On this pageOpen

Phase 02

Implementation

Runs the focused task. The agent works inside the harness; I review the output.

Harness Engineering

Agent = Model + Harness

Model = Claude

Harness = what keeps Claude Code bounded and reviewable

Question: What is the harness?

Answer:

The operating system around Claude Code.
The rules, tools, environment, checks, and logs around the model.
What keeps the agent bounded, observable, and reviewable.

Why:

The model alone is not the system.
The harness is what makes the work usable.

Risk if wrong:

People over-credit the model and under-design the workflow.

1.1 Harness anatomy

Instructions

What the agent is.
What it should do.
What it must not do.

Examples: CLAUDE.md, agent skills, project conventions

Tools

What the agent can call.
How it acts on the repo.
Where it gets leverage.

Examples: file read/write, repo search, terminal commands

Execution Environment

Where the agent runs.
What it can access.
What it cannot cross.

Examples: branch boundary, local runtime, safe command scope

Orchestration Logic

How work moves through the harness.
When each step fires.
How handoff and verification happen.

Examples: implement-ticket ATS-123, coding + verification loop, context update step

Guardrails / Hooks

Hard rules the agent must obey.
Deterministic checks around execution.
Stops for common failure modes.

Examples: one focused task, max 100 LOC including docs, stop if unclear

Observability

What the run produced.
What happened during execution.
What the next reviewer needs to see.

Examples: session log, verification output, PR review notes

1.2 Harness layers

Always-on harness

Loaded every Claude Code run.

Examples: CLAUDE.md, project conventions, hard guardrails

Per-ticket harness

Loaded for one focused ticket.

Examples: focused task, proof path, review notes

Operating model

During Implementation, the developer defines the focused task and reviews the output. Claude Code implements inside the task, runs verification, and prepares a pull request.

I hold the wheel at task definition and review.

The agent runs inside the approved task.

Verification loops until the work is ready for human review.

2.1 Verification rule

Tests, typecheck, and lint verify deterministic correctness.
Evals verify output quality and agent trajectory.
Human review owns the final merge decision.

2.2 Review rule

One acceptance criterion.
One focused task.
One coding agent.
One reviewable PR.
Small PRs keep review fast.
Max 100 LOC, docs included.

Review gate

3.1 Before a PR is accepted

Does the diff satisfy the focused task?
Did verification run?
Is the evidence reviewable?
Did the agent stay inside scope?
Are unknowns carried forward?
Is the PR small enough to review?

3.2 Evidence package

focused task summary
files changed
tests run
typecheck / lint / build output
verifier notes
skipped checks with reason
context delta
remaining risks or confirms

Implementation

Harness Engineering

1.1 Harness anatomy

Instructions

Tools

Execution Environment

Orchestration Logic

Guardrails / Hooks

Observability

1.2 Harness layers

Always-on harness

Per-ticket harness

Operating model

Developer zone

Agent zone

2.1 Verification rule

2.2 Review rule

Review gate

3.1 Before a PR is accepted

3.2 Evidence package

How can I help?

1 Harness Engineering

1.1 Harness anatomy

Instructions

Tools

Execution Environment

Orchestration Logic

Guardrails / Hooks

Observability

1.2 Harness layers

Always-on harness

Per-ticket harness

2 Operating model

Developer zone

Agent zone

2.1 Verification rule

2.2 Review rule

3 Review gate

3.1 Before a PR is accepted

3.2 Evidence package

How can I help?

Harness Engineering

Operating model

Review gate