On this pageOpenClose
Phase 02
Implementation
Runs the focused task. The agent works inside the harness; I review the output.
Harness Engineering
Agent = Model + Harness
Model = Claude
Harness = what keeps Claude Code bounded and reviewable
Question: What is the harness?
Answer:
- The operating system around Claude Code.
- The rules, tools, environment, checks, and logs around the model.
- What keeps the agent bounded, observable, and reviewable.
Why:
- The model alone is not the system.
- The harness is what makes the work usable.
Risk if wrong:
- People over-credit the model and under-design the workflow.
1.1 Harness anatomy
Instructions
- What the agent is.
- What it should do.
- What it must not do.
Examples: CLAUDE.md, agent skills, project conventions
Tools
- What the agent can call.
- How it acts on the repo.
- Where it gets leverage.
Examples: file read/write, repo search, terminal commands
Execution Environment
- Where the agent runs.
- What it can access.
- What it cannot cross.
Examples: branch boundary, local runtime, safe command scope
Orchestration Logic
- How work moves through the harness.
- When each step fires.
- How handoff and verification happen.
Examples: implement-ticket ATS-123, coding + verification loop, context update step
Guardrails / Hooks
- Hard rules the agent must obey.
- Deterministic checks around execution.
- Stops for common failure modes.
Examples: one focused task, max 100 LOC including docs, stop if unclear
Observability
- What the run produced.
- What happened during execution.
- What the next reviewer needs to see.
Examples: session log, verification output, PR review notes
1.2 Harness layers
Always-on harness
Loaded every Claude Code run.
Examples: CLAUDE.md, project conventions, hard guardrails
Per-ticket harness
Loaded for one focused ticket.
Examples: focused task, proof path, review notes
Operating model

Developer zone
- Define focused task
- Review + approve
Agent zone
- Coding agent
- Verification: tests, typecheck, lint, evals
- Pull Request
Developer hands the task to the agent zone.
Failure feedback stays inside the task.
Developer reviews PR output.
I hold the wheel at task definition and review.
The agent runs inside the approved task.
Verification loops until the work is ready for human review.
2.1 Verification rule
- Tests, typecheck, and lint verify deterministic correctness.
- Evals verify output quality and agent trajectory.
- Human review owns the final merge decision.
2.2 Review rule
- One acceptance criterion.
- One focused task.
- One coding agent.
- One reviewable PR.
- Small PRs keep review fast.
- Max 100 LOC, docs included.
Review gate
3.1 Before a PR is accepted
- Does the diff satisfy the focused task?
- Did verification run?
- Is the evidence reviewable?
- Did the agent stay inside scope?
- Are unknowns carried forward?
- Is the PR small enough to review?
3.2 Evidence package
- focused task summary
- files changed
- tests run
- typecheck / lint / build output
- verifier notes
- skipped checks with reason
- context delta
- remaining risks or confirms