The Agentic Project Harness
A baseline operating layer for agent-friendly repositories — four readiness levels from context harness to product harness.
Every repository that an AI agent touches needs a harness: instructions, checks, scripts, docs, pipelines, observability, recovery, and feedback loops. Not the product itself, but the operating layer around it.
The goal is to make a repo easy for humans and agents to understand, modify, verify, deploy, recover, and operate — without relying on hidden context.
The core rule
If a repo needs hidden memory to be changed safely, the harness is incomplete. Encode that memory in files, scripts, checks, dashboards, or documented workflows.
Ground truth is disk, not memory.
Four readiness levels
The harness is organized into four levels so you can adopt incrementally. Each level adds what matters at that stage.
L0: Context harness
Make the repo legible.
| Item | Purpose |
|---|---|
| Agent instructions (AGENTS.md / CLAUDE.md) | Defines repo-specific behavior, guardrails, commands, and forbidden actions |
| README | Fast entry point for humans |
| Skills | Encodes repeatable workflows agents should follow |
| MCP configs | Gives agents structured access to external tools |
| Docs folder | Stores durable product, architecture, and operational context |
| Scratch-pad | Safe place for temporary investigation notes |
| Scripts folder | Holds repeatable operational commands |
| Decision records | Preserves why important choices were made |
L1: Development harness
Make local development repeatable and reviewable.
.env.example— documents required environment variables without leaking secrets.gitignore/.dockerignore— keeps noise and secrets out of git and Docker contexts- Local bootstrap — one documented setup path for a new machine
- Linting and formatting — catches problems before review, keeps diffs focused on behavior
- Pre-commit hooks — moves cheap checks earlier than CI
- Tests and type checks — verifies correctness
- Automated migrations — handles schema changes reproducibly
- Changelog folder — records what changed and why
- Dependency management — controls the supply chain
L2: Production harness
Make the project shippable and recoverable.
This level is required before production traffic, sensitive data, payments, or customer commitments.
- CI/CD pipelines with gated deployment
- Health checks and monitoring
- Database backup and restore automation
- Alerting and runbooks
- Security audits and vulnerability scanning
- Secrets management
- Performance testing
- Incident history
L3: Product harness
Learn from users and operate as a product.
This level is project-dependent — adopt it when stage, team size, compliance, or growth motion justifies it.
- User feedback collection and analysis
- License and legal documentation
- Feature flags and experimentation
- Customer support system
- Issue tracking
- Marketing and outreach
Recommended repo shape
.
├── AGENTS.md or CLAUDE.md
├── README.md
├── .env.example
├── .gitignore
├── .dockerignore
├── docs/
├── scripts/
├── changelog/
├── scratch-pad/
├── infra/
├── migrations/
├── tests/
└── .github/workflows/
Folder names matter less than discoverability. A new maintainer or agent should find instructions, commands, docs, verification, release history, and runbooks within minutes.
Completion discipline
The harness includes a completion protocol that agents follow before declaring work done:
- Recover state from disk — start with
git statusandgit diffto find the actual changed files - Inspect — look for edge cases, stale assumptions, missing tests, state leakage, and docs drift
- Verify — run the completion check script and fix issues before final response
- Report — state files changed, checks run, and remaining risk
A lightweight completion_check.sh script prints dirty files, touched files, untracked files, conflict markers, debug/TODO markers, and large uncommitted files. Wire it into git hooks later.
When context is tight, scale the pass down but don't skip it. Prioritize: money/payment state, auth/privacy, production/deploy risk, data correctness, runtime state, user-facing UX, docs/tests.
Standards
Each harness item uses one of three requirement levels:
| Standard | Meaning |
|---|---|
| Required | Every serious repo needs this. Missing it creates avoidable risk. |
| Production-required | Required before production traffic, sensitive data, or customer commitments. |
| Project-dependent | Required only when stage, team size, or compliance justifies it. |
This keeps the harness proportionate to the project's actual risk surface rather than applying everything everywhere.