Agile Vibe Coding (AVC)
A framework for managing agent-based software development projects
| Version | DRAFT |
| License | MIT |
| Authors | @NachoColl (How to Contribute) |
| Website | agilevibecoding.org |
Agile Vibe Coding (AVC) is a structured approach to implementing complex software features using AI agents. The framework takes its starting points from Anthropic's best practices for long-running agents, which offer some patterns for structuring AI agent systems, and the Agile Manifesto principles, which brings a reference for decades of iterative software development wisdom.
AVC provides best practices and tools to break down large projects into verifiable, trackable features that can be implemented efficiently by specialized AI coding agents. The framework makes long-running development sustainable through parallel execution—multiple agents working simultaneously on different features without conflicts—combined with continuous context quality improvement through systematic measurement and retrospective analysis.
The Challenge
Large software projects overwhelm LLM coding agents in ways fundamentally different from human developers. Unlike humans who increase the overall abstract understanding through iteration, LLMs do not hold any kind of understanding - let's assume it as an axiom: an LLM contains a vast knowledge they can 'use' with the proper context input to create the proper output. That's it (at least as an strategy to get closer to a deterministic behaviour).
- the predictive nature of LLMs brings best results on *single-line (thread) chain of thoughs - while devs usually work on multithreading
- LLMs context SHOULD NOT be understood as a memory but as information.
Can we then use LLMs to build large and complex systems without such an understanding? Yes. As for many disciplines, shaping the proper questions at the right time guides in a critical way to get a good answer. For LLM coding agents, the context building process contains that guideline (the recent success of tools like Claude Code lies not primarily in the quality of their underlying models, but in their ability to decouple human intent from model interaction by delivering the appropriate context through a multi-step workflow).
The Solution
[PENDING INTRO]
- Context Scopes - A project contains as many scopes as required, which get defined by the context they share.
- Scope Features - A scope (e.g. EPIC, SPRINT) contains multiple scopes and single-agent short tasks with no dependencies to same scope task(s).
- Context Retrospectives - Regular analysis of next sprint context requirements while keeping full project context database(s).
- Specialized Agents - AVC Framework specialized agents
Context Scopes
A project is organized into tree scopes, each representing a layer of work that shares a set of contexts (Agile Vibe Coding mimics Agile epics and sprints naming, but it's not limited to). The key part is the hierarchy of contexts.
project/
├── context.md # project-wide context
├── epic-1/
│ ├── context.md # epic-level context
│ ├── sprint-1/
│ │ ├── context.md # sprint-level context
│ │ ├── feature-001.json # feature with embedded tests
│ │ ├── feature-002.json
│ │ └── ...
│ ├── sprint-2/
│ │ ├── context.md
│ │ └── ...
│ └── ...
├── epic-2/
│ ├── context.md
│ └── ...
└── ...Scope hierarchy is flexible - use whatever structure fits your project:
- Simple:
project/→feature-001.json(single-level, good for <50 features) - Agile-style:
project/→epic-1/→sprint-1/→feature-001.json(shown above) - Custom:
project/→module-auth/→phase-mvp/→feature-001.json(domain-based)
Context inheritance: Each scope inherits context from its parent and adds its own specifics. Coding agents receive the full context chain when implementing features.
Scope Features
Each feature is a small, independent task defined by a prompt and a set of contexts that AI coding agent(s) should implement and a set of tests which contain a test prompt and a validation criteria to validate the correct implementation.
Prompt-Driven Development
Features and tests are defined via prompts that LLMs interpret, not explicit commands. This allows agents to determine the best implementation and testing approaches based on provided context(s).
Characteristics:
- Prompt-based - LLMs read prompts and scope context hierarchy to determine what to build
- Two-stage completion - Implementation first, then testing/validation
- Independent git commits - One for implementation, one for test validation results
- No intra-scope dependencies - Features within the same scope don't depend on each other
- Context inheritance - Each feature receives full context chain (project → epic → sprint)
- Embedded test definitions - Tests defined within feature file with prompts and validation criteria
Feature file structure (feature-001.json):
{
"id": "feature-001",
"name": "Create User interface",
"scope": "epic-1/sprint-1",
"status": "pending",
"dependencies": [],
"prompt": "Create a TypeScript User interface in src/types/User.ts with properties: id (string), email (string), name (string), createdAt (Date). Follow the patterns from the scope context hierarchy. Export the interface for use in other modules.",
"completedAt": null,
"gitCommit": null,
"tests": [{
"status": "pending",
"testPrompt": "Generate comprehensive tests to verify the User interface implementation. Tests should confirm: 1) Interface is properly exported from src/types/User.ts, 2) All required properties (id, email, name, createdAt) exist with correct types, 3) TypeScript compilation succeeds. Use appropriate testing tools (TypeScript compiler, file parsing, build verification).",
"validation":[{
"criteria": "User interface is exported",
"status": "pending"
}, {
"criteria": "TypeScript compiles without errors",
"status": "pending"
}]
}]
}Understanding Context Scope Hierarchy
Core concept: AVC uses a flexible, multi-level scope hierarchy to organize features and share context, not limited to Agile's epic/sprint structure.
What is a Scope?
A scope is any organizational level in your project where you want to:
- Group related features together
- Share common context (patterns, principles, tech stack)
- Create an inheritance boundary for comprehension
Scopes form a hierarchy - child scopes inherit all context from parent scopes and add their own specifics.
Flexible Hierarchy - Not Limited to Epics/Sprints
While AVC draws inspiration from Agile (epics → sprints), you can use ANY hierarchy that fits your project:
Agile-style:
project → epic → sprint → featureDomain-based:
project → module → phase → featureLayer-based:
project → layer → component → featureCustom mix:
project → area → iteration → task-group → featureSingle-level (simple projects):
project → feature (no intermediate scopes)The only rule: Each scope must have a context.md file defining shared comprehension for all features in that scope and its children.
How Context Inheritance Works
Each scope's context.md file inherits from its parent and adds its own specifics:
Example hierarchy:
features/
├── context.md # Project-wide (root)
├── epic-auth/
│ ├── context.md # Epic-level (inherits project)
│ └── sprint-mvp/
│ ├── context.md # Sprint-level (inherits epic + project)
│ └── feature-001.json # Feature (receives merged context)When an agent implements feature-001:
- Reads
features/context.md(project-wide patterns) - Reads
features/epic-auth/context.md(authentication domain patterns) - Reads
features/epic-auth/sprint-mvp/context.md(MVP-specific details) - Merges all three to get complete context for feature-001
Result: Agent has full context chain without duplication - project principles, domain patterns, and sprint specifics all available.
When to Add Scope Levels
Add a scope level when:
- Multiple features share common patterns/context
- You need a logical grouping boundary
- Context becomes too large at current level
- Different domains require different principles
Don't add unnecessary levels:
- If <10 features total, single level is fine
- If all features share identical context, no subdivision needed
- Don't create levels "just because" - each level should add value
Example decision:
- 200 features with 5 domains → Use epic-sprint hierarchy
- 20 features, all backend → Use single sprint level
- 100 features, 3 layers (data, business, API) → Use layer-sprint hierarchy
Scope Naming is Flexible
Built-in convention (Agile-inspired):
epic-X/- For domain/area groupingssprint-X/- For iteration/phase groupings
But you can use ANY names:
module-auth/,module-payments/layer-data/,layer-api/phase-mvp/,phase-beta/iteration-1/,iteration-2/
Only requirement: Directories must contain context.md and feature files.
Benefits of Scope Hierarchy
- No duplication - Define shared knowledge once per scope, inherited by children
- Flexibility - Structure matches your project's natural organization
- Scalability - Add/remove levels as project grows or shrinks
- Clarity - Each scope has clear boundaries and responsibilities
- Parallel work - Multiple agents work in different scopes simultaneously
- Context quality - Retrospectives improve context at appropriate scope levels
Example: Choosing Your Hierarchy
Scenario 1: Small project (30 features)
project/
└── features/
├── context.md # All shared context here
├── feature-001.json
├── feature-002.json
└── ...Why: Simple structure, all features share one context file
Scenario 2: Medium project (120 features, 3 domains)
project/
└── features/
├── context.md # Project-wide patterns
├── module-auth/
│ ├── context.md # Auth-specific patterns
│ ├── feature-001.json
│ └── ...
├── module-payments/
│ └── ...
└── module-reporting/
└── ...Why: Domain-based grouping, each module has unique patterns
Scenario 3: Large project (500 features, iterative development)
project/
└── features/
├── context.md # Project-wide principles
├── epic-core/
│ ├── context.md # Core domain patterns
│ ├── sprint-1/
│ │ ├── context.md # Sprint-specific details
│ │ ├── feature-001.json
│ │ └── ...
│ └── sprint-2/
│ └── ...
└── epic-integrations/
└── ...Why: Large scale needs epic (domain) and sprint (iteration) levels
Context Retrospectives
Core principle: Provide shared comprehension once per scope - the tech stack, boundaries, principles, and patterns - ensuring every agent starts with complete context about HOW to implement features, not WHAT features to implement.
Benefits:
- No duplication - Define shared knowledge once per scope, not per feature
- Consistency - All features follow same tech stack, principles, and patterns
- Parallel execution - Multiple agents can work simultaneously with same comprehension
- 2-3x faster implementation - Agents spend 70% time coding vs 30% searching for context
- First-attempt accuracy - Comprehensive shared understanding reduces clarification questions
- Independent feature evolution - Features and tests evolve separately from shared context
Context quality metrics:
Track these metrics per scope (epic, sprint, or custom level):
- Clarification Rate - How often agents ask questions (target: 0%)
- Rework Rate - Features needing fixes after initial implementation (target: <5%)
- Implementation Time Variance - Actual vs estimated time (target: within 20%)
- Test Pass Rate - First-attempt test passes (target: >90%)
Improvement cycle:
Implement Scope → Track Metrics → Identify Patterns → Update Context → Apply Learning → Next ScopeExample retrospective after epic-1/sprint-1:
Scope: epic-1/sprint-1
Metrics: 68 features, 12 clarifications (17.6%), 8 rework (11.8%)
Pattern: SessionManager methods needed more error handling examples
Action: Added error handling section to epic-1/sprint-1/context.md
Result: sprint-2 had only 3 clarifications (8.6%) ✅
Insight: Error handling patterns promoted to epic-1/context.md for all sprintsKey insight: Context quality is to agent development what team dynamics are to human development. Invest in it, measure it, improve it systematically.
Specialized Agents
Some framework domain-specific agents which are responsible for orchestrating and managing the AVC to work properly.
Agent types:
- Initializer Agent (runs once) - Creates tracking infrastructure, feature files, baseline commit
- Controller Agent (every session) - Orchestrates feature selection, spawns coding agents, updates tracking
- Server Agent - Backend/API implementation, business logic
- Client Agent - SDK/client library implementation
- Infrastructure Agent - Cloud deployment, IaC, monitoring setup
- Testing Agent - Test suites, CI/CD, quality assurance
- Documentation Agent - User guides, API docs, examples
How agents interact with scope hierarchy:
- Controller reads scope context chain (project → epic → sprint
context.mdfiles) + feature files - Controller selects pending features and spawns appropriate specialized agent
- Specialized agent receives merged scope contexts (full hierarchy) + specific feature details
- Agent has complete context: project-wide patterns + epic-level specifics + sprint-level details
- Agent implements, validates with embedded tests, commits, updates feature status
Parallel Execution
Multiple agents implement different features simultaneously without merge conflicts through individual feature files and shared scope context hierarchy.
Conflict prevention mechanisms:
Individual feature files - Each agent updates different JSON files
feature-001.json,feature-002.json, etc.- File-system level atomicity
- Git handles merges naturally
Exclusive status updates - Agents mark features "in_progress" before starting
- Other agents skip in_progress features
- Git commits serve as synchronization points
Separate code files - Different features modify different source files
- Example: feature-001 →
User.ts, feature-002 →UserService.ts - Rare conflicts caught by git
- Example: feature-001 →
Shared context enables independence - No inter-agent communication needed
- All agents read same scope context chain (project → epic → sprint
context.mdfiles) - Complete information for independent work
- All agents read same scope context chain (project → epic → sprint
Example parallel workflow:
Terminal 1: Agent implements feature-001 (User interface) → commits to User.ts
Terminal 2: Agent implements feature-002 (UserService) → commits to UserService.ts
Terminal 3: Agent implements feature-003 (UserRepository) → commits to UserRepository.ts
Result: 3 features in 15 min (parallel) vs 45 min (sequential)
Long-running impact: 100 features across 10 agents = days instead of weeks6. Continuous Verification
Principle: Generate and run tests for every feature using LLM-interpreted test prompts, catching issues early when context is still fresh, preventing integration problems across parallel work.
Prompt-Based Testing
Instead of predefined test commands, test files contain test prompts that LLMs interpret to generate appropriate tests.
Test Generation and Execution Workflow:
Stage 1: Feature Implementation
1. Coding agent reads feature.prompt + scope context hierarchy (project → epic → sprint)
2. Implements code based on prompt interpretation and merged contexts
3. Updates feature.status: "implemented"
4. Git commit implementation
5. Feature ready for testing (NOT completed)Stage 2: Test Generation
6. Testing agent detects feature.status="implemented"
7. Reads feature.tests[].testPrompt from feature.json (embedded tests)
8. LLM generates test code for each test based on testPrompt
9. Stores generated code in feature.tests[].generatedTest
10. Updates feature.tests[].status: "generated"Stage 3: Test Execution
11. Testing agent runs each feature.tests[].generatedTest sequentially
12. Captures execution results in feature.tests[].executionResult
13. Updates feature.tests[].validation[].status: "passed" or "failed"
14. Updates feature.tests[].status: "passed" or "failed"
15. Determines if all validation criteria passedStage 4: Feature Completion
16. If all feature.tests[].validation[].status = "passed":
- Update feature.status: "completed"
- Git commit test results
17. If any validation failed:
- Update feature.status: "blocked"
- Report failures for review/debuggingCritical Rule: A feature can ONLY be marked "completed" when:
feature.status == "implemented"AND- All
feature.tests[].validation[].status == "passed"
Test Lifecycle States:
Each test progresses through:
pending→ Test defined but not generatedgenerating→ LLM creating test codegenerated→ Test code ready to runrunning→ Test executingpassed→ Test succeededfailed→ Test failed
Feature Status Values:
pending→ Not startedimplementing→ Code being writtenimplemented→ Code done, awaiting teststesting→ Tests being generated/executedcompleted→ Implementation AND all tests passedblocked→ Test failures or dependency issues
Why prompt-based testing:
- Flexibility: LLMs choose appropriate testing methods based on implementation
- Adaptability: Tests adjust to actual code structure
- Intelligence: Agents can debug and improve tests iteratively
- Context-aware: Tests use testing strategies from scope context hierarchy
- Separation: Implementation and verification are distinct stages with different agents
Why this matters for parallel execution:
- Implementation and testing can happen in parallel across different features
- Catches conflicts immediately when tests run
- Prevents cascading failures across parallel streams
- Makes rollback easier (one feature at a time)
- Maintains clean git history with verified code
- Multiple tests per feature ensure thorough verification
7. Context Quality Self-Improvement
Core innovation: Systematically analyze agent questions and rework patterns to continuously improve specifications throughout the long-running project.
Retrospective process after each sprint:
## Sprint N Context Quality Retrospective
### Metrics Summary
- Features completed: 68
- Clarification questions: 12 (17.6%)
- Rework required: 8 (11.8%)
- Avg implementation time: 8.5 min (target: 10 min) ✅
### Patterns Identified
1. SessionManager methods needed more error handling examples (6 clarifications)
2. Middleware features had highest rework rate (4/12)
3. Type definitions were fastest (avg 5 min) - excellent context
### Context Improvements for Next Sprint
1. ✅ Add error handling section with try-catch patterns
2. ✅ Add middleware testing examples with mocking
3. ✅ Template the successful type definition approach
### Apply Learnings
- Reuse error handling patterns in Sprint 2
- Create middleware template for Sprint 3Continuous improvement cycle:
Sprint Implementation
↓
Track Metrics (questions, rework, time)
↓
Identify Patterns (which context sections caused issues?)
↓
Update Context (add examples, clarify specs)
↓
Apply to Next Sprint
↓
Repeat8. Session Continuity
Purpose: Progress survives across sessions through tracking files and git history, essential for long-running projects.
Tracking mechanisms:
- Individual feature files -
features/epic-1/sprint-1/feature-XXX.jsonwith implementation details, embedded tests, and status tracking - Git history - One commit per feature with standard format, source of truth
- Scope context files - Hierarchical
context.mdfiles at each level (project, epic, sprint) with inherited comprehension- Example:
features/context.md(project) →features/epic-1/context.md(epic) →features/epic-1/sprint-1/context.md(sprint)
- Example:
- Optional: index.json - Machine-readable progress summary (regenerable from feature files)
Session workflow:
- Session start - Controller reads feature files + git log to determine what's completed
- During session - Agent updates feature status (pending → in_progress → completed)
- Session end - Git commits serve as permanent progress markers
- Next session - New controller instance picks up exactly where previous session ended
Why this enables long-running projects:
- No loss of context between sessions
- Progress is persistent and verifiable
- Multiple sessions can work in parallel
- Easy to resume after interruptions
Feature-Test Lifecycle
Complete State Machine
Understanding the full lifecycle of features and tests is critical for working with AVC.
Feature State Transitions:
pending → implementing → implemented → testing → completed
↓
blocked (test failures)Feature has TWO completion stages:
- Implementation Stage: Code is written and committed
- Testing Stage: All tests pass
Detailed Feature Status:
| Status | Description | feature.tests[].status | All validation passed? | Can git commit? |
|---|---|---|---|---|
pending | Not started | pending | N/A | No |
implementing | Code being written | pending | N/A | No |
implemented | Code complete, awaiting tests | pending | N/A | Yes (impl only) |
testing | Tests being generated/run | generating/running | No | No |
completed | Everything done | passed | Yes | Yes (test results) |
blocked | Tests failed | failed | No | No |
Embedded Test State Transitions (feature.tests[]):
pending → generating → generated → running → passed
↓
failedIndividual Embedded Test Status:
| Status | Description | What's happening |
|---|---|---|
pending | Test defined but not generated | Awaiting testing agent |
generating | LLM creating test code | Testing agent working |
generated | Test code ready | Ready to execute |
running | Test executing | Testing agent running |
passed | Test succeeded, all validation criteria met | Success |
failed | Test failed, one or more validation criteria failed | Needs attention |
Validation Criteria (feature.tests[].validation[]):
Each test can have multiple validation criteria to verify. Each criterion has:
criteria: Description of what to verify (e.g., "User interface is exported")status: "pending" | "passed" | "failed"
Parallel Agent Workflow
Terminal 1 - Coding Agent (feature-001):
1. Read feature-001.json prompt and scope hierarchy
2. Read scope context chain: project/context.md → epic-1/context.md → sprint-1/context.md
3. Implement code in src/types/User.ts based on merged contexts
4. Update feature.status: "implemented"
5. Git commit: "feat: Implement User interface [feature-001]"
6. STOP - do not mark as completedTerminal 2 - Testing Agent (feature-001):
7. Detect feature-001.status="implemented"
8. Read feature-001.tests[].testPrompt from feature.json (embedded tests)
9. Generate test code for each test based on testPrompt
10. Store in feature-001.tests[].generatedTest fields
11. Update feature-001.tests[].status: "generated"
12. Execute each generated test
13. Update feature-001.tests[].validation[].status: "passed" or "failed"
14. Update feature-001.tests[].status: "passed" or "failed"
15. If all validation criteria passed:
- Update feature.status: "completed"
- Git commit: "test: Verify User interface [feature-001]"
16. If any validation failed:
- Update feature.status: "blocked"
- Report: "Test validation failed: <criteria> - <reason>"Terminal 3 - Coding Agent (feature-002):
(Runs in parallel with Terminals 1 & 2 on different feature)Handling Test Failures
When tests fail:
Testing agent marks feature as blocked
json{ "status": "blocked", "testing": { "status": "failed", "allTestsPassed": false } }Review failure reasons
- Check test.executionResult for each failed test
- Determine if issue is in implementation or test
Fix and retry
- Option A: Fix implementation, re-run tests
- Option B: Improve testPrompt, regenerate tests
- Option C: Fix specific test.generatedTest, re-run
Escalation
- After N failed attempts, escalate to human review
- Human can override or provide guidance
Critical Rules
Rule 1: Features complete only after ALL tests pass
# This should ERROR:
./update-feature.sh feature-001 completed
# Error: Cannot mark as completed
# testing.allTestsPassed = false
# 2 of 3 tests passedRule 2: Implementation and testing are separate
# Correct workflow:
./update-feature.sh feature-001 --implementation implemented abc123f
# Later, after tests pass:
./update-feature.sh feature-001 --testing completedRule 3: Tests cannot be generated until implementation done
# This should ERROR if implementation.status != "implemented":
./generate-tests.sh test-001.json
# Error: Cannot generate tests
# feature-001 implementation.status = "implementing"Query Features by Stage
# Find features ready for implementation
./query-pending.sh --status pending
# Find features ready for testing
./query-pending.sh --ready-for-testing
# Returns features where implementation.status="implemented"
# Find blocked features
./query-pending.sh --status blockedExample: Complete Feature Lifecycle
# Day 1, 10:00 AM - Coding Agent
$ ./feature-status.sh feature-001
Status: pending
Implementation: pending
Testing: pending
# Day 1, 10:15 AM - After implementation
$ ./feature-status.sh feature-001
Status: implemented
Implementation: implemented (commit: abc123f)
Testing: pending
Ready for: Testing
# Day 1, 10:20 AM - Testing Agent generates tests
$ ./test-status.sh test-001.json
Test 1: pending → generating
Test 2: pending → generating
Test 3: pending → generating
# Day 1, 10:22 AM - Tests generated
$ ./test-status.sh test-001.json
Test 1: generated
Test 2: generated
Test 3: generated
# Day 1, 10:25 AM - Tests running
$ ./test-status.sh test-001.json
Test 1: running
Test 2: running
Test 3: running
# Day 1, 10:27 AM - Tests complete
$ ./test-status.sh test-001.json
Test 1: passed ✓
Test 2: passed ✓
Test 3: passed ✓
Summary: 3/3 tests passed
$ ./feature-status.sh feature-001
Status: completed ✓
Implementation: implemented (commit: abc123f)
Testing: completed (commit: def456g)
All tests passed: trueQuick Start
1. Install AVC in Your Project
# Clone AVC framework
git clone https://github.com/yourusername/agilevibecoding.git .avc
# Or copy essential files
cp -r .avc/templates/* ./avc/
cp -r .avc/scripts/* ./avc/scripts/2. Set Up Your Project Structure
your-project/
├── avc/ # AVC tracking directory
│ ├── tracking/
│ │ ├── features/
│ │ │ ├── context.md # Project-wide context (inherited by all)
│ │ │ ├── epic-1/
│ │ │ │ ├── context.md # Epic-level context (inherits project context)
│ │ │ │ ├── sprint-1/
│ │ │ │ │ ├── context.md # Sprint-level context (inherits epic + project)
│ │ │ │ │ ├── feature-001.json # Feature with embedded tests
│ │ │ │ │ └── feature-002.json
│ │ │ │ └── sprint-2/
│ │ │ │ ├── context.md
│ │ │ │ └── feature-003.json
│ │ │ ├── epic-2/
│ │ │ │ └── ...
│ │ ├── index.json # Progress summary (or omit if using individual files)
│ │ ├── claude-progress.txt # Session log
│ │ └── init.sh # Environment setup
│ ├── scripts/
│ │ ├── query-pending.sh
│ │ ├── rebuild-index.sh
│ │ ├── feature-status.sh
│ │ └── update-feature.sh
│ └── README.md # Project-specific AVC guide
├── src/ # Your actual code
└── ...3. Create Your Implementation Plan
Document your project's scope hierarchy and features:
- Break down the project into scope hierarchy (project → epics → sprints, or custom levels)
- Identify verifiable features for each scope (5-30 minutes each)
- Create scope context files at each level with inherited specifications
- Project-level: Overall tech stack, principles, patterns
- Epic-level: Domain-specific patterns, boundaries
- Sprint-level: Specific implementation details
- Generate feature files with embedded tests (can be automated - see Auto-Generation)
4. Run Initializer Agent
Use the Initializer Agent prompt to set up tracking:
# The agent will:
# 1. Generate feature files based on your plan
# 2. Create tracking infrastructure (index.json, claude-progress.txt, init.sh)
# 3. Create baseline git commit
# See prompts/initializer.md for the prompt5. Run Controller Agent
Use the Controller Agent for every development session:
# Each session, the controller will:
# 1. Read claude-progress.txt (resume from last session)
# 2. Review git log (understand what's been completed)
# 3. Run baseline tests (verify system health)
# 4. Select next feature (based on dependencies)
# 5. Spawn specialized coding agent
# 6. Update progress tracking
# See prompts/controller.md for the promptProject Setup Guide
Step 1: Define Your Scope Hierarchy
Organize your project into logical scope hierarchy (flexible structure):
Option A: Epic-Sprint Structure (Agile-style)
## Project: Your Project Name
### Epic 1: Core Foundation
#### Sprint 1.1: Data Layer (15 features, ~3 hours)
- Core types and interfaces
- Base service classes
- Database models
#### Sprint 1.2: Business Logic (20 features, ~4 hours)
- Service implementations
- Validation logic
- Error handling
### Epic 2: API Layer
#### Sprint 2.1: REST Endpoints (25 features, ~5 hours)
- Express routes
- API endpoints
- Request validation
...Option B: Custom Hierarchy (Domain-based)
## Project: Your Project Name
### Module: Authentication
#### Phase: MVP (10 features, ~2 hours)
- User login
- Token generation
- Session management
### Module: Payments
#### Phase: MVP (15 features, ~3 hours)
- Payment processing
- Invoice generation
...Step 2: Create Scope Contexts
For each scope level (project, epic, sprint, or custom), create a context.md file with shared comprehension that inherits from parent scopes:
Context Hierarchy:
- Project-level (
features/context.md): Overall tech stack, global principles, project-wide patterns - Epic-level (
features/epic-1/context.md): Domain-specific patterns, epic boundaries, shared domain knowledge - Sprint-level (
features/epic-1/sprint-1/context.md): Specific implementation details, sprint-specific patterns
Template Structure:
# Scope: [Level] - [Name]
# (e.g., "Scope: Epic - Core Foundation" or "Scope: Sprint 1.1 - Data Layer")
## Overview
[What this scope accomplishes - the domain/layer, not specific features]
[How it inherits/extends parent scope context]
## Tech Stack
- Language: [e.g., TypeScript 5.0]
- Framework: [e.g., Express 4.18]
- Libraries: [e.g., zod for validation]
- Tools: [e.g., Jest for testing]
## Boundaries
**In Scope:**
- [What types of functionality belong in this sprint]
- [What problems this sprint solves]
**Out of Scope:**
- [What explicitly does NOT belong here]
- [What other sprints handle]
## Exclusions (What NOT to do)
- ❌ [Anti-patterns to avoid]
- ❌ [Common mistakes]
- ❌ [Explicitly forbidden approaches]
## Coding Principles
- [Naming conventions]
- [Code organization patterns]
- [Style guide requirements]
- [Best practices to follow]
## Architecture Constraints
- [Design decisions that apply to all features]
- [Patterns that must be used consistently]
## Shared Knowledge
- [Domain concepts all features need]
- [Business rules that apply]
- [Data models that are referenced]
## Code Examples (Generic Patterns)
```typescript
// Example: Error handling pattern (NOT a specific feature)
try {
// All features should follow this error handling approach
} catch (error) {
// Standard error handling
}
\```
## Integration Points
- [How this scope connects to parent/child/sibling scopes]
- [External APIs or systems]
- [Dependencies on other work]
## Error Handling Approach
[How errors should be handled consistently]
## Dependencies
[External libraries, versions, why chosen]IMPORTANT: The context.md contains shared comprehension for ALL features in that scope and child scopes, NOT individual feature implementations or test definitions.
See: examples/simple-api/features/epic-1/sprint-1/context.md for a complete example (note the scope hierarchy).
Step 3: Generate Feature Files with Embedded Tests
Create individual feature files with embedded test definitions, or use auto-generation:
Manual Creation - Feature File (feature-001.json):
{
"id": "feature-001",
"name": "Create User interface",
"scope": "epic-1/sprint-1",
"status": "pending",
"dependencies": [],
"prompt": "Create a TypeScript User interface in src/types/User.ts with properties: id (string), email (string), name (string), createdAt (Date). Follow the patterns from the scope context hierarchy (project → epic-1 → sprint-1). Export the interface for use in other modules.",
"contextReference": "epic-1/sprint-1/context.md#data-models",
"completedAt": null,
"gitCommit": null,
"tests": [{
"status": "pending",
"testPrompt": "Generate comprehensive tests to verify the User interface implementation. Tests should confirm: 1) Interface is properly exported from src/types/User.ts, 2) All required properties (id, email, name, createdAt) exist with correct types, 3) TypeScript compilation succeeds. Use appropriate testing tools (TypeScript compiler, file parsing, build verification).",
"validation": [{
"criteria": "User interface is exported from src/types/User.ts",
"status": "pending"
}, {
"criteria": "All required properties exist with correct types",
"status": "pending"
}, {
"criteria": "TypeScript compiles without errors",
"status": "pending"
}]
}]
}Note: Tests are now embedded in feature files, not separate test files. The Testing Agent will:
- Read the
testPromptfromfeature.tests[] - Generate test code (stores in
generatedTestfield) - Execute tests and update
validation[].status
Auto-Generation: See docs/AUTO_GENERATION.md for pattern templates and scripts to generate both feature and test files.
Step 4: Configure Tracking Scripts
Copy scripts from scripts/ to your project:
query-pending.sh- Find next features to implementrebuild-index.sh- Regenerate index.json from feature filesfeature-status.sh- Check feature statusupdate-feature.sh- Update feature status atomically
Step 5: Run Initializer Agent
Use the prompt from prompts/initializer.md to:
- Verify feature files are correct
- Create
claude-progress.txt - Create
init.shenvironment setup script - Create baseline git commit
Agent Prompts
Initializer Agent (Run Once)
Purpose: Set up tracking infrastructure
Prompt: See prompts/initializer.md
Output:
- Verifies/creates feature files
- Creates
claude-progress.txt - Creates
init.sh - Creates baseline git commit
Duration: 1 session (~30 minutes)
Controller Agent (Every Session)
Purpose: Orchestrate feature implementation
Prompt: See prompts/controller.md
Workflow:
- Read
claude-progress.txt(resume state) - Review git log (understand recent work)
- Run baseline tests (verify health)
- Select next feature (dependencies met)
- Spawn specialized coding agent
- Track completion
- Update progress files
Duration: Ongoing (10-20 features per session)
Coding Agents (Specialized)
Purpose: Implement specific features
Prompt: See prompts/coding-agent.md
Types:
- Server Agent - Backend/API implementation
- Client Agent - SDK/frontend implementation
- Infrastructure Agent - Cloud/deployment
- Testing Agent - Test suites
- Documentation Agent - User guides
Workflow:
- Receive feature assignment with complete context
- Implement feature
- Run tests
- Create git commit
- Update feature status
Duration: 5-30 minutes per feature
File Structure Options
AVC supports two approaches for feature tracking:
Option 1: Individual Files per Feature (Recommended)
avc/tracking/features/
├── sprint-1/
│ ├── context.md
│ ├── feature-001.json
│ ├── feature-002.json
│ └── ...
├── sprint-2/
│ ├── context.md
│ └── ...
└── index.json (generated)Benefits:
- Natural conflict prevention (different agents work on different files)
- Clear git history (one file per commit)
- File-system level atomicity
- Easy to query (
find,grep)
Option 2: Single features.json File
avc/tracking/
├── features.json (all features)
├── features/
│ ├── sprint-1/
│ │ └── context.md
│ └── ...
└── claude-progress.txtBenefits:
- Single source of truth
- Simpler querying
- Easier to understand for humans
Drawbacks:
- Requires locking mechanism for parallel agents
- Merge conflicts possible
See: docs/FILE_STRUCTURE.md for detailed comparison.
Shared Context Strategy
Key Principle
Features within the same sprint share the same implementation context.
Context File Structure
Each sprint folder contains a context.md file:
Contents:
- Sprint Overview - What this sprint accomplishes
- Complete Specifications - Detailed specs with code examples
- Implementation Patterns - Reusable code patterns
- Testing Strategy - How to test with expected outputs
- Dependencies - Required packages
- Success Criteria - What "done" looks like
Context Delivery Workflow
1. Controller selects feature-001
2. Controller reads:
- feature-001.json (feature name, file, test)
- sprint-1/context.md (complete specification)
3. Controller passes to Coding Agent:
- Feature assignment
- Full specification from context
- Implementation guidance
- Expected behaviors
4. Coding Agent implements (no searching needed)Benefits
- No Duplication - Write specs once per sprint, not per feature
- Consistency - All features follow same patterns
- Parallelization - Features sharing context can run simultaneously
- 2-3x Faster - Agents spend 70% time implementing vs 30% (vs 60% searching)
See: docs/SHARED_CONTEXT_STRATEGY.md for complete details.
Workflow Diagram
┌─────────────────────────────────────────────┐
│ 1. Run Initializer Agent (Once) │
│ - Generate/verify feature files │
│ - Create tracking infrastructure │
│ - Create baseline commit │
└─────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────┐
│ 2. Run Controller Agent (Every Session) │
│ - Read claude-progress.txt │
│ - Review git log │
│ - Run baseline tests │
│ - Select next feature │
│ - Read sprint context │
│ - Spawn specialized coding agent │
└─────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────┐
│ 3. Coding Agent Implements Feature │
│ - Receive complete specification │
│ - Write code │
│ - Run tests │
│ - Create git commit │
│ - Update feature status │
└─────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────┐
│ 4. Controller Updates Tracking │
│ - Update index.json (if using) │
│ - Update claude-progress.txt │
│ - Select next feature │
└─────────────────────────────────────────────┘
↓
RepeatGit Commit Format
Standard format for feature commits:
feat: [Feature name] - [brief description]
Feature ID: feature-XXX
Sprint: sprint-N
File: src/path/to/file.ts
Test: npm run test:unit -- ComponentName
Status: ✅ Tests passing
[Optional: Additional context or notes]
Co-Authored-By: [Agent Name] <noreply@anthropic.com>Example:
feat: Create User interface - user type definition
Feature ID: feature-001
Sprint: sprint-1
File: src/types/User.ts
Test: npm run build
Status: ✅ Build passing
Co-Authored-By: Server Agent <noreply@anthropic.com>Progress Tracking Files
claude-progress.txt
Human-readable session log:
=== Project Name - Progress Log ===
Session: 5
Completed: 96/247 (38.9%)
== Recent Activity ==
[2026-01-19] ✅ feature-015 COMPLETED
SessionManager.getAvailableSession()
Commit: abc123f
== Current Task ==
feature-016: SessionManager.isInCooldown()
Sprint: sprint-1
Status: in_progress
== Next Up ==
feature-017: CookieRefreshService setup
feature-018: OperationExecutor.search()index.json (if using)
Progress summary:
{
"projectName": "Your Project",
"totalFeatures": 247,
"completedFeatures": 96,
"completionPercentage": 38.9,
"lastUpdated": "2026-01-19T10:30:00Z",
"sprints": [
{
"sprintId": "sprint-1",
"sprintName": "Foundation",
"totalFeatures": 68,
"completedFeatures": 68,
"completionPercentage": 100.0
}
]
}init.sh
Environment verification script:
#!/bin/bash
# Verifies development environment is ready
check_node() { ... }
check_git() { ... }
check_dependencies() { ... }
setup_directories() { ... }Benefits vs Monolithic Approach
| Aspect | Monolithic | AVC Agent Harness |
|---|---|---|
| Progress Visibility | Per sprint (1-2 weeks) | Per feature (5-30 min) |
| Resume Capability | Manual notes | Read tracking files |
| Testing Frequency | End of sprint | After every feature |
| Context Management | Entire sprint | One feature at a time |
| Failure Recovery | Lose week of work | Lose one feature |
| Quality Assurance | Final testing only | Continuous testing |
| Parallel Execution | Not possible | Features sharing context |
| Implementation Speed | Baseline | 2-3x faster with context |
Examples
Simple API Example
See examples/simple-api/ for a complete example project:
- REST API with user management
- 3 sprints, 10 features
- Complete sprint contexts
- Example feature files
Real-World Project
See the BWS X SDK Remote Sessions project for a production implementation:
- 247 features across 7 sprints
- 161KB of sprint context
- Parallel agent execution
- Complete integration testing
Tools and Scripts
query-pending.sh
Find next features to implement:
# Find all pending features
./scripts/query-pending.sh
# Find pending features in sprint-1
./scripts/query-pending.sh --sprint sprint-1
# Find next 5 pending features
./scripts/query-pending.sh --limit 5rebuild-index.sh
Regenerate index.json from feature files:
./scripts/rebuild-index.sh
# Output: index.json regenerated (96/247 complete, 38.9%)feature-status.sh
Check feature status:
./scripts/feature-status.sh feature-001
# Output: feature-001: completed (✅)update-feature.sh
Update feature status:
./scripts/update-feature.sh feature-001 completed abc123f
# Output: feature-001 updated to completedBest Practices
1. Keep Features Focused and Verifiable
- One clear deliverable per feature
- 5-30 minutes implementation time
- Independently testable
- Single git commit per stage
2. Test After Every Feature
- Don't batch testing to end of scope (test immediately after implementation)
- Catch issues early when context is fresh
- Maintain working state throughout development
3. Choose Appropriate Scope Hierarchy
- Use single level for small projects (<50 features)
- Add epic/module level for domain separation (100-500 features)
- Add sprint/phase level for iterative development
- Don't over-engineer - each level should add value
- Hierarchy should match your project's natural organization
- Context files at each level should have distinct content
4. Write Comprehensive Scope Context
- Complete specifications with code examples at each level
- Implementation patterns shared across scope
- Expected behaviors and validation criteria
- Testing strategies appropriate for the scope
- Inherit from parent, add specifics at child level
5. Use Specialized Agents
- Match agent expertise to task domain
- Clear responsibilities per agent type
- One feature per agent invocation
- Agents receive full scope context chain
6. Maintain Session Continuity
- Always read claude-progress.txt first
- Review git log before new work
- Run baseline tests before implementing
- Update tracking files after every feature
- Scope hierarchy persists across sessions
7. Enable Parallel Execution
- Features within same scope can run simultaneously
- Use individual feature files to prevent conflicts
- Shared scope context enables independence
- Coordinate via file-level locking or git commits
Auto-Generation
AVC supports auto-generating feature descriptions from sprint context.
See: docs/AUTO_GENERATION.md for:
- Pattern templates by feature type
- Auto-generation scripts
- Accuracy statistics (80-90%)
- Time savings analysis (12-37 hours)
Example Pattern:
# Type Definition Pattern
Pattern: "Create {Interface} interface"
Extract from context: Interface definition
Generate:
- Complete description with properties
- Test command
- Expected behaviors
- Context referenceTroubleshooting
Agent Can't Find Next Feature
Problem: Controller agent doesn't know which feature to implement next.
Solution:
# Check pending features
./scripts/query-pending.sh
# Verify dependencies are met
./scripts/feature-status.sh feature-XXX
# Check index.json is up to date
./scripts/rebuild-index.shFeatures Marked Complete but Code Missing
Problem: Feature status is "completed" but implementation is missing.
Solution:
# Check git log for feature commit
git log --grep="feature-XXX"
# If commit exists, code should be there
# If no commit, reset feature status
./scripts/update-feature.sh feature-XXX pendingContext Not Detailed Enough
Problem: Coding agent asks clarifying questions during implementation.
Solution:
- Enhance scope context.md files with more details (at appropriate level)
- Add code examples to context
- Include expected behaviors
- Specify error handling patterns
Parallel Agents Selecting Same Feature
Problem: Multiple agents pick the same feature.
Solution:
- Use individual feature files (natural file locking)
- Controller marks feature "in_progress" before spawning agent
- Run ONE Controller at a time
- Use file timestamps to detect conflicts
Agile Principles for AI-Agent Development
AVC adapts the 12 principles of the Agile Manifesto for AI-agent development, transforming human-centric practices into systematic processes:
Key Transformations
| Agile Principle | Traditional Agile | AVC Adaptation |
|---|---|---|
| Delivery frequency | Every 1-4 weeks | Every 5-30 minutes (per feature) |
| Collaboration | Daily face-to-face meetings | Asynchronous via comprehensive context files |
| Team motivation | Psychological safety & trust | Clear specifications & context quality |
| Progress measure | Working software | Working software + passing tests (enforced per feature) |
| Sustainable pace | Avoid developer burnout | Human validation capacity + economic sustainability |
| Technical excellence | Code reviews & pair programming | Consistent pattern application via shared context |
| Reflection | Team retrospectives | Context quality retrospectives (data-driven) |
Documentation
Core Documentation
- ARCHITECTURE.md - Complete framework architecture
- WORKFLOW.md - Agent workflow and coordination
- SHARED_CONTEXT_STRATEGY.md - Context strategy details
- AUTO_GENERATION.md - Auto-generating features
- FILE_STRUCTURE.md - File organization options
Templates
- templates/feature.json - Feature file template
- templates/context.md - Sprint context template
- templates/claude-progress.txt - Progress log template
- templates/init.sh - Environment setup template
Prompts
- prompts/initializer.md - Initializer agent prompt
- prompts/controller.md - Controller agent prompt
- prompts/coding-agent.md - Coding agent prompt template
Contributing
Contributions to improve AVC are welcome:
- Fork the repository
- Create a feature branch
- Implement your improvement
- Add tests/examples
- Submit a pull request
Areas for contribution:
- Additional examples
- Auto-generation scripts
- Enhanced templates
- Better documentation
- Tool improvements
References
Foundational Documents
- Effective Harnesses for Long-Running Agents - Anthropic's best practices for AI agent systems
- Agile Manifesto Principles - 12 principles for iterative software development
- Claude Code - AI coding assistant
Example Implementations
- Example Project: BWS X SDK Remote Sessions - Production implementation with 247 features
- Simple API Example - See
examples/simple-api/in this repository
License
MIT License - See LICENSE file for details
Support
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Examples: See
examples/directory