Method Comparison Notes
Method Comparison Notes
Date: 2026-05-15
Question
What structure should govern AI-assisted coding so the system works across projects instead of changing by taste each time?
Conclusion
Use spec/ as the repository-native Intent Specification Layer.
specis the operational name: a verifiable promise that code must satisfy.intentis the philosophy: preserving human/system intention so the AI does not have to guess.layeris the architecture concept: L0 Constitution, L1 Domain, L2 Behavior, L3 Contracts.- The method is bidirectional: it guides implementation before code exists and exposes missing user journey, failure, and recovery behavior after code exists.
- REQ-IDs need a generated test bridge. Otherwise traceability remains a manual convention and can drift silently.
Investigated Alternatives
Project Rule Files
Examples: AGENTS.md, CLAUDE.md, .cursorrules, Kiro steering files.
Verdict: necessary but insufficient. They provide ambient constraints but do not capture feature-specific behavior, unwanted cases, or rollback contracts.
PRD
Verdict: useful for product alignment, insufficient as an AI implementation contract. PRDs often assume humans will fill gaps with organizational context. AI agents fill gaps too, but not reliably in the intended direction.
GitHub Spec Kit
Observed facts:
- Treats specifications as the central source of truth.
- Uses a constitution to constrain architecture and implementation.
- Emphasizes implementable specs, plans, and tasks.
- Uses templates and commands to reduce ambiguity and premature implementation.
Verdict: strong workflow and philosophy. Best used as a tool that consumes
spec/, not as the only structure that owns project intent.
Kiro Feature Specs
Observed facts:
- Supports requirements-first and design-first workflows.
- Requirements use EARS notation.
- Emphasizes clarity, testability, traceability, and completeness.
Verdict: validates EARS as a practical AI-coding requirement syntax. Avoid locking the repository source layer to one IDE’s folder convention.
OpenSpec
Observed facts:
- Uses change proposals before code.
- Describes deltas and archives accepted changes.
- Works well for brownfield evolution.
Verdict: strong delta workflow. Borrow the changes/ idea, but keep current
governing specs explicit.
BMAD Method
Observed facts:
- Provides structured workflows and specialized roles.
- Targets full lifecycle planning and implementation.
- Useful when planning itself is large.
Verdict: useful for complex planning, too heavy as the default for ordinary
feature work. Its outputs should still anchor back to spec/.
Augment Intent
Observed facts:
- Treats specs as source of truth.
- Tries to reduce drift with agent-updated specs.
- Uses coordinator and specialist agents.
Verdict: directly attacks spec drift. The repository folder should remain
tool-neutral as spec/.
EARS
Observed facts:
- EARS reduces ambiguity, complexity, and vagueness in natural-language requirements.
- The unwanted-behavior pattern is valuable because undesired situations are a common source of omissions.
Verdict: best Layer 2 syntax because it keeps natural language while forcing testable behavior categories.
Local Empirical Checks
Two local checks are included:
npm run benchmark- Measures whether a candidate spec explicitly contains behavior, vocabulary, interface, constitution, and traceability signals.
- Result: PRD 0.000, BDD 0.100, EARS 0.513, Domain+EARS 0.750, Full Spec Layer 1.000.
npm run simulate- Uses concrete implementations constrained to each spec level and runs the same 15 tests against all of them.
- Result: PRD 2/15, BDD 5/15, EARS 11/15, Domain+EARS 12/15, Full Spec Layer 15/15.
The simulation is not fully blind because the same author created the experiment. Its value is diagnostic: failures appear exactly where a layer is missing.