Spec-driven development with GSD for Claude Code
Tokens at the front of the context window are more effective than tokens at the end. The longer you use Claude in a single session, its efficiency decreases.
Claude Code has autocompact, but we can do better. GSD gives each task a fresh 200k context via subagent orchestration.
An open-source system (12.8k ★) that adds spec-driven development and multi-agent orchestration to Claude Code via slash commands.
XML task plans with built-in verification — Claude never guesses
PROJECT.md, STATE.md, REQUIREMENTS.md keep Claude oriented across sessions
Each task runs in a fresh 200k context — your main window stays lean
Built-in checks at every step — plans, execution, and UAT
Works with Claude Code, OpenCode, and Gemini CLI. Mac, Windows, Linux.
💡 Best with claude --dangerously-skip-permissions — GSD automates many small commands; approving each one defeats the purpose.
Five commands that take you from idea to shipped, verified code. Each step feeds the next — decisions compound, context stays fresh.
Each step creates files that the next step reads. That's the context engineering — Claude is never starting cold.
One command extracts everything Claude needs.
Creates: PROJECT.md (big picture — vision, values, requirements), ROADMAP.md (tactical — exact phases + tasks), STATE.md (living doc — progress, metrics, session handoff)
💡 These are living documents — validated requirements get marked as verified as you go.
The most underrated step. Shapes implementation before any code is written.
Creates: {phase}-CONTEXT.md — feeds into research + planning
Skip → defaults. Use it → your exact vision.
Creates: {phase}-RESEARCH.md, {phase}-{N}-PLAN.md
Each plan is a dense XML prompt — max 3 tasks — small enough for one fresh 200k context window. Verification criteria built in.
Walk away. Come back to completed work.
Tests pass — but does it actually work?
If needed, execution pauses mid-build to ask you to verify something manually. After your thumbs up, it creates a summary file and commits.
Every stage: thin orchestrator spawns specialized agents, waits, integrates.
| Stage | Agents |
|---|---|
| Research | 4× parallel (stack, features, arch, pitfalls) |
| Planning | Planner → checker → loop until pass |
| Execution | N× parallel executors (fresh 200k each) |
| Verification | Verifier + debuggers |
Main context stays at 30–40%. The work happens in subagents.
/gsd:set-profile budget for exploration, quality for final builds
"build me a SaaS app"
→ 50 messages of clarification
→ context window at 95%
→ quality degrades per message
→ no git history, no tests
→ works until it doesn't
/gsd:new-project → describe idea
→ structured research + roadmap
→ fresh 200k context per task
→ atomic commits, auto-verification
→ main window at 30-40%
→ reliable at any scale
For tasks that don't need the full loop. Same quality agents, skips research + verification.
Use for: bug fixes, config changes, one-off tasks.
| Profile | Plan | Exec | Verify |
|---|---|---|---|
| quality | Opus | Opus | Sonnet |
| balanced | Opus | Sonnet | Sonnet |
| budget | Sonnet | Sonnet | Haiku |
Session management: /gsd:pause-work + /gsd:resume-work
come say Hi :D