
Test Driven Development
Make your coding agent follow strict test-first red-green-refactor before any feature, bugfix, or refactor production code.
Overview
Test-Driven Development is an agent skill most often used in Build (also Ship testing) that enforces writing a failing test before any production code for features, bugfixes, and refactors.
Install
npx skills add https://github.com/obra/superpowers --skill test-driven-developmentWhat is this skill?
- Iron law: no production code without a failing test first—delete any code written ahead of tests
- Red-green-refactor cycle with explicit verify-fail and verify-pass checkpoints
- Clear exceptions only for prototypes, generated code, and configuration after human approval
Adoption & trust: 118k installs on skills.sh; 221k GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
Your agent writes implementation first, so you cannot tell whether tests actually catch regressions or match real behavior.
Who is it for?
Solo builders who want disciplined, verifiable increments when pairing with an agent on real product code.
Skip if: Throwaway spikes, one-off generated scaffolding, or config-only edits where you have explicitly agreed to skip TDD.
When should I use this skill?
Use when implementing any feature or bugfix, before writing implementation code.
What do I get? / Deliverables
You get a red-green-refactor loop where every change starts with a verified failing test, then minimal passing code, then cleanup.
- A failing test that proves the intended behavior or bug
- Minimal production code that makes the test pass
- Optional refactor with tests still green
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
TDD binds to the implementation loop—features and fixes—so the canonical shelf is Build where solo builders ship code daily, even though the discipline also hardens quality before release. Backend is the default implementation shelf for language-agnostic TDD workflows that apply equally to APIs, services, and shared modules before UI-specific work is singled out.
Where it fits
Add an API endpoint by defining a failing integration test for the contract before touching handler code.
Change UI behavior by writing a failing component or interaction test before editing the view logic.
Fix a production bug by reproducing it with a failing test, then ship the minimal patch that turns it green.
Wire a third-party webhook by asserting expected payloads in a test before implementing the adapter.
How it compares
Use instead of post-hoc test generation after the agent has already written the feature.
Common Questions / FAQ
Who is test-driven-development for?
It is for solo and indie builders who implement features and fixes with AI coding agents and want test-first discipline rather than implementation-first guesses.
When should I use test-driven-development?
Use it in Build when adding APIs, modules, or integrations; in Build frontend work for behavior changes; and in Ship testing when fixing regressions—always before production code unless you exempt prototypes or generated config with your partner.
Is test-driven-development safe to install?
Review the Security Audits panel on this Prism page for pass/fail signals and inspect the SKILL.md in your repo before enabling it in automated agent runs.
SKILL.md
READMESKILL.md - Test Driven Development
# Test-Driven Development (TDD) ## Overview Write the test first. Watch it fail. Write minimal code to pass. **Core principle:** If you didn't watch the test fail, you don't know if it tests the right thing. **Violating the letter of the rules is violating the spirit of the rules.** ## When to Use **Always:** - New features - Bug fixes - Refactoring - Behavior changes **Exceptions (ask your human partner):** - Throwaway prototypes - Generated code - Configuration files Thinking "skip TDD just this once"? Stop. That's rationalization. ## The Iron Law ``` NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST ``` Write code before the test? Delete it. Start over. **No exceptions:** - Don't keep it as "reference" - Don't "adapt" it while writing tests - Don't look at it - Delete means delete Implement fresh from tests. Period. ## Red-Green-Refactor ```dot digraph tdd_cycle { rankdir=LR; red [label="RED\nWrite failing test", shape=box, style=filled, fillcolor="#ffcccc"]; verify_red [label="Verify fails\ncorrectly", shape=diamond]; green [label="GREEN\nMinimal code", shape=box, style=filled, fillcolor="#ccffcc"]; verify_green [label="Verify passes\nAll green", shape=diamond]; refactor [label="REFACTOR\nClean up", shape=box, style=filled, fillcolor="#ccccff"]; next [label="Next", shape=ellipse]; red -> verify_red; verify_red -> green [label="yes"]; verify_red -> red [label="wrong\nfailure"]; green -> verify_green; verify_green -> refactor [label="yes"]; verify_green -> green [label="no"]; refactor -> verify_green [label="stay\ngreen"]; verify_green -> next; next -> red; } ``` ### RED - Write Failing Test Write one minimal test showing what should happen. <Good> ```typescript test('retries failed operations 3 times', async () => { let attempts = 0; const operation = () => { attempts++; if (attempts < 3) throw new Error('fail'); return 'success'; }; const result = await retryOperation(operation); expect(result).toBe('success'); expect(attempts).toBe(3); }); ``` Clear name, tests real behavior, one thing </Good> <Bad> ```typescript test('retry works', async () => { const mock = jest.fn() .mockRejectedValueOnce(new Error()) .mockRejectedValueOnce(new Error()) .mockResolvedValueOnce('success'); await retryOperation(mock); expect(mock).toHaveBeenCalledTimes(3); }); ``` Vague name, tests mock not code </Bad> **Requirements:** - One behavior - Clear name - Real code (no mocks unless unavoidable) ### Verify RED - Watch It Fail **MANDATORY. Never skip.** ```bash npm test path/to/test.test.ts ``` Confirm: - Test fails (not errors) - Failure message is expected - Fails because feature missing (not typos) **Test passes?** You're testing existing behavior. Fix test. **Test errors?** Fix error, re-run until it fails correctly. ### GREEN - Minimal Code Write simplest code to pass the test. <Good> ```typescript async function retryOperation<T>(fn: () => Promise<T>): Promise<T> { for (let i = 0; i < 3; i++) { try { return await fn(); } catch (e) { if (i === 2) throw e; } } throw new Error('unreachable'); } ``` Just enough to pass </Good> <Bad> ```typescript async function retryOperation<T>( fn: () => Promise<T>, options?: { maxRetries?: number; backoff?: 'linear' | 'exponential'; onRetry?: (attempt: number) => void; } ): Promise<T> { // YAGNI } ``` Over-engineered </Bad> Don't add features, refactor other code, or "improve" beyond the test. ### Verify GREEN - Watch It Pass **MANDATORY.** ```bash npm test path/to/test.test.ts ``` Confirm: - Test passes - Other tests still pass - Output pristine (no errors, warnings) **Test fails?** Fix code, not test. **Other tests fail?** Fix now. ### REFACTOR - Clean Up After green only: - Remove du