
Test Driven Development
Install this so your coding agent always writes a failing test before production code for features, fixes, and refactors.
Overview
Test-Driven Development is an agent skill most often used in Ship (also Build) that forces failing tests before any production code so agents cannot fake coverage.
Install
npx skills add https://github.com/neolabhq/context-engineering-kit --skill test-driven-developmentWhat is this skill?
- Iron law: no production code without a failing test first—delete any code written ahead of tests
- Red-Green-Refactor cycle with explicit verify-fail and verify-pass checkpoints
- Applies by default to new features, bug fixes, refactoring, and behavior changes
- Exceptions limited to throwaway prototypes, generated code, and configuration (human partner approval)
- Anti-rationalization guardrails when tempted to skip TDD once
- Core rule: no production code without a failing test first
- Default scope: new features, bug fixes, refactoring, and behavior changes
Adoption & trust: 564 installs on skills.sh; 1.1k GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
You need a feature or fix shipped fast, but agent-written tests often pass without ever failing—so you cannot trust they guard the right behavior.
Who is it for?
Solo builders who want agent-implemented features and bugfixes backed by tests that demonstrably failed before the fix.
Skip if: Throwaway prototypes, bulk-generated code, or pure config edits where you and your partner explicitly skip TDD—or work where no automated test harness exists yet.
When should I use this skill?
Use when implementing any feature or bugfix, before writing implementation code—write the test first, watch it fail, write minimal code to pass.
What do I get? / Deliverables
Every change follows red-green-refactor with observed failure first, minimal passing code, and refactor only on green—leaving a test suite that actually proved what it checks.
- Failing test committed or run before implementation
- Minimal production code that passes the new test
- Optional refactor step with full suite still green
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
TDD is cataloged under Ship because verified behavior and test-first discipline are the gate before you trust releases—implementation happens in Build, but the skill’s iron law is test verification. Testing is the canonical shelf: red-green-refactor, mandatory failure observation, and no production code without a failing test first.
Where it fits
Add an API endpoint by defining a failing integration test, then implement only enough handler code to turn it green.
Change UI behavior by writing a failing component test before touching the React view.
Reproduce a production bug with a failing test, fix minimally, and refactor only after the suite is green.
Patch a regression by requiring a new failing test that captures the reported failure mode before any hotfix.
How it compares
Use instead of letting the agent write implementation first and backfill tests that never failed.
Common Questions / FAQ
Who is test-driven-development for?
Indie and solo developers using AI coding agents who want disciplined red-green-refactor on every meaningful code change, not optional testing advice.
When should I use test-driven-development?
During Build when implementing features or refactors, and during Ship when fixing bugs or changing behavior—always before production code except approved prototype/generated/config cases.
Is test-driven-development safe to install?
It is procedural guidance only; review the Security Audits panel on this Prism page before installing any skill from the repo.
SKILL.md
READMESKILL.md - Test Driven Development
# Test-Driven Development (TDD) ## Overview Write the test first. Watch it fail. Write minimal code to pass. **Core principle:** If you didn't watch the test fail, you don't know if it tests the right thing. **Violating the letter of the rules is violating the spirit of the rules.** ## When to Use **Always:** - New features - Bug fixes - Refactoring - Behavior changes **Exceptions (ask your human partner):** - Throwaway prototypes - Generated code - Configuration files Thinking "skip TDD just this once"? Stop. That's rationalization. ## The Iron Law ``` NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST ``` Write code before the test? Delete it. Start over. **No exceptions:** - Don't keep it as "reference" - Don't "adapt" it while writing tests - Don't look at it - Delete means delete Implement fresh from tests. Period. ## Red-Green-Refactor ```dot digraph tdd_cycle { rankdir=LR; red [label="RED\nWrite failing test", shape=box, style=filled, fillcolor="#ffcccc"]; verify_red [label="Verify fails\ncorrectly", shape=diamond]; green [label="GREEN\nMinimal code", shape=box, style=filled, fillcolor="#ccffcc"]; verify_green [label="Verify passes\nAll green", shape=diamond]; refactor [label="REFACTOR\nClean up", shape=box, style=filled, fillcolor="#ccccff"]; next [label="Next", shape=ellipse]; red -> verify_red; verify_red -> green [label="yes"]; verify_red -> red [label="wrong\nfailure"]; green -> verify_green; verify_green -> refactor [label="yes"]; verify_green -> green [label="no"]; refactor -> verify_green [label="stay\ngreen"]; verify_green -> next; next -> red; } ``` ### RED - Write Failing Test Write one minimal test showing what should happen. <Good> ```typescript test('retries failed operations 3 times', async () => { let attempts = 0; const operation = () => { attempts++; if (attempts < 3) throw new Error('fail'); return 'success'; }; const result = await retryOperation(operation); expect(result).toBe('success'); expect(attempts).toBe(3); }); ``` Clear name, tests real behavior, one thing </Good> <Bad> ```typescript test('retry works', async () => { const mock = jest.fn() .mockRejectedValueOnce(new Error()) .mockRejectedValueOnce(new Error()) .mockResolvedValueOnce('success'); await retryOperation(mock); expect(mock).toHaveBeenCalledTimes(3); }); ``` Vague name, tests mock not code </Bad> **Requirements:** - One behavior - Clear name - Real code (no mocks unless unavoidable) ### Verify RED - Watch It Fail **MANDATORY. Never skip.** ```bash npm test path/to/test.test.ts ``` Confirm: - Test fails (not errors) - Failure message is expected - Fails because feature missing (not typos) **Test passes?** You're testing existing behavior. Fix test. **Test errors?** Fix error, re-run until it fails correctly. ### GREEN - Minimal Code Write simplest code to pass the test. <Good> ```typescript async function retryOperation<T>(fn: () => Promise<T>): Promise<T> { for (let i = 0; i < 3; i++) { try { return await fn(); } catch (e) { if (i === 2) throw e; } } throw new Error('unreachable'); } ``` Just enough to pass </Good> <Bad> ```typescript async function retryOperation<T>( fn: () => Promise<T>, options?: { maxRetries?: number; backoff?: 'linear' | 'exponential'; onRetry?: (attempt: number) => void; } ): Promise<T> { // YAGNI } ``` Over-engineered </Bad> Don't add features, refactor other code, or "improve" beyond the test. ### Verify GREEN - Watch It Pass **MANDATORY.** ```bash npm test path/to/test.test.ts ``` Confirm: - Test passes - Other tests still pass - Output pristine (no e