Test Driven Development

Name: Test Driven Development
Author: addyosmani

addyosmani/agent-skills

Enforce RED-GREEN-REFACTOR so every behavior change and bug fix is proven by a failing test first.

Overview

Test-Driven Development is a journey-wide agent skill that drives implementation and bug fixes through a failing-test-first RED-GREEN-REFACTOR cycle—usable whenever a solo builder needs to prove code works before committ

Install

npx skills add https://github.com/addyosmani/agent-skills --skill test-driven-development

What is this skill?

Three-step TDD cycle: RED write failing test, GREEN minimal pass, REFACTOR with tests still green
Prove-It Pattern: reproduce bugs with a test before attempting a fix
Applies to new logic, bug fixes, behavior changes, and edge-case handling
Explicit when NOT to use: pure config, docs, or static content with no behavioral impact
Optional browser verification via Chrome DevTools MCP for UI-related changes
3-step TDD cycle: RED, GREEN, REFACTOR

Compatible agents: Claude Code, Cursor, Codex, Windsurf

Adoption & trust: 4.1k installs on skills.sh; 49.1k GitHub stars; 2/3 security scanners passed (skills.sh audits); trending (+200% hot-view momentum).

What problem does it solve?

You or your agent ship changes that look correct in chat but break edge cases because nothing objectively proved the behavior first.

Who is it for?

Solo builders using AI agents on feature work and regressions who want tests as the contract for what "done" means.

Skip if: Pure documentation edits, static content, or configuration-only tweaks with no runtime behavior to assert.

When should I use this skill?

Implementing any logic, fixing any bug, changing behavior, adding edge cases, or modifying functionality that could break existing behavior.

What do I get? / Deliverables

Every logic change and bug fix is anchored by tests that failed then passed, giving you refactor-safe proof instead of anecdotal confidence.

Failing then passing tests per change
Minimal implementation per green step
Refactored code with sustained green suite

Recommended Skills

Agent Browservercel-labs/open-agents

agent-browser is a Vercel Open Agents skill that wraps a CLI for programmatic browser control—ideal when solo builders n…404k installs·5.6k stars

Tddmattpocock/skills

TDD is an agent skill that coaches test-driven development using the red-green-refactor loop for solo and indie builders…214k installs·121k stars

Use My Browserxixu-me/skills

Use My Browser skill forces agents to classify tasks as static-capable or browser-required before choosing tools—staying…198k installs·61 stars

Test Driven Developmentobra/superpowers

Test-Driven Development is an agent skill from obra/superpowers that forces a test-first implementation ritual: write a …118k installs·221k stars

Verification Before Completionobra/superpowers

Verification Before Completion is an agent skill from the Superpowers lineage that blocks premature success claims durin…100k installs·221k stars

Webapp Testinganthropics/skills

webapp-testing is an agent skill for solo builders who need to prove that a local web application actually works—not jus…90.9k installs·148k stars

Journey fit

Useful at every journey phase - explore requirements and options before committing to a direction.

Where it fits

Example use

BuildUI/UX & frontend

Add a checkout validation rule only after a failing unit or component test defines the expected error.

Example use

ValidatePrototype & spike

Lock prototype behavior with thin tests before expanding the spike into production code.

Example use

ShipTesting & QA

Expand edge-case coverage while implementing the minimal green pass for each new scenario.

Example use

OperateIteration & experiments

Reproduce a production bug with a test, then patch and refactor under green CI.

How it compares

Process discipline for test-first coding—not a test runner installer or a one-shot coverage report generator.

Common Questions / FAQ

Who is test-driven-development for?

Indie developers and agent users who implement features or fix bugs and want a repeatable ritual so AI-generated code stays verifiable.

When should I use test-driven-development?

In build when adding logic, in ship when fixing bugs or running review prep, and in operate when patching production regressions—anytime behavior might change.

Is test-driven-development safe to install?

It guides workflow only; check the Security Audits panel on this Prism page for the parent repo before install.

SKILL.md

READMESKILL.md - Test Driven Development

# Test-Driven Development

## Overview

Write a failing test before writing the code that makes it pass. For bug fixes, reproduce the bug with a test before attempting a fix. Tests are proof — "seems right" is not done. A codebase with good tests is an AI agent's superpower; a codebase without tests is a liability.

## When to Use

- Implementing any new logic or behavior
- Fixing any bug (the Prove-It Pattern)
- Modifying existing functionality
- Adding edge case handling
- Any change that could break existing behavior

**When NOT to use:** Pure configuration changes, documentation updates, or static content changes that have no behavioral impact.

**Related:** For browser-based changes, combine TDD with runtime verification using Chrome DevTools MCP — see the Browser Testing section below.

## The TDD Cycle

```
    RED                GREEN              REFACTOR
 Write a test    Write minimal code    Clean up the
 that fails  ──→  to make it pass  ──→  implementation  ──→  (repeat)
      │                  │                    │
      ▼                  ▼                    ▼
   Test FAILS        Test PASSES         Tests still PASS
```

### Step 1: RED — Write a Failing Test

Write the test first. It must fail. A test that passes immediately proves nothing.

```typescript
// RED: This test fails because createTask doesn't exist yet
describe('TaskService', () => {
  it('creates a task with title and default status', async () => {
    const task = await taskService.createTask({ title: 'Buy groceries' });

    expect(task.id).toBeDefined();
    expect(task.title).toBe('Buy groceries');
    expect(task.status).toBe('pending');
    expect(task.createdAt).toBeInstanceOf(Date);
  });
});
```

### Step 2: GREEN — Make It Pass

Write the minimum code to make the test pass. Don't over-engineer:

```typescript
// GREEN: Minimal implementation
export async function createTask(input: { title: string }): Promise<Task> {
  const task = {
    id: generateId(),
    title: input.title,
    status: 'pending' as const,
    createdAt: new Date(),
  };
  await db.tasks.insert(task);
  return task;
}
```

### Step 3: REFACTOR — Clean Up

With tests green, improve the code without changing behavior:

- Extract shared logic
- Improve naming
- Remove duplication
- Optimize if necessary

Run tests after every refactor step to confirm nothing broke.

## The Prove-It Pattern (Bug Fixes)

When a bug is reported, **do not start by trying to fix it.** Start by writing a test that reproduces it.

```
Bug report arrives
       │
       ▼
  Write a test that demonstrates the bug
       │
       ▼
  Test FAILS (confirming the bug exists)
       │
       ▼
  Implement the fix
       │
       ▼
  Test PASSES (proving the fix works)
       │
       ▼
  Run full test suite (no regressions)
```

**Example:**

```typescript
// Bug: "Completing a task doesn't update the completedAt timestamp"

// Step 1: Write the reproduction test (it should FAIL)
it('sets completedAt when task is completed', async () => {
  const task = await taskService.createTask({ title: 'Test' });
  const completed = await taskService.completeTask(task.id);

  expect(completed.status).toBe('completed');
  expect(completed.completedAt).toBeInstanceOf(Date);  // This fails → bug confirmed
});

// Step 2: Fix the bug
export async function completeTask(id: string): Promise<Task> {
  return db.tasks.update(id, {
    status: 'completed',
    completedAt: new Date(),  // This was missing
  });
}

// Step 3: Test passes → bug fixed, regression guarded
```

## The Test Pyramid

Invest testing effort according to the pyramid — most tests should be small and fast, with progressively fewer tests at higher levels:

```

What is this skill?

Three-step TDD cycle: RED write failing test, GREEN minimal pass, REFACTOR with tests still green

Prove-It Pattern: reproduce bugs with a test before attempting a fix

Applies to new logic, bug fixes, behavior changes, and edge-case handling

Explicit when NOT to use: pure config, docs, or static content with no behavioral impact

Optional browser verification via Chrome DevTools MCP for UI-related changes

3-step TDD cycle: RED, GREEN, REFACTOR

Compatible agents: Claude Code, Cursor, Codex, Windsurf

Adoption & trust: 4.1k installs on skills.sh; 49.1k GitHub stars; 2/3 security scanners passed (skills.sh audits); trending (+200% hot-view momentum).

Journey fit

Useful at every journey phase - explore requirements and options before committing to a direction.

Where it fits

Example use

BuildUI/UX & frontend

Add a checkout validation rule only after a failing unit or component test defines the expected error.

Example use

ValidatePrototype & spike

Lock prototype behavior with thin tests before expanding the spike into production code.

Example use

ShipTesting & QA

Expand edge-case coverage while implementing the minimal green pass for each new scenario.

Example use

OperateIteration & experiments

Reproduce a production bug with a test, then patch and refactor under green CI.

SKILL.md

READMESKILL.md - Test Driven Development

# Test-Driven Development

## Overview

Write a failing test before writing the code that makes it pass. For bug fixes, reproduce the bug with a test before attempting a fix. Tests are proof — "seems right" is not done. A codebase with good tests is an AI agent's superpower; a codebase without tests is a liability.

## When to Use

- Implementing any new logic or behavior
- Fixing any bug (the Prove-It Pattern)
- Modifying existing functionality
- Adding edge case handling
- Any change that could break existing behavior

**When NOT to use:** Pure configuration changes, documentation updates, or static content changes that have no behavioral impact.

**Related:** For browser-based changes, combine TDD with runtime verification using Chrome DevTools MCP — see the Browser Testing section below.

## The TDD Cycle

```
    RED                GREEN              REFACTOR
 Write a test    Write minimal code    Clean up the
 that fails  ──→  to make it pass  ──→  implementation  ──→  (repeat)
      │                  │                    │
      ▼                  ▼                    ▼
   Test FAILS        Test PASSES         Tests still PASS
```

### Step 1: RED — Write a Failing Test

Write the test first. It must fail. A test that passes immediately proves nothing.

```typescript
// RED: This test fails because createTask doesn't exist yet
describe('TaskService', () => {
  it('creates a task with title and default status', async () => {
    const task = await taskService.createTask({ title: 'Buy groceries' });

    expect(task.id).toBeDefined();
    expect(task.title).toBe('Buy groceries');
    expect(task.status).toBe('pending');
    expect(task.createdAt).toBeInstanceOf(Date);
  });
});
```

### Step 2: GREEN — Make It Pass

Write the minimum code to make the test pass. Don't over-engineer:

```typescript
// GREEN: Minimal implementation
export async function createTask(input: { title: string }): Promise<Task> {
  const task = {
    id: generateId(),
    title: input.title,
    status: 'pending' as const,
    createdAt: new Date(),
  };
  await db.tasks.insert(task);
  return task;
}
```

### Step 3: REFACTOR — Clean Up

With tests green, improve the code without changing behavior:

- Extract shared logic
- Improve naming
- Remove duplication
- Optimize if necessary

Run tests after every refactor step to confirm nothing broke.

## The Prove-It Pattern (Bug Fixes)

When a bug is reported, **do not start by trying to fix it.** Start by writing a test that reproduces it.

```
Bug report arrives
       │
       ▼
  Write a test that demonstrates the bug
       │
       ▼
  Test FAILS (confirming the bug exists)
       │
       ▼
  Implement the fix
       │
       ▼
  Test PASSES (proving the fix works)
       │
       ▼
  Run full test suite (no regressions)
```

**Example:**

```typescript
// Bug: "Completing a task doesn't update the completedAt timestamp"

// Step 1: Write the reproduction test (it should FAIL)
it('sets completedAt when task is completed', async () => {
  const task = await taskService.createTask({ title: 'Test' });
  const completed = await taskService.completeTask(task.id);

  expect(completed.status).toBe('completed');
  expect(completed.completedAt).toBeInstanceOf(Date);  // This fails → bug confirmed
});

// Step 2: Fix the bug
export async function completeTask(id: string): Promise<Task> {
  return db.tasks.update(id, {
    status: 'completed',
    completedAt: new Date(),  // This was missing
  });
}

// Step 3: Test passes → bug fixed, regression guarded
```

## The Test Pyramid

Invest testing effort according to the pyramid — most tests should be small and fast, with progressively fewer tests at higher levels:

```

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is test-driven-development for?

When should I use test-driven-development?

Is test-driven-development safe to install?

SKILL.md

This week for builders

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is test-driven-development for?

When should I use test-driven-development?

Is test-driven-development safe to install?

SKILL.md