Ai Regression Testing

Name: Ai Regression Testing
Author: affaan-m

affaan-m/everything-claude-code

Catch regressions when AI agents change API or backend code—especially where the same model wrote and reviewed the fix.

Overview

AI Regression Testing is an agent skill most often used in Ship (also Build backend) that defines regression and sandbox API tests to catch blind spots when the same AI writes and reviews code.

Install

npx skills add https://github.com/affaan-m/everything-claude-code --skill ai-regression-testing

What is this skill?

Documents the write-then-review blind spot: same model approves fixes that still miss SELECT columns, types, or path gap
Sandbox-mode API testing without database dependencies for faster agent iteration loops
Activate after AI modifies API routes, after bug fixes, or when running bug-check workflows post-change
Covers multi-path setups: sandbox vs production, feature flags, and mock layers
Regression focus on re-introduction after a production bug was found and patched

Compatible agents: Claude Code, Cursor, Codex, Windsurf

Adoption & trust: 4.2k installs on skills.sh; 210k GitHub stars; 3/3 security scanners passed (skills.sh audits).

What problem does it solve?

Your agent said a backend fix looked correct, but the bug returned because automated tests never exercised the missed query column or type path.

Who is it for?

Solo builders with sandbox/mock API modes who ship agent-edited routes and want DB-free regression loops after /bug-check.

Skip if: Pure frontend UI tweaks with no API surface, or teams that already enforce full integration tests on every change without AI involvement.

When should I use this skill?

AI agent has modified API routes or backend logic; a bug was fixed; project has sandbox/mock mode; running /bug-check after changes; multiple paths like sandbox vs production.

What do I get? / Deliverables

You add repeatable sandbox or mock-backed regression tests that run after agent edits and bug fixes so reintroductions surface before merge or deploy.

Regression test cases for agent-touched API routes
Documented bug-check workflow tied to sandbox execution

Recommended Skills

Agent Browservercel-labs/open-agents

agent-browser is a Vercel Open Agents skill that wraps a CLI for programmatic browser control—ideal when solo builders n…404k installs·5.6k stars

Tddmattpocock/skills

TDD is an agent skill that coaches test-driven development using the red-green-refactor loop for solo and indie builders…214k installs·121k stars

Use My Browserxixu-me/skills

Use My Browser skill forces agents to classify tasks as static-capable or browser-required before choosing tools—staying…198k installs·61 stars

Test Driven Developmentobra/superpowers

Test-Driven Development is an agent skill from obra/superpowers that forces a test-first implementation ritual: write a …118k installs·221k stars

Verification Before Completionobra/superpowers

Verification Before Completion is an agent skill from the Superpowers lineage that blocks premature success claims durin…100k installs·221k stars

Webapp Testinganthropics/skills

webapp-testing is an agent skill for solo builders who need to prove that a local web application actually works—not jus…90.9k installs·148k stars

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

Canonical shelf is Ship/testing because the skill targets automated regression and sandbox API verification after agent-driven code changes. Testing fits DB-free sandbox runs, /bug-check style reviews, and preventing reintroduction of fixed bugs on API routes.

Also useful

BuildBackend, data & payments

Also useful

ShipCode review

Where it fits

Example use

BuildBackend, data & payments

After Codex adds a new response field, add a sandbox test that asserts the SELECT list and serialized JSON stay aligned.

Example use

ShipTesting & QA

Run DB-free route tests in sandbox mode before merging agent-generated API changes.

Example use

ShipCode review

Follow /bug-check with automated regression cases so review is not only the same model re-reading its patch.

Example use

OperateIteration & experiments

When a production bug was fixed by an agent, lock a regression test so the failure class does not return on the next autopatch.

How it compares

Targets AI-specific review blind spots—use alongside Ship code-review skills, not instead of deterministic test suites.

Common Questions / FAQ

Who is ai-regression-testing for?

Builders using AI agents on backend and API code who need regression coverage that does not trust the same model’s self-review.

When should I use ai-regression-testing?

In Ship after agent-modified routes or bug fixes and during Build backend work when sandbox mode exists; also when running bug-check flows before release.

Is ai-regression-testing safe to install?

It guides test design and workflows; review the Security Audits panel on this Prism page and avoid pointing sandbox tests at production secrets or live data.

SKILL.md

READMESKILL.md - Ai Regression Testing

# AI Regression Testing

Testing patterns specifically designed for AI-assisted development, where the same model writes code and reviews it — creating systematic blind spots that only automated tests can catch.

## When to Activate

- AI agent (Claude Code, Cursor, Codex) has modified API routes or backend logic
- A bug was found and fixed — need to prevent re-introduction
- Project has a sandbox/mock mode that can be leveraged for DB-free testing
- Running `/bug-check` or similar review commands after code changes
- Multiple code paths exist (sandbox vs production, feature flags, etc.)

## The Core Problem

When an AI writes code and then reviews its own work, it carries the same assumptions into both steps. This creates a predictable failure pattern:

```
AI writes fix → AI reviews fix → AI says "looks correct" → Bug still exists
```

**Real-world example** (observed in production):

```
Fix 1: Added notification_settings to API response
  → Forgot to add it to the SELECT query
  → AI reviewed and missed it (same blind spot)

Fix 2: Added it to SELECT query
  → TypeScript build error (column not in generated types)
  → AI reviewed Fix 1 but didn't catch the SELECT issue

Fix 3: Changed to SELECT *
  → Fixed production path, forgot sandbox path
  → AI reviewed and missed it AGAIN (4th occurrence)

Fix 4: Test caught it instantly on first run PASS:
```

The pattern: **sandbox/production path inconsistency** is the #1 AI-introduced regression.

## Sandbox-Mode API Testing

Most projects with AI-friendly architecture have a sandbox/mock mode. This is the key to fast, DB-free API testing.

### Setup (Vitest + Next.js App Router)

```typescript
// vitest.config.ts
import { defineConfig } from "vitest/config";
import path from "path";

export default defineConfig({
  test: {
    environment: "node",
    globals: true,
    include: ["__tests__/**/*.test.ts"],
    setupFiles: ["__tests__/setup.ts"],
  },
  resolve: {
    alias: {
      "@": path.resolve(__dirname, "."),
    },
  },
});
```

```typescript
// __tests__/setup.ts
// Force sandbox mode — no database needed
process.env.SANDBOX_MODE = "true";
process.env.NEXT_PUBLIC_SUPABASE_URL = "";
process.env.NEXT_PUBLIC_SUPABASE_ANON_KEY = "";
```

### Test Helper for Next.js API Routes

```typescript
// __tests__/helpers.ts
import { NextRequest } from "next/server";

export function createTestRequest(
  url: string,
  options?: {
    method?: string;
    body?: Record<string, unknown>;
    headers?: Record<string, string>;
    sandboxUserId?: string;
  },
): NextRequest {
  const { method = "GET", body, headers = {}, sandboxUserId } = options || {};
  const fullUrl = url.startsWith("http") ? url : `http://localhost:3000${url}`;
  const reqHeaders: Record<string, string> = { ...headers };

  if (sandboxUserId) {
    reqHeaders["x-sandbox-user-id"] = sandboxUserId;
  }

  const init: { method: string; headers: Record<string, string>; body?: string } = {
    method,
    headers: reqHeaders,
  };

  if (body) {
    init.body = JSON.stringify(body);
    reqHeaders["content-type"] = "application/json";
  }

  return new NextRequest(fullUrl, init);
}

export async function parseResponse(response: Response) {
  const json = await response.json();
  return { status: response.status, json };
}
```

### Writing Regression Tests

The key principle: **write tests for bugs that were found, not for code that works**.

```typescript
// __tests__/api/user/profile.test.ts
import { describe, it, expect } from "vitest";
import { createTestRequest, parseResponse } from "../../helpers";
import { GET, PATCH } from "@/app/api/user/profile/route";

// Define the contract — what fields MUST be in the response
const REQUIRED

What is this skill?

Documents the write-then-review blind spot: same model approves fixes that still miss SELECT columns, types, or path gap

Sandbox-mode API testing without database dependencies for faster agent iteration loops

Activate after AI modifies API routes, after bug fixes, or when running bug-check workflows post-change

Covers multi-path setups: sandbox vs production, feature flags, and mock layers

Regression focus on re-introduction after a production bug was found and patched

Compatible agents: Claude Code, Cursor, Codex, Windsurf

Adoption & trust: 4.2k installs on skills.sh; 210k GitHub stars; 3/3 security scanners passed (skills.sh audits).

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

Also useful

BuildBackend, data & payments

Also useful

ShipCode review

Where it fits

Example use

BuildBackend, data & payments

After Codex adds a new response field, add a sandbox test that asserts the SELECT list and serialized JSON stay aligned.

Example use

ShipTesting & QA

Run DB-free route tests in sandbox mode before merging agent-generated API changes.

Example use

ShipCode review

Follow /bug-check with automated regression cases so review is not only the same model re-reading its patch.

Example use

OperateIteration & experiments

When a production bug was fixed by an agent, lock a regression test so the failure class does not return on the next autopatch.

SKILL.md

READMESKILL.md - Ai Regression Testing

# AI Regression Testing

Testing patterns specifically designed for AI-assisted development, where the same model writes code and reviews it — creating systematic blind spots that only automated tests can catch.

## When to Activate

- AI agent (Claude Code, Cursor, Codex) has modified API routes or backend logic
- A bug was found and fixed — need to prevent re-introduction
- Project has a sandbox/mock mode that can be leveraged for DB-free testing
- Running `/bug-check` or similar review commands after code changes
- Multiple code paths exist (sandbox vs production, feature flags, etc.)

## The Core Problem

When an AI writes code and then reviews its own work, it carries the same assumptions into both steps. This creates a predictable failure pattern:

```
AI writes fix → AI reviews fix → AI says "looks correct" → Bug still exists
```

**Real-world example** (observed in production):

```
Fix 1: Added notification_settings to API response
  → Forgot to add it to the SELECT query
  → AI reviewed and missed it (same blind spot)

Fix 2: Added it to SELECT query
  → TypeScript build error (column not in generated types)
  → AI reviewed Fix 1 but didn't catch the SELECT issue

Fix 3: Changed to SELECT *
  → Fixed production path, forgot sandbox path
  → AI reviewed and missed it AGAIN (4th occurrence)

Fix 4: Test caught it instantly on first run PASS:
```

The pattern: **sandbox/production path inconsistency** is the #1 AI-introduced regression.

## Sandbox-Mode API Testing

Most projects with AI-friendly architecture have a sandbox/mock mode. This is the key to fast, DB-free API testing.

### Setup (Vitest + Next.js App Router)

```typescript
// vitest.config.ts
import { defineConfig } from "vitest/config";
import path from "path";

export default defineConfig({
  test: {
    environment: "node",
    globals: true,
    include: ["__tests__/**/*.test.ts"],
    setupFiles: ["__tests__/setup.ts"],
  },
  resolve: {
    alias: {
      "@": path.resolve(__dirname, "."),
    },
  },
});
```

```typescript
// __tests__/setup.ts
// Force sandbox mode — no database needed
process.env.SANDBOX_MODE = "true";
process.env.NEXT_PUBLIC_SUPABASE_URL = "";
process.env.NEXT_PUBLIC_SUPABASE_ANON_KEY = "";
```

### Test Helper for Next.js API Routes

```typescript
// __tests__/helpers.ts
import { NextRequest } from "next/server";

export function createTestRequest(
  url: string,
  options?: {
    method?: string;
    body?: Record<string, unknown>;
    headers?: Record<string, string>;
    sandboxUserId?: string;
  },
): NextRequest {
  const { method = "GET", body, headers = {}, sandboxUserId } = options || {};
  const fullUrl = url.startsWith("http") ? url : `http://localhost:3000${url}`;
  const reqHeaders: Record<string, string> = { ...headers };

  if (sandboxUserId) {
    reqHeaders["x-sandbox-user-id"] = sandboxUserId;
  }

  const init: { method: string; headers: Record<string, string>; body?: string } = {
    method,
    headers: reqHeaders,
  };

  if (body) {
    init.body = JSON.stringify(body);
    reqHeaders["content-type"] = "application/json";
  }

  return new NextRequest(fullUrl, init);
}

export async function parseResponse(response: Response) {
  const json = await response.json();
  return { status: response.status, json };
}
```

### Writing Regression Tests

The key principle: **write tests for bugs that were found, not for code that works**.

```typescript
// __tests__/api/user/profile.test.ts
import { describe, it, expect } from "vitest";
import { createTestRequest, parseResponse } from "../../helpers";
import { GET, PATCH } from "@/app/api/user/profile/route";

// Define the contract — what fields MUST be in the response
const REQUIRED

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is ai-regression-testing for?

When should I use ai-regression-testing?

Is ai-regression-testing safe to install?

SKILL.md

This week for builders

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is ai-regression-testing for?

When should I use ai-regression-testing?

Is ai-regression-testing safe to install?

SKILL.md