
Ai Regression Testing
Catch regressions when AI agents change API or backend code—especially where the same model wrote and reviewed the fix.
Overview
AI Regression Testing is an agent skill most often used in Ship (also Build backend) that defines regression and sandbox API tests to catch blind spots when the same AI writes and reviews code.
Install
npx skills add https://github.com/affaan-m/everything-claude-code --skill ai-regression-testingWhat is this skill?
- Documents the write-then-review blind spot: same model approves fixes that still miss SELECT columns, types, or path gap
- Sandbox-mode API testing without database dependencies for faster agent iteration loops
- Activate after AI modifies API routes, after bug fixes, or when running bug-check workflows post-change
- Covers multi-path setups: sandbox vs production, feature flags, and mock layers
- Regression focus on re-introduction after a production bug was found and patched
Adoption & trust: 4.2k installs on skills.sh; 210k GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
Your agent said a backend fix looked correct, but the bug returned because automated tests never exercised the missed query column or type path.
Who is it for?
Solo builders with sandbox/mock API modes who ship agent-edited routes and want DB-free regression loops after /bug-check.
Skip if: Pure frontend UI tweaks with no API surface, or teams that already enforce full integration tests on every change without AI involvement.
When should I use this skill?
AI agent has modified API routes or backend logic; a bug was fixed; project has sandbox/mock mode; running /bug-check after changes; multiple paths like sandbox vs production.
What do I get? / Deliverables
You add repeatable sandbox or mock-backed regression tests that run after agent edits and bug fixes so reintroductions surface before merge or deploy.
- Regression test cases for agent-touched API routes
- Documented bug-check workflow tied to sandbox execution
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Canonical shelf is Ship/testing because the skill targets automated regression and sandbox API verification after agent-driven code changes. Testing fits DB-free sandbox runs, /bug-check style reviews, and preventing reintroduction of fixed bugs on API routes.
Where it fits
After Codex adds a new response field, add a sandbox test that asserts the SELECT list and serialized JSON stay aligned.
Run DB-free route tests in sandbox mode before merging agent-generated API changes.
Follow /bug-check with automated regression cases so review is not only the same model re-reading its patch.
When a production bug was fixed by an agent, lock a regression test so the failure class does not return on the next autopatch.
How it compares
Targets AI-specific review blind spots—use alongside Ship code-review skills, not instead of deterministic test suites.
Common Questions / FAQ
Who is ai-regression-testing for?
Builders using AI agents on backend and API code who need regression coverage that does not trust the same model’s self-review.
When should I use ai-regression-testing?
In Ship after agent-modified routes or bug fixes and during Build backend work when sandbox mode exists; also when running bug-check flows before release.
Is ai-regression-testing safe to install?
It guides test design and workflows; review the Security Audits panel on this Prism page and avoid pointing sandbox tests at production secrets or live data.
SKILL.md
READMESKILL.md - Ai Regression Testing
# AI Regression Testing Testing patterns specifically designed for AI-assisted development, where the same model writes code and reviews it — creating systematic blind spots that only automated tests can catch. ## When to Activate - AI agent (Claude Code, Cursor, Codex) has modified API routes or backend logic - A bug was found and fixed — need to prevent re-introduction - Project has a sandbox/mock mode that can be leveraged for DB-free testing - Running `/bug-check` or similar review commands after code changes - Multiple code paths exist (sandbox vs production, feature flags, etc.) ## The Core Problem When an AI writes code and then reviews its own work, it carries the same assumptions into both steps. This creates a predictable failure pattern: ``` AI writes fix → AI reviews fix → AI says "looks correct" → Bug still exists ``` **Real-world example** (observed in production): ``` Fix 1: Added notification_settings to API response → Forgot to add it to the SELECT query → AI reviewed and missed it (same blind spot) Fix 2: Added it to SELECT query → TypeScript build error (column not in generated types) → AI reviewed Fix 1 but didn't catch the SELECT issue Fix 3: Changed to SELECT * → Fixed production path, forgot sandbox path → AI reviewed and missed it AGAIN (4th occurrence) Fix 4: Test caught it instantly on first run PASS: ``` The pattern: **sandbox/production path inconsistency** is the #1 AI-introduced regression. ## Sandbox-Mode API Testing Most projects with AI-friendly architecture have a sandbox/mock mode. This is the key to fast, DB-free API testing. ### Setup (Vitest + Next.js App Router) ```typescript // vitest.config.ts import { defineConfig } from "vitest/config"; import path from "path"; export default defineConfig({ test: { environment: "node", globals: true, include: ["__tests__/**/*.test.ts"], setupFiles: ["__tests__/setup.ts"], }, resolve: { alias: { "@": path.resolve(__dirname, "."), }, }, }); ``` ```typescript // __tests__/setup.ts // Force sandbox mode — no database needed process.env.SANDBOX_MODE = "true"; process.env.NEXT_PUBLIC_SUPABASE_URL = ""; process.env.NEXT_PUBLIC_SUPABASE_ANON_KEY = ""; ``` ### Test Helper for Next.js API Routes ```typescript // __tests__/helpers.ts import { NextRequest } from "next/server"; export function createTestRequest( url: string, options?: { method?: string; body?: Record<string, unknown>; headers?: Record<string, string>; sandboxUserId?: string; }, ): NextRequest { const { method = "GET", body, headers = {}, sandboxUserId } = options || {}; const fullUrl = url.startsWith("http") ? url : `http://localhost:3000${url}`; const reqHeaders: Record<string, string> = { ...headers }; if (sandboxUserId) { reqHeaders["x-sandbox-user-id"] = sandboxUserId; } const init: { method: string; headers: Record<string, string>; body?: string } = { method, headers: reqHeaders, }; if (body) { init.body = JSON.stringify(body); reqHeaders["content-type"] = "application/json"; } return new NextRequest(fullUrl, init); } export async function parseResponse(response: Response) { const json = await response.json(); return { status: response.status, json }; } ``` ### Writing Regression Tests The key principle: **write tests for bugs that were found, not for code that works**. ```typescript // __tests__/api/user/profile.test.ts import { describe, it, expect } from "vitest"; import { createTestRequest, parseResponse } from "../../helpers"; import { GET, PATCH } from "@/app/api/user/profile/route"; // Define the contract — what fields MUST be in the response const REQUIRED