Santa Method

Name: Santa Method
Author: affaan-m

affaan-m/everything-claude-code

Adversarial dual-agent review with a convergence loop before user-facing or production output ships.

Overview

Santa Method is an agent skill most often used in Ship (also Launch distribution and Ship security) that runs dual independent adversarial reviews until both pass before output ships.

Install

npx skills add https://github.com/affaan-m/everything-claude-code --skill santa-method

What is this skill?

Two independent review agents with no shared generator context
Convergence loop: both reviewers must pass or output is sent back for fixes
Phase 1 generator (make a list) then Phase 2 dual check (check it twice)
Targets hallucination, compliance, brand, and factual accuracy at scale
Explicitly skip when build/test/lint already gives deterministic verification

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 3.7k installs on skills.sh; 210k GitHub stars; 3/3 security scanners passed (skills.sh audits).

What problem does it solve?

A single agent reviewing its own output repeats the same blind spots that created wrong claims, unsafe code, or off-brand copy you are about to publish.

Who is it for?

Production-bound code, customer-facing docs, regulated copy, or large batch generations where one reviewer is not enough.

Skip if: Internal drafts, exploratory research, or tasks where test, lint, and build pipelines already deterministically verify correctness.

When should I use this skill?

Output will be published, deployed, or end-user facing; compliance, regulatory, or brand rules apply; production ships without human review; hallucination or accuracy risk is elevated.

What do I get? / Deliverables

Deliverables only ship after two isolated reviewers converge on pass, with naughty findings fixed in a loop instead of one self-review.

Converged approved artifact after dual independent review
Iteration log of issues fixed between review rounds

Recommended Skills

Improve Codebase Architecturemattpocock/skills

Improve Codebase Architecture is an agent skill that teaches how to deepen a cluster of shallow modules without breaking…226k installs·121k stars

Zoom Outmattpocock/skills

Lightweight meta-prompt skill that tells the agent to zoom out and deliver a domain-aligned overview of modules and call…181k installs·121k stars

Caveman Reviewjuliusbrussee/caveman

Formats code review as single actionable lines: location, problem, fix, with minimal noise.139k installs·70k stars

Requesting Code Reviewobra/superpowers

Requesting Code Review is an agent skill from the Superpowers collection that gives solo and indie builders a copy-ready…119k installs·221k stars

Receiving Code Reviewobra/superpowers

Superpowers methodology for agents receiving code review: prioritize technical correctness over social comfort, verify e…96.2k installs·221k stars

Request Refactor Planmattpocock/skills

request-refactor-plan is a structured agent workflow for solo and small-team maintainers who want refactors filed as act…30.5k installs·121k stars

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

Ship is where unreviewed agent output becomes production risk; Santa Method is shelved under review as the last structured gate before release. Dual independent reviewers and iterate-until-pass maps directly to code review and content QA subphase—not exploratory drafting.

Also useful

ShipSecurity

Also useful

LaunchDistribution & launch channels

Where it fits

Example use

ShipCode review

Run dual reviewers on a PR-sized agent diff before merging to main without a human reviewer on call.

Example use

ShipSecurity

Stress-test production config or policy text for unsafe defaults before unattended deploy.

Example use

LaunchDistribution & launch channels

Verify landing page claims, pricing copy, and API references before public launch.

Example use

GrowContent & marketing

Batch-check educational or support articles for factual drift and off-brand tone at scale.

How it compares

Multi-agent adversarial QA workflow—not a substitute for CI and not a single-pass code-review checklist skill.

Common Questions / FAQ

Who is santa-method for?

Solo builders and indie teams using multiple agent roles who need publication- or deployment-grade verification without a full human editorial bench.

When should I use santa-method?

In Ship (review/security) before production deploy or unattended releases; in Launch (distribution) for customer-facing accuracy; in Grow (content) for factual marketing or docs—when compliance, brand, or hallucination risk is high.

Is santa-method safe to install?

It orchestrates review behavior, not arbitrary shell access by itself. Review the Security Audits panel on this Prism page for the hosting repo before install.

SKILL.md

READMESKILL.md - Santa Method

# Santa Method

Multi-agent adversarial verification framework. Make a list, check it twice. If it's naughty, fix it until it's nice.

The core insight: a single agent reviewing its own output shares the same biases, knowledge gaps, and systematic errors that produced the output. Two independent reviewers with no shared context break this failure mode.

## When to Activate

Invoke this skill when:
- Output will be published, deployed, or consumed by end users
- Compliance, regulatory, or brand constraints must be enforced
- Code ships to production without human review
- Content accuracy matters (technical docs, educational material, customer-facing copy)
- Batch generation at scale where spot-checking misses systemic patterns
- Hallucination risk is elevated (claims, statistics, API references, legal language)

Do NOT use for internal drafts, exploratory research, or tasks with deterministic verification (use build/test/lint pipelines for those).

## Architecture

```
┌─────────────┐
│  GENERATOR   │  Phase 1: Make a List
│  (Agent A)   │  Produce the deliverable
└──────┬───────┘
       │ output
       ▼
┌──────────────────────────────┐
│     DUAL INDEPENDENT REVIEW   │  Phase 2: Check It Twice
│                                │
│  ┌───────────┐ ┌───────────┐  │  Two agents, same rubric,
│  │ Reviewer B │ │ Reviewer C │  │  no shared context
│  └─────┬─────┘ └─────┬─────┘  │
│        │              │        │
└────────┼──────────────┼────────┘
         │              │
         ▼              ▼
┌──────────────────────────────┐
│        VERDICT GATE           │  Phase 3: Naughty or Nice
│                                │
│  B passes AND C passes → NICE  │  Both must pass.
│  Otherwise → NAUGHTY           │  No exceptions.
└──────┬──────────────┬─────────┘
       │              │
    NICE           NAUGHTY
       │              │
       ▼              ▼
   [ SHIP ]    ┌─────────────┐
               │  FIX CYCLE   │  Phase 4: Fix Until Nice
               │              │
               │ iteration++  │  Collect all flags.
               │ if i > MAX:  │  Fix all issues.
               │   escalate   │  Re-run both reviewers.
               │ else:        │  Loop until convergence.
               │   goto Ph.2  │
               └──────────────┘
```

## Phase Details

### Phase 1: Make a List (Generate)

Execute the primary task. No changes to your normal generation workflow. Santa Method is a post-generation verification layer, not a generation strategy.

```python
# The generator runs as normal
output = generate(task_spec)
```

### Phase 2: Check It Twice (Independent Dual Review)

Spawn two review agents in parallel. Critical invariants:

1. **Context isolation** — neither reviewer sees the other's assessment
2. **Identical rubric** — both receive the same evaluation criteria
3. **Same inputs** — both receive the original spec AND the generated output
4. **Structured output** — each returns a typed verdict, not prose

```python
REVIEWER_PROMPT = """
You are an independent quality reviewer. You have NOT seen any other review of this output.

## Task Specification
{task_spec}

## Output Under Review
{output}

## Evaluation Rubric
{rubric}

## Instructions
Evaluate the output against EACH rubric criterion. For each:
- PASS: criterion fully met, no issues
- FAIL: specific issue found (cite the exact problem)

Return your assessment as structured JSON:
{
  "verdict": "PASS" | "FAIL",
  "checks": [
    {"criterion": "...", "result": "PASS|FAIL", "detail": "..."}
  ],
  "critical_issues": ["..."],   // blockers that must be fixed
  "suggestions": ["..."]         // non-blocking improvements
}

Be rigorous. Your job is to find problems, not to approve.
"""
```

```python
# Spawn reviewers in parallel (Claude Code suba

What is this skill?

Two independent review agents with no shared generator context

Convergence loop: both reviewers must pass or output is sent back for fixes

Phase 1 generator (make a list) then Phase 2 dual check (check it twice)

Targets hallucination, compliance, brand, and factual accuracy at scale

Explicitly skip when build/test/lint already gives deterministic verification

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 3.7k installs on skills.sh; 210k GitHub stars; 3/3 security scanners passed (skills.sh audits).

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

Also useful

Also useful

LaunchDistribution & launch channels

Where it fits

Example use

ShipCode review

Run dual reviewers on a PR-sized agent diff before merging to main without a human reviewer on call.

Example use

ShipSecurity

Stress-test production config or policy text for unsafe defaults before unattended deploy.

Example use

LaunchDistribution & launch channels

Verify landing page claims, pricing copy, and API references before public launch.

Example use

GrowContent & marketing

Batch-check educational or support articles for factual drift and off-brand tone at scale.

SKILL.md

READMESKILL.md - Santa Method

# Santa Method

Multi-agent adversarial verification framework. Make a list, check it twice. If it's naughty, fix it until it's nice.

The core insight: a single agent reviewing its own output shares the same biases, knowledge gaps, and systematic errors that produced the output. Two independent reviewers with no shared context break this failure mode.

## When to Activate

Invoke this skill when:
- Output will be published, deployed, or consumed by end users
- Compliance, regulatory, or brand constraints must be enforced
- Code ships to production without human review
- Content accuracy matters (technical docs, educational material, customer-facing copy)
- Batch generation at scale where spot-checking misses systemic patterns
- Hallucination risk is elevated (claims, statistics, API references, legal language)

Do NOT use for internal drafts, exploratory research, or tasks with deterministic verification (use build/test/lint pipelines for those).

## Architecture

```
┌─────────────┐
│  GENERATOR   │  Phase 1: Make a List
│  (Agent A)   │  Produce the deliverable
└──────┬───────┘
       │ output
       ▼
┌──────────────────────────────┐
│     DUAL INDEPENDENT REVIEW   │  Phase 2: Check It Twice
│                                │
│  ┌───────────┐ ┌───────────┐  │  Two agents, same rubric,
│  │ Reviewer B │ │ Reviewer C │  │  no shared context
│  └─────┬─────┘ └─────┬─────┘  │
│        │              │        │
└────────┼──────────────┼────────┘
         │              │
         ▼              ▼
┌──────────────────────────────┐
│        VERDICT GATE           │  Phase 3: Naughty or Nice
│                                │
│  B passes AND C passes → NICE  │  Both must pass.
│  Otherwise → NAUGHTY           │  No exceptions.
└──────┬──────────────┬─────────┘
       │              │
    NICE           NAUGHTY
       │              │
       ▼              ▼
   [ SHIP ]    ┌─────────────┐
               │  FIX CYCLE   │  Phase 4: Fix Until Nice
               │              │
               │ iteration++  │  Collect all flags.
               │ if i > MAX:  │  Fix all issues.
               │   escalate   │  Re-run both reviewers.
               │ else:        │  Loop until convergence.
               │   goto Ph.2  │
               └──────────────┘
```

## Phase Details

### Phase 1: Make a List (Generate)

Execute the primary task. No changes to your normal generation workflow. Santa Method is a post-generation verification layer, not a generation strategy.

```python
# The generator runs as normal
output = generate(task_spec)
```

### Phase 2: Check It Twice (Independent Dual Review)

Spawn two review agents in parallel. Critical invariants:

1. **Context isolation** — neither reviewer sees the other's assessment
2. **Identical rubric** — both receive the same evaluation criteria
3. **Same inputs** — both receive the original spec AND the generated output
4. **Structured output** — each returns a typed verdict, not prose

```python
REVIEWER_PROMPT = """
You are an independent quality reviewer. You have NOT seen any other review of this output.

## Task Specification
{task_spec}

## Output Under Review
{output}

## Evaluation Rubric
{rubric}

## Instructions
Evaluate the output against EACH rubric criterion. For each:
- PASS: criterion fully met, no issues
- FAIL: specific issue found (cite the exact problem)

Return your assessment as structured JSON:
{
  "verdict": "PASS" | "FAIL",
  "checks": [
    {"criterion": "...", "result": "PASS|FAIL", "detail": "..."}
  ],
  "critical_issues": ["..."],   // blockers that must be fixed
  "suggestions": ["..."]         // non-blocking improvements
}

Be rigorous. Your job is to find problems, not to approve.
"""
```

```python
# Spawn reviewers in parallel (Claude Code suba

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is santa-method for?

When should I use santa-method?

Is santa-method safe to install?

SKILL.md

This week for builders

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is santa-method for?

When should I use santa-method?

Is santa-method safe to install?

SKILL.md