Critique

Name: Critique
Author: neolabhq

neolabhq/context-engineering-kit

Orchestrate multi-judge debate and consensus on finished work—code, docs, or plans—without auto-applying fixes.

Overview

Critique is a journey-wide agent skill that orchestrates specialized judges, debate, and consensus to review completed work and deliver report-only quality findings.

Install

npx skills add https://github.com/neolabhq/context-engineering-kit --skill critique

What is this skill?

Multi-Agent Debate plus LLM-as-a-Judge evaluation framework
Chain-of-Verification: each judge validates its critique before submission
Consensus building across specialized judges after independent review
Report-only delivery—no automatic fixes applied to your repo
Scope from explicit file paths, commits, or default recent conversation changes
Review pattern combines 4 named techniques: Multi-Agent Debate, LLM-as-a-Judge, Chain-of-Verification, and consensus bui

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 559 installs on skills.sh; 1.1k GitHub stars; 3/3 security scanners passed (skills.sh audits).

What problem does it solve?

You finished a chunk of work but only got a single-pass review that misses conflicting expert perspectives and verifiable critique quality.

Who is it for?

Builders using context-engineering patterns who want debate-style review on arbitrary completed artifacts without auto-editing the codebase.

Skip if: Quick lint fixes, fully automated patch application, or reviews where you cannot supply scope via paths, commits, or recent session context.

When should I use this skill?

You need a comprehensive multi-perspective, report-only review of completed work, with optional file paths, commits, or recent-change scope.

What do I get? / Deliverables

You receive a consensus-backed critique report from multiple judges with CoVe-validated findings, ready for you to prioritize fixes manually.

Consensus critique report from specialized judges
Debated findings with CoVe-validated recommendations (no automatic patches)

Recommended Skills

Improve Codebase Architecturemattpocock/skills

Improve Codebase Architecture is an agent skill that teaches how to deepen a cluster of shallow modules without breaking…226k installs·121k stars

Zoom Outmattpocock/skills

Lightweight meta-prompt skill that tells the agent to zoom out and deliver a domain-aligned overview of modules and call…181k installs·121k stars

Caveman Reviewjuliusbrussee/caveman

Formats code review as single actionable lines: location, problem, fix, with minimal noise.139k installs·70k stars

Requesting Code Reviewobra/superpowers

Requesting Code Review is an agent skill from the Superpowers collection that gives solo and indie builders a copy-ready…119k installs·221k stars

Receiving Code Reviewobra/superpowers

Superpowers methodology for agents receiving code review: prioritize technical correctness over social comfort, verify e…96.2k installs·221k stars

Request Refactor Planmattpocock/skills

request-refactor-plan is a structured agent workflow for solo and small-team maintainers who want refactors filed as act…30.5k installs·121k stars

Journey fit

Useful at every journey phase - explore requirements and options before committing to a direction.

Where it fits

Example use

ShipCode review

Debate judges over the latest commit range before you tag a release candidate.

Example use

BuildProject management & tracking

Critique an implementation plan or spec the agent produced so gaps surface before coding continues.

Example use

ValidatePrototype & spike

Review a thin prototype and its README for correctness and UX risks before expanding scope.

Example use

LaunchDistribution & launch channels

Multi-judge pass on launch announcement draft for factual and messaging consistency.

Example use

OperateIteration & experiments

Assess a postmortem doc and related patches for completeness before sharing with users.

How it compares

Use for multi-judge debate and consensus reports, not as a drop-in replacement for a single adversarial persona merge blocker.

Common Questions / FAQ

Who is critique for?

Solo builders and agent-heavy teams who want structured multi-judge review of finished work using context-engineering-kit patterns.

When should I use critique?

Use it in Ship before release review, in Build after a feature lands, in Validate when checking a prototype narrative, in Launch when reviewing positioning copy, or in Operate when assessing post-incident writeups—whenever you need consensus critique without auto-fixes.

Is critique safe to install?

It is primarily analytical and report-only, but judges may process sensitive repo or conversation content—check the Security Audits panel on this page and scope reviews to non-secret files.

SKILL.md

READMESKILL.md - Critique

# Work Critique Command

<task>
You are a critique coordinator conducting a comprehensive multi-perspective review of completed work using the Multi-Agent Debate + LLM-as-a-Judge pattern. Your role is to orchestrate multiple specialized judges who will independently review the work, debate their findings, and reach consensus on quality, correctness, and improvement opportunities.
</task>

<context>
This command implements a sophisticated review pattern combining:
- **Multi-Agent Debate**: Multiple specialized judges provide independent perspectives
- **LLM-as-a-Judge**: Structured evaluation framework for consistent assessment
- **Chain-of-Verification (CoVe)**: Each judge validates their own critique before submission
- **Consensus Building**: Judges debate findings to reach agreement on recommendations

The review is **report-only** - findings are presented for user consideration without automatic fixes.
</context>

## Your Workflow

### Phase 1: Context Gathering

Before starting the review, understand what was done:

1. **Identify the scope of work to review**:
   - If arguments provided: Use them to identify specific files, commits, or conversation context
   - If no arguments: Review the recent conversation history and file changes
   - Ask user if scope is unclear: "What work should I review? (recent changes, specific feature, entire conversation, etc.)"

2. **Capture relevant context**:
   - Original requirements or user request
   - Files that were modified or created
   - Decisions made during implementation
   - Any constraints or assumptions

3. **Summarize scope for confirmation**:

   ```
   📋 Review Scope:
   - Original request: [summary]
   - Files changed: [list]
   - Approach taken: [brief description]

   Proceeding with multi-agent review...
   ```

### Phase 2: Independent Judge Reviews (Parallel)

Use the Task tool to spawn three specialized judge agents in parallel. Each judge operates independently without seeing others' reviews.

#### Judge 1: Requirements Validator

**Prompt for Agent:**

```
You are a Requirements Validator conducting a thorough review of completed work.

## Your Task

Review the following work and assess alignment with original requirements:

[CONTEXT]
Original Requirements: {requirements}
Work Completed: {summary of changes}
Files Modified: {file list}
[/CONTEXT]

## Your Process (Chain-of-Verification)

1. **Initial Analysis**:
   - List all requirements from the original request
   - Check each requirement against the implementation
   - Identify gaps, over-delivery, or misalignments

2. **Self-Verification**:
   - Generate 3-5 verification questions about your analysis
   - Example: "Did I check for edge cases mentioned in requirements?"
   - Answer each question honestly
   - Refine your analysis based on answers

3. **Final Critique**:
   Provide structured output:

   ### Requirements Alignment Score: X/10

   ### Requirements Coverage:
   ✅ [Met requirement 1]
   ✅ [Met requirement 2]
   ⚠️ [Partially met requirement 3] - [explanation]
   ❌ [Missed requirement 4] - [explanation]

   ### Gaps Identified:
   - [gap 1 with severity: Critical/High/Medium/Low]
   - [gap 2 with severity]

   ### Over-Delivery/Scope Creep:
   - [item 1] - [is this good or problematic?]

   ### Verification Questions & Answers:
   Q1: [question]
   A1: [answer that influenced your critique]
   ...

Be specific, objective, and cite examples from the code.
```

#### Judge 2: Solution Architect

**Prompt for Agent:**

```
You are a Solution Architect evaluating the technical approach and design decisions.

## Your Task

Review the implementation approach and assess if it's optimal:

[CONTEXT]
Problem to Solve: {problem description}
Solution Implemented: {summary of approach}

What is this skill?

Multi-Agent Debate plus LLM-as-a-Judge evaluation framework

Chain-of-Verification: each judge validates its critique before submission

Consensus building across specialized judges after independent review

Report-only delivery—no automatic fixes applied to your repo

Scope from explicit file paths, commits, or default recent conversation changes

Review pattern combines 4 named techniques: Multi-Agent Debate, LLM-as-a-Judge, Chain-of-Verification, and consensus bui

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 559 installs on skills.sh; 1.1k GitHub stars; 3/3 security scanners passed (skills.sh audits).

Journey fit

Useful at every journey phase - explore requirements and options before committing to a direction.

Where it fits

Example use

ShipCode review

Debate judges over the latest commit range before you tag a release candidate.

Example use

BuildProject management & tracking

Critique an implementation plan or spec the agent produced so gaps surface before coding continues.

Example use

ValidatePrototype & spike

Review a thin prototype and its README for correctness and UX risks before expanding scope.

Example use

LaunchDistribution & launch channels

Multi-judge pass on launch announcement draft for factual and messaging consistency.

Example use

OperateIteration & experiments

Assess a postmortem doc and related patches for completeness before sharing with users.

SKILL.md

READMESKILL.md - Critique

# Work Critique Command

<task>
You are a critique coordinator conducting a comprehensive multi-perspective review of completed work using the Multi-Agent Debate + LLM-as-a-Judge pattern. Your role is to orchestrate multiple specialized judges who will independently review the work, debate their findings, and reach consensus on quality, correctness, and improvement opportunities.
</task>

<context>
This command implements a sophisticated review pattern combining:
- **Multi-Agent Debate**: Multiple specialized judges provide independent perspectives
- **LLM-as-a-Judge**: Structured evaluation framework for consistent assessment
- **Chain-of-Verification (CoVe)**: Each judge validates their own critique before submission
- **Consensus Building**: Judges debate findings to reach agreement on recommendations

The review is **report-only** - findings are presented for user consideration without automatic fixes.
</context>

## Your Workflow

### Phase 1: Context Gathering

Before starting the review, understand what was done:

1. **Identify the scope of work to review**:
   - If arguments provided: Use them to identify specific files, commits, or conversation context
   - If no arguments: Review the recent conversation history and file changes
   - Ask user if scope is unclear: "What work should I review? (recent changes, specific feature, entire conversation, etc.)"

2. **Capture relevant context**:
   - Original requirements or user request
   - Files that were modified or created
   - Decisions made during implementation
   - Any constraints or assumptions

3. **Summarize scope for confirmation**:

   ```
   📋 Review Scope:
   - Original request: [summary]
   - Files changed: [list]
   - Approach taken: [brief description]

   Proceeding with multi-agent review...
   ```

### Phase 2: Independent Judge Reviews (Parallel)

Use the Task tool to spawn three specialized judge agents in parallel. Each judge operates independently without seeing others' reviews.

#### Judge 1: Requirements Validator

**Prompt for Agent:**

```
You are a Requirements Validator conducting a thorough review of completed work.

## Your Task

Review the following work and assess alignment with original requirements:

[CONTEXT]
Original Requirements: {requirements}
Work Completed: {summary of changes}
Files Modified: {file list}
[/CONTEXT]

## Your Process (Chain-of-Verification)

1. **Initial Analysis**:
   - List all requirements from the original request
   - Check each requirement against the implementation
   - Identify gaps, over-delivery, or misalignments

2. **Self-Verification**:
   - Generate 3-5 verification questions about your analysis
   - Example: "Did I check for edge cases mentioned in requirements?"
   - Answer each question honestly
   - Refine your analysis based on answers

3. **Final Critique**:
   Provide structured output:

   ### Requirements Alignment Score: X/10

   ### Requirements Coverage:
   ✅ [Met requirement 1]
   ✅ [Met requirement 2]
   ⚠️ [Partially met requirement 3] - [explanation]
   ❌ [Missed requirement 4] - [explanation]

   ### Gaps Identified:
   - [gap 1 with severity: Critical/High/Medium/Low]
   - [gap 2 with severity]

   ### Over-Delivery/Scope Creep:
   - [item 1] - [is this good or problematic?]

   ### Verification Questions & Answers:
   Q1: [question]
   A1: [answer that influenced your critique]
   ...

Be specific, objective, and cite examples from the code.
```

#### Judge 2: Solution Architect

**Prompt for Agent:**

```
You are a Solution Architect evaluating the technical approach and design decisions.

## Your Task

Review the implementation approach and assess if it's optimal:

[CONTEXT]
Problem to Solve: {problem description}
Solution Implemented: {summary of approach}

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is critique for?

When should I use critique?

Is critique safe to install?

SKILL.md

This week for builders

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is critique for?

When should I use critique?

Is critique safe to install?

SKILL.md