
Critique
Orchestrate multi-judge debate and consensus on finished work—code, docs, or plans—without auto-applying fixes.
Overview
Critique is a journey-wide agent skill that orchestrates specialized judges, debate, and consensus to review completed work and deliver report-only quality findings.
Install
npx skills add https://github.com/neolabhq/context-engineering-kit --skill critiqueWhat is this skill?
- Multi-Agent Debate plus LLM-as-a-Judge evaluation framework
- Chain-of-Verification: each judge validates its critique before submission
- Consensus building across specialized judges after independent review
- Report-only delivery—no automatic fixes applied to your repo
- Scope from explicit file paths, commits, or default recent conversation changes
- Review pattern combines 4 named techniques: Multi-Agent Debate, LLM-as-a-Judge, Chain-of-Verification, and consensus bui
Adoption & trust: 559 installs on skills.sh; 1.1k GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
You finished a chunk of work but only got a single-pass review that misses conflicting expert perspectives and verifiable critique quality.
Who is it for?
Builders using context-engineering patterns who want debate-style review on arbitrary completed artifacts without auto-editing the codebase.
Skip if: Quick lint fixes, fully automated patch application, or reviews where you cannot supply scope via paths, commits, or recent session context.
When should I use this skill?
You need a comprehensive multi-perspective, report-only review of completed work, with optional file paths, commits, or recent-change scope.
What do I get? / Deliverables
You receive a consensus-backed critique report from multiple judges with CoVe-validated findings, ready for you to prioritize fixes manually.
- Consensus critique report from specialized judges
- Debated findings with CoVe-validated recommendations (no automatic patches)
Recommended Skills
Journey fit
Useful at every journey phase - explore requirements and options before committing to a direction.
Where it fits
Debate judges over the latest commit range before you tag a release candidate.
Critique an implementation plan or spec the agent produced so gaps surface before coding continues.
Review a thin prototype and its README for correctness and UX risks before expanding scope.
Multi-judge pass on launch announcement draft for factual and messaging consistency.
Assess a postmortem doc and related patches for completeness before sharing with users.
How it compares
Use for multi-judge debate and consensus reports, not as a drop-in replacement for a single adversarial persona merge blocker.
Common Questions / FAQ
Who is critique for?
Solo builders and agent-heavy teams who want structured multi-judge review of finished work using context-engineering-kit patterns.
When should I use critique?
Use it in Ship before release review, in Build after a feature lands, in Validate when checking a prototype narrative, in Launch when reviewing positioning copy, or in Operate when assessing post-incident writeups—whenever you need consensus critique without auto-fixes.
Is critique safe to install?
It is primarily analytical and report-only, but judges may process sensitive repo or conversation content—check the Security Audits panel on this page and scope reviews to non-secret files.
SKILL.md
READMESKILL.md - Critique
# Work Critique Command <task> You are a critique coordinator conducting a comprehensive multi-perspective review of completed work using the Multi-Agent Debate + LLM-as-a-Judge pattern. Your role is to orchestrate multiple specialized judges who will independently review the work, debate their findings, and reach consensus on quality, correctness, and improvement opportunities. </task> <context> This command implements a sophisticated review pattern combining: - **Multi-Agent Debate**: Multiple specialized judges provide independent perspectives - **LLM-as-a-Judge**: Structured evaluation framework for consistent assessment - **Chain-of-Verification (CoVe)**: Each judge validates their own critique before submission - **Consensus Building**: Judges debate findings to reach agreement on recommendations The review is **report-only** - findings are presented for user consideration without automatic fixes. </context> ## Your Workflow ### Phase 1: Context Gathering Before starting the review, understand what was done: 1. **Identify the scope of work to review**: - If arguments provided: Use them to identify specific files, commits, or conversation context - If no arguments: Review the recent conversation history and file changes - Ask user if scope is unclear: "What work should I review? (recent changes, specific feature, entire conversation, etc.)" 2. **Capture relevant context**: - Original requirements or user request - Files that were modified or created - Decisions made during implementation - Any constraints or assumptions 3. **Summarize scope for confirmation**: ``` 📋 Review Scope: - Original request: [summary] - Files changed: [list] - Approach taken: [brief description] Proceeding with multi-agent review... ``` ### Phase 2: Independent Judge Reviews (Parallel) Use the Task tool to spawn three specialized judge agents in parallel. Each judge operates independently without seeing others' reviews. #### Judge 1: Requirements Validator **Prompt for Agent:** ``` You are a Requirements Validator conducting a thorough review of completed work. ## Your Task Review the following work and assess alignment with original requirements: [CONTEXT] Original Requirements: {requirements} Work Completed: {summary of changes} Files Modified: {file list} [/CONTEXT] ## Your Process (Chain-of-Verification) 1. **Initial Analysis**: - List all requirements from the original request - Check each requirement against the implementation - Identify gaps, over-delivery, or misalignments 2. **Self-Verification**: - Generate 3-5 verification questions about your analysis - Example: "Did I check for edge cases mentioned in requirements?" - Answer each question honestly - Refine your analysis based on answers 3. **Final Critique**: Provide structured output: ### Requirements Alignment Score: X/10 ### Requirements Coverage: ✅ [Met requirement 1] ✅ [Met requirement 2] ⚠️ [Partially met requirement 3] - [explanation] ❌ [Missed requirement 4] - [explanation] ### Gaps Identified: - [gap 1 with severity: Critical/High/Medium/Low] - [gap 2 with severity] ### Over-Delivery/Scope Creep: - [item 1] - [is this good or problematic?] ### Verification Questions & Answers: Q1: [question] A1: [answer that influenced your critique] ... Be specific, objective, and cite examples from the code. ``` #### Judge 2: Solution Architect **Prompt for Agent:** ``` You are a Solution Architect evaluating the technical approach and design decisions. ## Your Task Review the implementation approach and assess if it's optimal: [CONTEXT] Problem to Solve: {problem description} Solution Implemented: {summary of approach}