Do Competitively

Name: Do Competitively
Author: neolabhq

neolabhq/context-engineering-kit

Run competitive parallel agent generations with meta-judge rubrics and synthesis when one-shot coding answers are not good enough for high-stakes work.

Overview

do-competitively is an agent skill most often used in Build (also Ship review, Validate scope) that runs competitive multi-agent generation, meta-judge evaluation, and synthesis for higher-quality outputs.

Install

npx skills add https://github.com/neolabhq/context-engineering-kit --skill do-competitively

What is this skill?

Generate-Critique-Synthesize (GCS) with adaptive polish, synthesize, or redesign strategies
Meta-judge builds tailored rubrics before multi-judge evaluation
Constitutional AI self-critique in generation and Chain-of-Verification in evaluation
Orchestrator must not read sub-agent context files or reports to avoid context bloat
Claims ~15–20% average cost savings via adaptive strategy selection
15–20% average cost savings via adaptive strategy selection (per skill doc)

Compatible agents: Claude Code, Cursor, Codex, Windsurf

Adoption & trust: 524 installs on skills.sh; 1.1k GitHub stars; 2/3 security scanners passed (skills.sh audits).

What problem does it solve?

One agent draft is not trustworthy enough for a high-stakes task and you have no structured way to compare parallel solutions.

Who is it for?

High-stakes specs, designs, or implementations where you can describe task criteria and optional output paths upfront.

Skip if: Quick typo fixes, trivial scripts, or solo work where reading every sub-agent artifact in the main thread is acceptable.

When should I use this skill?

High-stakes task where quality matters more than speed; provide task description and optional output path or criteria.

What do I get? / Deliverables

You get a synthesized or polished result backed by multi-judge evidence, with strategy chosen to save cost versus naive re-runs.

Synthesized or polished deliverable from competitive runs
Evidence-backed evaluation outcome

Recommended Skills

Microsoft Foundrymicrosoft/azure-skills

Microsoft Foundry skill guides agents through the full Azure AI Foundry lifecycle—containerizing agents, pushing to ACR,…377k installs·1.2k stars

Azure Aimicrosoft/azure-skills

azure-ai is a Prism-oriented quick reference for Microsoft Azure AI work, with the published body centered on the Azure …375k installs·1.2k stars

Azure Hosted Copilot Sdkmicrosoft/azure-skills

Azure Hosted Copilot SDK is Microsoft's entry skill for repos using @github/copilot-sdk—it detects CopilotClient usage, …346k installs·1.2k stars

Lark Eventlarksuite/cli

Lark real-time subscription skill via lark-cli event consume for building bots and streaming webhook-style agent workers…208k installs·13.7k stars

Running Claude Code Via Litellm Copilotxixu-me/skills

Running Claude Code via LiteLLM Copilot walks through pointing Claude Code at a local LiteLLM proxy that forwards Anthro…200k installs·61 stars

Setup Matt Pocock Skillsmattpocock/skills

One-time per-repo setup so Matt Pocock engineering skills share correct issue tracker, triage strings, and domain docume…180k installs·121k stars

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

BuildAgent skills & templates

Primary shelf is build agent-tooling because the command orchestrates multi-agent generation and synthesis for implementation-quality outputs. Matches agent-tooling as an orchestration pattern (GCS) rather than a single integration or test script.

Also useful

ShipCode review

Also useful

ValidateScope & plan

Where it fits

Example use

ValidateScope & plan

Run parallel architecture sketches and synthesize the winner before you commit to a stack.

Example use

BuildBackend, data & payments

Compete two API designs then polish or merge based on judge scores.

Example use

ShipCode review

Multi-judge a sensitive refactor when reviewers disagree on approach.

Example use

LaunchDistribution & launch channels

Synthesize competing launch copy variants against a tailored rubric.

How it compares

Orchestration workflow for multi-agent quality, not a single checker skill or a plain MCP tool call.

Common Questions / FAQ

Who is do-competitively for?

Indie builders and small teams using agent stacks who need repeatable, judge-backed quality on important tasks without manually merging three chat transcripts.

When should I use do-competitively?

In validate when scoping architecture, in build for competitive implementations, and in ship review when split judges need synthesis—not for speed-first chores.

Is do-competitively safe to install?

It spawns sub-agents and may touch files per your task; check the Security Audits panel on this Prism page and scope permissions before use.

SKILL.md

READMESKILL.md - Do Competitively

# do-competitively

<task>
Execute tasks through competitive multi-agent generation, meta-judge evaluation specification, multi-judge evaluation, and evidence-based synthesis to produce superior results by combining the best elements from parallel implementations.
</task>

<context>
This command implements the Generate-Critique-Synthesize (GCS) pattern with adaptive strategy selection for high-stakes tasks where quality matters more than speed. It combines competitive generation with meta-judge evaluation specification and multi-perspective evaluation, then intelligently selects the optimal synthesis strategy based on results.

**Key features:**

- Self-critique loops in generation (Constitutional AI)
- Structured evaluation - Meta-judge produces tailored rubrics before judging
- Verification loops in evaluation (Chain-of-Verification)
- Adaptive strategy: polish clear winners, synthesize split decisions, redesign failures
- Average 15-20% cost savings through intelligent strategy selection
</context>

CRITICAL: You are not implementation agent or judge, you shoudn't read files that provided as context for sub-agent or task. You shouldn't read reports, you shouldn't overwhelm your context with unneccesary information. You MUST follow process step by step. Any diviations will be considered as failure and you will be killed!

## Pattern: Generate-Critique-Synthesize (GCS)

This command implements a multi-phase adaptive competitive orchestration pattern:

```
Phase 1: Competitive Generation with Self-Critique + Meta-Judge (IN PARALLEL)
         ┌─ Meta-Judge → Evaluation Specification YAML ───────────┐
Task ────┼─ Agent 2 → Draft → Critique → Revise → Solution B ───┐ │ 
         ├─ Agent 3 → Draft → Critique → Revise → Solution C ───┼─┤ 
         └─ Agent 1 → Draft → Critique → Revise → Solution A ───┘ │
                                                                  │
Phase 2: Multi-Judge Evaluation with Verification                 │
         ┌─ Judge 1 → Evaluate → Verify → Revise → Report A ─┐    │
         ├─ Judge 2 → Evaluate → Verify → Revise → Report B ─┼────┤
         └─ Judge 3 → Evaluate → Verify → Revise → Report C ─┘    │
                                                                  │
Phase 2.5: Adaptive Strategy Selection                            │
         Analyze Consensus ───────────────────────────────────────┤
                ├─ Clear Winner? → SELECT_AND_POLISH              │
                ├─ All Flawed (<3.0)? → REDESIGN (return Phase 1) │
                └─ Split Decision? → FULL_SYNTHESIS               │
                                          │                       │
Phase 3: Evidence-Based Synthesis         │                       │
         (Only if FULL_SYNTHESIS)         │                       │
         Synthesizer ─────────────────────┴───────────────────────┴─→ Final Solution
```

## Process

### Setup: Create Reports Directory

Before starting, ensure the reports directory exists:

```bash
mkdir -p .specs/reports
```

**Report naming convention:** `.specs/reports/{solution-name}-{YYYY-MM-DD}.[1|2|3].md`

Where:

- `{solution-name}` - Derived from output path (e.g., `users-api` from output `specs/api/users.md`)
- `{YYYY-MM-DD}` - Current date
- `[1|2|3]` - Judge number

**Note:** Solutions remain in their specified output locations; only evaluation reports go to `.specs/reports/`

### Phase 1: Competitive Generation + Meta-Judge (IN PARALLEL)

Launch **3 independent generator agents AND 1 meta-judge agent in parallel** (4 agents total, all recommended: Opus for quality):

The meta-judge runs in parallel with the 3 generators because it does not need their output — it only needs the task description to generate evaluation criteria.

What is this skill?

Generate-Critique-Synthesize (GCS) with adaptive polish, synthesize, or redesign strategies

Meta-judge builds tailored rubrics before multi-judge evaluation

Constitutional AI self-critique in generation and Chain-of-Verification in evaluation

Orchestrator must not read sub-agent context files or reports to avoid context bloat

Claims ~15–20% average cost savings via adaptive strategy selection

15–20% average cost savings via adaptive strategy selection (per skill doc)

Compatible agents: Claude Code, Cursor, Codex, Windsurf

Adoption & trust: 524 installs on skills.sh; 1.1k GitHub stars; 2/3 security scanners passed (skills.sh audits).

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

BuildAgent skills & templates

Also useful

ShipCode review

Also useful

ValidateScope & plan

Where it fits

Example use

ValidateScope & plan

Run parallel architecture sketches and synthesize the winner before you commit to a stack.

Example use

BuildBackend, data & payments

Compete two API designs then polish or merge based on judge scores.

Example use

ShipCode review

Multi-judge a sensitive refactor when reviewers disagree on approach.

Example use

LaunchDistribution & launch channels

Synthesize competing launch copy variants against a tailored rubric.

SKILL.md

READMESKILL.md - Do Competitively

# do-competitively

<task>
Execute tasks through competitive multi-agent generation, meta-judge evaluation specification, multi-judge evaluation, and evidence-based synthesis to produce superior results by combining the best elements from parallel implementations.
</task>

<context>
This command implements the Generate-Critique-Synthesize (GCS) pattern with adaptive strategy selection for high-stakes tasks where quality matters more than speed. It combines competitive generation with meta-judge evaluation specification and multi-perspective evaluation, then intelligently selects the optimal synthesis strategy based on results.

**Key features:**

- Self-critique loops in generation (Constitutional AI)
- Structured evaluation - Meta-judge produces tailored rubrics before judging
- Verification loops in evaluation (Chain-of-Verification)
- Adaptive strategy: polish clear winners, synthesize split decisions, redesign failures
- Average 15-20% cost savings through intelligent strategy selection
</context>

CRITICAL: You are not implementation agent or judge, you shoudn't read files that provided as context for sub-agent or task. You shouldn't read reports, you shouldn't overwhelm your context with unneccesary information. You MUST follow process step by step. Any diviations will be considered as failure and you will be killed!

## Pattern: Generate-Critique-Synthesize (GCS)

This command implements a multi-phase adaptive competitive orchestration pattern:

```
Phase 1: Competitive Generation with Self-Critique + Meta-Judge (IN PARALLEL)
         ┌─ Meta-Judge → Evaluation Specification YAML ───────────┐
Task ────┼─ Agent 2 → Draft → Critique → Revise → Solution B ───┐ │ 
         ├─ Agent 3 → Draft → Critique → Revise → Solution C ───┼─┤ 
         └─ Agent 1 → Draft → Critique → Revise → Solution A ───┘ │
                                                                  │
Phase 2: Multi-Judge Evaluation with Verification                 │
         ┌─ Judge 1 → Evaluate → Verify → Revise → Report A ─┐    │
         ├─ Judge 2 → Evaluate → Verify → Revise → Report B ─┼────┤
         └─ Judge 3 → Evaluate → Verify → Revise → Report C ─┘    │
                                                                  │
Phase 2.5: Adaptive Strategy Selection                            │
         Analyze Consensus ───────────────────────────────────────┤
                ├─ Clear Winner? → SELECT_AND_POLISH              │
                ├─ All Flawed (<3.0)? → REDESIGN (return Phase 1) │
                └─ Split Decision? → FULL_SYNTHESIS               │
                                          │                       │
Phase 3: Evidence-Based Synthesis         │                       │
         (Only if FULL_SYNTHESIS)         │                       │
         Synthesizer ─────────────────────┴───────────────────────┴─→ Final Solution
```

## Process

### Setup: Create Reports Directory

Before starting, ensure the reports directory exists:

```bash
mkdir -p .specs/reports
```

**Report naming convention:** `.specs/reports/{solution-name}-{YYYY-MM-DD}.[1|2|3].md`

Where:

- `{solution-name}` - Derived from output path (e.g., `users-api` from output `specs/api/users.md`)
- `{YYYY-MM-DD}` - Current date
- `[1|2|3]` - Judge number

**Note:** Solutions remain in their specified output locations; only evaluation reports go to `.specs/reports/`

### Phase 1: Competitive Generation + Meta-Judge (IN PARALLEL)

Launch **3 independent generator agents AND 1 meta-judge agent in parallel** (4 agents total, all recommended: Opus for quality):

The meta-judge runs in parallel with the 3 generators because it does not need their output — it only needs the task description to generate evaluation criteria.

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is do-competitively for?

When should I use do-competitively?

Is do-competitively safe to install?

SKILL.md

This week for builders

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is do-competitively for?

When should I use do-competitively?

Is do-competitively safe to install?

SKILL.md