Karpathy Guardrails

Name: Karpathy Guardrails
Author: juliusbrussee

juliusbrussee/cavekit

Load four behavioral rules so Cavekit agents think before coding, avoid scope creep, and make surgical, goal-driven edits on every task.

Overview

Karpathy Guardrails is a journey-wide agent skill that keeps Cavekit agents disciplined—usable whenever a solo builder needs to stop over-engineering and unverified assumptions before committing code.

Install

npx skills add https://github.com/juliusbrussee/cavekit --skill karpathy-guardrails

What is this skill?

Four principles: think before coding, simplicity first, surgical changes, goal-driven execution
Reviewer enforces guardrails as Pass-1 filter before code-quality review
Think-before-coding checklist: one-sentence goal, assumption list, verifiable success mapping
Refusing to produce code is allowed when scope is unknown—flag NEEDS_CONTEXT instead of guessing
Integrates with revision skill for sharpening vague acceptance criteria (automated-trace subsection)
4 guardrail principles
Reviewer Pass-1 filter before code quality

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 4 installs on skills.sh; 1k GitHub stars; 3/3 security scanners passed (skills.sh audits).

What problem does it solve?

Your coding agent adds speculative features, guesses requirements, and expands scope because nothing forces verifiable acceptance criteria upfront.

Who is it for?

Indie developers running Cavekit multi-agent loops who want consistent anti-scope-creep behavior across planning, implementation, and review.

Skip if: Fully approved specs with frozen scope where you only need a single-shot formatter and no behavioral gate—still useful for review, but lighter-touch prompts may suffice.

When should I use this skill?

Trigger phrases: guardrails, karpathy, scope creep, over-engineering, stop adding features, surgical fix; load at start of every Cavekit task.

What do I get? / Deliverables

Every task starts with stated goals, flagged assumptions, minimum necessary changes, and reviewer Pass-1 enforcement—with revision skill handoff when criteria need sharpening.

Documented goal sentence, assumption list, verifiable success mapping
Pass-1 compliant change set or explicit NEEDS_CONTEXT flags

Recommended Skills

Improve Codebase Architecturemattpocock/skills

Improve Codebase Architecture is an agent skill that teaches how to deepen a cluster of shallow modules without breaking…226k installs·121k stars

Zoom Outmattpocock/skills

Lightweight meta-prompt skill that tells the agent to zoom out and deliver a domain-aligned overview of modules and call…181k installs·121k stars

Caveman Reviewjuliusbrussee/caveman

Formats code review as single actionable lines: location, problem, fix, with minimal noise.139k installs·70k stars

Requesting Code Reviewobra/superpowers

Requesting Code Review is an agent skill from the Superpowers collection that gives solo and indie builders a copy-ready…119k installs·221k stars

Receiving Code Reviewobra/superpowers

Superpowers methodology for agents receiving code review: prioritize technical correctness over social comfort, verify e…96.2k installs·221k stars

Request Refactor Planmattpocock/skills

request-refactor-plan is a structured agent workflow for solo and small-team maintainers who want refactors filed as act…30.5k installs·121k stars

Journey fit

Useful at every journey phase - explore requirements and options before committing to a direction.

Where it fits

Example use

ValidateScope & plan

Planner maps each acceptance criterion to a verifiable test before approving a prototype milestone.

Example use

BuildProject management & tracking

Task-builder lists load-bearing assumptions and refuses to code until NEEDS_CONTEXT questions are answered.

Example use

BuildUI/UX & frontend

Agent limits a UI fix to the reported bug with no new component library “just in case.”

Example use

ShipCode review

Reviewer Pass-1 rejects a PR that introduces speculative abstraction layers outside acceptance criteria.

Example use

OperateIteration & experiments

Hotfix task stays surgical—one observable behavior change with a mapped regression check.

How it compares

Behavioral contract for agents, not a linter—pairs with the revision skill instead of ad-hoc “keep it simple” chat reminders.

Common Questions / FAQ

Who is karpathy-guardrails for?

Solo and small-team builders using Cavekit task-builder, reviewer, planner, and inspector agents who want Karpathy-style discipline on every task.

When should I use karpathy-guardrails?

At the start of any Cavekit task in Validate when scoping acceptance criteria; in Build before implementation; in Ship during reviewer Pass-1; in Operate when iterating fixes—especially when you say guardrails, scope creep, over-engineering, or surgical fix.

Is karpathy-guardrails safe to install?

It is policy text with no network or shell requirements by itself; confirm combined Cavekit skills follow your policies and review the Security Audits panel on this page before enabling the full pipeline.

Workflow Chain

Then invoke: revision

SKILL.md

READMESKILL.md - Karpathy Guardrails

# Karpathy Guardrails

Four rules. Load them into context at the start of every task. The reviewer
enforces them as a Pass-1 filter before it looks at code quality.

## 1. Think Before Coding

Before the first edit, write down:

- **What am I actually building?** One sentence. If you cannot state it, stop.
- **What am I assuming?** List every assumption. If any is load-bearing and
  unverified, flag `NEEDS_CONTEXT` and ask — do not guess.
- **What does success look like?** Map each acceptance criterion to a concrete
  test, check, or observable behaviour. If a criterion is not verifiable,
  propose a sharpening via the `revision` skill (automated-trace subsection), not a vague attempt.

Refusing to produce code is allowed. A task with unknown scope is a spec bug,
not a coding task.

## 2. Simplicity First

The correct amount of code is the minimum that meets the acceptance criteria.

- No speculative features. No abstraction layer "in case we need it."
- No new dependencies unless the task requires one and no existing dep fits.
- No "while I'm in here" refactors. Surface them as separate kits.
- Duplication is not always wrong. Three similar lines usually beat a premature
  abstraction with two configuration knobs.

If the diff is larger than the acceptance criteria seem to demand, explain why
in the commit body. If you cannot, trim the diff.

## 3. Surgical Changes

Every line in the diff must trace back to an acceptance criterion. Touching
code outside the task's owned files is justified only when a requirement forces
it. Examples of violations:

- Fixing a formatter warning in an unrelated file.
- Renaming a helper "to match new convention."
- Reordering imports, docstrings, whitespace.
- Tightening a type signature the task did not ask about.

If you see a real bug in adjacent code, log it to `.cavekit/history/backprop-log.md`
as a candidate kit item and keep it out of this task's diff.

## 4. Goal-Driven Execution

Transform vague tasks into verifiable success criteria before execution.

- A task that cannot be verified is not a task — escalate it.
- The verification plan must be concrete: exact commands, exact assertions,
  exact files to inspect. "Make sure it works" is not a plan.
- After implementation, run the verification plan. Report the output.

## Role-specific enforcement

- **task-builder** — must produce, alongside code, a Verification Report listing
  each AC, the verification step, and the observed result.
- **reviewer** — must refuse to advance to Pass 2 (code quality) if Pass 1 finds
  any of: undeclared assumptions, diff lines unjustified by an AC, out-of-scope
  edits, or unreachable verification steps.
- **planner** — must reject kits that contain un-testable ACs. They are spec
  bugs and block planning.
- **inspector** — must flag completed tasks whose verification logs are missing
  or hand-waved.

## When you are tempted to break a rule

You are probably over-confident about a shortcut that will cost more than the
delay of asking. Stop and note the tension in the commit body or in the build
log so the reviewer can judge.

What is this skill?

Four principles: think before coding, simplicity first, surgical changes, goal-driven execution

Reviewer enforces guardrails as Pass-1 filter before code-quality review

Think-before-coding checklist: one-sentence goal, assumption list, verifiable success mapping

Refusing to produce code is allowed when scope is unknown—flag NEEDS_CONTEXT instead of guessing

Integrates with revision skill for sharpening vague acceptance criteria (automated-trace subsection)

4 guardrail principles

Reviewer Pass-1 filter before code quality

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 4 installs on skills.sh; 1k GitHub stars; 3/3 security scanners passed (skills.sh audits).

Who is it for?

Indie developers running Cavekit multi-agent loops who want consistent anti-scope-creep behavior across planning, implementation, and review.

Skip if: Fully approved specs with frozen scope where you only need a single-shot formatter and no behavioral gate—still useful for review, but lighter-touch prompts may suffice.

What do I get? / Deliverables

Every task starts with stated goals, flagged assumptions, minimum necessary changes, and reviewer Pass-1 enforcement—with revision skill handoff when criteria need sharpening.

Documented goal sentence, assumption list, verifiable success mapping

Pass-1 compliant change set or explicit NEEDS_CONTEXT flags

Journey fit

Useful at every journey phase - explore requirements and options before committing to a direction.

Where it fits

Example use

ValidateScope & plan

Planner maps each acceptance criterion to a verifiable test before approving a prototype milestone.

Example use

BuildProject management & tracking

Task-builder lists load-bearing assumptions and refuses to code until NEEDS_CONTEXT questions are answered.

Example use

BuildUI/UX & frontend

Agent limits a UI fix to the reported bug with no new component library “just in case.”

Example use

ShipCode review

Reviewer Pass-1 rejects a PR that introduces speculative abstraction layers outside acceptance criteria.

Example use

OperateIteration & experiments

Hotfix task stays surgical—one observable behavior change with a mapped regression check.

SKILL.md

READMESKILL.md - Karpathy Guardrails

# Karpathy Guardrails

Four rules. Load them into context at the start of every task. The reviewer
enforces them as a Pass-1 filter before it looks at code quality.

## 1. Think Before Coding

Before the first edit, write down:

- **What am I actually building?** One sentence. If you cannot state it, stop.
- **What am I assuming?** List every assumption. If any is load-bearing and
  unverified, flag `NEEDS_CONTEXT` and ask — do not guess.
- **What does success look like?** Map each acceptance criterion to a concrete
  test, check, or observable behaviour. If a criterion is not verifiable,
  propose a sharpening via the `revision` skill (automated-trace subsection), not a vague attempt.

Refusing to produce code is allowed. A task with unknown scope is a spec bug,
not a coding task.

## 2. Simplicity First

The correct amount of code is the minimum that meets the acceptance criteria.

- No speculative features. No abstraction layer "in case we need it."
- No new dependencies unless the task requires one and no existing dep fits.
- No "while I'm in here" refactors. Surface them as separate kits.
- Duplication is not always wrong. Three similar lines usually beat a premature
  abstraction with two configuration knobs.

If the diff is larger than the acceptance criteria seem to demand, explain why
in the commit body. If you cannot, trim the diff.

## 3. Surgical Changes

Every line in the diff must trace back to an acceptance criterion. Touching
code outside the task's owned files is justified only when a requirement forces
it. Examples of violations:

- Fixing a formatter warning in an unrelated file.
- Renaming a helper "to match new convention."
- Reordering imports, docstrings, whitespace.
- Tightening a type signature the task did not ask about.

If you see a real bug in adjacent code, log it to `.cavekit/history/backprop-log.md`
as a candidate kit item and keep it out of this task's diff.

## 4. Goal-Driven Execution

Transform vague tasks into verifiable success criteria before execution.

- A task that cannot be verified is not a task — escalate it.
- The verification plan must be concrete: exact commands, exact assertions,
  exact files to inspect. "Make sure it works" is not a plan.
- After implementation, run the verification plan. Report the output.

## Role-specific enforcement

- **task-builder** — must produce, alongside code, a Verification Report listing
  each AC, the verification step, and the observed result.
- **reviewer** — must refuse to advance to Pass 2 (code quality) if Pass 1 finds
  any of: undeclared assumptions, diff lines unjustified by an AC, out-of-scope
  edits, or unreachable verification steps.
- **planner** — must reject kits that contain un-testable ACs. They are spec
  bugs and block planning.
- **inspector** — must flag completed tasks whose verification logs are missing
  or hand-waved.

## When you are tempted to break a rule

You are probably over-confident about a shortcut that will cost more than the
delay of asking. Stop and note the tension in the commit body or in the build
log so the reviewer can judge.

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is karpathy-guardrails for?

When should I use karpathy-guardrails?

Is karpathy-guardrails safe to install?

SKILL.md

This week for builders

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is karpathy-guardrails for?

When should I use karpathy-guardrails?

Is karpathy-guardrails safe to install?

SKILL.md