
Backprop
Turn a failing test or reported bug into a SPEC.md invariant and test so the same class of failure cannot recur under spec-driven development.
Overview
Backprop is an agent skill most often used in Ship (also Operate, Build) that turns bugs and test failures into SPEC.md invariants so recurrence is caught by spec, not memory.
Install
npx skills add https://github.com/juliusbrussee/cavekit --skill backpropWhat is this skill?
- Six-step protocol: TRACE, ANALYZE, PROPOSE, GENERATE TEST, apply spec edit, re-verify
- Adds §B backprop rows and §V invariants—plan-then-execute fixes code only; SDD fixes the spec too
- Forces a failing test before accepting a new invariant
- Caveman one-line root cause and pipe-table §B/§V templates
- Triggers on user bug reports and post-mortems—not only local test runs
- Six-step backprop workflow from TRACE through re-verify
- Template pairs §B row with numbered §V invariant line
Adoption & trust: 1.6k installs on skills.sh; 999 GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
You fixed the code after a test failed but nothing in SPEC.md would stop the same bug from shipping again on the next change.
Who is it for?
Teams and solo builders using Cavekit SPEC.md who want every production bug or red test to tighten the living spec.
Skip if: Quick hotfix workflows with no SPEC.md, or one-off experiments where you are not maintaining §V invariants.
When should I use this skill?
Test failed at /build verification, user reports bug, post-mortem after production incident, or /check flags VIOLATE with root cause found.
What do I get? / Deliverables
Root cause is recorded in §B, a testable §V invariant is added with a failing test first, and the spec mutator applies the edit before you re-run build or check.
- Draft §B backprop row and §V invariant text
- New failing test citing the invariant
- Applied spec change after user or spec skill confirmation
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Failures surface most often during Ship verification, but the canonical shelf is testing because backprop closes the loop from red tests to spec updates. The skill is triggered by `/build` verification failures, `/check` VIOLATE results, and explicit test failure—core test-and-verify moments.
Where it fits
A refund integration test fails at /build verification and you trace idempotency before adding §V7.
After a production duplicate-charge incident you run a post-mortem and backprop a new invariant into SPEC.md.
/check reports VIOLATE with root cause found and you propose §B and §V edits before the next task batch.
How it compares
Use instead of closing tickets after a code-only patch when you practice spec-driven development.
Common Questions / FAQ
Who is backprop for?
Solo builders and small teams on Cavekit who treat SPEC.md as source of truth and need bugs to flow back into §B and §V automatically.
When should I use backprop?
After Ship test failures, user-reported bugs, production post-mortems, or when `/check` finds a VIOLATE—you can also run it during Operate incident review before the next Build cycle.
Is backprop safe to install?
It edits project spec and tests; review Security Audits on this page and ensure only trusted agents mutate SPEC.md in your repo.
SKILL.md
READMESKILL.md - Backprop
# backprop — bug → spec Plan-then-execute fixes the code & forgets. SDD fixes the code AND edits spec so recurrence is impossible. That edit is backprop. ## WHEN TO BACKPROP - Test failed at `/build` verification. - User reports bug. - Post-mortem after production incident. - `/check` flags VIOLATE with root cause found. ## SIX STEPS ### 1. TRACE Read failure output / bug report. Find exact file:line of wrong behavior. Name root cause in one caveman sentence. ### 2. ANALYZE Ask three questions: - Would a new §V invariant catch this class of bug? (most common: yes) - Is §I wrong — did spec claim shape the code cannot deliver? (sometimes) - Is §T wrong — did we build the wrong thing? (rare but real) ### 3. PROPOSE Draft the spec change. Never skip §B; §V/§I/§T are case-by-case. Template: ``` §B row: B<next>|<date>|<root cause>|V<N> §V line: V<next>: <testable rule that would have caught it> ``` Example: ``` §B row: B3|2026-04-20|refund job ran twice on retry|V7 §V line: V7: ∀ refund → idempotency key check before charge reversal ``` ### 4. GENERATE TEST New invariant without test = lie. Add failing test first. Name test so it cites the invariant: `TestV7_RefundIdempotent`. ### 5. VERIFY Fix code. Run test. Must pass. Run full suite. Must not regress. ### 6. LOG Commit spec edit + test + code fix together. Commit msg: `backprop §B.<n> + §V.<N>: <one-line cause>`. ## WHAT MAKES A GOOD INVARIANT - Testable in code (grep-able or assert-able). - Scoped to a behavior, not a file. - Stated positively when possible (`! hold` over `⊥ forbid`). - References §I surface where it applies. **Bad**: V8: code should be correct. **Good**: V8: ∀ pg_query ! params interpolated via driver, ⊥ string concat. ## WHEN NOT TO ADD §V - Bug was purely mechanical typo with no class (`i++` vs `i--` in throwaway). - Fix is a one-time migration. - Root cause is external dep (upgrade deps instead, note in §C). Still append §B entry — record that this failure mode was considered. Future bug with same smell → §B search shows precedent. ## OUTPUT SHAPE Every backprop run produces: 1. §B entry (always). 2. §V entry (usually). 3. Test file (when §V added). 4. Code fix. 5. One commit. No dashboards. No log files. SPEC.md + git is the full history.