Post Mortem

Canonical shelf is Operate because the skill closes the incident loop after production or staging issues are fixed. Errors is the right facet for root-cause records tied to defects, outages, and regressions rather than greenfield build work.

Also useful

Also useful

Where it fits

Example use

Draft the RCA after a production 500 is patched and replay tests pass.

Example use

Attach a structured post-mortem before merging the hotfix PR so reviewers see mechanism and validation.

Example use

Archive what you learned from a recurring cron failure before scheduling preventive work.

How it compares

Use instead of pasting debug chat logs—this skill structures RCA fields engineers expect, not ad-hoc Slack threads.

Common Questions / FAQ

Who is post-mortem for?

Solo builders and small engineering teams who want a durable, identifier-friendly RCA after a real fix—not a placeholder while debugging is still open.

When should I use post-mortem?

After a debug session lands a validated fix in Operate/errors, when closing Ship/review tickets, or when you need iterate-phase documentation before archiving an incident.

Is post-mortem safe to install?

Review the Security Audits panel on this Prism page before installing; the skill mainly drafts text from context you provide and does not inherently require network access.

Workflow Chain

Then invoke: management talk

SKILL.md

READMESKILL.md - Post Mortem

# Post-mortem

The canonical engineering record of a bug fix. Written **after** debugging lands a real fix, **for** other engineers (and future-you, who will have forgotten everything in 6 months). Code identifiers are welcome here — this is the artifact that lets the next person recover the mental model fast.

For the up-the-org version of this same content, hand the finished post-mortem to [`management-talk`](../../productivity/management-talk/SKILL.md). They compose: post-mortem owns the engineering truth, management-talk reframes it for leadership.

## When to invoke

- "/post-mortem"
- "write the post-mortem / postmortem / RCA / root-cause analysis"
- "document this fix" / "write up the root cause" / "close out this bug with a writeup"
- After a debug session has clearly landed a fix, proactively offer to draft one.

## When NOT to use

- **Bug not fixed yet, or fix not validated.** A post-mortem of a hypothesis is misleading. Refuse and tell the user what's missing.
- **Customer-visible outage / incident.** Those need a separate incident report (timeline, blast radius, paging history, comms). This skill is bug-fix scope. Flag and confirm before producing one.
- **Trivial fix** (typo, obvious one-liner). The PR description is the record. Don't manufacture ceremony.

## Required inputs — refuse to draft without these

Before writing a single line, confirm all four. If any are missing, list what's missing and stop:

- [ ] **Reliable repro** exists (not "happens sometimes" — a deterministic or high-rate-flake repro the next person can run).
- [ ] **Root cause is known** (the mechanism is identified, not a hypothesis).
- [ ] **Fix is identified** (PR / commit / branch pointer).
- [ ] **Fix is validated** (the original repro now passes; the customer workload / failing test now succeeds).

These map directly to `debug-mantra` steps 1–4. If you came in via `debug-mantra`, the breadcrumb ledger from step 4 is your raw material — pull from it.

## Structure

Use these blocks in this order. **Summary, Root cause, Fix, and Validation are mandatory.** The rest are conditional but usually present.

### 1. Summary _(mandatory)_
One paragraph. What broke, in user/workload terms. What fixed it, in one sentence. JIRA key, PR number, owner. A reader who stops here should have the right answer.

### 2. Symptom
What was actually observed. Test output, error message, log line, perf number, customer report. Concrete identifiers — don't paraphrase the failure mode.

### 3. Root cause _(mandatory)_
The actual bug mechanism. **Code identifiers welcome and expected** — function names, file paths, struct fields, branch conditions, commit SHAs of the offending change. Walk the cause chain end-to-end. This is the most expensive section and the reason the post-mortem exists at all. Future-you will live or die by how clearly you write this.

### 4. Why it produced the symptom
Link the root cause to the symptom. Often non-obvious — the bug is in `tadaLaunchPrepare` but the visible failure is a customer training run hanging hours later. Walk the chain so a reader who only knows the symptom can connect it back to the cause without re-deriving it.

### 5. Fix _(mandatory)_
What changed and **why this change addresses the root cause** rather than hiding the symptom. Link to PR / commit. If a previous fix attempt papered over the symptom, name it and explain what was wrong with it — that history is part of the cause.

### 6. How it w

What is this skill?

Produces a canonical engineering post-mortem after the fix is landed and validated—not for open hypotheses

Welcomes code identifiers, mechanism detail, and validation evidence for engineer audiences

Explicit handoff path to management-talk for leadership-safe reframing of the same facts

Refuses premature writeups when the bug is unfixed or unvalidated

Triggered by /post-mortem, RCA, or “document this fix” after a debug session

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 1.1k installs on skills.sh; 2.7k GitHub stars; 3/3 security scanners passed (skills.sh audits).

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

Also useful

Also useful

Where it fits

Example use

Draft the RCA after a production 500 is patched and replay tests pass.

Example use

Attach a structured post-mortem before merging the hotfix PR so reviewers see mechanism and validation.

Example use