
Incident Response
Run a structured production-incident workflow from first alert through stakeholder updates and a blameless postmortem without ad-hoc Slack chaos.
Overview
incident-response is an agent skill most often used in Operate (also Ship launch prep, Grow support) that runs triage, communication, and postmortem workflows from detection through resolution.
Install
npx skills add https://github.com/anthropics/knowledge-work-plugins --skill incident-responseWhat is this skill?
- Three modes: new incident, mid-incident status update, and post-incident postmortem generation
- Phase workflow: triage (SEV1–4, systems, roles) then communicate through resolution
- Argument-hint driven: pass alert text or incident description as `/incident-response` input
- CONNECTORS.md integration for status channels and tooling when plugins are wired
- Prompts for phase when mode omitted so you never skip triage vs comms vs postmortem
- SEV1–4 severity scale in triage phase
- 3 command modes: new, update, postmortem
- 2-phase core workflow: triage then communicate
Adoption & trust: 3.1k installs on skills.sh; 19.6k GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
When production breaks, you need SEV clarity, role assignment, and stakeholder updates—but ad-hoc chat threads leave gaps and no postmortem trail.
Who is it for?
Solo founders or tiny teams shipping live SaaS or APIs who get paged rarely but want Google/SRE-style incident hygiene without a full ops org.
Skip if: Local-only prototypes with no users, or teams that already run PagerDuty runbooks end-to-end and only need ticket sync—not a greenfield incident template.
When should I use this skill?
Trigger with "we have an incident", "production is down", an alert needing severity assessment, mid-incident status updates, or blameless postmortem after resolution.
What do I get? / Deliverables
You get a severity-ranked incident record, comms-ready status updates, and postmortem material aligned to a blameless review after the service is stable.
- Incident triage summary with severity and roles
- Status update copy for stakeholders
- Postmortem draft from incident timeline
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Incidents are owned in Operate when systems fail in production; this skill is shelved under errors because triage and resolution are the canonical entry point. SEV assessment, affected-system identification, and IC/responder assignment map directly to error response rather than passive monitoring setup.
Where it fits
Grade a new alert as SEV2, name affected API paths, and assign yourself IC while a contractor handles comms.
Turn a noisy dashboard spike into a structured incident thread instead of restarting services blindly.
Draft rollback and customer-facing language during a bad deploy on launch day.
Escalate a support tsunami into a single incident with timeline updates for power users.
How it compares
Use instead of unstructured “fix prod in chat” threads; it is procedural incident workflow, not an MCP observability server.
Common Questions / FAQ
Who is incident-response for?
Indie builders and small teams running production web apps, APIs, or CLIs who need a lightweight IC/comms/postmortem playbook inside Claude Code or similar agents.
When should I use incident-response?
In Operate when alerts fire or users report outages; mid-incident for status updates; after resolution for postmortems. Also relevant in Ship when rehearsing launch rollback comms and in Grow when support escalations look like SEV issues.
Is incident-response safe to install?
Review the Security Audits panel on this Prism page and restrict connector permissions; incident workflows may touch production context and comms channels you configure via CONNECTORS.md.
SKILL.md
READMESKILL.md - Incident Response
# /incident-response > If you see unfamiliar placeholders or need to check which tools are connected, see [CONNECTORS.md](../../CONNECTORS.md). Manage an incident from detection through postmortem. ## Usage ``` /incident-response $ARGUMENTS ``` ## Modes ``` /incident-response new [description] # Start a new incident /incident-response update [status] # Post a status update /incident-response postmortem # Generate postmortem from incident data ``` If no mode is specified, ask what phase the incident is in. ## How It Works ``` ┌─────────────────────────────────────────────────────────────────┐ │ INCIDENT RESPONSE │ ├─────────────────────────────────────────────────────────────────┤ │ Phase 1: TRIAGE │ │ ✓ Assess severity (SEV1-4) │ │ ✓ Identify affected systems and users │ │ ✓ Assign roles (IC, comms, responders) │ │ │ │ Phase 2: COMMUNICATE │ │ ✓ Draft internal status update │ │ ✓ Draft customer communication (if needed) │ │ ✓ Set up war room and cadence │ │ │ │ Phase 3: MITIGATE │ │ ✓ Document mitigation steps taken │ │ ✓ Track timeline of events │ │ ✓ Confirm resolution │ │ │ │ Phase 4: POSTMORTEM │ │ ✓ Blameless postmortem document │ │ ✓ Timeline reconstruction │ │ ✓ Root cause analysis (5 whys) │ │ ✓ Action items with owners │ └─────────────────────────────────────────────────────────────────┘ ``` ## Severity Classification | Level | Criteria | Response Time | |-------|----------|---------------| | SEV1 | Service down, all users affected | Immediate, all-hands | | SEV2 | Major feature degraded, many users affected | Within 15 min | | SEV3 | Minor feature issue, some users affected | Within 1 hour | | SEV4 | Cosmetic or low-impact issue | Next business day | ## Communication Guidance Provide clear, factual updates at regular cadence. Include: what's happening, who's affected, what we're doing, when the next update is. ## Output — Status Update ```markdown ## Incident Update: [Title] **Severity:** SEV[1-4] | **Status:** Investigating | Identified | Monitoring | Resolved **Impact:** [Who/what is affected] **Last Updated:** [Timestamp] ### Current Status [What we know now] ### Actions Taken - [Action 1] - [Action 2] ### Next Steps - [What's happening next and ETA] ### Timeline | Time | Event | |------|-------| | [HH:MM] | [Event] | ``` ## Output — Postmortem ```markdown ## Postmortem: [Incident Title] **Date:** [Date] | **Duration:** [X hours] | **Severity:** SEV[X] **Authors:** [Names] | **Status:** Draft ### Summary [2-3 sentence plain-language summary] ### Impact - [Users affected] - [Duration of impact] - [Business impact if quantifiable] ### Timeline | Time (UTC) | Event | |------------|-------| | [HH:MM] | [Event] | ### Root Cause [Detailed explanation of what cau