Geored Sre Skill
geored-sre-skill is a Claude Code plugin for the Operate phase that guides systematic, read-only Kubernetes incident investigation and root cause analysis.
Run structured Kubernetes incident triage in Claude Code when pods crash, degrade, or fail health checks instead of guessing from a single log line.
Add it to Claude Code
Install the plugin in Claude Code. One command, paste-ready.
/plugin install geored-sre-skill@geored/sre-skillBuilt to be called by your agent
Skillselion is itself an MCP server. Your agent can pull this entry and a paste-ready install config straight from the API - no copy-paste.
Retrieve this entry with skillselion.get_details("plugin:geored/sre-skill") and the paste-ready config with skillselion.get_install_config("plugin:geored/sre-skill").
What it does
geored-sre-skill is a Claude Code plugin that teaches a systematic, read-only Kubernetes incident investigation flow aligned with SRE practice. Solo builders and small teams shipping on K8s install it when an outage or flaky rollout forces fast, evidence-based answers instead of ad-hoc kubectl spelunking. The skill emphasizes correlation across logs, events, and metrics, walks common pod and network failure modes, and pushes toward root cause before suggesting fixes. It fits anyone who owns cluster health but does not want the agent to mutate production blindly. Use it during active incidents, post-mortem prep, or when mentoring an agent through a structured 5-phase checklist. Complexity is intermediate: you should already know basic kubectl and pod lifecycle concepts. The outcome is a documented chain of evidence, likely cause, and next remediation steps you can execute or ticket.
Highlights
- Five-phase investigation methodology correlating logs, events, and metrics read-only via kubectl
- Common failure pattern library for pod, network, and resource incidents
- Multi-source triage distinguishing symptoms from underlying causes
- Actionable remediation recommendations without inventing cluster state
- Structured SRE workflow for Kubernetes service degradation
Why builders use it
When a cluster misbehaves, solo builders waste time jumping between kubectl commands without a consistent way to tie symptoms to root cause.
After invoking the skill, you get a structured incident narrative, correlated evidence, and prioritized remediation ideas grounded in what you actually observed.
At a glance
- Type - Plugin in DevOps.
- Adoption - 0 installs, 1 stars, 0 votes.
FAQ
Who is geored-sre-skill for?
Builders who operate their own Kubernetes workloads and want Claude Code to investigate outages using logs, events, and metrics in a fixed multi-phase workflow.
When should I use geored-sre-skill?
Use it when pods fail health checks, crash loop, cannot pull images, hit OOM, or show network issues and you need root cause analysis before changing the cluster.
How do I add geored-sre-skill to my agent?
Install the geored/sre-skill Claude Code plugin from the registry, ensure kubectl context points at the target cluster, then invoke the skill with a concise incident summary and namespace scope.
Comments
Share how you use geored-sre-skill, gotchas, or tips for other indie builders.
No comments yet - be the first to share how you use it.