
Code Debugging
Classify agent-generated code failures with a severity-ordered hierarchy and structured RunIssue fix instructions instead of ad-hoc printf debugging.
Overview
code-debugging is an agent skill most often used in Ship (also Build backend, Operate errors) that applies a severity-ordered CodeProblem hierarchy and RunIssue fix templates to agent-generated code failures.
Install
npx skills add https://github.com/lingzhi227/agent-research-skills --skill code-debuggingWhat is this skill?
- IndexOrdered CodeProblem enum from NoCode through AllOK for severity-ordered triage
- RunIssue dataclass fields: category, item, issue, instructions, requesting_small_change
- Patterns distilled from data-to-paper debugger.py and run_issues.py pipelines
- Covers timeout, syntax, runtime, static check, and output-file compilation categories
- Separates breaking vs non-breaking runtime issues for agent retry strategies
- CodeProblem severity ladder spans from NoCode through AllOK with 15+ enumerated problem types in the excerpt
Adoption & trust: 711 installs on skills.sh; 114 GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
Your agent run failed with mixed syntax, timeout, and missing-output errors and you have no shared severity order or structured fix brief for the next iteration.
Who is it for?
Solo builders running automated code generation pipelines, research agents, or batch eval jobs that emit logs needing consistent triage.
Skip if: Interactive-only IDE debugging with breakpoints when you do not need severity taxonomies or agent handoff instructions.
When should I use this skill?
Agent or generated code runs fail and you need severity-ordered categories plus structured fix instructions.
What do I get? / Deliverables
Failures map into CodeProblem levels and RunIssue objects with category, item, issue, and instructions so the agent prioritizes fixes and chooses small changes versus full rewrites.
- CodeProblem classification for each failure
- RunIssue records with instructions and small-change flag
- Ordered fix plan from most to least severe
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Debugging patterns surface hardest when you are validating runs, timeouts, and outputs before release—the Ship testing shelf is where structured triage first pays off. The skill encodes test-run failures, static checks, missing artifacts, and output validation—core testing/debug workflows, not greenfield feature coding.
Where it fits
Map a failed CI sandbox run to CodeProblem.RuntimeError and emit RunIssue instructions before re-prompting the agent.
Integrate a new agent module and use StaticCheck vs SyntaxError ordering to decide whether to fix imports or block structure first.
Diagnose scheduled job MissingOutputFiles versus TimeoutError with requesting_small_change hints for a minimal patch.
Summarize Non-breaking runtime issues separately from breaking failures before merge.
How it compares
Use structured RunIssue taxonomy instead of pasting raw stack traces into chat without severity ordering.
Common Questions / FAQ
Who is code-debugging for?
Indie builders and agent authors who run generated code in sandboxes and need repeatable failure categories aligned with research-agent tooling.
When should I use code-debugging?
In Ship testing when runs fail validation; in Build backend when integrating agent outputs; in Operate errors when jobs time out or miss expected output files.
Is code-debugging safe to install?
Check the Security Audits panel on this Prism page; the skill describes patterns only and does not execute code by itself.
SKILL.md
READMESKILL.md - Code Debugging
# Code Debugging Patterns Extracted from data-to-paper (debugger.py, run_issues.py), AI-Scientist-v2, and AgentLaboratory. ## Error Severity Hierarchy (data-to-paper) ```python class CodeProblem(IndexOrderedEnum): NoCode = 'No code' # Most severe IncompleteBlock = 'Incomplete block' NotSingleBlock = 'Not single block' StaticCheck = 'Static check' TimeoutError = 'Timeout error' RuntimeError = 'Runtime error' SyntaxError = 'Syntax error' MissingOutputFiles = 'Missing output files' NonBreakingRuntimeIssue = 'Non-breaking runtime issue' OutputFileCallingSyntax = 'Output file calling syntax' OutputFileContentA = 'Output file content first check' OutputFileContinuity = 'Check dependency on previous output' OutputFileContentB = 'Output file content second check' OutputFileCompilation = 'Output file failed compilation' OutputFileAnnotation = 'Output file annotation' AllOK = 'All OK' # Least severe ``` ## RunIssue Structure (data-to-paper) ```python @dataclass class RunIssue: code_problem: CodeProblem # Severity category category: str # e.g., "Importing packages", "Timeout" item: str # Specific file/function name issue: str # Problem description instructions: str # How to fix comment: str # Internal note requesting_small_change: bool # Minor fix vs major rewrite forgive_after: int # Forgive after N occurrences (None = never) ``` ## Fix Strategy State Machine (data-to-paper) ### Action Matrix ``` Stage 0 Stage 1 Stage 2 (initial) (1st revision) (2nd revision) incomplete regen0 regen1 regen1 not_single_block leave regen1 regen2 static_check repost0 repost0/regen1 regen2 run_failed repost0 repost0/leave repost1 missing_files repost0 repost0/leave repost0/regen1 run_completed repost0 repost0 repost0 ``` ### Action Definitions - **repost[N]**: Rewind conversation to stage N, post code as fresh response - Stage 0: "Here is the code to perform the requested analysis:" - Stage 1: "Here is the revised code to perform the requested analysis:" - **regen[N]**: Delete messages back to stage N, regenerate from scratch - Resets `requesting_small_change` flag - **leave**: Keep current response, post issue feedback requesting small change ### Conditional Actions (A/B) When action contains "/": `action1/action2` - If current problem severity ≥ previous: use action1 - Else: use action2 ## Common Error Patterns ### Device Mismatch (PyTorch) ``` RuntimeError: Expected all tensors to be on the same device Fix: Add .to(device) or ensure consistent device placement ``` ### Shape Mismatch ``` RuntimeError: mat1 and mat2 shapes cannot be multiplied Fix: Check tensor dimensions, add .reshape() or .view() ``` ### Missing Data Normalization ``` Symptom: Loss is NaN or Inf Fix: Add input normalization, check for zero-division ``` ### Off-by-One Indexing ``` IndexError: index X is out of bounds for axis Y with size Z Fix: Check loop bounds, array indexing ``` ### Incorrect Loss Function ``` Symptom: Training loss doesn't decrease Fix: Match loss function to task type: - Classification: CrossEntropyLoss (not MSE) - Regression: MSELoss (not CrossEntropy) - Multi-label: BCEWithLogitsLoss ``` ## Automated Code Repair Prompt (AgentLaboratory) ``` You are a code repair specialist. Given: 1. The original code 2. The error message 3. The traceback Identify the root cause and apply a MINIMAL fix: - Do not rewrite working code - Fix only the lines causing the error - Preserve the original logic and structure - Explain why the error occurred ``` ## Truncation Rules - Error output: Keep last 1500 characters of stder