Code Debugging

Name: Code Debugging
Author: lingzhi227

lingzhi227/agent-research-skills

Classify agent-generated code failures with a severity-ordered hierarchy and structured RunIssue fix instructions instead of ad-hoc printf debugging.

Overview

code-debugging is an agent skill most often used in Ship (also Build backend, Operate errors) that applies a severity-ordered CodeProblem hierarchy and RunIssue fix templates to agent-generated code failures.

Install

npx skills add https://github.com/lingzhi227/agent-research-skills --skill code-debugging

What is this skill?

IndexOrdered CodeProblem enum from NoCode through AllOK for severity-ordered triage
RunIssue dataclass fields: category, item, issue, instructions, requesting_small_change
Patterns distilled from data-to-paper debugger.py and run_issues.py pipelines
Covers timeout, syntax, runtime, static check, and output-file compilation categories
Separates breaking vs non-breaking runtime issues for agent retry strategies
CodeProblem severity ladder spans from NoCode through AllOK with 15+ enumerated problem types in the excerpt

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 711 installs on skills.sh; 114 GitHub stars; 3/3 security scanners passed (skills.sh audits).

What problem does it solve?

Your agent run failed with mixed syntax, timeout, and missing-output errors and you have no shared severity order or structured fix brief for the next iteration.

Who is it for?

Solo builders running automated code generation pipelines, research agents, or batch eval jobs that emit logs needing consistent triage.

Skip if: Interactive-only IDE debugging with breakpoints when you do not need severity taxonomies or agent handoff instructions.

When should I use this skill?

Agent or generated code runs fail and you need severity-ordered categories plus structured fix instructions.

What do I get? / Deliverables

Failures map into CodeProblem levels and RunIssue objects with category, item, issue, and instructions so the agent prioritizes fixes and chooses small changes versus full rewrites.

CodeProblem classification for each failure
RunIssue records with instructions and small-change flag
Ordered fix plan from most to least severe

Recommended Skills

Azure Diagnosticsmicrosoft/azure-skills

Azure Diagnostics walks agents through systematic production troubleshooting on Azure—checking resource health, AppLens …374k installs·1.2k stars

Diagnosemattpocock/skills

Matt Pocock-style diagnose skill that prioritizes deterministic pass/fail signals then walks through structured debuggin…187k installs·121k stars

Systematic Debuggingobra/superpowers

Systematic Debugging is an agent skill that forces a root-cause-first workflow before any proposed fix for bugs, test fa…134k installs·221k stars

Safe Debuglllllllama/rigorpilot-skills

safe-debug implements Rigor Debug / Rigor Audit mode for deep-learning research repos: your agent reads the traceback or…32.3k installs·412 stars

Mastramastra-ai/skills

The mastra skill is a structured troubleshooting companion for solo builders shipping TypeScript agents on the Mastra fr…18.5k installs·57 stars

Insforge Debuginsforge/agent-skills

InsForge Debug guides solo builders through structured diagnosis on InsForge-backed projects when something breaks in pr…9.2k installs·27 stars

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

Debugging patterns surface hardest when you are validating runs, timeouts, and outputs before release—the Ship testing shelf is where structured triage first pays off. The skill encodes test-run failures, static checks, missing artifacts, and output validation—core testing/debug workflows, not greenfield feature coding.

Also useful

OperateError tracking

Also useful

BuildBackend, data & payments

Where it fits

Example use

ShipTesting & QA

Map a failed CI sandbox run to CodeProblem.RuntimeError and emit RunIssue instructions before re-prompting the agent.

Example use

BuildBackend, data & payments

Integrate a new agent module and use StaticCheck vs SyntaxError ordering to decide whether to fix imports or block structure first.

Example use

OperateError tracking

Diagnose scheduled job MissingOutputFiles versus TimeoutError with requesting_small_change hints for a minimal patch.

Example use

ShipCode review

Summarize Non-breaking runtime issues separately from breaking failures before merge.

How it compares

Use structured RunIssue taxonomy instead of pasting raw stack traces into chat without severity ordering.

Common Questions / FAQ

Who is code-debugging for?

Indie builders and agent authors who run generated code in sandboxes and need repeatable failure categories aligned with research-agent tooling.

When should I use code-debugging?

In Ship testing when runs fail validation; in Build backend when integrating agent outputs; in Operate errors when jobs time out or miss expected output files.

Is code-debugging safe to install?

Check the Security Audits panel on this Prism page; the skill describes patterns only and does not execute code by itself.

SKILL.md

READMESKILL.md - Code Debugging

# Code Debugging Patterns

Extracted from data-to-paper (debugger.py, run_issues.py), AI-Scientist-v2, and AgentLaboratory.

## Error Severity Hierarchy (data-to-paper)

```python
class CodeProblem(IndexOrderedEnum):
    NoCode = 'No code'                                    # Most severe
    IncompleteBlock = 'Incomplete block'
    NotSingleBlock = 'Not single block'
    StaticCheck = 'Static check'
    TimeoutError = 'Timeout error'
    RuntimeError = 'Runtime error'
    SyntaxError = 'Syntax error'
    MissingOutputFiles = 'Missing output files'
    NonBreakingRuntimeIssue = 'Non-breaking runtime issue'
    OutputFileCallingSyntax = 'Output file calling syntax'
    OutputFileContentA = 'Output file content first check'
    OutputFileContinuity = 'Check dependency on previous output'
    OutputFileContentB = 'Output file content second check'
    OutputFileCompilation = 'Output file failed compilation'
    OutputFileAnnotation = 'Output file annotation'
    AllOK = 'All OK'                                      # Least severe
```

## RunIssue Structure (data-to-paper)

```python
@dataclass
class RunIssue:
    code_problem: CodeProblem  # Severity category
    category: str              # e.g., "Importing packages", "Timeout"
    item: str                  # Specific file/function name
    issue: str                 # Problem description
    instructions: str          # How to fix
    comment: str               # Internal note
    requesting_small_change: bool  # Minor fix vs major rewrite
    forgive_after: int         # Forgive after N occurrences (None = never)
```

## Fix Strategy State Machine (data-to-paper)

### Action Matrix
```
                    Stage 0      Stage 1              Stage 2
                    (initial)    (1st revision)       (2nd revision)
incomplete         regen0       regen1               regen1
not_single_block   leave        regen1               regen2
static_check       repost0      repost0/regen1       regen2
run_failed         repost0      repost0/leave        repost1
missing_files      repost0      repost0/leave        repost0/regen1
run_completed      repost0      repost0              repost0
```

### Action Definitions

- **repost[N]**: Rewind conversation to stage N, post code as fresh response
  - Stage 0: "Here is the code to perform the requested analysis:"
  - Stage 1: "Here is the revised code to perform the requested analysis:"

- **regen[N]**: Delete messages back to stage N, regenerate from scratch
  - Resets `requesting_small_change` flag

- **leave**: Keep current response, post issue feedback requesting small change

### Conditional Actions (A/B)
When action contains "/": `action1/action2`
- If current problem severity ≥ previous: use action1
- Else: use action2

## Common Error Patterns

### Device Mismatch (PyTorch)
```
RuntimeError: Expected all tensors to be on the same device
Fix: Add .to(device) or ensure consistent device placement
```

### Shape Mismatch
```
RuntimeError: mat1 and mat2 shapes cannot be multiplied
Fix: Check tensor dimensions, add .reshape() or .view()
```

### Missing Data Normalization
```
Symptom: Loss is NaN or Inf
Fix: Add input normalization, check for zero-division
```

### Off-by-One Indexing
```
IndexError: index X is out of bounds for axis Y with size Z
Fix: Check loop bounds, array indexing
```

### Incorrect Loss Function
```
Symptom: Training loss doesn't decrease
Fix: Match loss function to task type:
- Classification: CrossEntropyLoss (not MSE)
- Regression: MSELoss (not CrossEntropy)
- Multi-label: BCEWithLogitsLoss
```

## Automated Code Repair Prompt (AgentLaboratory)

```
You are a code repair specialist. Given:
1. The original code
2. The error message
3. The traceback

Identify the root cause and apply a MINIMAL fix:
- Do not rewrite working code
- Fix only the lines causing the error
- Preserve the original logic and structure
- Explain why the error occurred
```

## Truncation Rules

- Error output: Keep last 1500 characters of stder

What is this skill?

IndexOrdered CodeProblem enum from NoCode through AllOK for severity-ordered triage

RunIssue dataclass fields: category, item, issue, instructions, requesting_small_change

Patterns distilled from data-to-paper debugger.py and run_issues.py pipelines

Covers timeout, syntax, runtime, static check, and output-file compilation categories

Separates breaking vs non-breaking runtime issues for agent retry strategies

CodeProblem severity ladder spans from NoCode through AllOK with 15+ enumerated problem types in the excerpt

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 711 installs on skills.sh; 114 GitHub stars; 3/3 security scanners passed (skills.sh audits).

What do I get? / Deliverables

Failures map into CodeProblem levels and RunIssue objects with category, item, issue, and instructions so the agent prioritizes fixes and chooses small changes versus full rewrites.

CodeProblem classification for each failure

RunIssue records with instructions and small-change flag

Ordered fix plan from most to least severe

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

Also useful

OperateError tracking

Also useful

BuildBackend, data & payments

Where it fits

Example use

ShipTesting & QA

Map a failed CI sandbox run to CodeProblem.RuntimeError and emit RunIssue instructions before re-prompting the agent.

Example use

BuildBackend, data & payments

Integrate a new agent module and use StaticCheck vs SyntaxError ordering to decide whether to fix imports or block structure first.

Example use

OperateError tracking

Diagnose scheduled job MissingOutputFiles versus TimeoutError with requesting_small_change hints for a minimal patch.

Example use

ShipCode review

Summarize Non-breaking runtime issues separately from breaking failures before merge.

SKILL.md

READMESKILL.md - Code Debugging

# Code Debugging Patterns

Extracted from data-to-paper (debugger.py, run_issues.py), AI-Scientist-v2, and AgentLaboratory.

## Error Severity Hierarchy (data-to-paper)

```python
class CodeProblem(IndexOrderedEnum):
    NoCode = 'No code'                                    # Most severe
    IncompleteBlock = 'Incomplete block'
    NotSingleBlock = 'Not single block'
    StaticCheck = 'Static check'
    TimeoutError = 'Timeout error'
    RuntimeError = 'Runtime error'
    SyntaxError = 'Syntax error'
    MissingOutputFiles = 'Missing output files'
    NonBreakingRuntimeIssue = 'Non-breaking runtime issue'
    OutputFileCallingSyntax = 'Output file calling syntax'
    OutputFileContentA = 'Output file content first check'
    OutputFileContinuity = 'Check dependency on previous output'
    OutputFileContentB = 'Output file content second check'
    OutputFileCompilation = 'Output file failed compilation'
    OutputFileAnnotation = 'Output file annotation'
    AllOK = 'All OK'                                      # Least severe
```

## RunIssue Structure (data-to-paper)

```python
@dataclass
class RunIssue:
    code_problem: CodeProblem  # Severity category
    category: str              # e.g., "Importing packages", "Timeout"
    item: str                  # Specific file/function name
    issue: str                 # Problem description
    instructions: str          # How to fix
    comment: str               # Internal note
    requesting_small_change: bool  # Minor fix vs major rewrite
    forgive_after: int         # Forgive after N occurrences (None = never)
```

## Fix Strategy State Machine (data-to-paper)

### Action Matrix
```
                    Stage 0      Stage 1              Stage 2
                    (initial)    (1st revision)       (2nd revision)
incomplete         regen0       regen1               regen1
not_single_block   leave        regen1               regen2
static_check       repost0      repost0/regen1       regen2
run_failed         repost0      repost0/leave        repost1
missing_files      repost0      repost0/leave        repost0/regen1
run_completed      repost0      repost0              repost0
```

### Action Definitions

- **repost[N]**: Rewind conversation to stage N, post code as fresh response
  - Stage 0: "Here is the code to perform the requested analysis:"
  - Stage 1: "Here is the revised code to perform the requested analysis:"

- **regen[N]**: Delete messages back to stage N, regenerate from scratch
  - Resets `requesting_small_change` flag

- **leave**: Keep current response, post issue feedback requesting small change

### Conditional Actions (A/B)
When action contains "/": `action1/action2`
- If current problem severity ≥ previous: use action1
- Else: use action2

## Common Error Patterns

### Device Mismatch (PyTorch)
```
RuntimeError: Expected all tensors to be on the same device
Fix: Add .to(device) or ensure consistent device placement
```

### Shape Mismatch
```
RuntimeError: mat1 and mat2 shapes cannot be multiplied
Fix: Check tensor dimensions, add .reshape() or .view()
```

### Missing Data Normalization
```
Symptom: Loss is NaN or Inf
Fix: Add input normalization, check for zero-division
```

### Off-by-One Indexing
```
IndexError: index X is out of bounds for axis Y with size Z
Fix: Check loop bounds, array indexing
```

### Incorrect Loss Function
```
Symptom: Training loss doesn't decrease
Fix: Match loss function to task type:
- Classification: CrossEntropyLoss (not MSE)
- Regression: MSELoss (not CrossEntropy)
- Multi-label: BCEWithLogitsLoss
```

## Automated Code Repair Prompt (AgentLaboratory)

```
You are a code repair specialist. Given:
1. The original code
2. The error message
3. The traceback

Identify the root cause and apply a MINIMAL fix:
- Do not rewrite working code
- Fix only the lines causing the error
- Preserve the original logic and structure
- Explain why the error occurred
```

## Truncation Rules

- Error output: Keep last 1500 characters of stder

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is code-debugging for?

When should I use code-debugging?

Is code-debugging safe to install?

SKILL.md

This week for builders

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is code-debugging for?

When should I use code-debugging?

Is code-debugging safe to install?

SKILL.md