Paper2code

Name: Paper2code
Author: prathamlearnstocode

prathamlearnstocode/paper2code

Turn a research paper into a faithful codebase with explicit tags for specified, official-code, and unspecified choices instead of silent guesses.

Overview

Paper2Code is an agent skill most often used in Build (also Validate scope) that implements research papers into code using explicit ambiguity resolution instead of silent guessing.

Install

npx skills add https://github.com/prathamlearnstocode/paper2code --skill paper2code

What is this skill?

Decision tree for ambiguity: explicit paper text, official GitHub, reimplementations, field defaults, or documented stub
Partial specification protocol for underspecified optimizers and hyperparameters—never guess silently
Marks provenance with [SPECIFIED], [FROM_OFFICIAL_CODE], and [UNSPECIFIED] in code and REPRODUCTION_NOTES.md
Stub-first path when no standard exists, with docstrings listing implementation options
Documented decision tree for explicit, partial, official-code, reimplementation, standard-field, and stub outcomes

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 541 installs on skills.sh; 1.4k GitHub stars; 2/3 security scanners passed (skills.sh audits).

What problem does it solve?

You want to code a paper but the write-up omits optimizers, shapes, or data steps and ad-hoc agents fill gaps with wrong defaults.

Who is it for?

Solo ML builders reproducing papers, coursework, or baselines who need auditable choices when specs are partial or contradictory.

Skip if: Quick demos that only need a high-level blog summary, or production features with no tie to a published method you intend to reproduce faithfully.

When should I use this skill?

Implementing or reproducing code from a research paper where methods, hyperparameters, or architecture details may be vague or incomplete.

What do I get? / Deliverables

You get a structured codebase with labeled provenance, reproduction notes, and stubs where the literature truly leaves gaps.

Implementation modules with provenance tags
REPRODUCTION_NOTES.md for unspecified choices
Stub files with option docstrings when no standard exists

Recommended Skills

Paper Context Resolverlllllllama/ai-paper-reproduction-skill

Optional helper-tier skill that supplements README-guided deep learning reproduction by resolving specific paper details…140k installs·412 stars

Repo Intake And Planlllllllama/ai-paper-reproduction-skill

Rigor Intake scans repository docs and layout to classify documented commands and propose a minimal reproduction plan fo…140k installs·412 stars

Env And Assets Bootstraplllllllama/ai-paper-reproduction-skill

Rigor Setup establishes conservative environment and asset assumptions aligned with README and config evidence before ex…140k installs·412 stars

Minimal Run And Auditlllllllama/ai-paper-reproduction-skill

RigorPilot executes the selected minimal reproduction command and produces normalized, auditable run evidence for paper …140k installs·412 stars

Analyze Projectlllllllama/rigorpilot-skills

analyze-project is a read-only agent skill from the RigorPilot family aimed at solo builders and small teams inheriting …32.3k installs·412 stars

Ai Research Reproductionlllllllama/rigorpilot-skills

ai-research-reproduction is the RigorPilot Reproduce orchestrator for solo builders and small teams who need to rerun a …32.3k installs·412 stars

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

BuildBackend, data & payments

Canonical shelf is Build backend because the deliverable is implementation code and stubs for models, training loops, and experiment utilities described in papers. Backend fits ML reproduction work—training scripts, model modules, and data pipelines—rather than marketing or launch surfaces.

Also useful

ValidateScope & plan

Where it fits

Example use

ValidateScope & plan

Walk the ambiguity decision tree to see whether the paper is implementable before promising a demo deadline.

Example use

BuildBackend, data & payments

Generate model and training files with [FROM_OFFICIAL_CODE] tags when the authors’ GitHub exists.

Example use

ShipTesting & QA

Use REPRODUCTION_NOTES.md to design smoke tests only on behaviors the paper or official code actually specifies.

How it compares

Use instead of one-shot “implement this PDF” prompts that hide assumptions—this skill forces a documented decision tree and REPRODUCTION_NOTES.md.

Common Questions / FAQ

Who is paper2code for?

Indie researchers and developers implementing ML or systems papers who need transparent handling of missing hyperparameters and ambiguous methods text.

When should I use paper2code?

During Validate scope when scoping what a paper actually specifies, and during Build backend when generating modules, training scripts, and stubs with [SPECIFIED] versus [UNSPECIFIED] labels.

Is paper2code safe to install?

It may pull logic from cited official repos; review the Security Audits panel on this Prism page and verify third-party code licenses before merging into your product.

SKILL.md

READMESKILL.md - Paper2code

# Guardrail: Handling Badly Written Papers

## Reality check

Many papers are vague, inconsistent, or incomplete. This is not a criticism — writing a paper is hard, page limits are strict, and reviewers rarely check hyperparameter tables. But it means you will regularly encounter papers where the text alone is insufficient to produce a correct implementation.

This file tells you what to do when that happens. The answer is never "guess silently."

---

## Decision tree for resolving ambiguity

```
Is the detail stated explicitly in the paper?
├── YES → Use it. Cite the section. Done.
├── PARTIALLY → Follow the partial specification protocol (below)
└── NO →
    Is there official code on GitHub?
    ├── YES → Use it as ground truth. Cite the file and line number.
    │         Mark as [FROM_OFFICIAL_CODE], not [SPECIFIED].
    └── NO →
        Is there a well-known reimplementation?
        ├── YES → Reference it in REPRODUCTION_NOTES.md.
        │         DO NOT blindly copy — they made their own choices.
        │         Mark as [UNSPECIFIED] and note the external reference.
        └── NO →
            Is there a "standard" choice in the field?
            ├── YES → Use it. Mark as [UNSPECIFIED] with alternatives.
            └── NO →
                Write a stub with a detailed docstring.
                Explain what should go here and what the options are.
```

---

## Partial specification protocol

When the paper partially specifies something:

### "We use Adam"
The paper says an optimizer but not all parameters.
```python
# §4.1 — "We use the Adam optimizer"
# [PARTIALLY_SPECIFIED] Optimizer stated as Adam, but beta and epsilon values not specified
# Using: β₁=0.9, β₂=0.999, ε=1e-8 (PyTorch defaults)
# Alternatives: β₁=0.9, β₂=0.98, ε=1e-9 (Transformer default from Vaswani et al.)
optimizer = torch.optim.Adam(model.parameters(), lr=config.lr,
                              betas=(0.9, 0.999), eps=1e-8)
```

### "We use dropout for regularization"
The paper mentions dropout but not the rate or placement.
```python
# §3.3 — "We use dropout for regularization"
# [PARTIALLY_SPECIFIED] Dropout mentioned but rate not specified
# Using: 0.1 (common default for transformer models)
# Placement: after attention and FFN (not specified — this is our choice)
self.dropout = nn.Dropout(0.1)
```

### "Similar to Transformer [Vaswani et al., 2017]"
The paper references another work for a component.
```python
# §3.1 — "We use an architecture similar to the Transformer [Vaswani et al., 2017]"
# [PARTIALLY_SPECIFIED] "Similar" implies differences exist but none are specified
# We implement the standard Transformer from Vaswani et al. unless other sections
# describe modifications. Reader should check if the authors intended differences.
```

---

## When the paper contradicts itself

### Figure vs text
Figures are often created early in the paper-writing process and may show an earlier version of the architecture or method. Text is usually more up-to-date.

**Rule: Trust the text. Flag the figure discrepancy.**
```python
# §3.2 — "We apply layer normalization before each sub-layer" (Pre-LN)
# NOTE: Figure 2 shows post-norm placement, inconsistent with this text.
# We implement pre-norm as stated in the text. If reproduction fails,
# try post-norm (change this to: x = self.norm(x + sublayer(x)))
```

### Equation vs prose
Equations are peer-reviewed more carefully and are more precise.

**Rule: Trust the equation. Flag the prose discrepancy.**

### Different numbers in different sections
The paper might say "learning rate of 1e-4" in one section and "learning rate of 3e-4" in another.

**Rule: If one is in the appendix/hyperparameter table, trust that. If both are in prose, flag both and use the one from the main experimental section (usually the one paired with the best results).**

---

## When the paper is genuinely incomplete

Some papers are missing critical details that make reproduction impossible without additional information. Here

What is this skill?

Decision tree for ambiguity: explicit paper text, official GitHub, reimplementations, field defaults, or documented stub

Partial specification protocol for underspecified optimizers and hyperparameters—never guess silently

Marks provenance with [SPECIFIED], [FROM_OFFICIAL_CODE], and [UNSPECIFIED] in code and REPRODUCTION_NOTES.md

Stub-first path when no standard exists, with docstrings listing implementation options

Documented decision tree for explicit, partial, official-code, reimplementation, standard-field, and stub outcomes

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 541 installs on skills.sh; 1.4k GitHub stars; 2/3 security scanners passed (skills.sh audits).

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

BuildBackend, data & payments

Also useful

ValidateScope & plan

Where it fits

Example use

ValidateScope & plan

Walk the ambiguity decision tree to see whether the paper is implementable before promising a demo deadline.

Example use

BuildBackend, data & payments

Generate model and training files with [FROM_OFFICIAL_CODE] tags when the authors’ GitHub exists.

Example use

ShipTesting & QA

Use REPRODUCTION_NOTES.md to design smoke tests only on behaviors the paper or official code actually specifies.

SKILL.md

READMESKILL.md - Paper2code

# Guardrail: Handling Badly Written Papers

## Reality check

Many papers are vague, inconsistent, or incomplete. This is not a criticism — writing a paper is hard, page limits are strict, and reviewers rarely check hyperparameter tables. But it means you will regularly encounter papers where the text alone is insufficient to produce a correct implementation.

This file tells you what to do when that happens. The answer is never "guess silently."

---

## Decision tree for resolving ambiguity

```
Is the detail stated explicitly in the paper?
├── YES → Use it. Cite the section. Done.
├── PARTIALLY → Follow the partial specification protocol (below)
└── NO →
    Is there official code on GitHub?
    ├── YES → Use it as ground truth. Cite the file and line number.
    │         Mark as [FROM_OFFICIAL_CODE], not [SPECIFIED].
    └── NO →
        Is there a well-known reimplementation?
        ├── YES → Reference it in REPRODUCTION_NOTES.md.
        │         DO NOT blindly copy — they made their own choices.
        │         Mark as [UNSPECIFIED] and note the external reference.
        └── NO →
            Is there a "standard" choice in the field?
            ├── YES → Use it. Mark as [UNSPECIFIED] with alternatives.
            └── NO →
                Write a stub with a detailed docstring.
                Explain what should go here and what the options are.
```

---

## Partial specification protocol

When the paper partially specifies something:

### "We use Adam"
The paper says an optimizer but not all parameters.
```python
# §4.1 — "We use the Adam optimizer"
# [PARTIALLY_SPECIFIED] Optimizer stated as Adam, but beta and epsilon values not specified
# Using: β₁=0.9, β₂=0.999, ε=1e-8 (PyTorch defaults)
# Alternatives: β₁=0.9, β₂=0.98, ε=1e-9 (Transformer default from Vaswani et al.)
optimizer = torch.optim.Adam(model.parameters(), lr=config.lr,
                              betas=(0.9, 0.999), eps=1e-8)
```

### "We use dropout for regularization"
The paper mentions dropout but not the rate or placement.
```python
# §3.3 — "We use dropout for regularization"
# [PARTIALLY_SPECIFIED] Dropout mentioned but rate not specified
# Using: 0.1 (common default for transformer models)
# Placement: after attention and FFN (not specified — this is our choice)
self.dropout = nn.Dropout(0.1)
```

### "Similar to Transformer [Vaswani et al., 2017]"
The paper references another work for a component.
```python
# §3.1 — "We use an architecture similar to the Transformer [Vaswani et al., 2017]"
# [PARTIALLY_SPECIFIED] "Similar" implies differences exist but none are specified
# We implement the standard Transformer from Vaswani et al. unless other sections
# describe modifications. Reader should check if the authors intended differences.
```

---

## When the paper contradicts itself

### Figure vs text
Figures are often created early in the paper-writing process and may show an earlier version of the architecture or method. Text is usually more up-to-date.

**Rule: Trust the text. Flag the figure discrepancy.**
```python
# §3.2 — "We apply layer normalization before each sub-layer" (Pre-LN)
# NOTE: Figure 2 shows post-norm placement, inconsistent with this text.
# We implement pre-norm as stated in the text. If reproduction fails,
# try post-norm (change this to: x = self.norm(x + sublayer(x)))
```

### Equation vs prose
Equations are peer-reviewed more carefully and are more precise.

**Rule: Trust the equation. Flag the prose discrepancy.**

### Different numbers in different sections
The paper might say "learning rate of 1e-4" in one section and "learning rate of 3e-4" in another.

**Rule: If one is in the appendix/hyperparameter table, trust that. If both are in prose, flag both and use the one from the main experimental section (usually the one paired with the best results).**

---

## When the paper is genuinely incomplete

Some papers are missing critical details that make reproduction impossible without additional information. Here

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is paper2code for?

When should I use paper2code?

Is paper2code safe to install?

SKILL.md

This week for builders

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is paper2code for?

When should I use paper2code?

Is paper2code safe to install?

SKILL.md