
lllllllama/rigorpilot-skills
11 skills355k installs4.5k starsGitHub
Install
npx skills add https://github.com/lllllllama/rigorpilot-skillsSkills in this repo
1Analyze Projectanalyze-project is a read-only agent skill from the RigorPilot family aimed at solo builders and small teams inheriting deep learning repositories they did not author. It prioritizes structural understanding over quick fixes: likely training, inference, and evaluation entrypoints, how configs relate to scripts, and where changes would logically land if you later choose to implement them. The embedded policy keeps recommendations low-ego and review-friendly while explicitly forbidding code patches and unevidenced bug claims. A Python analyzer supports conservative mapping rather than open-ended refactors. Use it when you import a paper implementation, audit a teammate’s experiment tree, or want your coding agent to explain model flow before you commit GPU time or merge risky edits. It is the calm first pass that makes subsequent build and ship work informed instead of speculative.32.3kinstalls2Explore Codeexplore-code is a RigorPilot Improve leaf skill for solo builders and small research teams who already authorized exploratory edits during README-first deep learning reproduction. It tells the agent to keep all work on an isolated branch or worktree, document current_research and experiment metadata, and favor minimal adaptations over sweeping refactors. Results are exploratory candidates backed by CHANGESET.md, TOP_RUNS.md, and status.json under explore_outputs—not trusted reproduction conclusions. Use it only after a human or orchestrator explicitly allows code experiments; pair it with ai-research-reproduction for the overall repro flow and avoid treating this mode as permission to mutate the trusted baseline or claim rerun success from ad-hoc changes.32.3kinstalls3Ai Research Reproductionai-research-reproduction is the RigorPilot Reproduce orchestrator for solo builders and small teams who need to rerun a deep learning repo the way its README describes—not by improvising fixes. It guides the agent through reading the repository first, picking the smallest trustworthy inference or evaluation path, then coordinating environment setup, trusted execution, optional training, optional analysis, and optional paper-gap resolution via helpers. The skill stresses faithful use of documented commands, weights, and datasets while logging evidence, assumptions, deviations, and where humans must decide. It explicitly rejects generic paper summaries, isolated scanning, standalone command runs without rigor, and broad research chat. Successful runs culminate in a standardized repro_outputs bundle suitable for audit and iteration; when gaps remain, invoke paper-context-resolver or authorized explore-code rather than bypassing README guidance.32.3kinstalls4Paper Context Resolverpaper-context-resolver is a RigorPilot helper for builders mid-reproduction when the README and on-disk repo almost suffice but one reproduction-critical fact is missing—such as which split, preprocessing step, evaluation protocol, checkpoint mapping, or runtime assumption the authors actually used. It is not a general paper explainer: it answers a specific question needed to run or compare results, cites primary sources, and logs conflicts instead of silently preferring the PDF over the repo. Skip it when the README already documents the detail or when the user only wants a summary without a runnable repro goal. In the standard stack, ai-research-reproduction invokes this skill only after a concrete gap appears, keeping reproduction faithful and auditable rather than speculative.32.3kinstalls5Run Trainrun-train is a RigorPilot agent skill for solo builders and researchers who already picked a training entrypoint from a deep-learning repo README or config. It executes that command conservatively—startup check, short verification run, full kickoff, or resume—without drifting into environment provisioning, hyperparameter sweeps, or autonomous feature work. Outputs land in a predictable train_outputs/ tree so you and your agent can audit what ran, which seed and config applied, where checkpoints landed, and how metrics evolved. Pair it after repo-intake-and-plan or an equivalent command-selection step; if runs fail, hand off to safe-debug before any patch. Best for trustworthy reproduction and evidence-backed iteration, not for exploratory search or orchestrating the whole paper-to-code pipeline.32.3kinstalls6Safe Debugsafe-debug implements Rigor Debug / Rigor Audit mode for deep-learning research repos: your agent reads the traceback or symptom, classifies the likely failure family, and proposes the smallest trustworthy fix while keeping repository code read-only until you authorize a patch. It aligns with RigorPilot’s conservative stance—no silent edits, no drift into speculative exploration—and recommends savepoints when the fix scope is medium or high. Use it after run-train failures, distributed job crashes, or checkpoint load mismatches, and before any automated refactor. The bundled categorization rules help solo builders triage OOM, DDP/NCCL, and tensor shape issues quickly so Codex or Claude Code does not burn tokens rewriting unrelated modules.32.3kinstalls7Repo Intake And Planrepo-intake-and-plan (Rigor Intake) gives solo builders a structured first pass on unfamiliar ML research repositories. The agent reads README and dependency manifests, mines configs and scripts folders, and classifies commands into inference, evaluation, training, or utility buckets—always preferring explicit documentation over filename guesses and marking inferred routes as uncertain. The payoff is a smallest trustworthy reproduction target so you do not jump straight into multi-day training or random notebook cells. It pairs upstream of run-train and downstream of cloning a paper implementation; it is not environment provisioning or paper-gap autonomous coding. Ideal when you need scope clarity before Validate prototyping or Build backend work.32.3kinstalls8Env And Assets BootstrapEnv-and-assets-bootstrap is the Rigor Setup leaf for RigorPilot: it prepares a conservative, auditable foundation before any reproduction command runs. Solo and indie builders shipping ML papers, open models, or research forks use it when README instructions are incomplete but you still need isolated dependencies and honest asset inventory. The skill walks documented links and config defaults first, prefers conda-style environments when the repo expects them, and logs each checkpoint or dataset with source, local path, and status instead of silently mirroring unofficial mirrors. It fits validate and early ship moments when you must separate facts from guesses and avoid blocking a later smoke run on a missing tokenizer or wrong cache path. Outcomes are transparent assumptions your agent and future you can audit, not a claim that training or full eval already succeeded.32.3kinstalls9Explore RunExplore-run is the Rigor Improve / Rigor Explore leaf for RigorPilot when you have permission to experiment beyond a single sanity command. It targets solo builders iterating on paper reproductions or model tweaks who need a conservative matrix of variants, clear isolation from the baseline, and human-readable candidate summaries instead of silent success claims. The skill encodes policy: no implicit exploration, record current_research and branch metadata, and use weighted selection across cost, success rate, and expected gain when generating variant plans. Bundled Python utilities help emit structured JSON specs for matrices. Use it after environment bootstrap and a minimal audited run establish what trusted means; outputs live under explore_outputs for review gates. It does not replace an end-to-end explore orchestrator and must not be treated as default behavior on every repo touch.32.3kinstalls10Minimal Run And AuditMinimal-run-and-audit is the Rigor Run leaf for RigorPilot: it runs the one command you selected—smoke test, inference, evaluation, or sanity check—and packages proof an indie builder or reviewer can trust. The skill enforces short factual reporting, surfaces the exact documented command, and classifies whether the attempt was full, partial, smoke-only, sanity-only, or blocked without burying the blocker. A helper script runs subprocesses, combines logs, and pulls metric key-value pairs for machine-readable follow-up. When patches were applied, SUMMARY.md notes patch state while PATCHES.md holds the audit trail. It assumes environment and assets were prepared upstream and deliberately stops before training loops or unauthorized exploration. Ideal for closing the loop on README claims with copy-paste commands and stable status.json for agents and CI-minded solo operators.32.3kinstalls11Ai Research ExploreRigor Explore (ai-research-explore) is a RigorPilot-compatible agent skill for solo and indie builders who need governed exploratory research on an already-defined current_research context. It is not a generic brainstorming or refactor tool: work stays on an isolated branch or worktree, outputs are candidate-only exploratory records, and code plus run exploration is coordinated conservatively rather than through freeform rewriting. The skill enforces clear separation between trusted and exploratory lanes, requires source-backed idea cards before transplant-style implementation planning, and keeps patch plans minimal, reversible, and auditable. Improvement mining stays bounded to the frozen task family, dataset, benchmark, evaluation source, and supplied SOTA references. Literature lookup is free-first and provider-optional, preferring local curated sources such as Zotero when available, so missing API keys do not halt the workflow. Use it when you have explicit authorization to explore on top of current_research and need auditable artifacts instead of silent workspace drift.32.1kinstalls