
lingzhi227/agent-research-skills
31 skills23.5k installs3.5k starsGitHub
Install
npx skills add https://github.com/lingzhi227/agent-research-skillsSkills in this repo
1Literature ReviewLiterature Review is an agent research skill that supplies dialogue prompts distilled from Stanford’s STORM system. Solo builders and indie researchers use it when they need structured prior-art exploration: first surfacing closely related topics and example pages, then spawning varied expert personas (or focused roundtable speakers) to stress-test assumptions before writing specs or code. The skill is intentionally prompt-forward—find related topics, generate expert personas, generate focused experts—so your coding agent can simulate multi-perspective review instead of one bland summary. It fits early journey phases when you are still choosing a problem space, drafting positioning, or validating that your angle is not already saturated in public knowledge bases. Pair it with human judgment on source quality; it accelerates framing and question generation, not peer-reviewed verification. Intermediate complexity assumes comfort orchestrating multi-turn agent research sessions.1.5kinstalls2Literature SearchLiterature-search is an agent skill backed by a small Python utility for solo and indie builders who need arXiv papers in a form agents can actually use. Instead of copying PDF text or guessing from abstracts, you search by paper title or arXiv identifier, download the publisher source bundle, and land extracted LaTeX in a local folder your Claude Code, Cursor, or Codex session can read. The script deliberately avoids extra dependencies—only urllib and the standard library XML parser—so it fits CI sandboxes and minimal dev machines. Use it during idea-phase desk research when validating whether an approach is already solved, when scoping a prototype that must match a named paper, or when wiring agent-tooling steps that need grounded citations. It does not replace a full reference manager; it optimizes the one job of getting arXiv .tex on disk fast for agent workflows.1.1kinstalls3Data Analysisdata-analysis packages review prompts for scrutinizing quantitative analysis code the way research engineering pipelines expect: layered passes that catch math errors, careless data prep, and mis-specified statistical tests before results ship. Solo builders writing study scripts, benchmark notebooks, or backend analytics jobs can invoke it when they need an agent to systematically question normality assumptions, train-test leakage, unit consistency, and whether a test is appropriate for the data type. It originates from data-to-paper hypothesis-testing coding patterns and AgentLaboratory-style rigor, so it favors publication- or report-grade correctness over quick chart generation. Use after you have draft analysis code and before you trust p-values, effect sizes, or descriptive tables in a paper, investor deck, or production metric.898installs4Citation Managementcitation-management packages a self-contained Python workflow for solo researchers and indie technical writers who draft in LaTeX and keep slipping into unsupported assertions. The harvest_citations.py utility walks your .tex file, flags sentences that look like factual claims without a \\cite, turns those passages into search queries, and hits the Semantic Scholar graph API to propose matching papers as BibTeX you can merge into references.bib. Flags such as --dry-run, --max-rounds, and --verbose make it practical to iterate without bloating your bibliography overnight. It fits Prism’s audience when an agent should operationalize tedious citation gap-filling instead of hand-searching for every “recent work has shown” sentence. Use it while writing theses, lab notes turned papers, or long-form research posts where auditability matters. It does not replace reference managers or plagiarism checks; it accelerates finding plausible sources for already-written prose.864installs5Figure Generationfigure-generation packages research-grade prompts from MatPlotAgent so your coding agent can expand a vague chart request into step-by-step Python instructions, then emit executable Matplotlib code that saves PNGs. Solo builders validating ideas with quick exploratory plots, writing technical blogs, or preparing investor or academic figures benefit because the skill forces explicit library choices, parameter setting, and data preparation rather than one-shot hallucinated scripts. Use it when you need reproducible visualization code tied to a user query, especially where CSV ingestion or multi-step plotting logic matters. It sits in Build/docs as the primary shelf but also supports Idea research visuals and Grow content that needs charts. It does not replace a full notebook workflow or interactive BI; pair it with data you already trust and review generated code before running on sensitive datasets.847installs6Latex FormattingLaTeX Formatting is an agent skill that gives solo builders and indie researchers a conference-ready reference for machine-learning venue templates. It maps NeurIPS, ICML, ICLR, AAAI, ACL/EMNLP, CVPR/ICCV, ICBINB, and arXiv against page limits, column layout, anonymous review rules, and style files so you pick the right `.sty` before writing. The recommended project skeleton—`main.tex`, `references.bib`, modular `sections/`, `figures/`, `tables/`, and appendix—keeps agents and humans aligned on where content lives. A single essential-packages block covers math, figures, professional tables, algorithms, subfigures, smart cross-refs, and microtype so first compilations are less painful. Use it when an agent is drafting or restructuring a submission, converting notes into LaTeX, or validating that a repo matches venue constraints. It complements research workflows (related work, methods, experiments) without replacing bibliography management or figure-generation skills.817installs7Deep ResearchDeep Research is an agent skill that teaches reliable arXiv API usage for literature discovery. It specifies query parameters, Boolean syntax, category filters, and sort options so agents do not hammer the public endpoint or return truncated, irrelevant feeds. Builders researching transformers, language models, multi-agent systems, or biomolecular ML get copy-paste patterns such as combined cat and all queries and conservative pagination. The skill foregrounds operational limits—100 results per call and a three-second pacing rule—so automated research loops stay polite and debuggable. It fits early opportunity validation and later technical spikes when you need primary sources before committing to an architecture or model choice.779installs8Idea Generationidea-generation is a reference skill that packages prompt patterns from AI-Scientist and related autonomous research projects so an agent can propose the next experiment direction given task text, existing code, and a list of ideas already tried. Solo builders running research agents or benchmark pipelines use it at the Idea phase when they need creative but feasible hypotheses that respect code-only resources. The skill enforces a two-part response: a THOUGHT section for motivation and design, then NEW IDEA JSON with machine-friendly fields for downstream automation or report writing. It is not a general startup brainstorming tool; it targets ML paper-style novelty within supplied experiment.py context. Pair it with validation and build skills once an idea is selected for implementation.769installs9Paper RevisionPaper Revision Prompts packages patterns from AI-Scientist (perform_improvement) and AgentLaboratory (report_refinement) so agent-assisted researchers can respond to peer review systematically. Solo builders and small lab teams use it when a manuscript returns with multi-page reviews and they need to classify each comment, map it to sections, and choose between a light edit and a partial pipeline rerun. The skill emphasizes Major vs Minor vs Question vs Suggestion triage, explicit revision plans, and optional experiment re-execution with state preservation. It fits builders who already run agentic science workflows and want procedural knowledge instead of one-off chat instructions. It does not replace domain expertise on statistics or ethics; it structures how feedback becomes actionable work and updated text.767installs10Experiment DesignExperiment-design packages a progressive research playbook for agent-driven science workflows. Stage 1 forces a minimal working implementation on the simplest dataset with converging training curves and sane validation metrics. Stage 2 shifts to baseline tuning—learning rate, batch size, and core hyperparameters—tested on at least two datasets with three random seeds so results are not lottery wins. Stage 3 opens creative research: novel method changes, broader dataset coverage, multiple baseline comparisons, ablations that support contribution claims, and figures suitable for a paper. The skill is prose prompts and criteria rather than a training binary, so it pairs naturally with coding agents that implement and run experiments. Solo builders and small research pods use it to keep exploratory ML work honest before they treat a spike as a shippable result or write up findings.741installs11Novelty AssessmentNovelty Assessment is an agent skill for solo researchers, indie ML builders, and small labs who must avoid duplicating published work before writing proposals, grant drafts, or implementation plans. The workflow breaks an idea into core contributions and subfields, then runs up to ten rounds of focused literature retrieval and critique, reviewing roughly ten top hits per round for substantive overlap. A deliberately harsh critic stance forces a clear binary outcome—novel or not—with justification you can cite in emails to advisors or co-founders. Companion scripts can automate rounds and emit JSON reports, while reference prompts live in the skill package. Use it early in Idea when exploring directions, and again in Validate when tightening scope before a prototype. It does not replace human peer review, ethics review, or patent counsel; it reduces the risk of building on already-solved problems.734installs12Github Researchgithub-research is a methodology skill for solo builders running agentic literature-to-code pipelines. It documents how to move from deep-research deliverables—especially paper_db.jsonl and optional phase4_code code_repos.md—into systematic GitHub discovery with a search strategy matrix spanning broad topic queries, exact paper titles, method names, and author profiles. Expect structured outputs: handfuls of direct repo links, dozens of keywords at varied specificity, and explicit paper-to-repo mappings. The guide is intentionally phased (intake, discovery, and beyond in the full skill) so an agent does not improvise vague “search GitHub for X” steps when validating whether an idea has mature open implementations. It pairs with research automation stacks rather than replacing hands-on prototyping, and it assumes English technical terms when sources are multilingual. Builders use it when they need a reproducible map of the OSS landscape before scope lock or integration choices.725installs13Research Planningresearch-planning is an agent skill for solo academics, indie ML builders, and research engineers who need a reproducible blueprint before writing code or drafting a paper. Given a topic or paper to reproduce, it walks through context gathering (question, significance, datasets, compute), then emits a four-stage plan adapted from Paper2Code: strategic methodology and metrics, system/file architecture with diagrams, dependency-aware task breakdown with shared knowledge, and config extraction for training or experiments. It also structures the paper outline and evaluation story so later automation or human implementation has a single source of truth. Use at project kickoff, before benchmarking a new idea, or when cloning a published method. It does not run experiments or train models—it designs how you will. References live under the skill path for prompts and output schemas. Intermediate complexity reflects assumed familiarity with ML experimentation vocabulary.724installs14Paper Writing SectionPaper Writing Section packages two-pass refinement prompts extracted from the AI-Scientist perform_writeup flow so your coding agent can critique and complete one LaTeX section at a time. Solo builders and indie researchers who ship papers with Claude Code or similar agents use it after a rough draft exists and before the full manuscript is stitched together. Pass one targets mechanical and scientific integrity errors—unenclosed math, bad labels, references missing from the bibliography, results not traceable to experiment logs, and broken begin/end pairs—while insisting placeholders are filled. Pass two applies section-specific writing guidance to tighten verbosity and align tone with the rest of the paper. It fits agentic research pipelines where experiments, figures, and notes.txt already live in-repo and the bottleneck is consistent, error-free section text rather than ideation or deployment.713installs15Code Debuggingcode-debugging is an agent skill that packages research-agent debugging patterns—especially data-to-paper’s CodeProblem severity ladder and RunIssue records—so solo builders and small teams can triage failing agent-written code systematically. It targets builders shipping scientific agents, automation CLIs, or experiment runners where stderr alone is not enough and you need categorized failures with fix instructions. Use it during Ship when tests or sandbox runs fail, during Build when integrating generated blocks, and in Operate when production jobs regresses on output files or timeouts. The skill matters because unstructured debug loops waste tokens: ordering problems from “no code” down to “all OK” tells the agent what to fix first and whether to request a small patch versus a rewrite. It is procedural knowledge, not a live debugger MCP—your agent applies the taxonomy when interpreting run logs and proposing fixes.711installs16Algorithm DesignAlgorithm Design Templates is an agent skill for solo builders and indie researchers who need consistent, journal-style pseudo-code without hand-rolling LaTeX every time. It packages patterns extracted from Paper2Code’s planning stage and AI-Researcher’s plan agent: structured Require/Ensure headers, nested loops over mini-batches, subroutine calls with `\ref`, and iterative optimization with explicit convergence criteria. Use it when you are documenting ML or optimization procedures alongside an implementation spec, a thesis chapter, or an agent-generated research plan. The skill does not implement algorithms in Python or Rust—it gives you copy-paste LaTeX that agents can fill with your notation, loss definitions, and subroutine names. That makes handoffs to writing-plans or code generation cleaner because the documented steps match what you intend to ship. Intermediate familiarity with LaTeX macros and standard ML notation helps; beginners can still start from the filled examples and edit symbols incrementally.710installs17Survey GenerationSurvey Generation is a prompt-library skill distilled from AutoSurvey's prompt module for solo researchers and indie builders who need agent-ready templates to draft large-scale academic surveys. The first workflow runs parallel rough outline generation: the model receives a topic and enumerated papers, then returns a titled outline with a fixed number of sections, each paired with a one-sentence writing brief. A second merging stage takes multiple such outlines and asks an expert persona to produce a single final outline that is more comprehensive and logically ordered. Outputs are constrained to a explicit title-and-sections format so downstream agents can expand sections into full prose or citations. It fits early Idea work when you are mapping a technical field before building an RAG product, writing launch content, or scoping a validate-phase prototype, but it does not fetch papers or verify citations—you supply the reading list and edit the merged outline before treating it as publication-ready.708installs18Rebuttal WritingRebuttal Writing packages the ChatReviewer-style workflow for academic authors answering peer review under tight deadlines. It is aimed at solo researchers and small lab groups who must extract every numbered concern from heterogeneous review formats and reply discipline-by-discipline without drifting into future tense commitments reviewers penalize. The skill embeds the full system prompt requiring point-to-point responses, a canonical review comment structure for parsing strengths and weaknesses, and a markdown output template starting with a concise gratitude paragraph and blue-highlight change notes in the revised manuscript. Use it in Validate when reviews land and you need a consistent rebuttal voice across multiple reviewers while aligning claims to edits already made. It does not replace understanding your paper’s limits—pair human fact-checking on experiments and citations before submission uploads.702installs19Table GenerationTable Generation is a template-oriented agent skill for solo researchers and builders documenting ML or benchmark results in LaTeX. It supplies ready-made booktabs layouts—plain comparison tables and grouped multirow structures—with captions, labels, and notation patterns like ± standard deviations and bold best rows. Use it during Build when you are writing docs or a paper section and need tables that match venue conventions without relearning multirow and threeparttable every sprint. The skill is phase-specific documentation aid, not a data pipeline; you still supply numbers and methods. It pairs well with analysis skills that produce metrics. Complexity is beginner to intermediate depending on how much you customize column groups and footnotes.700installs20Related Work WritingRelated Work Writing is an agent skill for solo builders and small research teams who need publication-quality Related Work sections aligned with AI-scientist and AgentLaboratory-style guidance. It targets the gap between a bibliography dump and a scholarly positioning argument: each cited work should be an academic sibling—alternative attempts at the same problem—with clear contrast in assumptions or method, not a chronological summary. The skill walks you through thematic clustering (problem variants, methodology families, application domains, evaluation paradigms, theoretical foundations), then prescribes per-theme paragraphs that state what key papers did and how they differ from each other and from your approach. When a method fits your problem setting, you are prompted to plan experimental comparison; when it does not, you must state why explicitly. It is aimed at researchers writing papers or theses, not at casual blog posts. Organize by theme, cite broadly, and treat Related Work as the intellectual map that justifies your contribution before you invest months in Build and Ship.695installs21Math Reasoningmath-reasoning is a compact notation and LaTeX reference for machine-learning and AI research writing. Solo builders and indie researchers use it when agents draft paper sections, appendices, or blog posts that need standard symbols for input/output spaces, losses, expectations, divergences, and matrix calculus without inventing ad-hoc notation. It fits agents working in Claude Code, Cursor, or Codex on literature-style outputs where reviewers expect familiar 𝒳/𝒴, 𝔼[·], D_KL, and bold vector conventions. The skill does not run computations or verify proofs; it steers formatting and vocabulary so equations align with common ML paper style. Use it during doc-heavy Build work and when polishing Validate-stage research memos or Idea-stage survey notes that include formal definitions.694installs22Paper CompilationPaper Compilation is a self-contained Python skill adapted from AI-Scientist and data-to-paper flows so solo researchers and agent builders turn LaTeX sources into PDFs without manual shell guesswork. You point it at paper/main.tex and it orchestrates the classic multi-pass build, optionally runs chktex, and reports errors in a form your coding agent can fix. It fits idea-phase research when you are iterating manuscripts, benchmarks, or internal tech reports—not when you are shipping a SaaS UI. Requires a working pdflatex and bibtex toolchain on the machine running the script. Use when your agent wrote or edited .tex and you need a reliable compile gate before sharing or archiving.693installs23Self Reviewself-review is a multi-phase agent skill that applies the NeurIPS review form extracted from AI-Scientist’s perform_review.py so solo researchers can stress-test papers before submission. It instructs the agent to produce a brief contribution summary, then a thorough strengths and weaknesses assessment along originality, technical quality, clarity, and significance, followed by concrete questions for authors and a limitations discussion. The rubric mirrors what venue reviewers expect, which helps indie labs without a formal peer circle catch gaps in related work, unsupported claims, or unclear exposition early. Use it when a draft is complete enough to critique but still private—complementary to brainstorming or experiment skills, not a replacement for statistical testing. For builder-journeys that ship research agents or publish ML writeups, it turns opaque “does this read okay?” anxiety into a repeatable review pass that can gate whether to iterate experiments or pursue arXiv/upload.683installs24Slide Generationslide-generation supplies Beamer LaTeX templates and layout patterns tuned for research presentations rather than generic pitch slides. Solo builders and indie researchers use it when they need a credible deck structure—problem statement, literature gap, key idea figure, and formulation slides—without designing layout from scratch. The skill emphasizes metropolis theming, appendix numbering, booktabs-friendly tables, and reusable frame patterns for motivation versus method. It fits agents that emit .tex source for local compilation. Typical moments include preparing a conference talk after a paper or prototype exists, turning validation insights into a demo day deck, or supporting grow-phase content repurposing. Complexity assumes comfort with LaTeX toolchains. Outputs are starter .tex you customize with real figures, citations, and results—not automated PDF rendering inside the agent unless your environment compiles LaTeX.681installs25Experiment Codeexperiment-code is a reference skill for solo builders and indie researchers who automate scientific coding with agents. It documents concrete Python patterns for running repeatable experiment scripts: an outer run counter, an inner retry budget when subprocess calls fail, and prompts that feed stderr tails or JSON metric summaries back into the coding agent. The AI-Scientist loop caps iterations and runs so costs stay bounded while still allowing multiple attempts to fix broken training or evaluation scripts. A complementary hill-climbing optimization pattern supports stepwise code improvements when benchmarking against baselines. Use it when you are prototyping ML pipelines, ablation studies, or agent-lab workflows—not when you only need a one-off notebook cell. It pairs well with research planning skills and later ship-phase hardening once a run directory layout is stable.678installs26Paper To CodePaper to Code is an agent skill that implements the Paper2Code workflow so your agent can ingest a machine-learning paper and emit a complete, runnable codebase. Solo builders and researchers use it when they need to validate that published methods are reproducible, bootstrap a thesis or product experiment, or skip weeks of manual translation from equations to modules. Stage one planning extracts methodology, experiments, metrics, and hyperparameters, then produces architecture diagrams, per-file logic notes, and a config.yaml. Stage two analyzes each file against the UML and config. Stage three generates code in dependency order while preserving interfaces. The skill points to bundled reference prompts under the skill’s references path. Expect advanced ML literacy; the agent still needs your review for correctness, licensing, and dataset access.666installs27Symbolic Equationsymbolic-equation is an agent-research skill that encodes patterns extracted from the LLM-SR codebase—pipeline, sampler, buffer, evaluator, and config—so coding agents can reason about symbolic equation discovery instead of guessing file layout. It is aimed at builders and researchers automating scientific ML workflows: multi-island evolutionary search, parallel LLM sampling, and evaluator scoring loops that register improving programs over time. Use it when you are still in the research phase, prototyping discoverable function structures for datasets with interpretable inputs, not when you only need a one-line stats API in production. The readme emphasizes continuous sample/analyse cycles and island-aware prompts, which maps cleanly to agent orchestration over long-running search jobs. This is advanced, niche tooling compared with typical solo SaaS skills; prerequisites include comfort with Python ML pipelines and tolerance for experimental symbolic-regression stacks.666installs28Atomic DecompositionAtomic Decomposition is an agent skill for solo builders and small research teams who need to turn a vague research idea or dense systems paper into inspectable building blocks. You supply an idea, paper, or method description; the agent breaks it into atomic concepts that each have clear mathematical foundations, a plausible code implementation, and citations back to sources. For every atom it runs a paper survey to extract LaTeX formulas, assumptions, and reference papers, then a code survey to find implementations, variations, and reference repositories, and finally assembles structured knowledge entries you can reuse in specs or agent workflows. Use it when a complex system paper requires formal grounding before you prototype or write an implementation plan. It pairs naturally with literature review and implementation planning skills rather than replacing hands-on coding or empirical validation.659installs29Backward TraceabilityBackward Traceability Patterns is an agent skill that teaches LaTeX conventions for linking narrative claims in a paper to the exact code lines that produced each numeric value. Solo builders and indie researchers shipping reproducible ML or statistics writeups install it when they generate results with Python and must defend every percentage and delta in peer review or preprint scrutiny. The skill encodes data-to-paper primitives: hypertarget markers in generated TeX, hyperlink references in the manuscript body, and compile-time \num formulas so derived statistics cannot drift from source outputs without a failed build. It also standardizes label grammar and letter suffix rules so agents and humans can grep from a PDF click target back to ref_numeric_values-style pipelines. Use it while integrating latex_to_pdf and referencable_text flows, before final submission, and whenever a collaborator asks where a table cell came from. It matters because ad-hoc copy-paste from notebooks is the fastest way to lose auditability; this skill makes traceability a repeatable agent procedure instead of a one-off postmortem.653installs30Excalidraw Skillexcalidraw-skill is a cheatsheet-oriented integration skill for driving an Excalidraw canvas through MCP rather than manual drawing. Solo builders and small teams use it when they need fast system diagrams, flowcharts, or whiteboard-style specs that agents can create, update, and re-read via describe_scene. The documented surface spans twenty-six tools—from create_element through batch_create_elements, alignment and distribution, grouping, locking, and scene-level description—plus health checks against a default local express server URL. It fits early architecture conversations, build-phase documentation, and validate-phase prototypes where a visual beats a long prose spec. You should have the canvas server running and MCP wired in your agent; the skill teaches tool parameters and workflows, not hosted Excalidraw account setup.644installs31Paper AssemblyPaper Assembly is an agent skill that orchestrates a full research-paper workflow for solo builders and small teams using Claude Code. You point it at a paper project directory or plan; it assesses which of nine phases are complete, builds a dependency graph, and runs remaining work in order—from literature search and review through planning, code, experiments, figures, tables, writing, and review. A bundled assembly_checker script emits checkpoint JSON and verbose completeness reports so you can pause and resume without manually tracking artifacts. It is built for agent-research-skills stacks where each phase maps to companion skills (literature-search, research-planning, and similar). Use it when you have partial paper components and need a single conductor rather than ad-hoc phase hopping. The skill emphasizes state propagation and checkpoint files over one-shot generation, which matters for long-running empirical work where experiments and figures land on different days.561installs