
Keyword Research Harvest
Install this when you need a repeatable, on-disk pipeline that searches scholarly APIs from topic keywords and saves deduplicated full texts for literature review.
Overview
keyword-research-harvest is an agent skill for the Idea phase that runs a local scholarly-API harvest: candidate table, full-text download, HTML-to-PDF retry, and deduplication into a project folder.
Install
npx skills add https://github.com/zhongzhx/literature-harvest --skill SKILL.mdWhat is this skill?
- Keyword-driven searches across scholarly APIs with results normalized into a candidate table
- Downloads accessible PDFs plus HTML/XML full texts into a project folder layout
- Second-pass HTML-to-PDF attempt for sources that did not land as PDF on first try
- File-level deduplication so reruns do not clutter the corpus
- Designed as a reusable local workflow you can rerun for any topic keywords
Adoption & trust: 61 GitHub stars.
What problem does it solve?
You know your topic keywords but lack a reproducible way to search scholarly APIs, download full texts, and avoid duplicate files in one project folder.
Who is it for?
Solo builders doing keyword-driven literature reviews who want artifacts on disk and a workflow they can repeat per topic or project.
Skip if: Builders who only need a quick web summary without PDFs, or teams that require licensed database access and compliance workflows this skill does not configure.
When should I use this skill?
User wants a reusable local workflow to search scholarly APIs for topic keywords, build a candidate table, download accessible full texts, run HTML-to-PDF retry, and deduplicate files.
What do I get? / Deliverables
You get a deduplicated local corpus and candidate table you can rerun for new keywords before scoping or prototyping.
- Candidate table of scholarly hits for the keyword set
- Deduplicated PDF or converted full-text files in the project folder
Recommended Skills
Journey fit
Literature harvesting is a pre-commit research task: you gather evidence and sources before narrowing what to build or validate. The skill’s entire flow—API search, candidate table, downloads, second-pass conversion, dedupe—maps directly to systematic topic research, not shipping or growth work.
How it compares
Use instead of one-off chat summaries when you need downloadable primary sources and a repeatable harvest script pattern, not a generic research MCP.
Common Questions / FAQ
Who is keyword-research-harvest for?
Indie and solo builders using Claude Code, Cursor, or similar agents who want scholarly search and full-text collection automated into a project folder for early research.
When should I use keyword-research-harvest?
Use it in Idea/research when exploring a market or technology topic, and again in Validate/scope when you need papers to back a prototype or positioning decision.
Is keyword-research-harvest safe to install?
It performs network downloads and local file writes; review the Security Audits panel on this Prism page and inspect SKILL.md permissions before running against sensitive machines.
SKILL.md
READMESKILL.md - Keyword Research Harvest
Use when the user wants a reusable local workflow to search scholarly APIs for any topic keywords, build a candidate table, download accessible PDFs or HTML/XML full texts, run a second-pass HTML-to-PDF attempt, and deduplicate the downloaded files. Best for keyword-driven research-article harvesting that should be reproducible and saved into a project folder. # keyword-research-harvest { "name": "keyword-research-harvest", "description": "Use when the user wants a reusable local workflow to search scholarly APIs for any topic keywords, build a candidate table, download accessible PDFs or HTML/XML full texts, run a second-pass HTML-to-PDF attempt, and deduplicate the downloaded files. Best for keyword-driven research-article harvesting that should be reproducible and saved into a project folder." }