
Ref Verify
Run reference and citation verification on agent-generated research so you catch hallucinated sources before publishing docs, posts, or investor updates.
Overview
ref-verify is an agent skill most often used in Ship review (also Build docs, Launch distribution) that verifies references and citations in AI-assisted research outputs before you publish.
Install
npx skills add https://github.com/moonweave-research/ref-verify --skill ref-verifyWhat is this skill?
- Oriented around verification of claims and references—not open-ended research generation
- Project design discusses Quick Screen and Full Audit balance for cost versus depth
- Explicit focus on real failure modes and fabricated hallucination examples as unacceptable input
- Factual, correction-friendly workflow aligned with research integrity norms
- Enhancement path contemplates tradeoffs for API cost and output length across agents
Adoption & trust: 1 installs on skills.sh; 8 GitHub stars; 1/3 security scanners passed (skills.sh audits); trending (+100% hot-view momentum).
What problem does it solve?
Your agent draft cites papers, links, or statistics you have not validated and a bad reference could embarrass you at launch.
Who is it for?
Indie founders and technical writers who publish research-heavy posts, docs, or pitches where citation accuracy matters more than speed.
Skip if: Pure codegen tasks with no external claims, or teams that already run a formal human fact-check pipeline on every paragraph.
When should I use this skill?
User needs to verify references, audit citations, or screen agent research for unsupported claims before publication.
What do I get? / Deliverables
You get a verification-oriented pass that surfaces unsupported or suspect references so you can fix or remove claims before shipping content.
- Reference verification report with flagged or unsupported citations
- Actionable corrections aligned to factual verification norms
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Ship review is the first hard gate where fabricated citations block release quality—before distribution in Launch. Review matches Quick Screen versus Full Audit style verification described in project materials rather than initial drafting.
Where it fits
Scan API guide footnotes and external links before merging documentation PRs.
Run a verification pass on an investor memo with numbered citations.
Validate sources in a launch blog post before posting to HN or newsletters.
How it compares
Citation verification workflow for agent drafts—not a general web search skill or SEO content generator.
Common Questions / FAQ
Who is ref-verify for?
Solo builders and small research teams using Claude Code or similar agents who need reference checking before sharing AI-written reports or documentation.
When should I use ref-verify?
Use it in Ship review on near-final drafts, in Build docs when embedding external sources in README or API docs, and in Launch distribution before SEO or PR content goes live.
Is ref-verify safe to install?
Verification skills may call external APIs; review the Security Audits panel on this Prism page and do not treat automated output as legal or academic certification.
SKILL.md
READMESKILL.md - Ref Verify
## What problem does this solve? (describe a real failure mode or gap, not an abstract improvement) ## Proposed change (what should the skill do differently?) ## Tradeoffs - Does this add cost (more API calls, longer output)? - Does this change the Quick Screen / Full Audit balance? - Does it affect agents other than Claude Code? ## Example (if possible, show a before/after of what the output would look like) # Code of Conduct ## Our Pledge We are committed to making participation in this project a respectful, harassment-free experience for everyone, regardless of background or experience level. ## Our Standards **Expected behavior:** - Use clear, factual language — this project is about verification, not advocacy - Accept corrections gracefully — if a test case shows a bug, that's valuable information - Focus feedback on the skill behavior, not on other contributors **Unacceptable behavior:** - Harassment, personal attacks, or discriminatory language - Deliberate misrepresentation of test results or failure cases - Submitting fabricated hallucination examples as real catches ## Enforcement Instances of unacceptable behavior may be reported to the maintainers via GitHub Issues (mark as `conduct`). All reports will be reviewed and investigated. Maintainers have the right to remove comments, close issues, or ban contributors who violate these standards. ## Attribution Adapted from the [Contributor Covenant](https://www.contributor-covenant.org), version 2.1. ## What this changes (one paragraph — what problem does this PR solve?) ## Type of change - [ ] Bug fix — skill was producing wrong verdicts - [ ] New API source (e.g. Retraction Watch, IEEE Xplore) - [ ] Trigger description improvement - [ ] New test case in `evals/evals.json` - [ ] Documentation ## Evidence For skill changes: show a before/after example with a real DOI. ``` Before (without this change): [verdict or behavior] After (with this change): [verdict or behavior] ``` For trigger changes: show which queries now correctly trigger (or don't) that didn't before. ## Checklist - [ ] The core rule is preserved (verbatim abstract traceability, UNVERIFIABLE instead of guessing) - [ ] Quick Screen and Full Audit remain distinct modes - [ ] SKILL.md is under 500 lines - [ ] If adding a test case: the DOI is real and independently verifiable - [ ] CHANGELOG.md updated under `[Unreleased]` # Contributing to ref-verify Thank you for helping improve citation verification. Contributions of all kinds are welcome — new failure modes, broader API coverage, improved trigger descriptions, and additional test cases. --- ## Ways to contribute ### Report a hallucination case If `ref-verify` missed a real error (false negative) or flagged something incorrectly (false positive), open an issue with: - The prompt you used - What the skill returned - What the correct result should have been - The DOI or paper title involved These are the most valuable contributions. Real failure cases are how the skill improves. ### Add a test case Test cases live in `evals/evals.json`. A good test case: - Uses a real DOI that can be independently verified - Tests a specific failure mode (wrong author, hallucinated content, near-miss, retracted paper) - Has a clear `expected_output` description See the existing three cases for format reference. ### Improve the skill `SKILL.md` is the skill itself — the instructions the agent follows. Improvements should: - Solve a documented problem (link to an issue or test case) - Not add scope beyond citation verification - Keep the two-mode design intact (Quick Screen and Full Audit) - Preserve the core rule: every content statement must be verbatim from a fetched abstract ### Extend API coverage Currently covers: CrossRef, Semantic Scholar, Unpaywall, arXiv, PubMed. Additions worth considering: Retraction Watc