Ref Verify

Name: Ref Verify
Author: moonweave-research

moonweave-research/ref-verify

Run reference and citation verification on agent-generated research so you catch hallucinated sources before publishing docs, posts, or investor updates.

Overview

ref-verify is an agent skill most often used in Ship review (also Build docs, Launch distribution) that verifies references and citations in AI-assisted research outputs before you publish.

Install

npx skills add https://github.com/moonweave-research/ref-verify --skill ref-verify

What is this skill?

Oriented around verification of claims and references—not open-ended research generation
Project design discusses Quick Screen and Full Audit balance for cost versus depth
Explicit focus on real failure modes and fabricated hallucination examples as unacceptable input
Factual, correction-friendly workflow aligned with research integrity norms
Enhancement path contemplates tradeoffs for API cost and output length across agents

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 1 installs on skills.sh; 8 GitHub stars; 1/3 security scanners passed (skills.sh audits); trending (+100% hot-view momentum).

What problem does it solve?

Your agent draft cites papers, links, or statistics you have not validated and a bad reference could embarrass you at launch.

Who is it for?

Indie founders and technical writers who publish research-heavy posts, docs, or pitches where citation accuracy matters more than speed.

Skip if: Pure codegen tasks with no external claims, or teams that already run a formal human fact-check pipeline on every paragraph.

When should I use this skill?

User needs to verify references, audit citations, or screen agent research for unsupported claims before publication.

What do I get? / Deliverables

You get a verification-oriented pass that surfaces unsupported or suspect references so you can fix or remove claims before shipping content.

Reference verification report with flagged or unsupported citations
Actionable corrections aligned to factual verification norms

Recommended Skills

Microsoft Foundrymicrosoft/azure-skills

Microsoft Foundry skill guides agents through the full Azure AI Foundry lifecycle—containerizing agents, pushing to ACR,…377k installs·1.2k stars

Azure Aimicrosoft/azure-skills

azure-ai is a Prism-oriented quick reference for Microsoft Azure AI work, with the published body centered on the Azure …375k installs·1.2k stars

Azure Hosted Copilot Sdkmicrosoft/azure-skills

Azure Hosted Copilot SDK is Microsoft's entry skill for repos using @github/copilot-sdk—it detects CopilotClient usage, …346k installs·1.2k stars

Lark Eventlarksuite/cli

Lark real-time subscription skill via lark-cli event consume for building bots and streaming webhook-style agent workers…208k installs·13.7k stars

Running Claude Code Via Litellm Copilotxixu-me/skills

Running Claude Code via LiteLLM Copilot walks through pointing Claude Code at a local LiteLLM proxy that forwards Anthro…200k installs·61 stars

Setup Matt Pocock Skillsmattpocock/skills

One-time per-repo setup so Matt Pocock engineering skills share correct issue tracker, triage strings, and domain docume…180k installs·121k stars

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

Ship review is the first hard gate where fabricated citations block release quality—before distribution in Launch. Review matches Quick Screen versus Full Audit style verification described in project materials rather than initial drafting.

Also useful

BuildDocs & content

Also useful

LaunchDistribution & launch channels

Where it fits

Example use

BuildDocs & content

Scan API guide footnotes and external links before merging documentation PRs.

Example use

ShipCode review

Run a verification pass on an investor memo with numbered citations.

Example use

LaunchDistribution & launch channels

Validate sources in a launch blog post before posting to HN or newsletters.

How it compares

Citation verification workflow for agent drafts—not a general web search skill or SEO content generator.

Common Questions / FAQ

Who is ref-verify for?

Solo builders and small research teams using Claude Code or similar agents who need reference checking before sharing AI-written reports or documentation.

When should I use ref-verify?

Use it in Ship review on near-final drafts, in Build docs when embedding external sources in README or API docs, and in Launch distribution before SEO or PR content goes live.

Is ref-verify safe to install?

Verification skills may call external APIs; review the Security Audits panel on this Prism page and do not treat automated output as legal or academic certification.

SKILL.md

READMESKILL.md - Ref Verify

## What problem does this solve?

(describe a real failure mode or gap, not an abstract improvement)

## Proposed change

(what should the skill do differently?)

## Tradeoffs

- Does this add cost (more API calls, longer output)?
- Does this change the Quick Screen / Full Audit balance?
- Does it affect agents other than Claude Code?

## Example

(if possible, show a before/after of what the output would look like)


# Code of Conduct

## Our Pledge

We are committed to making participation in this project a respectful, harassment-free experience for everyone, regardless of background or experience level.

## Our Standards

**Expected behavior:**

- Use clear, factual language — this project is about verification, not advocacy
- Accept corrections gracefully — if a test case shows a bug, that's valuable information
- Focus feedback on the skill behavior, not on other contributors

**Unacceptable behavior:**

- Harassment, personal attacks, or discriminatory language
- Deliberate misrepresentation of test results or failure cases
- Submitting fabricated hallucination examples as real catches

## Enforcement

Instances of unacceptable behavior may be reported to the maintainers via GitHub Issues (mark as `conduct`). All reports will be reviewed and investigated. Maintainers have the right to remove comments, close issues, or ban contributors who violate these standards.

## Attribution

Adapted from the [Contributor Covenant](https://www.contributor-covenant.org), version 2.1.


## What this changes

(one paragraph — what problem does this PR solve?)

## Type of change

- [ ] Bug fix — skill was producing wrong verdicts
- [ ] New API source (e.g. Retraction Watch, IEEE Xplore)
- [ ] Trigger description improvement
- [ ] New test case in `evals/evals.json`
- [ ] Documentation

## Evidence

For skill changes: show a before/after example with a real DOI.

```
Before (without this change):
[verdict or behavior]

After (with this change):
[verdict or behavior]
```

For trigger changes: show which queries now correctly trigger (or don't) that didn't before.

## Checklist

- [ ] The core rule is preserved (verbatim abstract traceability, UNVERIFIABLE instead of guessing)
- [ ] Quick Screen and Full Audit remain distinct modes
- [ ] SKILL.md is under 500 lines
- [ ] If adding a test case: the DOI is real and independently verifiable
- [ ] CHANGELOG.md updated under `[Unreleased]`


# Contributing to ref-verify

Thank you for helping improve citation verification. Contributions of all kinds are welcome — new failure modes, broader API coverage, improved trigger descriptions, and additional test cases.

---

## Ways to contribute

### Report a hallucination case

If `ref-verify` missed a real error (false negative) or flagged something incorrectly (false positive), open an issue with:

- The prompt you used
- What the skill returned
- What the correct result should have been
- The DOI or paper title involved

These are the most valuable contributions. Real failure cases are how the skill improves.

### Add a test case

Test cases live in `evals/evals.json`. A good test case:

- Uses a real DOI that can be independently verified
- Tests a specific failure mode (wrong author, hallucinated content, near-miss, retracted paper)
- Has a clear `expected_output` description

See the existing three cases for format reference.

### Improve the skill

`SKILL.md` is the skill itself — the instructions the agent follows. Improvements should:

- Solve a documented problem (link to an issue or test case)
- Not add scope beyond citation verification
- Keep the two-mode design intact (Quick Screen and Full Audit)
- Preserve the core rule: every content statement must be verbatim from a fetched abstract

### Extend API coverage

Currently covers: CrossRef, Semantic Scholar, Unpaywall, arXiv, PubMed.

Additions worth considering: Retraction Watc

What is this skill?

Oriented around verification of claims and references—not open-ended research generation

Project design discusses Quick Screen and Full Audit balance for cost versus depth

Explicit focus on real failure modes and fabricated hallucination examples as unacceptable input

Factual, correction-friendly workflow aligned with research integrity norms

Enhancement path contemplates tradeoffs for API cost and output length across agents

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 1 installs on skills.sh; 8 GitHub stars; 1/3 security scanners passed (skills.sh audits); trending (+100% hot-view momentum).

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

Also useful

Also useful

LaunchDistribution & launch channels

Where it fits

Example use

BuildDocs & content

Scan API guide footnotes and external links before merging documentation PRs.

Example use

ShipCode review

Run a verification pass on an investor memo with numbered citations.

Example use

LaunchDistribution & launch channels

Validate sources in a launch blog post before posting to HN or newsletters.

SKILL.md

READMESKILL.md - Ref Verify

## What problem does this solve?

(describe a real failure mode or gap, not an abstract improvement)

## Proposed change

(what should the skill do differently?)

## Tradeoffs

- Does this add cost (more API calls, longer output)?
- Does this change the Quick Screen / Full Audit balance?
- Does it affect agents other than Claude Code?

## Example

(if possible, show a before/after of what the output would look like)


# Code of Conduct

## Our Pledge

We are committed to making participation in this project a respectful, harassment-free experience for everyone, regardless of background or experience level.

## Our Standards

**Expected behavior:**

- Use clear, factual language — this project is about verification, not advocacy
- Accept corrections gracefully — if a test case shows a bug, that's valuable information
- Focus feedback on the skill behavior, not on other contributors

**Unacceptable behavior:**

- Harassment, personal attacks, or discriminatory language
- Deliberate misrepresentation of test results or failure cases
- Submitting fabricated hallucination examples as real catches

## Enforcement

Instances of unacceptable behavior may be reported to the maintainers via GitHub Issues (mark as `conduct`). All reports will be reviewed and investigated. Maintainers have the right to remove comments, close issues, or ban contributors who violate these standards.

## Attribution

Adapted from the [Contributor Covenant](https://www.contributor-covenant.org), version 2.1.


## What this changes

(one paragraph — what problem does this PR solve?)

## Type of change

- [ ] Bug fix — skill was producing wrong verdicts
- [ ] New API source (e.g. Retraction Watch, IEEE Xplore)
- [ ] Trigger description improvement
- [ ] New test case in `evals/evals.json`
- [ ] Documentation

## Evidence

For skill changes: show a before/after example with a real DOI.

```
Before (without this change):
[verdict or behavior]

After (with this change):
[verdict or behavior]
```

For trigger changes: show which queries now correctly trigger (or don't) that didn't before.

## Checklist

- [ ] The core rule is preserved (verbatim abstract traceability, UNVERIFIABLE instead of guessing)
- [ ] Quick Screen and Full Audit remain distinct modes
- [ ] SKILL.md is under 500 lines
- [ ] If adding a test case: the DOI is real and independently verifiable
- [ ] CHANGELOG.md updated under `[Unreleased]`


# Contributing to ref-verify

Thank you for helping improve citation verification. Contributions of all kinds are welcome — new failure modes, broader API coverage, improved trigger descriptions, and additional test cases.

---

## Ways to contribute

### Report a hallucination case

If `ref-verify` missed a real error (false negative) or flagged something incorrectly (false positive), open an issue with:

- The prompt you used
- What the skill returned
- What the correct result should have been
- The DOI or paper title involved

These are the most valuable contributions. Real failure cases are how the skill improves.

### Add a test case

Test cases live in `evals/evals.json`. A good test case:

- Uses a real DOI that can be independently verified
- Tests a specific failure mode (wrong author, hallucinated content, near-miss, retracted paper)
- Has a clear `expected_output` description

See the existing three cases for format reference.

### Improve the skill

`SKILL.md` is the skill itself — the instructions the agent follows. Improvements should:

- Solve a documented problem (link to an issue or test case)
- Not add scope beyond citation verification
- Keep the two-mode design intact (Quick Screen and Full Audit)
- Preserve the core rule: every content statement must be verbatim from a fetched abstract

### Extend API coverage

Currently covers: CrossRef, Semantic Scholar, Unpaywall, arXiv, PubMed.

Additions worth considering: Retraction Watc

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is ref-verify for?

When should I use ref-verify?

Is ref-verify safe to install?

SKILL.md

This week for builders

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is ref-verify for?

When should I use ref-verify?

Is ref-verify safe to install?

SKILL.md