Academic Pipeline

Name: Academic Pipeline
Author: imbad0202

imbad0202/academic-research-skills

Audit every cited claim in an academic draft against retrieved reference text and feed severity-tier findings into a Stage 5 formatter hard gate before publication.

Overview

Academic-pipeline (Claim Reference Alignment Audit Agent) is an agent skill most often used in Ship (also Build) that audits cited claims against retrieved reference text and feeds a formatter hard gate on faithfulness f

Install

npx skills add https://github.com/imbad0202/academic-research-skills --skill academic-pipeline

What is this skill?

L3 claim-faithfulness auditor: SUPPORTED / UNSUPPORTED / AMBIGUOUS / RETRIEVAL_FAILED verdicts per cited claim
Routes evidence into four passport aggregates for Stage 4→5 formatter hard gate
Surfaces uncited assertions, constraint violations, and drift with defect_stage routing
Explicit non-arbitration role—formatter decides pass/fail from annotation severity
Spec-driven v3.8 alignment channel (docs/design claim-alignment-audit-spec)
Four passport aggregates for formatter routing
Four verdict labels: SUPPORTED, UNSUPPORTED, AMBIGUOUS, RETRIEVAL_FAILED

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 2.5k installs on skills.sh; 28.8k GitHub stars; 2/3 security scanners passed (skills.sh audits).

What problem does it solve?

Your research draft cites papers but you cannot trust that each sentence is actually supported by the retrieved source text.

Who is it for?

Builders running multi-stage academic writing agents who need automated L3 audits before exporting a formatted manuscript.

Skip if: Casual blog posts without citation discipline or users who want a single-shot literature review without the ARS stage model.

When should I use this skill?

Stage 4 draft with citations is ready for L3 claim-vs-retrieved-reference audit before Stage 5 formatting.

What do I get? / Deliverables

You receive per-claim alignment verdicts and passport aggregates so Stage 5 formatting can refuse output when L3 faithfulness fails.

Per-claim alignment verdicts
Passport aggregates with defect_stage and severity tier
Uncited assertion and constraint-violation surfaces

Recommended Skills

Microsoft Foundrymicrosoft/azure-skills

Microsoft Foundry skill guides agents through the full Azure AI Foundry lifecycle—containerizing agents, pushing to ACR,…377k installs·1.2k stars

Azure Aimicrosoft/azure-skills

azure-ai is a Prism-oriented quick reference for Microsoft Azure AI work, with the published body centered on the Azure …375k installs·1.2k stars

Azure Hosted Copilot Sdkmicrosoft/azure-skills

Azure Hosted Copilot SDK is Microsoft's entry skill for repos using @github/copilot-sdk—it detects CopilotClient usage, …346k installs·1.2k stars

Lark Eventlarksuite/cli

Lark real-time subscription skill via lark-cli event consume for building bots and streaming webhook-style agent workers…208k installs·13.7k stars

Running Claude Code Via Litellm Copilotxixu-me/skills

Running Claude Code via LiteLLM Copilot walks through pointing Claude Code at a local LiteLLM proxy that forwards Anthro…200k installs·61 stars

Setup Matt Pocock Skillsmattpocock/skills

One-time per-repo setup so Matt Pocock engineering skills share correct issue tracker, triage strings, and domain docume…180k installs·121k stars

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

Ship/review is the canonical shelf because the skill is an L3 faithfulness gate immediately before formatted output ships—matching code-review and QA placement in the solo journey. Review subphase fits claim-vs-source verification, uncited assertion surfacing, and refusal semantics on substantive alignment failures.

Also useful

BuildDocs & content

Where it fits

Example use

BuildDocs & content

Run alignment audit after revising Related Work claims tied to specific arXiv IDs.

Example use

ShipCode review

Block formatter export when UNSUPPORTED verdicts exceed the severity tier.

Example use

OperateIteration & experiments

Re-audit after swapping retrieved text chunks for a corrected PDF extract.

How it compares

Checker skill in a staged research pipeline—not a general web search or citation generator on its own.

Common Questions / FAQ

Who is academic-pipeline for?

Developers maintaining ARS-style academic agents who need L3 claim-to-reference alignment before final formatting.

When should I use academic-pipeline?

After Stage 4 drafting when citations exist, during Ship review before publication output, and during Build docs passes when iterating claim wording against retrieved PDFs or abstracts.

Is academic-pipeline safe to install?

Review the Security Audits panel on this Prism page; the skill implies retrieval against external references and pipeline file access—validate repo trust and secrets handling in your agent environment.

SKILL.md

READMESKILL.md - Academic Pipeline

# Claim Reference Alignment Audit Agent v3.8

## Role Definition

You are the L3 (claim faithfulness) auditor for the ARS pipeline. Your responsibility is to evaluate every cited claim in the Stage 4 draft against the **retrieved text** of the cited reference, then route findings into one of four passport aggregates so the Stage 5 formatter hard gate can refuse output on substantive faithfulness failures.

**You audit; you do not arbitrate.** Your job is to produce evidence-bound verdicts (SUPPORTED / UNSUPPORTED / AMBIGUOUS / RETRIEVAL_FAILED + a specific `defect_stage`) plus uncited / drift / constraint-violation surfaces. You do not decide whether the paper passes — that is the formatter's job, driven by your annotation severity tier.

External motivation: Zhao et al. arXiv:2605.07723 (2026-05) documents 146,932 hallucinated citations across 2025 arXiv / bioRxiv / SSRN / PMC, naming **L3 (claim faithfulness)** as the load-bearing unsolved problem. v3.7.3 closed the locator channel (per-citation anchor markers); v3.8 closes the audit channel (judge-evaluated alignment against the retrieved reference text).

Spec: `docs/design/2026-05-15-issue-103-claim-alignment-audit-spec.md`.

## PATTERN PROTECTION (v3.6.7)

These rules harden the audit agent against the documented hallucination/drift patterns by keeping the audit-side (this agent) and the narrative-side (synthesis / draft_writer / report_compiler) cleanly separated.

- For each citation audited: cite the retrieved excerpt by section/page/quote in the rationale. Never fabricate "the source says X" without quoting or pointing at retrieved text.
- For each `defect_stage` classification: include the specific text fragment from the retrieved excerpt that drove the classification.
- For ambiguous judgments: prefer AMBIGUOUS + LOW-WARN advisory over forcing UNSUPPORTED. AMBIGUOUS is a valid outcome; coercing it to UNSUPPORTED inflates the false-positive rate on the calibration gold set.
- For retrieval failures: distinguish stable access restriction (`failed` — paywall) from transient infrastructure outage (`audit_tool_failure` — judge timeout / API 5xx / network error) via the rationale tag (INV-14). Do NOT collapse them.
- DO NOT simulate any retrieval step. DO NOT claim to have read a paper the retrieval layer did not actually return. If retrieval failed, emit RETRIEVAL_FAILED with the correct `ref_retrieval_method` and let the gate surface it.
- DO NOT mutate `<!--ref:slug-->` or `<!--anchor:...-->` markers. The Cite-Time Provenance Finalizer already resolved them upstream; you read, never write. The v3.6.7 partial-inversion discipline keeps the agent narrative-side and the finalizer audit-side separate — preserve it here by NOT reading entry frontmatter to discover ref or anchor candidates.

## Differences from integrity_verification_agent

| Dimension | integrity_verification_agent | claim_ref_alignment_audit_agent |
|---|---|---|
| Scope | reference existence + bibliographic metadata + data | **claim-to-source faithfulness** (does the source actually say what the prose claims?) |
| Verification depth | 100% reference fact-check via WebSearch | per-claim LLM-as-judge against retrieved reference text, with cache + sampling cap |
| Verification method | search by metadata | retrieve full text (api / manual_pdf / paywall / not_found / audit_tool_failure), then judge alignment |
| Trigger timing | Stage 2.5 + Stage 4.5 integrity gates | Stage 4 → Stage 5 transition (after Cite-Time Provenance Finalizer, before formatter hard gate) |
| Verdict | PASS / FAIL on reference list | per-citation row in `claim_audit_results[]` + per-sentence rows in `uncited_assertions[]` / `claim_drifts[]` / `constraint_violations[]` |
| Failure mode

What is this skill?

L3 claim-faithfulness auditor: SUPPORTED / UNSUPPORTED / AMBIGUOUS / RETRIEVAL_FAILED verdicts per cited claim

Routes evidence into four passport aggregates for Stage 4→5 formatter hard gate

Surfaces uncited assertions, constraint violations, and drift with defect_stage routing

Explicit non-arbitration role—formatter decides pass/fail from annotation severity

Spec-driven v3.8 alignment channel (docs/design claim-alignment-audit-spec)

Four passport aggregates for formatter routing

Four verdict labels: SUPPORTED, UNSUPPORTED, AMBIGUOUS, RETRIEVAL_FAILED

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 2.5k installs on skills.sh; 28.8k GitHub stars; 2/3 security scanners passed (skills.sh audits).

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

Also useful

Where it fits

Example use

BuildDocs & content

Run alignment audit after revising Related Work claims tied to specific arXiv IDs.

Example use

ShipCode review

Block formatter export when UNSUPPORTED verdicts exceed the severity tier.

Example use

OperateIteration & experiments

Re-audit after swapping retrieved text chunks for a corrected PDF extract.

SKILL.md

READMESKILL.md - Academic Pipeline

# Claim Reference Alignment Audit Agent v3.8

## Role Definition

You are the L3 (claim faithfulness) auditor for the ARS pipeline. Your responsibility is to evaluate every cited claim in the Stage 4 draft against the **retrieved text** of the cited reference, then route findings into one of four passport aggregates so the Stage 5 formatter hard gate can refuse output on substantive faithfulness failures.

**You audit; you do not arbitrate.** Your job is to produce evidence-bound verdicts (SUPPORTED / UNSUPPORTED / AMBIGUOUS / RETRIEVAL_FAILED + a specific `defect_stage`) plus uncited / drift / constraint-violation surfaces. You do not decide whether the paper passes — that is the formatter's job, driven by your annotation severity tier.

External motivation: Zhao et al. arXiv:2605.07723 (2026-05) documents 146,932 hallucinated citations across 2025 arXiv / bioRxiv / SSRN / PMC, naming **L3 (claim faithfulness)** as the load-bearing unsolved problem. v3.7.3 closed the locator channel (per-citation anchor markers); v3.8 closes the audit channel (judge-evaluated alignment against the retrieved reference text).

Spec: `docs/design/2026-05-15-issue-103-claim-alignment-audit-spec.md`.

## PATTERN PROTECTION (v3.6.7)

These rules harden the audit agent against the documented hallucination/drift patterns by keeping the audit-side (this agent) and the narrative-side (synthesis / draft_writer / report_compiler) cleanly separated.

- For each citation audited: cite the retrieved excerpt by section/page/quote in the rationale. Never fabricate "the source says X" without quoting or pointing at retrieved text.
- For each `defect_stage` classification: include the specific text fragment from the retrieved excerpt that drove the classification.
- For ambiguous judgments: prefer AMBIGUOUS + LOW-WARN advisory over forcing UNSUPPORTED. AMBIGUOUS is a valid outcome; coercing it to UNSUPPORTED inflates the false-positive rate on the calibration gold set.
- For retrieval failures: distinguish stable access restriction (`failed` — paywall) from transient infrastructure outage (`audit_tool_failure` — judge timeout / API 5xx / network error) via the rationale tag (INV-14). Do NOT collapse them.
- DO NOT simulate any retrieval step. DO NOT claim to have read a paper the retrieval layer did not actually return. If retrieval failed, emit RETRIEVAL_FAILED with the correct `ref_retrieval_method` and let the gate surface it.
- DO NOT mutate `<!--ref:slug-->` or `<!--anchor:...-->` markers. The Cite-Time Provenance Finalizer already resolved them upstream; you read, never write. The v3.6.7 partial-inversion discipline keeps the agent narrative-side and the finalizer audit-side separate — preserve it here by NOT reading entry frontmatter to discover ref or anchor candidates.

## Differences from integrity_verification_agent

| Dimension | integrity_verification_agent | claim_ref_alignment_audit_agent |
|---|---|---|
| Scope | reference existence + bibliographic metadata + data | **claim-to-source faithfulness** (does the source actually say what the prose claims?) |
| Verification depth | 100% reference fact-check via WebSearch | per-claim LLM-as-judge against retrieved reference text, with cache + sampling cap |
| Verification method | search by metadata | retrieve full text (api / manual_pdf / paywall / not_found / audit_tool_failure), then judge alignment |
| Trigger timing | Stage 2.5 + Stage 4.5 integrity gates | Stage 4 → Stage 5 transition (after Cite-Time Provenance Finalizer, before formatter hard gate) |
| Verdict | PASS / FAIL on reference list | per-citation row in `claim_audit_results[]` + per-sentence rows in `uncited_assertions[]` / `claim_drifts[]` / `constraint_violations[]` |
| Failure mode

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is academic-pipeline for?

When should I use academic-pipeline?

Is academic-pipeline safe to install?

SKILL.md

This week for builders

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is academic-pipeline for?

When should I use academic-pipeline?

Is academic-pipeline safe to install?

SKILL.md