
Citation Verification
Verify bibliographic metadata and existence of citations via Semantic Scholar and arXiv APIs before publishing writing.
Overview
Citation-verification is an agent skill most often used in Idea (also Validate docs and Launch content) that checks academic citations against Semantic Scholar and arXiv APIs.
Install
npx skills add https://github.com/galaxy-dawn/claude-scholar --skill citation-verificationWhat is this skill?
- Documents Semantic Scholar search and get-by-DOI flows with Python `semanticscholar` client.
- Notes free API use with rate limit guidance (100 requests per 5 minutes).
- Covers arXiv API usage for preprint verification (companion section in skill).
- Returns rich metadata: authors, year, venue, externalIds, citationCount, abstract.
- Includes error-handling pattern to flag citations needing manual review.
- Semantic Scholar API documented rate limit: 100 requests per 5 minutes.
- Primary integration paths: paper search, get by paper ID / DOI, and arXiv API section in the skill.
Adoption & trust: 524 installs on skills.sh; 4.2k GitHub stars; 2/3 security scanners passed (skills.sh audits).
What problem does it solve?
You have a list of references or inline citations and cannot trust that titles, DOIs, and years are real without hitting authoritative catalogs.
Who is it for?
Indie founders, technical writers, and research-heavy agent projects validating references before shipping prose.
Skip if: Builders who need full plagiarism detection, paywalled publisher PDF access, or legal compliance review of licensing—not covered here.
When should I use this skill?
You need to confirm citations, DOIs, or preprint metadata against Semantic Scholar or arXiv before trusting them in writing.
What do I get? / Deliverables
Each citation is matched or flagged with structured metadata and explicit manual-review markers when APIs fail or results disagree.
- Per-citation metadata validation result (match or manual-review flag)
- Structured fields: title, authors, year, venue, DOI/arXiv ids where available
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
First appears in Idea when gathering and validating sources, before you commit claims in a product spec or paper draft. Research subphase covers literature lookup, DOI resolution, and cross-checking titles against authoritative indexes.
Where it fits
Resolve whether a seminal paper title and DOI match before you cite it in a competitive research note.
Audit ten references in a validation memo so investor-facing claims trace to real venues.
Verify API documentation footnotes and related-work section citations in your repo README.
Fact-check citations in long-form SEO content meant for AI search and human readers.
How it compares
API-driven reference checker skill, not a reference manager UI like Zotero nor a generic web search.
Common Questions / FAQ
Who is citation-verification for?
Solo builders writing research-backed content or product docs who want agent-assisted checks against Semantic Scholar and arXiv.
When should I use citation-verification?
In Idea while researching sources; in Validate when fact-checking a pitch or landing claims; in Build/Launch when polishing docs or SEO articles that cite papers.
Is citation-verification safe to install?
It instructs outbound calls to public scholarly APIs—review the Security Audits panel on this page and avoid sending confidential manuscript text you cannot expose to network calls.
SKILL.md
READMESKILL.md - Citation Verification
# API 使用指南 本文档详细说明如何使用三个主要 API 进行文献验证。 ## Semantic Scholar API ### 概述 Semantic Scholar 是一个免费的学术搜索引擎,提供强大的 API 用于论文检索和元数据获取。 **优势:** - 免费使用,无需 API key - 覆盖广泛的学科领域 - 提供丰富的元数据 - 支持模糊搜索 **限制:** - 请求频率限制:100 requests/5min - 部分论文可能缺失 ### API 端点 **1. 通过 Paper ID 获取论文** ``` GET https://api.semanticscholar.org/graph/v1/paper/{paper_id} ``` **2. 搜索论文** ``` GET https://api.semanticscholar.org/graph/v1/paper/search?query={query} ``` ### Python 示例 **安装:** ```bash pip install semanticscholar ``` **基本用法:** ```python from semanticscholar import SemanticScholar sch = SemanticScholar() # 通过标题搜索 results = sch.search_paper("Attention is All You Need", limit=5) for paper in results: print(f"Title: {paper.title}") print(f"Authors: {[a.name for a in paper.authors]}") print(f"Year: {paper.year}") print(f"DOI: {paper.externalIds.get('DOI', 'N/A')}") print("---") ``` **通过 DOI 获取:** ```python # DOI 格式: DOI:10.48550/arXiv.1706.03762 paper = sch.get_paper("DOI:10.48550/arXiv.1706.03762") print(f"Title: {paper.title}") print(f"Citations: {paper.citationCount}") ``` ### 字段说明 **返回的主要字段:** - `paperId` - Semantic Scholar 内部 ID - `title` - 论文标题 - `authors` - 作者列表 - `year` - 发表年份 - `venue` - 发表场所(会议/期刊) - `externalIds` - 外部标识符(DOI, arXiv, PubMed 等) - `citationCount` - 引用次数 - `abstract` - 摘要 ### 错误处理 ```python try: paper = sch.get_paper("invalid_id") except Exception as e: print(f"Error: {e}") # 处理错误:标记需要人工验证 ``` ## arXiv API ### 概述 arXiv 是预印本论文库,提供免费的 API 用于访问论文元数据。 **优势:** - 完全免费,无需认证 - 覆盖物理、数学、计算机科学等领域 - 提供完整的论文 PDF - 更新及时 **限制:** - 仅限预印本论文 - 不包含已发表的期刊版本信息 ### API 端点 **查询接口:** ``` GET http://export.arxiv.org/api/query?search_query={query}&start={start}&max_results={max} ``` ### Python 示例 **安装:** ```bash pip install arxiv ``` **基本用法:** ```python import arxiv # 通过 arXiv ID 获取 paper = next(arxiv.Search(id_list=["1706.03762"]).results()) print(f"Title: {paper.title}") print(f"Authors: {[a.name for a in paper.authors]}") print(f"Published: {paper.published}") print(f"PDF URL: {paper.pdf_url}") # 通过标题搜索 search = arxiv.Search( query="Attention is All You Need", max_results=5, sort_by=arxiv.SortCriterion.Relevance ) for result in search.results(): print(f"Title: {result.title}") print(f"arXiv ID: {result.entry_id.split('/')[-1]}") print("---") ``` ### arXiv ID 格式 **识别 arXiv ID:** - 新格式: `YYMM.NNNNN` (如 2301.12345) - 旧格式: `arch-ive/YYMMNNN` (如 cs/0703001) **从 URL 提取:** ```python import re def extract_arxiv_id(text): # 匹配新格式 match = re.search(r'\d{4}\.\d{4,5}', text) if match: return match.group() # 匹配旧格式 match = re.search(r'[a-z-]+/\d{7}', text) if match: return match.group() return None ``` ## CrossRef API ### 概述 CrossRef 是 DOI 注册机构,提供权威的学术文献元数据。 **优势:** - DOI 是最可靠的唯一标识符 - 覆盖几乎所有正式发表的论文 - 数据质量高,权威性强 - 支持 BibTeX 格式直接获取 **限制:** - 仅限有 DOI 的论文 - 预印本通常没有 DOI ### API 端点 **通过 DOI 获取元数据:** ``` GET https://api.crossref.org/works/{doi} ``` **通过 DOI 获取 BibTeX:** ``` GET https://doi.org/{doi} Headers: Accept: application/x-bibtex ``` ### Python 示例 **通过 DOI 获取元数据:** ```python import requests def get_crossref_metadata(doi): url = f"https://api.crossref.org/works/{doi}" response = requests.get(url) if response.status_code == 200: data = response.json() return data['message'] return None # 示例 doi = "10.48550/arXiv.1706.03762" metadata = get_crossref_metadata(doi) if metadata: print(f"Title: {metadata['title'][0]}") print(f"Authors: {[f\"{a['given']} {a['family']}\" for a in metadata['author']]}") print(f"Published: {metadata['published']['date-parts'][0]}") ``` **通过 DOI 获取 BibTeX:** ```python def doi_to_bibtex(doi): url = f"https://doi.org/{doi}" headers = {"Accept": "application/x-bibtex"} response = requests.get(url, headers=headers) if response.status_code == 200: return response.text return None # 示例 bibtex = doi_to_bibtex("10.48550/arXiv.1706.03762") print(bibt