
Semanticscholar Skill
Query the Semantic Scholar API to search academic papers, authors, and citations for an agent.
Install
npx skills add https://github.com/agents365-ai/365-skills --skill semanticscholar-skillWhat is this skill?
- Semantic Scholar API
- Paper + citation search
- Research retrieval
Adoption & trust: 664 installs on skills.sh; 8 GitHub stars; 2/3 security scanners passed (skills.sh audits).
Recommended Skills
Microsoft Foundrymicrosoft/azure-skills
Azure Aimicrosoft/azure-skills
Azure Hosted Copilot Sdkmicrosoft/azure-skills
Lark Eventlarksuite/cli
Running Claude Code Via Litellm Copilotxixu-me/skills
Setup Matt Pocock Skillsmattpocock/skills
Journey fit
Primary fit
Searching the scholarly literature via Semantic Scholar supports background research and discovery — an idea-phase research task. Academic paper/citation search through Semantic Scholar is literature research, the research subphase of idea.
Common Questions / FAQ
Is Semanticscholar Skill safe to install?
skills.sh reports 2 of 3 security scanners passed. Review the Security Audits panel on this page before installing in production.
SKILL.md
READMESKILL.md - Semanticscholar Skill
"""Semantic Scholar API helper for the semanticscholar-skill. Public surface (organized by phase of the skill's 4-phase workflow): Plan / construct queries build_bool_query(phrases, required, excluded, or_terms) -> str deduplicate(papers) -> list Execute searches search_relevance(query, **filters) broad, ranked by relevance search_bulk(query, sort=..., **filters) boolean syntax, up to 10M search_snippets(query, **filters) full-text passage match match_title(title) exact title lookup paper_autocomplete(query) query-completion suggestions Direct lookup get_paper(paper_id) single paper, full fields batch_papers(ids, fields) up to 500 papers in one POST get_citations(paper_id, max_results) who cites this work get_references(paper_id, max_results) what this work cites get_paper_authors(paper_id, max_results) Recommendations find_similar(paper_id, limit, pool) single seed recommend(positive_ids, negative_ids, ...) multi-seed Authors search_authors(query, max_results) get_author(author_id) get_author_papers(author_id, max_results) batch_authors(ids, fields) Present / export format_table(papers) summary markdown table format_details(papers) per-paper details with TLDR format_citations(citations) citation list with intent labels format_results(papers, query_desc) table + details combined format_authors(authors) author table export_bibtex(papers) | export_markdown(...) | export_json(...) Trust + safety contract: - Auth comes from the S2_API_KEY env var; never accepted via function args. - All endpoints are read-only. - Rate limiting (1.1s gap) and exponential backoff are enforced inside _request and shared across the process via a module-level lock. Filter kwargs are snake_case here and translated to the S2 camelCase params inside _add_filters (year, publication_date -> publicationDateOrYear, fields_of_study -> fieldsOfStudy, min_citations -> minCitationCount, pub_types -> publicationTypes, open_access -> openAccessPdf, venue). """ import time, os, json, requests, sys GRAPH = "https://api.semanticscholar.org/graph/v1" RECS = "https://api.semanticscholar.org/recommendations/v1" DATASETS = "https://api.semanticscholar.org/datasets/v1" _API_KEY = os.environ.get("S2_API_KEY", "").strip() HAS_KEY = bool(_API_KEY) HEADERS = {"x-api-key": _API_KEY} if HAS_KEY else {} # Per S2 docs: anonymous calls share a 1000 req/s pool globally and can be # further throttled during heavy use; API keys get an introductory 1 req/s # dedicated quota across all endpoints. 1.1s under a key, 5s otherwise — the # anon pool can vanish under load, so we stay conservative without a key. _last_request_time = 0 _MIN_GAP = 1.1 if HAS_KEY else 5.0 def _request(method, url, params=None, json_data=None, max_retries=5): """Send one request with rate limiting and exponential backoff. Enforces a 1.1s minimum gap between requests, retries on 429/504 with 2s -> 60s exponential backoff (max 5 retries), and re-raises on other 4xx/5xx after surfacing the response body to stderr for debugging. """ global _last_request_time elapsed = time.time() - _last_request_time if elapsed < _MIN_GAP: time.sleep(_MIN_GAP - elapsed) for attempt in range(max_retries + 1): _last_request_time = time.time() try: if method == "GET": r = requests.get(url, params=params, headers=HEADERS, timeout=30) else: r = requests.post(url, params=params, json=json_data, headers=HEADERS, timeout=30) except (requests.ConnectionError, requests.Timeout) as e: if attempt < max_retries: wait = min(2 ** (attempt + 1), 60) prin