
Pubmed Database
Query PubMed and NCBI ELink bridges so your agent can trace citations, full text, genes, compounds, and related records from PMIDs.
Overview
PubMed Database is an agent skill for the Idea phase that navigates NCBI ELink target databases and linknames to pull citations, full text paths, and linked biological records from PubMed queries.
Install
npx skills add https://github.com/google-deepmind/science-skills --skill pubmed-databaseWhat is this skill?
- Explains ELink requiring both target_database and linkname with NCBI naming convention dbfrom_db_subset
- Covers forward citations (pubmed_pubmed_citedin), backward refs (pubmed_pubmed_refs), and similar articles (pubmed_pubme
- Maps PubMed to PMC full text (pubmed_pmc), PubChem compounds/assays, gene, nucleotide, protein, and ClinVar bridges
- Teaches semantic bridge selection when one database pair exposes multiple relationship types
- Documents 10+ common PubMed/NCBI database and linkname pairs including citations, PMC, PubChem, gene, and ClinVar bridge
Adoption & trust: 643 installs on skills.sh; 1.7k GitHub stars; 2/3 security scanners passed (skills.sh audits).
What problem does it solve?
You need structured literature and entity links from PubMed but keep confusing NCBI databases with linknames and pull the wrong relationship (citations vs similar articles vs chemicals).
Who is it for?
Builders researching health, bioinformatics, or science-backed products who want agent-guided PubMed and NCBI cross-database queries during discovery.
Skip if: Pure commercial SEO content with no PMID-backed evidence, or teams that only need a single abstract summary without ELink traversal.
When should I use this skill?
You need PubMed literature, citation direction (cited-in vs refs), similar articles, or cross-links to PMC, compounds, assays, genes, sequences, proteins, or ClinVar.
What do I get? / Deliverables
Your agent uses the correct database–linkname pairs to retrieve forward and backward citations, PMC full text handles, and linked gene, compound, sequence, and variant records for downstream specs or datasets.
- Structured ELink query patterns with target_database and linkname
- Curated sets of related PMIDs, PMCIDs, CIDs, gene IDs, or variant links for your research notes
Recommended Skills
Journey fit
Idea phase is the canonical shelf because the skill supports discovery and evidence gathering before you commit product scope in health, bio, or research-adjacent builds. Research subphase matches literature retrieval, citation graphs, and cross-database linking via NCBI linknames rather than shipping production UI.
How it compares
Specialized PubMed/ELink procedural knowledge—not a general web research skill or a hosted literature SaaS integration.
Common Questions / FAQ
Who is pubmed-database for?
Solo builders and indie researchers wiring agents to PubMed, PMC, PubChem, and related NCBI resources when scoping science or health products.
When should I use pubmed-database?
During Idea research when mapping competitors and evidence, tracing citation networks, or resolving PMIDs to PMCIDs and linked entity records before you prototype.
Is pubmed-database safe to install?
Check the Security Audits panel on this page; NCBI access is read-oriented but still involves network calls and API keys or rate limits you should configure in your environment.
SKILL.md
READMESKILL.md - Pubmed Database
# Advanced Biological Database Linking (ELink) ## 1. Core Concepts: Database vs. Linkname A query requires both a `target_database` and a `linkname`: * Target Database: The destination repository (e.g., `gene`, `nuccore`, `pccompound`). * Linkname: The specific "semantic bridge" defined by NCBI. Linknames follow the naming convention: `dbfrom_db_subset`. One database pair can have multiple linknames representing different relationships. ### Common Database and Linkname Pairs * **pubmed**, `pubmed_pubmed_citedin`: **Forward Citations** — Find papers that cite the source paper. * **pubmed**, `pubmed_pubmed_refs`: **Backward Citations** — Extract the source paper's bibliography. * **pubmed**, `pubmed_pubmed`: **Similar Articles** — Papers sharing MeSH terms or keywords. * **pmc**, `pubmed_pmc`: **Full Text** — Resolve a PMID to a PMCID (required for BioC API). * **pccompound**, `pubmed_pccompound`: **Chemicals** — Find specific chemicals or drugs mentioned (CIDs). * **pcassay**, `pubmed_pcassay`: **PubChem BioAssays** — Link to experimental results and screening data. * **gene**, `pubmed_gene`: **Genetics** — Identify specific NCBI Gene records discussed. * **nuccore**, `pubmed_nuccore`: **Sequence Data** — Link to GenBank/nucleotide sequences. * **protein**, `pubmed_protein`: **Proteins** — Link to RefSeq or GenPept protein records. * **clinvar**, `pubmed_clinvar`: **Clinical Variants** — Find links to the ClinVar database (mutations). * **snp**, `pubmed_snp`: **SNPs** — Find specific Single Nucleotide Polymorphisms. * **sra**, `pubmed_sra`: **Raw Data** — Find raw datasets in the Sequence Read Archive. * **structure**, `pubmed_structure`: **3D Structures** — Find molecular structures (PDB) for proteins/ligands. This list covers the most common `pubmed → X` links. For the full list of all ELink linknames across all NCBI databases, see the [NCBI Entrez Links catalog](https://eutils.ncbi.nlm.nih.gov/entrez/query/static/entrezlinks.html). -------------------------------------------------------------------------------- ## 2. Procedural Wisdom: Handling Failure Modes ### The "Indexing Lag" Problem (Recent Papers) NCBI links are not created instantly. There is a human-in-the-loop and automated indexing process that results in a **4-8 week delay** for cross-database links. **Symptom**: `find_linked_biological_data` returns `[]` for a paper published 2 weeks ago. **Strategy**: If the paper is very recent, **pivot immediately** to semantic search or full-text extraction. Use `get_full_text_pmc` and search for primary identifiers in the prose (e.g., by searching for specific identifiers or nomenclature manually). ### The "High-Citation" Timeout For foundational papers with >10,000 citations, `pubmed_pubmed_citedin` may fail or timeout. **Strategy**: Instead of linking, use `search_pubmed` with the title of the paper in quotes or a specific query like `"citations for PMID [SOURCE_PMID]"` ### Verifying Open Access Availability Before calling `get_full_text_pmc`, it is more reliable to check the link first **Workflow**: Call `find_linked_biological_data` with `target_database="pmc"` and `linkname="pubmed_pmc"`. If it returns a result, the paper is definitely in the PMC BioC database. If not, don't waste time on a full-text fetch; use `fetch_article_abstracts` instead. -------------------------------------------------------------------------------- ## 3. Category-Specific Tips ### Chemical Entities (`pccompound`, `pcassay`) Links to chemical databases typically return **internal identifiers** (e.g., PubChem CIDs) rather than common names. **Example**: A link to a drug study might return `["4091"]`. **Note**: To resolve these to names, you must use a separate metadata lookup or search strategy, as the linking tool only provides the relationship, not the entity details. ### Sequence and Genomic Data (`nuccore`, `protein`, `gene