
Bioservices
Map protein, gene, and compound identifiers across UniProt, KEGG, UniChem, and related bio databases using BioServices patterns in Python.
Overview
BioServices is an agent skill for the Build phase that guides Python identifier mapping between biological databases via UniProt, UniChem, KEGG, and related BioServices APIs.
Install
npx skills add https://github.com/k-dense-ai/scientific-agent-skills --skill bioservicesWhat is this skill?
- UniProt `mapping()` for protein and gene ID conversion with `fr`/`to` database pairs
- Batch mapping with comma-separated or list inputs for KEGG and related targets
- UniChem compound identifier crosswalks alongside KEGG built-in references
- Documents PICR and common mapping patterns with troubleshooting guidance
- Python BioServices client examples for reproducible identifier reconciliation
Adoption & trust: 529 installs on skills.sh; 27.6k GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
Your dataset mixes UniProt, KEGG, and chemical IDs and you cannot reliably join pathways, genes, and compounds without a documented crosswalk.
Who is it for?
Solo builders writing Python bioinformatics scripts or agent tools that must convert IDs across UniProt, KEGG, and UniChem.
Skip if: Pure clinical compliance workflows with no Python integration, or teams that only need static ID tables with no live API mapping.
When should I use this skill?
You need to convert or batch-map biological database identifiers using BioServices in Python.
What do I get? / Deliverables
You implement BioServices mapping calls with correct `fr`/`to` pairs and batch patterns so downstream analysis uses consistent database identifiers.
- Mapping code snippets with `fr`/`to` parameters
- Batch conversion patterns for ID lists
- Troubleshooting notes for failed cross-database lookups
Recommended Skills
Journey fit
Build/integrations is the shelf for wiring external scientific APIs and ID crosswalks into scripts, agents, or data pipelines. Integrations covers third-party service usage—here the UniProt mapping service, UniChem, KEGG cross-references, and batch conversion flows.
How it compares
Use as procedural glue for BioServices API usage, not as a substitute for domain-specific curated ontology curation tools.
Common Questions / FAQ
Who is bioservices for?
Developers and indie scientists building Python tooling that must cross-reference public biology database identifiers.
When should I use bioservices?
During Build/integrations when implementing UniProt-to-KEGG mapping, UniChem compound links, or batch ID conversion in pipelines.
Is bioservices safe to install?
Examples call external biology web services over the network; review the Security Audits panel on this page and handle API keys and data sensitivity per your policy.
SKILL.md
READMESKILL.md - Bioservices
# BioServices: Identifier Mapping Guide This document provides comprehensive information about converting identifiers between different biological databases using BioServices. ## Table of Contents 1. [Overview](#overview) 2. [UniProt Mapping Service](#uniprot-mapping-service) 3. [UniChem Compound Mapping](#unichem-compound-mapping) 4. [KEGG Identifier Conversions](#kegg-identifier-conversions) 5. [Common Mapping Patterns](#common-mapping-patterns) 6. [Troubleshooting](#troubleshooting) --- ## Overview Biological databases use different identifier systems. Cross-referencing requires mapping between these systems. BioServices provides multiple approaches: 1. **UniProt Mapping**: Comprehensive protein/gene ID conversion 2. **UniChem**: Chemical compound ID mapping 3. **KEGG**: Built-in cross-references in entries 4. **PICR**: Protein identifier cross-reference service --- ## UniProt Mapping Service The UniProt mapping service is the most comprehensive tool for protein and gene identifier conversion. ### Basic Usage ```python from bioservices import UniProt u = UniProt() # Map single ID result = u.mapping( fr="UniProtKB_AC-ID", # Source database to="KEGG", # Target database query="P43403" # Identifier to convert ) print(result) # Output: {'P43403': ['hsa:7535']} ``` ### Batch Mapping ```python # Map multiple IDs (comma-separated) ids = ["P43403", "P04637", "P53779"] result = u.mapping( fr="UniProtKB_AC-ID", to="KEGG", query=",".join(ids) ) for uniprot_id, kegg_ids in result.items(): print(f"{uniprot_id} → {kegg_ids}") ``` ### Supported Database Pairs UniProt supports mapping between 100+ database pairs. Key ones include: #### Protein/Gene Databases | Source Format | Code | Target Format | Code | |---------------|------|---------------|------| | UniProtKB AC/ID | `UniProtKB_AC-ID` | KEGG | `KEGG` | | UniProtKB AC/ID | `UniProtKB_AC-ID` | Ensembl | `Ensembl` | | UniProtKB AC/ID | `UniProtKB_AC-ID` | Ensembl Protein | `Ensembl_Protein` | | UniProtKB AC/ID | `UniProtKB_AC-ID` | Ensembl Transcript | `Ensembl_Transcript` | | UniProtKB AC/ID | `UniProtKB_AC-ID` | RefSeq Protein | `RefSeq_Protein` | | UniProtKB AC/ID | `UniProtKB_AC-ID` | RefSeq Nucleotide | `RefSeq_Nucleotide` | | UniProtKB AC/ID | `UniProtKB_AC-ID` | GeneID (Entrez) | `GeneID` | | UniProtKB AC/ID | `UniProtKB_AC-ID` | HGNC | `HGNC` | | UniProtKB AC/ID | `UniProtKB_AC-ID` | MGI | `MGI` | | KEGG | `KEGG` | UniProtKB | `UniProtKB` | | Ensembl | `Ensembl` | UniProtKB | `UniProtKB` | | GeneID | `GeneID` | UniProtKB | `UniProtKB` | #### Structural Databases | Source | Code | Target | Code | |--------|------|--------|------| | UniProtKB AC/ID | `UniProtKB_AC-ID` | PDB | `PDB` | | UniProtKB AC/ID | `UniProtKB_AC-ID` | Pfam | `Pfam` | | UniProtKB AC/ID | `UniProtKB_AC-ID` | InterPro | `InterPro` | | PDB | `PDB` | UniProtKB | `UniProtKB` | #### Expression & Proteomics | Source | Code | Target | Code | |--------|------|--------|------| | UniProtKB AC/ID | `UniProtKB_AC-ID` | PRIDE | `PRIDE` | | UniProtKB AC/ID | `UniProtKB_AC-ID` | ProteomicsDB | `ProteomicsDB` | | UniProtKB AC/ID | `UniProtKB_AC-ID` | PaxDb | `PaxDb` | #### Organism-Specific | Source | Code | Target | Code | |--------|------|--------|------| | UniProtKB AC/ID | `UniProtKB_AC-ID` | FlyBase | `FlyBase` | | UniProtKB AC/ID | `UniProtKB_AC-ID` | WormBase | `WormBase` | | UniProtKB AC/ID | `UniProtKB_AC-ID` | SGD | `SGD` | | UniProtKB AC/ID | `UniProtKB_AC-ID` | ZFIN | `ZFIN` | #### Other Useful Mappings | Source | Code | Target | Code | |--------|------|--------|------| | UniProtKB AC/ID | `UniProtKB_AC-ID` | GO | `GO` | | UniProtKB AC/ID | `UniProtKB_AC-ID` | Reactome | `Reactome` | | UniProtKB AC/ID | `UniProtKB_AC-ID` | STRING | `STRING` | | UniProtKB AC/ID | `UniProtKB_AC-ID` | BioGRID | `BioGRID` | | UniProtKB AC/ID | `UniProtKB_AC-ID` | OMA | `OMA` | ### Complete List of Database Codes To get the complete, up-to-date list: