
Database Lookup
Retrieve and integrate data from 78+ scientific and public databases through a unified programmatic interface for research and data analysis workflows.
Install
npx skills add https://github.com/itallstartedwithaidea/agent-skills --skill database-lookupWhat is this skill?
- Access 78+ scientific databases (PubChem, ChEMBL, UniProt, ClinicalTrials.gov, FRED, USPTO)
- Unified query interface with normalized response schemas across heterogeneous APIs
- Built-in caching, pagination handling, and rate limiting for production reliability
Adoption & trust: 1 installs on skills.sh; 18 GitHub stars; 3/3 security scanners passed (skills.sh audits); trending (+100% hot-view momentum).
Recommended Skills
Journey fit
This skill belongs in the build phase because it provides foundational infrastructure for agents and APIs that need to query external data sources. Database and third-party API integrations are core to building scalable data pipelines that fetch from multiple scientific sources.
Common Questions / FAQ
Is Database Lookup safe to install?
skills.sh reports 3 of 3 security scanners passed. Review the Security Audits panel on this page before installing in production.
SKILL.md
READMESKILL.md - Database Lookup
# Database Lookup Part of [Agent Skills™](https://github.com/itallstartedwithaidea/agent-skills) by [googleadsagent.ai™](https://googleadsagent.ai) ## Description Database Lookup provides unified programmatic access to 78+ scientific and public databases spanning chemistry (PubChem, ChEMBL), biology (UniProt, COSMIC, Ensembl), clinical (ClinicalTrials.gov, FDA), economics (FRED, World Bank), and intellectual property (USPTO, EPO). The agent constructs API queries, handles pagination, normalizes responses, and caches results for reproducible research workflows. Scientific research increasingly depends on integrating data from multiple heterogeneous databases. A drug discovery project might query ChEMBL for bioactivity data, UniProt for target protein information, PubChem for compound properties, ClinicalTrials.gov for related clinical studies, and FRED for healthcare spending trends—all for a single research question. This skill abstracts the API differences into a unified query interface. Each database connector handles authentication, rate limiting, response parsing, and error recovery. Results are normalized into consistent schemas (DataFrames with typed columns) regardless of the source API's format (REST JSON, XML, CSV, SPARQL). Caching prevents redundant API calls and enables offline analysis of previously retrieved data. ## Use When - Retrieving compound data from PubChem or ChEMBL - Querying protein sequences or annotations from UniProt - Searching clinical trials on ClinicalTrials.gov - Fetching economic indicators from FRED or World Bank - Looking up patent information from USPTO - Integrating data across multiple scientific databases ## How It Works ```mermaid graph TD A[Research Query] --> B[Query Router] B --> C{Database Selection} C -->|Chemistry| D[PubChem / ChEMBL / DrugBank] C -->|Biology| E[UniProt / Ensembl / COSMIC] C -->|Clinical| F[ClinicalTrials.gov / FDA / OMIM] C -->|Economics| G[FRED / World Bank / BLS] C -->|Patents| H[USPTO / EPO / WIPO] D --> I[API Request + Rate Limiting] E --> I F --> I G --> I H --> I I --> J[Response Normalization] J --> K[Cache Layer] K --> L[Unified DataFrame Output] ``` The query router identifies the appropriate database based on the query type and entity. All responses pass through normalization to produce consistent DataFrames with standardized column names and types. ## Implementation ```python import requests import pandas as pd from functools import lru_cache from time import sleep class DatabaseClient: BASE_URLS = { "pubchem": "https://pubchem.ncbi.nlm.nih.gov/rest/pug", "chembl": "https://www.ebi.ac.uk/chembl/api/data", "uniprot": "https://rest.uniprot.org/uniprotkb", "clinicaltrials": "https://clinicaltrials.gov/api/v2/studies", "fred": "https://api.stlouisfed.org/fred/series/observations", } def __init__(self, cache_dir: str = ".db_cache"): self.session = requests.Session() self.session.headers["User-Agent"] = "AgentSkills/1.0 (research)" def pubchem_compound(self, name: str) -> dict: url = f"{self.BASE_URLS['pubchem']}/compound/name/{name}/JSON" resp = self._get(url) props = resp["PC_Compounds"][0]["props"] return { "cid": resp["PC_Compounds"][0]["id"]["id"]["cid"], "name": name, "properties": {p["urn"]["label"]: p["value"] for p in props}, } def chembl_target(self, uniprot_id: str) -> pd.DataFrame: url = f"{self.BASE_URLS['chembl']}/target.json" resp = self._get(url, params={ "target_components__ac