Pubchem Database

The skill is procedural documentation for calling NCBI PubChem over HTTP—classic integration work when you are wiring external scientific data into an agent or backend during the Build phase. Integrations is the canonical shelf because every workflow here ends in composed REST URLs (compound/substance/assay domains, namespaces, operations, and output formats) rather than UI or pure product planning.

Also useful

Also useful

Where it fits

Example use

Compare marketed compound synonyms and xrefs before committing to a med-chem side project.

Example use

Confirm which PubChem operations (properties vs assays vs pathways) your MVP agent must support.

Example use

BuildBackend, data & payments

Implement fastsimilarity_2d lookups from user-supplied SMILES inside a Codex-driven backend.

Example use

Batch-fetch MolecularWeight, XLogP, and TPSA for a list of CIDs to enrich an internal scoring service.

How it compares

Procedural REST reference for NCBI PubChem—not a hosted MCP server or a general-purpose web search skill.

Common Questions / FAQ

Who is pubchem-database for?

It is for solo and indie developers, plus small science teams, who embed PubChem compound and assay data in AI coding agents during research or product build.

When should I use pubchem-database?

Use it when integrating PubChem during Build, when scouting compounds or assays in Idea research, or when scoping cheminformatics features in Validate—especially for fastsubstructure, fastsimilarity_2d, property lists, or xref pulls the wrapper skips.

Is pubchem-database safe to install?

Treat it like any third-party agent skill: review the Security Audits panel on this Prism page and avoid piping untrusted SMILES or assay payloads into production without your usual input validation.

SKILL.md

READMESKILL.md - Pubchem Database

# Advanced PubChem API Reference

This file documents the raw PUG-REST and PUG-View APIs for cases where the
`pubchem_api.py` wrapper does not support your specific query.

## PUG-REST (Computed Properties & Search)

**Base URL:** `https://pubchem.ncbi.nlm.nih.gov/rest/pug`

The URL path always follows this structure:
`/<domain>/<namespace>/<identifiers>/<operation>/<output>[?options]`

### 1. Domain
The core data type: `compound`, `substance`, `assay`, `gene`, `protein`,
`pathway`, `taxonomy`, `cell`.

### 2. Namespace & Identifiers
How you are identifying the target record(s):

- `cid/<cid>`: Compound ID
- `name/<name>`: Exact chemical name
- `smiles/<smiles>`: Exact SMILES match
- `inchikey/<inchikey>`: Exact InChIKey match
- `formula/<formula>`: Exact molecular formula
- Search namespaces (use `fast` prefix for synchronous):
  - `fastsubstructure/smiles/<smiles>`
  - `fastsimilarity_2d/smiles/<smiles>`
  - `fastidentity/smiles/<smiles>`

### 3. Operation
What data you want to extract:

- `record` (default): The full raw record.
- `property/<property_list>`: Specific properties (e.g.,
  `MolecularWeight,XLogP,TPSA`).
- `synonyms`: List of synonyms.
- `cids`: Return only the CIDs (useful after a search).
- `assaysummary`: Summary of bioassays.
- `xrefs/<xref_type>`: Cross-references (e.g., `PatentID`, `PubMedID`).

### 4. Output
Format for the response: `JSON`, `XML`, `CSV`, `TXT`, `PNG`.

### Examples

*   **Properties by CID (JSON)**: `https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/2244/property/MolecularWeight,MolecularFormula/JSON`
*   **Mass Range Search (JSON)**: `https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/molecular_weight/range/400.0/400.05/cids/JSON`
*   **Patents by SID (JSON)**: `https://pubchem.ncbi.nlm.nih.gov/rest/pug/substance/sid/137349406/xrefs/PatentID/JSON`

---

## PUG-View (Third-Party Annotations & Text)

Used for retrieving comprehensive textual annotations (like GHS Safety,
Pharmacology, Toxicity) compiled from external sources.

**Base URL:** `https://pubchem.ncbi.nlm.nih.gov/rest/pug_view`

The standard structure for retrieving specific sections:
`https://pubchem.ncbi.nlm.nih.gov/rest/pug_view/data/compound/<cid>/JSON?heading=<Section+Heading>`

*Note: Spaces in headings must be replaced with `+`.*

### Common Headings

*   `Safety+and+Hazards`
*   `Pharmacology+and+Biochemistry`
*   `Toxicity`
*   `Drug+and+Medication+Information`
*   `Experimental+Properties`



# PubChem Workflows

Follow these checklists for complex, multi-step queries to ensure accurate
results.

## Workflow 1: Comprehensive Chemical Profiling

When asked to provide a complete profile of a chemical (e.g., "Tell me
everything about Aspirin"):

1.  **Resolve Name**: Run `pubchem_api.py resolve` to get the primary CID.
2.  **Get Properties**: Run `pubchem_api.py properties` using the CID to get
    basic chemical traits (Weight, XLogP).
3.  **Check Safety**: Run `pubchem_api.py safety` to fetch GHS hazard
    information.
4.  **Check Pharmacology**: Run `pubchem_api.py pharmacology` to understand its
    biological/medical use.
5.  **Synthesize**: Read all output JSON files and compile a comprehensive
    markdown report.

## Workflow 2: Structure-Based BioAssay Lookup

When asked to find targets or assays for compounds similar to a given structure:

1.  **Search Structure**: Run `pubchem_api.py similarity` (for 2D similarity)
    or `pubchem_api.py substructure` using the target SMILES string.
2.  **Filter Results**: Read the resulting JSON file. The search may return
    hundreds of CIDs. Select the top 5-10 most relevant CIDs.
3.  **Fetch Assays**: For each selected CID, run `pubchem_api.py assays`.
4.  **Analyze**: Review the assay summaries to identify common biological
    targets (e.g., specific genes or proteins) that these compounds interact
    with.


# Copyright 2026 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the Li

What is this skill?

Documents full PUG-REST path pattern: domain / namespace / identifiers / operation / output with optional query flags

Covers compound, substance, assay, gene, protein, pathway, taxonomy, and cell domains with cid, name, SMILES, InChIKey,

Includes fast synchronous searches: substructure, 2D similarity, and identity match on SMILES

Lists operations for properties, synonyms, CIDs-only results, assay summaries, and cross-references such as PatentID and

Explains JSON, XML, CSV, TXT, and PNG response formats for agent-friendly parsing or visualization

8 PUG-REST domain types documented (compound through cell)

5 synchronous fast search namespace patterns on SMILES

5 response output formats: JSON, XML, CSV, TXT, PNG

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 571 installs on skills.sh; 1.7k GitHub stars; 2/3 security scanners passed (skills.sh audits).

Who is it for?

Solo builders adding cheminformatics or bioassay lookups to research agents, notebooks, or API backends that must hit PubChem directly with SMILES, InChIKey, or CID identifiers.

Skip if: Projects that only need the limited calls already covered by pubchem_api.py with no custom substructure, similarity, xref, or PUG-View needs.

What do I get? / Deliverables

Your agent builds valid PubChem REST URLs, retrieves JSON or other formats, and returns structured chemical identifiers and computed properties for downstream code or analysis.

Correctly formed PubChem REST request URLs

Parsed compound or assay property payloads (typically JSON)

Search result CIDs or similarity/substructure hit sets for follow-on code

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

Also useful

Also useful

Where it fits

Example use

Compare marketed compound synonyms and xrefs before committing to a med-chem side project.

Example use

Confirm which PubChem operations (properties vs assays vs pathways) your MVP agent must support.

Example use