
Human Protein Atlas Database
Install when your agent must compose valid Human Protein Atlas search queries for expression, RNA, localization, and protein class filters during biomedical discovery.
Install
npx skills add https://github.com/google-deepmind/science-skills --skill human-protein-atlas-databaseWhat is this skill?
- Documents HPA key-value search syntax with case-insensitive field:value pairs
- Explains subfield separators: semicolon between sub-categories, comma for multi-select within a sub-category
- Boolean AND, OR, NOT and grouping with parentheses for compound filters
- Warns that quoted phrases with spaces can break queries—spaces do not require double quotes
- Covers tissue_category_rna, protein_class, chromosome, and subcellular localization style filters
Adoption & trust: 537 installs on skills.sh; 1.7k GitHub stars; 3/3 security scanners passed (skills.sh audits).
Recommended Skills
Paper Context Resolverlllllllama/ai-paper-reproduction-skill
Repo Intake And Planlllllllama/ai-paper-reproduction-skill
Env And Assets Bootstraplllllllama/ai-paper-reproduction-skill
Minimal Run And Auditlllllllama/ai-paper-reproduction-skill
Analyze Projectlllllllama/rigorpilot-skills
Ai Research Reproductionlllllllama/rigorpilot-skills
Journey fit
Primary fit
HPA query construction supports early discovery and hypothesis filtering before you commit to a product or study design. Research is the right shelf for database query syntax and field semantics—not shipping a UI or production API yet.
Common Questions / FAQ
Is Human Protein Atlas Database safe to install?
skills.sh reports 3 of 3 security scanners passed. Review the Security Audits panel on this page before installing in production.
SKILL.md
READMESKILL.md - Human Protein Atlas Database
# Human Protein Atlas Search Query API Reference This document provides a comprehensive guide to constructing search queries for the **Human Protein Atlas (HPA)**. The search engine allows users to filter the HPA database based on protein expression, mRNA levels, subcellular localization, and functional classifications. --- ## 1. Core Syntax Overview The HPA search engine follows a standard key-value pair format. All queries are **case-insensitive**. **IMPORTANT:** Terms with spaces do **not** need to be enclosed in double quotes (e.g. `protein_class:Transcription factors`), and adding quotes may actually break the query. When a field has sub-categories, values for them are separated by a semi-colon `;`. Multiple selections within a sub-category are separated by a comma `,`. * **Field Search**: `field:value` * Example: `chromosome:12` * **Field with Subfields**: `field:subvalue1;subvalue2` * Example: `tissue_category_rna:Brain;Tissue enriched` * **Multiple selections**: `field:subval1;sel1,sel2` * Example: `tissue_category_rna:Any;Tissue enriched,Group enriched` * **Boolean AND**: `term1 AND term2` * Example: `protein_class:Enzymes AND chromosome:X` * **Boolean OR**: `term1 OR term2` * Example: `tissue_category_rna:Any;Tissue enriched OR tissue_category_rna:Any;Tissue enhanced` * **Boolean NOT**: `term1 NOT term2` * Example: `protein_class:Enzymes NOT chromosome:1` * **Grouping**: `( ... )` * Example: `(tissue_category_rna:Any;Tissue enriched OR tissue_category_rna:Any;Group enriched) AND chromosome:1` --- ## 2. Specificity Classifications The HPA categorizes gene expression based on RNA-seq data across tissues (General Atlas), brain regions (Brain Atlas) and single cell types (Single Cell Atlas). These categories are built out of two dropdowns in the UI: the region/tissue to filter on, and the specificity category. ### RNA Tissue Specificity (`tissue_category_rna`) Filters based on the general tissue distribution across the entire human body. Format: `tissue_category_rna:<Tissue Name>;<Specificity Category>` * **Specificity Categories**: * **`Tissue enriched`**: Genes with mRNA levels at least 4-fold higher in a single tissue compared to all others. * **`Group enriched`**: Genes with mRNA levels at least 4-fold higher in a group of 2-5 tissues compared to all others. * **`Tissue enhanced`**: Genes with mRNA levels at least 4-fold higher in a tissue compared to the *average* of all others. * **`Low tissue specificity`**: Low tissue specificity; detected in many tissues. * **`Not detected`**: Not detected in tissues. *Example*: `tissue_category_rna:Liver;Tissue enriched` *Example*: `tissue_category_rna:Any;Tissue enriched,Group enriched` ### RNA Brain Region Specificity (`brain_category_rna`) Filters based on distribution across different brain structures. Format: `brain_category_rna:<Brain Region>;<Specificity Category>` * **Specificity Categories**: * **`Region enriched`**: Genes with mRNA levels at least 4-fold higher in one brain region compared to all others. * **`Group enriched`**: mRNA levels at least 4-fold higher in 2-5 brain regions. * **`Region enhanced`**: mRNA levels at least 4-fold higher in a brain region compared to the average of others. *Example*: `brain_category_rna:Amygdala;Region enriched,Group enriched` > **Pro-Tip:** If you are specifically looking for proteins that are *unique* to > the brain compared to the rest of the body, combine `tissue_category_rna` and > `brain_category_rna` fields. --- ## 3. Commonly Used Query Fields Below is a reference of the most frequently used fields for filtering the database. * **`gene_name`**: Search by the official HGNC symbol. * Examples: `gene_name:APOE`, `gene_name:TP53` * **`chromosome`**: Filter by the genomic location. * Examples: `1`, `2`, ..., `X`, `Y`, `MT` * **`protein_class`**: Functional classification of the protein. * Examples: `Enzymes`, `Transcription factors`, `FDA appro