Interpro Database

Name: Interpro Database
Author: google-deepmind

google-deepmind/science-skills

Query InterPro protein domains, families, and GO-linked entries through the official EBI API with correct filters, pagination, and aggregations for bioinformatics research.

Overview

InterPro-database is an agent skill for the Idea phase that documents InterPro API query parameters so agents can fetch protein entries, domains, and families from the EBI InterPro service reliably.

Install

npx skills add https://github.com/google-deepmind/science-skills --skill interpro-database

What is this skill?

Documents global `page_size` (max 200) and `page_size=1` bulk count patterns for aggregation without full page downloads
Covers `/entry` filters: `type`, `integrated`, `go_term`, `annotation`, and context-dependent `group_by` aggregations
Maps Member Database constraints (e.g. `integrated` fails when `source_db=interpro`) to avoid broken queries
Aligns with InterPro7 Swagger (`interpro7-swagger.yml`) for reproducible agent-generated API calls
Documents `page_size` default ~20 with maximum 200 per page
References official InterPro7 Swagger at ebi.ac.uk/interpro/api/static_files/interpro7-swagger.yml

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 542 installs on skills.sh; 1.7k GitHub stars; 3/3 security scanners passed (skills.sh audits).

What problem does it solve?

You need curated protein family and domain data from InterPro but the API has many endpoint-specific filters, pagination limits, and aggregation rules that are easy to get wrong in ad-hoc agent code.

Who is it for?

Solo builders or indie researchers shipping scientific agents, notebooks, or small APIs that must call InterPro with Swagger-accurate filters and pagination.

Skip if: Teams that only need one-off manual lookups in the InterPro web UI or projects with no protein annotation or structural-biology context.

When should I use this skill?

The user needs InterPro API queries, protein entry/domain/family filters, GO terms, member database integration flags, or `fetch_interpro_data` / count helpers for scientific pipelines.

What do I get? / Deliverables

Your agent builds valid InterPro `query_params`, uses efficient count and page strategies, and returns structured API results you can feed into downstream analysis or integration code.

Correctly formed InterPro API query parameter dictionaries
Paginated or aggregated InterPro API responses suitable for further analysis

Recommended Skills

Paper Context Resolverlllllllama/ai-paper-reproduction-skill

Optional helper-tier skill that supplements README-guided deep learning reproduction by resolving specific paper details…140k installs·412 stars

Repo Intake And Planlllllllama/ai-paper-reproduction-skill

Rigor Intake scans repository docs and layout to classify documented commands and propose a minimal reproduction plan fo…140k installs·412 stars

Env And Assets Bootstraplllllllama/ai-paper-reproduction-skill

Rigor Setup establishes conservative environment and asset assumptions aligned with README and config evidence before ex…140k installs·412 stars

Minimal Run And Auditlllllllama/ai-paper-reproduction-skill

RigorPilot executes the selected minimal reproduction command and produces normalized, auditable run evidence for paper …140k installs·412 stars

Analyze Projectlllllllama/rigorpilot-skills

analyze-project is a read-only agent skill from the RigorPilot family aimed at solo builders and small teams inheriting …32.3k installs·412 stars

Ai Research Reproductionlllllllama/rigorpilot-skills

ai-research-reproduction is the RigorPilot Reproduce orchestrator for solo builders and small teams who need to rerun a …32.3k installs·412 stars

Journey fit

Primary fit

IdeaOpportunity & market research

InterPro sits at the start of protein-function and domain research—before you commit to pipelines, models, or apps that depend on curated family and domain annotations. The skill is a parameter and endpoint reference for exploratory API queries (`fetch_interpro_data`, counts, `/entry` filters), which matches early research and literature-style discovery rather than shipping code.

Also useful

BuildIntegrations & version control

How it compares

Use this procedural API reference instead of hallucinating InterPro endpoint parameters in generic chat.

Common Questions / FAQ

Who is interpro-database for?

It is for solo builders and researchers using Claude Code, Cursor, or Codex to automate InterPro searches, entry exploration, and count aggregations during bioinformatics work.

When should I use interpro-database?

Use it in the Idea phase while researching protein function and domains, and again in Build when wiring backend integrations that call the InterPro API with documented filters and `page_size` behavior.

Is interpro-database safe to install?

Treat it like any third-party agent skill: review the Security Audits panel on this Prism page and avoid sending secrets in query strings; the skill itself is documentation for public EBI API usage.

SKILL.md

READMESKILL.md - Interpro Database

# InterPro API Query Parameters Reference

This document provides a comprehensive list of all query parameters available
for the InterPro API endpoints, based on the official InterPro Swagger
documentation
(https://www.ebi.ac.uk/interpro/api/static_files/interpro7-swagger.yml) These
parameters can be passed into the `query_params` dictionary in
`fetch_interpro_data`.

## Global Parameters

*Available on all endpoints.*

*   `page_size`: (`int`) Number of results per page (typically defaults to 20,
    max is 200). Use `page_size=1` with `get_interpro_count` for rapid bulk
    aggregations without downloading pages.

--------------------------------------------------------------------------------

## 1. `/entry` Parameters

*For exploring protein entries (genes, domains, families, repeats).*

### General Filters

*   `type`: (`str`) Filter by entry type (e.g., `family`, `domain`,
    `active_site`, `binding_site`, `conserved_site`, `ptms`, `repeat`,
    `homologous_superfamily`).
*   `integrated`: (`str`) Comma-separated list of Member Databases (e.g.,
    `pfam`, `smart`) to filter integrated status. *(Fails if
    source_db=interpro)*
*   `go_term`: (`str`) Filter by exact Gene Ontology term (e.g., `GO:0016301`).
*   `annotation`: (`str`) Filter by annotation type (`logo`, `alignment`,
    `hmm`). *(Works only when `source_db` is a member database).*
*   `group_by`: (`str`) Aggregation method. *Note: Valid values depend on the
    context!*
    -   `/entry` (and `/entry/integrated`, `/entry/unintegrated`, `/entry/all`):
        `type`, `source_database`, `tax_id`, `go_terms`.
    -   `/entry/interpro`: `type`, `tax_id`, `source_database`,
        `member_databases`, `go_terms`, `go_categories`.
    -   `/entry/{sourceDB}`: `type`, `tax_id`, `source_database`, `go_terms`,
        `go_categories`.
*   `sort_by`: (`str`) Sort criteria (e.g., `accession`, `name`).
*   `interpro_status`: (`str`) Value `"interpro_status"` counts how many entries
    are integrated and how many are not. *(Fails unless sourceDB is a member
    Database)*.
*   `ida`: (`str`) Included architectures strings.
*   `extra_fields`: (`str`) Include additional data (e.g., `counters`,
    `entry_id`, `short_name`, `description`, `wikipedia`, `literature`,
    `hierarchy`, `cross_references`, `entry_date`, `is_featured`,
    `overlaps_with`). *(Only available for `/entry/{sourceDB}` and
    `/entry/{sourceDB}/{accession}`).*

### InterPro-Specific (`source_db="interpro"`)

*   `go_category`: (`str`) Filter by top-level GO (`biological_process`,
    `molecular_function`, `cellular_component`).
*   `signature_in`: (`str`) Filter to entries matching a given member database.
*   `latest_entries`: (`str`) Pass `"latest_entries"` to filter for entries
    modified in the most recent release.
*   `interactions`: (`str`) Pass `"interactions"` to limit to entries with known
    structural interactions.
*   `pathways`: (`str`) Pass `"pathways"` to filter for entries linked to
    pathway datasets.
*   `has_model`: (`str`) Pass `"has_model"` to filter for entries with
    structural models.

### Source-DB Specific

*   `subfamilies` / `subfamily`: (`str`) Filter specifically against Panther
    subfamilies. *(Fails unless `source_db="panther"`)*.
*   `model`: (`str`) Included models from `interpro` or `pfam`.

### IDA (Domain Architecture) Search

*(Can ONLY be used on the root `/entry` endpoint. Invalidates aggregations).*

*   `ida_search`: (`str`) Comma-separated list of domain accessions (InterPro or
    Pfam) to find architectures containing them.
*   `ida_ignore`: (`str`) Architectures to ignore. *(Requires `ida_search`)*.
*   `ordered`: (`str`) Pass `"ordered"` to mandate domains appear sequentially.
    *(Requires `ida_search`)*.
*   `exact`: (`str`) Pass `"exact"` to mandate exact composition (no surplus
    domains). *(Requires `ida_search` and `ordered`)*.

--------------------------------------------------------------------------------

## 2. `/protein` Parame

What is this skill?

Documents global `page_size` (max 200) and `page_size=1` bulk count patterns for aggregation without full page downloads

Covers `/entry` filters: `type`, `integrated`, `go_term`, `annotation`, and context-dependent `group_by` aggregations

Maps Member Database constraints (e.g. `integrated` fails when `source_db=interpro`) to avoid broken queries

Aligns with InterPro7 Swagger (`interpro7-swagger.yml`) for reproducible agent-generated API calls

Documents `page_size` default ~20 with maximum 200 per page

References official InterPro7 Swagger at ebi.ac.uk/interpro/api/static_files/interpro7-swagger.yml

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 542 installs on skills.sh; 1.7k GitHub stars; 3/3 security scanners passed (skills.sh audits).

Who is it for?

Solo builders or indie researchers shipping scientific agents, notebooks, or small APIs that must call InterPro with Swagger-accurate filters and pagination.

Skip if: Teams that only need one-off manual lookups in the InterPro web UI or projects with no protein annotation or structural-biology context.

What do I get? / Deliverables

Your agent builds valid InterPro `query_params`, uses efficient count and page strategies, and returns structured API results you can feed into downstream analysis or integration code.

Correctly formed InterPro API query parameter dictionaries

Paginated or aggregated InterPro API responses suitable for further analysis

Journey fit

Primary fit

IdeaOpportunity & market research

Also useful

BuildIntegrations & version control

SKILL.md

READMESKILL.md - Interpro Database

# InterPro API Query Parameters Reference

This document provides a comprehensive list of all query parameters available
for the InterPro API endpoints, based on the official InterPro Swagger
documentation
(https://www.ebi.ac.uk/interpro/api/static_files/interpro7-swagger.yml) These
parameters can be passed into the `query_params` dictionary in
`fetch_interpro_data`.

## Global Parameters

*Available on all endpoints.*

*   `page_size`: (`int`) Number of results per page (typically defaults to 20,
    max is 200). Use `page_size=1` with `get_interpro_count` for rapid bulk
    aggregations without downloading pages.

--------------------------------------------------------------------------------

## 1. `/entry` Parameters

*For exploring protein entries (genes, domains, families, repeats).*

### General Filters

*   `type`: (`str`) Filter by entry type (e.g., `family`, `domain`,
    `active_site`, `binding_site`, `conserved_site`, `ptms`, `repeat`,
    `homologous_superfamily`).
*   `integrated`: (`str`) Comma-separated list of Member Databases (e.g.,
    `pfam`, `smart`) to filter integrated status. *(Fails if
    source_db=interpro)*
*   `go_term`: (`str`) Filter by exact Gene Ontology term (e.g., `GO:0016301`).
*   `annotation`: (`str`) Filter by annotation type (`logo`, `alignment`,
    `hmm`). *(Works only when `source_db` is a member database).*
*   `group_by`: (`str`) Aggregation method. *Note: Valid values depend on the
    context!*
    -   `/entry` (and `/entry/integrated`, `/entry/unintegrated`, `/entry/all`):
        `type`, `source_database`, `tax_id`, `go_terms`.
    -   `/entry/interpro`: `type`, `tax_id`, `source_database`,
        `member_databases`, `go_terms`, `go_categories`.
    -   `/entry/{sourceDB}`: `type`, `tax_id`, `source_database`, `go_terms`,
        `go_categories`.
*   `sort_by`: (`str`) Sort criteria (e.g., `accession`, `name`).
*   `interpro_status`: (`str`) Value `"interpro_status"` counts how many entries
    are integrated and how many are not. *(Fails unless sourceDB is a member
    Database)*.
*   `ida`: (`str`) Included architectures strings.
*   `extra_fields`: (`str`) Include additional data (e.g., `counters`,
    `entry_id`, `short_name`, `description`, `wikipedia`, `literature`,
    `hierarchy`, `cross_references`, `entry_date`, `is_featured`,
    `overlaps_with`). *(Only available for `/entry/{sourceDB}` and
    `/entry/{sourceDB}/{accession}`).*

### InterPro-Specific (`source_db="interpro"`)

*   `go_category`: (`str`) Filter by top-level GO (`biological_process`,
    `molecular_function`, `cellular_component`).
*   `signature_in`: (`str`) Filter to entries matching a given member database.
*   `latest_entries`: (`str`) Pass `"latest_entries"` to filter for entries
    modified in the most recent release.
*   `interactions`: (`str`) Pass `"interactions"` to limit to entries with known
    structural interactions.
*   `pathways`: (`str`) Pass `"pathways"` to filter for entries linked to
    pathway datasets.
*   `has_model`: (`str`) Pass `"has_model"` to filter for entries with
    structural models.

### Source-DB Specific

*   `subfamilies` / `subfamily`: (`str`) Filter specifically against Panther
    subfamilies. *(Fails unless `source_db="panther"`)*.
*   `model`: (`str`) Included models from `interpro` or `pfam`.

### IDA (Domain Architecture) Search

*(Can ONLY be used on the root `/entry` endpoint. Invalidates aggregations).*

*   `ida_search`: (`str`) Comma-separated list of domain accessions (InterPro or
    Pfam) to find architectures containing them.
*   `ida_ignore`: (`str`) Architectures to ignore. *(Requires `ida_search`)*.
*   `ordered`: (`str`) Pass `"ordered"` to mandate domains appear sequentially.
    *(Requires `ida_search`)*.
*   `exact`: (`str`) Pass `"exact"` to mandate exact composition (no surplus
    domains). *(Requires `ida_search` and `ordered`)*.

--------------------------------------------------------------------------------

## 2. `/protein` Parame

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Who is interpro-database for?

When should I use interpro-database?

Is interpro-database safe to install?

SKILL.md

This week for builders

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Who is interpro-database for?

When should I use interpro-database?

Is interpro-database safe to install?

SKILL.md