Cheminformatics

Name: Cheminformatics
Author: itallstartedwithaidea

itallstartedwithaidea/agent-skills

Run RDKit-based molecular property, ADMET, virtual screening, and docking-prep pipelines on SMILES/SDF libraries without hand-rolling every cheminformatics step.

Overview

Cheminformatics is an agent skill for the Validate phase that builds RDKit pipelines for molecular property prediction, virtual screening, ADMET analysis, docking prep, and chemical-space exploration from SMILES and SDF

Install

npx skills add https://github.com/itallstartedwithaidea/agent-skills --skill cheminformatics

What is this skill?

RDKit workflows from SMILES/SDF parsing through descriptors, fingerprints, and chemical-space clustering
Virtual screening and drug-likeness filters including Lipinski’s Rule of Five
ADMET-oriented prediction to drop compounds likely to fail downstream
Molecular docking preparation and pose-oriented scoring hooks
Reproducible cheminformatics pipelines with PubChem-style database integration patterns

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 1 installs on skills.sh; 18 GitHub stars; 2/3 security scanners passed (skills.sh audits); trending (+100% hot-view momentum).

What problem does it solve?

You have huge compound libraries but no fast, reproducible way to filter for drug-likeness, ADMET risk, and diverse leads before synthesis or assays.

Who is it for?

Indie builders shipping chemistry-adjacent agents, internal discovery tools, or research prototypes that must score real structures with RDKit.

Skip if: Teams without chemistry inputs, builders who only need generic Python data science with no molecular structures, or production wet-lab protocols with no computational screening step.

When should I use this skill?

You have molecular structures (SMILES/SDF) and need property prediction, screening, ADMET triage, docking preparation, or chemical-space exploration with reproducible RDKit pipelines.

What do I get? / Deliverables

You get an ordered cheminformatics pipeline with computed descriptors, fingerprints, filters, and clustering output you can feed into docking, procurement, or the next modeling skill.

Reproducible cheminformatics pipeline scripts
Filtered or ranked compound tables
Descriptor/fingerprint outputs and clustering summaries for lead selection

Recommended Skills

Paper Context Resolverlllllllama/ai-paper-reproduction-skill

Optional helper-tier skill that supplements README-guided deep learning reproduction by resolving specific paper details…140k installs·412 stars

Repo Intake And Planlllllllama/ai-paper-reproduction-skill

Rigor Intake scans repository docs and layout to classify documented commands and propose a minimal reproduction plan fo…140k installs·412 stars

Env And Assets Bootstraplllllllama/ai-paper-reproduction-skill

Rigor Setup establishes conservative environment and asset assumptions aligned with README and config evidence before ex…140k installs·412 stars

Minimal Run And Auditlllllllama/ai-paper-reproduction-skill

RigorPilot executes the selected minimal reproduction command and produces normalized, auditable run evidence for paper …140k installs·412 stars

Analyze Projectlllllllama/rigorpilot-skills

analyze-project is a read-only agent skill from the RigorPilot family aimed at solo builders and small teams inheriting …32.3k installs·412 stars

Ai Research Reproductionlllllllama/rigorpilot-skills

ai-research-reproduction is the RigorPilot Reproduce orchestrator for solo builders and small teams who need to rerun a …32.3k installs·412 stars

Journey fit

Primary fit

ValidatePrototype & spike

Computational triage sits after you have candidate structures but before expensive synthesis or wet-lab bets—classic validate/prototype for drug-discovery and chemistry tooling. Prototype phase is where you narrow libraries with Lipinski filters, fingerprints, similarity search, and ADMET screens on representative sets.

Also useful

IdeaOpportunity & market research

Also useful

BuildIntegrations & version control

How it compares

Use this skill package for RDKit workflow scaffolding—not a hosted compound database or a clinical trial ops platform.

Common Questions / FAQ

Who is cheminformatics for?

Solo and indie builders working on drug discovery, cheminformatics SaaS, or agent tools that must reason over SMILES/SDF libraries with RDKit-backed predictions.

When should I use cheminformatics?

During Validate when prototyping lead lists—e.g. filtering a downloaded library before a landing-page demo, scoring analogs for a niche therapeutic idea, or clustering candidates before docking in a build-phase integration.

Is cheminformatics safe to install?

Treat it like any third-party agent skill: review the Security Audits panel on this Prism page and inspect generated scripts before running them on sensitive compound or IP data.

SKILL.md

READMESKILL.md - Cheminformatics

# Cheminformatics

Part of [Agent Skills™](https://github.com/itallstartedwithaidea/agent-skills) by [googleadsagent.ai™](https://googleadsagent.ai)

## Description

Cheminformatics provides computational chemistry workflows using RDKit for molecular property prediction, virtual screening, ADMET analysis, molecular docking preparation, and chemical space exploration. The agent generates reproducible cheminformatics pipelines that transform molecular structures (SMILES, SDF) into actionable predictions about drug-likeness, toxicity, and binding affinity.

Drug discovery generates vast chemical libraries that cannot all be synthesized and tested. Cheminformatics narrows the search space computationally: filtering by Lipinski's Rule of Five, predicting ADMET properties (Absorption, Distribution, Metabolism, Excretion, Toxicity), scoring docking poses, and clustering chemical space to identify diverse lead candidates. Each step eliminates compounds that would fail in later, more expensive stages.

This skill covers the molecular informatics workflow from SMILES parsing through descriptor calculation, fingerprint generation, similarity searching, property prediction, and visualization. It integrates with databases like PubChem and ChEMBL for compound retrieval and benchmarking against known actives and inactives.

## Use When

- Calculating molecular properties and descriptors
- Screening compound libraries for drug-likeness
- Predicting ADMET properties for lead compounds
- Performing molecular similarity searches
- Preparing structures for molecular docking
- Visualizing chemical space and structure-activity relationships

## How It Works

```mermaid
graph TD
    A[Molecular Input: SMILES/SDF] --> B[Parse + Validate Structures]
    B --> C[Calculate Descriptors]
    C --> D[Drug-likeness Filters]
    D --> E{Passes Lipinski?}
    E -->|No| F[Flag as Non-Drug-like]
    E -->|Yes| G[ADMET Prediction]
    G --> H[Virtual Screening Score]
    H --> I[Docking Preparation]
    I --> J[Ranked Candidate List]
    F --> K[Report with Flags]
    J --> K
```

Compounds flow through increasingly selective filters. Drug-likeness removes obviously non-viable candidates, ADMET prediction flags absorption and toxicity risks, and virtual screening ranks the survivors by predicted activity.

## Implementation

```python
from rdkit import Chem
from rdkit.Chem import Descriptors, AllChem, Draw, Lipinski, DataStructs
from rdkit.Chem import rdMolDescriptors
import pandas as pd

def molecular_properties(smiles: str) -> dict:
    mol = Chem.MolFromSmiles(smiles)
    if mol is None:
        raise ValueError(f"Invalid SMILES: {smiles}")
    return {
        "smiles": smiles,
        "mw": Descriptors.MolWt(mol),
        "logp": Descriptors.MolLogP(mol),
        "hbd": Descriptors.NumHDonors(mol),
        "hba": Descriptors.NumHAcceptors(mol),
        "tpsa": Descriptors.TPSA(mol),
        "rotatable_bonds": Descriptors.NumRotatableBonds(mol),
        "rings": Descriptors.RingCount(mol),
        "lipinski_violations": sum([
            Descriptors.MolWt(mol) > 500,
            Descriptors.MolLogP(mol) > 5,
            Descriptors.NumHDonors(mol) > 5,
            Descriptors.NumHAcceptors(mol) > 10,
        ]),
    }

def lipinski_filter(df: pd.DataFrame) -> pd.DataFrame:
    return df[df["lipinski_violations"] <= 1].copy()

def similarity_search(query_smiles: str, library: list[str], threshold: float = 0.7) -> list[dict]:
    query_mol = Chem.MolFromSmiles(query_smiles)
    query_fp = AllChem.GetMorganFingerprintAsBitVect(query_mol, radius=2, nBits=2048)

    results = []
    for smi in library:
        mol = Chem.MolFromSmiles(smi)
        if mol is None:
            continue
        fp = AllChem.GetMorganF

What is this skill?

RDKit workflows from SMILES/SDF parsing through descriptors, fingerprints, and chemical-space clustering

Virtual screening and drug-likeness filters including Lipinski’s Rule of Five

ADMET-oriented prediction to drop compounds likely to fail downstream

Molecular docking preparation and pose-oriented scoring hooks

Reproducible cheminformatics pipelines with PubChem-style database integration patterns

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 1 installs on skills.sh; 18 GitHub stars; 2/3 security scanners passed (skills.sh audits); trending (+100% hot-view momentum).

Who is it for?

Indie builders shipping chemistry-adjacent agents, internal discovery tools, or research prototypes that must score real structures with RDKit.

Skip if: Teams without chemistry inputs, builders who only need generic Python data science with no molecular structures, or production wet-lab protocols with no computational screening step.

What do I get? / Deliverables

You get an ordered cheminformatics pipeline with computed descriptors, fingerprints, filters, and clustering output you can feed into docking, procurement, or the next modeling skill.

Reproducible cheminformatics pipeline scripts

Filtered or ranked compound tables

Descriptor/fingerprint outputs and clustering summaries for lead selection

Journey fit

Primary fit

ValidatePrototype & spike

Also useful

IdeaOpportunity & market research

Also useful

BuildIntegrations & version control

SKILL.md

READMESKILL.md - Cheminformatics

# Cheminformatics

Part of [Agent Skills™](https://github.com/itallstartedwithaidea/agent-skills) by [googleadsagent.ai™](https://googleadsagent.ai)

## Description

Cheminformatics provides computational chemistry workflows using RDKit for molecular property prediction, virtual screening, ADMET analysis, molecular docking preparation, and chemical space exploration. The agent generates reproducible cheminformatics pipelines that transform molecular structures (SMILES, SDF) into actionable predictions about drug-likeness, toxicity, and binding affinity.

Drug discovery generates vast chemical libraries that cannot all be synthesized and tested. Cheminformatics narrows the search space computationally: filtering by Lipinski's Rule of Five, predicting ADMET properties (Absorption, Distribution, Metabolism, Excretion, Toxicity), scoring docking poses, and clustering chemical space to identify diverse lead candidates. Each step eliminates compounds that would fail in later, more expensive stages.

This skill covers the molecular informatics workflow from SMILES parsing through descriptor calculation, fingerprint generation, similarity searching, property prediction, and visualization. It integrates with databases like PubChem and ChEMBL for compound retrieval and benchmarking against known actives and inactives.

## Use When

- Calculating molecular properties and descriptors
- Screening compound libraries for drug-likeness
- Predicting ADMET properties for lead compounds
- Performing molecular similarity searches
- Preparing structures for molecular docking
- Visualizing chemical space and structure-activity relationships

## How It Works

```mermaid
graph TD
    A[Molecular Input: SMILES/SDF] --> B[Parse + Validate Structures]
    B --> C[Calculate Descriptors]
    C --> D[Drug-likeness Filters]
    D --> E{Passes Lipinski?}
    E -->|No| F[Flag as Non-Drug-like]
    E -->|Yes| G[ADMET Prediction]
    G --> H[Virtual Screening Score]
    H --> I[Docking Preparation]
    I --> J[Ranked Candidate List]
    F --> K[Report with Flags]
    J --> K
```

Compounds flow through increasingly selective filters. Drug-likeness removes obviously non-viable candidates, ADMET prediction flags absorption and toxicity risks, and virtual screening ranks the survivors by predicted activity.

## Implementation

```python
from rdkit import Chem
from rdkit.Chem import Descriptors, AllChem, Draw, Lipinski, DataStructs
from rdkit.Chem import rdMolDescriptors
import pandas as pd

def molecular_properties(smiles: str) -> dict:
    mol = Chem.MolFromSmiles(smiles)
    if mol is None:
        raise ValueError(f"Invalid SMILES: {smiles}")
    return {
        "smiles": smiles,
        "mw": Descriptors.MolWt(mol),
        "logp": Descriptors.MolLogP(mol),
        "hbd": Descriptors.NumHDonors(mol),
        "hba": Descriptors.NumHAcceptors(mol),
        "tpsa": Descriptors.TPSA(mol),
        "rotatable_bonds": Descriptors.NumRotatableBonds(mol),
        "rings": Descriptors.RingCount(mol),
        "lipinski_violations": sum([
            Descriptors.MolWt(mol) > 500,
            Descriptors.MolLogP(mol) > 5,
            Descriptors.NumHDonors(mol) > 5,
            Descriptors.NumHAcceptors(mol) > 10,
        ]),
    }

def lipinski_filter(df: pd.DataFrame) -> pd.DataFrame:
    return df[df["lipinski_violations"] <= 1].copy()

def similarity_search(query_smiles: str, library: list[str], threshold: float = 0.7) -> list[dict]:
    query_mol = Chem.MolFromSmiles(query_smiles)
    query_fp = AllChem.GetMorganFingerprintAsBitVect(query_mol, radius=2, nBits=2048)

    results = []
    for smi in library:
        mol = Chem.MolFromSmiles(smi)
        if mol is None:
            continue
        fp = AllChem.GetMorganF

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Who is cheminformatics for?

When should I use cheminformatics?

Is cheminformatics safe to install?

SKILL.md

This week for builders

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Who is cheminformatics for?

When should I use cheminformatics?

Is cheminformatics safe to install?

SKILL.md