
Alphagenome Single Variant Analysis
Call the AlphaGenome DNA client to score a single genomic variant with correct Interval and 1-based Variant types for interpretation workflows.
Overview
alphagenome-single-variant-analysis is an agent skill for the Build phase that integrates the AlphaGenome Python client for single-variant genomic scoring and visualization.
Install
npx skills add https://github.com/google-deepmind/science-skills --skill alphagenome-single-variant-analysisWhat is this skill?
- Documents `alphagenome` imports for genome, track_data, variant scorers, and visualization
- Shows `dna_client.create` with `ALPHAGENOME_API_KEY` and Google science gRPC endpoint
- Clarifies 0-based half-open `genome.Interval` vs 1-based VCF-style `genome.Variant`
- Covers interval helpers: resize, overlap, intersect, and center positioning
- Points to interpretation modules (e.g. ISM) and plotting components for result review
Adoption & trust: 534 installs on skills.sh; 1.7k GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
You need to analyze one genomic variant with AlphaGenome but keep mixing interval coordinates, API keys, and client setup.
Who is it for?
Builders or researchers automating variant-level AlphaGenome calls inside Python notebooks, batch jobs, or science-focused coding agents.
Skip if: General web app frontend work, non-genomics products, or teams without AlphaGenome API access and variant coordinates to analyze.
When should I use this skill?
You are writing Python that calls AlphaGenome for one variant and need correct client setup and genome data types.
What do I get? / Deliverables
You run a typed single-variant workflow with correct Interval/Variant objects, an initialized DNA client, and optional plots for interpretation.
- Initialized `dna_client` session and variant/interval objects
- Single-variant scoring or interpretation script scaffold
- Optional plots via alphagenome visualization helpers
Recommended Skills
Journey fit
Wiring a specialized science API into scripts or agent pipelines is Build work once you have a variant hypothesis to test. Integrations subphase fits external API clients, env-based keys, and typed domain objects from third-party SDKs.
How it compares
API integration reference for DeepMind AlphaGenome—not a general bioinformatics pipeline framework like Snakemake-only orchestration.
Common Questions / FAQ
Who is alphagenome-single-variant-analysis for?
Solo developers and small research teams using coding agents to integrate AlphaGenome single-variant analysis in Python.
When should I use alphagenome-single-variant-analysis?
During Build → integrations when scaffolding `alphagenome` client code, fixing coordinate bugs, or adding variant scorer calls before downstream interpretation.
Is alphagenome-single-variant-analysis safe to install?
The skill implies network API access and secrets via environment variables; review the Security Audits panel on this Prism page and protect `ALPHAGENOME_API_KEY`.
SKILL.md
READMESKILL.md - Alphagenome Single Variant Analysis
# AlphaGenome API Reference Pip package: `alphagenome` ## Setup and Imports Standard imports for AlphaGenome workflows: ```python from alphagenome.data import gene_annotation from alphagenome.data import genome from alphagenome.data import track_data from alphagenome.data import transcript as transcript_utils from alphagenome.interpretation import ism from alphagenome.models import dna_client from alphagenome.models import variant_scorers from alphagenome.visualization import plot_components import matplotlib.pyplot as plt import pandas as pd ``` ### Client Initialization The API key is automatically loaded by `dotenv` from the `.env` file in the agent configuration dir. To initialize a client: ```python from alphagenome.models import dna_client dna_model = dna_client.create( api_key=os.environ.get('ALPHAGENOME_API_KEY'), address='dns:///gdmscience.googleapis.com:443', ) ``` ## Core Data Types ### genome.Interval 0-based half-open interval (includes `start`, excludes `end`). ```python interval = genome.Interval(chromosome='chr1', start=1_000, end=1_010) interval.center() # Returns center position (int) interval.width # Returns 10 interval.resize(100) # Resizes around center interval.overlaps(other_interval) interval.contains(other_interval) interval.intersect(other_interval) ``` ### genome.Variant Position is **1-based** (VCF-compatible). ```python variant = genome.Variant( chromosome='chr22', position=36201698, # 1-based! reference_bases='A', alternate_bases='C', ) # Get interval around variant interval = variant.reference_interval.resize(dna_client.SEQUENCE_LENGTH_1MB) ``` ## Predictions ### Predict from DNA Sequence ```python output = dna_model.predict_sequence( sequence='GATTACA'.center(dna_client.SEQUENCE_LENGTH_1MB, 'N'), requested_outputs=[dna_client.OutputType.DNASE], ontology_terms=['UBERON:0002048'], # Lung ) # Access predictions print(output.dnase.values.shape) # (sequence_length, num_tracks) print(output.dnase.metadata) # Track metadata DataFrame ``` ### Predict from Genome Interval ```python interval = genome.Interval('chr1', 1000000, 1000001) interval = interval.resize(dna_client.SEQUENCE_LENGTH_1MB) output = dna_model.predict_interval( interval=interval, requested_outputs=[dna_client.OutputType.RNA_SEQ], ontology_terms=['UBERON:0001114'], # Right liver lobe ) ``` ### Mouse Predictions Specify `organism=dna_client.Organism.MUS_MUSCULUS` for mouse models. ```python output = dna_model.predict_sequence( ..., organism=dna_client.Organism.MUS_MUSCULUS, ) ``` ## Variant Analysis ### Predict Variant Effects (Raw Tracks) Compare predictions for Reference (REF) vs Alternate (ALT) alleles. ```python variant_output = dna_model.predict_variant( interval=interval, variant=variant, requested_outputs=[dna_client.OutputType.RNA_SEQ], ontology_terms=['UBERON:0001157'], # Colon - Transverse ) ref_tracks = variant_output.reference.rna_seq alt_tracks = variant_output.alternate.rna_seq ``` ### Score Variants (Aggregated Scores) Get aggregated scores using recommended scorers. ```python scorer = variant_scorers.RECOMMENDED_VARIANT_SCORERS['RNA_SEQ'] variant_scores_list = dna_model.score_variant( interval=interval, variant=variant, variant_scorers=[scorer], ) scores = variant_scores_list[0] # Tidy scores to DataFrame df = variant_scorers.tidy_scores([scores], match_gene_strand=True) print(df[['gene_symbol', 'raw_score', 'quantile_score']]) ``` **Available recommended scorers:** `ATAC`, `CAGE`, `DNASE`, `PROCAP`, `RNA_SEQ`, `CHIP_TF`, `CHIP_HISTONE`, `SPLICE_SITES`, `SPLICE_SITE_USAGE`, `SPLICE_JUNCTIONS`, `POLYADENYLATION`, `CONTACT_MAPS` ### Batch Variant Scoring ```python # Parse variants from VCF-like DataFrame for _, row in vcf_df.iterrows(): variant = genome.Variant( chromosome=str(row.CHROM), position=int(row.POS), reference_bases=row.REF, alternate_ba