
Scvi Tools
Run Bayesian differential-expression analysis on single-cell RNA (and related modalities) with batch correction and zero-inflation handling via scvi-tools.
Overview
scvi-tools is an agent skill for the Build phase that guides Bayesian differential expression testing with scvi-tools generative models, batch-aware contrasts, and zero-inflation-aware uncertainty.
Install
npx skills add https://github.com/k-dense-ai/scientific-agent-skills --skill scvi-toolsWhat is this skill?
- Three-stage DE workflow: posterior sampling, hypothesis testing (vanilla and extended modes), and uncertainty-aware fold
- Batch-corrected comparisons across groups or cell types on RNA, totalVI protein, and PeakVI accessibility
- Log fold-change framing log(μ_B) − log(μ_A) with generative-model expression draws
- Handles dropout and zero inflation instead of treating zeros as simple missing data
- Flexible pairwise contrasts on learned latent representations
- Three-stage DE process: expression estimation, hypothesis testing, and contrast reporting
Adoption & trust: 515 installs on skills.sh; 27.6k GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
You have a trained single-cell model but need defensible DE between conditions without ignoring batch effects, dropout, or probabilistic effect sizes.
Who is it for?
Indie or solo builders automating single-cell DE in Python agents after models are trained and contrasts are defined.
Skip if: Teams without scvi-tools training experience, bulk-only RNA workflows, or anyone treating agent output as clinical significance without expert review.
When should I use this skill?
You need differential expression between groups or cell types on data already modeled with scvi-tools (RNA, totalVI, or PeakVI).
What do I get? / Deliverables
You get a structured DE workflow—posterior sampling, testing mode choice, and interpreted log fold-changes—ready to document in notebooks or downstream validation reports.
- DE contrast plan aligned to scvi-tools testing modes
- Interpreted log fold-change and uncertainty summary for chosen features
Recommended Skills
Journey fit
Canonical shelf is Build because the skill drives executable analysis pipelines and model-based DE workflows, not early market discovery. Backend fits data-modeling and probabilistic inference steps that sit outside UI and shipping checklists.
How it compares
Use instead of generic DESeq2-style chat recipes when your data already lives in scvi-tools’ probabilistic single-cell stack.
Common Questions / FAQ
Who is scvi-tools for?
Solo builders and small teams using Claude Code, Cursor, or Codex to run single-cell differential expression inside scvi-tools-trained pipelines.
When should I use scvi-tools?
Use it in Build when you need DE on batch-corrected latent or denoised expression; in Validate when comparing cell-type contrasts for a prototype dataset; not for idea-stage market research without omics data.
Is scvi-tools safe to install?
Review the Security Audits panel on this Prism page and treat the skill as procedural guidance over local Python and data files—do not pipe PHI into untrusted environments without your own controls.
SKILL.md
READMESKILL.md - Scvi Tools
# Differential Expression Analysis in scvi-tools This document provides detailed information about differential expression (DE) analysis using scvi-tools' probabilistic framework. ## Overview scvi-tools implements Bayesian differential expression testing that leverages the learned generative models to estimate expression differences between groups. This approach provides several advantages over traditional methods: - **Batch correction**: DE testing on batch-corrected representations - **Uncertainty quantification**: Probabilistic estimates of effect sizes - **Zero-inflation handling**: Proper modeling of dropout and zeros - **Flexible comparisons**: Between any groups or cell types - **Multiple modalities**: Works for RNA, proteins (totalVI), and accessibility (PeakVI) ## Core Statistical Framework ### Problem Definition The goal is to estimate the log fold-change in expression between two conditions: ``` log fold-change = log(μ_B) - log(μ_A) ``` Where μ_A and μ_B are the mean expression levels in conditions A and B. ### Three-Stage Process **Stage 1: Estimating Expression Levels** - Sample from posterior distribution of cellular states - Generate expression values from the learned generative model - Aggregate across cells to get population-level estimates **Stage 2: Detecting Relevant Features (Hypothesis Testing)** - Test for differential expression using Bayesian framework - Two testing modes available: - **"vanilla" mode**: Point null hypothesis (β = 0) - **"change" mode**: Composite hypothesis (|β| ≤ δ) **Stage 3: Controlling False Discovery** - Posterior expected False Discovery Proportion (FDP) control - Selects maximum number of discoveries ensuring E[FDP] ≤ α ## Basic Usage ### Simple Two-Group Comparison ```python import scvi # After training a model model = scvi.model.SCVI(adata) model.train() # Compare two cell types de_results = model.differential_expression( groupby="cell_type", group1="T cells", group2="B cells" ) # View top DE genes top_genes = de_results.sort_values("lfc_mean", ascending=False).head(20) print(top_genes[["lfc_mean", "lfc_std", "bayes_factor", "is_de_fdr_0.05"]]) ``` ### One vs. Rest Comparison ```python # Compare one group against all others de_results = model.differential_expression( groupby="cell_type", group1="T cells" # No group2 = compare to rest ) ``` ### All Pairwise Comparisons ```python # Compare all cell types pairwise all_comparisons = {} cell_types = adata.obs["cell_type"].unique() for ct1 in cell_types: for ct2 in cell_types: if ct1 != ct2: key = f"{ct1}_vs_{ct2}" all_comparisons[key] = model.differential_expression( groupby="cell_type", group1=ct1, group2=ct2 ) ``` ## Key Parameters ### `groupby` (required) Column in `adata.obs` defining groups to compare. ```python # Must be a categorical variable de_results = model.differential_expression(groupby="cell_type") ``` ### `group1` and `group2` Groups to compare. If `group2` is None, compares `group1` to all others. ```python # Specific comparison de = model.differential_expression(groupby="condition", group1="treated", group2="control") # One vs rest de = model.differential_expression(groupby="cell_type", group1="T cells") ``` ### `mode` (Hypothesis Testing Mode) **"vanilla" mode** (default): Point null hypothesis - Tests if β = 0 exactly - More sensitive, but may find trivially small effects **"change" mode**: Composite null hypothesis - Tests if |β| ≤ δ - Requires biologically meaningful change - Reduces false discoveries of tiny effects ```python # Change mode with minimum effect size de = model.differential_expression( groupby="cell_type", group1="T cells", group2="B cells", mode="change", delta=0.25 # Minimum log fold-change ) ``` ### `delta` Minimum effect size threshold for "change" mode. - Typical values: 0.25, 0.5, 0.7 (log scale) - log2(1.5) ≈ 0.58