
Pymc
Stand up a Bayesian hierarchical (multilevel) PyMC model on grouped data with sampling, diagnostics, and plotting hooks.
Overview
pymc is an agent skill for the Validate phase that templates Bayesian hierarchical multilevel models in PyMC with ArviZ-backed analysis for grouped datasets.
Install
npx skills add https://github.com/k-dense-ai/scientific-agent-skills --skill pymcWhat is this skill?
- End-to-end PyMC hierarchical/multilevel workflow with TODO-marked customization sections
- Demonstration generator: 10 groups × 20 observations with known hyperpriors for sanity checks
- Uses PyMC, ArviZ, NumPy, Pandas, and Matplotlib in one scripted template
- Covers data prep, model specification, sampling, and posterior analysis patterns typical of nested datasets
- Oriented to grouped data such as students-in-schools or patients-in-hospitals examples in the docstring
- Demonstration uses 10 groups with 20 observations per group (200 total rows)
- Template sections numbered 1–2+ for data prep and model workflow
Adoption & trust: 543 installs on skills.sh; 27.6k GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
Your outcomes are nested by group and plain least-squares models overfit or ignore partial pooling you need for decision-grade uncertainty.
Who is it for?
Solo data scientists validating multilevel structure on modest nested datasets with Python and PyMC already in the environment.
Skip if: Builders who need only descriptive analytics, non-Bayesian ML benchmarks, or production model serving without a Python notebook workflow.
When should I use this skill?
You need a PyMC hierarchical or multilevel Bayesian workflow for nested/grouped observational data with TODO customization points.
What do I get? / Deliverables
You adapt the PyMC template to your CSV, run hierarchical inference, and inspect posteriors with ArviZ and Matplotlib before locking model assumptions into production pipelines.
- Customized PyMC hierarchical model script
- Posterior summaries via ArviZ
- Diagnostic plots from the template workflow
Recommended Skills
Journey fit
The skill is a modeling prototype template—best shelved under Validate when you test whether grouped structure explains outcomes before production ML build-out. Hierarchical PyMC workflows fit prototype validation of assumptions, priors, and group effects—not shipping frontend or SEO.
How it compares
A Bayesian modeling script template—not a hosted experiment tracker or AutoML tabular API.
Common Questions / FAQ
Who is pymc for?
Indie researchers and builder-scientists who validate grouped or nested hypotheses with PyMC before embedding models into apps or reports.
When should I use pymc?
During Validate (prototype) when exploring hierarchical priors on real group IDs; optionally in Build (backend) when prototyping inference code adjacent to your API.
Is pymc safe to install?
Use the Security Audits panel on this Prism page; the skill is local Python execution on your data with no built-in secret handling—treat notebooks like any scientific code you run.
SKILL.md
READMESKILL.md - Pymc
""" PyMC Hierarchical/Multilevel Model Template This template provides a complete workflow for Bayesian hierarchical models, useful for grouped/nested data (e.g., students within schools, patients within hospitals). Customize the sections marked with # TODO """ import pymc as pm import arviz as az import numpy as np import pandas as pd import matplotlib.pyplot as plt # ============================================================================= # 1. DATA PREPARATION # ============================================================================= # TODO: Load your data with group structure # Example: # df = pd.read_csv('data.csv') # groups = df['group_id'].values # X = df['predictor'].values # y = df['outcome'].values # For demonstration: Generate hierarchical data np.random.seed(42) n_groups = 10 n_per_group = 20 n_obs = n_groups * n_per_group # True hierarchical structure true_mu_alpha = 5.0 true_sigma_alpha = 2.0 true_mu_beta = 1.5 true_sigma_beta = 0.5 true_sigma = 1.0 group_alphas = np.random.normal(true_mu_alpha, true_sigma_alpha, n_groups) group_betas = np.random.normal(true_mu_beta, true_sigma_beta, n_groups) # Generate data groups = np.repeat(np.arange(n_groups), n_per_group) X = np.random.randn(n_obs) y = group_alphas[groups] + group_betas[groups] * X + np.random.randn(n_obs) * true_sigma # TODO: Customize group names group_names = [f'Group_{i}' for i in range(n_groups)] # ============================================================================= # 2. BUILD HIERARCHICAL MODEL # ============================================================================= print("Building hierarchical model...") coords = { 'groups': group_names, 'obs': np.arange(n_obs) } with pm.Model(coords=coords) as hierarchical_model: # Data containers (for later predictions) X_data = pm.Data('X_data', X) groups_data = pm.Data('groups_data', groups) # Hyperpriors (population-level parameters) # TODO: Adjust hyperpriors based on your domain knowledge mu_alpha = pm.Normal('mu_alpha', mu=0, sigma=10) sigma_alpha = pm.HalfNormal('sigma_alpha', sigma=5) mu_beta = pm.Normal('mu_beta', mu=0, sigma=10) sigma_beta = pm.HalfNormal('sigma_beta', sigma=5) # Group-level parameters (non-centered parameterization) # Non-centered parameterization improves sampling efficiency alpha_offset = pm.Normal('alpha_offset', mu=0, sigma=1, dims='groups') alpha = pm.Deterministic('alpha', mu_alpha + sigma_alpha * alpha_offset, dims='groups') beta_offset = pm.Normal('beta_offset', mu=0, sigma=1, dims='groups') beta = pm.Deterministic('beta', mu_beta + sigma_beta * beta_offset, dims='groups') # Observation-level model mu = alpha[groups_data] + beta[groups_data] * X_data # Observation noise sigma = pm.HalfNormal('sigma', sigma=5) # Likelihood; tie shape to X_data so prediction data can have a new row count y_obs = pm.Normal('y_obs', mu=mu, sigma=sigma, observed=y, shape=X_data.shape[0], dims='obs') print("Model built successfully!") print(f"Groups: {n_groups}") print(f"Observations: {n_obs}") # ============================================================================= # 3. PRIOR PREDICTIVE CHECK # ============================================================================= print("\nRunning prior predictive check...") with hierarchical_model: prior_pred = pm.sample_prior_predictive(draws=500, random_seed=42) # Visualize prior predictions fig, ax = plt.subplots(figsize=(10, 6)) az.plot_ppc(prior_pred, group='prior', num_pp_samples=100, ax=ax) ax.set_title('Prior Predictive Check') plt.tight_layout() plt.savefig('hierarchical_prior_check.png', dpi=300, bbox_inches='tight') print("Prior predictive check saved to 'hierarchical_prior_check.png'") # ============================================================================= # 4. FIT MODEL # ============================================================================= print("\nFitting hierarchical model..."