
Statistical Analysis
Check statistical assumptions and diagnostics before trusting t-tests, ANOVA, regression, or correlation results in research or product experiments.
Overview
Statistical Analysis is an agent skill most often used in Validate (also Grow analytics and Idea research) that guides assumption checks and diagnostic procedures before interpreting statistical tests.
Install
npx skills add https://github.com/k-dense-ai/scientific-agent-skills --skill statistical-analysisWhat is this skill?
- 5 general principles: check before interpret, visual + formal tests, robustness notes, document checks, report remedial
- Structured assumption sections including independence, normality, and related diagnostics
- Remedial guidance: mixed models, GEE, time-series methods when independence fails
- Severity callouts (e.g., HIGH for independence violations on Type I error)
- Explicit n>30 robustness notes for t-tests and ANOVA normality requirements
- 5 general principles for assumption validation and reporting
- Documents HIGH severity when independence assumptions are violated
Adoption & trust: 611 installs on skills.sh; 27.6k GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
You have test output or regression fits but have not verified independence, normality, or other assumptions, risking wrong conclusions.
Who is it for?
Builders running experiments, A/B analyses, or scientific reports who need a repeatable assumption-audit ritual in the agent.
Skip if: Quick dashboards that only need descriptive stats, or regulated clinical work requiring institutional statistician sign-off without human review.
When should I use this skill?
Before interpreting statistical test results or writing analysis reports that require assumption validation.
What do I get? / Deliverables
You produce documented assumption checks, violation severity notes, and remedial analysis choices aligned with the tests you plan to report.
- Documented assumption checks (visual and formal)
- Violation severity assessment and remedial method recommendations
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Assumption validation is the canonical shelf in Validate when interpreting whether evidence supports a scope or hypothesis decision. Fits scoping and proving work—confirming data supports conclusions before full build or growth bets.
Where it fits
Audit normality and independence before declaring an MVP A/B test winner.
Re-check regression residuals when funnel metrics feed a forecasting dashboard.
Screen pilot survey data for autocorrelation before citing effect sizes in a opportunity memo.
Attach assumption documentation to a stats section in a product requirements or research appendix.
How it compares
A diagnostic checker for inference assumptions—not a one-click auto-ML trainer or visualization-only chart skill.
Common Questions / FAQ
Who is statistical-analysis for?
Solo analysts, indie SaaS founders validating experiments, and agent users writing method sections or internal stats memos.
When should I use statistical-analysis?
In Validate when proving a hypothesis or scope; in Grow when reviewing analytics model validity; in Idea when assessing early survey or pilot data before committing to build.
Is statistical-analysis safe to install?
It is documentation-only procedural knowledge; review the Security Audits panel on this page and never treat automated checks as a substitute for domain expert review on high-stakes decisions.
SKILL.md
READMESKILL.md - Statistical Analysis
# Statistical Assumptions and Diagnostic Procedures This document provides comprehensive guidance on checking and validating statistical assumptions for various analyses. ## General Principles 1. **Always check assumptions before interpreting test results** 2. **Use multiple diagnostic methods** (visual + formal tests) 3. **Consider robustness**: Some tests are robust to violations under certain conditions 4. **Document all assumption checks** in analysis reports 5. **Report violations and remedial actions taken** ## Common Assumptions Across Tests ### 1. Independence of Observations **What it means**: Each observation is independent; measurements on one subject do not influence measurements on another. **How to check**: - Review study design and data collection procedures - For time series: Check autocorrelation (ACF/PACF plots, Durbin-Watson test) - For clustered data: Consider intraclass correlation (ICC) **What to do if violated**: - Use mixed-effects models for clustered/hierarchical data - Use time series methods for temporally dependent data - Use generalized estimating equations (GEE) for correlated data **Critical severity**: HIGH - violations can severely inflate Type I error --- ### 2. Normality **What it means**: Data or residuals follow a normal (Gaussian) distribution. **When required**: - t-tests (for small samples; robust for n > 30 per group) - ANOVA (for small samples; robust for n > 30 per group) - Linear regression (for residuals) - Some correlation tests (Pearson) **How to check**: **Visual methods** (primary): - Q-Q (quantile-quantile) plot: Points should fall on diagonal line - Histogram with normal curve overlay - Kernel density plot **Formal tests** (secondary): - Shapiro-Wilk test (recommended for n < 50) - Kolmogorov-Smirnov test - Anderson-Darling test **Python implementation**: ```python from scipy import stats import matplotlib.pyplot as plt # Shapiro-Wilk test statistic, p_value = stats.shapiro(data) # Q-Q plot stats.probplot(data, dist="norm", plot=plt) ``` **Interpretation guidance**: - For n < 30: Both visual and formal tests important - For 30 ≤ n < 100: Visual inspection primary, formal tests secondary - For n ≥ 100: Formal tests overly sensitive; rely on visual inspection - Look for severe skewness, outliers, or bimodality **What to do if violated**: - **Mild violations** (slight skewness): Proceed if n > 30 per group - **Moderate violations**: Use non-parametric alternatives (Mann-Whitney, Kruskal-Wallis, Wilcoxon) - **Severe violations**: - Transform data (log, square root, Box-Cox) - Use non-parametric methods - Use robust regression methods - Consider bootstrapping **Critical severity**: MEDIUM - parametric tests are often robust to mild violations with adequate sample size --- ### 3. Homogeneity of Variance (Homoscedasticity) **What it means**: Variances are equal across groups or across the range of predictors. **When required**: - Independent samples t-test - ANOVA - Linear regression (constant variance of residuals) **How to check**: **Visual methods** (primary): - Box plots by group (for t-test/ANOVA) - Residuals vs. fitted values plot (for regression) - should show random scatter - Scale-location plot (square root of standardized residuals vs. fitted) **Formal tests** (secondary): - Levene's test (robust to non-normality) - Bartlett's test (sensitive to non-normality, not recommended) - Brown-Forsythe test (median-based version of Levene's) - Breusch-Pagan test (for regression) **Python implementation**: ```python from scipy import stats import pingouin as pg # Levene's test statistic, p_value = stats.levene(group1, group2, group3) # For regression # Breusch-Pagan test from statsmodels.stats.diagnostic import het_breuschpagan _, p_value, _, _ = het_breuschpagan(residuals, exog) ``` **Interpretation guidance**: - Variance ratio (max/min) < 2-3: Generally acceptable - For ANOVA: Test is robust if groups have equal sizes - For regression: Look