
Statistical Analysis
Interpret product and business metrics with the right center, spread, and tests instead of misleading averages.
Overview
Statistical Analysis is an agent skill most often used in Grow (also Validate, Build) that applies descriptive stats, tests, and anomaly checks so metric conclusions match the data’s shape.
Install
npx skills add https://github.com/anthropics/knowledge-work-plugins --skill statistical-analysisWhat is this skill?
- Central tendency decision table: mean vs median vs mode by distribution shape
- Rule to report mean and median together on business metrics to surface skew
- Spread toolkit: standard deviation, IQR, coefficient of variation, and range caveats
- Trend analysis, outlier detection, and hypothesis testing workflows
- Explicit caution guidance on when statistical claims are unreliable
- Central tendency guidance table covers 4 decision situations (symmetric, skewed, categorical, highly skewed revenue-styl
- Explicit rule to always report mean and median together for business metrics
Adoption & trust: 2.5k installs on skills.sh; 19.6k GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
Your dashboard shows a glowing average while medians, outliers, or tiny samples tell a different story—and you are not sure which test to trust.
Who is it for?
Founders analyzing activation, revenue per user, experiment results, or support volume without a dedicated data scientist.
Skip if: Production-grade ML modeling pipelines that need full sklearn workflows, or legal/regulatory statistical sign-off without human review.
When should I use this skill?
Analyzing distributions, testing for significance, detecting anomalies, computing correlations, or interpreting statistical results.
What do I get? / Deliverables
You get structured choices for center and spread, outlier and trend steps, and hypothesis-testing guidance with explicit limits on overclaiming.
- Recommended center and spread summaries for the metric type
- Interpretation notes for tests, outliers, or trends with caution callouts
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Statistical interpretation is anchored on grow/analytics where solo builders read funnels, revenue, and experiments. The skill’s tables and tests target distribution shape, skew, outliers, and significance—the core analytics literacy gap for indies.
Where it fits
Compare signup conversion before and after a landing change with a simple significance check.
Report median and mean ARPU together to explain skew from a few whale accounts.
Define anomaly thresholds for an internal usage scoring job using IQR instead of raw max-min range.
How it compares
Interpretation methodology for metrics, not a replacement for your BI tool or warehouse SQL skill.
Common Questions / FAQ
Who is statistical-analysis for?
Solo and indie builders who export or query metrics and need statistically sound summaries and tests without hiring a data team.
When should I use statistical-analysis?
In grow analytics for KPI reviews and anomaly hunts, in validate when judging prototype or pricing experiments, and in build when defining features driven by distributions or significance checks.
Is statistical-analysis safe to install?
Check the Security Audits panel on this Prism page and the upstream plugin source; the skill is guidance-only but your agent may still access local CSVs or notebooks you attach.
SKILL.md
READMESKILL.md - Statistical Analysis
# Statistical Analysis Skill Descriptive statistics, trend analysis, outlier detection, hypothesis testing, and guidance on when to be cautious about statistical claims. ## Descriptive Statistics Methodology ### Central Tendency Choose the right measure of center based on the data: | Situation | Use | Why | |---|---|---| | Symmetric distribution, no outliers | Mean | Most efficient estimator | | Skewed distribution | Median | Robust to outliers | | Categorical or ordinal data | Mode | Only option for non-numeric | | Highly skewed with outliers (e.g., revenue per user) | Median + mean | Report both; the gap shows skew | **Always report mean and median together for business metrics.** If they diverge significantly, the data is skewed and the mean alone is misleading. ### Spread and Variability - **Standard deviation**: How far values typically fall from the mean. Use with normally distributed data. - **Interquartile range (IQR)**: Distance from p25 to p75. Robust to outliers. Use with skewed data. - **Coefficient of variation (CV)**: StdDev / Mean. Use to compare variability across metrics with different scales. - **Range**: Max minus min. Sensitive to outliers but gives a quick sense of data extent. ### Percentiles for Business Context Report key percentiles to tell a richer story than mean alone: ``` p1: Bottom 1% (floor / minimum typical value) p5: Low end of normal range p25: First quartile p50: Median (typical user) p75: Third quartile p90: Top 10% / power users p95: High end of normal range p99: Top 1% / extreme users ``` **Example narrative**: "The median session duration is 4.2 minutes, but the top 10% of users spend over 22 minutes per session, pulling the mean up to 7.8 minutes." ### Describing Distributions Characterize every numeric distribution you analyze: - **Shape**: Normal, right-skewed, left-skewed, bimodal, uniform, heavy-tailed - **Center**: Mean and median (and the gap between them) - **Spread**: Standard deviation or IQR - **Outliers**: How many and how extreme - **Bounds**: Is there a natural floor (zero) or ceiling (100%)? ## Trend Analysis and Forecasting ### Identifying Trends **Moving averages** to smooth noise: ```python # 7-day moving average (good for daily data with weekly seasonality) df['ma_7d'] = df['metric'].rolling(window=7, min_periods=1).mean() # 28-day moving average (smooths weekly AND monthly patterns) df['ma_28d'] = df['metric'].rolling(window=28, min_periods=1).mean() ``` **Period-over-period comparison**: - Week-over-week (WoW): Compare to same day last week - Month-over-month (MoM): Compare to same month prior - Year-over-year (YoY): Gold standard for seasonal businesses - Same-day-last-year: Compare specific calendar day **Growth rates**: ``` Simple growth: (current - previous) / previous CAGR: (ending / beginning) ^ (1 / years) - 1 Log growth: ln(current / previous) -- better for volatile series ``` ### Seasonality Detection Check for periodic patterns: 1. Plot the raw time series -- visual inspection first 2. Compute day-of-week averages: is there a clear weekly pattern? 3. Compute month-of-year averages: is there an annual cycle? 4. When comparing periods, always use YoY or same-period comparisons to avoid conflating trend with seasonality ### Forecasting (Simple Methods) For business analysts (not data scientists), use straightforward methods: - **Naive forecast**: Tomorrow = today. Use as a baseline. - **Seasonal naive**: Tomorrow = same day last week/year. - **Linear trend**: Fit a line to historical data. Only for clearly linear trends. - **Moving average forecast**: Use trailing average as the forecast. **Always commu