Statistical Analysis

Name: Statistical Analysis
Author: anthropics

anthropics/knowledge-work-plugins

3.4k installs
23.1k repo stars
Updated July 28, 2026
anthropics/knowledge-work-plugins

statistical-analysis is a skill that applies descriptive statistics, trend analysis, outlier detection, and hypothesis testing with business interpretation guardrails.

About

statistical-analysis teaches descriptive statistics, trend analysis, outlier detection, hypothesis testing, and cautionary guidance for business data interpretation. Central tendency guidance picks mean for symmetric data, median for skewed distributions, and mode for categorical fields, always reporting mean and median together for business metrics. Spread covers standard deviation, IQR, coefficient of variation, and range with percentile narratives from p1 through p99. Trend analysis documents seven-day and twenty-eight-day moving averages, WoW MoM YoY comparisons, simple CAGR and log growth formulas, seasonality checks, and naive through moving-average forecasts with uncertainty ranges instead of false precision. Outlier methods include z-score above three, IQR 1.5 bounds, and p1/p99 percentile cuts with investigate-don't-delete handling for data errors versus genuine extremes. Hypothesis testing framework covers null and alternative hypotheses, alpha 0.05, t-tests, z-tests for proportions, paired t-test, ANOVA, Mann-Whitney U, and chi-squared with practical versus statistical significance and effect sizes. Caution sections warn on correlation not causation, multiple comparison.

Mean versus median selection rules with always report both for business metrics.
Moving averages, WoW MoM YoY comparisons, and forecast ranges not point estimates.
Outlier methods: z-score, IQR 1.5, percentile with investigate before removing.
Hypothesis test picker: t-test, z-test proportions, ANOVA, Mann-Whitney, chi-squared.
Caution on correlation causation, multiple comparisons, Simpson's paradox, survivorship bias.

Statistical Analysis by the numbers

3,384 all-time installs (skills.sh)
+127 installs in the week ending Jul 28, 2026 (Skillselion tracking)
Ranked #26 of 2,066 Data Science & ML skills by installs in the Skillselion catalog
Security screen: LOW risk (skills.sh audit)
Data as of Jul 28, 2026 (Skillselion catalog sync)

At a glance

statistical-analysis capabilities & compatibility

Capabilities: central tendency and spread methodology · moving average and period over period trends · z score iqr and percentile outlier detection · hypothesis test selection and interpretation · correlation causation and bias caution framework
Use cases: data analysis · research

From the docs

What statistical-analysis says it does

Always report mean and median together for business metrics.

SKILL.md

Correlation Is Not Causation

SKILL.md

npx skills add https://github.com/anthropics/knowledge-work-plugins --skill statistical-analysis

Add your badge

Show developers this skill is listed on Skillselion. Paste this into your README.

[![Listed on Skillselion](https://skillselion.com/badge/skills/anthropics/knowledge-work-plugins/statistical-analysis.svg)](https://skillselion.com/skills/anthropics/knowledge-work-plugins/statistical-analysis)

Installs	3.4k
repo stars	★ 23.1k
Security audit	3 / 3 scanners passed
Last updated	July 28, 2026
Repository	anthropics/knowledge-work-plugins ↗

How do I analyze metric distributions, detect outliers, test A/B significance, and avoid misleading statistical claims in business reports?

Apply descriptive stats, trend analysis, outlier detection, and hypothesis testing with business-safe interpretation guardrails.

Who is it for?

Analysts interpreting business metrics, experiment results, and time series who need methodology plus correlation-causation guardrails.

Skip if: Skip for deep ML modeling, non-linear multi-seasonality forecasting, or production data pipeline engineering without analysis context.

When should I use this skill?

User asks about descriptive stats, trend analysis, outlier detection, hypothesis testing, p-values, or statistical significance interpretation.

What you get

Structured analysis with mean-median spread, trend comparisons, outlier handling notes, test results with effect sizes, and explicit caution flags.

statistical summary
trend analysis
hypothesis test interpretation

Files

SKILL.mdMarkdownGitHub ↗

Statistical Analysis Skill

Descriptive statistics, trend analysis, outlier detection, hypothesis testing, and guidance on when to be cautious about statistical claims.

Descriptive Statistics Methodology

Central Tendency

Choose the right measure of center based on the data:

Situation	Use	Why
Symmetric distribution, no outliers	Mean	Most efficient estimator
Skewed distribution	Median	Robust to outliers
Categorical or ordinal data	Mode	Only option for non-numeric
Highly skewed with outliers (e.g., revenue per user)	Median + mean	Report both; the gap shows skew

Always report mean and median together for business metrics. If they diverge significantly, the data is skewed and the mean alone is misleading.

Spread and Variability

Standard deviation: How far values typically fall from the mean. Use with normally distributed data.
Interquartile range (IQR): Distance from p25 to p75. Robust to outliers. Use with skewed data.
Coefficient of variation (CV): StdDev / Mean. Use to compare variability across metrics with different scales.
Range: Max minus min. Sensitive to outliers but gives a quick sense of data extent.

Percentiles for Business Context

Report key percentiles to tell a richer story than mean alone:

p1:   Bottom 1% (floor / minimum typical value)
p5:   Low end of normal range
p25:  First quartile
p50:  Median (typical user)
p75:  Third quartile
p90:  Top 10% / power users
p95:  High end of normal range
p99:  Top 1% / extreme users

Example narrative: "The median session duration is 4.2 minutes, but the top 10% of users spend over 22 minutes per session, pulling the mean up to 7.8 minutes."

Describing Distributions

Characterize every numeric distribution you analyze:

Shape: Normal, right-skewed, left-skewed, bimodal, uniform, heavy-tailed
Center: Mean and median (and the gap between them)
Spread: Standard deviation or IQR
Outliers: How many and how extreme
Bounds: Is there a natural floor (zero) or ceiling (100%)?

Trend Analysis and Forecasting

Identifying Trends

Moving averages to smooth noise:

# 7-day moving average (good for daily data with weekly seasonality)
df['ma_7d'] = df['metric'].rolling(window=7, min_periods=1).mean()

# 28-day moving average (smooths weekly AND monthly patterns)
df['ma_28d'] = df['metric'].rolling(window=28, min_periods=1).mean()

Period-over-period comparison:

Week-over-week (WoW): Compare to same day last week
Month-over-month (MoM): Compare to same month prior
Year-over-year (YoY): Gold standard for seasonal businesses
Same-day-last-year: Compare specific calendar day

Growth rates:

Simple growth: (current - previous) / previous
CAGR: (ending / beginning) ^ (1 / years) - 1
Log growth: ln(current / previous)  -- better for volatile series

Seasonality Detection

Check for periodic patterns: 1. Plot the raw time series -- visual inspection first 2. Compute day-of-week averages: is there a clear weekly pattern? 3. Compute month-of-year averages: is there an annual cycle? 4. When comparing periods, always use YoY or same-period comparisons to avoid conflating trend with seasonality

Forecasting (Simple Methods)

For business analysts (not data scientists), use straightforward methods:

Naive forecast: Tomorrow = today. Use as a baseline.
Seasonal naive: Tomorrow = same day last week/year.
Linear trend: Fit a line to historical data. Only for clearly linear trends.
Moving average forecast: Use trailing average as the forecast.

Always communicate uncertainty. Provide a range, not a point estimate:

"We expect 10K-12K signups next month based on the 3-month trend"
NOT "We will get exactly 11,234 signups next month"

When to escalate to a data scientist: Non-linear trends, multiple seasonalities, external factors (marketing spend, holidays), or when forecast accuracy matters for resource allocation.

Outlier and Anomaly Detection

Statistical Methods

Z-score method (for normally distributed data):

z_scores = (df['value'] - df['value'].mean()) / df['value'].std()
outliers = df[abs(z_scores) > 3]  # More than 3 standard deviations

IQR method (robust to non-normal distributions):

Q1 = df['value'].quantile(0.25)
Q3 = df['value'].quantile(0.75)
IQR = Q3 - Q1
lower_bound = Q1 - 1.5 * IQR
upper_bound = Q3 + 1.5 * IQR
outliers = df[(df['value'] < lower_bound) | (df['value'] > upper_bound)]

Percentile method (simplest):

outliers = df[(df['value'] < df['value'].quantile(0.01)) |
              (df['value'] > df['value'].quantile(0.99))]

Handling Outliers

Do NOT automatically remove outliers. Instead:

1. Investigate: Is this a data error, a genuine extreme value, or a different population? 2. Data errors: Fix or remove (e.g., negative ages, timestamps in year 1970) 3. Genuine extremes: Keep them but consider using robust statistics (median instead of mean) 4. Different population: Segment them out for separate analysis (e.g., enterprise vs. SMB customers)

Report what you did: "We excluded 47 records (0.3%) with transaction amounts >$50K, which represent bulk enterprise orders analyzed separately."

Time Series Anomaly Detection

For detecting unusual values in a time series:

1. Compute expected value (moving average or same-period-last-year) 2. Compute deviation from expected 3. Flag deviations beyond a threshold (typically 2-3 standard deviations of the residuals) 4. Distinguish between point anomalies (single unusual value) and change points (sustained shift)

Hypothesis Testing Basics

When to Use

Use hypothesis testing when you need to determine whether an observed difference is likely real or could be due to random chance. Common scenarios:

A/B test results: Is variant B actually better than A?
Before/after comparison: Did the product change actually move the metric?
Segment comparison: Do enterprise customers really have higher retention?

The Framework

1. Null hypothesis (H0): There is no difference (the default assumption) 2. Alternative hypothesis (H1): There is a difference 3. Choose significance level (alpha): Typically 0.05 (5% chance of false positive) 4. Compute test statistic and p-value 5. Interpret: If p < alpha, reject H0 (evidence of a real difference)

Common Tests

Scenario	Test	When to Use
Compare two group means	t-test (independent)	Normal data, two groups
Compare two group proportions	z-test for proportions	Conversion rates, binary outcomes
Compare paired measurements	Paired t-test	Before/after on same entities
Compare 3+ group means	ANOVA	Multiple segments or variants
Non-normal data, two groups	Mann-Whitney U test	Skewed metrics, ordinal data
Association between categories	Chi-squared test	Two categorical variables

Practical Significance vs. Statistical Significance

Statistical significance means the difference is unlikely due to chance.

Practical significance means the difference is large enough to matter for business decisions.

A difference can be statistically significant but practically meaningless (common with large samples). Always report:

Effect size: How big is the difference? (e.g., "Variant B improved conversion by 0.3 percentage points")
Confidence interval: What's the range of plausible true effects?
Business impact: What does this translate to in revenue, users, or other business terms?

Sample Size Considerations

Small samples produce unreliable results, even with significant p-values
Rule of thumb for proportions: Need at least 30 events per group for basic reliability
For detecting small effects (e.g., 1% conversion rate change), you may need thousands of observations per group
If your sample is small, say so: "With only 200 observations per group, we have limited power to detect effects smaller than X%"

When to Be Cautious About Statistical Claims

Correlation Is Not Causation

When you find a correlation, explicitly consider:

Reverse causation: Maybe B causes A, not A causes B
Confounding variables: Maybe C causes both A and B
Coincidence: With enough variables, spurious correlations are inevitable

What you can say: "Users who use feature X have 30% higher retention" What you cannot say without more evidence: "Feature X causes 30% higher retention"

Multiple Comparisons Problem

When you test many hypotheses, some will be "significant" by chance:

Testing 20 metrics at p=0.05 means ~1 will be falsely significant
If you looked at many segments before finding one that's different, note that
Adjust for multiple comparisons with Bonferroni correction (divide alpha by number of tests) or report how many tests were run

Simpson's Paradox

A trend in aggregated data can reverse when data is segmented:

Always check whether the conclusion holds across key segments
Example: Overall conversion goes up, but conversion goes down in every segment -- because the mix shifted toward a higher-converting segment

Survivorship Bias

You can only analyze entities that "survived" to be in your dataset:

Analyzing active users ignores those who churned
Analyzing successful companies ignores those that failed
Always ask: "Who is missing from this dataset, and would their inclusion change the conclusion?"

Ecological Fallacy

Aggregate trends may not apply to individuals:

"Countries with higher X have higher Y" does NOT mean "individuals with higher X have higher Y"
Be careful about applying group-level findings to individual cases

Anchoring on Specific Numbers

Be wary of false precision:

"Churn will be 4.73% next quarter" implies more certainty than is warranted
Prefer ranges: "We expect churn between 4-6% based on historical patterns"
Round appropriately: "About 5%" is often more honest than "4.73%"

Related skills

Microsoft FoundryDeploy, evaluate, and continuously improve Microsoft Foundry agents from a single agent interface.478k1.3k

Ai Research ReproductionOrchestrate trustworthy, auditable reproduction of deep learning repositories directly from their READMEs.164k507

Run TrainSafely execute selected deep learning training commands with standardized evidence capture.164k507

Explore RunSafely run isolated exploratory experiments with clear recording and conservative selection before committing changes.164k507

Paper Context ResolverFetch precise reproduction-critical details like dataset splits, preprocessing steps, or evaluation protocols from the original academic paper when the repo README leav141k507

Repo Intake And PlanScan unfamiliar AI research repositories and receive a minimal, trustworthy reproduction target before investing significant time.140k507

Forks & variants (1)

Statistical Analysis has 1 known copy in the catalog totaling 129 installs. They canonicalize to this original listing.

smithery.ai - 129 installs

How it compares

Use statistical-analysis for interpretive metric validation; use a dedicated ML training skill when building predictive models rather than summarizing existing data.

FAQ

Should statistical-analysis automatically remove outliers?

No. Investigate whether they are data errors, genuine extremes, or a different population before deciding.

What is the default significance level?

Alpha 0.05 (5% false positive chance) with emphasis on practical significance and effect sizes.

When should analysis escalate to a data scientist?

Non-linear trends, multiple seasonalities, external factors, or when forecast accuracy drives resource allocation.

Is Statistical Analysis safe to install?

skills.sh reports 3 of 3 security scanners passed. Review the Security Audits panel on this page before installing in production.

Data Science & MLanalytics

About

Statistical Analysis by the numbers

statistical-analysis capabilities & compatibility

What statistical-analysis says it does

Add your badge

How do I analyze metric distributions, detect outliers, test A/B significance, and avoid misleading statistical claims in business reports?

Who is it for?

When should I use this skill?

What you get

Files

Statistical Analysis Skill

Descriptive Statistics Methodology

Central Tendency

Spread and Variability

Percentiles for Business Context

Describing Distributions

Trend Analysis and Forecasting

Identifying Trends

Seasonality Detection

Forecasting (Simple Methods)

Outlier and Anomaly Detection

Statistical Methods

Handling Outliers

Time Series Anomaly Detection

Hypothesis Testing Basics

When to Use

The Framework

Common Tests

Practical Significance vs. Statistical Significance

Sample Size Considerations

When to Be Cautious About Statistical Claims

Correlation Is Not Causation

Multiple Comparisons Problem

Simpson's Paradox

Survivorship Bias

Ecological Fallacy

Anchoring on Specific Numbers

Related skills

Forks & variants (1)

How it compares

FAQ

Should statistical-analysis automatically remove outliers?

What is the default significance level?

When should analysis escalate to a data scientist?

Is Statistical Analysis safe to install?

This week in AI coding