
Ab Testing
Plan statistically valid A/B tests by choosing sample size, MDE, power, and run duration before you ship experiments on landing pages or funnels.
Install
npx skills add https://github.com/infrasity-labs/dev-gtm-claude-skills --skill ab-testingWhat is this skill?
- Sample size fundamentals: baseline rate, MDE, 95% significance, 80% power
- Quick-reference tables and duration formulas with minimum/maximum run rules
- Guidance on multiple variants, sequential testing, and when requirements are too high
- Common sample size mistakes and a quick decision framework
- Pointers to online calculators for hands-on sizing
Adoption & trust: 1 installs on skills.sh; 24 GitHub stars; trending (+100% hot-view momentum).
Recommended Skills
Seo Auditcoreyhaines31/marketingskills
Copywritingcoreyhaines31/marketingskills
Twitter Automationqu-skills/skills
Marketing Psychologycoreyhaines31/marketingskills
Content Strategycoreyhaines31/marketingskills
Programmatic Seocoreyhaines31/marketingskills
Journey fit
Primary fit
Grow → analytics is where conversion experiments are sized and interpreted; the skill is a reference for experiment design rather than code generation. Subphase analytics matches sample-size tables, duration calculators, and decision frameworks for measuring lift.
SKILL.md
READMESKILL.md - Ab Testing
# Sample Size Guide Reference for calculating sample sizes and test duration. ## Contents - Sample Size Fundamentals (required inputs, what these mean) - Sample Size Quick Reference Tables - Duration Calculator (formula, examples, minimum duration rules, maximum duration guidelines) - Online Calculators - Adjusting for Multiple Variants - Common Sample Size Mistakes - When Sample Size Requirements Are Too High - Sequential Testing - Quick Decision Framework ## Sample Size Fundamentals ### Required Inputs 1. **Baseline conversion rate**: Your current rate 2. **Minimum detectable effect (MDE)**: Smallest change worth detecting 3. **Statistical significance level**: Usually 95% (α = 0.05) 4. **Statistical power**: Usually 80% (β = 0.20) ### What These Mean **Baseline conversion rate**: If your page converts at 5%, that's your baseline. **MDE (Minimum Detectable Effect)**: The smallest improvement you care about detecting. Set this based on: - Business impact (is a 5% lift meaningful?) - Implementation cost (worth the effort?) - Realistic expectations (what have past tests shown?) **Statistical significance (95%)**: Means there's less than 5% chance the observed difference is due to random chance. **Statistical power (80%)**: Means if there's a real effect of size MDE, you have 80% chance of detecting it. --- ## Sample Size Quick Reference Tables ### Conversion Rate: 1% | Lift to Detect | Sample per Variant | Total Sample | |----------------|-------------------|--------------| | 5% (1% → 1.05%) | 1,500,000 | 3,000,000 | | 10% (1% → 1.1%) | 380,000 | 760,000 | | 20% (1% → 1.2%) | 97,000 | 194,000 | | 50% (1% → 1.5%) | 16,000 | 32,000 | | 100% (1% → 2%) | 4,200 | 8,400 | ### Conversion Rate: 3% | Lift to Detect | Sample per Variant | Total Sample | |----------------|-------------------|--------------| | 5% (3% → 3.15%) | 480,000 | 960,000 | | 10% (3% → 3.3%) | 120,000 | 240,000 | | 20% (3% → 3.6%) | 31,000 | 62,000 | | 50% (3% → 4.5%) | 5,200 | 10,400 | | 100% (3% → 6%) | 1,400 | 2,800 | ### Conversion Rate: 5% | Lift to Detect | Sample per Variant | Total Sample | |----------------|-------------------|--------------| | 5% (5% → 5.25%) | 119,000 | 238,000 | | 10% (5% → 5.5%) | 30,000 | 60,000 | | 20% (5% → 6%) | 7,500 | 15,000 | | 50% (5% → 7.5%) | 1,200 | 2,400 | | 100% (5% → 10%) | 300 | 600 | ### Conversion Rate: 10% | Lift to Detect | Sample per Variant | Total Sample | |----------------|-------------------|--------------| | 5% (10% → 10.5%) | 130,000 | 260,000 | | 10% (10% → 11%) | 34,000 | 68,000 | | 20% (10% → 12%) | 8,700 | 17,400 | | 50% (10% → 15%) | 1,500 | 3,000 | | 100% (10% → 20%) | 400 | 800 | ### Conversion Rate: 20% | Lift to Detect | Sample per Variant | Total Sample | |----------------|-------------------|--------------| | 5% (20% → 21%) | 60,000 | 120,000 | | 10% (20% → 22%) | 16,000 | 32,000 | | 20% (20% → 24%) | 4,000 | 8,000 | | 50% (20% → 30%) | 700 | 1,400 | | 100% (20% → 40%) | 200 | 400 | --- ## Duration Calculator ### Formula ``` Duration (days) = (Sample per variant × Number of variants) / (Daily traffic × % exposed) ``` ### Examples **Scenario 1: High-traffic page** - Need: 10,000 per variant (2 variants = 20,000 total) - Daily traffic: 5,000 visitors - 100% exposed to test - Duration: 20,000 / 5,000 = **4 days** **Scenario 2: Medium-traffic page** - Need: 30,000 per variant (60,000 total) - Daily traffic: 2,000 visitors - 100% exposed - Duration: 60,000 / 2,000 = **30 days** **Scenario 3: Low-traffic with partial exposure** - Need: 15,000 per variant (30,000 total) - Daily traffic: 500 visitors - 50% exposed to test - Effective daily: 250 - Duration: 30,000 / 250 = **120 days** (too long!) ### Minimum Duration Rules Even with sufficient sample size, run tests for at least: - **1 full week**: To capture day-of-week variation - **2 business cycles**: If B2B (weekday vs. weekend patterns) - **Through paydays**: If e-commerce (beginning/end of month) ### Maximum Duration Gu