Validate Data

Name: Validate Data
Author: anthropics

anthropics/knowledge-work-plugins

2.5k installs
23.1k repo stars
Updated July 28, 2026
anthropics/knowledge-work-plugins

validate-data QA-reviews analyses for methodology, accuracy, bias, and presentation before stakeholder sharing.

About

The validate-data skill reviews analyses for accuracy, methodology, and bias before stakeholder delivery via /validate-data. It accepts documents, notebooks, SQL with results, charts, or methodology descriptions. The workflow examines question framing, data selection, population definition, metric consistency, and fair baselines, then runs a pre-delivery QA checklist covering data quality, calculations, reasonableness, and presentation. It systematically checks join explosion, survivorship bias, incomplete period comparisons, denominator shifting, average-of-averages traps, timezone mismatches, and selection bias. Spot-checks recalculate key numbers, verify subtotals, and cross-validate against known benchmarks. Visualization review flags truncated axes, inconsistent scales, and misleading chart choices. Output rates overall assessment as Ready to share, Share with caveats, or Needs revision with severity-tagged issues, required caveats, and actionable improvements. Teams invoke it before executive presentations, quarterly reviews, or any high-stakes data narrative.

Eight-step workflow from methodology review to confidence assessment.
Pre-delivery checklist for data quality, calculations, and presentation.
Pitfall catalog: join explosion, survivorship bias, denominator shifting.
Spot-checks recalculate metrics and cross-validate against benchmarks.
Three-level verdict: Ready, Share with caveats, or Needs revision.

Validate Data by the numbers

2,530 all-time installs (skills.sh)
+119 installs in the week ending Jul 28, 2026 (Skillselion tracking)
Ranked #37 of 2,066 Data Science & ML skills by installs in the Skillselion catalog
Security screen: MEDIUM risk (skills.sh audit)
Data as of Jul 28, 2026 (Skillselion catalog sync)

At a glance

validate-data capabilities & compatibility

Capabilities: methodology and assumption review with pitfall c · pre delivery qa checklist across data calculatio · calculation spot checks and cross validation tec · visualization honesty and narrative conclusion e · three level confidence assessment with required
Use cases: data analysis · research

From the docs

What validate-data says it does

Run /validate-data before any high-stakes presentation or decision

SKILL.md

A many-to-many join silently multiplies rows, inflating counts and sums.

SKILL.md

Conclusions are supported by the data shown

SKILL.md

npx skills add https://github.com/anthropics/knowledge-work-plugins --skill validate-data

Add your badge

Show developers this skill is listed on Skillselion. Paste this into your README.

[![Listed on Skillselion](https://skillselion.com/badge/skills/anthropics/knowledge-work-plugins/validate-data.svg)](https://skillselion.com/skills/anthropics/knowledge-work-plugins/validate-data)

Installs	2.5k
repo stars	★ 23.1k
Security audit	3 / 3 scanners passed
Last updated	July 28, 2026
Repository	anthropics/knowledge-work-plugins ↗

How do I verify my analysis is sound before presenting it to executives or stakeholders?

QA an analysis for methodology, calculation accuracy, visualization honesty, and bias before sharing with stakeholders.

Who is it for?

Analysts reviewing SQL results, dashboards, or reports before high-stakes presentations.

Skip if: Skip when you need to run the analysis itself; this skill reviews finished work only.

When should I use this skill?

User runs /validate-data, asks to spot-check calculations, or review analysis before sharing.

What you get

A validation report with assessment level, issues, spot-checks, caveats, and improvement suggestions.

Confidence assessment
Methodology and bias findings
Recommended corrections

Files

SKILL.mdMarkdownGitHub ↗

/validate-data - Validate Analysis Before Sharing

If you see unfamiliar placeholders or need to check which tools are connected, see CONNECTORS.md.

Review an analysis for accuracy, methodology, and potential biases before sharing with stakeholders. Generates a confidence assessment and improvement suggestions.

Usage

/validate-data <analysis to review>

The analysis can be:

A document or report in the conversation
A file (markdown, notebook, spreadsheet)
SQL queries and their results
Charts and their underlying data
A description of methodology and findings

Workflow

1. Review Methodology and Assumptions

Examine:

Question framing: Is the analysis answering the right question? Could the question be interpreted differently?
Data selection: Are the right tables/datasets being used? Is the time range appropriate?
Population definition: Is the analysis population correctly defined? Are there unintended exclusions?
Metric definitions: Are metrics defined clearly and consistently? Do they match how stakeholders understand them?
Baseline and comparison: Is the comparison fair? Are time periods, cohort sizes, and contexts comparable?

2. Run the Pre-Delivery QA Checklist

Work through the checklist below — data quality, calculation, reasonableness, and presentation checks.

3. Check for Common Analytical Pitfalls

Systematically review against the detailed pitfall catalog below (join explosion, survivorship bias, incomplete period comparison, denominator shifting, average of averages, timezone mismatches, selection bias).

4. Verify Calculations and Aggregations

Where possible, spot-check:

Recalculate a few key numbers independently
Verify that subtotals sum to totals
Check that percentages sum to 100% (or close to it) where expected
Confirm that YoY/MoM comparisons use the correct base periods
Validate that filters are applied consistently across all metrics

Apply the result sanity-checking techniques below (magnitude checks, cross-validation, red-flag detection).

5. Assess Visualizations

If the analysis includes charts:

Do axes start at appropriate values (zero for bar charts)?
Are scales consistent across comparison charts?
Do chart titles accurately describe what's shown?
Could the visualization mislead a quick reader?
Are there truncated axes, inconsistent intervals, or 3D effects that distort perception?

6. Evaluate Narrative and Conclusions

Review whether:

Conclusions are supported by the data shown
Alternative explanations are acknowledged
Uncertainty is communicated appropriately
Recommendations follow logically from findings
The level of confidence matches the strength of evidence

7. Suggest Improvements

Provide specific, actionable suggestions:

Additional analyses that would strengthen the conclusions
Caveats or limitations that should be noted
Better visualizations or framings for key points
Missing context that stakeholders would want

8. Generate Confidence Assessment

Rate the analysis on a 3-level scale:

Ready to share -- Analysis is methodologically sound, calculations verified, caveats noted. Minor suggestions for improvement but nothing blocking.

Share with noted caveats -- Analysis is largely correct but has specific limitations or assumptions that must be communicated to stakeholders. List the required caveats.

Needs revision -- Found specific errors, methodological issues, or missing analyses that should be addressed before sharing. List the required changes with priority order.

Output Format

## Validation Report

### Overall Assessment: [Ready to share | Share with caveats | Needs revision]

### Methodology Review
[Findings about approach, data selection, definitions]

### Issues Found
1. [Severity: High/Medium/Low] [Issue description and impact]
2. ...

### Calculation Spot-Checks
- [Metric]: [Verified / Discrepancy found]
- ...

### Visualization Review
[Any issues with charts or visual presentation]

### Suggested Improvements
1. [Improvement and why it matters]
2. ...

### Required Caveats for Stakeholders
- [Caveat that must be communicated]
- ...

---

Pre-Delivery QA Checklist

Run through this checklist before sharing any analysis with stakeholders.

Data Quality Checks

[ ] Source verification: Confirmed which tables/data sources were used. Are they the right ones for this question?
[ ] Freshness: Data is current enough for the analysis. Noted the "as of" date.
[ ] Completeness: No unexpected gaps in time series or missing segments.
[ ] Null handling: Checked null rates in key columns. Nulls are handled appropriately (excluded, imputed, or flagged).
[ ] Deduplication: Confirmed no double-counting from bad joins or duplicate source records.
[ ] Filter verification: All WHERE clauses and filters are correct. No unintended exclusions.

Calculation Checks

[ ] Aggregation logic: GROUP BY includes all non-aggregated columns. Aggregation level matches the analysis grain.
[ ] Denominator correctness: Rate and percentage calculations use the right denominator. Denominators are non-zero.
[ ] Date alignment: Comparisons use the same time period length. Partial periods are excluded or noted.
[ ] Join correctness: JOIN types are appropriate (INNER vs LEFT). Many-to-many joins haven't inflated counts.
[ ] Metric definitions: Metrics match how stakeholders define them. Any deviations are noted.
[ ] Subtotals sum: Parts add up to the whole where expected. If they don't, explain why (e.g., overlap).

Reasonableness Checks

[ ] Magnitude: Numbers are in a plausible range. Revenue isn't negative. Percentages are between 0-100%.
[ ] Trend continuity: No unexplained jumps or drops in time series.
[ ] Cross-reference: Key numbers match other known sources (dashboards, previous reports, finance data).
[ ] Order of magnitude: Total revenue is in the right ballpark. User counts match known figures.
[ ] Edge cases: What happens at the boundaries? Empty segments, zero-activity periods, new entities.

Presentation Checks

[ ] Chart accuracy: Bar charts start at zero. Axes are labeled. Scales are consistent across panels.
[ ] Number formatting: Appropriate precision. Consistent currency/percentage formatting. Thousands separators where needed.
[ ] Title clarity: Titles state the insight, not just the metric. Date ranges are specified.
[ ] Caveat transparency: Known limitations and assumptions are stated explicitly.
[ ] Reproducibility: Someone else could recreate this analysis from the documentation provided.

Common Data Analysis Pitfalls

Join Explosion

The problem: A many-to-many join silently multiplies rows, inflating counts and sums.

How to detect:

-- Check row count before and after join
SELECT COUNT(*) FROM table_a;  -- 1,000
SELECT COUNT(*) FROM table_a a JOIN table_b b ON a.id = b.a_id;  -- 3,500 (uh oh)

How to prevent:

Always check row counts after joins
If counts increase, investigate the join relationship (is it really 1:1 or 1:many?)
Use COUNT(DISTINCT a.id) instead of COUNT(*) when counting entities through joins

Survivorship Bias

The problem: Analyzing only entities that exist today, ignoring those that were deleted, churned, or failed.

Examples:

Analyzing user behavior of "current users" misses churned users
Looking at "companies using our product" ignores those who evaluated and left
Studying properties of "successful" outcomes without "unsuccessful" ones

How to prevent: Ask "who is NOT in this dataset?" before drawing conclusions.

Incomplete Period Comparison

The problem: Comparing a partial period to a full period.

Examples:

"January revenue is $500K vs. December's $800K" -- but January isn't over yet
"This week's signups are down" -- checked on Wednesday, comparing to a full prior week

How to prevent: Always filter to complete periods, or compare same-day-of-month / same-number-of-days.

Denominator Shifting

The problem: The denominator changes between periods, making rates incomparable.

Examples:

Conversion rate improves because you changed how you count "eligible" users
Churn rate changes because the definition of "active" was updated

How to prevent: Use consistent definitions across all compared periods. Note any definition changes.

Average of Averages

The problem: Averaging pre-computed averages gives wrong results when group sizes differ.

Example:

Group A: 100 users, average revenue $50
Group B: 10 users, average revenue $200
Wrong: Average of averages = ($50 + $200) / 2 = $125
Right: Weighted average = (100$50 + 10$200) / 110 = $63.64

How to prevent: Always aggregate from raw data. Never average pre-aggregated averages.

Timezone Mismatches

The problem: Different data sources use different timezones, causing misalignment.

Examples:

Event timestamps in UTC vs. user-facing dates in local time
Daily rollups that use different cutoff times

How to prevent: Standardize all timestamps to a single timezone (UTC recommended) before analysis. Document the timezone used.

Selection Bias in Segmentation

The problem: Segments are defined by the outcome you're measuring, creating circular logic.

Examples:

"Users who completed onboarding have higher retention" -- obviously, they self-selected
"Power users generate more revenue" -- they became power users BY generating revenue

How to prevent: Define segments based on pre-treatment characteristics, not outcomes.

Other Statistical Traps

Simpson's paradox: Trend reverses when data is aggregated vs. segmented
Correlation presented as causation without supporting evidence
Small sample sizes leading to unreliable conclusions
Outliers disproportionately affecting averages (should medians be used instead?)
Multiple testing / cherry-picking significant results
Look-ahead bias: Using future information to explain past events
Cherry-picked time ranges that favor a particular narrative

Result Sanity Checking

Magnitude Checks

For any key number in your analysis, verify it passes the "smell test":

Metric Type	Sanity Check
User counts	Does this match known MAU/DAU figures?
Revenue	Is this in the right order of magnitude vs. known ARR?
Conversion rates	Is this between 0% and 100%? Does it match dashboard figures?
Growth rates	Is 50%+ MoM growth realistic, or is there a data issue?
Averages	Is the average reasonable given what you know about the distribution?
Percentages	Do segment percentages sum to ~100%?

Cross-Validation Techniques

1. Calculate the same metric two different ways and verify they match 2. Spot-check individual records -- pick a few specific entities and trace their data manually 3. Compare to known benchmarks -- match against published dashboards, finance reports, or prior analyses 4. Reverse engineer -- if total revenue is X, does per-user revenue times user count approximately equal X? 5. Boundary checks -- what happens when you filter to a single day, a single user, or a single category? Are those micro-results sensible?

Red Flags That Warrant Investigation

Any metric that changed by more than 50% period-over-period without an obvious cause
Counts or sums that are exact round numbers (suggests a filter or default value issue)
Rates exactly at 0% or 100% (may indicate incomplete data)
Results that perfectly confirm the hypothesis (reality is usually messier)
Identical values across time periods or segments (suggests the query is ignoring a dimension)

Documentation Standards for Reproducibility

Analysis Documentation Template

Every non-trivial analysis should include:

## Analysis: [Title]

### Question
[The specific question being answered]

### Data Sources
- Table: [schema.table_name] (as of [date])
- Table: [schema.other_table] (as of [date])
- File: [filename] (source: [where it came from])

### Definitions
- [Metric A]: [Exactly how it's calculated]
- [Segment X]: [Exactly how membership is determined]
- [Time period]: [Start date] to [end date], [timezone]

### Methodology
1. [Step 1 of the analysis approach]
2. [Step 2]
3. [Step 3]

### Assumptions and Limitations
- [Assumption 1 and why it's reasonable]
- [Limitation 1 and its potential impact on conclusions]

### Key Findings
1. [Finding 1 with supporting evidence]
2. [Finding 2 with supporting evidence]

### SQL Queries
[All queries used, with comments]

### Caveats
- [Things the reader should know before acting on this]

Code Documentation

For any code (SQL, Python) that may be reused:

"""
Analysis: Monthly Cohort Retention
Author: [Name]
Date: [Date]
Data Source: events table, users table
Last Validated: [Date] -- results matched dashboard within 2%

Purpose:
    Calculate monthly user retention cohorts based on first activity date.

Assumptions:
    - "Active" means at least one event in the month
    - Excludes test/internal accounts (user_type != 'internal')
    - Uses UTC dates throughout

Output:
    Cohort retention matrix with cohort_month rows and months_since_signup columns.
    Values are retention rates (0-100%).
"""

Version Control for Analyses

Save queries and code in version control (git) or a shared docs system
Note the date of the data snapshot used
If an analysis is re-run with updated data, document what changed and why
Link to prior versions of recurring analyses for trend comparison

Examples

/validate-data Review this quarterly revenue analysis before I send it to the exec team: [analysis]

/validate-data Check my churn analysis -- I'm comparing Q4 churn rates to Q3 but Q4 has a shorter measurement window

/validate-data Here's a SQL query and its results for our conversion funnel. Does the logic look right? [query + results]

Tips

Run /validate-data before any high-stakes presentation or decision
Even quick analyses benefit from a sanity check -- it takes a minute and can save your credibility
If the validation finds issues, fix them and re-validate
Share the validation output alongside your analysis to build stakeholder confidence

Related skills

Microsoft FoundryDeploy, evaluate, and continuously improve Microsoft Foundry agents from a single agent interface.478k1.3k

Ai Research ReproductionOrchestrate trustworthy, auditable reproduction of deep learning repositories directly from their READMEs.164k507

Run TrainSafely execute selected deep learning training commands with standardized evidence capture.164k507

Explore RunSafely run isolated exploratory experiments with clear recording and conservative selection before committing changes.164k507

Paper Context ResolverFetch precise reproduction-critical details like dataset splits, preprocessing steps, or evaluation protocols from the original academic paper when the repo README leav141k507

Repo Intake And PlanScan unfamiliar AI research repositories and receive a minimal, trustworthy reproduction target before investing significant time.140k507

How it compares

Use validate-data for pre-share QA of finished analyses; use ETL or modeling skills when creating datasets from scratch.

FAQ

What inputs does validate-data accept?

Documents, notebooks, SQL with results, charts, or methodology and findings descriptions in conversation.

What are the three assessment levels?

Ready to share, Share with noted caveats, or Needs revision with prioritized required changes.

Should I re-validate after fixing issues?

Yes. Fix found issues and re-run validation before the stakeholder presentation.

Is Validate Data safe to install?

skills.sh reports 3 of 3 security scanners passed. Review the Security Audits panel on this page before installing in production.

Data Science & MLanalytics

About

Validate Data by the numbers

validate-data capabilities & compatibility

What validate-data says it does

Add your badge

How do I verify my analysis is sound before presenting it to executives or stakeholders?

Who is it for?

When should I use this skill?

What you get

Files

/validate-data - Validate Analysis Before Sharing

Usage

Workflow

1. Review Methodology and Assumptions

2. Run the Pre-Delivery QA Checklist

3. Check for Common Analytical Pitfalls

4. Verify Calculations and Aggregations

5. Assess Visualizations

6. Evaluate Narrative and Conclusions

7. Suggest Improvements

8. Generate Confidence Assessment

Output Format

Pre-Delivery QA Checklist

Data Quality Checks

Calculation Checks

Reasonableness Checks

Presentation Checks

Common Data Analysis Pitfalls

Join Explosion

Survivorship Bias

Incomplete Period Comparison

Denominator Shifting

Average of Averages

Timezone Mismatches

Selection Bias in Segmentation

Other Statistical Traps

Result Sanity Checking

Magnitude Checks

Cross-Validation Techniques

Red Flags That Warrant Investigation

Documentation Standards for Reproducibility

Analysis Documentation Template

Code Documentation

Version Control for Analyses

Examples

Tips

Related skills

How it compares

FAQ

What inputs does validate-data accept?

What are the three assessment levels?

Should I re-validate after fixing issues?

Is Validate Data safe to install?

This week in AI coding