Data Analysis

Name: Data Analysis
Author: lingzhi227

lingzhi227/agent-research-skills

Run structured multi-round reviews on statistical and data-analysis code before publishing results or shipping research pipelines.

Overview

data-analysis is an agent skill most often used in Ship (also Validate prototype and Build backend) that runs a 4-round structured review of statistical and data-handling code extracted from research pipelines.

Install

npx skills add https://github.com/lingzhi227/agent-research-skills --skill data-analysis

What is this skill?

4-round code review system adapted from data-to-paper and AgentLaboratory flows
Round 1: fundamental math/stat flaws, wrong calculations, and trivial tests
Round 2: missing values, units, filtering, descriptive stats, preprocessing, and test assumptions
Round 3–4 prompts extend to methodology, statistical test choice, p-values, and multiple-comparison issues per extracted
Designed for hypothesis-testing and analysis coding review—not one-click EDA dashboards
4-round code review system (data-to-paper extracted)
Round 1 includes four explicit check categories (flaws, calculations, trivialities, other issues)
Round 2 covers five data-handling areas including preprocessing and statistical tests

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 898 installs on skills.sh; 114 GitHub stars; 2/3 security scanners passed (skills.sh audits).

What problem does it solve?

Your analysis script may encode silent math errors, bad preprocessing, or the wrong statistical test—and you only discover it after results are published.

Who is it for?

Indie researchers or data-heavy SaaS founders who already have analysis code and need systematic statistical QA before release.

Skip if: Beginners who want automated EDA, charting, or SQL reporting without existing analysis code to review.

When should I use this skill?

You have data analysis or hypothesis-testing code and need structured agent prompts to review math, data prep, and statistical test correctness.

What do I get? / Deliverables

You get a staged review checklist applied to your code covering flaws, data handling, preprocessing, and statistical test validity before you ship findings.

Structured review findings across four rounds (flaws, data handling, analysis, tests)
Explicit lists of calculations and statistical assumptions to fix

Recommended Skills

Paper Context Resolverlllllllama/ai-paper-reproduction-skill

Optional helper-tier skill that supplements README-guided deep learning reproduction by resolving specific paper details…140k installs·412 stars

Repo Intake And Planlllllllama/ai-paper-reproduction-skill

Rigor Intake scans repository docs and layout to classify documented commands and propose a minimal reproduction plan fo…140k installs·412 stars

Env And Assets Bootstraplllllllama/ai-paper-reproduction-skill

Rigor Setup establishes conservative environment and asset assumptions aligned with README and config evidence before ex…140k installs·412 stars

Minimal Run And Auditlllllllama/ai-paper-reproduction-skill

RigorPilot executes the selected minimal reproduction command and produces normalized, auditable run evidence for paper …140k installs·412 stars

Analyze Projectlllllllama/rigorpilot-skills

analyze-project is a read-only agent skill from the RigorPilot family aimed at solo builders and small teams inheriting …32.3k installs·412 stars

Ai Research Reproductionlllllllama/rigorpilot-skills

ai-research-reproduction is the RigorPilot Reproduce orchestrator for solo builders and small teams who need to rerun a …32.3k installs·412 stars

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

Ship → review is the canonical shelf because the skill is a code-review system for analysis scripts, not primary exploratory research. Four explicit review rounds target flaws, data handling, methodology, and statistical test correctness—classic pre-merge or pre-publication review work.

Also useful

ValidatePrototype & spike

Also useful

BuildBackend, data & payments

Where it fits

Example use

ShipCode review

Run Round 1–2 prompts on notebook code before merging results into a report.

Example use

ValidatePrototype & spike

Sanity-check a quick experiment script for trivial significance tests and unit mistakes.

Example use

BuildBackend, data & payments

Review batch analytics job code for missing-value handling and test selection.

Example use

OperateIteration & experiments

Re-audit analysis after metric definitions change in production logging.

How it compares

A checker-style review rubric for analysis code—not a notebook generator or BI dashboard integration.

Common Questions / FAQ

Who is data-analysis for?

Builders and researchers with draft Python or statistical analysis code who need rigorous multi-pass review before papers, experiments, or production metrics go live.

When should I use data-analysis?

In Ship during code review of analysis modules; in Validate when hardening a prototype notebook; and in Build backend when implementing hypothesis tests in pipeline code.

Is data-analysis safe to install?

Review involves reading your analysis source; check the Security Audits panel on this Prism page and avoid sending sensitive raw datasets to agents you do not trust.

SKILL.md

READMESKILL.md - Data Analysis

# Data Analysis Review Prompts

Extracted from data-to-paper (hypothesis_testing/coding/analysis/coding.py) and AgentLaboratory.

## 4-Round Code Review System (data-to-paper)

### Round 1: Fundamental Code Flaws

```
### CHECK FOR FUNDAMENTAL FLAWS:
Check for any fundamental mathematical or statistical flaws in the code.

### CHECK FOR WRONG CALCULATIONS:
Explicitly list all key calculations and assess them.

### CHECK FOR MATH TRIVIALITIES:
Check for any mathematically trivial assessments / statistical tests.
For example, testing whether a value is different from zero when it is
defined as a sum of positive values.

### OTHER ISSUES:
Any other issues you find in the code.
```

### Round 2: Data Handling Issues

```
### DATASET PREPARATIONS:
- Missing values: Are missing values handled correctly?
- Units: Are units consistent and correctly converted?
- Data restriction: Is data appropriately filtered/restricted?

### DESCRIPTIVE STATISTICS:
Check for issues in descriptive statistics calculations.

### PREPROCESSING:
Review data preprocessing steps:
- Normalization / standardization
- Feature encoding
- Train/test split methodology

### ANALYSIS:
Check data analysis issues:
- Correct statistical test selection
- Assumptions met (normality, independence, etc.)
- Multiple comparisons correction

### STATISTICAL TESTS:
Check choice and implementation of statistical tests:
- Is the test appropriate for the data type?
- Are assumptions validated?
- Are p-values correctly computed and interpreted?
```

### Round 3: Per-Table Individual Review

```
### SENSIBLE NUMERIC VALUES:
Check each numeric value in the table:
- Are values within expected ranges?
- Do percentages sum to 100% where expected?
- Are decimal places appropriate?

### MEASURES OF UNCERTAINTY:
Does the table report measures of uncertainty?
- p-values for statistical tests
- Confidence intervals for estimates
- Standard deviations for means

### MISSING DATA:
Are we missing key variables or important results?

### OTHER ISSUES:
Any other issues you find in the table.
```

*Note: This round runs individually for each output file (df_*.pkl).*

### Round 4: Cross-Table Completeness

```
### COMPLETENESS OF TABLES:
Does the code create all needed results for the hypothesis testing plan?

### CONSISTENCY ACROSS TABLES:
Are tables consistent in:
- Variable naming conventions
- Measures of uncertainty reported
- Decimal precision
- Statistical test choices

### MISSING DATA:
Are we missing key variables or measures of uncertainty
that should be reported for a complete analysis?
```

## Allowed Packages Whitelist (data-to-paper)

```python
ALLOWED_PACKAGES = [
    'pandas',
    'numpy',
    'scipy',
    'statsmodels',
    'sklearn',
    'pingouin',     # For ANOVA and post-hoc tests
    'matplotlib',   # For diagnostic plots only
]
```

## Statistical Test Selection Guide

```
Select the appropriate statistical test based on:

| Data Type | Groups | Test |
|-----------|--------|------|
| Continuous, normal, 2 groups | Independent | Independent t-test |
| Continuous, normal, 2 groups | Paired | Paired t-test |
| Continuous, non-normal, 2 groups | Independent | Mann-Whitney U |
| Continuous, normal, 3+ groups | Independent | One-way ANOVA |
| Continuous, non-normal, 3+ groups | Independent | Kruskal-Wallis |
| Categorical, 2 variables | Independent | Chi-square test |
| Continuous, 2 variables | Correlation | Pearson/Spearman |
| Binary outcome | Multiple predictors | Logistic regression |

Always check assumptions before applying parametric tests:
1. Normality (Shapiro-Wilk test)
2. Homogeneity of variance (Levene's test)
3. Independence of observations
```

## Results Interpretation Dialogue (AgentLaboratory)

```
Postdoc guides PhD to extract insights:

1. "What are the key findings from the results?"
2. "Are there any surprising or unexpected results?"
3. "How do results compare to baselines?"
4. "What is the statistical significance of improvements?"
5. "Are there any failur

What is this skill?

4-round code review system adapted from data-to-paper and AgentLaboratory flows

Round 1: fundamental math/stat flaws, wrong calculations, and trivial tests

Round 2: missing values, units, filtering, descriptive stats, preprocessing, and test assumptions

Round 3–4 prompts extend to methodology, statistical test choice, p-values, and multiple-comparison issues per extracted

Designed for hypothesis-testing and analysis coding review—not one-click EDA dashboards

4-round code review system (data-to-paper extracted)

Round 1 includes four explicit check categories (flaws, calculations, trivialities, other issues)

Round 2 covers five data-handling areas including preprocessing and statistical tests

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 898 installs on skills.sh; 114 GitHub stars; 2/3 security scanners passed (skills.sh audits).

What do I get? / Deliverables

You get a staged review checklist applied to your code covering flaws, data handling, preprocessing, and statistical test validity before you ship findings.

Structured review findings across four rounds (flaws, data handling, analysis, tests)

Explicit lists of calculations and statistical assumptions to fix

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

Also useful

ValidatePrototype & spike

Also useful

BuildBackend, data & payments

Where it fits

Example use

ShipCode review

Run Round 1–2 prompts on notebook code before merging results into a report.

Example use

ValidatePrototype & spike

Sanity-check a quick experiment script for trivial significance tests and unit mistakes.

Example use

BuildBackend, data & payments

Review batch analytics job code for missing-value handling and test selection.

Example use

OperateIteration & experiments

Re-audit analysis after metric definitions change in production logging.

SKILL.md

READMESKILL.md - Data Analysis

# Data Analysis Review Prompts

Extracted from data-to-paper (hypothesis_testing/coding/analysis/coding.py) and AgentLaboratory.

## 4-Round Code Review System (data-to-paper)

### Round 1: Fundamental Code Flaws

```
### CHECK FOR FUNDAMENTAL FLAWS:
Check for any fundamental mathematical or statistical flaws in the code.

### CHECK FOR WRONG CALCULATIONS:
Explicitly list all key calculations and assess them.

### CHECK FOR MATH TRIVIALITIES:
Check for any mathematically trivial assessments / statistical tests.
For example, testing whether a value is different from zero when it is
defined as a sum of positive values.

### OTHER ISSUES:
Any other issues you find in the code.
```

### Round 2: Data Handling Issues

```
### DATASET PREPARATIONS:
- Missing values: Are missing values handled correctly?
- Units: Are units consistent and correctly converted?
- Data restriction: Is data appropriately filtered/restricted?

### DESCRIPTIVE STATISTICS:
Check for issues in descriptive statistics calculations.

### PREPROCESSING:
Review data preprocessing steps:
- Normalization / standardization
- Feature encoding
- Train/test split methodology

### ANALYSIS:
Check data analysis issues:
- Correct statistical test selection
- Assumptions met (normality, independence, etc.)
- Multiple comparisons correction

### STATISTICAL TESTS:
Check choice and implementation of statistical tests:
- Is the test appropriate for the data type?
- Are assumptions validated?
- Are p-values correctly computed and interpreted?
```

### Round 3: Per-Table Individual Review

```
### SENSIBLE NUMERIC VALUES:
Check each numeric value in the table:
- Are values within expected ranges?
- Do percentages sum to 100% where expected?
- Are decimal places appropriate?

### MEASURES OF UNCERTAINTY:
Does the table report measures of uncertainty?
- p-values for statistical tests
- Confidence intervals for estimates
- Standard deviations for means

### MISSING DATA:
Are we missing key variables or important results?

### OTHER ISSUES:
Any other issues you find in the table.
```

*Note: This round runs individually for each output file (df_*.pkl).*

### Round 4: Cross-Table Completeness

```
### COMPLETENESS OF TABLES:
Does the code create all needed results for the hypothesis testing plan?

### CONSISTENCY ACROSS TABLES:
Are tables consistent in:
- Variable naming conventions
- Measures of uncertainty reported
- Decimal precision
- Statistical test choices

### MISSING DATA:
Are we missing key variables or measures of uncertainty
that should be reported for a complete analysis?
```

## Allowed Packages Whitelist (data-to-paper)

```python
ALLOWED_PACKAGES = [
    'pandas',
    'numpy',
    'scipy',
    'statsmodels',
    'sklearn',
    'pingouin',     # For ANOVA and post-hoc tests
    'matplotlib',   # For diagnostic plots only
]
```

## Statistical Test Selection Guide

```
Select the appropriate statistical test based on:

| Data Type | Groups | Test |
|-----------|--------|------|
| Continuous, normal, 2 groups | Independent | Independent t-test |
| Continuous, normal, 2 groups | Paired | Paired t-test |
| Continuous, non-normal, 2 groups | Independent | Mann-Whitney U |
| Continuous, normal, 3+ groups | Independent | One-way ANOVA |
| Continuous, non-normal, 3+ groups | Independent | Kruskal-Wallis |
| Categorical, 2 variables | Independent | Chi-square test |
| Continuous, 2 variables | Correlation | Pearson/Spearman |
| Binary outcome | Multiple predictors | Logistic regression |

Always check assumptions before applying parametric tests:
1. Normality (Shapiro-Wilk test)
2. Homogeneity of variance (Levene's test)
3. Independence of observations
```

## Results Interpretation Dialogue (AgentLaboratory)

```
Postdoc guides PhD to extract insights:

1. "What are the key findings from the results?"
2. "Are there any surprising or unexpected results?"
3. "How do results compare to baselines?"
4. "What is the statistical significance of improvements?"
5. "Are there any failur

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is data-analysis for?

When should I use data-analysis?

Is data-analysis safe to install?

SKILL.md

This week for builders

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is data-analysis for?

When should I use data-analysis?

Is data-analysis safe to install?

SKILL.md