Peer Review

Name: Peer Review
Author: davila7

davila7/claude-code-templates

584 installs
29.9k repo stars
Updated July 27, 2026
davila7/claude-code-templates

peer-review is a Claude Code skill that systematically catches statistical, methodological, and reporting flaws in research-heavy code, docs, or manuscripts for developers who must validate scientific work before finaliz

About

Peer-review is an agent skill that applies a standardized checklist of common methodological and statistical pitfalls to any research-oriented output. It scans for p-value misuse, inappropriate test selection, missing effect sizes, data dredging, poor visualization practices, and other frequent errors that undermine credibility. The skill returns specific, constructive feedback with recommended fixes such as reporting confidence intervals, applying multiple-testing corrections, and using equivalence tests. Ideal for solo builders producing data-driven features, academic-style documentation, or scientific manuscripts who want to eliminate sloppy statistical claims before they reach users, collaborators, or publication.

Comprehensive catalog of statistical issues including p-hacking, multiple testing problems, and misinterpretation of sig
Structured guidance on inappropriate statistical tests, missing controls, and data visualization errors
Actionable recommendations with concrete fixes such as reporting effect sizes, applying corrections, and using pre-regis
Organized by category for rapid reference during peer-style reviews of scientific or data-driven work
Serves as a hard-gate checklist before releasing research-backed features or publications

Peer Review by the numbers

584 all-time installs (skills.sh)
+10 installs in the week ending Jun 27, 2026 (Skillselion tracking)
Ranked #215 of 1,382 Code Review & Quality skills by installs in the Skillselion catalog
Security screen: LOW risk (skills.sh audit)
Data as of Jul 28, 2026 (Skillselion catalog sync)

npx skills add https://github.com/davila7/claude-code-templates --skill peer-review

Add your badge

Show developers this skill is listed on Skillselion. Paste this into your README.

[![Listed on Skillselion](https://skillselion.com/badge/skills/davila7/claude-code-templates/peer-review.svg)](https://skillselion.com/skills/davila7/claude-code-templates/peer-review)

Installs	584
repo stars	★ 29.9k
Security audit	3 / 3 scanners passed
Last updated	July 27, 2026
Repository	davila7/claude-code-templates ↗

How do you catch statistical flaws before publishing research?

Systematically catch statistical, methodological, and reporting flaws in research-heavy code, docs, or manuscripts before finalizing or publishing.

Who is it for?

Developers and data scientists preparing research notebooks, technical manuscripts, or analysis docs that require rigorous statistical and methodological review.

Skip if: Teams needing standard pull-request code review for application features without statistical or scientific reporting requirements.

When should I use this skill?

The user wants peer review of research-heavy code, manuscripts, or docs for statistical misuse, methodology gaps, or reporting flaws before publishing.

What you get

Peer-review feedback on statistical, methodological, and reporting issues with category-organized flaw annotations.

categorized peer-review feedback
methodological flaw annotations

Files

SKILL.mdMarkdownGitHub ↗

Scientific Critical Evaluation and Peer Review

Overview

Peer review is a systematic process for evaluating scientific manuscripts. Assess methodology, statistics, design, reproducibility, ethics, and reporting standards. Apply this skill for manuscript and grant review across disciplines with constructive, rigorous evaluation.

When to Use This Skill

This skill should be used when:

Conducting peer review of scientific manuscripts for journals
Evaluating grant proposals and research applications
Assessing methodology and experimental design rigor
Reviewing statistical analyses and reporting standards
Evaluating reproducibility and data availability
Checking compliance with reporting guidelines (CONSORT, STROBE, PRISMA)
Providing constructive feedback on scientific writing

Visual Enhancement with Scientific Schematics

When creating documents with this skill, always consider adding scientific diagrams and schematics to enhance visual communication.

If your document does not already contain schematics or diagrams:

Use the scientific-schematics skill to generate AI-powered publication-quality diagrams
Simply describe your desired diagram in natural language
Nano Banana Pro will automatically generate, review, and refine the schematic

For new documents: Scientific schematics should be generated by default to visually represent key concepts, workflows, architectures, or relationships described in the text.

How to generate schematics:

python scripts/generate_schematic.py "your diagram description" -o figures/output.png

The AI will automatically:

Create publication-quality images with proper formatting
Review and refine through multiple iterations
Ensure accessibility (colorblind-friendly, high contrast)
Save outputs in the figures/ directory

When to add schematics:

Peer review workflow diagrams
Evaluation criteria decision trees
Review process flowcharts
Methodology assessment frameworks
Quality assessment visualizations
Reporting guidelines compliance diagrams
Any complex concept that benefits from visualization

For detailed guidance on creating schematics, refer to the scientific-schematics skill documentation.

---

Peer Review Workflow

Conduct peer review systematically through the following stages, adapting depth and focus based on the manuscript type and discipline.

Stage 1: Initial Assessment

Begin with a high-level evaluation to determine the manuscript's scope, novelty, and overall quality.

Key Questions:

What is the central research question or hypothesis?
What are the main findings and conclusions?
Is the work scientifically sound and significant?
Is the work appropriate for the intended venue?
Are there any immediate major flaws that would preclude publication?

Output: Brief summary (2-3 sentences) capturing the manuscript's essence and initial impression.

Stage 2: Detailed Section-by-Section Review

Conduct a thorough evaluation of each manuscript section, documenting specific concerns and strengths.

Abstract and Title

Accuracy: Does the abstract accurately reflect the study's content and conclusions?
Clarity: Is the title specific, accurate, and informative?
Completeness: Are key findings and methods summarized appropriately?
Accessibility: Is the abstract comprehensible to a broad scientific audience?

Introduction

Context: Is the background information adequate and current?
Rationale: Is the research question clearly motivated and justified?
Novelty: Is the work's originality and significance clearly articulated?
Literature: Are relevant prior studies appropriately cited?
Objectives: Are research aims/hypotheses clearly stated?

Methods

Reproducibility: Can another researcher replicate the study from the description provided?
Rigor: Are the methods appropriate for addressing the research questions?
Detail: Are protocols, reagents, equipment, and parameters sufficiently described?
Ethics: Are ethical approvals, consent, and data handling properly documented?
Statistics: Are statistical methods appropriate, clearly described, and justified?
Validation: Are controls, replicates, and validation approaches adequate?

Critical elements to verify:

Sample sizes and power calculations
Randomization and blinding procedures
Inclusion/exclusion criteria
Data collection protocols
Computational methods and software versions
Statistical tests and correction for multiple comparisons

Results

Presentation: Are results presented logically and clearly?
Figures/Tables: Are visualizations appropriate, clear, and properly labeled?
Statistics: Are statistical results properly reported (effect sizes, confidence intervals, p-values)?
Objectivity: Are results presented without over-interpretation?
Completeness: Are all relevant results included, including negative results?
Reproducibility: Are raw data or summary statistics provided?

Common issues to identify:

Selective reporting of results
Inappropriate statistical tests
Missing error bars or measures of variability
Over-fitting or circular analysis
Batch effects or confounding variables
Missing controls or validation experiments

Discussion

Interpretation: Are conclusions supported by the data?
Limitations: Are study limitations acknowledged and discussed?
Context: Are findings placed appropriately within existing literature?
Speculation: Is speculation clearly distinguished from data-supported conclusions?
Significance: Are implications and importance clearly articulated?
Future directions: Are next steps or unanswered questions discussed?

Red flags:

Overstated conclusions
Ignoring contradictory evidence
Causal claims from correlational data
Inadequate discussion of limitations
Mechanistic claims without mechanistic evidence

References

Completeness: Are key relevant papers cited?
Currency: Are recent important studies included?
Balance: Are contrary viewpoints appropriately cited?
Accuracy: Are citations accurate and appropriate?
Self-citation: Is there excessive or inappropriate self-citation?

Stage 3: Methodological and Statistical Rigor

Evaluate the technical quality and rigor of the research with particular attention to common pitfalls.

Statistical Assessment:

Are statistical assumptions met (normality, independence, homoscedasticity)?
Are effect sizes reported alongside p-values?
Is multiple testing correction applied appropriately?
Are confidence intervals provided?
Is sample size justified with power analysis?
Are parametric vs. non-parametric tests chosen appropriately?
Are missing data handled properly?
Are exploratory vs. confirmatory analyses distinguished?

Experimental Design:

Are controls appropriate and adequate?
Is replication sufficient (biological and technical)?
Are potential confounders identified and controlled?
Is randomization properly implemented?
Are blinding procedures adequate?
Is the experimental design optimal for the research question?

Computational/Bioinformatics:

Are computational methods clearly described and justified?
Are software versions and parameters documented?
Is code made available for reproducibility?
Are algorithms and models validated appropriately?
Are assumptions of computational methods met?
Is batch correction applied appropriately?

Stage 4: Reproducibility and Transparency

Assess whether the research meets modern standards for reproducibility and open science.

Data Availability:

Are raw data deposited in appropriate repositories?
Are accession numbers provided for public databases?
Are data sharing restrictions justified (e.g., patient privacy)?
Are data formats standard and accessible?

Code and Materials:

Is analysis code made available (GitHub, Zenodo, etc.)?
Are unique materials available or described sufficiently for recreation?
Are protocols detailed in sufficient depth?

Reporting Standards:

Does the manuscript follow discipline-specific reporting guidelines (CONSORT, PRISMA, ARRIVE, MIAME, MINSEQE, etc.)?
See references/reporting_standards.md for common guidelines
Are all elements of the appropriate checklist addressed?

Stage 5: Figure and Data Presentation

Evaluate the quality, clarity, and integrity of data visualization.

Quality Checks:

Are figures high resolution and clearly labeled?
Are axes properly labeled with units?
Are error bars defined (SD, SEM, CI)?
Are statistical significance indicators explained?
Are color schemes appropriate and accessible (colorblind-friendly)?
Are scale bars included for images?
Is data visualization appropriate for the data type?

Integrity Checks:

Are there signs of image manipulation (duplications, splicing)?
Are Western blots and gels appropriately presented?
Are representative images truly representative?
Are all conditions shown (no selective presentation)?

Clarity:

Can figures stand alone with their legends?
Is the message of each figure immediately clear?
Are there redundant figures or panels?
Would data be better presented as tables or figures?

Stage 6: Ethical Considerations

Verify that the research meets ethical standards and guidelines.

Human Subjects:

Is IRB/ethics approval documented?
Is informed consent described?
Are vulnerable populations appropriately protected?
Is patient privacy adequately protected?
Are potential conflicts of interest disclosed?

Animal Research:

Is IACUC or equivalent approval documented?
Are procedures humane and justified?
Are the 3Rs (replacement, reduction, refinement) considered?
Are euthanasia methods appropriate?

Research Integrity:

Are there concerns about data fabrication or falsification?
Is authorship appropriate and justified?
Are competing interests disclosed?
Is funding source disclosed?
Are there concerns about plagiarism or duplicate publication?

Stage 7: Writing Quality and Clarity

Assess the manuscript's clarity, organization, and accessibility.

Structure and Organization:

Is the manuscript logically organized?
Do sections flow coherently?
Are transitions between ideas clear?
Is the narrative compelling and clear?

Writing Quality:

Is the language clear, precise, and concise?
Are jargon and acronyms minimized and defined?
Is grammar and spelling correct?
Are sentences unnecessarily complex?
Is the passive voice overused?

Accessibility:

Can a non-specialist understand the main findings?
Are technical terms explained?
Is the significance clear to a broad audience?

Structuring Peer Review Reports

Organize feedback in a hierarchical structure that prioritizes issues and provides actionable guidance.

Summary Statement

Provide a concise overall assessment (1-2 paragraphs):

Brief synopsis of the research
Overall recommendation (accept, minor revisions, major revisions, reject)
Key strengths (2-3 bullet points)
Key weaknesses (2-3 bullet points)
Bottom-line assessment of significance and soundness

Major Comments

List critical issues that significantly impact the manuscript's validity, interpretability, or significance. Number these sequentially for easy reference.

Major comments typically include:

Fundamental methodological flaws
Inappropriate statistical analyses
Unsupported or overstated conclusions
Missing critical controls or experiments
Serious reproducibility concerns
Major gaps in literature coverage
Ethical concerns

For each major comment: 1. Clearly state the issue 2. Explain why it's problematic 3. Suggest specific solutions or additional experiments 4. Indicate if addressing it is essential for publication

Minor Comments

List less critical issues that would improve clarity, completeness, or presentation. Number these sequentially.

Minor comments typically include:

Unclear figure labels or legends
Missing methodological details
Typographical or grammatical errors
Suggestions for improved data presentation
Minor statistical reporting issues
Supplementary analyses that would strengthen conclusions
Requests for clarification

For each minor comment: 1. Identify the specific location (section, paragraph, figure) 2. State the issue clearly 3. Suggest how to address it

Specific Line-by-Line Comments (Optional)

For manuscripts requiring detailed feedback, provide section-specific or line-by-line comments:

Reference specific page/line numbers or sections
Note factual errors, unclear statements, or missing citations
Suggest specific edits for clarity

Questions for Authors

List specific questions that need clarification:

Methodological details that are unclear
Seemingly contradictory results
Missing information needed to evaluate the work
Requests for additional data or analyses

Tone and Approach

Maintain a constructive, professional, and collegial tone throughout the review.

Best Practices:

Be constructive: Frame criticism as opportunities for improvement
Be specific: Provide concrete examples and actionable suggestions
Be balanced: Acknowledge strengths as well as weaknesses
Be respectful: Remember that authors have invested significant effort
Be objective: Focus on the science, not the scientists
Be thorough: Don't overlook issues, but prioritize appropriately
Be clear: Avoid ambiguous or vague criticism

Avoid:

Personal attacks or dismissive language
Sarcasm or condescension
Vague criticism without specific examples
Requesting unnecessary experiments beyond the scope
Demanding adherence to personal preferences vs. best practices
Revealing your identity if reviewing is double-blind

Special Considerations by Manuscript Type

Original Research Articles

Emphasize rigor, reproducibility, and novelty
Assess significance and impact
Verify that conclusions are data-driven
Check for complete methods and appropriate controls

Reviews and Meta-Analyses

Evaluate comprehensiveness of literature coverage
Assess search strategy and inclusion/exclusion criteria
Verify systematic approach and lack of bias
Check for critical analysis vs. mere summarization
For meta-analyses, evaluate statistical approach and heterogeneity

Methods Papers

Emphasize validation and comparison to existing methods
Assess reproducibility and availability of protocols/code
Evaluate improvements over existing approaches
Check for sufficient detail for implementation

Short Reports/Letters

Adapt expectations for brevity
Ensure core findings are still rigorous and significant
Verify that format is appropriate for findings

Preprints

Recognize that these have not undergone formal peer review
May be less polished than journal submissions
Still apply rigorous standards for scientific validity
Consider providing constructive feedback to help authors improve before journal submission

Presentations and Slide Decks

⚠️ CRITICAL: For presentations, NEVER read the PDF directly. ALWAYS convert to images first.

When reviewing scientific presentations (PowerPoint, Beamer, slide decks):

Mandatory Image-Based Review Workflow

NEVER attempt to read presentation PDFs directly - this causes buffer overflow errors and doesn't show visual formatting issues.

Required Process: 1. Convert PDF to images using Python:

   python skills/scientific-slides/scripts/pdf_to_images.py presentation.pdf review/slide --dpi 150
   # Creates: review/slide-001.jpg, review/slide-002.jpg, etc.

2. Read and inspect EACH slide image file sequentially 3. Document issues with specific slide numbers 4. Provide feedback on visual formatting and content

Print when starting review:

[HH:MM:SS] PEER REVIEW: Presentation detected - converting to images for review
[HH:MM:SS] PDF REVIEW: NEVER reading PDF directly - using image-based inspection

Presentation-Specific Evaluation Criteria

Visual Design and Readability:

[ ] Text is large enough (minimum 18pt, ideally 24pt+ for body text)
[ ] High contrast between text and background (4.5:1 minimum, 7:1 preferred)
[ ] Color scheme is professional and colorblind-accessible
[ ] Consistent visual design across all slides
[ ] White space is adequate (not cramped)
[ ] Fonts are clear and professional

Layout and Formatting (Check EVERY Slide Image):

[ ] No text overflow or truncation at slide edges
[ ] No element overlaps (text over images, overlapping shapes)
[ ] Titles are consistently positioned
[ ] Content is properly aligned
[ ] Bullets and text are not cut off
[ ] Figures fit within slide boundaries
[ ] Captions and labels are visible and readable

Content Quality:

[ ] One main idea per slide (not overloaded)
[ ] Minimal text (3-6 bullets per slide maximum)
[ ] Bullet points are concise (5-7 words each)
[ ] Figures are simplified and clear (not copy-pasted from papers)
[ ] Data visualizations have large, readable labels
[ ] Citations are present and properly formatted
[ ] Results/data slides dominate the presentation (40-50% of content)

Structure and Flow:

[ ] Clear narrative arc (introduction → methods → results → discussion)
[ ] Logical progression between slides
[ ] Slide count appropriate for talk duration (~1 slide per minute)
[ ] Title slide includes authors, affiliation, date
[ ] Introduction cites relevant background literature (3-5 papers)
[ ] Discussion cites comparison papers (3-5 papers)
[ ] Conclusions slide summarizes key findings
[ ] Acknowledgments/funding slide at end

Scientific Content:

[ ] Research question clearly stated
[ ] Methods adequately summarized (not excessive detail)
[ ] Results presented logically with clear visualizations
[ ] Statistical significance indicated appropriately
[ ] Conclusions supported by data shown
[ ] Limitations acknowledged where appropriate
[ ] Future directions or broader impact discussed

Common Presentation Issues to Flag:

Critical Issues (Must Fix):

Text overflow making content unreadable
Font sizes too small (<18pt)
Element overlaps obscuring data
Insufficient contrast (text hard to read)
Figures too complex or illegible
No citations (completely unsupported claims)
Slide count drastically mismatched to duration

Major Issues (Should Fix):

Inconsistent design across slides
Too much text (walls of text, not bullets)
Poorly simplified figures (axis labels too small)
Cramped layout with insufficient white space
Missing key structural elements (no conclusion slide)
Poor color choices (not colorblind-safe)
Minimal results content (<30% of slides)

Minor Issues (Suggestions for Improvement):

Could use more visuals/diagrams
Some slides slightly text-heavy
Minor alignment inconsistencies
Could benefit from more white space
Additional citations would strengthen claims
Color scheme could be more modern

Review Report Format for Presentations

Summary Statement:

Overall impression of presentation quality
Appropriateness for target audience and duration
Key strengths (visual design, content, clarity)
Key weaknesses (formatting issues, content gaps)
Recommendation (ready to present, minor revisions, major revisions)

Layout and Formatting Issues (By Slide Number):

Slide 3: Text overflow - bullet point 4 extends beyond right margin
Slide 7: Element overlap - figure overlaps with caption text
Slide 12: Font size - axis labels too small to read from distance
Slide 18: Alignment - title not centered

Content and Structure Feedback:

Adequacy of background context and citations
Clarity of research question and objectives
Quality of methods summary
Effectiveness of results presentation
Strength of conclusions and implications

Design and Accessibility:

Overall visual appeal and professionalism
Color contrast and readability
Colorblind accessibility
Consistency across slides

Timing and Scope:

Whether slide count matches intended duration
Appropriate level of detail for talk type
Balance between sections

Example Image-Based Review Process

[14:30:00] PEER REVIEW: Starting review of presentation
[14:30:05] PEER REVIEW: Presentation detected - converting to images
[14:30:10] PDF REVIEW: Running pdf_to_images.py on presentation.pdf
[14:30:15] PDF REVIEW: Converted 25 slides to images in review/ directory
[14:30:20] PDF REVIEW: Inspecting slide 1/25 - title slide
[14:30:25] PDF REVIEW: Inspecting slide 2/25 - introduction
...
[14:35:40] PDF REVIEW: Inspecting slide 25/25 - acknowledgments
[14:35:45] PDF REVIEW: Completed image-based review
[14:35:50] PEER REVIEW: Found 8 layout issues, 3 content issues
[14:35:55] PEER REVIEW: Generating structured feedback by slide number

Remember: For presentations, the visual inspection via images is MANDATORY. Never attempt to read presentation PDFs as text - it will fail and miss all visual formatting issues.

Resources

This skill includes reference materials to support comprehensive peer review:

references/reporting_standards.md

Guidelines for major reporting standards across disciplines (CONSORT, PRISMA, ARRIVE, MIAME, STROBE, etc.) to evaluate completeness of methods and results reporting.

references/common_issues.md

Catalog of frequent methodological and statistical issues encountered in peer review, with guidance on identifying and addressing them.

Final Checklist

Before finalizing the review, verify:

[ ] Summary statement clearly conveys overall assessment
[ ] Major concerns are clearly identified and justified
[ ] Suggested revisions are specific and actionable
[ ] Minor issues are noted but properly categorized
[ ] Statistical methods have been evaluated
[ ] Reproducibility and data availability assessed
[ ] Ethical considerations verified
[ ] Figures and tables evaluated for quality and integrity
[ ] Writing quality assessed
[ ] Tone is constructive and professional throughout
[ ] Review is thorough but proportionate to manuscript scope
[ ] Recommendation is consistent with identified issues

Common Methodological and Statistical Issues in Scientific Manuscripts

This document catalogs frequent issues encountered during peer review, organized by category. Use this as a reference to identify potential problems and provide constructive feedback.

Statistical Issues

1. P-Value Misuse and Misinterpretation

Common Problems:

P-hacking (selective reporting of significant results)
Multiple testing without correction (familywise error rate inflation)
Interpreting non-significance as proof of no effect
Focusing exclusively on p-values without effect sizes
Dichotomizing continuous p-values at arbitrary thresholds (p=0.049 vs p=0.051)
Confusing statistical significance with biological/clinical significance

How to Identify:

Suspiciously high proportion of p-values just below 0.05
Many tests performed but no correction mentioned
Statements like "no difference was found" from non-significant results
No effect sizes or confidence intervals reported
Language suggesting p-values indicate strength of effect

What to Recommend:

Report effect sizes with confidence intervals
Apply appropriate multiple testing corrections (Bonferroni, FDR, Holm-Bonferroni)
Interpret non-significance cautiously (lack of evidence ≠ evidence of lack)
Pre-register analyses to avoid p-hacking
Consider equivalence testing for "no difference" claims

2. Inappropriate Statistical Tests

Common Problems:

Using parametric tests when assumptions are violated (non-normal data, unequal variances)
Analyzing paired data with unpaired tests
Using t-tests for multiple groups instead of ANOVA with post-hoc tests
Treating ordinal data as continuous
Ignoring repeated measures structure
Using correlation when regression is more appropriate

How to Identify:

No mention of assumption checking
Small sample sizes with parametric tests
Multiple pairwise t-tests instead of ANOVA
Likert scales analyzed with t-tests
Time-series data analyzed without accounting for repeated measures

What to Recommend:

Check assumptions explicitly (normality tests, Q-Q plots)
Use non-parametric alternatives when appropriate
Apply proper corrections for multiple comparisons after ANOVA
Use mixed-effects models for repeated measures
Consider ordinal regression for ordinal outcomes

3. Sample Size and Power Issues

Common Problems:

No sample size justification or power calculation
Underpowered studies claiming "no effect"
Post-hoc power calculations (which are uninformative)
Stopping rules not pre-specified
Unequal group sizes without justification

How to Identify:

Small sample sizes (n<30 per group for typical designs)
No mention of power analysis in methods
Statements about post-hoc power
Wide confidence intervals suggesting imprecision
Claims of "no effect" with large p-values and small n

What to Recommend:

Conduct a priori power analysis based on expected effect size
Report achieved power or precision (confidence interval width)
Acknowledge when studies are underpowered
Consider effect sizes and confidence intervals for interpretation
Pre-register sample size and stopping rules

4. Missing Data Problems

Common Problems:

Complete case analysis without justification (listwise deletion)
Not reporting extent or pattern of missingness
Assuming data are missing completely at random (MCAR) without testing
Inappropriate imputation methods
Not performing sensitivity analyses

How to Identify:

Different n values across analyses without explanation
No discussion of missing data
Participants "excluded from analysis"
Simple mean imputation used
No sensitivity analyses comparing complete vs. imputed data

What to Recommend:

Report extent and patterns of missingness
Test MCAR assumption (Little's test)
Use appropriate methods (multiple imputation, maximum likelihood)
Perform sensitivity analyses
Consider intention-to-treat analysis for trials

5. Circular Analysis and Double-Dipping

Common Problems:

Using the same data for selection and inference
Defining ROIs based on contrast then testing that contrast in same ROI
Selecting outliers then testing for differences
Post-hoc subgroup analyses presented as planned
HARKing (Hypothesizing After Results are Known)

How to Identify:

ROIs or features selected based on results
Unexpected subgroup analyses
Post-hoc analyses not clearly labeled as exploratory
No data-independent validation
Introduction that perfectly predicts findings

What to Recommend:

Use independent datasets for selection and testing
Pre-register analyses and hypotheses
Clearly distinguish confirmatory vs. exploratory analyses
Use cross-validation or hold-out datasets
Correct for selection bias

6. Pseudoreplication

Common Problems:

Technical replicates treated as biological replicates
Multiple measurements from same subject treated as independent
Clustered data analyzed without accounting for clustering
Non-independence in spatial or temporal data

How to Identify:

n defined as number of measurements rather than biological units
Multiple cells from same animal counted as independent
Repeated measures not acknowledged
No mention of random effects or clustering

What to Recommend:

Define n as biological replicates (animals, patients, independent samples)
Use mixed-effects models for nested or clustered data
Account for repeated measures explicitly
Average technical replicates before analysis
Report both technical and biological replication

Experimental Design Issues

7. Lack of Appropriate Controls

Common Problems:

Missing negative controls
Missing positive controls for validation
No vehicle controls for drug studies
No time-matched controls for longitudinal studies
No batch controls

How to Identify:

Methods section lists only experimental groups
No mention of controls in figures
Unclear baseline or reference condition
Cross-batch comparisons without controls

What to Recommend:

Include negative controls to assess specificity
Include positive controls to validate methods
Use vehicle controls matched to experimental treatment
Include sham surgery controls for surgical interventions
Include batch controls for cross-batch comparisons

8. Confounding Variables

Common Problems:

Systematic differences between groups besides intervention
Batch effects not controlled or corrected
Order effects in sequential experiments
Time-of-day effects not controlled
Experimenter effects not blinded

How to Identify:

Groups differ in multiple characteristics
Samples processed in different batches by group
No randomization of sample order
No mention of blinding
Baseline characteristics differ between groups

What to Recommend:

Randomize experimental units to conditions
Block on known confounders
Randomize sample processing order
Use blinding to minimize bias
Perform batch correction if needed
Report and adjust for baseline differences

9. Insufficient Replication

Common Problems:

Single experiment without replication
Technical replicates mistaken for biological replication
Small n justified by "typical for the field"
No independent validation of key findings
Cherry-picking representative examples

How to Identify:

Methods state "experiment performed once"
n=3 with no justification
"Representative image shown"
Key claims based on single experiment
No validation in independent dataset

What to Recommend:

Perform independent biological replicates (typically ≥3)
Validate key findings in independent cohorts
Report all replicates, not just representative examples
Conduct power analysis to justify sample size
Show individual data points, not just summary statistics

Reproducibility Issues

10. Insufficient Methodological Detail

Common Problems:

Methods not described in sufficient detail for replication
Key reagents not specified (vendor, catalog number)
Software versions and parameters not reported
Antibodies not validated
Cell line authentication not verified

How to Identify:

Vague descriptions ("standard protocols were used")
No information on reagent sources
Generic software mentioned without versions
No antibody validation information
Cell lines not authenticated

What to Recommend:

Provide detailed protocols or cite specific protocols
Include reagent vendors, catalog numbers, lot numbers
Report software versions and all parameters
Include antibody validation (Western blot, specificity tests)
Report cell line authentication method (STR profiling)
Make protocols available (protocols.io, supplementary materials)

11. Data and Code Availability

Common Problems:

No data availability statement
"Data available upon request" (often unfulfilled)
No code provided for computational analyses
Custom software not made available
No clear documentation

How to Identify:

Missing data availability statement
No repository accession numbers
Computational methods with no code
Custom pipelines without access
No README or documentation

What to Recommend:

Deposit raw data in appropriate repositories (GEO, SRA, Dryad, Zenodo)
Share analysis code on GitHub or similar
Provide clear documentation and README files
Include requirements.txt or environment files
Make custom software available with installation instructions
Use DOIs for permanent data citation

12. Lack of Method Validation

Common Problems:

New methods not compared to gold standard
Assays not validated for specificity, sensitivity, linearity
No spike-in controls
Cross-reactivity not tested
Detection limits not established

How to Identify:

Novel assays presented without validation
No comparison to existing methods
No positive/negative controls shown
Claims of specificity without evidence
No standard curves or controls

What to Recommend:

Validate new methods against established approaches
Show specificity (knockdown/knockout controls)
Demonstrate linearity and dynamic range
Include positive and negative controls
Report limits of detection and quantification
Show reproducibility across replicates and operators

Interpretation Issues

13. Overstatement of Results

Common Problems:

Causal language for correlational data
Mechanistic claims without mechanistic evidence
Extrapolating beyond data (species, conditions, populations)
Claiming "first to show" without thorough literature review
Overgeneralizing from limited samples

How to Identify:

"X causes Y" from observational data
Mechanism proposed without direct testing
Mouse data presented as relevant to humans without caveats
Claims of novelty with missing citations
Broad claims from narrow samples

What to Recommend:

Use appropriate language ("associated with" vs. "caused by")
Distinguish correlation from causation
Acknowledge limitations of model systems
Provide thorough literature context
Be specific about generalizability
Propose mechanisms as hypotheses, not conclusions

14. Cherry-Picking and Selective Reporting

Common Problems:

Reporting only significant results
Showing "representative" images that may not be typical
Excluding outliers without justification
Not reporting negative or contradictory findings
Switching between different statistical approaches

How to Identify:

All reported results are significant
"Representative of 3 experiments" with no quantification
Data exclusions mentioned in results but not methods
Supplementary data contradicts main findings
Multiple analysis approaches with only one reported

What to Recommend:

Report all planned analyses regardless of outcome
Quantify and show variability across replicates
Pre-specify outlier exclusion criteria
Include negative results
Pre-register analysis plan
Report effect sizes and confidence intervals for all comparisons

15. Ignoring Alternative Explanations

Common Problems:

Preferred explanation presented without considering alternatives
Contradictory evidence dismissed without discussion
Off-target effects not considered
Confounding variables not acknowledged
Limitations section minimal or absent

How to Identify:

Single interpretation presented as fact
Prior contradictory findings not cited or discussed
No consideration of alternative mechanisms
No discussion of limitations
Specificity assumed without controls

What to Recommend:

Discuss alternative explanations
Address contradictory findings from literature
Include appropriate specificity controls
Acknowledge and discuss limitations thoroughly
Consider and test alternative hypotheses

Figure and Data Presentation Issues

16. Inappropriate Data Visualization

Common Problems:

Bar graphs for continuous data (hiding distributions)
No error bars or error bars not defined
Truncated y-axes exaggerating differences
Dual y-axes creating misleading comparisons
Too many significant figures
Colors not colorblind-friendly

How to Identify:

Bar graphs with few data points
Unclear what error bars represent (SD, SEM, CI?)
Y-axis doesn't start at zero for ratio/percentage data
Left and right y-axes with different scales
Values reported to excessive precision (p=0.04562)
Red-green color schemes

What to Recommend:

Show individual data points with scatter/box/violin plots
Always define error bars (SD, SEM, 95% CI)
Start y-axis at zero or indicate breaks clearly
Avoid dual y-axes; use separate panels instead
Report appropriate significant figures
Use colorblind-friendly palettes (viridis, colorbrewer)
Include sample sizes in figure legends

17. Image Manipulation Concerns

Common Problems:

Excessive contrast/brightness adjustment
Spliced gels or images without indication
Duplicated images or panels
Uneven background in Western blots
Selective cropping
Over-processed microscopy images

How to Identify:

Suspicious patterns or discontinuities
Very high contrast with no background
Similar features in different panels
Straight lines suggesting splicing
Inconsistent backgrounds
Loss of detail suggesting over-processing

What to Recommend:

Apply adjustments uniformly across images
Indicate spliced gels with dividing lines
Show full, uncropped images in supplementary materials
Provide original images if requested
Follow journal image integrity policies
Use appropriate image analysis tools

Study Design Issues

18. Poorly Defined Hypotheses and Outcomes

Common Problems:

No clear hypothesis stated
Primary outcome not specified
Multiple outcomes without correction
Outcomes changed after data collection
Fishing expeditions presented as hypothesis-driven

How to Identify:

Introduction doesn't state clear testable hypothesis
Multiple outcomes with unclear hierarchy
Outcomes in results don't match those in methods
Exploratory study presented as confirmatory
Many tests with no multiple testing correction

What to Recommend:

State clear, testable hypotheses
Designate primary and secondary outcomes a priori
Pre-register studies when possible
Apply appropriate corrections for multiple outcomes
Clearly distinguish exploratory from confirmatory analyses
Report all pre-specified outcomes

19. Baseline Imbalance and Selection Bias

Common Problems:

Groups differ at baseline
Selection criteria applied differentially
Healthy volunteer bias
Survivorship bias
Indication bias in observational studies

How to Identify:

Table 1 shows significant baseline differences
Inclusion criteria different between groups
Response rate <50% with no analysis
Analysis only includes completers
Groups self-selected rather than randomized

What to Recommend:

Report baseline characteristics in Table 1
Use randomization to ensure balance
Adjust for baseline differences in analysis
Report response rates and compare responders vs. non-responders
Consider propensity score matching for observational data
Use intention-to-treat analysis

20. Temporal and Batch Effects

Common Problems:

Samples processed in batches by condition
Temporal trends not accounted for
Instrument drift over time
Different operators for different groups
Reagent lot changes between groups

How to Identify:

All treatment samples processed on same day
Controls from different time period
No mention of batch or time effects
Different technicians for groups
Long study duration with no temporal analysis

What to Recommend:

Randomize samples across batches/time
Include batch as covariate in analysis
Perform batch correction (ComBat, limma)
Include quality control samples across batches
Report and test for temporal trends
Balance operators across conditions

Reporting Issues

21. Incomplete Statistical Reporting

Common Problems:

Test statistics not reported
Degrees of freedom missing
Exact p-values replaced with inequalities (p<0.05)
No confidence intervals
No effect sizes
Sample sizes not reported per group

How to Identify:

Only p-values given with no test statistics
p-values reported as p<0.05 rather than exact values
No measures of uncertainty
Effect magnitude unclear
n reported for total but not per group

What to Recommend:

Report complete test statistics (t, F, χ², etc. with df)
Report exact p-values (except p<0.001)
Include 95% confidence intervals
Report effect sizes (Cohen's d, odds ratios, correlation coefficients)
Report n for each group in every analysis
Consider CONSORT-style flow diagram

22. Methods-Results Mismatch

Common Problems:

Methods describe analyses not performed
Results include analyses not described in methods
Different sample sizes in methods vs. results
Methods mention controls not shown
Statistical methods don't match what was done

How to Identify:

Analyses in results without methodological description
Methods describe experiments not in results
Numbers don't match between sections
Controls mentioned but not shown
Different software mentioned than used

What to Recommend:

Ensure complete concordance between methods and results
Describe all analyses performed in methods
Remove methodological descriptions of experiments not performed
Verify all numbers are consistent
Update methods to match actual analyses conducted

How to Use This Reference

When reviewing manuscripts: 1. Read through methods and results systematically 2. Check for common issues in each category 3. Note specific problems with evidence 4. Provide constructive suggestions for improvement 5. Distinguish major issues (affect validity) from minor issues (affect clarity) 6. Prioritize reproducibility and transparency

This is not an exhaustive list but covers the most frequently encountered issues. Always consider the specific context and discipline when evaluating potential problems.

Scientific Reporting Standards and Guidelines

This document catalogs major reporting standards and guidelines across scientific disciplines. When reviewing manuscripts, verify that authors have followed the appropriate guidelines for their study type and discipline.

Clinical Trials and Medical Research

CONSORT (Consolidated Standards of Reporting Trials)

Purpose: Randomized controlled trials (RCTs) Key Requirements:

Trial design, participants, and interventions clearly described
Primary and secondary outcomes specified
Sample size calculation and statistical methods
Participant flow through trial (enrollment, allocation, follow-up, analysis)
Baseline characteristics of participants
Numbers analyzed in each group
Outcomes and estimation with confidence intervals
Adverse events
Trial registration number and protocol access

Reference: http://www.consort-statement.org/

STROBE (Strengthening the Reporting of Observational Studies in Epidemiology)

Purpose: Observational studies (cohort, case-control, cross-sectional) Key Requirements:

Study design clearly stated
Setting, eligibility criteria, and participant sources
Variables clearly defined
Data sources and measurement methods
Bias assessment
Sample size justification
Statistical methods including handling of missing data
Participant flow and characteristics
Main results with confidence intervals
Limitations discussed

Reference: https://www.strobe-statement.org/

PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses)

Purpose: Systematic reviews and meta-analyses Key Requirements:

Protocol registration
Systematic search strategy across multiple databases
Inclusion/exclusion criteria
Study selection process
Data extraction methods
Quality assessment of included studies
Statistical methods for meta-analysis
Assessment of publication bias
Heterogeneity assessment
PRISMA flow diagram showing study selection
Summary of findings tables

Reference: http://www.prisma-statement.org/

SPIRIT (Standard Protocol Items: Recommendations for Interventional Trials)

Purpose: Clinical trial protocols Key Requirements:

Administrative information (title, registration, funding)
Introduction (rationale, objectives)
Methods (design, participants, interventions, outcomes, sample size)
Ethics and dissemination
Trial schedule and assessments

Reference: https://www.spirit-statement.org/

CARE (CAse REport guidelines)

Purpose: Case reports Key Requirements:

Patient information and demographics
Clinical findings
Timeline of events
Diagnostic assessment
Therapeutic interventions
Follow-up and outcomes
Patient perspective
Informed consent

Reference: https://www.care-statement.org/

Animal Research

ARRIVE (Animal Research: Reporting of In Vivo Experiments)

Purpose: Studies involving animal research Key Requirements:

Title indicates study involves animals
Abstract provides accurate summary
Background and objectives clearly stated
Ethical statement and approval
Housing and husbandry details
Animal details (species, strain, sex, age, weight)
Experimental procedures in detail
Experimental animals (number, allocation, welfare assessment)
Statistical methods appropriate
Exclusion criteria stated
Sample size determination
Randomization and blinding described
Outcome measures defined
Adverse events reported

Reference: https://arriveguidelines.org/

Genomics and Molecular Biology

MIAME (Minimum Information About a Microarray Experiment)

Purpose: Microarray experiments Key Requirements:

Experimental design clearly described
Array design information
Samples (origin, preparation, labeling)
Hybridization procedures and parameters
Image acquisition and quantification
Normalization and data transformation
Raw and processed data availability
Database accession numbers

Reference: http://fged.org/projects/miame/

MINSEQE (Minimum Information about a high-throughput Nucleotide Sequencing Experiment)

Purpose: High-throughput sequencing (RNA-seq, ChIP-seq, etc.) Key Requirements:

Experimental design and biological context
Sample information (source, preparation, QC)
Library preparation (protocol, adapters, size selection)
Sequencing platform and parameters
Data processing pipeline (alignment, quantification, normalization)
Quality control metrics
Raw data deposition (SRA, GEO, ENA)
Processed data and analysis code availability

MIGS/MIMS (Minimum Information about a Genome/Metagenome Sequence)

Purpose: Genome and metagenome sequencing Key Requirements:

Sample origin and environmental context
Sequencing methods and coverage
Assembly methods and quality metrics
Annotation approach
Quality control and contamination screening
Data deposition in INSDC databases

Reference: https://gensc.org/

Structural Biology

PDB (Protein Data Bank) Deposition Requirements

Purpose: Macromolecular structure determination Key Requirements:

Atomic coordinates deposited
Structure factors for X-ray structures
Restraints and experimental data for NMR
EM maps and metadata for cryo-EM
Model quality validation metrics
Experimental conditions (crystallization, sample preparation)
Data collection parameters
Refinement statistics

Reference: https://www.wwpdb.org/

Proteomics and Mass Spectrometry

MIAPE (Minimum Information About a Proteomics Experiment)

Purpose: Proteomics experiments Key Requirements:

Sample processing and fractionation
Separation methods (2D gel, LC)
Mass spectrometry parameters (instrument, acquisition)
Database search and validation parameters
Peptide and protein identification criteria
Quantification methods
Statistical analysis
Data deposition (PRIDE, PeptideAtlas)

Reference: http://www.psidev.info/

Neuroscience

COBIDAS (Committee on Best Practices in Data Analysis and Sharing)

Purpose: MRI and fMRI studies Key Requirements:

Scanner and sequence parameters
Preprocessing pipeline details
Software versions and parameters
Statistical analysis approach
Multiple comparison correction
ROI definitions
Data sharing (raw data, analysis scripts)

Reference: https://www.humanbrainmapping.org/cobidas

Flow Cytometry

MIFlowCyt (Minimum Information about a Flow Cytometry Experiment)

Purpose: Flow cytometry experiments Key Requirements:

Experimental overview and purpose
Sample characteristics and preparation
Instrument information and settings
Reagents (antibodies, fluorophores, concentrations)
Compensation and controls
Gating strategy
Data analysis approach
Data availability

Reference: http://flowcyt.org/

Ecology and Environmental Science

MIAPPE (Minimum Information About a Plant Phenotyping Experiment)

Purpose: Plant phenotyping studies Key Requirements:

Investigation and study metadata
Biological material information
Environmental parameters
Experimental design and factors
Phenotypic measurements and methods
Data file descriptions

Reference: https://www.miappe.org/

Chemistry and Chemical Biology

MIRIBEL (Minimum Information Reporting in Bio-Nano Experimental Literature)

Purpose: Nanomaterial characterization Key Requirements:

Nanomaterial composition and structure
Size, shape, and morphology characterization
Surface chemistry and functionalization
Purity and stability
Experimental conditions
Characterization methods

Quality Assessment and Bias

CAMARADES (Collaborative Approach to Meta-Analysis and Review of Animal Data from Experimental Studies)

Purpose: Quality assessment for animal studies in systematic reviews Key Items:

Publication in peer-reviewed journal
Statement of temperature control
Randomization to treatment
Blinded assessment of outcome
Avoidance of anesthetic with marked intrinsic properties
Use of appropriate animal model
Sample size calculation
Compliance with regulatory requirements
Statement of conflict of interest
Study pre-registration

SYRCLE's Risk of Bias Tool

Purpose: Assessing risk of bias in animal intervention studies Domains:

Selection bias (sequence generation, baseline characteristics, allocation concealment)
Performance bias (random housing, blinding of personnel)
Detection bias (random outcome assessment, blinding of assessors)
Attrition bias (incomplete outcome data)
Reporting bias (selective outcome reporting)
Other sources of bias

General Principles Across Guidelines

Common Requirements

1. Transparency: All methods, materials, and analyses fully described 2. Reproducibility: Sufficient detail for independent replication 3. Data Availability: Raw data and analysis code shared or deposited 4. Registration: Studies pre-registered where applicable 5. Ethics: Appropriate approvals and consent documented 6. Conflicts of Interest: Disclosed for all authors 7. Statistical Rigor: Methods appropriate and fully described 8. Completeness: All outcomes reported, including negative results

Red Flags for Non-Compliance

Methods section lacks critical details
No mention of following reporting guidelines
Data availability statement missing or vague
No database accession numbers for omics data
No trial registration for clinical studies
Sample size not justified
Statistical methods inadequately described
Missing flow diagrams (CONSORT, PRISMA)
Selective reporting of outcomes

How to Use This Reference

When reviewing a manuscript: 1. Identify the study type and discipline 2. Find the relevant reporting guideline(s) 3. Check if authors mention following the guideline 4. Verify that key requirements are addressed 5. Note any missing elements in your review 6. Suggest the appropriate guideline if not mentioned

Many journals require authors to complete reporting checklists at submission. Reviewers should verify compliance even if a checklist was submitted.

Related skills

Improve Codebase ArchitectureSafely deepen clusters of shallow modules into cohesive, testable units while respecting their external dependencies.531k185k

Caveman ReviewGet ultra-compressed, one-line code review comments that cut noise while keeping every actionable fix.260k92.5k

Codebase DesignShared vocabulary for designing deep modules: improve a module's interface, find deepening opportunities, decide where a seam goes, make code more testable.233k185k

CavecrewDelegate coding tasks to specialized subagents that return compressed output, keeping the main context window usable for much longer sessions.210k92.5k

Requesting Code ReviewDispatch a consistent, high-signal code reviewer subagent that catches plan deviations and quality issues before merging or continuing development.178k260k

Code ReviewReviews a branch or PR diff on two axes at once: conformance to coding standards plus a code-smell baseline, and whether it actually implements the original spec.167k185k

How it compares

Choose peer-review over generic code-review skills when the artifact is research output requiring statistical and methodological validation rather than application logic review.

FAQ

What flaws does peer-review catch in manuscripts?

peer-review catalogs frequent scientific peer-review issues including p-value misuse, multiple testing without correction, missing effect sizes, and misinterpreted non-significance. The skill organizes flaws by category so reviewers give constructive feedback on research-heavy co

Is peer-review for code pull requests or research papers?

peer-review targets research-heavy code, documentation, and scientific manuscripts rather than routine application pull requests. The skill focuses on statistical and methodological rigor—p-hacking, reporting standards, and analysis validity—before finalizing or publishing resear

Is Peer Review safe to install?

skills.sh reports 3 of 3 security scanners passed. Review the Security Audits panel on this page before installing in production.

Code Review & Qualitytesting