
Scholar Evaluation
Score and improve scholarly drafts, lit reviews, or agent-generated research memos with the ScholarEval rubric before you commit to a build or publication path.
Overview
scholar-evaluation is an agent skill most often used in Idea (also Validate, Build) that applies ScholarEval rubrics to systematically score problem formulation, research questions, and related scholarly quality dimensio
Install
npx skills add https://github.com/k-dense-ai/scientific-agent-skills --skill scholar-evaluationWhat is this skill?
- Five-level rubrics (Excellent through Poor) for problem formulation and research questions with explicit quality indicat
- Systematic evaluation dimensions for scholarly work beyond a single pass/fail gut check
- Gap, significance, scope, and novelty criteria agents can apply consistently to drafts and proposals
- Structured standards for when incremental vs high-impact problems are inadequately justified
- Framework-oriented guidance suitable for peer review prep, thesis chapters, and research-agent outputs
- Five-level scoring scale per dimension: Excellent (5) through Poor (1)
- Dimension 1 rubric covers problem formulation and research questions with enumerated quality indicators
Adoption & trust: 591 installs on skills.sh; 27.6k GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
You have draft research text or agent output but no consistent rubric to tell whether the problem, RQs, and claimed contribution are actually strong enough to pursue.
Who is it for?
Builders running research agents, writing theses, or validating AI-generated literature summaries who need repeatable academic quality gates.
Skip if: Pure implementation tasks with no research artifact, or users who only need citation formatting without evaluative depth.
When should I use this skill?
You need systematic ScholarEval-style evaluation of scholarly drafts, proposals, or agent-generated research content.
What do I get? / Deliverables
You receive dimension-scored, rubric-aligned feedback on scholarly work so you can revise questions, scope, and significance before investing in prototypes or writing.
- Dimension-level qualitative ratings aligned to ScholarEval indicators
- Actionable revision notes on RQs, gap, scope, novelty, and significance
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Rigorous problem formulation and research-question quality belong at the start of the journey when you are deciding what is worth building or publishing. ScholarEval dimensions map to literature-facing research quality—RQs, significance, and contribution clarity—not shipping code.
Where it fits
Rate an agent-drafted problem statement against Excellent-to-Poor RQ indicators before picking a product niche backed by literature.
Check whether proposed experiments are feasible and whether the contribution is differentiated enough to justify a prototype sprint.
Review a technical report or README research section for weak significance justification before publishing or pitching.
How it compares
A structured evaluation checker for scholarly artifacts, not a paper search MCP or automatic plagiarism detector.
Common Questions / FAQ
Who is scholar-evaluation for?
Solo builders and small teams using agents on scientific writing, lit reviews, or grant sections who want ScholarEval-style criteria instead of generic editing.
When should I use scholar-evaluation?
In Idea research when framing RQs; in Validate scope when judging feasibility and contribution; in Build docs when reviewing agent-generated research sections before ship or launch narratives.
Is scholar-evaluation safe to install?
Check this page’s Security Audits panel for risk level and audit results before installing the skill into an agent that handles unpublished manuscripts.
SKILL.md
READMESKILL.md - Scholar Evaluation
# ScholarEval Evaluation Framework ## Overview This document provides detailed evaluation criteria, rubrics, and quality indicators for each dimension of the ScholarEval framework. Use these standards when conducting systematic evaluations of scholarly work. --- ## Dimension 1: Problem Formulation & Research Questions ### Quality Indicators **Excellent (5):** - Research question is specific, measurable, and clearly articulated - Problem addresses significant gap in literature with high impact potential - Scope is appropriate and feasible within constraints - Novel contribution is clearly differentiated from existing work - Theoretical or practical significance is compellingly justified **Good (4):** - Research question is clear with minor ambiguities - Problem is relevant with moderate impact potential - Scope is generally appropriate with minor feasibility concerns - Contribution is identifiable though not groundbreaking - Significance is adequately justified **Adequate (3):** - Research question is present but lacks specificity - Problem relevance is unclear or incremental - Scope may be too broad or narrow - Contribution is unclear or overlaps heavily with existing work - Significance justification is weak **Needs Improvement (2):** - Research question is vague or poorly defined - Problem lacks clear relevance or significance - Scope is inappropriate or infeasible - Contribution is not articulated - No clear justification for significance **Poor (1):** - No clear research question - Problem is trivial or irrelevant - Scope is fundamentally flawed - No identifiable contribution - No significance justification ### Assessment Checklist - [ ] Is the research question clearly stated? - [ ] Can the question be answered with the proposed approach? - [ ] Is the problem significant to the field? - [ ] Is the scope feasible within resource constraints? - [ ] Is the novelty/contribution clearly articulated? - [ ] Are key assumptions explicitly stated? - [ ] Are success criteria or expected outcomes defined? --- ## Dimension 2: Literature Review ### Quality Indicators **Excellent (5):** - Comprehensive coverage of relevant literature across key areas - Critical synthesis identifying patterns, contradictions, and gaps - Literature is current (majority from last 3-5 years for rapidly evolving fields) - Sources are authoritative and peer-reviewed - Clear positioning of current work within scholarly conversation - Identifies genuine research gaps that the work addresses **Good (4):** - Good coverage with minor gaps in key areas - Mostly synthesis with some description - Literature is mostly current with some older foundational works - Sources are generally authoritative - Work positioning is present but could be stronger - Research gaps are identified but may not be critical **Adequate (3):** - Partial coverage with notable gaps - More descriptive summarization than synthesis - Literature mix of current and dated sources - Mix of authoritative and less rigorous sources - Weak positioning within existing literature - Research gaps are vague or questionable **Needs Improvement (2):** - Minimal coverage with major gaps - Purely descriptive without synthesis - Literature is largely outdated - Sources lack authority or rigor - Little to no positioning of current work - No clear research gaps identified **Poor (1):** - Inadequate or absent literature review - No synthesis - Outdated or inappropriate sources - No engagement with scholarly conversation - No gap identification ### Assessment Checklist - [ ] Does review cover all major relevant areas? - [ ] Is literature synthesized rather than just summarized? - [ ] Are sources current and authoritative? - [ ] Are contrasting viewpoints presented? - [ ] Are research gaps clearly identified? - [ ] Is the current work positioned within existing literature? - [ ] Is citation balance appropriate (not over-relying on few authors)? - [ ] Are seminal/foundational works included? ###