
Paper Analyst
Turn an uploaded academic PDF or pasted paper text into a structured Chinese summary, critique, and optional slide-plan handoff for solo researchers and builders learning from papers.
Overview
Paper Analyst is an agent skill most often used in Idea (also Validate, Build) that analyzes academic papers and research PDFs into schema-backed summaries and optional presentation plans.
Install
npx skills add https://github.com/flyer-li/paper-analyst --skill paper-analystWhat is this skill?
- Classifies paper type via a dedicated rubric before structuring the write-up
- Follows a fixed output schema plus an anti-hallucination quality checklist
- Supports standard analysis and presentation modes with slide JSON and pptx handoff docs
- Optional `extract_pdf_meta.py` script for PDF metadata extraction
- Bilingual triggers including Chinese group-meeting and PPT-outline requests
- 7 reference files including output schema, paper-type rubric, quality checklist, and presentation handoff docs
- Default analysis mode is standard with optional presentation mode
Adoption & trust: 2 installs on skills.sh; 47 GitHub stars; 2/3 security scanners passed (skills.sh audits); trending (+100% hot-view momentum).
What problem does it solve?
You have a dense research PDF and need a trustworthy structured breakdown without misreading methods, results, or claims.
Who is it for?
Solo builders and researchers who regularly ingest arXiv-style PDFs and want Chinese-first structured reports or group-meeting outlines.
Skip if: General business PDFs, spreadsheets, invoices, or any non-academic document task the skill explicitly excludes.
When should I use this skill?
User uploads or pastes a research paper and asks to analyze, summarize, critique, or prepare meeting/PPT material—academic sources only.
What do I get? / Deliverables
You get a checklist-governed analysis (and optional slide plan) you can cite in notes or hand off to pptx rendering for a talk.
- Schema-compliant Chinese paper analysis report
- Optional presentation slide plan JSON per presentation schema
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Literature understanding is the canonical first stop in the solo journey when you are exploring what prior work says before you commit to a product or technique. Research subphase is where primary-source papers are consumed; this skill is built explicitly for academic PDF analysis, not general docs.
Where it fits
Map a new CV paper’s contributions before choosing a model architecture for your side project.
Check whether a cited study actually supports the feature you plan to promise on a landing page.
Extract training and evaluation details from a paper while implementing a reproduction spike.
How it compares
Use instead of asking the agent to freestyle-summarize papers without schema, type rubric, or anti-hallucination checks.
Common Questions / FAQ
Who is paper-analyst for?
Indie builders, ML-curious founders, and grad-style researchers who need fast, structured understanding of one academic paper at a time.
When should I use paper-analyst?
During Idea research when surveying literature, in Validate when a paper backs your scope claims, and in Build when you need method details before implementing an approach—always for academic PDFs or pasted paper text, not generic documents.
Is paper-analyst safe to install?
Review the Security Audits panel on this Prism page and treat uploaded PDFs as sensitive; the skill is read/analyze oriented but your files pass through the agent environment.
SKILL.md
READMESKILL.md - Paper Analyst
# Paper Analyst Analyze academic papers from PDF or pasted text. Output in Chinese by default. All outputs follow `references/output-schema.md`. Paper type detection uses `references/paper-type-rubric.md`. Anti-hallucination rules in `references/quality-checklist.md`. ## Quick Reference | File | Purpose | |------|---------| | `references/output-schema.md` | Section structure and field rules | | `references/paper-type-rubric.md` | How to classify paper type | | `references/quality-checklist.md` | Anti-hallucination checklist | | `references/presentation-schema.md` | Slide plan JSON schema | | `references/presentation-style-guide.md` | Content compression rules for slides | | `references/pptx-handoff.md` | How to call the pptx skill for rendering | | `scripts/extract_pdf_meta.py` | Optional: extract PDF metadata to JSON | ## Mode Selection Default mode: `standard`. Detect from user's request: | Mode | Trigger | Output | |------|---------|--------| | `quick` | "quick", "简单说", "一句话", "简要" | Header + info + abstract + 3 contributions | | `standard` | (default) | Full analysis: sections 1–5 | | `extended` | "前作", "课题组", "prior work" | standard + author/group prior work | | `presentation` | "PPT", "组会", "汇报大纲", "slides" | standard + slide outline | | `presentation_with_figures` | "图表", "figures", "带图", "关键图" | presentation + figure annotations | If ambiguous, use `standard` and offer to switch. ## Workflow ### Step 1: Assess Input Quality Classify PDF quality before analysis: - **良好**: Full text extractable - **降级处理**: Partial text, scanned sections, garbled encoding - **严重降级**: Minimal text, image-only PDF If degraded: state reason in header line, proceed with available content, mark all gaps explicitly. Never fabricate content to fill gaps. Optional: if user has Python, suggest running `scripts/extract_pdf_meta.py` first for structured metadata. ### Step 2: Classify Paper Type Read `references/paper-type-rubric.md` and classify. Do NOT assume AI/ML. Output the type label and 2–3 evidence indicators before proceeding. ### Step 3: Execute Analysis Follow `references/output-schema.md` for the selected mode. Apply all rules from `references/quality-checklist.md` throughout every section. ### Step 4: Self-Check Before Output Verify before finalizing: - Every uncertain field marked `[不确定]` or `[未明确给出]` - Every contribution tagged `[原文声明]` or `[模型归纳]` - No section silently omitted — skipped sections state why - Paper type label matches rubric evidence ## Anti-Hallucination Rules Full rules in `references/quality-checklist.md`. Non-negotiable constraints: 1. **Source tagging**: `[原文声明]` = directly stated in paper (cite location); `[模型归纳]` = inferred by model (state reasoning basis) 2. **Uncertainty**: `[未明确给出]` when absent; `[不确定]` when ambiguous 3. **No domain assumption**: classify paper type first, always 4. **No fabrication**: venue, DOI, year, affiliations not in text → `[未明确给出]` 5. **Evidence binding**: each contribution must cite section/figure/table/quote 6. **Degraded PDF**: state which sections were unreadable; do not fill gaps ## Degraded Input Fallback | Situation | Action | |-----------|--------| | Only abstract available | `quick` mode, note limitation | | Scanned PDF, no text | Ask user for text or OCR first | | Missing references section | Skip prior work analysis, note absence | | Figures unreadable | Skip figure