
Systematic Literature Review
Run a reproducible systematic literature review pipeline—search, abstract enrichment, BibTeX, and LaTeX-friendly references—for Chinese academic writing with your agent.
Overview
Systematic Literature Review is an agent skill for the Idea phase that automates searchable, configurable literature review steps and LaTeX-ready references for Chinese research workflows.
Install
npx skills add https://github.com/huangwb8/chineseresearchlatex --skill systematic-literature-reviewWhat is this skill?
- Multi-source search and abstract enrichment with config-driven defaults instead of silent hard-coded queries
- OpenAlex integration with cache-aware requests and safer handling of missing abstract indexes
- BibTeX generation with Unicode control-character sanitization to reduce LaTeX missing-character warnings
- Reference selection thresholds tied to config.yaml for consistent min-abstract and enrichment rules
- Validation utilities for citation distribution and reproducible retrieval behavior
- BibTeX Unicode sanitization validated with 14 automated test cases in upstream changelog notes
- Reference workflows default abstract-length thresholds from config.yaml search.abstract_enrichment.min_abstract_chars
Adoption & trust: 805 installs on skills.sh; 2.2k GitHub stars; 1/3 security scanners passed (skills.sh audits).
What problem does it solve?
You need a defensible systematic review and clean BibTeX for LaTeX, but manual OpenAlex/Google workflows drift, break on Unicode, and are hard to reproduce.
Who is it for?
Grad students, academic indie researchers, or builders documenting AI or domain surveys who already use LaTeX and want agent-driven, script-backed retrieval.
Skip if: Quick blog posts with three informal links, or product validation that only needs five competitor landing pages—not formal SLR methods.
When should I use this skill?
You are conducting a systematic or semi-systematic literature review for LaTeX output and need scripted search, enrichment, BibTeX, and citation checks—not ad-hoc chat lists.
What do I get? / Deliverables
You end with cached, config-aligned search runs, enriched abstracts, sanitized BibTeX, and validation checks you can rerun for the same research question.
- Curated reference sets and BibTeX suitable for LaTeX
- Cached search/enrichment runs you can replay
- Citation distribution or validation reports when those scripts are used
Recommended Skills
Journey fit
Systematic reviews belong in Idea when you are still framing evidence and sources before you commit product or thesis direction. Research subphase is the canonical shelf for literature search, citation hygiene, and scholarly source selection—not for shipping code.
How it compares
A structured research automation skill with Python tooling, not a general web-search MCP or a one-shot summarize prompt.
Common Questions / FAQ
Who is systematic-literature-review for?
Solo researchers and small teams producing Chinese LaTeX papers or evidence-heavy docs who want reproducible search, enrichment, and bibliography pipelines driven by an agent.
When should I use systematic-literature-review?
Use it in the Idea research phase when defining a review protocol, pulling OpenAlex-backed sources, enriching abstracts, and generating BibTeX before you write or validate claims.
Is systematic-literature-review safe to install?
It performs network calls to scholarly APIs; check the Security Audits panel on this page and review scripts and API keys before running on sensitive machines.
SKILL.md
READMESKILL.md - Systematic Literature Review
# Changelog All notable changes to the systematic-literature-review skill will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). ## [Unreleased] ### Fixed(检索与摘要补齐的可控性/可复现性 - 2026-01-25) - `multi_query_search.py`:未提供查询时不再静默回退到硬编码查询,改为直接报错(避免误跑无关主题) - `openalex_search.py`:摘要补齐默认跟随 `config.yaml`,并支持 CLI 显式覆盖;补齐请求复用 `--cache-dir` - `multi_source_abstract.py`:补齐请求接入 `api_cache.py` 缓存,减少重复请求与限流风险;修复 OpenAlex `abstract_inverted_index=null` 导致的崩溃 - `select_references.py`:摘要长度阈值默认跟随 `config.yaml:search.abstract_enrichment.min_abstract_chars`,保证“补齐判定/选文规避”口径一致 ### Fixed(BibTeX Unicode 控制字符清洗 - 2026-01-03)🧹 **问题修复**:解决 LaTeX 编译时产生的 "Missing character" 警告 - **问题背景**(breast-test-05 实例): - LaTeX 编译日志显示 31 个 "Missing character" 警告 - 涉及 Unicode 控制字符:U+202C(POP DIRECTIONAL FORMATTING)、U+200E(LEFT-TO-RIGHT MARK) - 来源:OpenAlex API 返回的作者名称中包含方向控制符 - **解决方案**: - 新增 `_sanitize_unicode()` 函数(`build_reference_bib_from_papers.py` 第 24-48 行) - 移除 Unicode 控制字符(Cc、Cf 类别),保留正常字符和特殊学术字符 - 在 `_to_ref()` 函数中对 title、venue、authors 调用清洗函数 - **测试验证**(test/AUTOv202601030646): - 14 个测试用例全部通过 - 真实数据测试:成功清洗 breast-test-05 中的问题字符串 - **影响**: - ✅ 消除 LaTeX 编译时的 Unicode 字符警告 - ✅ 保留特殊学术字符(如 ǹ、ę、中文) - ✅ 向后兼容:不影响现有 BibTeX 生成流程 --- ### Fixed(validate_citation_distribution.py SyntaxWarning - 2026-01-03)🔧 **问题修复**:修复 Python 3.12+ 的 SyntaxWarning - **问题背景**: - 运行脚本时产生 `SyntaxWarning: invalid escape sequence '\c'` - 原因:docstring 中的 `\cite` 未转义 - **解决方案**: - 将第 28 行 docstring 中的 `\cite` 改为 `\\cite` - **测试验证**: - `python3 -W error -c "import scripts.validate_citation_distribution"` 无警告 - **影响**: - ✅ 消除 SyntaxWarning - ✅ 脚本在 `-W error` 模式下可正常运行 --- ### Changed(选文分数分布统计透明化 - 2026-01-03)📊 **功能增强**:在 `selection_rationale.yaml` 中增加详细的分数分布统计 - **问题背景**(breast-test-05 实例): - `high_score_bucket: 196` 容易被误解为「高分文献数量」 - 实际含义是「按分数排序后取前 70% 的文献数量」 - 用户无法直观了解选中文献的实际分数分布 - **解决方案**: - 在 `_select_papers()` 函数中增加 `score_distribution` 统计 - 新增字段: * `high_score_count`: 高分(≥7)文献数 * `mid_score_count`: 中分(4-6.9)文献数 * `low_score_count`: 低分(<4)文献数 * `max_score`, `min_score`, `avg_score`: 分数范围和均值 - **输出示例**(修复后): ```yaml total_candidates: 279 selected: 90 high_score_fraction_used: 0.7 high_score_bucket: 196 # 保留向后兼容 min_refs: 50 max_refs: 90 score_distribution: high_score_count: 32 mid_score_count: 34 low_score_count: 24 max_score: 9.4 min_score: 2.0 avg_score: 6.13 ``` - **测试验证**(test/AUTOv202601030646): - 3 个测试用例全部通过 - 向后兼容性验证通过 - **影响**: - ✅ 选文理由更透明,用户可直观了解分数分布 - ✅ 向后兼容:保留 `high_score_bucket` 字段 - ✅ 便于调试和质量评估 --- ### Added(成本追踪系统 - AI 驱动的价格获取与 Token 统计 - 2026-01-02)💰 **新功能**:添加完全可选的 Token 使用与成本追踪系统,帮助用户了解综述项目的 AI 成本。 - **核心特性**: - **单文件架构**:所有功能集中在 `scripts/pipeline_cost.py` - **AI 驱动价格获取**:AI 自动联网查询官方价格(OpenAI、Anthropic、智谱清言) - **项目级数据隔离**:每个综述项目独立记录 - **零侵入设计**:不影响文献综述核心流程 - **AI 驱动价格获取流程**: 1. 用户运行:`python3 scripts/pipeline_cost.py fetch-prices` 2. AI 自动: - 使用 WebSearch 工具查询官方定价 - 从官网提取准确价格信息 - 生成 YAML 格式 - 保存到 `scripts/pipeline_cost.yaml` 3. 自动复制到当前项目:`.systematic-literature-review/cost/price_config.yaml` - **获取的价格数据**(共 14 个模型,2026-01-02 获取): **OpenAI 模型**: | 模型 | 输入价格 | 输出价格 | 货币 | |------|----------|----------|------| | GPT-5.2 | $1.75/1M | $14.00/1M | USD | | GPT-5 Mini | $0.25/1M | $2.00/1M | USD | | GPT-4o | $2.50/1M | $10.00/1M | USD | | GPT-4o Mini | $0.15/1M | $0.60/1M | USD | | O1 | $15.00/1M | $60.00/1M | USD | | O3 | $2.00/1M | $8.00/1M | USD | **Anthropic 模型**: | 模型 | 输入价格 | 输出价格 | 货币 | |------|----------|----------|------| | Claude Opus 4.5 | $5.00/1M | $25.00/1M | USD | | Claude Sonnet 4.5 | $3.00/1M | $15.00/1M | USD | | Claude Haiku 4.5 | $1.00/1M | $5.00/1M | USD | **智谱清言模型**: | 模型 | 输入价格 | 输出价格 | 货币 | |------|---------