Systematic Literature Review

Name: Systematic Literature Review
Author: huangwb8

huangwb8/chineseresearchlatex

Run a reproducible systematic literature review pipeline—search, abstract enrichment, BibTeX, and LaTeX-friendly references—for Chinese academic writing with your agent.

Overview

Systematic Literature Review is an agent skill for the Idea phase that automates searchable, configurable literature review steps and LaTeX-ready references for Chinese research workflows.

Install

npx skills add https://github.com/huangwb8/chineseresearchlatex --skill systematic-literature-review

What is this skill?

Multi-source search and abstract enrichment with config-driven defaults instead of silent hard-coded queries
OpenAlex integration with cache-aware requests and safer handling of missing abstract indexes
BibTeX generation with Unicode control-character sanitization to reduce LaTeX missing-character warnings
Reference selection thresholds tied to config.yaml for consistent min-abstract and enrichment rules
Validation utilities for citation distribution and reproducible retrieval behavior
BibTeX Unicode sanitization validated with 14 automated test cases in upstream changelog notes
Reference workflows default abstract-length thresholds from config.yaml search.abstract_enrichment.min_abstract_chars

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 805 installs on skills.sh; 2.2k GitHub stars; 1/3 security scanners passed (skills.sh audits).

What problem does it solve?

You need a defensible systematic review and clean BibTeX for LaTeX, but manual OpenAlex/Google workflows drift, break on Unicode, and are hard to reproduce.

Who is it for?

Grad students, academic indie researchers, or builders documenting AI or domain surveys who already use LaTeX and want agent-driven, script-backed retrieval.

Skip if: Quick blog posts with three informal links, or product validation that only needs five competitor landing pages—not formal SLR methods.

When should I use this skill?

You are conducting a systematic or semi-systematic literature review for LaTeX output and need scripted search, enrichment, BibTeX, and citation checks—not ad-hoc chat lists.

What do I get? / Deliverables

You end with cached, config-aligned search runs, enriched abstracts, sanitized BibTeX, and validation checks you can rerun for the same research question.

Curated reference sets and BibTeX suitable for LaTeX
Cached search/enrichment runs you can replay
Citation distribution or validation reports when those scripts are used

Recommended Skills

Lark Doclarksuite/cli

lark-doc is an agent skill for Feishu cloud documents, knowledge-base wiki pages, and Docx v2 workflows through the `lar…211k installs·13.7k stars

Lark Wikilarksuite/cli

Operates Lark wiki spaces and nodes via lark-cli, emphasizing URL resolution, bot limitations on departments, and safe s…209k installs·13.7k stars

Opensource Guide Coachxixu-me/skills

Open Source Guide Coach distills GitHub's official Open Source Guides into actionable coaching for starting projects, at…200k installs·61 stars

Readme I18nxixu-me/skills

README i18n skill standardizes multilingual README language selectors—placing a canonical README-I18N block after the ti…200k installs·61 stars

Doc Coauthoringanthropics/skills

Doc Co-Authoring is an agent skill that walks solo builders through collaborative creation of substantial documentation—…54.6k installs·148k stars

Obsidian Markdownkepano/obsidian-skills

obsidian-markdown is an agent skill for solo builders who keep specs, research, and runbooks in Obsidian vaults. It teac…41k installs·34.9k stars

Journey fit

Primary fit

IdeaOpportunity & market research

Systematic reviews belong in Idea when you are still framing evidence and sources before you commit product or thesis direction. Research subphase is the canonical shelf for literature search, citation hygiene, and scholarly source selection—not for shipping code.

How it compares

A structured research automation skill with Python tooling, not a general web-search MCP or a one-shot summarize prompt.

Common Questions / FAQ

Who is systematic-literature-review for?

Solo researchers and small teams producing Chinese LaTeX papers or evidence-heavy docs who want reproducible search, enrichment, and bibliography pipelines driven by an agent.

When should I use systematic-literature-review?

Use it in the Idea research phase when defining a review protocol, pulling OpenAlex-backed sources, enriching abstracts, and generating BibTeX before you write or validate claims.

Is systematic-literature-review safe to install?

It performs network calls to scholarly APIs; check the Security Audits panel on this page and review scripts and API keys before running on sensitive machines.

SKILL.md

READMESKILL.md - Systematic Literature Review

# Changelog

All notable changes to the systematic-literature-review skill will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]

### Fixed（检索与摘要补齐的可控性/可复现性 - 2026-01-25）

- `multi_query_search.py`：未提供查询时不再静默回退到硬编码查询，改为直接报错（避免误跑无关主题）
- `openalex_search.py`：摘要补齐默认跟随 `config.yaml`，并支持 CLI 显式覆盖；补齐请求复用 `--cache-dir`
- `multi_source_abstract.py`：补齐请求接入 `api_cache.py` 缓存，减少重复请求与限流风险；修复 OpenAlex `abstract_inverted_index=null` 导致的崩溃
- `select_references.py`：摘要长度阈值默认跟随 `config.yaml:search.abstract_enrichment.min_abstract_chars`，保证“补齐判定/选文规避”口径一致

### Fixed（BibTeX Unicode 控制字符清洗 - 2026-01-03）🧹

**问题修复**：解决 LaTeX 编译时产生的 "Missing character" 警告

- **问题背景**（breast-test-05 实例）：
  - LaTeX 编译日志显示 31 个 "Missing character" 警告
  - 涉及 Unicode 控制字符：U+202C（POP DIRECTIONAL FORMATTING）、U+200E（LEFT-TO-RIGHT MARK）
  - 来源：OpenAlex API 返回的作者名称中包含方向控制符

- **解决方案**：
  - 新增 `_sanitize_unicode()` 函数（`build_reference_bib_from_papers.py` 第 24-48 行）
  - 移除 Unicode 控制字符（Cc、Cf 类别），保留正常字符和特殊学术字符
  - 在 `_to_ref()` 函数中对 title、venue、authors 调用清洗函数

- **测试验证**（test/AUTOv202601030646）：
  - 14 个测试用例全部通过
  - 真实数据测试：成功清洗 breast-test-05 中的问题字符串

- **影响**：
  - ✅ 消除 LaTeX 编译时的 Unicode 字符警告
  - ✅ 保留特殊学术字符（如 ǹ、ę、中文）
  - ✅ 向后兼容：不影响现有 BibTeX 生成流程

---

### Fixed（validate_citation_distribution.py SyntaxWarning - 2026-01-03）🔧

**问题修复**：修复 Python 3.12+ 的 SyntaxWarning

- **问题背景**：
  - 运行脚本时产生 `SyntaxWarning: invalid escape sequence '\c'`
  - 原因：docstring 中的 `\cite` 未转义

- **解决方案**：
  - 将第 28 行 docstring 中的 `\cite` 改为 `\\cite`

- **测试验证**：
  - `python3 -W error -c "import scripts.validate_citation_distribution"` 无警告

- **影响**：
  - ✅ 消除 SyntaxWarning
  - ✅ 脚本在 `-W error` 模式下可正常运行

---

### Changed（选文分数分布统计透明化 - 2026-01-03）📊

**功能增强**：在 `selection_rationale.yaml` 中增加详细的分数分布统计

- **问题背景**（breast-test-05 实例）：
  - `high_score_bucket: 196` 容易被误解为「高分文献数量」
  - 实际含义是「按分数排序后取前 70% 的文献数量」
  - 用户无法直观了解选中文献的实际分数分布

- **解决方案**：
  - 在 `_select_papers()` 函数中增加 `score_distribution` 统计
  - 新增字段：
    * `high_score_count`: 高分(≥7)文献数
    * `mid_score_count`: 中分(4-6.9)文献数
    * `low_score_count`: 低分(<4)文献数
    * `max_score`, `min_score`, `avg_score`: 分数范围和均值

- **输出示例**（修复后）：
  ```yaml
  total_candidates: 279
  selected: 90
  high_score_fraction_used: 0.7
  high_score_bucket: 196  # 保留向后兼容
  min_refs: 50
  max_refs: 90
  score_distribution:
    high_score_count: 32
    mid_score_count: 34
    low_score_count: 24
    max_score: 9.4
    min_score: 2.0
    avg_score: 6.13
  ```

- **测试验证**（test/AUTOv202601030646）：
  - 3 个测试用例全部通过
  - 向后兼容性验证通过

- **影响**：
  - ✅ 选文理由更透明，用户可直观了解分数分布
  - ✅ 向后兼容：保留 `high_score_bucket` 字段
  - ✅ 便于调试和质量评估

---

### Added（成本追踪系统 - AI 驱动的价格获取与 Token 统计 - 2026-01-02）💰

**新功能**：添加完全可选的 Token 使用与成本追踪系统，帮助用户了解综述项目的 AI 成本。

- **核心特性**：
  - **单文件架构**：所有功能集中在 `scripts/pipeline_cost.py`
  - **AI 驱动价格获取**：AI 自动联网查询官方价格（OpenAI、Anthropic、智谱清言）
  - **项目级数据隔离**：每个综述项目独立记录
  - **零侵入设计**：不影响文献综述核心流程

- **AI 驱动价格获取流程**：
  1. 用户运行：`python3 scripts/pipeline_cost.py fetch-prices`
  2. AI 自动：
     - 使用 WebSearch 工具查询官方定价
     - 从官网提取准确价格信息
     - 生成 YAML 格式
     - 保存到 `scripts/pipeline_cost.yaml`
  3. 自动复制到当前项目：`.systematic-literature-review/cost/price_config.yaml`

- **获取的价格数据**（共 14 个模型，2026-01-02 获取）：

  **OpenAI 模型**：
  | 模型 | 输入价格 | 输出价格 | 货币 |
  |------|----------|----------|------|
  | GPT-5.2 | $1.75/1M | $14.00/1M | USD |
  | GPT-5 Mini | $0.25/1M | $2.00/1M | USD |
  | GPT-4o | $2.50/1M | $10.00/1M | USD |
  | GPT-4o Mini | $0.15/1M | $0.60/1M | USD |
  | O1 | $15.00/1M | $60.00/1M | USD |
  | O3 | $2.00/1M | $8.00/1M | USD |

  **Anthropic 模型**：
  | 模型 | 输入价格 | 输出价格 | 货币 |
  |------|----------|----------|------|
  | Claude Opus 4.5 | $5.00/1M | $25.00/1M | USD |
  | Claude Sonnet 4.5 | $3.00/1M | $15.00/1M | USD |
  | Claude Haiku 4.5 | $1.00/1M | $5.00/1M | USD |

  **智谱清言模型**：
  | 模型 | 输入价格 | 输出价格 | 货币 |
  |------|---------

What is this skill?

Multi-source search and abstract enrichment with config-driven defaults instead of silent hard-coded queries

OpenAlex integration with cache-aware requests and safer handling of missing abstract indexes

BibTeX generation with Unicode control-character sanitization to reduce LaTeX missing-character warnings

Reference selection thresholds tied to config.yaml for consistent min-abstract and enrichment rules

Validation utilities for citation distribution and reproducible retrieval behavior

BibTeX Unicode sanitization validated with 14 automated test cases in upstream changelog notes

Reference workflows default abstract-length thresholds from config.yaml search.abstract_enrichment.min_abstract_chars

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 805 installs on skills.sh; 2.2k GitHub stars; 1/3 security scanners passed (skills.sh audits).

Who is it for?

Grad students, academic indie researchers, or builders documenting AI or domain surveys who already use LaTeX and want agent-driven, script-backed retrieval.

Skip if: Quick blog posts with three informal links, or product validation that only needs five competitor landing pages—not formal SLR methods.

What do I get? / Deliverables

You end with cached, config-aligned search runs, enriched abstracts, sanitized BibTeX, and validation checks you can rerun for the same research question.

Curated reference sets and BibTeX suitable for LaTeX

Cached search/enrichment runs you can replay

Citation distribution or validation reports when those scripts are used

Journey fit

Primary fit

IdeaOpportunity & market research

SKILL.md

READMESKILL.md - Systematic Literature Review

# Changelog

All notable changes to the systematic-literature-review skill will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]

### Fixed（检索与摘要补齐的可控性/可复现性 - 2026-01-25）

- `multi_query_search.py`：未提供查询时不再静默回退到硬编码查询，改为直接报错（避免误跑无关主题）
- `openalex_search.py`：摘要补齐默认跟随 `config.yaml`，并支持 CLI 显式覆盖；补齐请求复用 `--cache-dir`
- `multi_source_abstract.py`：补齐请求接入 `api_cache.py` 缓存，减少重复请求与限流风险；修复 OpenAlex `abstract_inverted_index=null` 导致的崩溃
- `select_references.py`：摘要长度阈值默认跟随 `config.yaml:search.abstract_enrichment.min_abstract_chars`，保证“补齐判定/选文规避”口径一致

### Fixed（BibTeX Unicode 控制字符清洗 - 2026-01-03）🧹

**问题修复**：解决 LaTeX 编译时产生的 "Missing character" 警告

- **问题背景**（breast-test-05 实例）：
  - LaTeX 编译日志显示 31 个 "Missing character" 警告
  - 涉及 Unicode 控制字符：U+202C（POP DIRECTIONAL FORMATTING）、U+200E（LEFT-TO-RIGHT MARK）
  - 来源：OpenAlex API 返回的作者名称中包含方向控制符

- **解决方案**：
  - 新增 `_sanitize_unicode()` 函数（`build_reference_bib_from_papers.py` 第 24-48 行）
  - 移除 Unicode 控制字符（Cc、Cf 类别），保留正常字符和特殊学术字符
  - 在 `_to_ref()` 函数中对 title、venue、authors 调用清洗函数

- **测试验证**（test/AUTOv202601030646）：
  - 14 个测试用例全部通过
  - 真实数据测试：成功清洗 breast-test-05 中的问题字符串

- **影响**：
  - ✅ 消除 LaTeX 编译时的 Unicode 字符警告
  - ✅ 保留特殊学术字符（如 ǹ、ę、中文）
  - ✅ 向后兼容：不影响现有 BibTeX 生成流程

---

### Fixed（validate_citation_distribution.py SyntaxWarning - 2026-01-03）🔧

**问题修复**：修复 Python 3.12+ 的 SyntaxWarning

- **问题背景**：
  - 运行脚本时产生 `SyntaxWarning: invalid escape sequence '\c'`
  - 原因：docstring 中的 `\cite` 未转义

- **解决方案**：
  - 将第 28 行 docstring 中的 `\cite` 改为 `\\cite`

- **测试验证**：
  - `python3 -W error -c "import scripts.validate_citation_distribution"` 无警告

- **影响**：
  - ✅ 消除 SyntaxWarning
  - ✅ 脚本在 `-W error` 模式下可正常运行

---

### Changed（选文分数分布统计透明化 - 2026-01-03）📊

**功能增强**：在 `selection_rationale.yaml` 中增加详细的分数分布统计

- **问题背景**（breast-test-05 实例）：
  - `high_score_bucket: 196` 容易被误解为「高分文献数量」
  - 实际含义是「按分数排序后取前 70% 的文献数量」
  - 用户无法直观了解选中文献的实际分数分布

- **解决方案**：
  - 在 `_select_papers()` 函数中增加 `score_distribution` 统计
  - 新增字段：
    * `high_score_count`: 高分(≥7)文献数
    * `mid_score_count`: 中分(4-6.9)文献数
    * `low_score_count`: 低分(<4)文献数
    * `max_score`, `min_score`, `avg_score`: 分数范围和均值

- **输出示例**（修复后）：
  ```yaml
  total_candidates: 279
  selected: 90
  high_score_fraction_used: 0.7
  high_score_bucket: 196  # 保留向后兼容
  min_refs: 50
  max_refs: 90
  score_distribution:
    high_score_count: 32
    mid_score_count: 34
    low_score_count: 24
    max_score: 9.4
    min_score: 2.0
    avg_score: 6.13
  ```

- **测试验证**（test/AUTOv202601030646）：
  - 3 个测试用例全部通过
  - 向后兼容性验证通过

- **影响**：
  - ✅ 选文理由更透明，用户可直观了解分数分布
  - ✅ 向后兼容：保留 `high_score_bucket` 字段
  - ✅ 便于调试和质量评估

---

### Added（成本追踪系统 - AI 驱动的价格获取与 Token 统计 - 2026-01-02）💰

**新功能**：添加完全可选的 Token 使用与成本追踪系统，帮助用户了解综述项目的 AI 成本。

- **核心特性**：
  - **单文件架构**：所有功能集中在 `scripts/pipeline_cost.py`
  - **AI 驱动价格获取**：AI 自动联网查询官方价格（OpenAI、Anthropic、智谱清言）
  - **项目级数据隔离**：每个综述项目独立记录
  - **零侵入设计**：不影响文献综述核心流程

- **AI 驱动价格获取流程**：
  1. 用户运行：`python3 scripts/pipeline_cost.py fetch-prices`
  2. AI 自动：
     - 使用 WebSearch 工具查询官方定价
     - 从官网提取准确价格信息
     - 生成 YAML 格式
     - 保存到 `scripts/pipeline_cost.yaml`
  3. 自动复制到当前项目：`.systematic-literature-review/cost/price_config.yaml`

- **获取的价格数据**（共 14 个模型，2026-01-02 获取）：

  **OpenAI 模型**：
  | 模型 | 输入价格 | 输出价格 | 货币 |
  |------|----------|----------|------|
  | GPT-5.2 | $1.75/1M | $14.00/1M | USD |
  | GPT-5 Mini | $0.25/1M | $2.00/1M | USD |
  | GPT-4o | $2.50/1M | $10.00/1M | USD |
  | GPT-4o Mini | $0.15/1M | $0.60/1M | USD |
  | O1 | $15.00/1M | $60.00/1M | USD |
  | O3 | $2.00/1M | $8.00/1M | USD |

  **Anthropic 模型**：
  | 模型 | 输入价格 | 输出价格 | 货币 |
  |------|----------|----------|------|
  | Claude Opus 4.5 | $5.00/1M | $25.00/1M | USD |
  | Claude Sonnet 4.5 | $3.00/1M | $15.00/1M | USD |
  | Claude Haiku 4.5 | $1.00/1M | $5.00/1M | USD |

  **智谱清言模型**：
  | 模型 | 输入价格 | 输出价格 | 货币 |
  |------|---------

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Who is systematic-literature-review for?

When should I use systematic-literature-review?

Is systematic-literature-review safe to install?

SKILL.md

This week for builders

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Who is systematic-literature-review for?

When should I use systematic-literature-review?

Is systematic-literature-review safe to install?

SKILL.md