
Invoice Processor
Turn batches of Chinese invoice PDFs or images into structured Excel rows using GLM vision without manual data entry.
Overview
Invoice Processor is an agent skill for the Operate phase that converts Chinese invoice PDFs and images into Excel-ready rows using Zhipu GLM vision recognition.
Install
npx skills add https://github.com/jst-well-dan/skill-box --skill invoice-processorWhat is this skill?
- End-to-end flow from PDF/JPG/PNG inputs to formatted Excel exports via AI vision
- Triggers on 发票 / invoice / 处理发票 / 发票转Excel style requests
- Requires .env GLM_API_KEY and documents non-ASCII path pitfalls with full paths under .claude/skills/
- Includes environment check scripts and cross-platform shell guidance (ls not dir)
- Designed for Zhipu BigModel GLM vision API integration
Adoption & trust: 1 installs on skills.sh; 8 GitHub stars; 3/3 security scanners passed (skills.sh audits); trending (+100% hot-view momentum).
What problem does it solve?
You have stacks of 发票 files and need structured spreadsheet data without hiring manual entry or building a custom OCR pipeline from scratch.
Who is it for?
Indie founders or ops leads who already use Claude Code-style agents and want repeatable 发票-to-Excel batches with a GLM API key.
Skip if: Teams that need on-prem OCR with no cloud vision API, non-Chinese invoice schemas only, or builders who will not store GLM_API_KEY in a local .env.
When should I use this skill?
Users mention 发票, invoice, 处理发票, 识别发票, 提取发票, or need invoice files converted to Excel.
What do I get? / Deliverables
After the skill runs, invoice fields land in a formatted Excel workflow driven by GLM vision, with environment checks and path-safe script invocation documented.
- Structured invoice data exported toward Excel format
- Environment validation via check_env script pattern
Recommended Skills
Journey fit
Canonical shelf is Operate because invoice extraction is ongoing back-office work after you are shipping, not greenfield product design. Iterate fits recurring document workflows you refine as volume and formats change, rather than a one-off prototype.
How it compares
Use this procedural skill package instead of ad-hoc one-off vision prompts that skip env checks and Excel formatting steps.
Common Questions / FAQ
Who is invoice-processor for?
Solo builders and small teams automating Chinese invoice digitization inside an agent that can run Python scripts and hold a Zhipu GLM API key.
When should I use invoice-processor?
Use it in Operate when recurring 发票 files need Excel exports; also when a Build integrations task wires finance automation into your agent stack.
Is invoice-processor safe to install?
Review the Security Audits panel on this Prism page and treat GLM_API_KEY as a secret; the skill expects network calls to Zhipu and local filesystem access for invoices.
SKILL.md
READMESKILL.md - Invoice Processor
# 智谱AI API配置 (Zhipu AI API Configuration) # 获取地址: https://open.bigmodel.cn/ GLM_API_KEY=your_glm_api_key --- name: invoice-processor description: Automatically process invoices (发票) from PDFs/images to Excel spreadsheets using AI vision recognition. Use this skill when users mention "发票", "invoice", "处理发票", "识别发票", "提取发票", or need to convert invoice files to Excel format. --- # Invoice Processor Fully automated workflow for processing invoice files using AI vision models to extract structured information and generate formatted Excel reports. ## When to Use **Auto-trigger when users mention:** - "处理发票" / "识别发票" / "提取发票信息" / "发票转Excel" - Processing invoice files (PDF, JPG, PNG) - Converting invoices to Excel format ## Execution Environment Notes **⚠️ CRITICAL: Path handling for non-ASCII directory names** When the project directory contains Chinese or other non-ASCII characters (e.g., "发票助手agent"), you MUST use full relative paths from project root: ```bash # ❌ WRONG - Will fail with encoding errors python scripts/check_env.py # ✅ CORRECT - Use full paths from .claude/skills/ python .claude/skills/invoice-processor/scripts/check_env.py ``` **Cross-platform compatibility:** - Use Unix-style commands (Git Bash, Linux, macOS) - ❌ `dir /b` → ✅ `ls` ## Setup Create a `.env` file in the `invoice-processor` directory: ```bash # Copy from template cp .env.example .env # Edit with your API key # .env content: GLM_API_KEY=your_actual_api_key_here ``` Get your API key from: https://open.bigmodel.cn/ ## Workflow ### Step 1: Environment Check (Recommended) ```bash python .claude/skills/invoice-processor/scripts/check_env.py ``` Verifies: GLM_API_KEY is set, required packages installed (aiohttp, PyMuPDF, openpyxl) ### Step 2: Recognize Invoices ```bash # Default: process 'invoices' directory → 'invoice_results.json' python .claude/skills/invoice-processor/scripts/invoice_ocr.py # Custom paths python .claude/skills/invoice-processor/scripts/invoice_ocr.py -i <input_path> -o <output.json> ``` **What it does:** - Scans for invoice files (JPG, JPEG, PNG, PDF) - Converts PDFs to images (200 DPI) - Processes up to 5 files concurrently - Extracts 9 fields: type, number, date, buyer/seller names, amounts (excl/incl tax), tax, items - Saves to JSON with success/error status **Arguments:** - `-i, --input`: Input path (default: `invoices`) - `-o, --output`: Output JSON (default: `invoice_results.json`) **Prerequisites:** - `.env` file with `GLM_API_KEY` (see Setup below) - `pip install aiohttp PyMuPDF` ### Step 3: Generate Excel Report ```bash # Default: 'invoice_results.json' → 'invoice_results.xlsx' python .claude/skills/invoice-processor/scripts/convert_to_excel.py # Custom paths python .claude/skills/invoice-processor/scripts/convert_to_excel.py -i <input.json> -o <output.xlsx> ``` **What it does:** - Reads JSON from Step 2 - Creates formatted Excel with 12 columns - Auto-deletes input JSON after successful conversion **Arguments:** - `-i, --input`: Input JSON (default: `invoice_results.json`) - `-o, --output`: Output Excel (default: `invoice_results.xlsx`) **Prerequisites:** - `pip install openpyxl` ## Usage Examples ### Basic (Non-ASCII directory names) ```bash # Full 3-step workflow python .claude/skills/invoice-processor/scripts/check_env.py python .claude/skills/invoice-processor/scripts/invoice_ocr.py -i invoices -o invoice_results.json python .claude/skills/invoice-processor/scripts/convert_to_excel.py -i invoice_results.json -o invoice_results.xlsx ``` ### Custom Paths ```bash python .claude/skills/invoice-processor/scripts/invoice_ocr.py -i invoices_2024 -o results_2024.json python .claude/skills/invoice-processor/scripts/convert_to_excel.py -i results_2024.json -o report_2024.xlsx ``` ## Troubleshooting ### Script Not Found / Encoding Errors **Error:** `can't open file '...\��Ʊ����agent\scripts\...'` **Cause:** Short paths (`scripts/`) fail in non-ASCII directories **Solution:** Use full paths: `.cl