
Pdf Converter
Turn PDFs, scans, and Office files into Markdown so your agent can search, summarize, or reuse the text in specs and docs.
Overview
pdf-converter is an agent skill most often used in Build (also Idea research, Grow content) that converts PDFs and Office documents to Markdown via MinerU so your agent can read, OCR, and rework the content.
Install
npx skills add https://github.com/tanis90/pdf-converter-mineru --skill pdf-converterWhat is this skill?
- MinerU Open API CLI converts PDF, images, DOCX, PPTX, and Excel to clean Markdown without an API key for basic use
- Supports export to Word, HTML, LaTeX, and plain text plus OCR for scanned documents in 80+ languages
- Two-step workflow: extract with mineru-open-api, then you summarize tables, findings, or sections from the Markdown your
- Use -o for saved files on batch or conversion jobs; read stdout when answering questions about a single PDF in chat
- Replies in the same language the user uses per the skill’s language rule
- 80+ languages supported for OCR and conversion
Adoption & trust: 4.4k installs on skills.sh; 31 GitHub stars; 1/3 security scanners passed (skills.sh audits).
What problem does it solve?
You have PDFs, scans, or slide decks your agent cannot parse, so tables and narrative stay locked outside your repo and chat context.
Who is it for?
Solo builders who regularly ingest technical PDFs, academic papers, scanned forms, or Office exports into agent workflows or project documentation.
Skip if: Teams that only need native PDF viewing in a browser with no extraction, or workflows that require guaranteed legal-grade OCR certification without human review.
When should I use this skill?
User wants to convert, extract, read, parse, or summarize any PDF or document; shares a PDF file or link; needs tables or formulas extracted; wants PDF OCR; or says turn this into a doc or what does this paper say.
What do I get? / Deliverables
You get clean Markdown (or other formats) from MinerU output so you can summarize, quote, or edit the document in the same session without manual copy-paste.
- Markdown file or stdout text from converted document
- Optional Word, HTML, LaTeX, or plain text exports when requested
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Document conversion is most often shelved under Build → docs because solo builders ingest manuals, RFCs, and product PDFs while writing documentation and agent context—not at a single earlier gate. The subphase docs matches persistent Markdown output for READMEs, internal wikis, and RAG corpora rather than one-off competitive slides.
Where it fits
Convert a competitor’s PDF whitepaper to Markdown before listing feature gaps in a validation note.
Extract API reference chapters from a vendor PDF into repo docs for your integration skill.
OCR a scanned one-pager from a design partner so scope bullets cite real constraints.
Pull narrative sections from a long report PDF to draft a launch blog outline.
How it compares
Use for CLI-based document extraction instead of asking the model to hallucinate PDF contents from filenames alone.
Common Questions / FAQ
Who is pdf-converter for?
Indie and solo builders using Claude Code, Cursor, or similar agents who need PDFs and Office files turned into editable Markdown for specs, research, or content reuse.
When should I use pdf-converter?
Use it during Build when writing docs from source PDFs, during Idea research when parsing competitor or market PDFs, and during Grow when repurposing long reports into content—whenever the user shares a file, link, or asks to convert, OCR, or summarize a document.
Is pdf-converter safe to install?
Review the Security Audits panel on this Prism page for the ingested package signals; the skill runs external CLI conversion and may read local files or call network APIs per MinerU—treat untrusted PDFs like any downloaded binary.
SKILL.md
READMESKILL.md - Pdf Converter
# Document to Markdown Convert PDF, images, Office docs, and more to clean Markdown using the MinerU Open API CLI. No API key needed for basic use. ## Language Rule Reply to the user in the SAME language they use. This is non-negotiable. ## Core Workflow Extraction is often just the first step. The typical flow is: 1. **Extract** — Use `mineru-open-api` to convert the document to Markdown 2. **Read & Process** — Help the user with what they actually need MinerU outputs raw Markdown — it doesn't interpret or restructure the content. If the user asks to "extract the tables", "summarize the paper", or "find the key findings", you need to read the output and do that work yourself. MinerU handles the OCR and layout; you handle the understanding. Use `-o` to save to a file when the user wants persistent output (conversion, batch processing). Skip `-o` and read stdout directly when the content is consumed immediately (summarization, Q&A). For example: - "帮我把这个PDF转成markdown" → use `-o` to save to file, done - "提取这篇论文里的表格" → use `-o` to save, then read the file and pull out the tables - "这篇论文讲了什么" → stdout is fine, read the output directly and summarize - "把PDF里的参考文献整理出来" → stdout or `-o`, then parse the references section ### Page Range Extraction Rule When `--pages` is used with `-o` pointing to a **directory**, the CLI derives the output filename solely from the input file name. This means multiple page-range extracts of the same file will overwrite each other. **CRITICAL**: You MUST avoid this by converting the output path to an **explicit file path** that includes the page range. ```bash # ❌ WRONG — same file overwrites itself mineru-open-api flash-extract report.pdf --pages 1-20 -o ./out/ mineru-open-api flash-extract report.pdf --pages 21-40 -o ./out/ # ✅ CORRECT — unique filenames per chunk mineru-open-api flash-extract report.pdf --pages 1-20 -o ./out/report_p1-20.md mineru-open-api flash-extract report.pdf --pages 21-40 -o ./out/report_p21-40.md ``` Whenever the user asks to split a document by page ranges (e.g., "extract pages 1-20", "split into chunks"), always generate `-o` as an exact file path with the `_p{range}` suffix. | User says | You generate | |---|---| | "把 report.pdf 每20页拆分成多个文件" | `-o ./out/report_p1-20.md`, `-o ./out/report_p21-40.md`... | | "extract pages 1-10 and 11-20" | `-o ./out/report_p1-10.md`, `-o ./out/report_p11-20.md` | ## Two Extraction Modes ### flash-extract — Fast, no auth Best for quick reads. No API key, no setup. ```bash mineru-open-api flash-extract report.pdf # to stdout (for immediate consumption) mineru-open-api flash-extract report.pdf -o ./output/ # save to file mineru-open-api flash-extract report.pdf -o ./output/report_p1-10.md # page range (explicit file path) mineru-open-api flash-extract report.pdf -o ./output/ --language en # language hint mineru-open-api flash-extract https://example.com/paper.pdf # URL input ``` **Supports:** PDF, images (PNG, JPG, WebP...), DOCX, PPTX, Excel (XLS, XLSX) **Limits:** 10 MB / 20 pages per document **Output:** Markdown only — images, tables, and formulas may become placeholders Use flash-extract as the default unless the user needs more. ### extract — Precision, auth required Use when the user needs full-fidelity output: preserved images, accurate tables, LaTeX formulas, or non-Markdown formats. Requ