
MinerU
Parse PDFs, office files, and images into structured text through MinerU so your agent can ingest specs, decks, and scans into build workflows.
Overview
MinerU is an MCP server for the Build phase that lets agents parse PDFs, office documents, and images via the MinerU API with OCR and batch jobs.
What is this MCP server?
- MinerU parsing for PDF, images, DOCX, and PPTX with OCR paths
- Batch processing support for multi-file ingestion jobs
- npm package mineru-mcp version 1.1.3 with stdio MCP transport
- Agent-callable document API instead of manual copy-paste from PDFs
- GitHub repository linxule/mineru-mcp as MCP wrapper around MinerU capabilities
- npm package mineru-mcp version 1.1.3 with stdio transport
- Supports PDF, images, DOCX, and PPTX with OCR and batch processing per server description
Community signal: 6 GitHub stars.
What problem does it solve?
Agents stall when your requirements live in PDFs and slides because manual copy-paste loses structure and OCR is tedious at volume.
Who is it for?
Indie builders automating spec ingestion, knowledge bases, or client document workflows inside MCP-driven development.
Skip if: Teams that only work in plain Markdown repos with no PDF or office file inputs, or workflows needing certified legal document processing alone.
What do I get? / Deliverables
After you add MinerU MCP, your agent can request structured parses and batch document jobs to feed downstream build and RAG steps.
- Structured text extractions from PDF, image, DOCX, and PPTX inputs
- Batch-parse orchestration callable from the agent via MCP tools
Recommended MCP Servers
Journey fit
Document ingestion is a build-time integration when agents and apps need machine-readable content from heterogeneous files before features ship. MinerU sits as an MCP-callable document API bridge alongside other third-party services in the agent toolchain.
How it compares
Document parsing MCP API, not a local-only git search or music composition tool.
Common Questions / FAQ
Who is MinerU for?
MinerU is for builders using MCP agents who must extract text from PDFs, DOCX, PPTX, and images through the MinerU parsing service during product development.
When should I use MinerU?
Use it when integrating document ingestion into build workflows—RAG indexing, spec extraction, or batch OCR—before you rely on the content in code or prompts.
How do I add MinerU to my agent?
Configure MinerU API credentials per upstream MinerU docs, install mineru-mcp from npm, and register io.github.linxule/mineru with stdio transport in your MCP client.