
Space Ocr
Wire multilingual document OCR into Claude Code or Cursor so agents return structured text with per-character bounding boxes instead of guessing layout from screenshots.
Overview
space-ocr is an MCP server for the Build phase that returns multilingual structured OCR with per-character bounding boxes for AI agents.
What is this MCP server?
- Multilingual structured OCR tuned for agent consumption
- Verified per-character bounding boxes for layout-aware extraction
- stdio npm package space-ocr-mcp (v0.1.3)
- Requires SPACE_OCR_API_KEY from space-ocr.com Settings
- MCP integration—not a one-off Python script in your repo
- Server version 0.1.3 on npm identifier space-ocr-mcp
- stdio transport; single required secret SPACE_OCR_API_KEY
- Advertised capability: verified per-character bboxes on structured OCR output
What problem does it solve?
Agents reading screenshots or scanned PDFs often misread characters and lose layout, which breaks automation you cannot trust downstream.
Who is it for?
Indie builders adding receipt, ID, or form extraction to agent workflows where you need character-level layout, not just a plain string.
Skip if: Teams that only need English plain text from clean digital PDFs and do not use MCP—native PDF parsers may be enough.
What do I get? / Deliverables
After you register the server and API key, your agent can call OCR tools and work from structured text plus bbox metadata instead of raw pixels.
- Structured multilingual OCR payloads with per-character bboxes via MCP tools
- Agent-ready document text without custom OCR microservice code
- Repeatable stdio MCP wiring in your local dev setup
Recommended MCP Servers
Journey fit
Document ingestion belongs in Build when you connect agent workflows to real files, scans, and UI captures. Integrations is the right shelf for an stdio MCP that exposes OCR as callable tools alongside your coding agent.
How it compares
MCP OCR integration with bbox metadata, not a local Tesseract skill or a general vision-only prompt.
Common Questions / FAQ
Who is space-ocr for?
Solo and indie builders using Claude Code, Cursor, or similar agents who need reliable multilingual OCR with layout boxes in automated workflows.
When should I use space-ocr?
Use it during Build when you integrate document upload, KYC, or ops automation and want the agent to call OCR instead of guessing from images.
How do I add space-ocr to my agent?
Issue a SPACE_OCR_API_KEY at https://space-ocr.com, install the space-ocr-mcp npm package (stdio), and add the MCP server entry in your agent’s MCP config.