
Vision Squeezer
Preprocess screenshots and UI captures so vision-model tile billing stays predictable before you ship agent features that analyze images.
Overview
Vision Squeezer is a Ship-phase MCP server that cuts vision API token costs by snapping images to provider tile boundaries for Claude, GPT-4o, GPT-5, and Gemini workflows.
What is this MCP server?
- Snaps image dimensions to vision provider tile boundaries to reduce billed tokens
- Targets multimodal stacks using Claude, GPT-4o, GPT-5, and Gemini vision pricing models
- npm stdio package vision-squeezer at version 0.1.1 for local MCP wiring
- Runs as a focused preprocessing MCP rather than a full image editor or CDN
- GitHub-hosted eralpozcan/vision-squeezer for indie agent cost control
- npm package identifier vision-squeezer at version 0.1.1
- stdio transport only in published registry packages block
- Documented compatibility with Claude, GPT-4o, GPT-5, and Gemini vision stacks
Community signal: 2 GitHub stars.
What problem does it solve?
Multimodal agents burn budget sending oversized screenshots that providers bill as extra vision tiles.
Who is it for?
Indie builders shipping agent features that analyze UI screenshots or photos and need predictable vision API spend.
Skip if: Products with no vision models, teams needing lossless print workflows, or builders who only use text-only LLMs.
What do I get? / Deliverables
Preprocessed images align to tile grids so each vision call uses fewer tokens without you hand-resizing every capture.
- Tile-aligned image payloads sized for major vision providers
- Lower vision token usage on repeated screenshot or photo analysis runs
- Local preprocessing step plug-in before primary multimodal MCP or API calls
Recommended MCP Servers
Journey fit
Vision token cost is a ship-time performance and unit-economics concern once multimodal features exist, not an idea-phase research toy. Perf fits because the server optimizes image dimensions against provider tile boundaries to cut vision API token spend.
How it compares
Vision cost preprocessor MCP, not a full image CDN or creative editing skill.
Common Questions / FAQ
Who is Vision Squeezer for?
Solo developers running MCP agents that call Claude, GPT-4o, GPT-5, or Gemini vision APIs and want to trim token usage on image inputs.
When should I use Vision Squeezer?
Use it in ship and operate loops once multimodal features are live and screenshot or photo payloads are showing up on usage dashboards.
How do I add Vision Squeezer to my agent?
Install the npm package vision-squeezer and register the stdio MCP server in your Claude Code, Cursor, or compatible client config.