Mcp Florence2

Name: Mcp Florence2
Author: jkawamoto

jkawamoto/mcp-florence2

Expose Florence-2 vision capabilities to your agent as MCP tools for captioning, detection, and other image understanding tasks.

Overview

io.github.jkawamoto/mcp-florence2 is a MCP server for the Build phase that processes images using Florence-2 for agent-accessible vision tools.

What is this MCP server?

MCP server v0.3.9 wrapping Microsoft Florence-2 for image processing
Distributed as mcpb stdio bundle from jkawamoto/mcp-florence2 GitHub releases
Lets coding agents run vision tasks without embedding full model pipelines in app code first
Suited to prototypes needing captions, OCR-style understanding, or visual Q&A hooks
Server version 0.3.9
Transport type stdio via mcpb package
Repository github.com/jkawamoto/mcp-florence2

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Community signal: 7 GitHub stars.

What problem does it solve?

Adding vision to an agent-built product usually means wrestling with model weights and inference code before you can test a single user flow.

Who is it for?

Indie builders prototyping image captioning, UI understanding, or visual search with Florence-2 behind Claude Code or Cursor.

Skip if: Text-only CRUD apps, teams needing managed cloud vision APIs with SLAs and no local model setup, or production scale without reviewing repo runtime requirements.

What do I get? / Deliverables

Registering mcp-florence2 exposes Florence-2 image processing to your agent so you can iterate on multimodal features through MCP tool calls.

Florence-2-backed image processing tools available to your agent
Faster multimodal feature spikes without bespoke inference wrappers
Documented mcpb-based install pinned to release v0.3.9

Recommended MCP Servers

0Latency Memory

0Latency Memory is a hosted MCP server that gives AI agents a persistent memory layer with fast recall, semantic search,…

0nMCP — Universal AI API Orchestrator0nork/0nMCP

0nMCP is a Universal AI API Orchestrator MCP server aimed at solo builders who would otherwise register a long list of p…

0xHumans Protocol MCPDavidOrpeli/0xhumans-mcp-proxy

io.github.DavidOrpeli/0xhumans-mcp is a Model Context Protocol offering for the 0xHumans Protocol, aimed at AI agents th…

1k Patient Mcp

The 1k Patient MCP server is a hosted Model Context Protocol endpoint described as serving on the order of one thousand …

1trippulsegkcogz/OneTrip-Beta

1trip PULSE is a travel-focused MCP server that packages twenty-one planning tools—flights, hotels, visa guidance, safet…

4bots Content

io.github.davidsiegel59/4bots-content is a remote MCP server that supplies daily, channelized content for AI agents buil…

Journey fit

Primary fit

BuildIntegrations & version control

Vision MCP servers are added while building multimodal features and agent tooling, not during initial idea research alone. Florence-2 processing is an ML integration boundary—agents call it as a tool during backend and agent-feature implementation.

How it compares

Local Florence-2 vision MCP—not a generic image CDN skill or browser screenshot automation server.

Common Questions / FAQ

Who is io.github.jkawamoto/mcp-florence2 for?

Developers building multimodal or vision-assisted features who want agents to call Florence-2 through MCP instead of custom scripts.

When should I use io.github.jkawamoto/mcp-florence2?

During Build integrations when you need image understanding in the agent loop for prototypes, internal tools, or feature spikes.

How do I add io.github.jkawamoto/mcp-florence2 to my agent?

Install the v0.3.9 mcpb from jkawamoto/mcp-florence2 releases, configure stdio in your MCP client, satisfy Florence-2 runtime prerequisites from the repo, then verify vision tools in a test session.

Mcp Florence2

jkawamoto/mcp-florence2

Expose Florence-2 vision capabilities to your agent as MCP tools for captioning, detection, and other image understanding tasks.

Overview

io.github.jkawamoto/mcp-florence2 is a MCP server for the Build phase that processes images using Florence-2 for agent-accessible vision tools.

What is this MCP server?

MCP server v0.3.9 wrapping Microsoft Florence-2 for image processing
Distributed as mcpb stdio bundle from jkawamoto/mcp-florence2 GitHub releases
Lets coding agents run vision tasks without embedding full model pipelines in app code first
Suited to prototypes needing captions, OCR-style understanding, or visual Q&A hooks
Server version 0.3.9
Transport type stdio via mcpb package
Repository github.com/jkawamoto/mcp-florence2

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Community signal: 7 GitHub stars.

What problem does it solve?

Adding vision to an agent-built product usually means wrestling with model weights and inference code before you can test a single user flow.

Who is it for?

Indie builders prototyping image captioning, UI understanding, or visual search with Florence-2 behind Claude Code or Cursor.

Skip if: Text-only CRUD apps, teams needing managed cloud vision APIs with SLAs and no local model setup, or production scale without reviewing repo runtime requirements.

What do I get? / Deliverables

Registering mcp-florence2 exposes Florence-2 image processing to your agent so you can iterate on multimodal features through MCP tool calls.

Florence-2-backed image processing tools available to your agent
Faster multimodal feature spikes without bespoke inference wrappers
Documented mcpb-based install pinned to release v0.3.9

Recommended MCP Servers

0Latency Memory

0Latency Memory is a hosted MCP server that gives AI agents a persistent memory layer with fast recall, semantic search,…

0nMCP — Universal AI API Orchestrator0nork/0nMCP

0nMCP is a Universal AI API Orchestrator MCP server aimed at solo builders who would otherwise register a long list of p…

0xHumans Protocol MCPDavidOrpeli/0xhumans-mcp-proxy

io.github.DavidOrpeli/0xhumans-mcp is a Model Context Protocol offering for the 0xHumans Protocol, aimed at AI agents th…

1k Patient Mcp

The 1k Patient MCP server is a hosted Model Context Protocol endpoint described as serving on the order of one thousand …

1trippulsegkcogz/OneTrip-Beta

1trip PULSE is a travel-focused MCP server that packages twenty-one planning tools—flights, hotels, visa guidance, safet…

4bots Content

io.github.davidsiegel59/4bots-content is a remote MCP server that supplies daily, channelized content for AI agents buil…

Journey fit

Primary fit

BuildIntegrations & version control

How it compares

Local Florence-2 vision MCP—not a generic image CDN skill or browser screenshot automation server.

Common Questions / FAQ

Who is io.github.jkawamoto/mcp-florence2 for?

Developers building multimodal or vision-assisted features who want agents to call Florence-2 through MCP instead of custom scripts.

When should I use io.github.jkawamoto/mcp-florence2?

During Build integrations when you need image understanding in the agent loop for prototypes, internal tools, or feature spikes.

How do I add io.github.jkawamoto/mcp-florence2 to my agent?

Install the v0.3.9 mcpb from jkawamoto/mcp-florence2 releases, configure stdio in your MCP client, satisfy Florence-2 runtime prerequisites from the repo, then verify vision tools in a test session.

Overview

What is this MCP server?

What problem does it solve?

Who is it for?

What do I get? / Deliverables

Recommended MCP Servers

Journey fit

Who is io.github.jkawamoto/mcp-florence2 for?

When should I use io.github.jkawamoto/mcp-florence2?

How do I add io.github.jkawamoto/mcp-florence2 to my agent?

This week for builders

Overview

What is this MCP server?

What problem does it solve?

Who is it for?

What do I get? / Deliverables

Recommended MCP Servers

Journey fit

Who is io.github.jkawamoto/mcp-florence2 for?

When should I use io.github.jkawamoto/mcp-florence2?

How do I add io.github.jkawamoto/mcp-florence2 to my agent?