Vox

Name: Vox
Author: boska

boska/vox

Add native macOS voice input and spoken replies to your coding agent over MCP without building custom audio plumbing.

Overview

Vox is a MCP server for the Build phase that provides native macOS voice input and text-to-speech output for agent workflows over stdio.

What is this MCP server?

Native Swift macOS binary distributed as MCPB v1.0.0 with stdio transport
On-device speech-to-text via Apple SFSpeechRecognizer
Text-to-speech via ElevenLabs when ELEVENLABS_API_KEY is set, otherwise macOS system voice
Optional secret env var for ElevenLabs; no key required for basic TTS fallback
Purpose-built MCP server for voice I/O, not a general LLM or browser tool
Registry version 1.0.0
Optional ELEVENLABS_API_KEY environment variable

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Community signal: 1 GitHub stars.

What problem does it solve?

Talking to your coding agent still means typing and reading walls of text, which breaks flow when you are debugging, walking, or pair-programming alone.

Who is it for?

Solo Mac developers who want low-friction voice loops with Claude Code or Cursor and already accept macOS-only tooling.

Skip if: Windows or Linux builders, teams needing hosted multi-user voice, or anyone who only needs text chat without MCP wiring.

What do I get? / Deliverables

After you register Vox in your agent, you can dictate prompts and hear responses through macOS speech APIs with optional ElevenLabs quality.

stdio MCP connection for agent-invoked listen and speak flows
On-device transcription via SFSpeechRecognizer
Spoken agent responses via ElevenLabs or macOS system voice

Recommended MCP Servers

0Latency Memory

0Latency Memory is a hosted MCP server that gives AI agents a persistent memory layer with fast recall, semantic search,…

0nMCP — Universal AI API Orchestrator0nork/0nMCP

0nMCP is a Universal AI API Orchestrator MCP server aimed at solo builders who would otherwise register a long list of p…

0xHumans Protocol MCPDavidOrpeli/0xhumans-mcp-proxy

io.github.DavidOrpeli/0xhumans-mcp is a Model Context Protocol offering for the 0xHumans Protocol, aimed at AI agents th…

1k Patient Mcp

The 1k Patient MCP server is a hosted Model Context Protocol endpoint described as serving on the order of one thousand …

1trippulsegkcogz/OneTrip-Beta

1trip PULSE is a travel-focused MCP server that packages twenty-one planning tools—flights, hotels, visa guidance, safet…

4bots Content

io.github.davidsiegel59/4bots-content is a remote MCP server that supplies daily, channelized content for AI agents buil…

Journey fit

Primary fit

BuildAgent skills & templates

Voice I/O sits in the build phase because it extends how you interact with agents while you ship product code, not how you market or operate production metrics. Agent-tooling is the right shelf: Vox is an MCP stdio server wired into Claude Code, Cursor, and similar clients—not a standalone app feature.

How it compares

Native voice I/O MCP server, not a general chat skill or cloud telephony integration.

Common Questions / FAQ

Who is io.github.boska/vox for?

It is for solo and indie builders on macOS who use MCP-enabled coding agents and want local speech recognition plus optional ElevenLabs TTS.

When should I use io.github.boska/vox?

Use it during active build sessions when hands-free capture or spoken agent replies help you stay in flow without leaving your dev environment.

How do I add io.github.boska/vox to my agent?

Add the catalog package (stdio, vox-mcp from the GitHub release) to your MCP client config and optionally set ELEVENLABS_API_KEY for premium TTS.

Vox