
Speakturbo Tts
Add ultra-low-latency spoken feedback to your coding agent so you hear responses in real time without building a custom TTS pipeline.
Install
npx skills add https://github.com/emzod/speak-turbo --skill speakturbo-ttsWhat is this skill?
- Ultra-fast text-to-speech with roughly ~90ms to first audio after the daemon is warm
- Eight built-in voices via speakturbo with --list-voices; default voice alba
- CLI modes: immediate playback, -o WAV export, -q quiet status, optional file-only output
- First run warms a daemon in about 2–5 seconds; later invocations stay near ~90ms latency
- Status line shows timing: first audio (⚡), playback start (▶), completion (✓)
Adoption & trust: 978 installs on skills.sh; 18 GitHub stars; 2/3 security scanners passed (skills.sh audits).
Recommended Skills
Microsoft Foundrymicrosoft/azure-skills
Azure Aimicrosoft/azure-skills
Azure Hosted Copilot Sdkmicrosoft/azure-skills
Lark Eventlarksuite/cli
Running Claude Code Via Litellm Copilotxixu-me/skills
Setup Matt Pocock Skillsmattpocock/skills
Journey fit
Primary fit
Build agent-tooling is the primary shelf because the skill wires voice output into your local agent workflow, not a go-to-market or ops concern. Agent-tooling captures CLI integrations like speakturbo that extend how Claude Code, Cursor, or Codex talks back while you code.
Common Questions / FAQ
Is Speakturbo Tts safe to install?
skills.sh reports 2 of 3 security scanners passed. Review the Security Audits panel on this page before installing in production.
SKILL.md
READMESKILL.md - Speakturbo Tts
# speakturbo - Talk to your Claude! Give your agent the ability to speak to you real-time. Ultra-fast text-to-speech with ~90ms latency and 8 built-in voices. ## Quick Start ```bash # Play immediately - you should hear "Hello world" through your speakers speakturbo "Hello world" # Output: ⚡ 92ms → ▶ 93ms → ✓ 1245ms # Verify it's working by saving to file speakturbo "Hello world" -o test.wav ls -lh test.wav # Should show ~50-100KB file ``` **Output explained:** `⚡` = first audio received, `▶` = playback started, `✓` = done ## First Run The **first execution takes 2-5 seconds** while the daemon starts and loads the model into memory. Subsequent calls are ~90ms to first sound. ```bash # First run (slow - daemon starting) speakturbo "Starting up" # ~2-5 seconds # Second run (fast - daemon already running) speakturbo "Now I'm fast" # ~90ms ``` ## Usage ```bash # Basic - plays immediately (default voice: alba) speakturbo "Hello world" # Save to file (no audio playback) speakturbo "Hello" -o output.wav # Save to specific file speakturbo "Goodbye" -o goodbye.wav # Quiet mode (suppress status messages, still plays audio) speakturbo "Hello" -q # List available voices speakturbo --list-voices ``` ## Available Voices | Voice | Type | |-------|------| | `alba` | Female (default) | | `marius` | Male | | `javert` | Male | | `jean` | Male | | `fantine` | Female | | `cosette` | Female | | `eponine` | Female | | `azelma` | Female | ## Performance | Metric | Value | |--------|-------| | Time to first sound | ~90ms (daemon warm) | | First run | 2-5s (daemon startup) | | Real-time factor | ~4x faster | | Sample rate | 24kHz mono | ## Architecture ``` speakturbo (Rust CLI, 2.2MB) │ │ HTTP streaming (port 7125) ▼ speakturbo-daemon (Python + pocket-tts) │ │ Model in memory, auto-shutdown after 1hr idle ▼ Audio playback (rodio) ``` ## Text Input - **Encoding:** UTF-8 - **Quotes in text:** Use escaping: `speakturbo "She said \"hello\""` - **Long text:** Supported, streams as it generates ## Output Path Security The `-o` flag only writes to directories that are on the allowlist. By default, these are: - `/tmp` and system temp directories - Your current working directory - `~/.speakturbo/` If you need to write elsewhere, use `--allow-dir`: ```bash speakturbo "Hello" -o /custom/path/audio.wav --allow-dir /custom/path ``` To permanently allow a directory, add it to `~/.speakturbo/config`: ```bash mkdir -p ~/.speakturbo && echo "/custom/path" >> ~/.speakturbo/config ``` The config file is one directory per line. Lines starting with `#` are comments. ## Exit Codes | Code | Meaning | |------|---------| | 0 | Success (audio played/saved) | | 1 | Error (daemon connection failed, invalid args) | ## When to Use **Use speakturbo when:** - You need instant audio feedback (~90ms) - Speed matters more than voice variety - Built-in voices are sufficient **Use `speak` instead when:** - You need custom voice cloning (Morgan Freeman, etc.) → `speak "text" --voice ~/.chatter/voices/morgan_freeman.wav` - You need emotion tags like `[laugh]`, `[sigh]` - Quality/variety matters more than speed See the `speak` skill documentation for full usage. ## Troubleshooting **No audio plays:** ```bash # Check daemon is running curl http://127.0.0.1:7125/health # Expected: {"status":"ready","voices":["alba","marius",...]} # Verify by saving to file and playing manually speakturbo "test" -o /tmp/test.wav afplay /tmp/test.wav # macOS aplay /tmp/test.wav # Linux ``` **Daemon won't start:** ```bash # Check port availability lsof -i :7125 # Manually kill and restart pkill -f "daemon_streaming" speakturbo "test" # Auto-restarts daemon `