Sag

Name: Sag
Author: steipete

steipete/clawdis

Wire macOS-friendly ElevenLabs text-to-speech into agent or CLI workflows so spoken output feels like `say` with expressive voice control.

Overview

sag is an agent skill for the Build phase that integrates the sag CLI with ElevenLabs text-to-speech for local, say-like voice playback in agent workflows.

Install

npx skills add https://github.com/steipete/clawdis --skill sag

What is this skill?

ElevenLabs TTS via `sag` with mac-style quick commands (`sag "Hello there"`, `sag voices`)
Model tiers: default expressive `eleven_v3`, stable `eleven_multilingual_v2`, fast `eleven_flash_v2_5`
Pronunciation fixes via respelling, `--normalize auto`, and `--lang` bias for names and URLs
v3 delivery tags (`[whispers]`, `[laughs]`, pause markers) plus v2/v2.5 SSML break support
OpenClaw metadata: brew install steipete/tap/sag, requires `ELEVENLABS_API_KEY`
Documents 3 model options: eleven_v3, eleven_multilingual_v2, eleven_flash_v2_5
Brew install path: steipete/tap/sag with required ELEVENLABS_API_KEY

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 1.9k installs on skills.sh; 378k GitHub stars; 3/3 security scanners passed (skills.sh audits).

What problem does it solve?

You want your agent or script to speak naturally on macOS but do not want to hand-roll ElevenLabs API calls, voice IDs, and pronunciation edge cases.

Who is it for?

macOS-based agent builders using OpenClaw/Clawdis who need quick ElevenLabs TTS from the terminal.

Skip if: Windows/Linux environments without sag, offline-only audio needs, or teams that cannot use cloud TTS API keys.

When should I use this skill?

ElevenLabs text-to-speech with mac-style say UX via the sag CLI.

What do I get? / Deliverables

You configure `ELEVENLABS_API_KEY`, pick a model and voice, and run verified `sag` commands with normalization and delivery tags for reliable local audio.

Working sag speak commands with chosen voice and model
Documented normalization and delivery tag settings for repeatable phrasing

Recommended Skills

Microsoft Foundrymicrosoft/azure-skills

Microsoft Foundry skill guides agents through the full Azure AI Foundry lifecycle—containerizing agents, pushing to ACR,…377k installs·1.2k stars

Azure Aimicrosoft/azure-skills

azure-ai is a Prism-oriented quick reference for Microsoft Azure AI work, with the published body centered on the Azure …375k installs·1.2k stars

Azure Hosted Copilot Sdkmicrosoft/azure-skills

Azure Hosted Copilot SDK is Microsoft's entry skill for repos using @github/copilot-sdk—it detects CopilotClient usage, …346k installs·1.2k stars

Lark Eventlarksuite/cli

Lark real-time subscription skill via lark-cli event consume for building bots and streaming webhook-style agent workers…208k installs·13.7k stars

Running Claude Code Via Litellm Copilotxixu-me/skills

Running Claude Code via LiteLLM Copilot walks through pointing Claude Code at a local LiteLLM proxy that forwards Anthro…200k installs·61 stars

Setup Matt Pocock Skillsmattpocock/skills

One-time per-repo setup so Matt Pocock engineering skills share correct issue tracker, triage strings, and domain docume…180k installs·121k stars

Journey fit

Primary fit

BuildIntegrations & version control

Voice output is integrated during Build when you connect external APIs and local binaries to your agent stack. sag is a third-party CLI integration (brew-installed) with API keys and playback—classic integrations work, not core app UI.

Also useful

OperateIteration & experiments

How it compares

A thin CLI integration skill—not a hosted voice widget SDK or a multi-speaker podcast editor.

Common Questions / FAQ

Who is sag for?

Solo builders on macOS wiring ElevenLabs voices into agents or shell automations via the sag binary.

When should I use sag?

During Build integrations when you add spoken feedback, alerts, or demo narration to an agent that already supports shell tools.

Is sag safe to install?

Review the Security Audits panel on this Prism page; sag needs network access and an API key—treat `ELEVENLABS_API_KEY` as a secret.

SKILL.md

READMESKILL.md - Sag

# sag

Use `sag` for ElevenLabs TTS with local playback.

API key (required)

- `ELEVENLABS_API_KEY` (preferred)
- `SAG_API_KEY` also supported by the CLI

Quick start

- `sag "Hello there"`
- `sag speak -v "Roger" "Hello"`
- `sag voices`
- `sag prompting` (model-specific tips)

Model notes

- Default: `eleven_v3` (expressive)
- Stable: `eleven_multilingual_v2`
- Fast: `eleven_flash_v2_5`

Pronunciation + delivery rules

- First fix: respell (e.g. "key-note"), add hyphens, adjust casing.
- Numbers/units/URLs: `--normalize auto` (or `off` if it harms names).
- Language bias: `--lang en|de|fr|...` to guide normalization.
- v3: SSML `<break>` not supported; use `[pause]`, `[short pause]`, `[long pause]`.
- v2/v2.5: SSML `<break time="1.5s" />` supported; `<phoneme>` not exposed in `sag`.

v3 audio tags (put at the entrance of a line)

- `[whispers]`, `[shouts]`, `[sings]`
- `[laughs]`, `[starts laughing]`, `[sighs]`, `[exhales]`
- `[sarcastic]`, `[curious]`, `[excited]`, `[crying]`, `[mischievously]`
- Example: `sag "[whispers] keep this quiet. [short pause] ok?"`

Voice defaults

- `ELEVENLABS_VOICE_ID` or `SAG_VOICE_ID`

Confirm voice + speaker before long output.

## Chat voice responses

When the user asks for a "voice" reply (e.g., "crazy scientist voice", "explain in voice"), generate audio and send it:

```bash
# Generate audio file
sag -v Clawd -o /tmp/voice-reply.mp3 "Your message here"

# Then include in reply:
# MEDIA:/tmp/voice-reply.mp3
```

Voice character tips:

- Crazy scientist: Use `[excited]` tags, dramatic pauses `[short pause]`, vary intensity
- Calm: Use `[whispers]` or slower pacing
- Dramatic: Use `[sings]` or `[shouts]` sparingly

Default voice for Clawd: `lj2rcrvANS3gaWWnczSX` (or just `-v Clawd`)

What is this skill?

ElevenLabs TTS via `sag` with mac-style quick commands (`sag "Hello there"`, `sag voices`)

Model tiers: default expressive `eleven_v3`, stable `eleven_multilingual_v2`, fast `eleven_flash_v2_5`

Pronunciation fixes via respelling, `--normalize auto`, and `--lang` bias for names and URLs

v3 delivery tags (`[whispers]`, `[laughs]`, pause markers) plus v2/v2.5 SSML break support

OpenClaw metadata: brew install steipete/tap/sag, requires `ELEVENLABS_API_KEY`

Documents 3 model options: eleven_v3, eleven_multilingual_v2, eleven_flash_v2_5

Brew install path: steipete/tap/sag with required ELEVENLABS_API_KEY

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 1.9k installs on skills.sh; 378k GitHub stars; 3/3 security scanners passed (skills.sh audits).

Journey fit

Primary fit

BuildIntegrations & version control

Also useful

OperateIteration & experiments

SKILL.md

READMESKILL.md - Sag

# sag

Use `sag` for ElevenLabs TTS with local playback.

API key (required)

- `ELEVENLABS_API_KEY` (preferred)
- `SAG_API_KEY` also supported by the CLI

Quick start

- `sag "Hello there"`
- `sag speak -v "Roger" "Hello"`
- `sag voices`
- `sag prompting` (model-specific tips)

Model notes

- Default: `eleven_v3` (expressive)
- Stable: `eleven_multilingual_v2`
- Fast: `eleven_flash_v2_5`

Pronunciation + delivery rules

- First fix: respell (e.g. "key-note"), add hyphens, adjust casing.
- Numbers/units/URLs: `--normalize auto` (or `off` if it harms names).
- Language bias: `--lang en|de|fr|...` to guide normalization.
- v3: SSML `<break>` not supported; use `[pause]`, `[short pause]`, `[long pause]`.
- v2/v2.5: SSML `<break time="1.5s" />` supported; `<phoneme>` not exposed in `sag`.

v3 audio tags (put at the entrance of a line)

- `[whispers]`, `[shouts]`, `[sings]`
- `[laughs]`, `[starts laughing]`, `[sighs]`, `[exhales]`
- `[sarcastic]`, `[curious]`, `[excited]`, `[crying]`, `[mischievously]`
- Example: `sag "[whispers] keep this quiet. [short pause] ok?"`

Voice defaults

- `ELEVENLABS_VOICE_ID` or `SAG_VOICE_ID`

Confirm voice + speaker before long output.

## Chat voice responses

When the user asks for a "voice" reply (e.g., "crazy scientist voice", "explain in voice"), generate audio and send it:

```bash
# Generate audio file
sag -v Clawd -o /tmp/voice-reply.mp3 "Your message here"

# Then include in reply:
# MEDIA:/tmp/voice-reply.mp3
```

Voice character tips:

- Crazy scientist: Use `[excited]` tags, dramatic pauses `[short pause]`, vary intensity
- Calm: Use `[whispers]` or slower pacing
- Dramatic: Use `[sings]` or `[shouts]` sparingly

Default voice for Clawd: `lj2rcrvANS3gaWWnczSX` (or just `-v Clawd`)

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Who is sag for?

When should I use sag?

Is sag safe to install?

SKILL.md

This week for builders

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Who is sag for?

When should I use sag?

Is sag safe to install?

SKILL.md