Local Whisper

Name: Local Whisper
Author: thinkfleetai

thinkfleetai/thinkfleet-engine

Transcribe voice memos, meeting audio, and agent-recorded WAV files locally without sending audio to a cloud STT API.

Overview

Local Whisper is an agent skill for the Build phase that transcribes audio files with OpenAI Whisper entirely offline after models are downloaded.

Install

npx skills add https://github.com/thinkfleetai/thinkfleet-engine --skill local-whisper

What is this skill?

Fully offline transcription after one-time Whisper model download
CLI with five model tiers from tiny (39M) through large-v3 (1.5GB)
Optional word timestamps and JSON output for pipelines
Default base model; turbo model positioned for best speed/quality tradeoff
uv-managed Python 3.12 venv with ffmpeg as the required binary
5 documented model sizes from tiny through large-v3
Default model base at 74M parameters

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 1 installs on skills.sh; 3/3 security scanners passed (skills.sh audits); trending (+100% hot-view momentum).

What problem does it solve?

You have audio you need as text but cloud STT adds cost, privacy risk, and network dependency for every run.

Who is it for?

Builders who want repeatable local transcription on macOS/Linux agents with ffmpeg and room for up to ~1.5GB model weights.

Skip if: Teams that need real-time streaming dictation, GPU-only training pipelines, or transcription without installing Python, torch, and ffmpeg.

When should I use this skill?

You need high-quality speech-to-text from audio files and want it to run fully offline after model download.

What do I get? / Deliverables

You get plain-text or JSON transcripts with optional timestamps from a local CLI, ready to paste into docs, tickets, or the next agent skill in your chain.

Plain-text transcript on stdout
Optional JSON transcript with word-level timestamps

Recommended Skills

Microsoft Foundrymicrosoft/azure-skills

Microsoft Foundry skill guides agents through the full Azure AI Foundry lifecycle—containerizing agents, pushing to ACR,…377k installs·1.2k stars

Azure Aimicrosoft/azure-skills

azure-ai is a Prism-oriented quick reference for Microsoft Azure AI work, with the published body centered on the Azure …375k installs·1.2k stars

Azure Hosted Copilot Sdkmicrosoft/azure-skills

Azure Hosted Copilot SDK is Microsoft's entry skill for repos using @github/copilot-sdk—it detects CopilotClient usage, …346k installs·1.2k stars

Lark Eventlarksuite/cli

Lark real-time subscription skill via lark-cli event consume for building bots and streaming webhook-style agent workers…208k installs·13.7k stars

Running Claude Code Via Litellm Copilotxixu-me/skills

Running Claude Code via LiteLLM Copilot walks through pointing Claude Code at a local LiteLLM proxy that forwards Anthro…200k installs·61 stars

Setup Matt Pocock Skillsmattpocock/skills

One-time per-repo setup so Matt Pocock engineering skills share correct issue tracker, triage strings, and domain docume…180k installs·121k stars

Journey fit

Primary fit

BuildIntegrations & version control

Local Whisper is a concrete integration skill—install models, run the bundled script, and pipe transcripts into downstream build or agent workflows. Speech-to-text is wired as an offline capability your agent stack calls during product build, not a launch or growth distribution task.

Also useful

GrowContent & marketing

How it compares

Use instead of always-on cloud Whisper APIs when offline runs and data residency matter more than managed scaling.

Common Questions / FAQ

Who is local-whisper for?

Solo and indie developers running Claude Code, Cursor, or similar agents who need WAV-to-text locally for notes, content, or automation without a paid STT service.

When should I use local-whisper?

During build when you are ingesting voice recordings into specs or agent memory, or in grow/operate when you batch-transcribe support or meeting audio on your own machine.

Is local-whisper safe to install?

Review the Security Audits panel on this Prism page before installing; the skill runs shell scripts, downloads ML weights, and needs filesystem access for venv and models.

SKILL.md

READMESKILL.md - Local Whisper

# Local Whisper STT

Local speech-to-text using OpenAI's Whisper. **Fully offline** after initial model download.

## Usage

```bash
# Basic
~/.thinkfleetbot/skills/local-whisper/scripts/local-whisper audio.wav

# Better model
~/.thinkfleetbot/skills/local-whisper/scripts/local-whisper audio.wav --model turbo

# With timestamps
~/.thinkfleetbot/skills/local-whisper/scripts/local-whisper audio.wav --timestamps --json
```

## Models

| Model | Size | Notes |
|-------|------|-------|
| `tiny` | 39M | Fastest |
| `base` | 74M | **Default** |
| `small` | 244M | Good balance |
| `turbo` | 809M | Best speed/quality |
| `large-v3` | 1.5GB | Maximum accuracy |

## Options

- `--model/-m` — Model size (default: base)
- `--language/-l` — Language code (auto-detect if omitted)
- `--timestamps/-t` — Include word timestamps
- `--json/-j` — JSON output
- `--quiet/-q` — Suppress progress

## Setup

Uses uv-managed venv at `.venv/`. To reinstall:
```bash
cd ~/.thinkfleetbot/skills/local-whisper
uv venv .venv --python 3.12
uv pip install --python .venv/bin/python click openai-whisper torch --index-url https://download.pytorch.org/whl/cpu
```

What is this skill?

Fully offline transcription after one-time Whisper model download

CLI with five model tiers from tiny (39M) through large-v3 (1.5GB)

Optional word timestamps and JSON output for pipelines

Default base model; turbo model positioned for best speed/quality tradeoff

uv-managed Python 3.12 venv with ffmpeg as the required binary

5 documented model sizes from tiny through large-v3

Default model base at 74M parameters

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 1 installs on skills.sh; 3/3 security scanners passed (skills.sh audits); trending (+100% hot-view momentum).

Journey fit

Primary fit

BuildIntegrations & version control

Also useful

GrowContent & marketing

SKILL.md

READMESKILL.md - Local Whisper

# Local Whisper STT

Local speech-to-text using OpenAI's Whisper. **Fully offline** after initial model download.

## Usage

```bash
# Basic
~/.thinkfleetbot/skills/local-whisper/scripts/local-whisper audio.wav

# Better model
~/.thinkfleetbot/skills/local-whisper/scripts/local-whisper audio.wav --model turbo

# With timestamps
~/.thinkfleetbot/skills/local-whisper/scripts/local-whisper audio.wav --timestamps --json
```

## Models

| Model | Size | Notes |
|-------|------|-------|
| `tiny` | 39M | Fastest |
| `base` | 74M | **Default** |
| `small` | 244M | Good balance |
| `turbo` | 809M | Best speed/quality |
| `large-v3` | 1.5GB | Maximum accuracy |

## Options

- `--model/-m` — Model size (default: base)
- `--language/-l` — Language code (auto-detect if omitted)
- `--timestamps/-t` — Include word timestamps
- `--json/-j` — JSON output
- `--quiet/-q` — Suppress progress

## Setup

Uses uv-managed venv at `.venv/`. To reinstall:
```bash
cd ~/.thinkfleetbot/skills/local-whisper
uv venv .venv --python 3.12
uv pip install --python .venv/bin/python click openai-whisper torch --index-url https://download.pytorch.org/whl/cpu
```

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Who is local-whisper for?

When should I use local-whisper?

Is local-whisper safe to install?

SKILL.md

This week for builders

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Who is local-whisper for?

When should I use local-whisper?

Is local-whisper safe to install?

SKILL.md