
Faster Whisper
Transcribe local audio files to text or JSON with word timestamps using faster-whisper from your agent workflow on Windows.
Overview
faster-whisper is an agent skill for the Build phase that transcribes local audio via a faster-whisper Python venv and PowerShell wrapper.
Install
npx skills add https://github.com/theplasmak/faster-whisper --skill faster-whisperWhat is this skill?
- PowerShell entry script auto-runs setup.ps1 when .venv is missing
- Supports model selection (e.g. large-v3-turbo), language, beam size, device, and compute type
- Optional VAD, word-level timestamps, and -Json structured output
- Quiet mode and custom -Output paths for agent-friendly artifacts
- Pass-through args to the underlying Python transcribe.py runner
Adoption & trust: 1.3k installs on skills.sh; 7 GitHub stars; 1/3 security scanners passed (skills.sh audits).
What problem does it solve?
You have local recordings but no repeatable, agent-invokable path to accurate transcripts without manual setup or paid cloud STT.
Who is it for?
Solo builders on Windows who want offline or self-hosted transcription while building voice notes, meeting capture, or caption pipelines.
Skip if: Teams needing managed cloud STT with SLAs, real-time streaming-only APIs, or macOS/Linux-only workflows without adapting the scripts.
When should I use this skill?
You need to transcribe a local audio file with faster-whisper models, optional VAD, or JSON/word-timestamp output from an agent session.
What do I get? / Deliverables
After setup, you get reproducible text or JSON transcripts (optionally with word timestamps) from audio files inside your dev environment.
- Plain-text transcript file or stdout output
- Optional JSON transcript with metadata and word timestamps
Recommended Skills
Journey fit
Local speech-to-text is an integration step while building features that need transcripts, captions, or voice-derived content—not a launch or growth activity. The skill wires a PowerShell wrapper, Python venv, and transcriber CLI into the repo—classic third-party/local tooling integration during implementation.
How it compares
Local CLI skill package—not a hosted transcription API or MCP media server.
Common Questions / FAQ
Who is faster-whisper for?
Indie and solo developers who already use agent coding tools and want local faster-whisper transcription wired into their repo with minimal ceremony.
When should I use faster-whisper?
During Build when you need transcripts from demos, user interviews, or voice memos for docs, RAG, or product features—especially when you prefer not to send audio to a third-party API.
Is faster-whisper safe to install?
It runs local Python in a project venv and processes files you specify; review the Security Audits panel on this Prism page and inspect setup.ps1 and transcribe.py before running in sensitive environments.
SKILL.md
READMESKILL.md - Faster Whisper
README.md CHANGELOG.md LICENSE IMPROVEMENTS-SUMMARY* TEST-SCENARIOS* *.zip #Requires -Version 5.1 <# .SYNOPSIS Transcribe audio using faster-whisper .DESCRIPTION Wrapper script that activates venv and runs the Python transcriber. Auto-runs setup if venv doesn't exist. .EXAMPLE .\transcribe.ps1 audio.mp3 .\transcribe.ps1 audio.wav -Model large-v3-turbo -Language en .\transcribe.ps1 audio.mp3 -Json -WordTimestamps #> param( [Parameter(Position=0, Mandatory=$true)] [string]$Audio, [Alias("m")] [string]$Model, [Alias("l")] [string]$Language, [switch]$WordTimestamps, [int]$BeamSize, [switch]$Vad, [Alias("j")] [switch]$Json, [Alias("o")] [string]$Output, [string]$Device, [string]$ComputeType, [Alias("q")] [switch]$Quiet, # Pass-through for any other args [Parameter(ValueFromRemainingArguments=$true)] [string[]]$RemainingArgs ) $ErrorActionPreference = "Stop" $ScriptDir = Split-Path -Parent $MyInvocation.MyCommand.Path $SkillDir = Split-Path -Parent $ScriptDir $VenvPython = Join-Path $SkillDir ".venv\Scripts\python.exe" $SetupScript = Join-Path $SkillDir "setup.ps1" $TranscribePy = Join-Path $ScriptDir "transcribe.py" # Auto-setup if venv doesn't exist if (-not (Test-Path $VenvPython)) { Write-Host "🎙️ faster-whisper not set up yet. Running setup..." -ForegroundColor Cyan Write-Host "" if (Test-Path $SetupScript) { & $SetupScript Write-Host "" if (-not (Test-Path $VenvPython)) { Write-Host "❌ Setup failed. Please check errors above." -ForegroundColor Red exit 1 } } else { Write-Host "❌ Setup script not found: $SetupScript" -ForegroundColor Red exit 1 } } # Build arguments for Python script $pyArgs = @($TranscribePy, $Audio) if ($Model) { $pyArgs += "--model", $Model } if ($Language) { $pyArgs += "--language", $Language } if ($WordTimestamps) { $pyArgs += "--word-timestamps" } if ($BeamSize) { $pyArgs += "--beam-size", $BeamSize } if ($Vad) { $pyArgs += "--vad" } if ($Json) { $pyArgs += "--json" } if ($Output) { $pyArgs += "--output", $Output } if ($Device) { $pyArgs += "--device", $Device } if ($ComputeType) { $pyArgs += "--compute-type", $ComputeType } if ($Quiet) { $pyArgs += "--quiet" } if ($RemainingArgs) { $pyArgs += $RemainingArgs } # Run transcription & $VenvPython @pyArgs exit $LASTEXITCODE .venv/ __pycache__/ *.pyc *.pyo *.egg-info/ .pytest_cache/ .coverage *.mp3 *.wav *.m4a *.flac *.ogg # Ignore writing-skills superpower skill's files IMPROVEMENTS-SUMMARY.md TEST-SCENARIOS.md MIT License Copyright (c) 2026 ThePlasmak Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. # Core dependency — PyTorch installed separately by setup script (with CUDA if GPU detected) faster-whisper>=1.2.1 @echo off REM faster-whisper transcription wrapper REM Auto-runs setup if venv doesn't exist setlocal EnableDela