
Audio Transcriber
Turn meeting or voice recordings into structured notes, atas, or summaries using Whisper plus an optional LLM pass with prompt-engineer polish.
Overview
Audio Transcriber is an agent skill most often used in Build (also Grow content, Validate scope) that transcribes audio with Whisper and optionally structures output via Claude or Copilot CLI with prompt-engineer-assiste
Install
npx skills add https://github.com/sickn33/antigravity-awesome-skills --skill audio-transcriberWhat is this skill?
- Whisper transcription with tqdm segment progress and rich spinner during LLM steps
- Step 3b intelligent prompt workflow: improve user prompts via prompt-engineer or auto-suggest document type (ata, resumo
- LLM priority chain: Claude CLI, then GitHub Copilot CLI, else transcript-only fallback
- 5-minute default LLM timeout with graceful degradation when no CLI is available
- Timestamp-based output naming and changelog-documented RISEN/RODES/STAR-style structured prompts
- CLI priority: Claude then GitHub Copilot
Adoption & trust: 946 installs on skills.sh; 40.1k GitHub stars; 2/3 security scanners passed (skills.sh audits).
What problem does it solve?
You recorded an important conversation or voice memo but only have messy raw audio and no fast path to searchable meeting notes or summaries.
Who is it for?
Indie builders who take calls on the go and want local Whisper plus agent-guided prompts for atas, resumos, or meeting notes.
Skip if: Teams needing HIPAA-grade hosted transcription, real-time streaming captions, or workflows with no tolerance for installing Whisper and optional Claude/Copilot CLIs.
When should I use this skill?
You have audio to transcribe and want optional LLM structuring with prompt-engineer improvement or auto-generated document prompts.
What do I get? / Deliverables
You get a transcript and, when a CLI is available, a structured document aligned to your confirmed or auto-generated prompt, with transcript-only mode if LLM steps are skipped.
- Text transcript from Whisper
- Structured meeting or summary document when LLM path succeeds
- Timestamp-named output artifacts
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Canonical shelf is Build/PM because the skill’s primary artifact is structured meeting documentation and planning notes from raw audio. PM fits first: transcripts feed specs, meeting atas, and decision logs indie builders use while shipping—not pure media production.
Where it fits
After a weekly founder call, transcribe the recording into an ata your agent can reference in the implementation plan.
Transcribe five user interview clips and summarize constraints before you lock MVP scope.
Turn a podcast-style voice draft into a cleaned resumo you can publish as a changelog or blog outline.
Convert a walkthrough voiceover into step-by-step internal docs for a feature you just shipped.
How it compares
Skill workflow for offline transcription plus LLM structuring—not a browser MCP connector or a single-click SaaS recorder.
Common Questions / FAQ
Who is audio-transcriber for?
Solo and indie builders who capture audio meetings or memos and want agent-assisted transcription and structured notes without a dedicated transcription product.
When should I use audio-transcriber?
Use it in Build/PM for meeting atas after syncs, in Grow/content when turning interviews into posts, and in Validate when documenting discovery calls—especially before you commit specs from spoken input.
Is audio-transcriber safe to install?
Check the Security Audits panel on this Prism page; the skill runs local Whisper and optional CLIs that read your audio and may send transcript text to those tools—review permissions and data handling before use.
Workflow Chain
Requires first: prompt engineer
Then invoke: writing plans
SKILL.md
READMESKILL.md - Audio Transcriber
# Changelog - audio-transcriber All notable changes to the audio-transcriber skill will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). --- ## [1.1.0] - 2026-02-03 ### ✨ Added - **Intelligent Prompt Workflow** (Step 3b) - Complete integration with prompt-engineer skill - **Scenario A**: User-provided prompts are automatically improved with prompt-engineer - Displays both original and improved versions side-by-side - Single confirmation: "Usar versão melhorada? [s/n]" - **Scenario B**: Auto-generation when no prompt provided - Analyzes transcript and suggests document type (ata, resumo, notas) - Shows suggestion and asks confirmation - Generates complete structured prompt (RISEN/RODES/STAR) - Shows preview and asks final confirmation - Falls back to DEFAULT_MEETING_PROMPT if declined - **LLM Integration** - Process transcripts with Claude CLI or GitHub Copilot CLI - Priority: Claude > GitHub Copilot > None (transcript-only mode) - Step 0b: CLI detection logic documented - Timeout handling (5 minutes default) - Graceful fallback if CLI unavailable - **Progress Indicators** - Visual feedback during long operations - `tqdm` progress bar for Whisper transcription segments - `rich` spinner for LLM processing - Clear status messages at each step - **Timestamp-based File Naming** - Avoid overwriting previous transcriptions - Format: `transcript-YYYYMMDD-HHMMSS.md` - Format: `ata-YYYYMMDD-HHMMSS.md` - Prevents data loss from repeated runs - **Automatic Cleanup** - Remove temporary files after processing - Deletes `metadata.json` and `transcription.json` automatically - `--keep-temp` flag to preserve if needed - Clean output directory - **Rich Terminal UI** - Beautiful output with `rich` library - Formatted panels for prompt previews - Color-coded status messages (green=success, yellow=warning, red=error) - Spinner animations for long-running tasks - **Dual Output Support** - Generate both transcript and processed ata - `transcript-*.md` - Raw transcription with timestamps - `ata-*.md` - Intelligent summary/meeting minutes (if LLM available) - User can decline LLM processing to get transcript-only ### 🔧 Changed - **SKILL.md** - Major documentation updates - Added Step 0b (CLI Detection) - Updated Step 2 (Progress Indicators) - Added Step 3b (Intelligent Prompt Workflow with 150+ lines) - Updated version to 1.1.0 - Added detailed workflow diagrams for both scenarios - **install-requirements.sh** - Added UI libraries - Now installs `tqdm` and `rich` packages - Graceful fallback if installation fails - Updated success messages - **Python Implementation** - Complete refactor - Created `scripts/transcribe.py` (516 lines) - Functions: `detect_cli_tool()`, `invoke_prompt_engineer()`, `handle_prompt_workflow()`, `process_with_llm()`, `transcribe_audio()`, `save_outputs()`, `cleanup_temp_files()` - Command-line arguments: `--prompt`, `--model`, `--output-dir`, `--keep-temp` - Auto-installs `rich` and `tqdm` if missing ### 🐛 Fixed - **User prompts no longer ignored** - v1.0.0 completely ignored custom prompts - Now processes all prompts (custom or auto-generated) with LLM - Improves simple prompts into structured frameworks - **Temporary files cleanup** - v1.0.0 left `metadata.json` and `transcription.json` as trash - Now automatically removed after processing - Clean output directory - **File overwriting** - v1.0.0 used same filename (e.g., `meeting.md`) every time - Now uses timestamp to prevent data loss - Each run creates unique files - **Missing ata/summary** - v1.0.0 only generated raw transcript - Now generates intelligent ata/resumo using LLM - Respects user's prompt instructions - **No progress feedback** - v1.0.0 had silent processing (users didn't know if it