
Whisper
Install this when you wire OpenAI Whisper transcription into a product and need language tiers, ISO codes, and WER expectations for routing and QA.
Overview
Whisper is an agent skill most often used in Build (also Validate, Grow) that catalogs Whisper’s 99-language support tiers and codes for transcription integrations.
Install
npx skills add https://github.com/davila7/claude-code-templates --skill whisperWhat is this skill?
- Documents 99 Whisper-supported languages with ISO-style codes for pipeline routing
- Top-tier WER under 10% called out for 12 languages including English, Spanish, Japanese, and Chinese
- Good-support tier (WER 10–20%) listed for 15 additional languages such as Arabic, Turkish, and Thai
- Full language roster from Afrikaans through Welsh for coverage checks before shipping voice features
- Structured guide format for multilingual capability planning in agent or SaaS products
- 99 supported languages documented
- 12 top-tier languages under 10% WER
- 15 good-support languages in 10–20% WER band
Adoption & trust: 1.1k installs on skills.sh; 27.8k GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
You are adding Whisper STT but lack a structured list of supported languages, quality tiers, and codes for your agent to use in config and UX.
Who is it for?
Indie builders shipping voice notes, meeting bots, accessibility captions, or multilingual agent tools on Whisper.
Skip if: Teams needing speaker diarization, custom acoustic models only, or video generation workflows unrelated to speech recognition.
When should I use this skill?
When configuring Whisper transcription, language selection, or documenting multilingual STT coverage in an app or agent workflow.
What do I get? / Deliverables
Your integration and docs use consistent language codes and tiered WER expectations so locale rollout and QA plans match Whisper’s real coverage.
- Language allowlists and ISO codes in app config
- Documentation of expected accuracy tiers per locale
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Whisper integration is implemented during Build; language metadata also informs Validate prototypes and Grow content pipelines. The skill is integration-focused—multilingual STT configuration and language selection—not generic frontend layout.
Where it fits
Pick two launch locales with top-tier WER before building a voice-note MVP.
Map ISO language codes in Whisper API request bodies and server-side defaults.
Plan which podcast languages get automatic captions without surprising quality drops.
Compare user-reported transcription errors against documented good-support tiers for triage.
How it compares
Reference for Whisper locale coverage—not a wrapper skill for running ffmpeg, GPU batch jobs, or alternative STT vendors.
Common Questions / FAQ
Who is whisper for?
Solo builders and small teams integrating OpenAI Whisper who need authoritative language lists and quality tiers while coding in Claude Code, Cursor, or Codex.
When should I use whisper?
At Build → integrations when coding language pickers and API params; at Validate → prototype when choosing launch locales; at Grow → content when planning auto-transcription for podcasts or video.
Is whisper safe to install?
The skill is informational; it does not execute transcription. Review the Security Audits panel on this Prism page and treat any bundled install scripts in the parent repo separately.
SKILL.md
READMESKILL.md - Whisper
# Whisper Language Support Guide Complete guide to Whisper's multilingual capabilities. ## Supported languages (99 total) ### Top-tier support (WER < 10%) - English (en) - Spanish (es) - French (fr) - German (de) - Italian (it) - Portuguese (pt) - Dutch (nl) - Polish (pl) - Russian (ru) - Japanese (ja) - Korean (ko) - Chinese (zh) ### Good support (WER 10-20%) - Arabic (ar) - Turkish (tr) - Vietnamese (vi) - Swedish (sv) - Finnish (fi) - Czech (cs) - Romanian (ro) - Hungarian (hu) - Danish (da) - Norwegian (no) - Thai (th) - Hebrew (he) - Greek (el) - Indonesian (id) - Malay (ms) ### Full list (99 languages) Afrikaans, Albanian, Amharic, Arabic, Armenian, Assamese, Azerbaijani, Bashkir, Basque, Belarusian, Bengali, Bosnian, Breton, Bulgarian, Burmese, Cantonese, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Estonian, Faroese, Finnish, French, Galician, Georgian, German, Greek, Gujarati, Haitian Creole, Hausa, Hawaiian, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Italian, Japanese, Javanese, Kannada, Kazakh, Khmer, Korean, Lao, Latin, Latvian, Lingala, Lithuanian, Luxembourgish, Macedonian, Malagasy, Malay, Malayalam, Maltese, Maori, Marathi, Moldavian, Mongolian, Myanmar, Nepali, Norwegian, Nynorsk, Occitan, Pashto, Persian, Polish, Portuguese, Punjabi, Pushto, Romanian, Russian, Sanskrit, Serbian, Shona, Sindhi, Sinhala, Slovak, Slovenian, Somali, Spanish, Sundanese, Swahili, Swedish, Tagalog, Tajik, Tamil, Tatar, Telugu, Thai, Tibetan, Turkish, Turkmen, Ukrainian, Urdu, Uzbek, Vietnamese, Welsh, Yiddish, Yoruba ## Usage examples ### Auto-detect language ```python import whisper model = whisper.load_model("turbo") # Auto-detect language result = model.transcribe("audio.mp3") print(f"Detected language: {result['language']}") print(f"Text: {result['text']}") ``` ### Specify language (faster) ```python # Specify language for faster transcription result = model.transcribe("audio.mp3", language="es") # Spanish result = model.transcribe("audio.mp3", language="fr") # French result = model.transcribe("audio.mp3", language="ja") # Japanese ``` ### Translation to English ```python # Translate any language to English result = model.transcribe( "spanish_audio.mp3", task="translate" # Translates to English ) print(f"Original language: {result['language']}") print(f"English translation: {result['text']}") ``` ## Language-specific tips ### Chinese ```python # Chinese works well with larger models model = whisper.load_model("large") result = model.transcribe( "chinese_audio.mp3", language="zh", initial_prompt="这是一段关于技术的讨论" # Context helps ) ``` ### Japanese ```python # Japanese benefits from initial prompt result = model.transcribe( "japanese_audio.mp3", language="ja", initial_prompt="これは技術的な会議の録音です" ) ``` ### Arabic ```python # Arabic: Use large model for best results model = whisper.load_model("large") result = model.transcribe( "arabic_audio.mp3", language="ar" ) ``` ## Model size recommendations | Language Tier | Recommended Model | WER | |---------------|-------------------|-----| | Top-tier (en, es, fr, de) | base/turbo | < 10% | | Good (ar, tr, vi) | medium/large | 10-20% | | Lower-resource | large | 20-30% | ## Performance by language ### English - **tiny**: WER ~15% - **base**: WER ~8% - **small**: WER ~5% - **medium**: WER ~4% - **large**: WER ~3% - **turbo**: WER ~3.5% ### Spanish - **tiny**: WER ~20% - **base**: WER ~12% - **medium**: WER ~6% - **large**: WER ~4% ### Chinese - **small**: WER ~15% - **medium**: WER ~8% - **large**: WER ~5% ## Best practices 1. **Use English-only models** - Better for small models (tiny/base) 2. **Specify language** - Faster than auto-detect 3. **Add initial prompt** - Improves accuracy for technical terms 4. **Use larger models** - For low-resource languages 5. **Test on sample** - Quality varies by accent/dialect 6. **Consider audio quality** - Clear audio = better results 7. **Check language cod