
Whisper
Choose Whisper language codes and quality tiers when adding multilingual speech-to-text to apps, agents, or pipelines.
Install
npx skills add https://github.com/orchestra-research/ai-research-skills --skill whisperWhat is this skill?
- Documents 99 supported Whisper languages end to end
- Top-tier WER under 10% called out for 12 major locales including English, Spanish, and Chinese
- Good-tier WER 10–20% bucket for 15 additional languages
- Full alphabetical language list for locale and subtitle pipeline planning
- Guides multilingual ASR scope before shipping voice features
Adoption & trust: 1 installs on skills.sh; 9.4k GitHub stars; 3/3 security scanners passed (skills.sh audits).
Recommended Skills
Microsoft Foundrymicrosoft/azure-skills
Azure Aimicrosoft/azure-skills
Azure Hosted Copilot Sdkmicrosoft/azure-skills
Lark Eventlarksuite/cli
Running Claude Code Via Litellm Copilotxixu-me/skills
Setup Matt Pocock Skillsmattpocock/skills
Journey fit
Primary fit
Speech-to-text integration decisions happen while wiring product features, so Build/integrations is the primary shelf even though ops teams reuse the same reference. Integrations fits mapping ASR locales, WER expectations, and ISO-like language lists into APIs and agent tools.
Common Questions / FAQ
Is Whisper safe to install?
skills.sh reports 3 of 3 security scanners passed. Review the Security Audits panel on this page before installing in production.
SKILL.md
READMESKILL.md - Whisper
# Whisper Language Support Guide Complete guide to Whisper's multilingual capabilities. ## Supported languages (99 total) ### Top-tier support (WER < 10%) - English (en) - Spanish (es) - French (fr) - German (de) - Italian (it) - Portuguese (pt) - Dutch (nl) - Polish (pl) - Russian (ru) - Japanese (ja) - Korean (ko) - Chinese (zh) ### Good support (WER 10-20%) - Arabic (ar) - Turkish (tr) - Vietnamese (vi) - Swedish (sv) - Finnish (fi) - Czech (cs) - Romanian (ro) - Hungarian (hu) - Danish (da) - Norwegian (no) - Thai (th) - Hebrew (he) - Greek (el) - Indonesian (id) - Malay (ms) ### Full list (99 languages) Afrikaans, Albanian, Amharic, Arabic, Armenian, Assamese, Azerbaijani, Bashkir, Basque, Belarusian, Bengali, Bosnian, Breton, Bulgarian, Burmese, Cantonese, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Estonian, Faroese, Finnish, French, Galician, Georgian, German, Greek, Gujarati, Haitian Creole, Hausa, Hawaiian, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Italian, Japanese, Javanese, Kannada, Kazakh, Khmer, Korean, Lao, Latin, Latvian, Lingala, Lithuanian, Luxembourgish, Macedonian, Malagasy, Malay, Malayalam, Maltese, Maori, Marathi, Moldavian, Mongolian, Myanmar, Nepali, Norwegian, Nynorsk, Occitan, Pashto, Persian, Polish, Portuguese, Punjabi, Pushto, Romanian, Russian, Sanskrit, Serbian, Shona, Sindhi, Sinhala, Slovak, Slovenian, Somali, Spanish, Sundanese, Swahili, Swedish, Tagalog, Tajik, Tamil, Tatar, Telugu, Thai, Tibetan, Turkish, Turkmen, Ukrainian, Urdu, Uzbek, Vietnamese, Welsh, Yiddish, Yoruba ## Usage examples ### Auto-detect language ```python import whisper model = whisper.load_model("turbo") # Auto-detect language result = model.transcribe("audio.mp3") print(f"Detected language: {result['language']}") print(f"Text: {result['text']}") ``` ### Specify language (faster) ```python # Specify language for faster transcription result = model.transcribe("audio.mp3", language="es") # Spanish result = model.transcribe("audio.mp3", language="fr") # French result = model.transcribe("audio.mp3", language="ja") # Japanese ``` ### Translation to English ```python # Translate any language to English result = model.transcribe( "spanish_audio.mp3", task="translate" # Translates to English ) print(f"Original language: {result['language']}") print(f"English translation: {result['text']}") ``` ## Language-specific tips ### Chinese ```python # Chinese works well with larger models model = whisper.load_model("large") result = model.transcribe( "chinese_audio.mp3", language="zh", initial_prompt="这是一段关于技术的讨论" # Context helps ) ``` ### Japanese ```python # Japanese benefits from initial prompt result = model.transcribe( "japanese_audio.mp3", language="ja", initial_prompt="これは技術的な会議の録音です" ) ``` ### Arabic ```python # Arabic: Use large model for best results model = whisper.load_model("large") result = model.transcribe( "arabic_audio.mp3", language="ar" ) ``` ## Model size recommendations | Language Tier | Recommended Model | WER | |---------------|-------------------|-----| | Top-tier (en, es, fr, de) | base/turbo | < 10% | | Good (ar, tr, vi) | medium/large | 10-20% | | Lower-resource | large | 20-30% | ## Performance by language ### English - **tiny**: WER ~15% - **base**: WER ~8% - **small**: WER ~5% - **medium**: WER ~4% - **large**: WER ~3% - **turbo**: WER ~3.5% ### Spanish - **tiny**: WER ~20% - **base**: WER ~12% - **medium**: WER ~6% - **large**: WER ~4% ### Chinese - **small**: WER ~15% - **medium**: WER ~8% - **large**: WER ~5% ## Best practices 1. **Use English-only models** - Better for small models (tiny/base) 2. **Specify language** - Faster than auto-detect 3. **Add initial prompt** - Improves accuracy for technical terms 4. **Use larger models** - For low-resource languages 5. **Test on sample** - Quality varies by accent/dialect 6. **Consider audio quality** - Clear audio = better results 7. **Check language cod