Speech Recognition

Name: Speech Recognition
Author: dpearson2699

dpearson2699/swift-ios-skills

2.6k installs
944 repo stars
Updated July 15, 2026
dpearson2699/swift-ios-skills

speech-recognition is a Swift skill for Apple Speech framework live and file transcription with SpeechAnalyzer and SFSpeechRecognizer.

About

Speech Recognition transcribes live microphone and pre-recorded audio using Apple's Speech framework on Swift 6.3 and iOS 26 plus with SFSpeechRecognizer fallbacks for older targets. SpeechAnalyzer on iOS 26 plus supports SpeechTranscriber, DictationTranscriber, SpeechDetector, AssetInventory model installation, and async result streams with finalizeAndFinish lifecycle methods. Setup checklist covers module choice, locale availability checks, preset selection for progressive or time-indexed transcription, and converting audio buffers to bestAvailableAudioFormat before yielding AnalyzerInput. SFSpeechRecognizer paths handle authorization requests, AVAudioEngine live capture, file-based recognition, and on-device versus server recognition tradeoffs. Scope hands off post-transcript NLP to natural-language, playback UI to avkit, and generative summarization to apple-on-device-ai. Common mistakes include using undocumented offlineTranscription presets, assuming finishing an AsyncStream input finishes the analyzer session, and skipping authorization before capture. Review checklists verify microphone and speech permissions, locale support, asset installation, and on-device model readine.

SpeechAnalyzer SpeechTranscriber on iOS 26 plus with asset installation.
SFSpeechRecognizer fallback for older OS and server locales.
Live AVAudioEngine microphone and file-based recognition patterns.
Authorization for speech and microphone before capture starts.
Scope boundaries to natural-language, avkit, and on-device AI skills.

Speech Recognition by the numbers

2,618 all-time installs (skills.sh)
+111 installs in the week ending Jul 29, 2026 (Skillselion tracking)
Ranked #74 of 1,039 Mobile Development skills by installs in the Skillselion catalog
Security screen: LOW risk (skills.sh audit)
Data as of Jul 31, 2026 (Skillselion catalog sync)

At a glance

speech-recognition capabilities & compatibility

Capabilities: speechanalyzer module and preset selection · asset installation and locale availability check · sfspeechrecognizer live and file recognition · speech and microphone authorization flows · asyncsequence result consumption and session fin
Use cases: frontend · transcription
Platforms: macOS
Runs: Runs locally
Pricing: Free

npx skills add https://github.com/dpearson2699/swift-ios-skills --skill speech-recognition

Add your badge

Show developers this skill is listed on Skillselion. Paste this into your README.

[![Listed on Skillselion](https://skillselion.com/badge/skills/dpearson2699/swift-ios-skills/speech-recognition.svg)](https://skillselion.com/skills/dpearson2699/swift-ios-skills/speech-recognition)

Installs	2.6k
repo stars	★ 944
Security audit	3 / 3 scanners passed
Last updated	July 15, 2026
Repository	dpearson2699/swift-ios-skills ↗

How do I implement live microphone transcription with correct authorization and iOS 26 SpeechAnalyzer presets?

Implement live and file speech-to-text with SpeechAnalyzer, SFSpeechRecognizer, and microphone authorization in Swift.

Who is it for?

iOS developers adding dictation, live captions, or audio file transcription in Swift apps.

Skip if: Skip for post-transcript sentiment or translation use natural-language instead.

When should I use this skill?

User implements AVAudioEngine transcription, SpeechAnalyzer, SFSpeechRecognizer, or speech authorization flows.

What you get

Authorized capture pipeline with locale-checked recognizer, asset installation, and async transcript results.

SpeechAnalyzer Swift implementation
Live transcription pipeline
Session cleanup logic

Files

SKILL.mdMarkdownGitHub ↗

Speech Recognition

Transcribe live and pre-recorded audio to text using Apple's Speech framework. Covers SpeechAnalyzer / SpeechTranscriber (iOS 26+) and SFSpeechRecognizer (iOS 10+). Targets Swift 6.3 / iOS 26+ while preserving fallback guidance for apps that support older OS versions.

Scope boundary: Use this skill for speech-to-text recognition, speech authorization, microphone capture plumbing, and result handling. Hand off text analysis, language identification after transcription, sentiment, embeddings, and translation to natural-language; hand off audio playback UI to avkit; hand off summarization or generation over transcripts to apple-on-device-ai.

SpeechAnalyzer Strategy (iOS 26+)
SFSpeechRecognizer Setup
Authorization
Live Microphone Transcription
Pre-Recorded Audio File Recognition
On-Device vs Server Recognition
Handling Results
Common Mistakes
Review Checklist
References

SpeechAnalyzer Strategy (iOS 26+)

Use SpeechAnalyzer for modern iOS 26+ speech analysis, especially long-form recordings, live transcription, time-indexed transcripts, and fully on-device flows. Keep SFSpeechRecognizer for iOS 10+ deployment targets, server-backed locale coverage, or existing callback/delegate implementations.

Read SpeechAnalyzer patterns when implementing an iOS 26+ transcription pipeline, model asset handling, volatile results, or file/buffer examples.

SpeechAnalyzer setup checklist

1. Choose the module:

SpeechTranscriber for the newer general-purpose on-device model.
DictationTranscriber when SpeechTranscriber is unavailable for the

current device or locale and dictation-compatible support is acceptable.

SpeechDetector only in conjunction with a transcriber when voice

activity detection is worth the accuracy/power tradeoff. 2. Check support before creating the session:

SpeechTranscriber.isAvailable
SpeechTranscriber.supportedLocale(equivalentTo:)
SpeechTranscriber.installedLocales / supportedLocales when showing

language choices. 3. Pick a documented preset:

.transcription for basic accurate transcription.
.progressiveTranscription for live UI updates.
.timeIndexedProgressiveTranscription when playback highlighting needs

audioTimeRange. 4. Install required assets with AssetInventory.assetInstallationRequest. 5. Convert live audio buffers to SpeechAnalyzer.bestAvailableAudioFormat(compatibleWith:) before yielding AnalyzerInput. 6. Consume module results from their AsyncSequence in a separate task. 7. Finish explicitly with finalizeAndFinish(through:), finalizeAndFinishThroughEndOfInput(), or cancelAndFinishNow().

Do not use an offlineTranscription preset; Apple does not document one. Finishing an AsyncStream input sequence does not finish the analyzer session.

SFSpeechRecognizer Setup

Creating a recognizer with locale

import Speech

// Default locale (user's current language)
let recognizer = SFSpeechRecognizer()

// Specific locale
let recognizer = SFSpeechRecognizer(locale: Locale(identifier: "en-US"))

// Check if recognition is available for this locale
guard let recognizer, recognizer.isAvailable else {
    print("Speech recognition not available")
    return
}

Monitoring availability changes

final class SpeechManager: NSObject, SFSpeechRecognizerDelegate {
    private let recognizer = SFSpeechRecognizer()!

    override init() {
        super.init()
        recognizer.delegate = self
    }

    func speechRecognizer(
        _ speechRecognizer: SFSpeechRecognizer,
        availabilityDidChange available: Bool
    ) {
        // Update UI — disable record button when unavailable
    }
}

Authorization

Request both speech recognition and microphone permissions before starting live transcription. Add these keys to Info.plist:

NSSpeechRecognitionUsageDescription
NSMicrophoneUsageDescription

import Speech
import AVFoundation

func requestPermissions() async -> Bool {
    let speechStatus = await withCheckedContinuation { continuation in
        SFSpeechRecognizer.requestAuthorization { status in
            continuation.resume(returning: status)
        }
    }
    guard speechStatus == .authorized else { return false }

    let micStatus: Bool
    if #available(iOS 17, *) {
        micStatus = await AVAudioApplication.requestRecordPermission()
    } else {
        micStatus = await withCheckedContinuation { continuation in
            AVAudioSession.sharedInstance().requestRecordPermission { granted in
                continuation.resume(returning: granted)
            }
        }
    }
    return micStatus
}

Live Microphone Transcription

The standard pattern: AVAudioEngine captures microphone audio → buffers are appended to SFSpeechAudioBufferRecognitionRequest → results stream in.

import Speech
import AVFoundation

final class LiveTranscriber {
    private let recognizer = SFSpeechRecognizer(locale: Locale(identifier: "en-US"))!
    private let audioEngine = AVAudioEngine()
    private var recognitionRequest: SFSpeechAudioBufferRecognitionRequest?
    private var recognitionTask: SFSpeechRecognitionTask?

    func startTranscribing() throws {
        // Cancel any in-progress task
        recognitionTask?.cancel()
        recognitionTask = nil

        // Configure audio session
        let audioSession = AVAudioSession.sharedInstance()
        try audioSession.setCategory(.record, mode: .measurement, options: .duckOthers)
        try audioSession.setActive(true, options: .notifyOthersOnDeactivation)

        // Create request
        let request = SFSpeechAudioBufferRecognitionRequest()
        request.shouldReportPartialResults = true
        self.recognitionRequest = request

        // Start recognition task
        recognitionTask = recognizer.recognitionTask(with: request) { result, error in
            if let result {
                let text = result.bestTranscription.formattedString
                print("Transcription: \(text)")

                if result.isFinal {
                    self.stopTranscribing()
                }
            }
            if let error {
                print("Recognition error: \(error)")
                self.stopTranscribing()
            }
        }

        // Install audio tap
        let inputNode = audioEngine.inputNode
        let recordingFormat = inputNode.outputFormat(forBus: 0)
        inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) {
            buffer, _ in
            request.append(buffer)
        }

        audioEngine.prepare()
        try audioEngine.start()
    }

    func stopTranscribing() {
        audioEngine.stop()
        audioEngine.inputNode.removeTap(onBus: 0)
        recognitionRequest?.endAudio()
        recognitionRequest = nil
        recognitionTask?.cancel()
        recognitionTask = nil
    }
}

Pre-Recorded Audio File Recognition

Use SFSpeechURLRecognitionRequest for audio files on disk:

func transcribeFile(at url: URL) async throws -> String {
    guard let recognizer = SFSpeechRecognizer(), recognizer.isAvailable else {
        throw SpeechError.unavailable
    }
    let request = SFSpeechURLRecognitionRequest(url: url)
    request.shouldReportPartialResults = false

    return try await withCheckedThrowingContinuation { continuation in
        var didResume = false
        recognizer.recognitionTask(with: request) { result, error in
            guard !didResume else { return }
            if let error {
                didResume = true
                continuation.resume(throwing: error)
            } else if let result, result.isFinal {
                didResume = true
                continuation.resume(
                    returning: result.bestTranscription.formattedString
                )
            }
        }
    }
}

On-Device vs Server Recognition

SFSpeechRecognizer can use on-device recognition for supported locales on iOS 13+. If supportsOnDeviceRecognition is false, the recognizer requires a network connection. requiresOnDeviceRecognition only has effect when the recognizer supports it.

let recognizer = SFSpeechRecognizer(locale: Locale(identifier: "en-US"))!

// Check if on-device is supported for this locale
if recognizer.supportsOnDeviceRecognition {
    let request = SFSpeechAudioBufferRecognitionRequest()
    request.requiresOnDeviceRecognition = true  // Force on-device
}

SFSpeechRecognizer requests may still be a poor fit for long-form capture. Apple documents a roughly one-minute task limit for speech recognition and other service limits. For long recordings on iOS 26+, prefer SpeechAnalyzer; otherwise chunk or restart recognition before the limit and preserve transcript state across tasks.

Handling Results

Partial vs final results

let request = SFSpeechAudioBufferRecognitionRequest()
request.shouldReportPartialResults = true  // default is true

recognizer.recognitionTask(with: request) { result, error in
    guard let result else { return }

    if result.isFinal {
        // Final transcription — recognition is complete
        let final = result.bestTranscription.formattedString
    } else {
        // Partial result — may change as more audio is processed
        let partial = result.bestTranscription.formattedString
    }
}

Accessing alternative transcriptions and confidence

recognizer.recognitionTask(with: request) { result, error in
    guard let result else { return }

    // Best transcription
    let best = result.bestTranscription

    // All alternatives (sorted by confidence, descending)
    for transcription in result.transcriptions {
        for segment in transcription.segments {
            print("\(segment.substring): \(segment.confidence)")
        }
    }
}

Adding punctuation (iOS 16+)

let request = SFSpeechAudioBufferRecognitionRequest()
request.addsPunctuation = true

Contextual strings

Improve recognition of domain-specific terms:

let request = SFSpeechAudioBufferRecognitionRequest()
request.contextualStrings = ["SwiftUI", "Xcode", "CloudKit"]

Common Mistakes

Not requesting both speech and microphone authorization

// ❌ DON'T: Only request speech authorization for live audio
SFSpeechRecognizer.requestAuthorization { status in
    // Missing microphone permission — audio engine will fail
    self.startRecording()
}

// ✅ DO: Request both permissions before recording
SFSpeechRecognizer.requestAuthorization { status in
    guard status == .authorized else { return }
    AVAudioSession.sharedInstance().requestRecordPermission { granted in
        guard granted else { return }
        self.startRecording()
    }
}

Not handling availability changes

// ❌ DON'T: Assume recognizer stays available after initial check
let recognizer = SFSpeechRecognizer()!
// Recognition may fail if network drops or locale changes

// ✅ DO: Monitor availability via delegate
recognizer.delegate = self
func speechRecognizer(
    _ speechRecognizer: SFSpeechRecognizer,
    availabilityDidChange available: Bool
) {
    recordButton.isEnabled = available
}

Not stopping the audio engine when recognition ends

// ❌ DON'T: Leave audio engine running after recognition finishes
recognizer.recognitionTask(with: request) { result, error in
    if result?.isFinal == true {
        // Audio engine still running, wasting resources and battery
    }
}

// ✅ DO: Clean up all audio resources
recognizer.recognitionTask(with: request) { result, error in
    if result?.isFinal == true || error != nil {
        self.audioEngine.stop()
        self.audioEngine.inputNode.removeTap(onBus: 0)
        self.recognitionRequest?.endAudio()
        self.recognitionRequest = nil
    }
}

Assuming on-device recognition is available for all locales

// ❌ DON'T: Force on-device without checking support
let request = SFSpeechAudioBufferRecognitionRequest()
request.requiresOnDeviceRecognition = true // Ignored unless the recognizer supports it

// ✅ DO: Check support before requiring on-device
if recognizer.supportsOnDeviceRecognition {
    request.requiresOnDeviceRecognition = true
} else {
    // Fall back to server-based or inform user
}

Not handling the one-minute recognition limit

// ❌ DON'T: Start one long continuous recognition session
func startRecording() {
    // SFSpeechRecognizer tasks can be cut off after about 60 seconds
}

// ✅ DO: roll the segment before the limit and let cleanup end audio once
func scheduleRecognitionRollover() {
    recognitionTimer = Timer.scheduledTimer(withTimeInterval: 55, repeats: false) { [weak self] _ in
        self?.commitLatestPartialText()
        self?.stopTranscribing()     // owns endAudio(), tap removal, and task cancellation
        try? self?.startTranscribing()
    }
}

SFSpeechRecognitionTask exposes finish(), cancel(), state, and error; do not invent task properties such as recognitionTask to restart work. Keep the active SFSpeechAudioBufferRecognitionRequest in your manager and call endAudio() from one cleanup path only.

Treating SpeechAnalyzer input completion as session completion

// ❌ DON'T: Only finish the AsyncStream and expect result streams to close
inputBuilder.finish()

// ✅ DO: explicitly finish or cancel the analyzer session
let lastSampleTime = try await analyzer.analyzeSequence(inputSequence)
if let lastSampleTime {
    try await analyzer.finalizeAndFinish(through: lastSampleTime)
} else {
    try analyzer.cancelAndFinishNow()
}

Duplicating volatile SpeechAnalyzer results

// ✅ Replace volatile text with the finalized result for the same audio range
for try await result in transcriber.results {
    if result.isFinal {
        volatileTranscript = AttributedString()
        finalizedTranscript.append(result.text)
    } else {
        volatileTranscript = result.text
    }
}

Creating multiple simultaneous recognition tasks

// ❌ DON'T: Start a new task without canceling the previous one
func startRecording() {
    recognitionTask = recognizer.recognitionTask(with: request) { ... }
    // Previous task is still running — undefined behavior
}

// ✅ DO: Cancel existing task before creating a new one
func startRecording() {
    recognitionTask?.cancel()
    recognitionTask = nil
    recognitionTask = recognizer.recognitionTask(with: request) { ... }
}

Review Checklist

[ ] NSSpeechRecognitionUsageDescription is in Info.plist
[ ] NSMicrophoneUsageDescription is in Info.plist (if using live audio)
[ ] Authorization is requested before starting recognition
[ ] SFSpeechRecognizerDelegate is set to handle availabilityDidChange
[ ] Audio engine is stopped and tap removed when recognition ends
[ ] recognitionRequest.endAudio() is called when done recording
[ ] Previous recognitionTask is canceled before starting a new one
[ ] supportsOnDeviceRecognition is checked before requiring on-device mode
[ ] Partial results are handled separately from final (isFinal) results
[ ] SFSpeechRecognizer one-minute/service limits are accounted for
[ ] For iOS 26+: AssetInventory assets are installed before using SpeechAnalyzer
[ ] For iOS 26+: SpeechTranscriber.isAvailable and locale support are checked
[ ] For iOS 26+: live buffers are converted to the analyzer-compatible format
[ ] For iOS 26+: analyzer sessions are explicitly finalized or canceled
[ ] For iOS 26+: volatile results are replaced by finalized results, not duplicated

References

{
  "skill_name": "speech-recognition",
  "evals": [
    {
      "id": 0,
      "name": "speechanalyzer-live-transcription",
      "prompt": "I'm building an iOS 26 meeting recorder with live transcript text and word highlighting during playback. Sketch the SpeechAnalyzer implementation shape, including model setup, audio input, result handling, and cleanup.",
      "expected_output": "An iOS 26 SpeechAnalyzer plan that uses current SpeechTranscriber APIs, installs assets, converts live audio buffers, handles volatile/final results, and explicitly finishes the analyzer session.",
      "files": [],
      "assertions": [
        "Uses SpeechAnalyzer with SpeechTranscriber and a documented preset such as .timeIndexedProgressiveTranscription, not .offlineTranscription.",
        "Checks SpeechTranscriber.isAvailable and supportedLocale(equivalentTo:) before starting transcription.",
        "Installs or verifies model assets with AssetInventory.assetInstallationRequest(supporting:) before analysis.",
        "Converts AVAudioEngine microphone buffers to SpeechAnalyzer.bestAvailableAudioFormat(compatibleWith:) before yielding AnalyzerInput.",
        "Handles volatile results separately from finalized results so live text is replaced rather than duplicated.",
        "Explicitly finalizes or cancels the analyzer session after input ends."
      ]
    },
    {
      "id": 1,
      "name": "sfspeechrecognizer-live-review",
      "prompt": "Review this iOS live dictation plan: request only NSSpeechRecognitionUsageDescription, start AVAudioEngine immediately, use SFSpeechRecognizer forever in one recognition task, force requiresOnDeviceRecognition for every locale, and ignore availability changes. Give corrected guidance and focused Swift snippets.",
      "expected_output": "A correction-focused SFSpeechRecognizer review that covers speech and microphone authorization, live audio setup, availability changes, on-device checks, task cleanup, and recognition duration limits.",
      "files": [],
      "assertions": [
        "Requires both NSSpeechRecognitionUsageDescription and NSMicrophoneUsageDescription for live microphone transcription.",
        "Requests speech recognition authorization and microphone record permission before activating the audio session or starting AVAudioEngine.",
        "Uses SFSpeechRecognizerDelegate or availability handling for recognition service changes.",
        "Checks supportsOnDeviceRecognition before setting requiresOnDeviceRecognition and gives a fallback when unsupported.",
        "Accounts for SFSpeechRecognizer recognition duration/service limits instead of running one unbounded task.",
        "Stops AVAudioEngine, removes the input tap, ends audio, and cancels or clears the recognition task during cleanup."
      ]
    },
    {
      "id": 2,
      "name": "speech-boundary-routing",
      "prompt": "A notes feature records a spoken note, transcribes it, detects the text language and sentiment, translates a summary to Spanish, plays back the recording with captions, and optionally uses Apple Intelligence to summarize it. Which parts belong in the speech-recognition skill and which should be handed to sibling skills?",
      "expected_output": "A boundary-aware routing answer that keeps speech-to-text and microphone capture in Speech, and routes text analysis, translation, playback UI, and generative summarization to the correct sibling skills.",
      "files": [],
      "assertions": [
        "Keeps microphone capture, speech authorization, SpeechAnalyzer or SFSpeechRecognizer transcription, and transcript result handling in the speech-recognition scope.",
        "Routes language detection, sentiment, text embeddings, and translation to the natural-language skill after transcription exists.",
        "Routes playback UI, captions during media playback, or AVPlayer/AVKit concerns to the avkit skill.",
        "Routes Apple Intelligence or Foundation Models summarization to the apple-on-device-ai skill.",
        "Does not present NaturalLanguage as a speech-to-text framework or Speech as a general text-analysis framework."
      ]
    }
  ]
}

SpeechAnalyzer Patterns

Use this reference when implementing iOS 26+ speech-to-text with SpeechAnalyzer, SpeechTranscriber, DictationTranscriber, SpeechDetector, or AssetInventory.

Choosing Modules
Preparing Assets
Transcribing Files
Live Audio
Handling Results
Finishing Sessions
References

Choosing Modules

Prefer SpeechTranscriber for the newer general-purpose on-device model. Before building UI around it, check both device and locale support:

guard SpeechTranscriber.isAvailable,
      let locale = SpeechTranscriber.supportedLocale(equivalentTo: Locale.current)
else {
    // Disable the feature or try DictationTranscriber for compatible devices/locales.
    return
}

Use documented presets only:

.transcription for basic accurate transcription.
.transcriptionWithAlternatives for editing suggestions.
.timeIndexedTranscriptionWithAlternatives for audio-time metadata plus alternatives.
.progressiveTranscription for low-latency live UI updates.
.timeIndexedProgressiveTranscription for live UI updates with time ranges.

Use DictationTranscriber when SpeechTranscriber is unavailable and the app can accept dictation-model behavior. Add SpeechDetector only with a transcriber module, and only when voice activity detection is worth the risk of dropping speech-like audio.

Preparing Assets

SpeechAnalyzer modules require model assets. The system installs and shares them outside the app bundle, but the app must request installation for the module configuration it plans to use.

let transcriber = SpeechTranscriber(locale: locale, preset: .transcription)

if let request = try await AssetInventory.assetInstallationRequest(
    supporting: [transcriber]
) {
    try await request.downloadAndInstall()
}

For language pickers, use installedLocales, supportedLocales, and AssetInventory.status(forModules:) to distinguish installed, downloadable, and unsupported choices. The app has a limited number of locale reservations; release unused reservations with AssetInventory.release(reservedLocale:).

Transcribing Files

For files, let the analyzer convert the file to a compatible format and finish the session after the file is consumed.

func transcribeFile(at url: URL, locale: Locale) async throws -> AttributedString {
    guard let supportedLocale = SpeechTranscriber.supportedLocale(equivalentTo: locale) else {
        throw SpeechError.unsupportedLocale
    }

    let transcriber = SpeechTranscriber(
        locale: supportedLocale,
        preset: .transcription
    )

    if let request = try await AssetInventory.assetInstallationRequest(
        supporting: [transcriber]
    ) {
        try await request.downloadAndInstall()
    }

    let analyzer = SpeechAnalyzer(modules: [transcriber])
    async let transcript = transcriber.results.reduce(into: AttributedString()) {
        text, result in
        text.append(result.text)
    }

    let file = try AVAudioFile(forReading: url)
    let lastSampleTime = try await analyzer.analyzeSequence(from: file)
    if let lastSampleTime {
        try await analyzer.finalizeAndFinish(through: lastSampleTime)
    } else {
        try analyzer.cancelAndFinishNow()
    }

    return try await transcript
}

Live Audio

For live audio, create an AsyncStream<AnalyzerInput>, convert microphone buffers to the analyzer-compatible format, yield them, and consume results in a separate task.

let transcriber = SpeechTranscriber(
    locale: locale,
    preset: .timeIndexedProgressiveTranscription
)
let analyzer = SpeechAnalyzer(modules: [transcriber])
let audioFormat = await SpeechAnalyzer.bestAvailableAudioFormat(
    compatibleWith: [transcriber]
)
let (inputSequence, inputBuilder) = AsyncStream.makeStream(of: AnalyzerInput.self)

// In the audio-engine tap, convert each AVAudioPCMBuffer to audioFormat first.
inputBuilder.yield(AnalyzerInput(buffer: convertedBuffer))

Use AVAudioConverter or an existing project audio pipeline for the conversion. Do not feed arbitrary input-node formats directly unless they already match a compatible analyzer format.

Handling Results

SpeechTranscriber.Result.text is an AttributedString. Time-indexed presets include audio time range attributes that can drive playback highlighting.

When using progressive presets, volatile results may be replaced by later final results. Keep volatile display state separate so the UI does not duplicate text.

for try await result in transcriber.results {
    if result.isFinal {
        volatileTranscript = AttributedString()
        finalizedTranscript.append(result.text)
    } else {
        volatileTranscript = result.text
    }
}

Finishing Sessions

The analyzer can only analyze one input sequence at a time. Ending your stream does not finish the analyzer session; call a finish or cancel method.

Use:

finalizeAndFinish(through:) after analyzeSequence(_:) returns a final sample time.
finalizeAndFinishThroughEndOfInput() after autonomous start(inputSequence:).
cancelAndFinishNow() for immediate cancellation.

After the session finishes, result streams terminate and most analyzer methods no longer accept new work. Create a new analyzer for a new finished session.

References

Related skills

Xcode Project SetupAutomatically create and configure a new Xcode project with Swift Package Manager dependencies for iOS or macOS agent projects.74.7k392

Expo Tailwind SetupInstantly configure Tailwind CSS v4 with NativeWind v5 and react-native-css inside an Expo project for universal styling.46.7k2.3k

Expo Dev ClientCreate custom development clients for Expo React Native apps that need native modules or Apple-specific targets.45.9k2.3k

Swiftui Expert SkillGet expert guidance when writing, reviewing, or refactoring SwiftUI views, state, performance, and modern iOS/macOS APIs.27.6k3.3k

Flutter Apply Architecture Best PracticesEnforce clean layered architecture when creating or refactoring a Flutter mobile application.25.4k2.7k

Expo ModuleCreate custom config plugins that safely modify native Android and iOS projects generated by Expo prebuild.25k2.3k

FAQ

When should I use SpeechAnalyzer over SFSpeechRecognizer?

Use SpeechAnalyzer on iOS 26 plus for long-form, live, or time-indexed on-device transcription flows.

Does finishing the input stream finish the analyzer?

No. Finishing an AsyncStream input does not finish the analyzer session; call finalizeAndFinish explicitly.

Where does text analysis go after transcription?

Hand off language identification, sentiment, and translation to the natural-language skill.

Is Speech Recognition safe to install?

skills.sh reports 3 of 3 security scanners passed. Review the Security Audits panel on this page before installing in production.

Mobile Developmentfrontendintegrations

About

Speech Recognition by the numbers

speech-recognition capabilities & compatibility

Add your badge

How do I implement live microphone transcription with correct authorization and iOS 26 SpeechAnalyzer presets?

Who is it for?

When should I use this skill?

What you get

Files

Speech Recognition

Contents

SpeechAnalyzer Strategy (iOS 26+)

SpeechAnalyzer setup checklist

SFSpeechRecognizer Setup

Creating a recognizer with locale

Monitoring availability changes

Authorization

Live Microphone Transcription

Pre-Recorded Audio File Recognition

On-Device vs Server Recognition

Handling Results

Partial vs final results

Accessing alternative transcriptions and confidence

Adding punctuation (iOS 16+)

Contextual strings

Common Mistakes

Not requesting both speech and microphone authorization

Not handling availability changes

Not stopping the audio engine when recognition ends

Assuming on-device recognition is available for all locales

Not handling the one-minute recognition limit

Treating SpeechAnalyzer input completion as session completion

Duplicating volatile SpeechAnalyzer results

Creating multiple simultaneous recognition tasks

Review Checklist

References

SpeechAnalyzer Patterns

Contents

Choosing Modules

Preparing Assets

Transcribing Files

Live Audio

Handling Results

Finishing Sessions

References

Related skills

FAQ

When should I use SpeechAnalyzer over SFSpeechRecognizer?

Does finishing the input stream finish the analyzer?

Where does text analysis go after transcription?

Is Speech Recognition safe to install?

This week in AI coding