Audio Transcriber

Name: Audio Transcriber
Author: sickn33

sickn33/antigravity-awesome-skills

1.2k installs
44k repo stars
Updated July 27, 2026
sickn33/antigravity-awesome-skills

audio-transcriber provides documented workflows for Transform audio recordings into professional Markdown documentation with intelligent summaries using LLM integration

About

The audio-transcriber skill transform audio recordings into professional Markdown documentation with intelligent summaries using LLM integration Purpose This skill automates audio-to-text transcription with professional Markdown output extracting rich technical metadata speakers timestamps language file size duration and generating structured meeting minutes and executive summaries It uses Faster-Whisper or Whisper with zero configuration working universally across projects without hardcoded paths or API keys Inspired by tools like Plaud this skill transforms raw audio recordings into actionable documentation making it ideal for meetings interviews lectures and content analysis When to Use Invoke this skill when User needs to transcribe audio video files to text User wants meeting minutes automatically generated from recordings User requires speaker identification diarization in conversations User needs subtitles captions SRT VTT formats User wants executive summaries of long audio content User asks variations of transcribe this audio convert audio to text generate meeting notes from recording User has audio files in common formats MP3 WAV M4A OGG FLAC WEBM Workflow Step 0 Discove.

User needs to transcribe audio/video files to text
User wants meeting minutes automatically generated from recordings
User requires speaker identification (diarization) in conversations
User needs subtitles/captions (SRT, VTT formats)
User wants executive summaries of long audio content

Audio Transcriber by the numbers

1,187 all-time installs (skills.sh)
+23 installs in the week ending Jul 28, 2026 (Skillselion tracking)
Ranked #355 of 2,209 Security skills by installs in the Skillselion catalog
Security screen: MEDIUM risk (skills.sh audit)
Data as of Jul 28, 2026 (Skillselion catalog sync)

At a glance

audio-transcriber capabilities & compatibility

Capabilities: user needs to transcribe audio/video files to te · user wants meeting minutes automatically generat · user requires speaker identification (diarizatio · user needs subtitles/captions (srt, vtt formats) · user wants executive summaries of long audio con
Use cases: documentation

From the docs

What audio-transcriber says it does

It uses Faster-Whisper or Whisper with zero configuration, working universally across projects without hardcoded paths or API keys.

SKILL.md

Inspired by tools like Plaud, this skill transforms raw audio recordings into actionable documentation, making it ideal for meetings, interviews, lectures, and content analysis.

SKILL.md

npx skills add https://github.com/sickn33/antigravity-awesome-skills --skill audio-transcriber

Add your badge

Show developers this skill is listed on Skillselion. Paste this into your README.

[![Listed on Skillselion](https://skillselion.com/badge/skills/sickn33/antigravity-awesome-skills/audio-transcriber.svg)](https://skillselion.com/skills/sickn33/antigravity-awesome-skills/audio-transcriber)

Installs	1.2k
repo stars	★ 44k
Security audit	2 / 3 scanners passed
Last updated	July 27, 2026
Repository	sickn33/antigravity-awesome-skills ↗

How do I use audio-transcriber for the task described in its SKILL.md triggers?

Transform audio recordings into professional Markdown documentation with intelligent summaries using LLM integration

Who is it for?

Teams invoking audio-transcriber when the user request matches documented triggers and prerequisites.

Skip if: Skip when cached docs are missing, the request is a negative trigger, or another sibling skill owns the workflow.

When should I use this skill?

Transform audio recordings into professional Markdown documentation with intelligent summaries using LLM integration

What you get

Step-by-step guidance grounded in audio-transcriber documentation and reference files.

Structured LLM-processed document
Improved or auto-generated extraction prompt

By the numbers

Version 1.1.0 released 2026-02-03 with Step 3b prompt workflow

Files

SKILL.mdMarkdownGitHub ↗

Purpose

This skill automates audio-to-text transcription with professional Markdown output, extracting rich technical metadata (speakers, timestamps, language, file size, duration) and generating structured meeting minutes and executive summaries. It uses Faster-Whisper or Whisper with zero configuration, working universally across projects without hardcoded paths or API keys.

Inspired by tools like Plaud, this skill transforms raw audio recordings into actionable documentation, making it ideal for meetings, interviews, lectures, and content analysis.

When to Use

Invoke this skill when:

User needs to transcribe audio/video files to text
User wants meeting minutes automatically generated from recordings
User requires speaker identification (diarization) in conversations
User needs subtitles/captions (SRT, VTT formats)
User wants executive summaries of long audio content
User asks variations of "transcribe this audio", "convert audio to text", "generate meeting notes from recording"
User has audio files in common formats (MP3, WAV, M4A, OGG, FLAC, WEBM)

Workflow

Step 0: Discovery (Auto-detect Transcription Tools)

Objective: Identify available transcription engines without user configuration.

Actions:

Run detection commands to find installed tools:

# Check for Faster-Whisper (preferred - 4-5x faster)
if python3 -c "import faster_whisper" 2>/dev/null; then
    TRANSCRIBER="faster-whisper"
    echo "✅ Faster-Whisper detected (optimized)"
# Fallback to original Whisper
elif python3 -c "import whisper" 2>/dev/null; then
    TRANSCRIBER="whisper"
    echo "✅ OpenAI Whisper detected"
else
    TRANSCRIBER="none"
    echo "⚠️  No transcription tool found"
fi

# Check for ffmpeg (audio format conversion)
if command -v ffmpeg &>/dev/null; then
    echo "✅ ffmpeg available (format conversion enabled)"
else
    echo "ℹ️  ffmpeg not found (limited format support)"
fi

If no transcriber found:

Offer automatic installation using the provided script:

echo "⚠️  No transcription tool found"
echo ""
echo "🔧 Auto-install dependencies? (Recommended)"
read -p "Run installation script? [Y/n]: " AUTO_INSTALL

if [[ ! "$AUTO_INSTALL" =~ ^[Nn] ]]; then
    # Get skill directory (works for both repo and symlinked installations)
    SKILL_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
    
    # Run installation script
    if [[ -f "$SKILL_DIR/scripts/install-requirements.sh" ]]; then
        bash "$SKILL_DIR/scripts/install-requirements.sh"
    else
        echo "❌ Installation script not found"
        echo ""
        echo "📦 Manual installation:"
        echo "  pip install faster-whisper  # Recommended"
        echo "  pip install openai-whisper  # Alternative"
        echo "  brew install ffmpeg         # Optional (macOS)"
        exit 1
    fi
    
    # Verify installation succeeded
    if python3 -c "import faster_whisper" 2>/dev/null || python3 -c "import whisper" 2>/dev/null; then
        echo "✅ Installation successful! Proceeding with transcription..."
    else
        echo "❌ Installation failed. Please install manually."
        exit 1
    fi
else
    echo ""
    echo "📦 Manual installation required:"
    echo ""
    echo "Recommended (fastest):"
    echo "  pip install faster-whisper"
    echo ""
    echo "Alternative (original):"
    echo "  pip install openai-whisper"
    echo ""
    echo "Optional (format conversion):"
    echo "  brew install ffmpeg  # macOS"
    echo "  apt install ffmpeg   # Linux"
    echo ""
    exit 1
fi

This ensures users can install dependencies with one confirmation, or opt for manual installation if preferred.

If transcriber found:

Proceed to Step 0b (CLI Detection).

Step 1: Validate Audio File

Objective: Verify file exists, check format, and extract metadata.

Actions:

1. Accept file path or URL from user:

Local file: meeting.mp3
URL: https://example.com/audio.mp3 (download to temp directory)

2. Verify file exists:

if [[ ! -f "$AUDIO_FILE" ]]; then
    echo "❌ File not found: $AUDIO_FILE"
    exit 1
fi

3. Extract metadata using ffprobe or file utilities:

# Get file size
FILE_SIZE=$(du -h "$AUDIO_FILE" | cut -f1)

# Get duration and format using ffprobe
DURATION=$(ffprobe -v error -show_entries format=duration \
    -of default=noprint_wrappers=1:nokey=1 "$AUDIO_FILE" 2>/dev/null)
FORMAT=$(ffprobe -v error -select_streams a:0 -show_entries \
    stream=codec_name -of default=noprint_wrappers=1:nokey=1 "$AUDIO_FILE" 2>/dev/null)

# Convert duration to HH:MM:SS
DURATION_HMS=$(date -u -r "$DURATION" +%H:%M:%S 2>/dev/null || echo "Unknown")

4. Check file size (warn if large for cloud APIs):

SIZE_MB=$(du -m "$AUDIO_FILE" | cut -f1)
if [[ $SIZE_MB -gt 25 ]]; then
    echo "⚠️  Large file ($FILE_SIZE) - processing may take several minutes"
fi

5. Validate format (supported: MP3, WAV, M4A, OGG, FLAC, WEBM):

EXTENSION="${AUDIO_FILE##*.}"
SUPPORTED_FORMATS=("mp3" "wav" "m4a" "ogg" "flac" "webm" "mp4")

if [[ ! " ${SUPPORTED_FORMATS[@]} " =~ " ${EXTENSION,,} " ]]; then
    echo "⚠️  Unsupported format: $EXTENSION"
    if command -v ffmpeg &>/dev/null; then
        echo "🔄 Converting to WAV..."
        ffmpeg -i "$AUDIO_FILE" -ar 16000 "${AUDIO_FILE%.*}.wav" -y
        AUDIO_FILE="${AUDIO_FILE%.*}.wav"
    else
        echo "❌ Install ffmpeg to convert formats: brew install ffmpeg"
        exit 1
    fi
fi

Step 3: Generate Markdown Output

Objective: Create structured Markdown with metadata, transcription, meeting minutes, and summary.

Output Template:

# Audio Transcription Report

## 📊 Metadata

| Field | Value |
|-------|-------|
| **File Name** | {filename} |
| **File Size** | {file_size} |
| **Duration** | {duration_hms} |
| **Language** | {language} ({language_code}) |
| **Processed Date** | {process_date} |
| **Speakers Identified** | {num_speakers} |
| **Transcription Engine** | {engine} (model: {model}) |


## 📋 Meeting Minutes

### Participants
- {speaker_1}
- {speaker_2}
- ...

### Topics Discussed
1. **{topic_1}** ({timestamp})
   - {key_point_1}
   - {key_point_2}

2. **{topic_2}** ({timestamp})
   - {key_point_1}

### Decisions Made
- ✅ {decision_1}
- ✅ {decision_2}

### Action Items
- [ ] **{action_1}** - Assigned to: {speaker} - Due: {date_if_mentioned}
- [ ] **{action_2}** - Assigned to: {speaker}


*Generated by audio-transcriber skill v1.0.0*  
*Transcription engine: {engine} | Processing time: {elapsed_time}s*

Implementation:

Use Python or bash with AI model (Claude/GPT) for intelligent summarization:

def generate_meeting_minutes(segments):
    """Extract topics, decisions, action items from transcription."""
    
    # Group segments by topic (simple clustering by timestamps)
    topics = cluster_by_topic(segments)
    
    # Identify action items (keywords: "should", "will", "need to", "action")
    action_items = extract_action_items(segments)
    
    # Identify decisions (keywords: "decided", "agreed", "approved")
    decisions = extract_decisions(segments)
    
    return {
        "topics": topics,
        "decisions": decisions,
        "action_items": action_items
    }

def generate_summary(segments, max_paragraphs=5):
    """Create executive summary using AI (Claude/GPT via API or local model)."""
    
    full_text = " ".join([s["text"] for s in segments])
    
    # Use Chain of Density approach (from prompt-engineer frameworks)
    summary_prompt = f"""
    Summarize the following transcription in {max_paragraphs} concise paragraphs.
    Focus on key topics, decisions, and action items.
    
    Transcription:
    {full_text}
    """
    
    # Call AI model (placeholder - user can integrate Claude API or use local model)
    summary = call_ai_model(summary_prompt)
    
    return summary

Output file naming:

# v1.1.0: Use timestamp para evitar sobrescrever
TIMESTAMP=$(date +%Y%m%d-%H%M%S)
TRANSCRIPT_FILE="transcript-${TIMESTAMP}.md"
ATA_FILE="ata-${TIMESTAMP}.md"

echo "$TRANSCRIPT_CONTENT" > "$TRANSCRIPT_FILE"
echo "✅ Transcript salvo: $TRANSCRIPT_FILE"

if [[ -n "$ATA_CONTENT" ]]; then
    echo "$ATA_CONTENT" > "$ATA_FILE"
    echo "✅ Ata salva: $ATA_FILE"
fi

SCENARIO A: User Provided Custom Prompt

Workflow:

1. Display user's prompt:

   📝 Prompt fornecido pelo usuário:
   ┌──────────────────────────────────┐
   │ [User's prompt preview]          │
   └──────────────────────────────────┘

2. Automatically improve with prompt-engineer (if available):

   🔧 Melhorando prompt com prompt-engineer...
   [Invokes: gh copilot -p "melhore este prompt: {user_prompt}"]

3. Show both versions:

   ✨ Versão melhorada:
   ┌──────────────────────────────────┐
   │ Role: Você é um documentador...  │
   │ Instructions: Transforme...      │
   │ Steps: 1) ... 2) ...             │
   │ End Goal: ...                    │
   └──────────────────────────────────┘

   📝 Versão original:
   ┌──────────────────────────────────┐
   │ [User's original prompt]         │
   └──────────────────────────────────┘

4. Ask which to use:

   💡 Usar versão melhorada? [s/n] (default: s):

5. Process with selected prompt:

If "s": use improved
If "n": use original

LLM Processing (Both Scenarios)

Once prompt is finalized:

from rich.progress import Progress, SpinnerColumn, TextColumn

def process_with_llm(transcript, prompt, cli_tool='claude'):
    full_prompt = f"{prompt}\n\n---\n\nTranscrição:\n\n{transcript}"
    
    with Progress(
        SpinnerColumn(),
        TextColumn("[progress.description]{task.description}"),
        transient=True
    ) as progress:
        progress.add_task(
            description=f"🤖 Processando com {cli_tool}...",
            total=None
        )
        
        if cli_tool == 'claude':
            result = subprocess.run(
                ['claude', '-'],
                input=full_prompt,
                capture_output=True,
                text=True,
                timeout=300  # 5 minutes
            )
        elif cli_tool == 'gh-copilot':
            result = subprocess.run(
                ['gh', 'copilot', 'suggest', '-t', 'shell', full_prompt],
                capture_output=True,
                text=True,
                timeout=300
            )
    
    if result.returncode == 0:
        return result.stdout.strip()
    else:
        return None

Progress output:

🤖 Processando com claude... ⠋
[After completion:]
✅ Ata gerada com sucesso!

Final Output

Success (both files):

💾 Salvando arquivos...

✅ Arquivos criados:
  - transcript-20260203-023045.md  (transcript puro)
  - ata-20260203-023045.md         (processado com LLM)

🧹 Removidos arquivos temporários: metadata.json, transcription.json

✅ Concluído! Tempo total: 3m 45s

Transcript only (user declined LLM):

💾 Salvando arquivos...

✅ Arquivo criado:
  - transcript-20260203-023045.md

ℹ️  Ata não gerada (processamento LLM recusado pelo usuário)

🧹 Removidos arquivos temporários: metadata.json, transcription.json

✅ Concluído!

Step 5: Display Results Summary

Objective: Show completion status and next steps.

Output:

echo ""
echo "✅ Transcription Complete!"
echo ""
echo "📊 Results:"
echo "  File: $OUTPUT_FILE"
echo "  Language: $LANGUAGE"
echo "  Duration: $DURATION_HMS"
echo "  Speakers: $NUM_SPEAKERS"
echo "  Words: $WORD_COUNT"
echo "  Processing time: ${ELAPSED_TIME}s"
echo ""
echo "📝 Generated:"
echo "  - $OUTPUT_FILE (Markdown report)"
[if alternative formats:]
echo "  - ${OUTPUT_FILE%.*}.srt (Subtitles)"
echo "  - ${OUTPUT_FILE%.*}.json (Structured data)"
echo ""
echo "🎯 Next steps:"
echo "  1. Review meeting minutes and action items"
echo "  2. Share report with participants"
echo "  3. Track action items to completion"

Example Usage

Example 1: Basic Transcription

User Input:

copilot> transcribe audio to markdown: meeting-2026-02-02.mp3

Skill Output:

✅ Faster-Whisper detected (optimized)
✅ ffmpeg available (format conversion enabled)

📂 File: meeting-2026-02-02.mp3
📊 Size: 12.3 MB
⏱️  Duration: 00:45:32

🎙️  Processing...
[████████████████████] 100%

✅ Language detected: Portuguese (pt-BR)
👥 Speakers identified: 4
📝 Generating Markdown output...

✅ Transcription Complete!

📊 Results:
  File: meeting-2026-02-02.md
  Language: pt-BR
  Duration: 00:45:32
  Speakers: 4
  Words: 6,842
  Processing time: 127s

📝 Generated:
  - meeting-2026-02-02.md (Markdown report)

🎯 Next steps:
  1. Review meeting minutes and action items
  2. Share report with participants
  3. Track action items to completion

Example 3: Batch Processing

User Input:

copilot> transcreva estes áudios: recordings/*.mp3

Skill Output:

📦 Batch mode: 5 files found
  1. team-standup.mp3
  2. client-call.mp3
  3. brainstorm-session.mp3
  4. product-demo.mp3
  5. retrospective.mp3

🎙️  Processing batch...

[1/5] team-standup.mp3 ✅ (2m 34s)
[2/5] client-call.mp3 ✅ (15m 12s)
[3/5] brainstorm-session.mp3 ✅ (8m 47s)
[4/5] product-demo.mp3 ✅ (22m 03s)
[5/5] retrospective.mp3 ✅ (11m 28s)

✅ Batch Complete!
📝 Generated 5 Markdown reports
⏱️  Total processing time: 6m 15s

Example 5: Large File Warning

User Input:

copilot> transcribe audio to markdown: conference-keynote.mp3

Skill Output:

✅ Faster-Whisper detected (optimized)

📂 File: conference-keynote.mp3
📊 Size: 87.2 MB
⏱️  Duration: 02:15:47
⚠️  Large file (87.2 MB) - processing may take several minutes

Continue? [Y/n]:

User: Y

🎙️  Processing... (this may take 10-15 minutes)
[████░░░░░░░░░░░░░░░░] 20% - Estimated time remaining: 12m

This skill is platform-agnostic and works in any terminal context where GitHub Copilot CLI is available. It does not depend on specific project configurations or external APIs, following the zero-configuration philosophy.

Limitations

Use this skill only when the task clearly matches the scope described above.
Do not treat the output as a substitute for environment-specific validation, testing, or expert review.
Stop and ask for clarification if required inputs, permissions, safety boundaries, or success criteria are missing.

Changelog - audio-transcriber

All notable changes to the audio-transcriber skill will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

---

[1.1.0] - 2026-02-03

✨ Added

Intelligent Prompt Workflow (Step 3b) - Complete integration with prompt-engineer skill
Scenario A: User-provided prompts are automatically improved with prompt-engineer
Displays both original and improved versions side-by-side
Single confirmation: "Usar versão melhorada? [s/n]"
Scenario B: Auto-generation when no prompt provided
Analyzes transcript and suggests document type (ata, resumo, notas)
Shows suggestion and asks confirmation
Generates complete structured prompt (RISEN/RODES/STAR)
Shows preview and asks final confirmation
Falls back to DEFAULT_MEETING_PROMPT if declined

LLM Integration - Process transcripts with Claude CLI or GitHub Copilot CLI
Priority: Claude > GitHub Copilot > None (transcript-only mode)
Step 0b: CLI detection logic documented
Timeout handling (5 minutes default)
Graceful fallback if CLI unavailable

Progress Indicators - Visual feedback during long operations
tqdm progress bar for Whisper transcription segments
rich spinner for LLM processing
Clear status messages at each step

Timestamp-based File Naming - Avoid overwriting previous transcriptions
Format: transcript-YYYYMMDD-HHMMSS.md
Format: ata-YYYYMMDD-HHMMSS.md
Prevents data loss from repeated runs

Automatic Cleanup - Remove temporary files after processing
Deletes metadata.json and transcription.json automatically
--keep-temp flag to preserve if needed
Clean output directory

Rich Terminal UI - Beautiful output with rich library
Formatted panels for prompt previews
Color-coded status messages (green=success, yellow=warning, red=error)
Spinner animations for long-running tasks

Dual Output Support - Generate both transcript and processed ata
transcript-*.md - Raw transcription with timestamps
ata-*.md - Intelligent summary/meeting minutes (if LLM available)
User can decline LLM processing to get transcript-only

🔧 Changed

SKILL.md - Major documentation updates
Added Step 0b (CLI Detection)
Updated Step 2 (Progress Indicators)
Added Step 3b (Intelligent Prompt Workflow with 150+ lines)
Updated version to 1.1.0
Added detailed workflow diagrams for both scenarios

install-requirements.sh - Added UI libraries
Now installs tqdm and rich packages
Graceful fallback if installation fails
Updated success messages

Python Implementation - Complete refactor
Created scripts/transcribe.py (516 lines)
Functions: detect_cli_tool(), invoke_prompt_engineer(), handle_prompt_workflow(), process_with_llm(), transcribe_audio(), save_outputs(), cleanup_temp_files()
Command-line arguments: --prompt, --model, --output-dir, --keep-temp
Auto-installs rich and tqdm if missing

🐛 Fixed

User prompts no longer ignored - v1.0.0 completely ignored custom prompts
Now processes all prompts (custom or auto-generated) with LLM
Improves simple prompts into structured frameworks

Temporary files cleanup - v1.0.0 left metadata.json and transcription.json as trash
Now automatically removed after processing
Clean output directory

File overwriting - v1.0.0 used same filename (e.g., meeting.md) every time
Now uses timestamp to prevent data loss
Each run creates unique files

Missing ata/summary - v1.0.0 only generated raw transcript
Now generates intelligent ata/resumo using LLM
Respects user's prompt instructions

No progress feedback - v1.0.0 had silent processing (users didn't know if it froze)
Now shows progress bar for transcription
Shows spinner for LLM processing
Clear status messages throughout

📝 Notes

Backward Compatibility: Fully compatible with v1.0.0 workflows
Requires: Python 3.8+, faster-whisper OR whisper, tqdm, rich
Optional: Claude CLI or GitHub Copilot CLI for intelligent processing
Optional: prompt-engineer skill for automatic prompt generation

🔗 Related Issues

Fixes #1: Prompt do usuário RISEN ignorado
Fixes #2: Arquivos temporários (metadata.json, transcription.json) deixados como lixo
Fixes #3: Output incompleto (apenas transcript RAW, sem ata)
Fixes #4: Falta de indicador de progresso visual
Fixes #5: Formato de saída sem timestamp

---

[1.0.0] - 2026-02-02

✨ Initial Release

Audio transcription using Faster-Whisper or OpenAI Whisper
Automatic language detection
Speaker diarization (basic)
Voice Activity Detection (VAD)
Markdown output with metadata table
Installation script for dependencies
Example scripts for basic transcription
Support for multiple audio formats (MP3, WAV, M4A, OGG, FLAC, WEBM)
FFmpeg integration for format conversion
Zero-configuration philosophy

📝 Known Limitations (Fixed in v1.1.0)

User prompts ignored (no LLM integration)
Only raw transcript generated (no ata/summary)
Temporary files not cleaned up
No progress indicators
Files overwritten on repeated runs

#!/usr/bin/env bash

# Basic Audio Transcription Example
# Demonstrates how to use the audio-transcriber skill manually

set -euo pipefail

# Configuration
AUDIO_FILE="${1:-}"
MODEL="${MODEL:-base}"  # Options: tiny, base, small, medium, large
OUTPUT_FORMAT="${OUTPUT_FORMAT:-markdown}"  # Options: markdown, txt, srt, vtt, json

# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color

# Helper functions
error() {
    echo -e "${RED}❌ Error: $1${NC}" >&2
    exit 1
}

success() {
    echo -e "${GREEN}✅ $1${NC}"
}

info() {
    echo -e "${BLUE}ℹ️  $1${NC}"
}

warn() {
    echo -e "${YELLOW}⚠️  $1${NC}"
}

# Check if audio file is provided
if [[ -z "$AUDIO_FILE" ]]; then
    error "Usage: $0 <audio_file>"
fi

# Verify file exists
if [[ ! -f "$AUDIO_FILE" ]]; then
    error "File not found: $AUDIO_FILE"
fi

# Step 0: Discovery - Check for transcription tools
info "Step 0: Discovering transcription tools..."

TRANSCRIBER=""
if python3 -c "import faster_whisper" 2>/dev/null; then
    TRANSCRIBER="faster-whisper"
    success "Faster-Whisper detected (optimized)"
elif python3 -c "import whisper" 2>/dev/null; then
    TRANSCRIBER="whisper"
    success "OpenAI Whisper detected"
else
    error "No transcription tool found. Install with: pip install faster-whisper"
fi

# Check for ffmpeg
if command -v ffmpeg &>/dev/null; then
    success "ffmpeg available (format conversion enabled)"
else
    warn "ffmpeg not found (limited format support)"
fi

# Step 1: Extract metadata
info "Step 1: Extracting audio metadata..."

FILE_SIZE=$(du -h "$AUDIO_FILE" | cut -f1)
info "File size: $FILE_SIZE"

# Get duration if ffprobe is available
if command -v ffprobe &>/dev/null; then
    DURATION=$(ffprobe -v error -show_entries format=duration \
        -of default=noprint_wrappers=1:nokey=1 "$AUDIO_FILE" 2>/dev/null || echo "0")
    
    # Convert to HH:MM:SS
    if command -v date &>/dev/null; then
        if [[ "$OSTYPE" == "darwin"* ]]; then
            # macOS
            DURATION_HMS=$(date -u -r "${DURATION%.*}" +%H:%M:%S 2>/dev/null || echo "Unknown")
        else
            # Linux
            DURATION_HMS=$(date -u -d @"${DURATION%.*}" +%H:%M:%S 2>/dev/null || echo "Unknown")
        fi
    else
        DURATION_HMS="Unknown"
    fi
    
    info "Duration: $DURATION_HMS"
else
    warn "ffprobe not found - cannot extract duration"
    DURATION="0"
    DURATION_HMS="Unknown"
fi

# Check file size warning
SIZE_MB=$(du -m "$AUDIO_FILE" | cut -f1)
if [[ $SIZE_MB -gt 25 ]]; then
    warn "Large file ($FILE_SIZE) - processing may take several minutes"
    read -p "Continue? [Y/n]: " CONTINUE
    if [[ "$CONTINUE" =~ ^[Nn] ]]; then
        info "Transcription cancelled"
        exit 0
    fi
fi

# Step 2: Transcribe using Python
info "Step 2: Transcribing audio..."

OUTPUT_FILE="${AUDIO_FILE%.*}.md"
TEMP_JSON="$(mktemp "${TMPDIR:-/tmp}/transcription.XXXXXX.json")"

AUDIO_FILE_ENV="$AUDIO_FILE" MODEL_ENV="$MODEL" TRANSCRIBER_ENV="$TRANSCRIBER" TEMP_JSON_ENV="$TEMP_JSON" python3 << 'EOF'
import os
import sys
import json
from datetime import datetime

try:
    audio_file = os.environ["AUDIO_FILE_ENV"]
    model_name = os.environ["MODEL_ENV"]
    transcriber = os.environ["TRANSCRIBER_ENV"]
    temp_json = os.environ["TEMP_JSON_ENV"]

    if transcriber == "faster-whisper":
        from faster_whisper import WhisperModel
        model = WhisperModel(model_name, device="cpu", compute_type="int8")
        segments, info = model.transcribe(audio_file, language=None, vad_filter=True)
        
        data = {
            "language": info.language,
            "language_probability": round(info.language_probability, 2),
            "duration": info.duration,
            "segments": []
        }
        
        for segment in segments:
            data["segments"].append({
                "start": round(segment.start, 2),
                "end": round(segment.end, 2),
                "text": segment.text.strip()
            })
    else:
        import whisper
        model = whisper.load_model(model_name)
        result = model.transcribe(audio_file)
        
        data = {
            "language": result["language"],
            "duration": result["segments"][-1]["end"] if result["segments"] else 0,
            "segments": result["segments"]
        }
    
    with open(temp_json, "w", encoding="utf-8") as f:
        json.dump(data, f)
    
    print(f"✅ Language detected: {data['language']}")
    print(f"📝 Transcribed {len(data['segments'])} segments")
    
except Exception as e:
    print(f"❌ Error: {e}", file=sys.stderr)
    sys.exit(1)
EOF

# Check if transcription succeeded
if [[ ! -f "$TEMP_JSON" ]]; then
    error "Transcription failed"
fi

# Step 3: Generate Markdown output
info "Step 3: Generating Markdown report..."

AUDIO_FILE_ENV="$AUDIO_FILE" FILE_SIZE_ENV="$FILE_SIZE" DURATION_HMS_ENV="$DURATION_HMS" TRANSCRIBER_ENV="$TRANSCRIBER" MODEL_ENV="$MODEL" TEMP_JSON_ENV="$TEMP_JSON" OUTPUT_FILE_ENV="$OUTPUT_FILE" python3 << 'EOF'
import json
import os
from datetime import datetime

# Load transcription data
with open(os.environ["TEMP_JSON_ENV"], encoding="utf-8") as f:
    data = json.load(f)

# Prepare metadata
filename = os.path.basename(os.environ["AUDIO_FILE_ENV"])
file_size = os.environ["FILE_SIZE_ENV"]
duration_hms = os.environ["DURATION_HMS_ENV"]
language = data["language"]
process_date = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
num_segments = len(data["segments"])
transcriber = os.environ["TRANSCRIBER_ENV"]
model_name = os.environ["MODEL_ENV"]

# Generate Markdown
markdown = f"""# Audio Transcription Report

## 📊 Metadata

| Field | Value |
|-------|-------|
| **File Name** | {filename} |
| **File Size** | {file_size} |
| **Duration** | {duration_hms} |
| **Language** | {language.upper()} |
| **Processed Date** | {process_date} |
| **Segments** | {num_segments} |
| **Transcription Engine** | {transcriber} (model: {model_name}) |

---

## 🎙️ Full Transcription

"""

# Add transcription with timestamps
for seg in data["segments"]:
    start_time = f"{int(seg['start'] // 60):02d}:{int(seg['start'] % 60):02d}"
    end_time = f"{int(seg['end'] // 60):02d}:{int(seg['end'] % 60):02d}"
    markdown += f"**[{start_time} → {end_time}]**  \n{seg['text']}\n\n"

markdown += """---

## 📝 Summary

*Automatic summary generation requires AI integration (Claude/GPT).*  
*For now, review the full transcription above.*

---

*Generated by audio-transcriber skill example script*  
*Transcription engine: {transcriber} | Model: {model_name}*
"""

# Write to file
with open(os.environ["OUTPUT_FILE_ENV"], "w", encoding="utf-8") as f:
    f.write(markdown)

print(f"✅ Markdown report saved: {os.environ['OUTPUT_FILE_ENV']}")
EOF

# Clean up
rm -f "$TEMP_JSON"

# Step 4: Display summary
success "Transcription complete!"
echo ""
echo "📊 Results:"
echo "  Output file: $OUTPUT_FILE"
echo "  Transcription engine: $TRANSCRIBER"
echo "  Model: $MODEL"
echo ""
info "Next steps:"
echo "  1. Review the transcription: cat $OUTPUT_FILE"
echo "  2. Edit if needed: vim $OUTPUT_FILE"
echo "  3. Share with team or archive"

Audio Transcriber Skill v1.1.0

Transform audio recordings into professional Markdown documentation with intelligent atas/summaries using LLM integration (Claude/Copilot CLI) and automatic prompt engineering.

🆕 What's New in v1.1.0

🧠 LLM Integration - Claude CLI (primary) or GitHub Copilot CLI (fallback) for intelligent processing
✨ Smart Prompts - Automatic integration with prompt-engineer skill
User-provided prompts → automatically improved → user chooses version
No prompt → analyzes transcript → suggests format → generates structured prompt
📊 Progress Indicators - Visual progress bars (tqdm) and spinners (rich)
📁 Timestamp Filenames - transcript-YYYYMMDD-HHMMSS.md + ata-YYYYMMDD-HHMMSS.md
🧹 Auto-Cleanup - Removes temporary metadata.json and transcription.json
🎨 Rich Terminal UI - Beautiful formatted output with panels and colors

See [CHANGELOG.md](./CHANGELOG.md) for complete v1.1.0 details.

🎯 Core Features

📝 Rich Markdown Output - Structured reports with metadata tables, timestamps, and formatting
🎙️ Speaker Diarization - Automatically identifies and labels different speakers
📊 Technical Metadata - Extracts file size, duration, language, processing time
📋 Intelligent Atas/Summaries - Generated via LLM (Claude/Copilot) with customizable prompts
💡 Executive Summaries - AI-generated structured summaries with topics, decisions, action items
🌍 Multi-language - Supports 99 languages with auto-detection
⚡ Zero Configuration - Auto-discovers Faster-Whisper/Whisper installation
🔒 Privacy-First - 100% local Whisper processing, no cloud uploads
🚀 Flexible Modes - Transcript-only or intelligent processing with LLM

📦 Installation

Quick Install (NPX)

npx cli-ai-skills@latest install audio-transcriber

This automatically:

Downloads the skill
Installs Python dependencies (faster-whisper, tqdm, rich)
Installs ffmpeg (macOS via Homebrew)
Sets up the skill globally

Manual Installation

1. Install Transcription Engine

Recommended (fastest):

pip install faster-whisper tqdm rich

Alternative (original Whisper):

pip install openai-whisper tqdm rich

2. Install Audio Tools (Optional)

For format conversion support:

# macOS
brew install ffmpeg

# Linux
apt install ffmpeg

3. Install LLM CLI (Optional - for intelligent summaries)

Claude CLI (recommended):

# Follow: https://docs.anthropic.com/en/docs/claude-cli

GitHub Copilot CLI (alternative):

gh extension install github/gh-copilot

4. Install Skill

Global installation (auto-updates with git pull):

cd /path/to/cli-ai-skills
./scripts/install-skills.sh $(pwd)

Repository only:

# Skill is already available if you cloned the repo

🚀 Usage

Basic Transcription

copilot> transcribe audio to markdown: meeting.mp3

Output:

meeting.md - Full Markdown report with metadata, transcription, minutes, summary

With Subtitles

copilot> convert audio file to text with subtitles: interview.wav

Generates:

interview.md - Markdown report
interview.srt - Subtitle file

Batch Processing

copilot> transcreva estes áudios: recordings/*.mp3

Processes all MP3 files in the directory.

Trigger Phrases

Activate the skill with any of these phrases:

"transcribe audio to markdown"
"transcreva este áudio"
"convert audio file to text"
"extract speech from audio"
"áudio para texto com metadados"

📋 Use Cases

1. Team Meetings

Record standups, planning sessions, or retrospectives and automatically generate:

Participant list
Discussion topics with timestamps
Decisions made
Action items assigned

2. Client Calls

Transcribe client conversations with:

Speaker identification
Key agreements documented
Follow-up tasks extracted

3. Interviews

Convert interviews to text with:

Question/answer attribution
Subtitle generation for video
Searchable transcript

4. Lectures & Training

Document educational content with:

Timestamped notes
Topic breakdown
Key concepts summary

5. Content Creation

Analyze podcasts, videos, YouTube content:

Full transcription
Chapter markers (timestamps)
Summary for show notes

📊 Output Example

# Audio Transcription Report

## 📊 Metadata

| Field | Value |
|-------|-------|
| **File Name** | team-standup.mp3 |
| **File Size** | 3.2 MB |
| **Duration** | 00:12:47 |
| **Language** | English (en) |
| **Processed Date** | 2026-02-02 14:35:21 |
| **Speakers Identified** | 5 |
| **Transcription Engine** | Faster-Whisper (model: base) |

---

## 🎙️ Full Transcription

**[00:00:12 → 00:00:45]** *Speaker 1*  
Good morning everyone. Let's start with updates from the frontend team.

**[00:00:46 → 00:01:23]** *Speaker 2*  
We completed the dashboard redesign and deployed to staging yesterday.

---

## 📋 Meeting Minutes

### Participants
- Speaker 1 (Meeting Lead)
- Speaker 2 (Frontend Developer)
- Speaker 3 (Backend Developer)
- Speaker 4 (Designer)
- Speaker 5 (Product Manager)

### Topics Discussed
1. **Dashboard Redesign** (00:00:46)
   - Completed and deployed to staging
   - Positive feedback from QA team

2. **API Performance Issues** (00:03:12)
   - Database query optimization needed
   - Target response time < 200ms

### Decisions Made
- ✅ Approved dashboard for production deployment
- ✅ Allocated 2 sprint points for API optimization

### Action Items
- [ ] **Deploy dashboard to production** - Assigned to: Speaker 2 - Due: 2026-02-05
- [ ] **Optimize database queries** - Assigned to: Speaker 3
- [ ] **Schedule user testing session** - Assigned to: Speaker 5

---

## 📝 Executive Summary

The team standup covered progress on the dashboard redesign, which has been successfully completed and is ready for production deployment. The frontend team received positive feedback from QA and the design aligns with user requirements.

Backend performance concerns were raised regarding API response times. The team decided to prioritize query optimization in the current sprint, with a target of sub-200ms response times.

Next steps include production deployment of the dashboard by end of week and scheduling user testing sessions to validate the new design with real users.

### Key Points
- 🔹 Dashboard redesign complete and staging-approved
- 🔹 API performance optimization prioritized
- 🔹 User testing scheduled for next week

### Next Steps
1. Production deployment (Speaker 2)
2. Database optimization (Speaker 3)
3. User testing coordination (Speaker 5)

⚙️ Configuration

No configuration needed! The skill automatically:

Detects Faster-Whisper or Whisper installation
Chooses the fastest available engine
Selects appropriate model based on file size
Auto-detects language

🔧 Troubleshooting

"No transcription tool found"

Solution: Install Whisper:

pip install faster-whisper

"Unsupported format"

Solution: Install ffmpeg:

brew install ffmpeg  # macOS
apt install ffmpeg   # Linux

Slow processing

Solution: Use a smaller Whisper model:

# Edit the skill to use "tiny" or "base" model instead of "medium"

Poor speaker identification

Solution:

Ensure clear audio with minimal background noise
Use a better microphone for recordings
Try the "medium" or "large" Whisper model

🛠️ Advanced Usage

Custom Model Selection

Edit SKILL.md Step 2 to change model:

model = WhisperModel("small", device="cpu")  # Change "base" to "small", "medium", etc.

Output Language Control

Force output in specific language:

# Edit Step 3 to set language explicitly

Batch Settings

Process specific file types only:

copilot> transcribe audio: recordings/*.wav  # Only WAV files

📚 FAQ

Q: Does this work offline? A: Yes! 100% local processing, no internet required after initial model download.

Q: What's the difference between Whisper and Faster-Whisper? A: Faster-Whisper is 4-5x faster with same quality. Always prefer it if available.

Q: Can I transcribe YouTube videos? A: Not directly. Use a YouTube downloader first, then transcribe the audio file. Or use the youtube-summarizer skill instead.

Q: How accurate is speaker identification? A: Accuracy depends on audio quality. Clear recordings with distinct voices work best. Currently uses simple estimation; future versions will use advanced diarization.

Q: What languages are supported? A: 99 languages including English, Portuguese, Spanish, French, German, Chinese, Japanese, Arabic, and more.

Q: Can I edit the meeting minutes format? A: Yes! Edit the Markdown template in SKILL.md Step 3.

🔗 Related Skills

youtube-summarizer - Extract and summarize YouTube video transcripts
prompt-engineer - Optimize prompts for better AI summaries

📄 License

This skill is part of the cli-ai-skills repository. MIT License - See repository LICENSE file.

🤝 Contributing

Found a bug or have a feature request? Open an issue in the cli-ai-skills repository.

---

Version: 1.0.0 Author: Eric Andrade Created: 2026-02-02

Transcription Tools Comparison

Comprehensive comparison of audio transcription engines supported by the audio-transcriber skill.

Overview

Tool	Type	Speed	Quality	Cost	Privacy	Offline	Languages
Faster-Whisper	Open-source	⚡⚡⚡⚡⚡	⭐⭐⭐⭐⭐	Free	100%	✅	99
Whisper	Open-source	⚡⚡⚡	⭐⭐⭐⭐⭐	Free	100%	✅	99
Google Speech-to-Text	Commercial API	⚡⚡⚡⚡	⭐⭐⭐⭐⭐	$0.006/15s	Partial	❌	125+
Azure Speech	Commercial API	⚡⚡⚡⚡	⭐⭐⭐⭐	$1/hour	Partial	❌	100+
AssemblyAI	Commercial API	⚡⚡⚡⚡	⭐⭐⭐⭐⭐	$0.00025/s	Partial	❌	99

---

Faster-Whisper (Recommended)

Pros

✅ 4-5x faster than original Whisper ✅ Same quality as original Whisper ✅ Lower memory usage (50-60% less RAM) ✅ Free and open-source ✅ 100% offline (privacy guaranteed) ✅ Easy installation (pip install faster-whisper) ✅ Drop-in replacement for Whisper

Cons

❌ Requires Python 3.8+ ❌ Initial model download (~100MB-1.5GB) ❌ GPU optional but speeds up significantly

Installation

pip install faster-whisper

Usage Example

from faster_whisper import WhisperModel

# Load model (auto-downloads on first run)
model = WhisperModel("base", device="cpu", compute_type="int8")

# Transcribe
segments, info = model.transcribe("audio.mp3", language="pt")

# Print results
for segment in segments:
    print(f"[{segment.start:.2f}s -> {segment.end:.2f}s] {segment.text}")

Model Sizes

Model	Size	RAM	Speed (CPU)	Quality
`tiny`	39 MB	~1 GB	Very fast (~10x realtime)	Basic
`base`	74 MB	~1 GB	Fast (~7x realtime)	Good
`small`	244 MB	~2 GB	Moderate (~4x realtime)	Very good
`medium`	769 MB	~5 GB	Slow (~2x realtime)	Excellent
`large`	1550 MB	~10 GB	Very slow (~1x realtime)	Best

Recommendation: small or medium for production use.

---

Whisper (Original)

Pros

✅ Official OpenAI model ✅ Excellent quality ✅ Free and open-source ✅ 100% offline ✅ Well-documented ✅ Large community

Cons

❌ Slower than Faster-Whisper (4-5x) ❌ Higher memory usage ❌ Requires PyTorch (large dependency) ❌ GPU highly recommended for larger models

Installation

pip install openai-whisper

Usage Example

import whisper

# Load model
model = whisper.load_model("base")

# Transcribe
result = model.transcribe("audio.mp3", language="pt")

# Print results
print(result["text"])

When to Use Whisper vs. Faster-Whisper

Use Faster-Whisper if:

Speed is important
Limited RAM available
Processing many files

Use Original Whisper if:

Faster-Whisper installation issues
Need exact OpenAI implementation
Already have Whisper in project dependencies

---

Google Cloud Speech-to-Text

Pros

✅ Very accurate (industry-leading) ✅ Fast processing (cloud infrastructure) ✅ 125+ languages ✅ Word-level timestamps ✅ Punctuation & capitalization ✅ Speaker diarization (premium)

Cons

❌ Requires internet (cloud-only) ❌ Costs money (after free tier) ❌ Privacy concerns (audio uploaded to Google) ❌ Requires GCP account setup ❌ Complex authentication

Pricing

Free tier: 60 minutes/month
Standard: $0.006 per 15 seconds ($1.44/hour)
Premium: $0.009 per 15 seconds (with diarization)

Installation

pip install google-cloud-speech

Setup

1. Create GCP project 2. Enable Speech-to-Text API 3. Create service account & download JSON key 4. Set environment variable:

   export GOOGLE_APPLICATION_CREDENTIALS="path/to/key.json"

Usage Example

from google.cloud import speech

client = speech.SpeechClient()

with open("audio.wav", "rb") as audio_file:
    content = audio_file.read()

audio = speech.RecognitionAudio(content=content)
config = speech.RecognitionConfig(
    encoding=speech.RecognitionConfig.AudioEncoding.LINEAR16,
    sample_rate_hertz=16000,
    language_code="pt-BR",
)

response = client.recognize(config=config, audio=audio)

for result in response.results:
    print(result.alternatives[0].transcript)

---

Azure Speech Services

Pros

✅ High accuracy ✅ 100+ languages ✅ Real-time transcription ✅ Custom models (train on your data) ✅ Good Microsoft ecosystem integration

Cons

❌ Requires internet ❌ Costs money (after free tier) ❌ Privacy concerns (cloud processing) ❌ Requires Azure account ❌ Complex setup

Pricing

Free tier: 5 hours/month
Standard: $1.00 per audio hour

Installation

pip install azure-cognitiveservices-speech

Setup

1. Create Azure account 2. Create Speech resource 3. Get API key and region 4. Set environment variables:

   export AZURE_SPEECH_KEY="your-key"
   export AZURE_SPEECH_REGION="your-region"

Usage Example

import azure.cognitiveservices.speech as speechsdk

speech_config = speechsdk.SpeechConfig(
    subscription=os.environ.get('AZURE_SPEECH_KEY'),
    region=os.environ.get('AZURE_SPEECH_REGION')
)

audio_config = speechsdk.audio.AudioConfig(filename="audio.wav")
speech_recognizer = speechsdk.SpeechRecognizer(
    speech_config=speech_config,
    audio_config=audio_config
)

result = speech_recognizer.recognize_once()
print(result.text)

---

AssemblyAI

Pros

✅ Modern, developer-friendly API ✅ Excellent accuracy ✅ Advanced features (sentiment, topic detection, PII redaction) ✅ Speaker diarization (included) ✅ Fast processing ✅ Good documentation

Cons

❌ Requires internet ❌ Costs money (no free tier, only trial credits) ❌ Privacy concerns (cloud processing) ❌ Requires API key

Pricing

Free trial: $50 credits
Standard: $0.00025 per second (~$0.90/hour)

Installation

pip install assemblyai

Setup

1. Sign up at assemblyai.com 2. Get API key 3. Set environment variable:

   export ASSEMBLYAI_API_KEY="your-key"

Usage Example

import assemblyai as aai

aai.settings.api_key = os.environ["ASSEMBLYAI_API_KEY"]

transcriber = aai.Transcriber()
transcript = transcriber.transcribe("audio.mp3")

print(transcript.text)

# Speaker diarization
for utterance in transcript.utterances:
    print(f"Speaker {utterance.speaker}: {utterance.text}")

---

Recommendation Matrix

Use Faster-Whisper if:

✅ Privacy is critical (local processing)
✅ Want zero cost (free forever)
✅ Need offline capability
✅ Processing many files (speed matters)
✅ Limited budget

Use Google Speech-to-Text if:

✅ Need absolute best accuracy
✅ Have budget for cloud services
✅ Want advanced features (punctuation, diarization)
✅ Already using GCP ecosystem

Use Azure Speech if:

✅ In Microsoft ecosystem
✅ Need custom model training
✅ Want real-time transcription
✅ Have Azure credits

Use AssemblyAI if:

✅ Need advanced features (sentiment, topics)
✅ Want easiest API experience
✅ Need automatic PII redaction
✅ Value developer experience

---

Performance Benchmarks

Test: 1-hour podcast (MP3, 44.1kHz, stereo)

Tool	Processing Time	Accuracy	Cost
Faster-Whisper (small)	8 min	94%	$0
Whisper (small)	32 min	94%	$0
Google Speech	2 min	96%	$1.44
Azure Speech	3 min	95%	$1.00
AssemblyAI	4 min	96%	$0.90

Benchmarks run on MacBook Pro M1, 16GB RAM

---

Conclusion

For the audio-transcriber skill:

1. Primary: Faster-Whisper (best balance of speed, quality, privacy, cost) 2. Fallback: Whisper (if Faster-Whisper unavailable) 3. Optional: Cloud APIs (user choice for premium features)

This ensures the skill works out-of-the-box for most users while allowing advanced users to integrate commercial services if needed.

#!/usr/bin/env bash

# Audio Transcriber - Requirements Installation Script
# Automatically installs and validates dependencies

set -euo pipefail

# Colors
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
RED='\033[0;31m'
BLUE='\033[0;34m'
NC='\033[0m'

echo -e "${BLUE}🔧 Audio Transcriber - Dependency Installation${NC}"
echo ""

# Check Python
if ! command -v python3 &>/dev/null; then
    echo -e "${RED}❌ Python 3 not found. Please install Python 3.8+${NC}"
    exit 1
fi

PYTHON_VERSION=$(python3 --version | cut -d' ' -f2 | cut -d'.' -f1,2)
echo -e "${GREEN}✅ Python ${PYTHON_VERSION} detected${NC}"

# Check pip
if ! python3 -m pip --version &>/dev/null; then
    echo -e "${RED}❌ pip not found. Please install pip${NC}"
    exit 1
fi

echo -e "${GREEN}✅ pip available${NC}"
echo ""

# Install system dependencies (macOS only)
if [[ "$OSTYPE" == "darwin"* ]]; then
    echo -e "${BLUE}📦 Checking system dependencies (macOS)...${NC}"
    
    # Check for Homebrew
    if command -v brew &>/dev/null; then
        # Install pkg-config and ffmpeg if not present
        NEED_INSTALL=""
        
        if ! brew list pkg-config &>/dev/null 2>&1; then
            NEED_INSTALL="$NEED_INSTALL pkg-config"
        fi
        
        if ! brew list ffmpeg &>/dev/null 2>&1; then
            NEED_INSTALL="$NEED_INSTALL ffmpeg"
        fi
        
        if [[ -n "$NEED_INSTALL" ]]; then
            echo -e "${BLUE}Installing:$NEED_INSTALL${NC}"
            brew install $NEED_INSTALL --quiet
            echo -e "${GREEN}✅ System dependencies installed${NC}"
        else
            echo -e "${GREEN}✅ System dependencies already installed${NC}"
        fi
    else
        echo -e "${YELLOW}⚠️  Homebrew not found. Install manually if needed:${NC}"
        echo "  /bin/bash -c \"\$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)\""
    fi
fi

echo ""

# Install faster-whisper (recommended)
echo -e "${BLUE}📦 Installing Faster-Whisper...${NC}"

# Try different installation methods based on Python environment
if python3 -m pip install faster-whisper --quiet 2>/dev/null; then
    echo -e "${GREEN}✅ Faster-Whisper installed successfully${NC}"
elif python3 -m pip install --user --break-system-packages faster-whisper --quiet 2>/dev/null; then
    echo -e "${GREEN}✅ Faster-Whisper installed successfully (user mode)${NC}"
else
    echo -e "${YELLOW}⚠️  Faster-Whisper installation failed, trying Whisper...${NC}"
    
    if python3 -m pip install openai-whisper --quiet 2>/dev/null; then
        echo -e "${GREEN}✅ Whisper installed successfully${NC}"
    elif python3 -m pip install --user --break-system-packages openai-whisper --quiet 2>/dev/null; then
        echo -e "${GREEN}✅ Whisper installed successfully (user mode)${NC}"
    else
        echo -e "${RED}❌ Failed to install transcription engine${NC}"
        echo ""
        echo -e "${YELLOW}Manual installation options:${NC}"
        echo "  1. Use --break-system-packages (macOS/Homebrew Python):"
        echo "     python3 -m pip install --user --break-system-packages openai-whisper"
        echo ""
        echo "  2. Use virtual environment (recommended):"
        echo "     python3 -m venv ~/whisper-env"
        echo "     source ~/whisper-env/bin/activate"
        echo "     pip install faster-whisper"
        echo ""
        echo "  3. Use pipx (isolated):"
        echo "     brew install pipx"
        echo "     pipx install openai-whisper"
        exit 1
    fi
fi

# Install UI/progress libraries (tqdm, rich)
echo ""
echo -e "${BLUE}📦 Installing UI libraries (tqdm, rich)...${NC}"

if python3 -m pip install tqdm rich --quiet 2>/dev/null; then
    echo -e "${GREEN}✅ tqdm and rich installed successfully${NC}"
elif python3 -m pip install --user --break-system-packages tqdm rich --quiet 2>/dev/null; then
    echo -e "${GREEN}✅ tqdm and rich installed successfully (user mode)${NC}"
else
    echo -e "${YELLOW}⚠️  Optional UI libraries not installed (skill will still work)${NC}"
fi

# Check ffmpeg (optional but recommended)
echo ""
if command -v ffmpeg &>/dev/null; then
    echo -e "${GREEN}✅ ffmpeg already installed${NC}"
else
    echo -e "${YELLOW}⚠️  ffmpeg not found (should have been installed earlier)${NC}"
    if [[ "$OSTYPE" == "darwin"* ]] && command -v brew &>/dev/null; then
        echo -e "${BLUE}Installing ffmpeg via Homebrew...${NC}"
        brew install ffmpeg --quiet && echo -e "${GREEN}✅ ffmpeg installed${NC}"
    else
        echo -e "${BLUE}ℹ️  ffmpeg is optional but recommended for format conversion${NC}"
        echo ""
        echo "Install ffmpeg:"
        if [[ "$OSTYPE" == "darwin"* ]]; then
            echo "  brew install ffmpeg"
        elif [[ "$OSTYPE" == "linux-gnu"* ]]; then
            echo "  sudo apt install ffmpeg  # Debian/Ubuntu"
            echo "  sudo yum install ffmpeg  # CentOS/RHEL"
        fi
    fi
fi

# Verify installation
echo ""
echo -e "${BLUE}🔍 Verifying installation...${NC}"

if python3 -c "import faster_whisper" 2>/dev/null; then
    echo -e "${GREEN}✅ Faster-Whisper verified${NC}"
    TRANSCRIBER="Faster-Whisper"
elif python3 -c "import whisper" 2>/dev/null; then
    echo -e "${GREEN}✅ Whisper verified${NC}"
    TRANSCRIBER="Whisper"
else
    echo -e "${RED}❌ No transcription engine found after installation${NC}"
    exit 1
fi

# Download initial model (optional)
read -p "Download Whisper 'base' model now? (recommended, ~74MB) [Y/n]: " DOWNLOAD_MODEL

if [[ ! "$DOWNLOAD_MODEL" =~ ^[Nn] ]]; then
    echo ""
    echo -e "${BLUE}📥 Downloading 'base' model...${NC}"
    
    python3 << 'EOF'
try:
    import faster_whisper
    model = faster_whisper.WhisperModel("base", device="cpu", compute_type="int8")
    print("✅ Model downloaded successfully")
except:
    try:
        import whisper
        model = whisper.load_model("base")
        print("✅ Model downloaded successfully")
    except Exception as e:
        print(f"❌ Model download failed: {e}")
EOF
fi

# Success summary
echo ""
echo -e "${GREEN}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
echo -e "${GREEN}✅ Installation Complete!${NC}"
echo -e "${GREEN}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
echo ""
echo "📊 Installed components:"
echo "  • Transcription engine: $TRANSCRIBER"
if command -v ffmpeg &>/dev/null; then
    echo "  • Format conversion: ffmpeg (available)"
else
    echo "  • Format conversion: ffmpeg (not installed)"
fi
echo ""
echo "🚀 Ready to use! Try:"
echo "  copilot> transcribe audio to markdown: myfile.mp3"
echo "  claude> transcreva este áudio: myfile.mp3"
echo ""

#!/usr/bin/env python3
"""
Audio Transcriber v1.1.0
Transcreve áudio para texto e gera atas/resumos usando LLM.
"""

import os
import sys
import json
import subprocess
import shutil
from datetime import datetime
from pathlib import Path

# Rich for beautiful terminal output
try:
    from rich.console import Console
    from rich.prompt import Prompt
    from rich.panel import Panel
    from rich.progress import Progress, SpinnerColumn, TextColumn, BarColumn
    from rich import print as rprint
    RICH_AVAILABLE = True
except ImportError:
    RICH_AVAILABLE = False
    print("⚠️  Installing rich for better UI...")
    subprocess.run([sys.executable, "-m", "pip", "install", "--user", "rich"], check=False)
    from rich.console import Console
    from rich.prompt import Prompt
    from rich.panel import Panel
    from rich.progress import Progress, SpinnerColumn, TextColumn, BarColumn
    from rich import print as rprint

# tqdm for progress bars
try:
    from tqdm import tqdm
except ImportError:
    print("⚠️  Installing tqdm for progress bars...")
    subprocess.run([sys.executable, "-m", "pip", "install", "--user", "tqdm"], check=False)
    from tqdm import tqdm

# Whisper engines
try:
    from faster_whisper import WhisperModel
    TRANSCRIBER = "faster-whisper"
except ImportError:
    try:
        import whisper
        TRANSCRIBER = "whisper"
    except ImportError:
        print("❌ Nenhum engine de transcrição encontrado!")
        print("   Instale: pip install faster-whisper")
        sys.exit(1)

console = Console()

# Template padrão RISEN para fallback
DEFAULT_MEETING_PROMPT = """
Role: Você é um transcritor profissional especializado em documentação.

Instructions: Transforme a transcrição fornecida em um documento estruturado e profissional.

Steps:
1. Identifique o tipo de conteúdo (reunião, palestra, entrevista, etc.)
2. Extraia os principais tópicos e pontos-chave
3. Identifique participantes/speakers (se aplicável)
4. Extraia decisões tomadas e ações definidas (se reunião)
5. Organize em formato apropriado com seções claras
6. Use Markdown para formatação profissional

End Goal: Documento final bem estruturado, legível e pronto para distribuição.

Narrowing: 
- Mantenha objetividade e clareza
- Preserve contexto importante
- Use formatação Markdown adequada
- Inclua timestamps relevantes quando aplicável
"""


def detect_cli_tool():
    """Detecta qual CLI de LLM está disponível (claude > gh copilot)."""
    if shutil.which('claude'):
        return 'claude'
    elif shutil.which('gh'):
        result = subprocess.run(['gh', 'copilot', '--version'], 
                                capture_output=True, text=True)
        if result.returncode == 0:
            return 'gh-copilot'
    return None


def invoke_prompt_engineer(raw_prompt, timeout=90):
    """
    Invoca prompt-engineer skill via CLI para melhorar/gerar prompts.
    
    Args:
        raw_prompt: Prompt a ser melhorado ou meta-prompt
        timeout: Timeout em segundos
    
    Returns:
        str: Prompt melhorado ou DEFAULT_MEETING_PROMPT se falhar
    """
    try:
        # Tentar via gh copilot
        console.print("[dim]   Invocando prompt-engineer...[/dim]")
        
        result = subprocess.run(
            ['gh', 'copilot', 'suggest', '-t', 'shell', raw_prompt],
            capture_output=True,
            text=True,
            timeout=timeout
        )
        
        if result.returncode == 0 and result.stdout.strip():
            return result.stdout.strip()
        else:
            console.print("[yellow]⚠️  prompt-engineer não respondeu, usando template padrão[/yellow]")
            return DEFAULT_MEETING_PROMPT
            
    except subprocess.TimeoutExpired:
        console.print(f"[red]⚠️  Timeout após {timeout}s, usando template padrão[/red]")
        return DEFAULT_MEETING_PROMPT
    except Exception as e:
        console.print(f"[red]⚠️  Erro ao invocar prompt-engineer: {e}[/red]")
        return DEFAULT_MEETING_PROMPT


def handle_prompt_workflow(user_prompt, transcript):
    """
    Gerencia fluxo completo de prompts com prompt-engineer.
    
    Cenário A: Usuário forneceu prompt → Melhorar AUTOMATICAMENTE → Confirmar
    Cenário B: Sem prompt → Sugerir tipo → Confirmar → Gerar → Confirmar
    
    Returns:
        str: Prompt final a usar, ou None se usuário recusou processamento
    """
    prompt_engineer_available = os.path.exists(
        os.path.expanduser('~/.copilot/skills/prompt-engineer/SKILL.md')
    )
    
    # ========== CENÁRIO A: USUÁRIO FORNECEU PROMPT ==========
    if user_prompt:
        console.print("\n[cyan]📝 Prompt fornecido pelo usuário[/cyan]")
        console.print(Panel(user_prompt[:300] + ("..." if len(user_prompt) > 300 else ""), 
                           title="Prompt original", border_style="dim"))
        
        if prompt_engineer_available:
            # Melhora AUTOMATICAMENTE (sem perguntar)
            console.print("\n[cyan]🔧 Melhorando prompt com prompt-engineer...[/cyan]")
            
            improved_prompt = invoke_prompt_engineer(
                f"melhore este prompt:\n\n{user_prompt}"
            )
            
            # Mostrar AMBAS versões
            console.print("\n[green]✨ Versão melhorada:[/green]")
            console.print(Panel(improved_prompt[:500] + ("..." if len(improved_prompt) > 500 else ""), 
                               title="Prompt otimizado", border_style="green"))
            
            console.print("\n[dim]📝 Versão original:[/dim]")
            console.print(Panel(user_prompt[:300] + ("..." if len(user_prompt) > 300 else ""), 
                               title="Seu prompt", border_style="dim"))
            
            # Pergunta qual usar
            confirm = Prompt.ask(
                "\n💡 Usar versão melhorada?",
                choices=["s", "n"],
                default="s"
            )
            
            return improved_prompt if confirm == "s" else user_prompt
        else:
            # prompt-engineer não disponível
            console.print("[yellow]⚠️  prompt-engineer skill não disponível[/yellow]")
            console.print("[dim]✅ Usando seu prompt original[/dim]")
            return user_prompt
    
    # ========== CENÁRIO B: SEM PROMPT - AUTO-GERAÇÃO ==========
    else:
        console.print("\n[yellow]⚠️  Nenhum prompt fornecido.[/yellow]")
        
        if not prompt_engineer_available:
            console.print("[yellow]⚠️  prompt-engineer skill não encontrado[/yellow]")
            console.print("[dim]Usando template padrão...[/dim]")
            return DEFAULT_MEETING_PROMPT
        
        # PASSO 1: Perguntar se quer auto-gerar
        console.print("Posso analisar o transcript e sugerir um formato de resumo/ata?")
        
        generate = Prompt.ask(
            "\n💡 Gerar prompt automaticamente?",
            choices=["s", "n"],
            default="s"
        )
        
        if generate == "n":
            console.print("[dim]✅ Ok, gerando apenas transcript.md (sem ata)[/dim]")
            return None  # Sinaliza: não processar com LLM
        
        # PASSO 2: Analisar transcript e SUGERIR tipo
        console.print("\n[cyan]🔍 Analisando transcript...[/cyan]")
        
        suggestion_meta_prompt = f"""
Analise este transcript ({len(transcript)} caracteres) e sugira:

1. Tipo de conteúdo (reunião, palestra, entrevista, etc.)
2. Formato de saída recomendado (ata formal, resumo executivo, notas estruturadas)
3. Framework ideal (RISEN, RODES, STAR, etc.)

Primeiras 1000 palavras do transcript:
{transcript[:4000]}

Responda em 2-3 linhas concisas.
"""
        
        suggested_type = invoke_prompt_engineer(suggestion_meta_prompt)
        
        # PASSO 3: Mostrar sugestão e CONFIRMAR
        console.print("\n[green]💡 Sugestão de formato:[/green]")
        console.print(Panel(suggested_type, title="Análise do transcript", border_style="green"))
        
        confirm_type = Prompt.ask(
            "\n💡 Usar este formato?",
            choices=["s", "n"],
            default="s"
        )
        
        if confirm_type == "n":
            console.print("[dim]Usando template padrão...[/dim]")
            return DEFAULT_MEETING_PROMPT
        
        # PASSO 4: Gerar prompt completo baseado na sugestão
        console.print("\n[cyan]✨ Gerando prompt estruturado...[/cyan]")
        
        final_meta_prompt = f"""
Crie um prompt completo e estruturado (usando framework apropriado) para:

{suggested_type}

O prompt deve instruir uma IA a transformar o transcript em um documento
profissional e bem formatado em Markdown.
"""
        
        generated_prompt = invoke_prompt_engineer(final_meta_prompt)
        
        # PASSO 5: Mostrar prompt gerado e CONFIRMAR
        console.print("\n[green]✅ Prompt gerado:[/green]")
        console.print(Panel(generated_prompt[:600] + ("..." if len(generated_prompt) > 600 else ""), 
                           title="Preview", border_style="green"))
        
        confirm_final = Prompt.ask(
            "\n💡 Usar este prompt?",
            choices=["s", "n"],
            default="s"
        )
        
        if confirm_final == "s":
            return generated_prompt
        else:
            console.print("[dim]Usando template padrão...[/dim]")
            return DEFAULT_MEETING_PROMPT


def process_with_llm(transcript, prompt, cli_tool='claude', timeout=300):
    """
    Processa transcript com LLM usando prompt fornecido.
    
    Args:
        transcript: Texto transcrito
        prompt: Prompt instruindo como processar
        cli_tool: 'claude' ou 'gh-copilot'
        timeout: Timeout em segundos
    
    Returns:
        str: Ata/resumo processado
    """
    full_prompt = f"{prompt}\n\n---\n\nTranscrição:\n\n{transcript}"
    
    try:
        with Progress(
            SpinnerColumn(),
            TextColumn("[progress.description]{task.description}"),
            transient=True
        ) as progress:
            progress.add_task(description=f"🤖 Processando com {cli_tool}...", total=None)
            
            if cli_tool == 'claude':
                result = subprocess.run(
                    ['claude', '-'],
                    input=full_prompt,
                    capture_output=True,
                    text=True,
                    timeout=timeout
                )
            elif cli_tool == 'gh-copilot':
                result = subprocess.run(
                    ['gh', 'copilot', 'suggest', '-t', 'shell', full_prompt],
                    capture_output=True,
                    text=True,
                    timeout=timeout
                )
            else:
                raise ValueError(f"CLI tool desconhecido: {cli_tool}")
        
        if result.returncode == 0:
            return result.stdout.strip()
        else:
            console.print(f"[red]❌ Erro ao processar com {cli_tool}[/red]")
            console.print(f"[dim]{result.stderr[:200]}[/dim]")
            return None
            
    except subprocess.TimeoutExpired:
        console.print(f"[red]❌ Timeout após {timeout}s[/red]")
        return None
    except Exception as e:
        console.print(f"[red]❌ Erro: {e}[/red]")
        return None


def transcribe_audio(audio_file, model="base"):
    """
    Transcreve áudio usando Whisper com barra de progresso.
    
    Returns:
        dict: {language, duration, segments: [{start, end, text}]}
    """
    console.print(f"\n[cyan]🎙️  Transcrevendo áudio com {TRANSCRIBER}...[/cyan]")
    
    try:
        if TRANSCRIBER == "faster-whisper":
            model_obj = WhisperModel(model, device="cpu", compute_type="int8")
            segments, info = model_obj.transcribe(
                audio_file,
                language=None,
                vad_filter=True,
                word_timestamps=True
            )
            
            data = {
                "language": info.language,
                "language_probability": round(info.language_probability, 2),
                "duration": info.duration,
                "segments": []
            }
            
            # Converter generator em lista com progresso
            console.print("[dim]Processando segmentos...[/dim]")
            for segment in tqdm(segments, desc="Segmentos", unit="seg"):
                data["segments"].append({
                    "start": round(segment.start, 2),
                    "end": round(segment.end, 2),
                    "text": segment.text.strip()
                })
        
        else:  # whisper original
            import whisper
            model_obj = whisper.load_model(model)
            result = model_obj.transcribe(audio_file, word_timestamps=True)
            
            data = {
                "language": result["language"],
                "duration": result["segments"][-1]["end"] if result["segments"] else 0,
                "segments": result["segments"]
            }
        
        console.print(f"[green]✅ Transcrição completa! Idioma: {data['language'].upper()}[/green]")
        console.print(f"[dim]   {len(data['segments'])} segmentos processados[/dim]")
        
        return data
        
    except Exception as e:
        console.print(f"[red]❌ Erro na transcrição: {e}[/red]")
        sys.exit(1)


def save_outputs(transcript_text, ata_text, audio_file, output_dir="."):
    """
    Salva transcript e ata em arquivos .md com timestamp.
    
    Returns:
        tuple: (transcript_path, ata_path or None)
    """
    timestamp = datetime.now().strftime("%Y%m%d-%H%M%S")
    base_name = Path(audio_file).stem
    
    # Sempre salva transcript
    transcript_filename = f"transcript-{timestamp}.md"
    transcript_path = Path(output_dir) / transcript_filename
    
    with open(transcript_path, 'w', encoding='utf-8') as f:
        f.write(transcript_text)
    
    console.print(f"[green]✅ Transcript salvo:[/green] {transcript_filename}")
    
    # Salva ata se existir
    ata_path = None
    if ata_text:
        ata_filename = f"ata-{timestamp}.md"
        ata_path = Path(output_dir) / ata_filename
        
        with open(ata_path, 'w', encoding='utf-8') as f:
            f.write(ata_text)
        
        console.print(f"[green]✅ Ata salva:[/green] {ata_filename}")
    
    return str(transcript_path), str(ata_path) if ata_path else None


def main():
    """Função principal."""
    import argparse
    
    parser = argparse.ArgumentParser(description="Audio Transcriber v1.1.0")
    parser.add_argument("audio_file", help="Arquivo de áudio para transcrever")
    parser.add_argument("--prompt", help="Prompt customizado para processar transcript")
    parser.add_argument("--model", default="base", help="Modelo Whisper (tiny/base/small/medium/large)")
    parser.add_argument("--output-dir", default=".", help="Diretório de saída")
    
    args = parser.parse_args()
    
    # Verificar arquivo existe
    if not os.path.exists(args.audio_file):
        console.print(f"[red]❌ Arquivo não encontrado: {args.audio_file}[/red]")
        sys.exit(1)
    
    console.print("[bold cyan]🎵 Audio Transcriber v1.1.0[/bold cyan]\n")
    
    # Step 1: Transcrever
    transcription_data = transcribe_audio(args.audio_file, model=args.model)
    
    # Gerar texto do transcript
    transcript_text = f"# Transcrição de Áudio\n\n"
    transcript_text += f"**Arquivo:** {Path(args.audio_file).name}\n"
    transcript_text += f"**Idioma:** {transcription_data['language'].upper()}\n"
    transcript_text += f"**Data:** {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}\n\n"
    transcript_text += "---\n\n## Transcrição Completa\n\n"
    
    for seg in transcription_data["segments"]:
        start_min = int(seg["start"] // 60)
        start_sec = int(seg["start"] % 60)
        end_min = int(seg["end"] // 60)
        end_sec = int(seg["end"] % 60)
        transcript_text += f"**[{start_min:02d}:{start_sec:02d} → {end_min:02d}:{end_sec:02d}]**  \n{seg['text']}\n\n"
    
    # Step 2: Detectar CLI
    cli_tool = detect_cli_tool()
    
    if not cli_tool:
        console.print("\n[yellow]⚠️  Nenhuma CLI de IA detectada (Claude ou GitHub Copilot)[/yellow]")
        console.print("[dim]ℹ️  Salvando apenas transcript.md...[/dim]")
        
        save_outputs(transcript_text, None, args.audio_file, args.output_dir)

        console.print("\n[cyan]💡 Para gerar ata/resumo:[/cyan]")
        console.print("  - Instale Claude CLI: pip install claude-cli")
        console.print("  - Ou GitHub Copilot CLI já está instalado (gh copilot)")
        return
    
    console.print(f"\n[green]✅ CLI detectada: {cli_tool}[/green]")
    
    # Step 3: Workflow de prompt
    final_prompt = handle_prompt_workflow(args.prompt, transcript_text)
    
    if final_prompt is None:
        # Usuário recusou processamento
        save_outputs(transcript_text, None, args.audio_file, args.output_dir)
        return
    
    # Step 4: Processar com LLM
    ata_text = process_with_llm(transcript_text, final_prompt, cli_tool)
    
    if ata_text:
        console.print("[green]✅ Ata gerada com sucesso![/green]")
    else:
        console.print("[yellow]⚠️  Falha ao gerar ata, salvando apenas transcript[/yellow]")
    
    # Step 5: Salvar arquivos
    console.print("\n[cyan]💾 Salvando arquivos...[/cyan]")
    save_outputs(transcript_text, ata_text, args.audio_file, args.output_dir)

    console.print("\n[bold green]✅ Concluído![/bold green]")


if __name__ == "__main__":
    main()

Related skills

Entra App RegistrationCorrectly register an application in Microsoft Entra ID, configure OAuth 2.0 flows, request the right API permissions, and generate working MSAL authentication snippets476k1.3k

Azure ComplianceRun automated Azure compliance scans, security posture checks, and Key Vault expiration audits before deploying.475k1.3k

Openclaw Secure Linux CloudDeploy and harden an OpenClaw agent instance on a Linux cloud server following battle-tested security defaults.270k72

Better Auth Best PracticesCorrectly configure Better Auth for secure authentication with database adapters, sessions, OAuth, email/password, and plugins in TypeScript projects.78.9k204

Firebase Security Rules AuditorAutomatically audit Firebase Firestore security rules for bypass vulnerabilities and logic gaps before deploying.77.7k388

Audit WebsiteRun comprehensive audits that surface SEO, performance, security, technical, and content issues with LLM-optimized reports and health scores.64.9k85

How it compares

Use for agent-driven audio-to-structured-doc pipelines; use markitdown when the source is already PDF or Office files rather than spoken audio.

FAQ

What does audio-transcriber do?

Transform audio recordings into professional Markdown documentation with intelligent summaries using LLM integration

When should I use audio-transcriber?

Transform audio recordings into professional Markdown documentation with intelligent summaries using LLM integration

What are common prerequisites?

--- name: audio-transcriber description: "Transform audio recordings into professional Markdown documentation with intelligent summaries using LLM integration" category: content risk: safe source: community tags: "[audio

Is Audio Transcriber safe to install?

skills.sh reports 2 of 3 security scanners passed. Review the Security Audits panel on this page before installing in production.

Securitycomplianceaudit