Ffmpeg Analyse Video

Video understanding is most often invoked while building—documenting flows, learning from recordings, or tooling agent pipelines—even though the same summaries support research and content reuse. Agent-tooling fits the sub-agent batching architecture designed to preserve the main context window during vision-heavy work.

Also useful

Also useful

Where it fits

Example use

Summarise a competitor demo video into timestamped feature notes before you scope your own product.

Example use

Run the 3-phase sub-agent pipeline on a codebase walkthrough recording for implementation checklists.

Example use

BuildDocs & content

Convert an internal how-to screencast into step-by-step documentation with time markers.

Example use

Draft blog or changelog outlines from webinar recordings without manual transcription.

How it compares

Agent workflow for vision summarisation via ffmpeg frames—not a hosted video CDN or automated video editor.

Common Questions / FAQ

Who is ffmpeg-analyse-video for?

Solo and indie builders using agentic coding tools who need to understand or document visual content in local video files via ffmpeg plus AI vision.

When should I use ffmpeg-analyse-video?

Use it in Build when turning recordings into agent-tooling notes or docs; in Idea when researching competitors or tutorials from video; and in Grow when repurposing webinar or demo footage into written content.

Is ffmpeg-analyse-video safe to install?

Review the Security Audits panel on this Prism page; the skill runs shell ffmpeg commands and reads local video files, so only analyse media you are allowed to process.

SKILL.md

READMESKILL.md - Ffmpeg Analyse Video

# FFmpeg Video Analysis

Extract frames from video files with ffmpeg. Delegate frame reading to sub-agents to preserve the main context window. Synthesise a structured timestamped summary from text-only sub-agent reports.

## Architecture: Context-Efficient Sub-Agent Pipeline

**Problem**: Reading dozens of images into the main conversation context consumes most of the context window, leaving little room for synthesis and follow-up.

**Solution**: A 3-phase pipeline:

```
Main Agent                          Sub-Agents (disposable context)
──────────                          ──────────────────────────────
1. ffprobe metadata        ───►
2. ffmpeg frame extraction ───►
3. Split frames into batches ──►   4. Read images (vision)
                                      Write text descriptions
                                      to batch_N_analysis.md
5. Read text files only    ◄───    (context discarded)
6. Synthesise final output
```

Images only ever exist inside sub-agent contexts. The main agent only reads lightweight text files. This cuts context usage by ~90%.

## 1. Prerequisites

```bash
which ffmpeg && which ffprobe
```

If either is missing, show platform-specific install instructions and STOP:
- **macOS**: `brew install ffmpeg`
- **Ubuntu/Debian**: `sudo apt install ffmpeg`
- **Windows**: `choco install ffmpeg` or `winget install ffmpeg`

## 2. Setup Temp Directory

```bash
# macOS/Linux
TMPDIR="/tmp/video-analysis-$(date +%s)"
mkdir -p "$TMPDIR"

# Windows (PowerShell)
# $TMPDIR = "$env:TEMP\video-analysis-$(Get-Date -UFormat %s)"
# New-Item -ItemType Directory -Path $TMPDIR
```

## 3. Extract Video Metadata

```bash
ffprobe -v quiet -print_format json -show_format -show_streams "VIDEO_PATH"
```

Extract and report: duration, resolution (width x height), fps, codec, file size, whether audio is present.

If no video stream is found, report "audio-only file" and STOP.
If file size > 2GB, warn the user and suggest analysing a time range with `-ss START -to END`.

## 4. Extract Frames

Choose strategy based on duration:

| Duration | Strategy | Command |
|----------|----------|---------|
| 0-60s | 1 frame every 2s | `ffmpeg -hide_banner -y -i INPUT -vf "fps=1/2,scale='min(1280,iw)':-2" -q:v 5 DIR/frame_%04d.jpg` |
| 1-10min | Scene detection (threshold 0.3) | `ffmpeg -hide_banner -y -i INPUT -vf "select='gt(scene,0.3)',scale='min(1280,iw)':-2" -vsync vfr -q:v 5 DIR/scene_%04d.jpg` |
| 10-30min | Keyframe extraction | `ffmpeg -hide_banner -y -skip_frame nokey -i INPUT -vf "scale='min(1280,iw)':-2" -vsync vfr -q:v 5 DIR/key_%04d.jpg` |
| 30min+ | Thumbnail filter | `ffmpeg -hide_banner -y -i INPUT -vf "thumbnail=SEGMENT_FRAMES,scale='min(1280,iw)':-2" -vsync vfr -q:v 5 DIR/thumb_%04d.jpg` |

For thumbnail filter, calculate `SEGMENT_FRAMES = total_frames / 60` to cap output at ~60 frames.

**Fallbacks:**
- Scene detection yields 0 frames → retry with interval at 1 frame/5s
- More than 100 frames extracted → subsample evenly to 80
- Frame extraction fails → try the next simpler strategy (scene → interval, keyframe → interval)

**Time range analysis:** When user specifies a range, prepend `-ss START -to END` before `-i`.
**Higher detail mode:** If requested, double the fps rate and lower scene threshold to 0.2.

After extraction, list all frame files and calculate each frame's timestamp from its sequence number and the extraction rate.

## 5. Delegate Frame Analysis to Sub-Agents

**This is the critical context-saving step.** Do NOT read frame images in th

What is this skill?

3-phase pipeline: ffprobe metadata, ffmpeg frame extraction, batched sub-agent vision analysis

Sub-agents write batch_N_analysis.md text reports so the main agent only reads prose, not images

Produces structured timestamped step-by-step summaries from tutorials, presentations, and footage

Triggers on analyse this video, summarise this recording, and similar visual-understanding requests

Delegates frame reading to disposable sub-agent contexts to avoid exhausting the primary context window

Documented as a 3-phase context-efficient sub-agent pipeline

Sub-agents write per-batch text reports (batch_N_analysis.md) for main-agent synthesis only

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 656 installs on skills.sh; 12 GitHub stars; 1/3 security scanners passed (skills.sh audits).

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

Also useful

Also useful

Where it fits

Example use

Summarise a competitor demo video into timestamped feature notes before you scope your own product.

Example use

Run the 3-phase sub-agent pipeline on a codebase walkthrough recording for implementation checklists.

Example use

BuildDocs & content

Convert an internal how-to screencast into step-by-step documentation with time markers.

Example use