Songsee

Name: Songsee
Author: steipete

steipete/clawdis

Turn local audio files into spectrogram and multi-panel feature visualizations from the terminal for demos, content, or ML inspection.

Overview

songsee is an agent skill for the Build phase that generates spectrograms and multi-panel audio feature visualizations via the songsee CLI.

Install

npx skills add https://github.com/steipete/clawdis --skill songsee

What is this skill?

Single-command spectrogram: `songsee track.mp3` with optional time slices via `--start` / `--duration`
Multi-panel grids via repeatable or comma-separated `--viz` (spectrogram, mel, chroma, hpss, selfsim, loudness, tempogra
Palette and sizing controls: `--style` (classic, magma, inferno, viridis, gray), `--width` / `--height`, FFT `--window`
Stdin pipeline support: `cat track.mp3 | songsee - --format png -o out.png`
WAV/MP3 native decode; other formats use ffmpeg when available
9 named viz types in the multi-panel example (spectrogram, mel, chroma, hpss, selfsim, loudness, tempogram, mfcc, flux)
5 palette styles (classic, magma, inferno, viridis, gray)

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 1.8k installs on skills.sh; 378k GitHub stars; 3/3 security scanners passed (skills.sh audits).

What problem does it solve?

You have audio files or streams but no quick, repeatable way to produce spectrogram and feature-panel images for docs, content, or debugging.

Who is it for?

Solo builders who need terminal-driven spectrograms and feature panels from MP3/WAV (or ffmpeg-backed formats) for content, demos, or ML sanity checks.

Skip if: Teams that need a GUI DAW, real-time playback UI, or cloud batch rendering without installing the songsee binary locally.

When should I use this skill?

You need spectrograms or stacked audio feature panels from files or stdin and want the agent to run songsee with the right flags.

What do I get? / Deliverables

After the skill runs, your agent emits correct songsee commands (and install steps) that write JPG/PNG spectrograms or viz grids to your chosen paths.

Spectrogram or multi-viz grid image (JPG/PNG)
Documented shell commands for reproducible renders

Recommended Skills

Video Editagentspace-so/runcomfy-agent-skills

Video Edit is a RunComfy-focused agent skill that acts as a smart router between your edit intent and the correct model …211k installs·15 stars

Image To Videoagentspace-so/runcomfy-agent-skills

Image-to-Video on RunComfy picks the right i2v model for each intent—HappyHorse for general animation, Wan 2.7 with audi…210k installs·15 stars

Image Editagentspace-so/runcomfy-agent-skills

Image Edit is a RunComfy Pro Pack agent skill that acts as a smart router between your edit intent and the right model i…210k installs·15 stars

Flux Kontextagentspace-so/runcomfy-agent-skills

Flux Kontext Pro on RunComfy packages Black Forest Labs' precise local edit model with documented prompting patterns and…210k installs·15 stars

Nano Banana 2agentspace-so/runcomfy-agent-skills

Nano Banana 2 on RunComfy wraps Google's Gemini-family flash text-to-image model with prompting patterns for fast iterat…210k installs·15 stars

Nano Banana Editagentspace-so/runcomfy-agent-skills

Nano Banana Edit on RunComfy documents Google's image-to-image edit endpoint for identity-preserving changes, background…210k installs·15 stars

Journey fit

Primary fit

BuildIntegrations & version control

Canonical shelf is Build because the skill wires a host CLI (songsee) into agent workflows for producing media artifacts, not for validating ideas or shipping production services. Integrations fits best: it documents install (Homebrew), binary dependency, flags, and piping stdin—classic third-party CLI hookup for solo builders.

How it compares

Use instead of hand-rolling matplotlib/librosa scripts when you want a single maintained CLI with named viz presets.

Common Questions / FAQ

Who is songsee for?

Indie developers and creators who work in Claude Code, Cursor, or similar agents and want fast spectrogram images from local audio without writing visualization code each time.

When should I use songsee?

Use it in the Build phase when integrating tooling—e.g., generating slice images for a landing demo, documenting an audio ML pipeline, or batch-exporting panels during backend or agent-tooling work.

Is songsee safe to install?

The skill only documents invoking the external songsee CLI and optional ffmpeg; review the Security Audits panel on this Prism page and verify the Homebrew tap and binary source before installing on your machine.

SKILL.md

READMESKILL.md - Songsee

# songsee

Generate spectrograms + feature panels from audio.

Quick start

- Spectrogram: `songsee track.mp3`
- Multi-panel: `songsee track.mp3 --viz spectrogram,mel,chroma,hpss,selfsim,loudness,tempogram,mfcc,flux`
- Time slice: `songsee track.mp3 --start 12.5 --duration 8 -o slice.jpg`
- Stdin: `cat track.mp3 | songsee - --format png -o out.png`

Common flags

- `--viz` list (repeatable or comma-separated)
- `--style` palette (classic, magma, inferno, viridis, gray)
- `--width` / `--height` output size
- `--window` / `--hop` FFT settings
- `--min-freq` / `--max-freq` frequency range
- `--start` / `--duration` time slice
- `--format` jpg|png

Notes

- WAV/MP3 decode native; other formats use ffmpeg if available.
- Multiple `--viz` renders a grid.

What is this skill?

Single-command spectrogram: `songsee track.mp3` with optional time slices via `--start` / `--duration`

Multi-panel grids via repeatable or comma-separated `--viz` (spectrogram, mel, chroma, hpss, selfsim, loudness, tempogra

Palette and sizing controls: `--style` (classic, magma, inferno, viridis, gray), `--width` / `--height`, FFT `--window`

Stdin pipeline support: `cat track.mp3 | songsee - --format png -o out.png`

WAV/MP3 native decode; other formats use ffmpeg when available

9 named viz types in the multi-panel example (spectrogram, mel, chroma, hpss, selfsim, loudness, tempogram, mfcc, flux)

5 palette styles (classic, magma, inferno, viridis, gray)

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 1.8k installs on skills.sh; 378k GitHub stars; 3/3 security scanners passed (skills.sh audits).

Journey fit

Primary fit

BuildIntegrations & version control

SKILL.md

READMESKILL.md - Songsee

# songsee

Generate spectrograms + feature panels from audio.

Quick start

- Spectrogram: `songsee track.mp3`
- Multi-panel: `songsee track.mp3 --viz spectrogram,mel,chroma,hpss,selfsim,loudness,tempogram,mfcc,flux`
- Time slice: `songsee track.mp3 --start 12.5 --duration 8 -o slice.jpg`
- Stdin: `cat track.mp3 | songsee - --format png -o out.png`

Common flags

- `--viz` list (repeatable or comma-separated)
- `--style` palette (classic, magma, inferno, viridis, gray)
- `--width` / `--height` output size
- `--window` / `--hop` FFT settings
- `--min-freq` / `--max-freq` frequency range
- `--start` / `--duration` time slice
- `--format` jpg|png

Notes

- WAV/MP3 decode native; other formats use ffmpeg if available.
- Multiple `--viz` renders a grid.

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Who is songsee for?

When should I use songsee?

Is songsee safe to install?

SKILL.md

This week for builders

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Who is songsee for?

When should I use songsee?

Is songsee safe to install?

SKILL.md