website/docs/user-guide/skills/bundled/media/media-songsee.md

---
title: "Songsee — Generate spectrograms and audio feature visualizations (mel, chroma, MFCC, tempogram, etc"
sidebar_label: "Songsee"
description: "Generate spectrograms and audio feature visualizations (mel, chroma, MFCC, tempogram, etc"
---

{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}

# Songsee

Generate spectrograms and audio feature visualizations (mel, chroma, MFCC, tempogram, etc.) from audio files via CLI. Useful for audio analysis, music production debugging, and visual documentation.

## Skill metadata

| | |
|---|---|
| Source | Bundled (installed by default) |
| Path | `skills/media/songsee` |
| Version | `1.0.0` |
| Author | community |
| License | MIT |
| Tags | `Audio`, `Visualization`, `Spectrogram`, `Music`, `Analysis` |

## Reference: full SKILL.md

:::info
The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active.
:::

# songsee

Generate spectrograms and multi-panel audio feature visualizations from audio files.

## Prerequisites

Requires [Go](https://go.dev/doc/install):
```bash
go install github.com/steipete/songsee/cmd/songsee@latest
```

Optional: `ffmpeg` for formats beyond WAV/MP3.

## Quick Start

```bash
# Basic spectrogram
songsee track.mp3

# Save to specific file
songsee track.mp3 -o spectrogram.png

# Multi-panel visualization grid
songsee track.mp3 --viz spectrogram,mel,chroma,hpss,selfsim,loudness,tempogram,mfcc,flux

# Time slice (start at 12.5s, 8s duration)
songsee track.mp3 --start 12.5 --duration 8 -o slice.jpg

# From stdin
cat track.mp3 | songsee - --format png -o out.png
```

## Visualization Types

Use `--viz` with comma-separated values:

| Type | Description |
|------|-------------|
| `spectrogram` | Standard frequency spectrogram |
| `mel` | Mel-scaled spectrogram |
| `chroma` | Pitch class distribution |
| `hpss` | Harmonic/percussive separation |
| `selfsim` | Self-similarity matrix |
| `loudness` | Loudness over time |
| `tempogram` | Tempo estimation |
| `mfcc` | Mel-frequency cepstral coefficients |
| `flux` | Spectral flux (onset detection) |

Multiple `--viz` types render as a grid in a single image.

## Common Flags

| Flag | Description |
|------|-------------|
| `--viz` | Visualization types (comma-separated) |
| `--style` | Color palette: `classic`, `magma`, `inferno`, `viridis`, `gray` |
| `--width` / `--height` | Output image dimensions |
| `--window` / `--hop` | FFT window and hop size |
| `--min-freq` / `--max-freq` | Frequency range filter |
| `--start` / `--duration` | Time slice of the audio |
| `--format` | Output format: `jpg` or `png` |
| `-o` | Output file path |

## Notes

- WAV and MP3 are decoded natively; other formats require `ffmpeg`
- Output images can be inspected with `vision_analyze` for automated audio analysis
- Useful for comparing audio outputs, debugging synthesis, or documenting audio processing pipelines
docs(website): dedicated page per bundled + optional skill (#14929) Generates a full dedicated Docusaurus page for every one of the 132 skills (73 bundled + 59 optional) under website/docs/user-guide/skills/{bundled,optional}/<category>/. Each page carries the skill's description, metadata (version, author, license, dependencies, platform gating, tags, related skills cross-linked to their own pages), and the complete SKILL.md body that Hermes loads at runtime. Previously the two catalog pages just listed skills with a one-line blurb and no way to see what the skill actually did — users had to go read the source repo. Now every skill has a browsable, searchable, cross-linked reference in the docs. - website/scripts/generate-skill-docs.py — generator that reads skills/ and optional-skills/, writes per-skill pages, regenerates both catalog indexes, and rewrites the Skills section of sidebars.ts. Handles MDX escaping (outside fenced code blocks: curly braces, unsafe HTML-ish tags) and rewrites relative references/*.md links to point at the GitHub source. - website/docs/reference/skills-catalog.md — regenerated; each row links to the new dedicated page. - website/docs/reference/optional-skills-catalog.md — same. - website/sidebars.ts — Skills section now has Bundled / Optional subtrees with one nested category per skill folder. - .github/workflows/{docs-site-checks,deploy-site}.yml — run the generator before docusaurus build so CI stays in sync with the source SKILL.md files. Build verified locally with `npx docusaurus build`. Only remaining warnings are pre-existing broken link/anchor issues in unrelated pages. 2026-04-23 22:22:11 -07:00			`---`
			`title: "Songsee — Generate spectrograms and audio feature visualizations (mel, chroma, MFCC, tempogram, etc"`
			`sidebar_label: "Songsee"`
			`description: "Generate spectrograms and audio feature visualizations (mel, chroma, MFCC, tempogram, etc"`
			`---`

			`{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}`

			`# Songsee`

			`Generate spectrograms and audio feature visualizations (mel, chroma, MFCC, tempogram, etc.) from audio files via CLI. Useful for audio analysis, music production debugging, and visual documentation.`

			`## Skill metadata`

			`\| \| \|`
			`\|---\|---\|`
			`\| Source \| Bundled (installed by default) \|`
			\| Path \| `skills/media/songsee` \|
			\| Version \| `1.0.0` \|
			`\| Author \| community \|`
			`\| License \| MIT \|`
			\| Tags \| `Audio`, `Visualization`, `Spectrogram`, `Music`, `Analysis` \|

			`## Reference: full SKILL.md`

			`:::info`
			`The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active.`
			`:::`

			`# songsee`

			`Generate spectrograms and multi-panel audio feature visualizations from audio files.`

			`## Prerequisites`

			`Requires [Go](https://go.dev/doc/install):`
			```bash
			`go install github.com/steipete/songsee/cmd/songsee@latest`
			```

			Optional: `ffmpeg` for formats beyond WAV/MP3.

			`## Quick Start`

			```bash
			`# Basic spectrogram`
			`songsee track.mp3`

			`# Save to specific file`
			`songsee track.mp3 -o spectrogram.png`

			`# Multi-panel visualization grid`
			`songsee track.mp3 --viz spectrogram,mel,chroma,hpss,selfsim,loudness,tempogram,mfcc,flux`

			`# Time slice (start at 12.5s, 8s duration)`
			`songsee track.mp3 --start 12.5 --duration 8 -o slice.jpg`

			`# From stdin`
			`cat track.mp3 \| songsee - --format png -o out.png`
			```

			`## Visualization Types`

			Use `--viz` with comma-separated values:

			`\| Type \| Description \|`
			`\|------\|-------------\|`
			\| `spectrogram` \| Standard frequency spectrogram \|
			\| `mel` \| Mel-scaled spectrogram \|
			\| `chroma` \| Pitch class distribution \|
			\| `hpss` \| Harmonic/percussive separation \|
			\| `selfsim` \| Self-similarity matrix \|
			\| `loudness` \| Loudness over time \|
			\| `tempogram` \| Tempo estimation \|
			\| `mfcc` \| Mel-frequency cepstral coefficients \|
			\| `flux` \| Spectral flux (onset detection) \|

			Multiple `--viz` types render as a grid in a single image.

			`## Common Flags`

			`\| Flag \| Description \|`
			`\|------\|-------------\|`
			\| `--viz` \| Visualization types (comma-separated) \|
			\| `--style` \| Color palette: `classic`, `magma`, `inferno`, `viridis`, `gray` \|
			\| `--width` / `--height` \| Output image dimensions \|
			\| `--window` / `--hop` \| FFT window and hop size \|
			\| `--min-freq` / `--max-freq` \| Frequency range filter \|
			\| `--start` / `--duration` \| Time slice of the audio \|
			\| `--format` \| Output format: `jpg` or `png` \|
			\| `-o` \| Output file path \|

			`## Notes`

			- WAV and MP3 are decoded natively; other formats require `ffmpeg`
			- Output images can be inspected with `vision_analyze` for automated audio analysis
			`- Useful for comparing audio outputs, debugging synthesis, or documenting audio processing pipelines`