mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-03 09:17:09 +08:00

Files

Teknium 8d302e37a8 feat(tts): add Piper as a native local TTS provider (closes #8508 ) (#17885 )

Piper (OHF-Voice/piper1-gpl) is a fast, local neural TTS engine from the
Home Assistant project that supports 44 languages with zero API keys.
Adds it as a native built-in provider alongside edge/neutts/kittentts,
installable via 'hermes tools' with one keystroke.

What ships:

- New 'piper' built-in provider in tools/tts_tool.py
  - Lazy import via _import_piper()
  - Module-level voice cache keyed on (model_path, use_cuda) so switching
    voices doesn't invalidate older cached voices
  - _resolve_piper_voice_path() accepts either an absolute .onnx path or a
    voice name (auto-downloaded on first use via 'python -m
    piper.download_voices --download-dir <cache>')
  - Voice cache at ~/.hermes/cache/piper-voices/ (profile-aware via
    get_hermes_dir)
  - Optional SynthesisConfig knobs: length_scale, noise_scale,
    noise_w_scale, volume, normalize_audio, use_cuda — passed through
    only when configured, so older piper-tts versions aren't broken
  - WAV output then ffmpeg conversion path (same as neutts/kittentts) so
    Telegram voice bubbles work when ffmpeg is present
  - Piper added to BUILTIN_TTS_PROVIDERS so a user's
    tts.providers.piper.command cannot shadow the native provider
    (regression test included)

- 'hermes tools' wizard entry
  - Piper appears under Voice and TTS as local free, with
    'pip install piper-tts' auto-install via post_setup handler
  - Prints voice-catalog URL and default-voice info after install

- config.yaml defaults
  - tts.piper.voice defaults to en_US-lessac-medium
  - Commented advanced knobs for discoverability

- Docs
  - New 'Piper (local, 44 languages)' section in features/tts.md
    explaining install path, voice switching, pre-downloaded voices,
    and advanced knobs
  - Piper listed in the ten-provider table and ffmpeg table
  - Custom-command-providers section updated to drop the Piper example
    (now native) and add a piper-custom example for users with their own
    trained .onnx models
  - overview.md bumps provider count to ten

- Tests (tests/tools/test_tts_piper.py, 16 tests)
  - Registration (BUILTIN_TTS_PROVIDERS, PROVIDER_MAX_TEXT_LENGTH)
  - _resolve_piper_voice_path across every branch: direct .onnx path,
    cached voice name, fresh download with correct CLI args, download
    failure, successful-exit-but-missing-files, empty voice to default
  - _generate_piper_tts: loads voice once, reuses cache, voice-name
    download wiring, advanced knobs flow through SynthesisConfig
  - text_to_speech_tool end-to-end dispatch and missing-package error
  - check_tts_requirements: piper availability toggles the return value
  - Regression guard: piper cannot be shadowed by a command provider
    with the same name
  - Pre-existing test_tts_mistral test broadened to mock the new
    piper/kittentts/command-provider checks (otherwise it false-passes
    when piper is installed in the test venv)

E2E verification (live):

Actual pip install piper-tts, config piper + en_US-lessac-low,
text_to_speech_tool call, voice auto-downloaded from HuggingFace,
WAV synthesized, ffmpeg-converted to Ogg/Opus. Second call hits the
cache (~60ms). Cache dir populated with .onnx and .onnx.json.

This caught a real bug during development: the first pass used '-d' as
the download-dir flag; the actual piper.download_voices CLI wants
'--download-dir'. Fixed before PR opened.

2026-04-30 02:53:20 -07:00

docs

feat(tts): add Piper as a native local TTS provider (closes #8508 ) (#17885 )

2026-04-30 02:53:20 -07:00

scripts

fix(website): auto-wrap ASCII-art code blocks in generated skill pages (#16497 )

2026-04-27 03:38:39 -07:00

src

fix: light mode link/primary colors unreadable on white background (#10457 )

2026-04-15 11:17:44 -07:00

static

feat(dashboard): configure main + auxiliary models from Models page (#17802 )

2026-04-29 23:53:12 -07:00

.gitignore

chore: gitignore generated skills.json

2026-04-02 10:48:15 -07:00

docusaurus.config.ts

docs: exclude per-skill pages from search, add curator feature page (#17563 )

2026-04-29 10:28:15 -07:00

package-lock.json

docs: stabilize website diagrams

2026-03-14 22:49:57 -07:00

package.json

fix(website): run skill extraction automatically on npm run build/start (#13747 )

2026-04-21 18:02:04 -07:00

README.md

docs: replace ASCII diagrams with Mermaid/lists, add linting note

2026-03-21 17:58:30 -07:00

sidebars.ts

feat(dashboard): configure main + auxiliary models from Models page (#17802 )

2026-04-29 23:53:12 -07:00

tsconfig.json

feat: add documentation website (Docusaurus)

2026-03-05 05:24:55 -08:00

README.md

Website

This website is built using Docusaurus, a modern static website generator.

Installation

yarn

Local Development

yarn start

This command starts a local development server and opens up a browser window. Most changes are reflected live without having to restart the server.

Build

yarn build

This command generates static content into the build directory and can be served using any static contents hosting service.

Deployment

Using SSH:

USE_SSH=true yarn deploy

Not using SSH:

GIT_USER=<Your GitHub username> yarn deploy

If you are using GitHub pages for hosting, this command is a convenient way to build the website and push to the gh-pages branch.

Diagram Linting

CI runs ascii-guard to lint docs for ASCII box diagrams. Use Mermaid (````mermaid`) or plain lists/tables instead of ASCII boxes to avoid CI failures.