mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-04-30 16:01:49 +08:00
Add Google's Gemini speech-generation API as 8th TTS backend. Returns base64-encoded signed 16-bit PCM at 24 kHz mono, wrapped in WAV natively via stdlib wave module. Optional ffmpeg conversion to mp3/ogg for Telegram voice bubbles. Supports GEMINI_API_KEY and GOOGLE_API_KEY (fallback), 30 prebuilt voices, configurable model (flash/pro). Cherry-picked from #10922 by @zhonghui5207. Fixes #10918.
44 KiB
44 KiB