Compare commits

...

32 Commits

Author SHA1 Message Date
teknium1
b54591ddda fix(docker): require explicit env allowlist for container creds 2026-03-15 10:38:30 -07:00
Teknium
934fc9df22 Merge pull request #1440 from NousResearch/fix/1071-dict-tool-args
fix: handle dict tool call arguments from local backends
2026-03-15 08:04:09 -07:00
teknium1
5847c180c6 test: restore vllm integration coverage and add dict-args regression
Restore the existing vLLM integration test module that was accidentally replaced during development and add a focused agent-loop regression test for dict tool-call arguments from OpenAI-compatible local backends.
2026-03-15 08:02:29 -07:00
teknium1
93a0c0cddd fix: handle dict tool call arguments from local backends
Normalize tool call arguments when OpenAI-compatible backends return parsed dict/list payloads instead of JSON strings. This prevents the .strip() crash during tool-call validation for llama.cpp and similar servers, while preserving existing empty-string and invalid-JSON handling. Adds a focused regression test for dict arguments in the agent loop.
2026-03-15 08:00:19 -07:00
Teknium
23e8fdd167 feat(discord): auto-thread on @mention + skip mention in bot threads
Two changes to align Discord behavior with Slack:

1. Auto-thread on @mention (default: true)
   - When someone @mentions the bot in a server channel, a thread is
     automatically created from their message and the response goes there.
   - Each thread gets its own isolated session (like Slack).
   - Configurable via discord.auto_thread in config.yaml (default: true)
     or DISCORD_AUTO_THREAD env var (env takes precedence).
   - DMs and existing threads are unaffected.

2. Skip @mention in bot-participated threads
   - Once the bot has responded in a thread (auto-created or manually
     entered), subsequent messages in that thread no longer require
     @mention. Users can just type normally.
   - Tracked via in-memory set (_bot_participated_threads). After a
     gateway restart, users need to @mention once to re-establish.
   - Threads the bot hasn't participated in still require @mention.

Config change:
   discord:
     auto_thread: true  # new, added to DEFAULT_CONFIG

Tests: 7 new tests covering auto-thread default, disable, bot thread
participation tracking, and mention skip logic. All 903 gateway tests pass.
2026-03-15 07:59:55 -07:00
Teknium
3268b98779 Merge pull request #1437 from NousResearch/fix/1219-cron-thread-context
fix: preserve thread context for cronjob deliver=origin
2026-03-15 06:58:37 -07:00
teyrebaz33
20f381cfb6 fix: preserve thread context for cronjob deliver=origin
When a cronjob is created from within a Telegram or Slack thread,
deliver=origin was posting to the parent channel instead of the thread.

Root cause: the gateway never set HERMES_SESSION_THREAD_ID in the
session environment, so cronjob_tools.py could not capture thread_id
into the job's origin metadata — even though the scheduler already
reads origin.get('thread_id').

Fix:
- gateway/run.py: set HERMES_SESSION_THREAD_ID when thread_id is
  present on the session context, and clear it in _clear_session_env
- tools/cronjob_tools.py: read HERMES_SESSION_THREAD_ID into origin

Closes #1219
2026-03-15 06:57:00 -07:00
Teknium
77bfa252b9 Merge pull request #1434 from NousResearch/fix/1244-env-override
fix(config): reload .env over stale shell overrides
2026-03-15 06:47:40 -07:00
teknium1
f24c00a5bf fix(config): reload .env over stale shell overrides
Hermes startup entrypoints now load ~/.hermes/.env and project fallback env files with user config taking precedence over stale shell-exported values. This makes model/provider/base URL changes in .env actually take effect after restarting Hermes. Adds a shared env loader plus regression coverage, and reproduces the original bug case where OPENAI_BASE_URL and HERMES_INFERENCE_PROVIDER remained stuck on old shell values before import.
2026-03-15 06:46:28 -07:00
Teknium
463239ed85 docs: fallback providers + /background command documentation
* docs: comprehensive fallback providers documentation

- New dedicated page: user-guide/features/fallback-providers.md covering
  both primary model fallback and auxiliary task fallback systems
- Updated configuration.md with fallback_model config section
- Updated environment-variables.md noting fallback is config-only
- Fleshed out developer-guide/provider-runtime.md fallback section with
  internal architecture details (trigger points, activation flow, config flow)
- Added cross-reference from provider-routing.md distinguishing OpenRouter
  sub-provider routing from Hermes-level model fallback
- Added new page to sidebar under Integrations

* docs: comprehensive /background command documentation

- Added Background Sessions section to cli.md covering how it works
  (daemon threads, isolated sessions, config inheritance, Rich panel
  output, bell notification, concurrent tasks)
- Added Background Sessions section to messaging/index.md covering
  messaging-specific behavior (async execution, result delivery back
  to same chat, fire-and-forget pattern)
- Documented background_process_notifications config
  (all/result/error/off) in messaging docs and configuration.md
- Added HERMES_BACKGROUND_NOTIFICATIONS env var to reference page
- Fixed inconsistency in slash-commands.md: /background was listed as
  messaging-only but works in both CLI and messaging. Moved it to the
  'both surfaces' note.
- Expanded one-liner table descriptions with detail and cross-references
2026-03-15 06:24:28 -07:00
Teknium
60cce9ca6d Merge pull request #1429 from NousResearch/fix/1336-discord-voice-reliability
fix(voice): Discord voice channel reliability fixes
2026-03-15 05:25:45 -07:00
teknium1
2d57946ee9 test(voice): clarify install guidance and local skips
Add an explicit messaging-extra install hint to the missing PyNaCl/davey error path, cover it with a voice-channel join regression test, and skip the low-level NaCl packet tests when PyNaCl is not installed locally.
2026-03-15 05:24:34 -07:00
0xbyt4
5f32fd8b6d feat(voice): add discord-voice-doctor diagnostic script
Checks the full voice environment and reports what's missing:
- Python packages: discord.py, PyNaCl, davey, STT/TTS providers
- System tools: Opus codec (macOS + Linux paths), ffmpeg
- Environment: bot token, allowed users (resolved to usernames), API keys
- Configuration: STT/TTS provider, voice mode state
- Bot permissions: live Discord API check for Connect, Speak, VAD, etc.

All sensitive values are masked. Gracefully handles missing deps,
invalid tokens, API timeouts, and unreachable Discord API.
2026-03-15 05:20:17 -07:00
0xbyt4
3ea039684e test(voice): add integration tests with real NaCl crypto and Opus codec
End-to-end voice channel tests using real crypto (no mocks):

NaCl decrypt (5): valid packet, wrong key, bot SSRC, multi-packet, multi-SSRC
DAVE passthrough (3): unknown SSRC, Unencrypted error, real error drop
Full flow (5): utterance lifecycle, auto-map, pause/resume, corruption, cleanup
SPEAKING hook (4): hook installed, map/overwrite, mapped audio processed
Auth filtering (3): allowed user, rejected user, empty allowlist
Rejoin flow (3): clean state, new SSRC, missing SPEAKING auto-map
Multi-guild (2): independent receivers, stop isolation
Echo prevention (2): paused audio ignored, resumed audio processed
2026-03-15 05:20:17 -07:00
0xbyt4
63f0ec96ec test(voice): add comprehensive flow tests for voice channel fixes
Tests cover the actual code paths changed in voice fixes:

_on_packet DAVE passthrough (8 tests):
- Known SSRC + DAVE decrypt success → buffered
- Unknown SSRC + DAVE → skip DAVE, passthrough to Opus
- DAVE "Unencrypted" error → passthrough, not dropped
- DAVE other error → packet dropped
- No DAVE session → direct decode
- Bot's own SSRC → ignored (echo prevention)
- Multiple SSRCs → separate buffers

SSRC auto-mapping (6 tests):
- Single allowed user → auto-mapped
- Multiple allowed users → no auto-map
- No allowlist → sole non-bot member inferred
- Unallowed user → rejected
- Only bot in channel → no map
- Auto-map persists across checks

Buffer lifecycle (4 tests):
- Known SSRC completed utterance
- Short buffer ignored
- Recent audio waits
- Stale unknown buffer discarded

TTS playback (10 tests):
- play_tts calls play_in_voice_channel in VC
- play_tts falls through when not in VC
- play_tts wrong channel no match
- Voice input dedup (runner skips)
- Text + voice_mode combinations
- Error/empty response skipped
- Agent TTS tool dedup

UDP keepalive (2 tests):
- Interval within bounds
- Silence frame actually sent via send_packet
2026-03-15 05:20:17 -07:00
0xbyt4
1cacaccca6 fix(voice): show clear error when voice dependencies are missing
When PyNaCl or davey is not installed, joining a voice channel fails
with a raw exception. Now shows a human-readable message pointing
the user to reinstall with voice support.

Closes #1336
2026-03-15 05:20:17 -07:00
0xbyt4
773f3c1137 fix(voice): DAVE passthrough + auto-map SSRC after bot rejoin
After bot leave/rejoin, Discord doesn't resend SPEAKING events for
users already in the channel. This left SSRC unmapped and all audio
was silently dropped by DAVE decrypt.

Fixes:
- Skip DAVE for unknown SSRCs instead of dropping (passthrough)
- Handle "UnencryptedWhenPassthroughDisabled" DAVE errors gracefully
- Auto-infer user_id from sole allowed member in voice channel
- Pass allowed_user_ids to VoiceReceiver for secure inference
2026-03-15 05:20:17 -07:00
0xbyt4
0cc784068d fix(voice): add UDP keepalive to prevent Discord dropping voice after silence
Discord drops the UDP voice route after ~60s of silence - no packets
arrive even when users start speaking again. Send an Opus silence
frame every 15s to keep the UDP session alive.
2026-03-15 05:20:17 -07:00
0xbyt4
f1b4d0b280 fix(voice): make play_tts play in VC instead of no-op
play_tts was returning success without playing anything when bot was
in a voice channel. Now it calls play_in_voice_channel directly.

Simplified skip_double dedup: base adapter handles voice input TTS
via play_tts (which now works for VC), runner skips to avoid double.
2026-03-15 05:20:17 -07:00
Teknium
5254d0bba1 Merge pull request #1427 from NousResearch/fix/1414-gateway-shutdown-restart
fix(gateway): cancel active runs during shutdown
2026-03-15 04:47:28 -07:00
teknium1
21c20aeaa5 fix(gateway): cancel active runs during shutdown
Track adapter background message-processing tasks, cancel them during gateway shutdown, and interrupt running agents before disconnecting adapters. This prevents old gateway instances from continuing in-flight work after stop/replace, which was contributing to the restart-time task continuation/flicker behavior reported in #1414. Adds regression coverage for adapter task cancellation and shutdown interrupts.
2026-03-15 04:21:50 -07:00
Teknium
dc095f8491 Merge pull request #1425 from NousResearch/fix/1412-session-delete-prefix
fix(cli): accept session ID prefixes for session actions
2026-03-15 04:02:47 -07:00
teknium1
621fd80b1e fix(cli): accept session ID prefixes for session actions
Resolve session IDs by exact match or unique prefix for sessions delete/export/rename so IDs copied from Preview                                            Last Active   Src    ID
──────────────────────────────────────────────────────────────────────────────────────────
Search for GitHub/GitLab source repositories for   11m ago       cli    20260315_034720_8e1f
[SYSTEM: The user has invoked the "minecraft-atm   1m ago        cli    20260315_034035_57b6
                                                   1h ago        cron   cron_job-1_20260315_
[SYSTEM: The user has invoked the "hermes-agent-   9m ago        cli    20260315_014304_652a
                                                   4h ago        cron   cron_job-1_20260314_
[The user attached an image. Here's what it cont   4h ago        cli    20260314_233806_c8f3
[SYSTEM: The user has invoked the "google-worksp   1h ago        cli    20260314_233301_b04f
Inspect the opencode codebase for how it sends m   4h ago        cli    20260314_232543_0601
Inspect the clawdbot codebase for how it sends m   4h ago        cli    20260314_232543_8125
                                                   4h ago        cron   cron_job-1_20260314_
Reply with exactly: smoke-ok                       4h ago        cli    20260314_231730_aac9
                                                   4h ago        cron   cron_job-1_20260314_
[SYSTEM: The user has invoked the "hermes-agent-   4h ago        cli    20260314_231111_3586
[SYSTEM: The user has invoked the "hermes-agent-   4h ago        cli    20260314_225551_daff
                                                   5h ago        cron   cron_job-1_20260314_
[SYSTEM: The user has invoked the "google-worksp   4h ago        cli    20260314_224629_a9c6
k_sze   — 10:34 PM Just ran hermes update and I    5h ago        cli    20260314_224243_544e
                                                   5h ago        cron   cron_job-1_20260314_
                                                   5h ago        cron   cron_job-1_20260314_
                                                   5h ago        cron   cron_job-1_20260314_ work even when the table view truncates them. Add SessionDB prefix-resolution coverage and a CLI regression test for deleting by listed prefix.
2026-03-15 04:01:56 -07:00
Teknium
2b8fd9a8e3 Merge pull request #1422 from NousResearch/fix/1409-photo-burst-interrupts
fix(gateway): prevent Telegram photo burst interrupts
2026-03-15 03:51:51 -07:00
teknium1
fef710aca8 test(gateway): cover photo burst interrupt regressions
Add regression coverage for non-album Telegram photo burst batching, photo follow-ups that should queue without interrupting active runs, and the gateway priority-interrupt path for photo events.
2026-03-15 03:50:45 -07:00
CoinDegen
4ae1334287 fix(gateway): prevent telegram photo burst interrupts 2026-03-15 03:49:01 -07:00
Teknium
db3e3aa6c5 Merge pull request #1421 from NousResearch/fix/1247-preserve-mcp-toolsets
fix(tools): preserve MCP toolsets when saving platform tool config
2026-03-15 03:33:31 -07:00
ygd58
633488e0c0 fix(tools): preserve MCP toolsets when saving platform tool config
_save_platform_tools() overwrote the entire platform_toolsets list with
only the toolsets known to CONFIGURABLE_TOOLSETS. This silently dropped
any MCP server toolsets that users had added manually to config.yaml.

Fix: collect any existing toolset keys that are not in CONFIGURABLE_TOOLSETS
and append them back after the wizard's selections are written. This ensures
MCP toolsets survive a hermes tools save.

Fixes #1247
2026-03-15 03:28:20 -07:00
Teknium
0de200cf4d Merge pull request #1419 from NousResearch/fix/1264-env-secret-blocklist
fix(security): block gateway and tool env vars in subprocesses
2026-03-15 03:22:58 -07:00
Teknium
f6fdb18fe6 Merge pull request #1417 from NousResearch/fix/1056-dm-session-isolation
fix(gateway): isolate DM sessions by chat_id
2026-03-15 03:22:39 -07:00
teknium1
232ba441d7 test: cover DM session key isolation
Update interrupt-key expectations for namespaced DM session keys and add a regression test that different DM chat IDs produce distinct gateway sessions.
2026-03-15 02:38:48 -07:00
heyyyimmax
34e120bcbb fix(gateway): enforce chat_id isolation for all DM sessions 2026-03-15 02:37:53 -07:00
49 changed files with 3468 additions and 178 deletions

View File

@@ -42,19 +42,16 @@ def _setup_logging() -> None:
def _load_env() -> None:
"""Load .env from HERMES_HOME (default ``~/.hermes``)."""
from dotenv import load_dotenv
from hermes_cli.env_loader import load_hermes_dotenv
hermes_home = Path(os.getenv("HERMES_HOME", Path.home() / ".hermes"))
env_file = hermes_home / ".env"
if env_file.exists():
try:
load_dotenv(dotenv_path=env_file, encoding="utf-8")
except UnicodeDecodeError:
load_dotenv(dotenv_path=env_file, encoding="latin-1")
logging.getLogger(__name__).info("Loaded env from %s", env_file)
loaded = load_hermes_dotenv(hermes_home=hermes_home)
if loaded:
for env_file in loaded:
logging.getLogger(__name__).info("Loaded env from %s", env_file)
else:
logging.getLogger(__name__).info(
"No .env found at %s, using system env", env_file
"No .env found at %s, using system env", hermes_home / ".env"
)

View File

@@ -107,6 +107,12 @@ terminal:
# timeout: 180
# lifetime_seconds: 300
# docker_image: "nikolaik/python-nodejs:python3.11-nodejs20"
# # Optional: explicitly forward selected env vars into Docker.
# # These values come from your current shell first, then ~/.hermes/.env.
# # Warning: anything forwarded here is visible to commands run in the container.
# docker_forward_env:
# - "GITHUB_TOKEN"
# - "NPM_TOKEN"
# -----------------------------------------------------------------------------
# OPTION 4: Singularity/Apptainer container

19
cli.py
View File

@@ -61,23 +61,14 @@ import queue
_COMMAND_SPINNER_FRAMES = ("", "", "", "", "", "", "", "", "", "")
# Load .env from ~/.hermes/.env first, then project root as dev fallback
from dotenv import load_dotenv
# Load .env from ~/.hermes/.env first, then project root as dev fallback.
# User-managed env files should override stale shell exports on restart.
from hermes_constants import OPENROUTER_BASE_URL
from hermes_cli.env_loader import load_hermes_dotenv
_hermes_home = Path(os.getenv("HERMES_HOME", Path.home() / ".hermes"))
_user_env = _hermes_home / ".env"
_project_env = Path(__file__).parent / '.env'
if _user_env.exists():
try:
load_dotenv(dotenv_path=_user_env, encoding="utf-8")
except UnicodeDecodeError:
load_dotenv(dotenv_path=_user_env, encoding="latin-1")
elif _project_env.exists():
try:
load_dotenv(dotenv_path=_project_env, encoding="utf-8")
except UnicodeDecodeError:
load_dotenv(dotenv_path=_project_env, encoding="latin-1")
load_hermes_dotenv(hermes_home=_hermes_home, project_env=_project_env)
# Point mini-swe-agent at ~/.hermes/ so it shares our config
os.environ.setdefault("MSWEA_GLOBAL_CONFIG_DIR", str(_hermes_home))
@@ -167,6 +158,7 @@ def load_cli_config() -> Dict[str, Any]:
"timeout": 60,
"lifetime_seconds": 300,
"docker_image": "python:3.11",
"docker_forward_env": [],
"singularity_image": "docker://python:3.11",
"modal_image": "python:3.11",
"daytona_image": "nikolaik/python-nodejs:python3.11-nodejs20",
@@ -322,6 +314,7 @@ def load_cli_config() -> Dict[str, Any]:
"timeout": "TERMINAL_TIMEOUT",
"lifetime_seconds": "TERMINAL_LIFETIME_SECONDS",
"docker_image": "TERMINAL_DOCKER_IMAGE",
"docker_forward_env": "TERMINAL_DOCKER_FORWARD_ENV",
"singularity_image": "TERMINAL_SINGULARITY_IMAGE",
"modal_image": "TERMINAL_MODAL_IMAGE",
"daytona_image": "TERMINAL_DAYTONA_IMAGE",

View File

@@ -288,6 +288,7 @@ class MessageEvent:
message_id: Optional[str] = None
# Media attachments
# media_urls: local file paths (for vision tool access)
media_urls: List[str] = field(default_factory=list)
media_types: List[str] = field(default_factory=list)
@@ -355,6 +356,10 @@ class BasePlatformAdapter(ABC):
# Key: session_key (e.g., chat_id), Value: (event, asyncio.Event for interrupt)
self._active_sessions: Dict[str, asyncio.Event] = {}
self._pending_messages: Dict[str, MessageEvent] = {}
# Background message-processing tasks spawned by handle_message().
# Gateway shutdown cancels these so an old gateway instance doesn't keep
# working on a task after --replace or manual restarts.
self._background_tasks: set[asyncio.Task] = set()
# Chats where auto-TTS on voice input is disabled (set by /voice off)
self._auto_tts_disabled_chats: set = set()
@@ -751,7 +756,25 @@ class BasePlatformAdapter(ABC):
# Check if there's already an active handler for this session
if session_key in self._active_sessions:
# Store this as a pending message - it will interrupt the running agent
# Special case: photo bursts/albums frequently arrive as multiple near-
# simultaneous messages. Queue them without interrupting the active run,
# then process them immediately after the current task finishes.
if event.message_type == MessageType.PHOTO:
print(f"[{self.name}] 🖼️ Queuing photo follow-up for session {session_key} without interrupt")
existing = self._pending_messages.get(session_key)
if existing and existing.message_type == MessageType.PHOTO:
existing.media_urls.extend(event.media_urls)
existing.media_types.extend(event.media_types)
if event.text:
if not existing.text:
existing.text = event.text
elif event.text not in existing.text:
existing.text = f"{existing.text}\n\n{event.text}".strip()
else:
self._pending_messages[session_key] = event
return # Don't interrupt now - will run after current task completes
# Default behavior for non-photo follow-ups: interrupt the running agent
print(f"[{self.name}] ⚡ New message while session {session_key} is active - triggering interrupt")
self._pending_messages[session_key] = event
# Signal the interrupt (the processing task checks this)
@@ -759,7 +782,15 @@ class BasePlatformAdapter(ABC):
return # Don't process now - will be handled after current task finishes
# Spawn background task to process this message
asyncio.create_task(self._process_message_background(event, session_key))
task = asyncio.create_task(self._process_message_background(event, session_key))
try:
self._background_tasks.add(task)
except TypeError:
# Some tests stub create_task() with lightweight sentinels that are not
# hashable and do not support lifecycle callbacks.
return
if hasattr(task, "add_done_callback"):
task.add_done_callback(self._background_tasks.discard)
@staticmethod
def _get_human_delay() -> float:
@@ -969,6 +1000,21 @@ class BasePlatformAdapter(ABC):
if session_key in self._active_sessions:
del self._active_sessions[session_key]
async def cancel_background_tasks(self) -> None:
"""Cancel any in-flight background message-processing tasks.
Used during gateway shutdown/replacement so active sessions from the old
process do not keep running after adapters are being torn down.
"""
tasks = [task for task in self._background_tasks if not task.done()]
for task in tasks:
task.cancel()
if tasks:
await asyncio.gather(*tasks, return_exceptions=True)
self._background_tasks.clear()
self._pending_messages.clear()
self._active_sessions.clear()
def has_pending_interrupt(self, session_key: str) -> bool:
"""Check if there's a pending interrupt for a session."""
return session_key in self._active_sessions and self._active_sessions[session_key].is_set()

View File

@@ -87,8 +87,9 @@ class VoiceReceiver:
SAMPLE_RATE = 48000 # Discord native rate
CHANNELS = 2 # Discord sends stereo
def __init__(self, voice_client):
def __init__(self, voice_client, allowed_user_ids: set = None):
self._vc = voice_client
self._allowed_user_ids = allowed_user_ids or set()
self._running = False
# Decryption
@@ -274,19 +275,21 @@ class VoiceReceiver:
if self._dave_session:
with self._lock:
user_id = self._ssrc_to_user.get(ssrc, 0)
if user_id == 0:
if self._packet_debug_count <= 10:
logger.warning("DAVE skip: unknown user for ssrc=%d", ssrc)
return # unknown user, can't DAVE-decrypt
try:
import davey
decrypted = self._dave_session.decrypt(
user_id, davey.MediaType.audio, decrypted
)
except Exception as e:
if self._packet_debug_count <= 10:
logger.warning("DAVE decrypt failed for ssrc=%d: %s", ssrc, e)
return
if user_id:
try:
import davey
decrypted = self._dave_session.decrypt(
user_id, davey.MediaType.audio, decrypted
)
except Exception as e:
# Unencrypted passthrough — use NaCl-decrypted data as-is
if "Unencrypted" not in str(e):
if self._packet_debug_count <= 10:
logger.warning("DAVE decrypt failed for ssrc=%d: %s", ssrc, e)
return
# If SSRC unknown (no SPEAKING event yet), skip DAVE and try
# Opus decode directly — audio may be in passthrough mode.
# Buffer will get a user_id when SPEAKING event arrives later.
# --- Opus decode -> PCM ---
try:
@@ -304,6 +307,32 @@ class VoiceReceiver:
# Silence detection
# ------------------------------------------------------------------
def _infer_user_for_ssrc(self, ssrc: int) -> int:
"""Try to infer user_id for an unmapped SSRC.
When the bot rejoins a voice channel, Discord may not resend
SPEAKING events for users already speaking. If exactly one
allowed user is in the channel, map the SSRC to them.
"""
try:
channel = self._vc.channel
if not channel:
return 0
bot_id = self._vc.user.id if self._vc.user else 0
allowed = self._allowed_user_ids
candidates = [
m.id for m in channel.members
if m.id != bot_id and (not allowed or str(m.id) in allowed)
]
if len(candidates) == 1:
uid = candidates[0]
self._ssrc_to_user[ssrc] = uid
logger.info("Auto-mapped ssrc=%d -> user=%d (sole allowed member)", ssrc, uid)
return uid
except Exception:
pass
return 0
def check_silence(self) -> list:
"""Return list of (user_id, pcm_bytes) for completed utterances."""
now = time.monotonic()
@@ -322,6 +351,10 @@ class VoiceReceiver:
if silence_duration >= self.SILENCE_THRESHOLD and buf_duration >= self.MIN_SPEECH_DURATION:
user_id = ssrc_user_map.get(ssrc, 0)
if not user_id:
# SSRC not mapped (SPEAKING event missing after bot rejoin).
# Infer from allowed users in the voice channel.
user_id = self._infer_user_for_ssrc(ssrc)
if user_id:
completed.append((user_id, bytes(buf)))
self._buffers[ssrc] = bytearray()
@@ -400,6 +433,9 @@ class DiscordAdapter(BasePlatformAdapter):
self._voice_listen_tasks: Dict[int, asyncio.Task] = {} # guild_id -> listen loop
self._voice_input_callback: Optional[Callable] = None # set by run.py
self._on_voice_disconnect: Optional[Callable] = None # set by run.py
# Track threads where the bot has participated so follow-up messages
# in those threads don't require @mention.
self._bot_participated_threads: set = set()
async def connect(self) -> bool:
"""Connect to Discord and start receiving events."""
@@ -580,7 +616,7 @@ class DiscordAdapter(BasePlatformAdapter):
"""Send a message to a Discord channel."""
if not self._client:
return SendResult(success=False, error="Not connected")
try:
# Get the channel
channel = self._client.get_channel(int(chat_id))
@@ -695,13 +731,14 @@ class DiscordAdapter(BasePlatformAdapter):
) -> SendResult:
"""Play auto-TTS audio.
When the bot is in a voice channel for this chat's guild, skip the
file attachment — the gateway runner plays audio in the VC instead.
When the bot is in a voice channel for this chat's guild, play
directly in the VC instead of sending as a file attachment.
"""
for gid, text_ch_id in self._voice_text_channels.items():
if str(text_ch_id) == str(chat_id) and self.is_in_voice_channel(gid):
logger.debug("[%s] Skipping play_tts for %s — VC playback handled by runner", self.name, chat_id)
return SendResult(success=True)
logger.info("[%s] Playing TTS in voice channel (guild=%d)", self.name, gid)
success = await self.play_in_voice_channel(gid, audio_path)
return SendResult(success=success)
return await self.send_voice(chat_id=chat_id, audio_path=audio_path, **kwargs)
async def send_voice(
@@ -805,7 +842,7 @@ class DiscordAdapter(BasePlatformAdapter):
# Start voice receiver (Phase 2: listen to users)
try:
receiver = VoiceReceiver(vc)
receiver = VoiceReceiver(vc, allowed_user_ids=self._allowed_user_ids)
receiver.start()
self._voice_receivers[guild_id] = receiver
self._voice_listen_tasks[guild_id] = asyncio.ensure_future(
@@ -1001,14 +1038,32 @@ class DiscordAdapter(BasePlatformAdapter):
# Voice listening (Phase 2)
# ------------------------------------------------------------------
# UDP keepalive interval in seconds — prevents Discord from dropping
# the UDP route after ~60s of silence.
_KEEPALIVE_INTERVAL = 15
async def _voice_listen_loop(self, guild_id: int):
"""Periodically check for completed utterances and process them."""
receiver = self._voice_receivers.get(guild_id)
if not receiver:
return
last_keepalive = time.monotonic()
try:
while receiver._running:
await asyncio.sleep(0.2)
# Send periodic UDP keepalive to prevent Discord from
# dropping the UDP session after ~60s of silence.
now = time.monotonic()
if now - last_keepalive >= self._KEEPALIVE_INTERVAL:
last_keepalive = now
try:
vc = self._voice_clients.get(guild_id)
if vc and vc.is_connected():
vc._connection.send_packet(b'\xf8\xff\xfe')
except Exception:
pass
completed = receiver.check_silence()
for user_id, pcm_data in completed:
if not self._is_allowed_user(str(user_id)):
@@ -1746,14 +1801,13 @@ class DiscordAdapter(BasePlatformAdapter):
async def _handle_message(self, message: DiscordMessage) -> None:
"""Handle incoming Discord messages."""
# In server channels (not DMs), require the bot to be @mentioned
# UNLESS the channel is in the free-response list.
# UNLESS the channel is in the free-response list or the message is
# in a thread where the bot has already participated.
#
# Config:
# DISCORD_FREE_RESPONSE_CHANNELS: Comma-separated channel IDs where the
# bot responds to every message without needing a mention.
# DISCORD_REQUIRE_MENTION: Set to "false" to disable mention requirement
# globally (all channels become free-response). Default: "true".
# Can also be set via discord.require_mention in config.yaml.
# Config (all settable via discord.* in config.yaml):
# discord.require_mention: Require @mention in server channels (default: true)
# discord.free_response_channels: Channel IDs where bot responds without mention
# discord.auto_thread: Auto-create thread on @mention in channels (default: true)
thread_id = None
parent_channel_id = None
@@ -1772,7 +1826,11 @@ class DiscordAdapter(BasePlatformAdapter):
require_mention = os.getenv("DISCORD_REQUIRE_MENTION", "true").lower() not in ("false", "0", "no")
is_free_channel = bool(channel_ids & free_channels)
if require_mention and not is_free_channel:
# Skip the mention check if the message is in a thread where
# the bot has previously participated (auto-created or replied in).
in_bot_thread = is_thread and thread_id in self._bot_participated_threads
if require_mention and not is_free_channel and not in_bot_thread:
if self._client.user not in message.mentions:
return
@@ -1781,17 +1839,18 @@ class DiscordAdapter(BasePlatformAdapter):
message.content = message.content.replace(f"<@!{self._client.user.id}>", "").strip()
# Auto-thread: when enabled, automatically create a thread for every
# new message in a text channel so each conversation is isolated.
# @mention in a text channel so each conversation is isolated (like Slack).
# Messages already inside threads or DMs are unaffected.
auto_threaded_channel = None
if not is_thread and not isinstance(message.channel, discord.DMChannel):
auto_thread = os.getenv("DISCORD_AUTO_THREAD", "").lower() in ("true", "1", "yes")
auto_thread = os.getenv("DISCORD_AUTO_THREAD", "true").lower() in ("true", "1", "yes")
if auto_thread:
thread = await self._auto_create_thread(message)
if thread:
is_thread = True
thread_id = str(thread.id)
auto_threaded_channel = thread
self._bot_participated_threads.add(thread_id)
# Determine message type
msg_type = MessageType.TEXT
@@ -1891,7 +1950,12 @@ class DiscordAdapter(BasePlatformAdapter):
reply_to_message_id=str(message.reference.message_id) if message.reference else None,
timestamp=message.created_at,
)
# Track thread participation so the bot won't require @mention for
# follow-up messages in threads it has already engaged in.
if thread_id:
self._bot_participated_threads.add(thread_id)
await self.handle_message(event)

View File

@@ -111,6 +111,11 @@ class TelegramAdapter(BasePlatformAdapter):
super().__init__(config, Platform.TELEGRAM)
self._app: Optional[Application] = None
self._bot: Optional[Bot] = None
# Buffer rapid/album photo updates so Telegram image bursts are handled
# as a single MessageEvent instead of self-interrupting multiple turns.
self._media_batch_delay_seconds = float(os.getenv("HERMES_TELEGRAM_MEDIA_BATCH_DELAY_SECONDS", "0.8"))
self._pending_photo_batches: Dict[str, MessageEvent] = {}
self._pending_photo_batch_tasks: Dict[str, asyncio.Task] = {}
self._media_group_events: Dict[str, MessageEvent] = {}
self._media_group_tasks: Dict[str, asyncio.Task] = {}
self._token_lock_identity: Optional[str] = None
@@ -289,13 +294,19 @@ class TelegramAdapter(BasePlatformAdapter):
release_scoped_lock("telegram-bot-token", self._token_lock_identity)
except Exception as e:
logger.warning("[%s] Error releasing Telegram token lock: %s", self.name, e, exc_info=True)
for task in self._pending_photo_batch_tasks.values():
if task and not task.done():
task.cancel()
self._pending_photo_batch_tasks.clear()
self._pending_photo_batches.clear()
self._mark_disconnected()
self._app = None
self._bot = None
self._token_lock_identity = None
logger.info("[%s] Disconnected from Telegram", self.name)
async def send(
self,
chat_id: str,
@@ -807,6 +818,49 @@ class TelegramAdapter(BasePlatformAdapter):
event.text = "\n".join(parts)
await self.handle_message(event)
def _photo_batch_key(self, event: MessageEvent, msg: Message) -> str:
"""Return a batching key for Telegram photos/albums."""
from gateway.session import build_session_key
session_key = build_session_key(event.source)
media_group_id = getattr(msg, "media_group_id", None)
if media_group_id:
return f"{session_key}:album:{media_group_id}"
return f"{session_key}:photo-burst"
async def _flush_photo_batch(self, batch_key: str) -> None:
"""Send a buffered photo burst/album as a single MessageEvent."""
current_task = asyncio.current_task()
try:
await asyncio.sleep(self._media_batch_delay_seconds)
event = self._pending_photo_batches.pop(batch_key, None)
if not event:
return
logger.info("[Telegram] Flushing photo batch %s with %d image(s)", batch_key, len(event.media_urls))
await self.handle_message(event)
finally:
if self._pending_photo_batch_tasks.get(batch_key) is current_task:
self._pending_photo_batch_tasks.pop(batch_key, None)
def _enqueue_photo_event(self, batch_key: str, event: MessageEvent) -> None:
"""Merge photo events into a pending batch and schedule flush."""
existing = self._pending_photo_batches.get(batch_key)
if existing is None:
self._pending_photo_batches[batch_key] = event
else:
existing.media_urls.extend(event.media_urls)
existing.media_types.extend(event.media_types)
if event.text:
if not existing.text:
existing.text = event.text
elif event.text not in existing.text:
existing.text = f"{existing.text}\n\n{event.text}".strip()
prior_task = self._pending_photo_batch_tasks.get(batch_key)
if prior_task and not prior_task.done():
prior_task.cancel()
self._pending_photo_batch_tasks[batch_key] = asyncio.create_task(self._flush_photo_batch(batch_key))
async def _handle_media_message(self, update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
"""Handle incoming media messages, downloading images to local cache."""
if not update.message:
@@ -858,14 +912,22 @@ class TelegramAdapter(BasePlatformAdapter):
if file_obj.file_path.lower().endswith(candidate):
ext = candidate
break
# Save to cache and populate media_urls with the local path
# Save to local cache (for vision tool access)
cached_path = cache_image_from_bytes(bytes(image_bytes), ext=ext)
event.media_urls = [cached_path]
event.media_types = [f"image/{ext.lstrip('.')}"]
event.media_types = [f"image/{ext.lstrip('.')}" ]
logger.info("[Telegram] Cached user photo at %s", cached_path)
media_group_id = getattr(msg, "media_group_id", None)
if media_group_id:
await self._queue_media_group_event(str(media_group_id), event)
else:
batch_key = self._photo_batch_key(event, msg)
self._enqueue_photo_event(batch_key, event)
return
except Exception as e:
logger.warning("[Telegram] Failed to cache photo: %s", e, exc_info=True)
# Download voice/audio messages to cache for STT transcription
if msg.voice:
try:

View File

@@ -35,16 +35,12 @@ sys.path.insert(0, str(Path(__file__).parent.parent))
# Resolve Hermes home directory (respects HERMES_HOME override)
_hermes_home = Path(os.getenv("HERMES_HOME", Path.home() / ".hermes"))
# Load environment variables from ~/.hermes/.env first
from dotenv import load_dotenv
# Load environment variables from ~/.hermes/.env first.
# User-managed env files should override stale shell exports on restart.
from dotenv import load_dotenv # backward-compat for tests that monkeypatch this symbol
from hermes_cli.env_loader import load_hermes_dotenv
_env_path = _hermes_home / '.env'
if _env_path.exists():
try:
load_dotenv(_env_path, encoding="utf-8")
except UnicodeDecodeError:
load_dotenv(_env_path, encoding="latin-1")
# Also try project .env as fallback
load_dotenv()
load_hermes_dotenv(hermes_home=_hermes_home, project_env=Path(__file__).resolve().parents[1] / '.env')
# Bridge config.yaml values into the environment so os.getenv() picks them up.
# config.yaml is authoritative for terminal settings — overrides .env.
@@ -68,6 +64,7 @@ if _config_path.exists():
"timeout": "TERMINAL_TIMEOUT",
"lifetime_seconds": "TERMINAL_LIFETIME_SECONDS",
"docker_image": "TERMINAL_DOCKER_IMAGE",
"docker_forward_env": "TERMINAL_DOCKER_FORWARD_ENV",
"singularity_image": "TERMINAL_SINGULARITY_IMAGE",
"modal_image": "TERMINAL_MODAL_IMAGE",
"daytona_image": "TERMINAL_DAYTONA_IMAGE",
@@ -900,8 +897,19 @@ class GatewayRunner:
"""Stop the gateway and disconnect all adapters."""
logger.info("Stopping gateway...")
self._running = False
for session_key, agent in list(self._running_agents.items()):
try:
agent.interrupt("Gateway shutting down")
logger.debug("Interrupted running agent for session %s during shutdown", session_key[:20])
except Exception as e:
logger.debug("Failed interrupting agent during shutdown: %s", e)
for platform, adapter in list(self.adapters.items()):
try:
await adapter.cancel_background_tasks()
except Exception as e:
logger.debug("%s background-task cancel error: %s", platform.value, e)
try:
await adapter.disconnect()
logger.info("%s disconnected", platform.value)
@@ -909,6 +917,9 @@ class GatewayRunner:
logger.error("%s disconnect error: %s", platform.value, e)
self.adapters.clear()
self._running_agents.clear()
self._pending_messages.clear()
self._pending_approvals.clear()
self._shutdown_all_gateway_honcho()
self._shutdown_event.set()
@@ -1095,11 +1106,36 @@ class GatewayRunner:
)
return None
# PRIORITY: If an agent is already running for this session, interrupt it
# immediately. This is before command parsing to minimize latency -- the
# user's "stop" message reaches the agent as fast as possible.
# PRIORITY handling when an agent is already running for this session.
# Default behavior is to interrupt immediately so user text/stop messages
# are handled with minimal latency.
#
# Special case: Telegram/photo bursts often arrive as multiple near-
# simultaneous updates. Do NOT interrupt for photo-only follow-ups here;
# let the adapter-level batching/queueing logic absorb them.
_quick_key = build_session_key(source)
if _quick_key in self._running_agents:
if event.message_type == MessageType.PHOTO:
logger.debug("PRIORITY photo follow-up for session %s — queueing without interrupt", _quick_key[:20])
adapter = self.adapters.get(source.platform)
if adapter:
# Reuse adapter queue semantics so photo bursts merge cleanly.
if _quick_key in adapter._pending_messages:
existing = adapter._pending_messages[_quick_key]
if getattr(existing, "message_type", None) == MessageType.PHOTO:
existing.media_urls.extend(event.media_urls)
existing.media_types.extend(event.media_types)
if event.text:
if not existing.text:
existing.text = event.text
elif event.text not in existing.text:
existing.text = f"{existing.text}\n\n{event.text}".strip()
else:
adapter._pending_messages[_quick_key] = event
else:
adapter._pending_messages[_quick_key] = event
return None
running_agent = self._running_agents[_quick_key]
logger.debug("PRIORITY interrupt for session %s", _quick_key[:20])
running_agent.interrupt(event.text)
@@ -2396,6 +2432,13 @@ class GatewayRunner:
except Exception as e:
logger.warning("Failed to join voice channel: %s", e)
adapter._voice_input_callback = None
err_lower = str(e).lower()
if "pynacl" in err_lower or "nacl" in err_lower or "davey" in err_lower:
return (
"Voice dependencies are missing (PyNaCl / davey). "
"Install or reinstall Hermes with the messaging extra, e.g. "
"`pip install hermes-agent[messaging]`."
)
return f"Failed to join voice channel: {e}"
if success:
@@ -2536,18 +2579,9 @@ class GatewayRunner:
if has_agent_tts:
return False
# Dedup: base adapter auto-TTS already handles voice input.
# Exception: Discord voice channel — play_tts override is a no-op,
# so the runner must handle VC playback.
skip_double = is_voice_input
if skip_double:
adapter = self.adapters.get(event.source.platform)
guild_id = self._get_guild_id(event)
if (guild_id and adapter
and hasattr(adapter, "is_in_voice_channel")
and adapter.is_in_voice_channel(guild_id)):
skip_double = False
if skip_double:
# Dedup: base adapter auto-TTS already handles voice input
# (play_tts plays in VC when connected, so runner can skip).
if is_voice_input:
return False
return True
@@ -3469,10 +3503,12 @@ class GatewayRunner:
os.environ["HERMES_SESSION_CHAT_ID"] = context.source.chat_id
if context.source.chat_name:
os.environ["HERMES_SESSION_CHAT_NAME"] = context.source.chat_name
if context.source.thread_id:
os.environ["HERMES_SESSION_THREAD_ID"] = str(context.source.thread_id)
def _clear_session_env(self) -> None:
"""Clear session environment variables."""
for var in ["HERMES_SESSION_PLATFORM", "HERMES_SESSION_CHAT_ID", "HERMES_SESSION_CHAT_NAME"]:
for var in ["HERMES_SESSION_PLATFORM", "HERMES_SESSION_CHAT_ID", "HERMES_SESSION_CHAT_NAME", "HERMES_SESSION_THREAD_ID"]:
if var in os.environ:
del os.environ[var]
@@ -3490,9 +3526,13 @@ class GatewayRunner:
1. Immediately understand what the user sent (no extra tool call).
2. Re-examine the image with vision_analyze if it needs more detail.
Athabasca persistence should happen through Athabasca's own POST
/api/uploads flow, using the returned asset.publicUrl rather than local
cache paths.
Args:
user_text: The user's original caption / message text.
image_paths: List of local file paths to cached images.
user_text: The user's original caption / message text.
image_paths: List of local file paths to cached images.
Returns:
The enriched message string with vision descriptions prepended.
@@ -3517,10 +3557,16 @@ class GatewayRunner:
result = _json.loads(result_json)
if result.get("success"):
description = result.get("analysis", "")
athabasca_note = (
"\n[If this image needs to persist in Athabasca state, upload the cached file "
"through Athabasca POST /api/uploads and use the returned asset.publicUrl. "
"Do not store the local cache path as the canonical imageUrl.]"
)
enriched_parts.append(
f"[The user sent an image~ Here's what I can see:\n{description}]\n"
f"[If you need a closer look, use vision_analyze with "
f"image_url: {path} ~]"
f"{athabasca_note}"
)
else:
enriched_parts.append(

View File

@@ -321,25 +321,32 @@ def build_session_key(source: SessionSource) -> str:
This is the single source of truth for session key construction.
DM rules:
- WhatsApp DMs include chat_id (multi-user support).
- Other DMs include thread_id when present (e.g. Slack threaded DMs),
so each DM thread gets its own session while top-level DMs share one.
- Without thread_id or chat_id, all DMs share a single session.
- DMs include chat_id when present, so each private conversation is isolated.
- thread_id further differentiates threaded DMs within the same DM chat.
- Without chat_id, thread_id is used as a best-effort fallback.
- Without thread_id or chat_id, DMs share a single session.
Group/channel rules:
- thread_id differentiates threads within a channel.
- Without thread_id, all messages in a channel share one session.
- chat_id identifies the parent group/channel.
- thread_id differentiates threads within that parent chat.
- Without identifiers, messages fall back to one session per platform/chat_type.
"""
platform = source.platform.value
if source.chat_type == "dm":
if source.chat_id:
if source.thread_id:
return f"agent:main:{platform}:dm:{source.chat_id}:{source.thread_id}"
return f"agent:main:{platform}:dm:{source.chat_id}"
if source.thread_id:
return f"agent:main:{platform}:dm:{source.thread_id}"
if platform == "whatsapp" and source.chat_id:
return f"agent:main:{platform}:dm:{source.chat_id}"
return f"agent:main:{platform}:dm"
if source.chat_id:
if source.thread_id:
return f"agent:main:{platform}:{source.chat_type}:{source.chat_id}:{source.thread_id}"
return f"agent:main:{platform}:{source.chat_type}:{source.chat_id}"
if source.thread_id:
return f"agent:main:{platform}:{source.chat_type}:{source.chat_id}:{source.thread_id}"
return f"agent:main:{platform}:{source.chat_type}:{source.chat_id}"
return f"agent:main:{platform}:{source.chat_type}:{source.thread_id}"
return f"agent:main:{platform}:{source.chat_type}"
class SessionStore:

View File

@@ -106,6 +106,7 @@ DEFAULT_CONFIG = {
"cwd": ".", # Use current directory
"timeout": 180,
"docker_image": "nikolaik/python-nodejs:python3.11-nodejs20",
"docker_forward_env": [],
"singularity_image": "docker://nikolaik/python-nodejs:python3.11-nodejs20",
"modal_image": "nikolaik/python-nodejs:python3.11-nodejs20",
"daytona_image": "nikolaik/python-nodejs:python3.11-nodejs20",
@@ -280,6 +281,7 @@ DEFAULT_CONFIG = {
"discord": {
"require_mention": True, # Require @mention to respond in server channels
"free_response_channels": "", # Comma-separated channel IDs where bot responds without mention
"auto_thread": True, # Auto-create threads on @mention in channels (like Slack)
},
# Permanently allowed dangerous command patterns (added via "always" approval)
@@ -301,7 +303,7 @@ DEFAULT_CONFIG = {
},
# Config schema version - bump this when adding new required fields
"_config_version": 8,
"_config_version": 9,
}
# =============================================================================

46
hermes_cli/env_loader.py Normal file
View File

@@ -0,0 +1,46 @@
"""Helpers for loading Hermes .env files consistently across entrypoints."""
from __future__ import annotations
import os
from pathlib import Path
from typing import Iterable
from dotenv import load_dotenv
def _load_dotenv_with_fallback(path: Path, *, override: bool) -> None:
try:
load_dotenv(dotenv_path=path, override=override, encoding="utf-8")
except UnicodeDecodeError:
load_dotenv(dotenv_path=path, override=override, encoding="latin-1")
def load_hermes_dotenv(
*,
hermes_home: str | os.PathLike | None = None,
project_env: str | os.PathLike | None = None,
) -> list[Path]:
"""Load Hermes environment files with user config taking precedence.
Behavior:
- `~/.hermes/.env` overrides stale shell-exported values when present.
- project `.env` acts as a dev fallback and only fills missing values when
the user env exists.
- if no user env exists, the project `.env` also overrides stale shell vars.
"""
loaded: list[Path] = []
home_path = Path(hermes_home or os.getenv("HERMES_HOME", Path.home() / ".hermes"))
user_env = home_path / ".env"
project_env_path = Path(project_env) if project_env else None
if user_env.exists():
_load_dotenv_with_fallback(user_env, override=True)
loaded.append(user_env)
if project_env_path and project_env_path.exists():
_load_dotenv_with_fallback(project_env_path, override=not loaded)
loaded.append(project_env_path)
return loaded

View File

@@ -54,16 +54,11 @@ from typing import Optional
PROJECT_ROOT = Path(__file__).parent.parent.resolve()
sys.path.insert(0, str(PROJECT_ROOT))
# Load .env from ~/.hermes/.env first, then project root as dev fallback
from dotenv import load_dotenv
from hermes_cli.config import get_env_path, get_hermes_home
_user_env = get_env_path()
if _user_env.exists():
try:
load_dotenv(dotenv_path=_user_env, encoding="utf-8")
except UnicodeDecodeError:
load_dotenv(dotenv_path=_user_env, encoding="latin-1")
load_dotenv(dotenv_path=PROJECT_ROOT / '.env', override=False)
# Load .env from ~/.hermes/.env first, then project root as dev fallback.
# User-managed env files should override stale shell exports on restart.
from hermes_cli.config import get_hermes_home
from hermes_cli.env_loader import load_hermes_dotenv
load_hermes_dotenv(project_env=PROJECT_ROOT / '.env')
# Point mini-swe-agent at ~/.hermes/ so it shares our config
os.environ.setdefault("MSWEA_GLOBAL_CONFIG_DIR", str(get_hermes_home()))
@@ -3103,7 +3098,11 @@ For more help on a command:
elif action == "export":
if args.session_id:
data = db.export_session(args.session_id)
resolved_session_id = db.resolve_session_id(args.session_id)
if not resolved_session_id:
print(f"Session '{args.session_id}' not found.")
return
data = db.export_session(resolved_session_id)
if not data:
print(f"Session '{args.session_id}' not found.")
return
@@ -3118,13 +3117,17 @@ For more help on a command:
print(f"Exported {len(sessions)} sessions to {args.output}")
elif action == "delete":
resolved_session_id = db.resolve_session_id(args.session_id)
if not resolved_session_id:
print(f"Session '{args.session_id}' not found.")
return
if not args.yes:
confirm = input(f"Delete session '{args.session_id}' and all its messages? [y/N] ")
confirm = input(f"Delete session '{resolved_session_id}' and all its messages? [y/N] ")
if confirm.lower() not in ("y", "yes"):
print("Cancelled.")
return
if db.delete_session(args.session_id):
print(f"Deleted session '{args.session_id}'.")
if db.delete_session(resolved_session_id):
print(f"Deleted session '{resolved_session_id}'.")
else:
print(f"Session '{args.session_id}' not found.")
@@ -3140,10 +3143,14 @@ For more help on a command:
print(f"Pruned {count} session(s).")
elif action == "rename":
resolved_session_id = db.resolve_session_id(args.session_id)
if not resolved_session_id:
print(f"Session '{args.session_id}' not found.")
return
title = " ".join(args.title)
try:
if db.set_session_title(args.session_id, title):
print(f"Session '{args.session_id}' renamed to: {title}")
if db.set_session_title(resolved_session_id, title):
print(f"Session '{resolved_session_id}' renamed to: {title}")
else:
print(f"Session '{args.session_id}' not found.")
except ValueError as e:

View File

@@ -354,9 +354,29 @@ def _get_platform_tools(config: dict, platform: str) -> Set[str]:
def _save_platform_tools(config: dict, platform: str, enabled_toolset_keys: Set[str]):
"""Save the selected toolset keys for a platform to config."""
"""Save the selected toolset keys for a platform to config.
Preserves any non-configurable toolset entries (like MCP server names)
that were already in the config for this platform.
"""
config.setdefault("platform_toolsets", {})
config["platform_toolsets"][platform] = sorted(enabled_toolset_keys)
# Get the set of all configurable toolset keys
configurable_keys = {ts_key for ts_key, _, _ in CONFIGURABLE_TOOLSETS}
# Get existing toolsets for this platform
existing_toolsets = config.get("platform_toolsets", {}).get(platform, [])
if not isinstance(existing_toolsets, list):
existing_toolsets = []
# Preserve any entries that are NOT configurable toolsets (i.e. MCP server names)
preserved_entries = {
entry for entry in existing_toolsets
if entry not in configurable_keys
}
# Merge preserved entries with new enabled toolsets
config["platform_toolsets"][platform] = sorted(enabled_toolset_keys | preserved_entries)
save_config(config)

View File

@@ -249,6 +249,32 @@ class SessionDB:
row = cursor.fetchone()
return dict(row) if row else None
def resolve_session_id(self, session_id_or_prefix: str) -> Optional[str]:
"""Resolve an exact or uniquely prefixed session ID to the full ID.
Returns the exact ID when it exists. Otherwise treats the input as a
prefix and returns the single matching session ID if the prefix is
unambiguous. Returns None for no matches or ambiguous prefixes.
"""
exact = self.get_session(session_id_or_prefix)
if exact:
return exact["id"]
escaped = (
session_id_or_prefix
.replace("\\", "\\\\")
.replace("%", "\\%")
.replace("_", "\\_")
)
cursor = self._conn.execute(
"SELECT id FROM sessions WHERE id LIKE ? ESCAPE '\\' ORDER BY started_at DESC LIMIT 2",
(f"{escaped}%",),
)
matches = [row["id"] for row in cursor.fetchall()]
if len(matches) == 1:
return matches[0]
return None
# Maximum length for session titles
MAX_TITLE_LENGTH = 100

View File

@@ -27,25 +27,16 @@ from pathlib import Path
import fire
import yaml
# Load .env from ~/.hermes/.env first, then project root as dev fallback
from dotenv import load_dotenv
# Load .env from ~/.hermes/.env first, then project root as dev fallback.
# User-managed env files should override stale shell exports on restart.
_hermes_home = Path(os.getenv("HERMES_HOME", Path.home() / ".hermes"))
_user_env = _hermes_home / ".env"
_project_env = Path(__file__).parent / '.env'
if _user_env.exists():
try:
load_dotenv(dotenv_path=_user_env, encoding="utf-8")
except UnicodeDecodeError:
load_dotenv(dotenv_path=_user_env, encoding="latin-1")
print(f"✅ Loaded environment variables from {_user_env}")
elif _project_env.exists():
try:
load_dotenv(dotenv_path=_project_env, encoding="utf-8")
except UnicodeDecodeError:
load_dotenv(dotenv_path=_project_env, encoding="latin-1")
print(f"✅ Loaded environment variables from {_project_env}")
from hermes_cli.env_loader import load_hermes_dotenv
_loaded_env_paths = load_hermes_dotenv(hermes_home=_hermes_home, project_env=_project_env)
for _env_path in _loaded_env_paths:
print(f"✅ Loaded environment variables from {_env_path}")
# Set terminal working directory to tinker-atropos submodule
# This ensures terminal commands run in the right context for RL work

View File

@@ -45,24 +45,16 @@ import fire
from datetime import datetime
from pathlib import Path
# Load .env from ~/.hermes/.env first, then project root as dev fallback
from dotenv import load_dotenv
# Load .env from ~/.hermes/.env first, then project root as dev fallback.
# User-managed env files should override stale shell exports on restart.
from hermes_cli.env_loader import load_hermes_dotenv
_hermes_home = Path(os.getenv("HERMES_HOME", Path.home() / ".hermes"))
_user_env = _hermes_home / ".env"
_project_env = Path(__file__).parent / '.env'
if _user_env.exists():
try:
load_dotenv(dotenv_path=_user_env, encoding="utf-8")
except UnicodeDecodeError:
load_dotenv(dotenv_path=_user_env, encoding="latin-1")
logger.info("Loaded environment variables from %s", _user_env)
elif _project_env.exists():
try:
load_dotenv(dotenv_path=_project_env, encoding="utf-8")
except UnicodeDecodeError:
load_dotenv(dotenv_path=_project_env, encoding="latin-1")
logger.info("Loaded environment variables from %s", _project_env)
_loaded_env_paths = load_hermes_dotenv(hermes_home=_hermes_home, project_env=_project_env)
if _loaded_env_paths:
for _env_path in _loaded_env_paths:
logger.info("Loaded environment variables from %s", _env_path)
else:
logger.info("No .env file found. Using system environment variables.")
@@ -5590,6 +5582,12 @@ class AIAgent:
invalid_json_args = []
for tc in assistant_message.tool_calls:
args = tc.function.arguments
if isinstance(args, (dict, list)):
tc.function.arguments = json.dumps(args)
continue
if args is not None and not isinstance(args, str):
tc.function.arguments = str(args)
args = tc.function.arguments
# Treat empty/whitespace strings as empty object
if not args or not args.strip():
tc.function.arguments = "{}"

389
scripts/discord-voice-doctor.py Executable file
View File

@@ -0,0 +1,389 @@
#!/usr/bin/env python3
"""Discord Voice Doctor — diagnostic tool for voice channel support.
Checks all dependencies, configuration, and bot permissions needed
for Discord voice mode to work correctly.
Usage:
python scripts/discord-voice-doctor.py
.venv/bin/python scripts/discord-voice-doctor.py
"""
import os
import sys
import shutil
from pathlib import Path
# Resolve project root
SCRIPT_DIR = Path(__file__).resolve().parent
PROJECT_ROOT = SCRIPT_DIR.parent
sys.path.insert(0, str(PROJECT_ROOT))
HERMES_HOME = Path(os.getenv("HERMES_HOME", Path.home() / ".hermes"))
ENV_FILE = HERMES_HOME / ".env"
OK = "\033[92m\u2713\033[0m"
FAIL = "\033[91m\u2717\033[0m"
WARN = "\033[93m!\033[0m"
# Track whether discord.py is available for later sections
_discord_available = False
def mask(value):
"""Mask sensitive value: show only first 4 chars."""
if not value or len(value) < 8:
return "****"
return f"{value[:4]}{'*' * (len(value) - 4)}"
def check(label, ok, detail=""):
symbol = OK if ok else FAIL
msg = f" {symbol} {label}"
if detail:
msg += f" ({detail})"
print(msg)
return ok
def warn(label, detail=""):
msg = f" {WARN} {label}"
if detail:
msg += f" ({detail})"
print(msg)
def section(title):
print(f"\n\033[1m{title}\033[0m")
def check_packages():
"""Check Python package dependencies. Returns True if all critical deps OK."""
global _discord_available
section("Python Packages")
ok = True
# discord.py
try:
import discord
_discord_available = True
check("discord.py", True, f"v{discord.__version__}")
except ImportError:
check("discord.py", False, "pip install discord.py[voice]")
ok = False
# PyNaCl
try:
import nacl
ver = getattr(nacl, "__version__", "unknown")
try:
import nacl.secret
nacl.secret.Aead(bytes(32))
check("PyNaCl", True, f"v{ver}")
except (AttributeError, Exception):
check("PyNaCl (Aead)", False, f"v{ver} — need >=1.5.0")
ok = False
except ImportError:
check("PyNaCl", False, "pip install PyNaCl>=1.5.0")
ok = False
# davey (DAVE E2EE)
try:
import davey
check("davey (DAVE E2EE)", True, f"v{getattr(davey, '__version__', '?')}")
except ImportError:
check("davey (DAVE E2EE)", False, "pip install davey")
ok = False
# Optional: local STT
try:
import faster_whisper
check("faster-whisper (local STT)", True)
except ImportError:
warn("faster-whisper (local STT)", "not installed — local STT unavailable")
# Optional: TTS providers
try:
import edge_tts
check("edge-tts", True)
except ImportError:
warn("edge-tts", "not installed — edge TTS unavailable")
try:
import elevenlabs
check("elevenlabs SDK", True)
except ImportError:
warn("elevenlabs SDK", "not installed — premium TTS unavailable")
return ok
def check_system_tools():
"""Check system-level tools (opus, ffmpeg). Returns True if all OK."""
section("System Tools")
ok = True
# Opus codec
if _discord_available:
try:
import discord
opus_loaded = discord.opus.is_loaded()
if not opus_loaded:
import ctypes.util
opus_path = ctypes.util.find_library("opus")
if not opus_path:
# Platform-specific fallback paths
candidates = [
"/opt/homebrew/lib/libopus.dylib", # macOS Apple Silicon
"/usr/local/lib/libopus.dylib", # macOS Intel
"/usr/lib/x86_64-linux-gnu/libopus.so.0", # Debian/Ubuntu x86
"/usr/lib/aarch64-linux-gnu/libopus.so.0", # Debian/Ubuntu ARM
"/usr/lib/libopus.so", # Arch Linux
"/usr/lib64/libopus.so", # RHEL/Fedora
]
for p in candidates:
if os.path.isfile(p):
opus_path = p
break
if opus_path:
discord.opus.load_opus(opus_path)
opus_loaded = discord.opus.is_loaded()
if opus_loaded:
check("Opus codec", True)
else:
check("Opus codec", False, "brew install opus / apt install libopus0")
ok = False
except Exception as e:
check("Opus codec", False, str(e))
ok = False
else:
warn("Opus codec", "skipped — discord.py not installed")
# ffmpeg
ffmpeg_path = shutil.which("ffmpeg")
if ffmpeg_path:
check("ffmpeg", True, ffmpeg_path)
else:
check("ffmpeg", False, "brew install ffmpeg / apt install ffmpeg")
ok = False
return ok
def check_env_vars():
"""Check environment variables. Returns (ok, token, groq_key, eleven_key)."""
section("Environment Variables")
# Load .env
try:
from dotenv import load_dotenv
if ENV_FILE.exists():
load_dotenv(ENV_FILE)
except ImportError:
pass
ok = True
token = os.getenv("DISCORD_BOT_TOKEN", "")
if token:
check("DISCORD_BOT_TOKEN", True, mask(token))
else:
check("DISCORD_BOT_TOKEN", False, "not set")
ok = False
# Allowed users — resolve usernames if possible
allowed = os.getenv("DISCORD_ALLOWED_USERS", "")
if allowed:
users = [u.strip() for u in allowed.split(",") if u.strip()]
user_labels = []
for uid in users:
label = mask(uid)
if token and uid.isdigit():
try:
import requests
r = requests.get(
f"https://discord.com/api/v10/users/{uid}",
headers={"Authorization": f"Bot {token}"},
timeout=3,
)
if r.status_code == 200:
label = f"{r.json().get('username', '?')} ({mask(uid)})"
except Exception:
pass
user_labels.append(label)
check("DISCORD_ALLOWED_USERS", True, f"{len(users)} user(s): {', '.join(user_labels)}")
else:
warn("DISCORD_ALLOWED_USERS", "not set — all users can use voice")
groq_key = os.getenv("GROQ_API_KEY", "")
eleven_key = os.getenv("ELEVENLABS_API_KEY", "")
if groq_key:
check("GROQ_API_KEY (STT)", True, mask(groq_key))
else:
warn("GROQ_API_KEY", "not set — Groq STT unavailable")
if eleven_key:
check("ELEVENLABS_API_KEY (TTS)", True, mask(eleven_key))
else:
warn("ELEVENLABS_API_KEY", "not set — ElevenLabs TTS unavailable")
return ok, token, groq_key, eleven_key
def check_config(groq_key, eleven_key):
"""Check hermes config.yaml."""
section("Configuration")
config_path = HERMES_HOME / "config.yaml"
if config_path.exists():
try:
import yaml
with open(config_path) as f:
cfg = yaml.safe_load(f) or {}
stt_provider = cfg.get("stt", {}).get("provider", "local")
tts_provider = cfg.get("tts", {}).get("provider", "edge")
check("STT provider", True, stt_provider)
check("TTS provider", True, tts_provider)
if stt_provider == "groq" and not groq_key:
warn("STT config says groq but GROQ_API_KEY is missing")
if tts_provider == "elevenlabs" and not eleven_key:
warn("TTS config says elevenlabs but ELEVENLABS_API_KEY is missing")
except Exception as e:
warn("config.yaml", f"parse error: {e}")
else:
warn("config.yaml", "not found — using defaults")
# Voice mode state
voice_mode_path = HERMES_HOME / "gateway_voice_mode.json"
if voice_mode_path.exists():
try:
import json
modes = json.loads(voice_mode_path.read_text())
off_count = sum(1 for v in modes.values() if v == "off")
all_count = sum(1 for v in modes.values() if v == "all")
check("Voice mode state", True, f"{all_count} on, {off_count} off, {len(modes)} total")
except Exception:
warn("Voice mode state", "parse error")
else:
check("Voice mode state", True, "no saved state (fresh)")
def check_bot_permissions(token):
"""Check bot permissions via Discord API. Returns True if all OK."""
section("Bot Permissions")
if not token:
warn("Bot permissions", "no token — skipping")
return True
try:
import requests
except ImportError:
warn("Bot permissions", "requests not installed — skipping")
return True
VOICE_PERMS = {
"Priority Speaker": 8,
"Stream": 9,
"View Channel": 10,
"Send Messages": 11,
"Embed Links": 14,
"Attach Files": 15,
"Read Message History": 16,
"Connect": 20,
"Speak": 21,
"Mute Members": 22,
"Deafen Members": 23,
"Move Members": 24,
"Use VAD": 25,
"Send Voice Messages": 46,
}
REQUIRED_PERMS = {"Connect", "Speak", "View Channel", "Send Messages"}
ok = True
try:
headers = {"Authorization": f"Bot {token}"}
r = requests.get("https://discord.com/api/v10/users/@me", headers=headers, timeout=5)
if r.status_code == 401:
check("Bot login", False, "invalid token (401)")
return False
if r.status_code != 200:
check("Bot login", False, f"HTTP {r.status_code}")
return False
bot = r.json()
bot_name = bot.get("username", "?")
check("Bot login", True, f"{bot_name[:3]}{'*' * (len(bot_name) - 3)}")
# Check guilds
r2 = requests.get("https://discord.com/api/v10/users/@me/guilds", headers=headers, timeout=5)
if r2.status_code != 200:
warn("Guilds", f"HTTP {r2.status_code}")
return ok
guilds = r2.json()
check("Guilds", True, f"{len(guilds)} guild(s)")
for g in guilds[:5]:
perms = int(g.get("permissions", 0))
is_admin = bool(perms & (1 << 3))
if is_admin:
print(f" {OK} {g['name']}: Administrator (all permissions)")
continue
has = []
missing = []
for name, bit in sorted(VOICE_PERMS.items(), key=lambda x: x[1]):
if perms & (1 << bit):
has.append(name)
elif name in REQUIRED_PERMS:
missing.append(name)
if missing:
print(f" {FAIL} {g['name']}: missing {', '.join(missing)}")
ok = False
else:
print(f" {OK} {g['name']}: {', '.join(has)}")
except requests.exceptions.Timeout:
warn("Bot permissions", "Discord API timeout")
except requests.exceptions.ConnectionError:
warn("Bot permissions", "cannot reach Discord API")
except Exception as e:
warn("Bot permissions", f"check failed: {e}")
return ok
def main():
print()
print("\033[1m" + "=" * 50 + "\033[0m")
print("\033[1m Discord Voice Doctor\033[0m")
print("\033[1m" + "=" * 50 + "\033[0m")
all_ok = True
all_ok &= check_packages()
all_ok &= check_system_tools()
env_ok, token, groq_key, eleven_key = check_env_vars()
all_ok &= env_ok
check_config(groq_key, eleven_key)
all_ok &= check_bot_permissions(token)
# Summary
print()
print("\033[1m" + "-" * 50 + "\033[0m")
if all_ok:
print(f" {OK} \033[92mAll checks passed — voice mode ready!\033[0m")
else:
print(f" {FAIL} \033[91mSome checks failed — fix issues above.\033[0m")
print()
if __name__ == "__main__":
main()

View File

@@ -252,3 +252,109 @@ async def test_discord_dms_ignore_mention_requirement(adapter, monkeypatch):
event = adapter.handle_message.await_args.args[0]
assert event.text == "dm without mention"
assert event.source.chat_type == "dm"
@pytest.mark.asyncio
async def test_discord_auto_thread_enabled_by_default(adapter, monkeypatch):
"""Auto-threading should be enabled by default (DISCORD_AUTO_THREAD defaults to 'true')."""
monkeypatch.delenv("DISCORD_AUTO_THREAD", raising=False)
monkeypatch.setenv("DISCORD_REQUIRE_MENTION", "false")
# Patch _auto_create_thread to return a fake thread
fake_thread = FakeThread(channel_id=999, name="auto-thread")
adapter._auto_create_thread = AsyncMock(return_value=fake_thread)
message = make_message(channel=FakeTextChannel(channel_id=123), content="hello")
await adapter._handle_message(message)
adapter._auto_create_thread.assert_awaited_once()
adapter.handle_message.assert_awaited_once()
event = adapter.handle_message.await_args.args[0]
assert event.source.chat_type == "thread"
assert event.source.thread_id == "999"
@pytest.mark.asyncio
async def test_discord_auto_thread_can_be_disabled(adapter, monkeypatch):
"""Setting auto_thread to false skips thread creation."""
monkeypatch.setenv("DISCORD_AUTO_THREAD", "false")
monkeypatch.setenv("DISCORD_REQUIRE_MENTION", "false")
adapter._auto_create_thread = AsyncMock()
message = make_message(channel=FakeTextChannel(channel_id=123), content="hello")
await adapter._handle_message(message)
adapter._auto_create_thread.assert_not_awaited()
adapter.handle_message.assert_awaited_once()
event = adapter.handle_message.await_args.args[0]
assert event.source.chat_type == "group"
@pytest.mark.asyncio
async def test_discord_bot_thread_skips_mention_requirement(adapter, monkeypatch):
"""Messages in a thread the bot has participated in should not require @mention."""
monkeypatch.setenv("DISCORD_REQUIRE_MENTION", "true")
monkeypatch.delenv("DISCORD_FREE_RESPONSE_CHANNELS", raising=False)
monkeypatch.setenv("DISCORD_AUTO_THREAD", "false")
# Simulate bot having previously participated in thread 456
adapter._bot_participated_threads.add("456")
thread = FakeThread(channel_id=456, name="existing thread")
message = make_message(channel=thread, content="follow-up without mention")
await adapter._handle_message(message)
adapter.handle_message.assert_awaited_once()
event = adapter.handle_message.await_args.args[0]
assert event.text == "follow-up without mention"
assert event.source.chat_type == "thread"
@pytest.mark.asyncio
async def test_discord_unknown_thread_still_requires_mention(adapter, monkeypatch):
"""Messages in a thread the bot hasn't participated in should still require @mention."""
monkeypatch.setenv("DISCORD_REQUIRE_MENTION", "true")
monkeypatch.delenv("DISCORD_FREE_RESPONSE_CHANNELS", raising=False)
monkeypatch.setenv("DISCORD_AUTO_THREAD", "false")
# Bot has NOT participated in thread 789
thread = FakeThread(channel_id=789, name="some thread")
message = make_message(channel=thread, content="hello from unknown thread")
await adapter._handle_message(message)
adapter.handle_message.assert_not_awaited()
@pytest.mark.asyncio
async def test_discord_auto_thread_tracks_participation(adapter, monkeypatch):
"""Auto-created threads should be tracked for future mention-free replies."""
monkeypatch.delenv("DISCORD_AUTO_THREAD", raising=False)
monkeypatch.setenv("DISCORD_REQUIRE_MENTION", "false")
fake_thread = FakeThread(channel_id=555, name="auto-thread")
adapter._auto_create_thread = AsyncMock(return_value=fake_thread)
message = make_message(channel=FakeTextChannel(channel_id=123), content="start a thread")
await adapter._handle_message(message)
assert "555" in adapter._bot_participated_threads
@pytest.mark.asyncio
async def test_discord_thread_participation_tracked_on_dispatch(adapter, monkeypatch):
"""When the bot processes a message in a thread, it tracks participation."""
monkeypatch.setenv("DISCORD_REQUIRE_MENTION", "false")
monkeypatch.setenv("DISCORD_AUTO_THREAD", "false")
thread = FakeThread(channel_id=777, name="manually created thread")
message = make_message(channel=thread, content="hello in thread")
await adapter._handle_message(message)
assert "777" in adapter._bot_participated_threads

View File

@@ -363,11 +363,37 @@ async def test_auto_thread_creates_thread_and_redirects(adapter, monkeypatch):
@pytest.mark.asyncio
async def test_auto_thread_disabled_by_default(adapter, monkeypatch):
"""Without DISCORD_AUTO_THREAD, messages stay in the channel."""
async def test_auto_thread_enabled_by_default_slash_commands(adapter, monkeypatch):
"""Without DISCORD_AUTO_THREAD env var, auto-threading is enabled (default: true)."""
monkeypatch.delenv("DISCORD_AUTO_THREAD", raising=False)
monkeypatch.setenv("DISCORD_REQUIRE_MENTION", "false")
fake_thread = _FakeThreadChannel(channel_id=999, name="auto-thread")
adapter._auto_create_thread = AsyncMock(return_value=fake_thread)
captured_events = []
async def capture_handle(event):
captured_events.append(event)
adapter.handle_message = capture_handle
msg = _fake_message(_FakeTextChannel())
await adapter._handle_message(msg)
adapter._auto_create_thread.assert_awaited_once()
assert len(captured_events) == 1
assert captured_events[0].source.chat_id == "999" # redirected to thread
assert captured_events[0].source.chat_type == "thread"
@pytest.mark.asyncio
async def test_auto_thread_can_be_disabled(adapter, monkeypatch):
"""Setting DISCORD_AUTO_THREAD=false keeps messages in the channel."""
monkeypatch.setenv("DISCORD_AUTO_THREAD", "false")
monkeypatch.setenv("DISCORD_REQUIRE_MENTION", "false")
adapter._auto_create_thread = AsyncMock()
captured_events = []

View File

@@ -0,0 +1,106 @@
import asyncio
from unittest.mock import AsyncMock, MagicMock, patch
import pytest
from gateway.config import GatewayConfig, Platform, PlatformConfig
from gateway.platforms.base import BasePlatformAdapter, MessageEvent, SendResult
from gateway.run import GatewayRunner
from gateway.session import SessionSource, build_session_key
class StubAdapter(BasePlatformAdapter):
def __init__(self):
super().__init__(PlatformConfig(enabled=True, token="***"), Platform.TELEGRAM)
async def connect(self):
return True
async def disconnect(self):
return None
async def send(self, chat_id, content, reply_to=None, metadata=None):
return SendResult(success=True, message_id="1")
async def send_typing(self, chat_id, metadata=None):
return None
async def get_chat_info(self, chat_id):
return {"id": chat_id}
def _source(chat_id="123456", chat_type="dm"):
return SessionSource(
platform=Platform.TELEGRAM,
chat_id=chat_id,
chat_type=chat_type,
)
@pytest.mark.asyncio
async def test_cancel_background_tasks_cancels_inflight_message_processing():
adapter = StubAdapter()
release = asyncio.Event()
async def block_forever(_event):
await release.wait()
return None
adapter.set_message_handler(block_forever)
event = MessageEvent(text="work", source=_source(), message_id="1")
await adapter.handle_message(event)
await asyncio.sleep(0)
session_key = build_session_key(event.source)
assert session_key in adapter._active_sessions
assert adapter._background_tasks
await adapter.cancel_background_tasks()
assert adapter._background_tasks == set()
assert adapter._active_sessions == {}
assert adapter._pending_messages == {}
@pytest.mark.asyncio
async def test_gateway_stop_interrupts_running_agents_and_cancels_adapter_tasks():
runner = object.__new__(GatewayRunner)
runner.config = GatewayConfig(platforms={Platform.TELEGRAM: PlatformConfig(enabled=True, token="***")})
runner._running = True
runner._shutdown_event = asyncio.Event()
runner._exit_reason = None
runner._pending_messages = {"session": "pending text"}
runner._pending_approvals = {"session": {"command": "rm -rf /tmp/x"}}
runner._shutdown_all_gateway_honcho = lambda: None
adapter = StubAdapter()
release = asyncio.Event()
async def block_forever(_event):
await release.wait()
return None
adapter.set_message_handler(block_forever)
event = MessageEvent(text="work", source=_source(), message_id="1")
await adapter.handle_message(event)
await asyncio.sleep(0)
disconnect_mock = AsyncMock()
adapter.disconnect = disconnect_mock
session_key = build_session_key(event.source)
running_agent = MagicMock()
runner._running_agents = {session_key: running_agent}
runner.adapters = {Platform.TELEGRAM: adapter}
with patch("gateway.status.remove_pid_file"), patch("gateway.status.write_runtime_status"):
await runner.stop()
running_agent.interrupt.assert_called_once_with("Gateway shutting down")
disconnect_mock.assert_awaited_once()
assert runner.adapters == {}
assert runner._running_agents == {}
assert runner._pending_messages == {}
assert runner._pending_approvals == {}
assert runner._shutdown_event.is_set() is True

View File

@@ -0,0 +1,25 @@
from unittest.mock import patch
import pytest
@pytest.mark.asyncio
async def test_image_enrichment_uses_athabasca_upload_guidance_without_stale_r2_warning():
from gateway.run import GatewayRunner
runner = object.__new__(GatewayRunner)
with patch(
"tools.vision_tools.vision_analyze_tool",
return_value='{"success": true, "analysis": "A painted serpent warrior."}',
):
enriched = await runner._enrich_message_with_vision(
"caption",
["/tmp/test.jpg"],
)
assert "R2 not configured" not in enriched
assert "Gateway media URL available for reference" not in enriched
assert "POST /api/uploads" in enriched
assert "Do not store the local cache path" in enriched
assert "caption" in enriched

View File

@@ -11,7 +11,7 @@ import asyncio
import pytest
from gateway.config import Platform, PlatformConfig
from gateway.platforms.base import BasePlatformAdapter, MessageEvent, SendResult
from gateway.platforms.base import BasePlatformAdapter, MessageEvent, MessageType, SendResult
from gateway.session import SessionSource, build_session_key
@@ -50,11 +50,11 @@ class TestInterruptKeyConsistency:
"""Ensure adapter interrupt methods are queried with session_key, not chat_id."""
def test_session_key_differs_from_chat_id_for_dm(self):
"""Session key for a DM is NOT the same as chat_id."""
"""Session key for a DM is namespaced and includes the DM chat_id."""
source = _source("123456", "dm")
session_key = build_session_key(source)
assert session_key != source.chat_id
assert session_key == "agent:main:telegram:dm"
assert session_key == "agent:main:telegram:dm:123456"
def test_session_key_differs_from_chat_id_for_group(self):
"""Session key for a group chat includes prefix, unlike raw chat_id."""
@@ -122,3 +122,29 @@ class TestInterruptKeyConsistency:
# Interrupt event was set
assert adapter._active_sessions[session_key].is_set()
@pytest.mark.asyncio
async def test_photo_followup_is_queued_without_interrupt(self):
"""Photo follow-ups should queue behind the active run instead of interrupting it."""
adapter = StubAdapter()
adapter.set_message_handler(lambda event: asyncio.sleep(0, result=None))
source = _source("-1001234", "group")
session_key = build_session_key(source)
interrupt_event = asyncio.Event()
adapter._active_sessions[session_key] = interrupt_event
event = MessageEvent(
text="caption",
source=source,
message_type=MessageType.PHOTO,
message_id="2",
media_urls=["/tmp/photo-a.jpg"],
media_types=["image/jpeg"],
)
await adapter.handle_message(event)
queued = adapter._pending_messages[session_key]
assert queued is event
assert queued.media_urls == ["/tmp/photo-a.jpg"]
assert interrupt_event.is_set() is False

View File

@@ -338,7 +338,7 @@ class TestSessionStoreRewriteTranscript:
class TestWhatsAppDMSessionKeyConsistency:
"""Regression: all session-key construction must go through build_session_key
so WhatsApp DMs include chat_id while other DMs do not."""
so DMs are isolated by chat_id across platforms."""
@pytest.fixture()
def store(self, tmp_path):
@@ -369,15 +369,24 @@ class TestWhatsAppDMSessionKeyConsistency:
)
assert store._generate_session_key(source) == build_session_key(source)
def test_telegram_dm_omits_chat_id(self):
"""Non-WhatsApp DMs should still omit chat_id (single owner DM)."""
def test_telegram_dm_includes_chat_id(self):
"""Non-WhatsApp DMs should also include chat_id to separate users."""
source = SessionSource(
platform=Platform.TELEGRAM,
chat_id="99",
chat_type="dm",
)
key = build_session_key(source)
assert key == "agent:main:telegram:dm"
assert key == "agent:main:telegram:dm:99"
def test_distinct_dm_chat_ids_get_distinct_session_keys(self):
"""Different DM chats must not collapse into one shared session."""
first = SessionSource(platform=Platform.TELEGRAM, chat_id="99", chat_type="dm")
second = SessionSource(platform=Platform.TELEGRAM, chat_id="100", chat_type="dm")
assert build_session_key(first) == "agent:main:telegram:dm:99"
assert build_session_key(second) == "agent:main:telegram:dm:100"
assert build_session_key(first) != build_session_key(second)
def test_discord_group_includes_chat_id(self):
"""Group/channel keys include chat_type and chat_id."""

View File

@@ -0,0 +1,45 @@
import os
from gateway.config import Platform
from gateway.run import GatewayRunner
from gateway.session import SessionContext, SessionSource
def test_set_session_env_includes_thread_id(monkeypatch):
runner = object.__new__(GatewayRunner)
source = SessionSource(
platform=Platform.TELEGRAM,
chat_id="-1001",
chat_name="Group",
chat_type="group",
thread_id="17585",
)
context = SessionContext(source=source, connected_platforms=[], home_channels={})
monkeypatch.delenv("HERMES_SESSION_PLATFORM", raising=False)
monkeypatch.delenv("HERMES_SESSION_CHAT_ID", raising=False)
monkeypatch.delenv("HERMES_SESSION_CHAT_NAME", raising=False)
monkeypatch.delenv("HERMES_SESSION_THREAD_ID", raising=False)
runner._set_session_env(context)
assert os.getenv("HERMES_SESSION_PLATFORM") == "telegram"
assert os.getenv("HERMES_SESSION_CHAT_ID") == "-1001"
assert os.getenv("HERMES_SESSION_CHAT_NAME") == "Group"
assert os.getenv("HERMES_SESSION_THREAD_ID") == "17585"
def test_clear_session_env_removes_thread_id(monkeypatch):
runner = object.__new__(GatewayRunner)
monkeypatch.setenv("HERMES_SESSION_PLATFORM", "telegram")
monkeypatch.setenv("HERMES_SESSION_CHAT_ID", "-1001")
monkeypatch.setenv("HERMES_SESSION_CHAT_NAME", "Group")
monkeypatch.setenv("HERMES_SESSION_THREAD_ID", "17585")
runner._clear_session_env()
assert os.getenv("HERMES_SESSION_PLATFORM") is None
assert os.getenv("HERMES_SESSION_CHAT_ID") is None
assert os.getenv("HERMES_SESSION_CHAT_NAME") is None
assert os.getenv("HERMES_SESSION_THREAD_ID") is None

View File

@@ -12,6 +12,7 @@ import asyncio
import importlib
import os
import sys
from types import SimpleNamespace
from unittest.mock import AsyncMock, MagicMock, patch
import pytest
@@ -351,6 +352,26 @@ class TestDocumentDownloadBlock:
# ---------------------------------------------------------------------------
class TestMediaGroups:
@pytest.mark.asyncio
async def test_non_album_photo_burst_is_buffered_and_combined(self, adapter):
first_photo = _make_photo(_make_file_obj(b"first"))
second_photo = _make_photo(_make_file_obj(b"second"))
msg1 = _make_message(caption="two images", photo=[first_photo])
msg2 = _make_message(photo=[second_photo])
with patch("gateway.platforms.telegram.cache_image_from_bytes", side_effect=["/tmp/burst-one.jpg", "/tmp/burst-two.jpg"]):
await adapter._handle_media_message(_make_update(msg1), MagicMock())
await adapter._handle_media_message(_make_update(msg2), MagicMock())
assert adapter.handle_message.await_count == 0
await asyncio.sleep(adapter.MEDIA_GROUP_WAIT_SECONDS + 0.05)
adapter.handle_message.assert_awaited_once()
event = adapter.handle_message.await_args.args[0]
assert event.text == "two images"
assert event.media_urls == ["/tmp/burst-one.jpg", "/tmp/burst-two.jpg"]
assert len(event.media_types) == 2
@pytest.mark.asyncio
async def test_photo_album_is_buffered_and_combined(self, adapter):
first_photo = _make_photo(_make_file_obj(b"first"))
@@ -537,6 +558,51 @@ class TestSendDocument:
assert call_kwargs["reply_to_message_id"] == 50
class TestTelegramPhotoBatching:
@pytest.mark.asyncio
async def test_flush_photo_batch_does_not_drop_newer_scheduled_task(self, adapter):
old_task = MagicMock()
new_task = MagicMock()
batch_key = "session:photo-burst"
adapter._pending_photo_batch_tasks[batch_key] = new_task
adapter._pending_photo_batches[batch_key] = MessageEvent(
text="",
message_type=MessageType.PHOTO,
source=SimpleNamespace(channel_id="chat-1"),
media_urls=["/tmp/a.jpg"],
media_types=["image/jpeg"],
)
with (
patch("gateway.platforms.telegram.asyncio.current_task", return_value=old_task),
patch("gateway.platforms.telegram.asyncio.sleep", new=AsyncMock()),
):
await adapter._flush_photo_batch(batch_key)
assert adapter._pending_photo_batch_tasks[batch_key] is new_task
@pytest.mark.asyncio
async def test_disconnect_cancels_pending_photo_batch_tasks(self, adapter):
task = MagicMock()
task.done.return_value = False
adapter._pending_photo_batch_tasks["session:photo-burst"] = task
adapter._pending_photo_batches["session:photo-burst"] = MessageEvent(
text="",
message_type=MessageType.PHOTO,
source=SimpleNamespace(channel_id="chat-1"),
)
adapter._app = MagicMock()
adapter._app.updater.stop = AsyncMock()
adapter._app.stop = AsyncMock()
adapter._app.shutdown = AsyncMock()
await adapter.disconnect()
task.cancel.assert_called_once()
assert adapter._pending_photo_batch_tasks == {}
assert adapter._pending_photo_batches == {}
# ---------------------------------------------------------------------------
# TestSendVideo — outbound video delivery
# ---------------------------------------------------------------------------

View File

@@ -0,0 +1,49 @@
import asyncio
from unittest.mock import MagicMock
import pytest
from gateway.config import GatewayConfig, Platform, PlatformConfig
from gateway.platforms.base import MessageEvent, MessageType
from gateway.session import SessionSource, build_session_key
from gateway.run import GatewayRunner
class _PendingAdapter:
def __init__(self):
self._pending_messages = {}
def _make_runner():
runner = object.__new__(GatewayRunner)
runner.config = GatewayConfig(platforms={Platform.TELEGRAM: PlatformConfig(enabled=True, token="***")})
runner.adapters = {Platform.TELEGRAM: _PendingAdapter()}
runner._running_agents = {}
runner._pending_messages = {}
runner._pending_approvals = {}
runner._voice_mode = {}
runner._is_user_authorized = lambda _source: True
return runner
@pytest.mark.asyncio
async def test_handle_message_does_not_priority_interrupt_photo_followup():
runner = _make_runner()
source = SessionSource(platform=Platform.TELEGRAM, chat_id="12345", chat_type="dm")
session_key = build_session_key(source)
running_agent = MagicMock()
runner._running_agents[session_key] = running_agent
event = MessageEvent(
text="caption",
message_type=MessageType.PHOTO,
source=source,
media_urls=["/tmp/photo-a.jpg"],
media_types=["image/jpeg"],
)
result = await runner._handle_message(event)
assert result is None
running_agent.interrupt.assert_not_called()
assert runner.adapters[Platform.TELEGRAM]._pending_messages[session_key] is event

View File

@@ -1,5 +1,6 @@
"""Tests for the /voice command and auto voice reply in the gateway."""
import importlib.util
import json
import os
import queue
@@ -206,9 +207,11 @@ class TestAutoVoiceReply:
2. gateway _send_voice_reply: fires based on voice_mode setting
To prevent double audio, _send_voice_reply is skipped when voice input
already triggered base adapter auto-TTS (skip_double = is_voice_input).
Exception: Discord voice channel — both auto-TTS and Discord play_tts
override skip, so the runner must handle it via play_in_voice_channel.
already triggered base adapter auto-TTS.
For Discord voice channels, the base adapter now routes play_tts directly
into VC playback, so the runner should still skip voice-input follow-ups to
avoid double playback.
"""
@pytest.fixture
@@ -292,14 +295,14 @@ class TestAutoVoiceReply:
# -- Discord VC exception: runner must handle --------------------------
def test_discord_vc_voice_input_runner_fires(self, runner):
"""Discord VC + voice input: base play_tts skips (VC override),
so runner must handle via play_in_voice_channel."""
assert self._call(runner, "all", MessageType.VOICE, in_voice_channel=True) is True
def test_discord_vc_voice_input_base_handles(self, runner):
"""Discord VC + voice input: base adapter play_tts plays in VC,
so runner skips to avoid double playback."""
assert self._call(runner, "all", MessageType.VOICE, in_voice_channel=True) is False
def test_discord_vc_voice_only_runner_fires(self, runner):
"""Discord VC + voice_only + voice: runner must handle."""
assert self._call(runner, "voice_only", MessageType.VOICE, in_voice_channel=True) is True
def test_discord_vc_voice_only_base_handles(self, runner):
"""Discord VC + voice_only + voice: base adapter handles."""
assert self._call(runner, "voice_only", MessageType.VOICE, in_voice_channel=True) is False
# -- Edge cases --------------------------------------------------------
@@ -422,17 +425,23 @@ class TestDiscordPlayTtsSkip:
return adapter
@pytest.mark.asyncio
async def test_play_tts_skipped_when_in_vc(self):
async def test_play_tts_plays_in_vc_when_connected(self):
adapter = self._make_discord_adapter()
# Simulate bot in voice channel for guild 111, text channel 123
mock_vc = MagicMock()
mock_vc.is_connected.return_value = True
mock_vc.is_playing.return_value = False
adapter._voice_clients[111] = mock_vc
adapter._voice_text_channels[111] = 123
# Mock play_in_voice_channel to avoid actual ffmpeg call
async def fake_play(gid, path):
return True
adapter.play_in_voice_channel = fake_play
result = await adapter.play_tts(chat_id="123", audio_path="/tmp/test.ogg")
# play_tts now plays in VC instead of being a no-op
assert result.success is True
# send_voice should NOT have been called (no client, would fail)
@pytest.mark.asyncio
async def test_play_tts_not_skipped_when_not_in_vc(self):
@@ -728,6 +737,24 @@ class TestVoiceChannelCommands:
result = await runner._handle_voice_channel_join(event)
assert "failed" in result.lower()
@pytest.mark.asyncio
async def test_join_missing_voice_dependencies(self, runner):
"""Missing PyNaCl/davey should return a user-actionable install hint."""
mock_channel = MagicMock()
mock_channel.name = "General"
mock_adapter = AsyncMock()
mock_adapter.join_voice_channel = AsyncMock(
side_effect=RuntimeError("PyNaCl library needed in order to use voice")
)
mock_adapter.get_user_voice_channel = AsyncMock(return_value=mock_channel)
event = self._make_discord_event()
runner.adapters[event.source.platform] = mock_adapter
result = await runner._handle_voice_channel_join(event)
assert "voice dependencies are missing" in result.lower()
assert "hermes-agent[messaging]" in result
# -- _handle_voice_channel_leave --
@pytest.mark.asyncio
@@ -2031,3 +2058,534 @@ class TestDisconnectVoiceCleanup:
assert len(adapter._voice_receivers) == 0
assert len(adapter._voice_listen_tasks) == 0
assert len(adapter._voice_timeout_tasks) == 0
# =====================================================================
# Discord Voice Channel Flow Tests
# =====================================================================
@pytest.mark.skipif(
importlib.util.find_spec("nacl") is None,
reason="PyNaCl not installed",
)
class TestVoiceReception:
"""Audio reception: SSRC mapping, DAVE passthrough, buffer lifecycle."""
@staticmethod
def _make_receiver(allowed_ids=None, members=None, dave=False, bot_id=9999):
from gateway.platforms.discord import VoiceReceiver
vc = MagicMock()
vc._connection.secret_key = [0] * 32
vc._connection.dave_session = MagicMock() if dave else None
vc._connection.ssrc = bot_id
vc._connection.add_socket_listener = MagicMock()
vc._connection.remove_socket_listener = MagicMock()
vc._connection.hook = None
vc.user = SimpleNamespace(id=bot_id)
vc.channel = MagicMock()
vc.channel.members = members or []
receiver = VoiceReceiver(vc, allowed_user_ids=allowed_ids)
return receiver
@staticmethod
def _fill_buffer(receiver, ssrc, duration_s=1.0, age_s=3.0):
"""Add PCM data to buffer. 48kHz stereo 16-bit = 192000 bytes/sec."""
size = int(192000 * duration_s)
receiver._buffers[ssrc] = bytearray(b"\x00" * size)
receiver._last_packet_time[ssrc] = time.monotonic() - age_s
# -- Known SSRC (normal flow) --
def test_known_ssrc_returns_completed(self):
receiver = self._make_receiver()
receiver.start()
receiver.map_ssrc(100, 42)
self._fill_buffer(receiver, 100)
completed = receiver.check_silence()
assert len(completed) == 1
assert completed[0][0] == 42
assert len(receiver._buffers[100]) == 0 # cleared
def test_known_ssrc_short_buffer_ignored(self):
receiver = self._make_receiver()
receiver.start()
receiver.map_ssrc(100, 42)
self._fill_buffer(receiver, 100, duration_s=0.1) # too short
completed = receiver.check_silence()
assert len(completed) == 0
def test_known_ssrc_recent_audio_waits(self):
receiver = self._make_receiver()
receiver.start()
receiver.map_ssrc(100, 42)
self._fill_buffer(receiver, 100, age_s=0.0) # just arrived
completed = receiver.check_silence()
assert len(completed) == 0
# -- Unknown SSRC + DAVE passthrough --
def test_unknown_ssrc_no_automap_no_completed(self):
"""Unknown SSRC, no members to infer — buffer cleared, not returned."""
receiver = self._make_receiver(dave=True, members=[])
receiver.start()
self._fill_buffer(receiver, 100)
completed = receiver.check_silence()
assert len(completed) == 0
assert len(receiver._buffers[100]) == 0
def test_unknown_ssrc_late_speaking_event(self):
"""Audio buffered before SPEAKING → SPEAKING maps → next check returns it."""
receiver = self._make_receiver(dave=True)
receiver.start()
self._fill_buffer(receiver, 100, age_s=0.0) # still receiving
# No user yet
assert receiver.check_silence() == []
# SPEAKING event arrives
receiver.map_ssrc(100, 42)
# Silence kicks in
receiver._last_packet_time[100] = time.monotonic() - 3.0
completed = receiver.check_silence()
assert len(completed) == 1
assert completed[0][0] == 42
# -- SSRC auto-mapping --
def test_automap_single_allowed_user(self):
members = [
SimpleNamespace(id=9999, name="Bot"),
SimpleNamespace(id=42, name="Alice"),
]
receiver = self._make_receiver(allowed_ids={"42"}, members=members)
receiver.start()
self._fill_buffer(receiver, 100)
completed = receiver.check_silence()
assert len(completed) == 1
assert completed[0][0] == 42
assert receiver._ssrc_to_user[100] == 42
def test_automap_multiple_allowed_users_no_map(self):
members = [
SimpleNamespace(id=9999, name="Bot"),
SimpleNamespace(id=42, name="Alice"),
SimpleNamespace(id=43, name="Bob"),
]
receiver = self._make_receiver(allowed_ids={"42", "43"}, members=members)
receiver.start()
self._fill_buffer(receiver, 100)
completed = receiver.check_silence()
assert len(completed) == 0
def test_automap_no_allowlist_single_member(self):
"""No allowed_user_ids → sole non-bot member inferred."""
members = [
SimpleNamespace(id=9999, name="Bot"),
SimpleNamespace(id=42, name="Alice"),
]
receiver = self._make_receiver(allowed_ids=None, members=members)
receiver.start()
self._fill_buffer(receiver, 100)
completed = receiver.check_silence()
assert len(completed) == 1
assert completed[0][0] == 42
def test_automap_unallowed_user_rejected(self):
"""User in channel but not in allowed list — not mapped."""
members = [
SimpleNamespace(id=9999, name="Bot"),
SimpleNamespace(id=42, name="Alice"),
]
receiver = self._make_receiver(allowed_ids={"99"}, members=members)
receiver.start()
self._fill_buffer(receiver, 100)
completed = receiver.check_silence()
assert len(completed) == 0
def test_automap_only_bot_in_channel(self):
"""Only bot in channel — no one to map to."""
members = [SimpleNamespace(id=9999, name="Bot")]
receiver = self._make_receiver(allowed_ids=None, members=members)
receiver.start()
self._fill_buffer(receiver, 100)
completed = receiver.check_silence()
assert len(completed) == 0
def test_automap_persists_across_calls(self):
"""Auto-mapped SSRC stays mapped for subsequent checks."""
members = [
SimpleNamespace(id=9999, name="Bot"),
SimpleNamespace(id=42, name="Alice"),
]
receiver = self._make_receiver(allowed_ids={"42"}, members=members)
receiver.start()
self._fill_buffer(receiver, 100)
receiver.check_silence()
assert receiver._ssrc_to_user[100] == 42
# Second utterance — should use cached mapping
self._fill_buffer(receiver, 100)
completed = receiver.check_silence()
assert len(completed) == 1
assert completed[0][0] == 42
# -- Stale buffer cleanup --
def test_stale_unknown_buffer_discarded(self):
"""Buffer with no user and very old timestamp is discarded."""
receiver = self._make_receiver()
receiver.start()
receiver._buffers[200] = bytearray(b"\x00" * 100)
receiver._last_packet_time[200] = time.monotonic() - 10.0
receiver.check_silence()
assert 200 not in receiver._buffers
# -- Pause / resume (echo prevention) --
def test_paused_receiver_ignores_packets(self):
receiver = self._make_receiver()
receiver.start()
receiver.pause()
receiver._on_packet(b"\x00" * 100)
assert len(receiver._buffers) == 0
def test_resumed_receiver_accepts_packets(self):
receiver = self._make_receiver()
receiver.start()
receiver.pause()
receiver.resume()
assert receiver._paused is False
# -- _on_packet DAVE passthrough behavior --
def _make_receiver_with_nacl(self, dave_session=None, mapped_ssrcs=None):
"""Create a receiver that can process _on_packet with mocked NaCl + Opus."""
from gateway.platforms.discord import VoiceReceiver
vc = MagicMock()
vc._connection.secret_key = [0] * 32
vc._connection.dave_session = dave_session
vc._connection.ssrc = 9999
vc._connection.add_socket_listener = MagicMock()
vc._connection.remove_socket_listener = MagicMock()
vc._connection.hook = None
vc.user = SimpleNamespace(id=9999)
vc.channel = MagicMock()
vc.channel.members = []
receiver = VoiceReceiver(vc)
receiver.start()
# Pre-map SSRCs if provided
if mapped_ssrcs:
for ssrc, uid in mapped_ssrcs.items():
receiver.map_ssrc(ssrc, uid)
return receiver
@staticmethod
def _build_rtp_packet(ssrc=100, seq=1, timestamp=960):
"""Build a minimal valid RTP packet for _on_packet.
We need: RTP header (12 bytes) + encrypted payload + 4-byte nonce.
NaCl decrypt is mocked so payload content doesn't matter.
"""
import struct
# RTP header: version=2, payload_type=0x78, no extension, no CSRC
header = struct.pack(">BBHII", 0x80, 0x78, seq, timestamp, ssrc)
# Fake encrypted payload (NaCl will be mocked) + 4 byte nonce
payload = b"\x00" * 20 + b"\x00\x00\x00\x01"
return header + payload
def _inject_mock_decoder(self, receiver, ssrc):
"""Pre-inject a mock Opus decoder for the given SSRC."""
mock_decoder = MagicMock()
mock_decoder.decode.return_value = b"\x00" * 3840
receiver._decoders[ssrc] = mock_decoder
return mock_decoder
def test_on_packet_dave_known_user_decrypt_ok(self):
"""Known SSRC + DAVE decrypt success → audio buffered."""
dave = MagicMock()
dave.decrypt.return_value = b"\xf8\xff\xfe"
receiver = self._make_receiver_with_nacl(
dave_session=dave, mapped_ssrcs={100: 42}
)
self._inject_mock_decoder(receiver, 100)
with patch("nacl.secret.Aead") as mock_aead:
mock_aead.return_value.decrypt.return_value = b"\xf8\xff\xfe"
receiver._on_packet(self._build_rtp_packet(ssrc=100))
assert 100 in receiver._buffers
assert len(receiver._buffers[100]) > 0
dave.decrypt.assert_called_once()
def test_on_packet_dave_unknown_ssrc_passthrough(self):
"""Unknown SSRC + DAVE → skip DAVE, attempt Opus decode (passthrough)."""
dave = MagicMock()
receiver = self._make_receiver_with_nacl(dave_session=dave)
self._inject_mock_decoder(receiver, 100)
with patch("nacl.secret.Aead") as mock_aead:
mock_aead.return_value.decrypt.return_value = b"\xf8\xff\xfe"
receiver._on_packet(self._build_rtp_packet(ssrc=100))
dave.decrypt.assert_not_called()
assert 100 in receiver._buffers
assert len(receiver._buffers[100]) > 0
def test_on_packet_dave_unencrypted_error_passthrough(self):
"""DAVE decrypt 'Unencrypted' error → use data as-is, don't drop."""
dave = MagicMock()
dave.decrypt.side_effect = Exception(
"Failed to decrypt: DecryptionFailed(UnencryptedWhenPassthroughDisabled)"
)
receiver = self._make_receiver_with_nacl(
dave_session=dave, mapped_ssrcs={100: 42}
)
self._inject_mock_decoder(receiver, 100)
with patch("nacl.secret.Aead") as mock_aead:
mock_aead.return_value.decrypt.return_value = b"\xf8\xff\xfe"
receiver._on_packet(self._build_rtp_packet(ssrc=100))
assert 100 in receiver._buffers
assert len(receiver._buffers[100]) > 0
def test_on_packet_dave_other_error_drops(self):
"""DAVE decrypt non-Unencrypted error → packet dropped."""
dave = MagicMock()
dave.decrypt.side_effect = Exception("KeyRotationFailed")
receiver = self._make_receiver_with_nacl(
dave_session=dave, mapped_ssrcs={100: 42}
)
with patch("nacl.secret.Aead") as mock_aead:
mock_aead.return_value.decrypt.return_value = b"\xf8\xff\xfe"
receiver._on_packet(self._build_rtp_packet(ssrc=100))
assert len(receiver._buffers.get(100, b"")) == 0
def test_on_packet_no_dave_direct_decode(self):
"""No DAVE session → decode directly."""
receiver = self._make_receiver_with_nacl(dave_session=None)
self._inject_mock_decoder(receiver, 100)
with patch("nacl.secret.Aead") as mock_aead:
mock_aead.return_value.decrypt.return_value = b"\xf8\xff\xfe"
receiver._on_packet(self._build_rtp_packet(ssrc=100))
assert 100 in receiver._buffers
assert len(receiver._buffers[100]) > 0
def test_on_packet_bot_own_ssrc_ignored(self):
"""Bot's own SSRC → dropped (echo prevention)."""
receiver = self._make_receiver_with_nacl()
with patch("nacl.secret.Aead"):
receiver._on_packet(self._build_rtp_packet(ssrc=9999))
assert len(receiver._buffers) == 0
def test_on_packet_multiple_ssrcs_separate_buffers(self):
"""Different SSRCs → separate buffers."""
receiver = self._make_receiver_with_nacl(dave_session=None)
self._inject_mock_decoder(receiver, 100)
self._inject_mock_decoder(receiver, 200)
with patch("nacl.secret.Aead") as mock_aead:
mock_aead.return_value.decrypt.return_value = b"\xf8\xff\xfe"
receiver._on_packet(self._build_rtp_packet(ssrc=100))
receiver._on_packet(self._build_rtp_packet(ssrc=200))
assert 100 in receiver._buffers
assert 200 in receiver._buffers
class TestVoiceTTSPlayback:
"""TTS playback: play_tts in VC, dedup, fallback."""
@staticmethod
def _make_discord_adapter():
from gateway.platforms.discord import DiscordAdapter
from gateway.config import PlatformConfig, Platform
config = PlatformConfig(enabled=True, extra={})
config.token = "fake-token"
adapter = object.__new__(DiscordAdapter)
adapter.platform = Platform.DISCORD
adapter.config = config
adapter._voice_clients = {}
adapter._voice_text_channels = {}
adapter._voice_receivers = {}
return adapter
# -- play_tts behavior --
@pytest.mark.asyncio
async def test_play_tts_plays_in_vc(self):
"""play_tts calls play_in_voice_channel when bot is in VC."""
adapter = self._make_discord_adapter()
mock_vc = MagicMock()
mock_vc.is_connected.return_value = True
adapter._voice_clients[111] = mock_vc
adapter._voice_text_channels[111] = 123
played = []
async def fake_play(gid, path):
played.append((gid, path))
return True
adapter.play_in_voice_channel = fake_play
result = await adapter.play_tts(chat_id="123", audio_path="/tmp/tts.ogg")
assert result.success is True
assert played == [(111, "/tmp/tts.ogg")]
@pytest.mark.asyncio
async def test_play_tts_fallback_when_not_in_vc(self):
"""play_tts sends as file attachment when bot is not in VC."""
adapter = self._make_discord_adapter()
from gateway.platforms.base import SendResult
adapter.send_voice = AsyncMock(return_value=SendResult(success=False, error="no client"))
result = await adapter.play_tts(chat_id="123", audio_path="/tmp/tts.ogg")
assert result.success is False
adapter.send_voice.assert_called_once()
@pytest.mark.asyncio
async def test_play_tts_wrong_channel_no_match(self):
"""play_tts doesn't match if chat_id is for a different channel."""
adapter = self._make_discord_adapter()
mock_vc = MagicMock()
mock_vc.is_connected.return_value = True
adapter._voice_clients[111] = mock_vc
adapter._voice_text_channels[111] = 123
from gateway.platforms.base import SendResult
adapter.send_voice = AsyncMock(return_value=SendResult(success=True))
# Different chat_id — shouldn't match VC
result = await adapter.play_tts(chat_id="999", audio_path="/tmp/tts.ogg")
adapter.send_voice.assert_called_once()
# -- Runner dedup --
@staticmethod
def _make_runner():
from gateway.run import GatewayRunner
runner = object.__new__(GatewayRunner)
runner._voice_mode = {}
runner.adapters = {}
return runner
def _call_should_reply(self, runner, voice_mode, msg_type, response="Hello", agent_msgs=None):
from gateway.platforms.base import MessageType, MessageEvent, SessionSource
from gateway.config import Platform
runner._voice_mode["ch1"] = voice_mode
source = SessionSource(
platform=Platform.DISCORD, chat_id="ch1",
user_id="1", user_name="test", chat_type="channel",
)
event = MessageEvent(source=source, text="test", message_type=msg_type)
return runner._should_send_voice_reply(event, response, agent_msgs or [])
def test_voice_input_runner_skips(self):
"""Voice input: runner skips — base adapter handles via play_tts."""
from gateway.platforms.base import MessageType
runner = self._make_runner()
assert self._call_should_reply(runner, "all", MessageType.VOICE) is False
def test_text_input_voice_all_runner_fires(self):
"""Text input + voice_mode=all: runner generates TTS."""
from gateway.platforms.base import MessageType
runner = self._make_runner()
assert self._call_should_reply(runner, "all", MessageType.TEXT) is True
def test_text_input_voice_off_no_tts(self):
"""Text input + voice_mode=off: no TTS."""
from gateway.platforms.base import MessageType
runner = self._make_runner()
assert self._call_should_reply(runner, "off", MessageType.TEXT) is False
def test_text_input_voice_only_no_tts(self):
"""Text input + voice_mode=voice_only: no TTS for text."""
from gateway.platforms.base import MessageType
runner = self._make_runner()
assert self._call_should_reply(runner, "voice_only", MessageType.TEXT) is False
def test_error_response_no_tts(self):
"""Error response: no TTS regardless of voice_mode."""
from gateway.platforms.base import MessageType
runner = self._make_runner()
assert self._call_should_reply(runner, "all", MessageType.TEXT, response="Error: boom") is False
def test_empty_response_no_tts(self):
"""Empty response: no TTS."""
from gateway.platforms.base import MessageType
runner = self._make_runner()
assert self._call_should_reply(runner, "all", MessageType.TEXT, response="") is False
def test_agent_tts_tool_dedup(self):
"""Agent already called text_to_speech tool: runner skips."""
from gateway.platforms.base import MessageType
runner = self._make_runner()
agent_msgs = [{"role": "assistant", "tool_calls": [
{"id": "1", "type": "function", "function": {"name": "text_to_speech", "arguments": "{}"}}
]}]
assert self._call_should_reply(runner, "all", MessageType.TEXT, agent_msgs=agent_msgs) is False
class TestUDPKeepalive:
"""UDP keepalive prevents Discord from dropping the voice session."""
def test_keepalive_interval_is_reasonable(self):
from gateway.platforms.discord import DiscordAdapter
interval = DiscordAdapter._KEEPALIVE_INTERVAL
assert 5 <= interval <= 30, f"Keepalive interval {interval}s should be between 5-30s"
@pytest.mark.asyncio
async def test_keepalive_sends_silence_frame(self):
"""Listen loop sends silence frame via send_packet after interval."""
from gateway.platforms.discord import DiscordAdapter
from gateway.config import PlatformConfig, Platform
config = PlatformConfig(enabled=True, extra={})
config.token = "fake"
adapter = object.__new__(DiscordAdapter)
adapter.platform = Platform.DISCORD
adapter.config = config
adapter._voice_clients = {}
adapter._voice_text_channels = {}
adapter._voice_receivers = {}
adapter._voice_listen_tasks = {}
# Mock VC and receiver
mock_vc = MagicMock()
mock_vc.is_connected.return_value = True
mock_conn = MagicMock()
adapter._voice_clients[111] = mock_vc
mock_vc._connection = mock_conn
from gateway.platforms.discord import VoiceReceiver
mock_receiver_vc = MagicMock()
mock_receiver_vc._connection.secret_key = [0] * 32
mock_receiver_vc._connection.dave_session = None
mock_receiver_vc._connection.ssrc = 9999
mock_receiver_vc._connection.add_socket_listener = MagicMock()
mock_receiver_vc._connection.remove_socket_listener = MagicMock()
mock_receiver_vc._connection.hook = None
receiver = VoiceReceiver(mock_receiver_vc)
receiver.start()
adapter._voice_receivers[111] = receiver
# Set keepalive interval very short for test
original_interval = DiscordAdapter._KEEPALIVE_INTERVAL
DiscordAdapter._KEEPALIVE_INTERVAL = 0.1
try:
# Run listen loop briefly
import asyncio
loop_task = asyncio.create_task(adapter._voice_listen_loop(111))
await asyncio.sleep(0.3)
receiver._running = False # stop loop
await asyncio.sleep(0.1)
loop_task.cancel()
try:
await loop_task
except asyncio.CancelledError:
pass
# send_packet should have been called with silence frame
mock_conn.send_packet.assert_called_with(b'\xf8\xff\xfe')
finally:
DiscordAdapter._KEEPALIVE_INTERVAL = original_interval

View File

@@ -0,0 +1,70 @@
import importlib
import os
import sys
from pathlib import Path
from hermes_cli.env_loader import load_hermes_dotenv
def test_user_env_overrides_stale_shell_values(tmp_path, monkeypatch):
home = tmp_path / "hermes"
home.mkdir()
env_file = home / ".env"
env_file.write_text("OPENAI_BASE_URL=https://new.example/v1\n", encoding="utf-8")
monkeypatch.setenv("OPENAI_BASE_URL", "https://old.example/v1")
loaded = load_hermes_dotenv(hermes_home=home)
assert loaded == [env_file]
assert os.getenv("OPENAI_BASE_URL") == "https://new.example/v1"
def test_project_env_overrides_stale_shell_values_when_user_env_missing(tmp_path, monkeypatch):
home = tmp_path / "hermes"
project_env = tmp_path / ".env"
project_env.write_text("OPENAI_BASE_URL=https://project.example/v1\n", encoding="utf-8")
monkeypatch.setenv("OPENAI_BASE_URL", "https://old.example/v1")
loaded = load_hermes_dotenv(hermes_home=home, project_env=project_env)
assert loaded == [project_env]
assert os.getenv("OPENAI_BASE_URL") == "https://project.example/v1"
def test_user_env_takes_precedence_over_project_env(tmp_path, monkeypatch):
home = tmp_path / "hermes"
home.mkdir()
user_env = home / ".env"
project_env = tmp_path / ".env"
user_env.write_text("OPENAI_BASE_URL=https://user.example/v1\n", encoding="utf-8")
project_env.write_text("OPENAI_BASE_URL=https://project.example/v1\nOPENAI_API_KEY=project-key\n", encoding="utf-8")
monkeypatch.setenv("OPENAI_BASE_URL", "https://old.example/v1")
monkeypatch.delenv("OPENAI_API_KEY", raising=False)
loaded = load_hermes_dotenv(hermes_home=home, project_env=project_env)
assert loaded == [user_env, project_env]
assert os.getenv("OPENAI_BASE_URL") == "https://user.example/v1"
assert os.getenv("OPENAI_API_KEY") == "project-key"
def test_main_import_applies_user_env_over_shell_values(tmp_path, monkeypatch):
home = tmp_path / "hermes"
home.mkdir()
(home / ".env").write_text(
"OPENAI_BASE_URL=https://new.example/v1\nHERMES_INFERENCE_PROVIDER=custom\n",
encoding="utf-8",
)
monkeypatch.setenv("HERMES_HOME", str(home))
monkeypatch.setenv("OPENAI_BASE_URL", "https://old.example/v1")
monkeypatch.setenv("HERMES_INFERENCE_PROVIDER", "openrouter")
sys.modules.pop("hermes_cli.main", None)
importlib.import_module("hermes_cli.main")
assert os.getenv("OPENAI_BASE_URL") == "https://new.example/v1"
assert os.getenv("HERMES_INFERENCE_PROVIDER") == "custom"

View File

@@ -0,0 +1,64 @@
import sys
def test_sessions_delete_accepts_unique_id_prefix(monkeypatch, capsys):
import hermes_cli.main as main_mod
import hermes_state
captured = {}
class FakeDB:
def resolve_session_id(self, session_id):
captured["resolved_from"] = session_id
return "20260315_092437_c9a6ff"
def delete_session(self, session_id):
captured["deleted"] = session_id
return True
def close(self):
captured["closed"] = True
monkeypatch.setattr(hermes_state, "SessionDB", lambda: FakeDB())
monkeypatch.setattr(
sys,
"argv",
["hermes", "sessions", "delete", "20260315_092437_c9a6", "--yes"],
)
main_mod.main()
output = capsys.readouterr().out
assert captured == {
"resolved_from": "20260315_092437_c9a6",
"deleted": "20260315_092437_c9a6ff",
"closed": True,
}
assert "Deleted session '20260315_092437_c9a6ff'." in output
def test_sessions_delete_reports_not_found_when_prefix_is_unknown(monkeypatch, capsys):
import hermes_cli.main as main_mod
import hermes_state
class FakeDB:
def resolve_session_id(self, session_id):
return None
def delete_session(self, session_id):
raise AssertionError("delete_session should not be called when resolution fails")
def close(self):
pass
monkeypatch.setattr(hermes_state, "SessionDB", lambda: FakeDB())
monkeypatch.setattr(
sys,
"argv",
["hermes", "sessions", "delete", "missing-prefix", "--yes"],
)
main_mod.main()
output = capsys.readouterr().out
assert "Session 'missing-prefix' not found." in output

View File

@@ -1,6 +1,13 @@
"""Tests for hermes_cli.tools_config platform tool persistence."""
from hermes_cli.tools_config import _get_platform_tools, _platform_toolset_summary, _toolset_has_keys
from unittest.mock import patch
from hermes_cli.tools_config import (
_get_platform_tools,
_platform_toolset_summary,
_save_platform_tools,
_toolset_has_keys,
)
def test_get_platform_tools_uses_default_when_platform_not_configured():
@@ -31,7 +38,7 @@ def test_platform_toolset_summary_uses_explicit_platform_list():
def test_toolset_has_keys_for_vision_accepts_codex_auth(tmp_path, monkeypatch):
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
(tmp_path / "auth.json").write_text(
'{"active_provider":"openai-codex","providers":{"openai-codex":{"tokens":{"access_token":"codex-access-token","refresh_token":"codex-refresh-token"}}}}'
'{"active_provider":"openai-codex","providers":{"openai-codex":{"tokens":{"access_token": "codex-...oken","refresh_token": "codex-...oken"}}}}'
)
monkeypatch.delenv("OPENROUTER_API_KEY", raising=False)
monkeypatch.delenv("OPENAI_BASE_URL", raising=False)
@@ -40,3 +47,56 @@ def test_toolset_has_keys_for_vision_accepts_codex_auth(tmp_path, monkeypatch):
monkeypatch.delenv("CONTEXT_VISION_PROVIDER", raising=False)
assert _toolset_has_keys("vision") is True
def test_save_platform_tools_preserves_mcp_server_names():
"""Ensure MCP server names are preserved when saving platform tools.
Regression test for https://github.com/NousResearch/hermes-agent/issues/1247
"""
config = {
"platform_toolsets": {
"cli": ["web", "terminal", "time", "github", "custom-mcp-server"]
}
}
new_selection = {"web", "browser"}
with patch("hermes_cli.tools_config.save_config"):
_save_platform_tools(config, "cli", new_selection)
saved_toolsets = config["platform_toolsets"]["cli"]
assert "time" in saved_toolsets
assert "github" in saved_toolsets
assert "custom-mcp-server" in saved_toolsets
assert "web" in saved_toolsets
assert "browser" in saved_toolsets
assert "terminal" not in saved_toolsets
def test_save_platform_tools_handles_empty_existing_config():
"""Saving platform tools works when no existing config exists."""
config = {}
with patch("hermes_cli.tools_config.save_config"):
_save_platform_tools(config, "telegram", {"web", "terminal"})
saved_toolsets = config["platform_toolsets"]["telegram"]
assert "web" in saved_toolsets
assert "terminal" in saved_toolsets
def test_save_platform_tools_handles_invalid_existing_config():
"""Saving platform tools works when existing config is not a list."""
config = {
"platform_toolsets": {
"cli": "invalid-string-value"
}
}
with patch("hermes_cli.tools_config.save_config"):
_save_platform_tools(config, "cli", {"web"})
saved_toolsets = config["platform_toolsets"]["cli"]
assert "web" in saved_toolsets

View File

@@ -0,0 +1,611 @@
"""Integration tests for Discord voice channel audio flow.
Uses real NaCl encryption and Opus codec (no mocks for crypto/codec).
Does NOT require a Discord connection — tests the VoiceReceiver
packet processing pipeline end-to-end.
Requires: PyNaCl>=1.5.0, discord.py[voice] (opus codec)
"""
import struct
import time
import pytest
pytestmark = pytest.mark.integration
# Skip entire module if voice deps are missing
pytest.importorskip("nacl.secret", reason="PyNaCl required for voice integration tests")
discord = pytest.importorskip("discord", reason="discord.py required for voice integration tests")
import nacl.secret
try:
if not discord.opus.is_loaded():
import ctypes.util
opus_path = ctypes.util.find_library("opus")
if not opus_path:
import sys
for p in ("/opt/homebrew/lib/libopus.dylib", "/usr/local/lib/libopus.dylib"):
import os
if os.path.isfile(p):
opus_path = p
break
if opus_path:
discord.opus.load_opus(opus_path)
OPUS_AVAILABLE = discord.opus.is_loaded()
except Exception:
OPUS_AVAILABLE = False
from types import SimpleNamespace
from unittest.mock import MagicMock
from gateway.platforms.discord import VoiceReceiver
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def _make_secret_key():
"""Generate a random 32-byte key."""
import os
return os.urandom(32)
def _build_encrypted_rtp_packet(secret_key, opus_payload, ssrc=100, seq=1, timestamp=960):
"""Build a real NaCl-encrypted RTP packet matching Discord's format.
Format: RTP header (12 bytes) + encrypted(opus) + 4-byte nonce
Encryption: aead_xchacha20_poly1305 with RTP header as AAD.
"""
# RTP header: version=2, payload_type=0x78, no extension, no CSRC
header = struct.pack(">BBHII", 0x80, 0x78, seq, timestamp, ssrc)
# Encrypt with NaCl AEAD
box = nacl.secret.Aead(secret_key)
nonce_counter = struct.pack(">I", seq) # 4-byte counter as nonce seed
# Full 24-byte nonce: counter in first 4 bytes, rest zeros
full_nonce = nonce_counter + b'\x00' * 20
enc_msg = box.encrypt(opus_payload, header, full_nonce)
ciphertext = enc_msg.ciphertext # without nonce prefix
# Discord format: header + ciphertext + 4-byte nonce
return header + ciphertext + nonce_counter
def _make_voice_receiver(secret_key, dave_session=None, bot_ssrc=9999,
allowed_user_ids=None, members=None):
"""Create a VoiceReceiver with real secret key."""
vc = MagicMock()
vc._connection.secret_key = list(secret_key)
vc._connection.dave_session = dave_session
vc._connection.ssrc = bot_ssrc
vc._connection.add_socket_listener = MagicMock()
vc._connection.remove_socket_listener = MagicMock()
vc._connection.hook = None
vc.user = SimpleNamespace(id=bot_ssrc)
vc.channel = MagicMock()
vc.channel.members = members or []
receiver = VoiceReceiver(vc, allowed_user_ids=allowed_user_ids)
receiver.start()
return receiver
# ---------------------------------------------------------------------------
# Tests
# ---------------------------------------------------------------------------
class TestRealNaClDecrypt:
"""End-to-end: real NaCl encrypt → _on_packet decrypt → buffer."""
def test_valid_encrypted_packet_buffered(self):
"""Real NaCl encrypted packet → decrypted → buffered."""
key = _make_secret_key()
opus_silence = b'\xf8\xff\xfe'
receiver = _make_voice_receiver(key)
packet = _build_encrypted_rtp_packet(key, opus_silence, ssrc=100)
receiver._on_packet(packet)
assert 100 in receiver._buffers
assert len(receiver._buffers[100]) > 0
def test_wrong_key_packet_dropped(self):
"""Packet encrypted with wrong key → NaCl fails → not buffered."""
real_key = _make_secret_key()
wrong_key = _make_secret_key()
opus_silence = b'\xf8\xff\xfe'
receiver = _make_voice_receiver(real_key)
packet = _build_encrypted_rtp_packet(wrong_key, opus_silence, ssrc=100)
receiver._on_packet(packet)
assert len(receiver._buffers.get(100, b"")) == 0
def test_bot_ssrc_ignored(self):
"""Packet from bot's own SSRC → ignored."""
key = _make_secret_key()
receiver = _make_voice_receiver(key, bot_ssrc=9999)
packet = _build_encrypted_rtp_packet(key, b'\xf8\xff\xfe', ssrc=9999)
receiver._on_packet(packet)
assert len(receiver._buffers) == 0
def test_multiple_packets_accumulate(self):
"""Multiple valid packets → buffer grows."""
key = _make_secret_key()
receiver = _make_voice_receiver(key)
for seq in range(1, 6):
packet = _build_encrypted_rtp_packet(
key, b'\xf8\xff\xfe', ssrc=100, seq=seq, timestamp=960 * seq
)
receiver._on_packet(packet)
assert 100 in receiver._buffers
buf_size = len(receiver._buffers[100])
assert buf_size > 0, "Multiple packets should accumulate in buffer"
def test_different_ssrcs_separate_buffers(self):
"""Packets from different SSRCs → separate buffers."""
key = _make_secret_key()
receiver = _make_voice_receiver(key)
for ssrc in [100, 200, 300]:
packet = _build_encrypted_rtp_packet(key, b'\xf8\xff\xfe', ssrc=ssrc)
receiver._on_packet(packet)
assert len(receiver._buffers) == 3
for ssrc in [100, 200, 300]:
assert ssrc in receiver._buffers
class TestRealNaClWithDAVE:
"""NaCl decrypt + DAVE passthrough scenarios with real crypto."""
def test_dave_unknown_ssrc_passthrough(self):
"""DAVE enabled but SSRC unknown → skip DAVE, buffer audio."""
key = _make_secret_key()
dave = MagicMock() # DAVE session present but SSRC not mapped
receiver = _make_voice_receiver(key, dave_session=dave)
packet = _build_encrypted_rtp_packet(key, b'\xf8\xff\xfe', ssrc=100)
receiver._on_packet(packet)
# DAVE decrypt not called (SSRC unknown)
dave.decrypt.assert_not_called()
# Audio still buffered via passthrough
assert 100 in receiver._buffers
assert len(receiver._buffers[100]) > 0
def test_dave_unencrypted_error_passthrough(self):
"""DAVE raises 'Unencrypted' → use NaCl-decrypted data as-is."""
key = _make_secret_key()
dave = MagicMock()
dave.decrypt.side_effect = Exception(
"DecryptionFailed(UnencryptedWhenPassthroughDisabled)"
)
receiver = _make_voice_receiver(key, dave_session=dave)
receiver.map_ssrc(100, 42)
packet = _build_encrypted_rtp_packet(key, b'\xf8\xff\xfe', ssrc=100)
receiver._on_packet(packet)
# DAVE was called but failed → passthrough
dave.decrypt.assert_called_once()
assert 100 in receiver._buffers
assert len(receiver._buffers[100]) > 0
def test_dave_real_error_drops(self):
"""DAVE raises non-Unencrypted error → packet dropped."""
key = _make_secret_key()
dave = MagicMock()
dave.decrypt.side_effect = Exception("KeyRotationFailed")
receiver = _make_voice_receiver(key, dave_session=dave)
receiver.map_ssrc(100, 42)
packet = _build_encrypted_rtp_packet(key, b'\xf8\xff\xfe', ssrc=100)
receiver._on_packet(packet)
assert len(receiver._buffers.get(100, b"")) == 0
class TestFullVoiceFlow:
"""End-to-end: encrypt → receive → buffer → silence detect → complete."""
def test_single_utterance_flow(self):
"""Encrypt packets → buffer → silence → check_silence returns utterance."""
key = _make_secret_key()
receiver = _make_voice_receiver(key)
receiver.map_ssrc(100, 42)
# Send enough packets to exceed MIN_SPEECH_DURATION (0.5s)
# At 48kHz stereo 16-bit, each Opus silence frame decodes to ~3840 bytes
# Need 96000 bytes = ~25 frames
for seq in range(1, 30):
packet = _build_encrypted_rtp_packet(
key, b'\xf8\xff\xfe', ssrc=100, seq=seq, timestamp=960 * seq
)
receiver._on_packet(packet)
# Simulate silence by setting last_packet_time in the past
receiver._last_packet_time[100] = time.monotonic() - 3.0
completed = receiver.check_silence()
assert len(completed) == 1
user_id, pcm_data = completed[0]
assert user_id == 42
assert len(pcm_data) > 0
def test_utterance_with_ssrc_automap(self):
"""No SPEAKING event → auto-map sole allowed user → utterance processed."""
key = _make_secret_key()
members = [
SimpleNamespace(id=9999, name="Bot"),
SimpleNamespace(id=42, name="Alice"),
]
receiver = _make_voice_receiver(
key, allowed_user_ids={"42"}, members=members
)
# No map_ssrc call — simulating missing SPEAKING event
for seq in range(1, 30):
packet = _build_encrypted_rtp_packet(
key, b'\xf8\xff\xfe', ssrc=100, seq=seq, timestamp=960 * seq
)
receiver._on_packet(packet)
receiver._last_packet_time[100] = time.monotonic() - 3.0
completed = receiver.check_silence()
assert len(completed) == 1
assert completed[0][0] == 42 # auto-mapped to sole allowed user
def test_pause_blocks_during_playback(self):
"""Pause receiver → packets ignored → resume → packets accepted."""
key = _make_secret_key()
receiver = _make_voice_receiver(key)
# Pause (echo prevention during TTS playback)
receiver.pause()
packet = _build_encrypted_rtp_packet(key, b'\xf8\xff\xfe', ssrc=100)
receiver._on_packet(packet)
assert len(receiver._buffers.get(100, b"")) == 0
# Resume
receiver.resume()
receiver._on_packet(packet)
assert 100 in receiver._buffers
assert len(receiver._buffers[100]) > 0
def test_corrupted_packet_ignored(self):
"""Corrupted/truncated packet → silently ignored."""
key = _make_secret_key()
receiver = _make_voice_receiver(key)
# Too short
receiver._on_packet(b"\x00" * 5)
assert len(receiver._buffers) == 0
# Wrong RTP version
bad_header = struct.pack(">BBHII", 0x00, 0x78, 1, 960, 100)
receiver._on_packet(bad_header + b"\x00" * 20)
assert len(receiver._buffers) == 0
# Wrong payload type
bad_pt = struct.pack(">BBHII", 0x80, 0x00, 1, 960, 100)
receiver._on_packet(bad_pt + b"\x00" * 20)
assert len(receiver._buffers) == 0
def test_stop_cleans_everything(self):
"""stop() clears all state cleanly."""
key = _make_secret_key()
receiver = _make_voice_receiver(key)
receiver.map_ssrc(100, 42)
for seq in range(1, 10):
packet = _build_encrypted_rtp_packet(
key, b'\xf8\xff\xfe', ssrc=100, seq=seq, timestamp=960 * seq
)
receiver._on_packet(packet)
assert len(receiver._buffers[100]) > 0
receiver.stop()
assert receiver._running is False
assert len(receiver._buffers) == 0
assert len(receiver._ssrc_to_user) == 0
assert len(receiver._decoders) == 0
class TestSPEAKINGHook:
"""SPEAKING event hook correctly maps SSRC to user_id."""
def test_speaking_hook_installed(self):
"""start() installs speaking hook on connection."""
key = _make_secret_key()
receiver = _make_voice_receiver(key)
conn = receiver._vc._connection
# hook should be set (wrapped)
assert conn.hook is not None
def test_map_ssrc_via_speaking(self):
"""SPEAKING op 5 event maps SSRC to user_id."""
key = _make_secret_key()
receiver = _make_voice_receiver(key)
receiver.map_ssrc(500, 12345)
assert receiver._ssrc_to_user[500] == 12345
def test_map_ssrc_overwrites(self):
"""New SPEAKING event for same SSRC overwrites old mapping."""
key = _make_secret_key()
receiver = _make_voice_receiver(key)
receiver.map_ssrc(500, 111)
receiver.map_ssrc(500, 222)
assert receiver._ssrc_to_user[500] == 222
def test_speaking_mapped_audio_processed(self):
"""After SSRC is mapped, audio from that SSRC gets correct user_id."""
key = _make_secret_key()
receiver = _make_voice_receiver(key)
receiver.map_ssrc(100, 42)
for seq in range(1, 30):
packet = _build_encrypted_rtp_packet(
key, b'\xf8\xff\xfe', ssrc=100, seq=seq, timestamp=960 * seq
)
receiver._on_packet(packet)
receiver._last_packet_time[100] = time.monotonic() - 3.0
completed = receiver.check_silence()
assert len(completed) == 1
assert completed[0][0] == 42
class TestAuthFiltering:
"""Only allowed users' audio should be processed."""
def test_allowed_user_audio_processed(self):
"""Allowed user's utterance is returned by check_silence."""
key = _make_secret_key()
members = [
SimpleNamespace(id=9999, name="Bot"),
SimpleNamespace(id=42, name="Alice"),
]
receiver = _make_voice_receiver(
key, allowed_user_ids={"42"}, members=members,
)
receiver.map_ssrc(100, 42)
for seq in range(1, 30):
packet = _build_encrypted_rtp_packet(
key, b'\xf8\xff\xfe', ssrc=100, seq=seq, timestamp=960 * seq
)
receiver._on_packet(packet)
receiver._last_packet_time[100] = time.monotonic() - 3.0
completed = receiver.check_silence()
assert len(completed) == 1
assert completed[0][0] == 42
def test_automap_rejects_unallowed_user(self):
"""Auto-map refuses to map SSRC to user not in allowed list."""
key = _make_secret_key()
members = [
SimpleNamespace(id=9999, name="Bot"),
SimpleNamespace(id=42, name="Alice"),
]
receiver = _make_voice_receiver(
key, allowed_user_ids={"99"}, # Alice not allowed
members=members,
)
# No map_ssrc — SSRC unknown, auto-map should reject
for seq in range(1, 30):
packet = _build_encrypted_rtp_packet(
key, b'\xf8\xff\xfe', ssrc=100, seq=seq, timestamp=960 * seq
)
receiver._on_packet(packet)
receiver._last_packet_time[100] = time.monotonic() - 3.0
completed = receiver.check_silence()
assert len(completed) == 0
def test_empty_allowlist_allows_all(self):
"""Empty allowed_user_ids means no restriction."""
key = _make_secret_key()
members = [
SimpleNamespace(id=9999, name="Bot"),
SimpleNamespace(id=42, name="Alice"),
]
receiver = _make_voice_receiver(
key, allowed_user_ids=None, members=members,
)
for seq in range(1, 30):
packet = _build_encrypted_rtp_packet(
key, b'\xf8\xff\xfe', ssrc=100, seq=seq, timestamp=960 * seq
)
receiver._on_packet(packet)
receiver._last_packet_time[100] = time.monotonic() - 3.0
completed = receiver.check_silence()
# Auto-mapped to sole non-bot member
assert len(completed) == 1
assert completed[0][0] == 42
class TestRejoinFlow:
"""Leave and rejoin: state cleanup and fresh receiver."""
def test_stop_then_new_receiver_clean_state(self):
"""After stop(), a new receiver starts with empty state."""
key = _make_secret_key()
receiver1 = _make_voice_receiver(key)
receiver1.map_ssrc(100, 42)
for seq in range(1, 10):
packet = _build_encrypted_rtp_packet(
key, b'\xf8\xff\xfe', ssrc=100, seq=seq, timestamp=960 * seq
)
receiver1._on_packet(packet)
assert len(receiver1._buffers[100]) > 0
receiver1.stop()
# New receiver (simulates rejoin)
receiver2 = _make_voice_receiver(key)
assert len(receiver2._buffers) == 0
assert len(receiver2._ssrc_to_user) == 0
assert len(receiver2._decoders) == 0
def test_rejoin_new_ssrc_works(self):
"""After rejoin, user may get new SSRC — still works."""
key = _make_secret_key()
receiver1 = _make_voice_receiver(key)
receiver1.map_ssrc(100, 42) # old SSRC
receiver1.stop()
receiver2 = _make_voice_receiver(key)
receiver2.map_ssrc(200, 42) # new SSRC after rejoin
for seq in range(1, 30):
packet = _build_encrypted_rtp_packet(
key, b'\xf8\xff\xfe', ssrc=200, seq=seq, timestamp=960 * seq
)
receiver2._on_packet(packet)
receiver2._last_packet_time[200] = time.monotonic() - 3.0
completed = receiver2.check_silence()
assert len(completed) == 1
assert completed[0][0] == 42
def test_rejoin_without_speaking_event_automap(self):
"""Rejoin without SPEAKING event — auto-map sole allowed user."""
key = _make_secret_key()
members = [
SimpleNamespace(id=9999, name="Bot"),
SimpleNamespace(id=42, name="Alice"),
]
# First session
receiver1 = _make_voice_receiver(
key, allowed_user_ids={"42"}, members=members,
)
receiver1.stop()
# Rejoin — new key (Discord may assign new secret_key)
new_key = _make_secret_key()
receiver2 = _make_voice_receiver(
new_key, allowed_user_ids={"42"}, members=members,
)
# No map_ssrc — simulating missing SPEAKING event
for seq in range(1, 30):
packet = _build_encrypted_rtp_packet(
new_key, b'\xf8\xff\xfe', ssrc=300, seq=seq, timestamp=960 * seq
)
receiver2._on_packet(packet)
receiver2._last_packet_time[300] = time.monotonic() - 3.0
completed = receiver2.check_silence()
assert len(completed) == 1
assert completed[0][0] == 42
class TestMultiGuildIsolation:
"""Each guild has independent voice state."""
def test_separate_receivers_independent(self):
"""Two receivers (different guilds) don't interfere."""
key1 = _make_secret_key()
key2 = _make_secret_key()
receiver1 = _make_voice_receiver(key1, bot_ssrc=1111)
receiver2 = _make_voice_receiver(key2, bot_ssrc=2222)
receiver1.map_ssrc(100, 42)
receiver2.map_ssrc(200, 99)
# Send to receiver1
for seq in range(1, 10):
packet = _build_encrypted_rtp_packet(
key1, b'\xf8\xff\xfe', ssrc=100, seq=seq, timestamp=960 * seq
)
receiver1._on_packet(packet)
# receiver2 should be empty
assert len(receiver2._buffers) == 0
assert 100 in receiver1._buffers
def test_stop_one_doesnt_affect_other(self):
"""Stopping one receiver doesn't affect another."""
key1 = _make_secret_key()
key2 = _make_secret_key()
receiver1 = _make_voice_receiver(key1)
receiver2 = _make_voice_receiver(key2)
receiver1.map_ssrc(100, 42)
receiver2.map_ssrc(200, 99)
for seq in range(1, 10):
packet = _build_encrypted_rtp_packet(
key2, b'\xf8\xff\xfe', ssrc=200, seq=seq, timestamp=960 * seq
)
receiver2._on_packet(packet)
receiver1.stop()
# receiver2 still has data
assert receiver2._running is True
assert len(receiver2._buffers[200]) > 0
class TestEchoPreventionFlow:
"""Receiver pause/resume during TTS playback prevents echo."""
def test_audio_during_pause_ignored(self):
"""Audio arriving while paused is completely ignored."""
key = _make_secret_key()
receiver = _make_voice_receiver(key)
receiver.map_ssrc(100, 42)
receiver.pause()
for seq in range(1, 30):
packet = _build_encrypted_rtp_packet(
key, b'\xf8\xff\xfe', ssrc=100, seq=seq, timestamp=960 * seq
)
receiver._on_packet(packet)
assert len(receiver._buffers.get(100, b"")) == 0
def test_audio_after_resume_processed(self):
"""Audio arriving after resume is processed normally."""
key = _make_secret_key()
receiver = _make_voice_receiver(key)
receiver.map_ssrc(100, 42)
# Pause → send packets → resume → send more packets
receiver.pause()
for seq in range(1, 5):
packet = _build_encrypted_rtp_packet(
key, b'\xf8\xff\xfe', ssrc=100, seq=seq, timestamp=960 * seq
)
receiver._on_packet(packet)
assert len(receiver._buffers.get(100, b"")) == 0
receiver.resume()
for seq in range(5, 35):
packet = _build_encrypted_rtp_packet(
key, b'\xf8\xff\xfe', ssrc=100, seq=seq, timestamp=960 * seq
)
receiver._on_packet(packet)
assert len(receiver._buffers[100]) > 0
receiver._last_packet_time[100] = time.monotonic() - 3.0
completed = receiver.check_silence()
assert len(completed) == 1
assert completed[0][0] == 42

View File

@@ -0,0 +1,72 @@
import json
from types import SimpleNamespace
def _tool_call(name: str, arguments):
return SimpleNamespace(
id="call_1",
type="function",
function=SimpleNamespace(name=name, arguments=arguments),
)
def _response_with_tool_call(arguments):
assistant = SimpleNamespace(
content=None,
reasoning=None,
tool_calls=[_tool_call("read_file", arguments)],
)
choice = SimpleNamespace(message=assistant, finish_reason="tool_calls")
return SimpleNamespace(choices=[choice], usage=None)
class _FakeChatCompletions:
def __init__(self):
self.calls = 0
def create(self, **kwargs):
self.calls += 1
if self.calls == 1:
return _response_with_tool_call({"path": "README.md"})
return SimpleNamespace(
choices=[
SimpleNamespace(
message=SimpleNamespace(content="done", reasoning=None, tool_calls=[]),
finish_reason="stop",
)
],
usage=None,
)
class _FakeClient:
def __init__(self):
self.chat = SimpleNamespace(completions=_FakeChatCompletions())
def test_tool_call_validation_accepts_dict_arguments(monkeypatch):
from run_agent import AIAgent
monkeypatch.setattr("run_agent.OpenAI", lambda **kwargs: _FakeClient())
monkeypatch.setattr(
"run_agent.get_tool_definitions",
lambda *args, **kwargs: [{"function": {"name": "read_file"}}],
)
monkeypatch.setattr(
"run_agent.handle_function_call",
lambda name, args, task_id=None, **kwargs: json.dumps({"ok": True, "args": args}),
)
agent = AIAgent(
model="test-model",
api_key="test-key",
base_url="http://localhost:8080/v1",
platform="cli",
max_iterations=3,
quiet_mode=True,
skip_memory=True,
)
result = agent.run_conversation("read the file")
assert result["final_response"] == "done"

View File

@@ -361,6 +361,24 @@ class TestDeleteAndExport:
def test_delete_nonexistent(self, db):
assert db.delete_session("nope") is False
def test_resolve_session_id_exact(self, db):
db.create_session(session_id="20260315_092437_c9a6ff", source="cli")
assert db.resolve_session_id("20260315_092437_c9a6ff") == "20260315_092437_c9a6ff"
def test_resolve_session_id_unique_prefix(self, db):
db.create_session(session_id="20260315_092437_c9a6ff", source="cli")
assert db.resolve_session_id("20260315_092437_c9a6") == "20260315_092437_c9a6ff"
def test_resolve_session_id_ambiguous_prefix_returns_none(self, db):
db.create_session(session_id="20260315_092437_c9a6aa", source="cli")
db.create_session(session_id="20260315_092437_c9a6bb", source="cli")
assert db.resolve_session_id("20260315_092437_c9a6") is None
def test_resolve_session_id_escapes_like_wildcards(self, db):
db.create_session(session_id="20260315_092437_c9a6ff", source="cli")
db.create_session(session_id="20260315X092437_c9a6ff", source="cli")
assert db.resolve_session_id("20260315_092437") == "20260315_092437_c9a6ff"
def test_export_session(self, db):
db.create_session(session_id="s1", source="cli", model="test")
db.append_message("s1", role="user", content="Hello")

View File

@@ -153,6 +153,36 @@ class TestScheduleCronjob:
assert job["provider"] == "custom"
assert job["base_url"] == "http://127.0.0.1:4000/v1"
def test_thread_id_captured_in_origin(self, monkeypatch):
monkeypatch.setenv("HERMES_SESSION_PLATFORM", "telegram")
monkeypatch.setenv("HERMES_SESSION_CHAT_ID", "123456")
monkeypatch.setenv("HERMES_SESSION_THREAD_ID", "42")
import cron.jobs as _jobs
created = json.loads(schedule_cronjob(
prompt="Thread test",
schedule="every 1h",
deliver="origin",
))
assert created["success"] is True
job_id = created["job_id"]
job = _jobs.get_job(job_id)
assert job["origin"]["thread_id"] == "42"
def test_thread_id_absent_when_not_set(self, monkeypatch):
monkeypatch.setenv("HERMES_SESSION_PLATFORM", "telegram")
monkeypatch.setenv("HERMES_SESSION_CHAT_ID", "123456")
monkeypatch.delenv("HERMES_SESSION_THREAD_ID", raising=False)
import cron.jobs as _jobs
created = json.loads(schedule_cronjob(
prompt="No thread test",
schedule="every 1h",
deliver="origin",
))
assert created["success"] is True
job_id = created["job_id"]
job = _jobs.get_job(job_id)
assert job["origin"].get("thread_id") is None
# =========================================================================
# list_cronjobs

View File

@@ -1,4 +1,5 @@
import logging
from io import StringIO
import subprocess
import pytest
@@ -86,3 +87,64 @@ def test_ensure_docker_available_uses_resolved_executable(monkeypatch):
})
]
class _FakePopen:
def __init__(self, cmd, **kwargs):
self.cmd = cmd
self.kwargs = kwargs
self.stdout = StringIO("")
self.stdin = None
self.returncode = 0
def poll(self):
return self.returncode
def _make_execute_only_env(forward_env=None):
env = docker_env.DockerEnvironment.__new__(docker_env.DockerEnvironment)
env.cwd = "/root"
env.timeout = 60
env._forward_env = forward_env or []
env._prepare_command = lambda command: (command, None)
env._timeout_result = lambda timeout: {"output": f"timed out after {timeout}", "returncode": 124}
env._inner = type("Inner", (), {
"container_id": "test-container",
"config": type("Cfg", (), {"executable": "/usr/bin/docker", "env": {}})(),
})()
return env
def test_execute_uses_hermes_dotenv_for_allowlisted_env(monkeypatch):
env = _make_execute_only_env(["GITHUB_TOKEN"])
popen_calls = []
def _fake_popen(cmd, **kwargs):
popen_calls.append(cmd)
return _FakePopen(cmd, **kwargs)
monkeypatch.delenv("GITHUB_TOKEN", raising=False)
monkeypatch.setattr(docker_env, "_load_hermes_env_vars", lambda: {"GITHUB_TOKEN": "value_from_dotenv"})
monkeypatch.setattr(docker_env.subprocess, "Popen", _fake_popen)
result = env.execute("echo hi")
assert result["returncode"] == 0
assert "GITHUB_TOKEN=value_from_dotenv" in popen_calls[0]
def test_execute_prefers_shell_env_over_hermes_dotenv(monkeypatch):
env = _make_execute_only_env(["GITHUB_TOKEN"])
popen_calls = []
def _fake_popen(cmd, **kwargs):
popen_calls.append(cmd)
return _FakePopen(cmd, **kwargs)
monkeypatch.setenv("GITHUB_TOKEN", "value_from_shell")
monkeypatch.setattr(docker_env, "_load_hermes_env_vars", lambda: {"GITHUB_TOKEN": "value_from_dotenv"})
monkeypatch.setattr(docker_env.subprocess, "Popen", _fake_popen)
env.execute("echo hi")
assert "GITHUB_TOKEN=value_from_shell" in popen_calls[0]
assert "GITHUB_TOKEN=value_from_dotenv" not in popen_calls[0]

View File

@@ -30,6 +30,28 @@ class TestParseEnvVar:
result = _parse_env_var("TERMINAL_DOCKER_VOLUMES", "[]", json.loads, "valid JSON")
assert result == ["/host:/container"]
def test_get_env_config_parses_docker_forward_env_json(self):
with patch.dict("os.environ", {
"TERMINAL_ENV": "docker",
"TERMINAL_DOCKER_FORWARD_ENV": '["GITHUB_TOKEN", "NPM_TOKEN"]',
}, clear=False):
config = _tt_mod._get_env_config()
assert config["docker_forward_env"] == ["GITHUB_TOKEN", "NPM_TOKEN"]
def test_create_environment_passes_docker_forward_env(self):
fake_env = object()
with patch.object(_tt_mod, "_DockerEnvironment", return_value=fake_env) as mock_docker:
result = _tt_mod._create_environment(
"docker",
image="python:3.11",
cwd="/root",
timeout=180,
container_config={"docker_forward_env": ["GITHUB_TOKEN"]},
)
assert result is fake_env
assert mock_docker.call_args.kwargs["forward_env"] == ["GITHUB_TOKEN"]
def test_falls_back_to_default(self):
with patch.dict("os.environ", {}, clear=False):
# Remove the var if it exists, rely on default

View File

@@ -72,6 +72,7 @@ def _origin_from_env() -> Optional[Dict[str, str]]:
"platform": origin_platform,
"chat_id": origin_chat_id,
"chat_name": os.getenv("HERMES_SESSION_CHAT_NAME"),
"thread_id": os.getenv("HERMES_SESSION_THREAD_ID"),
}
return None

View File

@@ -7,6 +7,7 @@ persistence via bind mounts.
import logging
import os
import re
import shutil
import subprocess
import sys
@@ -30,6 +31,42 @@ _DOCKER_SEARCH_PATHS = [
]
_docker_executable: Optional[str] = None # resolved once, cached
_ENV_VAR_NAME_RE = re.compile(r"^[A-Za-z_][A-Za-z0-9_]*$")
def _normalize_forward_env_names(forward_env: list[str] | None) -> list[str]:
"""Return a deduplicated list of valid environment variable names."""
normalized: list[str] = []
seen: set[str] = set()
for item in forward_env or []:
if not isinstance(item, str):
logger.warning("Ignoring non-string docker_forward_env entry: %r", item)
continue
key = item.strip()
if not key:
continue
if not _ENV_VAR_NAME_RE.match(key):
logger.warning("Ignoring invalid docker_forward_env entry: %r", item)
continue
if key in seen:
continue
seen.add(key)
normalized.append(key)
return normalized
def _load_hermes_env_vars() -> dict[str, str]:
"""Load ~/.hermes/.env values without failing Docker command execution."""
try:
from hermes_cli.config import load_env
return load_env() or {}
except Exception:
return {}
def find_docker() -> Optional[str]:
@@ -171,6 +208,7 @@ class DockerEnvironment(BaseEnvironment):
persistent_filesystem: bool = False,
task_id: str = "default",
volumes: list = None,
forward_env: list[str] | None = None,
network: bool = True,
):
if cwd == "~":
@@ -179,6 +217,7 @@ class DockerEnvironment(BaseEnvironment):
self._base_image = image
self._persistent = persistent_filesystem
self._task_id = task_id
self._forward_env = _normalize_forward_env_names(forward_env)
self._container_id: Optional[str] = None
logger.info(f"DockerEnvironment volumes: {volumes}")
# Ensure volumes is a list (config.yaml could be malformed)
@@ -330,8 +369,12 @@ class DockerEnvironment(BaseEnvironment):
if effective_stdin is not None:
cmd.append("-i")
cmd.extend(["-w", work_dir])
for key in self._inner.config.forward_env:
if (value := os.getenv(key)) is not None:
hermes_env = _load_hermes_env_vars() if self._forward_env else {}
for key in self._forward_env:
value = os.getenv(key)
if value is None:
value = hermes_env.get(key)
if value is not None:
cmd.extend(["-e", f"{key}={value}"])
for key, value in self._inner.config.env.items():
cmd.extend(["-e", f"{key}={value}"])

View File

@@ -492,6 +492,7 @@ def _get_env_config() -> Dict[str, Any]:
return {
"env_type": env_type,
"docker_image": os.getenv("TERMINAL_DOCKER_IMAGE", default_image),
"docker_forward_env": _parse_env_var("TERMINAL_DOCKER_FORWARD_ENV", "[]", json.loads, "valid JSON"),
"singularity_image": os.getenv("TERMINAL_SINGULARITY_IMAGE", f"docker://{default_image}"),
"modal_image": os.getenv("TERMINAL_MODAL_IMAGE", default_image),
"daytona_image": os.getenv("TERMINAL_DAYTONA_IMAGE", default_image),
@@ -536,6 +537,7 @@ def _create_environment(env_type: str, image: str, cwd: str, timeout: int,
disk = cc.get("container_disk", 51200)
persistent = cc.get("container_persistent", True)
volumes = cc.get("docker_volumes", [])
docker_forward_env = cc.get("docker_forward_env", [])
if env_type == "local":
return _LocalEnvironment(cwd=cwd, timeout=timeout)
@@ -546,6 +548,7 @@ def _create_environment(env_type: str, image: str, cwd: str, timeout: int,
cpu=cpu, memory=memory, disk=disk,
persistent_filesystem=persistent, task_id=task_id,
volumes=volumes,
forward_env=docker_forward_env,
)
elif env_type == "singularity":

View File

@@ -130,7 +130,41 @@ When an auxiliary task is configured with provider `main`, Hermes resolves that
## Fallback models
Hermes also supports a configured fallback model/provider, allowing runtime failover in supported error paths.
Hermes supports a configured fallback model/provider pair, allowing runtime failover when the primary model encounters errors.
### How it works internally
1. **Storage**: `AIAgent.__init__` stores the `fallback_model` dict and sets `_fallback_activated = False`.
2. **Trigger points**: `_try_activate_fallback()` is called from three places in the main retry loop in `run_agent.py`:
- After max retries on invalid API responses (None choices, missing content)
- On non-retryable client errors (HTTP 401, 403, 404)
- After max retries on transient errors (HTTP 429, 500, 502, 503)
3. **Activation flow** (`_try_activate_fallback`):
- Returns `False` immediately if already activated or not configured
- Calls `resolve_provider_client()` from `auxiliary_client.py` to build a new client with proper auth
- Determines `api_mode`: `codex_responses` for openai-codex, `anthropic_messages` for anthropic, `chat_completions` for everything else
- Swaps in-place: `self.model`, `self.provider`, `self.base_url`, `self.api_mode`, `self.client`, `self._client_kwargs`
- For anthropic fallback: builds a native Anthropic client instead of OpenAI-compatible
- Re-evaluates prompt caching (enabled for Claude models on OpenRouter)
- Sets `_fallback_activated = True` — prevents firing again
- Resets retry count to 0 and continues the loop
4. **Config flow**:
- CLI: `cli.py` reads `CLI_CONFIG["fallback_model"]` → passes to `AIAgent(fallback_model=...)`
- Gateway: `gateway/run.py._load_fallback_model()` reads `config.yaml` → passes to `AIAgent`
- Validation: both `provider` and `model` keys must be non-empty, or fallback is disabled
### What does NOT support fallback
- **Subagent delegation** (`tools/delegate_tool.py`): subagents inherit the parent's provider but not the fallback config
- **Cron jobs** (`cron/`): run with a fixed provider, no fallback mechanism
- **Auxiliary tasks**: use their own independent provider auto-detection chain (see Auxiliary model routing above)
### Test coverage
See `tests/test_fallback_model.py` for comprehensive tests covering all supported providers, one-shot semantics, and edge cases.
## Related docs

View File

@@ -76,6 +76,7 @@ For native Anthropic auth, Hermes prefers Claude Code's own credential files whe
|----------|-------------|
| `TERMINAL_ENV` | Backend: `local`, `docker`, `ssh`, `singularity`, `modal`, `daytona` |
| `TERMINAL_DOCKER_IMAGE` | Docker image (default: `python:3.11`) |
| `TERMINAL_DOCKER_FORWARD_ENV` | JSON array of env var names to explicitly forward into Docker terminal sessions |
| `TERMINAL_DOCKER_VOLUMES` | Additional Docker volume mounts (comma-separated `host:container` pairs) |
| `TERMINAL_SINGULARITY_IMAGE` | Singularity image or `.sif` path |
| `TERMINAL_MODAL_IMAGE` | Modal container image |
@@ -164,6 +165,7 @@ For native Anthropic auth, Hermes prefers Claude Code's own credential files whe
| `HERMES_QUIET` | Suppress non-essential output (`true`/`false`) |
| `HERMES_API_TIMEOUT` | LLM API call timeout in seconds (default: `900`) |
| `HERMES_EXEC_ASK` | Enable execution approval prompts in gateway mode (`true`/`false`) |
| `HERMES_BACKGROUND_NOTIFICATIONS` | Background process notification mode in gateway: `all` (default), `result`, `error`, `off` |
## Session Settings
@@ -197,6 +199,18 @@ For native Anthropic auth, Hermes prefers Claude Code's own credential files whe
For task-specific direct endpoints, Hermes uses the task's configured API key or `OPENAI_API_KEY`. It does not reuse `OPENROUTER_API_KEY` for those custom endpoints.
## Fallback Model (config.yaml only)
The primary model fallback is configured exclusively through `config.yaml` — there are no environment variables for it. Add a `fallback_model` section with `provider` and `model` keys to enable automatic failover when your main model encounters errors.
```yaml
fallback_model:
provider: openrouter
model: anthropic/claude-sonnet-4
```
See [Fallback Providers](/docs/user-guide/features/fallback-providers) for full details.
## Provider Routing (config.yaml only)
These go in `~/.hermes/config.yaml` under the `provider_routing` section:

View File

@@ -31,7 +31,7 @@ Type `/` in the CLI to open the autocomplete menu. Built-in commands are case-in
| `/title` | Set a title for the current session (usage: /title My Session Name) |
| `/compress` | Manually compress conversation context (flush memories + summarize) |
| `/rollback` | List or restore filesystem checkpoints (usage: /rollback [number]) |
| `/background` | Run a prompt in the background (usage: /background &lt;prompt&gt;) |
| `/background <prompt>` | Run a prompt in a separate background session. The agent processes your prompt independently — your current session stays free for other work. Results appear as a panel when the task finishes. See [CLI Background Sessions](/docs/user-guide/cli#background-sessions). |
| `/plan [request]` | Load the bundled `plan` skill to write a markdown plan instead of executing the work. Plans are saved under `.hermes/plans/` relative to the active workspace/backend working directory. |
### Configuration
@@ -109,7 +109,7 @@ The messaging gateway supports the following built-in commands inside Telegram,
| `/reasoning [level\|show\|hide]` | Change reasoning effort or toggle reasoning display. |
| `/voice [on\|off\|tts\|join\|channel\|leave\|status]` | Control spoken replies in chat. `join`/`channel`/`leave` manage Discord voice-channel mode. |
| `/rollback [number]` | List or restore filesystem checkpoints. |
| `/background &lt;prompt&gt;` | Run a prompt in a separate background session. |
| `/background <prompt>` | Run a prompt in a separate background session. Results are delivered back to the same chat when the task finishes. See [Messaging Background Sessions](/docs/user-guide/messaging/#background-sessions). |
| `/plan [request]` | Load the bundled `plan` skill to write a markdown plan instead of executing the work. Plans are saved under `.hermes/plans/` relative to the active workspace/backend working directory. |
| `/reload-mcp` | Reload MCP servers from config. |
| `/update` | Update Hermes Agent to the latest version. |
@@ -119,6 +119,6 @@ The messaging gateway supports the following built-in commands inside Telegram,
## Notes
- `/skin`, `/tools`, `/toolsets`, `/config`, `/prompt`, `/cron`, `/skills`, `/platforms`, `/paste`, and `/verbose` are **CLI-only** commands.
- `/status`, `/stop`, `/sethome`, `/resume`, `/background`, and `/update` are **messaging-only** commands.
- `/voice`, `/reload-mcp`, and `/rollback` work in **both** the CLI and the messaging gateway.
- `/status`, `/stop`, `/sethome`, `/resume`, and `/update` are **messaging-only** commands.
- `/background`, `/voice`, `/reload-mcp`, and `/rollback` work in **both** the CLI and the messaging gateway.
- `/voice join`, `/voice channel`, and `/voice leave` are only meaningful on Discord.

View File

@@ -259,6 +259,55 @@ compression:
When compression triggers, middle turns are summarized while the first 3 and last 4 turns are always preserved.
## Background Sessions
Run a prompt in a separate background session while continuing to use the CLI for other work:
```
/background Analyze the logs in /var/log and summarize any errors from today
```
Hermes immediately confirms the task and gives you back the prompt:
```
🔄 Background task #1 started: "Analyze the logs in /var/log and summarize..."
Task ID: bg_143022_a1b2c3
```
### How It Works
Each `/background` prompt spawns a **completely separate agent session** in a daemon thread:
- **Isolated conversation** — the background agent has no knowledge of your current session's history. It receives only the prompt you provide.
- **Same configuration** — the background agent inherits your model, provider, toolsets, reasoning settings, and fallback model from the current session.
- **Non-blocking** — your foreground session stays fully interactive. You can chat, run commands, or even start more background tasks.
- **Multiple tasks** — you can run several background tasks simultaneously. Each gets a numbered ID.
### Results
When a background task finishes, the result appears as a panel in your terminal:
```
╭─ ⚕ Hermes (background #1) ──────────────────────────────────╮
│ Found 3 errors in syslog from today: │
│ 1. OOM killer invoked at 03:22 — killed process nginx │
│ 2. Disk I/O error on /dev/sda1 at 07:15 │
│ 3. Failed SSH login attempts from 192.168.1.50 at 14:30 │
╰──────────────────────────────────────────────────────────────╯
```
If the task fails, you'll see an error notification instead. If `display.bell_on_complete` is enabled in your config, the terminal bell rings when the task finishes.
### Use Cases
- **Long-running research** — "/background research the latest developments in quantum error correction" while you work on code
- **File processing** — "/background analyze all Python files in this repo and list any security issues" while you continue a conversation
- **Parallel investigations** — start multiple background tasks to explore different angles simultaneously
:::info
Background sessions do not appear in your main conversation history. They are standalone sessions with their own task ID (e.g., `bg_143022_a1b2c3`).
:::
## Quiet Mode
By default, the CLI runs in quiet mode which:

View File

@@ -421,6 +421,26 @@ provider_routing:
**Shortcuts:** Append `:nitro` to any model name for throughput sorting (e.g., `anthropic/claude-sonnet-4:nitro`), or `:floor` for price sorting.
## Fallback Model
Configure a backup provider:model that Hermes switches to automatically when your primary model fails (rate limits, server errors, auth failures):
```yaml
fallback_model:
provider: openrouter # required
model: anthropic/claude-sonnet-4 # required
# base_url: http://localhost:8000/v1 # optional, for custom endpoints
# api_key_env: MY_CUSTOM_KEY # optional, env var name for custom endpoint API key
```
When activated, the fallback swaps the model and provider mid-session without losing your conversation. It fires **at most once** per session.
Supported providers: `openrouter`, `nous`, `openai-codex`, `anthropic`, `zai`, `kimi-coding`, `minimax`, `minimax-cn`, `custom`.
:::tip
Fallback is configured exclusively through `config.yaml` — there are no environment variables for it. For full details on when it triggers, supported providers, and how it interacts with auxiliary tasks and delegation, see [Fallback Providers](/docs/user-guide/features/fallback-providers).
:::
## Terminal Backend Configuration
Configure which environment the agent uses for terminal commands:
@@ -433,6 +453,8 @@ terminal:
# Docker-specific settings
docker_image: "nikolaik/python-nodejs:python3.11-nodejs20"
docker_forward_env: # Optional explicit allowlist for env passthrough
- "GITHUB_TOKEN"
docker_volumes: # Share host directories with the container
- "/home/user/projects:/workspace/projects"
- "/home/user/data:/data:ro" # :ro for read-only
@@ -497,6 +519,24 @@ This is useful for:
Can also be set via environment variable: `TERMINAL_DOCKER_VOLUMES='["/host:/container"]'` (JSON array).
### Docker Credential Forwarding
By default, Docker terminal sessions do not inherit arbitrary host credentials. If you need a specific token inside the container, add it to `terminal.docker_forward_env`.
```yaml
terminal:
backend: docker
docker_forward_env:
- "GITHUB_TOKEN"
- "NPM_TOKEN"
```
Hermes resolves each listed variable from your current shell first, then falls back to `~/.hermes/.env` if it was saved with `hermes config set`.
:::warning
Anything listed in `docker_forward_env` becomes visible to commands run inside the container. Only forward credentials you are comfortable exposing to the terminal session.
:::
See [Code Execution](features/code-execution.md) and the [Terminal section of the README](features/tools.md) for details on each backend.
## Memory Configuration
@@ -733,6 +773,7 @@ display:
resume_display: full # full (show previous messages on resume) | minimal (one-liner only)
bell_on_complete: false # Play terminal bell when agent finishes (great for long tasks)
show_reasoning: false # Show model reasoning/thinking above each response (toggle with /reasoning show|hide)
background_process_notifications: all # all | result | error | off (gateway only)
```
| Mode | What you see |

View File

@@ -0,0 +1,311 @@
---
title: Fallback Providers
description: Configure automatic failover to backup LLM providers when your primary model is unavailable.
sidebar_label: Fallback Providers
sidebar_position: 8
---
# Fallback Providers
Hermes Agent has two separate fallback systems that keep your sessions running when providers hit issues:
1. **Primary model fallback** — automatically switches to a backup provider:model when your main model fails
2. **Auxiliary task fallback** — independent provider resolution for side tasks like vision, compression, and web extraction
Both are optional and work independently.
## Primary Model Fallback
When your main LLM provider encounters errors — rate limits, server overload, auth failures, connection drops — Hermes can automatically switch to a backup provider:model pair mid-session without losing your conversation.
### Configuration
Add a `fallback_model` section to `~/.hermes/config.yaml`:
```yaml
fallback_model:
provider: openrouter
model: anthropic/claude-sonnet-4
```
Both `provider` and `model` are **required**. If either is missing, the fallback is disabled.
### Supported Providers
| Provider | Value | Requirements |
|----------|-------|-------------|
| OpenRouter | `openrouter` | `OPENROUTER_API_KEY` |
| Nous Portal | `nous` | `hermes login` (OAuth) |
| OpenAI Codex | `openai-codex` | `hermes model` (ChatGPT OAuth) |
| Anthropic | `anthropic` | `ANTHROPIC_API_KEY` or Claude Code credentials |
| z.ai / GLM | `zai` | `GLM_API_KEY` |
| Kimi / Moonshot | `kimi-coding` | `KIMI_API_KEY` |
| MiniMax | `minimax` | `MINIMAX_API_KEY` |
| MiniMax (China) | `minimax-cn` | `MINIMAX_CN_API_KEY` |
| Custom endpoint | `custom` | `base_url` + `api_key_env` (see below) |
### Custom Endpoint Fallback
For a custom OpenAI-compatible endpoint, add `base_url` and optionally `api_key_env`:
```yaml
fallback_model:
provider: custom
model: my-local-model
base_url: http://localhost:8000/v1
api_key_env: MY_LOCAL_KEY # env var name containing the API key
```
### When Fallback Triggers
The fallback activates automatically when the primary model fails with:
- **Rate limits** (HTTP 429) — after exhausting retry attempts
- **Server errors** (HTTP 500, 502, 503) — after exhausting retry attempts
- **Auth failures** (HTTP 401, 403) — immediately (no point retrying)
- **Not found** (HTTP 404) — immediately
- **Invalid responses** — when the API returns malformed or empty responses repeatedly
When triggered, Hermes:
1. Resolves credentials for the fallback provider
2. Builds a new API client
3. Swaps the model, provider, and client in-place
4. Resets the retry counter and continues the conversation
The switch is seamless — your conversation history, tool calls, and context are preserved. The agent continues from exactly where it left off, just using a different model.
:::info One-Shot
Fallback activates **at most once** per session. If the fallback provider also fails, normal error handling takes over (retries, then error message). This prevents cascading failover loops.
:::
### Examples
**OpenRouter as fallback for Anthropic native:**
```yaml
model:
provider: anthropic
default: claude-sonnet-4-6
fallback_model:
provider: openrouter
model: anthropic/claude-sonnet-4
```
**Nous Portal as fallback for OpenRouter:**
```yaml
model:
provider: openrouter
default: anthropic/claude-opus-4
fallback_model:
provider: nous
model: nous-hermes-3
```
**Local model as fallback for cloud:**
```yaml
fallback_model:
provider: custom
model: llama-3.1-70b
base_url: http://localhost:8000/v1
api_key_env: LOCAL_API_KEY
```
**Codex OAuth as fallback:**
```yaml
fallback_model:
provider: openai-codex
model: gpt-5.3-codex
```
### Where Fallback Works
| Context | Fallback Supported |
|---------|-------------------|
| CLI sessions | ✔ |
| Messaging gateway (Telegram, Discord, etc.) | ✔ |
| Subagent delegation | ✘ (subagents do not inherit fallback config) |
| Cron jobs | ✘ (run with a fixed provider) |
| Auxiliary tasks (vision, compression) | ✘ (use their own provider chain — see below) |
:::tip
There are no environment variables for `fallback_model` — it is configured exclusively through `config.yaml`. This is intentional: fallback configuration is a deliberate choice, not something a stale shell export should override.
:::
---
## Auxiliary Task Fallback
Hermes uses separate lightweight models for side tasks. Each task has its own provider resolution chain that acts as a built-in fallback system.
### Tasks with Independent Provider Resolution
| Task | What It Does | Config Key |
|------|-------------|-----------|
| Vision | Image analysis, browser screenshots | `auxiliary.vision` |
| Web Extract | Web page summarization | `auxiliary.web_extract` |
| Compression | Context compression summaries | `auxiliary.compression` or `compression.summary_provider` |
| Session Search | Past session summarization | `auxiliary.session_search` |
| Skills Hub | Skill search and discovery | `auxiliary.skills_hub` |
| MCP | MCP helper operations | `auxiliary.mcp` |
| Memory Flush | Memory consolidation | `auxiliary.flush_memories` |
### Auto-Detection Chain
When a task's provider is set to `"auto"` (the default), Hermes tries providers in order until one works:
**For text tasks (compression, web extract, etc.):**
```text
OpenRouter → Nous Portal → Custom endpoint → Codex OAuth →
API-key providers (z.ai, Kimi, MiniMax, Anthropic) → give up
```
**For vision tasks:**
```text
Main provider (if vision-capable) → OpenRouter → Nous Portal →
Codex OAuth → Anthropic → Custom endpoint → give up
```
If the resolved provider fails at call time, Hermes also has an internal retry: if the provider is not OpenRouter and no explicit `base_url` is set, it tries OpenRouter as a last-resort fallback.
### Configuring Auxiliary Providers
Each task can be configured independently in `config.yaml`:
```yaml
auxiliary:
vision:
provider: "auto" # auto | openrouter | nous | codex | main | anthropic
model: "" # e.g. "openai/gpt-4o"
base_url: "" # direct endpoint (takes precedence over provider)
api_key: "" # API key for base_url
web_extract:
provider: "auto"
model: ""
compression:
provider: "auto"
model: ""
session_search:
provider: "auto"
model: ""
skills_hub:
provider: "auto"
model: ""
mcp:
provider: "auto"
model: ""
flush_memories:
provider: "auto"
model: ""
```
Or via environment variables:
```bash
AUXILIARY_VISION_PROVIDER=openrouter
AUXILIARY_VISION_MODEL=openai/gpt-4o
AUXILIARY_WEB_EXTRACT_PROVIDER=nous
CONTEXT_COMPRESSION_PROVIDER=main
CONTEXT_COMPRESSION_MODEL=google/gemini-3-flash-preview
```
### Provider Options for Auxiliary Tasks
| Provider | Description | Requirements |
|----------|-------------|-------------|
| `"auto"` | Try providers in order until one works (default) | At least one provider configured |
| `"openrouter"` | Force OpenRouter | `OPENROUTER_API_KEY` |
| `"nous"` | Force Nous Portal | `hermes login` |
| `"codex"` | Force Codex OAuth | `hermes model` → Codex |
| `"main"` | Use whatever provider the main agent uses | Active main provider configured |
| `"anthropic"` | Force Anthropic native | `ANTHROPIC_API_KEY` or Claude Code credentials |
### Direct Endpoint Override
For any auxiliary task, setting `base_url` bypasses provider resolution entirely and sends requests directly to that endpoint:
```yaml
auxiliary:
vision:
base_url: "http://localhost:1234/v1"
api_key: "local-key"
model: "qwen2.5-vl"
```
`base_url` takes precedence over `provider`. Hermes uses the configured `api_key` for authentication, falling back to `OPENAI_API_KEY` if not set. It does **not** reuse `OPENROUTER_API_KEY` for custom endpoints.
---
## Context Compression Fallback
Context compression has a legacy configuration path in addition to the auxiliary system:
```yaml
compression:
summary_provider: "auto" # auto | openrouter | nous | main
summary_model: "google/gemini-3-flash-preview"
```
This is equivalent to configuring `auxiliary.compression.provider` and `auxiliary.compression.model`. If both are set, the `auxiliary.compression` values take precedence.
If no provider is available for compression, Hermes drops middle conversation turns without generating a summary rather than failing the session.
---
## Delegation Provider Override
Subagents spawned by `delegate_task` do **not** use the primary fallback model. However, they can be routed to a different provider:model pair for cost optimization:
```yaml
delegation:
provider: "openrouter" # override provider for all subagents
model: "google/gemini-3-flash-preview" # override model
# base_url: "http://localhost:1234/v1" # or use a direct endpoint
# api_key: "local-key"
```
See [Subagent Delegation](/docs/user-guide/features/delegation) for full configuration details.
---
## Cron Job Providers
Cron jobs run with whatever provider is configured at execution time. They do not support a fallback model. To use a different provider for cron jobs, configure `provider` and `model` overrides on the cron job itself:
```python
cronjob(
action="create",
schedule="every 2h",
prompt="Check server status",
provider="openrouter",
model="google/gemini-3-flash-preview"
)
```
See [Scheduled Tasks (Cron)](/docs/user-guide/features/cron) for full configuration details.
---
## Summary
| Feature | Fallback Mechanism | Config Location |
|---------|-------------------|----------------|
| Main agent model | `fallback_model` in config.yaml — one-shot failover on errors | `fallback_model:` (top-level) |
| Vision | Auto-detection chain + internal OpenRouter retry | `auxiliary.vision` |
| Web extraction | Auto-detection chain + internal OpenRouter retry | `auxiliary.web_extract` |
| Context compression | Auto-detection chain, degrades to no-summary if unavailable | `auxiliary.compression` or `compression.summary_provider` |
| Session search | Auto-detection chain | `auxiliary.session_search` |
| Skills hub | Auto-detection chain | `auxiliary.skills_hub` |
| MCP helpers | Auto-detection chain | `auxiliary.mcp` |
| Memory flush | Auto-detection chain | `auxiliary.flush_memories` |
| Delegation | Provider override only (no automatic fallback) | `delegation.provider` / `delegation.model` |
| Cron jobs | Per-job provider override only (no automatic fallback) | Per-job `provider` / `model` |

View File

@@ -194,3 +194,7 @@ provider_routing:
## Default Behavior
When no `provider_routing` section is configured (the default), OpenRouter uses its own default routing logic, which generally balances cost and availability automatically.
:::tip Provider Routing vs. Fallback Models
Provider routing controls which **sub-providers within OpenRouter** handle your requests. For automatic failover to an entirely different provider when your primary model fails, see [Fallback Providers](/docs/user-guide/features/fallback-providers).
:::

View File

@@ -135,6 +135,8 @@ All container backends run with security hardening:
- Full namespace isolation
- Persistent workspace via volumes, not writable root layer
Docker can optionally receive an explicit env allowlist via `terminal.docker_forward_env`, but forwarded variables are visible to commands inside the container and should be treated as exposed to that session.
## Background Process Management
Start background processes and manage them:

View File

@@ -181,6 +181,63 @@ When enabled, the bot sends status messages as it works:
🐍 execute_code...
```
## Background Sessions
Run a prompt in a separate background session so the agent works on it independently while your main chat stays responsive:
```
/background Check all servers in the cluster and report any that are down
```
Hermes confirms immediately:
```
🔄 Background task started: "Check all servers in the cluster..."
Task ID: bg_143022_a1b2c3
```
### How It Works
Each `/background` prompt spawns a **separate agent instance** that runs asynchronously:
- **Isolated session** — the background agent has its own session with its own conversation history. It has no knowledge of your current chat context and receives only the prompt you provide.
- **Same configuration** — inherits your model, provider, toolsets, reasoning settings, and provider routing from the current gateway setup.
- **Non-blocking** — your main chat stays fully interactive. Send messages, run other commands, or start more background tasks while it works.
- **Result delivery** — when the task finishes, the result is sent back to the **same chat or channel** where you issued the command, prefixed with "✅ Background task complete". If it fails, you'll see "❌ Background task failed" with the error.
### Background Process Notifications
When the agent running a background session uses `terminal(background=true)` to start long-running processes (servers, builds, etc.), the gateway can push status updates to your chat. Control this with `display.background_process_notifications` in `~/.hermes/config.yaml`:
```yaml
display:
background_process_notifications: all # all | result | error | off
```
| Mode | What you receive |
|------|-----------------|
| `all` | Running-output updates **and** the final completion message (default) |
| `result` | Only the final completion message (regardless of exit code) |
| `error` | Only the final message when the exit code is non-zero |
| `off` | No process watcher messages at all |
You can also set this via environment variable:
```bash
HERMES_BACKGROUND_NOTIFICATIONS=result
```
### Use Cases
- **Server monitoring** — "/background Check the health of all services and alert me if anything is down"
- **Long builds** — "/background Build and deploy the staging environment" while you continue chatting
- **Research tasks** — "/background Research competitor pricing and summarize in a table"
- **File operations** — "/background Organize the photos in ~/Downloads by date into folders"
:::tip
Background tasks on messaging platforms are fire-and-forget — you don't need to wait or check on them. Results arrive in the same chat automatically when the task finishes.
:::
## Service Management
### Linux (systemd)

View File

@@ -212,6 +212,7 @@ Container resources are configurable in `~/.hermes/config.yaml`:
terminal:
backend: docker
docker_image: "nikolaik/python-nodejs:python3.11-nodejs20"
docker_forward_env: [] # Explicit allowlist only; empty keeps secrets out of the container
container_cpu: 1 # CPU cores
container_memory: 5120 # MB (default 5GB)
container_disk: 51200 # MB (default 50GB, requires overlay2 on XFS)
@@ -227,6 +228,10 @@ terminal:
For production gateway deployments, use `docker`, `modal`, or `daytona` backend to isolate agent commands from your host system. This eliminates the need for dangerous command approval entirely.
:::
:::warning
If you add names to `terminal.docker_forward_env`, those variables are intentionally injected into the container for terminal commands. This is useful for task-specific credentials like `GITHUB_TOKEN`, but it also means code running in the container can read and exfiltrate them.
:::
## Terminal Backend Security Comparison
| Backend | Isolation | Dangerous Cmd Check | Best For |

View File

@@ -91,6 +91,7 @@ const sidebars: SidebarsConfig = {
'user-guide/features/mcp',
'user-guide/features/honcho',
'user-guide/features/provider-routing',
'user-guide/features/fallback-providers',
],
},
{