Groundwork for injecting raw platform identifiers into the agent's
system prompt. Currently only `thread_id` is exposed as a raw ID —
callers in a Discord thread had to guess `channel_id == thread_id`
(which happens to work because threads are channels in Discord's REST
API) and had no way to reference the parent channel, guild, or the
triggering message.
Adds three optional fields:
- `guild_id` — Discord guild / Slack workspace / Matrix server scope
- `parent_chat_id` — parent channel when chat_id refers to a thread
- `message_id` — ID of the triggering message (pin/reply/react)
Extends `BasePlatformAdapter.build_source()` to accept + forward them
and teaches `to_dict`/`from_dict` to serialize them. Behaviourally a
no-op: nothing reads the fields yet and they default to None.
The feishu_doc and feishu_drive tools were registered in the tool
registry but never added to the hermes-feishu composite toolset.
The pipeline fix from the prior commit now recovers them automatically
once they are in the composite.
Split the monolithic discord_server tool (14 actions) into two:
- discord: core actions (fetch_messages, search_members, create_thread)
that are useful for the agent's normal operation. Auto-enabled on
the discord platform via the pipeline fix.
- discord_admin: server management actions (list channels/roles, pins,
role assignment) that require explicit opt-in via hermes tools.
Added to CONFIGURABLE_TOOLSETS and _DEFAULT_OFF_TOOLSETS.
The reverse-mapping loop in _get_platform_tools only checked
CONFIGURABLE_TOOLSETS, silently dropping platform-specific toolsets
like discord and feishu_doc whose tools were in the composite but
had no configurable key. Add a second pass over TOOLSETS that picks
up unclaimed toolsets whose tools are present in the resolved
composite.
The tool schema promised 'On update, pass an empty array to clear' but the
update branch ignored the context_from kwarg entirely — users could set
the field at create time and never modify or clear it afterward.
- tools/cronjob_tools.py: handle context_from in the update branch the
same way script/enabled_toolsets/workdir are handled: normalize str/list
to refs, validate each referenced job exists (same check the create
branch does), store as list-or-None to match create_job()'s shape.
Empty string or empty list clears the field.
- tests/cron/test_cron_context_from.py: 6 new tests covering add/change/
clear (both shapes)/bad-ref/preserve-across-unrelated-update.
Root installs on Linux now put the code at /usr/local/lib/hermes-agent and
the hermes command at /usr/local/bin/hermes. HERMES_HOME (~/.hermes) stays
state-only. Matches Claude Code / Codex CLI / OpenClaw, keeps Docker
bind-mounted /root/ volumes lean, and puts the command on every shell's
default PATH without touching shell RC files.
- Non-root users and macOS root: unchanged
- Existing root installs at $HERMES_HOME/hermes-agent: preserved in-place
(detected via .git dir) — no auto-migration, no breakage
- Explicit --dir / $HERMES_INSTALL_DIR: always wins, never overridden
- Termux: unchanged (package manager manages /data/data/...)
Requested by @souly9999 (Discord). Our own Dockerfile already uses this
split (code at /opt/hermes, data at /opt/data volume); the user-install
path now matches.
YAML parses bare numeric toolset names (e.g. 12306:) as int, causing
TypeError in sorted() since the read path normalizes to str but the
save path did not.
The no_mcp sentinel was preserved in existing entries even when the
user re-enabled MCP servers, causing MCP to stay silently disabled.
update_model() recalculated threshold_tokens but left tail_token_budget
and max_summary_tokens at their __init__ values. When switching from a
200K model to 32K, the tail budget stayed at ~20K tokens (62% of 32K)
instead of the intended ~10%.
Adds budget recalculation in update_model() and 2 regression tests.
The web-dashboard.md and dashboard-plugins.md pages had overlapping,
partial coverage of the theme and plugin systems. Themes were split
across two pages; the plugin docs had a minimal manifest reference but
no step-by-step guide, no slot catalog, and no theme+plugin demo.
New: user-guide/features/extending-the-dashboard.md — single navigable
reference for all three extension layers (themes, UI plugins, backend
plugins). Includes:
- Theme quick-start + full schema (palette, typography, layout, layout
variants, assets, componentStyles, colorOverrides, customCSS)
- Plugin quick-start + full schema (manifest, SDK, slots, tab.override,
tab.hidden, backend routes, custom CSS)
- 10-slot shell catalog with locations
- Plugin discovery + load lifecycle
- Combined theme+plugin walkthrough (Strike Freedom cockpit demo)
- API reference + troubleshooting
web-dashboard.md: trimmed to core tool docs (pages, REST API, CORS,
development). Theme/plugin content now points to the new page with a
built-in themes summary table.
dashboard-plugins.md: deleted (merged into extending-the-dashboard.md).
sidebars.ts: swap 'dashboard-plugins' → 'extending-the-dashboard' under
the Management group.
No user-facing behavior change; docs-only.
Subagents run inside a ThreadPoolExecutor. The CLI's interactive approval
callback lives in tools/terminal_tool.py's threading.local(), which worker
threads do not inherit. When a subagent hits a dangerous-command guard,
prompt_dangerous_approval() falls back to input() from the worker thread,
deadlocking against the parent's prompt_toolkit TUI that owns stdin.
Fix: install a non-interactive callback into every subagent worker thread
via ThreadPoolExecutor(initializer=set_approval_callback, initargs=(cb,)).
The callback is config-gated by delegation.subagent_auto_approve:
false (default) -> _subagent_auto_deny (safe; matches leaf tool blocklist)
true -> _subagent_auto_approve (opt-in YOLO for cron/batch)
Both emit a logger.warning audit line. Gateway sessions are unaffected
because they resolve approvals via tools/approval.py's per-session queue,
not through these TLS callbacks. Diagnosis credit: @MorAlekss (#14685).
- hermes_cli/config.py: DEFAULT_CONFIG.delegation.subagent_auto_approve: False
- cli-config.yaml.example: documented, commented (default)
- tools/delegate_tool.py: _subagent_auto_deny, _subagent_auto_approve,
_get_subagent_approval_callback, wired into the child timeout executor
- tests/tools/test_delegate.py: 7 tests covering defaults, truthy coercion,
and TLS scoping in the worker thread
On Windows WSL2, ConPTY implicitly enables mouse event injection when
the alternate screen buffer (DEC 1049) is entered, causing raw escape
sequences to appear in the transcript as ghost characters.
Fix (two parts):
1. ConPTY fix: send DISABLE_MOUSE_TRACKING immediately after entering
alt screen when mouse tracking is off (AlternateScreen.tsx)
2. Runtime toggle: add /mouse [on|off|toggle] slash command with config
persistence (display.tui_mouse) so users can manage this at runtime
The env var HERMES_TUI_DISABLE_MOUSE continues to work as the initial
default, but can now be overridden via /mouse and persisted to config.
Closes: upstream ConPTY mouse injection issue
Credits: OutThisLife / PR #13716 for the toggle concept
Two adjustments to make CI pass:
- In gateway/platforms/matrix.py: `DeviceID` is `NewType("DeviceID", str)`,
so passing `client.device_id` directly (already a str) works identically
at runtime. The explicit import was cosmetic and tripped CI environments
where `mautrix.types` doesn't re-export DeviceID at the expected path
("cannot import name 'DeviceID' from 'mautrix.types' (unknown location)").
- In tests/gateway/test_matrix.py: add `put_device_id` to the hand-written
`PgCryptoStore` fake so the three encryption-path tests
(test_connect_with_access_token_and_encryption,
test_connect_uses_configured_device_id_over_whoami,
test_connect_registers_encrypted_event_handler_when_encryption_on) can
exercise the new crypto-store binding without AttributeError.
PgCryptoStore.__init__ defaults _device_id to "" and put_account writes
that blank value into crypto_account. The UPSERT's ON CONFLICT DO UPDATE
clause deliberately does not touch device_id, so once the row is written
blank it stays blank forever — breaking every downstream device-scoped
olm operation. Peers' to-device olm ciphertext can't match our identity
key, no megolm sessions ever land, and the user sees "hermes is in the
room but never responds to encrypted messages".
Fix: call put_device_id(client.device_id) immediately after
crypto_store.open() and before olm.load(). This sets the store's
in-memory _device_id so the first put_account INSERT writes the correct
value from the start.
Observable symptoms without the fix, on a fresh crypto.db:
- crypto_account.device_id = ""
- crypto_tracked_user: 0 rows
- crypto_device: 0 rows
- crypto_olm_session: 0 rows
- crypto_megolm_inbound_session: 0 rows
- "No one-time keys nor device keys got when trying to share keys"
warning on every startup
- "olm event doesn't contain ciphertext for this device" DecryptionError
on any inbound to-device event
- Encrypted room messages arrive but never decrypt
After the fix (wiped crypto.db + restart):
- device_id populated with actual runtime device (e.g. CZIKTRFLOV)
- all counts populate from sync as expected
- encrypted DMs flow normally
Who hits this: anyone with a fresh crypto.db — includes first-time matrix
E2EE setup, nio→mautrix migrations (since matrix.py removes the legacy
pickle on startup, creating a fresh SQLite store), and anyone who wipes
crypto.db to start over. Existing installs that somehow already have a
non-blank device_id would be unaffected, but no prior code path writes
it correctly, so that set is likely empty.
* fix(nix): use --rebuild in fix-lockfiles to bypass cached FOD store paths
fix-lockfiles checked npm lockfile hashes by running
`nix build .#<attr>.npmDeps`, but fetchNpmDeps is a fixed-output
derivation — if the old store path exists locally, Nix returns it from
cache without re-fetching. This caused the script to report "ok" even
when hashes were stale, while CI (with no cache) failed with a hash
mismatch.
Adding --rebuild forces Nix to re-derive and verify the output hash
against the declared one, catching staleness regardless of local cache
state. Also updates the tui and web npm deps hashes that were stale.
* fix(nix): regenerate ui-tui lockfile to add missing @emnapi entries
npm ci was failing because @emnapi/core and @emnapi/runtime were
missing from ui-tui/package-lock.json despite being required as peer
deps by @napi-rs/wasm-runtime (via @rolldown/binding-wasm32-wasi).
Running npm install --package-lock-only adds the missing entries.
The npmDepsHash reverts to its previous value since fetchNpmDeps was
already fetching these packages as transitive dependencies.
/model gpt-5.5 on openai-codex showed 'Context: 1,050,000 tokens' because
the display block used ModelInfo.context_window directly from models.dev.
Codex OAuth actually enforces 272K for the same slug, and the agent's
compressor already runs at 272K via get_model_context_length() — so the
banner + real context budget said 272K while /model lied with 1M.
Route the display context through a new resolve_display_context_length()
helper that always prefers agent.model_metadata.get_model_context_length
(which knows about Codex OAuth, Copilot, Nous caps) and only falls back
to models.dev when that returns nothing.
Fix applied to all 3 /model display sites:
cli.py _handle_model_switch
gateway/run.py picker on_model_selected callback
gateway/run.py text-fallback confirmation
Reported by @emilstridell (Telegram, April 2026).
Closes#13626.
Two follow-ups on top of the _hermes_home helper from @jerome-benoit's #12729:
1. Declare a [google] optional extra in pyproject.toml
(google-api-python-client, google-auth-oauthlib, google-auth-httplib2) and
include it in [all]. Packagers (Nix flake, Homebrew) now ship the deps by
default, so `setup.py --check` does not need to shell out to pip at
runtime — the imports succeed and install_deps() is never reached.
This fixes the Nix breakage where pip/ensurepip are stripped.
2. Add `from __future__ import annotations` to setup.py so the PEP 604
`str | None` annotation parses on Python 3.9 (macOS system python).
Previously system python3 SyntaxError'd before any code ran.
install_deps() error message now also points users at the extra instead of
just the raw pip command.
The three google-workspace scripts (setup.py, google_api.py, gws_bridge.py)
each had their own way of resolving HERMES_HOME:
- setup.py imported hermes_constants (crashes outside Hermes process)
- google_api.py used os.getenv inline (no strip, no empty handling)
- gws_bridge.py defined its own local get_hermes_home() (duplicate)
Extract the common logic into _hermes_home.py which:
- Delegates to hermes_constants when available (profile support, etc.)
- Falls back to os.getenv with .strip() + empty-as-unset handling
- Provides display_hermes_home() with ~/ shortening for profiles
All three scripts now import from _hermes_home instead of duplicating.
7 regression tests cover the fallback path: env var override, default
~/.hermes, empty env var, display shortening, profile paths, and
custom non-home paths.
Closes#12722
Extracts _needs_kimi_tool_reasoning() for symmetry with the existing
_needs_deepseek_tool_reasoning() helper, so _copy_reasoning_content_for_api
uses the same detection logic as _build_assistant_message. Future changes
to either provider's signals now only touch one function.
Adds tests/run_agent/test_deepseek_reasoning_content_echo.py covering:
- All 3 DeepSeek detection signals (provider, model, host)
- Poisoned history replay (empty string fallback)
- Plain assistant turns NOT padded
- Explicit reasoning_content preserved
- Reasoning field promoted to reasoning_content
- Existing Kimi/Moonshot detection intact
- Non-thinking providers left alone
21 tests, all pass.
DeepSeek V4 thinking mode requires reasoning_content on every
assistant message that includes tool_calls. When this field is
missing from persisted history, replaying the session causes
HTTP 400: 'The reasoning_content in the thinking mode must be
passed back to the API.'
Two-part fix (refs #15250):
1. _copy_reasoning_content_for_api: Merge the Kimi-only and
DeepSeek detection into a single needs_tool_reasoning_echo
check. This handles already-poisoned persisted sessions by
injecting an empty reasoning_content on replay.
2. _build_assistant_message: Store reasoning_content='' on new
DeepSeek tool-call messages at creation time, preventing
future session poisoning at the source.
Additional fix:
3. _handle_max_iterations: Add missing call to
_copy_reasoning_content_for_api in the max-iterations flush
path (previously only main loop and flush_memories had it).
Detection covers:
- provider == 'deepseek'
- model name containing 'deepseek' (case-insensitive)
- base URL matching api.deepseek.com (for custom provider)
``run_conversation`` was calling ``memory_manager.sync_all(
original_user_message, final_response)`` at the end of every turn
where both args were present. That gate didn't consider the
``interrupted`` local flag, so an external memory backend received
partial assistant output, aborted tool chains, or mid-stream resets as
durable conversational truth. Downstream recall then treated the
not-yet-real state as if the user had seen it complete, poisoning the
trust boundary between "what the user took away from the turn" and
"what Hermes was in the middle of producing when the interrupt hit".
Extracted the inline sync block into a new private method
``AIAgent._sync_external_memory_for_turn(original_user_message,
final_response, interrupted)`` so the interrupt guard is a single
visible check at the top of the method instead of hidden in a
boolean-and at the call site. That also gives tests a clean seam to
assert on — the pre-fix layout buried the logic inside the 3,000-line
``run_conversation`` function where no focused test could reach it.
The new method encodes three independent skip conditions:
1. ``interrupted`` → skip entirely (the #15218 fix). Applies even
when ``final_response`` and ``original_user_message`` happen to
be populated — an interrupt may have landed between a streamed
reply and the next tool call, so the strings on disk are not
actually the turn the user took away.
2. No memory manager / no final_response / no user message →
preserve existing skip behaviour (nothing new for providerless
sessions, system-initiated refreshes, tool-only turns that never
resolved, etc.).
3. Sync_all / queue_prefetch_all exceptions → swallow. External
memory providers are strictly best-effort; a misconfigured or
offline backend must never block the user from seeing their
response.
The prefetch side-effect is gated on the same interrupt flag: the
user's next message is almost certainly a retry of the same intent,
and a prefetch keyed on the interrupted turn would fire against stale
context.
### Tests (16 new, all passing on py3.11 venv)
``tests/run_agent/test_memory_sync_interrupted.py`` exercises the
helper directly on a bare ``AIAgent`` (``__new__`` pattern that the
interrupt-propagation tests already use). Coverage:
- Interrupted turn with full-looking response → no sync (the fix)
- Interrupted turn with long assistant output → no sync (the interrupt
could have landed mid-stream; strings-on-disk lie)
- Normal completed turn → sync_all + queue_prefetch_all both called
with the right args (regression guard for the positive path)
- No final_response / no user_message / no memory manager → existing
pre-fix skip paths still apply
- sync_all raises → exception swallowed, prefetch still attempted
- queue_prefetch_all raises → exception swallowed after sync succeeded
- 8-case parametrised matrix across (interrupted × final_response ×
original_user_message) asserts sync fires iff interrupted=False AND
both strings are non-empty
Closes#15218
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
_load_auth_store() caught all parse/read exceptions and silently
returned an empty store, making corruption look like a logout with
no diagnostic information and no way to recover the original file.
Now copies the corrupt file to auth.json.corrupt before resetting,
and logs a warning with the exception and backup path.
Extends PR #15171 to also cover the server-side cancellation path (aiohttp
shutdown, request-level timeout) — previously only ConnectionResetError
triggered the incomplete-snapshot write, so cancellations left the store
stuck at the in_progress snapshot written on response.created.
Factors the incomplete-snapshot build into a _persist_incomplete_if_needed()
helper called from both the ConnectionResetError and CancelledError
branches; the CancelledError handler re-raises so cooperative cancellation
semantics are preserved.
Adds two regression tests that drive _write_sse_responses directly (the
TestClient disconnect path races the server handler, which makes the
end-to-end assertion flaky).
_submit_anthropic_pkce() retrieved sess under _oauth_sessions_lock but
wrote back to sess["status"] and sess["error_message"] outside the lock.
A concurrent session GC or cancel could race with these writes, producing
inconsistent session state.
Wrap all 4 sess write sites in _oauth_sessions_lock:
- network exception path (Token exchange failed)
- missing access_token path
- credential save failure path
- success path (approved)
Before this, typing during /compress was accepted by the classic CLI
prompt and landed in the next prompt after compression finished,
effectively consuming a keystroke for a prompt that was about to be
replaced. Wrapping the body in self._busy_command('Compressing
context...') blocks input rendering for the duration, matching the
pattern /skills install and other slow commands already use.
Salvages the useful part of #10303 (@iRonin). The `_compressing` flag
added to run_agent.py in the original PR was dead code (set in 3 spots,
read nowhere — not by cli.py, not by run_agent.py, not by the Ink TUI
which doesn't use _busy_command at all) and was dropped.
When display.busy_input_mode is 'queue', the runner-level PRIORITY block
in _handle_message was still calling running_agent.interrupt() for every
text follow-up to an active session. The adapter-level busy handler
already honors queue mode (commit 9d147f7fd), but this runner-level path
was an unconditional interrupt regardless of config.
Adds a queue-mode branch that queues the follow-up via
_queue_or_replace_pending_event() and returns without interrupting.
Salvages the useful part of #12070 (@knockyai). The config fan-out to
per-platform extra was redundant — runner already loads busy_input_mode
directly via _load_busy_input_mode().
Follow-up to @iRonin's Ctrl+D EOF fix. If the input text is empty but
the user has pending attached images, do nothing rather than exiting —
otherwise a stray Ctrl+D silently discards the attachments.
skill_view response went to the model verbatim; duplicating the SKILL.md
body as raw_content on every tool call added token cost with no agent-facing
benefit. Remove the field and update tests to assert on content only.
The slash/preload caller (agent/skill_commands.py) already falls back to
content when raw_content is absent, and it calls skill_view(preprocess=False)
anyway, so content is already unrendered on that path.