The terminal config->env-var bridge was hand-maintained as three parallel
dict literals (cli.py env_mappings, gateway/run.py _terminal_env_map,
hermes_cli/config.py TERMINAL_CONFIG_ENV_MAP). They drifted: docker_extra_args
and modal_mode were in the shared map but missing from BOTH cli.py and
gateway — so 'hermes config set terminal.docker_extra_args' bridged via
set_config_value but those keys never reached the running process through CLI
or gateway startup. Silent config-does-nothing, the exact bug class the
drift test guards against.
Fix the class by construction: cli.py and gateway/run.py now derive their
bridge from TERMINAL_CONFIG_ENV_MAP (the single source of truth) instead of
duplicating it. cli keeps two documented deltas (legacy env_type alias for
backend; sudo_password credential); gateway maps 1:1.
The drift test (tests/tools/test_terminal_config_env_sync.py) is rewritten off
fragile AST source-parsing — which broke when set_config_value was refactored
to drop its inline _config_to_env_sync literal (8 failures) — onto live
imported state: assert the shared map covers critical keys, every mapped
TERMINAL_* var is consumed by terminal_tool, and all three paths derive from
the shared map. Refactor-proof.
E2E: with the shared-map derivation, 'terminal.docker_extra_args' and
'terminal.modal_mode' now bridge through load_cli_config() (verified they did
NOT before).
set_config_value used to bridge terminal.X keys to env vars via an inline
_config_to_env_sync dict literal. Commit f8adefdeb (fix(tui): apply terminal
backend config before launch) refactored it to call the shared
terminal_config_env_var_for_key() helper, which reads the module-level
TERMINAL_CONFIG_ENV_MAP. The behavior is preserved, but
_save_config_env_sync_keys() parsed set_config_value's source for the now-gone
dict literal and raised 'Could not find _config_to_env_sync = {...} literal in
source', failing 8 tests in the slice.
Read TERMINAL_CONFIG_ENV_MAP directly — the single source of truth the
'hermes config set' path actually consults — instead of parsing a
function-local literal. This is behavior-equivalent (all required docker/
container keys present) and refactor-proof: it tracks the live map rather
than a source-text snapshot.
@file: attachments now work when the desktop is connected to a remote
gateway. Previously a referenced file resolved to a client-disk path the
gateway couldn't see, so context_references rejected it with "path is
outside the allowed workspace" and the agent never saw the file.
Adds a file.attach RPC (sibling to the existing image.attach_bytes /
pdf.attach byte-upload pipeline): the desktop uploads the file bytes, the
gateway stages them into <workspace>/.hermes/desktop-attachments/ and
returns a workspace-relative @file: ref that resolves cleanly. Local mode
passes the path directly; a gateway-visible file outside the workspace is
copied in; an in-workspace file is referenced as-is with no copy.
Consolidates the file-sync design from #38615 (LeonSGP43) and the
host-file-staging idea from #33455 (Carry00), rebased onto the
image/PDF remote-media helpers already on main.
Co-authored-by: LeonSGP43 <cine.dreamer.one@gmail.com>
The Portal's /api/nous/recommended-models endpoint is the source of truth for
which models are free/paid right now, but its result was cached in-process
only. When the live fetch failed (network, parse, non-2xx), the function
returned {} and the model picker silently dropped the free/paid
recommendations — free models would vanish with no indication anything went
wrong.
Add a per-base disk cache at $HERMES_HOME/cache/nous_recommended_cache.json:
a successful live fetch is persisted as last-known-good, and a failed fetch
with an empty in-process cache falls back to the disk copy instead of {}.
Self-heals on the next successful fetch. With no disk copy, still degrades to
{} (callers already handle that). Keyed by portal base URL so staging/prod
don't collide.
E2E: live fetch writes disk; simulated Portal failure returns the cached free
models from disk; no-disk + failure returns {}.
Two new free-tier slugs surfaced in /model and `hermes model`. owl-alpha
was already present. Regenerated website/static/api/model-catalog.json to
keep the manifest sync test green.
The autouse _suppress_concurrent_hermes_gate fixture did
monkeypatch.setattr(main, '_detect_concurrent_hermes_instances', ...) with
no raising=False. Its try/except guards the import but not the setattr, so
under pytest's per-test spawn isolation a transiently partial hermes_cli.main
module (one a concurrent worker is mid-importing) made setattr raise
AttributeError and errored unrelated tests in the slice.
Add raising=False so a transiently-absent attribute is a no-op default rather
than a hard error. The attribute always exists once main.py finishes
importing; the real-function opt-out (@pytest.mark.real_concurrent_gate) is
unaffected.
Follow-up to the salvaged fallback-chain fix:
- Replace the hand-rolled fallback loader with the shared
hermes_cli.fallback_config.get_fallback_chain() helper so the TUI path
matches HermesCLI and gateway/run.py exactly: fallback_providers stays
first and keeps order, with distinct legacy fallback_model entries
merged in after (deduped). Previously the TUI loader picked one key OR
the other, diverging from CLI/gateway when both were set.
- Update the test to assert the merged canonical semantics.
- Add psionic73 to scripts/release.py AUTHOR_MAP (CI gate).
Extend the sidecar and Python adapter to handle `voice` content
alongside `attachment`. Voice notes are inlined as base64 (same
size-cap logic), surfaced as `MessageType.VOICE`, and include an
optional `duration` field in fallback markers when bytes are
unavailable.
Switch `list_users`, `find_user_by_phone`, `create_user`,
`register_user_if_absent`, and `refresh_user_numbers` from the
Dashboard API (Bearer token) to the Spectrum API (Basic auth with
project credentials). Update response unwrapping to handle the nested
`data.users` envelope returned by Spectrum, add `_spectrum_host()`
resolver, `_basic()` header helper, and structured error helpers.
Update tests, docs, and plugin.yaml accordingly.
Store operator and assigned iMessage numbers in `auth.json` after
setup, and surface them in `hermes photon status`. When numbers are
missing, status auto-refreshes from the dashboard without provisioning
new lines.
image_generate returns its artifact as JSON ({"image": "/abs/path.png"})
with no MEDIA: tag, so the gateway auto-append path (which only recognized
text_to_speech MEDIA: tags) never delivered it — image delivery silently
depended on the model restating the path in its reply. Add image_generate to
the producer allowlist and extract the local path from its JSON result
(host_image > image > agent_visible_image), reusing the existing
extension-anchored matcher and history-dedupe so remote URLs, unknown
extensions, failures, and already-sent paths are rejected.
Closes the remaining unfixed path from #19105.
The blanket stdin=subprocess.DEVNULL pass added the kwarg to the docker
'version' preflight call; the test pinned the exact kwargs dict. Update
the expected dict to match.
Regression tests for the salvage follow-up: the interactive 'claude
setup-token' login must keep inherited stdin, and the guard's inline
'noqa: subprocess-stdin' marker must exempt a call.
The blanket DEVNULL pass muzzled run_oauth_setup_token()'s interactive
'claude setup-token' login, which needs inherited stdin to prompt the
user. Revert that one call and replace the guard's brittle file:line
whitelist with an inline 'noqa: subprocess-stdin' marker that travels
with the code.
scripts/check_subprocess_stdin.py scans agent/, tools/, plugins/, and
tui_gateway/ for subprocess.run() and subprocess.Popen() calls that
don't explicitly set stdin=. Missing stdin= means the child inherits the
parent's fd, which in TUI mode is the JSON-RPC pipe — causing gateway
crashes on stdin EOF.
Exits 0 (pass) or 1 (violations found). Can be run manually or added to
CI. Skips comments, docstring references, and calls that use input= (which
creates its own pipe).
Usage: python scripts/check_subprocess_stdin.py
When Hermes runs in TUI mode, the gateway child process communicates with
the Node.js parent over a JSON-RPC protocol on stdin. Subprocess calls that
inherit this stdin fd can trigger a race condition where the child's stdin
read returns EOF, causing the gateway to exit cleanly (exit code 0) mid-tool-
execution.
This is the same root cause as issue #14036 (byterover plugin) and PR #39257
(SSH environment backend). This commit applies the fix — stdin=subprocess.DEVNULL
— to all 85 subprocess.run() and subprocess.Popen() calls that execute inside
the TUI gateway child process.
Scope: TUI-context code only (agent/, tools/, plugins/, tui_gateway/server.py).
CLI code (cli.py, hermes_cli/), tests, scripts, and gateway process management
are excluded — they don't run inside the TUI child and inherit the terminal's
stdin, not the JSON-RPC pipe.
85 call sites across 28 files. All files pass syntax check.
hermes update pulls the latest repo, so the freshly-pulled
website/static/api/model-catalog.json is already the newest catalog. Copy
it straight over ~/.hermes/cache/model_catalog.json instead of relying on a
network fetch (which can be Vercel bot-gated or hit a Portal hiccup and
silently degrade the picker to a stale/short list).
Adds seed_cache_from_checkout() in model_catalog.py (read shipped manifest,
validate, atomic write via _write_disk_cache, reset in-process cache) and
calls it from both update paths in main.py: _cmd_update_impl (git pull) and
_update_via_zip (Docker/no-git). Non-fatal on missing/malformed/invalid
files — the normal network refresh still applies on next picker open.
The desktop code uses Array.prototype.findLast (chat/composer/index.tsx) and
findLastIndex (session/hooks/use-session-actions.ts), which are ES2023 APIs,
but tsconfig declared only the ES2022 lib. Some TypeScript builds tolerate this,
but a correct/stricter tsc fails the desktop build with:
TS2550: Property 'findLast' does not exist on type 'ChatMessage[]'.
Do you need to change your target library? Try changing 'lib' to 'es2023'.
Declare es2023 so the build is correct regardless of the resolved TypeScript
version (reported on Windows with Node 24).
Refs #38970
Terminal tool progress on markdown-capable gateways (Telegram, Slack,
Discord, WhatsApp, Matrix, Weixin, Feishu) renders the full command in a
fenced code block again, in all/new AND verbose modes — gated on the
adapter's supports_code_blocks capability. Plain-text platforms keep the
short truncated preview.
No language tag is emitted: Slack mrkdwn renders a '```bash' fence with
'bash' as a literal first code line, so a bare '```' fence is used, which
renders correctly on every platform that supports blocks.
This restores the #41215 feature (removed in #41950 due to the command
showing in group chats) as the default. For a personal assistant the
command display is desired; the group-chat concern is a preference, not a
vulnerability.
Drop `replyTo` from all outbound send paths and update the `/typing`
endpoint to use the documented `typing("start" | "stop")` content
builder. Adds a `stop_typing` method on the adapter to pair with
`send_typing`.
Replace raw `{ replyTo }` send options with the `spectrumReply` content
builder from spectrum-ts, which is the correct API for threading
replies.
Adds `maybeReplyContent` helper with graceful fallback to normal send
when
the reply target cannot be resolved.
Allow PHOTON_HOME_CHANNEL to accept a bare E.164 phone number or a
`any;-;+1...` DM chat GUID in addition to a Spectrum space id. Inbound
DM spaces are cached so replies resolve without a second SDK lookup,
and `photon` is added to _PHONE_PLATFORMS so send_message treats E.164
strings as explicit targets rather than falling through to channel-name
resolution.
During `hermes photon setup`, allowlist the operator's number and set
their DM as the cron home channel when those env vars are unset. Without
this, the gateway denies the operator's own messages and cron has no
default delivery target. Re-runs never overwrite hand-tuned values.
Also teaches the sidecar's `resolveSpace` to accept a bare E.164 number
as a space identifier, resolving it to the user's DM space so
`PHOTON_HOME_CHANNEL` can be set to a phone number instead of an opaque
space id.
On shared-number plans, `/lines` has no dedicated entry, so the
`assignedPhoneNumber` field on the user object is the source of truth
for which number to text the agent. Fall back to the line inventory
only when no per-user assignment exists.
Make Photon iMessage a first-class persistent-connection channel like
Discord/Slack, using the spectrum-ts gRPC stream for both directions.
- Inbound: the sidecar forwards the SDK's app.messages gRPC stream to the
adapter over a loopback GET /inbound (NDJSON) instead of webhooks. Drops
the aiohttp webhook server, HMAC signature verification, public URL, and
PHOTON_WEBHOOK_* config; adapter reconnects with backoff.
- Management plane: device login uses client_id=photon-cli against the
single dashboard host (Bearer), matching the official photon-hq/cli;
find-or-create "Hermes Agent" project, enable Spectrum, rotate secret,
register user (with phone dedup), surface the assigned iMessage line.
- SDK projectId is the project's spectrumProjectId, not the dashboard id;
runtime creds persist to ~/.hermes/.env like every other channel.
- CLI: 6-step setup, webhook subcommands removed.
- Tests/docs updated for the gRPC flow; sidecar pins spectrum-ts ^1.17.1.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Salvage of PR #27978 cherry-picked onto current main, resolving conflicts
with main's intervening SimpleX plugin fixes (resp-envelope normalization,
health-monitor reconnect-churn fix, bare-form DM addressing).
What's new:
- Group support via SIMPLEX_GROUP_ALLOWED (comma-separated IDs or '*');
inbound items surface chat_id=group:<id> + chat_type=group. Disabled by
default so a bot in a group doesn't process every member's traffic.
- Inbound files/voice via rcvFileDescrReady (immediate /freceive) deferred
through _pending_file_transfers, replayed on rcvFileComplete. Voice notes
-> MessageType.VOICE.
- Native outbound media: send_image (PNG/JPEG + inline thumbnail), send_voice
(msgContent.type=voice), send_video, send_document. All addressed by numeric
ID via /_send ... json [...].
- MEDIA:<path> tags in agent replies stripped and dispatched as voice/document.
- Text-burst batching (HERMES_SIMPLEX_TEXT_BATCH_DELAY, default 0.8s).
- Auto-accept contact requests (SIMPLEX_AUTO_ACCEPT, default true).
- Group send path uses structured /_send #<id> json form (the bracket
#[<id>] form is parsed as display-name lookup and silently drops).
plugin.yaml bumped to 1.1.0; docs updated. All inside plugins/platforms/simplex/
- no core edits.
Co-authored-by: Juraj Bednar <juraj@bednar.io>
* fix(cli): persist custom --portal-url to .env on dashboard register
`hermes dashboard register --portal-url <url>` resolved the custom portal
for the registration request but only persisted it to .env when the var was
absent AND non-default. So a user who re-registered against a different
portal (e.g. switching preview deploys) silently kept the stale
HERMES_DASHBOARD_PORTAL_URL, and an explicit request for the production
portal was never written at all.
Track whether a custom portal was *explicitly supplied* (--portal-url flag
or HERMES_DASHBOARD_PORTAL_URL env), separately from the resolved value:
- explicit custom URL -> always persist (update in place via
save_env_value, which overwrites the matching key rather than appending
a duplicate), even when it equals the production default; no-op when it
already matches.
- no custom URL supplied -> unchanged conservative behaviour: only write an
inferred portal when absent and non-default; never alter an existing
entry unexpectedly.
save_env_value already preserves other lines/comments and dedups in place;
this only changes the decision of *when* to call it.
Adds TestCustomPortalPersistence covering all four cases.
Co-authored-by: Hermes Agent <agent@nousresearch.com>
* feat(cli): persist dashboard public URL from --redirect-uri on register
When the user registers a publicly-exposed dashboard with --redirect-uri
(the full OAuth callback, e.g. https://hermes.example.com/auth/callback),
derive its origin and persist it as HERMES_DASHBOARD_PUBLIC_URL — the env var
the dashboard auth layer actually consumes at serve time.
dashboard_auth/routes._redirect_uri reconstructs the callback as
HERMES_DASHBOARD_PUBLIC_URL + "/auth/callback" (verbatim), and
dashboard_auth/prefix.resolve_public_url reads that var (then config.yaml
dashboard.public_url) to decide the public origin. Previously --redirect-uri
was sent to the portal at registration but never persisted, so the operator
had to set HERMES_DASHBOARD_PUBLIC_URL by hand for the login gate to engage
and the callback to round-trip. We now wire it automatically.
Persist the ORIGIN (scheme://host[:port]), not the full callback path —
persisting the raw redirect would double the path when the runtime appends
/auth/callback. Mirrors the portal-url persistence semantics already in this
PR: always write an explicitly-derived value (updating in place, no
duplicate), no-op when it already matches, never written on a localhost-only
install (no --redirect-uri), and skipped for a non-http(s)/malformed redirect.
Verified end-to-end: cmd_dashboard_register writes the origin to .env, then
resolve_public_url() reads it back and public_url + /auth/callback
reconstructs exactly the originally-supplied --redirect-uri.
Adds TestPublicUrlPersistence (8 cases) incl. origin-derivation, port
preservation, update-in-place, no-op, no-flag, non-http skip, and
both-portal-and-public-url-persisted.
Co-authored-by: Hermes Agent <agent@nousresearch.com>
---------
Co-authored-by: Hermes Agent <agent@nousresearch.com>
Re-running `hermes dashboard register` now updates the existing dashboard
record in nous-account-service instead of creating a duplicate.
The stable key is the client_id this install already persisted in
HERMES_DASHBOARD_OAUTH_CLIENT_ID on a prior run:
- No stored client_id -> first registration -> create a fresh client with an
auto-generated name (unchanged behavior).
- Stored client_id present -> re-send it as `client_id` so the portal updates
that row in place. Without an explicit --name, the name is omitted so the
portal-stored name isn't churned to a new random value on every re-run.
- Prints "Updated dashboard" vs "Registered dashboard" based on whether the
portal echoed back the same client_id. A stale/deleted id safely falls
through to a fresh create server-side.
Requires the matching nous-account-service change (POST
/api/oauth/self-hosted-client accepting an optional client_id + optional name).
Tests: 7 new TestIdempotentRerun cases (key sent, name preserved/overridden,
Updated message, persisted id, stale-id fall-through, blank-id first-run);
existing create-path tests unchanged (23 pass).
The raw-mode teardown path (rawModeEnabledCount -> 0) disabled
modifyOtherKeys, kitty keyboard, focus reporting, and bracketed paste,
then dropped raw mode and detached the readable listener -- but left DEC
mouse tracking (1000/1002/1003/1006) asserted. With raw mode off and no
reader attached, the terminal falls back to cooked-mode echo, so every
mouse move emits a hover report (DEC 1003) that prints as literal text:
a flood of '35;col;row M' shards over the prompt in a long session.
handleSuspend() already guards against exactly this (it writes
DISABLE_MOUSE_TRACKING before SIGSTOP); the ordinary teardown path
missed the same guard. Add DISABLE_MOUSE_TRACKING to the teardown, and
re-assert tracking on raw-mode re-entry (via the Ink instance's
reassertTerminalModes, which is gated on altScreenActive and idempotent)
so a transient drop->re-add round-trips cleanly instead of silently
leaving the mouse dead.
Adds a regression test driving a real Ink mount: the last raw-mode
consumer detaching must emit DISABLE_MOUSE_TRACKING.
Reported via a community bug report.
System messages (slash-command output like /debug, plus the generic
system-message fallback) were rendered as plain text, so the uploaded
paste.rs URLs in a debug report were neither clickable nor easily
copyable.
Route both through LinkifiedText so URLs become real <a> links (open
externally via the desktop bridge, selectable/copyable text). Add an
opt-in explicitOnly mode that matches only explicit http(s):// / www.
URLs, used here so filename-shaped tokens in the report (agent.log,
errors.log, gateway.log) aren't mistaken for bare domains and linkified.
Bare-domain matching is preserved for all other LinkifiedText callers.
Adds regression tests covering explicitOnly (links only real URLs, keeps
.log filenames as text) and the default bare-domain behavior.
The existing #33961 tests mock _prompt_text_input away, so they only assert
modal-vs-stdin routing — they cannot observe the actual hang. Add a guard
class that drives the real helper chain with a blocking input() on a win32
daemon thread and asserts the worker never hangs. Fails on the pre-#33961
code (win32 -> _prompt_text_input -> off-main input() -> deadlock), passes
on the modal path. Also covers the scheduling-failure degraded branch
(must clean-cancel to None, never call input()).
The four win32 tests asserted the old deadlocking behavior (win32 -> raw
input()). Rewrite them to the corrected contract: native Windows uses the
modal via the app loop, and stdin is kept only for the safe no-app /
scheduling-failure cases. Consolidate three near-identical daemon-thread
tests into one parametrized (linux/win32) test behind a shared _run_on_daemon
harness, and drop dead code from the old main-thread test.
Refs #33961
Native Windows bypassed the destructive-slash modal and fell back to a raw
input() prompt. When the confirm was triggered from the process_loop daemon
thread (the normal case), that input() deadlocked against prompt_toolkit's
main-thread stdin ownership: bare /reset froze with Ctrl-C swallowed, while
/reset now worked only because it skips the prompt. Route native Windows
through the existing call_soon_threadsafe modal path (the same key-binding
channel that already handles normal typing on Windows); keep the stdin
fallback only for the safe no-app / scheduling-failure cases, and clean-cancel
(None) off the main thread on win32 so a degraded path never re-deadlocks.
Addresses #33961
Refs #30768
When edit_message(finalize=True) fails with a MarkdownV2 parse error,
the silent fallback previously sent raw content with escape sequences.
Now it logs the error and strips markdown formatting via _strip_mdv2()
for clean plain-text fallback.
Also fixes _strip_mdv2 to handle standard markdown bold (\*\*text\*\*)
before MarkdownV2 bold (\*text\*), preventing half-stripped asterisks.
Refs: #41955, #41732
When a platform adapter sets REQUIRES_EDIT_FINALIZE=True (e.g.
TelegramAdapter), tool progress edits now pass finalize=True so
format_message() is applied before sending to the platform.
Previously, the initial send() formatted the message correctly via
MarkdownV2, but subsequent edit_message() calls skipped formatting
(finalize=False), causing raw markdown (e.g. triple backticks for
bash code blocks) to render as plain text on Telegram.
Refs: #41955, #41732
Collapse the bare-"custom" allowlist entry and the custom:<name> guard into
a single provider_accepts_vendor_slug predicate so the slug-warning suppression
reads as one rule instead of two scattered conditions. No behavior change.
#41215 rendered a terminal tool call as a native ```bash fenced block on
markdown platforms (Telegram, WhatsApp, Slack, and others), showing the full
command with no truncation, in both all/new and verbose modes. That posted
complete shell commands (heredocs, internal paths, destructive commands) into
the chat before the final answer, visible to everyone in it.
This restores the prior behavior: terminal progress shows the short, truncated
preview line that every other tool already uses, capped at tool_preview_length.
The supports_code_blocks capability flag is left in place for future use.
CLI/TUI rendering is a separate path and was unaffected.
Adds a regression test asserting terminal progress renders as a truncated
preview, not a fenced bash block, even on a markdown-capable gateway.
Fixes#41955
Photon now allowlists registered device clients on the device-code
endpoint; the old client_id "hermes-agent" is rejected with
400 invalid_client, breaking the entire login flow. Switch to Photon's
published "photon-cli" device client and send the standard scope.
Also validate the device-flow token against /api/auth/get-session and
/api/projects/ before persisting it, and extract token candidates from
every response shape Photon has used (access_token, accessToken,
data.*, set-auth-token header) so a token that authenticates the
session lookup but is rejected by the project API fails loudly at
login instead of 404ing downstream.
Verified live: request_device_code() now returns 200 + a valid
user_code where "hermes-agent" returned 400 invalid_client.
Salvaged from #34467 by @yanxue06.
Photon now exposes attachment send (Ray Sun, photon-nousresearch), so
the Photon plugin gains outbound media to match the BlueBubbles iMessage
channel.
- sidecar: new /send-attachment endpoint wrapping space.send(attachment())
/ space.send(voice()); caption sent as a trailing text bubble.
- adapter: override send_image/send_image_file/send_voice/send_video/
send_document/send_animation. URL helpers cache to a local path first
(cache_image_from_url), file helpers pass through. Defense-in-depth
path re-validation before the path reaches the Node sidecar.
- _standalone_send (cron): send text first, then each media_file as a
/send-attachment call (is_voice -> voice builder).
- docs/README: flip the 'outbound attachments not wired' note.
The Skills Hub lost every api.github.com-backed source — the OpenAI,
Anthropic, HuggingFace, NVIDIA, gstack, Claude Marketplace and Well-Known
tabs all vanished — while ClawHub/skills.sh/LobeHub/browse.sh survived. A
GitHub API rate limit during the docs-deploy crawl zeroed all three
api.github.com sources (github / claude-marketplace / well-known) at once.
Two compounding bugs let the broken index reach the live site:
1. build_skills_index.py wrote the output file BEFORE the health check, so
even when the github floor (30) tripped and the script exited 2, the
degenerate file was already on disk. deploy-site.yml then swallowed the
exit code with `|| echo non-fatal` and extract-skills.py read the partial
index. Fix: run the health check first, write the file only when healthy,
exit without writing on failure. Removed the non-fatal swallow in
deploy-site.yml so a collapse fails the deploy and the last good site
stays live (Pages serves the previous build).
2. The build-time GitHub listing path returned [] on a 403 rate-limit without
retrying or flagging it, so a rate-limited crawl looked identical to an
empty source. Fix: a shared _github_get() helper on GitHubSource with
retry/backoff (honors Retry-After / X-RateLimit-Reset on 403/429, backs
off on 5xx + transport errors) and flags is_rate_limited. Routed
_list_skills_in_repo and _fetch_file_content through it; gave
ClaudeMarketplaceSource a persistent GitHubSource + is_rate_limited so the
builder can name the rate limit as the cause instead of '0 results'.
Added tests/scripts/test_build_skills_index_health.py pinning both contracts:
a degenerate crawl exits non-zero and writes no file; a healthy crawl writes
the index with github/claude-marketplace/well-known all present.
The chat transcript reaches the screen through a requestAnimationFrame-gated
flush (useSessionStateCache). The main BrowserWindow never set
backgroundThrottling, so Chromium paused rAF and clamped timers whenever the
window was blurred or occluded -- the live answer would stall until the window
regained focus or the user refreshed. In practice this bit any time Hermes
wasn't the focused window mid-turn (typing in your editor while the agent
replies, detached devtools, another window on top), presenting as "thinking,
no text, have to refresh."
Opt the renderer out of background throttling so a streaming chat app actually
streams in the background:
- backgroundThrottling: false on the main window (matches the secondary
windows that already set it)
- disable-renderer-backgrounding / disable-backgrounding-occluded-windows /
disable-background-timer-throttling at the process level for the
occlusion case
Latent since the desktop app landed (#20059), not a recent regression.
The test keyed the 'which call raises' decision on a shared invocation
counter (first call → raise, second → success), then asserted the error
landed in messages[0] (c1) and success in messages[1] (c2). But
_execute_tool_calls_concurrent runs the two web_search calls on a thread
pool with no ordering guarantee — c2's handler can be invoked first, take
the 'first call raises' branch, and the error ends up in messages[1].
Results are ordered by tool_call_id, so messages[0] (c1) was then 'success'
and the assertion failed.
It passed in isolation but reliably failed under CI's full parallel slice
(8 xdist workers) where the scheduler actually interleaves the two handlers.
Fix: tie the raise to a specific tool call via its arguments (q=boom raises,
q=ok succeeds) instead of invocation order, and assert tool_call_id ↔ content
pairing explicitly. Deterministic regardless of thread scheduling — verified
10/10 in isolation and the full TestConcurrentToolExecution class (32) green.
The parallel test runner sharded a present, tracked test file
(tests/plugins/platforms/photon/test_inbound.py) onto a slice that then
reported 'file or directory not found' (pytest exit 4) at exec time —
even though the planner had just enumerated the file via --collect-only
('5269 passed, 0 failed' in the same run). On loaded shared CI runners
the per-file subprocess can fail to stat a file the planner already saw;
the deterministic LPT slicer then reproduces it on every rerun because
the same file set lands on the same shard.
Fix: when a per-file run exits 4 AND the file still exists on disk, retry
the subprocess once before surfacing it as a hard failure. This kills the
shard-flake class for everyone, not just this PR.
Does NOT widen the exit-5-is-pass rule — exit 4 on a genuinely missing
file still fails (verified). Retry reuses the same pgroup-kill cleanup as
the primary run so no grandchildren orphan.
Validation: photon dir runs green through scripts/run_tests_parallel.py;
unit-level negative case confirms a nonexistent file still returns rc=4.
Adds the last missing parity piece vs the established channels: group
chats can be made opt-in via a mention wake word, exactly like the
BlueBubbles iMessage channel.
- require_mention + mention_patterns, read from config.extra (config.yaml
via the generic gateway bridge) or PHOTON_REQUIRE_MENTION /
PHOTON_MENTION_PATTERNS env vars. Same shapes BlueBubbles accepts
(list / JSON / comma / newline), same default Hermes wake words.
- _dispatch_inbound drops unmatched group messages and strips the leading
wake word from matched ones; DMs are never gated.
- plugin.yaml + docs document both knobs and the config.yaml form.
- New test_mention_gating.py (8 tests): default-off, group drop/pass,
wake-word strip, DM bypass, custom patterns, env comma-list, invalid
regex skip.
The config.yaml -> extra bridge needed no core change — the generic
shared-key loop in gateway/config.py already iterates plugin platforms
(_shared_loop_targets += plugin_entries()), so require_mention /
mention_patterns flow through automatically.
Note: outbound media is the one capability Photon still can't reach —
Photon exposes no HTTP send-attachment endpoint yet (documented API
limitation), so the sidecar can't send files. Not faked.
Validation: 34/34 photon tests; E2E confirms config.yaml require_mention
+ custom mention_patterns bridge through load_gateway_config into a live
adapter and gate/strip correctly.
Brings Photon in line with how every other Hermes gateway channel
behaves, instead of being a one-off with its own surfaces.
- gateway setup: register a `setup_fn` so Photon appears in
`hermes gateway setup` (the unified wizard) and runs the same
device-login + project + user + sidecar flow as `hermes photon setup`.
Adds `cli.gateway_setup()` as the zero-arg entry point.
- PII redaction: flip `pii_safe` False -> True. The comment already
said iMessage E.164 numbers should be redacted; the value contradicted
it. Now matches BlueBubbles (the other iMessage channel) which is in
_PII_SAFE_PLATFORMS — phone numbers are stripped before reaching the LLM.
- Pairing/authz: already worked via the registry's allowed_users_env /
allow_all_env generic path in authz_mixin; documented it. The adapter
forwards unauthorized DMs to the gateway (no intake gating), so the
pairing handshake fires and `hermes pairing approve photon <CODE>` works.
- Docs: fixed the `hermes photon status` output block to match the real
labels (project key / webhook key, not project secret / webhook secret),
added the missing PHOTON_API_HOST / PHOTON_DASHBOARD_HOST /
PHOTON_HOME_CHANNEL_NAME env vars, and added gateway-setup +
authorize-users sections mirroring the other channel docs.
Validation: 26/26 photon tests, 6504/6504 gateway+plugins tests, registry
E2E confirms setup_fn dispatch + pii_safe + authz envs all wired.
Every other Hermes gateway channel onboards through a single setup
surface (paste a token / run the wizard) with no per-platform login
command. Photon's device-code flow is unavoidable because Photon mints
credentials via API rather than a copy-paste dashboard field, but
exposing it as a top-level `hermes photon login` verb broke channel
parity.
- Remove the `login` subcommand; setup already runs the device flow as
its first step. `--no-browser` moves onto `setup`.
- Rename `_cmd_login` -> `_run_device_login` (internal helper).
- Status / credential-summary hints now point at `hermes photon setup`.
- README updated to the one-command onboarding flow.
The advisory lint-diff bot flagged 17 new ty diagnostics. 6 are
`unresolved-import` for httpx/aiohttp/pytest, which is structural
(CI lint env has no project deps) and matches every other platform
plugin's noise floor. The remaining 11 are real and fixable:
- `Optional[callable]` → `Optional[Callable[..., None]]` (auth.py)
invalid-type-form on `callable` as a type expression. Added the
proper `typing.Callable` import. Two sites: on_pending in
poll_for_token, on_user_code in login_device_flow.
- Dropped three unused `# type: ignore` comments on
hermes_constants / hermes_cli.config imports — ty can resolve
those modules fine, the comments were dead.
- _supervise_sidecar(proc) widened `proc.stdout` from
`IO[Any] | None` to a narrowed local after an early `is None`
guard. Defensive against subprocesses launched without
stdout=PIPE.
- cli.py _cmd_setup: dropped the `has_existing_project = bool(...)`
intermediate, did the narrowing inline with `if existing_id and
existing_secret:` so ty can see project_id/project_secret are
non-None when create_user is called.
- test_inbound.py: replaced three `adapter.handle_message =
fake_handle # type: ignore[assignment]` with
`monkeypatch.setattr(adapter, 'handle_message', fake_handle)`.
Same behavior, no type-ignore, and the monkeypatch reverts
cleanly between tests.
Validation:
ty check plugins/platforms/photon/ tests/plugins/platforms/photon/
→ All checks passed!
tests/plugins/platforms/photon/ → 26/26 pass
py_compile clean
Windows footgun checker → 0 footguns
CodeQL ignored the # lgtm[...] suppressions on default-config hosted
scans — same three high-severity false positives stayed open at
auth.py:461-463.
Last code-level attempt: drop the per-line emit() calls in favor of
- reading every credential into a tight prelude block that resolves
each to a display literal in a dict-typed local
- assembling the full 6-line banner as a list of plain strings
- calling emit() ONCE with '\\n'.join(rows)
CodeQL's flow tracker often gives up at the dict-literal + str-concat
+ list-join boundary because it has to track taint through index
access AND string concatenation AND join. Worth one more shot before
asking for an admin dismissal.
Output is byte-identical; live smoke confirms the same status table
renders. 26/26 photon tests still pass.
If CodeQL still flags this on the next scan, the architecture is as
clean as it can get without obfuscation and the right call is to
dismiss the three alerts as false positives in the Security tab
(documented escape valve for this rule).
After four iterations the taint flow finally settled on auth.py's
print_credential_summary, which emits four lines like
`emit(f" device token : {_present_token()}")`. The
`_present_*()` closures collapse credentials into display literals
("✓ stored" / "✗ missing") before the f-string evaluation, so no
secret bytes ever reach emit() — but CodeQL's interprocedural taint
tracker can't see through the closure-then-literal-return pattern
and keeps flagging the four lines.
This is the appropriate place for an inline suppression:
- auth.py is the only module that legitimately handles the secret;
every other surface (cli.py, adapter.py, tests) routes through
these helpers and stays clear of taint.
- The four lines are physically the boundary between
credential-reading code and a display callback. Without the
`emit(...)` calls there is no status command.
- The suppression is per-line with a comment explaining the
misfire pattern so a future maintainer can see the reasoning
without git-archaeology.
If GitHub's hosted CodeQL doesn't honor # lgtm comments on default-
config scans we'll need to dismiss these as false positives in the
Security tab once — that's the standard escape valve for this rule.
Validation:
tests/plugins/platforms/photon/ → 26/26 pass
py_compile clean
The previous pass moved credential reads into auth.credential_summary()
which returned a dict of pre-formatted display strings. CodeQL's
interprocedural taint analysis still flagged the cli.py prints because
the dict's values were transitively derived from load_photon_token()
and load_project_credentials().
Pattern that finally works: same as persist_webhook_signing_secret —
the helper takes an emit callback and does the formatting + emitting
itself. cli.py passes `print` as the sink and never receives any
return value derived from credential reads. CodeQL's flow stops at
the helper's emit() boundary.
Changes:
- auth.print_credential_summary(emit=print) — closure-scoped probes,
emits 6 lines (header + separator + 4 credential rows) via the
callback. Returns None.
- cli._cmd_status now calls print_credential_summary(print) then
appends the two non-credential rows (node binary, sidecar deps)
locally with no credential flow.
- Added test_print_credential_summary_emits_only_display_strings
asserting the emit callback never sees raw token/secret bytes.
Validation:
tests/plugins/platforms/photon/ → 26/26 pass
live smoke: hermes photon status (with empty HERMES_HOME) renders
the expected layout cleanly
CodeQL was still flagging three taint-flow alerts in cli.py — its
flow tracker keeps spreading the 'sensitive' label through every
variable that even touched a credential-returning function, including
'has_token = bool(load_photon_token())' and the redacted-response
dict returned by persist_webhook_signing_secret.
Refactor:
1. cli.py _cmd_status now calls a new auth.credential_summary() that
returns a {key: pre-formatted display string} dict. All probes +
bool checks happen inside the helper. cli.py never sees a token
or secret variable, only literals like '✓ stored' / '✗ missing'.
2. persist_webhook_signing_secret(webhook_data, *, on_summary=print)
now owns the formatting + writing + status messages. It returns
only a bool. The redacted-response JSON dump + 'saved to <path>'
confirmation are emitted via the on_summary callback, so cli.py
passes as the sink and never receives the path/dict back.
cli.py is now mechanical: register_webhook → persist (with print)
→ return 0/1. Zero credential-tainted variables in cli.py at all.
3. Tests updated for the new signatures and a credential_summary
guard added (the helper must never leak raw token/secret bytes
into its return strings).
Validation:
tests/plugins/platforms/photon/ → 25/25 pass
scripts/check-windows-footguns.py --all → 0 footguns
py_compile clean
Down to 4 CodeQL alerts after the last pass; all addressed:
cli.py:215 (clear-text-logging-sensitive-data)
The status banner literal 'project secret : ✓ stored' tripped
CodeQL's variable-name heuristic even though only a boolean was
interpolated. Renamed the column labels to 'project key' and
'webhook key' — fields contain only ✓ stored / ✗ missing / ⚠ unset
literals now, the word 'secret' is no longer in the source.
cli.py:283 (clear-text-logging-sensitive-data)
The fallback path for register-webhook used to echo
'PHOTON_WEBHOOK_SECRET=<value>' to stdout when the .env write
failed. Removed entirely — there is no scenario where we should
print the secret. On failure we now tell the user to fix the .env
permissions and re-register (after deleting the orphaned webhook
from the Photon dashboard).
cli.py:354 (clear-text-storage-sensitive-data) +
cli.py:276 (clear-text-logging-sensitive-data)
Replaced the hand-rolled .env writer in cli.py with the canonical
hermes_cli.config.save_env_value helper that every other API-key
persistence path uses (OpenAI key, Anthropic, Telegram, ...).
Moved the persist logic into auth.py as
persist_webhook_signing_secret(webhook_data) so the signing-secret
value never gets bound to a local in cli.py at all — cli.py hands
the raw API response straight to the helper and receives back only
the path + a redacted copy of the response for display. This both
matches project convention and removes the taint flow CodeQL was
tracking.
Bonus cleanup:
- dropped unused 'from typing import Any, Optional' in cli.py
- added 2 tests covering persist_webhook_signing_secret (writes
env successfully + returns redacted copy + no-secret-no-write)
Validation:
tests/plugins/platforms/photon/ → 24/24 pass
scripts/check-windows-footguns.py --all → 0 footguns
py_compile on all photon modules → clean
CI red on three blocking checks; all addressed:
1. Windows footguns: os.killpg() flagged as POSIX-only despite the
sys.platform != 'win32' guard. Static scanner doesn't see flow.
Added the documented '# windows-footgun: ok' suppression.
2. test (3): tests/plugins/platforms/photon/__init__.py shadowed the
real plugin's __init__.py because test_plugin_platform_interface.py
looks at PROJECT_ROOT/plugins/platforms/<name>/__init__.py with
PROJECT_ROOT=tests/ (pre-existing bug in that test, made visible
by the new test directory layout). Dropping the empty test
__init__.py restores the prior NOTSET parametrize behavior.
3. CodeQL (7 alerts in new code):
- cli.py: stop printing the first 8 chars of the bearer token after
login — even prefixes are partial credentials.
- cli.py: stop printing the first 8 chars of project_secret after
setup, same reason.
- cli.py 'hermes photon webhook register': stop dumping the raw
register-webhook response (contained signingSecret) and stop
echoing PHOTON_WEBHOOK_SECRET to stdout. Write it directly to
~/.hermes/.env (0o600), preserving existing entries; fall back
to manual instructions only if the file write fails. Photon
still only returns the secret once; this just doesn't put it
in scrollback / shell history.
- cli.py setup + status: rename project_id/project_secret/token
locals to has_* booleans before printing, breaking CodeQL's
taint flow through f-string interpolations. Drop diagnostic
prints of phone / assignedPhoneNumber that flagged as
'sensitive data' false positives.
- sidecar/index.mjs: stop returning the raw error message
(potentially containing stack trace) in HTTP 500 responses;
supervisor logs the real error to stderr, client only sees
a generic 'internal sidecar error'.
Validation:
- scripts/check-windows-footguns.py --all → 0 footguns (518 files)
- tests/plugins/platforms/photon/ → 22/22 pass
- tests/gateway/test_plugin_platform_interface.py → 7/7 pass, collects
NOTSET (matches pre-PR state)
- tests/gateway/test_platform_registry.py → 50/50 pass
- node --check sidecar/index.mjs clean
First-class iMessage support via Photon's managed Spectrum platform.
Targeted as a successor to the BlueBubbles adapter — Photon allocates
the iMessage line, handles delivery, and abuse-prevention so users
don't have to run their own Mac relay. Free tier uses Photon's shared
line pool.
Architecture:
- Inbound: signed JSON webhooks (X-Spectrum-Signature, HMAC-SHA256)
delivered to a local aiohttp listener. Dedupes on message.id,
rejects deliveries with >5min timestamp drift.
- Outbound: small supervised Node sidecar that runs the spectrum-ts
SDK. Photon does not currently expose a public HTTP send-message
endpoint; the sidecar is the only way to call Space.send() today.
When Photon ships an HTTP send endpoint we collapse the sidecar
into _sidecar_send and drop the Node dep — every other layer of
the plugin stays the same.
- Setup: 'hermes photon login' runs the RFC 8628 device-code flow;
'hermes photon setup' creates a Spectrum-enabled project, creates
a shared user (free tier), installs the sidecar's npm deps.
- Webhook management: 'hermes photon webhook register|list|delete'.
- Credentials persisted under credential_pool.photon /
credential_pool.photon_project in ~/.hermes/auth.json.
Plugin path (not built-in) — per current policy (May 2026), all new
platforms ship under plugins/platforms/. Registers itself via
ctx.register_platform() + ctx.register_cli_command(), zero edits to
core gateway code.
Tests cover:
- HMAC-SHA256 signature verification (happy path, tampered body,
wrong secret, drift, missing v0 prefix, empty inputs, non-integer
timestamp)
- Inbound dispatch for text DMs, group ids (any;+;...), and
attachment metadata markers
- Deduplication window
- check_requirements gating when Node is absent
- Device-code flow: request, header-based token return,
body-fallback token return, access_denied propagation
- Project/user/webhook API clients with mocked httpx
Known limitations (current Photon API):
- Attachments are metadata only — no download URL yet
- Outbound attachment send not wired (sidecar can add easily)
- Reactions / message effects not exposed yet
Docs: website/docs/user-guide/messaging/photon.md + sidebar entry.
#42178 dropped every session-scoped gateway event that arrived without an
explicit session_id, to stop background activity attaching to the focused
chat. But the gateway already stamps background sessions with their own id, so
an unscoped message/reasoning/tool/prompt event can only be the focused turn's
own output. Dropping those swallowed the live answer — it reappeared only after
a transcript refetch (manual refresh).
Narrow the guard to subagent.* (the only genuinely background/async family);
everything else falls back to the active session as before.
A bare `git fetch origin` (and `git fetch upstream`) pulls every ref. The
repo carries thousands of auto-generated branches, so on any
non-single-branch checkout the installer's update path and `hermes update`
spend minutes downloading the full branch list — long enough to stall the
desktop installer or trip the follow-up `git pull --ff-only`.
Scope every update-path fetch to the branch we actually compare/merge
against:
- scripts/install.sh: collapse the remote to single-branch and fetch only
$BRANCH on the "existing install, updating" path.
- hermes_cli/main.py: fetch the resolved branch in the apply path, the
--check path (upstream + origin), and the fork upstream-sync.
Tracking-ref updates still happen via git's opportunistic refspec, so the
later origin/<branch> rev-parse/rev-list checks are unaffected.
Tests assert the apply-path fetch is branch-scoped and never bare.
The anthropic extra pinned anthropic==0.86.0 while LAZY_DEPS['provider.anthropic']
pins 0.87.0 (CVE-2026-34450, CVE-2026-34452) — the same drift class as the
aiohttp #31817 downgrade. On hermes update the extra pin won and rolled
anthropic 0.87.0 -> 0.86.0, reopening both CVEs until the native-Anthropic
lazy refresh re-bumped it.
Bump the extra to 0.87.0, regenerate uv.lock, and generalize the regression
guard: test_pyproject_pins_match_lazy_deps_pins now fails if ANY package
pinned in both a pyproject extra and a LAZY_DEPS entry drifts, so a third
package can't reintroduce this class. The aiohttp-specific test is kept for
focused #31817 coverage.
hermes auth add openai-codex now creates an independent
manual:device_code pool entry per account instead of routing through
the singleton _save_codex_tokens save path, which collapsed every
added account into the latest login (the second add overwrote the
first account's singleton-mirrored device_code entry). This is the
add-path half of #39236; PR #39243 (already on this branch) fixes the
re-auth half.
manual:device_code entries refresh from their own token pair
(_sync_codex_entry_from_auth_store only adopts the singleton for
source=="device_code"), so they need no providers.openai-codex
shadow. Adding the first credential marks openai-codex active (the
singleton path did this implicitly) so the setup wizard's
get_active_provider() check still passes; subsequent adds leave the
active provider untouched.
Adds SOURCE_MANUAL_DEVICE_CODE constant and a regression test that two
distinct accounts keep distinct token pairs. Updates two existing add
tests to the pool-only behavior.
Co-authored-by: glesperance <info@glesperance.com>
The #33538 fix refreshed every credential_pool entry with source
"manual:device_code" on every Codex OAuth re-auth, on the assumption that
such entries were always legacy aliases of the singleton from the #33000
workaround era. That assumption is no longer true: `hermes auth add
openai-codex` also produces "manual:device_code" entries for independent
ChatGPT accounts, and the broad sync silently clobbered them with the
latest-authenticated token pair (labels preserved, token material
overwritten, status / quota readings then lie).
Narrow the sync: refresh a "manual:device_code" entry only when its
existing access_token matches the previous singleton access_token (true
legacy alias). Entries with distinct token material represent independent
accounts and are now left alone. Error markers are cleared only on
entries actually rewritten, so an independent account's own 429 / 401
state survives a re-auth that targeted a different account.
Tests:
* New: independent acctB/acctC are not overwritten when acctA re-auths.
* New: legacy singleton-alias still refreshed (preserves #33538).
* New: missing previous singleton state handled (no crash, no false
alias match).
* New: access_token-only alias match (legacy schema without
refresh_token still recognized).
* New: error markers cleared only on entries actually refreshed.
* Updated: existing manual-device-code sync test now covers both the
legacy-alias path AND the independent-account path in one fixture.
Behaviour change is zero for users with a single Codex account and zero
for users whose only "manual:device_code" entry is the legacy alias of
the singleton. Users with multiple independent Codex accounts added via
`hermes auth add` now keep their distinct token material across
re-auths.
Local: 29 passed in tests/hermes_cli/test_auth_codex_provider.py, no
new failures in tests/hermes_cli/ vs upstream/main baseline.
Fixes#39236.
* fix(stream): don't report dropped mid-tool-call streams as output truncation
A streaming tool call whose SSE ends with no finish_reason (the upstream
delivers the tool name + opening '{' then closes the connection cleanly,
no terminator, no [DONE]) was stamped finish_reason='length' by the mock
builder. That routed it through the output-cap truncation path: 3 useless
max_tokens-boosted retries, then the misleading 'Response truncated due to
output length limit' error — even though the model never reported hitting
any cap.
Reproduced live on nvidia/nemotron-3-ultra:free via the Nous dedicated
endpoint, which stalls/drops during large tool-arg generation (50s-4m41s).
Now: when tool args are incomplete AND the provider sent no finish_reason,
tag the response as a partial-stream stub so the loop reports an honest
mid-tool-call drop and asks the model to chunk its output (existing
continuation machinery), instead of escalating output budget and lying.
A provider-reported finish_reason='length' still takes the real-truncation
path unchanged.
* test(stream): update truncated-tool-args test for drop-vs-cap split
test_truncated_tool_call_args_upgrade_finish_reason_to_length pinned the
old behaviour where ANY incomplete tool args → finish_reason='length' with
tool_calls preserved. That single-chunk-no-finish_reason scenario is exactly
the mid-tool-call stream drop now reclassified as a partial-stream stub.
Split into two tests matching the new contract:
- no finish_reason + incomplete args → PARTIAL_STREAM_STUB_ID, tool_calls=None,
_dropped_tool_names set (the drop path)
- explicit finish_reason='length' + incomplete args → tool_calls preserved,
'length' upgrade unchanged (the genuine output-cap path)
helix4u's fix snapshotted the resolved HERMES_HOME into the static
config/env patterns at module-import time. That breaks when HERMES_HOME
is set after tools.approval is imported (the hermetic test conftest, any
deferred-profile-resolution path), and made the PR's own 4 new tests red.
Move the resolution into _normalize_command_for_detection(): rewrite the
live resolved absolute home prefix (and its symlink-resolved form) to the
canonical ~/.hermes/ form before pattern matching. Tracks the live env,
needs no regex recompile, and folds the absolute form into the shared
_SENSITIVE_WRITE_TARGET so > redirects, tee, cp, etc. are covered too —
not just sed/perl/ruby in-place edits.
* fix(desktop): stop running app locking win-unpacked before pack
On Windows a running Hermes.exe keeps an exclusive lock on
release/win-unpacked/Hermes.exe, so electron-builder's pack cannot
replace it and dies with "remove ...\Hermes.exe: Access is denied" /
ERR_ELECTRON_BUILDER_CANNOT_EXECUTE (before-pack hits the same EPERM
cleaning the dir, and the cache-purge retry repeats the failure since
the lock is still held).
Before building the packaged app, terminate any process whose
executable lives inside this build's release/ tree so the rebuild --
including the installer's headless --update rebuild -- can replace the
binary. Scope is narrow (only exes under release/), POSIX is a no-op
(it can unlink a running binary), and the final error now points
Windows users at the running-app cause.
* test(desktop): cover the win-unpacked lock-breaker helper
Verify _stop_desktop_processes_locking_build is a no-op off-Windows,
terminates only processes whose exe lives under release/ (sparing our
own PID and unrelated installs), and short-circuits when no release dir
exists.
* feat(windows): enable dashboard chat tab via ConPTY (win_pty_bridge)
Add hermes_cli/win_pty_bridge.py — a pywinpty-backed drop-in for
PtyBridge with the same spawn/read/write/resize/close surface — and
wire it into the web_server PTY import block so Windows picks it up
instead of falling back to None.
pywinpty is already a declared win32 dependency (pyproject.toml).
The ConPTY read path runs inside run_in_executor so the event loop
is never blocked. Spawn/read/write/terminate call shapes are taken
directly from tools/process_registry.py which already exercises the
same pywinpty version.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* docs: remove WSL2-only caveat for dashboard chat tab
The chat pane now works on native Windows via the ConPTY bridge added
in the previous commit.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* test(windows): cover ConPTY bridge + web_server platform-branched import
Companion to the bridge added in the previous commits. Verified live on
native Windows 11 (pywinpty 2.0.15) against `hermes dashboard`'s
`/api/pty` WebSocket: the spawned `hermes --tui` (node entry.js) renders
through ConPTY, resize escapes reach `setwinsize`, and closing the WS
reaps both the node child and the pywinpty agent with zero orphans.
tests/hermes_cli/test_win_pty_bridge.py
Mirrors the layout of the existing POSIX test_pty_bridge.py:
spawn/io/resize/close/env coverage against cmd.exe and python -c,
plus the cross-platform fallback surface (PtyUnavailableError, the
off-Windows `spawn -> raises PtyUnavailableError` guard, and the
load-bearing _clamp() helper that protects setwinsize from garbage
winsize values out of xterm.js).
tests/hermes_cli/test_web_server_pty_import.py
Asserts that web_server.PtyBridge resolves to WinPtyBridge on win32
and to the POSIX PtyBridge on POSIX, that PtyUnavailableError is the
matching class on each side (so isinstance checks in /api/pty's
spawn fallback path work), and a source-text check that pins the
platform-branched import shape so a future refactor can't quietly
collapse it back to a POSIX-only import.
scripts/release.py
AUTHOR_MAP entries so CI release-note generation can resolve both
authors' plain (non-noreply) emails to their GitHub logins.
Co-Authored-By: JoelJJohnson <josephjohnson.joel@gmail.com>
Co-Authored-By: Nea74 <andreas@schwarz-ketsch.de>
---------
Co-authored-by: JoelJJohnson <josephjohnson.joel@gmail.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Nea74 <andreas@schwarz-ketsch.de>
The messaging/slack/homeassistant/sms extras exact-pinned aiohttp==3.13.3
while LAZY_DEPS['platform.slack'] already pins 3.13.4 (the CVE fix). On
`hermes update` the extras pin won, downgrading aiohttp 3.13.4 -> 3.13.3
and reopening 10 published advisories (CVE-2026-34513/34515/34516/34517/
34518/34519/34520/34525, -22815, -34514) until Slack's lazy refresh
re-upgraded it.
Bump all four extras to 3.13.4 to match the lazy pin, regenerate uv.lock,
and add test_pyproject_aiohttp_pins_match_lazy_slack_pin to guard the
alignment going forward.
Fixes#31817
quiet_mode was being used to suppress tool-result display when
tool_progress_mode was 'off'. But quiet_mode also gates operational
status messages, so users with /verbose + tool-progress off lost all
status output.
Adds a dedicated tool_progress_mode attribute to AIAgent; the
tool_executor result-rendering path gates on tool_progress_mode != 'off'.
The CLI passes its tool_progress_mode through agent setup and the
tool-progress cycle command syncs it onto the live agent.
Fixes#33860.
prompt_toolkit's ANSI parser does not handle OSC escape sequences
(\x1b]...\x07 / \x1b]...\x1b\), which caused Rich's [link=...] markup
to leak raw OSC 8 payload into the banner title after /clear.
Added _OSC_ESCAPE_RE to strip OSC sequences in ChatConsole.print()
before routing through _cprint(). CSI/SGR color sequences are
preserved. Visible text between OSC sequences is kept intact.
When the agent has its own SessionDB reference (_session_db is not None),
_flush_messages_to_session_db() persists user messages to SQLite during the
agent run. Two gateway fallback paths also wrote the same user message
without skip_db=True, creating duplicate entries in state.db:
1. agent_failed_early path (transient 429/timeout failures)
2. not-new-messages path (history_offset >= len(messages) edge case)
Move agent_persisted flag definition to before the if/elif/else block so
all paths can use it, and pass skip_db=agent_persisted to every fallback
append_to_transcript() call.
Fixes#42039
* feat(desktop): assignable themes per profile
The desktop skin was a single global preference, so every profile shared
one look. Make the theme assignment per profile: picking a theme assigns it
to the profile that's currently live, and switching profiles paints that
profile's own skin. A profile with no assignment inherits the global default,
so single-profile installs and existing setups are unchanged.
- themes/context.tsx: per-profile skin record in localStorage; ThemeProvider
follows $activeGatewayProfile; boot paint uses the last active profile's
theme to avoid a flash on a non-default relaunch; setTheme assigns to the
live profile (default profile also seeds the legacy global fallback).
- settings/appearance-settings.tsx: caption noting the theme is saved per
profile, shown only when more than one profile exists.
- i18n: themeProfileNote string across en/zh/zh-hant/ja.
- themes/profile-theme.test.ts: resolution + inheritance coverage.
* feat(desktop): make light/dark mode per profile too
The command palette / theme picker sets skin + mode together on each pick,
so leaving mode global meant a profile couldn't actually remember the full
look it was given (e.g. "Ember Dark" in one profile would render Ember Light
if another profile last flipped the global mode). Mirror the per-profile skin
record for light/dark mode: ThemeProvider resolves and applies the active
profile's mode on switch, the boot paint uses it, and setMode assigns to the
live profile (default profile also seeds the legacy global mode fallback).
* refactor(desktop): collapse per-profile skin/mode into one helper
Skin and mode were near-identical resolve/assign pairs with hand-rolled
try/catch around localStorage. Fold both into a single profilePref<T>
factory (resolve + assign, default profile seeds the legacy global) and
lean on storedString/persistString for the error-swallowing. Tests go
table-driven over both prefs since they share one contract. No behavior
change; -89 LOC.
* refactor(desktop): treat default profile as the global slot directly
"default" isn't a real profile — it is the legacy global value. Stop
double-writing (record['default'] + global) on assign; route default
straight to the global. resolve is unchanged: a profile with no record
entry already falls back to the global, so default reads it for free.
A brand-new session's first turn persists to the SessionDB a beat after
the gateway emits message.complete, so a refresh fired in that window gets
a listSessions(min_messages=1) page that omits the new row. sessionsToKeep()
already shields the *active* chat from this race, but a session you started
and then navigated away from is — at the next refresh — neither working,
pinned, nor active, so mergeSessionPage() evicts it. Nothing re-fetches
afterward, so it stays gone until the app restarts.
Track sessions whose turn just settled (a real working->idle transition) in
a short, auto-expiring grace window and add them to the merge keep-set. This
bridges the persist race for non-active chats without resurrecting deleted
rows (mergeSessionPage only revives rows still in the in-memory list, which
optimistic delete/archive already drop).
Repro: start a new chat, send a message, then click another session before
the reply lands — the new session vanishes from the sidebar.
Main folded slash_worker.close() into _finalize_session (the single
_finalized-guarded chokepoint) while #42143 was open. The rebase
conflicted with the PR's worker-close in _teardown_session. Keep both —
they target the same #38095 leak and _SlashWorker.close() is
idempotent (_closed/poll()-guarded) — so callers reaching
_teardown_session without the real _finalize_session (and the PR's own
tests, which monkeypatch _finalize_session out) still reap the worker.
Same for _shutdown_sessions, now routed through the unified
_close_session_by_id funnel.
Salvaged from #35626 (banditburai) and re-scoped after maintainers landed the
parent-death watchdog (slash_worker.py) and PTY process-group teardown
(pty_bridge.py) directly on main. Those pieces are intentionally NOT included
here — this carries only what is still missing:
- C1 disconnect reap: ws.py's `finally` only re-pointed the dead transport at
stdio. `_close_sessions_for_transport` now reaps `close_on_disconnect`
sessions and schedules the grace-reap for the rest, offloaded via
`asyncio.to_thread` so the blocking worker.close() + DB write never stalls
the uvicorn loop.
- C2 create/close orphan race: `_attach_worker` stores the worker iff
`_sessions.get(sid) is session` under the lock (else closes it), applied at
every spawn site incl. the post-turn `_restart_slash_worker`.
- Single idempotent teardown funnel: session.close, WS disconnect, the
generous-TTL idle reaper, shutdown, and the WS grace-reap all reach
`_close_session_by_id` → `_teardown_session`; `_finalized`/`_closed` flags
make concurrent/double teardown a no-op. `_sessions_lock` upgraded to RLock.
- uvicorn `ws_ping_interval/timeout=20s` so a half-open socket (reverse-proxy
524) becomes a `WebSocketDisconnect` and the C1 path runs.
Plus two review-driven hardening fixes (mine):
- `session.active_list` now skips `_finalized` sessions so the footer
"N sessions" count reflects attachable sessions instead of only ever
growing until restart (#38950). Keys on `_finalized` only, NOT the stdio
sentinel, so a standalone `hermes --tui` session stays visible.
- `_schedule_ws_orphan_reap._reap` pops via `_close_session_by_id`
(under `_sessions_lock`) instead of `_sessions.pop` under the unrelated
`_session_resume_lock` (#39591); the resume_lock now only guards the orphan
re-check against `session.resume`.
- Float env knobs (`HERMES_SLASH_WATCHDOG_*`, `HERMES_TUI_SESSION_TTL_S`)
parse with a fallback helper so a malformed value can't crash the worker at
import.
Fixes#32377Fixes#38950
Addresses #22855
Co-authored-by: banditburai <123342691+banditburai@users.noreply.github.com>
Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
resolve_provider() auto-detection only checked OPENROUTER_API_KEY/
OPENAI_API_KEY env vars, never the credential pool. A key added via
`hermes auth add openrouter` (manual pool entry, no env var) was invisible:
the provider failed to resolve or resolved with an empty api_key, so
requests went out with no Authorization header and OpenRouter returned
"HTTP 401: Missing Authentication header" while `hermes auth list` showed
the credential. Closes#42130.
- auth.py: check load_pool("openrouter").has_credentials() after the env check
- dump.py: `debug share` shows 'openrouter set (auth pool)' instead of the
misleading 'not set' when the key lives in the pool
- add regression tests (pool credential auto-detects; empty pool still raises)
* fix(desktop): require session ids for scoped gateway events
Drop unscoped stream, tool, and subagent events in the desktop renderer so async activity cannot attach to whichever chat is currently focused.
* fix(desktop): preserve unscoped session info events
Keep session.info out of the scoped-event drop list so global desktop runtime broadcasts still initialize UI state before a session is active.
Lift the 18 _model_flow_* provider-setup wizard functions out of hermes_cli/main.py
into hermes_cli/model_setup_flows.py. Behavior-neutral; main.py 14050 -> 11479 LOC.
select_provider_and_model (the dispatcher) STAYS in main.py and re-imports the
flows via an explicit 'from hermes_cli.model_setup_flows import (...)' block, so
both its bare-name calls and existing test monkeypatches targeting
hermes_cli.main._model_flow_* keep resolving against main's namespace unchanged.
Imports: 3 neutral deps (argparse, os, subprocess) at the module top; the 14
main.py-internal helpers the flows call (_prompt_api_key, _save_custom_provider,
the reasoning-effort/stepfun/qwen helpers, _run_anthropic_oauth_flow, ...) are
lazy-imported per-flow (from hermes_cli.main import ...) so the new module never
imports main at module scope -> no import cycle.
Repointed one source-inspection change-detector (test_setup_ollama_cloud_force_refresh)
to read the module the ollama-cloud branch moved to.
Validation: 6563/6563 hermes_cli tests pass; live flow-dispatch probe confirms the
lazy main-internal imports resolve at runtime.
Lift the post-loop finalization tail out of run_conversation into
agent/turn_finalizer.py:finalize_turn. Behavior-neutral; run_conversation
4204 -> 3846 LOC, conversation_loop.py 4578 -> 4220.
The region (everything after the main tool-calling while loop): budget-exhaustion
summary, trajectory save, session persist, turn diagnostics, response transforms,
result-dict assembly, steer drain, and the memory/skill review trigger. Lifted
verbatim into a synchronous single-return free function; the 12 post-loop locals
it reads are passed as keyword args and the assembled result dict is returned to
run_conversation (which returns it to the caller). All agent.* side effects fire
exactly as before.
Imports: os + _summarize_user_message_for_log at module top; logger lazy from
agent.conversation_loop (preserves the gateway... err 'agent.conversation_loop'
logger name, no import cycle).
Validation: 1609/1609 tests/run_agent/ pass; live PTY agent turn PASS.
Lift the 4 inbound-message authorization methods out of GatewayRunner into
gateway/authz_mixin.py:GatewayAuthorizationMixin. Behavior-neutral; gateway/run.py
16200 -> 15812 LOC.
Methods moved (~389 LOC): _is_user_authorized, _get_unauthorized_dm_behavior,
_adapter_dm_policy, _adapter_enforces_own_access_policy. The two adapter-policy
helpers are private to _is_user_authorized, so the cluster is fully self-contained
(zero outside-cluster self.method calls after the lift). All self.* calls resolve
unchanged via the MRO (GatewayRunner(GatewayAuthorizationMixin, ...)).
Import split: 6 neutral deps (os, Optional, Platform, SessionSource, the two
whatsapp_identity helpers) at the mixin module top; the module-level logger is
imported lazily inside _is_user_authorized (from gateway.run import logger) so
the mixin never imports gateway.run at module scope -> no cycle. The lazy import
preserves the exact logger name (gateway.run) so log records are unchanged.
Lift the 5 agent-construction/session-resume methods out of HermesCLI into
hermes_cli/cli_agent_setup_mixin.py:CLIAgentSetupMixin. Behavior-neutral; cli.py
14139 -> 13492 LOC.
Methods moved (~647 LOC): _ensure_runtime_credentials, _resolve_turn_agent_config,
_init_agent, _preload_resumed_session, _display_resumed_history. All self.* calls
resolve unchanged via the MRO (HermesCLI(CLIAgentSetupMixin, CLICommandsMixin)).
Import split (same recipe as #41942): 2 neutral deps (sys, _escape) imported at
the mixin module top; 12 cli.py-internal helpers/constants (AIAgent, ChatConsole,
CLI_CONFIG, _cprint, _DIM, _RST, _accent_hex, ...) imported lazily per-method
(from cli import ...) so the mixin never imports cli at module scope -> no cycle.
Repointed one source-inspection change-detector (test_callable_api_key.py) to read
the mixin file where the method now lives.
Two narrow Windows desktop fixes:
1. tools/process_registry.py — PTY stdin writes are now platform-aware.
pywinpty (Windows) expects str; ptyprocess (POSIX) expects bytes.
Previously bytes was unconditionally passed, producing a TypeError on
Windows ("'bytes' object cannot be converted to 'PyString'").
2. tui_gateway/server.py + ws.py — Detached WebSocket sessions now park on
a _DropTransport sink instead of _stdio_transport. In the desktop the
gateway runs in-process and stdout is captured by Electron into
desktop.log, so falling back to stdio leaked raw JSON-RPC frames into
the desktop log after WS disconnects. Orphan-reap semantics are
preserved via _ws_session_is_orphaned.
Verified on a Windows desktop install:
- pywinpty 2.0.15 rejects bytes / accepts str — reproduced exactly
- Focused suite green (write_stdin × 2, write_json_drops_detached_ws_frames,
ws_orphan_reap × 2)
- All 6 CI test shards green, e2e green, nix (ubuntu/macos) green
Salvage commit (21be7ca) fixes the new test referencing an undefined
_ThreadUnsafeStdout — uses the existing _ChunkyStdout helper.
The TUI docs presented HERMES_TUI_GATEWAY_URL + /api/ws as a supported
'attach the TUI to a standalone running gateway' workflow. It isn't.
/api/ws exists only inside the dashboard's FastAPI server
(hermes_cli/web_server.py), which spawns its own embedded TUI child and
injects the var as an internal wiring detail. The OpenAI-compat API
server (api_server platform) deliberately does not serve /api/ws, so the
documented ws://host:port/api/ws workflow 404s — the cause of #32882 and
the two PRs (#32904, #32955) that tried to add the route to the wrong
surface.
Rewrites the section in en + zh-Hans to describe the var accurately and
point users at shared state.db / dashboard embedded chat for multi-surface
session sharing.
A Responses-API-shaped payload carrying instructions=/input=/store=/
parallel_tool_calls= can reach the native Anthropic messages.stream() /
messages.create() call under a rare api_mode-flip race (e.g. a concurrent
auxiliary vision call mutating a shared agent between the kwargs build and
the stream dispatch). The Anthropic SDK rejects these with a non-retryable
TypeError that kills the whole turn and propagates the entire fallback chain.
Add sanitize_anthropic_kwargs() at both Anthropic dispatch sites: it drops
the Responses-only keys in place and logs a WARNING (with #31673 breadcrumb)
when one is present, so the underlying race stays visible in the wild
instead of being silently papered over.
* fix(plugins): add thread-safe lazy-singleton helpers, fix honcho TOCTOU (#24759)
get_honcho_client() and fal's _load_fal_client() used unlocked
check-then-init: racing threads both ran the expensive build and the
loser's client (open connection) leaked.
Rather than one-off locks, add plugins/plugin_utils.py with two
reusable primitives every plugin author can drop in:
- lazy_singleton: decorator for zero-arg accessors
- SingletonSlot: manual slot for config-keyed accessors (first wins)
Both use double-checked locking; factory runs at most once; failed
builds aren't cached. honcho is the reference consumer; fal's sibling
TOCTOU gets a matching double-checked lock. Plugin dev guide documents
the pattern so future plugins don't reintroduce the race.
Closes#24759
* test(honcho): update reset test for SingletonSlot internals
test_reset_clears_singleton poked the removed _honcho_client module
global directly. Assert through the slot's public peek() surface
instead, matching the #24759 refactor.
Two independent reviewers flagged that applyBackendUpdate's in-progress and
error messages were inline English while the rest of the update overlay is
i18n'd. Move them into updates.applyStatus (preparing/pulling/restarting/
notAvailable/failed/noReturn) across en, ja, zh, zh-hant + types.
The no-return error said 'Backend updated but did not come back online' — but
once the connection drops the client can't know the update's exit code, only
that it was started and the backend is unreachable. Reword to not overclaim:
the update may not have completed.
Three rough edges in the remote backend apply flow:
- On success the overlay dropped to IDLE, briefly re-rendering the pre-install
'update available' view and then the generic 'you're all set' before settling.
Close the overlay outright once the backend is confirmed back instead of
bouncing through the idle view.
- If the backend never came back (a failed restart), the flow still reported
success. waitForBackendReturn now returns whether the backend answered;
finishBackendApply surfaces an error when it didn't.
- The up-to-date copy said 'you're running the latest version', conflating
client and backend. Backend target now reads 'the backend is running the
latest version' — the client's own version is a separate pill.
The backend Install path set stage:'restart' and stopped — in remote mode no
boot-progress events arrive to carry the overlay to done, so it sat on the
restarting spinner until a manual reload while the backend had already come
back. Poll the backend until it answers again, then clear the overlay and
refresh the backend status. Target-aware applying copy explains the remote
restart + auto-reconnect instead of the local-updater-window wording.
Also switch the apply poll sleeps from window.setTimeout to globalThis.setTimeout
so the flow is exercisable off the renderer.
HERMES_DESKTOP_REMOTE_URL forces a remote connection but never writes
connection.json, so the gateway panel read mode/url from persisted config
and mislabelled an env-remote session as local with no url.
The poller starts at mount, before the gateway connects, so its initial
checkBackendUpdates() ran while mode was still unset and no-op'd via the
remote-mode guard — leaving the backend button empty until the user clicked it.
Subscribe to $connection and re-check the backend when mode resolves to remote.
Two follow-ups from testing the two-button bar:
- The background poller and focus handler only checked the client, so the
backend behind-count and changelog stayed empty until the user opened the
overlay — and the overlay's first render then hit the empty-commits fallback
('Improvements and fixes') instead of the real changelog. Check the backend
alongside the client on poller start, interval, and focus so its state is
ready before the button is clicked.
- Order the status bar client-first, backend-second.
The status bar merged both versions into one pill with a single click target,
so there was no way to tell which artifact an update acted on — and the apply
path was overloaded by connection mode. Separate them:
- store: independent client (checkUpdates/applyUpdates) and backend
(checkBackendUpdates/applyBackendUpdate) flows with their own status/apply
atoms; openUpdateOverlayFor(target) drives the overlay.
- status bar: two buttons — client vX (always) and backend vY (+N) (remote
only), each with its own behind-count, opening the overlay for its target.
- overlay: reads the active target's atoms; install/check route per target.
Removes the version-bar merge helper (no longer merging the two versions).
The updates overlay showed generic 'New update available / improvements and
fixes' with no indication of whether it was updating the client or the backend.
In remote mode it now reads 'Backend update available' and names the connected
backend, and when there's no commit changelog (e.g. pip/non-git backend) it
degrades to honest 'release notes aren't available for this install type' copy
instead of filler.
Copy selection extracted to a pure resolveUpdateCopy() helper (unit-tested);
threads target ('client'|'backend') from connection.mode through the overlay.
The behind-count (banner._check_via_local_git) measures HEAD..origin/main, but
_recent_upstream_commits logged HEAD..@{upstream}. On a feature-branch checkout
@{upstream} is the branch's own tip (0 commits), so the changelog came back
empty while behind>0 — the overlay then showed generic filler instead of what
changed. Pin the commit range to origin/main so count and changelog agree.
Verified against a checkout 11 behind origin/main: now returns 11 commits.
In remote mode, checkUpdates()/applyUpdates() branch on connection.mode and
drive the existing updates overlay from the connected backend instead of the
local Electron git bridge:
- checkUpdates -> GET /api/hermes/update/check, mapped onto DesktopUpdateStatus
(behind, commits, supported=can_apply, message). The overlay renders the
commit list as 'what's changed' and shows guidance (not Install) when the
backend install can't self-apply (docker/nix).
- applyUpdates -> POST /api/hermes/update (the proven command-center path),
polling the action to completion and handling the expected mid-update
connection drop as the restart phase.
Local mode is unchanged. Adds checkHermesUpdate() to hermes.ts and a
BackendUpdateCheckResponse type.
In remote thin-client mode the Electron client and the backend it connects to
are separate installs that drift independently. The status bar previously showed
only the client version, hiding skew (e.g. client 0.15.1 talking to backend
0.16.0 looked fine).
Add a pure resolveVersionBar() helper (unit-tested) that, gated on
connection.mode === 'remote', renders both 'client vX · backend vY' from the
desktop appVersion and StatusResponse.version, and flags skew. Local mode is
byte-identical to before. Wire it into the status-bar version item.
Add a best-effort `commits` list (sha/summary/author/at) to the update-check
response for git/pip installs that are behind upstream, so the desktop's
remote update overlay can show what's changed before applying.
Additive and non-breaking: existing consumers (legacy dashboard, tests using
subset assertions) ignore the new field. Leaves the shared check_for_updates()
int contract untouched — commits come from a separate best-effort git call.
#19194's fix added process.exit(0) to die()/dieWithCode() with a comment
relying on a process.on('exit') handler in entry.tsx that resets terminal
modes — but that handler was never installed. So /quit, Ctrl+C, Ctrl+D and
every process.exit() path left DEC mouse tracking (?1000/1002/1003/1006)
armed in the parent shell. The terminal then kept emitting mouse reports
into stdin — read as keystrokes by the shell or a freshly relaunched TUI —
surfacing as ...;...M garbage in the input box.
Install the missing handler. 'exit' fires once on real termination and runs
synchronous code only; resetTerminalModes() writes via writeSync, so the
disable sequence lands before the process is gone.
Fixes#28419
#41867 replaced mkNpmPassthru's patchPhase with
`cp $npmDeps/package-lock.json package-lock.json`, on the theory that
prefetch-npm-deps strips advisory fields (engines/os/cpu) from the cache
lockfile. That diagnosis was wrong.
prefetch-npm-deps copies the lockfile into the cache *verbatim*
(prefetch-npm-deps/src/main.rs reads it and writes it unchanged). Building the
cache fresh from the current root lockfile yields exactly the pinned
npmDepsHash, and that cache's package-lock.json is byte-identical to the source
(740 "engines" blocks on each side). With the hash correct, npmConfigHook's
consistency check passes on its own — verified by building .#tui and .#default
green with this (original) patchPhase.
So the cp was unnecessary, and worse: it bypasses the consistency check
wholesale, silently masking a genuinely stale npmDepsHash (a lockfile that
changed without its hash being refreshed) instead of failing loudly. The
original patchPhase keeps the check meaningful while still handling the one real
cosmetic difference it was written for (trailing newlines); stale-hash drift is
caught by the npmDepsHash itself plus the auto-fix workflow.
Keeps the fix-lockfiles real-build verification and the nix-lockfile-fix.yml
file-path fix from #41867 — only the patchPhase cp is reverted.
Ensure failed plugin-config clear operations still re-arm managed reinitialization on the next Hermes session.
Add focused regression coverage for successful init, failed final-session clear, and next-session recovery.
Signed-off-by: mnajafian-nv <mnajafian@nvidia.com>
addToSystemPackages exports HERMES_HOME system-wide and puts the hermes CLI on
interactive users' PATH, so those users (in the hermes group) share the
gateway's state — that's the option's whole purpose. But the activation script
wrote config.yaml as 0640 (group read-only), so an interactive user saving a
setting via the CLI/TUI hit:
error: [Errno 13] Permission denied: '/var/lib/hermes/.hermes/config.yaml'
Make the mode conditional: 0660 when addToSystemPackages is set (group hermes
can write), else the previous 0640. .env stays 0640 either way — it holds
secrets, not user-facing settings. The config merge already preserves
user-added keys across rebuilds, so this simply lets interactive hermes-group
users actually make those edits.
Verified by evaluating the module's activation script for both option values:
addToSystemPackages=true -> chmod 0660, false -> chmod 0640.
Fold the slash-worker subprocess close into _finalize_session itself —
the single _finalized-guarded session-end chokepoint — instead of
relying on each caller (_teardown_session, _shutdown_sessions) to close
it separately. A future code path that finalizes a session directly can
no longer reintroduce the #38095 worker leak.
Idempotent: _SlashWorker.close() is poll()-guarded and _finalize_session
short-circuits on _finalized, so the existing teardown paths are
unaffected. Drops the now-redundant separate close() in
_shutdown_sessions.
Note: the active leak this issue reported was already fixed on main
(WS-orphan reaper #38591, _restart_slash_worker close, atexit shutdown).
This addresses the residual defense-in-depth gap the reporter correctly
identified in their follow-up comment.
close() is the hard teardown for true session boundaries (/new, /reset,
session expiry). It already closes the OpenAI client and child agents but
left the conversation-history list intact. Mirror the soft-eviction path
(_release_evicted_agent_soft clears _session_messages) so a held reference
to a closed agent — e.g. a draining background task — doesn't pin tens of
MB of tool outputs until the agent object itself is collected.
Daemon thread polls _is_orphaned (original ppid check + psutil create_time PID-reuse
guard, no PR_SET_PDEATHSIG). On orphan, drains an in-flight command up to a grace
window then os._exit(0). Started before the HermesCLI build to cover the spawn window.
Task: swl-qrf.8
_evict_cached_agent (the chokepoint for /new, /model, /undo, session
resets — 17 call sites) only popped the cache entry, dropping the
AIAgent reference without releasing its httpx client pool. AIAgent
holds reference cycles (callbacks, tool state) so CPython refcounting
does not free the client promptly; under steady gateway traffic the
held sockets + buffers accumulate and RSS climbs (the leak class behind
Now the chokepoint pops AND schedules a soft release_clients() on a
daemon thread (mirrors the cap-enforcer / idle-sweeper). Soft release
frees the client pool + per-turn child subagents but preserves the
session's terminal sandbox / browser / bg processes for resumption.
Mid-turn agents are skipped so a running request is never torn down.
Also fixes the no-lock branch which previously never popped at all.
Long-lived gateways under heavy cron/build workloads grow steadily (~18 MB/hr
post-phantom-dispatch-fix) and eventually need a restart-or-OOM. Four retention
sites, all confirmed live on current main:
1. _evict_cached_agent() (/model, /reasoning, codex-runtime, /undo, etc.) popped
the cache entry without releasing the agent's OpenAI client, httpx transport,
SSL context, or conversation history. Only /new cleaned up first. Now releases
clients on a daemon thread, matching _enforce_agent_cache_cap.
2. _release_evicted_agent_soft() now clears _session_messages after
release_clients() — tool outputs (file reads, terminal output, search results)
can be tens of MB per 100+-tool-call session; the list is rebuilt from
persisted session JSON on resume, so dropping it on soft eviction is safe.
3. The session-expiry watcher (permanent finalization) now drops the session's
per-session control dicts (_session_model_overrides, _session_reasoning_overrides,
_pending_approvals, _update_prompt_pending, _pending_model_notes). These leaked
one entry per session per gateway lifetime. NOTE: this is the session-finalize
path, NOT idle agent-cache eviction — an idle-evicted session is still alive and
rebuilds its agent from these overrides, so pruning them there would silently
reset a user's /model choice.
4. _tool_defs_cache is now bounded (_TOOL_DEFS_CACHE_MAX=8) with oldest-first
eviction instead of growing unboundedly across the distinct toolset/config
fingerprints a gateway sees over its lifetime.
Salvaged from #25318 by Michael Steuer (@mssteuer); fix 3 redirected from the
idle-sweep to the session-finalize lifecycle, magic number 8 lifted to a named
constant, test ported.
Fixes#19251
Co-authored-by: Michael Steuer <michael@make.software>
test_concurrent_compressions_same_session_serialize relied on a
time.sleep(0.25) inside the stubbed compressor to make the two threads
overlap inside the per-session lock window. Under CI CPU starvation that
sleep is insufficient: one thread can acquire -> compress -> rotate ->
RELEASE the lock before the other reaches try_acquire, so both acquire on
the shared session_id and both compress (the recurring 'Expected exactly
one agent to compress, got 2' failure on shard test (1)).
Replace the timing dependency with a threading.Barrier(2) wrapped around
the shared db's try_acquire_compression_lock: both threads rendezvous
immediately before the real (atomic) acquire, guaranteeing genuine
simultaneous contention regardless of scheduling. The real lock logic is
unchanged and still picks exactly one winner — this only fixes the test's
overlap guarantee. Restored after join so the post-join lock-leak
assertion hits the unwrapped method.
Verified: 20/20 plain + 15/15 under all-core CPU stress (load avg ~4.6),
where the old version flaked.
#41076 makes `hermes plugins list` discover nested category plugins (e.g.
observability/nemo_relay). This adds the missing enable/disable mutation path
so those plugins can actually be toggled, and fixes two incomplete-update
breakages on the #41076 base.
Before: `hermes plugins enable nemo_relay` -> "Plugin 'nemo_relay' is not
installed or bundled." (exit 1), because cmd_enable/cmd_disable went through
_plugin_exists(), which only checked top-level plugins/<name>/.
Changes:
- Add _resolve_plugin_key(): resolve a bare manifest/leaf name OR a full
path-derived key (observability/nemo_relay) to the canonical key the runtime
loader gates on, reusing #41076's _discover_all_plugins(). A bare leaf name
ambiguous across two categories resolves to None rather than silently picking
one.
- cmd_enable/cmd_disable resolve first, persist the canonical key, and drop any
stale legacy bare-name alias so the enabled/disabled lists can't drift into a
contradictory state. _plugin_exists delegates to the same resolver.
- Fix#41076 base breakages: _discover_all_plugins now returns 6-tuples, but
web_server._merged_plugins_hub() still unpacked 5 (ValueError on the
dashboard plugins-hub endpoint) and several test_plugins_cmd_list.py fixtures
were still 5-tuples. Both updated; the hub status check is now key-aware.
Verified e2e on the real CLI + runtime loader (isolated HERMES_HOME):
`hermes plugins enable nemo_relay` writes observability/nemo_relay to
config.yaml and the loader then loads it (enabled=True, error=None); a stale
bare-name alias is cleared on disable; the dashboard _merged_plugins_hub() runs
without crashing. Adds resolution + enable/disable tests; full
tests/hermes_cli/test_plugins_cmd* + web_server plugin tests green.
Follow-up to #41076 (#41066). Branched from that PR's head.
Salvaged from #6600 (@kristianvast) — re-scoped to the voice half only and
rebased onto current main. The cascading-interrupt hang half of the original
PR landed independently in dd0d1222a, so this carries ONLY Problem 1.
When a voice/audio message arrives while the agent is busy on the same
session, it hit the interrupt path with empty text because STT only ran after
the running-agent guard — the voice was effectively lost. Now we transcribe
audio BEFORE signaling the agent (and on the fresh-message path), echo the raw
transcript back to the user (🎙️), and _enrich_message_with_transcription
returns (text, transcripts) so callers can echo. A new
_dequeue_pending_with_transcription drives the post-agent drain the same way.
Reapplied onto _prepare_inbound_message_text (inbound enrichment was extracted
from the inline dispatch block since the original PR).
Co-authored-by: Kristian Vastveit <kristian@agrointel.no>
On-disk vitest coverage for the auto-heapdump disk-safety guard: opt-in
gating (suppressed diagnostics-only path), truthy-spelling acceptance,
manual-trigger passthrough, and the retention prune. Test approach
adapted from #21780 (briandevans) and #21822 (LeonSGP43), reconciled to
the merged gate semantics. Maps alarcritty into AUTHOR_MAP for CI.
Automatic heap dumps from the TUI memory monitor could write multi-GiB
.heapsnapshot files on every threshold cross, growing ~/.hermes/heapdumps
to tens of GiB. Add four layered safeguards:
- Gate auto-high/auto-critical snapshots behind HERMES_AUTO_HEAPDUMP=1;
manual dumps remain unchanged.
- Always write the lightweight diagnostics JSON sidecar so users still
get an actionable artifact when the snapshot is suppressed.
- Cap total bytes in the dump dir (HERMES_HEAPDUMP_MAX_BYTES, default
2 GiB), evicting oldest first, retaining the newest.
- Add a cooldown between auto dumps (HERMES_AUTO_HEAPDUMP_COOLDOWN_MS,
default 10 min) so an oscillating heap can't re-trigger.
Closes#21767
When agent.interrupt() fires during an active LLM call, the main poll loop
force-closes the worker-local httpx client to stop token generation. That
raises a transport error (RemoteProtocolError) on the worker thread — the
EXPECTED consequence of our own close, not a network bug.
The streaming retry loop misclassified it as a transient connection error
and retried; each doomed retry stalled for the full stream-stale timeout
(up to 300s). Because the gateway caches AIAgent instances per session, the
stale worker outlived the interrupted turn and raced the next turn's request
on shared client state — the root of the multi-minute cascading-interrupt
hang reported in the wild.
Fix: a request-local _request_cancelled token set by the poll loop right
before the force-close, in both interruptible_api_call (non-streaming) and
interruptible_streaming_api_call. The worker's exception handler checks the
token and exits cleanly — no retry, no fallback, no 'reconnecting' status —
instead of treating the forced error as transient. The token is request-
local (not agent._interrupt_requested, which is cleared at turn boundaries)
so a stale worker outliving its turn still recognizes its own forced close.
Original diagnosis and fix by @kristianvast (PR #6600), against the then-
inline methods in run_agent.py. Those were since extracted into
agent/chat_completion_helpers.py, so the fix is reapplied there.
Co-authored-by: Kristian Vastveit <kristianvast@users.noreply.github.com>
A misconfigured/slow external memory provider could hold the agent in
the 'running' state for minutes after the final response was delivered.
MemoryManager.sync_all / queue_prefetch_all looped provider.sync_turn /
queue_prefetch INLINE on the turn-completion path; a provider making a
blocking network/daemon call (a broken Hindsight daemon was observed
blocking ~298s before failing) blocked run_conversation from returning.
Because every interface (CLI, TUI, gateway) marks the agent 'running'
until run_conversation returns, the agent stayed busy for the full block
and any follow-up message triggered an aggressive interrupt that dropped
the message.
Dispatch provider sync/prefetch to a lazily-created single-worker
background executor. sync_all / queue_prefetch_all return immediately;
work completes (or fails, logged) in the background. A single worker
serializes writes so turn N lands before turn N+1. flush_pending()
provides a barrier for session boundaries and deterministic tests.
shutdown_all() drains the executor with a bounded timeout so a wedged
provider can never hang teardown.
Builtin-only / no-provider sessions spawn no executor (zero new threads
in the common case).
Review feedback (#40998): `rm -rf` / `Remove-Item -Recurse -Force` on the
install dir is destructive -- a user might still want whatever is there.
Rename the broken checkout to a timestamped `<dir>.broken-<ts>` backup and
re-clone fresh, so nothing is ever deleted. Transient cleanup of a clone
attempt that fails within the same run is left as-is.
Behavioral coverage for install.sh's clone_repo() guard (removes a
commit-less checkout, keeps a real one, ignores a non-repo dir) plus a
contract check that install.ps1's repo-validity gate requires a resolvable
HEAD.
An interrupted previous clone leaves the install dir's .git present but with
no initial commit. rev-parse --is-inside-work-tree and git status both still
succeed there, so the installer entered the update path and ran `git stash`,
which aborts with "You do not have the initial commit yet" and failed the
desktop install at the "Cloning Hermes repository" stage.
- install.ps1: add a `git rev-parse --verify HEAD` probe to the repo-validity
check so a commit-less checkout is treated as broken and re-cloned fresh.
- install.sh: mirror it at the top of clone_repo() — drop a partial checkout
with no resolvable HEAD so the fresh-clone path handles it (POSIX parity).
Fixes#40998
Lift the `_handle_*_command` cluster (2,077 LOC) out of HermesCLI into
hermes_cli/cli_commands_mixin.py; HermesCLI now inherits CLICommandsMixin so
every self.<handler> call resolves unchanged via the MRO. Behavior-neutral.
Import discipline mirrors gateway/slash_commands.py (PR #41886): neutral deps
imported at the mixin module top level; cli.py-internal helpers/constants
(_cprint, _ACCENT, save_config_value, ...) imported lazily inside each handler
via 'from cli import ...' so the mixin never imports cli at module scope.
cli.py 16215 -> 14139 LOC. One test mock repointed (cli.is_browser_debug_ready
-> hermes_cli.cli_commands_mixin.is_browser_debug_ready).
* fix(cli): set PYTHON env for node-gyp native builds on NixOS
node-gyp (triggered by node-pty during npm ci) looks for python3 on
PATH, which fails on NixOS because python3 lives in the nix store and
is not on the system PATH.
Add _nixos_build_env() — a two-tier helper that detects NixOS and:
1. Fast path: hermes venv python3 (~0s)
2. Fallback: nix-shell which python3 (~2-5s)
Wire it into _run_npm_install_deterministic() via a new env= parameter,
then pass it through cmd_gui() and _update_node_dependencies().
Non-NixOS systems: _nixos_build_env() returns None, behavior unchanged.
* fix(cli): merge _nixos_build_env() with os.environ, fix NixOS detection, add explicit return None
- Critical fix: both Tier 1 (venv) and Tier 2 (nix-shell) now return
{**os.environ, "PYTHON": ...} instead of {"PYTHON": ...} — subprocess.run
with env= replaces the entire environment, so the old code wiped PATH
and broke npm/node on NixOS entirely.
- Uses re.search(r"^ID=nixos$", ...) for anchored NixOS detection instead
of unanchored substring match (could match ID_LIKE=...nixos).
- Removes redundant Path.exists() guard before read_text(); just catches
OSError (one filesystem read instead of two).
- Adds explicit return None at end of function for type-hint consistency.
test_gateway_run_clamped read gateway/run.py asserting the /usage stats handler
clamps pct with min(100, ...). That handler moved to gateway/slash_commands.py
in this PR's extraction; repoint the guard so it still fires on clamp removal.
tests/run_agent/ + tests/gateway/ 8024 passed / 0 failed.
Tests for the extracted handlers mocked symbols at gateway.run.*; the handlers
now resolve top-level-imported deps (atomic_json_write, fetch_account_usage,
render_account_usage_lines) and __file__ from gateway.slash_commands. Repoint
those mocks. run.py-resident methods (_increment_restart_failure_counts,
_clear_restart_failure_count) keep their gateway.run.atomic_json_write mock —
only the moved handlers' mocks change.
tests/gateway/ 6415 passed / 0 failed.
The in-session slash commands (/model, /reset, /usage, /compress, /voice, ...)
— 42 _handle_*_command handlers, ~3,200 LOC — move out of gateway/run.py into a
mixin GatewayRunner inherits. self._handle_*_command dispatch + all test
references resolve unchanged via the MRO.
Neutral deps (MessageEvent, EphemeralReply, Platform, t, cfg_get, atomic_*_write,
account-usage helpers, stdlib) imported at the mixin top level. The ~10 run.py-
internal helpers (_hermes_home, _load_gateway_config, _resolve_gateway_model,
_AGENT_PENDING_SENTINEL, ...) imported lazily inside the handlers that need them
to avoid an import cycle.
gateway/run.py 19157 -> 15870 LOC; GatewayRunner direct methods 214 -> 172.
Behavior-neutral: voice/update/model/compress command test suites pass; all 42
resolve to the mixin via MRO.
A one-off transient transport failure (streaming-close / incomplete
chunked read / 5xx / 408) on an auxiliary LLM call escalated straight to
provider/model fallback (or, for context compression, dropped the summary
and entered cooldown), even when an immediate retry on the same provider
would have succeeded.
Add a single same-target retry at the top of call_llm() and
async_call_llm() — before the existing except-chain — gated on a new
_is_transient_transport_error() that reuses the canonical
_is_connection_error() detector plus a 5xx/408 status check. A second
failure (or any non-transient error: auth, other 4xx, malformed payload)
falls through to first_err and the existing fallback handling unchanged.
This lives in call_llm so every auxiliary task (compression, memory flush,
title generation, session search, vision) shares one transient-retry
surface, rather than each caller re-implementing it. The context
compressor needs no change — it calls call_llm and inherits the retry; its
existing fallback-to-main path (#18458) now composes naturally (retry the
aux model once, then fall back to main only if the retry also fails).
Co-authored-by: ARegalado1 <alberto.regalado@ymail.com>
Under systemd's Restart=always, --replace turns every restart into a
self-kill loop: the new instance reads gateway.pid, kills the previous
process, writes its own PID, and on the next restart the cycle repeats.
A process supervisor owns the lifecycle — --replace is for manual
one-shot takeovers and fights the supervisor.
Remove --replace from both the system-level and user-level systemd
ExecStart lines. The --replace flag stays available for manual
'hermes gateway run --replace' and on the macOS launchd fallback path
(#23387), which is a deliberate manual takeover, not a supervised unit.
Also drop RestartMaxDelaySec / RestartSteps from the templates — they
require systemd v255+ and are silently ignored on older versions. The
_strip_optional_systemd_directives normalizer stays so existing installs
whose on-disk unit still carries those directives aren't flagged as
outdated.
Credit: reported and diagnosed by @Skippy-the-Magnificent-one (PR #37145);
reimplemented here under project authorship because the original commit
was authored under a non-existent email.
* fix(nix): fix-lockfiles real-build verification + point auto-fix at nix/lib.nix
Two related fixes to the npm lockfile-hash tooling that, together, let a
broken nix build slip onto main and stay there:
1. fix-lockfiles trusted prefetch-npm-deps. It computes the hash from the
lockfile *contents* and early-exited "ok" whenever that matched the pin,
never running the real fetchNpmDeps + npmConfigHook build. Those two can
disagree (the --apply path already works around it), so `--check`
reported "ok" while a cold build was actually broken (e.g. lockfile
engines/os/cpu fields the pinned nixpkgs strips from the deps cache,
tripping npmConfigHook's consistency diff). Now, when prefetch says the
hash matches, confirm with `nix build .#<attr>` before believing it:
adopt the real fetchNpmDeps hash if nix reports a 'got:' mismatch,
surface non-hash failures honestly (exit 1) instead of claiming "ok",
and keep the transient-cache-failure skip.
2. nix-lockfile-fix.yml's auto-fix-main (and the PR-fix job) whitelisted and
staged nix/tui.nix + nix/web.nix, but the single npmDepsHash moved to
nix/lib.nix. So fix-lockfiles --apply edited nix/lib.nix, the guard
flagged it as an "unexpected modified file", and the job exited without
committing — the auto-healer could never push a fix. Point the guard
regex and both `git add` lines at nix/lib.nix.
* fix(nix): fix cold npm builds — adopt the deps-cache lockfile in patchPhase
hermes-tui/hermes-agent could not be built from source on the pinned nixpkgs:
prefetch-npm-deps strips advisory lockfile fields (engines/os/cpu/funding/
bin/…) that newer npm writes into package-lock.json, then npmConfigHook
byte-compares the source lockfile against the cache's stripped copy and fails
on the difference. CI only stayed green because it substitutes the prebuilt
hermes-tui from Cachix and never cold-builds it; anyone building cold (e.g. a
local path: input, or a cache miss) hit the failure.
mkNpmPassthru's patchPhase now copies the cache's own normalized
package-lock.json over the source before npmConfigHook runs, so the
consistency check is trivially satisfied. The resolved dependency set
(version/resolved/integrity/dependencies) is identical — fetchNpmDeps derived
the cache from this very lockfile — so `npm ci` installs the same tree; only
advisory metadata is dropped. Genuine drift is still caught by the
fixed-output npmDepsHash check, which runs before this phase.
Verified by cold-building .#tui and .#default (full hermes-agent) from scratch
on the pinned nixpkgs (6201e2) — both succeed where they previously failed at
npmConfigHook.
Completes the worktree-misroute fix from #35399, which made misroutes
visible (resolved_path) but did not prevent them: its divergence warning
only fired once a terminal command had populated the live cwd registry.
A fresh worktree session (registry still empty) with a stale TERMINAL_CWD='.'
got neither a worktree anchor nor a warning, so a relative write_file/patch
silently landed in the MAIN checkout.
Two changes in tools/file_tools.py:
- Treat sentinel TERMINAL_CWD values ('', '.', './', 'auto', 'cwd') and any
relative value as UNSET rather than a literal anchor. Previously '.' was
joined onto the process cwd, silently routing edits to wherever the process
happened to be (the main repo, in a worktree session). The gateway already
sanitizes the same set at import time; the file-tool layer now matches.
- New _authoritative_workspace_root(): prefers the live terminal cwd, else a
sentinel-free absolute TERMINAL_CWD (the worktree path cli.py/main.py set
for -w). _resolve_base_dir() and _path_resolution_warning() both use it, so
a worktree session resolves into — and warns about escaping — the worktree
from the very first write, before any cd has run.
Validation: 11 new/parametrized tests (sentinel handling, empty-registry
anchoring, early divergence warning, live-cwd precedence). 32/32 pass under
scripts/run_tests.sh. Live E2E: relative write in an empty-registry worktree
session lands in the worktree, main untouched.
When --replace force-kills an unresponsive old gateway, SIGKILL can fail
to reap it (uninterruptible sleep, zombie-reaping parent, etc.). The old
code unconditionally cleared the PID file and scoped locks and started a
fresh instance anyway, leaving two live gateways fighting over the same
bot token — a duplicate-gateway failure mode of #19471.
Re-verify the process is actually gone (via the Windows-safe _pid_exists
helper) after the force-kill; if it still appears alive, clear the
takeover marker and abort the replacement instead of duplicating.
Co-authored-by: Hermes <noreply@nousresearch.com>
PR #41822 collapsed CWD-only overrides to the shared 'default' container
via _resolve_container_task_id, but three call sites kept routing the
*env/override lookup* through that collapsed id:
- the foreground exec path read _task_env_overrides[effective_task_id],
yet register_task_env_overrides writes under the raw task_id, so a
CWD-only override's cwd was silently dropped (env spun up at the wrong
root, exit 126);
- the get-or-create env lookup keyed solely on effective_task_id, so an
env cached under the raw task_id was missed and duplicated;
- register_task_env_overrides synced the new cwd onto the env under the
collapsed id, missing a live env cached under the raw task_id.
Container *identity* still collapses to 'default' (sharing preserved);
only the per-session env/override *lookup* now prefers the raw task_id and
falls back to the collapsed id. Fixes the 3 regressions in
test_terminal_task_cwd.py left red by #41822.
eslint --fix (import sort + padding-line-between-statements) on sidebar/index.tsx
after cherry-picking @dangelo352's commits; add release.py AUTHOR_MAP entry so
CI doesn't block on the unmapped author email.
gateway/run.py is the largest god file (20k LOC, GatewayRunner with 220
methods). This lifts the cohesive kanban-watcher cluster — _kanban_notifier_watcher,
_kanban_dispatcher_watcher, _kanban_advance/unsub/rewind, _deliver_kanban_artifacts
(~1,035 LOC, 6 methods) — into gateway/kanban_watchers.py as a mixin that
GatewayRunner inherits.
Mixin (not free functions) because the methods use only self state: inheriting
keeps every self._kanban_* call site working unchanged via the MRO, making this
a behavior-neutral move. The methods' lazy imports (_kb, _decomp, _load_config,
Platform) travel with them; the mixin needs only stdlib + a matching
logging.getLogger('gateway.run').
run.py 20187 -> 19157 LOC; GatewayRunner direct methods 220 -> 214.
Behavior-neutral: gateway test suite 6582 passed / 0 failed; start() still wires
both watchers via self._kanban_*; MRO resolves all 6 to the mixin. One test
(corrupt-board quarantine retry) keyed its time-travel mock on the caller's
filename being gateway/run.py — updated to also accept gateway/kanban_watchers.py.
Establishes the mixin-extraction pattern for further GatewayRunner decomposition
(the 2406-LOC _run_agent and 1164-LOC _handle_message remain, but their callback
closures need a context-object redesign — deferred).
When register_task_env_overrides is called with only a 'cwd' key
(ACP adapter workspace tracking), the task_id should collapse to
'default' so all interactive surfaces (TUI, gateway, dashboard)
share one long-lived container.
Previously, any override registration — even CWD-only — caused
_resolve_container_task_id to return the session key unchanged,
spinning up a separate container per session. This made it
impossible to authenticate into external services once and have
that auth available across all surfaces.
Now only overrides containing isolation keys (docker_image,
modal_image, singularity_image, daytona_image, env_type) trigger
per-task container isolation.
Fixes#37361
Subcommands whose handler was a closure defined inside main() — memory, acp,
tools, insights, skills, pairing, plugins, mcp, claw — have their handler
promoted to a top-level function and their parser block extracted into
hermes_cli/subcommands/<name>.py (build_<name>_parser, injected handler).
These 9 had zero closure-over-main-locals, so promotion is a pure relocation.
acp/mcp parser blocks use the shared add_accept_hooks_flag helper.
main() 1798 -> 954 LOC (71% below the 3297 Phase-2 starting point);
add_parser calls in main.py 89 -> 28.
Deferred: sessions, computer-use, secrets handlers reference <name>_parser
(for a no-subcommand print_help fallback) — left in place to avoid the
_self_parser indirection; minority, low value.
Behavior-neutral: all 9 subcommands' --help (incl nested subactions) byte-
identical to pre-extraction (diff-verified). tests/hermes_cli/ 6519 passed /
0 failed; new test_subcommands_followup.py covers the 9 builders.
run_conversation's inner retry loop tracked recovery state in ~15 scattered
bare booleans (per-provider OAuth refresh guards, format-recovery guards,
restart signals). They are now fields on a single TurnRetryState dataclass the
loop mutates in place (_retry.<flag>), giving the recovery bookkeeping a named,
testable home.
Loop-control vars (retry_count, max_retries, max_compression_attempts) stay as
plain locals — they're while-mechanics, not recovery bookkeeping.
Behavior-neutral: pure local→attribute rewrite of 42 references; kwarg NAMES
preserved (e.g. has_retried_429=_retry.has_retried_429). Live simple + tool
turns OK.
Validation: tests/run_agent/ 1615 passed / 0 failed under per-file process
isolation; new test_turn_retry_state.py pins the field contract.
When context compaction rotates agent.session_id, it updates the gateway/tools
session context (set_current_session_id -> HERMES_SESSION_ID env + ContextVar)
but never updates the separate logging session context. The [session_id] tag on
log lines comes from hermes_logging._session_context (set once per turn in
conversation_loop.py), so post-compaction log lines in the same turn carry the
STALE old id while the message/DB/gateway state carry the new one — breaking log
correlation exactly at the compaction boundary.
Call hermes_logging.set_session_context(agent.session_id) alongside the existing
set_current_session_id, guarded so a logging failure can't regress the routing
update. Logs-only; no runtime or caching impact.
Refs #34089
The curator's idle-archival path (apply_automatic_transitions under
prune_builtins) could archive the bundled `plan` skill, killing the
/plan slash command silently — typing /plan then returned 'Unknown
command' with no signal that a skill had vanished. The archived skill's
hash stays in .bundled_manifest, so 'hermes update' wouldn't re-seed it.
Add PROTECTED_BUILTIN_SKILLS ({plan}) enforced at the master gate
is_curation_eligible() (covers archive_skill + the transition walk) and
in the candidate enumerator (so the LLM consolidation pass never sees
them). Immune to prune_builtins, pin state, and LLM judgment.
Closes#33617. Adds additive _meta.hermes.sessionProvenance to ACP session
surfaces so clients can detect compression-driven internal session rotation
without parsing status text, guessing from token drops, or reading state.db.
Derived on demand from the existing compression chain (parent_session_id /
end_reason) — no new persisted state, no schema change, no ACP protocol change.
ACP session_id stays the stable client handle.
- acp_adapter/provenance.py: derive provenance from SessionDB
- server.py: attach _meta to new/load/resume responses; emit a
session_info_update when the internal head rotates during a prompt
Salvage follow-up for PR #33221 — the cherry-picked commit is authored
under martin.alca@gmail.com (not the draixagent@gmail.com already mapped),
which would fail the CI author-attribution gate.
VolcEngine's api/plan endpoint occasionally leaks raw XML attribute
fragments into tool_use.name when its protocol-translation layer
converts the model's native XML-style tool emission to Anthropic
Messages tool_use blocks, producing names like:
terminal" parameter="command" string="true
execute_code" parameter="code" string="true
session_search" parameter="session_id" string="true
The corruption happens server-side at the provider, but it breaks
every tool call for affected users — no normalization rule in
repair_tool_call can rescue them, so each request runs through three
retries and then aborts as partial.
Add an early sanitizer in agent_runtime_helpers.repair_tool_call that
trims at the first ' " ', " ' ", '<', or '>' character (idx > 0
only) so the rest of the existing repair pipeline (lowercase /
snake_case / fuzzy match) can resolve the cleaned name normally.
Whitespace is deliberately NOT a separator — the legitimate
"write file" -> write_file repair path (covered by
test_space_to_underscore) must keep working.
Tests: 11 new regression cases in TestVolcEngineXmlPollution
covering all three observed polluted names, CamelCase + pollution
mix, single-quote variants, angle-bracket variants, clean-name
passthrough, and the whitespace-preservation guard. All 18 pre-
existing repair tests still pass (29 total in the file).
Replace the ACP-local prefix/suffix matcher + helper with a single
startswith() check against INTERRUPT_WAITING_FOR_MODEL_PREFIX, now
defined once in conversation_loop.py where the sentinel is produced.
Keeps the source of truth in one place so the guard cannot drift if
the status string changes. Net -17 LOC in server.py.
Also add lsaether to release.py AUTHOR_MAP.
The auxiliary Codex adapter maintained its own chat->Responses conversion
loop that forwarded every non-system message's role verbatim into
Responses input[]. When flush_memories()/compression replayed session
history containing assistant tool_calls + role=tool results, those tool
messages leaked into the request and the Responses API rejected them with
HTTP 400: Invalid value: 'tool'.
Route _CodexCompletionsAdapter.create() through the same shared converter
the main agent transport uses (_chat_messages_to_responses_input), so tool
calls become function_call items and tool results become function_call_output
items with a valid call_id. Single conversion path means no future drift.
Also remove the now-dead _convert_content_for_responses() helper — its only
caller was the private conversion loop this change deletes.
Co-authored-by: ProgramCaiCai <techxacm@gmail.com>
Batch extraction of every remaining subcommand whose handler is top-level and
whose parser block is pure argparse: model, setup, postinstall, whatsapp, slack,
login, logout, auth, status, webhook, hooks, doctor, security, dump, debug,
backup, import, config, version, update, uninstall, dashboard, gui, logs,
prompt-size.
Each becomes hermes_cli/subcommands/<name>.py with build_<name>_parser() and an
injected handler (no main import). dashboard also injects cmd_dashboard_register
for its nested 'register' action.
Behavior-neutral: all 25 subcommands' --help output (and nested subaction help)
diff-verified byte-identical to pre-extraction. Two RawDescriptionHelpFormatter
epilogs (debug, logs) needed their multi-line string interiors preserved at
column 0 — caught by the --help diff, not compile.
main() 3297 -> 1798 LOC across this PR; add_parser calls in main.py 179 -> 89.
Validation: tests/hermes_cli/ 6476 passed / 0 failed under per-file process
isolation; new test_subcommands_batch.py smoke-tests all 25 builders + the
dashboard two-handler case.
Follow-on to the cron extraction in the same Phase 2 PR. Same pattern:
per-group build_<name>_parser() functions with injected handlers, no main
import.
- subcommands/profile.py: build_profile_parser (190-line block out of main()).
- subcommands/gateway.py: build_gateway_parser (gateway + proxy, 238-line block;
they shared one inline section). Imports argparse for SUPPRESS defaults.
- main(): two more inline blocks become single builder calls.
Behavior-neutral: 'profile [sub] --help' and 'gateway/proxy [sub] --help'
byte-identical to pre-extraction (diff-verified).
main() now 2723 LOC (was 3297 at Phase 2 start); add_parser calls in main.py
179 -> 141.
Validation: tests/hermes_cli/ 6476 passed / 0 failed under per-file process
isolation; new builder unit tests cover subactions, aliases, dispatch, flags.
Phase 2 of the god-file decomposition plan. main()'s argparse tree is 179
inline add_parser calls in one 3,297-line function. This establishes the
hermes_cli/subcommands/ package and extracts the first group (cron) as the
proof-of-pattern:
- hermes_cli/subcommands/_shared.py: shared parser helpers (add_accept_hooks_flag),
re-exported from main.py for backwards compat.
- hermes_cli/subcommands/cron.py: build_cron_parser(subparsers, cmd_cron=...).
Handler injected so the module never imports main (cycle avoidance).
- main()'s ~155-line inline cron block becomes one build_cron_parser() call.
Behavior-neutral: 'hermes cron create --help' output is byte-identical to
origin/main. main() 3297 -> 3143 LOC.
Validation: tests/hermes_cli/ 6466 passed / 0 failed under per-file process
isolation; new test_subcommands_cron.py covers subactions, aliases, options,
no-agent tristate, injected dispatch, and --accept-hooks.
Phase 1 of the god-file decomposition plan. run_conversation's ~470-line
once-per-turn setup block (stdio guarding, retry-counter resets, user-message
sanitization, todo/nudge hydration, system-prompt restore-or-build,
crash-resilience persistence, preflight compression, the pre_llm_call hook, and
external-memory prefetch) is moved verbatim into build_turn_context(), which
returns a TurnContext dataclass the loop unpacks.
Behavior-neutral move-and-name refactor: the builder mutates `agent` exactly as
the inline code did; only the locals the loop reads back are returned.
- run_conversation: 4602 -> 4217 LOC (-385)
- agent/conversation_loop.py: 4965 -> ~4580 LOC
- new agent/turn_context.py: focused, dependency-injected, unit-tested in isolation
Tests: tests/run_agent/ 1570 passed / 0 failed under per-file process isolation.
Relocation follow-ups: 413_compression mocks now patch both module references;
nudge/on_turn_start source-inspection guards point at the extracted module.
* fix(memory): make overflow errors instruct in-turn consolidation + retry
When bounded memory is full, the add/replace overflow errors now explicitly
tell the model to consolidate (merge/remove/shorten) and retry the write in
the same turn, matching the documented behavior. The replace-overflow path
now also echoes current_entries + usage for parity with add-overflow, so the
model has the same context to act on.
Closes#23378 (working-as-documented; this sharpens runtime to match docs).
* fix(memory): broaden overflow remediation hint beyond 'stale'
Say 'stale or less important' — entries don't have to be stale to be the
right ones to drop when making room.
Session-scoped /model and /reasoning overrides were silently lost on
Telegram DM/forum topics and after compression session splits (#30479).
Root cause: _handle_message_with_agent rewrites source.thread_id via
_recover_telegram_topic_thread_id (lobby/stripped reply -> the user's
bound topic) before deriving the session key. The /model and /reasoning
handlers derived their override key from the raw inbound event.source,
skipping that recovery, so the override was stored under one key and the
next message turn read a different key.
Fix: add _normalize_source_for_session_key (applies the same recovery a
message turn does) and use it in both handlers before deriving the key.
session_id rotation on compression was never the cause — overrides are
keyed by the durable session_key; the split path preserves it.
Author: teknium1 <127238744+teknium1@users.noreply.github.com>
The conflict-retry path called asyncio.get_event_loop() to reschedule
itself when a retry's start_polling raised. On Python 3.11+ (our floor)
that raises 'RuntimeError: There is no current event loop in thread
MainThread' when no loop is attached to the thread, which is what
happens when PTB dispatches this error callback. The retry never gets
scheduled, the adapter goes silent-but-alive, and gateway --replace
keeps spawning fresh instances that hit the same wall — the crash loop
reported in #19471 (worse under multi-profile, where two bots hold the
same conflict open).
We are inside a coroutine here, so asyncio.get_running_loop() is the
correct, guaranteed-valid replacement. Only get_event_loop() call in
any platform adapter, so no sibling sites.
Fixes#19471
ContextCompressor inherited a no-op on_session_end() from ContextEngine, so
per-session iterative-summary state (_previous_summary) survived a real session
boundary on a reused compressor instance. Override it to clear the summary the
moment the owning session ends, complementing the point-of-use guard in
compress(). Closes the cross-session contamination path in #38788.
Co-authored-by: dusterbloom <32869278+dusterbloom@users.noreply.github.com>
When a cron or background session compacts, it sets _previous_summary for
iterative updates. If that session ends without /new or /reset (which calls
on_session_reset()), the stale summary survives on the ContextCompressor
instance. A subsequent live messaging session's compaction then injects it as
'PREVIOUS SUMMARY:' into the summarizer prompt — contaminating the live
session with unrelated content from the prior session.
Add an else guard in compress(): when no handoff summary is found in the
current messages but _previous_summary is non-empty, discard it so
_generate_summary() starts fresh instead of iteratively updating a stale
cross-session summary.
Fixes#38788
The module docstring and get_timezone()/cache comments documented a
reset_cache() helper for forcing tz re-resolution after config changes,
but the function was never defined — doc-followers calling it hit
AttributeError. Adds the helper to clear the cached tz state.
Surfaced in #32043.
build_session_key collapsed every DM that arrived without a chat_id into
one shared 'agent:main:<platform>:dm' key. A single cached AIAgent then
served multiple users' conversations, bleeding history across senders.
DMs now fall back to the sender's user_id_alt/user_id (mirroring the
group-path participant precedence and the telegram auth-path fallback)
before the bare per-platform sink. Telegram's normal event path always
sets chat_id, so this hardens the synthetic-source / non-standard-adapter
paths that don't.
The per-session compression lock prevents same-window concurrent forks but
not cross-turn ones: the background-review fork shares the parent's
session_id, so if it won a compression race its new child session was never
adopted by the gateway (the fork is single-lifecycle). The next foreground
turn then started from the stale parent and compressed it again, leaving the
same parent with two sibling children.
Set review_agent.compression_enabled = False so the fork never triggers
compression. Both trigger sites in conversation_loop.py gate on
compression_enabled before calling _compress_context, so the fork can never
rotate the shared parent. Review needs full context anyway — compressing
would degrade the memory/skill summary.
The per-session lock is kept as defense-in-depth for any future shared-session
path. Adds a regression test that fails without the flag and passes with it.
Closes#38727
Subagents delegated to a custom endpoint were misrouted when the parent
ran on a different custom endpoint. Both runtimes collapse to
provider="custom", so _resolve_child_credential_pool() treated them as
interchangeable and handed the child the parent's pool. Leasing from it
then overwrote the child's delegated base_url with the parent's endpoint
via _swap_credential() — the child sent the delegated model name to the
wrong endpoint.
Custom runtimes now resolve by endpoint identity (the custom:<name> pool
key derived from base_url). The parent pool is reused only when both
parent and child resolve to the same custom endpoint; unregistered raw
endpoints return None so the child keeps its fixed delegated credential.
Non-custom provider paths are unchanged.
Fixes#7833.
A packaged desktop app launches to a blank page with a bare
ERR_FILE_NOT_FOUND when dist/index.html isn't in the bundle (#39484).
This happens when the build step fails (e.g. a stale checkout that
fails typecheck) but electron-builder packages anyway, shipping an
empty dist/.
- build-time: scripts/assert-dist-built.cjs runs at the tail of the
`build` script and aborts before electron-builder if dist/index.html
or the vite JS bundle is missing/empty. Every packaging path
(pack, dist*) inherits it via `npm run build &&`.
- runtime: resolveRendererIndex() now logs a clear 'packaged without a
renderer bundle — rebuild with hermes desktop --force-build' message
when no index.html exists, instead of silently loading a missing path.
- runtime: resolveWebDist() logs when it falls back to an asar-internal
dist that isn't a real directory (the dashboard 404 class, #41327/#39472),
rather than returning an unservable path silently.
Adds scripts/assert-dist-built.test.cjs (node:test) covering the guard.
The dashboard backend serves HTTP 404 on all static routes (/, /assets,
/health) in packaged builds because resolveWebDist() points at
app.asar.unpacked/dist/, but dist/** was not listed in asarUnpack.
Add dist/** to the asarUnpack glob list so electron-builder extracts the
built frontend assets alongside the asar archive, making them accessible
to the Express static file server at runtime.
Fixes#41327
Inspired by Claude Code's /simplify. A bundled skill that captures recent
changes via git diff, fans out three focused reviewers (reuse, quality,
efficiency) via delegate_task batch mode, then aggregates findings and
applies the fixes worth applying.
Zero core changes — orchestrates existing tools (terminal/git, search_files,
delegate_task). Supports focus, dry-run, and scoped-diff modifiers.
Closes#379.
- web_server.py: after proc.poll() returns a non-None exit code, call
proc.wait() to reap the child and move the entry from _ACTION_PROCS
to _ACTION_RESULTS. Previously .poll() alone left <defunct> zombies.
- meet_bot.py: terminate and wait on the pcm_pump subprocess (paplay/
ffmpeg) during the finally-block teardown. Previously leaked on every
normal bot exit.
- tests: add test_action_status_reaps_completed_process and
test_action_status_ignores_wait_failure covering both the happy path
and the wait()-raises-OSError edge case.
Closes#38032
The copytree ignore lambda in _copy_dist_payload applied USER_OWNED_EXCLUDE
recursively at every directory depth. This caused nested directories whose
names matched exclude entries (bin, logs, cache, etc.) to be silently dropped
during distribution install/update.
Fix: only apply USER_OWNED_EXCLUDE filtering at the root of the staged tree,
matching the two-tier pattern used by _clone_all_copytree_ignore and
_default_export_ignore in profiles.py.
Add 5 tests covering nested bin/logs/cache preservation and top-level
filtering still working.
Fixes#37954
The WeChat iLink typing ticket has a 600-second TTL. When a long-running
session exceeds that window, the cached ticket evicts from TypingTicketCache.
Both send_typing and stop_typing silently returned early when the ticket was
None, meaning the TYPING_STOP=2 signal was never sent to iLink. The WeChat
client then showed the typing indicator indefinitely.
Fix: add _ensure_typing_ticket() that transparently refreshes the ticket
via getConfig when the cached one has expired or is missing. Both send_typing
and stop_typing now call this method instead of silently no-oping.
Fixes#38085
Problem: get_model_context_length() had an early return at the end of the
custom-endpoint probe branch (step 3) that returned DEFAULT_FALLBACK_CONTEXT
(256K) without ever consulting the hardcoded DEFAULT_CONTEXT_LENGTHS catalog
(step 8). Models served through a custom/proxied gateway (e.g. corporate
Anthropic proxy) that didn't expose Ollama or local-server endpoints would
hit this path and get capped at 256K, even when the model name clearly
matched a known entry in the catalog (e.g. claude-opus-4-8 → 1M).
Changes:
- agent/model_metadata.py: Before returning DEFAULT_FALLBACK_CONTEXT at the
end of the custom-endpoint branch, consult DEFAULT_CONTEXT_LENGTHS using
the same longest-key-first fuzzy matching as step 8. Only fall through
to 256K if no catalog entry matches.
- tests/agent/test_model_metadata.py: Updated existing test and added new
test covering the custom-endpoint → catalog fallback behavior.
Fixes#38865
When ~/.hermes/profiles/default/ exists as a directory, list_profiles()
returns 'default' twice: once as the built-in default profile (~/.hermes)
and once from the directory scan (~/.hermes/profiles/default).
This causes the cron dashboard API (profile=all) to read the same
jobs.json twice, showing every default-profile job duplicated in the UI.
Fix: skip name=='default' in the named profiles loop, since it's already
added as the built-in default at the top of the function.
Fixes#39346
Xiaomi MiMo (and potentially other providers) support multimodal user
messages but reject list-type tool message content with 400 'text is not
set'. Previously this was handled reactively — the API call would fail,
images would be stripped, and the request retried, losing visual info.
Fix: add supports_vision_tool_messages field to ProviderProfile (default
True). Xiaomi sets it to False. _tool_result_content_for_active_model
now checks this field proactively and returns a text summary instead of
list content, avoiding the round-trip failure entirely.
_supports_vision_override() in image_routing.py checked model.supports_vision
and providers.<name>.models, but not the legacy list-style custom_providers
config. A custom provider entry like:
custom_providers:
- name: my-provider
models:
my-model:
supports_vision: true
was ignored, causing image_input_mode=auto to route through the auxiliary
vision_analyze path instead of natively attaching images.
Fix: added a lookup step for custom_providers list entries, matching by
provider name (including 'custom:<name>' variants at runtime).
providers.<name>.models still takes precedence over custom_providers.
13 new tests covering: true/false override, custom: prefix matching,
no-match fallback, non-dict entries, empty lists, models key missing.
On older systemd versions that don't support RestartMaxDelaySec /
RestartSteps, the installed unit file has those directives silently
dropped. systemd_unit_is_current() did a strict text comparison, so
the unit was perpetually flagged as outdated.
Fix: _strip_optional_systemd_directives() removes RestartMaxDelaySec
and RestartSteps from both the installed and expected text before
comparison. Units that differ only by these optional directives are
now correctly considered current.
When summary_target_ratio is large (e.g. 0.45) and the context_length is
moderate (e.g. 96000), the soft_ceiling (token_budget * 1.5) can exceed
the total transcript size. _find_tail_cut_by_tokens walks the entire
transcript without breaking early, and the resulting compress window is
either empty (compress_start >= compress_end) or a single message whose
summary-of-one overhead saves ~0 tokens.
Both outcomes cause a no-op compression that does not increment
_ineffective_compression_count, so should_compress() returns True on
every subsequent turn and the loop repeats endlessly.
Fix (two layers):
1. _find_tail_cut_by_tokens: when the backward walk consumed the entire
transcript without breaking (cut_idx <= head_end and accumulated <=
soft_ceiling), re-walk with the raw (non-inflated) token budget to
find a meaningful cut that gives the summarizer a useful middle window.
2. compress(): when compress_start >= compress_end, increment
_ineffective_compression_count and log a warning so the existing
anti-thrashing guard in should_compress() can break the loop.
Fixes#40803
The custom/Ollama provider profile had no default_max_tokens, so no
max_tokens was sent on requests and Ollama fell back to its internal
num_predict=128 — truncating responses after a few tokens with
finish_reason='length' (#39281, e.g. gemma4).
max_tokens resolution is ephemeral > user model.max_tokens > profile
default, so this is only a floor used when the user hasn't set their own
cap. Set it to 65536 (matching the qwen-oauth tier) rather than a
conservative value, since users can always override per-model.
Fixes#39281
Two findings from Copilot's review on #15464, both addressed:
1. ``event.get("thread_ts")`` truthy vs
``event_thread_ts != ts``: the new channel branch treated ANY
truthy ``thread_ts`` as a real thread reply, but three lines below
``is_thread_reply`` is defined with the stricter
``event_thread_ts and event_thread_ts != ts`` invariant. If Slack
ever ships a payload where ``thread_ts == ts`` on a thread root,
the stricter check would treat it as a top-level message for the
``is_thread_reply`` path but as a thread reply for session keying
— divergent behaviour. Aligned this branch to the same
``and event_thread_ts_raw != ts`` invariant.
2. ``test_top_level_reply_to_id_stays_none_when_shared`` docstring
had the ternary logic backwards ("None != ts → reply_to_message_id
IS set"). The code reads
``reply_to_message_id = thread_ts if thread_ts != ts else None`` —
with ``thread_ts = None``, the condition is True so the expression
evaluates to ``thread_ts`` itself (None), meaning the reply stays
un-threaded. The test asserted the correct end-state; only the
explanatory docstring was wrong. Rewrote the docstring to match
the actual code flow, with the note that Copilot caught the
reversal.
7/7 tests still pass. No behaviour change for the existing
test_thread_reply_scopes_by_thread_even_when_shared case because
``event_thread_ts_raw = "1700000000.000000"`` and ``ts =
"1700000000.000005"`` are distinct — the new
``!= ts`` guard is a no-op there.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Top-level Slack channel messages previously fell back to the message's
own ``ts`` as a synthetic ``thread_ts``:
thread_ts = event.get("thread_ts") or ts # ts fallback for channels
That value flows into ``build_source(thread_id=thread_ts)`` at
line 1247. The gateway session store keys sessions by
``(platform, channel_id, thread_id)``, so every top-level channel
message ended up on a unique session. Operators who set
``reply_in_thread: false`` in ``config.yaml`` expected all top-level
channel messages to share one session (the whole point of that flag)
— instead each one spawned a fresh conversation with no context
carry-over.
### Fix
Three explicit cases in the channel branch:
| event.thread_ts | reply_in_thread | thread_ts for session keying |
|---|---|---|
| non-null (real thread reply) | either | event.thread_ts |
| null (top-level) | true (default) | ts (legacy: own-thread sessions) |
| null (top-level) | false | **None** (shared channel session) |
The outbound-reply gate at line 1264 (``reply_to_message_id =
thread_ts if thread_ts != ts else None``) still works correctly in
all three cases without further changes: ``None != ts`` is True, so
shared-channel top-level messages don't get their reply threaded
either — matching the operator's ``reply_in_thread=false`` intent
end-to-end.
Genuine thread replies still scope per-thread under both modes so
multi-person threaded conversations can't collide with unrelated
channel chatter.
### Tests (7 new in ``tests/gateway/test_slack_channel_session_scope.py``)
All drive the real ``SlackAdapter._handle_slack_message`` code path
(not a re-implementation) via the standard pytest fixture pattern
used by ``tests/gateway/test_slack.py``. Messages @mention the bot
so the mention gate doesn't drop them — the tests are specifically
about what happens once the handler decides to emit a ``MessageEvent``.
* ``TestChannelSessionScopeDefault`` (2 cases):
- Explicit ``reply_in_thread: true`` keeps ``thread_id = ts``
(legacy behaviour — regression guard)
- Unset config behaves like ``reply_in_thread: true`` (pins the
default)
* ``TestChannelSessionScopeShared`` (3 cases):
- ``reply_in_thread: false`` + top-level → ``thread_id is None``
(the #15421 bug 1 fix)
- ``reply_to_message_id is None`` in the same case (no threaded
outbound reply)
- Genuine thread reply still scopes per-thread when shared mode is
on — only TOP-LEVEL messages collapse to the channel session
* ``TestThreadReplyAlwaysScopesByThread`` (2 parametrised cases):
- Thread replies get ``thread_id = event.thread_ts`` regardless of
``reply_in_thread`` — critical invariant for multi-thread
channels; a regression here would leak per-thread context across
threads
**Regression guard verified**: reverted the else-branch to the legacy
``thread_ts = event.get("thread_ts") or ts`` one-liner;
``test_top_level_maps_to_none_when_reply_in_thread_false`` correctly
failed (asserts ``thread_id is None`` but got ``"1700000000.000003"``).
Restored → 182 slack tests pass (175 existing + 7 new).
Scope: this fixes#15421 bug 1 only. Bug 2 (sessions.json not
persisting across compression) lives elsewhere in the session
manager and is left for a separate diff.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Desktop zoom shortcuts (Cmd/Ctrl +/-/0) and the View menu only called
webContents.setZoomLevel(), which mutates the live renderer but persists
nothing. On reload, renderer crash/restart, or page recreation the app
snapped back to the default zoom, so the shortcuts felt broken for users
who need larger text.
Persist the selected zoom in the renderer's own localStorage rather than a
main-process JSON file. localStorage is per-origin and survives the
renderer lifecycle automatically, so there's no atomic-write/userData file
machinery to maintain. The main process still owns setZoomLevel: every
zoom change is mirrored into localStorage via executeJavaScript, and the
value is read back and re-applied on did-finish-load (covering reloads and
crash recovery). Clamping to Electron's [-9, 9] range now happens once in
setAndPersistZoomLevel instead of at each call site.
* feat(desktop): hover-reveal collapsed chat sidebar as a fixed overlay
When the sessions sidebar is collapsed, hovering the left edge now floats
it back in as a fixed overlay over the main content instead of just being
hidden. The collapsed grid track stays at 0px so the panel never reserves
space — it slides over whatever's underneath and retracts on pointer-leave.
- PaneShell: new hoverReveal prop. When a pane is collapsed + hoverReveal,
render an edge hot-zone + a side-anchored floating panel (absolute, full
height, honors any persisted resize width) that slides in on hover/focus.
- ChatSidebar: force the (otherwise opacity-0 when collapsed) sidebar fully
visible + interactive while the overlay is revealed, via an
in-data-[pane-hover-reveal=open] variant.
- desktop-controller: opt the chat-sidebar pane into hoverReveal.
* feat(desktop): lower window minWidth 900→400
Lets the window shrink to a narrow rail (e.g. for the collapsed
hover-reveal sidebar) instead of being floored at 900px.
* fix(desktop): render full sidebar content in hover-reveal overlay
The hover-reveal overlay showed only the nav rail — session rows, search,
pinned/recents were gated behind `sidebarOpen` (false while collapsed), so
they never mounted in the floated panel.
Add a $sidebarRevealed store the PaneShell overlay drives via a new
onHoverRevealChange callback, and gate ChatSidebar's content on
`sidebarOpen || sidebarRevealed` (contentVisible) instead of raw open
state. The overlay now shows the complete sidebar.
* fix(desktop): drop shadow on hover-reveal sidebar overlay
* feat(desktop): hover-reveal the file-browser sidebar too
The reveal mechanism already lives in the shared Pane primitive — the
right rail just opts in with hoverReveal. Its content renders
unconditionally, so (unlike the chat sidebar) it needs no extra
content-visibility gating.
* clean(desktop): tighten hover-reveal pane code
KISS pass — flatten the translate ternary, derive a single `revealed`,
inline the edge style, drop the redundant set-guard, and trim comments to
the house one-liner style. No behavior change.
* fix(desktop): stop hiding sidebar nav labels on narrow windows
The nav labels (New session, Skills, …) and the ⌘N hint were gated on a
viewport breakpoint (max-[46.25rem]:hidden), so shrinking the window hid
them even when the sidebar itself was wide — including in the hover-reveal
overlay. Drop the gate; the label already truncates (min-w-0 flex-1) so it
ellipsizes gracefully in a narrow rail, and contentVisible already hides it
when collapsed to the icon rail.
* feat(desktop): auto-collapse both sidebars below 600px into hover-reveal
Add a Pane `forceCollapsed` prop — collapses the track without writing to
the store (so the saved open state restores when the window widens) while
keeping hoverReveal alive (unlike `disabled`, which suppresses it).
desktop-controller watches (max-width: 600px) and force-collapses the chat
sidebar + file browser, so on a narrow window both rails get out of the way
and the hover-reveal overlay becomes the way in.
* feat(desktop): hover-intent + refined easing for sidebar reveal
- Gate the reveal on pointer velocity: the full-height edge hot-zone now
only arms on a slow, deliberate pass (<=0.55 px/ms). Fast sweeps toward
the titlebar/statusbar — or off the window — blow past the threshold and
never trigger, so the wide hit area stops being a nuisance.
- Swap the slide easing to cubic-bezier(0.32,0.72,0,1) at 260ms (snappy-out,
soft-land) for a more serious-app feel.
* fix(desktop): don't reveal sidebar during window resize
Resizing the window parks the cursor on the screen edge and fires slow
pointermoves over the hot-zone, reading as deliberate intent. Guard the
reveal on (a) e.buttons !== 0 — any button-held drag, incl. edge-resize —
and (b) a 250ms cooldown after any window resize event.
* feat(desktop): hoverIntent-style poll gate + inert contents during slide
Replace the single-sample velocity check (too eager — fired on any one slow
move, incl. resize drift) with a port of Brian Cherne's hoverIntent: poll
the pointer every 90ms and only arm once it has *settled* (moved <5px between
two consecutive polls inside the edge zone). Fly-bys, pass-throughs, and
resize drift never produce two close samples in a row, so they don't trigger.
Also keep the revealed panel's CONTENTS pointer-events-none until the slide-in
transition finishes (onTransitionEnd → settled), so you can't misclick a
session row mid-animation. Resets on retract.
* fix(desktop): no cursor/hit-test leak before reveal settles
The edge hot-zone showed cursor:pointer the instant the pointer touched it —
before the panel was armed or in view. And contents were inert but the panel
itself still hit-tested, so the cursor could flip mid-slide. Fix: hot-zone is
cursor-default (it's invisible), and the whole panel is pointer-events-none
until revealed && settled, so the cursor never changes or lands on a row
before the slide-in finishes.
* fix(desktop): geometry-driven close so revealed panel always retracts
The revealed panel relied on its own onPointerLeave to close — but a panel
that slid in under a still cursor (or whose contents were inert during the
slide) never fires enter/leave, so it got stuck open (esp. the file browser).
onTransitionEnd also bubbled from the file-tree's own row transitions,
tripping the settled flag wrongly.
Replace with a document-level pointermove watcher that closes once the cursor
leaves the panel's bounding rect + a 24px grace — independent of pointer-events
state or what the contents do. Gate interactivity on a simple slide-duration
timer (interactive) instead of the fragile transitionEnd, so the cursor still
can't flip or land on a row before the panel is in view.
* feat(desktop): make sidebar toggle shortcuts reveal when force-collapsed
mod+b / mod+j were no-ops on a narrow (force-collapsed) window — they
flipped the store but the pane ignores it. Now the toggle handlers also
dispatch PANE_TOGGLE_REVEAL_EVENT; a force-collapsed Pane listens (only while
overlayActive) and flips its hover-reveal, so the shortcut floats the rail in
(and back out) at this responsive breakpoint.
* refactor(desktop): name the 600px sidebar collapse breakpoint
Hoist the inline '(max-width: 600px)' literal into
SIDEBAR_COLLAPSE_BREAKPOINT_PX + SIDEBAR_COLLAPSE_MEDIA_QUERY in
layout-constants, so the responsive collapse point is a single named source
of truth instead of a magic string in the controller.
* tweak(desktop): sidebar auto-collapse breakpoint 600px -> 768px
768 is the standard md breakpoint and a more honest 'no room to dock' point.
* tweak(desktop): halve sidebar reveal slide duration 260ms -> 130ms
* Revert "tweak(desktop): halve sidebar reveal slide duration 260ms -> 130ms"
This reverts commit 6009a13200.
* perf(desktop): pre-mount hover-reveal contents to kill slide-in stall
The reveal mounted the (heavy, virtualized) sidebar contents in the same
frame the slide started, so the browser stalled painting the transform until
the mount finished — a ~100-200ms beat before the panel moved, very visible
on the instant keyboard toggle (hover masked it via the 90ms intent poll).
Report overlayActive (collapsed-overlay mode) rather than the live reveal
state to the mount consumer, so contents stay mounted off-screen while
collapsed and reveal is a pure transform. Visibility is still driven
separately by the data-pane-hover-reveal attr + the slide transform.
* fix(desktop): make reveal hotkey spammable
Two throttles on the reveal toggle:
- The handler fired both the reveal event AND toggleSidebarOpen() per press;
the store write hits localStorage synchronously every keystroke + recomputes
the grid, janking rapid presses. When collapsed, only dispatch the reveal
event (the store toggle was a no-op anyway).
- The geometry close-watcher slammed a keyboard-opened panel shut on the first
stray pointermove (trackpad jitter), fighting hotkey spam. Keyboard reveals
now ignore geometry until the cursor actually enters the panel, then the
mouse takes over.
* fix(desktop): inset reveal hot-zone past the OS window-resize gutter
The hot-zone sat flush at the window edge (left-0/right-0), overlapping the
OS resize grab strip — reaching to drag-resize naturally slows the cursor
there, which hoverIntent reads as settled and reveals before the resize drag
even starts. Inset the hot-zone 8px so the outermost edge stays a pure
resize/drag region and only an intentful move just inside it arms a reveal.
* fix(desktop): keep reveal hot-zone at edge, gate arming past resize gutter
Insetting the hot-zone made it unreachable when moving fast. Instead, anchor
the zone flush at the edge (w-4, always captures the pointer) but only ARM the
reveal when the cursor settles >=8px in from the edge — so a resize-reach that
parks on the outermost OS grab strip never triggers, while a deliberate move
into the zone still does. Keeps polling while in the gutter so moving inward
still arms.
* refactor(desktop): rebuild hover-reveal as pure CSS, delete the JS state machine
The hand-rolled pointer state machine (hoverIntent poll, refs, timers, document
pointermove geometry-close, interactive gate, resize cooldowns, keyboard-held
suppression) was fragile and side/instance-specific — hover broke on the right
rail, keyboard toggles triggered phantom animations, resize popped it open.
Replace all of it with the native primitive: CSS group-hover drives the slide
transform; a transition-delay on enter (instant on leave) is the hover-intent
gate (a fast pass-by doesn't dwell long enough to open); a thin edge trigger
inset past the OS resize grab strip arms it; and a single `forced` bool
(data-forced, toggled by the keyboard event) pins it open. Side-agnostic by
construction — group-hover doesn't care which edge or which pane.
Net: ~200 lines of imperative pointer logic → ~40 lines of declarative CSS.
* fix(desktop): don't animate hover-reveal panel across viewport on side flip
Flipping panes changed the off-screen transform from -translateX (off the
left) to +translateX (off the right). transition-transform interpolated
between them, passing through translate-x-0 (fully on-screen) mid-way — so the
hidden panel visibly slid across the window to reach its new hiding spot.
Key the panel on side so it remounts off-screen on the new edge with no
transition to play.
* clean(desktop): tighten hover-reveal markup
KISS pass on the CSS-driven reveal: reuse the existing `side` instead of a
local `left`, move the static duration/ease to inline style (drop two
single-use CSS vars + their arbitrary-value classes, keep only the
state-dependent enter-delay var), and trim comments to the house one-liner
density. No behavior change.
* fix(desktop): inset titlebar past traffic lights when sidebar is force-collapsed
The titlebar content inset (clearing the macOS traffic lights) keyed off the
stored sidebarOpen/fileBrowserOpen, but below the collapse breakpoint both
rails are force-collapsed so the left edge is uncovered while the store still
says open — content (the intro wordmark) overflowed under the lights. Gate
leftEdgePaneOpen on !narrowViewport using the shared SIDEBAR_COLLAPSE_MEDIA_QUERY.
Also rename the now-misleading reveal plumbing to match what it actually does:
onHoverRevealChange -> onOverlayActiveChange, $sidebarRevealed ->
$sidebarOverlayMounted (+ setter/consumer). It reports/stores collapsed-overlay
mode (mount gate), not live reveal state.
* feat(desktop): small --nous-shadow lift on revealed hover-reveal panels
Add a --nous-shadow token (white-based on light, black-based on dark) and apply
it to the floating sidebar panel only while revealed (group-hover / data-forced)
so it reads as lifted off the surface. No shadow on the off-screen panel.
* feat(desktop): shadow-reveal lift on revealed hover-reveal panels
Mirror the --shadow-nous layered falloff into a new --shadow-reveal token whose
drop color flips per mode (white on light, black on dark) via --shadow-reveal-raw
set in :root / :root.dark. Apply the generated shadow-reveal utility to the
floated panel only while revealed (group-hover / data-forced). Leaves the shared
--shadow-nous untouched.
* feat(desktop): use tuned reveal shadow, drop per-mode token
Replace the --shadow-reveal token machinery with Brooklyn's tuned literal
(0 -18px 18px -5px #0000003b) inline per-panel via --reveal-shadow, y-offset
sign flipped for the right side. Same color both modes. Reverts styles.css to
pristine (token removed).
* fix(desktop): use the reveal shadow verbatim, don't invert it per side
Flipping the y-offset sign for the right side inverted the shadow's direction
(cast-up -> cast-down), making it read heavier — not a mirror. The mirror axis
for a left/right panel is offset-x, which is 0 here, so both sides take the
tuned value as-is: 0 -18px 18px -5px #0000003b.
* clean(desktop): hoist reveal shadow to a named const
Move the inline reveal-shadow literal to HOVER_REVEAL_SHADOW alongside the
other HOVER_REVEAL_* tuning consts; drop the now-stale per-side comment.
* fix(desktop): truncate titlebar title before the right tool cluster
The session title used a hardcoded max-w-[52vw] that's blind to where the
right-side tools start, so it ran under them at narrow widths / with pane
tools present. Bound the title container by the same vars the titlebar drag
region uses (--titlebar-content-inset + --titlebar-tools-right +
--titlebar-tools-width) so it truncates exactly at the cluster's left edge.
* fix(desktop): responsive markdown tables — floor width + nowrap headers
The wrapper had overflow-x-auto but the table was w-full with auto layout, so
instead of scrolling it crushed columns until even header words broke mid-word
(Tim/e, Nig/ht). Add a min-w-[18rem] floor so it scrolls horizontally when the
column is narrower than readable, and whitespace-nowrap on th so headers never
break mid-word. Above the floor it still wraps cells naturally.
* fix intro
Adds thread_id and chat_type to the agent:start/end plugin hook context
(via getattr with safe defaults; both are real `source` attrs already used
in gateway/run.py). agent:end inherits them via **hook_ctx. Purely additive
— no prompt/history mutation. Documents the full ctx dict in hooks.py.
Co-authored-by: SNooZyy2 <SNooZyy2@users.noreply.github.com>
The status-bar zap currently toggles per-session approval bypass (the same
scope as the TUI's Shift+Tab). This adds a global escape hatch: Shift+clicking
the zap flips the persistent approvals.mode in config.yaml between "off"
(bypass on) and "manual" (bypass off), affecting every session, the CLI, the
TUI, and cron — and it survives restarts.
- statusbar-controls: thread the click's shiftKey through onSelect via a new
StatusbarSelectModifiers arg.
- yolo-session: add setGlobalYolo() that calls config.set with scope="global".
- use-statusbar-items: branch toggleYolo on modifiers.shiftKey; plain click
stays per-session, Shift+click goes global.
- tui_gateway config.set "yolo" key: add scope="global" that reads/writes
approvals.mode through the gateway's own (mtime-cached) config view, honors
an explicit value, and re-emits session.info to every live session so each
window's zap reflects the flip immediately.
- i18n: tooltip copy in en/ja/zh/zh-hant notes Shift+click toggles globally.
Tests: two new tui_gateway tests cover the global toggle and explicit-value
paths; existing session/process-scope yolo tests still pass.
The todo list is re-injected into the model's context after every
context-compression event (TodoStore.format_for_injection), so an oversized
todo item or an unbounded number of items defeats the compression it is meant
to ride through. TodoStore.write/_validate previously enforced no size or count
bounds, so a single 50KB item produced a ~50KB re-injection block on every
subsequent turn.
Add two caps:
- MAX_TODO_CONTENT_CHARS (4000): per-item content is truncated with a marker.
Routed through a shared _cap_content() so the merge-update path (which writes
content directly, bypassing _validate) is capped too.
- MAX_TODO_ITEMS (256): total list length is bounded, keeping the
highest-priority head (list order is priority).
Both caps are generous relative to real plans — a todo item is a short task
description and active lists are a handful of items.
NOT a security fix. Raised externally via GHSA-5g4g-6jrg-mw3g, which framed a
caller-supplied conversation_history on the authenticated API server replaying
into _hydrate_todo_store as a DoS. That path is authenticated (the API server
refuses to start without API_SERVER_KEY) and self-scoped (the caller supplies
their own entire history and can only inflate their own response chain — forged
role=tool entries are never persisted to the session DB), so it is out of scope
as a vulnerability under SECURITY.md 3.2. These bounds are footgun containment
that also applies to the trusted agent path, where the model itself authors the
todos. Credit to the reporter for the observation.
Co-authored-by: YLChen-007 <30854794+YLChen-007@users.noreply.github.com>
Tool-progress now shows a terminal command in a ```bash fenced block —
full command, no surrounding quotes, no label, no 40-char truncation —
instead of the noisy `terminal: "cmd…"` line, on every platform that
renders markdown code blocks (Telegram, Slack, Matrix, WhatsApp, Feishu,
Weixin, Discord). Plain-text platforms keep the compact preview line.
Gated on a new `BasePlatformAdapter.supports_code_blocks` capability
(default False) rather than a hardcoded platform list, so plugin adapters
(Discord lives in plugins/platforms/) opt in by setting the flag. Applies
to both all/new and verbose progress modes, with a safe fallback when the
command arg is missing or blank.
The desktop chat GUI pinned the viewport to the bottom on every content
growth while a turn streamed, so the window chased tokens as they arrived.
Remove that follow behavior: once a turn is running the viewport stays
exactly where the user left it.
- Delete the streaming ResizeObserver re-pin loop in useThreadScrollAnchor.
- Delete the post-run bottom lock (kept pinning ~1.2s after completion).
- Keep the one-time jump-to-bottom on user submit / new turn / session
change so a freshly submitted message still lands in view.
- Update streaming.test.tsx to assert the viewport no longer follows
streaming growth or snaps down on final code-highlight remeasure.
After sleep/wake, a remote (global-remote) primary backend can become
unreachable, but it has no child process whose 'exit' clears the main
process's cached connectionPromise. The renderer then re-dials the same
dead remote forever and the composer stays stuck on "Starting Hermes…";
only a quit+reopen recovered.
Fix: the renderer's existing backoff-paced reconnect loop now asks the
main process to revalidate the cached connection before re-dialing. The
main process liveness-probes the cached REMOTE backend's public
/api/status and, if unreachable, drops the cache (resetHermesConnection
only nulls connectionPromise for a remote — no child to SIGTERM) so the
next getConnection() rebuilds a reachable descriptor. Local backends are
never touched here; they self-heal via the child 'exit' handler. The
renderer's loop already provides retry pacing and rides out transient
blips, so no streak/episode bookkeeping is needed in the main process.
The boot hook dismisses the boot-progress overlay on the post-rebuild
'open' so an in-place rebuild can't leave it stuck at ~94%.
Reimplements #40135 by @AlchemistChaos on a smaller, more interpretable
path (63 added lines vs 555): no extracted helper module, no
failure-streak / episode-window state, the renderer's backoff loop is
the retry mechanism. Original diagnosis and fix by @AlchemistChaos.
Co-authored-by: AlchemistChaos <alchemistchaos@protonmail.com>
Clear NeMo Relay plugin-config observability only after the last active Hermes session finalizes.
Use the plugin's async-safe awaitable helper for both initialize and clear so session rotation remains safe under active event loops.
Disable the direct ATIF fallback when plugins.toml already owns the ATIF exporter lifecycle to avoid duplicate trajectory export on finalization.
Follow-up to #34306. The provider fix made SearXNG *usable* with a
config-only SEARXNG_URL, but tools/web_tools._has_env still read raw
os.getenv, so the backend auto-detect cascade and check_web_api_key
remained blind to it — SearXNG worked when explicitly selected but was
never auto-selected. Route _has_env (and the SearXNG diagnostic print)
through a config-aware _env_value helper mirroring the provider's
_searxng_url(). Fixing the shared helper covers every provider key in
one place. Adds regression tests for config-only auto-detect and
check_web_api_key. See #34290.
In classic CLI mode the dangerous-command approval prompt (and the clarify,
sudo, and secret-capture prompts) could fail to render: the user saw
'⏱ Timeout — denying command' after 60s without ever seeing the panel,
making approvals.mode: manual unusable.
Root cause. These prompts run their wait loop on the agent/background thread:
they set modal state that a ConditionalContainer's filter reads, then call
self._invalidate() to repaint so the panel appears. _invalidate() is a
THROTTLED wrapper built for high-frequency background repaints (spinner frames,
streaming) — it (a) returns early while a SIGWINCH resize-recovery is pending,
and (b) otherwise only repaints if 250ms elapsed since the last paint. Under
either condition the modal's entry paint is silently dropped, the
ConditionalContainer never re-evaluates, and the prompt times out unseen.
The throttle never belonged on these paths. Originally the callbacks painted
with a direct self._app.invalidate() and worked; a throttle PR blanket-replaced
every invalidate (including these rare, one-shot, user-blocking modal paints)
with the throttled _invalidate(); a later commit removed an idle 1Hz repaint
that had been masking dropped modal paints, surfacing the bug. Notably the
modal KEY-BINDING handlers (↑/↓/Enter) already paint with a direct
event.app.invalidate(), never the throttle — the background-thread callbacks
were the inconsistent ones.
Fix. Add a small _paint_now() helper that paints directly (guarded for a
missing _app, exception-safe) and route the four modal paths' entry, response,
countdown, and teardown paints through it — matching the key-handler idiom.
This covers approval, clarify, sudo, and the secret-capture teardown
(_submit_secret_response, which previously used the throttled _invalidate() so
its panel could linger after submit). _invalidate() is left untouched and its
docstring now states it is for high-frequency background repaints only;
modal/interactive paints must use _paint_now()/_app.invalidate() directly. This
also fixes the resize-recovery edge case for free (a direct paint never
consults the resize guard) without a throttle-bypass flag that could be
cargo-culted onto hot paths. Countdown refresh cadence tightened 5s->1s so the
timer stays visible while waiting, and a copy-pasted duplicate countdown block
in _clarify_callback is removed.
Tests: TestModalPaintNow drives all three wait-loop callbacks on a background
thread with BOTH gates active (_resize_recovery_pending=True + a recent
_last_invalidate in the throttle window) and asserts the panel paints on entry
AND repaints on teardown; plus a secret-teardown test, a direct
_paint_now-vs-_invalidate gate test, and a no-_app safety test. Each modal test
fails if its paint is reverted to _invalidate(). 17 in-file tests pass; full
tests/cli suite green (900).
Diagnosis credit: the throttle-drop root cause was identified by @sanidhyasin
in #41116; @islam666 independently reached the same direct-invalidate approach
in #41166; original report #41098 by @jodonnel.
When apps/desktop's `npm ci`/`npm install` fails, install_desktop printed a
single "Desktop workspace npm install failed" line and aborted, leaving the
user with a wall of raw npm output. A common trigger is a root-owned ~/.npm
cache left by an earlier `sudo npm`/`sudo npx`: the non-root install then
cannot write the shared cache, and npm reports it as EEXIST / "File exists"
while the real errno is EACCES (-13) -- so it reads like an installer bug.
Add a targeted remediation hint on that failure path pointing at:
sudo chown -R "$(id -un)" ~/.npm && npm cache verify
followed by the manual rebuild command. The stage stays a hard failure by
design (a silent skip yields a "complete" install with no app); only the
failure output changes.
hermes skills browse capped the hermes-index source at 5000, so it
surfaced ~5.4k of the ~90.7k skills the index actually carries. Raise
the per-source ceiling above catalog size; browse already paginates
client-side and the index is disk-cached, so no extra fetch cost.
Desktop connected to a remote gateway can now attach images and PDFs and
display agent-written images. Previously the desktop passed a LOCAL file path
to image.attach; on a remote gateway that path doesn't exist, so the image was
silently dropped ("skipped unreadable path") and the vision model never saw it.
The reverse direction was also broken — images the agent wrote on the gateway
rendered as dead links in the remote client.
Gateway (tui_gateway/server.py):
- image.attach_bytes: base64 byte upload written into the gateway's own images
dir and queued via the existing native-image-attach pipeline. Magic-byte
extension sniffing, data-URL prefix + whitespace tolerance, 25 MB cap,
structured error codes. Accepts content_base64/filename (canonical) and
data/ext (older-desktop aliases).
- pdf.attach: renders each page to PNG via pdftoppm (poppler-utils) at 150 DPI
and queues the pages as images; 50 MB / 25-page caps. Accepts host path or
base64 upload.
- Shared helpers (_decode_attach_base64, _sniff_image_ext, _queue_attached_image)
so the two methods and the existing image.attach don't duplicate logic.
Gateway (hermes_cli/web_server.py):
- GET /api/media: returns a gateway-local image as a base64 data URL so remote
clients can display it. Auth-gated like every /api route, extension
allowlist + size cap, AND confined to the gateway's own media roots
(images/screenshots/cache, resolved symlink-safe) so an authed caller can't
read image-extension files anywhere on disk.
Desktop (apps/desktop):
- syncImageAttachmentsForSubmit uploads bytes via image.attach_bytes when the
connection mode is 'remote'; the local fast path is unchanged.
- media.ts gains isRemoteGateway() + gatewayMediaDataUrl(); directive-text and
markdown-text fetch images over /api/media in remote mode.
Consolidates the competing remote-media PRs (#38876, #40317, #21908, #39437)
into one coherent implementation, taking the strongest parts of each and adding
shared-helper cleanup plus the /api/media root-confinement hardening on top.
The per-profile gateway switching from #38876 is intentionally left out as a
separable feature. TUI file uploads (#40492) remain a separate surface.
Tested: 11 new tui_gateway tests + 5 /api/media endpoint tests + desktop
media.remote unit tests; full tui_gateway + web_server suites green (472
passed); tsc -b clean; E2E verified the full attach→disk→queue and
gateway-path→data-URL display round-trip plus the out-of-root security block.
Co-authored-by: Max Mitcham <maxmitcham@mac.home>
Co-authored-by: Justlrnal4 <Justlrnal4@users.noreply.github.com>
Co-authored-by: Chris Cook <ccook@nvms.com>
Co-authored-by: Thomas Paquette <thomas.paquette@gmail.com>
* feat(desktop): surface TTS/STT/terminal backends as Settings dropdowns
Every native tool backend that the agent supports now shows up as a
clickable picker in the desktop Settings UI instead of a free-text box.
Desktop Settings renders a config field as a <Select> only if its dotpath
is a key in ENUM_OPTIONS (helpers.ts::enumOptionsFor returns undefined ->
free-text <Input> otherwise). Three backend-selector fields were surfaced
in their sections but missing from the map, so users had to hand-type the
provider name and could reasonably assume it was unsupported:
- tts.provider — now lists all built-in TTS backends incl. xai (Grok)
- stt.provider — local/groq/openai/mistral/elevenlabs
- terminal.backend — local/docker/singularity/modal/daytona/ssh
Each list is kept in sync with its backend source of truth (TTS:
agent/tts_registry.py::_BUILTIN_NAMES + tools/tts_tool.py; STT + terminal:
hermes_cli/config.py / tools/terminal_tool.py). The existing
enumOptionsFor current-value-append keeps any hand-typed/legacy value
selected, and command-type TTS providers still work.
Reported for Grok/xAI TTS, which was already a fully-wired built-in
provider (tts.provider: xai + XAI_API_KEY) with no picker entry.
* feat(desktop): expose per-backend TTS/STT/terminal config fields in Settings
Completes the backend-coverage pass: not just the provider PICKER but every
backend's own config fields are now tunable from desktop Settings, so a user
who picks (e.g.) Grok TTS can also set its voice/language without hand-editing
config.yaml.
Also fixes the STT provider dropdown: added 'xai' (Grok STT), which the
transcription dispatcher (tools/transcription_tools.py) handles but the
config.py comment had omitted — the dispatch ladder is the source of truth.
New Settings fields (Voice section):
- TTS xai (voice_id, language), minimax (model, voice_id), mistral
(model, voice_id), gemini (model, voice), neutts (model, device),
kittentts (model, voice), piper (voice)
- STT openai (model), groq (model), mistral (model)
New Settings fields (Advanced section):
- terminal docker_image / singularity_image / modal_image / daytona_image
New ENUM_OPTIONS dropdowns: stt.provider (+xai), stt.openai.model,
stt.mistral.model, tts.openai.model, tts.elevenlabs.model_id,
tts.neutts.device. Each list mirrors the backend generator's accepted values
(tools/tts_tool.py, tools/transcription_tools.py, hermes_cli/config.py).
i18n: FIELD_LABELS/FIELD_DESCRIPTIONS cover all locales via the English
fallback in config-settings.tsx; added native translations to ja/zh/zh-hant.
Secrets (provider API keys, modal/daytona tokens, ssh host/key) intentionally
stay in Settings -> Keys as env vars, not duplicated as config fields.
The agent-facing cronjob tool scans the user prompt with _scan_cron_prompt()
before creating/updating a job (tools/cronjob_tools.py); the REST cron
endpoints (POST /api/jobs, PATCH /api/jobs/{id}) validated length but not
content. This adds the same scan to both handlers so an exfiltration/injection
prompt is rejected the same way regardless of which surface created the job.
NOT a security boundary, defense-in-depth / parity only: the REST cron
endpoints are authenticated (every handler runs _check_auth, and connect()
refuses to start without API_SERVER_KEY), and _scan_cron_prompt is a documented
in-process heuristic, not a containment boundary (SECURITY.md 3.2).
Raised externally via GHSA-fr3q-rjg3-x6mf (DNS-rebinding pre-auth RCE). The
report's load-bearing 'no auth by default' premise was already closed three
weeks after it was filed by the API_SERVER_KEY-required guard (commit
1a9ef8314); this lands the create/update prompt-validation parity the report
also pointed at. Scanner imported defensively so a missing scanner cannot
disable the cron REST API.
Follow-up on the deferred-cleanup salvage (#33774): _cleanup_workspace
returned early for a non-scratch ('dir'/'worktree') task and never ran the
parent sweep, so a scratch parent waiting on a 'dir' child would leak its
deferred workspace forever. Run the parent sweep before the early return.
Adds regression tests: deferred-while-child-active, swept-after-last-child,
and dir-child-unblocks-scratch-parent.
When a Kanban task with workspace_kind=scratch completes, the
_cleanup_workspace() function immediately deletes the workspace
directory. If the task has children linked via task_links, those
children find the workspace deleted when they start.
This fix adds two checks:
1. Before deleting, check if any children are still active
(todo/ready/running). If so, defer cleanup.
2. After a child completes, check if parent workspace can now
be cleaned up (all children terminal).
FixesNousResearch/hermes-agent#33774
* feat(onboarding): opt-in structured profile-build path on first contact
On a user's very first gateway message, Hermes now optionally offers to
build a short profile of them — then, only with consent, gathers durable
facts and persists them to the user-profile memory store (memory tool,
target="user") so future sessions start already knowing who they are.
Inspired by Poke's zero-input onboarding, but consent-first by design:
- The agent OFFERS, never assumes. Declining stops it immediately.
- Before ANY external lookup it states what it will look up and asks.
- It never reads connected accounts (email/calendar) silently — the
exact privacy concern that made naive implementations feel invasive.
Wiring reuses existing infrastructure end-to-end:
- gateway/run.py first-message hook (was a plain self-intro) now swaps in
the profile-build directive when enabled and not yet offered.
- agent/onboarding.py gains profile_build_mode()/profile_build_directive()
+ PROFILE_BUILD_FLAG, latched once via the existing onboarding.seen
mechanism so the offer fires at most once per install.
- config default onboarding.profile_build: "ask" (set "off" to disable).
Added to an existing section, so no _config_version bump needed.
No new storage layer, no new injection path, no prompt-cache impact.
* fix(dashboard): fold onboarding into agent tab to avoid 1-field category
onboarding.profile_build is the only schema-surfaced onboarding field
(onboarding.seen is an internal latch dict), so the dashboard CONFIG_SCHEMA
single-field-category invariant rejected it. Merge onboarding -> agent like
the other small categories.
Compaction summaries now receive the current date and instruct the
summarizer to rewrite completed actions as absolute, dated, past-tense
facts (e.g. "email John about the proposal" -> "Sent the proposal email
to John on 2026-06-07"). A resumed conversation no longer re-issues work
that already happened or treats a finished action as still pending.
The date is resolved via hermes_time.now() (date-only, user-configured
timezone) inside _generate_summary. The compaction summary is a
mid-conversation message that is never part of the cached prefix, so the
date does not affect prompt-cache stability. Date resolution is
best-effort: a clock failure omits the rule rather than blocking
compaction. The rule rides the shared template, so both first-compaction
and iterative-update prompts carry it.
Inspired by Poke's summarization (temporal anchoring + semantic
preservation).
Three gateway tests broke on main after the component-auth security
hardening (test_discord_component_auth.py) made empty Discord component
allowlists fail-closed: a view built with allowed_user_ids=set() now
rejects every click instead of allowing anyone.
The clarify and model-picker BEHAVIOR tests still constructed their views
with an empty allowlist and expected the click to succeed — a stale
assumption from before the hardening. Fixed by giving each view an
allowlist containing the clicking user (the interaction's own id), which
is the realistic shape and what the security model requires.
Production code unchanged — this only updates the test fixtures to match
the intended (and separately pinned) fail-closed contract. The security
regression suite and these behavior suites now both pass.
Fixes:
- test_discord_clarify_buttons.py: test_choice_falls_back_to_label_text_when_entry_missing, test_other_flips_entry_to_awaiting_text
- test_discord_model_picker.py: test_model_picker_clears_controls_before_running_switch_callback
Salvage of the Discord half of PR #30964 by @LaPhilosophie. Discord
component button callbacks (ExecApprovalView, SlashConfirmView,
UpdatePromptView, ModelPickerView) bypass the normal message dispatch
authorization path. _component_check_auth previously returned True when
both the user and role allowlists were empty, so any guild member who
could see an approval prompt could click Approve on a dangerous command.
Fail closed instead: require DISCORD_ALLOWED_USERS / DISCORD_ALLOWED_ROLES
/ GATEWAY_ALLOWED_USERS membership, or an explicit DISCORD_ALLOW_ALL_USERS
/ GATEWAY_ALLOW_ALL_USERS opt-in for deliberately-open deployments.
Mirrors the Telegram (#24457) and Matrix fail-closed precedent.
The Slack half of #30964 is superseded by PR #33844's helper.
Reported via GHSA-mc26-p6fw-7pp6 (@whyiug).
Co-authored-by: LaPhilosophie <804436395@qq.com>
A non-numeric value in env vars like HERMES_STREAM_RETRIES,
HERMES_KANBAN_SPECIFY_MAX_TOKENS, GOOGLE_CHAT_MAX_BYTES, IRC_PORT, etc.
raised ValueError at import/init and crashed startup. Parse them safely,
falling back to the default.
Unified onto the existing utils.env_int(key, default) helper for core/
hermes_cli/tools modules instead of the original PR's three duplicate
local helpers; plugins keep minimal inline guards (no core-utils import).
All existing max()/min()/`or extra.get()` wrappers preserved.
Co-authored-by: annguyenNous <annguyenNous@users.noreply.github.com>
hermes doctor and hermes honcho status warned 'Honcho config not found'
whenever ~/.honcho/config.json was absent, even though HONCHO_API_KEY in
.env resolves a working config via HonchoClientConfig.from_global_config()
-> from_env(). Both now check hcfg.api_key/base_url before warning.
Co-authored-by: oxngon <98992931+oxngon@users.noreply.github.com>
## What does this PR do?
The trajectory compressor could corrupt training trajectories by cutting a
conversation in the middle of a tool-call/tool-response pair. In the from/value
trajectory format a `tool` turn (carrying `<tool_response>` markers) is always
emitted immediately after the `gpt` turn whose `<tool_call>` it answers, so the
two turns must stay together. The compressible region's end boundary, however,
was chosen purely by token accumulation: the loop stopped at the first turn where
the accumulated tokens met the savings target, with no regard for turn roles. For
any over-budget trajectory whose savings boundary happened to land between a `gpt`
turn and its `tool` turn, the `gpt` (with its `<tool_call>`) was summarised away
into the replacement `human` message while the now-orphaned `tool` turn (with its
`<tool_response>`) was kept verbatim in the tail — producing an unmatched marker
and silently corrupting the training signal. The head boundary had the mirror
problem when the first tool turn was not protected.
This change snaps both compression boundaries to a clean turn boundary before the
region is extracted and replaced, so the summary always covers whole gpt+tool
blocks and a `tool` turn is never separated from the `gpt` turn that precedes it.
The boundary is moved forward when possible (folding an orphaned tool turn into
the region that already holds its gpt) and falls back to moving backward when no
clean boundary exists ahead, such as when the protected tail itself begins on a
tool turn.
## Related Issue
N/A
## Type of Change
- [x] 🐛 Bug fix (non-breaking change that fixes an issue)
## Changes Made
- `trajectory_compressor.py`: added `_is_boundary_clean()` and `_snap_boundary()`
helpers on `TrajectoryCompressor`, and applied them to both the head and tail
compression boundaries in `compress_trajectory()` and
`compress_trajectory_async()`. When snapping collapses the region to nothing
safe to compress, the trajectory is returned unchanged and flagged as still
over the limit rather than being corrupted.
- `tests/test_trajectory_compressor.py`: added `TestCompressionToolPairIntegrity`
covering the sync and async paths plus direct unit tests for the boundary
snapping (forward skip and backward fallback).
## How to Test
1. Run the focused tests: `pytest tests/test_trajectory_compressor.py -q`.
2. The new sync/async cases build a trajectory of gpt/tool pairs with an oversized
middle gpt turn and choose a token target that forces the accumulation
boundary to stop between a `<tool_call>` and its `<tool_response>`. They assert
that `<tool_call>` and `<tool_response>` markers stay balanced after
compression and that every kept `tool` turn is immediately preceded by a `gpt`
turn (never the inserted summary or another tool turn).
## Checklist
### Code
- [x] I've read the [Contributing Guide](https://github.com/NousResearch/hermes-agent/blob/main/CONTRIBUTING.md)
- [x] My commit messages follow [Conventional Commits](https://www.conventionalcommits.org/) (`fix(scope):`, `feat(scope):`, etc.)
- [x] I searched for [existing PRs](https://github.com/NousResearch/hermes-agent/pulls) to make sure this isn't a duplicate
- [x] My PR contains **only** changes related to this fix/feature (no unrelated commits)
- [x] I've run `pytest tests/ -q` and all tests pass
- [x] I've added tests for my changes (required for bug fixes, strongly encouraged for features)
- [x] I've tested on my platform: macOS 15 (Darwin 25.5)
### Documentation & Housekeeping
- [x] I've updated relevant documentation (README, `docs/`, docstrings) — or N/A
- [x] I've updated `cli-config.yaml.example` if I added/changed config keys — or N/A
- [x] I've updated `CONTRIBUTING.md` or `AGENTS.md` if I changed architecture or workflows — or N/A
- [x] I've considered cross-platform impact (Windows, macOS) per the [compatibility guide](https://github.com/NousResearch/hermes-agent/blob/main/CONTRIBUTING.md#cross-platform-compatibility) — or N/A
- [x] I've updated tool descriptions/schemas if I changed tool behavior — or N/A
SIMPLEX_ALLOWED_USERS silently denied every contact when operators
listed display names instead of numeric contactIds. The SimpleX UI
never surfaces the numeric id, so display names are what operators
naturally put in the env var. _is_user_authorized only compared
source.user_id (the contactId), so the allowlist never matched.
Expand check_ids to include source.user_name for the simplex platform,
mirroring the existing WhatsApp phone-LID aliasing pattern. Adds doc +
setup-prompt clarification and three regression tests.
Salvaged from PR #40393. Adds manishbyatroy to release.py AUTHOR_MAP.
The desktop statusbar turn timer read a single process-global $turnStartedAt,
set/cleared only for the active session. With multiple same-profile sessions
running at once, switching to session B reset the one shared clock, so
session A's still-running turn "restarted from zero" the moment you left it —
exactly the behaviour @Da7_Tech reported after the profile-scoped session work.
Move turnStartedAt onto ClientSessionState so each session owns its own turn
clock. The global atom now just mirrors whichever session is focused, written
on view-sync (the flush that already stages the active session's state). A
backgrounded turn keeps counting in its own cache entry, and focusing it
restores its real elapsed time instead of zeroing it.
Set/clear sites: message.start (seed), message.complete + error + interrupted
bail (clear), and the session.info running-state path (seed if missing / clear
on stop) so a turn that goes busy via session.info — e.g. resuming a session
that's already running — also gets a clock.
Note: the agent loop itself never froze — every same-profile session runs in
its own backend thread and background deltas are buffered per-session. This
fixes the timer-reset symptom; the "no live progress until you return" is
inherent to a single-view transcript and is out of scope here.
DANGEROUS_PATTERNS and HARDLINE_PATTERNS are matched on the raw command string,
so backslash-escape (r\m) and empty-quote split (r''m) bypass both lists.
_normalize_command_for_detection now strips these before pattern matching.
tui_gateway shell.exec had a bare 'except ImportError: pass' that silently
disabled the entire safety gate if tools.approval wasn't importable. Changed
to fail-closed (return 5001 error). Added detect_hardline_command check.
Fixes#36846, #36847.
Two isolated reliability fixes:
- chat_completion_helpers: raise on a zero-chunk stream (no finish_reason,
no content/reasoning/tool_calls) so retry handles it instead of
fabricating a successful empty turn.
- model_metadata: parse the OpenRouter/Nous output-cap error phrasing
("maximum context length is N ... (A of text input, B of tool input,
C in the output)") so parse_available_output_tokens_from_error returns
a real cap and the caller stops looping on it.
Salvaged from #40405 (@ashishpatel26) — took the two stream/error-parsing
fixes. The PR also bundled compression-state changes (on_session_start
clearing _previous_summary; cron session-id prefix preservation, #38788);
those touch the compression hot path and are split out for separate review.
Co-authored-by: ashishpatel26 <ashishpatel26@users.noreply.github.com>
Packaged Desktop first-launch bootstrap no longer dies with a fatal HTTP
404 when install-stamp.json pins a commit that isn't fetchable from GitHub.
This only happens for locally-built desktop apps: write-build-stamp.cjs's
fromLocalGit() pins `git rev-parse HEAD`, which can be an unpushed commit
or dirty tree. CI builds stamp $GITHUB_SHA and are unaffected. The fix
unblocks the dev / self-builder workflow.
resolveInstallScript() now wraps the GitHub download in try/catch; on
failure it resolves ~/.hermes/hermes-agent/scripts/install.sh (the
already-installed agent checkout), copies it into bootstrap-cache, and
returns it as source 'installed-agent'. If the cache copy fails (read-only
FS), it uses the source path directly. With no installed checkout to fall
back to, the original error rethrows unchanged.
Download is now injectable via an optional _download param so the fallback
path is tested hermetically (no network).
Reported with a precise repro and suggested fix by @Tamaz-sujashvili (#40815).
Co-authored-by: Tamaz-sujashvili <56168197+Tamaz-sujashvili@users.noreply.github.com>
The dashboard font is now selectable from the UI, not just YAML. A new Font
section in the header theme picker overrides the UI font of whatever theme is
active; the choice is orthogonal to the theme and survives theme switches.
Each theme keeps its own font as the default — picking "Theme default" clears
the override.
- web/src/themes/fonts.ts: curated font catalog (system + Google Fonts across
sans/serif/mono), each with a family stack and optional webfont URL. The
catalog is the only injected-font surface — no free-text URL box, so the
injected <link> origins stay fixed.
- web/src/themes/context.tsx: font-override state (localStorage + server),
applied after theme typography so it wins; theme apply re-asserts it, and
clearing re-runs theme apply to restore the theme's own font. Mono is left
to the theme so code/terminal are untouched.
- web/src/components/ThemeSwitcher.tsx: Font section with grouped, self-
previewing font rows and a "Theme default" clear option.
- hermes_cli/web_server.py: GET/PUT /api/dashboard/font persisting to
config.yaml dashboard.font, with a server-side id allow-list (unknown ids
coerce to the theme sentinel).
- i18n + types, api client methods, tests, and docs.
Validation: 6 new backend endpoint tests pass; tsc + vite build clean; live
browser test confirmed pick/persist/survive-theme-switch/clear all work.
process_command() is typed -> bool, but the /clear, /new, and /undo
cancel paths did a bare `return` (None) when _confirm_destructive_slash
was declined, leaking None through the bool contract. Return True
(command handled, keep the REPL alive) on cancel.
Co-authored-by: yubingz <yubingz@users.noreply.github.com>
The desktop model picker calls POST /api/model/set with provider+model only
(no base_url). _apply_main_model_assignment cleared model.base_url for every
non-custom provider, so re-picking a Xiaomi MiMo model wiped a Token Plan
endpoint (https://token-plan-*.xiaomimimo.com/v1) back to the registry default
api.xiaomimimo.com — breaking valid tp- keys with 401s.
Now base_url is cleared only when switching to a different provider (the stale
URL belonged to the old one); same-provider re-assignment preserves it, and an
explicitly supplied base_url is honored for any provider.
The desktop markdown preprocessor autolinks bare URLs by wrapping them in
<...>. RAW_URL_RE allowed '*' in its character classes, so a bold line with
a URL and no separating space — e.g. '**PR opened: https://.../pull/123**' —
greedily pulled the closing '**' into the href, producing a broken link and
an unterminated bold run. Exclude '*' from both URL character classes; '_'
and '~' (which can appear in real paths) are preserved.
The cron run-history endpoint (GET /api/cron/jobs/{id}/runs, added in
#40684) reused list_sessions_rich's order_by_last_active path with a
leading-wildcard id_query. That routes through the recursive
compression-chain CTE, which seeds from EVERY source='cron' row in the DB
and runs per-row preview/last_active subqueries before filtering to one
job and applying LIMIT. Work scaled with the total cron history, so a
large pile made the run-history load time out before eventually
populating.
Cron runs are flat, never-compressed sessions with ids of the form
cron_{job_id}_{ts}, so the chain machinery is pure overhead and the
job binding is a true prefix, not a substring.
- New SessionDB.list_cron_job_runs(): bounded [prefix, hi) id-range scan
on source='cron', ordered by started_at DESC, with the same
preview/last_active enrichment. No CTE, no leading-wildcard LIKE.
- Add idx_sessions_source(source, id) so the range is an index scan;
bump SCHEMA_VERSION 14 -> 15 (index reconciles onto existing DBs via
CREATE INDEX IF NOT EXISTS on startup).
- Point the endpoint at the new method.
Measured on a real SessionDB with 30k cron rows: 5ms vs 85ms for the old
path (16x), and the new path stays flat as the pile grows while the old
one scaled with it. Verified the query plan uses idx_sessions_source_id
(range scan, no full table scan), runs are correctly scoped (substring
collisions like cron_xalpha_ excluded), newest-first, and paged.
* fix(desktop): scope in-session /model switch per-session, stop process-env leak
The desktop/dashboard tui_gateway backend hosts every same-profile session
in ONE process. An in-session /model switch wrote process-global env vars
(HERMES_MODEL / HERMES_INFERENCE_MODEL / HERMES_TUI_PROVIDER /
HERMES_INFERENCE_PROVIDER), which _resolve_startup_runtime() reads when
building a fresh agent. So switching the model in one session leaked into
every other live session's next agent rebuild (/new, resume) — changing the
model in session B silently changed it in session A.
Fix: record the switch as a per-session model_override on the session dict
instead of mutating os.environ. _make_agent honors that override on rebuild
(carrying the concrete base_url/api_key/api_mode the switch resolved), and
falls back to global config when absent. Global persistence on the --global
flag is unchanged.
Also a cleaner fix for #16857 (/new after switching to a custom-provider
model): the override carries the resolved credentials, so the rebuild keeps
the right endpoint without relying on the leaky env vars.
Reported via Twitter (@Da7_Tech): MiniMax M3 in one session + GLM 5.1 in
another interfere when switching between them.
* test(tui_gateway): align /model switch tests with per-session override contract
The three test_config_set_model_syncs_* tests asserted the old leaky contract
(switch writes HERMES_MODEL / HERMES_TUI_PROVIDER / HERMES_INFERENCE_PROVIDER to
process env). That env-sync IS the cross-session contamination bug this PR
removes. Updated to assert the new contract: shared process env untouched, the
switch recorded as a per-session model_override carrying provider/model/base_url/
api_key/api_mode. #16857's intent (a custom-provider switch survives /new) is
still covered — now via the override _make_agent honors on rebuild.
The desktop sidebar fetched the unified cross-profile session list as
profile='all' and filtered it client-side by the active profile. On a
large multi-profile install the active profile's rows could be windowed
out of the cross-profile recency page entirely, so switching to a profile
agent showed an empty history panel (and the 'all' fetch could exceed the
15s IPC timeout on startup). Scope the fetch to the active profile so its
own page comes back on its merits, and bump the session-list IPC timeout
to 60s. profileScope is now a refreshSessions dep, so the existing
gateway-open effect re-pulls on profile switch.
Persist the inbound user turn before provider/tool execution so a crash
before run_conversation() (e.g. provider/httpx client init failure) keeps
the inbound message in the transcript. Repair stale/missing SSL_CERT_FILE
state on gateway startup, and avoid duplicate gateway fallback writes.
The salvaged main-agent fix (sanidhyasin) applies model.default_headers
to the primary OpenAI client, but the auxiliary client (title generation,
context compression, vision routing) builds its own clients and did not
read the override. For a `provider: custom` endpoint behind a gateway/WAF
that rejects the OpenAI SDK's identifying headers, the main turn would
succeed while auxiliary calls to the same endpoint still failed with the
opaque 502/4xx from #40033.
Add agent.auxiliary_client._apply_user_default_headers() (user values win
over provider/SDK defaults; no-op when unconfigured) and apply it at every
OpenAI-wire client construction site:
- _try_custom_endpoint() — config-level `model.provider: custom`
- the named custom-provider branch (custom_providers/providers entries),
including the anthropic-SDK-missing OpenAI-wire fallback
- the api-key-provider, async-conversion, and main resolve_provider_client
fallback branches
To prevent the two clients ever drifting on precedence/value handling,
AIAgent._apply_user_default_headers (run_agent.py) now delegates the config
read + merge to this shared helper (run_agent already imports from
auxiliary_client). Native Anthropic/Bedrock branches are untouched (they
don't use the OpenAI wire).
8 new tests (helper semantics + config-level custom + named custom);
full aux + attribution header suites green (295).
Custom OpenAI-compatible endpoints sitting behind a gateway/WAF can reject
the OpenAI Python SDK's default identifying headers (User-Agent: OpenAI/Python,
X-Stainless-*) and return an opaque 502/4xx even though the same request body
succeeds under curl. There was no supported way to override those headers.
Add a model.default_headers config key whose values are merged onto the
OpenAI client's default_headers, taking precedence over provider- and
SDK-supplied defaults. Applied at client construction and on every credential
swap / client rebuild so the override survives reconnects. No-op for native
Anthropic / Bedrock modes and when unconfigured.
Mirrors the EN deep-audit fixes (PR #40952) into the zh-Hans translation so the
two locales agree. zh-Hans is the only non-English locale; 26 translated pages
carried the same stale claims.
Corrections ported (code tokens identical across locales; prose re-translated
where the surrounding text was already Chinese):
- reference: /version slash command + dual-surface list; cli --provider adds
openai-api + novita aliases; tool count 70->71 (+ removed phantom "10 RL tools"
and fixed kanban 7->9); model_catalog ttl 24->1.
- user-guide: hermes -w -q -> -w -z; language list 8->16; aux slots 8->11;
docker separate-dashboard claim; gateway-streaming per-platform note;
computer-use frontmatter.
- features: curator prune_builtins truth; codex-runtime aux keys
(context_compression->compression, vision_detect->vision); voice-mode STT/TTS
enums; removed phantom rl toolset.
- integrations: StepFun step-3-mini->step-3.5-flash; web-search backends 4->8;
nous-portal status subcommand.
- messaging: WeCom typing/streaming columns; telegram transport default
edit->auto; sms host 0.0.0.0->127.0.0.1; simplex/ntfy gateway-setup + pairing
approve; line smart-chunking; matrix MATRIX_DM_AUTO_THREAD; msgraph host note.
- developer-guide: entry-point group hermes.plugins->hermes_agent.plugins;
PLUGIN.yaml->plugin.yaml.
Net-new EN sections (mcp mTLS, api-server run-approval, kanban CLI verbs) are
untranslated in zh-Hans and fall back to English source, consistent with the
mirror's existing partial-coverage state. Verified: docusaurus build --locale
zh-Hans succeeds; no new broken anchors from these edits.
compress_context() sets last_prompt_tokens=-1 right after compression to
mark "no real API usage yet". The preflight display-seed used
`_preflight_tokens > (last_prompt_tokens or 0)`, and `(-1 or 0)` is -1
(truthy), so any positive rough estimate clobbered the sentinel with a
schema-inflated count — re-triggering compression on the next turn.
Treat any negative value as "no real data yet" and skip the seed.
Salvaged from #40246 as the minimal root-cause fix. The original also
added an `_awaiting_suppression_count` bounded-window state machine to
should_compress() across 3 files; left out here to keep blast radius
small — the sentinel guard alone fixes the re-fire. The suppression
window can be added separately if the usage=None-stub edge case warrants it.
Co-authored-by: davidgut1982 <davidgut1982@users.noreply.github.com>
Adds the AUTHOR_MAP entry for the #40403 salvage (model.default_headers
for custom OpenAI-compatible providers, fixes#40033) so contributor_audit
passes when the salvage PR lands.
* test(kimi): align stale parity/profile tests with thinking-xor-effort contract
ce4e74b3 (fix(kimi): send thinking xor reasoning_effort, never both)
changed the Kimi profile to emit at most one of extra_body.thinking or a
top-level reasoning_effort, and added tests/plugins/model_providers/test_kimi_profile.py
to pin it — but left two older test files still asserting the removed
'send both' behavior, turning main red for every PR branched after it.
Update the stale assertions to the xor contract:
- explicit recognized effort (low|medium|high) -> reasoning_effort only,
no thinking
- enabled w/o effort, or no reasoning_config -> thinking:enabled only,
no reasoning_effort
- disabled -> thinking:disabled only
No production change.
* test(kimi): cover remaining xor stale assertions (profile_wiring, run_agent)
Two more test files asserted the pre-ce4e74b3 'thinking + reasoning_effort
together' behavior — landed in a different CI shard so they surfaced only
after the first batch went green:
- tests/providers/test_profile_wiring.py::TestKimiProfileParity (2)
- tests/run_agent/test_run_agent.py::TestBuildApiKwargs (3: kimi-coding,
moonshot, moonshot-cn)
Same realignment to the xor contract: default/enabled-without-effort emits
thinking:enabled and no reasoning_effort; explicit effort emits
reasoning_effort only. Verified by running the full provider +
TestBuildApiKwargs Kimi surface (202 passed) plus a codebase-wide grep for
any remaining paired thinking+effort assertion (none).
The ChatGPT Codex OAuth backend hard-caps gpt-5.5 at a 272K context window
(verified live: a ~330K-token request to chatgpt.com/backend-api/codex/responses
is rejected with context_length_exceeded while ~250K succeeds; the same slug
exposes 1.05M on the direct OpenAI API / OpenRouter and 400K on Copilot). At the
default 50% trigger, auto-compaction fires at ~136K — half the usable window.
Raise the trigger to 85% (~231K) on this exact route only, gated by a new
compression.codex_gpt55_autoraise config flag (default true). When it fires,
emit a one-time notice (CLI inline print + gateway status_callback replay) with
the exact opt-back-out command. gpt-5.5 on any other provider keeps the user's
global threshold.
- _is_codex_gpt55() matches the 5.5 family only on provider=openai-codex
- _compression_threshold_for_model() now provider-aware + opt-out param
- config key + _config_version bump (27->28) for backfill
- docs + tests (40 cases in test_arcee_trinity_overrides.py)
Follow-up to the subprocess timeout: _prepare_local_audio only caught
CalledProcessError, so a timeout would raise uncaught. Return a clean
error instead.
subprocess.run() and proc.wait() without timeout can hang indefinitely
if the child process becomes unresponsive. This blocks the calling
thread forever.
Fixed locations:
- tools/transcription_tools.py: ffmpeg conversion (timeout=300) and
user-configured STT commands with shell=True (timeout=300)
- gateway/run.py: helper script proc.wait() (timeout=3600)
Not fixed:
- agent/anthropic_adapter.py: interactive 'claude setup-token' —
user-driven, timeout would be inappropriate
The standalone Kimi/Moonshot profile (api.moonshot.ai/v1) sent both
extra_body.thinking AND a top-level reasoning_effort. With no reasoning
config it even defaulted to thinking:enabled + reasoning_effort:medium,
pairing them on every default call. Moonshot treats these as mutually
exclusive (cannot specify both 'thinking' and 'reasoning_effort').
Align with the kimi-k2 handling already shipped for the opencode-go relay:
send effort when a recognized low|medium|high is requested, otherwise fall
back to the extra_body.thinking toggle. Disabled sends thinking:disabled
only. Never both.
Reported by Cars29 (NOUS Discord). DeepSeek was deliberately left untouched:
its native endpoint accepts both (verified by the live guardrail in
test_deepseek_v4_thinking_live.py), so the report's DeepSeek claim does not
hold there.
Tests: tests/plugins/model_providers/test_kimi_profile.py pins the xor
contract across all config shapes.
#40909 added `CREATE_BREAKAWAY_FROM_JOB` to `windows_detach_flags()`,
which fixed the headline bug (gateway dies after Desktop GUI update
and never comes back). The flag's own docstring acknowledges that
restrictive parent job objects can still refuse breakaway with
`ERROR_ACCESS_DENIED`, surfacing as `OSError` on the `subprocess.Popen`
call:
"Callers in this codebase already wrap detached spawns in
try/except OSError and fall back to a cmd.exe wrapper, so the
breakaway-denied case degrades gracefully rather than crashing."
That's true for `_spawn_detached` in `gateway_windows.py` (the
`hermes gateway start` path), which has both the breakaway bit AND a
retry-without-breakaway fallback. It's NOT true for the post-update
watcher path in `launch_detached_profile_gateway_restart`
(`hermes_cli/gateway.py`), which only has `except OSError: return
False` and gives up entirely. If a user's shell/terminal/container
wraps Hermes in a breakaway-denying job, the gateway-respawn watcher
silently fails to launch instead of trying again without breakaway.
This PR closes that gap and adds the regression tests that were
missing from the original fix.
## Changes
### `hermes_cli/_subprocess_compat.py`
Adds a sibling helper `windows_detach_flags_without_breakaway()` so
callers can express the fallback symbolically (via the helper) rather
than coding the magic `& ~0x01000000` mask at every site. Documented
on `windows_detach_flags` and `windows_detach_flags_without_breakaway`
with the recommended try/except pattern.
### `hermes_cli/gateway.py::launch_detached_profile_gateway_restart`
Two changes, both aligned with the canonical pattern in
`gateway_windows._spawn_detached`:
1. The outer watcher Popen now wraps in `try/except OSError`, and on
failure retries with `windows_detach_flags_without_breakaway()`
(POSIX never reaches this branch — `start_new_session=True` can't
raise OSError).
2. The inlined respawn payload (the `python -c` watcher) also
wraps its CreateProcess in try/except OSError and retries with
`_flags & ~_CREATE_BREAKAWAY_FROM_JOB` on failure. This matters
because the watcher's job-object inheritance is independent of the
outer process's — even if the outer Popen succeeds with breakaway,
the respawned gateway might inherit a job that doesn't.
### Regression tests in `tests/tools/test_windows_native_support.py`
#40909 shipped the fix without any test that the breakaway bit is
present (the existing `test_windows_detach_flags_has_expected_win32_bits`
asserted only the three legacy bits). Four new tests close that:
- `test_windows_detach_flags_includes_breakaway_from_job` — explicit
assertion that the breakaway bit is in the default bundle, with the
rationale spelled out in the docstring so a future maintainer
staring at this test understands why removing it would resurrect
the gateway-dies-after-GUI-update bug.
- `test_windows_detach_flags_without_breakaway_drops_only_that_bit`
— fallback payload keeps the other three detach bits intact.
- `test_launch_detached_profile_gateway_restart_inlined_watcher_uses_breakaway`
— static-text check on the stringified watcher payload. The inlined
Python program isn't reachable via normal import-time inspection
because it lives in a `textwrap.dedent("""...""")` literal that
gets passed to a separate `python -c` interpreter. Asserting that
both `_CREATE_BREAKAWAY_FROM_JOB` (symbolic) and `0x01000000` (hex
literal) appear inside the dedent block is a sufficient regression
guard against accidental refactors.
- `test_launch_detached_profile_gateway_restart_outer_popen_has_access_denied_fallback`
— static check that this PR's fallback retry is wired up
symbolically. Without standing up a real Windows job object that
refuses breakaway, we can't trigger the OSError in a unit test;
the text guard catches the case where a future refactor removes
the helper import or the `& ~_CREATE_BREAKAWAY_FROM_JOB` retry.
Also extends `test_windows_detach_flags_has_expected_win32_bits` to
include the breakaway bit assertion and updates
`test_windows_flags_zero_on_posix` to cover the new helper.
## Tests
Locally on Windows: 8/8 in the `-k "detach or breakaway or
popen_kwargs or launch_detached or gateway_run_update or
hermes_cli_gateway"` slice pass.
Broader `tests/hermes_cli/test_gateway*.py + test_windows_native_support.py`:
172 passed, 10 failed. All 10 failures are pre-existing POSIX-only
tests running on a Windows host (os.geteuid, SIGKILL fallback,
is_linux fixture mismatches). Stashing this PR and re-running on bare
post-#40909 main reproduces all 10 identically — none are regressions.
POSIX paths unchanged: `windows_detach_flags()` and
`windows_detach_flags_without_breakaway()` both return 0 off Windows,
`windows_detach_popen_kwargs()` still yields `{"start_new_session": True}`.
## Out of scope
- The other detached-spawn site in `hermes_cli/gateway.py` (around
line 3068) also uses `windows_detach_popen_kwargs()` + `except
OSError`. It deserves the same fallback treatment but the codepath
is different enough (not the update-flow watcher) that it warrants
a separate PR with its own scrutiny.
- `gateway/run.py` has Windows branches with `windows_detach_popen_kwargs`
too — same reasoning.
## Context
Follow-up to #40909 (merged). I had a parallel PR (#40934, closed)
that duplicated the core breakaway fix; the bits unique to that PR
that #40909 didn't cover are the contents of this one. Closing #40934
and opening this slimmed-down version as the focused follow-up.
_discover_all_plugins() previously did a flat iterdir() scan, missing
all category-namespaced plugins (web/*, image_gen/*, browser/*, video_gen/*).
Now recurses up to 2 levels deep, matching PluginManager._scan_directory_level().
Also fixes _plugin_status() to check both manifest name AND path-derived
key against enabled/disabled sets, so category plugins like 'web/tavily'
show correct status when enabled via config.
Follow-up to the salvaged perf fix. The new force_fresh_nous_tier param was
inserted into list_authenticated_providers between custom_providers and
max_models. Make it keyword-only (*) so a positional caller passing max_models
as the 5th arg can never silently mis-bind it to the tier-refresh flag, and
add a signature-contract test that fails if the keyword-only separator is
later dropped. All in-repo callers already use keyword args; verified no
caller breaks.
Mirror the bootstrap-installer (Rust) fix in the Electron first-launch
runner. spawnPowerShell launched bare 'powershell.exe', trusting PATH to
contain %SystemRoot%\System32\WindowsPowerShell\v1.0 — the same latent
weakness that stalled the native installer at "0 of 0 steps" when PATH is
trimmed/truncated or stored as a non-expanding REG_SZ. Resolve by absolute
path first (%SystemRoot%/%windir%), then PATH (powershell 5.1 -> pwsh 7),
then bare name as last resort.
Make `powershell_under_root` visible under `cfg(test)` so the
%SystemRoot%\System32\WindowsPowerShell\v1.0\powershell.exe layout is
asserted on any host (the rest of the resolution is gated to Windows).
The native Windows installer spawned PowerShell via the bare program name
`powershell.exe`, which trusts PATH to contain
%SystemRoot%\System32\WindowsPowerShell\v1.0. On machines whose PATH was
trimmed or truncated (Windows silently drops entries once the variable
exceeds its length limit), the lookup fails and the spawn dies with
"program not found" before install.ps1 runs at all — the installer then
stalls at "0 of 0 steps".
Resolve PowerShell by absolute path first (%SystemRoot%/%windir%), then
fall back to PATH (powershell 5.1, then pwsh 7), then a bare name as a
last resort. Also include the resolved interpreter in the spawn-failure
context; the old message printed only the script path, which misleadingly
read as if the .ps1 itself was missing.
Regression guard for the emoji-fallback fix: checks DEFAULT_TYPOGRAPHY and every
defined builtin-theme fontSans/fontMono stack contains a color-emoji font.
None of the UI sans/mono font stacks (themes/presets.ts, styles.css) carry
emoji glyphs, so on platforms whose default text font lacks them (e.g. Linux)
emoji rendered as tofu boxes in the composer and chat.
Append a color-emoji fallback — Apple Color Emoji / Segoe UI Emoji / Segoe UI
Symbol / Noto Color Emoji / the `emoji` generic — to every font stack
(SYSTEM_SANS, SYSTEM_MONO, the Courier theme, and the CSS --dt-font-* defaults).
Text still uses the primary fonts; the browser only falls back for emoji
codepoints. Custom themes build on SYSTEM_* so they inherit it automatically.
On Windows with non-UTF-8 console encodings (e.g. cp949, cp1252),
StreamHandler emits raise UnicodeEncodeError when log messages contain
characters outside the console codepage — such as the em-dash (U+2014)
in the session hygiene message.
This crashed the gateway process silently, leaving no diagnostic output.
Fix: add _safe_stderr() helper that wraps sys.stderr in a TextIOWrapper
with encoding='utf-8' and errors='replace' when the console encoding
is not UTF-8. Applied to both:
- hermes_logging.py setup_verbose_logging() stderr handler
- gateway/run.py optional stderr handler
The wrapper ensures log lines are never lost — un-encodable characters
are replaced with '?' instead of crashing the process.
Fixes#40432
* fix(gateway,windows): reliability — supervisor task, JOB breakaway, status --deep
Three coordinated fixes for the Windows gateway reliability story:
1. CREATE_BREAKAWAY_FROM_JOB on every detached spawn
The 'hermes update' triggered from the Electron Desktop GUI ran inside
Electron's job object. Without breakaway, the post-update gateway
watcher spawned by update — already DETACHED_PROCESS — was still
reaped when Electron's job tore down, so the gateway never came back
after a GUI-initiated update. Adds CREATE_BREAKAWAY_FROM_JOB (0x01000000)
to:
- hermes_cli/_subprocess_compat.py::windows_detach_flags() — used by
every helper that calls windows_detach_popen_kwargs(), including
launch_detached_profile_gateway_restart()
- The watcher subprocess's own respawn snippet in
hermes_cli/gateway.py (inlined flags so the watcher's child
respawn also breaks away)
_spawn_detached() in gateway_windows.py already had the flag; this
change brings the rest of the codebase to parity.
2. Per-minute supervisor Scheduled Task — Windows equivalent of
systemd Restart=always
Introduces hermes_cli/gateway_supervisor.py and registers it as a
second Scheduled Task ('Hermes_Gateway_Supervisor', SC MINUTE /MO 1,
LIMITED rights) alongside the existing ONLOGON task. Every minute,
the supervisor uses the same gateway.status.get_running_pid() probe
as 'hermes gateway status' and, if no gateway is alive, calls
gateway_windows._spawn_detached() (which now includes BREAKAWAY) to
bring one back.
Covers every crash mode, not just 'machine rebooted': taskkill,
OOM, GUI update SIGTERM, parent job teardown. Cheap — one pythonw
startup per minute when down, one PID-existence check per minute
when up.
Wired into both the schtasks-success and Startup-folder-fallback
install paths via _install_supervisor_best_effort(), and removed in
uninstall(). Best-effort: a failing supervisor install logs a
warning but doesn't roll back the primary install.
3. 'hermes gateway status --deep' shows per-probe PASS/FAIL
Replaces the existing terse '--deep' output (which only printed
paths) with an actual diagnostic table:
[1] PID file present
[2] Lock file held by a live process
[3] get_running_pid() result
[4] _pid_exists(pid) — OS-level liveness
[5] gateway_state.json (state + age)
[6] Last lifecycle event from gateway-exit-diag.log
When the high-level summary disagrees with reality, the user can
see exactly which signal is lying.
Test-leak fix
-------------
tests/hermes_cli/test_gateway_wsl.py::TestGatewayCommandWSLMessages
monkey-patched is_linux/is_wsl/supports_systemd_services to simulate
WSL but did NOT stub is_windows(). On a Windows host, the dispatcher
in _gateway_command_inner takes the is_windows() branch BEFORE the
WSL guidance branch, so the test invoked gateway_windows.install()
for real. install() writes to %APPDATA%\...\Startup\Hermes_Gateway.cmd
— the REAL user Startup folder, never sandboxed by tmp_path — pointing
at the test's pytest-of-<user>/pytest-<N>/.../gateway-service/ wrapper.
When pytest tore down the tmp_path, every subsequent Windows login
flashed a cmd.exe window that failed to find the missing target.
Stubs is_windows=False on all four affected tests:
test_install_wsl_no_systemd
test_start_wsl_no_systemd
test_status_wsl_running_manual
test_status_wsl_not_running
Defense-in-depth: _build_startup_launcher() now prefixes the launcher
with 'if not exist <target> exit /b 0', so any future stale Startup
entry silently no-ops instead of flashing a console window.
Status enhancements
-------------------
- status() now reports supervisor task presence alongside the existing
schtasks/Startup info, and nudges the user to reinstall if the
supervisor isn't registered.
- Deep mode dumps both the supervisor task name + script path.
* fix(gateway,windows): drop the per-minute supervisor task — keep breakaway + deep probes
Earlier in this branch we added a per-minute schtasks-based supervisor to
respawn the gateway after crashes / GUI-update SIGTERMs. The implementation
flashed a brief console window on every firing, which stole window focus.
We tried several variants:
- cmd.exe wrapper invoking pythonw -> flashes (cmd.exe is console-subsystem)
- schtasks /TR pointing at pythonw -> flashes (uv venv launcher pythonw is
actually subsystem=Console, not GUI; it respawns the real pythonw)
- schtasks /TR pointing at base uv -> still flashes (Task Scheduler-side
conhost preallocation; documented Windows quirk)
- XML registration with <Hidden>true> -> still flashes (<Hidden> only hides
the task in the Task Scheduler UI, not the spawned window)
Researched what leading projects do:
- Ollama: GUI-subsystem tray exe + Startup-folder shortcut. No supervisor.
- Tailscale: real Windows Service via SCM. Session 0, no console possible.
- Syncthing: --no-console flag inside the binary + Startup folder.
- openclaw: VBS Run(..., 0, False) wrapper. Suppresses the *window* but
Super User Q971162 confirms focus-steal still occurs in some cases.
None of these use a per-minute polling scheduled task. The 'auto-restart on
crash' responsibility belongs INSIDE the daemon (Tailscale's in-process
recovery / Ollama's monitor+worker pair) OR is delegated to the Windows
Service Control Manager — not Task Scheduler.
So this commit drops the supervisor entirely. The CREATE_BREAKAWAY_FROM_JOB
fix in _subprocess_compat.py (from commit c1e5fa433) survives — that is the
*real* fix for problem #2 (GUI-update kills gateway): the post-update
watcher in launch_detached_profile_gateway_restart() now breaks out of
Electron's job object, so the gateway respawn watcher survives the GUI
quit and successfully respawns the gateway.
Surviving from c1e5fa433:
* CREATE_BREAKAWAY_FROM_JOB in hermes_cli/_subprocess_compat.py (fixes#2)
* Inlined breakaway flag in the watcher respawn snippet in gateway.py
* hermes gateway status --deep PASS/FAIL probes (fixes#1 — visibility)
* 'if not exist <target> exit /b 0' guard in _build_startup_launcher
(fixes#3 — silent no-op for stale Startup entries)
* tests/hermes_cli/test_gateway_wsl.py is_windows=False stubs (root cause
of #3 — pytest WSL tests no longer leak Startup entries on Win hosts)
Removed in this commit:
* hermes_cli/gateway_supervisor.py (entire file)
* Supervisor section in hermes_cli/gateway_windows.py (~180 lines):
get_supervisor_task_name, get_supervisor_script_path,
_build_supervisor_cmd_script, _write_supervisor_script,
_install_supervisor_task, is_supervisor_task_registered,
_install_supervisor_best_effort
* _install_supervisor_best_effort() calls in install() (3 spots)
* supervisor cleanup block in uninstall()
* supervisor display lines in status() / status(deep=True)
Future direction (out of scope for this PR): the right place for Windows
'Restart=always' semantics is a real Windows Service installed via
pywin32's win32serviceutil.ServiceFramework — session-0 isolation, SCM
auto-restart, no console window possible. That's a meaningful next-PR
project, not a band-aid.
Tests: 51 pass / 2 pre-existing failures in
tests/hermes_cli/test_gateway_{windows,wsl}.py (the 2 failures are
TestSupportsSystemdServicesWSL cases that fail on origin/main too —
unrelated to this PR).
_hook_jsonable() referenced SimpleNamespace without importing it, so
sanitizing any hook payload that contained one raised
NameError: name 'SimpleNamespace' is not defined.
Bedrock, Codex-responses, and the auxiliary client build their
response / message / tool_call objects as SimpleNamespace and hand the
raw objects to the post_api_request hook. The hook call sites swallow
exceptions (except Exception: pass), so the crash silently dropped the
observability hook for those providers.
Add the missing `from types import SimpleNamespace` and a regression
test covering the SimpleNamespace sanitization path.
Add a help line under the Orchestrator profile selector explaining it
owns the root task after fan-out and does not drive how tasks split;
point at auxiliary.kanban_decomposer for the decomposer model. Also fix
the Profile descriptions hint to credit the decomposer (not the
orchestrator) for routing. This is the dashboard surface that prompted
the original support confusion.
_read_events() returned normally when self._ws was closed-but-non-None
(the while-condition is false on entry). _listen_loop treats a normal
return as a clean read, resets backoff to 0, and immediately retries —
a tight busy-loop pinning CPU. Raising on entry routes it through the
reconnect/backoff path instead.
Co-authored-by: xushibo <xushibo@users.noreply.github.com>
Co-authored-by: cnfi <cnfi@users.noreply.github.com>
Pillow drives the byte/pixel image-shrink path that runs at vision-embed
time. Without it, an oversized image (>5 MB or >8000px) bakes into
immutable history and bricks the session on Anthropic's non-retryable
400. It's a pure-wheel dep with no system-lib requirement for the codecs
we use, so there's no reason to gate it behind an extra + a mid-session
lazy install (the install that deadlocked the CLI under prompt_toolkit,
#40490). Every install — base, [all], packagers — now ships it.
The [vision] extra becomes a no-op back-compat alias so existing
'pip install hermes-agent[vision]' invocations still resolve. The
tool.vision lazy-deps entry is kept as a belt-and-suspenders fallback for
stripped/source-build installs.
The vision (Pillow) and faster-whisper STT tool paths were the only
ensure() call sites that defaulted to prompt=True, so they could fire a
blocking input() confirmation mid-session. Every other call site already
passes prompt=False. Under the interactive CLI prompt_toolkit owns stdin,
so that input() deadlocks the terminal (#40490). The install is already
gated by security.allow_lazy_installs, so the prompt was redundant
consent anyway. This makes the deadlock-capable input() branch
unreachable from any tool-call path.
A memory provider tool whose name collides with a built-in core tool
(e.g. clarify, delegate_task) was skipped from agent.tools at init but
lingered in MemoryManager._tool_to_provider, where the has_tool dispatch
branch could route a call to a tool that was never registered (#40466).
Block the collision at registration instead of patching dispatch:
- MemoryManager.add_provider rejects any tool whose name is in
_HERMES_CORE_TOOLS (warn + skip), so it never enters the routing table.
- get_all_tool_schemas applies the same filter, so the manager never
advertises a schema it would refuse to route.
Built-ins always win, matching the invariant used by the TTS/browser/
search provider registries. Makes the dispatch-hijack structurally
impossible regardless of branch ordering.
Closes#40466.
Adds regression tests for the SSH cwd fix: local backend keeps
host-validated session cwd; non-local backend uses TERMINAL_CWD (or
terminal.cwd config) verbatim without host isdir() validation; sentinel
values fall back to session cwd.
Complete the desktop app's tool-backend configuration so it fully
mirrors `hermes tools`. The toolset config panel already did
enable/disable, provider selection, and API-key save/reveal/clear — the
one remaining gap was post-setup install hooks, which previously just
told the user to run the CLI.
Now a provider that declares a post_setup hook (browser Chromium,
Camofox, cua-driver, KittenTTS/Piper, ddgs, Spotify, Langfuse, xAI)
renders a 'Run setup' button that spawns the install via the
`POST /api/tools/toolsets/{name}/post-setup` endpoint and tails the
log inline, feeding the desktop activity rail — mirroring
command-center's runSystemAction poll loop. On completion the panel
refreshes so a now-installed backend reports itself ready.
- hermes.ts: runToolsetPostSetup(name, key) -> profile-scoped POST.
- toolset-config-panel.tsx: PostSetupRunner sub-component (Run setup
button + inline live log + activity-rail upsert + unmount guard),
replacing the CLI-only placeholder.
- i18n: replace the orphaned `toolsets.postSetup` (CLI redirect) string
with proper post-setup UI keys (hint / run / running / starting /
complete / error / failed) across en, ja, zh, zh-hant + types.
- test: post-setup run+poll+log-tail coverage; mock additions for
runToolsetPostSetup/getActionStatus/activity store.
Works against local AND remote backends: all calls route through the
desktop's single `hermes:api` IPC handler to connection.baseUrl, so a
connected remote configures the remote host's tools (keys -> remote
.env, install runs on the remote). Relies on the post-setup endpoint +
'hermes tools post-setup' CLI shipped in #40418.
Verification: tsc -b clean (all 5 locales), eslint clean (the lone
exhaustive-deps warning is pre-existing on origin/main), vitest 4/5
(new post-setup test passes; the failing 'saves an API key' test fails
identically on origin/main — pre-existing EnvVarActionsMenu drift).
Home Assistant is a bundled plugin now (#40709) and declares
allow_update_command=True on its PlatformEntry. The registry fallback
in _handle_update_command already covers it, so the frozenset entry is
a redundant double-allow — same cleanup #40711 did for Discord and
Mattermost. Adds a registry-fallback test mirroring the existing
discord/mattermost cases.
* feat: uninstall the Chat GUI without removing the agent (CLI + desktop UI)
Adds a GUI-only uninstall path so people can remove the desktop Chat GUI
while keeping the Hermes agent + their config/sessions/.env, and surfaces
the three CLI uninstall modes inside the desktop app's Settings → About.
CLI:
- New hermes_cli/gui_uninstall.py: cross-platform discovery + removal of the
desktop GUI's artifacts (source-built dist/release/node_modules + build
stamp, the packaged app bundle, and the Electron userData dir) on Linux,
macOS, and Windows. Never touches the agent source, venv, or user data.
- `hermes uninstall --gui` removes only the Chat GUI; `--gui-summary` prints a
JSON install snapshot (used by the desktop UI to gate options + detect a
missing agent for a future lite client).
- `hermes uninstall --yes` / `--full --yes` now run non-interactively, sharing
the destructive sequence via a new _perform_uninstall() helper. The keep-data
and full flows also sweep the GUI artifacts.
Desktop:
- electron/desktop-uninstall.cjs: pure helpers mapping each mode (gui/lite/full)
to CLI flags, resolving the running app bundle per OS, and building the
detached cleanup script that waits for the app to exit, runs the Python
uninstall, and removes the bundle.
- IPC hermes:uninstall:summary / :run, preload bridge, and types.
- Settings → About "Danger zone" with the three options; agent-removing
options hide when no local agent is detected.
Tests: tests/hermes_cli/test_gui_uninstall.py (22 pass with the existing
uninstall tests), electron/desktop-uninstall.test.cjs (17 pass, wired into
test:desktop:platforms). Docs: desktop.md "Uninstalling" + cli-commands.md.
* fix(desktop): tear down backend process tree before GUI uninstall (Windows lock safety)
The desktop uninstall cleanup script waited only on the desktop app's own
PID, but a backend grandchild (gateway / pty terminal / hermes REPL) can
outlive it and keep hermes.exe + venv files mandatory-locked on Windows —
making the script's rmdir half-fail and leaving a partial install, the same
failure class as the self-update path's #37532.
- main.cjs: runDesktopUninstall now awaits releaseBackendLock() before
spawning the cleanup script — tree-kills every backend PID the desktop owns
(primary + pool) via taskkill /T /F and polls the venv shim until unlocked.
Extracted the shared core out of releaseBackendLockForUpdate so both the
update hand-off and the uninstaller use the identical, incident-hardened
teardown. No-op on macOS/Linux (no mandatory locks).
- desktop-uninstall.cjs: Windows cleanup script removes the bundle via a
bounded rmdir retry loop (10x, 1s) instead of a single rmdir, since Windows
releases directory handles lazily even after the holding process exits.
- Dropped a fragile tasklist|findstr reap-by-path attempt; the Electron-side
tree-kill-by-PID is the reliable mechanism.
Tests: desktop-uninstall.test.cjs updated for the retry-loop output (17 pass).
* fix(desktop): address review on GUI uninstall (venv self-delete, gates, wait-loop)
Resolves @OutThisLife's review on #40355:
1. full mode now gated on agent presence (needsAgent: true). It removes the
agent + user data, so on a lite client with no local agent it's hidden
like lite — no more offering to remove an agent that isn't there.
2. (Finding 3, the real bug) lite/full no longer rmtree the venv from the
venv's OWN python. On Windows a running python.exe is mandatory-locked, so
that half-fails. New lightweight 'python -m hermes_cli.uninstall --mode X'
entrypoint (stdlib-only imports) lets the desktop run agent-removing modes
under the SYSTEM python (findSystemPython) with PYTHONPATH=<agentRoot>, so
import hermes_cli resolves from source while the venv is torn down. Falls
back to venv python + logs when no system python (gui-only unaffected).
3. Windows wait-loop is now bounded (60 tries, matching POSIX) and matches the
PID as a whole space-delimited token via findstr (no substring 99->990
trap, no redundant bare find). set HERMES_HOME/PID/PYTHONPATH now quoted.
4. Renamed the misleading 'returns null for dev run' test — the dev-run safety
is shouldRemoveAppBundle(isPackaged=false), which the test now asserts.
Docs: note that --gui on a source checkout also sweeps node_modules/build
output. Tests: 18 python + 19 desktop pass.
Conflict resolution prefixes --workspace web before --silent (preserving
the Termux npm_workspace_args path); update test_cmd_update fixture to match.
Add zakame@zakame.net -> zakame mapping so CI author check passes.
- check_dir = npm_dir if audit_extra else npm_dir evaluated identically in
both branches; change to PROJECT_ROOT if audit_extra else npm_dir so
workspace-scoped audits check the workspace root's node_modules
- Add test_npm_install_uses_workspace_web_scope asserting --workspace web is
passed adjacently in the _build_web_ui npm install invocation
- Update the --skip-build pre-build hint in the dashboard startup path
to use `npm install --workspace web && npm run build -w web` so users
don't accidentally trigger a desktop rebuild by following the hint.
- Add test_tui_launch_install_uses_workspace_scope to assert that the
TUI launch npm install carries --workspace ui-tui, covering the call
site added in the prior commit.
- Add --workspace ui-tui to the TUI launch npm install, the one call
site missed by the prior commit. Without scoping it ran from
PROJECT_ROOT and still resolved apps/desktop via the apps/* glob.
- Update the two manual-recovery hints in _build_web_ui (npm install
failure and build failure paths) to use the scoped form
`npm install --workspace web && npm run build -w web` so users
following the hint don't accidentally trigger a desktop rebuild.
- Update the stale test assertion in test_cmd_update.py to expect
--workspace web in the _build_web_ui npm ci call, which was
previously unreachable through the if-guard and left the workspace-
scoping change from the prior commit unverified.
Root package.json uses apps/* workspaces glob which unconditionally
includes apps/desktop (Electron + node-pty@1.1.0, ~200MB, requires
make/g++ to build) in every unscoped npm command run from the repo root.
This commit addresses the core problem by adding explicit workspace
scoping to all internal npm calls:
hermes_cli/main.py (_build_web_ui):
- Add --workspace web to the npm install call so only the web
workspace deps are resolved, never apps/desktop.
hermes_cli/tools_config.py:
- Add --workspaces=false to agent-browser and Camofox root installs
so only root-level deps (agent-browser, @streamdown/math) are
installed, bypassing the workspace graph entirely.
hermes_cli/doctor.py (run_doctor npm audit):
- Replace the single unscoped 'npm audit --json' at PROJECT_ROOT with
three scoped invocations:
* --workspaces=false for root deps (Browser tools)
* --workspace web for the web workspace
* --workspace ui-tui for the TUI workspace
- Update remediation hints to use matching scoped 'npm audit fix'
commands so users don't accidentally trigger a desktop rebuild.
package.json:
- Add convenience scripts for scoped operations:
npm run install:root / install:web / install:tui / install:desktop
npm run audit:root / audit:web / audit:tui
npm run audit:fix:root / audit:fix:web / audit:fix:tui
These give developers and CI a safe, explicit interface for the
most common per-workspace tasks without accidentally pulling desktop.
Fixes#38772
Codify the desktop overlay/design conventions in apps/desktop/DESIGN.md:
surfaces & elevation (shadow-nous + --stroke-nous), stroke/color tokens, the
single Button (variants/sizes, no per-call overrides), shared form controls
(controlVariants / SearchField / SegmentedControl / Switch), flat layout
(PAGE_INSET_X, OverlaySplitLayout, ListRow, no card-in-card), feedback states
(Loader / ErrorState / LogView / EmptyState), BrandMark, motion, i18n, and the
nanostore state model. Ends with a pre-merge checklist.
Two fixes so the doc isn't aspirational:
- brand-mark: rounded-md + overflow-hidden (doc says "softly rounded")
- i18n ja/zh/zh-hant: mirror en's "Begin" + drop trailing period on
connectedProvider (doc says update all locales together)
The manage overlay held its own local jobs list, so deleting/creating a
job there left the sidebar's $cronJobs atom stale until the 30s poll
(delete all → section lingered). Make the overlay read and mutate the
shared atom directly (updateCronJobs), so sidebar + overlay are one
source of truth and changes show immediately.
active/createFirst/refresh/refreshing went unused when the cron overlay
moved to the shared split layout (no count header, no refresh button, no
EmptyState CTA). Remove from types + all four locales.
Cron's manage overlay now uses the shared OverlaySplitLayout (sidebar
list + main detail) instead of a bespoke PageSearchShell + grid, matching
profiles. Extract OverlayNewButton (the "+ New …" sidebar action) so
profiles and cron share one component — its hover underline is scoped to
the label span so it never strokes the leading icon glyph.
* fix(desktop): unify dialog/overlay buttons on shared Button component
Replace raw <button> action/text controls across the modal layer (boot
failure, install, update, onboarding, clarify, model-visibility,
notifications, gateway menu) with the shared Button + its variants
(text / ghost / icon-xs). Drops the bespoke square-cornered styling so
every dialog matches the app's slightly-rounded button system, and
swaps clarify-tool's hardcoded "Skip" for the existing i18n string.
* feat(desktop): add dev-only dialog gallery for auditing overlays
A code-split, DEV-gated harness (toggle ⌘/Ctrl+Alt+Shift+D) that triggers
every dialog/overlay so their buttons can be eyeballed in one place:
store-driven overlays (boot failure, updates, notifications, sudo/secret)
plus in-place dialogs (confirm, profile create/rename, attach-url, model
picker/visibility, clarify, tool approval). Never ships to production.
* fix(desktop): use Ctrl+Shift+D for dialog gallery (mac-friendly)
The Cmd/Ctrl+Alt+Shift+D chord is impractical on macOS (Option mangles
the keypress). Ctrl+Shift+D is the same chord on every platform and uses
neither Cmd nor Option.
* fix(desktop): stop overriding button icon size to size-4
Action buttons hardcoded size-4 icons, overriding the Button component's
built-in size-3.5. That extra 2px is why boot-failure / onboarding / gateway
buttons looked chunkier than the settings "Apply" (size-3.5 spinner) despite
being the same component+size. Drop the overrides so icons inherit 3.5.
* feat(desktop): add BrandMark, use it in the updates overlay hero
New BrandMark renders the white logo.png on a hardcoded brand-blue tile
(#0000F2 light / #222 dark), replacing the generic Sparkles hero glyph in
the "update available" overlay. Trying it here first to iterate on the look.
NOTE: apps/desktop/public/logo.png is currently a 1x1 placeholder — the tile
renders now; the glyph appears once the real white logo art is dropped in.
* feat(desktop): add real logo.png asset, render it white in BrandMark
logo.png is blue line-art on transparent, so force it white via filter to
read on both the brand-blue (#0000F2) and near-black (#222) tiles. Bump the
glyph to 62% of the tile for the portrait aspect.
* fix(desktop): BrandMark renders logo as-is, no light bg/radius/padding
Drop the white filter, the hardcoded light-mode blue tile, the radius, and
the inner padding. Logo now fills the tile over a transparent surface in
light mode; dark keeps the #222 tile.
* fix(desktop): bump updates-overlay BrandMark to size-16
* feat(desktop): use downscaled karb.webp in BrandMark
Swap the BrandMark glyph to karb.webp, downscaled from 1129x1418/888KB to
254x320/81KB for the hero badge.
* feat(desktop): use nous-girl mark in BrandMark, invert in dark
Key the white background to transparent so only the black line-art remains
(384px/20KB webp). Light mode shows black art; dark mode flips it white via
dark:invert on the #222 tile. Drop the now-unused karb.webp and logo.png.
* fix(desktop): BrandMark uses nous-girl as-is (no transparent/invert)
The dark-mode invert read as a creepy negative. Use the opaque black-on-white
mark unchanged in both themes; drop the white-key, dark:invert, and #222 tile.
* fix(desktop): give BrandMark an explicit white bg tile
* fix(desktop): use nous-girl.jpg directly in BrandMark
* perf(desktop): downscale nous-girl.jpg to 256x256 (466KB -> 19KB)
* style(desktop): bump nous light --theme-secondary to 14% blue
* fix(desktop): outline button is transparent, not chrome-filled
The outline variant used bg-background (the chrome color), so on cards/overlays
with a different surface it rendered as an odd gray-blue fill (visible on the
boot overlay's Repair install / Use local gateway). Make it bg-transparent so
it inherits the surface like a real outline. Reverts the unrelated
--theme-secondary tweak.
* fix(desktop): clean outline button — thin border, no shadow/fill
Drop shadow-xs and the resting fills (light chrome bg, dark bg-input/30) so
outline is just a thin clean border with a subtle hover, in both themes.
* fix(desktop): stop forcing tertiary bg on outline buttons
A global [data-variant='outline'] rule set background: var(--ui-bg-tertiary),
which (attribute-selector specificity) overrode the cva bg-transparent — so
outline buttons always showed the pale tertiary fill on cards/overlays
regardless of the variant classes. Scope that fill to secondary only; outline
is now a true transparent border.
* style(desktop): unified overlay design system + restore #38631 flat-UI
Overlays/dialogs/toasts share a custom shadow-nous (downward-weighted) and
--stroke-nous hairline instead of hard borders: boot-failure, install,
notifications, model-picker, onboarding, prompt-overlays, updates, Dialog.
- button: outline is a 1px inset ring (no fill/shadow); chrome lives in Button
- BrandMark: 256px nous-girl mark replaces sparkle glyphs (updates/onboarding/about)
- onboarding: conditional header, lemniscate-bloom loaders, OTP device-code boxes,
NOUS CONNECTED hero (ascii decode) + cuneiform easter egg, "Begin" matrix exit
- shared LogView + ErrorState; math/ascii loaders over "Loading..." text
- appearance-settings flattened to SegmentedControl/ListRow; keybind-panel on
shadow-nous + text-variant reset
- restore flat-UI clobbered by #38631's stale-squash (4a1907bd1): command-center,
profiles, skills, messaging, cron de-boxed; shared SearchField + PAGE_INSET_X;
profiles back on OverlaySplitLayout; skills tabs+search one row, no row dividers
* refactor(desktop): clean pass — drop dead code, dedupe, fix stale docs
- log-view: drop unused `bare` prop + forwardRef (no caller uses ref)
- install-overlay: drop `stateOverride` (only the removed dev gallery used it)
- profiles: ProfilesViewProps down to { onClose } (drop vestigial section/titlebar)
- onboarding: hoist shared PROVIDER_ROW_CLASS (was duplicated 2x)
- brand-mark / error-state: tighten comments, fix stale AlertCircle reference
Master/detail separated by gap, not a divider; header rule, schedule-
preview chip border, and error-box border removed (subtle bg tints carry
the grouping/semantics). Fully borderless to match the flat overlay pass.
De-box the master/detail Cron page ahead of #40708's flat-UI system:
drop the two rounded-lg border/bg cards for a single --ui-stroke-tertiary
hairline between list and detail, swap the header divider and schedule-
preview chip onto the same stroke/bg-quinary tokens. No --stroke-nous
(that lands with #40708); only tokens already on this branch.
Follow-up on the backend-visible artifact-path fix.
- Extract the cache-mount iteration loop into a reusable, backend-agnostic
credential_files.map_cache_path_to_container(host_path, container_base) that
returns the POSIX container path or None. to_agent_visible_cache_path() now
delegates to it (keeping its Docker-only gate), and image_generation_tool's
_agent_visible_cache_path() delegates to it too — eliminating the duplicated
loop and the divergent path-join (posixpath vs Path) between the two.
- Drop the now-unused posixpath/Path imports from image_generation_tool.py.
- Document the agent_visible_cache_base getattr probe as a forward-looking
optional hook (no producer yet) so it doesn't read as a typo'd attribute.
- Add unit tests for map_cache_path_to_container.
Sidebar and Cron page each carried a near-identical name→prompt→id
title fn. Collapse to a single jobTitle in cron/job-state.ts (the
page variant, which also falls back to script then 'Cron job').
Redesign the cron surface around jobs (not run sessions), following
power-user patterns (GitHub Actions / Airflow / Dagu): master → detail → output.
Sidebar "Cron jobs" section:
- jobs with a state pip + live next-run countdown
- click toggles an inline run-history peek; a run opens its chat (active run highlighted)
- hover: trigger-now + manage (open the Cron page)
- capped at 50 with a "50+" badge
Cron page: de-nested from a collapse-in-row accordion to master/detail —
job list + the selected job's schedule, actions, and run history.
Backend: GET /api/cron/jobs/{id}/runs lists a job's run sessions.
Share STATE_DOT/jobState across both surfaces; drop dead code/keys.
`gateway/run.py::_UPDATE_ALLOWED_PLATFORMS` was a hardcoded frozenset
listing every messaging platform allowed to invoke the `/update` slash
command. Plugin-migrated platforms (currently Discord and Mattermost,
soon also Home Assistant via #32500) declare `allow_update_command=True`
on their `PlatformEntry`, and `_handle_update_command` already falls
back to the registry when a platform isn't in the frozenset. The result
was a silent redundancy: those entries said "allowed" twice, and the
registry flag was a no-op for them in practice.
- Removed `Platform.DISCORD` and `Platform.MATTERMOST` from the frozenset.
- Updated the docstring to make the split explicit (built-ins live in
the frozenset; plugins use `allow_update_command` on the registry entry).
The remaining frozenset entries are all still built-in platforms living
under `gateway/platforms/` today. Future plugin migrations should drop
their entry from the frozenset as part of the migration PR (or in a
sibling chore PR like this one).
Added a `TestUpdateCommandPlatformGate` test class that pins down all
three branches of the gate so future changes don't silently regress:
- Programmatic interfaces (`Platform.WEBHOOK`, `Platform.API_SERVER`)
must remain blocked.
- Plugin-migrated platforms (Discord, Mattermost) must pass via the
registry fallback.
- Built-in platforms in the hardcoded frozenset (Telegram) must
still pass without needing the registry.
The gate previously had zero direct test coverage — its only existing
coverage was `test_no_adapter_for_platform` which exercised a different
code path.
Move gateway/platforms/homeassistant.py into plugins/platforms/homeassistant/
following the same shape as the Mattermost and Discord migrations.
- Adapter file is renamed via git mv (history is preserved).
- register() exposes the platform via the plugin system instead of the
hardcoded Platform.HOMEASSISTANT elif in gateway/run.py::build_adapter().
- _standalone_send() replaces the legacy _send_homeassistant() helper in
tools/send_message_tool.py. Out-of-process cron delivery
(deliver=homeassistant from a cron process not co-located with the
gateway) now flows through the registry's standalone_sender_fn path
instead of the hardcoded elif.
- _is_connected() probes HASS_TOKEN via hermes_cli.gateway.get_env_value
so existing connected-platform checks behave identically.
The HASS_TOKEN / HASS_URL env-to-PlatformConfig seeding in
gateway/config.py stays in core — same pattern bluebubbles, mattermost,
and discord migrations followed. No setup_fn or apply_yaml_config_fn is
registered because Home Assistant has no _setup_homeassistant wizard in
hermes_cli/setup.py and no homeassistant: YAML block in config.yaml today;
setup runs through the existing hermes_cli/tools_config.py toolset wizard.
Test imports were rewritten across tests/gateway/test_homeassistant.py,
tests/integration/test_ha_integration.py, and
tests/tools/test_send_message_missing_platforms.py; the legacy
(token, extra, chat_id, message)-shaped _send_homeassistant call site is
preserved via a small SimpleNamespace shim in
test_send_message_missing_platforms.py (same approach used when
mattermost moved).
- Focused HA suites (64 tests across the three rewritten files) pass.
- Broader gateway/cron sweep produces 10 failures identical to main
baseline (telegram approval/model-picker xdist isolation flakes,
wecom_callback defusedxml issue, cron script_timeout fixture issue).
Zero net new failures.
A cron session's first message is the injected "[IMPORTANT: you are running as
a scheduled cron job …]" delivery hint, so with no explicit title the sidebar
and history rows fell back to that hint as their label.
Set the session title from the job (name → short prompt → id) with a run-time
suffix for uniqueness against the sessions.title index. Done after the run so
the agent's own INSERT keeps model/system_prompt — this only updates the title.
The dependency-verification repair in _verify_core_dependencies_installed
ran 'pip install --reinstall -e .' via _run_install_with_heartbeat directly,
bypassing the Windows shim-quarantine that the primary install path performs.
That reinstall rewrites the entry-point shims, and on Windows the live
hermes.exe is the running process — pip can neither delete nor overwrite it.
With no quarantine, the shim was left missing and 'hermes' dropped off PATH
('hermes' is not recognized... after update).
Extract the rename-out-of-the-way / restore-on-failure logic into a reusable
_run_quarantined_install helper and route both the primary editable installs
and the --reinstall -e . repair through it. The per-package repair installs
only third-party deps (never hermes-agent), so they don't touch the shims and
are left untouched. Add a regression test (fails on old code, passes on new).
- check-attribution: add chilltulpa@gmail.com -> TheGardenGallery to
AUTHOR_MAP in scripts/release.py (new external contributor via the
carried-over commits).
- ty: the dashboard back-compat test imported pytest but never used it,
tripping unresolved-import. Drop the dead import — tests are plain
functions driving the parser via subprocess, no pytest API needed.
Supersedes the single-.1 rotation from the prior commit, which only bounded
FUTURE growth: rotating a pre-existing oversized desktop.log just renamed the
monster to .1 (no disk reclaimed) and left it stranded until a second rotation
cycle that a now-healthy app may never reach. The ~326 GB file that motivated
this PR would therefore persist as desktop.log.1 after the user updated.
Two changes bring desktop.log in line with the Python-side logs
(hermes_logging.py RotatingFileHandler, maxBytes x backupCount):
1. Cascade rotation: live -> .1 -> .2 -> .3, dropping the oldest. Steady-state
usage is bounded at ~(backupCount + 1) x cap regardless of loop intensity,
instead of the old ~2x with a single backup.
2. Pathological-size discard: a file past 4x the cap is a boot-loop artifact
with no diagnostic value — delete it (and any equally poisoned backups)
outright instead of relocating the disk-exhaustion problem into a sibling.
This is what lets an updated app self-heal a disk a stale build filled,
on the very next launch, rather than one rotation cycle later.
Behavior verified against a real filesystem in a temp dir: under cap -> no
rotation; normal overflow -> live becomes .1; repeated overflow keeps exactly
backupCount backups (no .4) with total bounded; a pathological live file plus
poisoned backups are all reclaimed. node --check passes.
Co-authored-by: The Garden <chilltulpa@gmail.com>
desktop.log is an append-only forensic log written via appendFileSync /
fs.promises.appendFile with no rotation. When the backend enters a boot
loop — e.g. the version-skew crash where an old app shell spawns
`dashboard --tui`, argparse exits(2) instantly, and the renderer keeps
retrying — the full bootstrap transcript plus repeated stack traces are
appended on every attempt. In the wild this drove a single desktop.log to
~326 GB, exhausting the disk and breaking `hermes update`/install (git
index.lock, venv rebuild, and npm all need scratch space).
Rotate to a single .1 sibling once the live file crosses a 10 MB cap, so
total on-disk usage stays ~2x the cap while preserving the most recent
transcript for diagnostics. The size check runs before each append in both
the sync (shutdown) and async (steady-state) flush paths. All filesystem
ops stay inside try/catch so logging can never block startup/shutdown or
crash the shell — consistent with the existing append error handling.
Paired with the CLI --tui back-compat guard in this PR: the guard stops the
crash loop from starting, and this stops a crash loop (from any cause) from
ever filling the disk.
Older Hermes desktop app shells (<= 0.15.x) spawn the backend as
`hermes dashboard --no-open --tui --host ... --port ...`. The --tui flag
was removed from the dashboard subcommand in cae6b5486 (embedded chat is
always on now).
When a user's CLI updates past that commit but their desktop app binary
has not, argparse hard-errored with 'unrecognized arguments: --tui' and
exit(2). The backend died before becoming ready and the desktop GUI showed
only 'Hermes couldn't start' with no actionable cause — a confusing brick
for anyone whose app and CLI versions drift apart across an update.
Add a hidden, deprecated, accepted-and-ignored --tui flag to the dashboard
subparser so an old app shell + new CLI degrades gracefully. Hidden from
--help via argparse.SUPPRESS so we don't re-advertise a removed feature.
Safe to delete once the floor app version is well past 0.16.0.
Adds tests/hermes_cli/test_dashboard_tui_backcompat.py pinning: the flag
parses without error, stays hidden from --help, and the modern (no --tui)
invocation is unaffected.
The cron scheduler tick loop only ran inside `hermes gateway run`, but the
desktop app spawns a `hermes dashboard` backend with no gateway — so any cron
a user created in the app was saved and never fired (silently).
Run a minimal scheduler ticker inside the dashboard lifespan, gated on a new
HERMES_DESKTOP=1 marker the electron shell injects, so server `hermes dashboard`
is unaffected. Cross-process safe via the existing cron/.tick.lock, so it never
double-fires alongside a real gateway.
Address the two non-blocking follow-ups from review:
- next_call is now single-use per middleware frame. A second invocation
raises instead of silently re-running the downstream provider/tool, so
the terminal call cannot execute twice via the chain. The error surfaces
through the existing handler, which preserves the first downstream result.
- Request-middleware payload copies go through _safe_copy(), which falls
back to a shallow dict copy when deepcopy() fails on a non-deepcopyable
member (clients, callbacks, file handles) instead of aborting the pass.
Adds regression coverage for both: double next_call() keeps the terminal
single-run, and a non-deepcopyable (threading.Lock) request payload still
runs middleware via the shallow fallback.
Scheduler sessions (source=cron) were listed in recents, where their
`[IMPORTANT: …]` first-message previews spammed the list — and because
cron runs are always newest, a burst of them consumed the whole recents
page budget and starved real conversations (sidebar showed 0 sessions).
Recents and cron jobs are now two independent lists:
- Backend: /api/sessions + /api/profiles/sessions accept source /
exclude_sources; session_count gains exclude_sources. Recents query
excludes cron; the cron section queries source=cron.
- Desktop: separate $cronSessions store + refreshCronSessions fetch, a
collapsed (persisted) "Cron jobs" section below Sessions that only
renders when cron sessions exist, with its own bounded scroller.
macOS reserves cmd+` for window cycling, so the keydown never reached the
renderer and profile.default never fired. Move it to ⌥⌘0 — the "0 slot" of
the ⌘⌥-digit profile range — which is unreserved and fits the scheme.
Add rebindable actions for the high-frequency gaps: focus composer, open
model picker, next/prev session, search sessions (⌘⇧F), show files/
terminal tab, and nav→artifacts. Reconcile the duplicate Shift+N new-
session listener into session.new's defaults, and surface the remaining
context-local shortcuts (⌘↵ steer, ⌘L terminal selection, ⌘W close
preview) as read-only rows so the panel is the honest source of truth.
Add a central keybind registry + nanostore so desktop hotkeys are
discoverable and user-rebindable. A titlebar ⌨ button (and ⌘/) opens a
collapsible map grouped by Composer (read-only) / Profiles / Session /
Navigation / View; click any chip to capture a new combo. Overrides
persist to localStorage as a delta against shipped defaults, so future
default changes aren't shadowed by a stored snapshot.
Migrates the previously scattered inline listeners (palette, command
center, new session, sidebar, theme) into the registry, and adds profile
switch/cycle/create + default-profile hotkeys.
":" is FTS5's column-filter operator. With a single-column "content" FTS table,
an unquoted query like "TODO: fix" parses as "column:term" and raises
"no such column: TODO". search_messages() catches that OperationalError at the
execute site and returns [], so colon queries silently yield zero hits even when
the content is present. This hits both the session_search tool and the dashboard
search.
Add ":" to the Step 2 metacharacter strip in _sanitize_fts5_query(), mirroring
how the other FTS5 syntax characters are already stripped. Colons inside quoted
phrases are preserved (Step 1 protects them). Adds a regression test asserting a
colon query still finds matching content, plus unit assertions on the sanitizer.
* feat(desktop): surface every provider + models from `hermes model` in the GUI
The desktop GUI's model/provider choices were starved relative to the
`hermes model` CLI. Onboarding listed ~8 providers, Settings → Model only
showed authenticated ones, because the global `/api/model/options` endpoint
called build_models_payload() without the full-universe flags the TUI's
model.options JSON-RPC already used.
- web_server.py: `/api/model/options` now passes include_unconfigured +
picker_hints + canonical_order (matching the TUI handler), so every GUI
surface fed by it sees all 37 canonical providers with auth hints.
- Settings → Model: provider dropdown lists every provider; picking an
unconfigured api_key provider shows an inline 'paste key → Activate' flow
(auto-selects the recommended default); OAuth/external route to onboarding.
- Onboarding: the API-key form is now driven by the full provider catalog
(curated five first, then the rest), not a hand-maintained list of five.
- types/hermes.ts: ModelOptionProvider gains authenticated/auth_type/key_env.
- Tests: model-settings covers the full-universe list + inline activation;
fixed a pre-existing stale assertion (nous / hermes-4 was never rendered).
* feat(desktop): /model in GUI chat opens the model picker instead of a dead-end notice
Typing /model in a desktop chat session printed "/model uses the desktop
model picker instead of a slash command" and did nothing — it never opened
the picker. (The slash worker can't render the prompt_toolkit modal /model
opens in the CLI, so the desktop just showed the unavailable-notice.)
- use-prompt-actions.ts: intercept /model client-side. No args → open the
desktop model picker overlay (setModelPickerOpen) — the same full
provider+model picker as the status-bar button. With args (/model <name>
[--provider ...]) → run the switch directly via slash.exec so power users
can still type it.
- desktop-slash-commands.ts: export isModelPickerCommand() so the hook can
detect picker-owned commands without duplicating the PICKER_OWNED_COMMANDS set.
- Test: covers isModelPickerCommand for /model (+ args) vs non-picker commands.
* fix(desktop): make onboarding provider lists scrollable + clean up card styling
The full-catalog onboarding picker could overflow the modal with no way to
scroll — the OAuth provider list and the api-key grid both grew past the
viewport, hiding the key input and the bottom action row (overflow-hidden card,
no scroll container).
- Scope a `max-h-[60dvh] overflow-y-auto` region to just the provider list /
api-key card grid; the "other providers" disclosure, key input, and action
row stay pinned and reachable.
- Inner `p-1` so card borders / focus rings aren't clipped by the scroll viewport.
- Flatter card styling: drop the persistent border, the redundant selected-state
checkmark, and the modal shadow — selection now reads from the ring alone (the
muted "already configured" check stays).
- Remove the " — set up" suffix from the Settings → Model provider dropdown; the
inline setup flow already signals unconfigured providers.
* fix(desktop): identify api-key onboarding cards by env var, not id
Selecting "Google Gemini" also highlighted "Google AI Studio": the curated
catalog and the backend-derived providers can collide on `id` (a provider slug
can equal a curated id like `gemini`), so `option.id === o.id` matched two
cards at once. Key selection (and the React key + snap-back effect) on `envKey`
instead, which the catalog dedups and is therefore unique per card.
---------
Co-authored-by: Brooklyn Nicholson <brooklyn.bb.nicholson@gmail.com>
Track successful next_call completion separately from invocation so execution
middleware that catches and translates a downstream provider/tool failure does
not accidentally convert that failure into a successful None result.
Also avoid wrapping BaseException from downstream execution, and document the
execution middleware error semantics.
Tests cover:
- pre-next_call middleware failures fail open to the remaining chain
- post-next_call middleware failures preserve the downstream result
- translated downstream failures propagate instead of returning None
- downstream BaseException is not wrapped
Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
Salvages the primary fix from #24275 (asdlem) and layers a last-resort
fallback on top:
Primary (from #24275): the real macOS 26 root cause is that `gui/<uid>`
isn't reachable from non-Aqua/background sessions. Switch the launchd
domain to `user/<uid>` and mark the plist valid for both Aqua and
Background sessions (LimitLoadToSessionType), restoring a real supervised
service. Treat exit code 125 as "job unloaded" so start/restart
re-bootstrap and retry.
Last resort (this PR): the #23387 reporter saw `user/<uid>` bootstrap
also fail with error 5 on some hosts. When even a fresh bootstrap can't
manage the domain (codes 5/125 persist), degrade to a CLI-managed
detached background process instead of crashing — logs to gateway.log,
PID tracked via gateway.pid so stop/status/restart keep working. Print
guidance that it won't auto-start at login or auto-restart on crash.
Co-authored-by: asdlem <asdlem@users.noreply.github.com>
macOS 26+ broke launchctl management of the gui/<uid> (and user/<uid>)
domains: `bootstrap` returns error 5 and `kickstart` returns error 125
("Domain does not support specified action"), so `hermes gateway
start/install/restart` crashed with a cryptic traceback (#23387).
Detect these codes and degrade gracefully: launch the gateway as a
CLI-managed detached background process (the documented `nohup hermes
gateway run --replace` workaround), with logs to gateway.log and the PID
tracked via gateway.pid so stop/status/restart keep working. Print clear
guidance that the service won't auto-start at login or auto-restart on
crash on this macOS version. launchd_stop also tolerates 125/5 from
bootout and falls through to the PID-based kill.
* fix: respect disabled auto-compaction on context overflow
Port from anomalyco/opencode#30749.
When compression.enabled is false, NO automatic compaction trigger may
fire. The proactive token-threshold paths (preflight + post-response
should_compress gate) already honoured the setting, but the three
provider-overflow recovery paths in the agent loop — long-context-tier
429, 413 payload-too-large, and context-overflow — called
_compress_context() unconditionally, silently compressing and rotating
the session against the user's explicit choice.
Add a single guard at the top of the overflow-recovery dispatch: when
compression is disabled and the error is one of those three overflow
classes, surface a terminal error (compaction_disabled: True) telling the
user to /compress manually, /new, switch to a larger-context model, or
reduce attachments. Manual /compress (force=True) is unaffected — it never
enters this loop.
Tests: new TestOverflowWithCompactionDisabled (413 + 400 overflow don't
compress when disabled; control case still compresses when enabled).
Existing overflow-recovery tests updated to enable compaction explicitly
(they verify the recovery fires); fixture defaults flipped to True to
match production (compression.enabled defaults to True).
* fix(install): detect TLS cert-trust failures during npm install on Windows
Corporate MITM proxies and missing root CAs surface as 'unable to get
local issuer certificate' while npm (most often Electron's install.js
postinstall) downloads over HTTPS. The installer surfaced this as an
opaque 'desktop workspace npm install failed (exit 1)', so users
misread it as a permissions/admin-rights problem (issue #38016).
Add a shared Show-NpmCertHint detector and route all three npm-install
failure paths (agent-browser global install, browser-tools workspace,
desktop workspace) through it. On a cert error it prints actionable
NODE_EXTRA_CA_CERTS / strict-ssl remediation; on any other failure it
stays silent.
* fix: respect disabled auto-compaction on context overflow
Port from anomalyco/opencode#30749.
When compression.enabled is false, NO automatic compaction trigger may
fire. The proactive token-threshold paths (preflight + post-response
should_compress gate) already honoured the setting, but the three
provider-overflow recovery paths in the agent loop — long-context-tier
429, 413 payload-too-large, and context-overflow — called
_compress_context() unconditionally, silently compressing and rotating
the session against the user's explicit choice.
Add a single guard at the top of the overflow-recovery dispatch: when
compression is disabled and the error is one of those three overflow
classes, surface a terminal error (compaction_disabled: True) telling the
user to /compress manually, /new, switch to a larger-context model, or
reduce attachments. Manual /compress (force=True) is unaffected — it never
enters this loop.
Tests: new TestOverflowWithCompactionDisabled (413 + 400 overflow don't
compress when disabled; control case still compresses when enabled).
Existing overflow-recovery tests updated to enable compaction explicitly
(they verify the recovery fires); fixture defaults flipped to True to
match production (compression.enabled defaults to True).
* fix(gateway): plain text while busy interrupts by default again
busy_input_mode (default 'interrupt') was advertised as the busy-behavior
knob, but a second knob added in 7abd62719 — busy_text_mode, defaulting to
'queue' — short-circuited every plain TEXT message before busy_input_mode
was consulted. Result: plain follow-ups silently queued instead of
interrupting, even with busy_input_mode left at its 'interrupt' default
(regression #38390, silent-queue #31588).
Collapse to one source of truth: busy_input_mode drives text handling.
busy_text_mode is kept only as a legacy explicit override for back-compat
(existing queue setups keep working); when unset it follows busy_input_mode.
All default fallbacks flipped queue->interrupt. The debounce mechanism is
preserved and now keyed off the resolved mode.
Fixes#38390, #31588.
Conversation rhythm:
- Single `--paragraph-gap` knob drives paragraph spacing both inside a
markdown block and between consecutive prose parts, out-specifying Tailwind
Typography's prose margins. Code cards carry the same gap themselves so it
holds at any Streamdown nesting depth.
- Two-tier vertical rhythm: `--turn-block-gap` separates scaffolding (tools /
thinking) from the reply; `--tool-row-gap` keeps a tool run tight.
- Drop the prose indent so prose, tools, todos, and thinking share one left
edge. `---` renders as quiet spacing, not a heavy rule.
Flat tool list:
- Tools always render as a standalone-row stack, never a "Tool actions · N
steps" group. assistant-ui slices the tool range unstably (interleaved live
vs. reconstructed-consecutive when settled), so grouping reshuffled the whole
turn the instant it settled. Flat rows are pixel-identical either way.
- Inline approvals can no longer be buried in a collapsed group body.
- Remove the now-dead grouping helpers from tool-fallback-model.
Empty thinking:
- Suppress reasoning disclosures with no visible text (encrypted / spinner-
coerced reasoning) instead of leaving an empty "Thinking" header.
- Tail stall indicator returns "thinking" when a running turn goes quiet.
Streaming cadence:
- Smooth character-reveal decouples visible cadence from bursty arrival.
- Flush queued text deltas before applying tool events so a tool row can't
jump ahead of its preceding text.
- Disable Nagle on the GUI WebSocket so per-token frames aren't coalesced.
Polish: clarify/patch/vision_analyze tool meta, queue-panel + diff-lines
spacing, sticky human bubble expands on focus (not hover).
on_post_llm_call extracted usage via `if response is not None:`, taking the
response-object path. But post_api_request delivers `response` as a sanitized
dict (no `.usage` attribute) alongside a separate `usage` summary dict, so
`getattr(response, "usage")` was always None and token/cost data was dropped
for every gateway turn (traces showed usage 0 / cost 0).
Gate on a real `.usage` attribute so the existing usage-dict fallback is
reached. Real response objects (post_llm_call / legacy) still take the
response-object path. Adds regression tests for both paths.
Replicate the `hermes tools` configurator in the dashboard Skills →
Toolsets view. Each toolset now opens a config drawer that covers the
full lifecycle the CLI offers: enable/disable, pick a provider/backend,
enter and save API keys, and run a provider's post-setup install hook
with a live log tail.
The toolset view was previously read+toggle only — the provider matrix
and key-status endpoints existed but the page never called them, and
there was no way to save a key or run a backend install (npm/pip/binary)
from the browser.
Backend:
- New CLI subcommand `hermes tools post-setup <KEY>` — non-interactive,
scriptable target that runs a provider's install hook (agent_browser,
camofox, cua_driver, kittentts, piper, ddgs, spotify, langfuse,
xai_grok). Validated against valid_post_setup_keys() so an arbitrary
key can't drive _run_post_setup.
- PUT /api/tools/toolsets/{name}/env — save API keys to ~/.hermes/.env
via save_env_value (same store the CLI writes), validated against the
toolset category's env-var allowlist; blank values skipped.
- POST /api/tools/toolsets/{name}/post-setup — spawn-action that runs
`hermes tools post-setup <key>`; frontend tails the log via the
existing /api/actions/tools-post-setup/status. Registered in
_ACTION_LOG_FILES.
Frontend:
- New ToolsetConfigDrawer component (provider radios, password key
inputs with saved-state, get-a-key links, Run-setup + live install
log). Toolset cards get a Configure button + the drawer also exposes
the enable toggle.
- api.ts: toggleToolset, getToolsetConfig, selectToolsetProvider,
saveToolsetEnv, runToolsetPostSetup + ToolsetConfig/Provider/EnvVar/
EnvResult types.
Validation: 56 admin-endpoint tests pass (10 new: env save w/ CLI
parity + allowlist reject + blank-skip, post-setup spawn validation,
auth gate); 232 web_server tests pass; web npm run build + eslint clean;
HTTP E2E exercises save-key (CLI reads it back) and spawn+poll
post-setup to exit 0.
The skill detail dialog (Skills hub browser) had several awkward
spacing/placement issues:
- description and identifier crammed together with no breathing room
(-mt-1 pulled the description tight to the header)
- the identifier line touched the action-row border
- Install was stranded far right with a large empty void in the middle
of the action row
- the SKILL.md <pre> opened with a leading blank line
Fixes:
- group description + identifier in a spaced flex-col block (mt-1, gap-1)
- give the action row mt-3 + py-2.5 so it separates from the meta block
- move the repo link into the right-side group with Install (ml-auto,
gap-3) so the row reads left=tabs / right=repo+install, no middle void
- mt-3 on the body for consistent vertical rhythm
- trim() the SKILL.md content so it starts at the first real line
- clarify-tool: top-align the help icon (items-start + mt-px) so it sits
beside the first line of a multi-line question instead of floating
centered against the whole block.
- tool-fallback: a non-zero exit code alone no longer paints the whole
terminal/execute_code card red. grep no-match, diff differences, and
piped commands routinely exit non-zero while producing useful output;
only flag an error when the command produced no output. Explicit error
signals (error field, success=false, status=error, isError) still go red.
- Add regression tests covering the exit-code -> status matrix.
Strip shadow-composer (and its focus/open-state variants) from the
composer surface, composer fallback surface, and the shared user-bubble
base class. Also drop the !important box-shadow override on
[data-slot=composer-surface] that re-applied the shadow regardless of
the utility class, so the flatter look actually takes effect.
The Browse-hub tab was a blank search box with sparse result cards (name +
source + one Install button), no way to read a skill before installing, no
visual security scan, and no indication it was even connected to any hubs.
Backend (web_server.py):
- GET /api/skills/hub/sources — lists the configured hubs (label + trust
tier + GitHub rate-limit + index availability) and featured skills pulled
from the centralized index (zero extra API calls), plus installed-skill
provenance so the UI can mark already-installed results.
- GET /api/skills/hub/preview — fetches a skill's SKILL.md text + file
manifest WITHOUT installing (decodes byte-stored text, masks binaries).
- GET /api/skills/hub/scan — runs the SAME quarantine + scan_skill +
should_allow_install pipeline the CLI installer uses, then cleans up
quarantine, returning verdict / per-finding detail / severity tally /
install-policy decision.
- search now returns per-source counts + timed-out sources + installed map.
Frontend (SkillsPage HubBrowser):
- Landing state: connected-hubs strip + featured skill grid (no more blank
page).
- Rich cards: trust-level color coding, source, tags, identifier,
Details + Install (or Installed state).
- Detail dialog: read the actual SKILL.md, on-demand visual security scan
(verdict pill, severity tally, per-finding list, allow/block policy),
GitHub repo link.
- Search meta line: result count + timing + per-source breakdown (the
'feels slow / no feedback' complaint).
Tests: 4 new endpoint test classes (sources/preview/scan + updated search
shape) in test_dashboard_admin_endpoints.py.
profile list and profile show assumed the wrapper script is always named
after the profile (wrapper_dir / name). When a custom alias exists — e.g.
`hermes profile alias steve --name qiaobusi` creates ~/.local/bin/qiaobusi
pointing at `hermes -p steve` — the display silently showed the profile
name (or nothing) instead of the alias the user actually typed.
The custom-alias *creation* path (create_wrapper_script(name, target)) was
added later; the *display* path was never updated to match.
Add find_alias_for_profile() — a reverse lookup that scans the wrapper dir
for our own wrappers (alias-named file containing 'hermes -p <profile>'),
prefers a custom alias over the profile-named one, strips .bat on Windows,
and sorts for deterministic output. Populate ProfileInfo.alias_name and wire
it into the three display sites (profile describe, list, show).
Credit: salvages the intent of #11506 by wss434631143, reimplemented on
current main against the post-#11506 custom-alias (--name/target) mechanism.
Tests: 6 new (profile-named, custom-name, none, unrelated-file rejection,
windows .bat strip, list_profiles surfacing). All 123 in test_profiles pass.
E2E verified against the real CLI for both custom and profile-named aliases.
credits.grant_spent is a one-time "your monthly grant is used up, you're now on
top-up" heads-up, but it was sticky — it camped the TUI status bar until the grant
refilled, so a user with healthy top-up saw "Grant spent · $990 top-up left"
indefinitely. Treat it like the usage-band notice: flash once, then clear on the
next prompt (startMessage). Depletion stays sticky (you actually can't make
requests). The Python `active` latch keeps the key, so it won't re-fire next turn.
* feat(tui): HERMES_DEV_CREDITS live-spend dev readout (L0 tracer for usage-aware credits)
L0 of the usage-aware-credits feature: a dev-only, env-gated tracer that
exercises the real header -> CreditsState -> TUI pipe end-to-end behind
HERMES_DEV_CREDITS, de-risking the L1/L5 build before the notice policy exists.
- agent/credits_tracker.py: CreditsState + parse_credits_headers (headers are
strings -> paid_access via == "true", never bool(); retain-last-known; only
subscription_micros may be negative; *_usd kept verbatim).
- run_agent.py: _capture_credits / get_credits_state / get_credits_spent_micros,
session-start baseline latch, + dev-gated "credits" capture log.
- agent/chat_completion_helpers.py: capture on the streaming response.
- agent/agent_init.py: init _credits_state + _credits_session_start_micros.
- tui_gateway/server.py: _get_usage emits dev_credits_spent_micros only when flagged.
- ui-tui appChrome.tsx / types.ts: cents delta status segment + "(dev credits)" banner.
Off by default; silent for normal users. Validated live against staging
(capture log delta matches the TUI segment). Throwaway consumer (readout/log/
banner); credits_tracker + the capture plumbing are the real feature foundation.
* test(credits): lock parser under 9-state matrix + harden validation (L2)
Add tests/agent/test_credits_tracker.py with 92 tests covering the 9-state
matrix (healthy, sub_90pct, grant_exhausted, purchased_only, tool_pool_free,
depleted, debt, missing, no_org) plus validation edge cases: version strict==1
with warn-once latch for v>1, bool-string trap (paid_access/tool_pool_gated_off
== "true"/"false", never bool()), half-pair subscription limit treated as
both-absent while parse succeeds, USD regex ^-?\d+\.\d{2}$, non-int micros
→ None, negative non-subscription micros → None, as_of_ms junk → None, zero
limit ZeroDivision guard.
Harden agent/credits_tracker.py to match the spec:
- Add tool_pool_micros/tool_pool_gated_off/from_header fields to CreditsState
- Add depleted property (== not paid_access, never remaining==0)
- Change used_fraction guard to key off subscription_limit_micros (the actual
denominator) not denominator_kind (metadata)
- Replace fail-soft _safe_int with a sentinel-returning variant; full validation
now returns None on any malformed field rather than silently defaulting
- Add module-level warn-once latch for version > 1
- Add USD regex validation; add denominator_kind allow-list check
- Parse x-nous-tool-pool-* prefix headers (not x-nous-credits-tool-pool-*)
* feat(credits): notice spine — AgentNotice + notice_callback/notice_clear_callback + TUI binding (L1)
L1 of usage-aware credits: the driver-agnostic notice delivery spine that L4's
policy will fire through and L5's TUI render will consume.
- agent/credits_tracker.py: AgentNotice dataclass (text/level/kind/ttl_ms/key/id;
kind defaults "sticky", kept TTL-expressive for a future config seam).
- run_agent.py: AIAgent gains notice_callback + notice_clear_callback slots and
_emit_notice / _emit_notice_clear emitters (swallow all callback errors — a
notice must never break the agent loop; no-op when unbound).
- agent/agent_init.py: thread both callbacks through init_agent.
- tui_gateway/server.py: bind both in _agent_cbs → notification.show / notification.clear
WS events (snake_case payload, matching the existing gateway-event convention).
- ui-tui/src/gatewayTypes.ts: notification.show / notification.clear arms on GatewayEvent.
- tests/run_agent/test_notice_spine.py: 15 tests (emitter fire + fail-open + no-op,
signature threading, TUI binding payload shape).
Messaging push is out of v1 (binds neither callback). CLI binding + the TUI render/
decode land with L4 (firing) and L5 (render) so turn-end flush is wired correctly.
* feat(credits): threshold reconciliation policy + tests (L4.1)
* feat(credits): wire threshold policy into capture + latch (L4.2)
After a fresh header parse, _capture_credits runs evaluate_credits_notices against
the agent's _credits_latch and emits the result — clears first, then shows (so a
recovered depletion clears before the "restored" success lands, and depleted wins
the latest-wins slot). Gated on a bound notice_callback: messaging (no callbacks)
still caches state for /usage but runs no policy. Parse stays fail-open (miss →
keep last-known); the eval/emit path warns on failure rather than swallowing, so a
depletion-notice bug can't vanish silently.
- run_agent.py: _capture_credits split into parse (swallow→miss) + policy (warn);
latch lazy-guarded (object.__new__ safety).
- agent/agent_init.py: init agent._credits_latch = {"active": set(), "seen_below_90": False}.
* feat(tui): render credits notices in the status bar (L5, Strategy B)
The TUI now renders the notification.show / notification.clear gateway events the
agent emits — a level-colored notice overrides the status/verb slot when not busy.
- Notice state machine on turnController (pendingNotice + dedicated noticeTimer +
show/clear/applyNotice/flushPendingNotice/clearNoticeState). createGatewayEventHandler
decodes the events and delegates.
- Render priority busy > notice > status (appChrome StatusRule); notice text rendered
verbatim (its glyph comes from the policy), shrinkable so it never clips model│ctx;
dev-credits banner + Δ segment preserved. UiState.notice is snake_case (matches wire).
- Busy-wins: a notice arriving mid-turn is held and flushed at the THREE turn-end sites
(recordMessageComplete / interruptTurn / recordError) — never idle(), which reset()
also calls (would leak across sessions); reset() clears instead.
- Dedicated noticeTimer (never statusTimer); TTL starts on visibility with an id-guard;
latest-wins cancels the prior timer; clear is key-matched (no-op on mismatch); a sticky
survives a turn (flush no-ops with no pending); session reset clears (no cross-session leak).
- 20 tests (handler/turnController logic incl. R3-C2 timer isolation + render priority).
* feat(credits): cold-start seed for new Nous sessions (L3)
A genuinely-new Nous session has no inference header yet, so seed credits state from
the authoritative GET /api/oauth/account snapshot at session start (in the new-session
branch of _restore_or_build_system_prompt — inline, since the on_session_start plugin
hook gets no agent reference). The seed runs the shared notice policy, so a session that
opens already depleted warns IMMEDIATELY rather than only after the first turn.
- Maps the nested account fields (paid_service_access → paid_access; total_usable /
subscription / purchased on paid_service_access_info; rollover on subscription), each
None-guarded; float dollars → micros via round(d*1e6), *_usd left "" (render formats
from micros — never synthesize a verbatim usd from a float).
- Magnitudes-only: no monthlyCredits on the endpoint → subscription_limit_* unset →
used_fraction None → no warn90 from the seed (% only once a header lands, per D-E).
- Provider-guarded to Nous; fail-open (any error leaves _credits_state None, never
blocks startup); paid_access unknown ⇒ True (never falsely depleted).
- run_agent.py: extracted the warm-path policy/emit block into a shared
_emit_credits_notices() so capture and the seed fire notices identically.
* feat(credits): /usage Nous credits magnitudes view + recovery trigger (L6)
Add Nous credit dollar magnitudes to /usage (subscription / top-up / total
+ rollover + renewal + portal CTA), magnitudes-only per v1 (no % until the
account endpoint exposes a denominator). Reuses the existing account-usage
render machinery via a new pure build_nous_credits_snapshot() that maps a
NousPortalAccountInfo to an AccountUsageSnapshot; no nous branch is added to
fetch_account_usage (keeps the per-provider boundary intact).
CLI /usage also doubles as a depletion-recovery trigger: a force_fresh
account fetch, kept in a SEPARATE local so it never clobbers the
header-sourced agent._credits_state (which alone carries used_fraction). If
paid access recovered while credits.depleted is latched and a notice
consumer is bound, it reuses agent._emit_credits_notices() to clear it.
Gateway /usage displays magnitudes only — messaging binds no notice
consumer, so it performs no recovery emit.
Fail-open throughout: any portal hiccup leaves /usage unaffected.
* refactor(credits): dedupe HERMES_DEV_CREDITS flag parse via shared helpers
The dev-flag truthy check was inlined in three places. Replace with the shared
utils.is_truthy_value (run_agent.py, tui_gateway/server.py — also drops a
redundant inline `import os`) and a hoisted DEV_CREDITS_MODE export in
ui-tui/src/config/env.ts (consumed by appChrome, which also stops recomputing the
env check on every render). Behaviour-preserving; identical truthy set.
* fix(credits): cut dead /usage recovery trigger + bound portal fetches (L6 review)
Adversarial review found the /usage depletion-recovery trigger dead AND broken:
the CLI binds no notice_clear_callback, the TUI runs /usage in a separate
slash-worker subprocess (its own agent/latch), and the no-clobber rule made it
evaluate stale paid_access anyway. Recovery already happens on the next inference
(warm path), so the trigger was redundant — remove it and stop the depleted
notice over-promising.
- cli.py: remove the dead recovery block; bound the /usage portal fetch with a
10s wall-clock timeout (ThreadPoolExecutor) like the per-provider fetch —
urllib's per-socket timeout is not a wall-clock guarantee.
- agent/credits_tracker.py: reword the depleted CTA to "run /usage for balance"
(no false recovery promise; /usage shows fresh magnitudes, sticky clears next turn).
- agent/conversation_loop.py: same wall-clock timeout on the cold-start seed fetch
so a stalled portal can't hang session startup; tidy its time import.
* chore(credits): dev notice-state fixtures (HERMES_DEV_CREDITS_FIXTURE)
Throwaway dev scaffolding to exercise the notice pipeline without real spend or
Redis seeding. Set HERMES_DEV_CREDITS_FIXTURE to a state name (healthy / sub_90pct
/ grant_exhausted / depleted / clear) or a file path whose contents name a state
(re-read each turn → flip states live for recovery testing). _capture_credits
injects the chosen CreditsState instead of parsing real headers and runs the
shared notice policy. Deletable with the rest of the HERMES_DEV_CREDITS scaffolding.
* feat(credits): /usage monthly-grant % gauge
The portal /api/oauth/account subscription block now carries monthly_credits
(the per-period grant allowance, the % denominator). The consumer parsed
monthly_charge but dropped monthly_credits, so /usage stayed magnitudes-only.
Capture monthly_credits into NousPortalSubscriptionInfo + _subscription_from_payload.
build_nous_credits_snapshot emits a Subscription usage window (real % used, routed
through the existing render machinery) when monthly_credits is a finite positive
denominator and credits_remaining is finite and <= cap; otherwise it degrades to
magnitudes-only (older portals, rollover-over-cap, or non-finite payloads).
Guards (adversarial-review-driven): reject non-finite operands (json.loads parses
bare NaN/Infinity by default → would render $nan + a false 100% used), reject
bools, guard div-by-zero (cap>0), and suppress the gauge when remaining > cap
(rollover spanning the period makes the cap a nonsensical denominator → the
$X-of-$Y detail would read as a contradiction). Debt (remaining<0) clamps to 100%.
Money rule preserved: the ratio + magnitudes are computed from numeric float
account fields via display formatting, never by parsing a server *_usd string
(there are none on these dataclasses).
13 gauge tests added (tests/agent/test_nous_credits_gauge.py).
* fix(credits): show /usage Nous block whenever a Nous account is present
/usage runs in a slash-worker subprocess whose resolved inference provider is
often not "nous" even when the user has a Nous account, so gating the Nous
credits block on (provider == "nous") hid it entirely — the account data was
fully available but never rendered.
Gate instead on "a Nous account is logged in": a cheap local auth-state lookup
(get_provider_auth_state('nous') has an access_token) decides whether to attempt
the portal fetch, regardless of which provider inference runs on. In the gateway
the block is also lifted out of the 'if provider:' scope so a Nous-credentialled
user with another (or no) resident inference provider still sees their balance.
Fail-open and the per-fetch wall-clock timeout are preserved.
* fix(credits): show /usage Nous block when there's no live agent (TUI slash-worker)
In the TUI, /usage runs in a slash-worker subprocess that resumes the session
WITHOUT building an agent (self.agent is None), so _show_usage early-returned
"(._.) No active agent" before ever reaching the Nous credits block — which is
agent-independent (a portal fetch gated on Nous auth-state). Extract the block
into _print_nous_credits_block() and run it at the no-agent / no-calls
early-returns too (returns True if it printed, so the fallback message only
shows when there's genuinely nothing).
Verified live against staging: the block + monthly-grant gauge now render in the
slash-worker /usage path (previously hidden). The plain CLI REPL + messaging
paths are unchanged (they have a live agent).
* feat(credits): escalating 50/75/90 usage bands (single status line)
Replace the lone 90%-used warning with three escalating bands (50 info, 75 warn,
90 warn) shown as ONE status-bar line: it displays the highest band the
subscription grant has crossed, replaces the line as usage climbs, steps back
down on recovery, and clears below 50%. No stacking, no per-turn churn.
Bands live in a tunable CREDITS_USAGE_BANDS list; the policy derives everything
from it. Single notice key (credits.usage) with a usage_band latch field so the
notice only re-emits when the band actually changes. The crossing gate
(seen_below_90) is preserved so a fresh live session that opens mid-range stays
quiet until it has been observed below the lowest band (cold-start primes it when
it wants an open-high warning). Denominator math unchanged: % = subscription
grant burn (cap - grant_remaining)/cap, clamped [0,1]; top-up never moves the %.
Migrated test_credits_policy.py to the new key + added TestUsageBands (climb,
step-down, recovery-clear, idempotent, inclusive boundaries).
* feat(credits): hydrate notices at session OPEN via shared seed (TUI + first-turn)
Notices previously only fired inside a conversation turn (first message), so a
session that opened already depleted / past a usage band showed nothing at
'ready'. Extract the cold-start seed into a shared seed_credits_at_session_start()
and call it (a) in the TUI/desktop agent build right after the notice callback is
wired (fires at 'ready', before any message) and (b) as the first-turn fallback in
conversation_loop. Idempotent (skips once _credits_state exists) and fail-open.
The seed now maps monthly_credits -> subscription_limit_micros +
denominator_kind='subscription_cap', so used_fraction is computable at seed time
and usage-band warnings (not just depletion) hydrate on open. Primes the crossing
latch so a session opening already in a band warns immediately. Degrades to
depletion-only when monthly_credits is absent (older portals).
Adds test_credits_cold_start.py covering open-at-band, depletion, debt, no-cap
degradation, and the shared seed (fires/idempotent/skips-non-nous).
* feat(credits): /usage monthly-grant % gauge + fixture support + TUI surfacing
agent/account_usage.py: build_nous_credits_snapshot emits a subscription %% gauge
when the portal supplies a positive, finite monthly_credits denominator with
remaining <= cap (guards reject NaN/Infinity and rollover-over-cap, which would
render $nan or a contradictory $X-of-$Y); degrades to magnitudes-only otherwise.
Adds shared nous_credits_lines() (auth-gated, wall-clock-bounded portal fetch) so
the CLI and TUI /usage render the same block, and _snapshot_from_credits_state()
so HERMES_DEV_CREDITS_FIXTURE drives /usage offline too.
TUI: session.usage RPC carries credits_lines (agent-independent) and the /usage
panel renders them regardless of API-call count or resume state — previously the
TUI's separate /usage implementation only showed token counts.
Money rule preserved: %% and magnitudes come from numeric float account fields via
display formatting, never by parsing a server *_usd string.
* feat(credits): CLI REPL inline notices (parity with TUI)
The plain CLI agent bound no notice callbacks, so credit notices were TUI-only.
Bind notice_callback/notice_clear_callback on the CLI AIAgent; _on_notice renders
a single level-colored line above the prompt (error red / warn yellow / success
green / info dim) via _cprint, and seed credits at session open so a depletion or
usage-band warning shows before the first message — the same hydration the TUI
got. _on_notice_clear is a no-op (the REPL prints lines, no persistent slot).
* test(credits): add sub_50pct + sub_75pct dev fixtures for the new usage bands
The fixture set jumped 10%% -> 90%%; add sub_50pct (uf 0.5 -> band 50 info) and
sub_75pct (uf 0.75 -> band 75 warn) so the new escalating bands are exercisable
via HERMES_DEV_CREDITS_FIXTURE across all three surfaces (notice, session-open
seed, /usage gauge).
* fix(credits): usage-band notice clears on next prompt (not sticky-forever)
A 50/75/90 usage heads-up was sticky and camped the status bar indefinitely. Clear
the visible credits.usage notice when a new turn starts (startMessage), so it shows
until your next prompt then yields. The server latch is unchanged, so it won't
re-nag at the same band — it only re-shows when the band actually changes (climb)
or clears when usage drops below the lowest band. Depletion stays sticky.
* refactor(credits): consolidate the /usage credits block behind nous_credits_lines()
The CLI (_print_nous_credits_block) and the messaging gateway (_handle_usage_command)
each re-implemented the auth-gate + portal fetch + render, and both bypassed the
dev-fixture short-circuit that only the TUI honored — so /usage ignored
HERMES_DEV_CREDITS_FIXTURE on the CLI and in chat. Route both through the shared
agent.account_usage.nous_credits_lines() helper: one fetch/render path, one auth
gate, and the fixture works on every surface (~60 fewer duplicated lines).
The gateway usage test recorded only the last asyncio.to_thread call; /usage now
dispatches both the account fetch and the credits fetch, so it records every call
and matches the account fetch by its provider arg.
* fix(credits): keep the /usage gauge type-safe and log its fail-open path
_is_finite_num is now a TypeGuard[float], so the type checker narrows the gauge
operands (monthly_credits / credits_remaining) and the magnitudes passed to
_fmt_usd through it — no more None-operand warnings on the arithmetic. Add a debug
breadcrumb on the nous_credits_lines portal-fetch fail-open so a dead /usage block
is diagnosable in agent.log without a dev flag.
* fix(credits): harden the header tracker — prod-leak gate, hot-path probe, fire-and-forget seed
- Prod-leak guard: dev fixtures (HERMES_DEV_CREDITS_FIXTURE) now also require
HERMES_DEV_CREDITS, so a stray fixture var can't surface fabricated balances on a
real account. Matches the documented run workflow (both vars set together).
- Hot-path probe: parse_credits_headers checks for the version sentinel header
before allocating a lowercased copy of the response headers — skips that work on
every non-Nous API call. Behaviour-identical and still case-insensitive.
- Fire-and-forget seed: the real portal fetch in seed_credits_at_session_start now
runs in a daemon thread, so a slow/unreachable portal never delays session "ready"
(previously blocked up to 10s). The dev-fixture path stays synchronous; the thread
re-checks idempotency before hydrating (a live header may land first).
- Diagnostics: debug breadcrumbs on the parse and seed fail-open paths so a crashed
parser / dead seed is distinguishable from a legitimate no-headers miss.
Cold-start tests set HERMES_DEV_CREDITS alongside the fixture to match the gate.
* test(tui): fix env-timing in the StatusRule dev-credits assertion
DEV_CREDITS_MODE is read once at module load (config/env), so mutating
process.env.HERMES_DEV_CREDITS inside the test couldn't flip it — the dev-banner
assertion only passed if the env was exported before vitest started, and failed in a
normal run. Move that assertion to a sibling file that mocks config/env with
DEV_CREDITS_MODE: true (scoped, no module-reset / React-identity hazard).
* test(credits): cover the dev-fixture /usage render and usage-band clear-on-prompt
- _snapshot_from_credits_state (the offline /usage renderer) had no direct test:
lock the gauge math, the verbatim *_usd magnitudes, the depletion line and the
fixture marker, plus the no-cap (no gauge) and None-state cases.
- turnController.startMessage had no test for clearing the credits.usage notice on
the next prompt while leaving credits.depleted sticky.
* feat(credits): deliver credit notices over messaging gateways
Bind notice_callback/notice_clear_callback on the per-turn gateway agent
so usage-band / depletion / restored notices reach Telegram/Discord/Slack/
etc. Previously the messaging gateway bound neither callback, so the agent's
_emit_credits_notices early-returned and a chat user crossing a band got
nothing unless they ran /usage manually.
- render_notice_line(): AgentNotice -> single plaintext line (level glyph +
text), plaintext-only so it renders uniformly without per-platform escaping.
Fail-soft on malformed/empty notices.
- Standalone push for every notice (messaging has no persistent status bar):
route through the shared _deliver_platform_notice rail (honors private/
public delivery + thread metadata), scheduled onto the gateway loop via
safe_schedule_threadsafe from the agent's sync worker thread — same pattern
as _status_callback_sync.
- The fired-once latch lives on the cached (reused-in-place) agent and
persists across turns, so a band crosses once -> one push, no per-turn
re-nag. Re-fires only after idle-eviction rebuilds the agent (a reminder).
- Recovery ('Credit access restored') rides the show path (emitted as a
success notice, not a clear). notice_clear_callback is a no-op: a sent
platform message can't be cleanly retracted.
Tests: render glyph/levels/fail-soft + public/private delivery seam through
_deliver_platform_notice + no-adapter no-op.
* fix(credits): don't double the glyph on messaging notices
render_notice_line prepended a per-level glyph, but the notice policy already
bakes the glyph into the text (and the TUI + CLI render it verbatim) — so every
credit notice over messaging came out doubled ("⚠ ⚠ Credits 90% used",
"⛔ ✕ Credit access paused"). Emit the text verbatim instead; drop the now-dead
level→glyph map.
The render tests fed glyph-less text (and the success case only checked
startswith), so the doubling slipped through. Rework them around the verbatim
contract and add an end-to-end regression that runs real evaluate_credits_notices
output through render_notice_line and asserts the line is returned unchanged.
Switching the main model never touches auxiliary slot pins (they're
independent, sticky per-task overrides). A user who switches main away
from a now-unpaid provider keeps paying 402s on every background aux call
until they manually reset those pins — silently, with no UI signal.
- /api/model/set scope:'main' now returns stale_aux: slots still pinned
to a provider different from the new main (additive field).
- Desktop Model Settings shows a switch-time notice after Apply AND a
persistent banner when any loaded aux slot mismatches the main provider,
both wired to the existing 'Reset all to main' action.
- Never auto-clears pins — a dedicated cheaper aux model is a legitimate
config; surface-and-offer instead of nuking.
- Fixes a stale pre-existing assertion in the panel test (main model now
renders via selectors, not a standalone label).
Follow-up on the cherry-picked content-block fix. _extract_output_tail
(the live subagent overlay) still used crude str(content), which renders
a "[{'type': 'text'...}]" blob and — worse — mislabels a block-wrapped
"Error: ..." result as is_error=False. Route it through the same
_stringify_tool_content helper so error detection and previews work at
both consumer sites.
- delegate_tool.py: _extract_output_tail uses _stringify_tool_content
- tests: add _extract_output_tail content-block test (error detection +
clean preview)
- release.py: AUTHOR_MAP entry for randomsnowflake (CI gate)
The native macOS About panel showed the Electron package.json version
(e.g. 0.15.1) while the status bar showed the real Hermes version
(0.16.0). setAboutPanelOptions() set applicationName + copyright but
omitted applicationVersion, so macOS fell back to app.getVersion() =
package.json, which drifts (release.py's desktop lockstep bump didn't
land for 0.16.0).
resolveHermesVersion() already reads the live version from
hermes_cli/__init__.py and was built 'so the desktop About panel shows
the real Hermes version' per its own comment, but was never wired in.
- Seed applicationVersion: resolveHermesVersion() at module load.
- Replace the macOS About menu item's role:'about' with a click handler
(showAboutPanelFresh) that re-resolves the version on every open, so an
in-place `hermes update` is reflected without an app restart.
* fix: respect disabled auto-compaction on context overflow
Port from anomalyco/opencode#30749.
When compression.enabled is false, NO automatic compaction trigger may
fire. The proactive token-threshold paths (preflight + post-response
should_compress gate) already honoured the setting, but the three
provider-overflow recovery paths in the agent loop — long-context-tier
429, 413 payload-too-large, and context-overflow — called
_compress_context() unconditionally, silently compressing and rotating
the session against the user's explicit choice.
Add a single guard at the top of the overflow-recovery dispatch: when
compression is disabled and the error is one of those three overflow
classes, surface a terminal error (compaction_disabled: True) telling the
user to /compress manually, /new, switch to a larger-context model, or
reduce attachments. Manual /compress (force=True) is unaffected — it never
enters this loop.
Tests: new TestOverflowWithCompactionDisabled (413 + 400 overflow don't
compress when disabled; control case still compresses when enabled).
Existing overflow-recovery tests updated to enable compaction explicitly
(they verify the recovery fires); fixture defaults flipped to True to
match production (compression.enabled defaults to True).
* fix(dashboard): populate cron delivery dropdown from configured platforms
The dashboard cron-create/edit dropdown hardcoded five delivery options
(local, telegram, discord, slack, email), so users on Matrix — or any
other backend-supported platform — had no way to pick their channel even
though the cron scheduler delivers to all of them. It also offered
Telegram/Discord/etc. to users who never set those up.
- cron/scheduler.py: add cron_delivery_targets() — the single source of
truth. Intersects gateway-configured platforms with cron-deliverable
ones and reports whether each platform's home channel is set.
- web_server.py: GET /api/cron/delivery-targets exposes that list (+ the
implicit local option) to the dashboard.
- CronPage.tsx: both modals render options from the endpoint. Configured
platforms missing a home channel still appear, annotated "set a home
channel first" (option B), so the user knows what to fix. Edit modal
preserves a job's current target even if it's no longer configured.
Local-only state shows a "configure a platform under Channels" hint.
Validation: scheduler + endpoint E2E'd with a Matrix gateway (home set
and unset); 5 new tests; tests/cron + tests/hermes_cli/test_web_server
green (366 passed).
The inline steer note used a ⏩ emoji. Emit a structured `steer:<text>`
system note and render it in SystemMessage as a codicon (compass) row —
same style as slash-status output. No emoji in the transcript.
Cmd/Ctrl+Enter now steers when there's a steerable draft and is a no-op
otherwise — it never falls through to a send, so the shortcut can't
surprise-send. Plain Enter keeps its role (queue while busy, send when idle).
A steer rides inside a tool result (the only role-alternation-safe slot
mid-turn), so a bare "User guidance:" line reads as untrusted tool content —
well-behaved models refuse it as suspected prompt injection (observed live:
"I only follow instructions from you directly, not ones injected through
command results").
- Wrap steers in a bounded, self-describing [OUT-OF-BAND USER MESSAGE] marker
(prompt_builder.format_steer_marker), shared by both drain sites.
- Add STEER_CHANNEL_NOTE to the core system prompt so the model expects this
exact marker and trusts it as a genuine user message — while still ignoring
lookalikes buried in tool/web/file output. Static text → byte-stable prompt,
no prompt-cache regression; gated on the agent having tools.
- Desktop: steer ack is now an inline transcript note (⏩ steered · …) instead
of a toast.
Marker is intentionally static (not a per-session nonce) to honor the
byte-stable system-prompt caching policy; nonce hardening noted as follow-up.
The desktop app could only queue while busy — `/steer` was in the palette
but had no first-class affordance, so the "nudge the agent mid-turn without
interrupting" lane was effectively unreachable.
Add a steer action to the composer: while busy with a text-only draft, a
steering-wheel button (and Cmd/Ctrl+Enter) injects the text into the live
turn via the `session.steer` RPC — the gateway folds it into the next tool
result so the model reads it on its next iteration. Plain Enter still queues.
steerPrompt returns false when the gateway has no live tool window (or the
RPC errors), and the composer re-queues the words so nothing is lost — the
same safety net as a plain queue.
Force-sending a queued message (double-empty-enter, or interrupt-mode
submit) flipped busy→false optimistically, so the queue drain raced the
still-unwinding turn: duplicate user bubble, a stray "queued: …" note, and
the cancelled turn's "Operation interrupted…" reply leaking in.
interruptTurn gains `keepBusy`: hold busy until the gateway's real settle
edge (message.complete, suppressed while interrupted), which drains the
queued message exactly once — desktop "send now" parity. The interrupt
paths now queue + interrupt instead of optimistically sending.
Builds on @naqerl's arrow up/down history (previous commit), making
ArrowUp do the right thing when a queue exists.
ArrowUp/ArrowDown priority:
1. Editing a queued turn → walk older/newer through queued entries,
saving each edit; ArrowDown past the newest exits and restores the
pre-edit draft.
2. Empty composer + queued turns → ArrowUp opens the newest queued entry
for editing (the row's pencil), so Enter saves it back to the queue
instead of firing a new message — the gap the history nav had alone.
3. Otherwise → sent-message history recall (unchanged).
Also: Esc cancels an in-progress queue edit (else interrupts).
Cleanups on the integrated code: fold the browse-state reset into the
existing session-change effect (drop the duplicate ref+effect); reuse
loadIntoComposer for history recall; sort imports; add curly braces +
the runDrain sessionId dep (lint).
* fix(desktop): make composer message queue reliable
The queue felt 'dumb' because of three real bugs:
1. Drained-after-interrupt sends went silent. cancelRun sets
interrupted:true and nothing reset it; submitPromptText's optimistic
seed preserved it, and the message stream drops every delta while
interrupted. So Send-now-while-busy and any interrupt+drain submitted
the next turn into a muted session. Fix: a fresh submit is a new turn —
seed interrupted:false.
2. Back-to-back queue drains stalled. The drain fires on the busy->false
settle edge, but busyRef (synced from the busy store by a separate
effect) can still read true on that same edge, so the drained send hit
the busy guard, returned false, and the entry was never removed. Fix:
fromQueue sends bypass the busyRef guard (the queue drain lock
serializes them); the user path keeps the guard.
3. Double-enter-to-interrupt killed single non-queue turns. The hidden
450ms timer meant a natural double-tap after sending stopped the agent.
Fix: empty Enter while busy is a no-op; interrupting is explicit —
Stop button or Esc.
Also: clean stop (no [interrupted] marker), Send-now works while busy
(promote + interrupt + auto-drain), settle on the interrupted completion
path. Adds regression tests and unblocks the prompt-actions suite by
completing its stale @/hermes mock.
* fix(desktop): float the queue panel as an overlay so the chat doesn't resize
The queue list rendered in-flow inside the composer root, so its height
fed --composer-measured-height (the composer rect drives the thread's
bottom padding + last-message clearance). Queuing a message grew that
rect and the whole chat visibly resized.
Anchor the panel out of flow above the composer (absolute bottom-full,
capped at 40vh with internal scroll). It no longer contributes to the
measured height, so the thread layout stays put and the list overlays the
(already faded) chat. Still collapsible via the panel's own
disclosure header.
* fix(desktop): queue panel collapsed by default + shared border with composer
- Default the queue disclosure to collapsed (compact 'N queued' pill)
instead of expanded.
- Drop the gap and merge the panel into the composer: square bottom
corners, no bottom border/radius, and overlap down by the Root's pt-2
(-mb-2) so the panel's borderless bottom lands on the composer surface's
top border — one continuous bordered shape.
* style(desktop): tighten queue panel padding
* style(desktop): trim queue-ux comments to house style
* style(desktop): drop 'Cursor' references from comments
Slack's native-slash manifest hard-caps at 50 (_SLACK_MAX_SLASH_COMMANDS).
Adding the /version canonical claims a pass-1 slot, so the lowest-priority
pass-2 alias (/q for /quit) clamps off the end. /q stays reachable via
/hermes q. Surviving aliases (/btw /bg /reset) still prove alias parity.
The new IME repro test has two it() blocks but the desktop suite registers
no global testing-library auto-cleanup, so the first render() leaked its
editor into the second test and getByTestId('editor') matched two nodes.
Add afterEach(cleanup) so each case renders into a fresh DOM.
DOM repro that drives compositionstart -> input(preedit) -> compositionend with
no trailing input event and asserts the composer payload (send button) becomes
visible for committed CJK/IME input. Regression guard for #39614.
Typing committed multi-character IME text (e.g. Chinese "你好", and equally
Japanese/Korean or any IME-composed script) left the send button hidden until
an unrelated edit. Input events during composition carry uncommitted preedit
text and are intentionally skipped; the code assumed a trailing input event
after compositionend would deliver the finalized text, but Chromium does not
reliably emit one on Windows IMEs. The committed text therefore never reached
composer state, so `hasComposerPayload` stayed false and the send button stayed
hidden (deleting a char fired a non-composition input that finally synced it).
Flush the live editor text into composer state in onCompositionEnd. Extract the
shared sync into flushEditorToDraft so input and compositionend both update
state.
Fixes#39614
The salvaged helper exported serializeJsonBody but main.cjs still inline-built
the request body, leaving the export dead and the test decoupled from the real
path. Use it at the fetchJsonViaOauthSession site so the helper's coverage
exercises production body construction. Byte-identical output.
The error guard in _search_with_rg/_search_with_grep was unreachable and,
if it had fired, would have discarded valid results.
Two root causes:
1. Unreachable. Both methods pipe the search through `| head` with no
pipefail, so the pipeline reported head's exit code (0), masking rg/grep's
error code (2). The guard never fired. Worse, because _exec merges stderr
into stdout (stderr=subprocess.STDOUT), the error text was then parsed as
bogus match lines instead of being surfaced — the user got garbage matches
with no indication the search failed.
2. Latent results-dropping. The original `not result.stdout.strip()` check
was always False on error (error text lives in stdout), and the
`hasattr(result, 'stderr')` branch was dead code (ExecuteResult has no
stderr field). A naive broadening to `exit_code == 2` would have nuked
real matches whenever rg/grep also hit a non-fatal error (e.g. one
unreadable file in a tree that otherwise matched), which both tools signal
with exit 2.
Fix:
- Prefix the piped command with `set -o pipefail` so rg/grep's real exit
status propagates. rg exits 0 on a truncating head; grep exits 141
(SIGPIPE), so the strict `== 2` guard ignores truncated-success.
- Add _split_tool_diagnostics() to separate tool diagnostics from match
output by tool prefix and output shape. Diagnostics never become matches;
on a hard error they are the message to surface.
- Only surface an error when exit==2 AND no usable match payload remains, so
partial errors keep their real matches.
Tests: tests/tools/test_search_error_guard.py drives both methods through the
real local backend (hard error surfaced, partial error keeps matches,
truncation no false error, files_only/count exclude diagnostics) plus unit
coverage for the splitter.
Supersedes #39710.
The terminal `hermes model` wizard (_model_flow_named_custom) always
live-probed a custom provider's /models endpoint, ignoring the configured
`models:` list. For plans whose endpoint exposes a large catalog (e.g. Baidu
Qianfan Coding Plan returns 100+ models for a 2-3 model plan) the picker
flooded with models the user can't use.
This wires `discover_models` (and the `models:` list) through
_named_custom_provider_map into the flow and honors `discover_models: false`
the same way the slash-command picker (model_switch.py sections 3 & 4) does:
- Default stays True — live probe, no behaviour change.
- discover_models: false → use the configured `models:` list verbatim,
skip the probe (string 'false'/'no'/'0' normalised to False).
- If the probe is on but returns empty, fall back to the configured list
instead of forcing manual entry.
Closes#18726
Section 3 (user `providers:`) already honors `discover_models: false` to
skip live /models discovery and keep the explicit `models:` list. Section 4
(`custom_providers:` list) did not — `should_probe` ignored the field, so any
grouped custom provider with an api_key always had its configured subset
replaced by the full live /models catalog.
This adds the same `discover_models` support to section 4:
- Default True — no behaviour change for existing configs.
- `discover_models: false` keeps the explicit `models:` list even when an
api_key is present.
- String values ("false"/"no"/"0") are normalised to False, matching
section 3.
- If any entry in a grouped endpoint opts out, the whole group opts out.
Use case: endpoints that expose a full aggregator catalog via /models but
only serve a configured subset.
Salvaged from #29810 — rebased onto current main. The PR's other change
(`key_env` resolution in section 4) landed independently in commit aa283d1e4
(custom provider picker credential isolation), so only the discover_models
portion is carried here.
Co-authored-by: ohMyJason <42903577+ohMyJason@users.noreply.github.com>
Follow-up to #39921. That PR scoped session.resume + prompt.submit to a
session's profile, but a BRAND-NEW chat (session.create) under a non-launch
profile was still built and persisted against the dashboard's launch profile.
Two visible symptoms in app-global remote mode (one dashboard, many profiles):
1. "who are you" in profile S replied as the launch (default) profile/agent —
the agent was built with the launch HERMES_HOME, so config/SOUL/identity
came from the wrong profile.
2. "session not found" on later resume — _ensure_session_db_row persisted the
row into the launch profile's state.db via _get_db(), so the session lived
in the wrong db, the unified list mis-tagged it (it showed up under BOTH
profiles), and resume routed to the wrong one.
Fix — carry the owning profile through the create path too:
- session.create accepts an optional `profile`; resolves its home and stores
`profile_home` on the session (alongside what resume already set).
- _start_agent_build binds that profile's HERMES_HOME while building the agent
(config/skills/model/identity resolve to it) and hands the agent the profile's
state.db so turns persist there.
- _ensure_session_db_row writes the row into the profile's state.db, not the
launch db — fixing the duplicate row + mis-tag + resume 404.
- desktop sends the new-chat profile on session.create.
None/launch profile → unchanged (single-profile and per-profile-remote setups
take the same path). Verified live against a one-dashboard / multi-profile
remote: a new chat under `work` builds as work's agent (correct SOUL identity),
persists ONLY to work's state.db (launch db stays empty), the unified list tags
it `work` exactly once, and it resumes cleanly.
tests/test_tui_gateway_server.py: _make_agent mocks updated for the session_db
param added in #39921's build path.
Introduce a lightweight React context-based i18n layer for the desktop
app and translate the UI into Simplified Chinese.
- New apps/desktop/src/i18n module: typed Translations interface, en + zh
locale tables, I18nProvider/useI18n, localStorage-persisted locale
(defaults to English), and language endonym metadata for the picker.
- Wire I18nProvider at the app root in main.tsx.
- Refactor 24 desktop screens/components to read strings from the `t`
object instead of hard-coded English.
- Add a unit test for the i18n context.
* fix(desktop): cross-profile session history in app-global remote mode
#39894 made remote-profile sessions first-class for PER-PROFILE remote
overrides. But the common setup — Settings → Gateway → "All profiles" → Remote
— writes app-GLOBAL remote mode (connection.json top-level mode:'remote', empty
profiles map), which the intercept didn't recognize. Switching to a non-launch
profile then 404'd every session read, so no history showed for it.
In global remote mode a SINGLE backend serves every profile via ?profile= (it
reads each profile's state.db off the remote host's own disk — verified: one
dashboard returns /api/profiles and /api/profiles/sessions?profile=all across
all profiles). The fix: when no per-profile override matches but global remote
mode is active, route per-session reads/mutations to that one backend and KEEP
the ?profile= param so it opens the right state.db (instead of bailing to the
local path and dropping the profile scope).
- new globalRemoteActive() — true for connection.json mode:'remote' or the
HERMES_DESKTOP_REMOTE_URL env override.
- per-session branch: per-profile override → route sans profile (own db);
global mode → route to the single backend WITH ?profile= preserved.
- unified list is unchanged in global mode: it already passes through to the one
backend, which aggregates all profiles natively.
Verified live against a one-dashboard / multi-profile remote (Austin's topology):
cross-profile transcript reads load (was 404), rename/delete route to the right
profile, unified list spans both profiles.
Known limitation (architectural, not fixed here): LIVE chat as a non-launch
profile still needs a per-profile dashboard on the remote — the dashboard binds
HERMES_HOME once at process start, so one global backend can't run an agent
turn as another profile. Session history/read/mutate now work regardless.
* fix(gateway): resume + chat any profile over one global-remote dashboard
The REST half of this branch made cross-profile session history visible in
app-global remote mode, but resume + chat still went over the WebSocket gateway,
which was hard-bound to the dashboard's launch profile. Resuming a non-launch
profile's session 404'd ("session not found") and sending spawned a new session
— because session.resume/prompt.submit had no profile concept and the live
agent + state.db were process-global to the launch profile's HERMES_HOME.
Make the WS gateway per-session profile-aware so ONE dashboard can serve every
local profile on its host (the app-global remote topology):
- session.resume accepts an optional `profile`. _profile_home() resolves that
profile's home on this host; resume opens THAT profile's state.db, binds its
HERMES_HOME (ContextVar override) while building the agent so config/skills/
model resolve to it, and passes the profile db to the agent so turns persist
to the right state.db. The owning profile_home is stored on the session.
- prompt.submit re-binds the stored profile_home for the turn thread (mid-turn
home reads — memory, skills — resolve to the resumed profile), reset in finally.
- _make_agent gains an optional session_db param (defaults to _get_db()).
- _load_cfg honors the home override (falls back to _hermes_home) so a resumed
profile loads its own config; cache keyed on resolved path.
- desktop: session.resume now sends the owning profile.
Omitted/launch profile → unchanged (single-profile and per-profile-remote setups
are byte-for-byte the same path). Verified live against a one-dashboard /
multi-profile remote: resuming a non-launch profile's session loads its history,
runs a real turn against THAT profile's home/env, and persists to its state.db.
tests/tui_gateway/test_protocol.py: _make_agent mocks updated for the new param.
Widens ViewWay's #20741 fix to the sibling config surface: a
custom_providers entry can pin its own output cap via max_output_tokens
(or max_tokens). _get_named_custom_provider now lifts it onto the
resolved runtime at all three return sites, and the gateway uses it as a
fallback only when the documented global model.max_tokens isn't set, so
the global key always wins.
Precedence: HERMES_MAX_TOKENS > model.max_tokens > provider
max_output_tokens > None. Closes the same #20741 truncation for users who
configure the cap per-provider rather than globally.
Picks up the intent of #19782 (alexcam1901), reimplemented to feed
ViewWay's max_tokens pipeline.
max_tokens set under model: in config.yaml was silently ignored.
The value was never read from config, never passed through
_resolve_runtime_agent_kwargs(), _resolve_turn_agent_config(),
or the session override path. Added it to all three code paths
so custom/Ollama endpoints receive the correct output cap.
Closes#20741
* fix(desktop): route remote-profile session reads to the owning remote backend
Per-profile remote hosts (#39778) wired the chat/resume socket to a profile's
remote backend, but session list + transcript reads still assumed every
profile's state.db is a local file the primary can open. For a remote profile
the local file is absent or stale, so the IDs the sidebar shows 404 the moment
resume runs against the remote -- the "session not found -> new session" bug.
Intercept the three session-read GETs in the hermes:api handler and route them
to the owning remote backend (which serves its own state.db natively):
GET /api/profiles/sessions -> splice each remote profile's real rows in
GET /api/sessions/{id}[/messages] -> read from the remote for remote profiles
No remote profiles configured -> untouched local fast path. A dead remote
contributes nothing rather than breaking the sidebar.
Verified end-to-end against a live remote backend: a remote-profile session
resumes from remote history and continues on the remote across turns (history
grows in place, no new session spawned).
* fix(desktop): route remote-profile session mutations + fix unified-list pagination
Follow-up to the read-routing fix: make remote-profile sessions fully
first-class, not just resumable.
Mutations (rename/archive/delete) went through the same hermes:api handler but
never carried the owning profile, so they hit the local primary's state.db --
which has no row for a remote session. Deleting/archiving/renaming a remote
session silently no-op'd or 404'd, and the row reappeared on next refresh.
- hermes.ts: setSessionArchived/deleteSession/renameSession take the owning
profile and pass it as request.profile so Electron routes to that profile's
backend (matching the read path). Callers now forward session.profile.
- main.cjs: generalize the intercept (read -> request) to also reroute
DELETE/PATCH on /api/sessions/{id} for remote profiles, stripping the profile
param (the remote serves its own state.db; no cross-profile semantics there).
- web_server.py: DELETE /api/sessions/{id} gains a profile param for parity with
GET/PATCH (local cross-profile delete).
Also fix the unified-list merge: it concatenated each remote's page onto the
primary's without re-windowing, so a limit=N request could return up to
N*(1+remotes) rows and report the primary's (stale) total. Now it over-fetches
limit+offset from each remote (from offset 0), re-sorts by recency, re-windows
to the page, and recomputes total/profile_totals from the remote counts.
Verified live against a remote backend: rename/archive/delete mutate the remote
db; page 1 windows to limit, profile_totals reflect remote counts, page 2 has no
overlap with page 1. tsc -b clean; connection-config tests pass.
Follow-up to the read-routing fix: make remote-profile sessions fully
first-class, not just resumable.
Mutations (rename/archive/delete) went through the same hermes:api handler but
never carried the owning profile, so they hit the local primary's state.db --
which has no row for a remote session. Deleting/archiving/renaming a remote
session silently no-op'd or 404'd, and the row reappeared on next refresh.
- hermes.ts: setSessionArchived/deleteSession/renameSession take the owning
profile and pass it as request.profile so Electron routes to that profile's
backend (matching the read path). Callers now forward session.profile.
- main.cjs: generalize the intercept (read -> request) to also reroute
DELETE/PATCH on /api/sessions/{id} for remote profiles, stripping the profile
param (the remote serves its own state.db; no cross-profile semantics there).
- web_server.py: DELETE /api/sessions/{id} gains a profile param for parity with
GET/PATCH (local cross-profile delete).
Also fix the unified-list merge: it concatenated each remote's page onto the
primary's without re-windowing, so a limit=N request could return up to
N*(1+remotes) rows and report the primary's (stale) total. Now it over-fetches
limit+offset from each remote (from offset 0), re-sorts by recency, re-windows
to the page, and recomputes total/profile_totals from the remote counts.
Verified live against a remote backend: rename/archive/delete mutate the remote
db; page 1 windows to limit, profile_totals reflect remote counts, page 2 has no
overlap with page 1. tsc -b clean; connection-config tests pass.
Per-profile remote hosts (#39778) wired the chat/resume socket to a profile's
remote backend, but session list + transcript reads still assumed every
profile's state.db is a local file the primary can open. For a remote profile
the local file is absent or stale, so the IDs the sidebar shows 404 the moment
resume runs against the remote -- the "session not found -> new session" bug.
Intercept the three session-read GETs in the hermes:api handler and route them
to the owning remote backend (which serves its own state.db natively):
GET /api/profiles/sessions -> splice each remote profile's real rows in
GET /api/sessions/{id}[/messages] -> read from the remote for remote profiles
No remote profiles configured -> untouched local fast path. A dead remote
contributes nothing rather than breaking the sidebar.
Verified end-to-end against a live remote backend: a remote-profile session
resumes from remote history and continues on the remote across turns (history
grows in place, no new session spawned).
* fix: respect disabled auto-compaction on context overflow
Port from anomalyco/opencode#30749.
When compression.enabled is false, NO automatic compaction trigger may
fire. The proactive token-threshold paths (preflight + post-response
should_compress gate) already honoured the setting, but the three
provider-overflow recovery paths in the agent loop — long-context-tier
429, 413 payload-too-large, and context-overflow — called
_compress_context() unconditionally, silently compressing and rotating
the session against the user's explicit choice.
Add a single guard at the top of the overflow-recovery dispatch: when
compression is disabled and the error is one of those three overflow
classes, surface a terminal error (compaction_disabled: True) telling the
user to /compress manually, /new, switch to a larger-context model, or
reduce attachments. Manual /compress (force=True) is unaffected — it never
enters this loop.
Tests: new TestOverflowWithCompactionDisabled (413 + 400 overflow don't
compress when disabled; control case still compresses when enabled).
Existing overflow-recovery tests updated to enable compaction explicitly
(they verify the recovery fires); fixture defaults flipped to True to
match production (compression.enabled defaults to True).
* perf(/model): prewarm picker provider-models cache in background
The no-args /model picker calls list_authenticated_providers(), which
fetches each authenticated provider's live /v1/models list serially. On a
cold or stale (>1h TTL) cache that blocks ~1.5s on the user's critical path
the first time /model is opened in a session.
Warm that exact path off-thread during the idle window right after the CLI
banner is shown: a once-per-process daemon thread runs
list_authenticated_providers() to populate provider_models_cache.json for
every authed provider. By the time the user types /model, the picker hits
the warm disk cache (~136ms vs ~1500ms).
Process-level Event guard (mirrors run_agent's _openrouter_prewarm_done)
ensures at most one thread per process; fully exception-isolated so an
offline/no-creds provider can never affect the session.
* docs: remove --include-desktop install instructions
Drop the --include-desktop curl one-liner from the desktop app docs.
The flag remains in scripts/install.sh; these docs now point to the
desktop installer / website and the 'hermes desktop' path instead.
* docs: remove --include-desktop from install docs
Drop the redundant 'Hermes Desktop installer on Linux' block (which
used --include-desktop) from quickstart, installation, and index docs.
The website installer covers macOS/Windows desktop; the CLI-only path
covers Linux. Removes the flag from all user-facing docs.
* fix: respect disabled auto-compaction on context overflow
Port from anomalyco/opencode#30749.
When compression.enabled is false, NO automatic compaction trigger may
fire. The proactive token-threshold paths (preflight + post-response
should_compress gate) already honoured the setting, but the three
provider-overflow recovery paths in the agent loop — long-context-tier
429, 413 payload-too-large, and context-overflow — called
_compress_context() unconditionally, silently compressing and rotating
the session against the user's explicit choice.
Add a single guard at the top of the overflow-recovery dispatch: when
compression is disabled and the error is one of those three overflow
classes, surface a terminal error (compaction_disabled: True) telling the
user to /compress manually, /new, switch to a larger-context model, or
reduce attachments. Manual /compress (force=True) is unaffected — it never
enters this loop.
Tests: new TestOverflowWithCompactionDisabled (413 + 400 overflow don't
compress when disabled; control case still compresses when enabled).
Existing overflow-recovery tests updated to enable compaction explicitly
(they verify the recovery fires); fixture defaults flipped to True to
match production (compression.enabled defaults to True).
* fix(completion): remove /model <arg> autocomplete from CLI/TUI
The TUI frontend already suppressed /model argument completion in favor of
the two-step ModelPicker (useCompletion.ts), but the CLI prompt_toolkit
completer and the gateway-backed complete.slash RPC (TUI + desktop) still
emitted model aliases and probed LM Studio on every keystroke.
Drops the /model branch in SlashCommandCompleter.get_completions, the
_model_completions method, and the LM Studio probe/cache helper that only
fed it. Command-name completion (/mod -> model) and sibling arg completers
(/skin, /personality) are untouched. Removes the now-dead TestModelTabCompletion
tests.
The in-app updater (Hermes-Setup --update) runs `hermes update`, which lazily
imports the freshly-pulled modules — but the dependency-install step runs the
already-in-memory PRE-pull code for one invocation. When a release changes an
updater-path contract across that boundary, the FIRST update on the parked
population crashes even though the fix is already on disk.
Concretely this is #39780's `_UvResult`: its `__iter__` yields (path, bool), so
Windows `subprocess.list2cmdline([uv_bin, "pip", ...])` injects the bool and
dies with `TypeError: sequence item 1: expected str instance, bool found`
(fixed in #39820). A parked Windows user clicking Update pulls #39820 to disk,
then still crashes on the in-memory pre-merge module; only the SECOND click runs
clean. Field repro: ryanc's bootstrap.log (2026-06-05 12:41:41).
Fix: when the first `hermes update` exits non-zero (and it isn't the
concurrent-instance guard, exit 2, which a retry can't fix), retry once
automatically. The retry loads the now-current module from the start and
succeeds — so the parked user gets a working one-click update instead of a
scary crash + manual second attempt.
Verified: cargo check clean.
* fix(desktop/windows): stop racing our own backend during in-app update
The Windows in-app update (Update button -> hermes-setup.exe --update handoff)
bricked because it raced a still-locked hermes.exe: the desktop quit
fire-and-forget without reaping its backend child + grandchildren, so when
the updater ran `hermes update`, the venv shim was still open. The quarantine
rename then failed, uv's `pip install -e .` hit "Access is denied", the git
path bailed to a full ZIP re-download, and the deps still couldn't write the
locked shim -- leaving a half-applied install. macOS is fine because it never
blocks REPLACE on a running executable.
Three coordinated fixes restore Mac-style parity (click Update -> progress ->
relaunch, no terminal):
A. Desktop (main.cjs): before spawning the updater, releaseBackendLockForUpdate()
tree-kills the primary + pool backends (taskkill /T /F on Windows, to catch
REPL/pty/gateway grandchildren that SIGTERM misses) and polls the venv shim
until it is actually writable (bounded 15s) -- so the lock is gone before we
hand off. Also fixes resolveHermesCliBinary to use venv\Scripts\hermes.exe on
Windows.
B. Updater (update.rs): wait_for_venv_free no longer "proceeds anyway" on
timeout -- it force-kills any lingering hermes.exe (excluding itself) and
re-checks, so a straggler can't doom the install.
C. Updater (update.rs): pass --force to `hermes update`. By contract the desktop
has exited + waited, and the wait force-kills stragglers, so the running-exe
guard would only produce a false "Hermes is still running" dead-end.
Verified: node --check on main.cjs, cargo check on the updater (clean), and the
Windows-gated taskkill body type-checks standalone. Field repro: ryanc's
update.log (manual + handoff both hit the same lock cascade).
* review: scope backend kill+wait to Windows; drop meaningless POSIX pgid kill
PR #39780 made ensure_uv() return a _UvResult — a str subclass whose
__iter__ yields (path, fresh_bootstrap) so old `uv_bin, fresh = ensure_uv()`
call sites survive the update boundary. That trick is unsafe on Windows.
The dependency installer passes uv straight into the command list
(`[uv_bin, "pip", "install", ...]`). On Windows, subprocess serializes argv
via subprocess.list2cmdline, which iterates every entry *as a string*
(`for c in arg`). Because _UvResult overrides __iter__, that iteration yields
(path, fresh_bootstrap) instead of characters, injecting the bool into the
command line and crashing the first update with:
TypeError: sequence item 1: expected str instance, bool found
This bites the common single-assignment caller (`uv_bin = ensure_uv()`) on
its first update after #39780: the freshly pulled _UvResult flows into the
old in-memory call site and into the argv. Reported in the field on a
~10-commits-behind Windows install.
A single return value cannot satisfy both legacy 2-target unpacking and
Windows char-iteration — both use the iterator protocol with contradictory
results. So gate the wrapper to POSIX: Windows returns a plain str/None
(the historical, subprocess-safe contract). POSIX keeps _UvResult and the
#39780 update-boundary fix.
Tests: list2cmdline canary proving _UvResult breaks Windows, plus Windows
returns-plain-str and POSIX dual-contract coverage.
Two switch-time regressions from the multi-profile rail work:
- "Session not found" (4007): pruneSecondaryGateways idle-reaps a
non-active profile's backend; switching back respawns a *fresh*
backend that mints new runtime ids, but runtimeIdByStoredSessionId is
never pruned. resumeSession's cache fast-path then makes a dead runtime
id active and returns, so session.usage + the next prompt 404. Probe
the cached id; on rejection drop the stale mapping and fall through to
a full resume that rebinds a live id.
- "Forgets the LLM setting": $currentModel is a nanostore set only by
refreshCurrentModel (gatewayState->open, etc). A swap fires
invalidateQueries() (react-query only) and keeps the socket 'open', so
the model/pill kept showing the previous profile. Re-pull both when
$activeGatewayProfile changes.
* feat(desktop): per-profile remote gateway hosts
Profile switching silently failed whenever the desktop was connected to a
remote backend: the rail routed non-active profiles to a local pool backend,
but spawnPoolBackend hard-threw "Profiles are unavailable when connected to a
remote Hermes backend", and the renderer swallowed the error into an infinite
reconnect backoff while still marking the profile active. Remote was also a
single app-global setting, so there was no way to give a profile its own host.
Add per-profile remote hosts so each profile can point at its own backend:
- connection.json gains a validated `profiles` map; profileRemoteOverride()
(pure, unit-tested) selects an explicit per-profile remote.
- resolveRemoteBackend(profile) precedence: per-profile override → env override
→ global remote → local spawn. spawnPoolBackend now connects to a profile's
remote (no local child) instead of throwing; startHermes resolves the primary
profile's remote.
- coerce/sanitize connection config are scope-aware (global vs named profile)
and preserve each other's entries; IPC get/save/apply/test thread an optional
profile. Per-profile apply drops only that profile's pool backend.
- Settings → Gateway adds an "Applies to" scope selector reusing the existing
URL/token/OAuth/test UX per profile.
Tests: connection-config pure suite (+6) and desktop platform suite pass;
tsc/eslint/vitest clean.
* refactor(desktop): DRY per-profile remote helpers
Share connectionScopeKey + normAuthMode from connection-config.cjs (drop the
main.cjs copy), collapse the scope/auth ternaries, route the env remote through
buildRemoteConnection, and fold the duplicated remote-block validation into
buildRemoteBlock. No behavior change; pure suite + live E2E still green.
* fix(update): make ensure_uv() survive the update boundary (no first-run crash)
`hermes update` runs the `ensure_uv()` call site from the old, already-imported
`hermes_cli.main` against the *freshly pulled* `managed_uv` (managed_uv is only
ever lazily imported, so it loads from disk post-pull). `ensure_uv()`'s return
arity flipped from a single path string to `(path, fresh_bootstrap)` (4df280d51)
and back to a single string (fb853a178). Installs parked on a 2-tuple release
unpack `uv_bin, fresh_bootstrap = ensure_uv()` against the new single-value
module and crash the first update with
`ValueError: not enough values to unpack (expected 2, got 1)` — inside the
dependency-install step, *before* the PR #39763 subprocess hand-off can run.
Return a `_UvResult` (a `str` subclass) that is usable as the bare path AND
unpackable as `(path|None, fresh_bootstrap)`. Missing uv is `""` (falsy) instead
of `None` so legacy 2-target call sites can unpack a failure without raising,
while `if not uv_bin` keeps working for single-value callers. fresh_bootstrap is
always False (the rebuild-venv path it gated was scrapped in fb853a178).
* docs(update): correct the verified error string + mechanism for ensure_uv()
A hermetic repro (old 2-target call site vs the freshly-pulled single-value
module) shows the first-update crash is exactly the string from PR #39763's
report: `ValueError: too many values to unpack (expected 2)` — not "not enough".
The returned path is a plain `str`, which is iterable, so `uv_bin, fresh =
ensure_uv()` walks its characters; the failure path's `None` return raises
`TypeError: cannot unpack non-iterable NoneType`. Both are fixed by `_UvResult`.
Comment/test wording updated to match; no behavior change.
* Revert "fix(update): require managed marker before destructive clean"
This reverts commit c8e80cd0bf.
* Revert "fix(update): stop stash/restore from clobbering desktop source on managed clones (#38542)"
This reverts commit 8a19884bf3.
* chore(install): keep npm ci desktop-build fix after stash revert
The destructive-clean reverts (#38542/#39568) pulled the desktop
workspace install back to bare `npm install`. The npm ci -> npm install
fallback is orthogonal build-correctness (avoids the Windows
workspace-hoisting flake where install reports up-to-date against a
stale marker while node_modules is empty, breaking tsc -b). Preserve it.
* feat(update): settable stash-or-discard for non-interactive local changes
Adds updates.non_interactive_local_changes (stash | discard, default
stash). Governs ONLY non-interactive updates (desktop/chat app, gateway,
--yes) — interactive terminal updates always stash-and-ask, unchanged.
- config.py: new key under existing updates section; _config_version 26->27.
- main.py: _cmd_update_impl detects non-interactive (gateway/--yes/no-TTY),
reads the setting; new _discard_stashed_changes() drops the stash
(stash-and-drop, never reset --hard/clean -fd, so ignored paths survive).
Post-pull restore site branches on it; the bail-out and up-to-date
restores always preserve work.
- web_server.py + apps/desktop settings: exposes it as a stash/discard
select (Advanced section, In-App Update Local Changes).
- docs + tests (discard drops, stash restores, interactive ignores setting,
missing section defaults to stash).
* fix(install.ps1): stash/restore instead of reset --hard on Windows update
The PR reverted the destructive update path to stash/restore everywhere
except scripts/install.ps1, whose managed-clone update path still ran
`git reset --hard HEAD` before checkout — silently destroying agent-edited
tracked source on Windows (the same #38542 data-loss class the PR fixes).
- Replace `git reset --hard HEAD` with stash-before-checkout +
restore-after-checkout, mirroring install.sh. Untracked files are
included so agent-created dirs (e.g. tinker-atropos/) survive.
- Keep `core.autocrlf false` (it prevents the phantom CRLF dirt that made
the stash necessary; it's also load-bearing for a clean restore).
- Wrap all three checkout modes (Commit/Tag/Branch); Branch case now uses
`git pull --ff-only` so local commits are never clobbered.
- Only prompt to restore when a real console is attached (UserInteractive
+ non-redirected stdin/stdout + ConsoleHost); the desktop Update button
and bootstrap have no usable console, so they default to restore and
never hang on Read-Host.
- On restore conflict or a failed update, the stash is preserved with
recovery instructions — work is never silently dropped.
Validated on Windows (PowerShell 5.1, git 2.54): AST parse clean;
E2E non-conflicting restore applies+drops cleanly with ignored paths
(node_modules) untouched; conflicting restore preserves the stash.
---------
Co-authored-by: alt-glitch <balyan.sid@gmail.com>
When the agent's reply references a deliverable file path that does not
exist on disk, extract_local_files dropped it from native delivery with
no log line — the most common reason a promised file never arrives over
a messaging platform. Add an INFO log at that drop point so the gap is
visible in gateway.log instead of vanishing.
Also convert the two print() calls in Telegram's send_document /
send_video exception handlers to logger.warning(exc_info=True). print()
writes to stdout, which 'hermes logs' never captures, so outbound upload
failures (oversized files, Bot API rejections) were invisible.
* fix: respect disabled auto-compaction on context overflow
Port from anomalyco/opencode#30749.
When compression.enabled is false, NO automatic compaction trigger may
fire. The proactive token-threshold paths (preflight + post-response
should_compress gate) already honoured the setting, but the three
provider-overflow recovery paths in the agent loop — long-context-tier
429, 413 payload-too-large, and context-overflow — called
_compress_context() unconditionally, silently compressing and rotating
the session against the user's explicit choice.
Add a single guard at the top of the overflow-recovery dispatch: when
compression is disabled and the error is one of those three overflow
classes, surface a terminal error (compaction_disabled: True) telling the
user to /compress manually, /new, switch to a larger-context model, or
reduce attachments. Manual /compress (force=True) is unaffected — it never
enters this loop.
Tests: new TestOverflowWithCompactionDisabled (413 + 400 overflow don't
compress when disabled; control case still compresses when enabled).
Existing overflow-recovery tests updated to enable compaction explicitly
(they verify the recovery fires); fixture defaults flipped to True to
match production (compression.enabled defaults to True).
* feat(delegation): uncap max_spawn_depth to match max_concurrent_children
Removed the hard ceiling of 3 on delegation.max_spawn_depth. Depth now has
a floor of 1 and no upper limit, mirroring max_concurrent_children. Cost
(each level multiplies API spend) is the practical limiter, not a constant.
- delegate_tool.py: drop _MAX_SPAWN_DEPTH_CAP, _get_max_spawn_depth() floors
at 1 instead of clamping to [1,3]; depth-limit error string reworded
- config.py / cli-config.yaml.example: doc comments say floor 1, no ceiling
- docs (configuration, delegation, delegation-patterns): range 1-3 -> >=1
- tests: convert clamp-above-3 change-detector into a no-ceiling invariant,
drop the _MAX_SPAWN_DEPTH_CAP==3 snapshot assert, fix warning-text assert
A bare /voice silently toggled on/off with a one-line result, leaving
users with no idea what the modes mean or that Discord also supports
TTS-all and live voice-channel join/leave. Bare /voice now still
toggles but appends a usage explainer covering on/off/tts/status, with
the Discord voice-channel lines shown only on adapters that support
them.
Adds gateway.voice.help + gateway.voice.help_channels across all 16
locales (placeholders {toggle}/{channels}).
* fix: respect disabled auto-compaction on context overflow
Port from anomalyco/opencode#30749.
When compression.enabled is false, NO automatic compaction trigger may
fire. The proactive token-threshold paths (preflight + post-response
should_compress gate) already honoured the setting, but the three
provider-overflow recovery paths in the agent loop — long-context-tier
429, 413 payload-too-large, and context-overflow — called
_compress_context() unconditionally, silently compressing and rotating
the session against the user's explicit choice.
Add a single guard at the top of the overflow-recovery dispatch: when
compression is disabled and the error is one of those three overflow
classes, surface a terminal error (compaction_disabled: True) telling the
user to /compress manually, /new, switch to a larger-context model, or
reduce attachments. Manual /compress (force=True) is unaffected — it never
enters this loop.
Tests: new TestOverflowWithCompactionDisabled (413 + 400 overflow don't
compress when disabled; control case still compresses when enabled).
Existing overflow-recovery tests updated to enable compaction explicitly
(they verify the recovery fires); fixture defaults flipped to True to
match production (compression.enabled defaults to True).
* fix(gemini): default native maxOutputTokens + strip OpenAI extra_body on Gemini endpoints
Two distinct failures hit users on the gemini provider with only Google
AI Studio keys set.
1. Truncation loop: build_gemini_request() only set maxOutputTokens when
max_tokens was non-None. Hermes passes None to mean "unlimited", but
Gemini's native generateContent does NOT treat an absent maxOutputTokens
as full budget — it applies a low internal default and stops early with
finishReason=MAX_TOKENS, truncating tool calls. The agent then retries
3x and refuses the incomplete call. Now default to the published 65,535
ceiling (shared by all current Gemini text models) when max_tokens=None.
2. HTTP 400 on Gemini endpoint: the chat_completions transport assembles
profile extra_body (Nous portal 'tags', reasoning, provider prefs) and
sends it via the OpenAI client to whatever base_url is resolved. When a
profile that emits extra_body (e.g. Nous) is active but the endpoint is a
native Gemini base_url — typical when only Google creds exist and a
fallback/aux call lands on Gemini — Google rejects the unknown 'tags'
field with a non-retryable 400. Strip all non-thinking_config extra_body
keys when the resolved endpoint is native Gemini.
Verified E2E against real transport code: tags stripped on native Gemini,
preserved on Nous and the /openai compat endpoint; maxOutputTokens=65535
on None, explicit values respected.
* feat(discord): voice-channel mixer — ambient idle bed + verbal acks that overlap TTS
Discord voice mode can now feel conversational: the bot speaks a short
acknowledgement before it starts working, and a subtle ambient 'thinking' bed
plays underneath while tools run, ducking under speech and swelling back — the
Grok-voice-mode feel.
discord.py plays only one audio stream per voice connection, so this adds a
software mixer (VoiceMixer, a discord.AudioSource) installed once per guild on
join. It sums an ambient loop, verbal acks, and TTS replies into that single
20ms/48kHz/stereo stream (numpy int16 add + clip), so they overlap instead of
stop-and-swap. Speech ducks the ambient gain down and releases it smoothly.
- plugins/platforms/discord/voice_mixer.py: VoiceMixer + MixerChild (gain,
loop, fade, duck/release), decode_to_pcm (ffmpeg), synth_ambient_pcm (no
asset needed — synthesised pad).
- adapter: install mixer on join, tear down on leave, route
play_in_voice_channel through the mixer (legacy one-shot path kept as
fallback), play_ack_in_voice, voice_mixer_active. Defensive getattr for the
object.__new__ test helpers.
- gateway/run.py: tool_start_callback fires a one-time verbal ack on the first
tool call of a turn when in a voice channel (independent of the text
tool-progress gate). No system-prompt or message-flow changes.
- config: discord.voice_fx.* (OFF by default; ambient/duck/speech gains, ack
phrases). All in config.yaml, not .env.
- docs + tests (mixer unit + adapter integration).
Verified: 19 new tests pass, existing voice suite green (2 pre-existing
davey-module env failures unchanged), and a real-mixer E2E confirms ambient
streams, TTS overlaps it, acks layer in, and teardown is clean.
* fix(discord): make voice mixer numpy import lazy (numpy is voice-extra-only)
numpy ships in the optional 'voice' extra, not [all,dev], so a module-level
'import numpy' broke CI test collection (and would break the always-imported
Discord adapter on any install without the voice extra). Defer numpy to the
functions that actually mix audio via _require_numpy(); guard the test module
with pytest.importorskip('numpy').
Follow-up on the salvaged fix: point the Nous silent-default override at
deepseek/deepseek-v4-flash (a cheap chat model) instead of the nvidia
nemotron entry. Keeps the no-model-configured fallback off the priciest
flagship while landing on a low-cost, broadly-capable default.
Assert get_default_model_for_provider("nous") never returns the priciest
catalog entry (anthropic/claude-opus-4.8) and that an override pointing at a
model absent from the catalog falls back to catalog order. Regression for the
silent flagship-billing footgun.
When a provider is configured but no model is selected (e.g. a profile sets
provider: nous with no model), the gateway/CLI fall back to
get_default_model_for_provider(), which returned the first curated catalog
entry. The Nous Portal list is ordered most-capable-first, so entry [0] is
anthropic/claude-opus-4.8 — the single most expensive model ($5/$25 per Mtok).
A misconfigured profile therefore silently routed every call to the flagship
and billed it for traffic the user never opted into.
Pin the silent (non-interactive) default for metered aggregators to the cheapest
curated tier via _PROVIDER_SILENT_DEFAULT_OVERRIDES so a missing model can never
auto-escalate to the flagship. The interactive default (GUI onboarding /
`hermes model`) keeps using the richer free/paid-tier-aware resolver.
Fixes the unexpected anthropic/claude-opus-4.8 charges reported for a
free-tier Nous account whose new profile had no default model.
The Desktop bootstrap installer writes `.hermes-bootstrap-complete` into the
managed git checkout root. Because it wasn't gitignored, `hermes update`'s
`git stash push --include-untracked` treated it as a local change and created an
autostash on every run — prompting the user to restore "local changes" that were
really Hermes-managed runtime state (and risking the marker getting stranded in a
stash, which re-triggers Desktop bootstrap).
Add the marker to .gitignore; `git stash -u` and `git status --porcelain` both
skip ignored files, so the updater now sees a clean tree.
Fixes#38529
Add HERMES_DASHBOARD_SESSION_TOKEN to the Hermes-managed subprocess environment blocklist so dashboard authorization material does not propagate into shell, PTY, or background process launches.
Extend the local environment blocklist regression coverage to prove the dashboard session token is stripped like other Hermes-managed secrets.
* docs(dashboard): clarify auth provider suitability + document dashboard registration
- Add a 'Registering a dashboard' subsection under the Nous Research
provider covering both the 'hermes dashboard register' CLI command
and the Portal /local-dashboards GUI page.
- Note that the Nous provider is the one suitable for public-internet
exposure (logins verified against your Nous account).
- Add a warning that the username/password provider is for trusted
networks / VPN only and is not suitable for direct public-internet
exposure; point readers to the Nous / OIDC / custom OAuth providers.
- Surface the same distinction in the two-provider intro list.
* docs(dashboard): count three bundled auth providers, add self-hosted OIDC to intro
'Two providers ship in the box' undercounted — the bundled
plugins/dashboard_auth/self_hosted (generic OpenID Connect) is a third.
List all three in the gated-mode intro and link each to its section.
* docs(dashboard): extend auth provider updates to Docker and Desktop pages
- docker.md: list all three bundled gate providers (was username/password
+ OAuth only), adding the self-hosted OIDC provider and its env vars,
and note username/password is not for public-internet exposure.
- desktop.md: reframe the remote-backend connection so OAuth (Nous Portal)
is the preferred option for any backend reachable beyond the local
machine, with username/password positioned for local / trusted-network
use only. Cover the 'Sign in with <provider>' OAuth flow in the in-app
steps and scope the VPN warning to the password path.
* docs(dashboard): align env-var, CLI, and remote-Desktop recipe with provider changes
- environment-variables.md: reframe the Web Dashboard & Hermes Desktop
intro (OAuth preferred for remote/public, username/password for
trusted networks), add the self-hosted OIDC env vars
(HERMES_DASHBOARD_OIDC_*) that were missing from the table, and note
hermes dashboard register provisions the OAuth client_id.
- cli-commands.md: document the 'hermes dashboard register' subcommand
(flags, behavior, /local-dashboards GUI alternative).
- web-dashboard.md: apply the OAuth-preferred reframe to the bottom
'Connecting Hermes Desktop to a remote backend' recipe and scope its
VPN warning to the username/password path, matching desktop.md.
* docs(dashboard): move 'recommended remote Desktop path' framing from username/password to OAuth
The gated-mode intro list claimed the username/password provider was the
recommended path for a remote Hermes Desktop connection, contradicting the
OAuth-preferred framing established elsewhere. Move that recommendation onto
the OAuth (Nous Portal) item so the docs are consistent: OAuth is the
recommended provider for any remote/internet-facing backend; username/password
is for trusted networks only.
* docs(dashboard): drop unreleased managed/hosted-install provisioning notes
Remove the 'not available in managed/hosted installs, where the client id is
provisioned by the hosting platform' line from the dashboard register docs
(web-dashboard.md, cli-commands.md) and the 'provisioned by the Nous Portal for
hosted deploys' clause from the HERMES_DASHBOARD_OAUTH_CLIENT_ID env-var row —
that platform-provisioning path is unreleased.
* docs(dashboard): drop --portal-url / HERMES_DASHBOARD_PORTAL_URL from user docs
The portal-URL override targets a non-production Nous Portal and only works
for internal Nous usage — it won't function for end users (the access token
must be issued by the same portal). Remove it from the register CLI flags,
the Nous-provider config/env tables, and the verify-the-gate example so users
aren't pointed at an option that can't work for them.
* docs(dashboard): add worked examples for Nous and username/password providers
The self-hosted OIDC provider already had a full 'Worked example: Keycloak'
walkthrough; the Nous and username/password providers only had scattered
config snippets. Add parallel '#### Worked example' sections for both
(register/run/login + /api/status verification), mirroring the Keycloak
example's structure so all three bundled providers read consistently.
* docs(env): move HERMES_DESKTOP_REMOTE_URL to end of the dashboard auth table
It was sitting between the HERMES_DASHBOARD_BASIC_AUTH_* block and the
HERMES_DASHBOARD_OAUTH/OIDC block, splitting the dashboard-side vars. As the
only desktop-side var in the table, it belongs at the end so the dashboard
provider vars (basic, OAuth, OIDC) stay grouped together.
* docs(dashboard): remove Fly.io references from dashboard auth docs
Fly.io is the internal hosting implementation for hosted Hermes — it shouldn't
leak into user-facing dashboard auth docs. Reword the OAuth provider intro,
the env-var-path rationale, the public-URL-override section, the cookie Secure
note, and the verify-the-gate example to generic 'hosting platform' / 'reverse
proxy' / 'TLS terminator' phrasing.
Left the legitimate user-facing Fly.io mentions in telegram.md (a deliberate
cloud-deployment walkthrough) and work-with-skills.md (a generic example)
untouched.
`cron_list` read `job.get("repeat", {})`, but the dict-default only
applies to a MISSING key. A one-shot job persisted with `"repeat": null`
returns None, and the next `.get("times")` raised AttributeError, taking
down the whole `cron list` output. Coalesce with `or {}` so a
present-but-null repeat renders as ∞ like the other cron readers already
do. Adds a regression test.
Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>
`os.getcwd()` raises FileNotFoundError when the process's working
directory was removed out from under it (e.g. a scratch workspace
cleaned up mid-session), crashing terminal env setup.
Extract a `_safe_getcwd()` helper that falls back to TERMINAL_CWD, then
the user's home, on FileNotFoundError, and route all three `os.getcwd()`
call sites in terminal_tool.py through it (local default_cwd, the Docker
cwd-passthrough source, and the debug-config print) so the same crash
can't resurface at a sibling site. Adds unit tests for the real-cwd path
and both fallback branches.
Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>
Inside the published Docker image, both the `--tui` banner and the
dashboard-embedded TUI report `1 commit behind — run docker pull
nousresearch/hermes-agent:latest to update` even though the container
has no git repo and no way to compute a commit delta.
Root cause: two independent update-detection paths, only one of which
knows it's running in Docker.
- `recommended_update_command()` → `detect_install_method()` reads the
`.install_method` stamp that `docker/stage2-hook.sh` writes at boot →
returns "docker", so the *command string* correctly says `docker pull`.
- `banner.check_for_updates()` (the source of the "N commits behind"
*count*) has no notion of the docker install method. It only detects a
build via `HERMES_REVISION` (nix-only, unset in the image) or a `.git`
dir (excluded from the image by .dockerignore). Neither matches, so it
silently falls through to `check_via_pypi()`, whose PyPI-version
mismatch flag (1) is then rendered verbatim by the CLI banner
(build_welcome_banner), the Ink TUI badge (branding.tsx), and `hermes
version` as "1 commit behind" — a phantom count, no commit math
involved. `hermes update` already refuses to run in-place in the
container.
The dashboard's REST `/api/hermes/update/check` endpoint already
short-circuits docker (returns behind=None + the docker guidance). This
mirrors that guard inside `check_for_updates()` so the banner/TUI/version
surfaces agree: when `detect_install_method() == "docker"`, return None
before any git/pypi probe (and before writing a cache entry). None makes
the render guards (`typeof === 'number' && > 0`, `behind and behind > 0`)
stay false, so the badge/line disappears entirely — matching the System
page.
Fix is in one place (check_for_updates) because all three consumers route
through it via get_update_result()/_update_result.
Tests: test_check_for_updates_docker_returns_none asserts None + no
git/pypi probe + no cache write; test_check_for_updates_non_docker_still_checks
guards against over-broadening (pip still version-checks). Mutation-tested:
removing the guard fails the docker test.
Verified against a real `docker build` of the image — see PR description.
The desktop OAuth remote-gateway path gated connectivity on
hasOauthSessionCookie(), which checks only the access-token cookie
(hermes_session_at, ~15 min TTL). The moment that cookie's Max-Age
lapsed, Electron's cookie jar dropped it and both resolveRemoteBackend()
and sanitizeDesktopConnectionConfig() reported "not signed in" — forcing
a full IDP re-login every ~15 min — even though a valid 24h refresh-token
cookie (hermes_session_rt) was sitting in the same jar.
The desktop OAuth code (2026-06-04) was written against the obsolete
"contract v1 issues no refresh token" model, two days after #37247
re-introduced server-side transparent refresh: Portal now issues a 24h
rotating, reuse-detected refresh token, and the gateway middleware
(_attempt_refresh) rotates a fresh AT from the RT on the next
authenticated request. So an expired-AT/live-RT session is fully
connectable — the desktop just never let the request through.
Fix:
- connection-config.cjs: add RT_COOKIE_VARIANTS + cookiesHaveLiveSession()
(true when EITHER a live AT or RT cookie is present). Keep
cookiesHaveSession() AT-only for callers that need that specific signal.
- main.cjs: add hasLiveOauthSession(); resolveRemoteBackend()'s oauth
branch now early-outs only when NEITHER cookie is present, otherwise
uses the ws-ticket mint as the authoritative liveness probe (that POST
carries the RT cookie and triggers the server-side AT rotation). A real
401 still surfaces as needsOauthLogin. Settings indicator + oauth-logout
report against the same AT-or-RT notion.
- Remove the stale "contract v1 / NO refresh token" docstrings in
cookies.py and the verify_session comments in the Nous provider that
contradicted #37247.
Tests: +57 lines in connection-config.test.cjs covering the RT-only
"still connectable" case. node --test: 32/32. dashboard-auth +
nous-provider Python suites: 223/223.
Note: server-side files (hermes_cli/dashboard_auth/, plugins/dashboard_auth/)
are comment/docstring-only here, but this touches outside apps/desktop/ so
it needs Teknium review.
`_read_json_file` caught OSError but not UnicodeDecodeError, so a status
file holding binary/non-UTF-8 bytes (truncated or clobbered write) would
crash the gateway status path instead of being treated as unreadable.
UnicodeDecodeError is a ValueError subclass, not an OSError, so it
escaped the existing guard.
Widen the catch to (OSError, UnicodeDecodeError) at both read sites in
gateway/status.py — `_read_json_file` and the sibling `_read_pid_record`,
which had the identical gap. Adds tests covering binary input (returns
None) and valid input (still parses) for both.
Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>
The LINE adapter classified every non-text inbound message as
`MessageType.IMAGE`, which doesn't exist on the enum — so any image,
video, audio, file, sticker, or location message raised AttributeError
the moment it was constructed.
Beyond fixing the crash, every non-text message was being collapsed onto
a single type. The gateway routes on MessageType (voice → STT, files →
document handling, etc.), so misclassification silently mishandled media.
Replace the inline ternary with a `_LINE_MESSAGE_TYPES` lookup that maps
each LINE webhook type to its proper enum member (audio → VOICE to match
how Telegram/WhatsApp treat voice notes), falling back to TEXT for
unknown types. Adds regression tests covering the mapping and the old
AttributeError.
Co-authored-by: Sahibzada Allahyar <94376830+sahibzada-allahyar@users.noreply.github.com>
The previous fix committed the hash from `prefetch-npm-deps`
(sha256-hgnqc...), but the actual `fetchNpmDeps` FOD (fetcherVersion 2)
that `nix flake check` builds wants sha256-cY+gM... . These two tools
disagree for this lockfile, so the build's npm-deps derivation failed
with a hash mismatch even though `fix-lockfiles --check` reported "ok".
Corrected to the build-verified value. Confirmed `nix build .#tui`,
`.#web`, and `.#desktop` all build cleanly with the new hash.
The react-router-dom 7.14->7.17 lockfile change stales the pinned
npm-deps hash in nix/lib.nix, turning the nix flake checks red. Bump
to the hash CI's prefetch diagnostic computed for the new lockfile.
Clears the npm-audit React Router advisory CVE-2026-42342 in the web
and apps/desktop workspaces by bumping react-router-dom 7.14.x -> ^7.17.0
(patched in 7.15.0; both react-router and react-router-dom now resolve
to 7.17.0 in the root lockfile).
Note: the advisory's DoS only affects React Router *Framework Mode*
(the __manifest server endpoint). Both workspaces use Declarative Mode
(web: <BrowserRouter>, desktop: <HashRouter>) as pure client-side SPAs,
so we were never actually exploitable -- this is audit-hygiene only.
npm audit --omit=dev: 0 vulnerabilities. Web + desktop + ui-tui builds
and tsc typecheck all green on 7.17.0.
The Hermes Docker image's venv is built with `uv sync`, which does not
bootstrap pip into the venv. When the google-workspace setup script needs
to install its deps and the running interpreter has no pip,
`sys.executable -m pip install` dead-ends with "No module named pip"
(reported via Discord support).
install_deps() now falls back to `uv pip install --python <interpreter>`
when the pip path fails and uv is on PATH. uv installs into the exact
interpreter the script is running under without needing pip present, so
the pip-less venv self-heals (e.g. a dep evicted on image update, or a
build without the [google]/[all] extra). On environments with neither
pip nor uv, the [google] extra hint is printed as before.
Verified E2E against nousresearch/hermes-agent:latest: under the venv
python with a missing dep, --install-deps now prints "Dependencies
installed." and exits 0 instead of failing.
Adds TestInstallDeps regression coverage: pip path, uv fallback,
uv-not-consulted-when-pip-works control, and both no-installer-available
and uv-also-fails failure cases.
Since #38591 made the dashboard's embedded chat unconditional, every
browser refresh of /chat spins up a fresh session.create (new sid + a
fresh _SlashWorker via _deferred_build) over /api/ws, but the old tab's
WS disconnect only DETACHES the transport (ws.py) — it never closes the
old session or its slash_worker. The dashboard's in-process gateway is
long-lived, so the detached _SlashWorker subprocess's stdin pipe stays
open forever and the worker never reaches EOF: one leaked python process
per refresh.
Fix at the session-lifecycle layer (not PTY signal timing — verified that
a process whose owning gateway dies is always reaped via stdin-EOF; the
leak is specifically the long-lived dashboard process keeping detached
sessions parked). On WS disconnect, schedule a grace-delayed reap of any
session left orphaned (transport detached to stdio, not mid-turn). A quick
reconnect / session.resume / prompt.submit rebinds a live transport and
cancels the reap, preserving the intentional detach-for-reconnect window.
- server.py: extract _teardown_session() (shared with session.close),
add _ws_session_is_orphaned() + _schedule_ws_orphan_reap(), gated by
HERMES_TUI_WS_ORPHAN_REAP_GRACE_S (default 20s, 0 disables).
- ws.py: schedule the reap for each detached session on disconnect.
- tests: reap-closes-worker, spares-reattached/mid-turn/finalized,
disabled-when-grace-zero.
Youssef's review caught a residual false-positive: resolveTestWsUrl
swallowed an OAuth ticket-mint failure and returned null, so the caller
skipped the WS probe and reported the remote test as reachable. But the
real boot path (resolveRemoteBackend) treats a mint failure as a hard
'session expired' auth error and refuses to connect — so an expired OAuth
session passed the test then failed boot, the exact false-positive this
PR exists to kill.
Extract resolveTestWsUrl into the electron-free connection-config.cjs
(injectable mintTicket) so it's unit-testable, and make OAuth mint
failure throw an actionable needsOauthLogin error instead of skipping.
Adds the three cases Youssef requested plus a mintTicket-required guard.
The "Test remote" button only checked HTTP GET /api/status, but the chat
surface depends on the renderer opening a live WebSocket to /api/ws — a
separate transport with separate server-side guards (Host/Origin checks,
ws-ticket/token auth, peer-IP checks). A gateway could pass the HTTP check yet
reject the WebSocket, so the test reported "reachable" while boot still failed
with the opaque "Could not connect to Hermes gateway".
testDesktopConnectionConfig now mirrors the renderer's connect: after the
status check it opens the WS URL (token/local) or a freshly minted ws-ticket
(OAuth) and confirms the upgrade is accepted and not immediately torn down by
a post-handshake auth rejection. Failures surface an actionable message instead
of a false-positive. The WS leg is skipped when the runtime lacks a global
WebSocket so it never fails spuriously.
Adds electron/gateway-ws-probe.cjs: a small helper that opens a gateway
WebSocket URL and classifies the handshake (open/frame → ok; error or close
before open → fail; open-then-early-close → credential rejected; never-opens →
timeout). The WebSocket implementation is injected so it can be unit-tested
without a real socket.
Wires gateway-ws-probe.test.cjs into test:desktop:platforms, covering every
handshake outcome plus constructor-throw and missing-impl.
The first-run provider picker was a hard gate — the only way out was
connecting a provider. Add an 'I'll choose a provider later' link that
dismisses the overlay and persists the skip to localStorage so it never
re-nags on subsequent launches. Users connect a provider any time from
Settings -> Providers (manual onboarding already bypasses the skip gate).
- onboarding.ts: firstRunSkipped state seeded from localStorage
(hermes-onboarding-skipped-v1) + dismissFirstRunOnboarding() action;
completeDesktopOnboarding clears the flag once a provider connects.
- overlay: skip gate (firstRunSkipped && !manual returns null); ChooseLaterLink
rendered in both the OAuth picker footer and the API-key fallback, first-run only.
- tests: skip persists + hidden in manual mode; full-state fixtures updated.
C1: Add _sessions_lock to protect all compound mutations and iterations
on the global _sessions dict across 5+ concurrent execution contexts
(main dispatcher, pool workers, daemon threads, notification poller,
atexit handler).
C2: Add _prompt_lock to protect _pending/_pending_prompt_payloads/_answers
dicts from races between _block() (agent callback thread) and
_respond() (pool worker). Lock scope is kept tight — _block() only
holds the lock during registration/cleanup, releasing it before
_emit() and ev.wait() to avoid blocking other prompts for 300s.
All 187 existing TUI tests pass with no regressions.
check_execute_code_guard() never called is_approved() before entering the
approval flow, and never persisted session/permanent approvals from the
gateway response. This meant 'Approve session' and 'Always' buttons had
no effect — every execute_code call re-prompted the user.
- Add is_approved() check after get_current_session_key(), matching
check_all_command_guards()
- Persist session ('approve_session') and permanent ('approve_permanent')
approvals based on the gateway choice, same as terminal command guard
- Add 3 regression tests for session persistence, permanent persistence,
and short-circuit on pre-existing approval
Windows counterpart of #39127: scripts/install.ps1 `Install-Desktop` runs
`npm run pack` once and throws on the opaque ENOENT a corrupt cached Electron
download produces, with no recovery. Add `Clear-ElectronBuildCache` plus a
purge-and-retry-once on pack failure, mirroring the install.sh fix: remove the
cached electron-*.zip (%LOCALAPPDATA%\electron\Cache + ELECTRON_CACHE /
electron_config_cache overrides) and stale *-unpacked output, then retry so
@electron/get re-downloads with its own SHASUM verification.
Refs #37544.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Replace [{_DIM}] with [dim] in all _restore_session_cwd and
_preload_resumed_session messages that go through _console_print (Rich
Console.print). _DIM is an ANSI escape (\x1b[2;3m) that Rich cannot
parse as a markup tag, causing MarkupError on session resume when the
stored cwd is missing or inaccessible.
Also uses [/dim] closing tag for explicit tag matching.
Fixes#39469
Mirror the workspace-group "+": each profile header in the all-profiles
session list gets a new-session button. Unlike selecting the profile, it
leaves the browse scope untouched (newSessionInProfile keeps
$showAllProfiles), so creating a chat doesn't collapse the unified view.
Keep one persistent socket per profile with live work instead of closing
the single socket on every profile swap, so background sessions across
profiles keep streaming at once. A gateway registry owns the primary
(window) socket plus lazy secondaries (own backoff/reconnect); all feed
the same session-keyed event handler. Secondaries are pruned to profiles
with a working/needs-input session, the keepalive pings every open
backend, and LRU eviction spares freshly-touched backends so the soft cap
can't abort a running agent. Approval/sudo/secret prompts are parked
per-session (surfaced via the needs-input badge) so a background turn can
block without hijacking the foreground. Single-profile users only ever
have the primary, so their path is unchanged.
Resolve conflicts in desktop settings/cron/messaging/sidebar: adopt main's
ListRow + actions-menu refactors for credential rows; keep our profileColor
import on the sidebar. Drop the now-orphaned Tip-based helpers.
Hold (~450ms) a profile square — or right-click → Color… — to open a
shadcn Popover of swatches and override its rail color, with Auto to fall
back to the deterministic hue. The hold timer rides alongside the dnd
pointer listener (a real drag cancels it, the trailing click is
suppressed), so reorder/select/recolor stay distinct gestures.
Overrides persist in localStorage ($profileColors), resolved via
resolveProfileColor (override wins, else the name-hashed hue). Cosmetic
and gated on the multi-profile rail, so single-profile users are
unaffected. Adds a reusable ui/popover.tsx (radix-ui umbrella).
The previous catch-all except OSError would silently swallow real
errors (disk full, bad path, permission issues unrelated to symlink
privilege). Narrow the handler to winerror == 1314 — the specific
Windows error code for "A required privilege is not held by the
client" — and re-raise every other OSError so genuine failures are
not hidden.
On Windows, os.symlink() raises OSError (WinError 1314) unless the
process has Administrator rights or Developer Mode is enabled. The SSH
bulk-upload staging logic used symlinks to mirror the remote layout
before piping through tar; this caused all ssh_bulk_upload tests to
fail on Windows.
- ssh.py: wrap os.symlink() in try/except OSError and fall back to
shutil.copy2() so staging works on every platform. shutil was already
imported, no new dependency introduced.
- file_sync.py: replace str(Path(remote).parent) with
posixpath.dirname(remote) in unique_parent_dirs(). pathlib.Path uses
the host separator (\ on Windows), but these paths are sent to a
remote Linux host over SSH and must always use forward slashes.
- test_ssh_bulk_upload.py: make test_staging_symlinks_mirror_remote_layout
platform-agnostic — assert file existence and content instead of
os.path.islink() + os.readlink(), since the staged entry may be a
copy on Windows.
When reasoning text grows during streaming, new parts can be appended
beyond endIndex. The pending check used slice(startIndex, endIndex)
which excluded these new parts — if the original part completed, the
block would close while new reasoning was still streaming.
Fix: remove the endIndex cap from slice() so all parts from startIndex
onward are checked. During non-streaming, the array is stable and
all parts are within range anyway.
web_tools.is_safe_url was replaced by async_is_safe_url, but three
web-provider test files still monkeypatched the old sync name, raising
AttributeError. Patch the async variant with an async lambda.
Add async_is_safe_url() wrapping is_safe_url via asyncio.to_thread, and route
all async SSRF call sites through it: web_extract_tool, the vision/video
preflight checks, and both download redirect guards. socket.getaddrinfo blocks;
calling it inline from async tool paths froze the event loop for the duration of
DNS resolution.
vision_tools: split _validate_image_url into _image_url_shape_ok (no DNS) +
sync _validate_image_url (for sync callers/tests) + async _validate_image_url_async.
Widened beyond the original PR #3691 to sibling async sites that also blocked
the loop (second redirect guard, video preflight).
Salvage of #3691 by @Kewe63 — surgically re-applied onto current main because
the original branch was too stale to cherry-pick cleanly (would have reverted
the web_crawl_tool refactor).
Co-authored-by: Kewe63 <kewe.3217@gmail.com>
PASSIVE checkpoint never shrinks the WAL file, causing state.db-wal to
grow without bound. Change to TRUNCATE in _try_wal_checkpoint() and
close() so the WAL is truncated regularly.
Fixes#24034
session.py _persist() bypassed SessionDB's thread-safe write path by
accessing private internals db._lock and db._conn directly:
with db._lock:
db._conn.execute("UPDATE sessions SET model_config = ? ...")
db._conn.commit()
This was fragile for three reasons:
1. It bypassed _execute_write()'s BEGIN IMMEDIATE + jitter-retry logic,
so concurrent writes could hit SQLite BUSY without retrying.
2. It called db._conn.commit() manually, breaking the transactional
contract that _execute_write() enforces.
3. Any internal rename of _lock or _conn would silently break this
call site with an AttributeError at runtime.
Fix:
- Add SessionDB.update_session_meta(session_id, model_config_json, model)
to hermes_state.py. Routes through _execute_write() for the standard
BEGIN IMMEDIATE + lock + jitter-retry guarantee. Uses COALESCE so
passing model=None leaves the stored model column unchanged.
- Replace the db._lock / db._conn block in session.py _persist() with
a single db.update_session_meta() call.
Tests (tests/acp/test_session_db_private_access.py, 11 tests):
- Unit tests for update_session_meta: updates model_config, updates
model, preserves existing model on None, routes through _execute_write,
no-op on non-existent session.
- AST checks: db._lock and db._conn not referenced in session.py;
_persist() calls update_session_meta().
- Integration round-trips: cwd and model persisted correctly; COALESCE
prevents overwriting an existing model with NULL.
The models.dev supports_vision field reflects model IMAGE-INPUT capability,
which is not the same contract as 'provider API accepts images inside
tool-result messages' — the looser heuristic could re-introduce the exact
HTTP 400 'text is not set' it aims to fix. Keep only the explicit, opt-in
ProviderProfile.supports_vision flag (set on xiaomi); add catalog-based
detection later if a concrete provider needs it.
_supports_media_in_tool_results() had a hardcoded provider allowlist
that missed custom providers and newer vision-capable providers like
xiaomi. Added ProviderProfile.supports_vision flag and made the
function check:
1. Registered provider profile (supports_vision flag)
2. Model capabilities from models.dev catalog (supports_vision)
3. Existing hardcoded allowlist (unchanged)
This fixes HTTP 400 "text is not set" errors when vision-capable
custom providers receive text-only tool results instead of
multipart image content.
Related: #25594
Tests in test_gateway_service.py imported grp inline without a
platform guard, causing ImportError on systems where grp is
unavailable (e.g. macOS, WSL without grp module).
Added pytest.importorskip('grp') at module level alongside the
existing pwd guard, and removed three redundant inline import grp
statements.
Fixes#24531
Some VPS providers (Hetzner Cloud and others) offer a browser-based
console for managing hosts. These consoles transmit special characters
incorrectly — ':' may arrive as ';', '@' may be mis-rendered, and
non-English keyboard layouts fare worse — which silently corrupts
'docker run' arguments like '-v ~/.hermes:/opt/data', '-e KEY=value',
and pasted API keys / tokens.
Adds a :::caution admonition above the Quick start 'docker run' block
in website/docs/user-guide/docker.md recommending SSH for copy-paste-
safe command entry, with manual-typing guidance as a fallback.
Pure docs change, no code touched.
Closes#36279
Co-authored-by: Bedirhan Celayir <bedirhancode@users.noreply.github.com>
Centralize the fallback in DeleteProfileDialog (the single delete choke
point) so both the rail and the Profiles view inherit it. Reset *after*
the host's onDeleted refresh so a refreshActiveProfile racing the dying
backend can't clobber the pill back to the deleted profile, and set
$activeProfile too (selectProfile only moved the gateway, leaving the
statusbar pill stranded on the dead profile).
The standalone `_send_slack()` function used by the send_message tool
and cron delivery fallback was not passing `thread_ts` to the Slack API,
causing messages to post to the top-level channel instead of inside
threads.
- Add `thread_ts` parameter to `_send_slack()`
- Include `thread_ts` in the chat.postMessage payload when present
- Pass `thread_id` from `_send_to_platform()` to `_send_slack()`
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Drag a sidebar session into the composer to drop an @session:<profile>/<id>
chip the agent resolves via session_search. New READ shape dumps a whole
session by id (head+tail when large); a `profile` param reads another
profile's DB read-only, and a cross-profile locate scan resolves bare ids
when the model drops the owning profile from the link.
Also: ASCII "waking up <profile>" overlay during lazy gateway swaps,
global haptic rate-limit to kill the reconnect-storm "clickity" buzz, and
reauth toasts surfaced once per disconnect instead of every backoff tick.
_build_memory_uri produced URIs of the form:
viking://user/{user}/memories/{subdir}/mem_{slug}.md
The /agent/{agent}/ segment was missing, causing every agent under
the same user to write into the same flat namespace. In multi-agent
deployments agents silently overwrite each other's memories and
vector retrieval cross-pollinates results.
self._agent was already populated correctly (from OPENVIKING_AGENT
env var, default 'hermes') and sent via X-OpenViking-Agent header —
it was simply not interpolated into the URI.
Fix: add the missing segment so URIs follow the documented shape:
viking://user/{user}/agent/{agent}/memories/{subdir}/mem_{slug}.md
Tests: 4 new regression tests in TestOpenVikingMemoryUriBuilder,
13/13 passed (9 existing + 4 new).
Both the desktop and web-dashboard remote-backend sections now state up front
that the 'remote backend' is a running 'hermes dashboard' process the desktop
app attaches to (it does not start it for you), and that the gateway is a
separate process needed only for messaging channels.
Salvage of #36631 (@annguyenNous), rebased onto current main with
regression tests added. Fixes#36266.
When a persistent Docker sandbox container is removed out-of-band (idle
reaper, `docker prune`, OOM kill, daemon restart), the gateway kept
issuing `docker exec` against the dead container ID, returning
"No such container" on every subsequent tool call — the agent was
permanently blocked until the gateway process restarted.
DockerEnvironment.execute() now detects the "No such container" /
"is not running" error after a non-zero exit (gated on
persist_across_processes) and calls _recreate_container(): it tries
label-based reuse first, falls back to a fresh container replaying the
same image + full all_run_args set, re-runs init_session(), and retries
the command once. A genuine non-zero exit is NOT misclassified as
container-gone.
Differs from #36631 as submitted: adds the tests the original lacked.
tests/tools/test_docker_environment.py covers _is_container_gone pattern
matching (incl. the negative/control case), the recover-and-retry path,
the persist_across_processes=False opt-out (no recovery), and the
ordinary-failure passthrough (no spurious recreation). _make_dummy_env
now forwards persist_across_processes.
Verified:
- Unit: 67/67 in test_docker_environment.py (4 new + existing).
- Live E2E against the real docker daemon: started a persistent
container, `docker rm -f`'d it out-of-band, and the next execute()
transparently recreated a fresh container and succeeded; a follow-up
command worked in the recovered container; a real `exit N` passed
through without triggering recovery.
Co-authored-by: annguyenNous <annguyenNous@users.noreply.github.com>
The desktop `/title <name>` command 404s with "Session not found" on
every platform (reported on Windows in #38508).
Root cause: `session.create` returns two distinct ids — a *runtime*
session id (held in `activeSessionIdRef`) and a `stored_session_id` (the
DB `sessions.id`) — and deliberately does NOT persist a DB row until the
first turn. Routing `/title` through the REST `PATCH /api/sessions/{id}`
endpoint (as #38576 proposed) resolves the id against the `sessions`
table, so the runtime id — or any brand-new, not-yet-persisted session —
never resolves and returns 404. This is an id-type mismatch, not a
Windows file-locking quirk, so it fails on macOS and Linux too.
Fix: route `/title <name>` through the gateway's `session.title` RPC —
the exact path the TUI already uses (`ui-tui/.../slash/commands/core.ts`).
The RPC maps the runtime id to the in-memory session, writes through the
gateway's own DB connection, and queues the title (`pending: true`) when
the row isn't persisted yet, so it works for a fresh chat. The sidebar is
then refreshed via the existing `refreshSessions()` plumbing.
Keeps the sidebar-refresh wiring and `refreshSessions` threading from
#38576; replaces only the broken REST/slash-worker write path. A bare
`/title` (no arg) still falls through to the worker to show the current
title.
Tests rewritten to assert `session.title` routing with the runtime-vs-
stored id distinction (which the original mock collapsed), plus the
queued/`pending` fresh-chat case and the error path.
Supersedes #38576. Fixes#38508.
Co-authored-by: xxxigm <54813621+xxxigm@users.noreply.github.com>
Adds qwen/qwen3.7-plus directly under qwen/qwen3.7-max in both the
OpenRouter curated catalog (OPENROUTER_MODELS) and the Nous portal
catalog (_PROVIDER_MODELS['nous']), then regenerates the docs-hosted
model-catalog.json manifest from those source lists.
When a remote gateway with username/password (or OAuth) auth restarts, its
session cookie lapses and Desktop boots into the recovery overlay with a
session-expired error. That overlay only exposed local-recovery actions —
Retry (resets the local bootstrap latch) and Repair (re-runs the installer) —
neither of which can re-establish a remote session, so the user is stuck in a
no-op Retry loop with no way to sign in again.
The overlay now detects a remote-reauth boot failure from the saved connection
config (remote + gated + not currently connected + has a URL) and surfaces a
primary 'Sign in to remote gateway' button that opens the gateway login window
(the username/password form for a basic gateway, the OAuth redirect otherwise)
and reloads on success. Button copy is driven by a best-effort provider probe,
matching the gateway-settings page. Detection and copy logic live in a pure
helper module with unit coverage.
When `docker run -d` fails after Docker has already created the container
object (e.g. exit 125 when the daemon isn't ready, or a timeout mid image
pull), the code raised before `self._container_id` was set — so the
container leaked permanently in "Created" state. Reported in #7439:
110+ orphaned containers accumulated over 3 days from hourly cron-
scheduled gateway sessions hitting a Docker Desktop startup race.
The orphan reaper added in #33645 (reap_orphan_containers) does NOT cover
this case: it filters `status=exited`, but a failed-create container is in
`Created` state, so it slips through and is never reaped.
Wrap the `docker run -d` call in try/except and `docker rm -f` the
container by its known name before re-raising.
Salvages #7440 by @Tranquil-Flow. Their branch predated the cross-process
reuse + labels rework on `main`, so a cherry-pick conflicted; reconstructed
the same intent (plus their two regression tests, adapted to mock the new
reuse `docker ps` probe) against current `main`.
Verified adversarially: reverted just the product change to origin/main's
`docker.py`, ran the two new tests -> both FAIL with
`assert 0 == 1 ("docker rm should be called once")`. With the fix applied,
both pass; full test_docker_environment.py is 65/65 green.
Closes#7440. Fixes#7439.
Co-authored-by: Evi Nova <66773372+Tranquil-Flow@users.noreply.github.com>
Docker Compose service names (e.g. ollama, litellm, hermes-litellm)
are unqualified hostnames with no dots. These are always local — they
resolve via Docker DNS, /etc/hosts, or mDNS. Without this fix, the
stale stream timeout fires on local LLM proxies, causing infinite
reconnect loops.
Closes#7905
Wrap the _tools iteration in _probe_single_server() in try/finally
so that server.shutdown() is called even if iterating tool metadata
raises. Without this, the MCP server connection leaks until the
event loop is torn down by _stop_mcp_loop().
The existing-message overflow split path in stream_consumer.run() sealed the
first chunk via _send_or_edit(chunk) (finalize=False) then reset _message_id
to None — so that chunk was never edited again and never received the adapter's
final rich-text pass. On Telegram, MarkdownV2 formatting is applied on the
finalize edit, so early split messages of a long multi-part streamed reply
rendered raw markdown (##, **bold**, code fences) while only the last chunk
rendered correctly.
Fix: seal the overflow chunk with finalize=True so it gets its final
formatting pass before _message_id is cleared.
Salvaged from #32609 (the streaming-format portion only; the PR's send_draft
parse_mode change is already superseded on main, and its media-roots change
conflicts with the current denylist + recency-window delivery model).
- right-click a profile square to rename or delete it, via shared
self-contained dialogs (also reused by the profiles page)
- switching or creating a profile now resets to a fresh new-session
draft so the prior session doesn't stay sticky across contexts
- deleting the profile you're currently in falls back to default
instead of stranding the gateway on a dead profile
- shared ConfirmDialog: Enter/Space confirm from anywhere in the dialog;
profile-delete and cron-delete both route through it
The per-session icon picker added more noise than value — rip it out end
to end (sessions.icon column, set_session_icon, the PATCH field, the
picker UI, and the SessionInfo.icon type).
The cross-profile session aggregator now opens each profile's state.db
read-only (mode=ro, no schema init), so listing other profiles on every
sidebar refresh never DDLs or takes a write lock on their live DBs. The
single-profile hot path stays on par with /api/sessions.
Left-align the default's home icon next to the create "+" in the
single-profile state (toggle/squares/Manage still appear only once a
second profile exists).
Always mount the profile rail, but when only the default profile exists
render just the create-profile "+" (hide the default/all toggle, the
draggable squares, and Manage). Gives a first-profile affordance without
the full switcher chrome; everything else appears once a 2nd profile exists.
If a user drops back to a single profile while scope is still ALL
(persisted), the rail is hidden — they'd be stuck in the grouped view
with no toggle out. Fall back to the scoped view when only one profile.
- Wheel maps vertical scroll → horizontal so the rail is navigable with a
plain mouse (trackpad x-scroll still passes through).
- Springy easeOutBack reflow; dragged square glides between snapped cells
(no scale — overflow-x strip would clip it) with a subtle lift.
- Haptic 'selection' tick per crossed cell + 'success' on a committed reorder.
Snap the drag transform to whole cells (no free glide) and clamp it to the
occupied squares strip via a relative wrapper as offsetParent, so a square
can't float past the last profile onto the "+" and break the layout.
overflow-x-auto makes overflow-y compute to auto, so a vertical drag
translate faulted in a cross-axis scrollbar. Pin the drag transform to
y:0 with a modifier — squares only slide horizontally now.
Bind Cmd/Ctrl+P to the command palette alongside Cmd+K (VS Code quick-open
muscle memory); Cmd+. stays the command center. No Print accelerator
competes, so the renderer preventDefault is enough.
Make the named-profile squares reorderable via dnd-kit (horizontal sort,
4px activation so a tap still selects). Order persists in localStorage
($profileOrder); unordered/new profiles alphabetize at the tail.
- Add a "+" in the profile rail that opens a self-contained CreateProfileDialog
(name + clone toggle + optional SOUL.md); extract it and ActionStatus from
the profiles view so both surfaces share one flow.
- Keep the profile rail pinned to the bottom when a profile has no sessions by
rendering a flex-1 spacer (previously the rail floated up to the nav).
Add first-class profile support to the desktop app without app reloads.
- Swap the single live gateway onto a session's profile lazily (spawned on
demand by the Electron backend pool), so one backend serves the active
profile and others stay cold — no OOM with many profiles.
- Aggregate sessions across profiles by reading each profile's state.db
read-only; unified "All profiles" view groups sessions per profile with
per-profile pagination, while the default view stays scoped to one profile.
- Add an Arc-style profile rail at the sidebar foot: a default<->all toggle
pinned left, colored named-profile squares scrolling between, Manage pinned
right. Profile identity is a deterministic per-name color.
- Route profile-scoped REST (config/env/skills/tools/model) to the active
gateway profile and invalidate React Query caches on swap. Single-profile
users never trigger a swap, so their path is unchanged.
Backend:
- web_server: profile-aware active/list endpoints + per-profile session
totals; hermes_state: session_count(exclude_children); main.py: honor
--profile over HERMES_HOME env for pooled backends.
UI primitives:
- Add a position-aware Tip tooltip (instant, themed) as a drop-in for native
title=, and strip redundant tooltips from self-descriptive chrome.
<a href="https://nousresearch.com"><img src="https://img.shields.io/badge/Built%20by-Nous%20Research-blueviolet?style=for-the-badge" alt="Built by Nous Research"></a>
**The self-improving AI agent built by [Nous Research](https://nousresearch.com).** It's the only agent with a built-in learning loop — it creates skills from experience, improves them during use, nudges itself to persist knowledge, searches its own past conversations, and builds a deepening model of who you are across sessions. Run it on a $5 VPS, a GPU cluster, or serverless infrastructure that costs nearly nothing when idle. It's not tied to your laptop — talk to it from Telegram while it works on a cloud VM.
@@ -33,7 +34,7 @@ Use any model you want — [Nous Portal](https://portal.nousresearch.com), [Open
The installer handles everything: uv, Python 3.11, Node.js, ripgrep, ffmpeg, **and a portable Git Bash** (MinGit, unpacked to `%LOCALAPPDATA%\hermes\git` — no admin required, completely isolated from any system Git install). Hermes uses this bundled Git Bash to run shell commands.
@@ -52,7 +53,7 @@ If you already have Git installed, the installer detects it and uses that instea
> **Android / Termux:** The tested manual path is documented in the [Termux guide](https://hermes-agent.nousresearch.com/docs/getting-started/termux). On Termux, Hermes installs a curated `.[termux]` extra because the full `.[all]` extra currently pulls Android-incompatible voice dependencies.
>
> **Windows:** Native Windows is fully supported — the PowerShell one-liner above installs everything. If you'd rather use WSL2, the Linux command works there too. Native Windows install lives under `%LOCALAPPDATA%\hermes`; WSL2 installs under `~/.hermes` as on Linux. The only Hermes feature that currently needs WSL2 specifically is the browser-based dashboard chat pane (it uses a POSIX PTY — classic CLI and gateway both run natively).
> **Windows:** Native Windows is fully supported — the PowerShell one-liner above installs everything. If you'd rather use WSL2, the Linux command works there too. Native Windows install lives under `%LOCALAPPDATA%\hermes`; WSL2 installs under `~/.hermes` as on Linux.
<a href="https://nousresearch.com"><img src="https://img.shields.io/badge/Built%20by-Nous%20Research-blueviolet?style=for-the-badge" alt="Built by Nous Research"></a>
**[نوس ریسرچ (Nous Research)](https://nousresearch.com) کا تیار کردہ خود کو بہتر بنانے والا اے آئی (AI) ایجنٹ۔** یہ واحد ایجنٹ ہے جس میں سیکھنے کا عمل (learning loop) پہلے سے موجود ہے — یہ اپنے تجربات سے نئی مہارتیں (skills) بناتا ہے، استعمال کے دوران ان کو بہتر کرتا ہے، معلومات کو محفوظ رکھنے کے لیے خود کو یاد دہانی کرواتا ہے، اپنی پرانی بات چیت کو تلاش کر سکتا ہے، اور مختلف سیشنز کے دوران آپ کے بارے میں ایک گہری سمجھ پیدا کرتا ہے۔ اسے $5 والے VPS پر چلائیں، GPU کلسٹر پر، یا سرور لیس (serverless) انفراسٹرکچر پر جس کی قیمت استعمال نہ ہونے پر تقریباً صفر ہے۔ یہ آپ کے لیپ ٹاپ تک محدود نہیں ہے — آپ ٹیلی گرام (Telegram) سے اس کے ساتھ بات چیت کر سکتے ہیں جبکہ یہ کلاؤڈ VM پر کام کر رہا ہو۔
آپ اپنی مرضی کا کوئی بھی ماڈل استعمال کر سکتے ہیں — [Nous Portal](https://portal.nousresearch.com)، [OpenRouter](https://openrouter.ai) (200 سے زائد ماڈلز)، [NovitaAI](https://novita.ai) (ماڈل API، ایجنٹ سینڈ باکس، اور GPU کلاؤڈ کے لیے اے آئی مقامی کلاؤڈ)، [NVIDIA NIM](https://build.nvidia.com) (Nemotron)، [Xiaomi MiMo](https://platform.xiaomimimo.com)، [z.ai/GLM](https://z.ai)، [Kimi/Moonshot](https://platform.moonshot.ai)، [MiniMax](https://www.minimax.io)، [Hugging Face](https://huggingface.co)، OpenAI، یا اپنا حسب ضرورت اینڈ پوائنٹ (endpoint) استعمال کریں۔ ماڈل تبدیل کرنے کے لیے صرف `hermes model` استعمال کریں — کسی کوڈ کو تبدیل کرنے کی ضرورت نہیں، کوئی پابندی نہیں۔
<table>
<tr><td><b>حقیقی ٹرمینل انٹرفیس</b></td><td>مکمل TUI جس میں ملٹی لائن ایڈیٹنگ، سلیش-کمانڈ آٹو کمپلیٹ، بات چیت کی ہسٹری، انٹرپٹ اور ری ڈائریکٹ، اور سٹریمنگ ٹول آؤٹ پٹ شامل ہے۔</td></tr>
<tr><td><b>یہ وہاں موجود ہے جہاں آپ ہیں</b></td><td>ٹیلی گرام، ڈسکارڈ (Discord)، سلیک (Slack)، واٹس ایپ (WhatsApp)، سگنل (Signal)، اور CLI — سب ایک ہی گیٹ وے پروسیس سے کام کرتے ہیں۔ وائس میمو (Voice memo) ٹرانسکرپشن، کراس پلیٹ فارم بات چیت کا تسلسل۔</td></tr>
<tr><td><b>سیکھنے کا ایک مکمل عمل</b></td><td>ایجنٹ کی اپنی ترتیب دی گئی میموری، جس میں وہ خود کو وقتاً فوقتاً یاد دہانی کرواتا ہے۔ پیچیدہ کاموں کے بعد خود کار طریقے سے مہارت (skill) کی تخلیق۔ استعمال کے دوران مہارتوں میں بہتری۔ LLM سمرائزیشن کے ساتھ FTS5 سیشن سرچ تاکہ پرانے سیشنز کی یاددہانی کی جا سکے۔ <a href="https://github.com/plastic-labs/honcho">Honcho</a> کے ذریعے صارف کی ماڈلنگ۔ <a href="https://agentskills.io">agentskills.io</a> اوپن سٹینڈرڈ کے ساتھ مکمل مطابقت۔</td></tr>
<tr><td><b>شیڈول کی گئی خودکار کارروائیاں</b></td><td>بلٹ ان (Built-in) کرون (cron) شیڈیولر جو کسی بھی پلیٹ فارم پر ڈیلیوری کے لیے استعمال ہو سکتا ہے۔ روزانہ کی رپورٹس، رات کے بیک اپس، ہفتہ وار آڈٹس — یہ سب کچھ قدرتی زبان (natural language) میں اور بغیر کسی نگرانی کے کام کرتا ہے۔</td></tr>
<tr><td><b>کام کی تقسیم اور متوازی عمل</b></td><td>متوازی (parallel) کاموں کے لیے الگ سے ذیلی ایجنٹس (subagents) بنائیں۔ پائتھون (Python) سکرپٹس لکھیں جو RPC کے ذریعے ٹولز کو استعمال کریں، تاکہ کئی مراحل پر مشتمل کاموں کو بغیر کسی سیاق و سباق (context) کے خرچ کے، ایک ہی باری میں انجام دیا جا سکے۔</td></tr>
<tr><td><b>کہیں بھی چلائیں، صرف اپنے لیپ ٹاپ پر نہیں</b></td><td>چھ (Six) ٹرمینل بیک اینڈز — لوکل، Docker، SSH، Singularity، Modal، اور Daytona۔ ڈیٹونا (Daytona) اور موڈل (Modal) سرور لیس (serverless) فعالیت پیش کرتے ہیں — جب آپ کا ایجنٹ فارغ ہوتا ہے تو اس کا ماحول سلیپ (hibernate) ہو جاتا ہے اور ضرورت پڑنے پر خود بخود جاگ جاتا ہے، جس کی وجہ سے سیشنز کے درمیان لاگت تقریباً صفر رہتی ہے۔ اسے $5 والے VPS یا GPU کلسٹر پر چلائیں۔</td></tr>
<tr><td><b>تحقیق کے لیے تیار</b></td><td>بیچ (Batch) ٹریجیکٹری (trajectory) جنریشن، اگلی نسل کے ٹول کالنگ ماڈلز کی تربیت کے لیے ٹریجیکٹری کمپریشن۔</td></tr>
</table>
---
## فوری انسٹالیشن (Quick Install)
### لینکس (Linux)، میک او ایس (macOS)، ڈبلیو ایس ایل ٹو (WSL2)، ٹرمکس (Termux)
> **توجہ فرمائیں:** مقامی ونڈوز (Native Windows) پر ہرمیس بغیر WSL کے چلتا ہے — CLI، گیٹ وے، TUI، اور ٹولز سب مقامی طور پر کام کرتے ہیں۔ اگر آپ WSL2 استعمال کرنا پسند کرتے ہیں، تو اوپر دی گئی لینکس/میک او ایس کی کمانڈ وہاں بھی کام کرے گی۔ کوئی مسئلہ نظر آیا؟ براہ کرم [مسائل (issues) درج کریں](https://github.com/NousResearch/hermes-agent/issues)۔
انسٹالر سب کچھ خود سنبھالتا ہے: uv، Python 3.11، Node.js، ripgrep، ffmpeg، **اور ایک پورٹ ایبل (portable) گٹ بیش (Git Bash)** (یعنی MinGit، جو `%LOCALAPPDATA%\hermes\git` میں ان پیک ہوتا ہے — اس کے لیے ایڈمن کی اجازت درکار نہیں، اور یہ سسٹم کے کسی بھی گٹ انسٹال سے بالکل الگ ہے)۔ ہرمیس اس بنڈل شدہ گٹ بیش کو شیل کمانڈز چلانے کے لیے استعمال کرتا ہے۔
اگر آپ کے پاس پہلے سے گٹ (Git) انسٹال ہے، تو انسٹالر اسے شناخت کر لیتا ہے اور اسے ہی استعمال کرتا ہے۔ بصورت دیگر آپ کو صرف ~45MB کے MinGit ڈاؤنلوڈ کی ضرورت ہوگی — یہ آپ کے سسٹم کے گٹ پر کوئی اثر نہیں ڈالے گا۔
> **اینڈرائیڈ (Android) / ٹرمکس (Termux):** ٹیسٹ کیا گیا مینوئل طریقہ [Termux گائیڈ](https://hermes-agent.nousresearch.com/docs/getting-started/termux) میں موجود ہے۔ ٹرمکس پر ہرمیس ایک مخصوص `.[termux]` ایکسٹرا انسٹال کرتا ہے کیونکہ مکمل `.[all]` ایکسٹرا میں ایسی وائس ڈیپینڈینسیز شامل ہیں جو اینڈرائیڈ کے ساتھ مطابقت نہیں رکھتیں۔
>
> **ونڈوز (Windows):** مقامی ونڈوز کی مکمل سپورٹ موجود ہے — اوپر دی گئی پاور شیل کی کمانڈ سب کچھ انسٹال کر دیتی ہے۔ اگر آپ WSL2 استعمال کرنا چاہتے ہیں، تو لینکس کی کمانڈ وہاں کام کرتی ہے۔ مقامی ونڈوز میں انسٹالیشن `%LOCALAPPDATA%\hermes` میں ہوتی ہے؛ جبکہ WSL2 میں لینکس کی طرح `~/.hermes` میں ہوتی ہے۔ ہرمیس کا وہ واحد فیچر جسے فی الحال خاص طور پر WSL2 کی ضرورت ہے وہ براؤزر پر مبنی ڈیش بورڈ چیٹ پین ہے (یہ POSIX PTY استعمال کرتا ہے — کلاسک CLI اور گیٹ وے دونوں مقامی طور پر چلتے ہیں)۔
ہرمیس آپ کے پسندیدہ پرووائیڈر کے ساتھ کام کرتا ہے — یہ چیز تبدیل نہیں ہو رہی۔ لیکن اگر آپ ماڈل، ویب سرچ، امیج جنریشن، TTS، اور کلاؤڈ براؤزر کے لیے پانچ الگ الگ API کیز جمع نہیں کرنا چاہتے، تو **[Nous Portal](https://portal.nousresearch.com)** ان سب کو ایک ہی سبسکرپشن کے تحت کور کرتا ہے:
- **300+ ماڈلز** — ان میں سے کوئی بھی ماڈل `/model <name>` کے ذریعے منتخب کریں
- **ٹول گیٹ وے (Tool Gateway)** — ویب سرچ (Firecrawl)، امیج جنریشن (FAL)، ٹیکسٹ ٹو سپیچ (OpenAI)، کلاؤڈ براؤزر (Browser Use)، یہ سب آپ کی سبسکرپشن کے ذریعے چلتے ہیں۔ کسی اضافی اکاؤنٹ کی ضرورت نہیں۔
نئی انسٹالیشن کے بعد بس ایک کمانڈ کی ضرورت ہے:
<div dir="ltr">
```bash
hermes setup --portal
```
</div>
یہ آپ کو OAuth کے ذریعے لاگ ان کرواتا ہے، Nous کو آپ کا پرووائیڈر مقرر کرتا ہے، اور ٹول گیٹ وے کو آن کر دیتا ہے۔ `hermes portal info` کمانڈ استعمال کر کے آپ کسی بھی وقت چیک کر سکتے ہیں کہ کون کون سی سروسز منسلک ہیں۔ مکمل تفصیلات [Tool Gateway دستاویزات کے صفحے](https://hermes-agent.nousresearch.com/docs/user-guide/features/tool-gateway) پر موجود ہیں۔
آپ اب بھی کسی بھی ٹول کے لیے اپنی مرضی کی API کیز استعمال کر سکتے ہیں — گیٹ وے ہر سروس کے لیے الگ الگ کام کرتا ہے، ایسا نہیں کہ یا تو سب کچھ استعمال کریں یا کچھ بھی نہیں۔
---
## CLI بمقابلہ میسجنگ فوری حوالہ
ہرمیس کے دو بنیادی انٹر فیس ہیں: آپ ٹرمینل UI کو `hermes` کے ساتھ شروع کریں، یا گیٹ وے چلا کر اس کے ساتھ ٹیلی گرام، ڈسکارڈ، سلیک، واٹس ایپ، سگنل، یا ای میل کے ذریعے بات کریں۔ جب آپ کسی بات چیت میں ہوتے ہیں، تو بہت سی سلیش (slash) کمانڈز دونوں انٹرفیسز میں ایک جیسی ہوتی ہیں۔
<div dir="ltr">
| کارروائی (Action) | سی ایل آئی (CLI) | میسجنگ پلیٹ فارمز (Messaging platforms) |
| [فوری آغاز (Quickstart)](https://hermes-agent.nousresearch.com/docs/getting-started/quickstart) | انسٹالیشن → سیٹ اپ → 2 منٹ میں پہلی بات چیت شروع کریں |
| [CLI کا استعمال](https://hermes-agent.nousresearch.com/docs/user-guide/cli) | کمانڈز، کی بائنڈنگز (keybindings)، پرسنلٹیز (personalities)، سیشنز |
| [کنفیگریشن (Configuration)](https://hermes-agent.nousresearch.com/docs/user-guide/configuration) | کنفگ فائل، پرووائیڈرز، ماڈلز، اور تمام آپشنز |
| [MCP انضمام (Integration)](https://hermes-agent.nousresearch.com/docs/user-guide/features/mcp) | صلاحیتوں کو بڑھانے کے لیے کسی بھی MCP سرور کو جوڑیں |
| [کرون (Cron) شیڈیولنگ](https://hermes-agent.nousresearch.com/docs/user-guide/features/cron) | پلیٹ فارم ڈیلیوری کے ساتھ شیڈول کیے گئے کام |
| [کانٹیکسٹ (Context) فائلز](https://hermes-agent.nousresearch.com/docs/user-guide/features/context-files)| پروجیکٹ کا سیاق و سباق (context) جو ہر بات چیت پر اثر انداز ہوتا ہے |
| [آرکیٹیکچر (Architecture)](https://hermes-agent.nousresearch.com/docs/developer-guide/architecture) | پروجیکٹ کا ڈھانچہ، ایجنٹ لوپ، اہم کلاسز |
| [تعاون (Contributing)](https://hermes-agent.nousresearch.com/docs/developer-guide/contributing) | ڈیویلپمنٹ سیٹ اپ، PR کا طریقہ کار، کوڈنگ کا انداز |
| [CLI حوالہ جات (Reference)](https://hermes-agent.nousresearch.com/docs/reference/cli-commands) | تمام کمانڈز اور فلیگز (flags) |
اگر آپ OpenClaw سے منتقل ہو رہے ہیں، تو ہرمیس آپ کی سیٹنگز، یادیں (memories)، مہارتیں (skills)، اور API کیز کو خود بخود امپورٹ کر سکتا ہے۔
**پہلی بار سیٹ اپ کے دوران:** سیٹ اپ وزرڈ (`hermes setup`) خود بخود `~/.openclaw` کو پہچان لیتا ہے اور کنفیگریشن شروع ہونے سے پہلے مائیگریٹ (migrate) کرنے کا آپشن دیتا ہے۔
- **ورک اسپیس کی ہدایات** — AGENTS.md (`--workspace-target` کے ساتھ)
تمام آپشنز دیکھنے کے لیے `hermes claw migrate --help` استعمال کریں، یا انٹرایکٹو ایجنٹ کی مدد سے مائیگریٹ کرنے کے لیے `openclaw-migration` سکل کا استعمال کریں (جس میں ڈرائی رن (dry-run) پریویوز شامل ہیں)۔
---
## تعاون کریں (Contributing)
ہم آپ کے تعاون کا خیرمقدم کرتے ہیں! ڈیویلپمنٹ سیٹ اپ، کوڈ کے انداز اور PR کے طریقہ کار کے لیے براہ کرم ہماری [Contributing گائیڈ](https://hermes-agent.nousresearch.com/docs/developer-guide/contributing) دیکھیں۔
معاونین (contributors) کے لیے فوری آغاز — کلون (clone) کریں اور `setup-hermes.sh` چلائیں:
- 🔌 [computer-use-linux](https://github.com/avifenesh/computer-use-linux) — ہرمیس اور دیگر MCP ہوسٹس کے لیے لینکس (Linux) ڈیسک ٹاپ کنٹرول MCP سرور، جس میں AT-SPI ایکسیسیبلٹی ٹریز، Wayland/X11 ان پٹ، سکرین شاٹس، اور کمپوزیٹر ونڈو ٹارگیٹنگ شامل ہے۔
- 🔌 [HermesClaw](https://github.com/AaronWong1999/hermesclaw) — کمیونٹی وی چیٹ (WeChat) برج: ہرمیس ایجنٹ اور OpenClaw کو ایک ہی وی چیٹ اکاؤنٹ پر چلائیں۔
---
## لائسنس (License)
MIT — تفصیلات کے لیے [LICENSE](LICENSE) دیکھیں۔
[نوس ریسرچ (Nous Research)](https://nousresearch.com) کی جانب سے تیار کردہ۔
<a href="https://nousresearch.com"><img src="https://img.shields.io/badge/Built%20by-Nous%20Research-blueviolet?style=for-the-badge" alt="Built by Nous Research"></a>
# references into absolute, dated, past-tense facts so a resumed
# conversation does not re-issue completed actions. Only emitted when the
# current date resolved successfully; otherwise the rule is omitted so the
# summarizer is never handed an empty date placeholder.
if_today_str:
_temporal_anchoring_rule=(
f"\nTEMPORAL ANCHORING: The current date is {_today_str}. When an "
"action has already been carried out, phrase it as a completed, "
"dated, past-tense fact rather than an open instruction. For "
'example, rewrite "email John about the proposal" as "Sent the '
f'proposal email to John on {_today_str}." Never leave a finished '
"action worded as if it still needs doing, and never invent a date "
"for work that has not happened yet.\n"
)
else:
_temporal_anchoring_rule=""
# Shared structured template (used by both paths).
_template_sections=f"""## Active Task
[THE SINGLE MOST IMPORTANT FIELD. Capture the user's most recent unfulfilled
@@ -1337,7 +1384,7 @@ Be specific with file paths, commands, line numbers, and results.]
[Any specific values, error messages, configuration details, or data that would be lost without explicit preservation. NEVER include API keys, tokens, passwords, or credentials — write [REDACTED] instead.]
Target ~{summary_budget} tokens. Be CONCRETE — include file paths, command outputs, error messages, line numbers, and specific values. Avoid vague descriptions like "made some changes" — say exactly what changed.
{_temporal_anchoring_rule}
Write only the summary body. Do not include any preamble or prefix."""
ifself._previous_summary:
@@ -1787,6 +1834,41 @@ The user has requested that this compaction PRIORITISE preserving all informatio
accumulated+=msg_tokens
cut_idx=i
# If the backward walk never broke early because the entire transcript
# fits within soft_ceiling, accumulated now holds the total transcript
# size. Without intervention _ensure_last_user_message_in_tail pushes
# cut_idx forward to include the last user message, and the caller's
# compress_start >= compress_end guard either returns unchanged (no-op)
# or compresses a single message — both of which trigger the infinite
# compaction loop described in #40803.
#
# Fix: when the whole transcript fits in soft_ceiling, compute a
# meaningful cut point using the raw (non-inflated) budget so that
# compression actually summarizes a worthwhile middle section.
@@ -40,7 +34,7 @@ It builds and launches the GUI against your existing install — same config, ke
### Prebuilt installers
When a release ships desktop installers they're attached to its [releases page](https://github.com/NousResearch/hermes-agent/releases) — `.dmg` (macOS), `.exe` / `.msi` (Windows), `.AppImage` / `.deb` / `.rpm` (Linux). These are published manually, so the install-with-Hermes path above is the most reliable way to get the latest.
Prebuilt installers are built and distributed via [the Hermes Desktop website.](https://hermes-agent.nousresearch.com/desktop).
---
@@ -56,10 +50,7 @@ hermes update
## Requirements
The installer handles everything for you (Python 3.11+, a portable Git, ripgrep). The only thing worth knowing:
- **Windows** — the installer bundles its own Git and Python; no admin rights or system changes required.
- **macOS / Linux** — uses your system Python 3.11+ (installed automatically if missing).
The installer handles everything for you (Python 3.11+, a portable Git, ripgrep).
'Module scripts are being served with the wrong MIME type. This usually means a static file server is serving a Vite/React app instead of the project dev server.',
description:copy.moduleMimeDescription,
url: webview.getURL?.()||target.url
})
setLoading(false)
@@ -566,13 +567,11 @@ export function PreviewPane({
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.