Replicate the `hermes tools` configurator in the dashboard Skills →
Toolsets view. Each toolset now opens a config drawer that covers the
full lifecycle the CLI offers: enable/disable, pick a provider/backend,
enter and save API keys, and run a provider's post-setup install hook
with a live log tail.
The toolset view was previously read+toggle only — the provider matrix
and key-status endpoints existed but the page never called them, and
there was no way to save a key or run a backend install (npm/pip/binary)
from the browser.
Backend:
- New CLI subcommand `hermes tools post-setup <KEY>` — non-interactive,
scriptable target that runs a provider's install hook (agent_browser,
camofox, cua_driver, kittentts, piper, ddgs, spotify, langfuse,
xai_grok). Validated against valid_post_setup_keys() so an arbitrary
key can't drive _run_post_setup.
- PUT /api/tools/toolsets/{name}/env — save API keys to ~/.hermes/.env
via save_env_value (same store the CLI writes), validated against the
toolset category's env-var allowlist; blank values skipped.
- POST /api/tools/toolsets/{name}/post-setup — spawn-action that runs
`hermes tools post-setup <key>`; frontend tails the log via the
existing /api/actions/tools-post-setup/status. Registered in
_ACTION_LOG_FILES.
Frontend:
- New ToolsetConfigDrawer component (provider radios, password key
inputs with saved-state, get-a-key links, Run-setup + live install
log). Toolset cards get a Configure button + the drawer also exposes
the enable toggle.
- api.ts: toggleToolset, getToolsetConfig, selectToolsetProvider,
saveToolsetEnv, runToolsetPostSetup + ToolsetConfig/Provider/EnvVar/
EnvResult types.
Validation: 56 admin-endpoint tests pass (10 new: env save w/ CLI
parity + allowlist reject + blank-skip, post-setup spawn validation,
auth gate); 232 web_server tests pass; web npm run build + eslint clean;
HTTP E2E exercises save-key (CLI reads it back) and spawn+poll
post-setup to exit 0.
The skill detail dialog (Skills hub browser) had several awkward
spacing/placement issues:
- description and identifier crammed together with no breathing room
(-mt-1 pulled the description tight to the header)
- the identifier line touched the action-row border
- Install was stranded far right with a large empty void in the middle
of the action row
- the SKILL.md <pre> opened with a leading blank line
Fixes:
- group description + identifier in a spaced flex-col block (mt-1, gap-1)
- give the action row mt-3 + py-2.5 so it separates from the meta block
- move the repo link into the right-side group with Install (ml-auto,
gap-3) so the row reads left=tabs / right=repo+install, no middle void
- mt-3 on the body for consistent vertical rhythm
- trim() the SKILL.md content so it starts at the first real line
The Browse-hub tab was a blank search box with sparse result cards (name +
source + one Install button), no way to read a skill before installing, no
visual security scan, and no indication it was even connected to any hubs.
Backend (web_server.py):
- GET /api/skills/hub/sources — lists the configured hubs (label + trust
tier + GitHub rate-limit + index availability) and featured skills pulled
from the centralized index (zero extra API calls), plus installed-skill
provenance so the UI can mark already-installed results.
- GET /api/skills/hub/preview — fetches a skill's SKILL.md text + file
manifest WITHOUT installing (decodes byte-stored text, masks binaries).
- GET /api/skills/hub/scan — runs the SAME quarantine + scan_skill +
should_allow_install pipeline the CLI installer uses, then cleans up
quarantine, returning verdict / per-finding detail / severity tally /
install-policy decision.
- search now returns per-source counts + timed-out sources + installed map.
Frontend (SkillsPage HubBrowser):
- Landing state: connected-hubs strip + featured skill grid (no more blank
page).
- Rich cards: trust-level color coding, source, tags, identifier,
Details + Install (or Installed state).
- Detail dialog: read the actual SKILL.md, on-demand visual security scan
(verdict pill, severity tally, per-finding list, allow/block policy),
GitHub repo link.
- Search meta line: result count + timing + per-source breakdown (the
'feels slow / no feedback' complaint).
Tests: 4 new endpoint test classes (sources/preview/scan + updated search
shape) in test_dashboard_admin_endpoints.py.
profile list and profile show assumed the wrapper script is always named
after the profile (wrapper_dir / name). When a custom alias exists — e.g.
`hermes profile alias steve --name qiaobusi` creates ~/.local/bin/qiaobusi
pointing at `hermes -p steve` — the display silently showed the profile
name (or nothing) instead of the alias the user actually typed.
The custom-alias *creation* path (create_wrapper_script(name, target)) was
added later; the *display* path was never updated to match.
Add find_alias_for_profile() — a reverse lookup that scans the wrapper dir
for our own wrappers (alias-named file containing 'hermes -p <profile>'),
prefers a custom alias over the profile-named one, strips .bat on Windows,
and sorts for deterministic output. Populate ProfileInfo.alias_name and wire
it into the three display sites (profile describe, list, show).
Credit: salvages the intent of #11506 by wss434631143, reimplemented on
current main against the post-#11506 custom-alias (--name/target) mechanism.
Tests: 6 new (profile-named, custom-name, none, unrelated-file rejection,
windows .bat strip, list_profiles surfacing). All 123 in test_profiles pass.
E2E verified against the real CLI for both custom and profile-named aliases.
credits.grant_spent is a one-time "your monthly grant is used up, you're now on
top-up" heads-up, but it was sticky — it camped the TUI status bar until the grant
refilled, so a user with healthy top-up saw "Grant spent · $990 top-up left"
indefinitely. Treat it like the usage-band notice: flash once, then clear on the
next prompt (startMessage). Depletion stays sticky (you actually can't make
requests). The Python `active` latch keeps the key, so it won't re-fire next turn.
* feat(tui): HERMES_DEV_CREDITS live-spend dev readout (L0 tracer for usage-aware credits)
L0 of the usage-aware-credits feature: a dev-only, env-gated tracer that
exercises the real header -> CreditsState -> TUI pipe end-to-end behind
HERMES_DEV_CREDITS, de-risking the L1/L5 build before the notice policy exists.
- agent/credits_tracker.py: CreditsState + parse_credits_headers (headers are
strings -> paid_access via == "true", never bool(); retain-last-known; only
subscription_micros may be negative; *_usd kept verbatim).
- run_agent.py: _capture_credits / get_credits_state / get_credits_spent_micros,
session-start baseline latch, + dev-gated "credits" capture log.
- agent/chat_completion_helpers.py: capture on the streaming response.
- agent/agent_init.py: init _credits_state + _credits_session_start_micros.
- tui_gateway/server.py: _get_usage emits dev_credits_spent_micros only when flagged.
- ui-tui appChrome.tsx / types.ts: cents delta status segment + "(dev credits)" banner.
Off by default; silent for normal users. Validated live against staging
(capture log delta matches the TUI segment). Throwaway consumer (readout/log/
banner); credits_tracker + the capture plumbing are the real feature foundation.
* test(credits): lock parser under 9-state matrix + harden validation (L2)
Add tests/agent/test_credits_tracker.py with 92 tests covering the 9-state
matrix (healthy, sub_90pct, grant_exhausted, purchased_only, tool_pool_free,
depleted, debt, missing, no_org) plus validation edge cases: version strict==1
with warn-once latch for v>1, bool-string trap (paid_access/tool_pool_gated_off
== "true"/"false", never bool()), half-pair subscription limit treated as
both-absent while parse succeeds, USD regex ^-?\d+\.\d{2}$, non-int micros
→ None, negative non-subscription micros → None, as_of_ms junk → None, zero
limit ZeroDivision guard.
Harden agent/credits_tracker.py to match the spec:
- Add tool_pool_micros/tool_pool_gated_off/from_header fields to CreditsState
- Add depleted property (== not paid_access, never remaining==0)
- Change used_fraction guard to key off subscription_limit_micros (the actual
denominator) not denominator_kind (metadata)
- Replace fail-soft _safe_int with a sentinel-returning variant; full validation
now returns None on any malformed field rather than silently defaulting
- Add module-level warn-once latch for version > 1
- Add USD regex validation; add denominator_kind allow-list check
- Parse x-nous-tool-pool-* prefix headers (not x-nous-credits-tool-pool-*)
* feat(credits): notice spine — AgentNotice + notice_callback/notice_clear_callback + TUI binding (L1)
L1 of usage-aware credits: the driver-agnostic notice delivery spine that L4's
policy will fire through and L5's TUI render will consume.
- agent/credits_tracker.py: AgentNotice dataclass (text/level/kind/ttl_ms/key/id;
kind defaults "sticky", kept TTL-expressive for a future config seam).
- run_agent.py: AIAgent gains notice_callback + notice_clear_callback slots and
_emit_notice / _emit_notice_clear emitters (swallow all callback errors — a
notice must never break the agent loop; no-op when unbound).
- agent/agent_init.py: thread both callbacks through init_agent.
- tui_gateway/server.py: bind both in _agent_cbs → notification.show / notification.clear
WS events (snake_case payload, matching the existing gateway-event convention).
- ui-tui/src/gatewayTypes.ts: notification.show / notification.clear arms on GatewayEvent.
- tests/run_agent/test_notice_spine.py: 15 tests (emitter fire + fail-open + no-op,
signature threading, TUI binding payload shape).
Messaging push is out of v1 (binds neither callback). CLI binding + the TUI render/
decode land with L4 (firing) and L5 (render) so turn-end flush is wired correctly.
* feat(credits): threshold reconciliation policy + tests (L4.1)
* feat(credits): wire threshold policy into capture + latch (L4.2)
After a fresh header parse, _capture_credits runs evaluate_credits_notices against
the agent's _credits_latch and emits the result — clears first, then shows (so a
recovered depletion clears before the "restored" success lands, and depleted wins
the latest-wins slot). Gated on a bound notice_callback: messaging (no callbacks)
still caches state for /usage but runs no policy. Parse stays fail-open (miss →
keep last-known); the eval/emit path warns on failure rather than swallowing, so a
depletion-notice bug can't vanish silently.
- run_agent.py: _capture_credits split into parse (swallow→miss) + policy (warn);
latch lazy-guarded (object.__new__ safety).
- agent/agent_init.py: init agent._credits_latch = {"active": set(), "seen_below_90": False}.
* feat(tui): render credits notices in the status bar (L5, Strategy B)
The TUI now renders the notification.show / notification.clear gateway events the
agent emits — a level-colored notice overrides the status/verb slot when not busy.
- Notice state machine on turnController (pendingNotice + dedicated noticeTimer +
show/clear/applyNotice/flushPendingNotice/clearNoticeState). createGatewayEventHandler
decodes the events and delegates.
- Render priority busy > notice > status (appChrome StatusRule); notice text rendered
verbatim (its glyph comes from the policy), shrinkable so it never clips model│ctx;
dev-credits banner + Δ segment preserved. UiState.notice is snake_case (matches wire).
- Busy-wins: a notice arriving mid-turn is held and flushed at the THREE turn-end sites
(recordMessageComplete / interruptTurn / recordError) — never idle(), which reset()
also calls (would leak across sessions); reset() clears instead.
- Dedicated noticeTimer (never statusTimer); TTL starts on visibility with an id-guard;
latest-wins cancels the prior timer; clear is key-matched (no-op on mismatch); a sticky
survives a turn (flush no-ops with no pending); session reset clears (no cross-session leak).
- 20 tests (handler/turnController logic incl. R3-C2 timer isolation + render priority).
* feat(credits): cold-start seed for new Nous sessions (L3)
A genuinely-new Nous session has no inference header yet, so seed credits state from
the authoritative GET /api/oauth/account snapshot at session start (in the new-session
branch of _restore_or_build_system_prompt — inline, since the on_session_start plugin
hook gets no agent reference). The seed runs the shared notice policy, so a session that
opens already depleted warns IMMEDIATELY rather than only after the first turn.
- Maps the nested account fields (paid_service_access → paid_access; total_usable /
subscription / purchased on paid_service_access_info; rollover on subscription), each
None-guarded; float dollars → micros via round(d*1e6), *_usd left "" (render formats
from micros — never synthesize a verbatim usd from a float).
- Magnitudes-only: no monthlyCredits on the endpoint → subscription_limit_* unset →
used_fraction None → no warn90 from the seed (% only once a header lands, per D-E).
- Provider-guarded to Nous; fail-open (any error leaves _credits_state None, never
blocks startup); paid_access unknown ⇒ True (never falsely depleted).
- run_agent.py: extracted the warm-path policy/emit block into a shared
_emit_credits_notices() so capture and the seed fire notices identically.
* feat(credits): /usage Nous credits magnitudes view + recovery trigger (L6)
Add Nous credit dollar magnitudes to /usage (subscription / top-up / total
+ rollover + renewal + portal CTA), magnitudes-only per v1 (no % until the
account endpoint exposes a denominator). Reuses the existing account-usage
render machinery via a new pure build_nous_credits_snapshot() that maps a
NousPortalAccountInfo to an AccountUsageSnapshot; no nous branch is added to
fetch_account_usage (keeps the per-provider boundary intact).
CLI /usage also doubles as a depletion-recovery trigger: a force_fresh
account fetch, kept in a SEPARATE local so it never clobbers the
header-sourced agent._credits_state (which alone carries used_fraction). If
paid access recovered while credits.depleted is latched and a notice
consumer is bound, it reuses agent._emit_credits_notices() to clear it.
Gateway /usage displays magnitudes only — messaging binds no notice
consumer, so it performs no recovery emit.
Fail-open throughout: any portal hiccup leaves /usage unaffected.
* refactor(credits): dedupe HERMES_DEV_CREDITS flag parse via shared helpers
The dev-flag truthy check was inlined in three places. Replace with the shared
utils.is_truthy_value (run_agent.py, tui_gateway/server.py — also drops a
redundant inline `import os`) and a hoisted DEV_CREDITS_MODE export in
ui-tui/src/config/env.ts (consumed by appChrome, which also stops recomputing the
env check on every render). Behaviour-preserving; identical truthy set.
* fix(credits): cut dead /usage recovery trigger + bound portal fetches (L6 review)
Adversarial review found the /usage depletion-recovery trigger dead AND broken:
the CLI binds no notice_clear_callback, the TUI runs /usage in a separate
slash-worker subprocess (its own agent/latch), and the no-clobber rule made it
evaluate stale paid_access anyway. Recovery already happens on the next inference
(warm path), so the trigger was redundant — remove it and stop the depleted
notice over-promising.
- cli.py: remove the dead recovery block; bound the /usage portal fetch with a
10s wall-clock timeout (ThreadPoolExecutor) like the per-provider fetch —
urllib's per-socket timeout is not a wall-clock guarantee.
- agent/credits_tracker.py: reword the depleted CTA to "run /usage for balance"
(no false recovery promise; /usage shows fresh magnitudes, sticky clears next turn).
- agent/conversation_loop.py: same wall-clock timeout on the cold-start seed fetch
so a stalled portal can't hang session startup; tidy its time import.
* chore(credits): dev notice-state fixtures (HERMES_DEV_CREDITS_FIXTURE)
Throwaway dev scaffolding to exercise the notice pipeline without real spend or
Redis seeding. Set HERMES_DEV_CREDITS_FIXTURE to a state name (healthy / sub_90pct
/ grant_exhausted / depleted / clear) or a file path whose contents name a state
(re-read each turn → flip states live for recovery testing). _capture_credits
injects the chosen CreditsState instead of parsing real headers and runs the
shared notice policy. Deletable with the rest of the HERMES_DEV_CREDITS scaffolding.
* feat(credits): /usage monthly-grant % gauge
The portal /api/oauth/account subscription block now carries monthly_credits
(the per-period grant allowance, the % denominator). The consumer parsed
monthly_charge but dropped monthly_credits, so /usage stayed magnitudes-only.
Capture monthly_credits into NousPortalSubscriptionInfo + _subscription_from_payload.
build_nous_credits_snapshot emits a Subscription usage window (real % used, routed
through the existing render machinery) when monthly_credits is a finite positive
denominator and credits_remaining is finite and <= cap; otherwise it degrades to
magnitudes-only (older portals, rollover-over-cap, or non-finite payloads).
Guards (adversarial-review-driven): reject non-finite operands (json.loads parses
bare NaN/Infinity by default → would render $nan + a false 100% used), reject
bools, guard div-by-zero (cap>0), and suppress the gauge when remaining > cap
(rollover spanning the period makes the cap a nonsensical denominator → the
$X-of-$Y detail would read as a contradiction). Debt (remaining<0) clamps to 100%.
Money rule preserved: the ratio + magnitudes are computed from numeric float
account fields via display formatting, never by parsing a server *_usd string
(there are none on these dataclasses).
13 gauge tests added (tests/agent/test_nous_credits_gauge.py).
* fix(credits): show /usage Nous block whenever a Nous account is present
/usage runs in a slash-worker subprocess whose resolved inference provider is
often not "nous" even when the user has a Nous account, so gating the Nous
credits block on (provider == "nous") hid it entirely — the account data was
fully available but never rendered.
Gate instead on "a Nous account is logged in": a cheap local auth-state lookup
(get_provider_auth_state('nous') has an access_token) decides whether to attempt
the portal fetch, regardless of which provider inference runs on. In the gateway
the block is also lifted out of the 'if provider:' scope so a Nous-credentialled
user with another (or no) resident inference provider still sees their balance.
Fail-open and the per-fetch wall-clock timeout are preserved.
* fix(credits): show /usage Nous block when there's no live agent (TUI slash-worker)
In the TUI, /usage runs in a slash-worker subprocess that resumes the session
WITHOUT building an agent (self.agent is None), so _show_usage early-returned
"(._.) No active agent" before ever reaching the Nous credits block — which is
agent-independent (a portal fetch gated on Nous auth-state). Extract the block
into _print_nous_credits_block() and run it at the no-agent / no-calls
early-returns too (returns True if it printed, so the fallback message only
shows when there's genuinely nothing).
Verified live against staging: the block + monthly-grant gauge now render in the
slash-worker /usage path (previously hidden). The plain CLI REPL + messaging
paths are unchanged (they have a live agent).
* feat(credits): escalating 50/75/90 usage bands (single status line)
Replace the lone 90%-used warning with three escalating bands (50 info, 75 warn,
90 warn) shown as ONE status-bar line: it displays the highest band the
subscription grant has crossed, replaces the line as usage climbs, steps back
down on recovery, and clears below 50%. No stacking, no per-turn churn.
Bands live in a tunable CREDITS_USAGE_BANDS list; the policy derives everything
from it. Single notice key (credits.usage) with a usage_band latch field so the
notice only re-emits when the band actually changes. The crossing gate
(seen_below_90) is preserved so a fresh live session that opens mid-range stays
quiet until it has been observed below the lowest band (cold-start primes it when
it wants an open-high warning). Denominator math unchanged: % = subscription
grant burn (cap - grant_remaining)/cap, clamped [0,1]; top-up never moves the %.
Migrated test_credits_policy.py to the new key + added TestUsageBands (climb,
step-down, recovery-clear, idempotent, inclusive boundaries).
* feat(credits): hydrate notices at session OPEN via shared seed (TUI + first-turn)
Notices previously only fired inside a conversation turn (first message), so a
session that opened already depleted / past a usage band showed nothing at
'ready'. Extract the cold-start seed into a shared seed_credits_at_session_start()
and call it (a) in the TUI/desktop agent build right after the notice callback is
wired (fires at 'ready', before any message) and (b) as the first-turn fallback in
conversation_loop. Idempotent (skips once _credits_state exists) and fail-open.
The seed now maps monthly_credits -> subscription_limit_micros +
denominator_kind='subscription_cap', so used_fraction is computable at seed time
and usage-band warnings (not just depletion) hydrate on open. Primes the crossing
latch so a session opening already in a band warns immediately. Degrades to
depletion-only when monthly_credits is absent (older portals).
Adds test_credits_cold_start.py covering open-at-band, depletion, debt, no-cap
degradation, and the shared seed (fires/idempotent/skips-non-nous).
* feat(credits): /usage monthly-grant % gauge + fixture support + TUI surfacing
agent/account_usage.py: build_nous_credits_snapshot emits a subscription %% gauge
when the portal supplies a positive, finite monthly_credits denominator with
remaining <= cap (guards reject NaN/Infinity and rollover-over-cap, which would
render $nan or a contradictory $X-of-$Y); degrades to magnitudes-only otherwise.
Adds shared nous_credits_lines() (auth-gated, wall-clock-bounded portal fetch) so
the CLI and TUI /usage render the same block, and _snapshot_from_credits_state()
so HERMES_DEV_CREDITS_FIXTURE drives /usage offline too.
TUI: session.usage RPC carries credits_lines (agent-independent) and the /usage
panel renders them regardless of API-call count or resume state — previously the
TUI's separate /usage implementation only showed token counts.
Money rule preserved: %% and magnitudes come from numeric float account fields via
display formatting, never by parsing a server *_usd string.
* feat(credits): CLI REPL inline notices (parity with TUI)
The plain CLI agent bound no notice callbacks, so credit notices were TUI-only.
Bind notice_callback/notice_clear_callback on the CLI AIAgent; _on_notice renders
a single level-colored line above the prompt (error red / warn yellow / success
green / info dim) via _cprint, and seed credits at session open so a depletion or
usage-band warning shows before the first message — the same hydration the TUI
got. _on_notice_clear is a no-op (the REPL prints lines, no persistent slot).
* test(credits): add sub_50pct + sub_75pct dev fixtures for the new usage bands
The fixture set jumped 10%% -> 90%%; add sub_50pct (uf 0.5 -> band 50 info) and
sub_75pct (uf 0.75 -> band 75 warn) so the new escalating bands are exercisable
via HERMES_DEV_CREDITS_FIXTURE across all three surfaces (notice, session-open
seed, /usage gauge).
* fix(credits): usage-band notice clears on next prompt (not sticky-forever)
A 50/75/90 usage heads-up was sticky and camped the status bar indefinitely. Clear
the visible credits.usage notice when a new turn starts (startMessage), so it shows
until your next prompt then yields. The server latch is unchanged, so it won't
re-nag at the same band — it only re-shows when the band actually changes (climb)
or clears when usage drops below the lowest band. Depletion stays sticky.
* refactor(credits): consolidate the /usage credits block behind nous_credits_lines()
The CLI (_print_nous_credits_block) and the messaging gateway (_handle_usage_command)
each re-implemented the auth-gate + portal fetch + render, and both bypassed the
dev-fixture short-circuit that only the TUI honored — so /usage ignored
HERMES_DEV_CREDITS_FIXTURE on the CLI and in chat. Route both through the shared
agent.account_usage.nous_credits_lines() helper: one fetch/render path, one auth
gate, and the fixture works on every surface (~60 fewer duplicated lines).
The gateway usage test recorded only the last asyncio.to_thread call; /usage now
dispatches both the account fetch and the credits fetch, so it records every call
and matches the account fetch by its provider arg.
* fix(credits): keep the /usage gauge type-safe and log its fail-open path
_is_finite_num is now a TypeGuard[float], so the type checker narrows the gauge
operands (monthly_credits / credits_remaining) and the magnitudes passed to
_fmt_usd through it — no more None-operand warnings on the arithmetic. Add a debug
breadcrumb on the nous_credits_lines portal-fetch fail-open so a dead /usage block
is diagnosable in agent.log without a dev flag.
* fix(credits): harden the header tracker — prod-leak gate, hot-path probe, fire-and-forget seed
- Prod-leak guard: dev fixtures (HERMES_DEV_CREDITS_FIXTURE) now also require
HERMES_DEV_CREDITS, so a stray fixture var can't surface fabricated balances on a
real account. Matches the documented run workflow (both vars set together).
- Hot-path probe: parse_credits_headers checks for the version sentinel header
before allocating a lowercased copy of the response headers — skips that work on
every non-Nous API call. Behaviour-identical and still case-insensitive.
- Fire-and-forget seed: the real portal fetch in seed_credits_at_session_start now
runs in a daemon thread, so a slow/unreachable portal never delays session "ready"
(previously blocked up to 10s). The dev-fixture path stays synchronous; the thread
re-checks idempotency before hydrating (a live header may land first).
- Diagnostics: debug breadcrumbs on the parse and seed fail-open paths so a crashed
parser / dead seed is distinguishable from a legitimate no-headers miss.
Cold-start tests set HERMES_DEV_CREDITS alongside the fixture to match the gate.
* test(tui): fix env-timing in the StatusRule dev-credits assertion
DEV_CREDITS_MODE is read once at module load (config/env), so mutating
process.env.HERMES_DEV_CREDITS inside the test couldn't flip it — the dev-banner
assertion only passed if the env was exported before vitest started, and failed in a
normal run. Move that assertion to a sibling file that mocks config/env with
DEV_CREDITS_MODE: true (scoped, no module-reset / React-identity hazard).
* test(credits): cover the dev-fixture /usage render and usage-band clear-on-prompt
- _snapshot_from_credits_state (the offline /usage renderer) had no direct test:
lock the gauge math, the verbatim *_usd magnitudes, the depletion line and the
fixture marker, plus the no-cap (no gauge) and None-state cases.
- turnController.startMessage had no test for clearing the credits.usage notice on
the next prompt while leaving credits.depleted sticky.
* feat(credits): deliver credit notices over messaging gateways
Bind notice_callback/notice_clear_callback on the per-turn gateway agent
so usage-band / depletion / restored notices reach Telegram/Discord/Slack/
etc. Previously the messaging gateway bound neither callback, so the agent's
_emit_credits_notices early-returned and a chat user crossing a band got
nothing unless they ran /usage manually.
- render_notice_line(): AgentNotice -> single plaintext line (level glyph +
text), plaintext-only so it renders uniformly without per-platform escaping.
Fail-soft on malformed/empty notices.
- Standalone push for every notice (messaging has no persistent status bar):
route through the shared _deliver_platform_notice rail (honors private/
public delivery + thread metadata), scheduled onto the gateway loop via
safe_schedule_threadsafe from the agent's sync worker thread — same pattern
as _status_callback_sync.
- The fired-once latch lives on the cached (reused-in-place) agent and
persists across turns, so a band crosses once -> one push, no per-turn
re-nag. Re-fires only after idle-eviction rebuilds the agent (a reminder).
- Recovery ('Credit access restored') rides the show path (emitted as a
success notice, not a clear). notice_clear_callback is a no-op: a sent
platform message can't be cleanly retracted.
Tests: render glyph/levels/fail-soft + public/private delivery seam through
_deliver_platform_notice + no-adapter no-op.
* fix(credits): don't double the glyph on messaging notices
render_notice_line prepended a per-level glyph, but the notice policy already
bakes the glyph into the text (and the TUI + CLI render it verbatim) — so every
credit notice over messaging came out doubled ("⚠ ⚠ Credits 90% used",
"⛔ ✕ Credit access paused"). Emit the text verbatim instead; drop the now-dead
level→glyph map.
The render tests fed glyph-less text (and the success case only checked
startswith), so the doubling slipped through. Rework them around the verbatim
contract and add an end-to-end regression that runs real evaluate_credits_notices
output through render_notice_line and asserts the line is returned unchanged.
Switching the main model never touches auxiliary slot pins (they're
independent, sticky per-task overrides). A user who switches main away
from a now-unpaid provider keeps paying 402s on every background aux call
until they manually reset those pins — silently, with no UI signal.
- /api/model/set scope:'main' now returns stale_aux: slots still pinned
to a provider different from the new main (additive field).
- Desktop Model Settings shows a switch-time notice after Apply AND a
persistent banner when any loaded aux slot mismatches the main provider,
both wired to the existing 'Reset all to main' action.
- Never auto-clears pins — a dedicated cheaper aux model is a legitimate
config; surface-and-offer instead of nuking.
- Fixes a stale pre-existing assertion in the panel test (main model now
renders via selectors, not a standalone label).
Follow-up on the cherry-picked content-block fix. _extract_output_tail
(the live subagent overlay) still used crude str(content), which renders
a "[{'type': 'text'...}]" blob and — worse — mislabels a block-wrapped
"Error: ..." result as is_error=False. Route it through the same
_stringify_tool_content helper so error detection and previews work at
both consumer sites.
- delegate_tool.py: _extract_output_tail uses _stringify_tool_content
- tests: add _extract_output_tail content-block test (error detection +
clean preview)
- release.py: AUTHOR_MAP entry for randomsnowflake (CI gate)
The native macOS About panel showed the Electron package.json version
(e.g. 0.15.1) while the status bar showed the real Hermes version
(0.16.0). setAboutPanelOptions() set applicationName + copyright but
omitted applicationVersion, so macOS fell back to app.getVersion() =
package.json, which drifts (release.py's desktop lockstep bump didn't
land for 0.16.0).
resolveHermesVersion() already reads the live version from
hermes_cli/__init__.py and was built 'so the desktop About panel shows
the real Hermes version' per its own comment, but was never wired in.
- Seed applicationVersion: resolveHermesVersion() at module load.
- Replace the macOS About menu item's role:'about' with a click handler
(showAboutPanelFresh) that re-resolves the version on every open, so an
in-place `hermes update` is reflected without an app restart.
* fix: respect disabled auto-compaction on context overflow
Port from anomalyco/opencode#30749.
When compression.enabled is false, NO automatic compaction trigger may
fire. The proactive token-threshold paths (preflight + post-response
should_compress gate) already honoured the setting, but the three
provider-overflow recovery paths in the agent loop — long-context-tier
429, 413 payload-too-large, and context-overflow — called
_compress_context() unconditionally, silently compressing and rotating
the session against the user's explicit choice.
Add a single guard at the top of the overflow-recovery dispatch: when
compression is disabled and the error is one of those three overflow
classes, surface a terminal error (compaction_disabled: True) telling the
user to /compress manually, /new, switch to a larger-context model, or
reduce attachments. Manual /compress (force=True) is unaffected — it never
enters this loop.
Tests: new TestOverflowWithCompactionDisabled (413 + 400 overflow don't
compress when disabled; control case still compresses when enabled).
Existing overflow-recovery tests updated to enable compaction explicitly
(they verify the recovery fires); fixture defaults flipped to True to
match production (compression.enabled defaults to True).
* fix(dashboard): populate cron delivery dropdown from configured platforms
The dashboard cron-create/edit dropdown hardcoded five delivery options
(local, telegram, discord, slack, email), so users on Matrix — or any
other backend-supported platform — had no way to pick their channel even
though the cron scheduler delivers to all of them. It also offered
Telegram/Discord/etc. to users who never set those up.
- cron/scheduler.py: add cron_delivery_targets() — the single source of
truth. Intersects gateway-configured platforms with cron-deliverable
ones and reports whether each platform's home channel is set.
- web_server.py: GET /api/cron/delivery-targets exposes that list (+ the
implicit local option) to the dashboard.
- CronPage.tsx: both modals render options from the endpoint. Configured
platforms missing a home channel still appear, annotated "set a home
channel first" (option B), so the user knows what to fix. Edit modal
preserves a job's current target even if it's no longer configured.
Local-only state shows a "configure a platform under Channels" hint.
Validation: scheduler + endpoint E2E'd with a Matrix gateway (home set
and unset); 5 new tests; tests/cron + tests/hermes_cli/test_web_server
green (366 passed).
The inline steer note used a ⏩ emoji. Emit a structured `steer:<text>`
system note and render it in SystemMessage as a codicon (compass) row —
same style as slash-status output. No emoji in the transcript.
Cmd/Ctrl+Enter now steers when there's a steerable draft and is a no-op
otherwise — it never falls through to a send, so the shortcut can't
surprise-send. Plain Enter keeps its role (queue while busy, send when idle).
A steer rides inside a tool result (the only role-alternation-safe slot
mid-turn), so a bare "User guidance:" line reads as untrusted tool content —
well-behaved models refuse it as suspected prompt injection (observed live:
"I only follow instructions from you directly, not ones injected through
command results").
- Wrap steers in a bounded, self-describing [OUT-OF-BAND USER MESSAGE] marker
(prompt_builder.format_steer_marker), shared by both drain sites.
- Add STEER_CHANNEL_NOTE to the core system prompt so the model expects this
exact marker and trusts it as a genuine user message — while still ignoring
lookalikes buried in tool/web/file output. Static text → byte-stable prompt,
no prompt-cache regression; gated on the agent having tools.
- Desktop: steer ack is now an inline transcript note (⏩ steered · …) instead
of a toast.
Marker is intentionally static (not a per-session nonce) to honor the
byte-stable system-prompt caching policy; nonce hardening noted as follow-up.
The desktop app could only queue while busy — `/steer` was in the palette
but had no first-class affordance, so the "nudge the agent mid-turn without
interrupting" lane was effectively unreachable.
Add a steer action to the composer: while busy with a text-only draft, a
steering-wheel button (and Cmd/Ctrl+Enter) injects the text into the live
turn via the `session.steer` RPC — the gateway folds it into the next tool
result so the model reads it on its next iteration. Plain Enter still queues.
steerPrompt returns false when the gateway has no live tool window (or the
RPC errors), and the composer re-queues the words so nothing is lost — the
same safety net as a plain queue.
Force-sending a queued message (double-empty-enter, or interrupt-mode
submit) flipped busy→false optimistically, so the queue drain raced the
still-unwinding turn: duplicate user bubble, a stray "queued: …" note, and
the cancelled turn's "Operation interrupted…" reply leaking in.
interruptTurn gains `keepBusy`: hold busy until the gateway's real settle
edge (message.complete, suppressed while interrupted), which drains the
queued message exactly once — desktop "send now" parity. The interrupt
paths now queue + interrupt instead of optimistically sending.
Builds on @naqerl's arrow up/down history (previous commit), making
ArrowUp do the right thing when a queue exists.
ArrowUp/ArrowDown priority:
1. Editing a queued turn → walk older/newer through queued entries,
saving each edit; ArrowDown past the newest exits and restores the
pre-edit draft.
2. Empty composer + queued turns → ArrowUp opens the newest queued entry
for editing (the row's pencil), so Enter saves it back to the queue
instead of firing a new message — the gap the history nav had alone.
3. Otherwise → sent-message history recall (unchanged).
Also: Esc cancels an in-progress queue edit (else interrupts).
Cleanups on the integrated code: fold the browse-state reset into the
existing session-change effect (drop the duplicate ref+effect); reuse
loadIntoComposer for history recall; sort imports; add curly braces +
the runDrain sessionId dep (lint).
* fix(desktop): make composer message queue reliable
The queue felt 'dumb' because of three real bugs:
1. Drained-after-interrupt sends went silent. cancelRun sets
interrupted:true and nothing reset it; submitPromptText's optimistic
seed preserved it, and the message stream drops every delta while
interrupted. So Send-now-while-busy and any interrupt+drain submitted
the next turn into a muted session. Fix: a fresh submit is a new turn —
seed interrupted:false.
2. Back-to-back queue drains stalled. The drain fires on the busy->false
settle edge, but busyRef (synced from the busy store by a separate
effect) can still read true on that same edge, so the drained send hit
the busy guard, returned false, and the entry was never removed. Fix:
fromQueue sends bypass the busyRef guard (the queue drain lock
serializes them); the user path keeps the guard.
3. Double-enter-to-interrupt killed single non-queue turns. The hidden
450ms timer meant a natural double-tap after sending stopped the agent.
Fix: empty Enter while busy is a no-op; interrupting is explicit —
Stop button or Esc.
Also: clean stop (no [interrupted] marker), Send-now works while busy
(promote + interrupt + auto-drain), settle on the interrupted completion
path. Adds regression tests and unblocks the prompt-actions suite by
completing its stale @/hermes mock.
* fix(desktop): float the queue panel as an overlay so the chat doesn't resize
The queue list rendered in-flow inside the composer root, so its height
fed --composer-measured-height (the composer rect drives the thread's
bottom padding + last-message clearance). Queuing a message grew that
rect and the whole chat visibly resized.
Anchor the panel out of flow above the composer (absolute bottom-full,
capped at 40vh with internal scroll). It no longer contributes to the
measured height, so the thread layout stays put and the list overlays the
(already faded) chat. Still collapsible via the panel's own
disclosure header.
* fix(desktop): queue panel collapsed by default + shared border with composer
- Default the queue disclosure to collapsed (compact 'N queued' pill)
instead of expanded.
- Drop the gap and merge the panel into the composer: square bottom
corners, no bottom border/radius, and overlap down by the Root's pt-2
(-mb-2) so the panel's borderless bottom lands on the composer surface's
top border — one continuous bordered shape.
* style(desktop): tighten queue panel padding
* style(desktop): trim queue-ux comments to house style
* style(desktop): drop 'Cursor' references from comments
Slack's native-slash manifest hard-caps at 50 (_SLACK_MAX_SLASH_COMMANDS).
Adding the /version canonical claims a pass-1 slot, so the lowest-priority
pass-2 alias (/q for /quit) clamps off the end. /q stays reachable via
/hermes q. Surviving aliases (/btw /bg /reset) still prove alias parity.
The new IME repro test has two it() blocks but the desktop suite registers
no global testing-library auto-cleanup, so the first render() leaked its
editor into the second test and getByTestId('editor') matched two nodes.
Add afterEach(cleanup) so each case renders into a fresh DOM.
DOM repro that drives compositionstart -> input(preedit) -> compositionend with
no trailing input event and asserts the composer payload (send button) becomes
visible for committed CJK/IME input. Regression guard for #39614.
Typing committed multi-character IME text (e.g. Chinese "你好", and equally
Japanese/Korean or any IME-composed script) left the send button hidden until
an unrelated edit. Input events during composition carry uncommitted preedit
text and are intentionally skipped; the code assumed a trailing input event
after compositionend would deliver the finalized text, but Chromium does not
reliably emit one on Windows IMEs. The committed text therefore never reached
composer state, so `hasComposerPayload` stayed false and the send button stayed
hidden (deleting a char fired a non-composition input that finally synced it).
Flush the live editor text into composer state in onCompositionEnd. Extract the
shared sync into flushEditorToDraft so input and compositionend both update
state.
Fixes#39614
The salvaged helper exported serializeJsonBody but main.cjs still inline-built
the request body, leaving the export dead and the test decoupled from the real
path. Use it at the fetchJsonViaOauthSession site so the helper's coverage
exercises production body construction. Byte-identical output.
The error guard in _search_with_rg/_search_with_grep was unreachable and,
if it had fired, would have discarded valid results.
Two root causes:
1. Unreachable. Both methods pipe the search through `| head` with no
pipefail, so the pipeline reported head's exit code (0), masking rg/grep's
error code (2). The guard never fired. Worse, because _exec merges stderr
into stdout (stderr=subprocess.STDOUT), the error text was then parsed as
bogus match lines instead of being surfaced — the user got garbage matches
with no indication the search failed.
2. Latent results-dropping. The original `not result.stdout.strip()` check
was always False on error (error text lives in stdout), and the
`hasattr(result, 'stderr')` branch was dead code (ExecuteResult has no
stderr field). A naive broadening to `exit_code == 2` would have nuked
real matches whenever rg/grep also hit a non-fatal error (e.g. one
unreadable file in a tree that otherwise matched), which both tools signal
with exit 2.
Fix:
- Prefix the piped command with `set -o pipefail` so rg/grep's real exit
status propagates. rg exits 0 on a truncating head; grep exits 141
(SIGPIPE), so the strict `== 2` guard ignores truncated-success.
- Add _split_tool_diagnostics() to separate tool diagnostics from match
output by tool prefix and output shape. Diagnostics never become matches;
on a hard error they are the message to surface.
- Only surface an error when exit==2 AND no usable match payload remains, so
partial errors keep their real matches.
Tests: tests/tools/test_search_error_guard.py drives both methods through the
real local backend (hard error surfaced, partial error keeps matches,
truncation no false error, files_only/count exclude diagnostics) plus unit
coverage for the splitter.
Supersedes #39710.
The terminal `hermes model` wizard (_model_flow_named_custom) always
live-probed a custom provider's /models endpoint, ignoring the configured
`models:` list. For plans whose endpoint exposes a large catalog (e.g. Baidu
Qianfan Coding Plan returns 100+ models for a 2-3 model plan) the picker
flooded with models the user can't use.
This wires `discover_models` (and the `models:` list) through
_named_custom_provider_map into the flow and honors `discover_models: false`
the same way the slash-command picker (model_switch.py sections 3 & 4) does:
- Default stays True — live probe, no behaviour change.
- discover_models: false → use the configured `models:` list verbatim,
skip the probe (string 'false'/'no'/'0' normalised to False).
- If the probe is on but returns empty, fall back to the configured list
instead of forcing manual entry.
Closes#18726
Section 3 (user `providers:`) already honors `discover_models: false` to
skip live /models discovery and keep the explicit `models:` list. Section 4
(`custom_providers:` list) did not — `should_probe` ignored the field, so any
grouped custom provider with an api_key always had its configured subset
replaced by the full live /models catalog.
This adds the same `discover_models` support to section 4:
- Default True — no behaviour change for existing configs.
- `discover_models: false` keeps the explicit `models:` list even when an
api_key is present.
- String values ("false"/"no"/"0") are normalised to False, matching
section 3.
- If any entry in a grouped endpoint opts out, the whole group opts out.
Use case: endpoints that expose a full aggregator catalog via /models but
only serve a configured subset.
Salvaged from #29810 — rebased onto current main. The PR's other change
(`key_env` resolution in section 4) landed independently in commit aa283d1e4
(custom provider picker credential isolation), so only the discover_models
portion is carried here.
Co-authored-by: ohMyJason <42903577+ohMyJason@users.noreply.github.com>
Follow-up to #39921. That PR scoped session.resume + prompt.submit to a
session's profile, but a BRAND-NEW chat (session.create) under a non-launch
profile was still built and persisted against the dashboard's launch profile.
Two visible symptoms in app-global remote mode (one dashboard, many profiles):
1. "who are you" in profile S replied as the launch (default) profile/agent —
the agent was built with the launch HERMES_HOME, so config/SOUL/identity
came from the wrong profile.
2. "session not found" on later resume — _ensure_session_db_row persisted the
row into the launch profile's state.db via _get_db(), so the session lived
in the wrong db, the unified list mis-tagged it (it showed up under BOTH
profiles), and resume routed to the wrong one.
Fix — carry the owning profile through the create path too:
- session.create accepts an optional `profile`; resolves its home and stores
`profile_home` on the session (alongside what resume already set).
- _start_agent_build binds that profile's HERMES_HOME while building the agent
(config/skills/model/identity resolve to it) and hands the agent the profile's
state.db so turns persist there.
- _ensure_session_db_row writes the row into the profile's state.db, not the
launch db — fixing the duplicate row + mis-tag + resume 404.
- desktop sends the new-chat profile on session.create.
None/launch profile → unchanged (single-profile and per-profile-remote setups
take the same path). Verified live against a one-dashboard / multi-profile
remote: a new chat under `work` builds as work's agent (correct SOUL identity),
persists ONLY to work's state.db (launch db stays empty), the unified list tags
it `work` exactly once, and it resumes cleanly.
tests/test_tui_gateway_server.py: _make_agent mocks updated for the session_db
param added in #39921's build path.
Introduce a lightweight React context-based i18n layer for the desktop
app and translate the UI into Simplified Chinese.
- New apps/desktop/src/i18n module: typed Translations interface, en + zh
locale tables, I18nProvider/useI18n, localStorage-persisted locale
(defaults to English), and language endonym metadata for the picker.
- Wire I18nProvider at the app root in main.tsx.
- Refactor 24 desktop screens/components to read strings from the `t`
object instead of hard-coded English.
- Add a unit test for the i18n context.
* fix(desktop): cross-profile session history in app-global remote mode
#39894 made remote-profile sessions first-class for PER-PROFILE remote
overrides. But the common setup — Settings → Gateway → "All profiles" → Remote
— writes app-GLOBAL remote mode (connection.json top-level mode:'remote', empty
profiles map), which the intercept didn't recognize. Switching to a non-launch
profile then 404'd every session read, so no history showed for it.
In global remote mode a SINGLE backend serves every profile via ?profile= (it
reads each profile's state.db off the remote host's own disk — verified: one
dashboard returns /api/profiles and /api/profiles/sessions?profile=all across
all profiles). The fix: when no per-profile override matches but global remote
mode is active, route per-session reads/mutations to that one backend and KEEP
the ?profile= param so it opens the right state.db (instead of bailing to the
local path and dropping the profile scope).
- new globalRemoteActive() — true for connection.json mode:'remote' or the
HERMES_DESKTOP_REMOTE_URL env override.
- per-session branch: per-profile override → route sans profile (own db);
global mode → route to the single backend WITH ?profile= preserved.
- unified list is unchanged in global mode: it already passes through to the one
backend, which aggregates all profiles natively.
Verified live against a one-dashboard / multi-profile remote (Austin's topology):
cross-profile transcript reads load (was 404), rename/delete route to the right
profile, unified list spans both profiles.
Known limitation (architectural, not fixed here): LIVE chat as a non-launch
profile still needs a per-profile dashboard on the remote — the dashboard binds
HERMES_HOME once at process start, so one global backend can't run an agent
turn as another profile. Session history/read/mutate now work regardless.
* fix(gateway): resume + chat any profile over one global-remote dashboard
The REST half of this branch made cross-profile session history visible in
app-global remote mode, but resume + chat still went over the WebSocket gateway,
which was hard-bound to the dashboard's launch profile. Resuming a non-launch
profile's session 404'd ("session not found") and sending spawned a new session
— because session.resume/prompt.submit had no profile concept and the live
agent + state.db were process-global to the launch profile's HERMES_HOME.
Make the WS gateway per-session profile-aware so ONE dashboard can serve every
local profile on its host (the app-global remote topology):
- session.resume accepts an optional `profile`. _profile_home() resolves that
profile's home on this host; resume opens THAT profile's state.db, binds its
HERMES_HOME (ContextVar override) while building the agent so config/skills/
model resolve to it, and passes the profile db to the agent so turns persist
to the right state.db. The owning profile_home is stored on the session.
- prompt.submit re-binds the stored profile_home for the turn thread (mid-turn
home reads — memory, skills — resolve to the resumed profile), reset in finally.
- _make_agent gains an optional session_db param (defaults to _get_db()).
- _load_cfg honors the home override (falls back to _hermes_home) so a resumed
profile loads its own config; cache keyed on resolved path.
- desktop: session.resume now sends the owning profile.
Omitted/launch profile → unchanged (single-profile and per-profile-remote setups
are byte-for-byte the same path). Verified live against a one-dashboard /
multi-profile remote: resuming a non-launch profile's session loads its history,
runs a real turn against THAT profile's home/env, and persists to its state.db.
tests/tui_gateway/test_protocol.py: _make_agent mocks updated for the new param.
Widens ViewWay's #20741 fix to the sibling config surface: a
custom_providers entry can pin its own output cap via max_output_tokens
(or max_tokens). _get_named_custom_provider now lifts it onto the
resolved runtime at all three return sites, and the gateway uses it as a
fallback only when the documented global model.max_tokens isn't set, so
the global key always wins.
Precedence: HERMES_MAX_TOKENS > model.max_tokens > provider
max_output_tokens > None. Closes the same #20741 truncation for users who
configure the cap per-provider rather than globally.
Picks up the intent of #19782 (alexcam1901), reimplemented to feed
ViewWay's max_tokens pipeline.
max_tokens set under model: in config.yaml was silently ignored.
The value was never read from config, never passed through
_resolve_runtime_agent_kwargs(), _resolve_turn_agent_config(),
or the session override path. Added it to all three code paths
so custom/Ollama endpoints receive the correct output cap.
Closes#20741
* fix(desktop): route remote-profile session reads to the owning remote backend
Per-profile remote hosts (#39778) wired the chat/resume socket to a profile's
remote backend, but session list + transcript reads still assumed every
profile's state.db is a local file the primary can open. For a remote profile
the local file is absent or stale, so the IDs the sidebar shows 404 the moment
resume runs against the remote -- the "session not found -> new session" bug.
Intercept the three session-read GETs in the hermes:api handler and route them
to the owning remote backend (which serves its own state.db natively):
GET /api/profiles/sessions -> splice each remote profile's real rows in
GET /api/sessions/{id}[/messages] -> read from the remote for remote profiles
No remote profiles configured -> untouched local fast path. A dead remote
contributes nothing rather than breaking the sidebar.
Verified end-to-end against a live remote backend: a remote-profile session
resumes from remote history and continues on the remote across turns (history
grows in place, no new session spawned).
* fix(desktop): route remote-profile session mutations + fix unified-list pagination
Follow-up to the read-routing fix: make remote-profile sessions fully
first-class, not just resumable.
Mutations (rename/archive/delete) went through the same hermes:api handler but
never carried the owning profile, so they hit the local primary's state.db --
which has no row for a remote session. Deleting/archiving/renaming a remote
session silently no-op'd or 404'd, and the row reappeared on next refresh.
- hermes.ts: setSessionArchived/deleteSession/renameSession take the owning
profile and pass it as request.profile so Electron routes to that profile's
backend (matching the read path). Callers now forward session.profile.
- main.cjs: generalize the intercept (read -> request) to also reroute
DELETE/PATCH on /api/sessions/{id} for remote profiles, stripping the profile
param (the remote serves its own state.db; no cross-profile semantics there).
- web_server.py: DELETE /api/sessions/{id} gains a profile param for parity with
GET/PATCH (local cross-profile delete).
Also fix the unified-list merge: it concatenated each remote's page onto the
primary's without re-windowing, so a limit=N request could return up to
N*(1+remotes) rows and report the primary's (stale) total. Now it over-fetches
limit+offset from each remote (from offset 0), re-sorts by recency, re-windows
to the page, and recomputes total/profile_totals from the remote counts.
Verified live against a remote backend: rename/archive/delete mutate the remote
db; page 1 windows to limit, profile_totals reflect remote counts, page 2 has no
overlap with page 1. tsc -b clean; connection-config tests pass.
Follow-up to the read-routing fix: make remote-profile sessions fully
first-class, not just resumable.
Mutations (rename/archive/delete) went through the same hermes:api handler but
never carried the owning profile, so they hit the local primary's state.db --
which has no row for a remote session. Deleting/archiving/renaming a remote
session silently no-op'd or 404'd, and the row reappeared on next refresh.
- hermes.ts: setSessionArchived/deleteSession/renameSession take the owning
profile and pass it as request.profile so Electron routes to that profile's
backend (matching the read path). Callers now forward session.profile.
- main.cjs: generalize the intercept (read -> request) to also reroute
DELETE/PATCH on /api/sessions/{id} for remote profiles, stripping the profile
param (the remote serves its own state.db; no cross-profile semantics there).
- web_server.py: DELETE /api/sessions/{id} gains a profile param for parity with
GET/PATCH (local cross-profile delete).
Also fix the unified-list merge: it concatenated each remote's page onto the
primary's without re-windowing, so a limit=N request could return up to
N*(1+remotes) rows and report the primary's (stale) total. Now it over-fetches
limit+offset from each remote (from offset 0), re-sorts by recency, re-windows
to the page, and recomputes total/profile_totals from the remote counts.
Verified live against a remote backend: rename/archive/delete mutate the remote
db; page 1 windows to limit, profile_totals reflect remote counts, page 2 has no
overlap with page 1. tsc -b clean; connection-config tests pass.
Per-profile remote hosts (#39778) wired the chat/resume socket to a profile's
remote backend, but session list + transcript reads still assumed every
profile's state.db is a local file the primary can open. For a remote profile
the local file is absent or stale, so the IDs the sidebar shows 404 the moment
resume runs against the remote -- the "session not found -> new session" bug.
Intercept the three session-read GETs in the hermes:api handler and route them
to the owning remote backend (which serves its own state.db natively):
GET /api/profiles/sessions -> splice each remote profile's real rows in
GET /api/sessions/{id}[/messages] -> read from the remote for remote profiles
No remote profiles configured -> untouched local fast path. A dead remote
contributes nothing rather than breaking the sidebar.
Verified end-to-end against a live remote backend: a remote-profile session
resumes from remote history and continues on the remote across turns (history
grows in place, no new session spawned).
* fix: respect disabled auto-compaction on context overflow
Port from anomalyco/opencode#30749.
When compression.enabled is false, NO automatic compaction trigger may
fire. The proactive token-threshold paths (preflight + post-response
should_compress gate) already honoured the setting, but the three
provider-overflow recovery paths in the agent loop — long-context-tier
429, 413 payload-too-large, and context-overflow — called
_compress_context() unconditionally, silently compressing and rotating
the session against the user's explicit choice.
Add a single guard at the top of the overflow-recovery dispatch: when
compression is disabled and the error is one of those three overflow
classes, surface a terminal error (compaction_disabled: True) telling the
user to /compress manually, /new, switch to a larger-context model, or
reduce attachments. Manual /compress (force=True) is unaffected — it never
enters this loop.
Tests: new TestOverflowWithCompactionDisabled (413 + 400 overflow don't
compress when disabled; control case still compresses when enabled).
Existing overflow-recovery tests updated to enable compaction explicitly
(they verify the recovery fires); fixture defaults flipped to True to
match production (compression.enabled defaults to True).
* perf(/model): prewarm picker provider-models cache in background
The no-args /model picker calls list_authenticated_providers(), which
fetches each authenticated provider's live /v1/models list serially. On a
cold or stale (>1h TTL) cache that blocks ~1.5s on the user's critical path
the first time /model is opened in a session.
Warm that exact path off-thread during the idle window right after the CLI
banner is shown: a once-per-process daemon thread runs
list_authenticated_providers() to populate provider_models_cache.json for
every authed provider. By the time the user types /model, the picker hits
the warm disk cache (~136ms vs ~1500ms).
Process-level Event guard (mirrors run_agent's _openrouter_prewarm_done)
ensures at most one thread per process; fully exception-isolated so an
offline/no-creds provider can never affect the session.
* docs: remove --include-desktop install instructions
Drop the --include-desktop curl one-liner from the desktop app docs.
The flag remains in scripts/install.sh; these docs now point to the
desktop installer / website and the 'hermes desktop' path instead.
* docs: remove --include-desktop from install docs
Drop the redundant 'Hermes Desktop installer on Linux' block (which
used --include-desktop) from quickstart, installation, and index docs.
The website installer covers macOS/Windows desktop; the CLI-only path
covers Linux. Removes the flag from all user-facing docs.
* fix: respect disabled auto-compaction on context overflow
Port from anomalyco/opencode#30749.
When compression.enabled is false, NO automatic compaction trigger may
fire. The proactive token-threshold paths (preflight + post-response
should_compress gate) already honoured the setting, but the three
provider-overflow recovery paths in the agent loop — long-context-tier
429, 413 payload-too-large, and context-overflow — called
_compress_context() unconditionally, silently compressing and rotating
the session against the user's explicit choice.
Add a single guard at the top of the overflow-recovery dispatch: when
compression is disabled and the error is one of those three overflow
classes, surface a terminal error (compaction_disabled: True) telling the
user to /compress manually, /new, switch to a larger-context model, or
reduce attachments. Manual /compress (force=True) is unaffected — it never
enters this loop.
Tests: new TestOverflowWithCompactionDisabled (413 + 400 overflow don't
compress when disabled; control case still compresses when enabled).
Existing overflow-recovery tests updated to enable compaction explicitly
(they verify the recovery fires); fixture defaults flipped to True to
match production (compression.enabled defaults to True).
* fix(completion): remove /model <arg> autocomplete from CLI/TUI
The TUI frontend already suppressed /model argument completion in favor of
the two-step ModelPicker (useCompletion.ts), but the CLI prompt_toolkit
completer and the gateway-backed complete.slash RPC (TUI + desktop) still
emitted model aliases and probed LM Studio on every keystroke.
Drops the /model branch in SlashCommandCompleter.get_completions, the
_model_completions method, and the LM Studio probe/cache helper that only
fed it. Command-name completion (/mod -> model) and sibling arg completers
(/skin, /personality) are untouched. Removes the now-dead TestModelTabCompletion
tests.
The in-app updater (Hermes-Setup --update) runs `hermes update`, which lazily
imports the freshly-pulled modules — but the dependency-install step runs the
already-in-memory PRE-pull code for one invocation. When a release changes an
updater-path contract across that boundary, the FIRST update on the parked
population crashes even though the fix is already on disk.
Concretely this is #39780's `_UvResult`: its `__iter__` yields (path, bool), so
Windows `subprocess.list2cmdline([uv_bin, "pip", ...])` injects the bool and
dies with `TypeError: sequence item 1: expected str instance, bool found`
(fixed in #39820). A parked Windows user clicking Update pulls #39820 to disk,
then still crashes on the in-memory pre-merge module; only the SECOND click runs
clean. Field repro: ryanc's bootstrap.log (2026-06-05 12:41:41).
Fix: when the first `hermes update` exits non-zero (and it isn't the
concurrent-instance guard, exit 2, which a retry can't fix), retry once
automatically. The retry loads the now-current module from the start and
succeeds — so the parked user gets a working one-click update instead of a
scary crash + manual second attempt.
Verified: cargo check clean.
* fix(desktop/windows): stop racing our own backend during in-app update
The Windows in-app update (Update button -> hermes-setup.exe --update handoff)
bricked because it raced a still-locked hermes.exe: the desktop quit
fire-and-forget without reaping its backend child + grandchildren, so when
the updater ran `hermes update`, the venv shim was still open. The quarantine
rename then failed, uv's `pip install -e .` hit "Access is denied", the git
path bailed to a full ZIP re-download, and the deps still couldn't write the
locked shim -- leaving a half-applied install. macOS is fine because it never
blocks REPLACE on a running executable.
Three coordinated fixes restore Mac-style parity (click Update -> progress ->
relaunch, no terminal):
A. Desktop (main.cjs): before spawning the updater, releaseBackendLockForUpdate()
tree-kills the primary + pool backends (taskkill /T /F on Windows, to catch
REPL/pty/gateway grandchildren that SIGTERM misses) and polls the venv shim
until it is actually writable (bounded 15s) -- so the lock is gone before we
hand off. Also fixes resolveHermesCliBinary to use venv\Scripts\hermes.exe on
Windows.
B. Updater (update.rs): wait_for_venv_free no longer "proceeds anyway" on
timeout -- it force-kills any lingering hermes.exe (excluding itself) and
re-checks, so a straggler can't doom the install.
C. Updater (update.rs): pass --force to `hermes update`. By contract the desktop
has exited + waited, and the wait force-kills stragglers, so the running-exe
guard would only produce a false "Hermes is still running" dead-end.
Verified: node --check on main.cjs, cargo check on the updater (clean), and the
Windows-gated taskkill body type-checks standalone. Field repro: ryanc's
update.log (manual + handoff both hit the same lock cascade).
* review: scope backend kill+wait to Windows; drop meaningless POSIX pgid kill
PR #39780 made ensure_uv() return a _UvResult — a str subclass whose
__iter__ yields (path, fresh_bootstrap) so old `uv_bin, fresh = ensure_uv()`
call sites survive the update boundary. That trick is unsafe on Windows.
The dependency installer passes uv straight into the command list
(`[uv_bin, "pip", "install", ...]`). On Windows, subprocess serializes argv
via subprocess.list2cmdline, which iterates every entry *as a string*
(`for c in arg`). Because _UvResult overrides __iter__, that iteration yields
(path, fresh_bootstrap) instead of characters, injecting the bool into the
command line and crashing the first update with:
TypeError: sequence item 1: expected str instance, bool found
This bites the common single-assignment caller (`uv_bin = ensure_uv()`) on
its first update after #39780: the freshly pulled _UvResult flows into the
old in-memory call site and into the argv. Reported in the field on a
~10-commits-behind Windows install.
A single return value cannot satisfy both legacy 2-target unpacking and
Windows char-iteration — both use the iterator protocol with contradictory
results. So gate the wrapper to POSIX: Windows returns a plain str/None
(the historical, subprocess-safe contract). POSIX keeps _UvResult and the
#39780 update-boundary fix.
Tests: list2cmdline canary proving _UvResult breaks Windows, plus Windows
returns-plain-str and POSIX dual-contract coverage.
Two switch-time regressions from the multi-profile rail work:
- "Session not found" (4007): pruneSecondaryGateways idle-reaps a
non-active profile's backend; switching back respawns a *fresh*
backend that mints new runtime ids, but runtimeIdByStoredSessionId is
never pruned. resumeSession's cache fast-path then makes a dead runtime
id active and returns, so session.usage + the next prompt 404. Probe
the cached id; on rejection drop the stale mapping and fall through to
a full resume that rebinds a live id.
- "Forgets the LLM setting": $currentModel is a nanostore set only by
refreshCurrentModel (gatewayState->open, etc). A swap fires
invalidateQueries() (react-query only) and keeps the socket 'open', so
the model/pill kept showing the previous profile. Re-pull both when
$activeGatewayProfile changes.
* feat(desktop): per-profile remote gateway hosts
Profile switching silently failed whenever the desktop was connected to a
remote backend: the rail routed non-active profiles to a local pool backend,
but spawnPoolBackend hard-threw "Profiles are unavailable when connected to a
remote Hermes backend", and the renderer swallowed the error into an infinite
reconnect backoff while still marking the profile active. Remote was also a
single app-global setting, so there was no way to give a profile its own host.
Add per-profile remote hosts so each profile can point at its own backend:
- connection.json gains a validated `profiles` map; profileRemoteOverride()
(pure, unit-tested) selects an explicit per-profile remote.
- resolveRemoteBackend(profile) precedence: per-profile override → env override
→ global remote → local spawn. spawnPoolBackend now connects to a profile's
remote (no local child) instead of throwing; startHermes resolves the primary
profile's remote.
- coerce/sanitize connection config are scope-aware (global vs named profile)
and preserve each other's entries; IPC get/save/apply/test thread an optional
profile. Per-profile apply drops only that profile's pool backend.
- Settings → Gateway adds an "Applies to" scope selector reusing the existing
URL/token/OAuth/test UX per profile.
Tests: connection-config pure suite (+6) and desktop platform suite pass;
tsc/eslint/vitest clean.
* refactor(desktop): DRY per-profile remote helpers
Share connectionScopeKey + normAuthMode from connection-config.cjs (drop the
main.cjs copy), collapse the scope/auth ternaries, route the env remote through
buildRemoteConnection, and fold the duplicated remote-block validation into
buildRemoteBlock. No behavior change; pure suite + live E2E still green.
* fix(update): make ensure_uv() survive the update boundary (no first-run crash)
`hermes update` runs the `ensure_uv()` call site from the old, already-imported
`hermes_cli.main` against the *freshly pulled* `managed_uv` (managed_uv is only
ever lazily imported, so it loads from disk post-pull). `ensure_uv()`'s return
arity flipped from a single path string to `(path, fresh_bootstrap)` (4df280d51)
and back to a single string (fb853a178). Installs parked on a 2-tuple release
unpack `uv_bin, fresh_bootstrap = ensure_uv()` against the new single-value
module and crash the first update with
`ValueError: not enough values to unpack (expected 2, got 1)` — inside the
dependency-install step, *before* the PR #39763 subprocess hand-off can run.
Return a `_UvResult` (a `str` subclass) that is usable as the bare path AND
unpackable as `(path|None, fresh_bootstrap)`. Missing uv is `""` (falsy) instead
of `None` so legacy 2-target call sites can unpack a failure without raising,
while `if not uv_bin` keeps working for single-value callers. fresh_bootstrap is
always False (the rebuild-venv path it gated was scrapped in fb853a178).
* docs(update): correct the verified error string + mechanism for ensure_uv()
A hermetic repro (old 2-target call site vs the freshly-pulled single-value
module) shows the first-update crash is exactly the string from PR #39763's
report: `ValueError: too many values to unpack (expected 2)` — not "not enough".
The returned path is a plain `str`, which is iterable, so `uv_bin, fresh =
ensure_uv()` walks its characters; the failure path's `None` return raises
`TypeError: cannot unpack non-iterable NoneType`. Both are fixed by `_UvResult`.
Comment/test wording updated to match; no behavior change.
* Revert "fix(update): require managed marker before destructive clean"
This reverts commit c8e80cd0bf.
* Revert "fix(update): stop stash/restore from clobbering desktop source on managed clones (#38542)"
This reverts commit 8a19884bf3.
* chore(install): keep npm ci desktop-build fix after stash revert
The destructive-clean reverts (#38542/#39568) pulled the desktop
workspace install back to bare `npm install`. The npm ci -> npm install
fallback is orthogonal build-correctness (avoids the Windows
workspace-hoisting flake where install reports up-to-date against a
stale marker while node_modules is empty, breaking tsc -b). Preserve it.
* feat(update): settable stash-or-discard for non-interactive local changes
Adds updates.non_interactive_local_changes (stash | discard, default
stash). Governs ONLY non-interactive updates (desktop/chat app, gateway,
--yes) — interactive terminal updates always stash-and-ask, unchanged.
- config.py: new key under existing updates section; _config_version 26->27.
- main.py: _cmd_update_impl detects non-interactive (gateway/--yes/no-TTY),
reads the setting; new _discard_stashed_changes() drops the stash
(stash-and-drop, never reset --hard/clean -fd, so ignored paths survive).
Post-pull restore site branches on it; the bail-out and up-to-date
restores always preserve work.
- web_server.py + apps/desktop settings: exposes it as a stash/discard
select (Advanced section, In-App Update Local Changes).
- docs + tests (discard drops, stash restores, interactive ignores setting,
missing section defaults to stash).
* fix(install.ps1): stash/restore instead of reset --hard on Windows update
The PR reverted the destructive update path to stash/restore everywhere
except scripts/install.ps1, whose managed-clone update path still ran
`git reset --hard HEAD` before checkout — silently destroying agent-edited
tracked source on Windows (the same #38542 data-loss class the PR fixes).
- Replace `git reset --hard HEAD` with stash-before-checkout +
restore-after-checkout, mirroring install.sh. Untracked files are
included so agent-created dirs (e.g. tinker-atropos/) survive.
- Keep `core.autocrlf false` (it prevents the phantom CRLF dirt that made
the stash necessary; it's also load-bearing for a clean restore).
- Wrap all three checkout modes (Commit/Tag/Branch); Branch case now uses
`git pull --ff-only` so local commits are never clobbered.
- Only prompt to restore when a real console is attached (UserInteractive
+ non-redirected stdin/stdout + ConsoleHost); the desktop Update button
and bootstrap have no usable console, so they default to restore and
never hang on Read-Host.
- On restore conflict or a failed update, the stash is preserved with
recovery instructions — work is never silently dropped.
Validated on Windows (PowerShell 5.1, git 2.54): AST parse clean;
E2E non-conflicting restore applies+drops cleanly with ignored paths
(node_modules) untouched; conflicting restore preserves the stash.
---------
Co-authored-by: alt-glitch <balyan.sid@gmail.com>
When the agent's reply references a deliverable file path that does not
exist on disk, extract_local_files dropped it from native delivery with
no log line — the most common reason a promised file never arrives over
a messaging platform. Add an INFO log at that drop point so the gap is
visible in gateway.log instead of vanishing.
Also convert the two print() calls in Telegram's send_document /
send_video exception handlers to logger.warning(exc_info=True). print()
writes to stdout, which 'hermes logs' never captures, so outbound upload
failures (oversized files, Bot API rejections) were invisible.
* fix: respect disabled auto-compaction on context overflow
Port from anomalyco/opencode#30749.
When compression.enabled is false, NO automatic compaction trigger may
fire. The proactive token-threshold paths (preflight + post-response
should_compress gate) already honoured the setting, but the three
provider-overflow recovery paths in the agent loop — long-context-tier
429, 413 payload-too-large, and context-overflow — called
_compress_context() unconditionally, silently compressing and rotating
the session against the user's explicit choice.
Add a single guard at the top of the overflow-recovery dispatch: when
compression is disabled and the error is one of those three overflow
classes, surface a terminal error (compaction_disabled: True) telling the
user to /compress manually, /new, switch to a larger-context model, or
reduce attachments. Manual /compress (force=True) is unaffected — it never
enters this loop.
Tests: new TestOverflowWithCompactionDisabled (413 + 400 overflow don't
compress when disabled; control case still compresses when enabled).
Existing overflow-recovery tests updated to enable compaction explicitly
(they verify the recovery fires); fixture defaults flipped to True to
match production (compression.enabled defaults to True).
* feat(delegation): uncap max_spawn_depth to match max_concurrent_children
Removed the hard ceiling of 3 on delegation.max_spawn_depth. Depth now has
a floor of 1 and no upper limit, mirroring max_concurrent_children. Cost
(each level multiplies API spend) is the practical limiter, not a constant.
- delegate_tool.py: drop _MAX_SPAWN_DEPTH_CAP, _get_max_spawn_depth() floors
at 1 instead of clamping to [1,3]; depth-limit error string reworded
- config.py / cli-config.yaml.example: doc comments say floor 1, no ceiling
- docs (configuration, delegation, delegation-patterns): range 1-3 -> >=1
- tests: convert clamp-above-3 change-detector into a no-ceiling invariant,
drop the _MAX_SPAWN_DEPTH_CAP==3 snapshot assert, fix warning-text assert
A bare /voice silently toggled on/off with a one-line result, leaving
users with no idea what the modes mean or that Discord also supports
TTS-all and live voice-channel join/leave. Bare /voice now still
toggles but appends a usage explainer covering on/off/tts/status, with
the Discord voice-channel lines shown only on adapters that support
them.
Adds gateway.voice.help + gateway.voice.help_channels across all 16
locales (placeholders {toggle}/{channels}).
* fix: respect disabled auto-compaction on context overflow
Port from anomalyco/opencode#30749.
When compression.enabled is false, NO automatic compaction trigger may
fire. The proactive token-threshold paths (preflight + post-response
should_compress gate) already honoured the setting, but the three
provider-overflow recovery paths in the agent loop — long-context-tier
429, 413 payload-too-large, and context-overflow — called
_compress_context() unconditionally, silently compressing and rotating
the session against the user's explicit choice.
Add a single guard at the top of the overflow-recovery dispatch: when
compression is disabled and the error is one of those three overflow
classes, surface a terminal error (compaction_disabled: True) telling the
user to /compress manually, /new, switch to a larger-context model, or
reduce attachments. Manual /compress (force=True) is unaffected — it never
enters this loop.
Tests: new TestOverflowWithCompactionDisabled (413 + 400 overflow don't
compress when disabled; control case still compresses when enabled).
Existing overflow-recovery tests updated to enable compaction explicitly
(they verify the recovery fires); fixture defaults flipped to True to
match production (compression.enabled defaults to True).
* fix(gemini): default native maxOutputTokens + strip OpenAI extra_body on Gemini endpoints
Two distinct failures hit users on the gemini provider with only Google
AI Studio keys set.
1. Truncation loop: build_gemini_request() only set maxOutputTokens when
max_tokens was non-None. Hermes passes None to mean "unlimited", but
Gemini's native generateContent does NOT treat an absent maxOutputTokens
as full budget — it applies a low internal default and stops early with
finishReason=MAX_TOKENS, truncating tool calls. The agent then retries
3x and refuses the incomplete call. Now default to the published 65,535
ceiling (shared by all current Gemini text models) when max_tokens=None.
2. HTTP 400 on Gemini endpoint: the chat_completions transport assembles
profile extra_body (Nous portal 'tags', reasoning, provider prefs) and
sends it via the OpenAI client to whatever base_url is resolved. When a
profile that emits extra_body (e.g. Nous) is active but the endpoint is a
native Gemini base_url — typical when only Google creds exist and a
fallback/aux call lands on Gemini — Google rejects the unknown 'tags'
field with a non-retryable 400. Strip all non-thinking_config extra_body
keys when the resolved endpoint is native Gemini.
Verified E2E against real transport code: tags stripped on native Gemini,
preserved on Nous and the /openai compat endpoint; maxOutputTokens=65535
on None, explicit values respected.
* feat(discord): voice-channel mixer — ambient idle bed + verbal acks that overlap TTS
Discord voice mode can now feel conversational: the bot speaks a short
acknowledgement before it starts working, and a subtle ambient 'thinking' bed
plays underneath while tools run, ducking under speech and swelling back — the
Grok-voice-mode feel.
discord.py plays only one audio stream per voice connection, so this adds a
software mixer (VoiceMixer, a discord.AudioSource) installed once per guild on
join. It sums an ambient loop, verbal acks, and TTS replies into that single
20ms/48kHz/stereo stream (numpy int16 add + clip), so they overlap instead of
stop-and-swap. Speech ducks the ambient gain down and releases it smoothly.
- plugins/platforms/discord/voice_mixer.py: VoiceMixer + MixerChild (gain,
loop, fade, duck/release), decode_to_pcm (ffmpeg), synth_ambient_pcm (no
asset needed — synthesised pad).
- adapter: install mixer on join, tear down on leave, route
play_in_voice_channel through the mixer (legacy one-shot path kept as
fallback), play_ack_in_voice, voice_mixer_active. Defensive getattr for the
object.__new__ test helpers.
- gateway/run.py: tool_start_callback fires a one-time verbal ack on the first
tool call of a turn when in a voice channel (independent of the text
tool-progress gate). No system-prompt or message-flow changes.
- config: discord.voice_fx.* (OFF by default; ambient/duck/speech gains, ack
phrases). All in config.yaml, not .env.
- docs + tests (mixer unit + adapter integration).
Verified: 19 new tests pass, existing voice suite green (2 pre-existing
davey-module env failures unchanged), and a real-mixer E2E confirms ambient
streams, TTS overlaps it, acks layer in, and teardown is clean.
* fix(discord): make voice mixer numpy import lazy (numpy is voice-extra-only)
numpy ships in the optional 'voice' extra, not [all,dev], so a module-level
'import numpy' broke CI test collection (and would break the always-imported
Discord adapter on any install without the voice extra). Defer numpy to the
functions that actually mix audio via _require_numpy(); guard the test module
with pytest.importorskip('numpy').
Follow-up on the salvaged fix: point the Nous silent-default override at
deepseek/deepseek-v4-flash (a cheap chat model) instead of the nvidia
nemotron entry. Keeps the no-model-configured fallback off the priciest
flagship while landing on a low-cost, broadly-capable default.
Assert get_default_model_for_provider("nous") never returns the priciest
catalog entry (anthropic/claude-opus-4.8) and that an override pointing at a
model absent from the catalog falls back to catalog order. Regression for the
silent flagship-billing footgun.
When a provider is configured but no model is selected (e.g. a profile sets
provider: nous with no model), the gateway/CLI fall back to
get_default_model_for_provider(), which returned the first curated catalog
entry. The Nous Portal list is ordered most-capable-first, so entry [0] is
anthropic/claude-opus-4.8 — the single most expensive model ($5/$25 per Mtok).
A misconfigured profile therefore silently routed every call to the flagship
and billed it for traffic the user never opted into.
Pin the silent (non-interactive) default for metered aggregators to the cheapest
curated tier via _PROVIDER_SILENT_DEFAULT_OVERRIDES so a missing model can never
auto-escalate to the flagship. The interactive default (GUI onboarding /
`hermes model`) keeps using the richer free/paid-tier-aware resolver.
Fixes the unexpected anthropic/claude-opus-4.8 charges reported for a
free-tier Nous account whose new profile had no default model.
The Desktop bootstrap installer writes `.hermes-bootstrap-complete` into the
managed git checkout root. Because it wasn't gitignored, `hermes update`'s
`git stash push --include-untracked` treated it as a local change and created an
autostash on every run — prompting the user to restore "local changes" that were
really Hermes-managed runtime state (and risking the marker getting stranded in a
stash, which re-triggers Desktop bootstrap).
Add the marker to .gitignore; `git stash -u` and `git status --porcelain` both
skip ignored files, so the updater now sees a clean tree.
Fixes#38529
Add HERMES_DASHBOARD_SESSION_TOKEN to the Hermes-managed subprocess environment blocklist so dashboard authorization material does not propagate into shell, PTY, or background process launches.
Extend the local environment blocklist regression coverage to prove the dashboard session token is stripped like other Hermes-managed secrets.
* docs(dashboard): clarify auth provider suitability + document dashboard registration
- Add a 'Registering a dashboard' subsection under the Nous Research
provider covering both the 'hermes dashboard register' CLI command
and the Portal /local-dashboards GUI page.
- Note that the Nous provider is the one suitable for public-internet
exposure (logins verified against your Nous account).
- Add a warning that the username/password provider is for trusted
networks / VPN only and is not suitable for direct public-internet
exposure; point readers to the Nous / OIDC / custom OAuth providers.
- Surface the same distinction in the two-provider intro list.
* docs(dashboard): count three bundled auth providers, add self-hosted OIDC to intro
'Two providers ship in the box' undercounted — the bundled
plugins/dashboard_auth/self_hosted (generic OpenID Connect) is a third.
List all three in the gated-mode intro and link each to its section.
* docs(dashboard): extend auth provider updates to Docker and Desktop pages
- docker.md: list all three bundled gate providers (was username/password
+ OAuth only), adding the self-hosted OIDC provider and its env vars,
and note username/password is not for public-internet exposure.
- desktop.md: reframe the remote-backend connection so OAuth (Nous Portal)
is the preferred option for any backend reachable beyond the local
machine, with username/password positioned for local / trusted-network
use only. Cover the 'Sign in with <provider>' OAuth flow in the in-app
steps and scope the VPN warning to the password path.
* docs(dashboard): align env-var, CLI, and remote-Desktop recipe with provider changes
- environment-variables.md: reframe the Web Dashboard & Hermes Desktop
intro (OAuth preferred for remote/public, username/password for
trusted networks), add the self-hosted OIDC env vars
(HERMES_DASHBOARD_OIDC_*) that were missing from the table, and note
hermes dashboard register provisions the OAuth client_id.
- cli-commands.md: document the 'hermes dashboard register' subcommand
(flags, behavior, /local-dashboards GUI alternative).
- web-dashboard.md: apply the OAuth-preferred reframe to the bottom
'Connecting Hermes Desktop to a remote backend' recipe and scope its
VPN warning to the username/password path, matching desktop.md.
* docs(dashboard): move 'recommended remote Desktop path' framing from username/password to OAuth
The gated-mode intro list claimed the username/password provider was the
recommended path for a remote Hermes Desktop connection, contradicting the
OAuth-preferred framing established elsewhere. Move that recommendation onto
the OAuth (Nous Portal) item so the docs are consistent: OAuth is the
recommended provider for any remote/internet-facing backend; username/password
is for trusted networks only.
* docs(dashboard): drop unreleased managed/hosted-install provisioning notes
Remove the 'not available in managed/hosted installs, where the client id is
provisioned by the hosting platform' line from the dashboard register docs
(web-dashboard.md, cli-commands.md) and the 'provisioned by the Nous Portal for
hosted deploys' clause from the HERMES_DASHBOARD_OAUTH_CLIENT_ID env-var row —
that platform-provisioning path is unreleased.
* docs(dashboard): drop --portal-url / HERMES_DASHBOARD_PORTAL_URL from user docs
The portal-URL override targets a non-production Nous Portal and only works
for internal Nous usage — it won't function for end users (the access token
must be issued by the same portal). Remove it from the register CLI flags,
the Nous-provider config/env tables, and the verify-the-gate example so users
aren't pointed at an option that can't work for them.
* docs(dashboard): add worked examples for Nous and username/password providers
The self-hosted OIDC provider already had a full 'Worked example: Keycloak'
walkthrough; the Nous and username/password providers only had scattered
config snippets. Add parallel '#### Worked example' sections for both
(register/run/login + /api/status verification), mirroring the Keycloak
example's structure so all three bundled providers read consistently.
* docs(env): move HERMES_DESKTOP_REMOTE_URL to end of the dashboard auth table
It was sitting between the HERMES_DASHBOARD_BASIC_AUTH_* block and the
HERMES_DASHBOARD_OAUTH/OIDC block, splitting the dashboard-side vars. As the
only desktop-side var in the table, it belongs at the end so the dashboard
provider vars (basic, OAuth, OIDC) stay grouped together.
* docs(dashboard): remove Fly.io references from dashboard auth docs
Fly.io is the internal hosting implementation for hosted Hermes — it shouldn't
leak into user-facing dashboard auth docs. Reword the OAuth provider intro,
the env-var-path rationale, the public-URL-override section, the cookie Secure
note, and the verify-the-gate example to generic 'hosting platform' / 'reverse
proxy' / 'TLS terminator' phrasing.
Left the legitimate user-facing Fly.io mentions in telegram.md (a deliberate
cloud-deployment walkthrough) and work-with-skills.md (a generic example)
untouched.
`cron_list` read `job.get("repeat", {})`, but the dict-default only
applies to a MISSING key. A one-shot job persisted with `"repeat": null`
returns None, and the next `.get("times")` raised AttributeError, taking
down the whole `cron list` output. Coalesce with `or {}` so a
present-but-null repeat renders as ∞ like the other cron readers already
do. Adds a regression test.
Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>
`os.getcwd()` raises FileNotFoundError when the process's working
directory was removed out from under it (e.g. a scratch workspace
cleaned up mid-session), crashing terminal env setup.
Extract a `_safe_getcwd()` helper that falls back to TERMINAL_CWD, then
the user's home, on FileNotFoundError, and route all three `os.getcwd()`
call sites in terminal_tool.py through it (local default_cwd, the Docker
cwd-passthrough source, and the debug-config print) so the same crash
can't resurface at a sibling site. Adds unit tests for the real-cwd path
and both fallback branches.
Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>
Inside the published Docker image, both the `--tui` banner and the
dashboard-embedded TUI report `1 commit behind — run docker pull
nousresearch/hermes-agent:latest to update` even though the container
has no git repo and no way to compute a commit delta.
Root cause: two independent update-detection paths, only one of which
knows it's running in Docker.
- `recommended_update_command()` → `detect_install_method()` reads the
`.install_method` stamp that `docker/stage2-hook.sh` writes at boot →
returns "docker", so the *command string* correctly says `docker pull`.
- `banner.check_for_updates()` (the source of the "N commits behind"
*count*) has no notion of the docker install method. It only detects a
build via `HERMES_REVISION` (nix-only, unset in the image) or a `.git`
dir (excluded from the image by .dockerignore). Neither matches, so it
silently falls through to `check_via_pypi()`, whose PyPI-version
mismatch flag (1) is then rendered verbatim by the CLI banner
(build_welcome_banner), the Ink TUI badge (branding.tsx), and `hermes
version` as "1 commit behind" — a phantom count, no commit math
involved. `hermes update` already refuses to run in-place in the
container.
The dashboard's REST `/api/hermes/update/check` endpoint already
short-circuits docker (returns behind=None + the docker guidance). This
mirrors that guard inside `check_for_updates()` so the banner/TUI/version
surfaces agree: when `detect_install_method() == "docker"`, return None
before any git/pypi probe (and before writing a cache entry). None makes
the render guards (`typeof === 'number' && > 0`, `behind and behind > 0`)
stay false, so the badge/line disappears entirely — matching the System
page.
Fix is in one place (check_for_updates) because all three consumers route
through it via get_update_result()/_update_result.
Tests: test_check_for_updates_docker_returns_none asserts None + no
git/pypi probe + no cache write; test_check_for_updates_non_docker_still_checks
guards against over-broadening (pip still version-checks). Mutation-tested:
removing the guard fails the docker test.
Verified against a real `docker build` of the image — see PR description.
The desktop OAuth remote-gateway path gated connectivity on
hasOauthSessionCookie(), which checks only the access-token cookie
(hermes_session_at, ~15 min TTL). The moment that cookie's Max-Age
lapsed, Electron's cookie jar dropped it and both resolveRemoteBackend()
and sanitizeDesktopConnectionConfig() reported "not signed in" — forcing
a full IDP re-login every ~15 min — even though a valid 24h refresh-token
cookie (hermes_session_rt) was sitting in the same jar.
The desktop OAuth code (2026-06-04) was written against the obsolete
"contract v1 issues no refresh token" model, two days after #37247
re-introduced server-side transparent refresh: Portal now issues a 24h
rotating, reuse-detected refresh token, and the gateway middleware
(_attempt_refresh) rotates a fresh AT from the RT on the next
authenticated request. So an expired-AT/live-RT session is fully
connectable — the desktop just never let the request through.
Fix:
- connection-config.cjs: add RT_COOKIE_VARIANTS + cookiesHaveLiveSession()
(true when EITHER a live AT or RT cookie is present). Keep
cookiesHaveSession() AT-only for callers that need that specific signal.
- main.cjs: add hasLiveOauthSession(); resolveRemoteBackend()'s oauth
branch now early-outs only when NEITHER cookie is present, otherwise
uses the ws-ticket mint as the authoritative liveness probe (that POST
carries the RT cookie and triggers the server-side AT rotation). A real
401 still surfaces as needsOauthLogin. Settings indicator + oauth-logout
report against the same AT-or-RT notion.
- Remove the stale "contract v1 / NO refresh token" docstrings in
cookies.py and the verify_session comments in the Nous provider that
contradicted #37247.
Tests: +57 lines in connection-config.test.cjs covering the RT-only
"still connectable" case. node --test: 32/32. dashboard-auth +
nous-provider Python suites: 223/223.
Note: server-side files (hermes_cli/dashboard_auth/, plugins/dashboard_auth/)
are comment/docstring-only here, but this touches outside apps/desktop/ so
it needs Teknium review.
`_read_json_file` caught OSError but not UnicodeDecodeError, so a status
file holding binary/non-UTF-8 bytes (truncated or clobbered write) would
crash the gateway status path instead of being treated as unreadable.
UnicodeDecodeError is a ValueError subclass, not an OSError, so it
escaped the existing guard.
Widen the catch to (OSError, UnicodeDecodeError) at both read sites in
gateway/status.py — `_read_json_file` and the sibling `_read_pid_record`,
which had the identical gap. Adds tests covering binary input (returns
None) and valid input (still parses) for both.
Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>
The LINE adapter classified every non-text inbound message as
`MessageType.IMAGE`, which doesn't exist on the enum — so any image,
video, audio, file, sticker, or location message raised AttributeError
the moment it was constructed.
Beyond fixing the crash, every non-text message was being collapsed onto
a single type. The gateway routes on MessageType (voice → STT, files →
document handling, etc.), so misclassification silently mishandled media.
Replace the inline ternary with a `_LINE_MESSAGE_TYPES` lookup that maps
each LINE webhook type to its proper enum member (audio → VOICE to match
how Telegram/WhatsApp treat voice notes), falling back to TEXT for
unknown types. Adds regression tests covering the mapping and the old
AttributeError.
Co-authored-by: Sahibzada Allahyar <94376830+sahibzada-allahyar@users.noreply.github.com>
The previous fix committed the hash from `prefetch-npm-deps`
(sha256-hgnqc...), but the actual `fetchNpmDeps` FOD (fetcherVersion 2)
that `nix flake check` builds wants sha256-cY+gM... . These two tools
disagree for this lockfile, so the build's npm-deps derivation failed
with a hash mismatch even though `fix-lockfiles --check` reported "ok".
Corrected to the build-verified value. Confirmed `nix build .#tui`,
`.#web`, and `.#desktop` all build cleanly with the new hash.
The react-router-dom 7.14->7.17 lockfile change stales the pinned
npm-deps hash in nix/lib.nix, turning the nix flake checks red. Bump
to the hash CI's prefetch diagnostic computed for the new lockfile.
Clears the npm-audit React Router advisory CVE-2026-42342 in the web
and apps/desktop workspaces by bumping react-router-dom 7.14.x -> ^7.17.0
(patched in 7.15.0; both react-router and react-router-dom now resolve
to 7.17.0 in the root lockfile).
Note: the advisory's DoS only affects React Router *Framework Mode*
(the __manifest server endpoint). Both workspaces use Declarative Mode
(web: <BrowserRouter>, desktop: <HashRouter>) as pure client-side SPAs,
so we were never actually exploitable -- this is audit-hygiene only.
npm audit --omit=dev: 0 vulnerabilities. Web + desktop + ui-tui builds
and tsc typecheck all green on 7.17.0.
The Hermes Docker image's venv is built with `uv sync`, which does not
bootstrap pip into the venv. When the google-workspace setup script needs
to install its deps and the running interpreter has no pip,
`sys.executable -m pip install` dead-ends with "No module named pip"
(reported via Discord support).
install_deps() now falls back to `uv pip install --python <interpreter>`
when the pip path fails and uv is on PATH. uv installs into the exact
interpreter the script is running under without needing pip present, so
the pip-less venv self-heals (e.g. a dep evicted on image update, or a
build without the [google]/[all] extra). On environments with neither
pip nor uv, the [google] extra hint is printed as before.
Verified E2E against nousresearch/hermes-agent:latest: under the venv
python with a missing dep, --install-deps now prints "Dependencies
installed." and exits 0 instead of failing.
Adds TestInstallDeps regression coverage: pip path, uv fallback,
uv-not-consulted-when-pip-works control, and both no-installer-available
and uv-also-fails failure cases.
Since #38591 made the dashboard's embedded chat unconditional, every
browser refresh of /chat spins up a fresh session.create (new sid + a
fresh _SlashWorker via _deferred_build) over /api/ws, but the old tab's
WS disconnect only DETACHES the transport (ws.py) — it never closes the
old session or its slash_worker. The dashboard's in-process gateway is
long-lived, so the detached _SlashWorker subprocess's stdin pipe stays
open forever and the worker never reaches EOF: one leaked python process
per refresh.
Fix at the session-lifecycle layer (not PTY signal timing — verified that
a process whose owning gateway dies is always reaped via stdin-EOF; the
leak is specifically the long-lived dashboard process keeping detached
sessions parked). On WS disconnect, schedule a grace-delayed reap of any
session left orphaned (transport detached to stdio, not mid-turn). A quick
reconnect / session.resume / prompt.submit rebinds a live transport and
cancels the reap, preserving the intentional detach-for-reconnect window.
- server.py: extract _teardown_session() (shared with session.close),
add _ws_session_is_orphaned() + _schedule_ws_orphan_reap(), gated by
HERMES_TUI_WS_ORPHAN_REAP_GRACE_S (default 20s, 0 disables).
- ws.py: schedule the reap for each detached session on disconnect.
- tests: reap-closes-worker, spares-reattached/mid-turn/finalized,
disabled-when-grace-zero.
Youssef's review caught a residual false-positive: resolveTestWsUrl
swallowed an OAuth ticket-mint failure and returned null, so the caller
skipped the WS probe and reported the remote test as reachable. But the
real boot path (resolveRemoteBackend) treats a mint failure as a hard
'session expired' auth error and refuses to connect — so an expired OAuth
session passed the test then failed boot, the exact false-positive this
PR exists to kill.
Extract resolveTestWsUrl into the electron-free connection-config.cjs
(injectable mintTicket) so it's unit-testable, and make OAuth mint
failure throw an actionable needsOauthLogin error instead of skipping.
Adds the three cases Youssef requested plus a mintTicket-required guard.
The "Test remote" button only checked HTTP GET /api/status, but the chat
surface depends on the renderer opening a live WebSocket to /api/ws — a
separate transport with separate server-side guards (Host/Origin checks,
ws-ticket/token auth, peer-IP checks). A gateway could pass the HTTP check yet
reject the WebSocket, so the test reported "reachable" while boot still failed
with the opaque "Could not connect to Hermes gateway".
testDesktopConnectionConfig now mirrors the renderer's connect: after the
status check it opens the WS URL (token/local) or a freshly minted ws-ticket
(OAuth) and confirms the upgrade is accepted and not immediately torn down by
a post-handshake auth rejection. Failures surface an actionable message instead
of a false-positive. The WS leg is skipped when the runtime lacks a global
WebSocket so it never fails spuriously.
Adds electron/gateway-ws-probe.cjs: a small helper that opens a gateway
WebSocket URL and classifies the handshake (open/frame → ok; error or close
before open → fail; open-then-early-close → credential rejected; never-opens →
timeout). The WebSocket implementation is injected so it can be unit-tested
without a real socket.
Wires gateway-ws-probe.test.cjs into test:desktop:platforms, covering every
handshake outcome plus constructor-throw and missing-impl.
The first-run provider picker was a hard gate — the only way out was
connecting a provider. Add an 'I'll choose a provider later' link that
dismisses the overlay and persists the skip to localStorage so it never
re-nags on subsequent launches. Users connect a provider any time from
Settings -> Providers (manual onboarding already bypasses the skip gate).
- onboarding.ts: firstRunSkipped state seeded from localStorage
(hermes-onboarding-skipped-v1) + dismissFirstRunOnboarding() action;
completeDesktopOnboarding clears the flag once a provider connects.
- overlay: skip gate (firstRunSkipped && !manual returns null); ChooseLaterLink
rendered in both the OAuth picker footer and the API-key fallback, first-run only.
- tests: skip persists + hidden in manual mode; full-state fixtures updated.
C1: Add _sessions_lock to protect all compound mutations and iterations
on the global _sessions dict across 5+ concurrent execution contexts
(main dispatcher, pool workers, daemon threads, notification poller,
atexit handler).
C2: Add _prompt_lock to protect _pending/_pending_prompt_payloads/_answers
dicts from races between _block() (agent callback thread) and
_respond() (pool worker). Lock scope is kept tight — _block() only
holds the lock during registration/cleanup, releasing it before
_emit() and ev.wait() to avoid blocking other prompts for 300s.
All 187 existing TUI tests pass with no regressions.
check_execute_code_guard() never called is_approved() before entering the
approval flow, and never persisted session/permanent approvals from the
gateway response. This meant 'Approve session' and 'Always' buttons had
no effect — every execute_code call re-prompted the user.
- Add is_approved() check after get_current_session_key(), matching
check_all_command_guards()
- Persist session ('approve_session') and permanent ('approve_permanent')
approvals based on the gateway choice, same as terminal command guard
- Add 3 regression tests for session persistence, permanent persistence,
and short-circuit on pre-existing approval
Windows counterpart of #39127: scripts/install.ps1 `Install-Desktop` runs
`npm run pack` once and throws on the opaque ENOENT a corrupt cached Electron
download produces, with no recovery. Add `Clear-ElectronBuildCache` plus a
purge-and-retry-once on pack failure, mirroring the install.sh fix: remove the
cached electron-*.zip (%LOCALAPPDATA%\electron\Cache + ELECTRON_CACHE /
electron_config_cache overrides) and stale *-unpacked output, then retry so
@electron/get re-downloads with its own SHASUM verification.
Refs #37544.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Replace [{_DIM}] with [dim] in all _restore_session_cwd and
_preload_resumed_session messages that go through _console_print (Rich
Console.print). _DIM is an ANSI escape (\x1b[2;3m) that Rich cannot
parse as a markup tag, causing MarkupError on session resume when the
stored cwd is missing or inaccessible.
Also uses [/dim] closing tag for explicit tag matching.
Fixes#39469
Mirror the workspace-group "+": each profile header in the all-profiles
session list gets a new-session button. Unlike selecting the profile, it
leaves the browse scope untouched (newSessionInProfile keeps
$showAllProfiles), so creating a chat doesn't collapse the unified view.
Keep one persistent socket per profile with live work instead of closing
the single socket on every profile swap, so background sessions across
profiles keep streaming at once. A gateway registry owns the primary
(window) socket plus lazy secondaries (own backoff/reconnect); all feed
the same session-keyed event handler. Secondaries are pruned to profiles
with a working/needs-input session, the keepalive pings every open
backend, and LRU eviction spares freshly-touched backends so the soft cap
can't abort a running agent. Approval/sudo/secret prompts are parked
per-session (surfaced via the needs-input badge) so a background turn can
block without hijacking the foreground. Single-profile users only ever
have the primary, so their path is unchanged.
Resolve conflicts in desktop settings/cron/messaging/sidebar: adopt main's
ListRow + actions-menu refactors for credential rows; keep our profileColor
import on the sidebar. Drop the now-orphaned Tip-based helpers.
Hold (~450ms) a profile square — or right-click → Color… — to open a
shadcn Popover of swatches and override its rail color, with Auto to fall
back to the deterministic hue. The hold timer rides alongside the dnd
pointer listener (a real drag cancels it, the trailing click is
suppressed), so reorder/select/recolor stay distinct gestures.
Overrides persist in localStorage ($profileColors), resolved via
resolveProfileColor (override wins, else the name-hashed hue). Cosmetic
and gated on the multi-profile rail, so single-profile users are
unaffected. Adds a reusable ui/popover.tsx (radix-ui umbrella).
The previous catch-all except OSError would silently swallow real
errors (disk full, bad path, permission issues unrelated to symlink
privilege). Narrow the handler to winerror == 1314 — the specific
Windows error code for "A required privilege is not held by the
client" — and re-raise every other OSError so genuine failures are
not hidden.
On Windows, os.symlink() raises OSError (WinError 1314) unless the
process has Administrator rights or Developer Mode is enabled. The SSH
bulk-upload staging logic used symlinks to mirror the remote layout
before piping through tar; this caused all ssh_bulk_upload tests to
fail on Windows.
- ssh.py: wrap os.symlink() in try/except OSError and fall back to
shutil.copy2() so staging works on every platform. shutil was already
imported, no new dependency introduced.
- file_sync.py: replace str(Path(remote).parent) with
posixpath.dirname(remote) in unique_parent_dirs(). pathlib.Path uses
the host separator (\ on Windows), but these paths are sent to a
remote Linux host over SSH and must always use forward slashes.
- test_ssh_bulk_upload.py: make test_staging_symlinks_mirror_remote_layout
platform-agnostic — assert file existence and content instead of
os.path.islink() + os.readlink(), since the staged entry may be a
copy on Windows.
When reasoning text grows during streaming, new parts can be appended
beyond endIndex. The pending check used slice(startIndex, endIndex)
which excluded these new parts — if the original part completed, the
block would close while new reasoning was still streaming.
Fix: remove the endIndex cap from slice() so all parts from startIndex
onward are checked. During non-streaming, the array is stable and
all parts are within range anyway.
web_tools.is_safe_url was replaced by async_is_safe_url, but three
web-provider test files still monkeypatched the old sync name, raising
AttributeError. Patch the async variant with an async lambda.
Add async_is_safe_url() wrapping is_safe_url via asyncio.to_thread, and route
all async SSRF call sites through it: web_extract_tool, the vision/video
preflight checks, and both download redirect guards. socket.getaddrinfo blocks;
calling it inline from async tool paths froze the event loop for the duration of
DNS resolution.
vision_tools: split _validate_image_url into _image_url_shape_ok (no DNS) +
sync _validate_image_url (for sync callers/tests) + async _validate_image_url_async.
Widened beyond the original PR #3691 to sibling async sites that also blocked
the loop (second redirect guard, video preflight).
Salvage of #3691 by @Kewe63 — surgically re-applied onto current main because
the original branch was too stale to cherry-pick cleanly (would have reverted
the web_crawl_tool refactor).
Co-authored-by: Kewe63 <kewe.3217@gmail.com>
PASSIVE checkpoint never shrinks the WAL file, causing state.db-wal to
grow without bound. Change to TRUNCATE in _try_wal_checkpoint() and
close() so the WAL is truncated regularly.
Fixes#24034
session.py _persist() bypassed SessionDB's thread-safe write path by
accessing private internals db._lock and db._conn directly:
with db._lock:
db._conn.execute("UPDATE sessions SET model_config = ? ...")
db._conn.commit()
This was fragile for three reasons:
1. It bypassed _execute_write()'s BEGIN IMMEDIATE + jitter-retry logic,
so concurrent writes could hit SQLite BUSY without retrying.
2. It called db._conn.commit() manually, breaking the transactional
contract that _execute_write() enforces.
3. Any internal rename of _lock or _conn would silently break this
call site with an AttributeError at runtime.
Fix:
- Add SessionDB.update_session_meta(session_id, model_config_json, model)
to hermes_state.py. Routes through _execute_write() for the standard
BEGIN IMMEDIATE + lock + jitter-retry guarantee. Uses COALESCE so
passing model=None leaves the stored model column unchanged.
- Replace the db._lock / db._conn block in session.py _persist() with
a single db.update_session_meta() call.
Tests (tests/acp/test_session_db_private_access.py, 11 tests):
- Unit tests for update_session_meta: updates model_config, updates
model, preserves existing model on None, routes through _execute_write,
no-op on non-existent session.
- AST checks: db._lock and db._conn not referenced in session.py;
_persist() calls update_session_meta().
- Integration round-trips: cwd and model persisted correctly; COALESCE
prevents overwriting an existing model with NULL.
The models.dev supports_vision field reflects model IMAGE-INPUT capability,
which is not the same contract as 'provider API accepts images inside
tool-result messages' — the looser heuristic could re-introduce the exact
HTTP 400 'text is not set' it aims to fix. Keep only the explicit, opt-in
ProviderProfile.supports_vision flag (set on xiaomi); add catalog-based
detection later if a concrete provider needs it.
_supports_media_in_tool_results() had a hardcoded provider allowlist
that missed custom providers and newer vision-capable providers like
xiaomi. Added ProviderProfile.supports_vision flag and made the
function check:
1. Registered provider profile (supports_vision flag)
2. Model capabilities from models.dev catalog (supports_vision)
3. Existing hardcoded allowlist (unchanged)
This fixes HTTP 400 "text is not set" errors when vision-capable
custom providers receive text-only tool results instead of
multipart image content.
Related: #25594
Tests in test_gateway_service.py imported grp inline without a
platform guard, causing ImportError on systems where grp is
unavailable (e.g. macOS, WSL without grp module).
Added pytest.importorskip('grp') at module level alongside the
existing pwd guard, and removed three redundant inline import grp
statements.
Fixes#24531
Some VPS providers (Hetzner Cloud and others) offer a browser-based
console for managing hosts. These consoles transmit special characters
incorrectly — ':' may arrive as ';', '@' may be mis-rendered, and
non-English keyboard layouts fare worse — which silently corrupts
'docker run' arguments like '-v ~/.hermes:/opt/data', '-e KEY=value',
and pasted API keys / tokens.
Adds a :::caution admonition above the Quick start 'docker run' block
in website/docs/user-guide/docker.md recommending SSH for copy-paste-
safe command entry, with manual-typing guidance as a fallback.
Pure docs change, no code touched.
Closes#36279
Co-authored-by: Bedirhan Celayir <bedirhancode@users.noreply.github.com>
Centralize the fallback in DeleteProfileDialog (the single delete choke
point) so both the rail and the Profiles view inherit it. Reset *after*
the host's onDeleted refresh so a refreshActiveProfile racing the dying
backend can't clobber the pill back to the deleted profile, and set
$activeProfile too (selectProfile only moved the gateway, leaving the
statusbar pill stranded on the dead profile).
The standalone `_send_slack()` function used by the send_message tool
and cron delivery fallback was not passing `thread_ts` to the Slack API,
causing messages to post to the top-level channel instead of inside
threads.
- Add `thread_ts` parameter to `_send_slack()`
- Include `thread_ts` in the chat.postMessage payload when present
- Pass `thread_id` from `_send_to_platform()` to `_send_slack()`
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Drag a sidebar session into the composer to drop an @session:<profile>/<id>
chip the agent resolves via session_search. New READ shape dumps a whole
session by id (head+tail when large); a `profile` param reads another
profile's DB read-only, and a cross-profile locate scan resolves bare ids
when the model drops the owning profile from the link.
Also: ASCII "waking up <profile>" overlay during lazy gateway swaps,
global haptic rate-limit to kill the reconnect-storm "clickity" buzz, and
reauth toasts surfaced once per disconnect instead of every backoff tick.
_build_memory_uri produced URIs of the form:
viking://user/{user}/memories/{subdir}/mem_{slug}.md
The /agent/{agent}/ segment was missing, causing every agent under
the same user to write into the same flat namespace. In multi-agent
deployments agents silently overwrite each other's memories and
vector retrieval cross-pollinates results.
self._agent was already populated correctly (from OPENVIKING_AGENT
env var, default 'hermes') and sent via X-OpenViking-Agent header —
it was simply not interpolated into the URI.
Fix: add the missing segment so URIs follow the documented shape:
viking://user/{user}/agent/{agent}/memories/{subdir}/mem_{slug}.md
Tests: 4 new regression tests in TestOpenVikingMemoryUriBuilder,
13/13 passed (9 existing + 4 new).
Both the desktop and web-dashboard remote-backend sections now state up front
that the 'remote backend' is a running 'hermes dashboard' process the desktop
app attaches to (it does not start it for you), and that the gateway is a
separate process needed only for messaging channels.
Salvage of #36631 (@annguyenNous), rebased onto current main with
regression tests added. Fixes#36266.
When a persistent Docker sandbox container is removed out-of-band (idle
reaper, `docker prune`, OOM kill, daemon restart), the gateway kept
issuing `docker exec` against the dead container ID, returning
"No such container" on every subsequent tool call — the agent was
permanently blocked until the gateway process restarted.
DockerEnvironment.execute() now detects the "No such container" /
"is not running" error after a non-zero exit (gated on
persist_across_processes) and calls _recreate_container(): it tries
label-based reuse first, falls back to a fresh container replaying the
same image + full all_run_args set, re-runs init_session(), and retries
the command once. A genuine non-zero exit is NOT misclassified as
container-gone.
Differs from #36631 as submitted: adds the tests the original lacked.
tests/tools/test_docker_environment.py covers _is_container_gone pattern
matching (incl. the negative/control case), the recover-and-retry path,
the persist_across_processes=False opt-out (no recovery), and the
ordinary-failure passthrough (no spurious recreation). _make_dummy_env
now forwards persist_across_processes.
Verified:
- Unit: 67/67 in test_docker_environment.py (4 new + existing).
- Live E2E against the real docker daemon: started a persistent
container, `docker rm -f`'d it out-of-band, and the next execute()
transparently recreated a fresh container and succeeded; a follow-up
command worked in the recovered container; a real `exit N` passed
through without triggering recovery.
Co-authored-by: annguyenNous <annguyenNous@users.noreply.github.com>
The desktop `/title <name>` command 404s with "Session not found" on
every platform (reported on Windows in #38508).
Root cause: `session.create` returns two distinct ids — a *runtime*
session id (held in `activeSessionIdRef`) and a `stored_session_id` (the
DB `sessions.id`) — and deliberately does NOT persist a DB row until the
first turn. Routing `/title` through the REST `PATCH /api/sessions/{id}`
endpoint (as #38576 proposed) resolves the id against the `sessions`
table, so the runtime id — or any brand-new, not-yet-persisted session —
never resolves and returns 404. This is an id-type mismatch, not a
Windows file-locking quirk, so it fails on macOS and Linux too.
Fix: route `/title <name>` through the gateway's `session.title` RPC —
the exact path the TUI already uses (`ui-tui/.../slash/commands/core.ts`).
The RPC maps the runtime id to the in-memory session, writes through the
gateway's own DB connection, and queues the title (`pending: true`) when
the row isn't persisted yet, so it works for a fresh chat. The sidebar is
then refreshed via the existing `refreshSessions()` plumbing.
Keeps the sidebar-refresh wiring and `refreshSessions` threading from
#38576; replaces only the broken REST/slash-worker write path. A bare
`/title` (no arg) still falls through to the worker to show the current
title.
Tests rewritten to assert `session.title` routing with the runtime-vs-
stored id distinction (which the original mock collapsed), plus the
queued/`pending` fresh-chat case and the error path.
Supersedes #38576. Fixes#38508.
Co-authored-by: xxxigm <54813621+xxxigm@users.noreply.github.com>
Adds qwen/qwen3.7-plus directly under qwen/qwen3.7-max in both the
OpenRouter curated catalog (OPENROUTER_MODELS) and the Nous portal
catalog (_PROVIDER_MODELS['nous']), then regenerates the docs-hosted
model-catalog.json manifest from those source lists.
When a remote gateway with username/password (or OAuth) auth restarts, its
session cookie lapses and Desktop boots into the recovery overlay with a
session-expired error. That overlay only exposed local-recovery actions —
Retry (resets the local bootstrap latch) and Repair (re-runs the installer) —
neither of which can re-establish a remote session, so the user is stuck in a
no-op Retry loop with no way to sign in again.
The overlay now detects a remote-reauth boot failure from the saved connection
config (remote + gated + not currently connected + has a URL) and surfaces a
primary 'Sign in to remote gateway' button that opens the gateway login window
(the username/password form for a basic gateway, the OAuth redirect otherwise)
and reloads on success. Button copy is driven by a best-effort provider probe,
matching the gateway-settings page. Detection and copy logic live in a pure
helper module with unit coverage.
When `docker run -d` fails after Docker has already created the container
object (e.g. exit 125 when the daemon isn't ready, or a timeout mid image
pull), the code raised before `self._container_id` was set — so the
container leaked permanently in "Created" state. Reported in #7439:
110+ orphaned containers accumulated over 3 days from hourly cron-
scheduled gateway sessions hitting a Docker Desktop startup race.
The orphan reaper added in #33645 (reap_orphan_containers) does NOT cover
this case: it filters `status=exited`, but a failed-create container is in
`Created` state, so it slips through and is never reaped.
Wrap the `docker run -d` call in try/except and `docker rm -f` the
container by its known name before re-raising.
Salvages #7440 by @Tranquil-Flow. Their branch predated the cross-process
reuse + labels rework on `main`, so a cherry-pick conflicted; reconstructed
the same intent (plus their two regression tests, adapted to mock the new
reuse `docker ps` probe) against current `main`.
Verified adversarially: reverted just the product change to origin/main's
`docker.py`, ran the two new tests -> both FAIL with
`assert 0 == 1 ("docker rm should be called once")`. With the fix applied,
both pass; full test_docker_environment.py is 65/65 green.
Closes#7440. Fixes#7439.
Co-authored-by: Evi Nova <66773372+Tranquil-Flow@users.noreply.github.com>
Docker Compose service names (e.g. ollama, litellm, hermes-litellm)
are unqualified hostnames with no dots. These are always local — they
resolve via Docker DNS, /etc/hosts, or mDNS. Without this fix, the
stale stream timeout fires on local LLM proxies, causing infinite
reconnect loops.
Closes#7905
Wrap the _tools iteration in _probe_single_server() in try/finally
so that server.shutdown() is called even if iterating tool metadata
raises. Without this, the MCP server connection leaks until the
event loop is torn down by _stop_mcp_loop().
The existing-message overflow split path in stream_consumer.run() sealed the
first chunk via _send_or_edit(chunk) (finalize=False) then reset _message_id
to None — so that chunk was never edited again and never received the adapter's
final rich-text pass. On Telegram, MarkdownV2 formatting is applied on the
finalize edit, so early split messages of a long multi-part streamed reply
rendered raw markdown (##, **bold**, code fences) while only the last chunk
rendered correctly.
Fix: seal the overflow chunk with finalize=True so it gets its final
formatting pass before _message_id is cleared.
Salvaged from #32609 (the streaming-format portion only; the PR's send_draft
parse_mode change is already superseded on main, and its media-roots change
conflicts with the current denylist + recency-window delivery model).
_ notification_poller_loop_ re-emits status.update every cycle
when a background process completes while the session is busy.
The same completion event gets re-queued and re-emitted to the
TUI every few ms, flooding the transcript with duplicate lines.
Add _notification_event_dedup_key(evt) that returns a tuple
identity for each notification event. Only emit status.update
on first sight per identity:
- completions: (sid, type) — one-shot per process session
- watch_match: (sid, type, command, pattern, output, ...)
- watch_overflow/disabled: (sid, type, command, message, ...)
The dedup key design was refined from an initial sid:type approach
after @lordbuffcloud identified that distinct watch_match events
(READY vs DONE) for the same process would be incorrectly collapsed.
Tests from @tymrtn cover distinct watch matches, exact replay
dedup, and completion one-shot behavior.
Co-authored-by: tymrtn <ty@tmrtn.com>
* Revert "fix(gateway): anchor Google Chat OAuth client secret to default Hermes root"
This reverts commit fff0561441.
* Revert "fix(cli): honor global-root active_provider fallback for named profiles"
This reverts commit 3858cf4307.
* docs(google_chat): describe OAuth client secret as profile-scoped, not host-wide
The setup docs, oauth docstring, and the adapter's 'no credentials'
error message all described the Google Chat OAuth client secret as
host-wide shared infrastructure. That contradicts profile isolation:
profiles are separate auth boundaries, so two profiles can point at
different Google OAuth apps / accounts. Reword all three to say the
secret is profile-scoped and each profile registers its own.
Windows contributors checking out on NTFS with git's default core.autocrlf
will end up with CRLF in docker/entrypoint.sh. When COPY'd into the image
and invoked as ENTRYPOINT, the kernel interprets the trailing \r as part of
the interpreter path, producing a confusing 'no such file or directory'
despite the file being present and executable.
Lock LF for the usual suspects (*.sh, Dockerfile, *.dockerfile, and the
specific docker/entrypoint.sh). The existing tree is already LF; this is
preventive against future Windows regressions only.
`markdown` was declared only in the `matrix` optional extra, and the
official Docker image installs `--extra all --extra messaging --extra
anthropic --extra bedrock --extra azure-identity --extra hindsight` —
notably NOT `--extra matrix` (the matrix extra is deliberately routed to
lazy-install because `mautrix[encryption]`/`python-olm` can't build on
Windows/macOS — see the 2026-05-12 policy comment in `[all]`).
Result: `markdown` never lands in the image venv, so the Markdown->HTML
conversion on the DEFAULT delivery path silently falls back to plain
text. Cron/agent deliveries render raw `##`/`**`/tables in clients like
Element (no `formatted_body`). The conversion is now used by BOTH
`gateway/platforms/matrix.py` and `tools/send_message_tool.py`, so it is
no longer matrix-specific.
`markdown` is a pure-Python `py3-none-any` wheel (~108KB, no compiled
extensions, no platform constraints), so none of the reasons the matrix
extra was lazy-routed apply to it. Promote it to a core dependency so it
ships in the wheel, the Docker image, and every install; drop the now
redundant copies from the `matrix` extra and the `platform.matrix`
lazy-deps group; refresh the stale "installed with the matrix extra"
docstring.
Verified against a real build: ran the image's exact `uv sync` command
(same extras, no `--extra matrix`) in a clean container off the new
lockfile -> `import markdown` succeeds (3.10.2). On `origin/main` the
same command leaves markdown absent. 223 targeted tests pass
(test_matrix.py + test_lazy_deps.py).
Closes#32486.
When a clarify prompt times out (backend _block returns an empty answer
after the configured timeout) or is dismissed with Esc/Ctrl+C, the live
ClarifyPrompt overlay was torn down by turnController.idle() ->
resetFlowOverlays() with no persistent transcript record. The question and
options vanished from the screen while the agent's follow-up still referred
to "the options above".
The answered path already persists the question + answer; only the
unanswered exits left no trace. This asymmetry is the bug.
Fix (TUI layer only, no Python/protocol change):
- formatAbandonedClarify() in lib/text.ts renders the question + the same
1-based numbered option list shown by ClarifyPrompt, plus a reason
('timed out' / 'cancelled').
- Timeout: createGatewayEventHandler flushes a still-live clarify into the
transcript as a plain system line when the clarify tool's own tool.complete
fires. A live capture of the event stream confirmed this is the only point
where the overlay is still set after a timeout: the sequence is
clarify.request -> (timeout) -> tool.complete -> message.complete, with NO
intervening message.start/tool.start. On a real answer, answerClarify()
clears the overlay before tool.complete arrives, so the hook no-ops there
(no double-write); a per-requestId guard set is belt-and-braces.
- Explicit cancel: answerClarify('') persists the prompt as a system line
instead of a transient 'prompt cancelled' flash.
System lines always render (unlike trail lines, which /details can hide),
so the record reliably survives on screen as standard output.
Verified live in the TUI: an Esc-cancelled clarify now leaves the question +
options + '(cancelled - no selection)' in the transcript after the turn ends.
Tests: formatAbandonedClarify unit cases + gateway-handler behavioral cases
(persist on clarify tool.complete, no flush on a non-clarify tool.complete,
no double-persist on repeat tool.complete, no-op when the overlay was already
cleared by an answer).
- right-click a profile square to rename or delete it, via shared
self-contained dialogs (also reused by the profiles page)
- switching or creating a profile now resets to a fresh new-session
draft so the prior session doesn't stay sticky across contexts
- deleting the profile you're currently in falls back to default
instead of stranding the gateway on a dead profile
- shared ConfirmDialog: Enter/Space confirm from anywhere in the dialog;
profile-delete and cron-delete both route through it
The per-session icon picker added more noise than value — rip it out end
to end (sessions.icon column, set_session_icon, the PATCH field, the
picker UI, and the SessionInfo.icon type).
The cross-profile session aggregator now opens each profile's state.db
read-only (mode=ro, no schema init), so listing other profiles on every
sidebar refresh never DDLs or takes a write lock on their live DBs. The
single-profile hot path stays on par with /api/sessions.
Left-align the default's home icon next to the create "+" in the
single-profile state (toggle/squares/Manage still appear only once a
second profile exists).
Always mount the profile rail, but when only the default profile exists
render just the create-profile "+" (hide the default/all toggle, the
draggable squares, and Manage). Gives a first-profile affordance without
the full switcher chrome; everything else appears once a 2nd profile exists.
If a user drops back to a single profile while scope is still ALL
(persisted), the rail is hidden — they'd be stuck in the grouped view
with no toggle out. Fall back to the scoped view when only one profile.
- Wheel maps vertical scroll → horizontal so the rail is navigable with a
plain mouse (trackpad x-scroll still passes through).
- Springy easeOutBack reflow; dragged square glides between snapped cells
(no scale — overflow-x strip would clip it) with a subtle lift.
- Haptic 'selection' tick per crossed cell + 'success' on a committed reorder.
Snap the drag transform to whole cells (no free glide) and clamp it to the
occupied squares strip via a relative wrapper as offsetParent, so a square
can't float past the last profile onto the "+" and break the layout.
overflow-x-auto makes overflow-y compute to auto, so a vertical drag
translate faulted in a cross-axis scrollbar. Pin the drag transform to
y:0 with a modifier — squares only slide horizontally now.
Bind Cmd/Ctrl+P to the command palette alongside Cmd+K (VS Code quick-open
muscle memory); Cmd+. stays the command center. No Print accelerator
competes, so the renderer preventDefault is enough.
Make the named-profile squares reorderable via dnd-kit (horizontal sort,
4px activation so a tap still selects). Order persists in localStorage
($profileOrder); unordered/new profiles alphabetize at the tail.
- Add a "+" in the profile rail that opens a self-contained CreateProfileDialog
(name + clone toggle + optional SOUL.md); extract it and ActionStatus from
the profiles view so both surfaces share one flow.
- Keep the profile rail pinned to the bottom when a profile has no sessions by
rendering a flex-1 spacer (previously the rail floated up to the nav).
Add first-class profile support to the desktop app without app reloads.
- Swap the single live gateway onto a session's profile lazily (spawned on
demand by the Electron backend pool), so one backend serves the active
profile and others stay cold — no OOM with many profiles.
- Aggregate sessions across profiles by reading each profile's state.db
read-only; unified "All profiles" view groups sessions per profile with
per-profile pagination, while the default view stays scoped to one profile.
- Add an Arc-style profile rail at the sidebar foot: a default<->all toggle
pinned left, colored named-profile squares scrolling between, Manage pinned
right. Profile identity is a deterministic per-name color.
- Route profile-scoped REST (config/env/skills/tools/model) to the active
gateway profile and invalidate React Query caches on swap. Single-profile
users never trigger a swap, so their path is unchanged.
Backend:
- web_server: profile-aware active/list endpoints + per-profile session
totals; hermes_state: session_count(exclude_children); main.py: honor
--profile over HERMES_HOME env for pooled backends.
UI primitives:
- Add a position-aware Tip tooltip (instant, themed) as a drop-in for native
title=, and strip redundant tooltips from self-descriptive chrome.
Two fixes for desktop app slash command handling:
1. Slash commands submitted while the agent is busy now execute
immediately instead of being queued. Previously submitDraft()
unconditionally queued any draft when busy, but slash commands
are client-side operations or self-contained gateway RPCs that
should run regardless of busy state (matching TUI behavior).
executeSlashCommand already has its own per-command busy guard
for commands that genuinely need an idle session.
2. Slash command trigger items no longer leak the "|index" suffix
from their item.id into the serialized chip text. The
toItem callback now sets rawText in metadata so
hermesDirectiveFormatter.serialize takes the direct-insertion
path instead of the legacy @type:id fallback. This also means
slash commands enter the composer as plain text (not chips),
matching selectSkinSlashCommand and TUI behavior.
/branch (aka /fork) sessions vanished from /resume and /sessions. Both
surfaces funnel through list_sessions_rich(include_children=False), which
hid any session with a parent_session_id unless identified as a branch via a
heuristic — parent.end_reason == 'branched' AND child.started_at >=
parent.ended_at.
Two ways that heuristic failed:
1. CLI/gateway branches: once the parent was reopened (e.g. resumed) and
re-ended with a different end_reason (tui_shutdown overwriting 'branched'),
the heuristic stopped matching and the branch was hidden permanently.
2. TUI branches (tui_gateway session.branch): the TUI never ends the parent
as 'branched' — it creates the child while the parent is still live — so
the heuristic NEVER matched and TUI branches were hidden from the moment
they were created (this is the macOS desktop app's primary symptom).
Fix: persist a stable '_branched_from' marker in the branch session's
model_config at creation time across ALL THREE branch paths (CLI cli.py,
gateway gateway/run.py, and TUI tui_gateway/server.py), and OR a
json_extract(model_config, '$._branched_from') IS NOT NULL check into the
list_sessions_rich filter. The marker is immutable across the parent's
lifecycle, so the branch stays visible regardless of how/whether the parent
is ended. The legacy end_reason heuristic is kept (OR'd) so pre-existing
branches remain visible. Subagent/compression children (no marker, parent
not 'branched') stay correctly hidden. Fixes#20856.
Approach by liuhao1024 (PR #20864); reimplemented on current main, extended
to the TUI branch path (which the original missed), with regression tests for
the reopen+re-end scenario and the TUI marker persistence.
hermes update can brick a Windows install. When 'hermes update --force' runs
past the concurrent-process guard, rebuild_venv runs while the venv is still in
use: shutil.rmtree(ignore_errors=True) deletes site-packages + certifi's cert
bundle but can't remove the locked python.exe, leaving a half-gutted venv that
uv venv then refuses to overwrite. Every later HTTPS call dies with
FileNotFoundError for the missing cacert and there is no recovery.
--clear alone (the c136eb4de retry path) does not fix the real lock case: when
the locked interpreter is *inside* the venv being rebuilt, neither rmtree nor
uv venv --clear can delete it. os.replace of the parent directory *is* allowed
on Windows (a running .exe is tracked by handle, not path), so we move the old
venv aside atomically to <venv>.old, rebuild with --clear in its place, and the
still-running gateway/desktop keep using the moved-aside copy until they
restart. If the venv genuinely can't be moved, we abort cleanly and leave it
fully intact; if the rebuild fails, we restore the moved-aside copy.
Folds in the call-site guards from #38511 (@f3rs3n):
- rebuild_venv() returns False (and restores the backup) if uv exits 0 without
producing an interpreter.
- both hermes update venv-rebuild call sites abort with RuntimeError instead of
continuing into dependency install when rebuild_venv() returns False.
Also gitignore /venv.old/ so the update autostash (git stash --include-untracked)
doesn't sweep the moved-aside venv on every run.
Root-cause fix for #37881. Supersedes the --clear-only retry from c136eb4de.
Co-authored-by: f3rs3n <32328813+f3rs3n@users.noreply.github.com>
The salvaged fix held _session_resume_lock across _make_agent (MCP discovery
+ AIAgent construction, seconds), serializing it against session.close. Since
session.close runs on the main RPC dispatch thread (not a _LONG_HANDLER), a
close racing a mid-build resume would stall all fast-path RPCs (approval.respond,
session.interrupt).
Restructure to double-checked locking: build the agent outside the lock, then
re-check _find_live_session_by_key under the lock before _init_session. A losing
concurrent resume discards its just-built agent (no worker/poller wired yet) and
reuses the winner. Updated the concurrent-resume regression test to assert the
real invariant (one surviving live session + loser agent closed) rather than the
implementation detail of a single _make_agent call.
quick() and dry_run() previously trusted the stored category from
tracked.json without re-validating at delete time. Stale entries from
before #34840 could carry category="cron-output" for cron control-plane
paths (e.g. cron/jobs.json), causing quick() to delete the live
scheduler registry.
Fix:
- Fix guess_category() to only classify cron/output/** as cron-output
(was classifying ALL cron/* paths, missing the #34840 fix).
- Re-validate cron-output entries via guess_category() at delete time
in quick() and dry_run(); stale entries that are no longer classified
as cron-output are skipped and removed from tracked.json.
- Add _is_protected_cron_path() as a hard defense-in-depth guard that
blocks deletion of cron/cronjobs directories and known control-plane
files (jobs.json, .tick.lock) regardless of stored category.
- Update test_cron_subtree_categorised to match fixed guess_category
(only cron/output/* is cron-output, not all of cron/).
Tests: add 5 regression tests in TestStaleCronEntryMigration.
A session_reset (/new, /cc) that bumps the run generation while an agent
turn is in flight left the dead agent in the _running_agents slot: the
in-flight run's own release is generation-guarded and correctly returns
False, and the outer finally's sentinel-only check also missed the
leftover real agent. The session then silently dropped every subsequent
message as 'agent busy' until a full gateway restart. (#28686)
- _process_message_or_command outer finally now calls the unconditional,
idempotent _release_running_agent_state(key) on all exit paths instead
of the sentinel-vs-else branch that could strand a dead agent.
- _handle_reset_command evicts the slot right after bumping the
generation, so the zombie is cleared at reset time regardless of how
the in-flight run unwinds.
Co-authored-by: CryptoByz <cryptobyz.airdrop@gmail.com>
Root-run gateways have $HOME=/root, which is on the MEDIA system-path
denylist, so the gateway silently dropped agent-generated deliverables
under /root (e.g. /root/work/proposal.docx) — the user got a 'here is
your file' reply with nothing attached.
_path_under_denied_prefix now treats the running user's own home as
deliverable: the home tree itself is no longer denied, while the
more-specific denied paths inside it (~/.ssh, ~/.aws, ~/.hermes/.env,
auth.json, config.yaml) stay blocked because they are separate denylist
entries. The exception only matches when the denied prefix IS $HOME, so
a non-root gateway still can't deliver another user's home.
Diagnosis, reproduction, and the failing-case analysis are from
@GodsBoy (#38108 / #38106). Implemented here as the minimal denylist
fix rather than a staging/copy subsystem.
Co-authored-by: GodsBoy <dhuysamen@gmail.com>
search_sessions_by_id previously fetched up to 10k sessions via
list_sessions_rich and filtered them in Python — O(n) per keystroke.
Push the id match into SQL instead.
- list_sessions_rich gains an optional id_query param: a case-insensitive
LIKE pushed into the outer WHERE, matched against each surfaced row's id
AND every id in its forward compression chain (via the existing chain
CTE). Searching a compression root id or a tip id both resolve to the
same projected conversation. LIKE wildcards in the needle are escaped.
- search_sessions_by_id now fetches only matching rows (limit*4) and ranks
exact > prefix > substring in Python over that small set.
- web_server /api/sessions/search: route ID matches and content matches
through one lineage-keyed dedup helper so an id-hit and a content-hit on
the same conversation collapse to a single result (the contributor's
version keyed ID hits by raw sid and content hits by root, which could
double-list a compression tip).
- command-center haystack also matches _lineage_root_id for parity.
E2E verified against a real DB: exact match over 3000+ sessions
materializes 1 row in Python (was ~3000), 5ms; root-id resolves to tip;
LIKE-wildcard escaping holds.
Follow-up to @0xharryriddle's feat(desktop): search sessions by id.
The dashboard specify and decompose endpoints run as sync FastAPI threadpool
handlers and pinned the active board by mutating the process-global
HERMES_KANBAN_BOARD env var. Two concurrent requests for different boards
race on that shared global and cross-write — the same bug class as the CLI
path (#38323), now using the scoped_current_board() contextvar introduced by
the CLI fix.
handleSaveDesc and handleAutoDescribe both set their loading flag in a
try block but always cleared it unconditionally in finally. When a user
opened profile A's description editor, clicked Save, then quickly
switched to profile B's editor and saved, profile A's resolving request
would clear descSaving/describing while profile B's request was still
in-flight, making the "Saving…" indicator disappear prematurely.
Track concurrent in-flight counts with descSavingCount and
describingCount refs (mirrors the existing activeDescRequest guard
pattern). The loading flag is cleared only when the counter reaches
zero, i.e. all overlapping requests have settled.
POST /api/profiles returns model_set: false when the model assignment
step fails (e.g. filesystem error) while the profile itself was created
successfully. handleCreate discarded the response, so the user received
a "Profile created" success toast with no indication that their chosen
model was not persisted.
Capture the response and show an error toast when a model was selected
but model_set is explicitly false, directing the user to set it from
the profile editor.
The apply handler sent SIGTERM then fired a 150 ms setTimeout to reload
the renderer. If the backend took longer to shut down the port was still
bound when startHermes() ran after reload, causing an "address already
in use" failure.
Capture the process reference before resetHermesConnection() nulls it,
then await the actual exit event. A 5 s SIGKILL fallback ensures the
wait never hangs if the backend ignores SIGTERM.
The salvaged detector validated each cached electron-*.zip with
zipfile.testzip() and only purged ones it judged corrupt. But stdlib
zipfile reads from the end-of-central-directory backward, so it silently
tolerates prepended/concatenated junk — which is exactly the corruption
the bug report names ('86257938 extra bytes at beginning or within
zipfile', a partial download resumed into the same file). testzip()
returns clean on those zips, so the self-heal never fired for the
reported failure mode.
Drop the self-rolled validator: on any packaged-build failure, purge the
version's cached zips AND the half-written unpacked dir, then retry once.
@electron/get re-downloads with its own SHASUM verification — the real
source of truth, which catches prepend/concat/truncate alike. An
unrelated failure just costs one clean re-download and fails the same way.
Verified empirically: zipfile.testzip() returns None (clean) on a
prepended-junk zip; the unconditional purge removes it correctly.
hermes desktop failed on Linux with an ENOENT renaming
release/linux-unpacked/electron -> Hermes. Root cause is a corrupt
cached Electron zip (~/.cache/electron/electron-*.zip): app-builder
unpack-electron extracts a partial tree from the bad zip that is
missing the electron binary, so electron-builder dies on the final
rename. Re-running repeats the broken extraction, leaving the desktop
app permanently unlaunchable until the cache is manually purged.
- Add _electron_download_cache_dirs() + _purge_corrupt_electron_cache()
to hermes_cli/main.py: validate every electron-*.zip via
zipfile.testzip() and delete corrupt ones; honor electron_config_cache
/ ELECTRON_CACHE overrides with per-OS defaults.
- Wire purge + single retry into cmd_gui packaged-build failure path so
a poisoned download self-heals (electron re-downloads clean).
- Add beforePack hook (apps/desktop/scripts/before-pack.cjs) to wipe the
target unpacked dir before staging, making packaging idempotent across
interrupted runs. Cross-platform, best-effort.
- Tests: corrupt-zip detector, cmd_gui purge/retry/launch path,
no-retry-when-clean path, and node --test for the cleanup helper.
The MiniMax t2a_v2 code path calls response.json() without first
checking the HTTP status code. If the API returns HTTP 4xx/5xx with
non-JSON content (e.g. HTML error page), response.json() raises an
opaque JSONDecodeError instead of a clear HTTPError.
The non-t2a_v2 path already has response.raise_for_status() at line
1299. Add the same check before response.json() in the t2a_v2 path
for consistent error handling.
Follow-up to the salvaged #37727. That PR fixed the reactive recovery path
(classifier + post-failure shrinker) but left the PROACTIVE embed-time guard
in vision_tools byte-only — a tall small-byte screenshot (e.g. 1200x12000 at
0.06 MB) still baked into immutable history un-resized, relying on a failed
round-trip to trigger reactive shrink.
- vision_tools: add _image_exceeds_dimension() + _EMBED_MAX_DIMENSION (7900px);
the embed-time cap now fires on bytes OR pixels and passes max_dimension to
the resizer, so tall small-byte images are shrunk before they're embedded.
- vision_tools: best-effort lazy-install of Pillow (tool.vision) in the resize
ImportError fallback so the soft dep self-heals (respects allow_lazy_installs).
- error_classifier: add two more Anthropic dimension-cap wording variants.
- pyproject + lazy_deps: declare Pillow as the [vision] extra / tool.vision
lazy dep (it was undeclared everywhere; without it ALL resize recovery no-ops).
- tests: cover _image_exceeds_dimension (tall/small/edge/no-Pillow/corrupt).
Co-authored-by: kyssta-exe <kyssta-exe@users.noreply.github.com>
Anthropic enforces two independent ceilings per image:
1. 5 MB encoded byte size
2. 8000 px longest side
Hermes only guarded #1. A tall screenshot (e.g. 1200x12000 at 0.06 MB)
passes every byte check but fails the pixel check, returning a
non-retryable HTTP 400 that permanently bricks the conversation thread.
Fixes:
- error_classifier: add 'image dimensions exceed' pattern to
_IMAGE_TOO_LARGE_PATTERNS so the 400 is classified as image_too_large
and triggers the shrink/retry path instead of falling through to
non-retryable error.
- conversation_compression: check pixel dimensions (via Pillow) even
when byte size is under the 4 MB target. If max(dims) > 8000, force
shrink.
- vision_tools._resize_image_for_vision: add optional max_dimension param.
When set, images exceeding the pixel cap are downscaled even if they're
under the byte budget. The resize loop now checks both byte AND pixel
limits before accepting a candidate.
Closes#37677
The ResponseStore.get() method calls json.loads(row[0]) without any
error handling. If the SQLite responses table contains corrupted JSON
data (e.g. from a crash mid-write or disk corruption), this raises
an unhandled JSONDecodeError that propagates to the caller.
Fix: wrap in try/except (json.JSONDecodeError, TypeError). On parse
failure, log a warning, evict the corrupted entry from the cache, and
return None (consistent with the function's Optional return type).
Collapse the payload-shape normalization helpers into one _as_dict and
drop unused dataclass fields (user_type/user_role, duplicate id, bot) on
the meeting-invite handler. Module 274->212 LOC, behavior unchanged.
Add zhaolei.vc@bytedance.com -> zhaoleibd to release.py AUTHOR_MAP.
All Discord interactive views (ExecApprovalView, SlashConfirmView,
UpdatePromptView, ModelPickerView, ClarifyChoiceView) now edit their
message when the view times out, disabling buttons and updating the
embed to show a 'Prompt expired' footer. Previously, timed-out buttons
remained visually clickable in the UI, causing Discord's generic
'Interaction failed' error when clicked.
Fixes#38022
Two complementary fixes for a silent partial-install failure that bit
``hermes update`` in the wild: a fresh checkout pulled 145 commits,
``rebuild_venv`` failed to recreate the venv on Windows because
``shutil.rmtree(ignore_errors=True)`` couldn't delete files held open by
the running ``hermes.exe`` shim. ``uv venv`` then refused with
"A directory already exists at: venv" and the update fell back to
installing on top of the stale venv. The resulting partial install
missed exactly one newly-added base dep — ``pathspec==1.1.1`` — which
``hermes desktop --build-only`` imports at the top of its content-hash
check. The desktop rebuild died with ModuleNotFoundError and the parent
update only logged "⚠ Desktop build failed (non-fatal)". Same root cause
made the "default: sync failed" line in the skill-sync stage, because
that sync subprocess hit the same missing import.
Fix 1: ``rebuild_venv`` retries with ``--clear``
------------------------------------------------
If ``uv venv`` fails with "already exists" in stderr (which is what uv
prints, and what uv's own hint tells you to fix with --clear), retry
once with ``--clear``. Only this specific failure pattern triggers the
retry — disk-full / interpreter-download failures still surface as
before so we don't mask real problems.
Fix 2: post-install dep verification
------------------------------------
Belt-and-suspenders so future uv resolver quirks (or any other cause of
partial installs) surface immediately instead of hours later in a
downstream subprocess. After ``_install_python_dependencies_with_optional_fallback``
runs, ``_verify_core_dependencies_installed``:
1. Reads ``[project.dependencies]`` straight from pyproject.toml
(so we don't trust the venv's stale metadata).
2. Filters by environment markers via ``packaging.requirements.Requirement``
so cross-platform exclusions (``ptyprocess ; sys_platform != 'win32'``)
don't false-positive on Windows.
3. Runs ``importlib.metadata.version()`` for each remaining dep inside
the *target* venv interpreter (resolved from ``VIRTUAL_ENV``, not
``sys.executable``).
4. If anything is missing, reinstalls the base group with
``--reinstall`` to force re-resolution. If a second probe still
reports missing deps, force-installs each one with its pinned spec.
5. Treats final failure as a warning rather than a hard error — a
single broken-on-PyPI dep shouldn't block an otherwise-successful
update — but the message points at ``hermes update --force`` and
names the missing packages so the user knows what's wrong.
Tests
-----
- ``TestRebuildVenv::test_retries_with_clear_when_dir_already_exists`` —
simulates the rmtree-couldn't-delete-it failure mode and asserts the
``--clear`` retry path is taken and succeeds.
- ``TestRebuildVenv::test_does_not_retry_when_first_failure_is_not_dir_exists``
— guards against masking real failures (disk full, etc.).
- ``test_verify_core_dependencies.py`` — 7 tests covering the happy
path, the regression (missing pathspec triggers --reinstall), the
per-package fallback when --reinstall doesn't help, the platform-
marker filter so Windows doesn't try to install ptyprocess, the
missing-pyproject noop, and the VIRTUAL_ENV resolver.
Co-authored-by: Kyssta <218078013+kyssta-exe@users.noreply.github.com>
Two error handling gaps in the gateway kanban dispatcher:
1. float() on dispatch_interval_seconds crashes with ValueError if the
config value is a non-numeric string. Wrap in try/except and fall
back to the default 60-second interval with a warning log.
2. splitlines()[0] on payload_summary and task.result raises IndexError
when the string is whitespace-only (truthy but strip() produces empty
string, splitlines() returns []). Guard with a check on the lines
list before indexing.
* docs(guides): add "Run Nemotron 3 Ultra free in Hermes Agent" launch guide
Day-0 NVIDIA Nemotron 3 Ultra availability on Nous Portal (free June 4-18,
in partnership with NVIDIA + Nebius). Quick Setup walkthrough for selecting
the nvidia/nemotron-3-ultra:free tier, plus switching/troubleshooting notes.
Registered at the top of Guides & Tutorials.
* docs(guides): reword Nemotron lead-in to match launch copy
Frame as Nemotron Coalition induction (working with NVIDIA) + Nebius
partnership for the free tier, rather than a direct NVIDIA partnership,
to avoid overstating the relationship.
* docs(guides): lead Nemotron guide with desktop app, CLI second
Add a one-click desktop-app install track (download → Nous Portal
recommended sign-in → pick the Free-tier nemotron-3-ultra model) as the
recommended path for non-terminal users, and keep the CLI curl flow as
Option B. Update switching/troubleshooting to cover both surfaces.
hermes auth add qwen-oauth called pool.add_entry() but never wrote to
providers["qwen-oauth"] or set active_provider in auth.json.
_model_section_has_credentials() checks get_active_provider() first; with
active_provider unset and no api_key_env_vars configured for oauth_external
providers, the setup wizard reported "No inference provider configured" even
after a successful Qwen CLI OAuth login.
Add _mark_qwen_oauth_active() in auth.py: writes a minimal provider state
entry (base_url for display only) and calls _save_provider_state() to set
active_provider. The function deliberately does not copy the api_key — that
lives in the Qwen CLI credential file managed by _save_qwen_cli_tokens /
resolve_qwen_runtime_credentials and must not be duplicated in auth.json
where it would become stale.
pool.add_entry() is retained so "hermes auth list" continues to show the entry.
Runtime credential resolution continues to use resolve_qwen_runtime_credentials.
Mirrors the fix applied to openai-codex (#37517) and xai-oauth (#37576).
The WeCom adapter delivers each response as a single complete message
via aibot_respond_msg / aibot_send_msg — it does not stream tokens
incrementally (no edit_message override) and send_typing is a no-op.
Reword the 'Reply-mode streaming' feature bullet to 'Reply correlation',
retitle the section to 'Reply-Mode Responses', and add a note clarifying
that neither token streaming nor typing indicators are supported.
Three Copilot inline review comments on #37664, two worth landing
in a polish pass before merge:
1. auxiliary_client.py:270 — Copilot suggested keeping the
minimax-* entries in _API_KEY_PROVIDER_AUX_MODELS_FALLBACK as
a safety net for environments where the profile-based
resolution can't import or run plugin discovery. **Declined.**
The deepseek precedent (commit 773a0faca) explicitly removed
deepseek from the same dict for the same reason — the profile
layer is the source of truth and the dict is a legacy
pre-profiles-system fallback. We do not want to fragment the
codebase by provider: either the profile layer is authoritative
or the dict is. The minimax PR picks profile (matching deepseek)
and the dict stays cleaned up. The risk Copilot raises is
real but theoretical — plugin discovery runs at import time of
the providers module, which is the first thing any modern
Hermes entrypoint imports.
2. tests/agent/test_minimax_provider.py:162 — Copilot flagged
that the test class relies on _get_aux_model_for_provider()
resolving via provider profiles but doesn't explicitly trigger
plugin discovery. **Fixed.** Added 'import model_tools # noqa:
F401' at the top of both test_minimax_aux_is_standard and
test_minimax_aux_not_highspeed. The fixtures in the parallel
test_minimax_profile.py already did this; the legacy test in
test_minimax_provider.py was order-dependent and would silently
break if anyone reorganised the test ordering. Pinned the
dependency explicitly so the test is order-independent.
3. tests/plugins/model_providers/test_minimax_profile.py:46 —
Copilot flagged that the docstring referenced a hard-coded
line number 'hermes_cli/models.py:298' that would go stale.
**Fixed.** Replaced with the symbol reference
'hermes_cli.models._PROVIDER_MODELS[\'minimax\']' which is
stable under file edits and grep-friendly. The new docstring
also reads more naturally — readers don't have to look up
'what's at line 298' to follow the reasoning.
All 221 minimax-related tests still pass.
Two follow-ups to the M3 default-aux-model PR (#37664):
1. AUTHOR_MAP entry: add fearvox1015@gmail.com -> Fearvox so the
check-attribution CI job recognises Nolan's real contributor
email. The previous run of the attribution check on #37664
failed because the commit was authored as nolan@0xvox.com
(wrong local git config) which isn't in AUTHOR_MAP. The
commit itself is now re-authored to fearvox1015@gmail.com
so both the per-commit check and the AUTHOR_MAP lookup pass.
2. tests/hermes_cli/test_api_key_providers.py::TestMinimaxOAuthProvider
::test_minimax_oauth_aux_model_registered was pinning the aux
model in the legacy _API_KEY_PROVIDER_AUX_MODELS dict, which
the PR correctly removed (mirrors the deepseek cleanup in
773a0faca). The test now asserts the new world order: the
aux model comes from ProviderProfile.default_aux_model on
the minimax-oauth profile, not the fallback dict. This is
the same pattern that the profile-layer deepseek fix
introduced.
The minimax / minimax-cn / minimax-oauth profiles still advertised
M2.7 (and M2.7-highspeed for OAuth) as their default_aux_model,
predating the M3 release (2026-06-01). The user-facing
_PROVIDER_MODELS['minimax'] catalog top entry is M3, and the
recommended config for a Token-Plan install now sets
model.default: MiniMax-M3, so the aux default was the only
remaining drift.
Updates:
* minimax default_aux_model: M2.7 -> M3
* minimax-cn default_aux_model: M2.7 -> M3
* minimax-oauth default_aux_model: M2.7-highspeed -> M2.7
(M3 is not on the OAuth / Coding Plan tier per
platform docs as of this PR; the highspeed
variant was the 2x-cost regression from #4082
that PR #6082 collapsed to plain M2.7 for
minimax / minimax-cn but missed OAuth)
* agent/auxiliary_client.py: drop the three legacy
_API_KEY_PROVIDER_AUX_MODELS_FALLBACK entries for the minimax
family. _get_aux_model_for_provider() reads from
ProviderProfile.default_aux_model first (line 250) and only
falls back to the dict when the profile has no aux model or
the profile import fails. With the profile now set, the dict
entries are dead code and a drift hazard. Mirrors the deepseek
cleanup in 773a0faca.
* tests/agent/test_minimax_provider.py: update the existing
TestMinimaxAuxModel assertions from MiniMax-M2.7 to MiniMax-M3
(the intent — 'standard, not highspeed' — is unchanged; the
pin value is).
* tests/plugins/model_providers/test_minimax_profile.py: new
file mirroring tests/plugins/model_providers/test_deepseek_profile.py.
Pins each of the three profiles' default_aux_model and
asserts _get_aux_model_for_provider() returns it. A second
class guards against the highspeed regression coming back.
Refs:
- Closes#36196 in spirit (M3 support — the catalog half of
that issue is #36212; this PR covers the profile half)
- Related: #4082 (M2.7-highspeed 2x-cost), #6082 (previous
M2.7-highspeed -> M2.7 fix that missed OAuth + the
auxiliary_client.py fallback dict)
- Pattern: 773a0faca (same profile-layer fix for deepseek)
hermes auth add xai-oauth called pool.add_entry() directly, writing only the
credential-pool entry (source "manual:xai_pkce") without touching
providers["xai-oauth"] or setting active_provider in auth.json.
_model_section_has_credentials() checks get_active_provider() first; with
active_provider unset and no api_key_env_vars configured for oauth_external
providers, the setup wizard reported "No inference provider configured" even
after a successful OAuth login.
Use _save_xai_oauth_tokens() — the canonical path already called from the
hermes model xAI login flow — which writes providers["xai-oauth"]["tokens"]
(setting active_provider) and lets _seed_from_singletons seed the pool with
a "loopback_pkce" entry on the next load_pool() call.
Mirrors the fix applied to openai-codex in #37517.
hermes auth add google-gemini-cli called pool.add_entry() but never wrote
to providers["google-gemini-cli"] or set active_provider in auth.json.
_model_section_has_credentials() checks get_active_provider() first; with
active_provider unset and no api_key_env_vars configured for oauth_external
providers, the setup wizard reported "No inference provider configured" even
after a successful OAuth login.
Add _mark_google_gemini_cli_active() in auth.py: writes a minimal provider
state entry (email for display only) and calls _save_provider_state() to set
active_provider. The function deliberately does not copy access_token or
refresh_token — those are managed by agent.google_oauth in the Google
credential file and must not be duplicated in auth.json where they would
become stale.
pool.add_entry() is retained so "hermes auth list" continues to show the entry.
Runtime credential resolution continues to use agent.google_oauth directly.
Mirrors the fix applied to openai-codex (#37517) and xai-oauth (#37576).
Follow-up on the parallel-dispatch decoupling: the sequential pass for
workdir/profile jobs still ran inline in the ticker thread, so a long
workdir/profile job reintroduced the exact starvation #37312 describes,
just for env-mutating jobs. And the MCP orphan sweep ran immediately
after dispatch in sync=False mode — before jobs finished — defeating its
own 'runs after every job' contract and racing jobs still spawning MCP
children.
- Sequential jobs now queue to a persistent single-thread cron-seq pool
(preserves one-at-a-time ordering across ticks, never blocks the tick).
- Same in-flight dedup guard now covers sequential jobs.
- MCP orphan sweep runs via a done-callback after the LAST dispatched job
completes in async mode; inline after as_completed in sync mode.
Verified E2E: tick(sync=False) returns in ~1ms with a 1.5s sequential job
in flight; sweep fires only after that job ends.
PR #13021 fixed serial starvation by adding ThreadPoolExecutor to tick(),
but kept as_completed(timeout=600) which still blocks the ticker thread
until the slowest job finishes. This causes the same starvation pattern:
when one job runs long (15+ min), other jobs' next_run_at expires past the
grace window and they get perpetually fast-forwarded instead of running.
This PR decouples dispatch from completion:
- Persistent ThreadPoolExecutor (reused across ticks, no auto-join)
- Fire-and-forget dispatch: tick submits and returns immediately
- Running-job guard: prevents re-dispatching active jobs
- sync parameter: defaults to True (backward compatible), callers opt
into sync=False for non-blocking behavior
- atexit shutdown handler for clean pool teardown
- gateway/run.py: production ticker opts into sync=False
Refs #33315 (complementary — that issue's PRs fix grace handling in
jobs.py; this PR prevents the grace from expiring in the first place)
Replace KeepAlive.SuccessfulExit=false dict with <key>KeepAlive</key><true/>
so launchd restarts hermes-gateway on any exit, matching the documented
drain-then-exit restart protocol used by --graceful-restart.
The salvaged conversion emitted type:"input_video", which MiniMax M3 rejects
just like the original video_url block. Per MiniMax's Anthropic-compat docs,
the video content block is type:"video" with an image-style source (base64 or
url). Fixes the block type, converts URL-based videos too, and adds 4 video
conversion tests (none shipped with the original PR).
The video_analyze tool sends OpenAI-style 'video_url' content blocks, which
breaks Anthropic-protocol providers (minimax, minimax-cn). These providers
expect 'input_video' blocks with base64 data instead of data: URLs.
Extends _convert_openai_images_to_anthropic() to also handle video_url
blocks, converting them to Anthropic's input_video format when targeting
Anthropic-compatible endpoints.
Fixes#37219
The ``grok-4.3`` (1M context) catalog entry was added on 2026-05-15
(ce0e189d3). Between 2026-04-10 (when ``grok-4`` at 256,000 was first
added by b57769718) and 2026-05-15, grok-4.3 slugs resolved via the
generic ``grok-4`` substring catch-all and that 256,000 value was
persisted to context_length_cache.yaml. Users who first queried
grok-4.3 in that 35-day window are stuck at 256K forever — the cache
is read at step 1 before the hardcoded defaults in step 8, so the
correct 1M entry is never reached.
Mirror the existing Kimi/Codex/MiniMax-M3 stale-cache guards: add
_model_name_suggests_grok_4_3() and an elif branch that drops any
cached value ≤ 256,000 for a grok-4.3 slug so the next lookup falls
through to the 1M hardcoded default.
Adds 4 regression tests: helper unit test, stale-drop-and-re-resolve,
correct-cache-preserved, and no-clobber for plain grok-4 (256K correct).
The salvaged pattern matched -i only inside the first flag token, so
`perl -p -i -e '...' config.yaml` (the -i split out after -p) slipped
through. Widen to match a -...i flag token anywhere in the args; still
no false positive on `perl -e` code eval or config reads. Adds tests
for the separate-token, backup-suffix, and read-safe forms.
sed -i coverage for ~/.hermes/config.yaml and .env was added in #14639,
but perl -i and ruby -i — which perform the same direct file mutation —
were not covered. The existing perl/ruby pattern only catches -e/-c (code
evaluation), not -i (file mutation), so:
perl -i -pe 's/approvals.mode: on/approvals.mode: off/' ~/.hermes/config.yaml
bypasses the approval gate entirely, letting the agent flip approvals.mode
off mid-session via the mtime-keyed config cache reload.
Add a single pattern mirroring the sed -i lines: `\b(?:perl|ruby)\s+-[^\s]*i`
against both _HERMES_CONFIG_PATH and _HERMES_ENV_PATH. Three regression
tests pin the new coverage.
User-installed memory providers load under the synthetic
_hermes_user_memory.<name> package, but the loader never registered that
parent namespace in sys.modules (it only registers "plugins" and
"plugins.memory" for bundled providers). As a result any external provider
using a relative import failed to load:
from . import config
ModuleNotFoundError: No module named '_hermes_user_memory'
The same gap in discover_plugin_cli_commands() meant an external provider's
cli.py with a relative import could never be discovered, so the documented
"hermes <plugin>" CLI integration did not work for standalone plugins.
Register the synthetic parent namespace before loading user-installed
providers, mirror it for cli.py discovery (including the per-provider parent
package, without executing the plugin's __init__.py), and make
_load_provider_from_dir() reuse only modules actually loaded from disk so a
parent shell registered by CLI discovery is never mistaken for the loaded
provider.
Regressions cover: a flat provider with a sibling relative import, a provider
with its implementation in a nested subpackage (including a namespace
intermediate directory), cli.py discovery with a relative import, and
provider load after CLI discovery ran first.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The shared-key bridging loop (allow_from, require_mention,
free_response_channels, …) read only the top-level yaml platform block
(yaml_cfg.get(plat.value)). When a user configured a platform solely
under ``platforms:`` or ``gateway.platforms:`` with no top-level block,
the loop skipped that platform entirely and all bridged keys were silently
dropped into PlatformConfig.extra — making allow_from, require_mention,
etc. ineffective for nested-only configs.
The apply_yaml_config_fn dispatch already received this same fallback in
44f3e51 to handle plugin adapters (e.g. Discord allow_from). The
shared-key loop now mirrors it: if yaml_cfg.get(plat.value) is absent,
fall back to gateway.platforms.<name> then platforms.<name>.
The enabled field is deliberately excluded from the nested fallback
(guarded by _cfg_toplevel): _merge_platform_map already merged it with
the correct precedence, so re-applying it from a single nested source
would overwrite the correctly-merged value.
Two new regression tests assert that allow_from and require_mention
configured under platforms.telegram and gateway.platforms.telegram are
bridged into PlatformConfig.extra. All 54 existing config tests pass.
The classic CLI left its live bottom chrome — the status bar, input box,
and separator rules — frozen in terminal scrollback after exit, on every
exit path (/exit, /quit, Ctrl+C, EOF) and on both Linux and Windows. The
prior erase_when_done=True fix (bf82a7f1c) routes prompt_toolkit's teardown
through renderer.erase(), but that walks back by the renderer's internal
cursor model and does not reliably wipe the chrome in practice — users still
saw a dead status bar + the rest of the session sitting above the resume
summary.
Clear the screen + scrollback directly at the single exit funnel instead.
All exit paths converge on _print_exit_summary() (called from the run-loop
finally block after app.run() returns and prompt_toolkit has restored
terminal modes), so a new _clear_terminal_on_exit() helper runs there before
the summary prints. It writes ESC[3J ESC[2J ESC[H (erase scrollback, erase
screen, home cursor) on a real TTY, no-ops silently when stdout is not a
terminal (pipes/redirects), and falls back to the platform clear command if
the escape write fails. Works on Linux, macOS, and modern Windows terminals
(Terminal/conhost with VT processing, already enabled by prompt_toolkit).
The resume/goodbye summary now prints at a clean top-left with nothing
stranded above it.
Fixes#38252.
The documented path for connecting Hermes Desktop to a remote backend was
`--insecure` + a pinned HERMES_DASHBOARD_SESSION_TOKEN — an unauthenticated
bind plus a copy-pasted token. Replace it everywhere with the bundled
username/password dashboard-auth provider: set HERMES_DASHBOARD_BASIC_AUTH_*,
run `hermes dashboard --host 0.0.0.0` (the non-loopback bind engages the auth
gate), and Sign in from the app.
- desktop.md: rewrite 'Connecting to a remote backend' for the user/pass + Sign in flow
- web-dashboard.md: rewrite both remote-backend sections (overview + dedicated);
reframe the auth-gate section so --insecure is a discouraged escape hatch, not a
co-equal use case; drop the removed --tui flag from the systemd example
- environment-variables.md: lead with HERMES_DASHBOARD_BASIC_AUTH_*; drop the
session-token / HERMES_DESKTOP_REMOTE_TOKEN remote-connect entries
- docker.md: mention the username/password provider as the simplest gate provider
The hermes tools save summary printed '- kanban' (and would print
'+ kanban') for a platform even though kanban is never offered as a
checklist option. kanban is a check_fn-gated toolset whose tools are a
subset of the platform composite, so _get_platform_tools resolves it as
enabled, but _prompt_toolset_checklist only renders CONFIGURABLE_TOOLSETS
— so it can never survive into the returned selection. The added/removed
diff (current_enabled - new_enabled) then surfaced kanban as removed.
Scope the printed diff to the checklist's actual universe via the new
_checklist_toolset_keys() helper at all three diff sites (first-install,
all-platforms, per-platform). The persisted config is unaffected —
_save_platform_tools already preserves non-configurable entries; this was
purely a false-signal in the UI.
The gated dashboard verifies a session cookie by trying each registered
DashboardAuthProvider's verify_session in turn (the session cookie stores
only the access token, not which provider issued it). A provider that
doesn't recognise a token returns None; a provider whose IDP/JWKS is
unreachable raises ProviderError.
The loop used to return HTTP 503 on the FIRST ProviderError, before any
later provider got a turn. With multiple providers stacked, that means an
unreachable IDP for a session you didn't even use blocks login through a
different, reachable provider.
Concrete repro: a self-hosted-OIDC session hits the 'nous' provider first
(registered earlier); nous tries to reach Nous Portal's JWKS, which is
unreachable in a self-hosted deployment, so it raises — and the gate
503s before the 'self-hosted' provider can verify the token. Hit live
while testing the new self-hosted OIDC plugin against a local Keycloak.
Fix: a ProviderError from one provider is logged and the loop continues
to the next. A 503 is returned only if NO provider verified the token
AND at least one was unreachable — distinguishing a transient IDP outage
(don't force a needless re-login) from a token that's genuinely invalid
(fall through to refresh/relogin). Single-provider behaviour is
unchanged.
Tests: adds an _UnreachableProvider stub and three cases — unreachable
provider first must not block a working second; all-unreachable still
503s; reachable-but-unrecognised falls through to 401/relogin (not 503).
Mutation-tested: reverting the fix makes the first case fail with the
exact 503 bug.
Adds a bundled dashboard-auth provider plugin that authenticates the
web dashboard against any conformant self-hosted OpenID Connect server
(Authentik, Keycloak, Zitadel, Authelia, Auth0, Okta, Google, …) using
standard OIDC — no per-IDP code.
It's a pure drop-in plugin implementing the DashboardAuthProvider
protocol; it touches no core auth/runtime/login paths. Mechanics:
- OIDC discovery from {issuer}/.well-known/openid-configuration
(cached; issuer pinned; endpoints required HTTPS, loopback http
allowed for local-dev IDPs)
- authorization-code + PKCE (S256), public client
- verifies the OIDC ID token (RS256/ES256) against the discovered
jwks_uri with iss/aud pinned to the configured issuer/client_id, and
maps standard claims (sub/email/name/preferred_username, groups→org)
onto a Session
- standard refresh_token grant for silent re-auth; RFC 7009 revocation
on logout when advertised
Verifies the ID token (not the access token) because OIDC guarantees the
ID token is a signed JWT carrying identity, while access-token format is
opaque to the client per spec — the only universally-correct choice
across self-hosted IDPs.
Config via dashboard.oauth.self_hosted.{issuer,client_id,scopes} in
config.yaml or HERMES_DASHBOARD_OIDC_{ISSUER,CLIENT_ID,SCOPES} env vars
(env-wins-config, empty-is-unset — same convention as the nous plugin).
Confidential clients (client_secret) left as a documented TODO seam.
Docs: adds a Self-hosted OIDC section to the web-dashboard guide,
including a copy-paste Keycloak worked example (realm import + docker
run + dashboard wiring + login walkthrough).
Tests: 65 cases covering construction, discovery (incl. issuer
mismatch + https enforcement), start_login/PKCE, complete_login, ID
token verification, refresh/revoke, and env/config precedence.
The dashboard's embedded Chat surface (/chat, /api/ws, /api/pty) was gated
behind `hermes dashboard --tui` / HERMES_DASHBOARD_TUI=1. The desktop app and
the dashboard's own Chat tab both drive the agent over the /api/ws + /api/pty
WebSockets, so a dashboard started without the flag would pass the /api/status
health check but slam the chat WebSocket shut with WS code 4403 — the app
connects, reports "ready", and chat stays dead. This was the root cause behind
multiple user reports of the desktop app failing to connect to a self-hosted
gateway/dashboard, and it bit Docker and host installs alike.
Make the embedded chat unconditional:
- web_server.py: _DASHBOARD_EMBEDDED_CHAT_ENABLED defaults to True; drop the
embedded_chat parameter and the runtime reassignment from start_server().
The WS gates still read the constant (now always true) so the seam — and its
"rejects when disabled" contract test — stays meaningful.
- main.py: remove the `--tui` argument from the dashboard subparser and the
`embedded_chat = args.tui or HERMES_DASHBOARD_TUI==1` derivation.
- web/: isDashboardEmbeddedChatEnabled() returns true unconditionally; drop the
deprecated __HERMES_DASHBOARD_TUI__ alias and the dead LEGACY_TUI_RE scrape in
the vite dev-token plugin.
- apps/desktop/electron/main.cjs: drop `--tui` from the spawned dashboardArgs
(it would now error with "unrecognized arguments: --tui") and the redundant
HERMES_DASHBOARD_TUI env injection.
- Docker: no s6 run-script change needed — the script never passed --tui; the
HERMES_DASHBOARD_TUI env var is now simply a no-op, so the image works out of
the box with no extra var.
- Docs: remove every dashboard --tui / HERMES_DASHBOARD_TUI reference across the
CLI reference, env-var reference, docker/desktop/web-dashboard guides, in-app
tips, and the zh-Hans translations. The terminal `hermes --tui` / HERMES_TUI
references are intentionally left untouched.
Tests: 270 passing across web_server, dashboard lifecycle, host-header,
auth-gate, and docker-override-scripts suites.
Sets erase_when_done=True on the classic CLI's prompt_toolkit Application so the
live bottom chrome (status bar, input box, separator rules) is wiped on exit
instead of frozen into scrollback.
Previously prompt_toolkit's render_as_done teardown repainted the chrome one
final time and left it on screen (ESC[J only erases below the cursor, not the
chrome above), so a dead status bar + empty prompt + rules were stranded
between the conversation transcript and the 'Resume this session' summary, and
stacked with the next session's UI on resume. erase_when_done routes teardown
through renderer.erase() which wipes exactly the managed chrome region; the
conversation transcript prints through patch_stdout into normal scrollback and
is untouched. Applies to every exit path (/exit, /quit, EOF, Ctrl+C).
Fixes#38252.
Root installs on Linux (FHS layout, #15608) put the `hermes` command in
`/usr/local/bin` (on PATH) but symlinked the bundled node/npm/npx into
`~/.local/bin`, which isn't on PATH for a stock root shell. `node`/`npm`
were 'command not found' and `hermes dashboard` failed with 'npm is not
available' because its build-on-demand fallback couldn't find npm.
Fix: `install_node()` now symlinks into `get_command_link_dir()` — the same
helper the `hermes` command link already uses — so node/npm/npx land
wherever the command does (`/usr/local/bin` on FHS root, `~/.local/bin`
otherwise, `$PREFIX/bin` on Termux). Non-root and Termux installs are
unchanged.
Also fixes:
- `scripts/lib/node-bootstrap.sh`: adds `_nb_get_link_dir()` mirroring
the same root/Termux/user logic for the standalone bootstrap path
(used by `hermes update`, TUI node bootstrap, etc.)
- `hermes_cli/uninstall.py`: `remove_node_symlinks()` now checks all
candidate directories (`~/.local/bin`, `/usr/local/bin`, `$PREFIX/bin`)
so root FHS uninstalls don't leave orphan symlinks
Regression from #15608, which created the FHS path for the command but
left `install_node` pointed at the legacy user-local dir.
The desktop command-approval ApprovalBar renders inline inside ToolEntry,
which lives inside ToolGroupSlot. When 2+ tools group, the group body is
hidden until expanded, so an approval raised by a pending terminal/
execute_code call was buried behind "Tool actions · N steps" and required
manual expansion to act on (sudo/secret were unaffected — they use modal
overlays).
ToolGroupSlot now subscribes to $approvalRequest and force-opens its body
while an approval targeting one of its pending approval-eligible tools is in
flight, so the inline controls surface with nothing expanded. The group
reverts to the user's stored collapse state once the approval resolves.
The desktop renderer is bundled as one chunk on purpose (codeSplitting:
false) because Shiki's many dynamic chunks make electron-builder OOM
scanning thousands of files. That makes the ~22 MB bundle expected, but
Vite still nags with 'Some chunks are larger than 500 kB' on every build.
Raise chunkSizeWarningLimit to 25000 kB so the cosmetic warning stays
quiet while still firing as a regression alarm if the bundle grows well
past today's size. Config-only; codeSplitting:false is untouched.
attemptReconnect() connected with the stale cached conn.wsUrl. OAuth WS
tickets are single-use with a ~30s TTL, so the first sign-in (which goes
through boot() and re-mints via resolveGatewayWsUrl) succeeds, but every
reconnect (sleep/wake, network online, window refocus, socket drop, app
restart) reused a dead ticket and failed the WS upgrade with an opaque
"Could not connect to Hermes gateway" — even though backend resolution
(cookie + REST) reported ready.
attemptReconnect now mints a fresh ticket before connecting, mirroring
use-gateway-request.ts, and surfaces the reauth "sign in again" message
once on OAuth expiry instead of silently looping backoff against a dead
ticket. Local/token gateways are unaffected (re-mint is a no-op).
When 'hermes update' rebuilds the project venv (rmtree + uv venv on the
first managed-uv migration), the desktop-rebuild and profile-skills-sync
steps that follow both spawn sys.executable. Firing while the venv is
mid-rewrite makes the child interpreter abort with the bare stderr line
'No pyvenv.cfg file', surfacing as a spurious 'Desktop build failed' /
'default: sync failed' on an update that actually succeeded.
Add _wait_for_interpreter_venv_ready(): resolve the venv hosting
sys.executable and poll briefly for pyvenv.cfg to (re)appear before each
of those subprocess steps. No-op when the interpreter isn't venv-hosted.
The desktop rebuild also retries once after re-waiting, and keeps
streaming its output live (no capture). Best-effort throughout — callers
proceed regardless, so a genuinely broken venv still surfaces the real
error.
Surface the username/password dashboard-auth provider in Hermes Desktop's
remote-gateway connect flow. A password gateway gates the same way an OAuth
one does (auth_required + session cookie + ws-ticket), so the desktop already
drives it through the existing sign-in window; the only gaps were that the
probe dropped supports_password and the UI always said "OAuth".
- main.cjs: capture supports_password from /api/auth/providers in the probe.
- global.d.ts: add optional supportsPassword to DesktopAuthProvider.
- gateway-settings.tsx: derive isPasswordProvider; render a plain "Sign in"
button + "username and password" copy instead of an OAuth provider label
when every advertised provider is password-based. Login still flows through
the gateway's /login credential form (POST /auth/password-login).
PR #38743 split the dashboard PTY WebSocket refusal codes (4404 = chat
disabled, 4403 = host/origin mismatch — see web_server.py refusal site
comment) but left test_rejects_when_embedded_chat_disabled asserting the
old 4403, so it has expected 4403 while the server sends 4404. Main CI has
been red on test (2)/(4) shards since that commit. Update the assertion to
4404 to match the disabled-chat path.
The reconnect and boot paths resolved the WS URL with
`(await getGatewayWsUrl().catch(() => null)) || conn.wsUrl`. For OAuth
gateways the cached conn.wsUrl carries a single-use, ~30s-TTL ticket; the
desktop connection is memoized for the process lifetime, so on reconnect
that ticket is both expired and already consumed. A failed fresh mint
therefore fell back to a guaranteed-dead ticket and surfaced as an opaque
"connection closed", masking the gateway's actionable "session expired,
sign in again" message.
Extract resolveGatewayWsUrl() (with unit tests): in OAuth mode a mint
failure throws a tagged GatewayReauthRequiredError instead of falling back;
token/local modes keep the long-lived-token fallback. Thread that error
through the reconnect path so requestGateway surfaces the reauth message
rather than the generic transport error that triggered the retry.
Co-authored-by: Kenmege <205099287+Kenmege@users.noreply.github.com>
The remote-gateway settings rendered the session-token box for every gateway
during the idle/probing window before the first /api/status probe lands,
because authMode defaults to 'token'. Gate both the OAuth sign-in button and
the token box behind an authResolved flag so neither renders until the probe
resolves the scheme (or a previously-saved remote config is being re-shown,
so re-opening settings doesn't flicker).
The gateway-side WS Origin fix that lets the packaged desktop (file:// origin)
connect to an OAuth-gated remote gateway landed separately in #37870; this
branch is now purely the desktop client + this UI fix.
The desktop remote-gateway settings now auto-detect whether a gateway
authenticates with OAuth or a static session token and present the
matching UI + connection mechanism.
Detection: an unauthenticated GET {base}/api/status reads auth_required
(true => OAuth, false => session token); /api/auth/providers supplies the
provider label. The settings UI debounce-probes the entered URL and shows
either a 'Sign in with <provider>' button or the session-token box.
OAuth connection mechanism:
- REST is authed by the HttpOnly session cookie held in a persistent
Electron session partition (persist:hermes-remote-oauth); main-process
REST routes through electron net bound to that partition so the cookie
attaches automatically.
- Login opens a BrowserWindow on {base}/login in that partition and
resolves once the hermes_session_at cookie lands.
- WebSocket upgrades use a single-use ?ticket= minted at
POST /api/auth/ws-ticket (the gateway rejects ?token= in gated mode);
getGatewayWsUrl() re-mints before every (re)connect since tickets are
single-use and short-lived.
- Missing cookie / 401 surfaces needsOauthLogin to prompt re-sign-in
(Nous Portal contract v1 issues no refresh token).
Local and token modes are unchanged.
Pure helpers (URL normalize, ws-url token/ticket builders, auth-mode
classify/resolve, cookie detector) are extracted to a standalone
connection-config.cjs (no electron import) and unit-tested with
node --test (26 tests), matching the backend-probes.cjs pattern.
* feat(desktop): dedicated Providers settings with Accounts/API-keys subnav
Rework provider configuration in the desktop app into its own Providers
page that mirrors the first-run onboarding picker, instead of burying
provider keys in the generic Tools & Keys list.
- Add a Providers settings page (providers-settings.tsx) reusing the
onboarding picker cards/ApiKeyForm so the two surfaces stay identical
- Add a sidebar subnav (Accounts vs API keys) backed by a deep-linkable
`pview` URL param; nested OverlayNavItem variant for a lighter active
state so children don't compete with the parent item
- Scope provider search to the active sub-view in its native card format
(no more accordion fallback); collapse the API-key grid to the top
providers behind a "Show all" toggle to cut scrolling
- Launch real in-app OAuth from settings via startManualProviderOAuth;
fix the misleading red "reason" banner that showed during an active
connect (neutral style, hidden during a flow, omitted for direct
per-provider launches)
- Expand PROVIDER_GROUPS and add longest-prefix matching so providers
like xAI/Ollama group correctly instead of landing under "Other"
- Drop redundant messaging API keys from Tools & Keys (channel_managed)
Co-authored-by: Cursor <cursoragent@cursor.com>
* feat(desktop): Cursor-style provider key list with inline inputs
Replace the card-grid API-key form on the Providers page with a
per-provider list (mirrors Cursor's API keys section):
- One row per vendor with its primary key input inline; rows with extra
vars (base URL, region, alt tokens) expand to reveal those on focus
- Set keys show their redacted value as the placeholder; Save appears on
edit, Remove on a set key
- Hide redundant alias key fields (e.g. ANTHROPIC_TOKEN vs
ANTHROPIC_API_KEY) unless already set, and label set aliases by env var
name so they're unambiguous
- Smaller mono input text + compact height
Co-authored-by: Cursor <cursoragent@cursor.com>
* style(desktop): flatten providers settings UI chrome
Tighten the providers settings surface to match the newer desktop style:
remove extra card rails/borders in API-key rows, reduce visual noise in the
providers subnav, replace bespoke link-like controls with shared text-button
variants, and improve key input readability.
* feat(desktop): rework providers settings UI
- Flatten the shared OAuth picker rows (accounts + onboarding): drop the
rounded-2xl/border cards for flat hover-bg rows; Nous hero keeps a subtle
tint plus an animated blue→purple arc border.
- Key fields collapse to a single input: a set key reads read-only (redacted)
and edits in place on focus/click — no Replace/Cancel chrome. Save on type,
Esc cancels (without closing the overlay), "Remove or esc to cancel" hint.
- Non-key overrides render boxless, content-sized (field-sizing) and
right-anchored; advanced fields align under the primary key column.
- Add `xs` control size; size fields via padding (no fixed heights).
- Cards expand on key-input focus; chevron shows on hover/expanded; expanded
state uses a ring + softer bg tier so hover ≠ focus.
- Relocate "Get a key" to the bottom-right of the expanded panel; drop the
redundant provider description.
- Cmd+K: add Providers (accounts) and Provider API keys deep-links.
* fix(desktop): flatten provider fields, drop input shadows, fix Cmd+K provider rank
- KeyField: collapse to one stacked label-above-input form field (drop the
bespoke `naked`/inline/column branches); empty advanced overrides fade until
hover/focus/set
- styles: kill the resting + focus drop shadow on shared input chrome so form
inputs sit flat (composer keeps its own shadow)
- Cmd+K: drop stray `providers` keyword from Skills & Tools so the Providers
settings entry ranks first for "provider"
* fix(desktop): nous portal arc blue → orange
* fix(desktop): rank appearance above settings in Cmd+K
---------
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Brooklyn Nicholson <brooklyn.bb.nicholson@gmail.com>
Add a 'Username/password provider (no OAuth IDP)' section to the web
dashboard guide (config.yaml + env surfaces, the explicit-secret caveat,
the rate-limit/generic-401 properties, and a 'write your own password
provider' pointer to the supports_password extension point), and list the
HERMES_DASHBOARD_BASIC_AUTH_* env vars in the environment-variables
reference.
- test_dashboard_auth_password_login.py: drives /auth/password-login
end-to-end through the REAL gated_auth_middleware (login -> session
cookie -> authenticated /api/auth/me -> transparent refresh via the RT
cookie), plus protocol-extension checks, the generic-401/404 oracle
properties, the rate limiter, and login-page rendering (form+script
when supports_password, script-free otherwise, both for mixed
providers). Reuses the existing StubAuthProvider harness convention.
- test_basic_provider.py: scrypt hash/verify, login mint, kind-claim
enforcement (access != refresh), cross-secret rejection, and the
register() config/env precedence + skip reasons.
Mutation-tested: dropping the kind-claim check in verify_session makes
test_access_token_not_accepted_as_refresh fail, confirming the test isn't
theater.
A bundled, zero-infrastructure 'just put a password on my dashboard'
provider that uses the supports_password extension point. No external IDP,
no database: sessions are stateless HMAC-signed tokens the provider mints
and verifies itself, and passwords are hashed with stdlib scrypt (no
third-party dependency — deliberately avoids bcrypt to keep the dep
surface unchanged).
- plugins/dashboard_auth/basic: BasicAuthProvider (scrypt verify with a
constant-time dummy-hash path for unknown users so the endpoint is not
a username-timing oracle; access/refresh tokens carry a 'kind' claim
that verify/refresh enforce; cross-secret tokens are rejected). The
register() entry point mirrors the Nous plugin's config/env precedence
(env wins; empty treated as unset) and LAST_SKIP_REASON channel.
- config.py: document the canonical dashboard.basic_auth.* surface
(username / password_hash / password / secret / session_ttl_seconds).
Activates only when username + (password or password_hash) are set, so
OAuth users and loopback/--insecure operators are unaffected. Without an
explicit secret a random per-process key is generated (logged): fine for a
single process, but sessions then don't survive restart or span workers.
The dashboard auth gate was OAuth-only: a DashboardAuthProvider could
authenticate only via a redirect to an IDP (start_login -> /auth/callback
-> complete_login). There was no first-class path for username/password
auth, so self-hosters who just want a password on their dashboard had no
clean option short of an external OAuth IDP.
Extend the provider framework with a parallel, non-redirect front door
that converges on the same Session + cookie + refresh machinery:
- base.py: add the optional supports_password flag and
complete_password_login(username, password) -> Session (default
raises NotImplementedError so an OAuth-only provider that forgets the
flag fails loudly). Add InvalidCredentialsError. OAuth providers are
unaffected (flag defaults False; the method is never called).
- routes.py: add POST /auth/password-login, mirroring the cookie-minting
tail of /auth/callback but skipping PKCE/state/code. Returns JSON
{ok, next} (the form POSTs via fetch). Generic 401 for both unknown
user and wrong password (no enumeration oracle); 404 hides whether a
provider exists or supports passwords; per-IP sliding-window rate
limit (10/min -> 429). /api/auth/providers now reports
supports_password so the login page can branch.
- middleware.py: allowlist /auth/password-login (a bootstrap route).
verify/refresh/revoke/ws-tickets/logout need zero changes — a password
session is just a Session with provider-minted opaque tokens.
- login_page.py: render a credential form (instead of a redirect button)
for supports_password providers, wired by a small inline script that
POSTs to /auth/password-login and navigates on success. OAuth-only
pages stay script-free.
The Nous dashboard OAuth login rejected any http:// redirect_uri whose
host was not localhost/127.0.0.1, surfacing "redirect_uri may only use
http:// for localhost/127.0.0.1" on the login screen. This broke
self-hosted dashboards reached over plain HTTP — LAN IPs, internal
hostnames, and reverse proxies that terminate TLS upstream.
The Portal-side check (agent-redirect-uri.ts) is authoritative on which
redirect_uris are permitted; this client-side _validate_redirect_uri is
only a fast-fail for obvious operator error and should not second-guess
valid http:// deployments.
Fix: drop the localhost-only branch on the http scheme. Validation now
enforces only that the scheme is http(s) and the path ends with
/auth/callback. Updated the docstring to explain the relaxed contract,
and replaced test_rejects_http_with_non_localhost (which pinned the old
behavior) with test_allows_http_with_arbitrary_host covering a Fly
hostname, a LAN IP, and an internal hostname.
* Port from google-gemini/gemini-cli#21541: back up corrupted config.yaml
When config.yaml fails to parse, load_config() silently falls back to
DEFAULT_CONFIG and leaves the broken file on disk. If the user then re-runs
the setup wizard or hermes config set (both rewrite config.yaml), their
broken-but-recoverable overrides are lost for good.
Adapts the policy-file recovery from gemini-cli#21541: on the first parse
warning for a given broken file, snapshot it to config.yaml.corrupt.<ts>.bak
(best-effort, symlink-guarded, size-deduped) and tell the user where it
landed. Unlike Gemini's version we deliberately do NOT reset config.yaml to a
clean state — hermes never silently mutates user config, and leaving it means
a hand-fixed file is re-read on the next load.
Tests: 3 new cases (backup created + content preserved + original untouched;
same-size backup dedup; symlink not copied). E2E verified with isolated
HERMES_HOME and a real tab-indented broken config.
* fix(dashboard): explain WHY a chat WS connection was refused
The embedded-chat PTY WebSocket (/api/pty) collapsed every rejection
into a bare close code: 4401 for any auth failure, 4403 for three
unrelated failures (host mismatch, origin mismatch, peer-IP). Neither
the server log nor the browser said which gate fired or why, so a
"chat won't connect" report was undiagnosable without a repro.
Server (web_server.py):
- _ws_auth_reason / _ws_host_origin_reason / _ws_client_reason return a
short machine-parseable reason; old bool wrappers kept for callers/tests.
- pty_ws splits the overloaded 4403 into 4401 (auth), 4403 (host/origin),
4408 (peer not allowed), 4404 (chat disabled), and sends the reason on
the close frame (clamped to the 123-byte RFC6455 limit).
- Each path logs one line: 'pty auth rejected reason=.. mode=.. cred=.. peer=..'
/ 'pty refused: <reason> ..'. Accepted path logs 'pty accepted peer=..
mode=.. cred=..' so an audit shows HOW a peer authed, not just that it did.
tui_gateway/ws.py:
- 'ws send/write failed' now logs error_type=<ExcName> so an exception
whose str() is empty (closed-transport sends) no longer logs 'error='.
web/src/pages/ChatPage.tsx:
- console.warn the real close code + server reason on every close.
- Map 4404/4408 to specific banners; 4401/4403 banners echo the server
reason; [session ended] prints the close code.
E2E verified all five reject paths + accepted path produce matching
close code, wire reason, and server log line.
The register command resolved the portal base URL purely from the stored
login, ignoring any override. That meant `HERMES_DASHBOARD_PORTAL_URL` (and
the absence of any flag) gave no way to point registration at a staging or
preview portal — the request always hit the login's portal, returning 404
against a branch that wasn't deployed there.
- _resolve_portal_base_url now takes an optional override (precedence:
override > stored login portal > prod default).
- New --portal-url flag; falls back to HERMES_DASHBOARD_PORTAL_URL env.
- Documents that the access token must be valid at the overridden portal
(it's minted by whoever you logged into).
- 3 new tests for override precedence.
Verified live against the PR #324 Vercel preview: CLI -> preview endpoint ->
real agent:{id} client_id written to .env.
Adds a CLI command that registers this install as a self-hosted dashboard
with the user's Nous Portal account, automating the manual browser flow on
/local-dashboards.
- New hermes_cli/dashboard_register.py: resolves a fresh Nous access token
from auth.json (fast-fails with a `hermes setup` hint when not logged in),
POSTs to {portal}/api/oauth/self-hosted-client, and writes
HERMES_DASHBOARD_OAUTH_CLIENT_ID into ~/.hermes/.env idempotently.
- Docker-style adjective_noun auto-naming; --name and --redirect-uri overrides.
- Persists HERMES_DASHBOARD_PORTAL_URL only when non-default and unset (so a
Vercel preview / staging portal sticks, prod default stays implicit).
- Refuses in managed/hosted installs (the orchestrator stamps the client_id).
- Post-register hint explains the OAuth gate only engages on a non-loopback bind.
- Nested 'register' subparser leaves bare `hermes dashboard` unchanged.
- 9 unit tests (name gen, fast-fails, POST shape, env writes, redirect URI,
portal-URL persistence, 401/403 mapping); dashboard lifecycle tests still green.
Depends on NousResearch/nous-account-service#324 (the portal endpoint).
* refactor(supermemory): session-level conversation ingest + kebab tool aliases
Salvaged from #32487 (by @MaheshtheDev), rebased onto current main.
- sync_turn now buffers cleaned turns; the full session is ingested once
at session end / switch / shutdown via the conversations endpoint
- ingest_conversation() accepts and forwards functional document metadata
(type, session_id, message_count, partial)
- register kebab-case tool aliases (supermemory-save/search/forget/profile)
alongside the snake_case names
- README + docs (EN/zh-Hans) updated for the simplified session model
Source/vendor-attribution removed per project policy (no telemetry):
dropped x-sm-source header, sm_source metadata, and sm_capture_mode tags.
Preserved the post-branch atomic_json_write(mode=0o600) hardening that the
PR's stale base had reverted. Updated provider tests for the new behavior
and added maheshthedev@gmail.com to release.py AUTHOR_MAP.
Co-authored-by: alt-glitch <balyan.sid@gmail.com>
* feat(supermemory): restore x-sm-source for Spaces routing
Reinstates x-sm-source: hermes (SDK default_headers + conversations POST)
and sm_source: hermes document metadata. Per @Dhravya (Supermemory), this
is a functional routing key, not telemetry: it groups Hermes writes into a
dedicated "Hermes" Space in the Supermemory app so users can filter and
bulk-manage memories per source agent.
sm_capture_mode remains dropped (appears analytics-only; Spaces are routed
by sm_source) pending confirmation. Adds README note + a unit test covering
_merge_metadata sm_source stamping and legacy source->type migration.
---------
Co-authored-by: Mahesh Sanikommu <maheshthedev@gmail.com>
* fix(desktop): surface background-session clarify prompts instead of hanging
clarify.request is a one-shot blocking event: the gateway turn blocks on
clarify.respond. The desktop handler dropped it for any non-focused session
(`if (!isActiveEvent) return`) and stored at most one request in a single
global atom, so a background session that asked a clarifying question hung
forever and re-focusing it could never recover (the event was already gone).
- store/clarify.ts: key pending requests by runtime session id; expose the
active session's request via a focus-scoped computed view (ClarifyTool is
unchanged). clearClarifyRequest takes an optional session id for targeted
clears, with a request-id fallback.
- use-message-stream.ts: park every session's clarify (drop the isActiveEvent
early return); toast when one lands for a background session since the row
otherwise just keeps spinning like normal work.
- clarify-tool.tsx: clear by session id so answering one chat can't wipe
another's pending request.
- store/clarify.test.ts: concurrent independence, focus-scoped view,
targeted/stale/fallback clears.
* feat(desktop): persistent needs-input indicator + icon button consolidation
Replace the background-clarify toast (expired on alt-tab, easy to miss) with a
persistent, glowing amber "needs input" dot on the session's sidebar row,
driven off a new ClientSessionState.needsInput flag mirrored into a
$attentionSessionIds store. The flag is set on clarify.request and cleared the
moment the turn resumes (tool.complete) or ends.
Also: redesign the clarify tool UI (borderless choices, pseudo-radio dots,
right-aligned checkmark, arc border, tighter padding), make Button the single
source of icon-button styling (4px radius, new icon-titlebar variant, titlebar
buttons rendered polymorphically via asChild, Codicons throughout), put the
file-tree refresh action first, and .trim() pasted composer text.
* style(desktop): padding-driven, square non-icon buttons
Default button sizing was vanilla-shadcn chunky (fixed h-9, 16px padding) and
inconsistent with the icon-button radius pass. Size text variants by
padding + line-height instead of fixed heights so they stay snug and scale
with content, and drop the radius on non-icon buttons (icon buttons keep the
shared 4px). Move the update-overlay CTAs off a hardcoded h-10 onto the
padding-based lg variant. Composer and the inline approval strip are untouched.
* style(desktop): shrink button scale, flush overlay sidebar, variant-ize stray buttons
- Buttons: smaller default font (14px -> 13px) and tighter padding-driven sizes
across every variant; the chunky shadcn scale read as oversized in a dense
desktop UI.
- Overlay split layout (settings / command center): the shared OverlayView top
padding left the card surface showing as a gap above the sidebar. Move the
titlebar clearance into each column so the sidebar background runs flush to
the card's top edge.
- Consolidate buttons that hardcoded size/radius/font onto the proper size
variants (tooltip-icon-button, overlay close, cron IconAction, SidebarTrigger,
gateway system button, session-row actions radius, title chip radius, release
notes link) so styling flows from variant props, not per-call overrides.
Composer and the inline approval strip are intentionally left as-is.
* style(desktop): 12px button text, drop sparkle decoration + redundant settings titles
- Button base font down to 12px (text-xs) for the dense desktop scale.
- Remove the decorative Sparkles glyph from the model "Apply" button (keep the
spinner while applying).
- Drop the page-level section titles that just restate the left nav ("Main
model", "Appearance", "MCP servers") — the sidebar already labels the pane.
Sub-section headings (Auxiliary models, LLM providers, etc.) stay.
* feat(desktop): add boxless `text` button variant; use for aux-model actions
New reusable `text` variant renders a button as inline label text (no
bg/border, muted -> foreground, underline-on-hover affordance). Emphasize the
actionable word by adding `font-semibold`/`underline` at the call site. Applied
to the auxiliary-model "Set to main" (plain), "Change" and "Reset all to main"
(bold + underlined) actions, replacing the boxed ghost/outline buttons.
* style(desktop): nudge button scale up + 2.5px radius on non-icon buttons
Bump default/sm vertical padding a step (the 12px pass read too small) and give
non-icon buttons a subtle 2.5px radius instead of square corners. Icon buttons
keep their 4px.
* style(desktop): unify Input/Textarea/SelectTrigger on shared controlVariants
Mirror the buttonVariants exercise for non-composer form controls: add a
single controlVariants source of truth (2.5px radius, 12px text,
padding-driven sizing, chrome via desktop-input-chrome) and consume it from
Input, Textarea, and SelectTrigger. Drop per-call radius/height/font
overrides that fought the shared look.
* style(desktop): flatten appearance settings — drop card-in-card sections
Remove the outer card chrome (border/bg/shadow/rounded) wrapping each
appearance section so they're flat headings + option grids instead of
boxes nested inside boxes, matching the other settings pages.
* style(desktop): de-box appearance options into flat rows + bare theme swatches
Color Mode and Tool Call Display become flat radio-style rows (no tile
border/fill, no inner icon box, no filled check badge — just a subtle active
bg and a check). Theme drops its outer card wrapper so only the preview
swatch shows, with a primary ring marking the active palette.
* style(desktop): primitive-level pointer cursor + borderless settings lists
Add a base-layer rule giving every interactive control (button, select,
menu item, switch, tab, summary) cursor:pointer, and strip the now-redundant
hardcoded cursor-pointer from those elements (plain clickable divs/labels
keep theirs). Remove the divide-y separators from settings list sections so
they breathe.
* style(desktop): Color Mode + Tool Call Display as one-row segmented controls
Replace the vertical option-row lists with a compact SegmentedControl
(grouped pill buttons on a single track), dropping the per-option
descriptions since the section subtitle already covers the context.
* style(desktop): drop redundant On/Off label next to boolean config switches
The switch already communicates state, so the text label was noise.
* style(desktop): add Switch xs size; move appearance controls inline-right
Add an xs size variant to the Switch primitive and use it for the provider
edit submenu toggles. In appearance settings, drop the redundant selection
Pills (the UI already shows the active choice), move the Color Mode and Tool
Call Display segmented controls into the section header's right side
(responsive: stacks under the heading on narrow widths), and shrink the
segmented control.
* feat(desktop): titlebar toggle to flip sidebar sides
Adds a top-left swap button (replacing the search icon) that mirrors the
layout: sessions sidebar ↔ file browser + preview rail. Persisted via
$panesFlipped. The left/right sidebar toggles, content inset, and pane
borders all follow the active side so the buttons stay accurate after a flip.
* feat(desktop): global Cmd+K palette + UI consistency overhaul
Builds on the clarify/needs-input work with a cross-cutting pass to make
the desktop surfaces feel like one app.
- Global Cmd+K command palette (cmdk): nav, settings deep-links, async
API-key / MCP-server / archived-session groups, reusable theme sub-page
(light/dark groups, stays open on pick), loop nav, fuzzy match. Replaces
per-page settings search.
- Shared SearchField: borderless, underline-on-focus, `field-sizing`
auto-width. Unifies sessions sidebar, pages, overlays, command center,
cron; drops bespoke OverlaySearchInput.
- Cron & Profiles converted to OverlayView; flat token-driven panels
(no card-in-card / divider borders) matching command center.
- `r` refresh hotkey via useRefreshHotkey; drop the visible refresh buttons.
- Button text/textStrong link variants applied across settings & views;
shared PAGE_INSET_X content gutters.
- Math/ascii loaders replace "Loading…" text placeholders; x-icon close
over text "Close"; cursor-pointer at the dropdown/select primitive level.
* style(desktop): tidy root error-boundary actions
Reload window → text link, Open logs pushed right (ml-auto), and the
error message box drops the oversized rounded-2xl for rounded-md.
* style(desktop): fix profiles sidebar — header + add-icon, drop text-link
The full-width `text` New-profile button drew an underline under the +
glyph on hover (text-decoration spans the icon). Replace with a proper
"PROFILES" section header + ghost add-icon button, matching the chat
sidebar's header/new-item pattern.
* style(desktop): kill focus rings globally
Tab/focus showed Tailwind's `focus-visible:ring-*` (a box-shadow) plus the
native outline. Drop both via an unlayered reset that nulls --tw-ring-*;
the composer / input soft-glow is untouched (those use direct box-shadows).
* style(desktop): shared Badge component; tidy profile metadata
Add a proper shadcn-style Badge (CVA tones, app radius — not a full pill)
and use it for the Default/.env tags instead of bespoke rounded-full spans.
Drop the oversized text-sm metadata values to text-xs.
* style(desktop): migrate bespoke pills to shared Badge; tidy cron/titlebar
- Sidebar toggles in the titlebar no longer carry an active highlight —
they're plain show/hide affordances now.
- Replace every bespoke rounded-full status pill (cron, messaging,
settings, skills) with the shared Badge (adds a `warn` tone). App radius,
one component.
- Cron row actions use Codicons (play/debug-pause/zap/edit/trash) to match
the rest of the chrome instead of stray lucide glyphs.
* style(desktop): drop active background on titlebar actions
Mute/haptics state reads from the icon glyph (and aria-pressed) — no
background highlight on any titlebar action.
* style(desktop): tighten error-boundary action gap
gap-4 → gap-2.5 between Try again / Reload window.
* style(desktop): hide search when there's nothing to search
Empty datasets no longer render a search field. Adds a `searchHidden` prop
to PageSearchShell (artifacts/skills/messaging) and gates cron + command
center sessions search on a non-empty list. The chat sidebar already did
this via showSessionSections.
* fix(desktop): composer wraps long text & expands at the real wrap point
Long unbroken input ran off horizontally and the stacked layout flipped
on a char-count guess (too early). Add wrap rules to the contentEditable
and drive expansion off the editor's actual rendered height via the
resize observer, so it stacks exactly when the text wraps to a 2nd line.
* feat(desktop): composer/intro polish + shared ErrorState
- Composer single-line row centers (was bottom-aligned); placeholder
randomizes per session (starter vs follow-up) without mid-stream flip.
- Drop chat header on brand-new sessions (dead label + border).
- ⌘N flashes its sidebar hint; ⌘. toggles the command center.
- Intro wordmark fills width (drop 8rem fit cap).
- Unify error states on a shared ErrorState component (boundary + updates).
* style(desktop): satisfy lint across PR-touched files
* refactor(desktop): DRY/elegance pass over PR-touched files
- Shared useDeepLinkHighlight hook collapses 3 near-identical settings
deep-link effects (keys/mcp); config kept inline (distinct bail-clear).
- command-center: table-driven SECTION_ICONS + single errorText helper.
- clarify-tool: OPTION_ROW_CLASS + RadioDot extracted from option rows.
- desktop-controller: merge Cmd+K / Cmd+. into one keydown handler.
- statusbar-controls: hoist shared action class.
- Misc: drop redundant cn()/cursor-pointer/dead fields; tidy switch.
* feat(desktop): Cmd+K jumps to sessions; drop API-key entries
Add active sessions to the palette (fuzzy jump-to-chat), remove the
low-value per-API-key entries, and move the lazy palette sources
(config/sessions/archived) to react-query instead of hand-rolled
useState + effect fetching. Hoist the shared nav helper.
Add active sessions to the palette (fuzzy jump-to-chat), remove the
low-value per-API-key entries, and move the lazy palette sources
(config/sessions/archived) to react-query instead of hand-rolled
useState + effect fetching. Hoist the shared nav helper.
Long unbroken input ran off horizontally and the stacked layout flipped
on a char-count guess (too early). Add wrap rules to the contentEditable
and drive expansion off the editor's actual rendered height via the
resize observer, so it stacks exactly when the text wraps to a 2nd line.
Empty datasets no longer render a search field. Adds a `searchHidden` prop
to PageSearchShell (artifacts/skills/messaging) and gates cron + command
center sessions search on a non-empty list. The chat sidebar already did
this via showSessionSections.
- Sidebar toggles in the titlebar no longer carry an active highlight —
they're plain show/hide affordances now.
- Replace every bespoke rounded-full status pill (cron, messaging,
settings, skills) with the shared Badge (adds a `warn` tone). App radius,
one component.
- Cron row actions use Codicons (play/debug-pause/zap/edit/trash) to match
the rest of the chrome instead of stray lucide glyphs.
Add a proper shadcn-style Badge (CVA tones, app radius — not a full pill)
and use it for the Default/.env tags instead of bespoke rounded-full spans.
Drop the oversized text-sm metadata values to text-xs.
Tab/focus showed Tailwind's `focus-visible:ring-*` (a box-shadow) plus the
native outline. Drop both via an unlayered reset that nulls --tw-ring-*;
the composer / input soft-glow is untouched (those use direct box-shadows).
The full-width `text` New-profile button drew an underline under the +
glyph on hover (text-decoration spans the icon). Replace with a proper
"PROFILES" section header + ghost add-icon button, matching the chat
sidebar's header/new-item pattern.
Builds on the clarify/needs-input work with a cross-cutting pass to make
the desktop surfaces feel like one app.
- Global Cmd+K command palette (cmdk): nav, settings deep-links, async
API-key / MCP-server / archived-session groups, reusable theme sub-page
(light/dark groups, stays open on pick), loop nav, fuzzy match. Replaces
per-page settings search.
- Shared SearchField: borderless, underline-on-focus, `field-sizing`
auto-width. Unifies sessions sidebar, pages, overlays, command center,
cron; drops bespoke OverlaySearchInput.
- Cron & Profiles converted to OverlayView; flat token-driven panels
(no card-in-card / divider borders) matching command center.
- `r` refresh hotkey via useRefreshHotkey; drop the visible refresh buttons.
- Button text/textStrong link variants applied across settings & views;
shared PAGE_INSET_X content gutters.
- Math/ascii loaders replace "Loading…" text placeholders; x-icon close
over text "Close"; cursor-pointer at the dropdown/select primitive level.
* fix(desktop): critical fixes — attachments, IME composition, scroll, fetchJson
DC2: Pass attachments to onSubmit() on direct Enter submit and call
clearComposerAttachments(). Previously attachments were silently
dropped — only text was sent while attachment pills remained visible.
DH1: Add 'open' to ThinkingDisclosure ResizeObserver effect deps.
When the disclosure toggles, refs point to new DOM but the observer
wasn't reattached, breaking live-scroll preview after expand/collapse
and leaking detached DOM nodes.
DH3+DH4: Add composition tracking via composingRef (set by
compositionstart/compositionend). Guards handleEditorInput (skip
preedit state writes), handleEditorKeyDown (prefer composingRef over
unreliable isComposing), and form onSubmit (prevent IME Enter from
triggering submission). Fixes IME Enter message splitting and preedit
text leaking into app state on CJK input.
DH6: Add res.on('error', reject) to fetchJson response stream.
Without this, a TCP reset mid-transfer left the promise hanging forever,
freezing the desktop UI.
All TypeScript compiles cleanly.
* chore: add copii.list@gmail.com to AUTHOR_MAP (stremtec)
* fix(desktop): prevent scroll snap-back during streaming, atomic config writes
DH2: Defer pinToBottom() in useLayoutEffect to rAF so that browser
scroll/wheel events from the current frame are processed first.
Previously an immediate pinToBottom() could snap the viewport back
to bottom against the user's trackpad scroll-up intent during
streaming — the wheel event hadn't fired yet so stickyBottomRef was
still true.
DH7: Add writeFileAtomic() helper (write to .tmp then rename) and
use it in writeDesktopConnectionConfig, writeDesktopUpdateConfig,
and writeBootstrapMarker. Prevents partial writes on crash/power
loss that would corrupt JSON config files, requiring manual repair.
* fix(desktop): guard nativeTheme listener from duplicates, invalidate connection config cache
DM9: Guard nativeTheme.on('updated') with a one-shot flag so that
multiple createWindow() calls (e.g. macOS activate after all windows
closed) don't accumulate duplicate listeners on the process-wide
singleton.
DM3: Add mtime-based cache invalidation to readDesktopConnectionConfig.
Previously the cache was populated once and never invalidated — if an
external tool modified connection.json, the desktop ignored the change
until restart. Now re-reads when the file's mtime differs.
* fix(desktop): widen fetchJson res.on('error') to sibling fetch + sort JSX props
Follow-up to salvaged #38502:
- resourceBufferFromUrl had the same mid-stream-reset hang class as
fetchJson (req.on('error') present, res.on('error') missing). Add the
response-stream error handler so a TCP reset during body read rejects
instead of leaving the promise unsettled.
- Sort the new onComposition* JSX props to satisfy perfectionist/sort-jsx-props
(was an introduced eslint error in the composer).
---------
Co-authored-by: asill-livestream <copii.list@gmail.com>
Closes#25495 (matrix/synapse broken in the official docker image).
`tools/lazy_deps.py` routes `platform.matrix` to
`mautrix[encryption]==0.21.0`, which transitively depends on
`python-olm`. `python-olm` is a Cython extension that links against
`libolm`; without `libolm-dev` in the image's apt set the lazy-install
build fails. Add `libolm-dev` to the runtime apt install line so the
in-container source build succeeds on first matrix use.
Salvages #27795 by @konsisumer. Their PR targeted a pre-rework
Dockerfile (still had `build-essential nodejs npm` in the apt list,
no `ca-certificates`); cherry-pick conflicts on incidental apt-list
churn, so this re-applies the same one-word insert against the
current apt line plus the matching pyproject.toml comment update.
Co-authored-by: konsisumer <11262660+konsisumer@users.noreply.github.com>
`_install_dependencies` (hermes memory setup) hard-aborted with
"uv not found — cannot install dependencies" whenever `uv` was not on
PATH, even when a perfectly good `pip` was available. Slim container
images and some CI environments don't ship uv, so memory-provider
dependency installation dead-ended there for no good reason.
Now: use `uv pip install` when uv is present, otherwise fall back to
`<python> -m pip install` when pip3/pip is available, and only abort
(with the uv install hint) when neither is found. The "Run manually:"
hints reflect whichever installer was selected.
Salvages #5954 by @MustafaKara7. Their patch added redundant local
`import subprocess` / `import sys` (both are already in scope — module
-level `sys`, function-top `subprocess`); this salvage drops those and
adds a regression test (TestInstallDependenciesRunner) covering all
three paths (uv / pip-fallback / abort). Verified adversarially: the
pip-fallback test fails against origin/main's unfixed code with the
exact dead-end symptom and passes with the fix.
Closes#5954.
Co-authored-by: MustafaKara7 <186085093+MustafaKara7@users.noreply.github.com>
Salvage of #37928 (@sarvesh1327), reduced to the still-needed delta.
`/opt/hermes/gateway` is a runtime-writable Python package: on first import
the supervised gateway writes `__pycache__` beneath it, and the image does
not set PYTHONDONTWRITEBYTECODE. When HERMES_UID/PUID is remapped at boot
(e.g. Unraid 99), `usermod -u` only re-chowns the hermes home dir; the build
trees under /opt/hermes keep the build-time UID (10000). main already chowns
`.venv`, `ui-tui`, and `node_modules` on remap (#38556) but missed `gateway`,
so the remapped gateway hits EACCES writing `__pycache__` (#27221).
Add `/opt/hermes/gateway` to both chown sites — the Dockerfile build-time
`chown -R hermes:hermes` line and the stage2-hook build-tree repair — so it
tracks the remapped UID like the sibling trees.
Differs from #37928 as submitted: dropped the `uid_gid_remapped` flag and the
`|| [ "$uid_gid_remapped" = true ]` chown gate. main's #38556 already solved
that half, and more correctly — it probes the actual tree ownership
(`venv_owner != actual_hermes_uid`) rather than tracking same-boot remaps,
which also catches pre-existing ownership drift and stays idempotent. Keeping
#37928's flag would regress that. The salvage is the `gateway`-tree addition
only.
Verified end-to-end against a real image build: on baseline main a remap to
UID 99 leaves `gateway` owned by 10000 and a write as uid 99 fails EACCES;
with this change `gateway` is chowned to 99:100 and the write succeeds, while
the default-uid (no-remap) path is unchanged.
Fixes#27221.
Co-authored-by: Sarvesh <sarveshagl1327@gmail.com>
Adds a top-left swap button (replacing the search icon) that mirrors the
layout: sessions sidebar ↔ file browser + preview rail. Persisted via
$panesFlipped. The left/right sidebar toggles, content inset, and pane
borders all follow the active side so the buttons stay accurate after a flip.
Both POST /api/model/set and the profile-model writer hand-rolled the same
provider/default/base_url/context_length reconciliation. Extract it into
_apply_main_model_assignment so the custom-vs-hosted base_url logic lives in
one place — removing the future-drift risk where one site learns about
custom base_url persistence and the other forgets.
Behavior unchanged; pinned with a direct helper unit test.
`test_tty_passthrough_to_container` asserted `int(numeric_lines[0]) > 0`
where `numeric_lines` was every `.isdigit()` token in the FULL PTY stream
— but the container's s6 boot output (cont-init diagnostics, the preinit
`uid=0 ... egid=0` line, skills-sync summaries like
`Done: 90 new, 0 updated, 0 unchanged. 90 total bundled.`) is written to
the same PTY before the `tput cols` probe runs. So the test was really
asserting on "the first number anywhere in the boot log", which passed
only by luck on whatever that first digit happened to be.
Any PR that shifts boot output flips the first digit to a stray `0` and
breaks the test with `assert 0 > 0` — even when TTY passthrough is
working perfectly (`tput cols` returns the right value). This is a latent
landmine for every Docker PR that changes boot output (e.g. adding a
bundled dependency changes the skills-sync counts).
Fix: emit the probe result behind a unique marker
(`HERMES_TTY_COLS=<cols>` / `HERMES_TTY_COLS=NO_TTY`) and parse only the
marked value, ignoring all boot-log noise. The test's real intent — verify
`docker run -t` delivers a real TTY with a positive column count — is
preserved (NO_TTY and non-numeric values still fail).
Verified against a real build, adversarially:
- Built an image with extra boot output (the markdown core-dep change from
#38649, which is what surfaced this) so the OLD logic grabs a stray `0`
-> reproduced `assert 0 > 0` locally.
- The hardened test PASSES against that same image, and against a clean
image. `tput cols` correctly returns 123 in both.
Add an xs size variant to the Switch primitive and use it for the provider
edit submenu toggles. In appearance settings, drop the redundant selection
Pills (the UI already shows the active choice), move the Color Mode and Tool
Call Display segmented controls into the section header's right side
(responsive: stacks under the heading on narrow widths), and shrink the
segmented control.
Replace the vertical option-row lists with a compact SegmentedControl
(grouped pill buttons on a single track), dropping the per-option
descriptions since the section subtitle already covers the context.
Add a base-layer rule giving every interactive control (button, select,
menu item, switch, tab, summary) cursor:pointer, and strip the now-redundant
hardcoded cursor-pointer from those elements (plain clickable divs/labels
keep theirs). Remove the divide-y separators from settings list sections so
they breathe.
Color Mode and Tool Call Display become flat radio-style rows (no tile
border/fill, no inner icon box, no filled check badge — just a subtle active
bg and a check). Theme drops its outer card wrapper so only the preview
swatch shows, with a primary ring marking the active palette.
Remove the outer card chrome (border/bg/shadow/rounded) wrapping each
appearance section so they're flat headings + option grids instead of
boxes nested inside boxes, matching the other settings pages.
Mirror the buttonVariants exercise for non-composer form controls: add a
single controlVariants source of truth (2.5px radius, 12px text,
padding-driven sizing, chrome via desktop-input-chrome) and consume it from
Input, Textarea, and SelectTrigger. Drop per-call radius/height/font
overrides that fought the shared look.
Bump default/sm vertical padding a step (the 12px pass read too small) and give
non-icon buttons a subtle 2.5px radius instead of square corners. Icon buttons
keep their 4px.
New reusable `text` variant renders a button as inline label text (no
bg/border, muted -> foreground, underline-on-hover affordance). Emphasize the
actionable word by adding `font-semibold`/`underline` at the call site. Applied
to the auxiliary-model "Set to main" (plain), "Change" and "Reset all to main"
(bold + underlined) actions, replacing the boxed ghost/outline buttons.
- Button base font down to 12px (text-xs) for the dense desktop scale.
- Remove the decorative Sparkles glyph from the model "Apply" button (keep the
spinner while applying).
- Drop the page-level section titles that just restate the left nav ("Main
model", "Appearance", "MCP servers") — the sidebar already labels the pane.
Sub-section headings (Auxiliary models, LLM providers, etc.) stay.
- Buttons: smaller default font (14px -> 13px) and tighter padding-driven sizes
across every variant; the chunky shadcn scale read as oversized in a dense
desktop UI.
- Overlay split layout (settings / command center): the shared OverlayView top
padding left the card surface showing as a gap above the sidebar. Move the
titlebar clearance into each column so the sidebar background runs flush to
the card's top edge.
- Consolidate buttons that hardcoded size/radius/font onto the proper size
variants (tooltip-icon-button, overlay close, cron IconAction, SidebarTrigger,
gateway system button, session-row actions radius, title chip radius, release
notes link) so styling flows from variant props, not per-call overrides.
Composer and the inline approval strip are intentionally left as-is.
Default button sizing was vanilla-shadcn chunky (fixed h-9, 16px padding) and
inconsistent with the icon-button radius pass. Size text variants by
padding + line-height instead of fixed heights so they stay snug and scale
with content, and drop the radius on non-icon buttons (icon buttons keep the
shared 4px). Move the update-overlay CTAs off a hardcoded h-10 onto the
padding-based lg variant. Composer and the inline approval strip are untouched.
Replace the background-clarify toast (expired on alt-tab, easy to miss) with a
persistent, glowing amber "needs input" dot on the session's sidebar row,
driven off a new ClientSessionState.needsInput flag mirrored into a
$attentionSessionIds store. The flag is set on clarify.request and cleared the
moment the turn resumes (tool.complete) or ends.
Also: redesign the clarify tool UI (borderless choices, pseudo-radio dots,
right-aligned checkmark, arc border, tighter padding), make Button the single
source of icon-button styling (4px radius, new icon-titlebar variant, titlebar
buttons rendered polymorphically via asChild, Codicons throughout), put the
file-tree refresh action first, and .trim() pasted composer text.
* Port from google-gemini/gemini-cli#21541: back up corrupted config.yaml
When config.yaml fails to parse, load_config() silently falls back to
DEFAULT_CONFIG and leaves the broken file on disk. If the user then re-runs
the setup wizard or hermes config set (both rewrite config.yaml), their
broken-but-recoverable overrides are lost for good.
Adapts the policy-file recovery from gemini-cli#21541: on the first parse
warning for a given broken file, snapshot it to config.yaml.corrupt.<ts>.bak
(best-effort, symlink-guarded, size-deduped) and tell the user where it
landed. Unlike Gemini's version we deliberately do NOT reset config.yaml to a
clean state — hermes never silently mutates user config, and leaving it means
a hand-fixed file is re-read on the next load.
Tests: 3 new cases (backup created + content preserved + original untouched;
same-size backup dedup; symlink not copied). E2E verified with isolated
HERMES_HOME and a real tab-indented broken config.
* feat(dashboard): add Debug Share to the System page
Surface `hermes debug share` in the dashboard. The System > Operations
section gets a dedicated card that uploads a redacted report + full logs
and returns the paste URLs as real, copyable links instead of a log tail.
- debug.py: factor a pure build_debug_share() returning structured
{urls, failures, redacted, auto_delete_seconds}; run_debug_share now
calls it (CLI output unchanged).
- web_server.py: POST /api/ops/debug-share runs the share core in a
worker thread and returns the structured payload synchronously (the
URLs are the whole point — not a backgrounded action).
- api.ts: runDebugShare() + DebugShareResponse.
- SystemPage.tsx: share card with a redaction toggle (on by default),
per-link + copy-all buttons, and the 6h auto-delete countdown.
- tests: build_debug_share core + endpoint (redact toggle, failure 502,
token gate).
clarify.request is a one-shot blocking event: the gateway turn blocks on
clarify.respond. The desktop handler dropped it for any non-focused session
(`if (!isActiveEvent) return`) and stored at most one request in a single
global atom, so a background session that asked a clarifying question hung
forever and re-focusing it could never recover (the event was already gone).
- store/clarify.ts: key pending requests by runtime session id; expose the
active session's request via a focus-scoped computed view (ClarifyTool is
unchanged). clearClarifyRequest takes an optional session id for targeted
clears, with a request-id fallback.
- use-message-stream.ts: park every session's clarify (drop the isActiveEvent
early return); toast when one lands for a background session since the row
otherwise just keeps spinning like normal work.
- clarify-tool.tsx: clear by session id so answering one chat can't wipe
another's pending request.
- store/clarify.test.ts: concurrent independence, focus-scoped view,
targeted/stale/fallback clears.
* fix(desktop): render approval/sudo/secret prompts so tools stop silently timing out
The desktop app's gateway event handler (use-message-stream.ts) handled
clarify.request but had no case for approval.request, sudo.request, or
secret.request. When a tool needed approval, the gateway emitted
approval.request and blocked the agent thread in _await_gateway_decision()
for up to 5 min (approvals.gateway_timeout); the desktop dropped the unknown
event, never showed a dialog, then the agent returned BLOCKED. No prompt,
just a stall then a block.
The Ink TUI already handles all three (createGatewayEventHandler.ts); this
brings the Electron app to parity.
- store/prompts.ts: approval/sudo/secret atoms (+ request-id-guarded clears)
- components/prompt-overlays.tsx: Radix dialogs; close/Esc maps to refusal so
silence is never mistaken for consent (parity with TUI Esc->deny)
- use-message-stream.ts: wire the three *.request cases; clearAllPrompts on
message.complete so an overlay can't outlive its turn
- chat-messages.ts: GatewayEventPayload gains command/description/env_var/prompt
- mount PromptOverlays in the chat shell
* feat(desktop): inline tool-call approval bar (Cursor-style "Run")
Render dangerous-command / execute_code approval inline on the pending
tool row instead of as a modal. Binding is positional: the desktop
tool.start payload carries no structured args, but approval.request only
fires from the terminal/execute_code guards and the agent blocks on one
approval at a time, so the single pending row of those tools is the one
that raised it. Command/description text comes from $approvalRequest.
Drops ApprovalDialog from PromptOverlays (sudo/secret stay modal).
* style(desktop): make inline approval bar match Cursor's command card
Drop the amber alert styling for a neutral elevated card: command on a
terminal-prefixed row up top, a divided footer with the muted description
on the left and right-aligned controls — a ghost "Reject" (Esc) plus a
split primary "Run" (⌘⏎) whose chevron opens "Allow this session" /
"Always allow" / "Reject". Wire ⌘/Ctrl+Enter → Run and Esc → Reject to
match Cursor's accept/skip bindings, guarded against double-send via the
$approvalRequest atom.
* style(desktop): shrink inline approval to a tiny Cursor-style button strip
The running tool row already shows the command, so drop the whole card +
command echo + description band. What's left is a compact strip under the
row: a small split "Run ⌘⏎" button (chevron → Allow this session / Always
allow / Reject) and a ghost "Reject Esc", indented to sit under the row's
title text.
* style(desktop): drop the loud blue Run button for a quiet outlined control
Swap the primary (blue) Run for a subtle outlined split control — neutral
border, transparent fill, hover-accent — so the approval strip reads as
quiet inline affordance rather than a big CTA. Reject stays ghost.
* style(desktop): make Run a soft primary badge
Tint the Run split control with the primary color as a badge (bg-primary/10,
primary text, primary/25 border, rounded-md, hover primary/15) instead of a
solid CTA or a neutral outline.
* style(desktop): slim the approval chevron and space out Reject
The chevron button had ballooned because dropping the size prop fell back
to the big default size (h-9 + has-svg px-3). Pin size=xs everywhere and
give the chevron a tight w-5/px-0. Bump the gap between the Run badge and
Reject (gap-2.5) and loosen Reject's internal spacing.
* feat(desktop): confirm before "Always allow" persists an approval
"Always allow" writes the matched pattern to ~/.hermes/config.yaml and
suppresses the prompt in every future session — too consequential to fire
straight from a menu click. Route it through a confirm dialog that names
the pattern + command and the file it touches. The dialog owns the
keyboard while open so Esc closes it instead of denying the approval.
* fix(gateway): make sudo + secret prompts actually fire in the desktop
Tek's PR added the sudo/secret overlays and callback wiring, but neither
reached the live path:
- Sudo: the sudo password callback is thread-local (terminal_tool
_callback_tls), and _wire_callbacks runs on the agent-build thread, not
the turn thread that executes tools. At command time the callback was
missing, so terminal sudo fell through to /dev/tty and hung the headless
gateway. Re-wire callbacks at the top of the prompt-submit turn thread.
- Secret: skills_tool short-circuited to the "secret entry unsupported"
hint for any gateway surface, before invoking the callback. Interactive
surfaces (desktop/TUI) register a secret-capture callback that routes to
the secret.request overlay; only short-circuit when no callback exists,
so messaging still gets the hint but the desktop prompts.
* docs(desktop): drop Cursor references from approval comments
* docs(desktop): drop Cursor reference from prompt-overlays comment
* fix(skills): gate in-band secret capture on HERMES_INTERACTIVE, not callback presence
The desktop/sudo PR switched the gateway secret-capture short-circuit from
"any gateway surface" to "gateway surface with no callback registered". That
made a messaging gateway (telegram/discord/...) attempt interactive in-band
secret capture whenever any callback happened to be registered, instead of
returning the safe "setup unsupported" hint — and broke
test_gateway_still_loads_skill_but_returns_setup_guidance.
Discriminate on HERMES_INTERACTIVE instead: the desktop app / TUI set it in
_enable_gateway_prompts (alongside registering the secret.request callback),
while messaging platforms never do. This is the same flag tools/approval.py
uses to tell an interactive surface from a messaging one, so messaging keeps
the hint and desktop/TUI still prompt.
---------
Co-authored-by: Brooklyn Nicholson <brooklyn.bb.nicholson@gmail.com>
Salvage of #35508 (@dchenk), rebased onto current main. Resolved the
tests/tools/test_stage2_hook_puid_pgid.py conflict (kept both the
envdir-creation regression test on main and the new config-migration
tests).
Docker image upgrades replace code under $INSTALL_DIR but preserve
$HERMES_HOME on the mounted volume, so the persisted config.yaml never
received the schema migrations that non-Docker `hermes update` runs
(#35406). This adds scripts/docker_config_migrate.py, invoked from
stage2-hook after first-boot seeding and before gateway services start:
it backs up config.yaml + .env, runs migrate_config(interactive=False),
and honors HERMES_SKIP_CONFIG_MIGRATION=1 for manual control.
Also fixes a latent bug in check_config_version(): it called load_config()
which deep-merges DEFAULT_CONFIG, so a legacy config with no raw
_config_version falsely reported as already-current. It now reads the raw
on-disk file so legacy configs are correctly detected for migration.
Differs from #35508 as submitted (Option B cleanup): dropped the
`_config_version` line added to cli-config.yaml.example and removed the
accompanying test_cli_config_example_declares_latest_version change-detector
test. The example is a copy-template and has no business asserting a schema
version; check_config_version() reads the user's real config.yaml, not the
example. This removes a second sync point that drifts on every version bump.
Closes#35508. Fixes#35406.
Co-authored-by: Dmitriy Cherchenko <17372886+dchenk@users.noreply.github.com>
`docker run --user $(id -u):$(id -g)` was a tini-era trick to make
container-written files match the host user. Under s6-overlay it no longer
works: the bootstrap (UID remap, volume + build-tree chown, config seeding)
needs root, and the baked image dirs (/opt/data, /opt/hermes/.venv, ui-tui,
node_modules) are owned by the hermes build UID (10000). A pinned arbitrary
UID can't write them, so the runtime fails with EACCES on a bind mount or
hard-crashes on a named volume (Docker inits the volume from the image as
10000; the non-root start can't even `cd /opt/data`, and the profile
reconciler dies with PermissionError on gateway_state.json).
Detect that start early in both the cont-init hook (stage2-hook.sh) and the
CMD wrapper (main-wrapper.sh) and fail fast with actionable guidance pointing
at the supported path: root start + HERMES_UID/HERMES_GID (or the PUID/PGID
aliases), which remaps the hermes user and chowns the volume — the same
host-UID-matching outcome --user was used for, without breaking s6.
The guard fires only when the current UID is neither root NOR the hermes UID.
This preserves the supported non-root start from #34648/#34837 (running with
`--user 10000:10000`, i.e. pinned to the hermes UID itself), which is
unaffected — only the arbitrary-UID variant that #34837 never actually made
writable is rejected.
Verified live across five scenarios (built image, bind + named volume):
arbitrary --user on bind -> rejected with guidance, hermes does not run;
arbitrary --user on named volume -> guidance shown, no raw 'can't cd' crash;
--user 10000:10000 -> boots; root + HERMES_UID=4242 remap -> boots, guard not
tripped; default root start -> boots. Pre-fix control reproduces the raw
PermissionError + 'can't cd' crash with no guidance.
`hermes mcp add --auth header` built `Authorization: Bearer ${MCP_X_API_KEY}`
and passed it straight to the discovery probe without interpolation, so the
probe sent the literal placeholder and auth-requiring servers (e.g. n8n)
returned 401. Runtime tool loading worked because `_load_mcp_config()`
interpolates, but the four CLI probe call sites (add/test/login/configure)
all used unresolved config.
Resolve `${ENV}` inside `_probe_single_server` via a new
`_resolve_mcp_server_config()` (load_hermes_dotenv + _interpolate_env_vars),
mirroring runtime loading. This covers all four call sites, not just add.
Also strip a leading `Bearer ` from pasted tokens before saving to
`MCP_*_API_KEY`, so a token pasted with the prefix doesn't produce
`Bearer Bearer <jwt>` (also a 401).
Reported with a precise root-cause analysis in #37792.
Co-authored-by: ThyFriendlyFox <116314616+ThyFriendlyFox@users.noreply.github.com>
Onboarding's "Local / custom endpoint" only wrote the OPENAI_BASE_URL env
var, which runtime resolution ignores — so a self-hosted endpoint was never
wired in and setup failed with "No usable credentials found for custom" even
though local servers need no key.
Route the local option through saveOnboardingLocalEndpoint: probe the
endpoint, auto-discover a model from /v1/models, persist provider=custom +
base_url + model via /api/model/set, then verify the runtime directly
(not via completeWithModelConfirm, which would re-assign the model without
base_url and wipe it). No onboarding form/UI changes — the existing single
URL field is enough.
The runtime resolver reads model.base_url from config and ignores the
OPENAI_BASE_URL env var, so a self-hosted endpoint could not be configured
from the GUI. Two changes enable it:
- POST /api/model/set accepts an optional base_url and persists it as
model.base_url when provider=custom (still clearing stale base_url for
hosted providers).
- POST /api/providers/validate now returns the model ids a custom endpoint
advertises at /v1/models, so the GUI can auto-pick a default without
asking the user to type a model name.
Refs desktop onboarding "Local / custom endpoint" bug.
The setup-flow provider list showed two Anthropic/Claude entries with
ambiguous labels ('Anthropic (Claude API)' and 'Claude Code (subscription)')
in no deliberate order. Relabel and reorder so the distinction and the
subscription caveat are explicit:
- 'Anthropic API Key' (PKCE, API path)
- 'Anthropic OAuth: Required Extra Usage Credits to Use Subscription' (external)
- Both Anthropic entries moved to the bottom of the list.
- 'OpenAI Codex (ChatGPT)' -> 'OpenAI OAuth (ChatGPT)', now first after Nous.
Applied consistently to the backend OAuth catalog (web_server.py) and the
desktop onboarding overlay's PROVIDER_DISPLAY title/order map; test
assertions updated to the new titles.
Hammer createTokenizer with the worst stalls a terminal can produce —
split + flush at every interior byte, and a 200-report byte-by-byte feed
that flushes after every single byte — and assert the two invariants that
make the SGR-leak class structurally impossible: nothing ever leaks as a
text token, and every complete report reassembles whole. A mixed
mouse+keystroke variant proves real input survives the same storm.
hermes -c / --resume now reopen a session in its original working
directory. The sessions table already had a cwd column; the classic CLI
just never wrote or read it.
- run_agent._ensure_db_session stamps cwd for local CLI sessions only
(new _launch_cwd_for_session gates out gateway/cron and non-local
terminal backends, where a host cwd is meaningless to restore).
- cli._restore_session_cwd chdir's the process AND retargets TERMINAL_CWD
so the terminal tool, code-exec tool, and relative-path resolution all
land in the restored dir. Called from both resume paths (interactive
run() and the -q single-query path).
- Robust degradation: no-op when no cwd recorded, when already there, or
when the dir is gone (single dim warning, stays put — no crash).
With the tokenizer reassembling split CSI sequences across a flush (prior
commit), no SGR mouse fragment can reach a text token anymore — terminals
write a mouse report as one atomic sequence, and any read/flush split now
re-joins in the tokenizer buffer instead of leaking. That makes the whole
downstream recovery layer dead code:
- SGR_MOUSE_FRAGMENT_RE, MOUSE_BURST_NOISE_RE, MOUSE_BURST_RESIDUE_RE
- parseTextWithSgrMouseFragments / parseSgrMouseFragment /
normalizeSgrMouseFragment
- the whole-text mouse-burst noise fast path in parseMultipleKeypresses
Remove all of it (~185 lines) and the tests that only exercised it. The
narrow legacy X10 wheel-tail resynth stays (distinct mechanism, kept with
its own test). This retires the #17701 → #18113 → #26781 → #28463 → #35512
regex hardening chain in favor of the one correct parser fix.
The guard it covered was removed in the previous commit (fragments no
longer reach input-event — they reassemble at the tokenizer). Reassembly
is now covered by termio/tokenize.test.ts and the flush-boundary cases in
parse-keypress.test.ts.
Root-cause fix for the SGR mouse fragment leak (`46M35;40M...` typed into
the prompt). The leak was never really about the fragments — it was the
flush emitting them. When App's 50ms watchdog fires mid-CSI during a render
stall, the tokenizer was force-emitting the buffered partial as a token and
resetting to ground, so both the prefix and the ESC-less remainder surfaced
as unparseable input.
Make the flush state-aware (xterm.js discipline): a bare ESC still flushes
to the Escape key (the legitimate ESCDELAY case), but a buffer still inside
a multi-byte control sequence (csi/osc/dcs/apc/ss3/intermediate) is NOT
emitted — it's kept so the continuation reassembles on the next feed. A
one-tick truncation valve in createTokenizer.flush() drops a partial that
survives a second flush with no progress, so a genuinely truncated write
can't fuse into the next keypress.
With partials never entering the input stream, the downstream scrubber is
dead code: remove the SGR fragment guard from input-event.ts (both the
original `/^\[<\d+;\d+;\d+[Mm]/` and the consolidated form added earlier in
this PR). The parse-keypress burst-recovery regexes (MOUSE_BURST_*) are now
also redundant but left in place as a safety net for one release; they can
be removed in a follow-up once this soaks.
Tests: tokenize.test.ts proves a mid-CSI flush keeps/reassembles and that a
stale partial is dropped after a second flush and a bare ESC still emits;
parse-keypress.test.ts adds the end-to-end split-then-reassemble case
yielding a single clean mouse event with no leaked key.
Supersedes #29337.
The stage2 hook gates the recursive chown of the build trees under
$INSTALL_DIR (.venv, ui-tui, node_modules) so a HERMES_UID/PUID remap
leaves them writable by the new runtime UID — needed for lazy_deps
'uv pip install' of platform extras (#15012, #21100) and the TUI esbuild
rebuild into ui-tui/dist (#28851).
#35027 folded that chown under the $HERMES_HOME ownership check
('stat $HERMES_HOME != hermes_uid'). But 'usermod -u <new> hermes'
re-chowns the hermes home dir ($HERMES_HOME == /opt/data) to the new UID
as a side effect, so after any remap that stat is already satisfied and
needs_chown is false — silently skipping the build-tree chown on the
common PUID/NAS path. The venv stays owned by the build-time UID (10000),
so lazy installs and TUI rebuilds fail with EACCES.
Probe the build trees directly instead: chown only when /opt/hermes/.venv
is not already owned by the runtime hermes UID. Independent of
$HERMES_HOME ownership, idempotent across restarts.
Verified live: built the image, booted with HERMES_UID/HERMES_GID on a
fresh named volume, confirmed .venv/ui-tui/node_modules end up owned by
the remapped UID and 'uv pip install' into the venv succeeds; confirmed
the recursive chown fires once and is skipped on restart.
When App's 50ms flush watchdog fires mid-CSI during a render stall, an
SGR mouse report (ESC[<btn;col;row M/m) is split across stdin chunks: the
tokenizer force-emits the buffered prefix and resets to ground, so both
the prefix and the ESC-less remainder reach InputEvent as nameless tokens.
The previous guard only matched a full `[<\d+;\d+;\d+[Mm]` fragment, so
the flushed prefixes (`ESC[<0;35;`) and the 1-/2-field and leading-`;`
tails (`46M`, `35;46M`, `;46M`) still leaked into the composer as
`46M35;40M...` during long sessions.
Replace the three would-be narrow regexes with one consolidated rule that
covers every split position. A `(?=...\d)` lookahead keeps typed `<`, `[`,
`;`, and `M` safe (no coordinate digit), and the embedded M/m terminator
in the param class leaves stuck-together fragments / prose intact. The
existing `!keypress.name` gate continues to protect real keystrokes, which
arrive one char per chunk with a name set.
Supersedes #29337 (covers the prefix-leak and leading-`;`/1-/2-field tail
cases that PR's two added guards missed).
The HERMES_AGENT_HELP_GUIDANCE block (added #16535) only fired when the
user explicitly asked about configuring/setting up Hermes. Broaden it so
the agent treats the docs as a standing source of self-knowledge for any
Hermes-related help and for understanding its own features/tools, points
to the hermes-agent skill for additional guidance, and treats the docs as
the authoritative/latest source of truth when the two differ.
Static constant in the cache-safe stable tier — no prompt-cache impact.
Dashboard plugins (kanban, hermes-achievements) read
window.__HERMES_SESSION_TOKEN__ directly and hand-assembled WebSocket
URLs with ?token=. That works in loopback/--insecure mode but is
rejected on OAuth-gated deployments, where the session token is absent
and _ws_auth_ok only accepts single-use ?ticket= auth. The result was
401s on plugin REST calls and 1008/403 on the kanban live-events WS
whenever the dashboard ran behind OAuth (e.g. hosted Fly agents).
Make the plugin SDK the single sanctioned auth surface:
- web/src/lib/api.ts: add authedFetch() (raw Response for FormData
uploads / blob downloads, token-or-cookie auth, no throw / no 401
redirect) and buildWsUrl() (assembles a ws(s):// URL with the correct
auth param for the active mode — fresh single-use ticket in gated
mode, token in loopback).
- web/src/plugins/registry.ts: expose authedFetch, buildWsUrl,
buildWsAuthParam, and sdkVersion on window.__HERMES_PLUGIN_SDK__;
add SDK_CONTRACT_VERSION.
- web/src/plugins/sdk.d.ts: hand-authored typed contract for the
plugin SDK + registry globals (single source of truth for the
Window declarations).
- plugins/kanban + hermes-achievements dist bundles: stop reading the
session token directly; route uploads/downloads through
SDK.authedFetch and the live-events WS through SDK.buildWsUrl.
- plugins/kanban plugin_api.py: _ws_upgrade_authorized() delegates the
/events WS upgrade to the canonical web_server._ws_auth_ok gate, so
it transparently accepts loopback token / gated ticket / internal
credential and can never drift from core auth again.
- tests: guard test asserting no plugin dist reads
__HERMES_SESSION_TOKEN__ directly; kanban gated-ticket WS test.
Verified live on a gated staging Fly agent: kanban /events upgrades
101 with a minted ticket (ticket_len=43, ws_auth_ok=True) where the
old code got 403.
uv selects the project Python from requires-python and from the UV_PYTHON
env var, both of which override an already-created venv on the next
'uv sync'. With no upper bound on requires-python, an inherited
UV_PYTHON=3.14 (or a fresh distro whose newest interpreter uv auto-picks)
silently recreated the installer's 3.11 venv at 3.14, where Rust-backed
transitives (pydantic-core) have no cp314 wheel and fall back to a maturin
source build that fails. This bit a Windows/WSL user with UV_PYTHON set in
their shell and a fresh WSL-arch box where uv auto-picked 3.14.
Two layers:
- pyproject: requires-python '>=3.11' -> '>=3.11,<3.14' (+ uv lock regen).
uv now refuses a 3.14 interpreter with a clear error instead of attempting
the maturin build. Backstop independent of the installer.
- install.sh / install.ps1: pin UV_PYTHON to the venv interpreter after
creating it (in both the venv step and the deps step, since bootstrap runs
those stages as separate processes). An inherited UV_PYTHON can no longer
hijack the sync/pip tiers, so the install just works regardless of shell env.
Verified E2E: hostile UV_PYTHON=3.14 + uv venv --python 3.11 + uv sync now
installs into 3.11 with pydantic-core's 3.11 wheel; without the re-pin the
capped requires-python produces a legible incompatibility error rather than a
cryptic build failure.
Fireworks/Mistral reject HTTP 400 'Extra inputs are not permitted, field:
messages[N].tool_calls[M].extra_content' on any session whose history
contains prior Gemini tool calls. Gemini 3 thinking models attach
extra_content (thought_signature) to tool_calls; it survived to the wire
because the sanitize paths only stripped call_id/response_item_id.
Strip extra_content from the outgoing wire copy in both sanitize paths
(ChatCompletionsTransport.convert_messages + _sanitize_tool_calls_for_strict_api),
but gate it on the target model: keep extra_content for Gemini-family
targets (the thought_signature MUST be replayed or Gemini 400s), strip it
for everyone else — including non-Gemini models that inherit a stale Gemini
signature earlier in a mixed-provider session. Native Gemini is unaffected
(GeminiNativeClient bypasses these paths).
Original stored history is never mutated (only the per-call copy).
Fixes#17986.
The Desktop Chat section described chat-only and gave no signpost that
remote-hosted Hermes connection is documented. Adds a pointer to the
in-page remote-backend section and to the deeper Web Dashboard doc.
The TUI hardcoded --max-old-space-size=8192. V8 is not cgroup-aware, so in a
Docker/k8s container capped below ~9-10GB the heap grows past the container
limit and the cgroup OOM-killer SIGKILLs the Node parent BEFORE V8's own heap
monitor fires. SIGKILL runs no JS handler, writes no [tui-parent] breadcrumb,
and closes the gateway child's stdin — the user sees only a bare gateway
'stdin EOF'. Complements #38224 (trail-text cap), which reduced pressure but
left the 8GB-vs-container mismatch in place.
- _read_cgroup_memory_limit(): read cgroup v2 (memory.max) then v1
(memory.limit_in_bytes); handle 'max', the v1 unlimited sentinel, blank/zero,
and >=1PB as unconstrained.
- _resolve_tui_heap_mb(): unconstrained -> 8192; constrained -> 75% of the
cgroup limit (headroom for non-heap RSS + the Python child sharing the
cgroup), floored at 1536MB, never above 8192.
- NODE_OPTIONS block uses the sized value; still respects a user-supplied
--max-old-space-size.
Net: V8 now GCs/exits gracefully (onCritical breadcrumb fires) instead of being
reaped silently. Display/transport only — no agent context or behavior change.
Tests: tests/hermes_cli/test_tui_heap_sizing.py (20 tests).
The stash/restore cycle in the update path was observed to clobber
freshly-pulled source files (apps/desktop/ deletion -> Vite
'[UNRESOLVED_ENTRY] Cannot resolve entry module index.html'). On a
managed clone the user never edits the source tree, so any 'dirty' state
is pure git artifact (CRLF renormalization, npm lockfile churn, files
left behind when a directory was deleted upstream such as
apps/bootstrap-installer/). Stashing that and re-applying it after a pull
is fragile and unnecessary.
- hermes update (hermes_cli/main.py): on a non-fork (managed) clone,
discard working-tree dirt via reset --hard HEAD + clean -fd instead of
stash/apply. Forks keep the stash machinery so intentional edits
survive. Also pin core.autocrlf=false on Windows so the dirt is never
created (mirrors install.ps1 #38239).
- install.sh: replace the update-path stash/restore dance with a hard
reset to origin/<branch>; the installer is a managed-only entry point.
- install.sh + install.ps1 desktop stage: prefer 'npm ci' (wipes and
reinstalls node_modules from the lockfile) over bare 'npm install',
which can report 'up to date' against a stale marker while node_modules
is empty -- leaving tsc unresolved so 'npm run pack' fails.
Tests: managed clone cleans instead of stashing; fork still stashes;
existing stash tests force the stash path explicitly.
The launch provider setup screen rejected too many legitimate users:
a live credential probe ("key rejected"), a post-save runtime check
("still cannot reach X"), and an 8-char minimum all gated progression.
Corporate proxies, regional blocks, rate-limited/flaky probes, and
self-hosted endpoints all tripped these. Now we just require a
non-empty value and save it; a genuinely bad key surfaces later at
chat time instead of blocking onboarding.
Desktop's readiness probe only checks GET /api/status (public), but the
live chat rides /api/ws, which is gated by --tui (4403), a matching
session token (4401), and a non-loopback bind. The web-dashboard doc
covered --tui and the OAuth gate but never the Desktop remote-connection
flow, so the three independent failure modes weren't documented together.
Adds a 'Connecting Hermes Desktop to a remote backend' section: pin
HERMES_DASHBOARD_SESSION_TOKEN, run with --host 0.0.0.0 --insecure --tui,
the curl token-verification one-liner, and WS close-code triage.
The desktop chat app's slash curation (desktop-slash-commands.ts) only
suggested the ~19 curated built-ins. isDesktopSlashSuggestion required
membership in DESKTOP_COMMANDS, so every skill-derived command and user
quick_command was silently dropped from both completion paths
(commands.catalog empty-query + complete.slash typed-query) and from
filterDesktopCommandsCatalog — even though isDesktopSlashCommand let them
EXECUTE when typed in full. The tui_gateway backend already includes skills
in both RPCs; the gap was purely renderer-side.
Add isDesktopSlashExtensionCommand() (= not-a-known-Hermes-built-in, the
same predicate that already gates execution) and let extensions through the
suggestion path. The catalog filter routes through isDesktopSlashSuggestion,
so skill/quick-command categories and pairs are kept automatically.
The native Hindsight memory provider lazy-installs hindsight-client into
/opt/hermes/.venv at first use (tools/lazy_deps.py: memory.hindsight).
That venv lives inside the immutable image layer, not the mounted
/opt/data volume, so the dependency is wiped on every container recreate
/ image update. After an update, profile config still points at Hindsight
and the Hindsight server is healthy, but recall/retain fails with:
ModuleNotFoundError: No module named 'hindsight_client'
The manual workaround (uv pip install hindsight-client inside the running
container) doesn't survive the next recreate, and pip-install-into-.venv
is not an officially supported durable Docker workflow.
Fix: add --extra hindsight to the image's uv sync line, same pattern as
the --extra anthropic/bedrock/azure-identity providers (#30504) and
--extra messaging (#24698) — bake the optional dependency into the build
layer so it survives container recreate. The pyproject [hindsight] pin
(hindsight-client==0.6.1) already matches tools/lazy_deps.py and uv.lock,
so this is a pure additive --extra with no lockfile churn.
Verified: 'uv sync --frozen --no-install-project --extra hindsight'
against the committed uv.lock installs hindsight-client 0.6.1 and the
module imports cleanly.
Adds a regression test (mirrors test_dockerfile_preinstalls_gateway_
messaging_dependencies) so a future Dockerfile cleanup can't silently
drop the extra.
* fix(packaging): modernize project.license to PEP 639 SPDX string
Drops the SetuptoolsDeprecationWarning ('project.license as a TOML table
is deprecated') emitted on every editable build under setuptools>=77 by
switching license = { text = "MIT" } to the SPDX string form plus an
explicit license-files entry. Bumps build-system requires to
setuptools>=77 so an older build backend can't reject the string form.
The warning was non-fatal (builds succeed with it) but surfaces
prominently in install.ps1 build-failure output, where it gets mistaken
for the cause of unrelated Windows build_editable crashes.
* fix(packaging): bound setuptools build requirement per supply-chain policy
Add the <83 upper bound to setuptools>=77.0 so the dep-bounds supply-chain
gate (>=floor,<next_major) passes.
Self-review of #38465 surfaced three real items:
1. SystemExit escape (defense): `_login_nous` raises SystemExit(130)/(1) on
cancel/failure. The logged-out login path inside `_model_flow_nous` catches
it, but the expired-session re-login path (main.py) only catches Exception,
so a Ctrl-C during re-auth could propagate past `_run_portal_one_shot` and
kill the CLI. Add SystemExit to the portal handler so all cancel/abort cases
end with the graceful 'Setup cancelled / retry later' message.
2. Doc sweep: the model-pick step was only added to the bare-`hermes portal`
prose. Propagate it to the surfaces describing `hermes setup --portal`
behavior that still omitted model selection:
- `--portal` argparse help (main.py)
- nous-portal.md intro + the numbered 'what it does' step list (EN + zh-Hans)
- run-hermes-with-nous-portal.md 'default model after setup --portal' line,
which was now contradictory (there's a picker, not a forced default) (EN + zh)
3. Test coverage: add parametrized regression test asserting the portal handler
swallows KeyboardInterrupt / EOFError / SystemExit (returns None, no escape).
Note on 'Skip (keep current)': delegating to _model_flow_nous means picking
Skip preserves the prior provider instead of force-switching to nous — this is
intentional and matches quick setup exactly; docs now say 'sets Nous as your
provider (when you pick a model)' rather than unconditionally.
`hermes portal` / `hermes setup --portal` previously logged in and set
provider=nous but left the model UNSELECTED (blank -> runtime default) and
never showed a picker — unlike the first-time quick setup, which runs the
model picker.
Route `_run_portal_one_shot` through `_model_flow_nous` — the exact same
routine quick setup (`_run_first_time_quick_setup`) and `hermes model` -> Nous
use. It handles both the logged-out path (device-code OAuth, which picks a
model internally) and the logged-in path (curated Nous model picker), then
offers the Tool Gateway opt-in and sets provider=nous. Net effect: `hermes
portal` now offers a model picker every time and is a true single-command
collapse of quick setup's Nous step.
Removes the hand-rolled auth_add_command + manual provider write + separate
Tool Gateway prompt (now a single source of truth). Re-syncs the in-memory
config from disk afterward so a caller's later save_config can't clobber the
model/provider written by the login flow.
Docs (CLI help, portal_cli docstrings, nous-portal EN + zh-Hans) updated to
mention model selection. New regression test asserts `_run_portal_one_shot`
delegates to `_model_flow_nous`.
Verified live: `hermes portal` now shows the 27-model curated picker, 'Skip
(keep current)' preserves prior provider/model.
* fix(desktop): prevent IME Enter from splitting messages and viewport resize from disarming scroll anchor
Two fixes for the Hermes Desktop composer:
1. IME composition Enter was treated as message submission. When a Korean/
Japanese/Chinese IME is composing text and the user presses Enter to
finalise the preedit, handleEditorKeyDown fired submitDraft() because it
did not check event.nativeEvent.isComposing. The assistant-ui hidden
textarea already guards this correctly; the custom contentEditable
handler was missing it. Added an early return when isComposing is true.
2. Viewport resize (composer expand/collapse, window resize) was disarming
the scroll sticky-bottom anchor. When the composer grows, the thread
viewport shrinks, the browser adjusts scrollTop down to keep content
visible, and the onScroll handler misread this as a user scroll-up.
Added lastClientHeightRef tracking so the disarm condition now requires
BOTH stable scrollHeight AND stable clientHeight before treating a
scrollTop decrease as user intent.
Fixes: random mid-message sends during IME typing; scroll jumps when the
composer resizes or the window changes size.
* fix(desktop): prevent virtualizer measurement adjustments from fighting scroll anchoring
The virtualizer's measureElement callbacks trigger scroll adjustments when
item sizes differ from estimates. These fight our ResizeObserver +
pinToBottom loop, creating visible rubber-banding (view snaps to composer
then jumps back up), even during idle.
Three changes:
1. React.memo on VirtualizedThread to stop parent re-renders cascading
2. Shared stickyBottomRef so scrollToFn can check bottom state
3. scrollToFn override: skip adjustments when user is at bottom
* fix(desktop): use stable useCallback ref instead of inline arrow for onBranchInNewChat
The inline arrow `messageId => void branchInNewChat(messageId)` created a
new function reference on every render. This cascaded through:
desktop-controller → ChatView → Thread → useMemo([...onBranchInNewChat])
→ new messageComponents object → VirtualizedThread receives new prop
→ React.memo overridden → virtualizer recalculates → measurement
adjustments trigger scroll jumps at the 15-second useStatusSnapshot
interval.
Pass the already-useCallback'd branchInNewChat directly.
* fix(desktop): use ctrlEnter submitMode on hidden textarea + gate ResizeObserver on isRunning
Two root-cause fixes:
1. IME message splitting: The hidden ComposerPrimitive.Input textarea had
submitMode='enter' (default), so any Enter keydown it received — even
during IME composition — triggered form.requestSubmit(). Changed to
submitMode='ctrlEnter' so only the contentEditable div (which correctly
checks isComposing) handles plain-Enter submission.
2. Scroll jumps during idle: The ResizeObserver auto-follow loop was
active even when the thread wasn't running, causing spurious
pinToBottom calls whenever any layout shift occurred (browser reflow,
font load, GPU cache eviction). Gated the ResizeObserver on
thread.isRunning so auto-scroll only follows during active streaming.
User messages still pin via useLayoutEffect, and thread.runStart still
calls jumpToBottom.
* fix(desktop): keep chat bottom anchor stable through idle layout shifts
* fix(desktop): prevent code block shrink scroll bounce
* fix(desktop): release bottom height lock on run completion
* fix(desktop): keep streaming code blocks rendered
* fix(desktop): keep bottom anchored through final render
* fix(desktop): render streaming reasoning code blocks
* feat(desktop): add subtle streaming block animations
The two retry hints inside _run_portal_one_shot (shown when the OAuth login
fails) still suggested `hermes auth add nous --type oauth`. Since this path
backs both `hermes portal` and `hermes setup --portal`, point users at the
new human-readable `hermes portal` for consistency.
`hermes portal` (no subcommand) now runs the one-shot Nous Portal onboarding
— OAuth login, switch provider to Nous, offer Tool Gateway — identical to
`hermes setup --portal` and the human-readable alias for
`hermes auth add nous --type oauth` (which still works).
The prior status default moves to `hermes portal info`; `status` is kept as a
hidden back-compat alias. `open`/`tools` subcommands are unchanged.
User-facing hints and docs (status.py, conversation_loop 401 guidance,
SystemPage, README, website docs + zh-Hans) now point at `hermes portal` /
`hermes portal info`. `--manual-paste` references keep the explicit auth
command since `hermes portal` does not expose that flag.
The macOS self-update relaunches and installs over the app it derives via
resolve_hermes_desktop_app (.../Hermes.app/Contents/MacOS/Hermes ->
.../Hermes.app). That derivation is load-bearing for both the ditto
install target and the auto-relaunch (open <app>), but had no test.
Add unit coverage:
- resolve_hermes_desktop_app_finds_built_bundle: a fake built release tree
resolves to the .app bundle on macOS (and the exe elsewhere).
- resolve_hermes_desktop_app_is_none_without_a_build: no build => None.
Verified the positive test FAILS if the .app parent-walk is wrong (e.g.
one too few .parent() hops), so it's a real guard against a regression
that would break the post-update relaunch target.
cargo test -> 17 passed.
The macOS self-update bundle swap (install_macos_app_update, added in
#38296) could leave the user with NO app installed. If moving the
existing /Applications/Hermes.app aside failed, the code deleted the
running app outright and set moved_old=false; if the subsequent move of
the freshly built bundle into place then also failed, the rollback was
gated on moved_old (now false) and skipped — leaving the target deleted
with no replacement.
Extract the swap into swap_in_new_bundle() with a strict invariant: on
ANY failure path the target is left pointing at a working bundle (either
the original, rolled back, or untouched) and is never deleted with no
replacement. Also clean up the staged .hermes-update-new copy on the
failure paths instead of orphaning it.
Add unit tests covering the happy path, the rollback-on-install-failure
path, and the catastrophic both-moves-fail path. The catastrophic-path
test was verified to FAIL against the old code ("original app must NOT
be deleted on failure") and pass against the fix.
* fix(packaging): ship locales/ i18n catalogs in wheel, sdist, and Nix
locales/ is a bare data dir (no __init__.py), invisible to packages.find
and package-data. Sealed installs (pip wheel, Nix store venv) dropped it,
so gateway/CLI commands rendered raw i18n keys like
gateway.reset.header_default.
- pyproject: [tool.setuptools.data-files] locales = ["locales/*.yaml"] (wheel)
- MANIFEST.in: graft locales (sdist)
- agent/i18n._locales_dir: env override -> source -> sysconfig data scheme
- nix/hermes-agent.nix: copy locales into the store + set HERMES_BUNDLED_LOCALES
as defense-in-depth. The wheel's data-files already materialize into the
uv2nix venv, so resolution works with no env var; the override pins the
store path against a future uv2nix change that could drop data-files.
- tests: metadata regression, wheel + sdist build-install smoke tests, and a
bundled-locales flake check that verifies BOTH the wrapper override and the
env-var-less data-files path. Smoke test wired into CI.
Closes#23943, #27632, #35374.
Supersedes #23966, #27716, #30261, #33841, #35429, #35494, #35735, #36697.
* test: cap locale e2e timeout, tighten catalog count guard
The two wheel/sdist e2e tests inherit the global --timeout=30 from
addopts; a cold-CI run (isolated build env + venv create + network pip
install) can plausibly exceed it. Add @pytest.mark.timeout(300) so they
don't ride the unit-test budget and flake intermittently.
Also assert the shipped catalog count equals len(SUPPORTED_LANGUAGES)
instead of a hardcoded >=16 floor, so the guard self-updates and trips
on a single dropped catalog (not just a fully-empty graft).
Avoid stale WebSocket events from an old reconnect attempt flipping the gateway state after a newer socket opens. Also limit session-search dedupe to compression edges so branch-specific hits still open the branch instead of collapsing to the parent.
Four related desktop session-management bugs:
- Pins lost until refresh: pinned sessions are joined against the
paginated in-memory session list, so a pinned chat that aged off the
most-recent page got evicted on the next refresh (every message.complete
triggers one) and the Pinned section went empty. mergeWorkingSessions ->
mergeSessionPage now also preserves pinned rows (matched by live id or
lineage root). Pin id checks in the chat header, command center, and
delete/archive are normalized to the durable sessionPinId so pins survive
auto-compression.
- Stuck on "Starting Hermes" after sleep: macOS sleep drops the renderer
WebSocket; nothing reconnected on wake so the composer stayed disabled.
The gateway boot hook now auto-reconnects with backoff on close/error and
on wake signals (powerMonitor resume/unlock-screen IPC, window online,
visibilitychange). connect() gains an open timeout so a hung reconnect
can't deadlock in 'connecting'. Composer placeholder distinguishes
"Reconnecting to Hermes" from a cold start.
- Loses chats from itself: the same hard-replace that dropped pins also
dropped loaded sessions; mergeSessionPage keeps them.
- Multiple copies/branches in search: /api/sessions/search deduped only by
raw session_id, so compression segments and branches surfaced as separate
hits. It now dedupes by lineage root and returns the live compression tip,
matching the session_search tool's behavior.
PR #38296 added four emit_log() calls using the old 3-arg signature, but
main had already changed emit_log to take a `stream: LogStream` argument
(#38312, "stop mislabeling stdout-style progress as stderr"). The two PRs
touched different lines, so the merge auto-resolved with no conflict and
left main unable to compile the bootstrap installer (E0061: 4 args expected,
3 supplied).
Supply the missing stream: Stdout for the update/install progress lines and
Stderr for the "could not auto-launch desktop" failure, matching the
convention from #38312. cargo check passes.
Co-authored-by: Cursor <cursoragent@cursor.com>
The Desktop App and Web Dashboard remote-connect instructions told users
to start the backend with `hermes dashboard --no-open --insecure --host
0.0.0.0`, omitting --tui. Without --tui the embedded-chat WebSockets
(/api/ws, /api/pty) are refused, so the desktop passes the /api/status
health check and reports the backend "ready" — but chat never works
because the socket is closed on connect.
- Add --tui to both backend command blocks (with an inline why-comment).
- Explain that the desktop chat runs over /api/ws + /api/pty and needs
the embedded-chat surface enabled; a plain dashboard/gateway is not
enough.
- Add a troubleshooting entry for the exact symptom (connects, says
ready, chat dead) on both pages.
Assert _exec_schtasks passes an explicit encoding and errors="replace" to
subprocess.run, and that _schtasks_encoding falls back to utf-8 when the
locale lookup is empty or raises (#38172).
_exec_schtasks ran schtasks.exe with text=True but no encoding/errors, so
localized Windows (e.g. Chinese) output in the console code page raised
UnicodeDecodeError tracebacks from subprocess' reader threads during
`hermes gateway status`. Decode with the locale's preferred encoding and
errors="replace" so non-UTF-8 status output is read cleanly.
Fixes#38172
* fix(dashboard): clamp PTY resize dimensions for WSL2 winsize garbage
WSL2 reports columns=131072, rows=1 from a broken winsize probe. The
dashboard /chat tab forwards xterm.js dimensions through PtyBridge.resize(),
which packs them as unsigned short via struct.pack. 131072 > 65535 raised
struct.error — uncaught (only OSError was handled) — breaking the resize
path and leaving the TUI laid out for a one-row, absurdly-wide screen, which
surfaces as blank/disappearing text.
Clamp cols/rows to a sane [1, 2000]x[1, 1000] range before packing.
Non-finite/non-integer probes fall back to the minimum so nothing can reach
struct.pack and raise.
* test(dashboard): de-flake pub/events broadcast test
test_pub_broadcasts_to_events_subscribers round-tripped a frame through
two nested Starlette TestClient WebSocket portals within a 10s wall-clock
budget. Under heavy parallel CI load a starved ASGI thread occasionally
blew that budget even though the server logic is correct, producing
intermittent 'broadcast not received within 10s' failures.
Drive _broadcast_event directly under asyncio with fake subscribers
instead. Same fan-out contract (verbatim delivery to every subscriber on
the channel, nothing to other channels), zero scheduling surface. Runs in
~0.3s, deterministic across 10 consecutive runs.
Both installers (Electron bootstrap-runner + Tauri) hardcoded a literal
`stderr: ` prefix onto every line that arrived on fd 2. Tools like
uv/pip/git/npm write normal progress to stderr by design, so routine
install output showed up tagged as "stderr" (and rendered red in the
Tauri progress UI), making a healthy install look like it was erroring.
Carry the stream as structured metadata (`stream: 'stdout' | 'stderr'`)
on the log event instead of mangling the line text. The UI now styles
stderr subtly (dimmed) rather than alarmingly, and the persistent
forensic logs keep their stdout/stderr distinction.
Chromium exposes the same pasted image on both DataTransfer.items and
.files as distinct Blob objects, which attached twice. Prefer items and
skip the files mirror when items already yielded images.
The macOS DMG / in-app update could leave Hermes unable to relaunch: the
staged updater rebuilt the desktop without managed Node on PATH ("npm not
found"), never installed the rebuilt bundle over the running app, and could
race itself on `git stash`. Child install scripts also inherited a deleted
cwd from the .app bundle replaced during self-update.
- update.rs: prepend $HERMES_HOME/node/bin + venv bin to the rebuild PATH;
read --branch / --target-app from args; add a macOS "install" stage that
dittos the rebuilt bundle over the target app, clears quarantine, and
relaunches via `open` (rolling back on a failed swap); guard start_update
with an AtomicBool so concurrent startUpdate() calls can't race git stash.
- main.cjs: pass --branch <configured> and --target-app <running bundle> to
the staged updater, and spawn it with HERMES_HOME + managed Node/venv on
PATH and cwd=HERMES_HOME.
- bootstrap.rs: launch the desktop via `open <App>.app` on macOS instead of
exec'ing Contents/MacOS/Hermes, avoiding cwd/quarantine issues post-rebuild.
- powershell.rs: pin child install scripts to a stable cwd so they don't emit
getcwd errors when the launching .app is replaced mid-install.
- failure.tsx: in update mode show "Update didn't finish" / "Retry update"
and retry via startUpdate() instead of re-running the installer bootstrap.
* fix(tui): save TUI /save snapshots under Hermes home with system prompt
The TUI gateway's session.save RPC wrote hermes_conversation_<ts>.json to
the workspace/project CWD via os.path.abspath(...) and only exported model
and messages. This diverged from the classic CLI /save (which writes under
the Hermes profile home) and from the dashboard save (which includes the
system prompt).
Write the snapshot under get_hermes_home()/sessions/saved/ and include
system_prompt, session_id, and session_start so the TUI export matches the
CLI and dashboard behavior.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(tui): prefer agent.session_start for /save export; assert it in test
Address review feedback: derive session_start from the agent's session_start
datetime (matching the classic CLI export) and fall back to the gateway
session's created_at only when unavailable. Assert session_start in the
regression test.
Co-authored-by: Cursor <cursoragent@cursor.com>
---------
Co-authored-by: Cursor <cursoragent@cursor.com>
* feat(desktop): enrich profiles dashboard and de-dupe channel env vars
Add active-profile switching, role descriptions (manual + auto-generate
via the auxiliary LLM), per-profile model selection, and gateway-running
/ distribution badges to the GUI Profiles page. New profile creation
gains clone-all, optional description and model assignment.
Hide messaging-platform credentials (channel_managed) from the Keys/Env
page since the Channels page is the canonical surface for them, and
relabel the trimmed "messaging" category as "Gateway".
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(desktop): address review feedback on profiles/env changes
- ProfilesPage: scope the action-menu outside-click handler to the menu's
own container via a ref so opening one card's menu no longer leaves
others open.
- EnvPage: route the "Gateway" label and hint through i18n
(t.common.gateway / gatewayHint) instead of hard-coded English, with an
English fallback for untranslated locales.
- web_server: only report description_auto=true when auto-generation
actually succeeded.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(desktop): address second-round review on profiles
- ProfilesPage: treat describe-auto success by null-checking the
description and trust the response's description_auto flag instead of
assuming true; disable the model-editor Save button unless the selected
choice resolves to a real /api/model/options entry (avoids silent
no-op saves).
- tests: cover the new profile endpoints (active get/set + 404,
description round-trip + 404, model round-trip + 400 validation, and
describe-auto success/failure contracts).
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(desktop): more profiles review fixes (toggles, races, tests)
- ProfilesPage: use the canonical `active` returned by setActiveProfile;
make the SOUL/description/model action-menu items toggle their editor
closed when already open; guard description save/auto-describe against
stale responses via an activeDescRequest ref so a late reply can't
clobber a different open editor.
- tests: assert /api/env channel_managed classification matches
_channel_managed_env_keys().
Co-authored-by: Cursor <cursoragent@cursor.com>
---------
Co-authored-by: Cursor <cursoragent@cursor.com>
The "Build desktop app" install step failed with an opaque "exit code 1"
on machines with an old Node, and nothing in the logs explained it.
Reproduced: on Node 20.5.1, `npm run pack`'s `vite build` crashes with
You are using Node.js 20.5.1. Vite requires Node.js version 20.19+ or 22.12+.
SyntaxError: The requested module 'node:util' does not provide an
export named 'styleText'
Vite 8 (rolldown) imports node:util.styleText, which doesn't exist before
Node 20.12, so the build dies before producing the app. The installer's
check_node / Test-Node accepted ANY pre-existing Node with no version
floor, so a too-old system Node was used for the build instead of the
bundled Node 22.
Add a version floor (^20.19 || >=22.12) to check_node (install.sh) and
Test-Node (install.ps1): a too-old system Node is replaced with the
Hermes-managed Node 22 LTS, and the desktop stage re-resolves Node so the
build always runs on a satisfying version. Declare the same range in
apps/desktop/package.json engines.
Verified: build succeeds on Node 22, fails on 20.5.1 with the error above;
the floor logic matches Vite's range across boundary versions (20.18/20.19,
21.x, 22.11/22.12).
Git for Windows defaults to core.autocrlf=true, which renormalizes the
repo's LF-only text files to CRLF in the working tree. On a managed,
never-user-edited clone this makes tracked files (.envrc, AGENTS.md,
agent/*.py, workflows) show as locally modified, so the update path's
bare git checkout aborts with 'Your local changes would be overwritten
by checkout' and the desktop bootstrap fails at stage=repository.
The bash installer already autostashes before checkout; the PowerShell
path had no dirty-tree handling at all and never pinned autocrlf.
Fix: (1) git reset --hard HEAD before fetch/checkout in the update path
to discard any pre-existing dirt, and (2) pin core.autocrlf=false on both
the update and fresh-clone paths so the dirt is never created again.
* fix(doctor): detect + repair stale HERMES_MAX_ITERATIONS .env ghost shadowing config.yaml
hermes doctor now flags when ~/.hermes/.env carries a HERMES_MAX_ITERATIONS
value that disagrees with agent.max_turns in config.yaml, and 'hermes doctor
--fix' removes the stale .env line so config.yaml is authoritative. 'hermes
config show' surfaces the same drift inline under Max turns.
The setup wizard stopped dual-writing this value, but users who edited only
config.yaml from a pre-fix install keep a .env ghost. The gateway bridge
normally overrides it at startup, but if the bridge bails on any earlier
config-parse error the ghost silently wins — config says 400 while the
gateway activity line reads N/90.
The detector reads the .env FILE directly (load_env), not get_env_value/
os.environ, since the startup bridge may already have overwritten os.environ
with the config value.
Closes#17534.
* fix(config): stop offering HERMES_MAX_ITERATIONS as an editable env var
Removes HERMES_MAX_ITERATIONS from OPTIONAL_ENV_VARS so the dashboard env
editor (PUT /api/env) and any env-var prompt no longer let a user write it
to .env — which would recreate the stale ghost that shadows config.yaml's
agent.max_turns (issue #17534). The iteration budget is configured only via
config.yaml; the env var stays a read-only backward-compat fallback in the
gateway/CLI, never a promoted write target.
Regression test asserts it is absent from OPTIONAL_ENV_VARS.
CI slice 3 caught that tests/test_transform_tool_result_hook.py monkeypatches
invoke_hook but not has_hook, so the new has_hook("transform_tool_result")
gate skipped the emit and the transform never ran. Stub has_hook=True in the
shared _run_handle_function_call helper whenever a custom invoke_hook is
supplied (the test intends hooks to fire). The no-hook-registered test keeps
the real has_hook=False path — that's the gate's intended behavior.
The salvaged observer contract gated the API-request hot path on has_hook()
but left the per-tool emit ungated: every tool call ran result-field
derivation + payload dict build + invoke_hook dispatch even with zero
plugins registered.
- _emit_post_tool_call_hook now short-circuits on has_hook("post_tool_call")
and derives status/error fields lazily (after the gate, only when a
listener will consume them). status defaults to None -> derived; explicit
blocked/cancelled callers still pass status through.
- transform_tool_result emit (pre-existing hook) likewise gated on
has_hook(); skips _tool_result_observer_fields when no listener.
- Removed the now-redundant _tool_result_observer_fields pre-computation at
the three ok-path call sites (model_tools, agent_runtime_helpers,
tool_executor) — the helper derives them, so the no-listener path costs
one dict lookup and the call sites shrink.
- Tests: stub has_hook=True where payload correctness is asserted; add a
no-listener regression proving post_tool_call/transform_tool_result emit
is skipped when nothing is registered.
The salvaged PR incidentally stripped a trailing blank line from two
unrelated test files (test_file_tools_cwd_resolution.py,
test_tool_search.py). Restore them to keep the salvage diff scoped to
the observability feature.
Adds backend-neutral observer hooks for plugins: session, turn, API
request, tool, approval, and subagent lifecycle events with stable
correlation IDs (session_id, task_id, turn_id, api_request_id,
tool_call_id, parent/child subagent ids). Extends VALID_HOOKS with
api_request_error and subagent_start.
Hot path is zero-cost when no plugin subscribes: has_hook()/presence
checks gate all payload construction, request payloads are returned
by reference when no middleware rewrites, and the sanitized response
payload no longer embeds raw response objects.
Bundles the optional NeMo-Relay observability plugin
(plugins/observability/nemo_relay) as an in-repo consumer of the new
hooks, peer to the existing langfuse plugin. Fails open when the
optional nemo-relay package is not installed.
Authored-by: Bryan Bednarski <bbednarski@nvidia.com>
Salvaged from #29722 onto current main.
A kanban worker that exhausted its retries purely on a provider rate
limit / quota wall (e.g. opencode-go's 5-hour window) exited with code 1.
The dispatcher counted that as a crash, and with DEFAULT_FAILURE_LIMIT=2
two quota-wall hits permanently blocked the card. Fanning out many
workers against one shared quota made this routine.
Now a rate-limited worker exits with EX_TEMPFAIL (75); the dispatcher
classifies that as a 'rate_limited' exit, releases the task back to
'ready' WITHOUT incrementing consecutive_failures (the breaker can't trip
on a transient throttle), and the respawn guard defers the next attempt
on a cooldown (default 5min, HERMES_KANBAN_RATE_LIMIT_COOLDOWN_SECONDS)
until the quota window clears. Genuine crashes still count and trip the
breaker as before. The 120s Retry-After cap is unchanged — no worker
parks for hours holding a slot.
- conversation_loop.py: surface failure_reason in the exhaustion return
- cli.py: kanban worker picks exit 75 on rate_limit/billing failure
- kanban_db.py: rate_limited exit kind, no-count requeue, cooldown guard
A heavy --tui session (browser snapshots, large tool outputs) silently
OOM-killed the Node parent within minutes — closing the gateway child's
stdin, which the user saw only as a bare "gateway exited" / stdin EOF.
CLI was immune. Root cause: each completed tool's verbose trail line
embedded up to 16KB of result_text, persisted in transcript Msg.tools[]
for the whole session and rendered EXPANDED by default, so an Ink
render-node tree was built for every one of up to 800 messages at once.
That tree blew past Node's heap at a few hundred MB — far below the 2.5GB
memory-monitor exit threshold, so the death was never even attributed.
- text.ts: persisted verbose tool-trail blocks now cap to a small preview
(VERBOSE_TRAIL_MAX_CHARS=800/12 lines), not the 16KB live-render budget.
Retained trail strings drop ~17x (12.2MB -> 0.7MB at 800 msgs); the live
streaming tail still uses the larger LIVE_RENDER budget.
- tui_gateway/server.py: lower the gateway-side verbose text cap to match
(1KB/16 lines) so we stop shipping output the TUI no longer renders.
- memoryMonitor.ts: derive critical/high thresholds from the real V8 heap
ceiling (~88%/70%) instead of the hardcoded 2.5GB that killed the process
at 31% of an 8GB ceiling; add a one-shot onWarn early-warning on fast
sub-threshold heap growth so the next such death is diagnosable, not silent.
- entry.tsx: wire onWarn to a crash-log breadcrumb + stderr line.
Full tool output is unchanged in the agent context and SQLite session — this
is display/transport only, no behavior or context change.
Fixes#34095. Related #27282.
Tests: ui-tui text + new memoryMonitor suites (33 pass), python verbose-cap
guard (5 pass); full ui-tui suite shows no new failures vs pristine main.
E2E repro confirms the retention drop.
The dashboard's update button ran 'hermes update' immediately with no
preview. Now the System page shows whether an update is available and
asks the user to confirm before applying it.
- New GET /api/hermes/update/check: reports install method, current
version, and commits-behind (via banner.check_for_updates, 6h-cached;
?force=1 busts the cache). Soft-fails to behind=null on network error;
marks docker/nix/homebrew as can_apply=false with the out-of-band cmd.
- System page: update-status badge on the Hermes version row (latest /
N behind), a Check-for-updates button, and an Update-now button that
opens a ConfirmDialog showing the commit count before POST /api/hermes/
update fires. Cached status loads with the rest of the page.
- Docs + 5 endpoint tests (git/up-to-date/docker/soft-failure + auth gate).
The thread scroll-anchor hook in apps/desktop/src/components/assistant-ui/
thread-virtualizer.tsx was disarming sticky-bottom whenever scrollTop
decreased by >1px between scroll events. That check was too eager: when
content height grows mid-frame (virtualizer measurement of a newly visible
turn, streaming token, Streamdown/Shiki re-tokenization, composer chip
toggle), the browser emits an interim 'scroll' event whose scrollTop is
smaller than the previous frame's because scrollHeight just jumped. The
rAF-scheduled pinToBottom hasn't run yet, so programmaticScrollPendingRef
is 0 and the disarm fired. With sticky-bottom disarmed the scroller stuck
~50px above bottom — the visible at-rest backward jump that #37997
describes (and the same root cause as the wheel-up variant in #37527).
Fix:
- Track scrollHeight per frame (lastHeightRef). Disarm on scrollTop
decrease ONLY when scrollHeight did not grow this frame. Real upward
user intent (scrollbar drag, keyboard PgUp, programmatic scrollIntoView)
still disarms because it moves scrollTop without growing the content.
Wheel-up and touchmove continue to disarm via their own listeners.
- Stop observing the scroller element itself in the ResizeObserver; only
observe its content child. Viewport-only resizes (window resize,
devtools panel toggle) no longer trigger spurious pins, matching the
intent of the auto-stick-to-bottom behavior.
Verified:
- apps/desktop `tsc -b` clean.
- apps/desktop `vitest run src/components/assistant-ui/streaming.test.tsx`
passes (9/9), including the existing wheel-up disarm regression test
that asserts scrollTop stays at 420 after a wheel-up + content growth.
Lockfile regeneration invalidated the flake's pinned npm-deps hash.
Hash taken from fetchNpmDeps' authoritative 'got:' line (the
prefetch-npm-deps Diagnose helper reports a different, wrong value
due to a fetcherVersion normalization discrepancy).
- Add @testing-library/dom to apps/desktop devDeps in package-lock.json
so npm ci validates against the manifest change (contributor left the
lockfile out of the PR intentionally).
- Removes stale 'peer: true' flags now that dom is an explicit devDep.
- AUTHOR_MAP: prostoandrei9@gmail.com -> vladkvlchk (CI author gate).
@testing-library/react@16 declares @testing-library/dom as a peerDependency
and re-exports waitFor/fireEvent/screen/within from it. Without dom installed
as a direct dependency, tsc -b fails with TS2305 in every test file that
imports those names — which breaks the apps/desktop build during installer
bootstrap (Hermes Setup → "INSTALL DIDN'T FINISH").
The Electron desktop app writes boot failures, backend spawn output, and
Python tracebacks to HERMES_HOME/logs/desktop.log, but debug-share only
captured agent/errors/gateway — so desktop boot issues never made it into
shared debug reports.
- logs.py: register desktop -> desktop.log (enables 'hermes logs desktop')
- debug.py: capture desktop snapshot, add to summary report, upload full
desktop.log in 'share', update privacy notice
- gateway /debug inherits the desktop tail via collect_debug_report()
- main.py + docs: help text and log-name table (also adds missing gui row)
- tests: desktop seed in fixture, new report test, three_pastes -> four_pastes
get_mcp_status() treated every non-connected server as a failure, so a
server configured with enabled: false rendered as red '— failed' in the
startup banner even though it was intentionally off. Add a 'disabled'
field derived from the enabled flag and render disabled servers dim as
'— disabled' instead.
The section explained why the Session token is hidden but punted the actual
setup steps to the web-dashboard page via a link — a bounce for someone on
the Desktop App page trying to connect. Inline the concrete steps instead:
backend command block (mint token -> .env -> hermes dashboard --insecure),
the in-app Remote gateway steps, the env-var override, Tailscale guidance,
and a troubleshooting list. Keep a short pointer to the web-dashboard page
for the same setup from that angle.
The Python half (#37538) reads HERMES_DESKTOP_CHILD_PID to exclude the
desktop-managed backend from _kill_stale_dashboard_processes, but nothing
set it. applyUpdatesPosixInApp now passes the live backend PID in the
`hermes update` env, completing the #37532 fix end-to-end.
The Desktop App page covered install, settings, and chat but not how to
connect the app to a backend on another machine — the exact thing
@PedjaDrazic asked about. Add a 'Connecting to a remote backend' section
that explains the Session token is the dashboard token Hermes never
surfaces (pin it via HERMES_DASHBOARD_SESSION_TOKEN + run --insecure),
and link to the web-dashboard page for the full backend setup rather than
duplicating it. Add a reciprocal link from the web-dashboard remote section
back to the Desktop App page.
Follow-up to the salvaged contributor commit:
- Underscore→hyphen tolerance now emits a resolvable token. Previously
the detect set accepted the hyphenated variant but emit returned the
raw token, so '!set_home' produced '/set_home' which the dispatcher
could not resolve. Now emits '/set-home'. Aliases are left as-is — the
gateway dispatcher canonicalizes them itself.
- Fix dead skill-command branch: skill command keys are stored
slash-prefixed (e.g. '/arxiv') in get_skill_commands(), but the check
compared the bare token, so '!arxiv' never normalized. Now compares
the '/candidate' form, making skill aliases (e.g. !gif-search) work.
- Re-run bang normalization after Matrix reply-fallback stripping so a
quoted reply whose content is a bang command reaches command parity
with the slash form.
- Replace silent 'except Exception: pass' with logger.debug(exc_info=True).
- Add AUTHOR_MAP entry for @nepenth.
Tests: +5 (underscore-alias, skill-command branch, quoted-reply bang +
slash parity). 162 Matrix tests pass.
The desktop Remote gateway field asks for a session token that Hermes never
surfaces — by default web_server.py mints an ephemeral token per boot and
injects it into the served HTML, so there is nothing in config.yaml, /gateway,
or env to copy. Document that you pin it yourself via
HERMES_DASHBOARD_SESSION_TOKEN, run the backend with --insecure (keeps the
legacy token auth path instead of engaging the OAuth gate), then paste that
value into the desktop app.
- web-dashboard.md: new 'Connecting Hermes Desktop to a remote backend' section
(backend + desktop steps, --insecure vs OAuth-gate nuance, HERMES_DESKTOP_*
env override, Tailscale guidance, troubleshooting).
- environment-variables.md: new 'Web Dashboard & Hermes Desktop' env-var table
(HERMES_DASHBOARD_SESSION_TOKEN, HERMES_DESKTOP_REMOTE_URL/TOKEN, the OAuth
and public-url vars) — none were previously documented.
Follow-up to #38089. The merged PR removed --recurse-submodules from the
installer, CI, and getting-started docs, but missed the same stale clause in:
- CONTRIBUTING.md (Prerequisites table)
- website/docs/developer-guide/contributing.md (table + clone command)
- zh-Hans mirror of the developer-guide contributing doc
git-lfs is kept in the Git requirement rows since it's a separate, real
prerequisite. No .gitmodules has existed since the Atropos RL submodule was
removed in #26106.
2026-06-03 03:11:19 -07:00
823 changed files with 66069 additions and 15941 deletions
@@ -283,6 +283,21 @@ The dashboard embeds the real `hermes --tui` — **not** a rewrite. See `hermes
**Structured React UI around the TUI is allowed when it is not a second chat surface.** Sidebar widgets, inspectors, summaries, status panels, and similar supporting views (e.g. `ChatSidebar`, `ModelPickerDialog`, `ToolCall`) are fine when they complement the embedded TUI rather than replacing the transcript / composer / terminal. Keep their state independent of the PTY child's session and surface their failures non-destructively so the terminal pane keeps working unimpaired.
### Electron Desktop Chat App (`apps/desktop/`)
A **separate** chat surface from both the classic CLI and the dashboard's embedded TUI. It is an Electron + React + nanostore renderer (`@assistant-ui/react`) that talks to a `tui_gateway` backend over JSON-RPC (`requestGateway(method, params)`). It does NOT embed `hermes --tui` — it has its own composer, transcript, and slash-command pipeline. Route desktop bugs to the `hermes-desktop-app-work` skill, not `hermes-dashboard-work`.
**Slash commands in the desktop app are curated client-side, then dispatched to the backend.** The pipeline:
- **Backend already provides everything.** `tui_gateway/server.py``commands.catalog` (empty-query list) and `complete.slash` (typed-query completions) both include built-in commands, user `quick_commands`, AND skill-derived commands (`scan_skill_commands()` / `get_skill_commands()`). The desktop app does not need a new RPC to see skills.
- **The renderer curates via `apps/desktop/src/lib/desktop-slash-commands.ts`.** This is the load-bearing file. It holds `DESKTOP_COMMANDS` (the ~19 built-ins shown in the palette) plus block-lists for terminal-only / messaging-only / picker-owned / settings-owned / advanced commands that should NOT clutter the desktop popover.
-`isDesktopSlashCommand(name)` — gates **execution**. Returns true for built-ins AND for any non-built-in (skill / quick command), so typed extension commands run.
-`isDesktopSlashSuggestion(name)` — gates **discovery/completion**. Used by BOTH completion paths in `app/chat/composer/hooks/use-slash-completions.ts` (empty-query catalog filter + typed-query `complete.slash` filter) and by `filterDesktopCommandsCatalog`.
-`isDesktopSlashExtensionCommand(name)` — true when the command is NOT a known Hermes built-in (i.e. a skill or user quick command). Both suggestion and catalog-filter paths allow extensions through so skill commands surface in the palette. (Added when fixing "skill commands missing from the desktop slash palette" — the curated allow-list was silently dropping every skill/quick command from completions even though they executed fine when typed.)
- **Dispatch** lives in `app/session/hooks/use-prompt-actions.ts` (`runSlash`): built-ins that the desktop owns (`/skin`, `/help`, `/new`, …) are handled locally or via `commands.catalog`; everything else goes to `slash.exec`, falling back to `command.dispatch` (which the gateway resolves into skill / alias / exec directives). A skill command resolves to `{type: "skill", message}` and is submitted as a normal prompt.
**Rule:** the desktop slash palette's curation is about hiding noise (terminal-only / messaging-only built-ins), NOT about hiding user-activated extensions. Skill commands and `quick_commands` are extensions the backend surfaces — they belong in completions. If you tighten `desktop-slash-commands.ts`, keep `isDesktopSlashExtensionCommand` flowing into both the suggestion and catalog-filter paths. Tests: `apps/desktop/src/lib/desktop-slash-commands.test.ts` (run via the repo-root `vitest`, since `apps/desktop` resolves deps from the root workspace install).
The installer handles everything: uv, Python 3.11, Node.js, ripgrep, ffmpeg, **and a portable Git Bash** (MinGit, unpacked to `%LOCALAPPDATA%\hermes\git` — no admin required, completely isolated from any system Git install). Hermes uses this bundled Git Bash to run shell commands.
@@ -52,7 +52,7 @@ If you already have Git installed, the installer detects it and uses that instea
> **Android / Termux:** The tested manual path is documented in the [Termux guide](https://hermes-agent.nousresearch.com/docs/getting-started/termux). On Termux, Hermes installs a curated `.[termux]` extra because the full `.[all]` extra currently pulls Android-incompatible voice dependencies.
>
> **Windows:** Native Windows is fully supported — the PowerShell one-liner above installs everything. If you'd rather use WSL2, the Linux command works there too. Native Windows install lives under `%LOCALAPPDATA%\hermes`; WSL2 installs under `~/.hermes` as on Linux. The only Hermes feature that currently needs WSL2 specifically is the browser-based dashboard chat pane (it uses a POSIX PTY — classic CLI and gateway both run natively).
> **Windows:** Native Windows is fully supported — the PowerShell one-liner above installs everything. If you'd rather use WSL2, the Linux command works there too. Native Windows install lives under `%LOCALAPPDATA%\hermes`; WSL2 installs under `~/.hermes` as on Linux. The only Hermes feature that currently needs WSL2 specifically is the browser-based dashboard chat pane (it uses a POSIX PTY — classic CLI and gateway both run natively).
After installation:
@@ -94,7 +94,7 @@ One command from a fresh install:
hermes setup --portal
```
That logs you in via OAuth, sets Nous as your provider, and turns on the Tool Gateway. Check what's wired up any time with `hermes portal status`. Full details on the [Tool Gateway docs page](https://hermes-agent.nousresearch.com/docs/user-guide/features/tool-gateway).
That logs you in via OAuth, sets Nous as your provider, and turns on the Tool Gateway. Check what's wired up any time with `hermes portal info`. Full details on the [Tool Gateway docs page](https://hermes-agent.nousresearch.com/docs/user-guide/features/tool-gateway).
You can still bring your own keys per-tool whenever you want — the gateway is per-backend, not all-or-nothing.
@@ -40,7 +34,7 @@ It builds and launches the GUI against your existing install — same config, ke
### Prebuilt installers
When a release ships desktop installers they're attached to its [releases page](https://github.com/NousResearch/hermes-agent/releases) — `.dmg` (macOS), `.exe` / `.msi` (Windows), `.AppImage` / `.deb` / `.rpm` (Linux). These are published manually, so the install-with-Hermes path above is the most reliable way to get the latest.
Prebuilt installers are built and distributed via [the Hermes Desktop website.](https://hermes-agent.nousresearch.com/desktop).
---
@@ -56,10 +50,7 @@ hermes update
## Requirements
The installer handles everything for you (Python 3.11+, a portable Git, ripgrep). The only thing worth knowing:
- **Windows** — the installer bundles its own Git and Python; no admin rights or system changes required.
- **macOS / Linux** — uses your system Python 3.11+ (installed automatically if missing).
The installer handles everything for you (Python 3.11+, a portable Git, ripgrep).
---
@@ -94,7 +85,7 @@ Installers are built and uploaded to GitHub Releases manually. macOS/Windows sig
### How it works
The packaged app ships only the Electron shell. On first launch it installs the Hermes Agent runtime into `HERMES_HOME` (`~/.hermes`, or `%LOCALAPPDATA%\hermes` on Windows) — the **same layout a CLI install uses**, so the two are interchangeable. The renderer (React, in `src/`) talks to a `hermes dashboard --tui` backend over the standard gateway APIs and reuses the embedded TUI rather than reimplementing chat. The install, backend-resolution, and self-update logic all live in `electron/main.cjs`.
The packaged app ships only the Electron shell. On first launch it installs the Hermes Agent runtime into `HERMES_HOME` (`~/.hermes`, or `%LOCALAPPDATA%\hermes` on Windows) — the **same layout a CLI install uses**, so the two are interchangeable. The renderer (React, in `src/`) talks to a `hermes dashboard` backend over the standard gateway APIs and reuses the embedded TUI rather than reimplementing chat. The install, backend-resolution, and self-update logic all live in `electron/main.cjs`.
'Module scripts are being served with the wrong MIME type. This usually means a static file server is serving a Vite/React app instead of the project dev server.',
description:copy.moduleMimeDescription,
url: webview.getURL?.()||target.url
})
setLoading(false)
@@ -566,13 +567,11 @@ export function PreviewPane({
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.