Compare commits

..

14 Commits

Author SHA1 Message Date
Teknium
fb5649f2e8 Merge remote-tracking branch 'origin/main' into fix/egress-review-fixes 2026-06-14 13:16:34 -07:00
Teknium
e433c41014 fix(egress): harden Docker proxy UX and enforcement 2026-06-14 13:07:22 -07:00
Teknium
4f65d5509f chore: empty commit to mint fresh merge SHA (CI runner tree-corruption on shard 5) 2026-06-10 09:15:41 -07:00
Teknium
e2c2e41137 fix(egress): v4 round — bridge bind on Linux, listener-role split, fallback gate, audit.log truth
Addresses GodsBoy's May 30 follow-up review on PR #30179.

P0 — Linux sandboxes could not reach the proxy:
- _default_http_listen now binds the docker bridge gateway on Linux
  (host.docker.internal resolves to the bridge gateway there; the old
  loopback-only bind was unreachable from containers). Loopback stays
  the default on Docker Desktop platforms; bridge-less Linux falls back
  to loopback with a warning.
- Live-testing the fix against the real v0.39 binary surfaced a second
  latent bug the host-side E2E had masked: v0.39's http_listen does NOT
  terminate CONNECT — tunnel_listen does. HTTPS_PROXY traffic through
  http_listen got 400s from upstream. build_proxy_config now binds
  tunnel_listen (CONNECT/MITM) on tunnel_port and http_listen
  (plain-HTTP forwards) on tunnel_port+1; docker.py points HTTPS_PROXY
  at tunnel_port and HTTP_PROXY at tunnel_port+1.
- Liveness probes (start_proxy poll loop, get_status) now probe the
  CONFIGURED bind host via _read_http_listen_from_config() instead of
  hardcoded loopback, which would have killed a healthy bridge-bound
  daemon as 'never came up'.

P2 — allow_env_fallback dead on the partial-secret path:
- The missing-secret branch in _build_proxy_subprocess_env now honors
  proxy.allow_env_fallback exactly as its own error message promises.
- cmd_start refuses (or warns, with the fallback flag) when
  credential_source=bitwarden but secrets.bitwarden is disabled/missing
  — closing the silent-degrade-to-host-env hole.

P2 — audit.log decoy on v0.39:
- Wizard pre-create failure downgraded to a warning (file is
  non-load-bearing until the version bump); success line qualified as
  'reserved'; user + dev docs stop telling operators to wire monitoring
  to the path today.

P3 — metrics comment no longer claims 'hermes egress status' surfaces
the ephemeral metrics port (it can't; :0 is random and unrecorded).

Validation: 180 unit tests pass (9 new), gated E2E passes, and a live
end-to-end run against the real binary verified CONNECT-MITM with
Authorization swap on tunnel_listen, plain-HTTP swap on tunnel_port+1,
403 on non-allowlisted hosts, and bind-host-aware status probing.
Docs gain a Linux firewall troubleshooting section (container→docker0
INPUT drops, e.g. ufw default-deny).
2026-06-10 05:12:37 -07:00
Teknium
42139c3cf0 Merge remote-tracking branch 'origin/main' into feat/iron-proxy
# Conflicts:
#	hermes_cli/main.py
#	tools/environments/docker.py
2026-06-10 04:38:13 -07:00
teknium1
bde66ef37a docs(egress): align user + dev docs with iron-proxy v0.39 actual behavior
The previous docs round (906b1da57) described the integration the way
we wanted it to work — `http_listens` plural with a docker bridge
bind, dedicated `audit.log` for per-request JSON records.  Live
testing against the real v0.39.0 binary in [905ce58a1] surfaced that
neither field exists in v0.39's config schema, and the docs were
making promises the daemon couldn't keep.

This commit walks every claim in the docs back to what the binary
actually does today, while keeping the upgrade path explicit so the
docs stay coherent when the pinned `_IRON_PROXY_VERSION` bumps:

website/docs/user-guide/egress/iron-proxy.md
- Bind policy section: rewritten.  Was "loopback + docker bridge IP
  on Linux"; now "loopback only" with an explicit explanation that
  v0.39 only supports one bind per daemon and that
  host.docker.internal -> host-gateway mapping is what sandboxes
  use to reach the loopback bind.
- Bind policy section adds a note on the metrics-port pin that the
  previous round of docs didn't even mention.
- State directory layout table: `audit.log` description rewritten
  to acknowledge it's a pre-created sentinel for future binary
  versions, NOT something the v0.39 daemon writes to.
- New section "Logging on iron-proxy v0.39" replaces the old
  "Audit log vs daemon log" section.  Explicitly tells operators
  the daemon log is the single source of truth for both audiences
  on v0.39, with the upgrade path called out.
- Data-flow diagram step 7: rewritten to send per-request records
  to `iron-proxy.log` on v0.39 with cross-link to the new logging
  section.
- Diagram caption updated.
- Security-model "allowlisted-host exfiltration" line: "audit log
  captures" -> "daemon log captures".
- Security-model "LAN peer leak" line: removed the docker-bridge
  claim.
- Troubleshooting section's per-request-inspection recipes:
  rewritten to use `iron-proxy.log` and explain when the split
  stream will land.
- Limitations list gets a new bullet calling out the
  single-bind + combined-log v0.39 constraints + the auto-upgrade
  posture.

website/docs/developer-guide/egress-internals.md
- Bind policy invariant: documents the singular `http_listen` v0.39
  schema constraint + dead-code-until-upgrade status of the
  bridge-bind path.
- New "Metrics port collision" invariant documenting why
  `metrics.listen: 127.0.0.1:0` is non-negotiable.
- Audit log fail-loud invariant adds the v0.39 schema constraint
  note + the new
  `test_audit_log_kwarg_does_not_inject_audit_path_v039` regression
  test.
- "Subscribing to per-request audit events" section updated to
  send watchers at `iron-proxy.log` for v0.39 with the upgrade
  pivot called out.

website/docs/reference/cli-commands.md
- Diagnostic shortcut for tailing the audit log: `tail audit.log |
  jq` -> `tail iron-proxy.log | jq` with the v0.39 note inline.

Build verification:
- `npx docusaurus build` succeeds across all three locales
  (en + zh-Hans + ko).
- New `#logging-on-iron-proxy-v039` anchor lands in the rendered
  HTML and the in-page cross-references resolve.
- No new broken anchors introduced (pre-existing warnings on
  unrelated zh-Hans pages are unchanged).
- No leftover stale `#audit-log-vs-daemon-log` or `#http_listens`
  references anywhere on the egress pages.
2026-05-29 00:34:36 -07:00
teknium1
905ce58a12 fix(egress): align proxy.yaml with iron-proxy v0.39 actual schema +
propagate handler exit codes

Live testing the full wizard against the real v0.39.0 binary
(downloaded + extracted via our own install_iron_proxy()) surfaced
three real bugs that the unit tests couldn't catch:

1. `proxy.http_listens` (plural) — NOT a field in v0.39's config struct.
   Our code emitted both `http_listen` (string) and `http_listens`
   (list) believing v0.39 accepts both forms.  The binary actually
   rejects with "field http_listens not found in type config.Proxy"
   at YAML unmarshal time, so the daemon fails to start.  Confirmed
   via strings(1) audit of the v0.39 binary — only `http_listen` is
   tagged.

2. `log.audit_path` — NOT a field in v0.39's config.Log struct.  Same
   class of error: "field audit_path not found in type config.Log".
   Per-request audit-log records are not separable from server-level
   logs at this binary version.

3. `metrics.listen` defaults to ":9090" — which is the SAME port as
   our default `tunnel_port: 9090`.  Result: every operator who runs
   `hermes egress setup` followed by `hermes egress start` gets
   "bind: address already in use" because the proxy listener and the
   metrics listener fight for port 9090.  We now explicitly pin
   `metrics.listen: 127.0.0.1:0` to give it an ephemeral loopback
   port that can never collide with tunnel_port regardless of what
   operator sets.

Plus a fourth bug — pre-existing but surfaced by the egress live
test — that affects every Hermes subcommand:

4. `hermes_cli/main.py` calls `args.func(args)` at the bottom of
   main() but discards the return value.  Every subcommand handler
   that returns a non-zero exit code (cmd_start refusing because
   `fail_on_uncovered_providers=true`, cmd_setup refusing because
   --from-bitwarden but BWS unreachable, etc.) was silently exiting 0.
   Fix: capture the handler's return value and `sys.exit(rc)` when
   it's a non-zero int.  Other subcommands' contracts unchanged
   because they either return 0/None or don't return at all.

Validation:
- 188/188 in test_iron_proxy.py + test_iron_proxy_cli.py +
  test_config.py pass post-fix.
- 5333/5337 in tests/hermes_cli/ pass; the 4 unrelated failures
  (test_managed_installs.py + test_update_hangup_protection.py)
  are pre-existing on main, not touched by this PR.
- Manual wizard run end-to-end with the v0.39.0 binary in an
  isolated HERMES_HOME:
    * `egress install` — downloads + SHA-256 verifies + extracts
    * `egress setup` — generates CA, mints tokens, writes
      proxy.yaml that the binary now accepts (no http_listens,
      no audit_path, metrics pinned to 127.0.0.1:0)
    * `egress start` — daemon binds 127.0.0.1:9090, listens=yes
    * `egress status` — shows pid + listening + mappings
    * `egress stop` — clean shutdown, pidfile + nonce removed
    * Idempotent re-start returns the running pid without spawning
    * curl through the proxy with the openrouter token gets
      forwarded; an attacker host gets HTTP 403 (allowlist works);
      169.254.169.254 gets HTTP 403 (deny CIDR works)
    * Refuse-start paths exit 1 with actionable messages:
      - `fail_on_uncovered_providers=true` + ANTHROPIC_API_KEY set
      - `credential_source=bitwarden` + BWS_ACCESS_TOKEN unset
    * `--rotate-tokens` confirmation gate fires via pty:
      typing 'cancel' aborts; typing 'rotate' proceeds and
      creates a mappings.json.rotated-<timestamp> backup

Test updates:
- `test_default_bind_is_loopback_not_zero_zero` — asserts the
  singular `http_listen` is loopback AND asserts `http_listens`
  (plural) is NOT in the rendered yaml.
- `test_default_bind_uses_loopback_on_linux` — replaces
  `test_default_bind_includes_docker_bridge_on_linux`.  v0.39
  only supports one bind per daemon process, so the docker bridge
  augmentation is dropped from the rendered config; sandboxes
  reach the daemon via host.docker.internal -> host-gateway
  mapping, so loopback-only is functional.
- `test_metrics_listener_pinned_to_loopback_ephemeral` — new
  regression test asserting `metrics.listen == "127.0.0.1:0"`.
- `test_audit_log_kwarg_does_not_inject_audit_path_v039` —
  replaces `test_audit_log_path_lands_in_yaml`.  audit_log kwarg
  is still accepted for forward compatibility but does NOT emit
  log.audit_path until upstream supports it.
2026-05-28 23:45:44 -07:00
teknium1
ec108c625e Merge origin/main into feat/iron-proxy
Single content conflict in hermes_cli/config.py — kept BOTH the
paste_collapse_threshold knobs from main and the proxy section from
this branch (they're independent additions to DEFAULT_CONFIG).

All 187 tests in test_iron_proxy.py + test_iron_proxy_cli.py +
test_config.py pass post-merge.
2026-05-25 18:37:06 -07:00
teknium1
906b1da57f docs(egress): comprehensive expansion — setup, config, troubleshooting,
internals reference

Pre-v3 the egress docs were 175 lines covering the basics: quick start,
slash commands, security model, failure modes.  After three rounds of
PR review we added a half-dozen new config knobs, two new flags, a
strict/warn tier split for uncovered providers, persisted-nonce
cross-process defense, audit-log + log-file separation, NODE_OPTIONS
append-merge, docker_env collision detection, etc. — none of which
the user-facing doc reflected.

This commit closes that gap end-to-end:

website/docs/user-guide/egress/iron-proxy.md (175 → 567 lines)
- Configuration section expanded with every new knob:
  fail_on_uncovered_providers, allow_env_fallback, upstream_deny_cidrs.
- Tables for default allowed hosts + default deny CIDRs.
- Bind policy section (loopback + docker bridge, NOT 0.0.0.0) with the
  operator-facing "why can't I hit the proxy from my LAN" answer.
- Uncovered providers section with the strict tier (Anthropic / Azure
  / Gemini — block when fail_on_uncovered_providers=true) vs warn tier
  (AWS, GCP appdefault — present on every dev laptop, never block).
- Bitwarden integration expanded: rotation semantics, fail-loud at
  start, the allow_env_fallback escape hatch, --no-bitwarden flag, the
  preserve-existing-source rule on plain re-setup.
- Slash commands section with --no-bitwarden, --rotate-tokens, and the
  token-rotation operator playbook (confirmation gate, backup file
  naming, restart-required caveat).
- State directory layout table covering all 9 files we create + their
  modes.
- Audit log vs daemon log distinction (the arshkumarsingh #2 fix that
  motivated the corrected diagram).
- CA distribution into the sandbox: full table of injected env vars,
  the Python/curl REPLACE vs Node ADD asymmetry caveat with the
  NODE_OPTIONS=--use-openssl-ca mitigation.
- docker_env collision detection: what gets blocked, what gets warned,
  the migration escape hatch.
- PID + nonce defense section explaining how iron-proxy.nonce works
  cross-CLI and the SIGKILL-suppress-on-recycle path.
- Security model expanded with the new defenses
  (IPv4-mapped-v6 IMDS bypass closure, env-var leakage prevention,
  LAN-peer-with-token-leak coverage).
- Failure modes extended for every new refuse-start path.
- Troubleshooting section (180 new lines) with grep-friendly error
  matchers for each common failure: BWS token missing, uncovered
  provider refused, port collision, slow bind, 403 from proxy, SSL
  verification errors inside the sandbox, 401 from upstreams, address-
  in-use orphan recovery, per-request audit log inspection.

website/docs/getting-started/quickstart.md
- One-paragraph mention of the egress proxy under "Sandboxed terminal"
  so operators discover the feature when they enable Docker isolation.

website/docs/reference/cli-commands.md
- Top-level command table now lists `hermes egress` alongside `hermes
  proxy` (different purpose, different direction — call it out).
- New `## hermes egress` section with full subcommand syntax, common
  flows (first-time setup, switching credential source, rotating
  tokens, adding upstream), and diagnostic shortcuts.

website/docs/reference/environment-variables.md
- New "Egress proxy (sandbox-injected)" section documenting every env
  var the Docker backend injects: HERMES_EGRESS_PROXY,
  HERMES_PROXY_TOKEN_<NAME>, HTTPS_PROXY/HTTP_PROXY/NO_PROXY,
  REQUESTS_CA_BUNDLE/SSL_CERT_FILE/CURL_CA_BUNDLE/NODE_EXTRA_CA_CERTS,
  NODE_OPTIONS append-merge, HERMES_IRON_PROXY_NONCE.
- Also fixes a stale layout issue with the Persistent Shell table that
  had two trailing rows getting orphaned in the v3 commit.

website/docs/developer-guide/egress-internals.md (NEW, 363 lines)
- Module layout map (which file owns what).
- Full lifecycle walkthrough for install / setup / start / stop with
  the actual function calls in order.
- "Security invariants" section enumerating every load-bearing property
  with the regression test name that guards it.  These are the rules
  contributors must preserve when touching the module:
  - filesystem perms (0o700 dir, 0o600 secrets, O_NOFOLLOW everywhere)
  - subprocess env minimisation (no os.environ.copy)
  - bind policy (loopback + docker bridge, never 0.0.0.0)
  - default deny CIDR coverage
  - audit log fail-loud
  - bitwarden fail-loud
  - docker_env collision detection
  - PID recycling defense
  - token preservation on re-setup
  - credential_source preservation
- Extension points: adding a bearer-token provider, adding a
  non-bearer provider, wiring iron-proxy into a non-Docker backend,
  subscribing to per-request audit events.
- Testing recipe (hermetic + E2E + CLI smoke).

website/sidebars.ts
- New `developer-guide/egress-internals` entry under Developer Guide
  → Internals (alongside acp-internals, cron-internals,
  trajectory-format).

Build verification
- `cd website && npm install && npx docusaurus build` succeeds locally.
- All three new pages render to static HTML in all three locales
  (en + zh-Hans + ko).
- No new broken links or broken anchors introduced (pre-existing
  warnings on translation stubs are unrelated).
2026-05-25 15:05:16 -07:00
teknium1
fa4e87b253 fix(egress): v3 round — GodsBoy/stephenschoettler/arshkumarsingh findings
GodsBoy 2nd-round P1 (all 4 addressed):
- _detect_docker_bridge_ip: replace `ip.count('.') == 3` heuristic with
  ipaddress.IPv4Address validation + reject unspecified/loopback/multicast/
  reserved/link-local/global addresses.  Hostile `ip` shim on PATH used to
  be able to inject 0.0.0.0 here and re-open INADDR_ANY binding.
- cmd_setup credential_source preservation: re-running `hermes egress
  setup` without --from-bitwarden no longer silently downgrades a previous
  bitwarden config back to env.  Require --no-bitwarden to switch
  explicitly; otherwise preserve the existing mode and surface the
  decision.
- fail_on_uncovered_providers docstring/default mismatch: docstring used
  to claim default=True; behavior was default=False.  Resolved by
  truth-in-advertising — docstring now correctly states default=False —
  AND splitting providers into a strict LLM-specific tier
  (_LLM_SPECIFIC_NON_BEARER_PROVIDERS, used by start blocking) and a
  generic uncovered tier (used by wizard warnings).  Generic cloud creds
  (AWS_*, GOOGLE_APPLICATION_CREDENTIALS) no longer trip refuse-start
  for operators using terraform/gcloud alongside Hermes.  New
  discover_blocked_providers() returns the strict subset.
- start_proxy poll-loop must verify listening before pidfile:
  previously fell through deadline-expired as success and wrote a
  pidfile for a non-listening daemon.  Refactored into a do-while
  shape, require `listening=True` for success, kill the child + unlink
  the pidfile on failure paths.

GodsBoy 2nd-round P2 (the worth-keeping subset):
- O_NOFOLLOW + 0o600 + st_uid check on iron-proxy.log open (symmetric
  with the pidfile and audit-log paths the same PR hardens).
- pidfile O_EXCL: refactored pidfile-write into _write_pidfile_safely
  which uses O_EXCL to detect concurrent starts.  EEXIST with a live
  pid means "another start in progress" — refuse with actionable
  message; EEXIST with a dead pid means "stale crash" — unlink and
  retry once.  Discriminates rather than racing.
- _VERSION_CACHE: invalidate on install_iron_proxy success;
  don't cache empty stdout (would poison `hermes egress status` for
  the lifetime of the process if first probe hit a corrupt binary).
- ensure_audit_log now RAISES on OSError instead of swallowing it as
  a warning.  Previous behavior let the daemon create the file under
  the default umask, exactly the world-readable scenario the helper
  was built to prevent.  cmd_setup catches the new RuntimeError and
  surfaces "✗" with the actionable message.
- SIGINT/SIGTERM handler scoped around the start_proxy poll loop:
  Ctrl-C while waiting for `hermes egress start` no longer leaks an
  orphan daemon with the port bound.  Handler kills the child +
  unlinks the pidfile before re-raising.
- pidfile written IMMEDIATELY after Popen, BEFORE the listening
  verification.  Parent dying during the poll loop now leaves a
  pidfile pointing at the orphan so the next `hermes egress stop` can
  clean up.  Failure paths in the poll loop explicitly unlink.
- _DEFAULT_UPSTREAM_DENY_CIDRS: add ::ffff:0:0/96 (IPv4-mapped IPv6 —
  closes the v6-resolved IMDS bypass), 100.64.0.0/10 (CGNAT / cloud
  overlays / K8s pod networks), 198.18.0.0/15 (RFC2544 benchmark).
- _NON_BEARER_PROVIDERS split into LLM-specific (Anthropic / Azure /
  Gemini — block when strict) vs generic-cloud (AWS_*, GCP appdefault
  — warn-only).
- docker.py except narrowing: load_config can raise yaml.YAMLError on
  a malformed config.yaml, not just ImportError.  Two callsites
  (collision check + precedence resolution) now catch yaml.YAMLError
  via a sentinel `import yaml` and fail-safe to enforced mode.

GodsBoy 2nd-round P3:
- _reset_for_tests: was a no-op claiming symmetry with bitwarden;
  now actually clears _VERSION_CACHE and _proxy_nonce so in-process
  callers (notebooks, pytest -p no:xdist) don't see state leakage.
- tests/test_iron_proxy_cli.py: replaced hardcoded Path("/tmp/...")
  with hermes_home/-derived fixtures.  Matches the same cleanup we
  did for test_iron_proxy.py in the previous round.
- --rotate-tokens confirmation gate: when there are existing tokens,
  prompt for "rotate" confirmation (skipped when stdin isn't a tty
  so CI/scripted use still works) AND back up the mappings to a
  timestamped sibling before overwriting.  Surface a no-op note when
  rotate is requested with no existing tokens.

stephenschoettler (runtime-boundary review):
- #1 BWS silent degrade at proxy start: when credential_source=bitwarden
  but the BWS access token or project_id is missing OR the fetch
  returns no values for mapped providers, raise instead of silently
  falling back to host env.  cmd_start also pre-checks at the wizard
  layer for actionable error messages.  Opt-in escape hatch via new
  `proxy.allow_env_fallback: true` config for migration scenarios.
- #2 docker_env collision detection extended: `docker_env:
  {OPENROUTER_API_KEY: sk-real}` in config.yaml with enforce_on_docker:
  true now raises just like an HTTPS_PROXY collision would.  The
  collision check pulls mapped provider names from load_mappings() at
  call time.
- #3 PID nonce persisted to disk: cross-CLI-invocation stale-pidfile
  defense now works.  start_proxy writes the nonce next to the pidfile
  (sibling 0o600), stop_proxy reads it back via _read_persisted_nonce()
  and uses it as a _pid_alive signal in the new process.  Falls back
  to argv0 basename matching when the file is missing (legacy install).

arshkumarsingh:
- #1 NODE_OPTIONS append-merge: egress dict no longer sets NODE_OPTIONS
  directly (would clobber the operator's --max-old-space-size etc.).
  Carry the egress flag in a sentinel key
  _HERMES_EGRESS_NODE_OPTIONS_APPEND; DockerEnvironment merges into the
  existing NODE_OPTIONS in env_args computation with de-duplication.
- #2 docs: structured per-request audit log is at audit.log, not
  iron-proxy.log (the latter is daemon stdout/stderr).  Diagram and
  step-7 text corrected; both file roles are now documented separately.

Tests
- Added 12 new tests in test_iron_proxy.py covering bridge-IP rejection
  (parametrized over 8 dangerous inputs), default deny-list adjacency
  (IPv4-mapped-v6 + CGNAT), blocked-providers strict-subset property,
  _pid_proc_starttime parser with paren-containing comm,
  stop_proxy SIGKILL suppression on starttime drift, _reset_for_tests
  clear behavior, iron_proxy_version don't-cache-empty, NODE_OPTIONS
  sentinel verification, ensure_audit_log raise-on-OSError, and
  persisted-nonce roundtrip.
- Added 1 new test in test_iron_proxy_cli.py covering cmd_start
  BWS-token-missing fail-loud.
- All 100 tests in test_iron_proxy + test_iron_proxy_cli pass; all 78
  tests in test_docker_environment + test_config still pass.

Acknowledged but not addressed:
- GodsBoy P3 dead-code `extra_env` kwarg: kept (removing is a breaking
  change for any out-of-tree caller; the kwarg is documented and works).
- Residual risks GodsBoy called out: iron-proxy in-memory secret
  zeroisation (Go-binary territory, out of scope); _PROXY_SUBPROCESS_ENV
  _ALLOWLIST cosmetic gaps (RUST_LOG, GOMAXPROCS); follow-up.
2026-05-24 04:22:53 -07:00
teknium1
4833acf046 fix(egress): silence CodeQL clear-text-logging on bws warning strings
The bws helper's warnings list contains non-secret status messages
('rate limited', 'project not found', etc.), but CodeQL's taint
analyzer can't distinguish those from the secrets dict returned by
the same call.  Log the count instead of the strings — the warnings
are still observable via 'hermes secrets bitwarden status'.
2026-05-23 23:13:03 -07:00
teknium1
128a6837b7 fix(egress): address PR review findings — P0/P1/P2/P3 + CI greens
P0 — must-fix
- iron_proxy: emit default upstream_deny_cidrs (loopback, IMDS
  169.254.0.0/16, RFC1918) when caller passes None.  Honours the docs
  promise that cloud-metadata IPs are refused regardless of allowlist.
- iron_proxy: bind 127.0.0.1 (+ docker0 bridge IP on Linux) instead of
  INADDR_ANY (':9090').  LAN peers with a leaked sandbox token could
  otherwise spend the operator's API quota against any allowlisted
  upstream.
- ensure_ca_cert: write the CA private key via os.open(..., 0o600)
  instead of shutil.copy2+os.chmod — closes the TOCTOU window where
  the key existed under the default umask.
- discover_uncovered_providers + proxy.fail_on_uncovered_providers
  config: refuse to start (when strict) if env vars for non-bearer
  providers (Anthropic native x-api-key, AWS SigV4, Azure OpenAI,
  etc.) are present.  Surfaces a wizard warning in non-strict mode.

P1 — should-fix
- start_proxy: build a minimal subprocess env (PATH/HOME/locale +
  only the env names referenced by mappings) instead of os.environ
  .copy().  Strips proxy-recursion vars (HTTPS_PROXY etc.).  Stops
  the proxy's /proc/<pid>/environ from leaking every host secret
  to same-uid local processes.
- start_proxy: optional Bitwarden refresh path
  (refresh_secrets_from_bitwarden=True, bitwarden_config=...).
  When credential_source=bitwarden, cmd_start wires it in — that's
  what delivers the rotation guarantee the docs make.
- build_proxy_config: wire audit_log into the rendered yaml
  (log.audit_path).  Parameter was accepted but never used.
- ensure_audit_log: pre-create the audit log with 0o600 perms so
  iron-proxy inherits tight permissions instead of relying on umask.
- Rename 'hermes proxy ...' → 'hermes egress ...' in user-facing
  strings (docstring, RuntimeError messages, post-setup banner).
- start_proxy: open log file with 0o600 perms and close the parent
  fd immediately after Popen — fixes the per-restart fd leak.
- DockerEnvironment: detect collisions between docker_env and the
  egress-controlling env vars (HTTPS_PROXY, SSL_CERT_FILE, etc.).
  When enforce_on_docker=true, fail loud rather than silently
  inverting the isolation; when false, warn and let docker_env win.
- proxy_cli: merge_mappings preserves existing tokens on re-setup;
  --rotate-tokens flag re-mints all of them.  Stops re-running
  `hermes egress setup` from invalidating tokens baked into
  already-running sandboxes.
- proxy_cli: --from-bitwarden fail-loud on disabled BW config,
  missing access token, or empty vault.  Previously fell through to
  the env path while still writing credential_source: bitwarden.
- docker.py: narrow `except Exception` → `except ImportError`;
  iron_proxy._read_tunnel_port_from_config: same.  Bare excepts
  were masking real config-load bugs.
- start_proxy: write pidfile via os.open with O_NOFOLLOW + 0o600
  + st_uid check.  Refuses to follow a pre-existing symlink at the
  pidfile path.
- mint_proxy_token docstring: document the 128-bit suffix entropy
  explicitly (sha256 truncated to 32 hex chars).

P2 — follow-up
- start_proxy: poll-with-timeout (100ms cadence on _port_listening)
  instead of an unconditional 5s sleep.  Saves several seconds per
  Docker container create when enforce_on_docker=true.
- docker.py: apply enforce_on_docker semantics when CA file vanishes
  between status.configured check and CA mount.  Previously returned
  empty args silently.
- docker.py: refuse to mount when mappings.json is empty/corrupt
  (was indistinguishable from upstream outage from inside the
  sandbox).
- install_iron_proxy: tarfile.extract(..., filter='data') to silence
  the PEP 706 deprecation and opt into the 3.14+ default.
- _proxy_state_dir: chmod 0o700 unconditionally; add
  _proxy_state_dir_ro() so read-only callers don't create the dir.
- stop_proxy: re-verify pid before SIGKILL via /proc/<pid>/stat
  starttime AND _pid_alive.  Prevents SIGKILL'ing a recycled pid.
- _pid_alive: tightened cmdline check — basename match on argv[0]
  plus an in-process nonce env var ('iron-proxy' in cmdline matched
  'tail iron-proxy.log' and editors with the log open).
- docker.py: NODE_OPTIONS=--use-openssl-ca so Node.js routes through
  the OpenSSL CA store SSL_CERT_FILE controls, narrowing the
  Python/curl-replace vs Node-add asymmetry waefrebeorn flagged.

P3 — polish
- proxy_cli: dest='egress_command' (was 'proxy_command' which
  collided lexically with the inbound OAuth subparser).
- iron_proxy_version: cache by binary path — get_status is called
  per Docker container create, version is constant per binary.
- Drop unused `import sys` from iron_proxy.
- proxy_cli: `is not None` check on --tunnel-port (was treating 0
  as falsy and silently substituting the default).
- proxy_cli cmd_disable: use get_status().pid instead of reaching
  into ip._read_pid() (stale pidfile from a crashed run would have
  fired a spurious "still running" warning).
- Tests: replace hardcoded /tmp/ca.* paths with tmp_path-derived
  fixtures so tests are hermetic across hosts.

CI
- Windows footguns scanner: os.kill(pid, 0) is now gated behind
  platform.system() != 'Windows' with a windows-footgun: ok marker;
  signal.SIGKILL falls back to SIGTERM on Windows via
  getattr(signal, 'SIGKILL', signal.SIGTERM).
- docs MDX compilation: replace bare `<https://…>` URLs with
  `[text](url)` syntax (MDX-jsx parser rejects the angle-bracket
  form).

Tests
- 32 new tests covering default deny CIDRs, bind policy, audit log
  wiring, subprocess env minimization, CA TOCTOU 0o600, state dir
  0o700, empty-mappings refusal, CA-vanished refusal, docker_env
  collision detection, token preservation/rotate, uncovered provider
  detection, and the proxy_cli command handlers + argparse wiring.
- All 156 tests in test_iron_proxy + test_iron_proxy_cli +
  test_docker_environment + test_config pass locally.

Acknowledged but not addressed in this revision
- E2E test for HTTPS CONNECT + TLS-MITM path: existing E2E exercises
  plain HTTP; full MITM coverage needs separate CI infra (real iron-
  proxy binary + curl with custom CA).  Tracked as follow-up.
- Cosign-style supply-chain verification for the binary checksum:
  upstream iron-proxy doesn't sign releases yet.  Accepted pattern
  (same as Bitwarden integration); tracked as follow-up.
- CA rotation CLI (`hermes egress rotate-ca`): scope-cut to a
  follow-up.

Reviewers: @annguyenNous @waefrebeorn @GodsBoy @erhnysr
2026-05-23 20:38:27 -07:00
Teknium
7a74492134 chore(infographic): add iron-proxy-egress bento-grid bold-graphic 2026-05-23 20:38:27 -07:00
Teknium
69ffb9cfd4 feat(egress): iron-proxy credential-injection firewall for sandboxes
Adds a TLS-intercepting egress proxy for remote terminal sandboxes (Docker
v1; Modal/SSH to follow).  When enabled, the sandbox holds opaque proxy
tokens; iron-proxy swaps them for real provider API keys at the egress
boundary.  Compromising the sandbox leaks tokens that only work from behind
the proxy.

Wraps ironsh/iron-proxy (Apache-2.0, Go binary).  Same lazy-install pattern
as the recently merged Bitwarden Secrets Manager integration — pinned
version, SHA-256 verified download into ~/.hermes/bin/iron-proxy, no apt
or sudo required.

Disabled by default.  Run `hermes egress setup` to mint tokens and
`hermes egress start` to launch.  The Docker backend then automatically
mounts the CA, sets HTTPS_PROXY + CA-bundle env vars, and adds the
host-gateway hostmap.

New surfaces:
  hermes egress install   — download the pinned iron-proxy binary
  hermes egress setup     — interactive wizard (supports --from-bitwarden)
  hermes egress start     — spawn the managed proxy daemon
  hermes egress stop      — SIGTERM (+SIGKILL after 5s grace)
  hermes egress status    — binary + config + pid + listening + mappings
  hermes egress disable   — flip proxy.enabled = false
  hermes egress config    — print the path to the generated proxy.yaml

Optional Bitwarden integration: `--from-bitwarden` sources the real
upstream credentials from a BSM project at proxy startup, so rotating a
key in the Bitwarden web app propagates to sandboxes on the next proxy
start without touching .env.

Hermes-side scope (v1):
  agent/proxy_sources/iron_proxy.py   — install + CA + config + lifecycle
  hermes_cli/proxy_cli.py             — `hermes egress` subcommand tree
  hermes_cli/config.py                — "proxy:" section in DEFAULT_CONFIG
  hermes_cli/main.py                  — argparse wiring (uses 'egress'
                                         because 'proxy' is the existing
                                         inbound OAuth reverse proxy)
  tools/environments/docker.py        — CA mount, HTTPS_PROXY, CA-bundle
                                         env vars, --add-host wiring

Hermetic tests cover the full lifecycle: token mint, mapping discovery,
config + mappings I/O, install pipeline (HTTP + tar + checksum all mocked),
subprocess lifecycle (Popen mocked), Docker backend arg builder.

A live E2E test (gated on HERMES_RUN_E2E=1) downloads the real iron-proxy
binary, spawns it, routes a curl request through it against a local fake
upstream, and verifies the Authorization header was swapped from the proxy
token to the real secret value (and the proxy token did NOT leak through
to upstream).

Failures (binary missing, port collision, bad token) never block agent
startup — they emit a warning and continue.  The Docker backend refuses to
start a sandbox when proxy.enabled=true but the daemon is dead, unless
proxy.enforce_on_docker is explicitly set to false.

Docs: website/docs/user-guide/egress/{index,iron-proxy}.md
Tests: tests/test_iron_proxy.py (35), tests/test_iron_proxy_e2e.py (1)
2026-05-23 20:38:27 -07:00
727 changed files with 20175 additions and 60755 deletions

View File

@@ -102,3 +102,6 @@ acp_registry/
.gitattributes
.hadolint.yaml
.mailmap
# Top-level LICENSE (not matched by *.md); not needed inside the container
LICENSE

Binary file not shown.

Before

Width:  |  Height:  |  Size: 138 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 148 KiB

View File

@@ -1,11 +1,12 @@
name: Contributor Attribution Check
on:
pull_request:
branches: [main]
# No paths filter — the job must always run so the required check
# reports a status (path-gated workflows leave checks "pending" forever
# when no matching files change, which blocks merge).
pull_request:
branches: [main]
permissions:
contents: read

View File

@@ -18,12 +18,13 @@ on:
- docker/**
- .hadolint.yaml
- .github/workflows/docker-lint.yml
# No paths filter — the job must always run so the required check
# reports a status (path-gated workflows leave checks "pending" forever
# when no matching files change, which blocks merge).
pull_request:
branches: [main]
paths:
- Dockerfile
- docker/**
- .hadolint.yaml
- .github/workflows/docker-lint.yml
permissions:
contents: read

View File

@@ -11,13 +11,16 @@ on:
- 'docker/**'
- '.github/workflows/docker-publish.yml'
- '.github/actions/hermes-smoke-test/**'
# No paths filter — the job must always run so the required check
# reports a status (path-gated workflows leave checks "pending" forever
# when no matching files change, which blocks merge).
pull_request:
branches: [main]
paths:
- '**/*.py'
- 'pyproject.toml'
- 'uv.lock'
- 'Dockerfile'
- 'docker/**'
- '.github/workflows/docker-publish.yml'
- '.github/actions/hermes-smoke-test/**'
release:
types: [published]

View File

@@ -1,12 +1,10 @@
name: Docs Site Checks
on:
# No paths filter — the job must always run so the required check
# reports a status (path-gated workflows leave checks "pending" forever
# when no matching files change, which blocks merge).
pull_request:
branches: [main]
paths:
- 'website/**'
- '.github/workflows/docs-site-checks.yml'
workflow_dispatch:
permissions:
@@ -16,9 +14,9 @@ jobs:
docs-site-checks:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- uses: actions/setup-node@49933ea5288caeca8642d1e84afbd3f7d6820020 # v4
- uses: actions/setup-node@49933ea5288caeca8642d1e84afbd3f7d6820020 # v4
with:
node-version: 22
cache: npm
@@ -28,9 +26,9 @@ jobs:
run: npm ci
working-directory: website
- uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
- uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: "3.11"
python-version: '3.11'
- name: Install ascii-guard
run: python -m pip install ascii-guard==2.3.0 pyyaml==6.0.3

View File

@@ -14,9 +14,6 @@ name: History Check
# the PR head and main to be non-empty.
on:
# No paths filter — the job must always run so the required check
# reports a status (path-gated workflows leave checks "pending" forever
# when no matching files change, which blocks merge).
pull_request:
branches: [main]
@@ -27,9 +24,9 @@ jobs:
check-common-ancestor:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
fetch-depth: 0 # full history both sides for merge-base
fetch-depth: 0 # full history both sides for merge-base
- name: Reject PRs with no common ancestor on main
run: |

View File

@@ -15,12 +15,12 @@ on:
- "**/*.md"
- "docs/**"
- "website/**"
# No paths filter — the job must always run so the required check
# reports a status (path-gated workflows leave checks "pending" forever
# when no matching files change, which blocks merge).
pull_request:
branches: [main]
paths-ignore:
- "**/*.md"
- "docs/**"
- "website/**"
permissions:
contents: read
@@ -154,6 +154,7 @@ jobs:
});
}
ruff-blocking:
# Enforce the rules in pyproject.toml [tool.ruff.lint.select]. Currently
# PLW1514 (unspecified-encoding) — catches bare ``open()`` /

255
.github/workflows/nix-lockfile-fix.yml vendored Normal file
View File

@@ -0,0 +1,255 @@
name: Nix Lockfile Fix
on:
push:
branches: [main]
paths:
- 'package-lock.json'
- 'package.json'
- 'ui-tui/package.json'
- 'apps/desktop/package.json'
workflow_dispatch:
inputs:
pr_number:
description: 'PR number to fix (leave empty to run on the selected branch)'
required: false
type: string
issue_comment:
types: [edited]
permissions:
contents: write
pull-requests: write
concurrency:
group: nix-lockfile-fix-${{ github.event.issue.number || github.event.inputs.pr_number || github.ref }}
cancel-in-progress: false
jobs:
# ── Auto-fix on main ───────────────────────────────────────────────
# Fires when a push to main touches package.json or package-lock.json.
# Runs fix-lockfiles and pushes the hash update commit directly to main
# so Nix builds never stay broken.
#
# Safety invariants:
# 1. The fix commit only touches nix/*.nix files, which are NOT in
# the paths filter above, so this cannot re-trigger itself.
# 2. An explicit file-whitelist check before commit aborts if
# fix-lockfiles ever modifies unexpected files.
# 3. Job-level concurrency with cancel-in-progress: true ensures
# back-to-back pushes collapse to the newest; ref: main checkout
# always operates on the latest branch state.
# 4. Uses a GitHub App token (not GITHUB_TOKEN) so the fix commit
# triggers downstream nix.yml verification.
auto-fix-main:
if: github.event_name == 'push'
runs-on: ubuntu-latest
timeout-minutes: 25
concurrency:
group: auto-fix-main
cancel-in-progress: true
steps:
- name: Generate GitHub App token
id: app-token
uses: actions/create-github-app-token@7bfa3a4717ef143a604ee0a99d859b8886a96d00 # v1.9.3
with:
app-id: ${{ secrets.APP_ID }}
private-key: ${{ secrets.APP_PRIVATE_KEY }}
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
ref: main
token: ${{ steps.app-token.outputs.token }}
- uses: ./.github/actions/nix-setup
with:
cachix-auth-token: ${{ secrets.CACHIX_AUTH_TOKEN }}
- name: Apply lockfile hashes
id: apply
run: nix run .#fix-lockfiles -- --apply
- name: Commit & push
if: steps.apply.outputs.changed == 'true'
shell: bash
run: |
set -euo pipefail
# Ensure only nix/lib.nix (home of the single npmDepsHash) was
# modified — prevents accidental self-triggering if fix-lockfiles
# ever touches package files.
unexpected="$(git diff --name-only | grep -Ev '^nix/lib\.nix$' || true)"
if [ -n "$unexpected" ]; then
echo "::error::Unexpected modified files: $unexpected"
exit 1
fi
# Record the base SHA before committing — used to detect package
# file changes if we need to rebase after a non-fast-forward push.
BASE_SHA="$(git rev-parse HEAD)"
git config user.name 'github-actions[bot]'
git config user.email '41898282+github-actions[bot]@users.noreply.github.com'
git add nix/lib.nix
git commit -m "fix(nix): auto-refresh npm lockfile hashes" \
-m "Source: $GITHUB_SHA" \
-m "Run: $GITHUB_SERVER_URL/$GITHUB_REPOSITORY/actions/runs/$GITHUB_RUN_ID"
# Retry push with rebase in case main advanced with an unrelated
# commit during the nix build. Without this, a non-fast-forward
# rejection silently loses the fix. If package files changed during
# the rebase, abort — a fresh auto-fix run will handle the new state.
for attempt in 1 2 3; do
if git push origin HEAD:main; then
exit 0
fi
echo "::warning::Push attempt $attempt failed (non-fast-forward?), rebasing…"
git fetch origin main
# If package files changed between our base and the new main,
# our computed hashes are stale. Abort and let the next triggered
# run recompute from the correct package-lock state.
pkg_changed="$(git diff --name-only "$BASE_SHA"..origin/main -- \
'package-lock.json' 'package.json' \
'ui-tui/package.json' 'apps/desktop/package.json' || true)"
if [ -n "$pkg_changed" ]; then
echo "::warning::Package files changed since hash computation — aborting; a fresh run will recompute"
exit 0
fi
git rebase origin/main
done
echo "::error::Failed to push after 3 rebase attempts"
exit 1
# ── PR fix (manual / checkbox) ─────────────────────────────────────
# Existing behavior: run on manual dispatch OR when a task-list
# checkbox in the sticky lockfile-check comment flips from [ ] to [x].
fix:
if: |
github.event_name == 'workflow_dispatch' ||
(github.event_name == 'issue_comment'
&& github.event.issue.pull_request != null
&& contains(github.event.comment.body, '[x] **Apply lockfile fix**')
&& !contains(github.event.changes.body.from, '[x] **Apply lockfile fix**'))
runs-on: ubuntu-latest
timeout-minutes: 25
steps:
- name: Authorize & resolve PR
id: resolve
uses: actions/github-script@60a0d83039c74a4aee543508d2ffcb1c3799cdea # v7.0.1
with:
script: |
// 1. Verify the actor has write access — applies to both checkbox
// clicks and manual dispatch.
const { data: perm } =
await github.rest.repos.getCollaboratorPermissionLevel({
owner: context.repo.owner,
repo: context.repo.repo,
username: context.actor,
});
if (!['admin', 'write', 'maintain'].includes(perm.permission)) {
core.setFailed(
`${context.actor} lacks write access (has: ${perm.permission})`
);
return;
}
// 2. Resolve which ref to check out.
let prNumber = '';
if (context.eventName === 'issue_comment') {
prNumber = String(context.payload.issue.number);
} else if (context.eventName === 'workflow_dispatch') {
prNumber = context.payload.inputs.pr_number || '';
}
if (!prNumber) {
core.setOutput('ref', context.ref.replace(/^refs\/heads\//, ''));
core.setOutput('repo', context.repo.repo);
core.setOutput('owner', context.repo.owner);
core.setOutput('pr', '');
return;
}
const { data: pr } = await github.rest.pulls.get({
owner: context.repo.owner,
repo: context.repo.repo,
pull_number: Number(prNumber),
});
core.setOutput('ref', pr.head.ref);
core.setOutput('repo', pr.head.repo.name);
core.setOutput('owner', pr.head.repo.owner.login);
core.setOutput('pr', String(pr.number));
# Wipe the sticky lockfile-check comment to a "running" state as soon
# as the job is authorized, so the user sees their click was picked up
# before the ~minute of nix build work.
- name: Mark sticky as running
if: steps.resolve.outputs.pr != ''
uses: marocchino/sticky-pull-request-comment@52423e01640425a022ef5fd42c6fb5f633a02728 # v2.9.1
with:
header: nix-lockfile-check
number: ${{ steps.resolve.outputs.pr }}
message: |
### 🔄 Applying lockfile fix…
Triggered by @${{ github.actor }} — [workflow run](${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}).
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
repository: ${{ steps.resolve.outputs.owner }}/${{ steps.resolve.outputs.repo }}
ref: ${{ steps.resolve.outputs.ref }}
token: ${{ secrets.GITHUB_TOKEN }}
fetch-depth: 0
- uses: ./.github/actions/nix-setup
with:
cachix-auth-token: ${{ secrets.CACHIX_AUTH_TOKEN }}
- name: Apply lockfile hashes
id: apply
run: nix run .#fix-lockfiles
- name: Commit & push
if: steps.apply.outputs.changed == 'true'
shell: bash
run: |
set -euo pipefail
git config user.name 'github-actions[bot]'
git config user.email '41898282+github-actions[bot]@users.noreply.github.com'
git add nix/lib.nix
git commit -m "fix(nix): refresh npm lockfile hashes"
git push
- name: Update sticky (applied)
if: steps.apply.outputs.changed == 'true' && steps.resolve.outputs.pr != ''
uses: marocchino/sticky-pull-request-comment@52423e01640425a022ef5fd42c6fb5f633a02728 # v2.9.1
with:
header: nix-lockfile-check
number: ${{ steps.resolve.outputs.pr }}
message: |
### ✅ Lockfile fix applied
Pushed a commit refreshing the npm lockfile hashes — [workflow run](${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}).
- name: Update sticky (already current)
if: steps.apply.outputs.changed == 'false' && steps.resolve.outputs.pr != ''
uses: marocchino/sticky-pull-request-comment@52423e01640425a022ef5fd42c6fb5f633a02728 # v2.9.1
with:
header: nix-lockfile-check
number: ${{ steps.resolve.outputs.pr }}
message: |
### ✅ Lockfile hashes already current
Nothing to commit — [workflow run](${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}).
- name: Update sticky (failed)
if: failure() && steps.resolve.outputs.pr != ''
uses: marocchino/sticky-pull-request-comment@52423e01640425a022ef5fd42c6fb5f633a02728 # v2.9.1
with:
header: nix-lockfile-check
number: ${{ steps.resolve.outputs.pr }}
message: |
### ❌ Lockfile fix failed
See the [workflow run](${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}) for logs.

105
.github/workflows/nix.yml vendored Normal file
View File

@@ -0,0 +1,105 @@
name: Nix
on:
push:
branches: [main]
pull_request:
permissions:
contents: read
pull-requests: write
concurrency:
group: nix-${{ github.ref }}
cancel-in-progress: true
jobs:
nix:
strategy:
matrix:
os: [ubuntu-latest, macos-latest]
runs-on: ${{ matrix.os }}
timeout-minutes: 30
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- uses: ./.github/actions/nix-setup
with:
cachix-auth-token: ${{ secrets.CACHIX_AUTH_TOKEN }}
- name: Resolve head SHA
if: github.event_name == 'pull_request'
id: sha
shell: bash
run: |
FULL="${{ github.event.pull_request.head.sha || github.sha }}"
echo "full=$FULL" >> "$GITHUB_OUTPUT"
echo "short=${FULL:0:7}" >> "$GITHUB_OUTPUT"
- name: Check flake
id: flake
continue-on-error: true
run: nix flake check --print-build-logs
# When the flake check fails, run a targeted diagnostic to see if
# the failure is specifically a stale npm lockfile hash in one of the
# known npm subpackages (tui / web). This avoids surfacing a generic
# "build failed" message when the fix is a single known command.
- name: Diagnose npm lockfile hashes
id: hash_check
if: steps.flake.outcome == 'failure' && runner.os == 'Linux'
continue-on-error: true
env:
LINK_SHA: ${{ steps.sha.outputs.full }}
run: nix run .#fix-lockfiles -- --check
# If fix-lockfiles itself crashes (infrastructure blip, cache throttle,
# etc.) it won't set stale=true/false. Treat that as a distinct failure
# mode rather than silently ignoring it.
- name: Fail if hash check crashed without reporting
if: steps.hash_check.outcome == 'failure' && steps.hash_check.outputs.stale != 'true' && steps.hash_check.outputs.stale != 'false'
run: |
echo "::error::fix-lockfiles exited without reporting stale status — likely an infrastructure or script failure"
exit 1
- name: Post sticky PR comment (stale hashes)
if: steps.hash_check.outputs.stale == 'true' && github.event_name == 'pull_request'
uses: marocchino/sticky-pull-request-comment@52423e01640425a022ef5fd42c6fb5f633a02728 # v2.9.1
with:
header: nix-lockfile-check
message: |
### ⚠️ npm lockfile hash out of date
Checked against commit [`${{ steps.sha.outputs.short }}`](${{ github.server_url }}/${{ github.repository }}/commit/${{ steps.sha.outputs.full }}) (PR head at check time).
The `hash = "sha256-..."` line in these nix files no longer matches the committed `package-lock.json`:
${{ steps.hash_check.outputs.report }}
#### Apply the fix
- [ ] **Apply lockfile fix** — tick to push a commit with the correct hashes to this PR branch
- Or [run the Nix Lockfile Fix workflow](${{ github.server_url }}/${{ github.repository }}/actions/workflows/nix-lockfile-fix.yml) manually (pass PR `#${{ github.event.pull_request.number }}`)
- Or locally: `nix run .#fix-lockfiles` and commit the diff
# Clear the sticky comment when either the flake check passed outright (no
# hash check needed) or the hash check explicitly returned stale=false
# (check failed for a non-hash reason).
- name: Clear sticky PR comment (resolved)
if: |
github.event_name == 'pull_request' &&
(steps.hash_check.outputs.stale == 'false' ||
steps.flake.outcome == 'success')
uses: marocchino/sticky-pull-request-comment@52423e01640425a022ef5fd42c6fb5f633a02728 # v2.9.1
with:
header: nix-lockfile-check
delete: true
- name: Final fail if flake check failed
if: steps.flake.outcome == 'failure'
run: |
if [ "${{ steps.hash_check.outputs.stale }}" == "true" ]; then
echo "::error::Nix build failed due to stale npm lockfile hash. Run: nix run .#fix-lockfiles"
else
echo "::error::Nix flake check failed. See logs above."
fi
exit 1

View File

@@ -20,23 +20,29 @@ name: OSV-Scanner
# vulnerabilities in pinned deps that we may need to patch deliberately.
on:
# No paths filter — the job must always run so the required check
# reports a status (path-gated workflows leave checks "pending" forever
# when no matching files change, which blocks merge).
pull_request:
branches: [main]
paths:
- 'uv.lock'
- 'pyproject.toml'
- 'package.json'
- 'package-lock.json'
- 'ui-tui/package.json'
- 'website/package.json'
- 'website/package-lock.json'
- '.github/workflows/osv-scanner.yml'
push:
branches: [main]
paths:
- "uv.lock"
- "pyproject.toml"
- "package.json"
- "package-lock.json"
- "website/package-lock.json"
- 'uv.lock'
- 'pyproject.toml'
- 'package.json'
- 'package-lock.json'
- 'website/package-lock.json'
schedule:
# Weekly scan against main — catches CVEs published after merge for
# deps that haven't changed since.
- cron: "0 9 * * 1"
- cron: '0 9 * * 1'
workflow_dispatch:
permissions:
@@ -48,7 +54,7 @@ permissions:
jobs:
scan:
name: Scan lockfiles
uses: google/osv-scanner-action/.github/workflows/osv-scanner-reusable.yml@9a498708959aeaef5ef730655706c5a1df1edbc2 # v2.3.8
uses: google/osv-scanner-action/.github/workflows/osv-scanner-reusable.yml@9a498708959aeaef5ef730655706c5a1df1edbc2 # v2.3.8
with:
# Scan explicit lockfiles rather than recursing, so we only look at
# the three sources of truth and skip vendored / test / worktree dirs.

View File

@@ -1,11 +1,11 @@
name: Supply Chain Audit
on:
pull_request:
types: [opened, synchronize, reopened]
# No paths filter — the jobs must always run so required checks
# report a status (path-gated workflows leave checks "pending" forever
# when no matching files change, which blocks merge).
pull_request:
types: [opened, synchronize, reopened]
permissions:
pull-requests: write
@@ -32,7 +32,7 @@ jobs:
# True when the curated MCP catalog / bundled MCP manifests changed.
mcp_catalog: ${{ steps.filter.outputs.mcp_catalog }}
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
fetch-depth: 0
- name: Check for relevant file changes
@@ -72,7 +72,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
fetch-depth: 0
@@ -207,7 +207,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
fetch-depth: 0
@@ -286,7 +286,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
fetch-depth: 0

View File

@@ -6,11 +6,11 @@ on:
paths-ignore:
- "**/*.md"
- "docs/**"
# No paths filter — the job must always run so the required check
# reports a status (path-gated workflows leave checks "pending" forever
# when no matching files change, which blocks merge).
pull_request:
branches: [main]
paths-ignore:
- "**/*.md"
- "docs/**"
permissions:
contents: read
@@ -219,4 +219,4 @@ jobs:
env:
OPENROUTER_API_KEY: ""
OPENAI_API_KEY: ""
NOUS_API_KEY: ""
NOUS_API_KEY: ""

View File

@@ -4,9 +4,6 @@ name: Typecheck
on:
push:
branches: [main]
# No paths filter — the job must always run so the required check
# reports a status (path-gated workflows leave checks "pending" forever
# when no matching files change, which blocks merge).
pull_request:
branches: [main]
@@ -26,20 +23,3 @@ jobs:
cache: npm
- run: npm ci
- run: npm run --prefix ${{ matrix.package }} typecheck
# Production build of the desktop renderer. `typecheck` runs `tsc` only,
# which does NOT exercise Vite/Rolldown module resolution — so an
# unresolvable package export (e.g. a transitive @assistant-ui/tap that no
# longer exports "./react-shim") slips past typecheck and only explodes when
# users build apps/desktop from source on install/update. Run the real
# `vite build` here so that class of break fails in CI instead.
desktop-build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- uses: actions/setup-node@49933ea5288caeca8642d1e84afbd3f7d6820020 # v4
with:
node-version: 22
cache: npm
- run: npm ci
- run: npm run --prefix apps/desktop build

View File

@@ -47,15 +47,15 @@ on:
push:
branches: [main]
paths:
- "pyproject.toml"
- "uv.lock"
- ".github/workflows/uv-lockfile-check.yml"
# No paths filter — the job must always run so the required check
# reports a status (path-gated workflows leave checks "pending" forever
# when no matching files change, which blocks merge).
- 'pyproject.toml'
- 'uv.lock'
- '.github/workflows/uv-lockfile-check.yml'
pull_request:
branches: [main]
paths:
- 'pyproject.toml'
- 'uv.lock'
- '.github/workflows/uv-lockfile-check.yml'
permissions:
contents: read
@@ -71,10 +71,10 @@ jobs:
timeout-minutes: 5
steps:
- name: Checkout code
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- name: Install uv
uses: astral-sh/setup-uv@d4b2f3b6ecc6e67c4457f6d3e41ec42d3d0fcb86 # v5
uses: astral-sh/setup-uv@d4b2f3b6ecc6e67c4457f6d3e41ec42d3d0fcb86 # v5
# `uv lock --check` re-resolves the project from pyproject.toml and
# compares the result to uv.lock, exiting non-zero if they disagree.

1
.gitignore vendored
View File

@@ -5,7 +5,6 @@
*.pyc*
__pycache__/
.venv/
.venv
.vscode/
.env
.env.local

View File

@@ -78,41 +78,7 @@ This isn't a quality bar — it's a coupling-and-maintenance decision. Memory pr
| **uv** | Fast Python package manager ([install](https://docs.astral.sh/uv/)) |
| **Node.js 20+** | Optional — needed for browser tools and WhatsApp bridge (matches root `package.json` engines) |
### Install with the standard installer
For most contributors, the best development bootstrap is the same path users
take: run the standard installer, then work inside the repository it cloned.
The installer creates the Hermes venv, wires the `hermes` command, stamps the
install method for `hermes update`, and clones the full git project into
`$HERMES_HOME/hermes-agent` (usually `~/.hermes/hermes-agent`). That keeps your
development environment on the same layout the CLI, updater, lazy dependency
installer, gateway, and docs assume.
```bash
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
cd "${HERMES_HOME:-$HOME/.hermes}/hermes-agent"
# Add dev/test extras on top of the standard install.
uv pip install -e ".[all,dev]"
# Optional: browser tools / docs site dependencies.
npm install
```
After that, create branches and run tests from that checkout:
```bash
git checkout -b fix/description
scripts/run_tests.sh
```
### Manual clone fallback
Use this only if you intentionally do not want Hermes' managed install layout
(for example, a throwaway clone inside a container or CI job). If you install
this way, make sure you run the `hermes` entrypoint from this venv; running the
system `python3 -m hermes_cli.main` can pick up unrelated system Python
packages.
### Clone and install
```bash
git clone https://github.com/NousResearch/hermes-agent.git
@@ -143,17 +109,13 @@ echo "OPENROUTER_API_KEY=***" >> ~/.hermes/.env
### Run
```bash
# The standard installer already put `hermes` on PATH.
hermes doctor
hermes chat -q "Hello"
```
If you used the manual clone fallback, run `./hermes` from the checkout or
symlink this clone's venv explicitly:
```bash
# Symlink for global access
mkdir -p ~/.local/bin
ln -sf "$(pwd)/venv/bin/hermes" ~/.local/bin/hermes
# Verify
hermes doctor
hermes chat -q "Hello"
```
### Run tests

View File

@@ -9,11 +9,8 @@ FROM ghcr.io/astral-sh/uv:0.11.6-python3.13-trixie@sha256:b3c543b6c4f23a5f2df228
FROM node:22-bookworm-slim@sha256:7af03b14a13c8cdd38e45058fd957bf00a72bbe17feac43b1c15a689c029c732 AS node_source
FROM debian:13.4
# Disable Python stdout buffering to ensure logs are printed immediately.
# Do not write .pyc files at runtime: /opt/hermes is immutable in the
# published container and writable state belongs under /opt/data.
# Disable Python stdout buffering to ensure logs are printed immediately
ENV PYTHONUNBUFFERED=1
ENV PYTHONDONTWRITEBYTECODE=1
# Store Playwright browsers outside the volume mount so the build-time
# install survives the /opt/data volume overlay at runtime.
@@ -189,40 +186,36 @@ RUN cd web && npm run build && \
# ---------- Source code ----------
# .dockerignore excludes node_modules, so the installs above survive.
# --link decouples this layer from parents for cache purposes; --chmod bakes
# the final read-only permissions at copy time so we skip the separate
# `chmod -R` pass that previously walked ~30k files across the venv +
# node_modules + source (21s amd64 / 222s arm64 — #49113). `a+rX,go-w`
# gives the non-root hermes user read + traverse but no write; root retains
# write so the build steps below don't need chmod u+w dances.
COPY --link --chmod=a+rX,go-w . .
COPY --chown=hermes:hermes . .
# ---------- Permissions ----------
# Link hermes-agent itself (editable). Deps are already installed in the
# cached layer above; `--no-deps` makes this a fast egg-link creation with no
# resolution or downloads.
RUN uv pip install --no-cache-dir --no-deps -e "."
# Wire the exec shim and install-method stamp. Files under /opt/hermes are
# already root-owned (COPY, uv sync, npm install all run as root) and
# read-only for the hermes user (go-w from the --chmod above), so the
# previous `chown -R` + `chmod -R` passes are gone.
# Make install dir world-readable so any HERMES_UID can read it at runtime.
# The venv needs to be traversable too.
# node_modules trees additionally need to be writable by the hermes user
# so the runtime `npm install` triggered by _tui_need_npm_install() in
# hermes_cli/main.py succeeds (see #18800). /opt/hermes/web is build-time
# only (HERMES_WEB_DIST points at hermes_cli/web_dist) and is intentionally
# not chowned here.
# /opt/hermes/gateway is runtime-writable: Python may create __pycache__ and
# gateway state artifacts beneath the package after services drop privileges,
# especially when the hermes UID is remapped at boot (#27221).
# The .venv MUST remain hermes-writable so lazy_deps.py can install
# remaining optional platform packages and future pin bumps at first use.
# Without this, `uv pip install` fails with EACCES and adapters silently
# fail to load. See tools/lazy_deps.py.
USER root
RUN mkdir -p /opt/hermes/bin && \
cp /opt/hermes/docker/hermes-exec-shim.sh /opt/hermes/bin/hermes && \
chmod 0755 /opt/hermes/bin/hermes && \
printf 'docker\n' > /opt/hermes/.install_method
# The ``.install_method`` stamp is baked next to the running code (the install
# tree), NOT into $HERMES_HOME. $HERMES_HOME (/opt/data) is a shared data
# volume that is commonly bind-mounted from the host and even shared with a
# host-side Desktop/CLI install; stamping it at boot used to clobber that
# host install's marker and wrongly block its ``hermes update``. A code-scoped
# stamp is read first by detect_install_method() and is immune to the share.
RUN chmod -R a+rX /opt/hermes && \
chown -R hermes:hermes /opt/hermes/.venv /opt/hermes/ui-tui /opt/hermes/gateway /opt/hermes/node_modules
# Start as root so the s6-overlay stage2 hook can usermod/groupmod and chown
# the data volume. Each supervised service then drops to the hermes user via
# `s6-setuidgid hermes` in its run script. If HERMES_UID is unset, services
# run as the default hermes user (UID 10000).
# ---------- Link hermes-agent itself (editable) ----------
# Deps are already installed in the cached layer above; `--no-deps` makes
# this a fast (~1s) egg-link creation with no resolution or downloads.
RUN uv pip install --no-cache-dir --no-deps -e "."
# ---------- Bake build-time git revision ----------
# .dockerignore excludes .git, so `git rev-parse HEAD` from inside the
# container always returns nothing — meaning `hermes dump` reports
@@ -242,7 +235,8 @@ RUN mkdir -p /opt/hermes/bin && \
# every published image has it.
ARG HERMES_GIT_SHA=
RUN if [ -n "${HERMES_GIT_SHA}" ]; then \
printf '%s\n' "${HERMES_GIT_SHA}" > /opt/hermes/.hermes_build_sha; \
printf '%s\n' "${HERMES_GIT_SHA}" > /opt/hermes/.hermes_build_sha && \
chown hermes:hermes /opt/hermes/.hermes_build_sha; \
fi
# ---------- s6-overlay service wiring ----------
@@ -288,8 +282,6 @@ ENV HERMES_WEB_DIST=/opt/hermes/hermes_cli/web_dist
# check. (A separate launcher hardening is tracked independently.)
ENV HERMES_TUI_DIR=/opt/hermes/ui-tui
ENV HERMES_HOME=/opt/data
ENV HERMES_WRITE_SAFE_ROOT=/opt/data
ENV HERMES_DISABLE_LAZY_INSTALLS=1
# `docker exec` privilege-drop shim. When operators run
# `docker exec <c> hermes ...` they default to root, and any file the
@@ -302,6 +294,7 @@ ENV HERMES_DISABLE_LAZY_INSTALLS=1
# Recursion is impossible because the shim exec's the venv binary by
# absolute path (/opt/hermes/.venv/bin/hermes). See the shim source for
# the opt-out env var (HERMES_DOCKER_EXEC_AS_ROOT=1).
COPY --chmod=0755 docker/hermes-exec-shim.sh /opt/hermes/bin/hermes
# Pre-s6 entrypoint.sh did `source .venv/bin/activate` which exported
# the venv bin onto PATH; Architecture B's main-wrapper.sh does the

View File

@@ -181,20 +181,16 @@ See `hermes claw migrate --help` for all options, or use the `openclaw-migration
We welcome contributions! See the [Contributing Guide](https://hermes-agent.nousresearch.com/docs/developer-guide/contributing) for development setup, code style, and PR process.
Quick start for contributors — use the standard installer, then work from the
full git checkout it creates at `$HERMES_HOME/hermes-agent` (usually
`~/.hermes/hermes-agent`). This matches the layout used by `hermes update`, the
managed venv, lazy dependencies, gateway, and docs tooling.
Quick start for contributors — clone and go with `setup-hermes.sh`:
```bash
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
cd "${HERMES_HOME:-$HOME/.hermes}/hermes-agent"
uv pip install -e ".[all,dev]"
scripts/run_tests.sh
git clone https://github.com/NousResearch/hermes-agent.git
cd hermes-agent
./setup-hermes.sh # installs uv, creates venv, installs .[all], symlinks ~/.local/bin/hermes
./hermes # auto-detects the venv, no need to `source` first
```
Manual clone fallback (for throwaway clones/CI where you intentionally do not
want the managed install layout):
Manual path (equivalent to the above):
```bash
curl -LsSf https://astral.sh/uv/install.sh | sh

View File

@@ -164,18 +164,16 @@ hermes claw migrate --overwrite # 覆盖已有冲突
欢迎贡献!请参阅 [贡献指南](https://hermes-agent.nousresearch.com/docs/developer-guide/contributing) 了解开发设置、代码风格和 PR 流程。
贡献者快速开始——使用标准安装器,然后在它创建的完整 git checkout 中开发
`$HERMES_HOME/hermes-agent`(通常是 `~/.hermes/hermes-agent`)。这会匹配
`hermes update`、托管 venv、lazy dependencies、gateway 和 docs tooling 使用的布局。
贡献者快速开始——克隆并使用 `setup-hermes.sh`
```bash
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
cd "${HERMES_HOME:-$HOME/.hermes}/hermes-agent"
uv pip install -e ".[all,dev]"
scripts/run_tests.sh
git clone https://github.com/NousResearch/hermes-agent.git
cd hermes-agent
./setup-hermes.sh # 安装 uv、创建 venv、安装 .[all]、创建符号链接 ~/.local/bin/hermes
./hermes # 自动检测 venv无需先 source
```
手动克隆备用路径(用于一次性 clone / CI或你明确不想使用 managed install layout 时
手动安装(等效于上述命令
```bash
curl -LsSf https://astral.sh/uv/install.sh | sh

View File

@@ -27,7 +27,7 @@ import threading
import time
import uuid
from datetime import datetime
from typing import Any, Callable, Dict, List, Optional
from typing import Any, Dict, List, Optional
from urllib.parse import urlparse, parse_qs, urlunparse
from agent.context_compressor import ContextCompressor
@@ -195,7 +195,6 @@ def init_agent(
status_callback: callable = None,
notice_callback: callable = None,
notice_clear_callback: callable = None,
event_callback: Optional[Callable[[str, dict], None]] = None,
max_tokens: int = None,
reasoning_config: Dict[str, Any] = None,
service_tier: str = None,
@@ -300,7 +299,6 @@ def init_agent(
# would mangle the escape sequences. None = use builtins.print.
agent._print_fn = None
agent.background_review_callback = None # Optional sync callback for gateway delivery
agent.memory_notifications = "on" # Memory update notifications: "off", "on", "verbose"
agent.skip_context_files = skip_context_files
agent.load_soul_identity = load_soul_identity
agent.pass_session_id = pass_session_id
@@ -427,7 +425,6 @@ def init_agent(
agent.status_callback = status_callback
agent.notice_callback = notice_callback
agent.notice_clear_callback = notice_clear_callback
agent.event_callback = event_callback
agent.tool_gen_callback = tool_gen_callback
@@ -599,7 +596,6 @@ def init_agent(
# (e.g. CLI voice mode adds a temporary prefix for the live call only).
agent._persist_user_message_idx = None
agent._persist_user_message_override = None
agent._persist_user_message_timestamp = None
# Cache anthropic image-to-text fallbacks per image payload/URL so a
# single tool loop does not repeatedly re-run auxiliary vision on the
@@ -1156,9 +1152,6 @@ def init_agent(
"hermes_home": str(get_hermes_home()),
"agent_context": "primary",
}
if _init_kwargs["platform"] == "cli":
_init_kwargs["warning_callback"] = agent._emit_warning
_init_kwargs["status_callback"] = agent._emit_status
# Thread session title for memory provider scoping
# (e.g. honcho uses this to derive chat-scoped session keys)
if agent._session_db:
@@ -1227,35 +1220,12 @@ def init_agent(
# targets.
agent._task_completion_guidance = bool(_agent_section.get("task_completion_guidance", True))
# Universal parallel-tool-call guidance toggle. Default True. Separate
# flag from task_completion_guidance because a user may want one but not
# the other. Steers the model to batch independent tool calls into a
# single turn; the runtime already executes such batches concurrently.
agent._parallel_tool_call_guidance = bool(_agent_section.get("parallel_tool_call_guidance", True))
# Local Python toolchain probe toggle. Default True. When False,
# the probe is skipped entirely (no subprocess calls, no system-prompt
# line). Useful for users on exotic setups where the probe heuristics
# are noisy.
agent._environment_probe = bool(_agent_section.get("environment_probe", True))
# Per-platform prompt-hint overrides (config.yaml → platform_hints).
# Lets an enterprise admin append to or replace Hermes' built-in
# platform hint for a single messaging platform (e.g. WhatsApp) without
# affecting other platforms. Shape:
# platform_hints:
# whatsapp:
# append: "When tabular output would help, invoke the ... skill."
# slack:
# replace: "Custom Slack hint that fully replaces the default."
# Stored verbatim; resolution happens in agent/system_prompt.py against
# the active platform. Invalid shapes are ignored defensively so a bad
# config entry can never break prompt assembly.
_platform_hints_cfg = _agent_cfg.get("platform_hints", {})
if not isinstance(_platform_hints_cfg, dict):
_platform_hints_cfg = {}
agent._platform_hint_overrides = _platform_hints_cfg
# App-level API retry count (wraps each model API call). Default 3,
# overridable via agent.api_max_retries in config.yaml. See #11616.
try:

View File

@@ -1217,23 +1217,12 @@ def dump_api_request_debug(
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S_%f")
dump_file = agent.logs_dir / f"request_dump_{agent.session_id}_{timestamp}.json"
# Redact secrets before persisting/printing. This dump captures the
# full request body (system prompt, tool defs, context-embedded
# values), and this path fires unconditionally on API errors — so it
# otherwise lands any context-embedded secret in cleartext on disk.
# Run the serialized dump through the same scrubber used for logs/tool
# output, then hand the resulting payload back to the shared atomic
# JSON writer so request dumps keep the same write semantics as before.
from agent.redact import redact_sensitive_text
_serialized = json.dumps(dump_payload, ensure_ascii=False, indent=2, default=str)
_redacted_payload = json.loads(redact_sensitive_text(_serialized, force=True))
atomic_json_write(dump_file, _redacted_payload, default=str)
atomic_json_write(dump_file, dump_payload, default=str)
agent._vprint(f"{agent.log_prefix}🧾 Request debug dump written to: {dump_file}")
if env_var_enabled("HERMES_DUMP_REQUEST_STDOUT"):
print(json.dumps(_redacted_payload, ensure_ascii=False, indent=2, default=str))
print(json.dumps(dump_payload, ensure_ascii=False, indent=2, default=str))
return dump_file
except Exception as dump_error:
@@ -1839,42 +1828,28 @@ def invoke_tool(agent, function_name: str, function_args: dict, effective_task_i
elif function_name == "memory":
def _execute(next_args: dict) -> Any:
target = next_args.get("target", "memory")
operations = next_args.get("operations")
from tools.memory_tool import memory_tool as _memory_tool
result = _memory_tool(
action=next_args.get("action"),
target=target,
content=next_args.get("content"),
old_text=next_args.get("old_text"),
operations=operations,
store=agent._memory_store,
)
# Bridge: notify external memory provider of built-in memory writes.
# Covers both the single-op shape and each add/replace inside a batch.
if agent._memory_manager:
if operations:
_mem_ops = [
op for op in operations
if isinstance(op, dict) and op.get("action") in {"add", "replace"}
]
else:
_mem_ops = (
[{"action": next_args.get("action"), "content": next_args.get("content")}]
if next_args.get("action") in {"add", "replace"} else []
# Bridge: notify external memory provider of built-in memory writes
if agent._memory_manager and next_args.get("action") in {"add", "replace"}:
try:
agent._memory_manager.on_memory_write(
next_args.get("action", ""),
target,
next_args.get("content", ""),
metadata=agent._build_memory_write_metadata(
task_id=effective_task_id,
tool_call_id=tool_call_id,
),
)
for _op in _mem_ops:
try:
agent._memory_manager.on_memory_write(
_op.get("action", ""),
target,
_op.get("content", "") or "",
metadata=agent._build_memory_write_metadata(
task_id=effective_task_id,
tool_call_id=tool_call_id,
),
)
except Exception:
pass
except Exception:
pass
return _finish_agent_tool(result, next_args)
elif agent._memory_manager and agent._memory_manager.has_tool(function_name):
def _execute(next_args: dict) -> Any:

View File

@@ -372,7 +372,7 @@ def _detect_claude_code_version() -> str:
_CLAUDE_CODE_SYSTEM_PREFIX = "You are Claude Code, Anthropic's official CLI for Claude."
_MCP_TOOL_PREFIX = "mcp__"
_MCP_TOOL_PREFIX = "mcp_"
def _get_claude_code_version() -> str:
@@ -2349,46 +2349,25 @@ def build_anthropic_kwargs(
text = text.replace("Nous Research", "Anthropic")
block["text"] = text
# 3. Normalize tool names so NOTHING goes on the OAuth wire with a
# single-underscore ``mcp_`` prefix. Anthropic's subscription/OAuth
# billing classifier treats a single-underscore ``mcp_`` tool name as
# a third-party-app fingerprint and rejects the request with HTTP 400
# "Third-party apps now draw from extra usage, not plan limits"
# (verified empirically: a single ``mcp_foo`` tool flips a request
# from plan-billing to the extra-usage lane; ``mcp__foo`` is accepted).
#
# Two cases, both must land on the double-underscore ``mcp__`` form:
# a) bare Hermes-native tools (``read_file``) -> ``mcp__read_file``
# b) native MCP server tools registered under their full
# single-underscore ``mcp_<server>_<tool>`` name
# (``mcp_linear_get_issue``) -> ``mcp__linear_get_issue``
# Case (b) is the gap that the bare ``mcp_``->``mcp__`` constant swap
# left open: those tools were *skipped* and stayed single-underscore,
# so any session with an MCP server configured still tripped the
# classifier. normalize_response reverses both forms via registry
# lookup so the dispatcher still sees the original name. GH-25255.
def _to_oauth_wire_name(name: str) -> str:
if name.startswith("mcp__"):
return name # already correct, don't double-prefix
if name.startswith("mcp_"):
# single-underscore native MCP tool -> promote to double
return "mcp__" + name[len("mcp_"):]
return _MCP_TOOL_PREFIX + name # bare name -> mcp__<name>
# 3. Prefix tool names with mcp_ (Claude Code convention)
# Skip names that already begin with the marker — native MCP server
# tools (from mcp_servers: in config.yaml) are registered under their
# full mcp_<server>_<tool> name and would double-prefix otherwise,
# breaking round-trip registry lookup in normalize_response. GH-25255.
if anthropic_tools:
for tool in anthropic_tools:
if "name" in tool:
tool["name"] = _to_oauth_wire_name(tool["name"])
if "name" in tool and not tool["name"].startswith(_MCP_TOOL_PREFIX):
tool["name"] = _MCP_TOOL_PREFIX + tool["name"]
# 4. Apply the same normalization to tool names in message history
# (tool_use blocks) so replayed turns match the wire names above.
# 4. Prefix tool names in message history (tool_use and tool_result blocks)
for msg in anthropic_messages:
content = msg.get("content")
if isinstance(content, list):
for block in content:
if isinstance(block, dict):
if block.get("type") == "tool_use" and "name" in block:
block["name"] = _to_oauth_wire_name(block["name"])
if not block["name"].startswith(_MCP_TOOL_PREFIX):
block["name"] = _MCP_TOOL_PREFIX + block["name"]
elif block.get("type") == "tool_result" and "tool_use_id" in block:
pass # tool_result uses ID, not name
@@ -2535,56 +2514,3 @@ def sanitize_anthropic_kwargs(api_kwargs: Any, *, log_prefix: str = "") -> Any:
sorted(leaked),
)
return api_kwargs
def _is_stream_unavailable_error(exc: Exception) -> bool:
"""Return True when an Anthropic stream call should fall back to create()."""
err_lower = str(exc).lower()
if "stream" in err_lower and "not supported" in err_lower:
return True
if "invokemodelwithresponsestream" in err_lower:
from agent.bedrock_adapter import is_streaming_access_denied_error
return is_streaming_access_denied_error(exc)
return False
def create_anthropic_message(
client: Any,
api_kwargs: dict,
*,
log_prefix: str = "",
prefer_stream: bool = True,
) -> Any:
"""Create an Anthropic message, aggregating via stream when available.
Some Anthropic-compatible gateways are SSE-only: they ignore non-streaming
requests and return ``text/event-stream`` even for ``messages.create()``.
The SDK can surface that as raw text, so callers that expect a Message then
crash on ``.content``. Prefer ``messages.stream().get_final_message()`` to
match the main turn path, falling back to ``create()`` only for providers
that explicitly do not support streaming, such as restricted Bedrock roles.
"""
sanitize_anthropic_kwargs(api_kwargs, log_prefix=log_prefix)
messages_api = getattr(client, "messages", None)
stream_fn = getattr(messages_api, "stream", None)
if prefer_stream and callable(stream_fn):
stream_kwargs = dict(api_kwargs)
stream_kwargs.pop("stream", None)
try:
with stream_fn(**stream_kwargs) as stream:
return stream.get_final_message()
except Exception as exc:
if not _is_stream_unavailable_error(exc):
raise
logger.debug(
"%sAnthropic Messages stream unavailable; falling back to "
"messages.create(): %s",
log_prefix,
exc,
)
create_kwargs = dict(api_kwargs)
create_kwargs.pop("stream", None)
return messages_api.create(**create_kwargs)

View File

@@ -997,7 +997,7 @@ class _AnthropicCompletionsAdapter:
self._is_oauth = is_oauth
def create(self, **kwargs) -> Any:
from agent.anthropic_adapter import build_anthropic_kwargs, create_anthropic_message
from agent.anthropic_adapter import build_anthropic_kwargs
from agent.transports import get_transport
messages = kwargs.get("messages", [])
@@ -1041,7 +1041,7 @@ class _AnthropicCompletionsAdapter:
if not _forbids_sampling_params(model):
anthropic_kwargs["temperature"] = temperature
response = create_anthropic_message(self._client, anthropic_kwargs)
response = self._client.messages.create(**anthropic_kwargs)
_transport = get_transport("anthropic_messages")
_nr = _transport.normalize_response(
response, strip_tool_prefix=self._is_oauth
@@ -3079,20 +3079,23 @@ def _try_configured_fallback_chain(
if not fb_provider or fb_provider.lower() == skip:
continue
fb_model = str(entry.get("model", "")).strip() or None
fb_base_url = str(entry.get("base_url", "")).strip() or None
fb_api_key = str(entry.get("api_key", "")).strip() or None
label = f"fallback_chain[{i}]({fb_provider})"
try:
fb_client, resolved_model = _resolve_fallback_entry(entry)
fb_client = _resolve_single_provider(
fb_provider, fb_model, fb_base_url, fb_api_key)
except Exception:
fb_client, resolved_model = None, None
fb_client = None
if fb_client is not None:
logger.info(
"Auxiliary %s: %s on %s — configured fallback to %s (%s)",
task, reason, failed_provider, label, resolved_model or fb_model or "default",
task, reason, failed_provider, label, fb_model or "default",
)
return fb_client, resolved_model or fb_model, label
return fb_client, fb_model, label
tried.append(label)
if tried:
@@ -3103,103 +3106,6 @@ def _try_configured_fallback_chain(
return None, None, ""
def _fallback_entry_api_key(entry: Dict[str, Any]) -> Optional[str]:
"""Resolve inline or env-backed API key from a fallback-chain entry."""
explicit = str(entry.get("api_key") or "").strip()
if explicit:
return explicit
key_env = str(entry.get("key_env") or entry.get("api_key_env") or "").strip()
if key_env:
return os.getenv(key_env, "").strip() or None
return None
def _resolve_fallback_entry(entry: Dict[str, Any]) -> Tuple[Optional[Any], Optional[str]]:
"""Resolve one fallback entry through the central provider router."""
provider = str(entry.get("provider") or "").strip()
model = str(entry.get("model") or "").strip() or None
if not provider or not model:
return None, None
base_url = str(entry.get("base_url") or "").strip() or None
api_key = _fallback_entry_api_key(entry)
api_mode = str(entry.get("api_mode") or entry.get("transport") or "").strip() or None
return resolve_provider_client(
provider,
model=model,
explicit_base_url=base_url,
explicit_api_key=api_key,
api_mode=api_mode,
)
def _try_main_fallback_chain(
task: Optional[str],
failed_provider: str = "",
reason: str = "error",
) -> Tuple[Optional[Any], Optional[str], str]:
"""Try the top-level main-agent fallback chain for an auxiliary call.
``provider: auto`` auxiliary tasks should respect the user's declared
main fallback policy before dropping into Hermes' built-in discovery
chain. The top-level chain is read through ``get_fallback_chain`` so
both modern ``fallback_providers`` and legacy ``fallback_model`` entries
participate in the same order as the main agent.
"""
try:
from hermes_cli.config import load_config
from hermes_cli.fallback_config import get_fallback_chain
chain = get_fallback_chain(load_config())
except Exception as exc:
logger.debug("Auxiliary %s: could not load main fallback chain: %s", task or "call", exc)
return None, None, ""
if not chain:
return None, None, ""
failed_norm = (failed_provider or "").strip().lower()
main_norm = (_read_main_provider() or "").strip().lower()
skip = {p for p in (failed_norm, main_norm, "auto") if p}
tried: List[str] = []
for i, entry in enumerate(chain):
if not isinstance(entry, dict):
continue
fb_provider = str(entry.get("provider") or "").strip()
fb_model = str(entry.get("model") or "").strip()
if not fb_provider or not fb_model:
continue
fb_norm = fb_provider.lower()
label = f"fallback_providers[{i}]({fb_provider})"
if fb_norm in skip:
tried.append(f"{label} (skipped)")
continue
if _is_provider_unhealthy(fb_norm):
_log_skip_unhealthy(fb_norm, task)
tried.append(f"{label} (unhealthy)")
continue
try:
fb_client, resolved_model = _resolve_fallback_entry(entry)
except Exception as exc:
logger.debug("Auxiliary %s: main fallback %s failed to resolve: %s", task or "call", label, exc)
fb_client, resolved_model = None, None
if fb_client is not None:
logger.info(
"Auxiliary %s: %s on %s — main fallback chain to %s (%s)",
task or "call", reason, failed_provider or "auto", label,
resolved_model or fb_model,
)
return fb_client, resolved_model or fb_model, fb_provider
tried.append(label)
if tried:
logger.debug(
"Auxiliary %s: main fallback chain exhausted (tried: %s)",
task or "call", ", ".join(tried),
)
return None, None, ""
def _resolve_single_provider(
provider: str,
model: Optional[str] = None,
@@ -3210,19 +3116,16 @@ def _resolve_single_provider(
Uses the existing provider resolution infrastructure where possible.
"""
# Reuse resolve_provider_client which handles provider→client mapping.
# Reuse resolve_provider_client which handles provider→client mapping
client, resolved_model = resolve_provider_client(
provider=provider,
model=model,
explicit_base_url=base_url,
explicit_api_key=api_key,
base_url=base_url,
api_key=api_key,
)
return client
def _resolve_auto(
main_runtime: Optional[Dict[str, Any]] = None,
task: Optional[str] = None,
) -> Tuple[Optional[OpenAI], Optional[str]]:
def _resolve_auto(main_runtime: Optional[Dict[str, Any]] = None) -> Tuple[Optional[OpenAI], Optional[str]]:
"""Full auto-detection chain.
Priority:
@@ -3320,22 +3223,7 @@ def _resolve_auto(
main_provider, resolved or main_model)
return client, resolved or main_model
# ── Step 2: user-configured fallback policy ─────────────────────────
# In auto mode, respect the task-specific fallback chain first, then the
# main agent's top-level fallback_providers/fallback_model chain. The
# hardcoded provider discovery chain below is only the convenience default
# for users who have not declared a fallback policy.
if task:
fb_client, fb_model, _fb_label = _try_configured_fallback_chain(
task, main_provider or "auto", reason="main provider unavailable")
if fb_client is not None:
return fb_client, fb_model
fb_client, fb_model, _fb_label = _try_main_fallback_chain(
task, main_provider or "auto", reason="main provider unavailable")
if fb_client is not None:
return fb_client, fb_model
# ── Step 3: aggregator / fallback chain ──────────────────────────────
# ── Step 2: aggregator / fallback chain ──────────────────────────────
tried = []
for label, try_fn in _get_provider_chain():
if _is_provider_unhealthy(label):
@@ -3456,7 +3344,6 @@ def resolve_provider_client(
api_mode: str = None,
main_runtime: Optional[Dict[str, Any]] = None,
is_vision: bool = False,
task: Optional[str] = None,
) -> Tuple[Optional[Any], Optional[str]]:
"""Central router: given a provider name and optional model, return a
configured client with the correct auth, base URL, and API format.
@@ -3577,7 +3464,7 @@ def resolve_provider_client(
# ── Auto: try all providers in priority order ────────────────────
if provider == "auto":
client, resolved = _resolve_auto(main_runtime=main_runtime, task=task)
client, resolved = _resolve_auto(main_runtime=main_runtime)
if client is None:
return None, None
# When auto-detection lands on a non-OpenRouter provider (e.g. a
@@ -4470,16 +4357,11 @@ def _client_cache_key(
api_mode: Optional[str] = None,
main_runtime: Optional[Dict[str, Any]] = None,
is_vision: bool = False,
task: Optional[str] = None,
) -> tuple:
runtime = _normalize_main_runtime(main_runtime)
runtime_key = tuple(runtime.get(field, "") for field in _MAIN_RUNTIME_FIELDS) if provider == "auto" else ()
# `auto` can now resolve through task-specific or main fallback policy,
# so the task participates in the cache key. Non-auto providers keep the
# old cache shape because the explicit provider/model tuple is sufficient.
task_key = (task or "") if provider == "auto" else ""
pool_hint = _pool_cache_hint(provider, main_runtime=main_runtime)
return (provider, async_mode, base_url or "", api_key or "", api_mode or "", runtime_key, is_vision, task_key, pool_hint)
return (provider, async_mode, base_url or "", api_key or "", api_mode or "", runtime_key, is_vision, pool_hint)
def _store_cached_client(cache_key: tuple, client: Any, default_model: Optional[str], *, bound_loop: Any = None) -> None:
@@ -4672,7 +4554,6 @@ def _get_cached_client(
api_mode: str = None,
main_runtime: Optional[Dict[str, Any]] = None,
is_vision: bool = False,
task: Optional[str] = None,
) -> Tuple[Optional[Any], Optional[str]]:
"""Get or create a cached client for the given provider.
@@ -4710,7 +4591,6 @@ def _get_cached_client(
api_mode=api_mode,
main_runtime=main_runtime,
is_vision=is_vision,
task=task,
)
with _client_cache_lock:
if cache_key in _client_cache:
@@ -4755,7 +4635,6 @@ def _get_cached_client(
api_mode=api_mode,
main_runtime=runtime,
is_vision=is_vision,
task=task,
)
if client is not None:
# For async clients, remember which loop they were created on so we
@@ -5261,7 +5140,7 @@ def call_llm(
if not resolved_base_url:
logger.info("Auxiliary %s: provider %s unavailable, trying auto-detection chain",
task or "call", resolved_provider)
client, final_model = _get_cached_client("auto", main_runtime=main_runtime, task=task)
client, final_model = _get_cached_client("auto", main_runtime=main_runtime)
if client is None:
raise RuntimeError(
f"No LLM provider configured for task={task} provider={resolved_provider}. "
@@ -5587,19 +5466,14 @@ def call_llm(
# Fallback order (#26882, #26803):
# 1. User-configured fallback_chain (per-task) if set
# 2. For auto: top-level main fallback_providers/fallback_model
# 3. For auto: built-in auxiliary discovery chain
# 4. For explicit aux providers: main agent model safety net
# 2. Main agent model (last-resort safety net)
# For auto users (no explicit aux provider), use the full
# auto-detection chain instead — its Step 1 IS the main agent
# model, so users on `auto` already get main-model fallback.
fb_client, fb_model, fb_label = (None, None, "")
if is_auto:
fb_client, fb_model, fb_label = _try_configured_fallback_chain(
task, resolved_provider or "auto", reason=reason)
if fb_client is None:
fb_client, fb_model, fb_label = _try_main_fallback_chain(
task, resolved_provider or "auto", reason=reason)
if fb_client is None:
fb_client, fb_model, fb_label = _try_payment_fallback(
resolved_provider, task, reason=reason)
fb_client, fb_model, fb_label = _try_payment_fallback(
resolved_provider, task, reason=reason)
else:
fb_client, fb_model, fb_label = _try_configured_fallback_chain(
task, resolved_provider or "auto", reason=reason)
@@ -5762,7 +5636,7 @@ async def async_call_llm(
if not resolved_base_url:
logger.info("Auxiliary %s: provider %s unavailable, trying auto-detection chain",
task or "call", resolved_provider)
client, final_model = _get_cached_client("auto", async_mode=True, main_runtime=main_runtime, task=task)
client, final_model = _get_cached_client("auto", async_mode=True)
if client is None:
raise RuntimeError(
f"No LLM provider configured for task={task} provider={resolved_provider}. "
@@ -6030,19 +5904,13 @@ async def async_call_llm(
# Fallback order (#26882, #26803):
# 1. User-configured fallback_chain (per-task) if set
# 2. For auto: top-level main fallback_providers/fallback_model
# 3. For auto: built-in auxiliary discovery chain
# 4. For explicit aux providers: main agent model safety net
# 2. Main agent model (last-resort safety net)
# Auto users get the full auto-detection chain instead — its
# Step 1 IS the main agent model.
fb_client, fb_model, fb_label = (None, None, "")
if is_auto:
fb_client, fb_model, fb_label = _try_configured_fallback_chain(
task, resolved_provider or "auto", reason=reason)
if fb_client is None:
fb_client, fb_model, fb_label = _try_main_fallback_chain(
task, resolved_provider or "auto", reason=reason)
if fb_client is None:
fb_client, fb_model, fb_label = _try_payment_fallback(
resolved_provider, task, reason=reason)
fb_client, fb_model, fb_label = _try_payment_fallback(
resolved_provider, task, reason=reason)
else:
fb_client, fb_model, fb_label = _try_configured_fallback_chain(
task, resolved_provider or "auto", reason=reason)

View File

@@ -237,25 +237,18 @@ _COMBINED_REVIEW_PROMPT = (
def summarize_background_review_actions(
review_messages: List[Dict],
prior_snapshot: List[Dict],
notification_mode: str = "on",
) -> List[str]:
"""Build the human-facing action summary for a background review pass.
Walks the review agent's session messages and collects successful memory
and skill-management actions to surface to the user. Tool messages already
present in ``prior_snapshot`` are skipped so stale inherited results are
not re-surfaced as fresh background work (issue #14944).
Walks the review agent's session messages and collects "successful tool
action" descriptions to surface to the user (e.g. "Memory updated").
Tool messages already present in ``prior_snapshot`` are skipped so we
don't re-surface stale results from the prior conversation that the
review agent inherited via ``conversation_history`` (issue #14944).
``notification_mode`` controls display detail:
- ``off``: return no actions.
- ``on``: generic "Memory updated"/tool messages.
- ``verbose``: include compact content previews from tool-call arguments.
Matching is by ``tool_call_id`` when available, with a content-equality
fallback for tool messages that lack one.
"""
mode = str(notification_mode or "on").lower()
if mode == "off":
return []
verbose = mode == "verbose"
existing_tool_call_ids = set()
existing_tool_contents = set()
for prior in prior_snapshot or []:
@@ -269,43 +262,6 @@ def summarize_background_review_actions(
if isinstance(content, str):
existing_tool_contents.add(content)
# Map review-agent tool results back to the calls that produced them. The
# result JSON only says "Entry added"; the call arguments contain action,
# target, and content previews. Restricting to notify_tools also prevents
# helper tools from surfacing as memory work just because they succeeded.
notify_tools = {"memory", "skill_manage"}
all_tool_call_ids: set = set()
call_details: dict = {}
for msg in review_messages or []:
if not isinstance(msg, dict) or msg.get("role") != "assistant":
continue
for tc in msg.get("tool_calls", []) or []:
if not isinstance(tc, dict):
continue
fn = tc.get("function", {}) or {}
fn_name = fn.get("name", "")
tcid = tc.get("id")
if tcid:
all_tool_call_ids.add(tcid)
if fn_name not in notify_tools:
continue
try:
args = json.loads(fn.get("arguments", "{}"))
except (json.JSONDecodeError, TypeError):
args = {}
if tcid:
call_details[tcid] = {
"tool": fn_name,
"action": args.get("action", "?"),
"target": args.get("target", "memory"),
"content": args.get("content", ""),
"old_text": args.get("old_text", ""),
"operations": args.get("operations") or [],
"name": args.get("name", ""),
"old_string": args.get("old_string", ""),
"new_string": args.get("new_string", ""),
}
actions: List[str] = []
for msg in review_messages or []:
if not isinstance(msg, dict) or msg.get("role") != "tool":
@@ -317,8 +273,6 @@ def summarize_background_review_actions(
content_str = msg.get("content")
if isinstance(content_str, str) and content_str in existing_tool_contents:
continue
if tcid and all_tool_call_ids and tcid not in call_details:
continue
try:
data = json.loads(msg.get("content", "{}"))
except (json.JSONDecodeError, TypeError):
@@ -326,92 +280,19 @@ def summarize_background_review_actions(
if not isinstance(data, dict) or not data.get("success"):
continue
message = data.get("message", "")
detail = call_details.get(tcid, {})
target = data.get("target", "") or detail.get("target", "")
is_skill = detail.get("tool") == "skill_manage"
message_lower = message.lower()
if not verbose:
if "created" in message_lower:
actions.append(message)
continue
if "updated" in message_lower:
actions.append(message)
continue
if is_skill and "patched" in message_lower:
actions.append(message)
continue
if is_skill:
label = "Skill"
elif target:
target = data.get("target", "")
if "created" in message.lower():
actions.append(message)
elif "updated" in message.lower():
actions.append(message)
elif "added" in message.lower() or (target and "add" in message.lower()):
label = "Memory" if target == "memory" else "User profile" if target == "user" else target
actions.append(f"{label} updated")
elif "Entry added" in message:
label = "Memory" if target == "memory" else "User profile" if target == "user" else target
actions.append(f"{label} updated")
elif "removed" in message.lower() or "replaced" in message.lower():
label = "Memory" if target == "memory" else "User profile" if target == "user" else target
else:
continue
if verbose:
action = detail.get("action", "")
content = detail.get("content", "")
old_text = detail.get("old_text", "")
skill_name = detail.get("name", "")
operations = detail.get("operations") or []
max_preview = 120
if is_skill:
change = data.get("_change", {})
old_string = change.get("old", "") or detail.get("old_string", "")
new_string = change.get("new", "") or detail.get("new_string", "")
description = change.get("description", "")
if action == "patch" and (old_string or new_string):
old_preview = old_string[:80].replace("\n", " ") + (
"" if len(old_string) > 80 else ""
)
new_preview = new_string[:80].replace("\n", " ") + (
"" if len(new_string) > 80 else ""
)
actions.append(
f"📝 Skill '{skill_name}' patched: "
f"\"{old_preview}\"\"{new_preview}\""
)
elif action == "create" and description:
actions.append(f"📝 Skill '{skill_name}' created: {description}")
elif action == "edit" and description:
actions.append(f"📝 Skill '{skill_name}' rewritten: {description}")
else:
actions.append(f"📝 {message}" if message else f"Skill {action}")
elif operations:
for op in operations:
op = op or {}
op_act = op.get("action", "")
op_content = (op.get("content") or "")
op_old = (op.get("old_text") or "")
if op_act == "add" and op_content:
preview = op_content[:max_preview] + ("" if len(op_content) > max_preview else "")
actions.append(f"{label} {preview}")
elif op_act == "replace" and op_content:
preview = op_content[:max_preview] + ("" if len(op_content) > max_preview else "")
actions.append(f"{label} ✏️ {preview}")
elif op_act == "remove" and op_old:
preview = op_old[:60] + ("" if len(op_old) > 60 else "")
actions.append(f"{label} {preview}")
elif action == "add" and content:
preview = content[:max_preview] + ("" if len(content) > max_preview else "")
actions.append(f"{label} {preview}")
elif action == "replace" and content:
preview = content[:max_preview] + ("" if len(content) > max_preview else "")
actions.append(f"{label} ✏️ {preview}")
elif action == "remove" and old_text:
preview = old_text[:60] + ("" if len(old_text) > 60 else "")
actions.append(f"{label} {preview}")
else:
actions.append(f"{label} updated")
elif (
"added" in message_lower
or "replaced" in message_lower
or "removed" in message_lower
or "applied" in message_lower
or (target and "add" in message.lower())
or "Entry added" in message
):
actions.append(f"{label} updated")
return actions
@@ -641,7 +522,6 @@ def _run_review_in_thread(
actions = summarize_background_review_actions(
review_messages,
messages_snapshot,
notification_mode=getattr(agent, "memory_notifications", "on"),
)
if actions:

View File

@@ -58,34 +58,17 @@ _bedrock_runtime_client_cache: Dict[str, Any] = {}
_bedrock_control_client_cache: Dict[str, Any] = {}
_MIN_BOTO3_VERSION = (1, 34, 59)
def _require_boto3():
"""Import boto3, raising a clear error if not installed or too old."""
"""Import boto3, raising a clear error if not installed."""
try:
import boto3
return boto3
except ImportError:
raise ImportError(
"The 'boto3' package is required for the AWS Bedrock provider. "
"Install it with: pip install boto3\n"
"Or install Hermes with Bedrock support: pip install -e '.[bedrock]'"
)
# converse() / converse_stream() were added in boto3 1.34.59.
# When Hermes is installed editable into system Python, the system boto3
# (e.g. Ubuntu 24.04 ships 1.34.46) may take precedence over the venv
# version pinned in pyproject.toml.
try:
version = tuple(int(x) for x in boto3.__version__.split(".")[:3])
except (AttributeError, ValueError):
return boto3 # can't parse — don't block on version check
if version < _MIN_BOTO3_VERSION:
raise RuntimeError(
f"boto3 {boto3.__version__} does not support converse_stream "
f"(minimum 1.34.59 required). Upgrade with: "
f"pip install --upgrade boto3"
)
return boto3
def _get_bedrock_runtime_client(region: str):

View File

@@ -1,295 +0,0 @@
"""Surface-agnostic core for the Phase 2b terminal-billing screens.
One fetch/parse per concern, consumed identically by the CLI handler
(``cli.py::_show_billing``), the TUI JSON-RPC methods
(``tui_gateway/server.py``), and any other surface. Mirrors the proven
``agent/account_usage.py::build_credits_view`` pattern: parse the server payload
into a frozen dataclass; **fail open** — when not logged in or the portal is
unreachable, return a struct with ``logged_in=False`` and let the surface degrade
gracefully (never crash).
Money discipline: the server emits decimal STRINGS (``"142.5"``, not fixed 2dp).
We keep them as :class:`decimal.Decimal` end-to-end and only format for display.
"""
from __future__ import annotations
import logging
import uuid
from dataclasses import dataclass, field
from decimal import Decimal, InvalidOperation
from typing import Any, Optional
logger = logging.getLogger(__name__)
# =============================================================================
# Decimal money helpers
# =============================================================================
def parse_money(value: Any) -> Optional[Decimal]:
"""Parse a server money value (decimal string) into :class:`Decimal`.
Returns None for missing/invalid input. Never raises. Accepts str/int (and,
defensively, float — though the server always sends strings).
"""
if value is None:
return None
try:
# Decimal(str(...)) avoids binary-float artifacts if a float ever sneaks in.
return Decimal(str(value).strip())
except (InvalidOperation, ValueError, TypeError):
return None
def format_money(value: Optional[Decimal]) -> str:
"""Format a Decimal as ``$X`` / ``$X.YY`` for display.
Whole dollars show no decimals; any fractional amount shows exactly 2dp:
``Decimal("142.5")`` → ``"$142.50"``, ``Decimal("100")`` → ``"$100"``,
``Decimal("0.01")`` → ``"$0.01"``.
"""
if value is None:
return ""
if value == value.to_integral_value():
# Whole dollars — no decimal point. format(..., "f") avoids 1E+3 for 1000.
return f"${format(value.to_integral_value(), 'f')}"
# Fractional — always show 2dp.
return f"${format(value.quantize(Decimal('0.01')), 'f')}"
# =============================================================================
# Parsed sub-structures
# =============================================================================
@dataclass(frozen=True)
class CardInfo:
brand: str
last4: str
@property
def masked(self) -> str:
return f"{self.brand} ····{self.last4}"
@dataclass(frozen=True)
class MonthlyCap:
limit_usd: Optional[Decimal] = None
spent_this_month_usd: Optional[Decimal] = None
is_default_ceiling: bool = False
@dataclass(frozen=True)
class AutoReload:
enabled: bool = False
threshold_usd: Optional[Decimal] = None
reload_to_usd: Optional[Decimal] = None
@dataclass(frozen=True)
class BillingState:
"""Parsed ``GET /api/billing/state`` — the overview screen's data.
Fail-open: ``logged_in=False`` (and empty fields) when not logged in or the
portal is unreachable.
"""
logged_in: bool
org_id: Optional[str] = None
org_slug: Optional[str] = None
org_name: Optional[str] = None
role: Optional[str] = None # "OWNER" | "ADMIN" | "MEMBER"
balance_usd: Optional[Decimal] = None
cli_billing_enabled: bool = False
charge_presets: tuple[Decimal, ...] = ()
min_usd: Optional[Decimal] = None
max_usd: Optional[Decimal] = None
card: Optional[CardInfo] = None
monthly_cap: Optional[MonthlyCap] = None
auto_reload: Optional[AutoReload] = None
portal_url: Optional[str] = None
# When the fetch failed (vs cleanly not-logged-in), the message for the surface.
error: Optional[str] = None
@property
def is_admin(self) -> bool:
"""True for OWNER/ADMIN — the roles that can manage billing."""
return (self.role or "").upper() in ("OWNER", "ADMIN")
@property
def can_charge(self) -> bool:
"""True when the UI should offer charge/auto-reload actions.
Admin role AND the per-org kill-switch on. (The server still enforces;
this is just for graying out actions the user can't take.)
"""
return self.is_admin and self.cli_billing_enabled
def _parse_card(raw: Any) -> Optional[CardInfo]:
if not isinstance(raw, dict):
return None
brand = raw.get("brand")
last4 = raw.get("last4")
if isinstance(brand, str) and isinstance(last4, str):
return CardInfo(brand=brand, last4=last4)
return None
def _parse_monthly_cap(raw: Any) -> Optional[MonthlyCap]:
if not isinstance(raw, dict):
return None
return MonthlyCap(
limit_usd=parse_money(raw.get("limitUsd")),
spent_this_month_usd=parse_money(raw.get("spentThisMonthUsd")),
is_default_ceiling=bool(raw.get("isDefaultCeiling")),
)
def _parse_auto_reload(raw: Any) -> Optional[AutoReload]:
if not isinstance(raw, dict):
return None
return AutoReload(
enabled=bool(raw.get("enabled")),
threshold_usd=parse_money(raw.get("thresholdUsd")),
reload_to_usd=parse_money(raw.get("reloadToUsd")),
)
def billing_state_from_payload(
payload: dict[str, Any], *, portal_url: Optional[str] = None
) -> BillingState:
"""Map a raw ``/api/billing/state`` JSON dict into :class:`BillingState`."""
raw_org = payload.get("org")
org: dict[str, Any] = raw_org if isinstance(raw_org, dict) else {}
raw_bounds = payload.get("bounds")
bounds: dict[str, Any] = raw_bounds if isinstance(raw_bounds, dict) else {}
presets: list[Decimal] = []
for item in payload.get("chargePresets") or ():
parsed = parse_money(item)
if parsed is not None:
presets.append(parsed)
return BillingState(
logged_in=True,
org_id=org.get("id"),
org_slug=org.get("slug"),
org_name=org.get("name"),
role=org.get("role"),
balance_usd=parse_money(payload.get("balanceUsd")),
cli_billing_enabled=bool(payload.get("cliBillingEnabled")),
charge_presets=tuple(presets),
min_usd=parse_money(bounds.get("minUsd")),
max_usd=parse_money(bounds.get("maxUsd")),
card=_parse_card(payload.get("card")),
monthly_cap=_parse_monthly_cap(payload.get("monthlyCap")),
auto_reload=_parse_auto_reload(payload.get("autoReload")),
portal_url=portal_url,
)
# =============================================================================
# Fail-open builders (the surface front doors)
# =============================================================================
def build_billing_state(*, timeout: float = 15.0) -> BillingState:
"""Fetch + parse ``/api/billing/state``. Fail-open.
Returns ``BillingState(logged_in=False)`` when not logged in. On a portal/HTTP
failure, returns ``logged_in=False`` with ``error`` set so the surface can show
a clear message rather than crashing.
"""
try:
from hermes_cli.nous_billing import (
BillingAuthError,
BillingError,
_absolutize_portal_url,
get_billing_state,
resolve_portal_base_url,
)
except Exception:
return BillingState(logged_in=False, error="billing client unavailable")
try:
payload = get_billing_state(timeout=timeout)
except BillingAuthError:
return BillingState(logged_in=False)
except BillingError as exc:
logger.debug("billing ▸ /state fetch failed (fail-open)", exc_info=True)
return BillingState(logged_in=False, error=str(exc))
except Exception:
logger.debug("billing ▸ /state unexpected error (fail-open)", exc_info=True)
return BillingState(logged_in=False, error="could not load billing state")
# Prefer a server-supplied portalUrl if present (resolved to absolute in case
# it's relative); else build the standard one.
raw_portal = payload.get("portalUrl") if isinstance(payload, dict) else None
portal_url = _absolutize_portal_url(raw_portal) if raw_portal else None
if not portal_url:
try:
portal_url = _fallback_portal_url(resolve_portal_base_url())
except Exception:
portal_url = None
return billing_state_from_payload(payload, portal_url=portal_url)
def _fallback_portal_url(base: str) -> str:
"""Standard billing deep-link when the server omits ``portalUrl``."""
return f"{base.rstrip('/')}/billing?topup=open"
# =============================================================================
# Idempotency
# =============================================================================
def new_idempotency_key() -> str:
"""Fresh UUID for a user-confirmed purchase (reuse on retry of the SAME buy).
The ``Idempotency-Key`` header is mandatory on ``POST /charge``; generate one
per confirmed purchase and reuse it across retries so a double-submit collapses
to a single charge. Never reuse a key across different amounts (the server
returns 409 idempotency_conflict).
"""
return str(uuid.uuid4())
# =============================================================================
# Amount validation (Screen 3 custom input)
# =============================================================================
@dataclass(frozen=True)
class AmountValidation:
ok: bool
amount: Optional[Decimal] = None
error: Optional[str] = None
def validate_charge_amount(
raw: str, *, min_usd: Optional[Decimal], max_usd: Optional[Decimal]
) -> AmountValidation:
"""Validate a custom charge amount against bounds + 2dp (multipleOf 0.01).
Mirrors the server's accept/reject so the UI can give instant feedback rather
than round-tripping a sure-to-fail charge. The server is still authoritative.
"""
cleaned = (raw or "").strip().lstrip("$").strip()
amount = parse_money(cleaned)
if amount is None:
return AmountValidation(ok=False, error="Enter a dollar amount, e.g. 100")
if amount <= 0:
return AmountValidation(ok=False, error="Amount must be greater than $0")
# multipleOf 0.01 — reject sub-cent precision.
if amount != amount.quantize(Decimal("0.01")):
return AmountValidation(ok=False, error="Amount can't be smaller than a cent")
if min_usd is not None and amount < min_usd:
return AmountValidation(ok=False, error=f"Minimum is {format_money(min_usd)}")
if max_usd is not None and amount > max_usd:
return AmountValidation(ok=False, error=f"Maximum is {format_money(max_usd)}")
return AmountValidation(ok=True, amount=amount)

View File

@@ -262,26 +262,6 @@ def _responses_tools(tools: Optional[List[Dict[str, Any]]] = None) -> Optional[L
return converted or None
# Provider-executed built-in tool *declaration* types accepted on the
# Responses ``tools`` array. These are declared by ``type`` alone (no
# client-side name/parameters schema) and run server-side — the provider
# owns the implementation and reports progress via the matching ``*_call``
# output items. Hermes injects xAI's native ``web_search`` for the xAI
# transport (see agent/transports/codex.py); the rest are listed so the
# preflight validator passes them through rather than rejecting them as
# "unsupported type". Mirrors the ``*_call`` item-type set used in
# _normalize_codex_response.
_RESPONSES_BUILTIN_TOOL_TYPES = {
"web_search",
"web_search_preview",
"file_search",
"code_interpreter",
"image_generation",
"computer_use_preview",
"local_shell",
}
# ---------------------------------------------------------------------------
# Message format conversion
# ---------------------------------------------------------------------------
@@ -822,22 +802,7 @@ def _preflight_codex_api_kwargs(
for idx, tool in enumerate(tools):
if not isinstance(tool, dict):
raise ValueError(f"Codex Responses tools[{idx}] must be an object.")
tool_type = tool.get("type")
# Provider-executed built-in tools (xAI native web_search, code
# interpreter, etc.) are declared by ``type`` alone and carry no
# ``name``/``parameters`` schema — the provider owns the
# implementation. Pass them through verbatim instead of forcing
# them through the function-tool validation below (which would
# otherwise reject them with "unsupported type"). See
# agent/transports/codex.py for where xAI's native web_search is
# injected.
if tool_type in _RESPONSES_BUILTIN_TOOL_TYPES:
normalized_tools.append(dict(tool))
continue
if tool_type != "function":
if tool.get("type") != "function":
raise ValueError(f"Codex Responses tools[{idx}] has unsupported type {tool.get('type')!r}.")
name = tool.get("name")
@@ -1121,33 +1086,6 @@ def _normalize_codex_response(
saw_final_answer_phase = False
saw_reasoning_item = False
# Server-side built-in tool calls (xAI's native web_search, code
# interpreter, etc.) are executed by the provider and reported as
# discrete ``*_call`` output items. xAI's /v1/responses surface
# (e.g. grok-composer-2.5-fast on SuperGrok OAuth) routinely leaves
# these items at ``status="in_progress"`` even when the overall
# ``response.status == "completed"`` — the search ran to completion
# server-side, the per-item status simply isn't reconciled. These
# are NOT a signal that the model's turn is unfinished, so they must
# not flip ``has_incomplete_items``. Only the response-level status
# and genuine model output items (message/reasoning/function_call)
# govern the incomplete verdict. Without this guard, any turn where
# grok-composer invokes server-side search is misclassified as
# ``finish_reason="incomplete"`` and burns 3 fruitless continuation
# retries before failing with "Codex response remained incomplete
# after 3 continuation attempts". client-side function/custom tool
# calls keep their own in_progress handling below (they are skipped,
# not awaited).
_SERVER_SIDE_TOOL_CALL_TYPES = {
"web_search_call",
"file_search_call",
"code_interpreter_call",
"image_generation_call",
"computer_call",
"local_shell_call",
"mcp_call",
}
for item in output:
item_type = getattr(item, "type", None)
item_status = getattr(item, "status", None)
@@ -1156,10 +1094,7 @@ def _normalize_codex_response(
else:
item_status = None
if (
item_status in {"queued", "in_progress", "incomplete"}
and item_type not in _SERVER_SIDE_TOOL_CALL_TYPES
):
if item_status in {"queued", "in_progress", "incomplete"}:
has_incomplete_items = True
saw_streaming_or_item_incomplete = True

View File

@@ -290,7 +290,6 @@ def run_codex_app_server_turn(
original_user_message=original_user_message,
final_response=turn.final_text,
interrupted=False,
messages=messages,
)
except Exception:
logger.debug("external memory sync raised", exc_info=True)

View File

@@ -512,16 +512,6 @@ def compress_context(
old_title = agent._session_db.get_session_title(agent.session_id)
# Trigger memory extraction on the old session before it rotates.
agent.commit_memory_session(messages)
# Flush any un-persisted messages from the current turn to the
# old session *before* rotating. compress_context() can be
# called mid-turn (auto-compress when context exceeds threshold)
# at a point when _flush_messages_to_session_db() has not yet
# run. Without this, messages generated during the current turn
# are silently lost on session rotation (#47202).
try:
agent._flush_messages_to_session_db(messages)
except Exception:
pass # best-effort — don't block compression on a flush error
agent._session_db.end_session(agent.session_id, "compression")
old_session_id = agent.session_id
agent.session_id = f"{datetime.now().strftime('%Y%m%d_%H%M%S')}_{uuid.uuid4().hex[:6]}"
@@ -613,20 +603,6 @@ def compress_context(
force=True,
)
# Emit session:compress event so hooks (e.g. MemPalace sync) can ingest
# the completed old session before its details are lost.
_old_sid_for_event = locals().get("old_session_id")
if getattr(agent, "event_callback", None):
try:
agent.event_callback("session:compress", {
"platform": agent.platform or "",
"session_id": agent.session_id,
"old_session_id": _old_sid_for_event or "",
"compression_count": agent.context_compressor.compression_count,
})
except Exception as e:
logger.debug("event_callback error on session:compress: %s", e)
# Keep the post-compression rough estimate for diagnostics, but do not
# treat it as provider-reported prompt usage. Schema-heavy rough estimates
# can remain above threshold even after the next real API request fits.

View File

@@ -300,20 +300,11 @@ def _restore_or_build_system_prompt(agent, system_message, conversation_history)
agent.session_id, exc,
)
if stored_prompt and _stored_prompt_matches_runtime(agent, stored_prompt):
if stored_prompt:
# Continuing session — reuse the exact system prompt from the
# previous turn so the Anthropic cache prefix matches.
agent._cached_system_prompt = stored_prompt
return
if stored_prompt:
stored_state = "stale_runtime"
logger.info(
"Stored system prompt for session %s has stale runtime identity; "
"rebuilding for model=%s provider=%s.",
agent.session_id,
getattr(agent, "model", "") or "",
getattr(agent, "provider", "") or "",
)
if conversation_history and stored_state in ("null", "empty"):
# Continuing session whose stored prompt is unusable. The
@@ -375,30 +366,6 @@ def _restore_or_build_system_prompt(agent, system_message, conversation_history)
)
def _stored_prompt_matches_runtime(agent, prompt: str) -> bool:
"""Return False when the persisted Model/Provider lines are stale."""
def line_value(label: str) -> str:
prefix = f"{label}:"
value = ""
for line in prompt.splitlines():
if line.startswith(prefix):
value = line[len(prefix):].strip()
return value
stored_model = line_value("Model")
current_model = str(getattr(agent, "model", "") or "").strip()
if stored_model and current_model and stored_model != current_model:
return False
stored_provider = line_value("Provider")
current_provider = str(getattr(agent, "provider", "") or "").strip()
if stored_provider and current_provider and stored_provider != current_provider:
return False
return True
def _get_continuation_prompt(is_partial_stub: bool, dropped_tools: Optional[List[str]] = None) -> str:
if is_partial_stub and dropped_tools:
tool_list = ", ".join(dropped_tools[:3])
@@ -474,7 +441,6 @@ def run_conversation(
task_id: str = None,
stream_callback: Optional[callable] = None,
persist_user_message: Optional[str] = None,
persist_user_timestamp: Optional[float] = None,
) -> Dict[str, Any]:
"""
Run a complete conversation with tool calling until completion.
@@ -490,8 +456,6 @@ def run_conversation(
persist_user_message: Optional clean user message to store in
transcripts/history when user_message contains API-only
synthetic prefixes.
persist_user_timestamp: Optional platform event timestamp to store
as metadata on that persisted user message.
or queuing follow-up prefetch work.
Returns:
@@ -513,7 +477,6 @@ def run_conversation(
task_id,
stream_callback,
persist_user_message,
persist_user_timestamp,
restore_or_build_system_prompt=_restore_or_build_system_prompt,
install_safe_stdio=_install_safe_stdio,
sanitize_surrogates=_sanitize_surrogates,
@@ -3197,22 +3160,15 @@ def run_conversation(
# Terminal — flush buffered context so the user sees
# what was tried before the abort.
agent._flush_status_buffer()
# Summarize once: Cloudflare/proxy HTML challenge pages and
# other raw provider bodies must be collapsed to a short
# one-liner here, otherwise the full page leaks into the
# returned ``error`` field and downstream consumers deliver
# it verbatim (e.g. a cron failure notification dumped a
# ~60KB Cloudflare challenge page as 31 Discord messages).
_nonretryable_summary = agent._summarize_api_error(api_error)
if classified.reason == FailoverReason.content_policy_blocked:
agent._emit_status(
f"❌ Provider safety filter blocked this request: "
f"{_nonretryable_summary}"
f"{agent._summarize_api_error(api_error)}"
)
else:
agent._emit_status(
f"❌ Non-retryable error (HTTP {status_code}): "
f"{_nonretryable_summary}"
f"{agent._summarize_api_error(api_error)}"
)
agent._vprint(f"{agent.log_prefix}❌ Non-retryable client error (HTTP {status_code}). Aborting.", force=True)
agent._vprint(f"{agent.log_prefix} 🔌 Provider: {_provider} Model: {_model}", force=True)
@@ -3297,17 +3253,18 @@ def run_conversation(
else:
agent._persist_session(messages, conversation_history)
if classified.reason == FailoverReason.content_policy_blocked:
_summary = agent._summarize_api_error(api_error)
_policy_response = (
"⚠️ The model provider's safety filter blocked this request "
"(not a Hermes/gateway failure).\n\n"
f"Provider message: {_nonretryable_summary}\n\n"
f"Provider message: {_summary}\n\n"
f"{_CONTENT_POLICY_RECOVERY_HINT}"
)
return _content_policy_blocked_result(
messages,
api_call_count,
final_response=_policy_response,
error_detail=_nonretryable_summary,
error_detail=_summary,
)
return {
"final_response": None,
@@ -3315,7 +3272,7 @@ def run_conversation(
"api_calls": api_call_count,
"completed": False,
"failed": True,
"error": _nonretryable_summary,
"error": str(api_error),
}
if retry_count >= max_retries:
@@ -3762,30 +3719,8 @@ def run_conversation(
assistant_msg = agent._build_assistant_message(assistant_message, finish_reason)
messages.append(assistant_msg)
for tc in assistant_message.tool_calls:
_tc_name = tc.function.name
if _tc_name not in agent.valid_tool_names:
# A blank/whitespace-only name is not a typo the
# model can fuzzy-correct toward a real tool — it is
# almost always a weak open model echoing tool-call
# XML/JSON it saw in file or tool output (#47967:
# <tool_call>/<invoke name=...> payloads in a file
# prime mimo/nemotron-class models to emit empty
# structured calls). Dumping the full tool catalog
# in that case feeds the priming loop more names to
# mimic and inflates context 3-4x across retries, so
# send a terse error that tells the model in-context
# tool-call syntax is DATA, not a call to make.
if not (_tc_name or "").strip():
content = (
"Tool call rejected: the tool name was empty. "
"If tool-call XML or JSON appeared in file "
"contents or tool output, that is data — do "
"not re-emit it as a tool call. To call a "
"tool, use a valid name from your tool list; "
"otherwise reply in plain text."
)
else:
content = f"Tool '{_tc_name}' does not exist. Available tools: {available}"
if tc.function.name not in agent.valid_tool_names:
content = f"Tool '{tc.function.name}' does not exist. Available tools: {available}"
else:
content = "Skipped: another tool call in this turn used an invalid name. Please retry this tool call."
messages.append({

View File

@@ -15,7 +15,6 @@ from typing import Any, Dict, List, Optional, Set, Tuple
from hermes_constants import OPENROUTER_BASE_URL
from hermes_cli.config import load_env
from agent.secret_scope import get_secret as _get_secret
from agent.credential_persistence import (
is_borrowed_credential_source,
sanitize_borrowed_credential_payload,
@@ -1667,7 +1666,7 @@ def _seed_from_singletons(provider: str, entries: List[PooledCredential]) -> Tup
_env_file = load_env()
def _env_val(key: str) -> str:
return (_env_file.get(key) or _get_secret(key, "") or "").strip()
return (_env_file.get(key) or os.environ.get(key) or "").strip()
anthropic_api_key = _env_val("ANTHROPIC_API_KEY")
anthropic_oauth_env = (
@@ -1953,7 +1952,7 @@ def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool
# changes to the .env file.
def _get_env_prefer_dotenv(key: str) -> str:
env_file = load_env()
val = env_file.get(key) or _get_secret(key, "") or ""
val = env_file.get(key) or os.environ.get(key) or ""
return val.strip()
# Honour user suppression — `hermes auth remove <provider> <N>` for an

View File

@@ -57,11 +57,6 @@ DEFAULT_INTERVAL_HOURS = 24 * 7 # 7 days
DEFAULT_MIN_IDLE_HOURS = 2
DEFAULT_STALE_AFTER_DAYS = 30
DEFAULT_ARCHIVE_AFTER_DAYS = 90
# Consolidation (the LLM umbrella-building fork) is OFF by default. The
# deterministic inactivity prune (apply_automatic_transitions) still runs
# whenever the curator is enabled; only the opinionated, aux-model-cost
# consolidation pass is opt-in.
DEFAULT_CONSOLIDATE = False
# ---------------------------------------------------------------------------
@@ -187,22 +182,6 @@ def get_prune_builtins() -> bool:
return bool(cfg.get("prune_builtins", True))
def get_consolidate() -> bool:
"""Whether the curator runs its LLM consolidation (umbrella-building) pass.
OFF by default. When off, a curator run does ONLY the deterministic
inactivity prune (mark stale / archive long-unused skills) and skips the
forked aux-model review entirely — no consolidation, no umbrella-building,
no aux-model cost. Set ``curator.consolidate: true`` to opt back into the
LLM pass that merges overlapping skills into class-level umbrellas.
The explicit ``hermes curator run --consolidate`` flag overrides this for
a single invocation regardless of the config value.
"""
cfg = _load_config()
return bool(cfg.get("consolidate", DEFAULT_CONSOLIDATE))
# ---------------------------------------------------------------------------
# Idle / interval check
# ---------------------------------------------------------------------------
@@ -1429,38 +1408,25 @@ def run_curator_review(
on_summary: Optional[Callable[[str], None]] = None,
synchronous: bool = False,
dry_run: bool = False,
consolidate: Optional[bool] = None,
) -> Dict[str, Any]:
"""Execute a single curator review pass.
Steps:
1. Apply automatic state transitions (pure, no LLM).
2. If consolidation is enabled AND there are agent-created skills, spawn
a forked AIAgent that runs the LLM review prompt against the current
candidate list.
2. If there are agent-created skills, spawn a forked AIAgent that runs
the LLM review prompt against the current candidate list.
3. Update .curator_state with last_run_at and a one-line summary.
4. Invoke *on_summary* with a user-visible description.
If *synchronous* is True, the LLM review runs in the calling thread; the
default is to spawn a daemon thread so the caller returns immediately.
*consolidate* gates the LLM umbrella-building pass. ``None`` (the default)
reads ``curator.consolidate`` from config (OFF by default). Passing
``True``/``False`` overrides the config for this invocation — used by the
``hermes curator run --consolidate`` flag. When consolidation is off, only
the deterministic inactivity prune runs and the forked aux-model review is
skipped entirely (no aux-model cost).
If *dry_run* is True, the automatic stale/archive transitions are SKIPPED
and the LLM review pass is instructed to produce a report only — no
skill_manage mutations, no terminal archive moves. The REPORT.md still
gets written and ``state.last_report_path`` still records it so users
can read what the curator WOULD have done. A dry-run also honors
*consolidate*: when consolidation is off, the preview only reports the
deterministic prune candidates.
can read what the curator WOULD have done.
"""
if consolidate is None:
consolidate = get_consolidate()
start = datetime.now(timezone.utc)
if dry_run:
# Count candidates without mutating state.
@@ -1523,53 +1489,6 @@ def run_curator_review(
before_report = []
before_names = {r.get("name") for r in before_report if isinstance(r, dict)}
# Consolidation gate. When off (the default), the curator does ONLY the
# deterministic inactivity prune above — no forked aux-model review, no
# umbrella-building, no aux-model cost. Record the run, write a report
# reflecting the prune-only outcome, and return without spawning a fork.
if not consolidate:
final_summary = (
f"{prefix}{auto_summary}; llm: skipped (consolidation off)"
)
llm_meta = {
"final": "",
"summary": "skipped (consolidation off)",
"model": "",
"provider": "",
"tool_calls": [],
"error": None,
}
elapsed = (datetime.now(timezone.utc) - start).total_seconds()
state2 = load_state()
state2["last_run_duration_seconds"] = elapsed
state2["last_run_summary"] = final_summary
try:
after_report = skill_usage.agent_created_report()
except Exception:
after_report = []
try:
report_path = _write_run_report(
started_at=start,
elapsed_seconds=elapsed,
auto_counts=counts,
auto_summary=auto_summary,
before_report=before_report,
before_names=before_names,
after_report=after_report,
llm_meta=llm_meta,
)
if report_path is not None:
state2["last_report_path"] = str(report_path)
except Exception as e:
logger.debug("Curator report write failed: %s", e, exc_info=True)
save_state(state2)
if on_summary:
try:
on_summary(f"curator: {final_summary}")
except Exception:
pass
return
llm_meta: Dict[str, Any] = {}
try:
candidate_list = _render_candidate_list()

View File

@@ -46,7 +46,7 @@ import shutil
import tarfile
from datetime import datetime, timezone
from pathlib import Path
from typing import Any, Dict, List, Optional, Set, Tuple
from typing import Any, Dict, List, Optional, Tuple
from hermes_constants import get_hermes_home
from agent.skill_utils import is_excluded_skill_path
@@ -208,17 +208,13 @@ def _write_manifest(dest: Path, reason: str, archive_path: Path,
)
def snapshot_skills(reason: str = "manual", *, protect_ids: Optional[Set[str]] = None) -> Optional[Path]:
def snapshot_skills(reason: str = "manual") -> Optional[Path]:
"""Create a tar.gz snapshot of ``~/.hermes/skills/`` and prune old ones.
Returns the snapshot directory path, or ``None`` if the snapshot was
skipped (backup disabled, skills dir missing, or an IO error occurred —
in which case we log at debug and return None so the curator never
aborts a pass because of a backup failure).
``protect_ids`` is forwarded to the prune step so callers can guarantee
specific snapshot ids survive even when they fall outside the keep
window (rollback passes the id it is about to restore from).
"""
if not is_enabled():
logger.debug("Curator backup disabled by config; skipping snapshot")
@@ -280,19 +276,15 @@ def snapshot_skills(reason: str = "manual", *, protect_ids: Optional[Set[str]] =
pass
return None
_prune_old(keep=get_keep(), protect=protect_ids)
_prune_old(keep=get_keep())
logger.info("Curator snapshot created: %s (%s)", snap_id, reason)
return dest
def _prune_old(keep: int, protect: Optional[Set[str]] = None) -> List[str]:
def _prune_old(keep: int) -> List[str]:
"""Delete regular snapshots beyond the newest *keep*. Returns deleted
ids. Snapshot ids in *protect* are never deleted even when they fall
outside the keep window — rollback() uses this so the mandatory
pre-rollback safety snapshot can never evict the very snapshot being
restored. Staging dirs (``.rollback-staging-*``) are implementation
detail and pruned independently on every call."""
protect = protect or set()
ids. Staging dirs (``.rollback-staging-*``) are implementation detail
and pruned independently on every call."""
backups = _backups_dir()
if not backups.exists():
return []
@@ -313,8 +305,6 @@ def _prune_old(keep: int, protect: Optional[Set[str]] = None) -> List[str]:
entries.sort(key=lambda t: t[0], reverse=True)
deleted: List[str] = []
for _, path in entries[keep:]:
if path.name in protect:
continue
try:
shutil.rmtree(path)
deleted.append(path.name)
@@ -464,16 +454,16 @@ def _restore_cron_skill_links(snapshot_dir: Path) -> Dict[str, Any]:
report["attempted"] = True # we tried but there was nothing to do
return report
# Load and rewrite the live jobs under the scheduler's cross-process lock.
# Load and rewrite the live jobs under the scheduler's lock.
try:
from cron.jobs import load_jobs, save_jobs, _jobs_lock
from cron.jobs import load_jobs, save_jobs, _jobs_file_lock
except ImportError as e:
report["error"] = f"cron module unavailable: {e}"
return report
report["attempted"] = True
try:
with _jobs_lock():
with _jobs_file_lock:
live_jobs = load_jobs()
changed = False
@@ -574,13 +564,7 @@ def rollback(backup_id: Optional[str] = None) -> Tuple[bool, str, Optional[Path]
# out before touching anything — otherwise a failed extract could leave
# the user with no skills.
try:
# Protect the target from this snapshot's prune step: at the steady
# keep limit, pruning the oldest snapshot would otherwise delete the
# very snapshot we are about to extract from.
snapshot_skills(
reason=f"pre-rollback to {target.name}",
protect_ids={target.name},
)
snapshot_skills(reason=f"pre-rollback to {target.name}")
except Exception as e:
return (False, f"pre-rollback safety snapshot failed: {e}", None)

View File

@@ -12,7 +12,6 @@ import time
from dataclasses import dataclass, field
from difflib import unified_diff
from pathlib import Path
from typing import Any
from utils import safe_json_loads
from agent.tool_result_classification import file_mutation_result_landed
@@ -169,27 +168,6 @@ def _oneline(text: str) -> str:
return " ".join(text.split())
def _truncate_preview(text: str, max_len: int | None) -> str:
if max_len and max_len > 0 and len(text) > max_len:
if max_len <= 3:
return "." * max_len
return text[:max_len - 3] + "..."
return text
def _delegate_task_goal_parts(tasks: Any, *, per_goal_len: int) -> tuple[int, list[str]]:
if not isinstance(tasks, list):
return 0, []
goals: list[str] = []
for task in tasks:
if not isinstance(task, dict):
continue
raw_goal = task.get("goal")
goal = "?" if raw_goal is None else _oneline(str(raw_goal))
goals.append(_truncate_preview(goal or "?", per_goal_len))
return len(goals), goals
def build_tool_preview(tool_name: str, args: dict, max_len: int | None = None) -> str | None:
"""Build a short preview of a tool call's primary argument for display.
@@ -213,22 +191,6 @@ def build_tool_preview(tool_name: str, args: dict, max_len: int | None = None) -
"clarify": "question", "skill_manage": "name",
}
# delegate_task: show goal (single) or individual task goals (batch)
if tool_name == "delegate_task":
tasks = args.get("tasks")
if tasks and isinstance(tasks, list):
task_count, goals = _delegate_task_goal_parts(tasks, per_goal_len=40)
preview = (
f"{task_count} tasks: " + " | ".join(goals)
if goals else f"{len(tasks)} parallel tasks"
)
return _truncate_preview(preview, max_len)
goal = args.get("goal", "")
if goal is None:
return None
preview = _oneline(str(goal))
return _truncate_preview(preview, max_len) if preview else None
if tool_name == "process":
action = args.get("action", "")
sid = args.get("session_id", "")
@@ -896,6 +858,20 @@ def _detect_tool_failure(tool_name: str, result: str | None) -> tuple[bool, str]
return False, ""
def _used_free_parallel(result: str | None) -> bool:
"""True when a web result came from Parallel's free Search MCP.
Only the keyless Parallel path tags its result with ``provider="parallel"``;
the paid REST path and every other provider omit it. Used to label the tool
line "Parallel search" / "Parallel fetch" exactly when the free MCP served
the call.
"""
if not isinstance(result, str) or '"provider"' not in result:
return False
data = safe_json_loads(result)
return isinstance(data, dict) and str(data.get("provider", "")).lower() == "parallel"
def get_cute_tool_message(
tool_name: str, args: dict, duration: float, result: str | None = None,
) -> str:
@@ -933,15 +909,17 @@ def get_cute_tool_message(
return f"{line}{failure_suffix}"
if tool_name == "web_search":
return _wrap(f"┊ 🔍 search {_trunc(args.get('query', ''), 42)} {dur}")
verb = "Parallel search" if _used_free_parallel(result) else "search"
return _wrap(f"┊ 🔍 {verb:<9} {_trunc(args.get('query', ''), 42)} {dur}")
if tool_name == "web_extract":
verb = "Parallel fetch" if _used_free_parallel(result) else "fetch"
urls = args.get("urls", [])
if urls:
url = urls[0] if isinstance(urls, list) else str(urls)
domain = url.replace("https://", "").replace("http://", "").split("/")[0]
extra = f" +{len(urls)-1}" if len(urls) > 1 else ""
return _wrap(f"┊ 📄 fetch {_trunc(domain, 35)}{extra} {dur}")
return _wrap(f"┊ 📄 fetch pages {dur}")
return _wrap(f"┊ 📄 {verb:<9} {_trunc(domain, 35)}{extra} {dur}")
return _wrap(f"┊ 📄 {verb:<9} pages {dur}")
if tool_name == "terminal":
return _wrap(f"┊ 💻 $ {_trunc(args.get('command', ''), 42)} {dur}")
if tool_name == "process":
@@ -1057,10 +1035,7 @@ def get_cute_tool_message(
if tool_name == "delegate_task":
tasks = args.get("tasks")
if tasks and isinstance(tasks, list):
task_count, goals = _delegate_task_goal_parts(tasks, per_goal_len=30)
detail = " | ".join(goals) if goals else "parallel"
count_label = task_count or len(tasks)
return _wrap(f"┊ 🔀 delegate {count_label}x: {_trunc(detail, 35)} {dur}")
return _wrap(f"┊ 🔀 delegate {len(tasks)} parallel tasks {dur}")
return _wrap(f"┊ 🔀 delegate {_trunc(args.get('goal', ''), 35)} {dur}")
preview = build_tool_preview(tool_name, args) or ""

View File

@@ -11,18 +11,6 @@ Providers live in ``<repo>/plugins/image_gen/<name>/`` (built-in, auto-loaded
as ``kind: backend``) or ``~/.hermes/plugins/image_gen/<name>/`` (user, opt-in
via ``plugins.enabled``).
Unified surface
---------------
One tool — ``image_generate`` — covers **text-to-image** and
**image-to-image / image editing**. The router is the presence of
``image_url`` (and/or ``reference_image_urls``): if any source image is
provided, the provider routes to its image-to-image / edit endpoint; if
omitted, the provider routes to text-to-image. Users pick one **model**
(e.g. nano-banana-pro, gpt-image-2, grok-imagine-image); the provider
handles which underlying endpoint to hit. This mirrors the ``video_gen``
provider design (``agent/video_gen_provider.py``) so the two surfaces
stay learnable together.
Response shape
--------------
All providers return a dict that :func:`success_response` / :func:`error_response`
@@ -33,7 +21,6 @@ produce. The tool wrapper JSON-serializes it. Keys:
model str provider-specific model identifier
prompt str echoed prompt
aspect_ratio str "landscape" | "square" | "portrait"
modality str "text" | "image" (which mode was used)
provider str provider name (for diagnostics)
error str only when success=False
error_type str only when success=False
@@ -140,51 +127,19 @@ class ImageGenProvider(abc.ABC):
return models[0].get("id")
return None
def capabilities(self) -> Dict[str, Any]:
"""Return what this provider supports.
Returned dict (all keys optional)::
{
"modalities": ["text", "image"], # which inputs the backend accepts
"max_reference_images": 9, # cap for reference_image_urls
}
``modalities`` declares whether the active backend/model supports
text-to-image (``"text"``), image-to-image / editing (``"image"``),
or both. The tool layer surfaces this in the dynamic schema so the
model knows when ``image_url`` is honored. Used by ``hermes tools``
for the picker too. Default: text-only (backward compatible — a
provider that doesn't override this advertises text-to-image only).
"""
return {
"modalities": ["text"],
"max_reference_images": 0,
}
@abc.abstractmethod
def generate(
self,
prompt: str,
aspect_ratio: str = DEFAULT_ASPECT_RATIO,
*,
image_url: Optional[str] = None,
reference_image_urls: Optional[List[str]] = None,
**kwargs: Any,
) -> Dict[str, Any]:
"""Generate an image from a text prompt, or edit/transform a source image.
Routing: if ``image_url`` (or any ``reference_image_urls``) is
provided, the provider should route to its image-to-image / edit
endpoint; otherwise text-to-image. ``image_url`` is the primary
source image to edit; ``reference_image_urls`` are additional
style/composition references (provider clamps to its declared
``max_reference_images``).
"""Generate an image.
Implementations should return the dict from :func:`success_response`
or :func:`error_response`. ``kwargs`` may contain forward-compat
parameters future versions of the schema will expose —
implementations MUST ignore unknown keys (no TypeError).
parameters future versions of the schema will expose — implementations
should ignore unknown keys.
"""
@@ -207,26 +162,6 @@ def resolve_aspect_ratio(value: Optional[str]) -> str:
return DEFAULT_ASPECT_RATIO
def normalize_reference_images(value: Any) -> Optional[List[str]]:
"""Coerce a reference-image argument into a clean list of URL/path strings.
Accepts a single string or a list; strips blanks and whitespace. Returns
``None`` when nothing usable remains so providers can treat "no refs" as a
single sentinel.
"""
if value is None:
return None
if isinstance(value, str):
value = [value]
if not isinstance(value, (list, tuple)):
return None
out: List[str] = []
for item in value:
if isinstance(item, str) and item.strip():
out.append(item.strip())
return out or None
def _images_cache_dir() -> Path:
"""Return ``$HERMES_HOME/cache/images/``, creating parents as needed."""
from hermes_constants import get_hermes_home
@@ -345,16 +280,13 @@ def success_response(
prompt: str,
aspect_ratio: str,
provider: str,
modality: str = "text",
extra: Optional[Dict[str, Any]] = None,
) -> Dict[str, Any]:
"""Build a uniform success response dict.
``image`` may be an HTTP URL or an absolute filesystem path (for b64
providers like OpenAI). ``modality`` is ``"text"`` (text-to-image) or
``"image"`` (image-to-image / editing) — indicates which endpoint was
actually hit, useful for diagnostics. Callers that need to pass through
additional backend-specific fields can supply ``extra``.
providers like OpenAI). Callers that need to pass through additional
backend-specific fields can supply ``extra``.
"""
payload: Dict[str, Any] = {
"success": True,
@@ -362,7 +294,6 @@ def success_response(
"model": model,
"prompt": prompt,
"aspect_ratio": aspect_ratio,
"modality": modality,
"provider": provider,
}
if extra:

View File

@@ -33,7 +33,6 @@ from concurrent.futures import ThreadPoolExecutor
from typing import Any, Dict, List, Optional
from agent.memory_provider import MemoryProvider
from agent.skill_commands import extract_user_instruction_from_skill_message
from tools.registry import tool_error
logger = logging.getLogger(__name__)
@@ -431,37 +430,16 @@ class MemoryManager:
# -- Prefetch / recall ---------------------------------------------------
@staticmethod
def _strip_skill_scaffolding(text: str) -> Optional[str]:
"""Return memory-worthy user text, or None to skip the turn.
When a user invokes a /skill or /bundle, Hermes expands the turn into
a model-facing message that embeds the entire skill body. Feeding that
verbatim to memory providers pollutes their stores/embeddings with
prompt scaffolding instead of what the user actually asked. We recover
just the user's instruction here, once, for every provider — so this
is fixed for the whole provider fan-out, not per backend.
- Non-skill messages pass through unchanged.
- Skill turns with a user instruction return that instruction.
- Bare skill invocations (no instruction) return None → callers skip
the turn, since there is no user content worth remembering.
"""
return extract_user_instruction_from_skill_message(text)
def prefetch_all(self, query: str, *, session_id: str = "") -> str:
"""Collect prefetch context from all providers.
Returns merged context text labeled by provider. Empty providers
are skipped. Failures in one provider don't block others.
"""
clean_query = self._strip_skill_scaffolding(query)
if not clean_query:
return ""
parts = []
for provider in self._providers:
try:
result = provider.prefetch(clean_query, session_id=session_id)
result = provider.prefetch(query, session_id=session_id)
if result and result.strip():
parts.append(result)
except Exception as e:
@@ -482,14 +460,10 @@ class MemoryManager:
if not providers:
return
clean_query = self._strip_skill_scaffolding(query)
if not clean_query:
return
def _run() -> None:
for provider in providers:
try:
provider.queue_prefetch(clean_query, session_id=session_id)
provider.queue_prefetch(query, session_id=session_id)
except Exception as e:
logger.debug(
"Memory provider '%s' queue_prefetch failed (non-fatal): %s",
@@ -541,11 +515,6 @@ class MemoryManager:
if not providers:
return
clean_user_content = self._strip_skill_scaffolding(user_content)
if not clean_user_content:
return
user_content = clean_user_content
def _run() -> None:
for provider in providers:
try:

View File

@@ -1,50 +0,0 @@
from __future__ import annotations
from collections.abc import Mapping
from typing import Any
_NON_TEXT_PART_TYPES = {"image", "image_url", "input_image", "audio", "input_audio"}
_TEXT_KEYS = ("text", "content", "input_text", "output_text", "summary_text")
def _field(value: Any, key: str) -> Any:
if isinstance(value, Mapping):
return value.get(key)
return getattr(value, key, None)
def _text_from_part(part: Any) -> str:
if part is None:
return ""
if isinstance(part, str):
return part
part_type = str(_field(part, "type") or "").strip().lower()
if part_type in _NON_TEXT_PART_TYPES:
return ""
for key in _TEXT_KEYS:
text = _field(part, key)
if isinstance(text, str):
return text
return ""
def flatten_message_text(content: Any, *, sep: str = "\n") -> str:
"""Return the visible text from common chat/Responses message content shapes."""
if content is None:
return ""
if isinstance(content, str):
return content
if isinstance(content, list):
chunks = [_text_from_part(part) for part in content]
return sep.join(chunk for chunk in chunks if chunk)
text = _text_from_part(content)
if text:
return text
try:
return str(content)
except Exception:
return ""

View File

@@ -261,13 +261,7 @@ DEFAULT_CONTEXT_LENGTHS = {
# https://platform.minimax.io/docs/api-reference/text-chat-openai
"minimax-m3": 1000000,
"minimax": 204800,
# GLM — GLM-5.2 ships with a 1M context window (verified empirically:
# needle-in-a-haystack retrieval at 789K prompt tokens succeeded with
# zero errors on api.z.ai/api/coding/paas/v4). Older GLM models
# (5, 5.1, 5-turbo) are ~202K. Longest-key-first substring matching
# ensures "glm-5.2" resolves to 1M while older variants still hit the
# generic 202K fallback.
"glm-5.2": 1_048_576,
# GLM
"glm": 202752,
# xAI Grok — xAI /v1/models does not return context_length metadata,
# so these hardcoded fallbacks prevent Hermes from probing-down to
@@ -275,11 +269,6 @@ DEFAULT_CONTEXT_LENGTHS = {
# via a custom provider. Values sourced from models.dev (2026-04).
# Keys use substring matching (longest-first), so e.g. "grok-4.20"
# matches "grok-4.20-0309-reasoning" / "-non-reasoning" / "-multi-agent-0309".
# OAuth-only slug; absent from GET /v1/models. xAI publishes a 200k
# usable context window for Composer 2.5 on Grok Build (SuperGrok /
# Premium+); /v1/responses additionally enforces a ~262144 input+output
# budget, but the usable context (what we track here) is 200k.
"grok-composer": 200000, # grok-composer-2.5-fast (Grok Build CLI)
"grok-build": 256000, # grok-build-0.1
"grok-code-fast": 256000, # grok-code-fast-1
"grok-2-vision": 8192, # grok-2-vision, -1212, -latest

View File

@@ -8,7 +8,6 @@ import json
import logging
import os
import threading
import contextvars
from collections import OrderedDict
from pathlib import Path
@@ -305,47 +304,6 @@ TASK_COMPLETION_GUIDANCE = (
"is always better than inventing a result."
)
# Universal parallel-tool-call guidance — applied to ALL models.
#
# Why this matters for cost: every assistant turn resends the entire
# accumulated conversation (and, on cache-friendly providers, re-reads the
# cached prefix and pays for the newly-appended turn). A model that issues
# one tool call per turn multiplies the number of round-trips — and therefore
# the resent context — for any task that needs several independent reads,
# searches, or safe lookups. Batching independent calls into a single
# assistant response collapses N turns into one, cutting both latency and the
# resent-context cost that compounds over a long conversation.
#
# The hermes-agent runtime already executes a batch of tool calls
# concurrently when they are independent (read-only tools always; path-scoped
# file ops when their targets don't overlap — see
# run_agent._execute_tool_calls / tool_dispatch_helpers). The missing piece
# was telling the *model* to emit those calls together in the first place.
# Until now the only batching steer in the prompt lived in
# GOOGLE_MODEL_OPERATIONAL_GUIDANCE — Gemini/Gemma got it, every other model
# got nothing. This block makes the steer universal; the now-redundant
# Google-only bullet has been dropped so no model receives it twice.
#
# Short on purpose — shipped in the cached system prompt to every user, every
# session. Token cost is paid once at install and amortised across all
# sessions via prefix caching. Keep it tight.
#
# Ported from cline/cline#11514 ("encourage parallel tool calls"), adapted
# from Cline's TypeScript tool-surface guidance to hermes-agent's Python
# prompt-assembly architecture.
PARALLEL_TOOL_CALL_GUIDANCE = (
"# Parallel tool calls\n"
"When you need several pieces of information that don't depend on each "
"other, request them together in a single response instead of one tool "
"call per turn. Independent reads, searches, web fetches, and read-only "
"commands should be batched into the same assistant turn — the runtime "
"executes independent calls concurrently, and batching avoids resending "
"the whole conversation on every extra round-trip.\n"
"Only serialize calls when a later call genuinely depends on an earlier "
"call's result (e.g. you must read a file before you can patch it). When "
"in doubt and the calls are independent, batch them."
)
# OpenAI GPT/Codex-specific execution guidance. Addresses known failure modes
# where GPT models abandon work on partial results, skip prerequisite lookups,
# hallucinate instead of using tools, and declare "done" without verification.
@@ -427,10 +385,9 @@ GOOGLE_MODEL_OPERATIONAL_GUIDANCE = (
"package.json, requirements.txt, Cargo.toml, etc. before importing.\n"
"- **Conciseness:** Keep explanatory text brief — a few sentences, not "
"paragraphs. Focus on actions and results over narration.\n"
# Parallel-tool-call steering now lives in the universal
# PARALLEL_TOOL_CALL_GUIDANCE block (injected for all models), so it is no
# longer duplicated here — keeping it would send Gemini/Gemma the same
# instruction twice.
"- **Parallel tool calls:** When you need to perform multiple independent "
"operations (e.g. reading several files), make all the tool calls in a "
"single response rather than sequentially.\n"
"- **Non-interactive commands:** Use flags like -y, --yes, --non-interactive "
"to prevent CLI tools from hanging on prompts.\n"
"- **Keep going:** Work autonomously until the task is fully resolved. "
@@ -1000,80 +957,6 @@ CONTEXT_FILE_MAX_CHARS = 20_000
CONTEXT_TRUNCATE_HEAD_RATIO = 0.7
CONTEXT_TRUNCATE_TAIL_RATIO = 0.2
# Dynamic-cap parameters (used when no explicit context_file_max_chars is set).
# The cap scales with the model's context window so large-context models rarely
# truncate a project doc, while small-context models stay at the historical
# 20K floor. ~4 chars/token is the usual English heuristic; we spend a small
# slice of the window on context files since they share the cached prefix with
# the system prompt, tools, memory, and the whole conversation.
_CONTEXT_FILE_CHARS_PER_TOKEN = 4
_CONTEXT_FILE_WINDOW_FRACTION = 0.06
_CONTEXT_FILE_DYNAMIC_CEILING = 500_000
def _dynamic_context_file_max_chars(context_length: Optional[int]) -> int:
"""Derive a char cap from the model's context window.
Returns at least ``CONTEXT_FILE_MAX_CHARS`` (the historical 20K floor) and
at most ``_CONTEXT_FILE_DYNAMIC_CEILING``. When ``context_length`` is
unknown/invalid, returns the flat default so behavior is unchanged.
"""
if not isinstance(context_length, int) or context_length <= 0:
return CONTEXT_FILE_MAX_CHARS
budget = int(
context_length * _CONTEXT_FILE_CHARS_PER_TOKEN * _CONTEXT_FILE_WINDOW_FRACTION
)
return max(CONTEXT_FILE_MAX_CHARS, min(budget, _CONTEXT_FILE_DYNAMIC_CEILING))
def _get_context_file_max_chars(context_length: Optional[int] = None) -> int:
"""Return the context-file truncation limit.
Resolution order:
1. Explicit ``context_file_max_chars`` in config.yaml — user knows best,
always wins (including over the dynamic cap).
2. Dynamic cap derived from the model's ``context_length`` when provided
(scales the budget to the window; floor 20K, ceiling 500K).
3. ``CONTEXT_FILE_MAX_CHARS`` (20K) as the upstream-compatible fallback.
"""
try:
from hermes_cli.config import load_config
val = load_config().get("context_file_max_chars")
if isinstance(val, (int, float)) and val > 0:
return int(val)
except Exception as e:
logger.debug("Could not read context_file_max_chars from config: %s", e)
return _dynamic_context_file_max_chars(context_length)
# Collect truncation warnings so the caller (run_agent) can surface them.
# A ContextVar (not a module-global list) isolates accumulation per thread /
# per async task, so concurrent gateway-session prompt builds can't drain or
# clear each other's pending warnings (cross-session leak). Each build runs in
# its own context, collects its own warnings, and drains them synchronously.
_truncation_warnings: "contextvars.ContextVar[Optional[list]]" = contextvars.ContextVar(
"context_file_truncation_warnings", default=None
)
def _record_truncation_warning(msg: str) -> None:
"""Append a truncation warning to the current context's accumulator."""
warnings = _truncation_warnings.get()
if warnings is None:
warnings = []
_truncation_warnings.set(warnings)
warnings.append(msg)
def drain_truncation_warnings() -> list:
"""Return and clear any truncation warnings accumulated in this context."""
warnings = _truncation_warnings.get()
if not warnings:
return []
drained = list(warnings)
warnings.clear()
return drained
# =========================================================================
# Skills prompt cache
@@ -1580,47 +1463,19 @@ def build_nous_subscription_prompt(valid_tool_names: "set[str] | None" = None) -
# Context files (SOUL.md, AGENTS.md, .cursorrules)
# =========================================================================
def _truncate_content(
content: str,
filename: str,
max_chars: Optional[int] = None,
context_length: Optional[int] = None,
read_path: Optional[str] = None,
) -> str:
"""Head/tail truncation with a marker in the middle.
``filename`` is the human label used in warnings. ``read_path`` is the
concrete path the agent should ``read_file`` to recover the full content
(defaults to ``filename`` when not supplied). ``context_length`` lets the
cap scale to the model's window when no explicit config override is set.
"""
if max_chars is None:
max_chars = _get_context_file_max_chars(context_length)
def _truncate_content(content: str, filename: str, max_chars: int = CONTEXT_FILE_MAX_CHARS) -> str:
"""Head/tail truncation with a marker in the middle."""
if len(content) <= max_chars:
return content
target = read_path or filename
msg = (
f"⚠️ Context file {filename} TRUNCATED: "
f"{len(content)} chars exceeds limit of {max_chars}"
f"trim the file, pin a larger context_file_max_chars, or use a "
f"larger-context model!"
)
logger.warning(msg)
_record_truncation_warning(msg)
head_chars = int(max_chars * CONTEXT_TRUNCATE_HEAD_RATIO)
tail_chars = int(max_chars * CONTEXT_TRUNCATE_TAIL_RATIO)
head = content[:head_chars]
tail = content[-tail_chars:]
marker = (
f"\n\n[...truncated {filename}: kept {head_chars}+{tail_chars} of "
f"{len(content)} chars. The middle is omitted — if you need the full "
f"instructions, read the complete file with the read_file tool: "
f"{target}]\n\n"
)
marker = f"\n\n[...truncated {filename}: kept {head_chars}+{tail_chars} of {len(content)} chars. Use file tools to read the full file.]\n\n"
return head + marker + tail
def load_soul_md(context_length: Optional[int] = None) -> Optional[str]:
def load_soul_md() -> Optional[str]:
"""Load SOUL.md from HERMES_HOME and return its content, or None.
Used as the agent identity (slot #1 in the system prompt). When this
@@ -1641,17 +1496,14 @@ def load_soul_md(context_length: Optional[int] = None) -> Optional[str]:
if not content:
return None
content = _scan_context_content(content, "SOUL.md")
content = _truncate_content(
content, "SOUL.md", context_length=context_length,
read_path=str(soul_path),
)
content = _truncate_content(content, "SOUL.md")
return content
except Exception as e:
logger.debug("Could not read SOUL.md from %s: %s", soul_path, e)
return None
def _load_hermes_md(cwd_path: Path, context_length: Optional[int] = None) -> str:
def _load_hermes_md(cwd_path: Path) -> str:
""".hermes.md / HERMES.md — walk to git root."""
hermes_md_path = _find_hermes_md(cwd_path)
if not hermes_md_path:
@@ -1668,16 +1520,13 @@ def _load_hermes_md(cwd_path: Path, context_length: Optional[int] = None) -> str
pass
content = _scan_context_content(content, rel)
result = f"## {rel}\n\n{content}"
return _truncate_content(
result, ".hermes.md", context_length=context_length,
read_path=str(hermes_md_path),
)
return _truncate_content(result, ".hermes.md")
except Exception as e:
logger.debug("Could not read %s: %s", hermes_md_path, e)
return ""
def _load_agents_md(cwd_path: Path, context_length: Optional[int] = None) -> str:
def _load_agents_md(cwd_path: Path) -> str:
"""AGENTS.md — top-level only (no recursive walk)."""
for name in ["AGENTS.md", "agents.md"]:
candidate = cwd_path / name
@@ -1687,16 +1536,13 @@ def _load_agents_md(cwd_path: Path, context_length: Optional[int] = None) -> str
if content:
content = _scan_context_content(content, name)
result = f"## {name}\n\n{content}"
return _truncate_content(
result, "AGENTS.md", context_length=context_length,
read_path=str(candidate),
)
return _truncate_content(result, "AGENTS.md")
except Exception as e:
logger.debug("Could not read %s: %s", candidate, e)
return ""
def _load_claude_md(cwd_path: Path, context_length: Optional[int] = None) -> str:
def _load_claude_md(cwd_path: Path) -> str:
"""CLAUDE.md / claude.md — cwd only."""
for name in ["CLAUDE.md", "claude.md"]:
candidate = cwd_path / name
@@ -1706,16 +1552,13 @@ def _load_claude_md(cwd_path: Path, context_length: Optional[int] = None) -> str
if content:
content = _scan_context_content(content, name)
result = f"## {name}\n\n{content}"
return _truncate_content(
result, "CLAUDE.md", context_length=context_length,
read_path=str(candidate),
)
return _truncate_content(result, "CLAUDE.md")
except Exception as e:
logger.debug("Could not read %s: %s", candidate, e)
return ""
def _load_cursorrules(cwd_path: Path, context_length: Optional[int] = None) -> str:
def _load_cursorrules(cwd_path: Path) -> str:
""".cursorrules + .cursor/rules/*.mdc — cwd only."""
cursorrules_content = ""
cursorrules_file = cwd_path / ".cursorrules"
@@ -1742,17 +1585,10 @@ def _load_cursorrules(cwd_path: Path, context_length: Optional[int] = None) -> s
if not cursorrules_content:
return ""
return _truncate_content(
cursorrules_content, ".cursorrules", context_length=context_length,
read_path=str(cwd_path / ".cursorrules"),
)
return _truncate_content(cursorrules_content, ".cursorrules")
def build_context_files_prompt(
cwd: Optional[str] = None,
skip_soul: bool = False,
context_length: Optional[int] = None,
) -> str:
def build_context_files_prompt(cwd: Optional[str] = None, skip_soul: bool = False) -> str:
"""Discover and load context files for the system prompt.
Priority (first found wins — only ONE project context type is loaded):
@@ -1762,11 +1598,7 @@ def build_context_files_prompt(
4. .cursorrules / .cursor/rules/*.mdc (cwd only)
SOUL.md from HERMES_HOME is independent and always included when present.
Each context source is capped before injection. The cap defaults to the
model's context window (scaled — see ``_dynamic_context_file_max_chars``)
when *context_length* is provided, falling back to 20,000 chars otherwise.
An explicit ``context_file_max_chars`` in config.yaml always wins.
Each context source is capped at 20,000 chars.
When *skip_soul* is True, SOUL.md is not included here (it was already
loaded via ``load_soul_md()`` for the identity slot).
@@ -1779,17 +1611,17 @@ def build_context_files_prompt(
# Priority-based project context: first match wins
project_context = (
_load_hermes_md(cwd_path, context_length)
or _load_agents_md(cwd_path, context_length)
or _load_claude_md(cwd_path, context_length)
or _load_cursorrules(cwd_path, context_length)
_load_hermes_md(cwd_path)
or _load_agents_md(cwd_path)
or _load_claude_md(cwd_path)
or _load_cursorrules(cwd_path)
)
if project_context:
sections.append(project_context)
# SOUL.md from HERMES_HOME only — skip when already loaded as identity
if not skip_soul:
soul_content = load_soul_md(context_length)
soul_content = load_soul_md()
if soul_content:
sections.append(soul_content)

View File

@@ -0,0 +1,8 @@
"""Egress proxy integrations.
Currently ships an iron-proxy (ironsh/iron-proxy) wrapper that intercepts
outbound traffic from remote terminal sandboxes and swaps proxy tokens
for real upstream credentials at the network edge.
Design notes live in :mod:`agent.proxy_sources.iron_proxy`.
"""

File diff suppressed because it is too large Load Diff

View File

@@ -104,7 +104,6 @@ _PREFIX_PATTERNS = [
r"mem0_[A-Za-z0-9]{10,}", # Mem0 Platform API key
r"brv_[A-Za-z0-9]{10,}", # ByteRover API key
r"xai-[A-Za-z0-9]{30,}", # xAI (Grok) API key
r"ntn_[A-Za-z0-9]{10,}", # Notion internal integration token
]
# ENV assignment patterns: KEY=value where KEY contains a secret-like name

View File

@@ -1,205 +0,0 @@
"""Profile-scoped credential resolution for multi-profile gateway multiplexing.
The multiplexing gateway serves many profiles from one process. Each profile
has its own ``.env`` with its own provider keys and platform tokens, so we
**cannot** union them into the process-global ``os.environ`` (that would leak
profile A's keys to profile B's turns, and to every subprocess spawned with
``env=dict(os.environ)``).
This module provides a fail-closed, context-local secret scope:
- ``set_secret_scope(mapping)`` installs the active profile's secrets for the
current task (a contextvar, so it propagates into the agent's worker thread
via ``copy_context()`` exactly like the HERMES_HOME override).
- ``get_secret(name)`` reads from that scope. When multiplexing is **active**
and no scope is set, it RAISES rather than silently falling back to
``os.environ`` — an un-migrated or newly-added call site fails loud at that
exact line instead of leaking another profile's value. When multiplexing is
**off** (the default), it transparently reads ``os.environ`` so the
single-profile gateway and every non-gateway caller behave exactly as before.
Design rationale lives in ``docs/design/multiplexing-gateway.md`` (Workstream A).
"""
from __future__ import annotations
import os
from contextvars import ContextVar, Token
from pathlib import Path
from typing import Dict, Mapping, Optional
# ── multiplex-active flag ────────────────────────────────────────────────
# Process-global: set once at gateway startup when gateway.multiplex_profiles
# is true. Governs whether get_secret() fails closed on an unscoped read.
# A plain module global (not a contextvar): it describes the deployment mode,
# not a per-task value.
_MULTIPLEX_ACTIVE: bool = False
def set_multiplex_active(active: bool) -> None:
"""Mark whether the process is running as a profile multiplexer.
Called once at gateway startup. When True, ``get_secret`` fails closed on
an unscoped read instead of falling back to ``os.environ``.
"""
global _MULTIPLEX_ACTIVE
_MULTIPLEX_ACTIVE = bool(active)
def is_multiplex_active() -> bool:
"""Return whether the process is running as a profile multiplexer."""
return _MULTIPLEX_ACTIVE
# ── the secret scope contextvar ──────────────────────────────────────────
_SECRET_SCOPE: ContextVar[Optional[Mapping[str, str]]] = ContextVar(
"_SECRET_SCOPE", default=None
)
class UnscopedSecretError(RuntimeError):
"""Raised when a secret is read in multiplex mode with no scope installed.
This is the fail-closed signal: it means a credential read reached
``get_secret`` without a profile scope active, which in a multiplexer would
otherwise leak whichever profile's value happened to be in ``os.environ``.
The fix is to wrap the call path in ``set_secret_scope(...)`` (the per-turn
/ per-adapter profile scope), not to widen the allowlist.
"""
def set_secret_scope(secrets: Optional[Mapping[str, str]]) -> Token:
"""Install the active profile's secret mapping for the current context.
Returns a token for ``reset_secret_scope``. Pass ``None`` to clear.
"""
return _SECRET_SCOPE.set(secrets)
def reset_secret_scope(token: Token) -> None:
"""Restore the previous secret scope."""
_SECRET_SCOPE.reset(token)
def current_secret_scope() -> Optional[Mapping[str, str]]:
"""Return the active secret mapping, or None when no scope is installed."""
return _SECRET_SCOPE.get()
# ── genuinely-global env vars (NOT per-profile secrets) ──────────────────
# These are process/deployment-level settings, not profile credentials. They
# legitimately live in os.environ and must keep reading from it even in
# multiplex mode — routing them through the fail-closed path would wrongly
# crash. Anything matching is read from os.environ regardless of scope.
#
# Membership test is by exact name OR prefix (see _is_global_env). Keep this
# list tight: when in doubt a value is a profile secret, not a global.
_GLOBAL_ENV_EXACT = frozenset({
# Hermes runtime / deployment
"HERMES_HOME", "HERMES_PROFILE", "HERMES_GATEWAY_LOCK_DIR",
"HERMES_MAX_ITERATIONS", "HERMES_MAX_TOKENS", "HERMES_API_TIMEOUT",
"HERMES_REDACT_SECRETS", "HERMES_NOUS_TIMEOUT_SECONDS",
"_HERMES_GATEWAY",
# OS / interpreter
"PATH", "HOME", "USER", "LANG", "LC_ALL", "TZ", "PWD", "SHELL", "TMPDIR",
"VIRTUAL_ENV", "PYTHONPATH", "SSL_CERT_FILE",
# Kanban paths (per-board, not per-profile-secret)
"HERMES_KANBAN_DB", "HERMES_KANBAN_WORKSPACES_ROOT", "HERMES_KANBAN_BOARD",
})
_GLOBAL_ENV_PREFIXES = (
"HERMES_KANBAN_",
"HERMES_TELEGRAM_", # tuning knobs (batch delays, fallback toggles) — NOT the token
"TERMINAL_", # terminal/sandbox backend settings
)
def _is_global_env(name: str) -> bool:
"""Return True for genuinely process-global (non-profile-secret) env vars."""
if name in _GLOBAL_ENV_EXACT:
return True
return any(name.startswith(p) for p in _GLOBAL_ENV_PREFIXES)
def get_secret(name: str, default: Optional[str] = None) -> Optional[str]:
"""Resolve a credential by env-var name, honoring the active profile scope.
Resolution order:
1. Genuinely-global vars (``_is_global_env``) always read ``os.environ`` —
they are deployment settings, not profile secrets.
2. When a secret scope is installed (multiplexed turn), read from it; an
absent key returns ``default``. The scope is authoritative — we do NOT
fall through to ``os.environ``, because in a multiplexer ``os.environ``
may hold another profile's value.
3. No scope installed:
- multiplex INACTIVE (default deployment): read ``os.environ`` —
identical to the legacy ``os.getenv`` behavior every caller had before.
- multiplex ACTIVE: FAIL CLOSED. Raise ``UnscopedSecretError`` so the
missing scope is caught loudly instead of leaking a cross-profile value.
"""
if _is_global_env(name):
val = os.environ.get(name)
return val if val is not None else default
scope = _SECRET_SCOPE.get()
if scope is not None:
val = scope.get(name)
return val if val is not None else default
if _MULTIPLEX_ACTIVE:
raise UnscopedSecretError(
f"get_secret({name!r}) called with no profile secret scope active "
f"while multiplexing is on. This credential read must run inside a "
f"set_secret_scope(...) block (the per-turn / per-adapter profile "
f"scope). Reading os.environ here would risk leaking another "
f"profile's value. See docs/design/multiplexing-gateway.md "
f"(Workstream A)."
)
val = os.environ.get(name)
return val if val is not None else default
def load_env_file(env_path: Path) -> Dict[str, str]:
"""Parse a ``.env`` file into a plain dict WITHOUT touching ``os.environ``.
Used to load a profile's secrets into an isolated mapping for
``set_secret_scope``. Mirrors python-dotenv's basic parsing (KEY=VALUE,
``export`` prefix, ``#`` comments, optional matching quotes) but never
mutates the process environment — that isolation is the whole point.
"""
secrets: Dict[str, str] = {}
try:
text = env_path.read_text(encoding="utf-8")
except (FileNotFoundError, OSError, UnicodeDecodeError):
return secrets
for raw in text.splitlines():
line = raw.strip()
if not line or line.startswith("#"):
continue
if line.startswith("export "):
line = line[len("export "):].lstrip()
if "=" not in line:
continue
key, _, value = line.partition("=")
key = key.strip()
if not key:
continue
value = value.strip()
if len(value) >= 2 and value[0] == value[-1] and value[0] in ("'", '"'):
value = value[1:-1]
secrets[key] = value
return secrets
def build_profile_secret_scope(hermes_home: Path) -> Dict[str, str]:
"""Build a profile's secret mapping from its ``<home>/.env``.
Returns a fresh dict (safe to install via ``set_secret_scope``). Genuinely
global vars are intentionally NOT copied in — ``get_secret`` reads those
from ``os.environ`` directly, so the scope holds only profile secrets.
"""
return load_env_file(Path(hermes_home) / ".env")

View File

@@ -26,91 +26,6 @@ _skill_commands_platform: Optional[str] = None
_SKILL_INVALID_CHARS = re.compile(r"[^a-z0-9-]")
_SKILL_MULTI_HYPHEN = re.compile(r"-{2,}")
# ---------------------------------------------------------------------------
# Skill-scaffolding markers and the canonical extractor.
#
# When a user invokes a /skill (or /bundle), Hermes expands the turn into a
# model-facing message that embeds the full skill body plus scaffolding. That
# expanded text is what flows into the agent loop — and into memory providers
# via MemoryManager. Providers that store or embed the raw user turn (mem0,
# openviking, hindsight, retaindb, byterover, honcho, supermemory) would
# otherwise capture the entire skill body instead of what the user actually
# asked. ``extract_user_instruction_from_skill_message`` recovers just the
# user's instruction so memory stays clean.
#
# These markers MUST stay byte-identical to the builders below
# (``_build_skill_message`` here, ``build_bundle_invocation_message`` in
# agent/skill_bundles.py). They are co-located with the single-skill builder
# on purpose, and the bundle markers are asserted against the bundle builder in
# tests/openviking_plugin/test_openviking.py::test_skill_markers_match_hermes_scaffolding.
# ---------------------------------------------------------------------------
_SKILL_INVOCATION_PREFIX = "[IMPORTANT: The user has invoked the "
_SINGLE_SKILL_MARKER = "The full skill content is loaded below.]"
_SINGLE_SKILL_INSTRUCTION = (
"The user has provided the following instruction alongside the skill invocation: "
)
_RUNTIME_NOTE = "\n\n[Runtime note:"
_BUNDLE_MARKER = " skill bundle,"
_BUNDLE_USER_INSTRUCTION = "\nUser instruction: "
_BUNDLE_FIRST_SKILL_BLOCK = "\n\n[Loaded as part of the "
def extract_user_instruction_from_skill_message(content: Any) -> Optional[str]:
"""Recover the user's instruction from a slash-skill-expanded turn.
Returns:
- The original string unchanged when it is NOT skill scaffolding
(a normal user message passes straight through).
- The extracted user instruction when the scaffolding carried one.
- ``None`` when the content is skill scaffolding with no user
instruction (i.e. a bare ``/skill`` invocation). Callers that feed
memory providers should skip the turn in that case — there is no
user content worth storing.
"""
if not isinstance(content, str):
return None
if not content.startswith(_SKILL_INVOCATION_PREFIX):
return content
if _BUNDLE_MARKER in content:
return _extract_bundle_user_instruction(content)
if _SINGLE_SKILL_MARKER in content:
return _extract_single_skill_user_instruction(content)
return None
def _extract_single_skill_user_instruction(message: str) -> Optional[str]:
# Single-skill format appends the user instruction after the skill body, so
# the last occurrence is the user-provided one; the body may quote this text.
marker_idx = message.rfind(_SINGLE_SKILL_INSTRUCTION)
if marker_idx < 0:
return None
instruction = message[marker_idx + len(_SINGLE_SKILL_INSTRUCTION):]
runtime_idx = instruction.find(_RUNTIME_NOTE)
if runtime_idx >= 0:
instruction = instruction[:runtime_idx]
instruction = instruction.strip()
return instruction or None
def _extract_bundle_user_instruction(message: str) -> Optional[str]:
# Bundle format puts the user instruction before the loaded skills, so the
# first occurrence is the user-provided one.
marker_idx = message.find(_BUNDLE_USER_INSTRUCTION)
if marker_idx < 0:
return None
instruction = message[marker_idx + len(_BUNDLE_USER_INSTRUCTION):]
first_skill_idx = instruction.find(_BUNDLE_FIRST_SKILL_BLOCK)
if first_skill_idx >= 0:
instruction = instruction[:first_skill_idx]
instruction = instruction.strip()
return instruction or None
def _resolve_skill_commands_platform() -> Optional[str]:
"""Return the current platform scope used for disabled-skill filtering.

View File

@@ -43,20 +43,14 @@ EXCLUDED_SKILL_DIRS = frozenset(
)
)
# Supporting files live inside a skill package and are loaded explicitly via
# skill_view(skill, file_path=...). They are not standalone skills and must not
# be scanned for active SKILL.md/DESCRIPTION.md entries, even if a Curator or
# archive workflow preserves a complete old skill package under references/.
SKILL_SUPPORT_DIRS = frozenset(("references", "templates", "assets", "scripts"))
def is_excluded_skill_path(path) -> bool:
"""True if *path* should be skipped by active skill scanners.
"""True if any component of *path* is in EXCLUDED_SKILL_DIRS.
Use this on every ``SKILL.md`` path produced by direct ``rglob`` scans to
prune dependency, virtualenv, VCS, cache, and progressive-disclosure
support-package paths. Centralising the check here keeps every
skill-scanning site in sync with the shared exclusion set.
Use this on every SKILL.md path produced by ``rglob`` to prune
dependency, virtualenv, VCS, and cache directories. Centralising the
check here keeps every skill-scanning site in sync with the shared
exclusion set.
Accepts a Path or string.
"""
@@ -65,36 +59,7 @@ def is_excluded_skill_path(path) -> bool:
except AttributeError:
from pathlib import PurePath
parts = PurePath(str(path)).parts
return any(part in EXCLUDED_SKILL_DIRS for part in parts) or is_skill_support_path(
path
)
def is_skill_support_path(path) -> bool:
"""True if *path* is under a support dir of an actual skill root.
``references/``, ``templates/``, ``assets/``, and ``scripts/`` are
progressive-disclosure support areas when they sit directly inside a skill
directory containing ``SKILL.md``. They are not active discovery roots for
standalone skills. A preserved package such as
``some-skill/references/old-skill-package/SKILL.md`` is documentation data
unless the caller explicitly loads it via ``file_path``.
Legitimate categories or skill names such as ``skills/scripts/foo`` remain
discoverable because their ``scripts`` component is not directly under a
directory that contains ``SKILL.md``.
"""
path_obj = path if isinstance(path, Path) else Path(str(path))
parts = path_obj.parts
# Last component may be a file or candidate skill directory name. Only
# components before the leaf can be containing support directories.
for idx, part in enumerate(parts[:-1]):
if part not in SKILL_SUPPORT_DIRS or idx == 0:
continue
skill_root = Path(*parts[:idx])
if (skill_root / "SKILL.md").exists():
return True
return False
return any(part in EXCLUDED_SKILL_DIRS for part in parts)
# ── Lazy YAML loader ─────────────────────────────────────────────────────
@@ -696,21 +661,12 @@ def extract_skill_description(frontmatter: Dict[str, Any]) -> str:
def iter_skill_index_files(skills_dir: Path, filename: str):
"""Walk skills_dir yielding sorted paths matching *filename*.
Excludes Hermes metadata, VCS, virtualenv/dependency, cache, and skill
support directories. Support directories (references/templates/assets/
scripts) can contain arbitrary markdown and even archived package
``SKILL.md`` files, but they are progressive-disclosure data loaded through
``skill_view(..., file_path=...)`` rather than active skill roots.
Excludes Hermes metadata, VCS, virtualenv/dependency, and cache
directories so dependencies cannot register nested skills.
"""
matches = []
for root, dirs, files in os.walk(skills_dir, followlinks=True):
has_skill_md = "SKILL.md" in files
dirs[:] = [
d
for d in dirs
if d not in EXCLUDED_SKILL_DIRS
and not (has_skill_md and d in SKILL_SUPPORT_DIRS)
]
dirs[:] = [d for d in dirs if d not in EXCLUDED_SKILL_DIRS]
if filename in files:
matches.append(Path(root) / filename)
for path in sorted(matches, key=lambda p: str(p.relative_to(skills_dir))):

View File

@@ -33,7 +33,6 @@ from agent.prompt_builder import (
KANBAN_GUIDANCE,
MEMORY_GUIDANCE,
OPENAI_MODEL_EXECUTION_GUIDANCE,
PARALLEL_TOOL_CALL_GUIDANCE,
PLATFORM_HINTS,
SESSION_SEARCH_GUIDANCE,
SKILLS_GUIDANCE,
@@ -41,7 +40,6 @@ from agent.prompt_builder import (
TASK_COMPLETION_GUIDANCE,
TOOL_USE_ENFORCEMENT_GUIDANCE,
TOOL_USE_ENFORCEMENT_MODELS,
drain_truncation_warnings,
)
from agent.runtime_cwd import resolve_context_cwd
@@ -61,55 +59,6 @@ def _ra():
return run_agent
def _resolve_platform_hint(agent: Any, platform_key: str, default_hint: str) -> str:
"""Apply a per-platform prompt-hint override to the default hint.
Reads ``agent._platform_hint_overrides`` (populated from
``config.yaml`` ``platform_hints`` by ``agent_init``) and resolves the
effective hint for *platform_key*:
* ``replace`` — substitute the default hint entirely.
* ``append`` — keep the default and append the extra text.
* a bare string value — treated as ``append`` (convenience shorthand).
Precedence: ``replace`` wins over ``append`` if both are present.
Override text is added on top of (not instead of) the SOUL/context/
memory tiers — it only affects the platform-hint segment, so other
platforms are unaffected and general system instructions still apply.
Defensive: any malformed entry falls back to the unmodified default so
a bad config value can never break prompt assembly or leak across
platforms.
"""
if not platform_key:
return default_hint
overrides = getattr(agent, "_platform_hint_overrides", None)
if not isinstance(overrides, dict) or not overrides:
return default_hint
spec = overrides.get(platform_key)
if spec is None:
return default_hint
# Shorthand: a bare string is treated as append text.
if isinstance(spec, str):
extra = spec.strip()
return f"{default_hint}\n\n{extra}".strip() if extra else default_hint
if not isinstance(spec, dict):
return default_hint
replace_text = spec.get("replace")
if isinstance(replace_text, str) and replace_text.strip():
base = replace_text.strip()
else:
base = default_hint
append_text = spec.get("append")
if isinstance(append_text, str) and append_text.strip():
return f"{base}\n\n{append_text.strip()}".strip()
return base
def build_system_prompt_parts(agent: Any, system_message: Optional[str] = None) -> Dict[str, str]:
"""Assemble the system prompt as three ordered parts.
@@ -133,17 +82,6 @@ def build_system_prompt_parts(agent: Any, system_message: Optional[str] = None)
# we resolve through ``_ra()`` to honor those patches.
_r = _ra()
# Resolve the model's context window once so context-file caps can scale
# to it (dynamic cap — see prompt_builder._dynamic_context_file_max_chars).
# None falls back to the historical flat default. This value is stable for
# the life of the conversation, so it does not threaten prompt caching.
_ctx_len: Optional[int] = None
_cc = getattr(agent, "context_compressor", None)
if _cc is not None:
_cc_len = getattr(_cc, "context_length", None)
if isinstance(_cc_len, int) and _cc_len > 0:
_ctx_len = _cc_len
# ── Stable tier ────────────────────────────────────────────────
stable_parts: List[str] = []
@@ -152,7 +90,7 @@ def build_system_prompt_parts(agent: Any, system_message: Optional[str] = None)
# cwd project instructions disabled.
_soul_loaded = False
if agent.load_soul_identity or not agent.skip_context_files:
_soul_content = _r.load_soul_md(_ctx_len)
_soul_content = _r.load_soul_md()
if _soul_content:
stable_parts.append(_soul_content)
_soul_loaded = True
@@ -173,17 +111,6 @@ def build_system_prompt_parts(agent: Any, system_message: Optional[str] = None)
if getattr(agent, "_task_completion_guidance", True) and agent.valid_tool_names:
stable_parts.append(TASK_COMPLETION_GUIDANCE)
# Universal parallel-tool-call guidance. Tells the model to batch
# independent tool calls into one assistant turn rather than emitting one
# call per turn — the runtime already runs independent calls concurrently
# (read-only tools always; non-overlapping path-scoped file ops), so the
# only thing missing was steering the model to produce the batch. Cuts
# round-trips and the resent-context cost that compounds over a long
# conversation. Gated by config.yaml ``agent.parallel_tool_call_guidance``
# (default True) and only injected when tools are actually loaded.
if getattr(agent, "_parallel_tool_call_guidance", True) and agent.valid_tool_names:
stable_parts.append(PARALLEL_TOOL_CALL_GUIDANCE)
# Tool-aware behavioral guidance: only inject when the tools are loaded
tool_guidance = []
if "memory" in agent.valid_tool_names:
@@ -380,25 +307,18 @@ def build_system_prompt_parts(agent: Any, system_message: Optional[str] = None)
)
platform_key = (agent.platform or "").lower().strip()
# Resolve the built-in/plugin default hint for this platform, then apply
# any per-platform override from config (platform_hints.<platform>).
_default_hint = ""
if platform_key in PLATFORM_HINTS:
_default_hint = PLATFORM_HINTS[platform_key]
stable_parts.append(PLATFORM_HINTS[platform_key])
elif platform_key:
# Check plugin registry for platform-specific LLM guidance
try:
from gateway.platform_registry import platform_registry
_entry = platform_registry.get(platform_key)
if _entry and _entry.platform_hint:
_default_hint = _entry.platform_hint
stable_parts.append(_entry.platform_hint)
except Exception:
pass
_effective_hint = _resolve_platform_hint(agent, platform_key, _default_hint)
if _effective_hint:
stable_parts.append(_effective_hint)
# ── Context tier (cwd-dependent, may change between sessions) ─
context_parts: List[str] = []
@@ -413,8 +333,7 @@ def build_system_prompt_parts(agent: Any, system_message: Optional[str] = None)
# dir — the user's real cwd there, but the install dir for the gateway
# daemon, which is why the gateway sets TERMINAL_CWD.
context_files_prompt = _r.build_context_files_prompt(
cwd=resolve_context_cwd(), skip_soul=_soul_loaded,
context_length=_ctx_len)
cwd=resolve_context_cwd(), skip_soul=_soul_loaded)
if context_files_prompt:
context_parts.append(context_files_prompt)
@@ -481,14 +400,7 @@ def build_system_prompt(agent: Any, system_message: Optional[str] = None) -> str
warm across turns.
"""
parts = build_system_prompt_parts(agent, system_message=system_message)
joined = "\n\n".join(p for p in (parts["stable"], parts["context"], parts["volatile"]) if p)
# Surface context-file truncation warnings through the normal agent status
# channel so gateway/CLI users see them in chat instead of only in logs.
for warning in drain_truncation_warnings():
agent._emit_status(warning)
return joined
return "\n\n".join(p for p in (parts["stable"], parts["context"], parts["volatile"]) if p)
def invalidate_system_prompt(agent: Any) -> None:

View File

@@ -1012,42 +1012,28 @@ def execute_tool_calls_sequential(agent, assistant_message, messages: list, effe
elif function_name == "memory":
def _execute(next_args: dict) -> Any:
target = next_args.get("target", "memory")
operations = next_args.get("operations")
from tools.memory_tool import memory_tool as _memory_tool
result = _memory_tool(
action=next_args.get("action"),
target=target,
content=next_args.get("content"),
old_text=next_args.get("old_text"),
operations=operations,
store=agent._memory_store,
)
# Bridge: notify external memory provider of built-in memory writes.
# Covers both the single-op shape and each add/replace inside a batch.
if agent._memory_manager:
if operations:
_mem_ops = [
op for op in operations
if isinstance(op, dict) and op.get("action") in {"add", "replace"}
]
else:
_mem_ops = (
[{"action": next_args.get("action"), "content": next_args.get("content")}]
if next_args.get("action") in {"add", "replace"} else []
# Bridge: notify external memory provider of built-in memory writes
if agent._memory_manager and next_args.get("action") in {"add", "replace"}:
try:
agent._memory_manager.on_memory_write(
next_args.get("action", ""),
target,
next_args.get("content", ""),
metadata=agent._build_memory_write_metadata(
task_id=effective_task_id,
tool_call_id=getattr(tool_call, "id", None),
),
)
for _op in _mem_ops:
try:
agent._memory_manager.on_memory_write(
_op.get("action", ""),
target,
_op.get("content", "") or "",
metadata=agent._build_memory_write_metadata(
task_id=effective_task_id,
tool_call_id=getattr(tool_call, "id", None),
),
)
except Exception:
pass
except Exception:
pass
return result
function_result, function_args = _run_agent_tool_execution_middleware(
agent,

View File

@@ -88,7 +88,7 @@ class AnthropicTransport(ProviderTransport):
from agent.transports.types import ToolCall
strip_tool_prefix = kwargs.get("strip_tool_prefix", False)
_MCP_PREFIX = "mcp__"
_MCP_PREFIX = "mcp_"
text_parts = []
reasoning_parts = []
@@ -132,25 +132,17 @@ class AnthropicTransport(ProviderTransport):
elif block.type == "tool_use":
name = block.name
if strip_tool_prefix and name.startswith(_MCP_PREFIX):
# On the OAuth wire every tool carries a double-underscore
# ``mcp__`` prefix (added in build_anthropic_kwargs to avoid
# Anthropic's single-underscore third-party classifier).
# Reverse it back to the name the registry/dispatcher knows.
# Two original forms map onto the same ``mcp__`` wire name:
# ``mcp__read_file`` <- bare native tool ``read_file``
# ``mcp__linear_get_issue`` <- MCP server tool
# ``mcp_linear_get_issue``
# Resolve by registry lookup, preferring whichever original
# is actually registered; never rewrite a name the LLM used
# that already resolves natively. GH-25255.
stripped = name[len(_MCP_PREFIX):]
# Only strip the mcp_ prefix for OAuth-injected tools
# (where Hermes adds the prefix when sending to Anthropic
# and must remove it on the way back). Native MCP server
# tools (from mcp_servers: in config.yaml) are registered
# in the tool registry under their FULL mcp_<server>_<tool>
# name and must NOT be stripped. GH-25255.
from tools.registry import registry as _tool_registry
if not _tool_registry.get_entry(name):
bare = name[len(_MCP_PREFIX):] # read_file
single = "mcp_" + bare # mcp_read_file / mcp_linear_get_issue
if _tool_registry.get_entry(single):
name = single
elif _tool_registry.get_entry(bare):
name = bare
if (_tool_registry.get_entry(stripped)
and not _tool_registry.get_entry(name)):
name = stripped
tool_calls.append(
ToolCall(
id=block.id,

View File

@@ -531,7 +531,6 @@ class ChatCompletionsTransport(ProviderTransport):
supports_reasoning=params.get("supports_reasoning", False),
qwen_session_metadata=params.get("qwen_session_metadata"),
model=model,
base_url=params.get("base_url"),
ollama_num_ctx=params.get("ollama_num_ctx"),
session_id=params.get("session_id"),
)

View File

@@ -128,65 +128,6 @@ class ResponsesApiTransport(ProviderTransport):
reasoning_effort = _effort_clamp.get(reasoning_effort, reasoning_effort)
response_tools = _responses_tools(tools)
# xAI server-side web search.
#
# grok models on xAI's /v1/responses surface (notably
# grok-composer-2.5-fast on SuperGrok OAuth) have a *native*,
# server-executed web search. When the model is handed a
# client-side function literally named ``web_search``, it routes
# the intent to that native engine — but because the tool is
# declared as a plain ``function`` rather than xAI's first-class
# ``{"type": "web_search"}`` built-in, the server-side search is
# dispatched but never reconciled: the response streams reasoning
# + ``web_search_call`` progress items, the searches never reach
# ``status="completed"`` in the assembled output, no final
# message is emitted, and ``_normalize_codex_response`` correctly
# sees reasoning-with-no-answer and reports ``incomplete``. The
# turn then burns 3 continuation retries and fails with "Codex
# response remained incomplete after 3 continuation attempts".
# Verified live against grok-composer-2.5-fast (2026-06).
#
# Fix: when the agent HAS a client-side ``web_search`` function (i.e.
# the user enabled the web toolset), declare xAI's native
# ``web_search`` built-in instead so the search actually runs to
# completion server-side and the model streams a real answer. The
# Responses API rejects two tools sharing the name ``web_search``
# (HTTP 400 "Duplicate tool names"), so we drop the client-side
# ``web_search`` function for the xAI path and let the native tool
# satisfy it. All other client-side tools (read_file, terminal,
# web_extract, MCP tools, …) are untouched and continue to dispatch
# through Hermes's agent loop.
#
# Scope: we ONLY swap in the native built-in when the client
# ``web_search`` was actually present. We do NOT force-enable Grok
# server-side search on turns where the user never had web enabled —
# that would silently route around Hermes's web-provider config and
# tool-trace/citation plumbing for every xai-oauth turn. The swap is
# a 1:1 replacement of an already-requested capability, not an
# additive grant.
#
# NOTE: for the swapped case this routes ``web_search`` to Grok's
# native search engine for xAI sessions instead of Hermes's
# configured web provider (Tavily/etc.), and those results bypass
# Hermes's tool-trace / citation plumbing (they arrive baked into the
# model's answer rather than as a tool result the loop observes).
# Scoped to ``is_xai_responses`` deliberately; narrow to specific
# models if a future grok variant should keep the client-side
# function.
if is_xai_responses and response_tools:
has_client_web_search = any(
isinstance(t, dict) and t.get("name") == "web_search"
for t in response_tools
)
if has_client_web_search:
filtered = [
t for t in response_tools
if not (isinstance(t, dict) and t.get("name") == "web_search")
]
filtered.append({"type": "web_search"})
response_tools = filtered
# ``tools`` MUST be omitted entirely when there are no functions to
# expose: the openai SDK's ``responses.stream()`` / ``responses.parse()``
# eagerly call ``_make_tools(tools)`` which does ``for tool in tools``
@@ -277,28 +218,10 @@ class ResponsesApiTransport(ProviderTransport):
kwargs.pop("timeout", None)
if is_codex_backend:
# The Codex backend rejects body-level ``extra_headers`` with
# HTTP 400, but the OpenAI SDK's ``extra_headers`` kwarg maps
# to actual HTTP request headers (not body fields). We need
# these headers for cache-scope routing so prompt cache hits
# remain high. Send session_id / x-client-request-id as HTTP
# headers while keeping ``prompt_cache_key`` in the body for
# standard OpenAI routing as a belt-and-braces fallback.
cache_scope_id = str(session_id or "").strip()
if cache_scope_id:
existing_extra_headers = kwargs.get("extra_headers")
merged_extra_headers: Dict[str, str] = {}
if isinstance(existing_extra_headers, dict):
merged_extra_headers.update(
{
str(key): str(value)
for key, value in existing_extra_headers.items()
if key and value is not None
}
)
merged_extra_headers["session_id"] = cache_scope_id
merged_extra_headers["x-client-request-id"] = cache_scope_id
kwargs["extra_headers"] = merged_extra_headers
# chatgpt.com/backend-api/codex rejects body-level
# ``extra_headers`` with HTTP 400. Correlation/cache routing for
# this backend must not be sent through the Responses payload.
kwargs.pop("extra_headers", None)
max_tokens = params.get("max_tokens")
if max_tokens is not None and not is_codex_backend:

View File

@@ -69,7 +69,6 @@ def build_turn_context(
task_id: Optional[str],
stream_callback,
persist_user_message: Optional[str],
persist_user_timestamp: Optional[float] = None,
*,
restore_or_build_system_prompt,
install_safe_stdio,
@@ -122,7 +121,6 @@ def build_turn_context(
agent._stream_callback = stream_callback
agent._persist_user_message_idx = None
agent._persist_user_message_override = persist_user_message
agent._persist_user_message_timestamp = persist_user_timestamp
# Generate unique task_id if not provided to isolate VMs between tasks.
effective_task_id = task_id or str(uuid.uuid4())
agent._current_task_id = effective_task_id

View File

@@ -16,7 +16,7 @@
},
"dependencies": {
"@nous-research/ui": "0.16.0",
"@tailwindcss/vite": "^4.2.4",
"@tailwindcss/vite": "^4.2.1",
"@tailwindcss/typography": "^0.5.19",
"@tauri-apps/api": "^2.0.0",
"@tauri-apps/plugin-dialog": "^2.0.0",
@@ -40,8 +40,8 @@
"@tauri-apps/cli": "^2.0.0",
"@types/react": "^19.2.14",
"@types/react-dom": "^19.2.3",
"@vitejs/plugin-react": "^6.0.2",
"@vitejs/plugin-react": "^5.2.0",
"typescript": "^6.0.3",
"vite": "^8.0.16"
"vite": "^7.3.1"
}
}

View File

@@ -286,7 +286,7 @@ async fn run_update(app: AppHandle) -> Result<()> {
emit_stage(&app, "rebuild", StageState::Running, None, None);
let started = Instant::now();
let rebuild_args: Vec<String> = vec!["desktop".into(), "--build-only".into()];
let mut rebuild = run_streamed(
let rebuild = run_streamed(
&app,
&hermes,
&rebuild_args,
@@ -295,33 +295,6 @@ async fn run_update(app: AppHandle) -> Result<()> {
Some("rebuild"),
)
.await?;
// Retry-once: the first `--build-only` can return nonzero on a still-settling
// post-update tree or a network-blocked Electron fetch that our self-heal
// repaired mid-run. A second attempt then builds clean off the healed dist
// (the content-hash stamp makes it a near-no-op when the first actually
// succeeded). Without this the updater bails here and never reaches the
// relaunch below — the app updates but doesn't restart. Matches the
// retry-once `hermes update` already does above, and `hermes update`'s own
// desktop rebuild in cmd_update.
if rebuild_needs_retry(rebuild.exit_code) {
emit_log(
&app,
Some("rebuild"),
LogStream::Stdout,
"[rebuild] first desktop rebuild failed; retrying once (a self-healed \
Electron download builds clean on the second run)…",
);
rebuild = run_streamed(
&app,
&hermes,
&rebuild_args,
&install_root,
&child_env,
Some("rebuild"),
)
.await?;
}
let rebuild_ms = started.elapsed().as_millis() as u64;
if rebuild.exit_code != Some(0) {
@@ -560,14 +533,6 @@ fn is_locked(path: &Path) -> bool {
}
}
/// Whether the `desktop --build-only` rebuild should be retried once. Any
/// non-success exit qualifies: the common cause is a transient first-attempt
/// failure (still-settling tree / self-healed Electron download) that a clean
/// second run resolves.
fn rebuild_needs_retry(exit_code: Option<i32>) -> bool {
exit_code != Some(0)
}
/// Spawn `hermes <args>` from `cwd`, stream stdout/stderr as Log events on the
/// bootstrap channel, and return the exit code. Mirrors powershell::run_script
/// but for an arbitrary command (no install.ps1 -File wrapping).
@@ -1005,16 +970,6 @@ mod tests {
assert_eq!(update_branch_from_args(["--update"]), None);
}
#[test]
fn rebuild_retries_only_on_failure() {
assert!(!rebuild_needs_retry(Some(0)), "a clean rebuild must not retry");
assert!(rebuild_needs_retry(Some(1)), "a failed rebuild retries once");
assert!(
rebuild_needs_retry(None),
"a killed/signalled rebuild (no exit code) retries once"
);
}
#[test]
fn parses_only_app_targets() {
assert_eq!(

View File

@@ -1,8 +1,8 @@
{
"compilerOptions": {
"target": "ES2023",
"target": "ES2022",
"useDefineForClassFields": true,
"lib": ["ES2023", "DOM", "DOM.Iterable"],
"lib": ["ES2022", "DOM", "DOM.Iterable"],
"module": "ESNext",
"skipLibCheck": true,
"moduleResolution": "bundler",

View File

@@ -34,7 +34,7 @@ It builds and launches the GUI against your existing install — same config, ke
### Prebuilt installers
Prebuilt installers are built and distributed via [the Hermes Desktop website.](https://hermes-agent.nousresearch.com/).
Prebuilt installers are built and distributed via [the Hermes Desktop website.](https://hermes-agent.nousresearch.com/desktop).
---

View File

@@ -166,39 +166,6 @@ function profileRemoteOverride(config, profile) {
return { url, authMode: normAuthMode(entry.authMode), token: entry.token }
}
/**
* In global-remote mode one backend serves every Desktop profile, so REST calls
* that are scoped by renderer-side `request.profile` must carry that scope as a
* query parameter. Local pooled backends and per-profile remote overrides do not
* need this: they already run against a backend scoped to the target profile.
*/
function pathWithGlobalRemoteProfile(path, profile, opts = {}) {
const scopedProfile = connectionScopeKey(profile)
if (!scopedProfile || !opts.globalRemote || opts.profileRemoteOverride) {
return path
}
const rawPath = String(path || '')
if (!rawPath) {
return path
}
let parsed
try {
parsed = new URL(rawPath, 'http://hermes.local')
} catch {
return path
}
if (parsed.searchParams.has('profile')) {
return path
}
parsed.searchParams.set('profile', scopedProfile)
return `${parsed.pathname}${parsed.search}${parsed.hash}`
}
function tokenPreview(value) {
const raw = String(value || '')
@@ -280,7 +247,6 @@ module.exports = {
cookiesHaveLiveSession,
normAuthMode,
normalizeRemoteBaseUrl,
pathWithGlobalRemoteProfile,
profileRemoteOverride,
resolveAuthMode,
resolveTestWsUrl,

View File

@@ -24,7 +24,6 @@ const {
cookiesHaveLiveSession,
normAuthMode,
normalizeRemoteBaseUrl,
pathWithGlobalRemoteProfile,
profileRemoteOverride,
resolveAuthMode,
resolveTestWsUrl,
@@ -91,72 +90,6 @@ test('profileRemoteOverride tolerates a missing/!object profiles map', () => {
assert.equal(profileRemoteOverride(null, 'coder'), null)
})
// --- pathWithGlobalRemoteProfile ---
test('pathWithGlobalRemoteProfile appends profile in global remote mode', () => {
assert.equal(
pathWithGlobalRemoteProfile('/api/model/info', 'iris', {
globalRemote: true,
profileRemoteOverride: false
}),
'/api/model/info?profile=iris'
)
})
test('pathWithGlobalRemoteProfile preserves existing query params', () => {
assert.equal(
pathWithGlobalRemoteProfile('/api/model/options?force=1', 'iris', {
globalRemote: true,
profileRemoteOverride: false
}),
'/api/model/options?force=1&profile=iris'
)
})
test('pathWithGlobalRemoteProfile does not replace an explicit profile query', () => {
assert.equal(
pathWithGlobalRemoteProfile('/api/model/info?profile=default', 'iris', {
globalRemote: true,
profileRemoteOverride: false
}),
'/api/model/info?profile=default'
)
})
test('pathWithGlobalRemoteProfile skips local and per-profile remote override paths', () => {
assert.equal(
pathWithGlobalRemoteProfile('/api/model/info', 'iris', {
globalRemote: false,
profileRemoteOverride: false
}),
'/api/model/info'
)
assert.equal(
pathWithGlobalRemoteProfile('/api/model/info', 'iris', {
globalRemote: true,
profileRemoteOverride: true
}),
'/api/model/info'
)
})
test('pathWithGlobalRemoteProfile skips empty profile/path safely', () => {
assert.equal(
pathWithGlobalRemoteProfile('/api/model/info', '', {
globalRemote: true,
profileRemoteOverride: false
}),
'/api/model/info'
)
assert.equal(
pathWithGlobalRemoteProfile('', 'iris', {
globalRemote: true,
profileRemoteOverride: false
}),
''
)
})
// --- normalizeRemoteBaseUrl ---
test('normalizeRemoteBaseUrl strips trailing slashes, hash, and query', () => {

View File

@@ -28,7 +28,6 @@ const { detectRemoteDisplay, isWindowsBinaryPathInWsl, isWslEnvironment } = requ
const { runBootstrap } = require('./bootstrap-runner.cjs')
const {
buildSessionWindowUrl,
chatWindowWebPreferences,
createSessionWindowRegistry,
SESSION_WINDOW_MIN_HEIGHT,
SESSION_WINDOW_MIN_WIDTH
@@ -40,12 +39,10 @@ const { waitForDashboardPort } = require('./backend-ready.cjs')
const { serializeJsonBody, setJsonRequestHeaders } = require('./oauth-net-request.cjs')
const { fetchMarketplaceThemes, searchMarketplaceThemes } = require('./vscode-marketplace.cjs')
const { buildDesktopBackendEnv, normalizeHermesHomeRoot } = require('./backend-env.cjs')
const { readWindowsUserEnvVar } = require('./windows-user-env.cjs')
const { readDirForIpc } = require('./fs-read-dir.cjs')
const { gitRootForIpc } = require('./git-root.cjs')
const { worktreesForIpc } = require('./git-worktrees.cjs')
const { OFFICIAL_REPO_HTTPS_URL, isOfficialSshRemote } = require('./update-remote.cjs')
const { runRebuildWithRetry } = require('./update-rebuild.cjs')
const {
buildPosixCleanupScript,
buildWindowsCleanupScript,
@@ -65,7 +62,6 @@ const {
cookiesHaveLiveSession,
normAuthMode,
normalizeRemoteBaseUrl,
pathWithGlobalRemoteProfile,
profileRemoteOverride,
resolveAuthMode,
resolveTestWsUrl,
@@ -246,16 +242,6 @@ if (INSTALL_STAMP) {
function resolveHermesHome() {
if (process.env.HERMES_HOME) return normalizeHermesHomeRoot(process.env.HERMES_HOME)
if (USER_DATA_OVERRIDE) return path.join(path.resolve(USER_DATA_OVERRIDE), 'hermes-home')
if (IS_WINDOWS) {
// A GUI app launched from Explorer inherits the environment block captured
// at login, so a HERMES_HOME set via `setx` AFTER login is invisible in
// process.env even though the CLI (a fresh shell) sees it. Without this the
// backend silently falls back to %LOCALAPPDATA%\hermes and reports "No
// inference provider configured" despite a valid configured home (#45471).
// Consult the live User-scoped registry value before the default below.
const fromRegistry = readWindowsUserEnvVar('HERMES_HOME')
if (fromRegistry) return normalizeHermesHomeRoot(fromRegistry)
}
if (IS_WINDOWS && process.env.LOCALAPPDATA) {
const localappdata = path.join(process.env.LOCALAPPDATA, 'hermes')
const legacy = path.join(app.getPath('home'), '.hermes')
@@ -2010,14 +1996,10 @@ async function applyUpdatesPosixInApp() {
}
emitUpdateProgress({ stage: 'rebuild', message: 'Rebuilding the desktop app…', percent: 60 })
// Retry-once: a first rebuild can fail on a still-settling tree or a
// self-healed (network-blocked) Electron download; a second run builds clean
// off the healed dist so we reach the swap+relaunch below instead of bailing.
const rebuilt = await runRebuildWithRetry(attempt => {
if (attempt > 0) {
emitUpdateProgress({ stage: 'rebuild', message: 'Retrying the desktop rebuild…', percent: 60 })
}
return runStreamedUpdate(hermes, ['desktop', '--build-only'], { cwd: updateRoot, env, stage: 'rebuild' })
const rebuilt = await runStreamedUpdate(hermes, ['desktop', '--build-only'], {
cwd: updateRoot,
env,
stage: 'rebuild'
})
if (rebuilt.code !== 0) {
emitUpdateProgress({
@@ -5090,68 +5072,65 @@ function focusWindow(win) {
win.focus()
}
function spawnSecondaryWindow({ sessionId, watch, newSession } = {}) {
const icon = getAppIconPath()
const win = new BrowserWindow({
width: SESSION_WINDOW_MIN_WIDTH,
height: SESSION_WINDOW_MIN_HEIGHT,
minWidth: SESSION_WINDOW_MIN_WIDTH,
minHeight: SESSION_WINDOW_MIN_HEIGHT,
title: 'Hermes',
titleBarStyle: 'hidden',
titleBarOverlay: getTitleBarOverlayOptions(),
trafficLightPosition: IS_MAC ? WINDOW_BUTTON_POSITION : undefined,
vibrancy: IS_MAC ? 'sidebar' : undefined,
opacity: windowOpacity(),
icon,
// Don't show until the renderer's first themed paint is ready. macOS
// `vibrancy` ignores `backgroundColor` and paints a translucent OS
// material (which follows the OS appearance, not the app theme), so a
// dark-themed app on a light-mode Mac flashes white until the renderer
// covers it. ready-to-show fires after the boot-time paint in
// themes/context.tsx, so the window appears already themed.
show: false,
backgroundColor: getWindowBackgroundColor(),
webPreferences: chatWindowWebPreferences(path.join(__dirname, 'preload.cjs'))
})
if (IS_MAC) {
win.setWindowButtonPosition?.(WINDOW_BUTTON_POSITION)
}
win.once('ready-to-show', () => {
if (!win.isDestroyed()) win.show()
})
win.on('will-enter-full-screen', () => sendWindowStateChanged(true))
win.on('enter-full-screen', () => sendWindowStateChanged(true))
win.on('will-leave-full-screen', () => sendWindowStateChanged(false))
win.on('leave-full-screen', () => sendWindowStateChanged(false))
wireCommonWindowHandlers(win)
win.loadURL(
buildSessionWindowUrl(sessionId, {
devServer: DEV_SERVER,
rendererIndexPath: DEV_SERVER ? undefined : resolveRendererIndex(),
watch,
newSession
})
)
return win
}
// Open (or focus) a standalone window for a single chat session.
function createSessionWindow(sessionId, { watch = false } = {}) {
return sessionWindows.openOrFocus(sessionId, () => spawnSecondaryWindow({ sessionId, watch }))
}
return sessionWindows.openOrFocus(sessionId, () => {
const icon = getAppIconPath()
const win = new BrowserWindow({
width: SESSION_WINDOW_MIN_WIDTH,
height: SESSION_WINDOW_MIN_HEIGHT,
minWidth: SESSION_WINDOW_MIN_WIDTH,
minHeight: SESSION_WINDOW_MIN_HEIGHT,
title: 'Hermes',
titleBarStyle: 'hidden',
titleBarOverlay: getTitleBarOverlayOptions(),
trafficLightPosition: IS_MAC ? WINDOW_BUTTON_POSITION : undefined,
vibrancy: IS_MAC ? 'sidebar' : undefined,
opacity: windowOpacity(),
icon,
// Don't show until the renderer's first themed paint is ready. macOS
// `vibrancy` ignores `backgroundColor` and paints a translucent OS
// material (which follows the OS appearance, not the app theme), so a
// dark-themed app on a light-mode Mac flashes white until the renderer
// covers it. ready-to-show fires after the boot-time paint in
// themes/context.tsx, so the window appears already themed.
show: false,
backgroundColor: getWindowBackgroundColor(),
webPreferences: {
preload: path.join(__dirname, 'preload.cjs'),
contextIsolation: true,
webviewTag: true,
sandbox: true,
nodeIntegration: false,
devTools: true
}
})
// Open a fresh compact window on the new-session draft (#/). Not registry-keyed:
// like ⌘N in a browser, every press opens a new window — and a draft window that
// later converts to a real session must not get refocused as if it were blank.
function createNewSessionWindow() {
return spawnSecondaryWindow({ newSession: true })
if (IS_MAC) {
win.setWindowButtonPosition?.(WINDOW_BUTTON_POSITION)
}
win.once('ready-to-show', () => {
if (!win.isDestroyed()) win.show()
})
win.on('will-enter-full-screen', () => sendWindowStateChanged(true))
win.on('enter-full-screen', () => sendWindowStateChanged(true))
win.on('will-leave-full-screen', () => sendWindowStateChanged(false))
win.on('leave-full-screen', () => sendWindowStateChanged(false))
wireCommonWindowHandlers(win)
win.loadURL(
buildSessionWindowUrl(sessionId, {
devServer: DEV_SERVER,
rendererIndexPath: DEV_SERVER ? undefined : resolveRendererIndex(),
watch
})
)
return win
})
}
function createWindow() {
@@ -5179,11 +5158,23 @@ function createWindow() {
// material before the renderer paints the app theme. See createSessionWindow.
show: false,
backgroundColor: getWindowBackgroundColor(),
// Shared with the secondary session windows (chatWindowWebPreferences) so
// both keep `backgroundThrottling: false` — the chat transcript streams via
// a requestAnimationFrame-gated flush that Chromium pauses for blurred
// windows, stalling the live answer until refocus. See session-windows.cjs.
webPreferences: chatWindowWebPreferences(path.join(__dirname, 'preload.cjs'))
webPreferences: {
preload: path.join(__dirname, 'preload.cjs'),
contextIsolation: true,
webviewTag: true,
sandbox: true,
nodeIntegration: false,
devTools: true,
// Keep timers + requestAnimationFrame running at full speed when the
// window is blurred/occluded. The chat transcript streams to the screen
// through a requestAnimationFrame-gated flush (useSessionStateCache),
// so with Chromium's default background throttling the live answer
// stalls whenever this window isn't focused (e.g. you switch to your
// editor mid-turn, or open detached devtools) and only appears once you
// refocus or refresh. A streaming chat app must render in the
// background, so opt out — matching the secondary windows above.
backgroundThrottling: false
}
})
if (IS_MAC) {
@@ -5326,11 +5317,6 @@ ipcMain.handle('hermes:window:openSession', async (_event, sessionId, opts) => {
return { ok: true }
})
ipcMain.handle('hermes:window:openNewSession', async () => {
createNewSessionWindow()
return { ok: true }
})
ipcMain.handle('hermes:bootstrap:reset', async () => {
// Renderer's "Reload and retry" path. Clear the latched failure and
// reset connection state so the next startHermes() call restarts the
@@ -5600,14 +5586,9 @@ ipcMain.handle('hermes:api', async (_event, request) => {
await prepareProfileDeleteRequest(request)
const profile = request?.profile
const connection = await ensureBackend(profile)
const connection = await ensureBackend(request?.profile)
const timeoutMs = resolveTimeoutMs(request?.timeoutMs, DEFAULT_FETCH_TIMEOUT_MS)
const requestPath = pathWithGlobalRemoteProfile(request.path, profile, {
globalRemote: globalRemoteActive(),
profileRemoteOverride: profileHasRemoteOverride(profile)
})
const url = `${connection.baseUrl}${requestPath}`
const url = `${connection.baseUrl}${request.path}`
// OAuth gateways authenticate REST via the HttpOnly session cookie held in
// the OAuth partition — route through Electron's net stack bound to that
// session so the cookie attaches automatically. Token/local modes keep using
@@ -6551,12 +6532,6 @@ app.on('before-quit', () => {
flushDesktopLogBufferSync()
closePreviewWatchers()
// Kill open PTYs before environment teardown to avoid the node-pty#904
// ThreadSafeFunction SIGABRT race.
for (const id of [...terminalSessions.keys()]) {
disposeTerminalSession(id)
}
if (hermesProcess && !hermesProcess.killed) {
hermesProcess.kill('SIGTERM')
}

View File

@@ -6,7 +6,6 @@ contextBridge.exposeInMainWorld('hermesDesktop', {
touchBackend: profile => ipcRenderer.invoke('hermes:backend:touch', profile),
getGatewayWsUrl: profile => ipcRenderer.invoke('hermes:gateway:ws-url', profile),
openSessionWindow: (sessionId, opts) => ipcRenderer.invoke('hermes:window:openSession', sessionId, opts),
openNewSessionWindow: () => ipcRenderer.invoke('hermes:window:openNewSession'),
getBootProgress: () => ipcRenderer.invoke('hermes:boot-progress:get'),
getConnectionConfig: profile => ipcRenderer.invoke('hermes:connection-config:get', profile),
saveConnectionConfig: payload => ipcRenderer.invoke('hermes:connection-config:save', payload),

View File

@@ -10,41 +10,17 @@ const { pathToFileURL } = require('node:url')
const SESSION_WINDOW_MIN_WIDTH = 420
const SESSION_WINDOW_MIN_HEIGHT = 620
// Shared webPreferences for every window that renders the chat transcript — the
// primary window AND the secondary session windows. Keeping it in one place is
// the whole point: the two BrowserWindow definitions in main.cjs used to be
// hand-copied, and the secondary windows silently lost `backgroundThrottling:
// false`, so a streamed answer stalled until the window regained focus.
//
// `backgroundThrottling: false` is load-bearing: the transcript streams to the
// screen through a requestAnimationFrame-gated flush, which Chromium pauses for
// blurred/occluded windows. A streaming chat app must keep painting in the
// background, so every chat window opts out. The preload path is injected
// because it depends on the Electron entry's __dirname.
function chatWindowWebPreferences(preloadPath) {
return {
preload: preloadPath,
contextIsolation: true,
webviewTag: true,
sandbox: true,
nodeIntegration: false,
devTools: true,
backgroundThrottling: false
}
}
// Build the renderer URL for a secondary window. The renderer uses a
// HashRouter, so the session route lives after the '#'. The `?win=secondary`
// flag MUST sit in the query string BEFORE the '#': anything after the '#' is
// treated as the route by HashRouter and would break routeSessionId(). The
// renderer reads the flag from window.location.search to suppress the install /
// onboarding overlays and the global session sidebar. `new=1` marks the compact
// scratch window; `watch=1` marks a spectator window (e.g. a running subagent's
// session): the renderer resumes it lazily so the gateway never builds an agent
// just to stream into it.
function buildSessionWindowUrl(sessionId, { devServer, rendererIndexPath, watch, newSession } = {}) {
const query = `?win=secondary${newSession ? '&new=1' : ''}${watch ? '&watch=1' : ''}`
const route = newSession ? '#/' : `#/${encodeURIComponent(sessionId)}`
// onboarding overlays and the global session sidebar. `watch=1` marks a
// spectator window (e.g. a running subagent's session): the renderer resumes
// it lazily so the gateway never builds an agent just to stream into it.
function buildSessionWindowUrl(sessionId, { devServer, rendererIndexPath, watch } = {}) {
const query = `?win=secondary${watch ? '&watch=1' : ''}`
const route = `#/${encodeURIComponent(sessionId)}`
if (devServer) {
const base = devServer.endsWith('/') ? devServer.slice(0, -1) : devServer
@@ -117,7 +93,6 @@ function createSessionWindowRegistry() {
module.exports = {
buildSessionWindowUrl,
chatWindowWebPreferences,
createSessionWindowRegistry,
SESSION_WINDOW_MIN_HEIGHT,
SESSION_WINDOW_MIN_WIDTH

View File

@@ -1,11 +1,7 @@
const assert = require('node:assert/strict')
const test = require('node:test')
const {
buildSessionWindowUrl,
chatWindowWebPreferences,
createSessionWindowRegistry
} = require('./session-windows.cjs')
const { buildSessionWindowUrl, createSessionWindowRegistry } = require('./session-windows.cjs')
// A minimal fake BrowserWindow: tracks listeners + destroyed state and lets a
// test fire the 'closed' event, mirroring the slice of the Electron API the
@@ -86,12 +82,6 @@ test('buildSessionWindowUrl adds the watch flag for spectator windows, before th
assert.equal(url, 'http://localhost:5173/?win=secondary&watch=1#/abc')
})
test('buildSessionWindowUrl routes new-session windows to the draft (#/)', () => {
const url = buildSessionWindowUrl(null, { devServer: 'http://localhost:5173', newSession: true })
assert.equal(url, 'http://localhost:5173/?win=secondary&new=1#/')
})
test('registry opens one window per session and focuses on re-open', () => {
const registry = createSessionWindowRegistry()
let built = 0
@@ -179,21 +169,3 @@ test('registry trims the session id before keying', () => {
assert.equal(registry.has('s1'), true)
})
test('chatWindowWebPreferences disables background throttling so streaming paints while blurred', () => {
// Regression: secondary session windows used to omit this flag, so a streamed
// answer stalled until the window regained focus (Chromium pauses the
// requestAnimationFrame-gated transcript flush for backgrounded windows).
const prefs = chatWindowWebPreferences('/tmp/preload.cjs')
assert.equal(prefs.backgroundThrottling, false)
})
test('chatWindowWebPreferences passes the preload path through and keeps the hardened defaults', () => {
const prefs = chatWindowWebPreferences('/some/preload.cjs')
assert.equal(prefs.preload, '/some/preload.cjs')
assert.equal(prefs.contextIsolation, true)
assert.equal(prefs.sandbox, true)
assert.equal(prefs.nodeIntegration, false)
})

View File

@@ -1,29 +0,0 @@
'use strict'
/**
* Retry-once policy for the desktop `--build-only` rebuild during self-update.
*
* The first rebuild can return nonzero on a still-settling post-update tree or a
* network-blocked Electron fetch that the installer's self-heal repaired mid-run.
* A second attempt then builds clean off the healed dist (the content-hash stamp
* makes it a near-no-op when the first actually succeeded). Without the retry the
* updater bails before the relaunch step — the app updates but doesn't restart.
*/
function shouldRetryRebuild(code) {
return code !== 0
}
/**
* Run `rebuild()` (async, resolves `{ code, ... }`), retrying once on failure.
* Returns the final result.
*/
async function runRebuildWithRetry(rebuild) {
let result = await rebuild(0)
if (shouldRetryRebuild(result.code)) {
result = await rebuild(1)
}
return result
}
module.exports = { shouldRetryRebuild, runRebuildWithRetry }

View File

@@ -1,55 +0,0 @@
/**
* Tests for electron/update-rebuild.cjs — the retry-once policy for the desktop
* `--build-only` rebuild during self-update.
*
* Run with: node --test electron/update-rebuild.test.cjs
* (Wired into npm test:desktop:platforms in package.json.)
*
* Why this matters: a first rebuild can return nonzero on a still-settling tree
* or a self-healed (network-blocked) Electron download. Without a second attempt
* the updater bails before the relaunch step — the app updates but never restarts
* (the field report behind this fix). The retry must fire on failure, not on
* success, and must run at most twice.
*/
const test = require('node:test')
const assert = require('node:assert/strict')
const { shouldRetryRebuild, runRebuildWithRetry } = require('./update-rebuild.cjs')
test('shouldRetryRebuild retries only on a non-success exit', () => {
assert.equal(shouldRetryRebuild(0), false)
assert.equal(shouldRetryRebuild(1), true)
assert.equal(shouldRetryRebuild(null), true)
})
test('a clean first rebuild runs once and does not retry', async () => {
const codes = []
const result = await runRebuildWithRetry(attempt => {
codes.push(attempt)
return Promise.resolve({ code: 0 })
})
assert.deepEqual(codes, [0])
assert.equal(result.code, 0)
})
test('a failed first rebuild retries once and succeeds', async () => {
const codes = []
const result = await runRebuildWithRetry(attempt => {
codes.push(attempt)
return Promise.resolve({ code: attempt === 0 ? 1 : 0 })
})
assert.deepEqual(codes, [0, 1])
assert.equal(result.code, 0)
})
test('a rebuild that keeps failing runs at most twice and reports the failure', async () => {
const codes = []
const result = await runRebuildWithRetry(attempt => {
codes.push(attempt)
return Promise.resolve({ code: 1, error: 'rebuild-failed' })
})
assert.deepEqual(codes, [0, 1])
assert.equal(result.code, 1)
assert.equal(result.error, 'rebuild-failed')
})

View File

@@ -1,76 +0,0 @@
// windows-user-env.cjs
//
// Read a User-scoped environment variable straight from the Windows registry
// (HKCU\Environment).
//
// A GUI app launched from Explorer inherits the environment block captured at
// login, so a variable set via `setx` AFTER login is invisible in process.env
// even though a fresh shell — and the Hermes CLI — sees it immediately. The
// desktop's HERMES_HOME resolution relies on process.env, so that stale-snapshot
// gap silently sends the backend to the default %LOCALAPPDATA%\hermes. Reading
// the live registry value closes the gap. See #45471.
const { execFileSync } = require('node:child_process')
// Parse the output of `reg query HKCU\Environment /v <name>`, which looks like:
//
// HKEY_CURRENT_USER\Environment
// HERMES_HOME REG_SZ F:\Hermes\data
//
// Returns the raw value string (spaces inside the value preserved), or null when
// the requested value line isn't present.
function parseRegQueryValue(stdout, name) {
if (!stdout || !name) return null
const typePattern =
/^(\S+)\s+(?:REG_SZ|REG_EXPAND_SZ|REG_MULTI_SZ|REG_DWORD|REG_QWORD|REG_BINARY|REG_NONE)\s+(.*)$/
for (const rawLine of String(stdout).split(/\r?\n/)) {
const line = rawLine.trim()
const match = line.match(typePattern)
if (match && match[1].toLowerCase() === name.toLowerCase()) {
return match[2]
}
}
return null
}
// Expand %VAR% references against an env map. REG_EXPAND_SZ values store
// unexpanded references; plain REG_SZ paths have none, so this is a no-op for
// the common F:\... case. Unknown references are left verbatim.
function expandWindowsEnvRefs(value, env = process.env) {
if (!value) return value
return value.replace(/%([^%]+)%/g, (whole, name) => {
const key = Object.keys(env).find(k => k.toUpperCase() === String(name).toUpperCase())
return key != null && env[key] != null ? env[key] : whole
})
}
// Read a User-scoped env var from HKCU\Environment. Windows-only: returns null
// off-Windows (without spawning), on any spawn error, when `reg` exits non-zero
// (the value doesn't exist), or when the value is empty.
function readWindowsUserEnvVar(
name,
{ platform = process.platform, env = process.env, exec = execFileSync } = {}
) {
if (platform !== 'win32' || !name) return null
let stdout
try {
stdout = exec('reg', ['query', 'HKCU\\Environment', '/v', name], {
encoding: 'utf8',
windowsHide: true,
timeout: 5000
})
} catch {
// `reg` missing, or value absent (reg exits 1) — caller falls back.
return null
}
const raw = parseRegQueryValue(stdout, name)
if (raw == null) return null
const expanded = expandWindowsEnvRefs(raw, env).trim()
return expanded || null
}
module.exports = {
expandWindowsEnvRefs,
parseRegQueryValue,
readWindowsUserEnvVar
}

View File

@@ -1,90 +0,0 @@
const assert = require('node:assert/strict')
const { test } = require('node:test')
const {
expandWindowsEnvRefs,
parseRegQueryValue,
readWindowsUserEnvVar
} = require('./windows-user-env.cjs')
// ── parseRegQueryValue ─────────────────────────────────────────────────────
test('parseRegQueryValue extracts a REG_SZ value', () => {
const out = [
'',
'HKEY_CURRENT_USER\\Environment',
' HERMES_HOME REG_SZ F:\\Hermes\\data',
''
].join('\r\n')
assert.equal(parseRegQueryValue(out, 'HERMES_HOME'), 'F:\\Hermes\\data')
})
test('parseRegQueryValue matches the name case-insensitively', () => {
const out = 'HKEY_CURRENT_USER\\Environment\r\n Hermes_Home REG_EXPAND_SZ %USERPROFILE%\\h\r\n'
assert.equal(parseRegQueryValue(out, 'HERMES_HOME'), '%USERPROFILE%\\h')
})
test('parseRegQueryValue preserves spaces inside the value', () => {
const out = ' HERMES_HOME REG_SZ C:\\Program Files\\Hermes\r\n'
assert.equal(parseRegQueryValue(out, 'HERMES_HOME'), 'C:\\Program Files\\Hermes')
})
test('parseRegQueryValue returns null when the value line is absent', () => {
const out = 'HKEY_CURRENT_USER\\Environment\r\n Path REG_SZ C:\\x\r\n'
assert.equal(parseRegQueryValue(out, 'HERMES_HOME'), null)
assert.equal(parseRegQueryValue('', 'HERMES_HOME'), null)
assert.equal(parseRegQueryValue('garbage', 'HERMES_HOME'), null)
})
// ── expandWindowsEnvRefs ───────────────────────────────────────────────────
test('expandWindowsEnvRefs expands %VAR% case-insensitively', () => {
assert.equal(
expandWindowsEnvRefs('%UserProfile%\\h', { USERPROFILE: 'C:\\Users\\jeff' }),
'C:\\Users\\jeff\\h'
)
})
test('expandWindowsEnvRefs leaves literal paths and unknown refs intact', () => {
assert.equal(expandWindowsEnvRefs('F:\\Hermes\\data', {}), 'F:\\Hermes\\data')
assert.equal(expandWindowsEnvRefs('%NOPE%\\x', {}), '%NOPE%\\x')
})
// ── readWindowsUserEnvVar ──────────────────────────────────────────────────
test('readWindowsUserEnvVar returns null off Windows without spawning', () => {
let spawned = false
const exec = () => {
spawned = true
return ''
}
assert.equal(readWindowsUserEnvVar('HERMES_HOME', { platform: 'linux', exec }), null)
assert.equal(spawned, false)
})
test('readWindowsUserEnvVar queries HKCU\\Environment and expands the value', () => {
const calls = []
const exec = (cmd, args) => {
calls.push([cmd, args])
return 'HKEY_CURRENT_USER\\Environment\r\n HERMES_HOME REG_EXPAND_SZ %DRIVE%\\Hermes\r\n'
}
const value = readWindowsUserEnvVar('HERMES_HOME', {
platform: 'win32',
env: { DRIVE: 'F:' },
exec
})
assert.equal(value, 'F:\\Hermes')
assert.deepEqual(calls, [['reg', ['query', 'HKCU\\Environment', '/v', 'HERMES_HOME']]])
})
test('readWindowsUserEnvVar returns null when reg exits non-zero (value missing)', () => {
const exec = () => {
throw new Error('reg exited 1')
}
assert.equal(readWindowsUserEnvVar('HERMES_HOME', { platform: 'win32', exec }), null)
})
test('readWindowsUserEnvVar returns null for an empty value', () => {
const exec = () => ' HERMES_HOME REG_SZ \r\n'
assert.equal(readWindowsUserEnvVar('HERMES_HOME', { platform: 'win32', exec }), null)
})

View File

@@ -20,8 +20,7 @@
"start": "npm run build && electron .",
"build": "node scripts/assert-root-install.cjs && node scripts/write-build-stamp.cjs && node scripts/stage-native-deps.cjs && tsc -b && vite build && npm run postbuild",
"postbuild": "node scripts/assert-dist-built.cjs",
"prebuilder": "node scripts/patch-electron-builder-mac-binary.cjs",
"builder": "cross-env NODE_OPTIONS=--max-old-space-size=16384 node scripts/run-electron-builder.cjs",
"builder": "cross-env NODE_OPTIONS=--max-old-space-size=16384 electron-builder",
"pack": "npm run build && npm run builder -- --dir",
"dist": "npm run build && npm run builder",
"dist:mac": "npm run build && npm run builder -- --mac",
@@ -37,7 +36,7 @@
"test:desktop:nsis": "node scripts/test-desktop.mjs nsis",
"test:desktop:existing": "node scripts/test-desktop.mjs existing",
"test:desktop:fresh": "node scripts/test-desktop.mjs fresh",
"test:desktop:platforms": "node --test electron/bootstrap-platform.test.cjs electron/hardening.test.cjs electron/backend-env.test.cjs electron/backend-probes.test.cjs electron/bootstrap-runner.test.cjs electron/connection-config.test.cjs electron/dashboard-token.test.cjs electron/gateway-ws-probe.test.cjs electron/oauth-net-request.test.cjs electron/desktop-uninstall.test.cjs electron/session-windows.test.cjs electron/workspace-cwd.test.cjs electron/fs-read-dir.test.cjs electron/git-root.test.cjs electron/windows-child-process.test.cjs electron/update-remote.test.cjs electron/update-rebuild.test.cjs electron/windows-user-env.test.cjs",
"test:desktop:platforms": "node --test electron/bootstrap-platform.test.cjs electron/hardening.test.cjs electron/backend-env.test.cjs electron/backend-probes.test.cjs electron/bootstrap-runner.test.cjs electron/connection-config.test.cjs electron/dashboard-token.test.cjs electron/gateway-ws-probe.test.cjs electron/oauth-net-request.test.cjs electron/desktop-uninstall.test.cjs electron/session-windows.test.cjs electron/workspace-cwd.test.cjs electron/fs-read-dir.test.cjs electron/git-root.test.cjs electron/windows-child-process.test.cjs electron/update-remote.test.cjs",
"typecheck": "tsc -p . --noEmit",
"lint": "eslint src/ electron/",
"lint:fix": "eslint src/ electron/ --fix",
@@ -55,7 +54,7 @@
"@dnd-kit/sortable": "^10.0.0",
"@dnd-kit/utilities": "^3.2.2",
"@hermes/shared": "file:../shared",
"@icons-pack/react-simple-icons": "=13.11.1",
"@icons-pack/react-simple-icons": "^13.13.0",
"@nanostores/react": "^1.1.0",
"@nous-research/ui": "^0.13.0",
"@radix-ui/react-slot": "^1.2.4",
@@ -117,7 +116,7 @@
"@vitejs/plugin-react": "^6.0.1",
"concurrently": "^10.0.3",
"cross-env": "^10.1.0",
"electron": "40.10.2",
"electron": "^40.9.3",
"electron-builder": "^26.8.1",
"eslint": "^9.39.4",
"eslint-plugin-perfectionist": "^5.9.0",
@@ -134,7 +133,7 @@
"wait-on": "^9.0.5"
},
"build": {
"electronVersion": "40.10.2",
"electronVersion": "40.9.3",
"appId": "com.nousresearch.hermes",
"productName": "Hermes",
"executableName": "Hermes",

View File

@@ -1,64 +0,0 @@
const fs = require('node:fs')
const path = require('node:path')
if (process.platform !== 'darwin') {
process.exit(0)
}
const desktopRoot = path.resolve(__dirname, '..')
const repoRoot = path.resolve(desktopRoot, '..', '..')
const electronMacPath = path.join(repoRoot, 'node_modules', 'app-builder-lib', 'out', 'electron', 'electronMac.js')
const marker = 'hermes-macos-electron-binary-fallback'
const needle = ` await Promise.all([
doRename(path.join(contentsPath, "MacOS"), electronBranding.productName, appPlist.CFBundleExecutable),
(0, builder_util_1.unlinkIfExists)(path.join(appOutDir, "LICENSE")),
(0, builder_util_1.unlinkIfExists)(path.join(appOutDir, "LICENSES.chromium.html")),
]);`
const replacement = ` // ${marker}: electron-builder 26.8.x can sometimes copy
// Electron.app without its main MacOS/Electron binary before this rename.
// Restore it from the installed Electron runtime so local desktop installs
// do not fail with ENOENT during macOS arm64 packaging.
const macosDir = path.join(contentsPath, "MacOS");
const bundledElectronBinary = path.join(macosDir, electronBranding.productName);
if (!fs.existsSync(bundledElectronBinary)) {
const candidates = [
path.join(packager.info.framework.distMacOsAppName, "Contents", "MacOS", electronBranding.productName),
// npm may nest the workspace-only electron devDep under
// apps/desktop/node_modules (process.cwd() during pack), or hoist
// it to the repo root. Try the workspace-local install first, then
// the root hoist, so the fallback works under either layout.
path.join(process.cwd(), "node_modules", "electron", "dist", "Electron.app", "Contents", "MacOS", electronBranding.productName),
path.join(process.cwd(), "..", "..", "node_modules", "electron", "dist", "Electron.app", "Contents", "MacOS", electronBranding.productName),
];
const sourceBinary = candidates.find(candidate => fs.existsSync(candidate));
if (sourceBinary == null) {
throw new Error("Electron binary missing from packaged app and Electron runtime: " + bundledElectronBinary);
}
await (0, promises_1.copyFile)(sourceBinary, bundledElectronBinary);
await (0, promises_1.chmod)(bundledElectronBinary, 0o755);
}
await Promise.all([
doRename(macosDir, electronBranding.productName, appPlist.CFBundleExecutable),
(0, builder_util_1.unlinkIfExists)(path.join(appOutDir, "LICENSE")),
(0, builder_util_1.unlinkIfExists)(path.join(appOutDir, "LICENSES.chromium.html")),
]);`
if (!fs.existsSync(electronMacPath)) {
console.warn(`[patch-electron-builder] skipped: ${electronMacPath} not found`)
process.exit(0)
}
const source = fs.readFileSync(electronMacPath, 'utf8')
if (source.includes(marker)) {
console.log('[patch-electron-builder] macOS Electron binary fallback already applied')
process.exit(0)
}
if (!source.includes(needle)) {
console.warn('[patch-electron-builder] skipped: expected electronMac.js shape not found')
process.exit(0)
}
fs.writeFileSync(electronMacPath, source.replace(needle, replacement))
console.log('[patch-electron-builder] applied macOS Electron binary fallback')

View File

@@ -1,57 +0,0 @@
"use strict"
// Resolve electronDist at runtime (#38673, #47917): electron-builder 26.8.x can
// re-unpack a broken Electron.app; reusing the installed dist dodges that.
// npm workspace hoisting is non-deterministic — require.resolve finds electron
// wherever it landed. Dist present → -c.electronDist=<abs>/dist; absent → let
// electron-builder fetch via @electron/get (electronVersion + ELECTRON_MIRROR).
const fs = require("node:fs")
const path = require("node:path")
const { spawnSync } = require("node:child_process")
function electronDistDir() {
try {
return path.join(path.dirname(require.resolve("electron/package.json")), "dist")
} catch {
return null
}
}
function distBinary(dist) {
if (process.platform === "darwin") {
return path.join(dist, "Electron.app", "Contents", "MacOS", "Electron")
}
if (process.platform === "win32") {
return path.join(dist, "electron.exe")
}
return path.join(dist, "electron")
}
function electronBuilderCli() {
const pkgJson = require.resolve("electron-builder/package.json")
const bin = require(pkgJson).bin
const rel = typeof bin === "string" ? bin : bin["electron-builder"]
return path.join(path.dirname(pkgJson), rel)
}
const dist = electronDistDir()
const args = []
if (dist && fs.existsSync(distBinary(dist))) {
args.push(`-c.electronDist=${dist}`)
} else {
console.warn(
"[run-electron-builder] no local electron dist; electron-builder will fetch " +
"via @electron/get (electronVersion + ELECTRON_MIRROR)."
)
}
args.push(...process.argv.slice(2))
const result = spawnSync(process.execPath, [electronBuilderCli(), ...args], {
stdio: "inherit",
})
if (result.error) {
console.error(`[run-electron-builder] spawn failed: ${result.error.message}`)
process.exit(1)
}
process.exit(result.status == null ? 1 : result.status)

View File

@@ -357,7 +357,7 @@ function SubagentRow({ node, depth = 0, nowMs }: { node: SubagentNode; depth?: n
</button>
{visibleRows.length > 0 ? (
<div className="grid min-w-0 gap-1 pl-6" data-selectable-text="true">
<div className="grid min-w-0 gap-1 pl-6">
{visibleRows.map((entry, i) => (
<StreamLine
active={running && i === visibleRows.length - 1}
@@ -371,7 +371,7 @@ function SubagentRow({ node, depth = 0, nowMs }: { node: SubagentNode; depth?: n
) : null}
{open && fileLines.length > 0 ? (
<div className="grid min-w-0 gap-0.5 pl-6" data-selectable-text="true">
<div className="grid min-w-0 gap-0.5 pl-6">
<p className="text-[0.58rem] font-medium tracking-wider text-muted-foreground/60 uppercase">
{t.agents.files}
</p>

View File

@@ -23,7 +23,6 @@ import { type Translations, useI18n } from '@/i18n'
import { sessionTitle } from '@/lib/chat-runtime'
import { ExternalLink, ExternalLinkIcon, hostPathLabel, urlSlugTitleLabel, useLinkTitle } from '@/lib/external-link'
import { FileImage, FileText, FolderOpen, Link2 } from '@/lib/icons'
import { mediaExternalUrl } from '@/lib/media'
import { cn } from '@/lib/utils'
import { notifyError } from '@/store/notifications'
import type { SessionInfo, SessionMessage } from '@/types/hermes'
@@ -125,12 +124,17 @@ function artifactKind(value: string): ArtifactKind {
}
function artifactHref(value: string): string {
if (value.startsWith('http://') || value.startsWith('https://') || value.startsWith('data:')) {
if (
value.startsWith('http://') ||
value.startsWith('https://') ||
value.startsWith('file://') ||
value.startsWith('data:')
) {
return value
}
if (value.startsWith('file://') || value.startsWith('/')) {
return mediaExternalUrl(value)
if (value.startsWith('/')) {
return `file://${encodeURI(value)}`
}
return value

View File

@@ -9,7 +9,6 @@ import { formatCombo } from '@/lib/keybinds/combo'
import { cn } from '@/lib/utils'
import type { ConversationStatus } from './hooks/use-voice-conversation'
import { ModelPill } from './model-pill'
import type { ChatBarState, VoiceStatus } from './types'
export const ICON_BTN = 'size-(--composer-control-size) shrink-0 rounded-md'
@@ -67,7 +66,6 @@ export function ComposerControls({
const c = t.composer
const steerCombo = formatCombo('mod+enter')
const steerLabel = `${c.steer} (${steerCombo})`
const steerTip = (
<span className="inline-flex items-center gap-1.5">
{c.steer}
@@ -83,10 +81,8 @@ export function ComposerControls({
return (
<div className="ml-auto flex shrink-0 items-center gap-(--composer-control-gap)">
<ModelPill disabled={disabled} model={state.model} />
{/* While the agent runs and the user is typing, steer takes over the mic's
slot rather than crowding the row with an extra button. */}
{canSteer ? (
<DictationButton disabled={disabled} onToggle={onDictate} state={state.voice} status={voiceStatus} />
{canSteer && (
<Tip label={steerTip}>
<Button
aria-label={steerLabel}
@@ -100,8 +96,6 @@ export function ComposerControls({
<SteeringWheel size={16} />
</Button>
</Tip>
) : (
<DictationButton disabled={disabled} onToggle={onDictate} state={state.voice} status={voiceStatus} />
)}
{showVoicePrimary ? (
<Tip label={c.startVoice}>

View File

@@ -1,86 +0,0 @@
import { useStore } from '@nanostores/react'
import { useState } from 'react'
import { ModelMenuCloseContext } from '@/app/shell/model-menu-panel'
import { Button } from '@/components/ui/button'
import { DropdownMenu, DropdownMenuContent, DropdownMenuTrigger } from '@/components/ui/dropdown-menu'
import { GlyphSpinner } from '@/components/ui/glyph-spinner'
import { useI18n } from '@/i18n'
import { ChevronDown } from '@/lib/icons'
import { formatModelStatusLabel } from '@/lib/model-status-label'
import { cn } from '@/lib/utils'
import {
$currentFastMode,
$currentModel,
$currentProvider,
$currentReasoningEffort,
setModelPickerOpen
} from '@/store/session'
import type { ChatBarState } from './types'
const PILL = cn(
'h-(--composer-control-size) max-w-40 shrink-0 gap-1 rounded-md px-2 text-xs font-normal',
'text-(--ui-text-tertiary) hover:bg-(--chrome-action-hover) hover:text-foreground'
)
/**
* Composer model selector — the relocated status-bar pill. Reuses the live
* `model.options` dropdown (`modelMenuContent`) verbatim; falls back to the
* full picker when the gateway is closed and no live menu exists.
*/
export function ModelPill({ disabled, model }: { disabled: boolean; model: ChatBarState['model'] }) {
const copy = useI18n().t.shell.statusbar
const currentModel = useStore($currentModel)
const currentProvider = useStore($currentProvider)
const fastMode = useStore($currentFastMode)
const reasoningEffort = useStore($currentReasoningEffort)
const [open, setOpen] = useState(false)
// The model resolves a beat after the gateway/session comes up. Rather than
// flash a literal "No model", show a quiet loader (inherits the pill text
// color at half opacity) until a model lands.
const label = (
<>
{currentModel.trim() ? (
<span className="truncate">{formatModelStatusLabel(currentModel, { fastMode, reasoningEffort })}</span>
) : (
<GlyphSpinner className="opacity-50" spinner="braille" />
)}
<ChevronDown className="size-2.5 shrink-0 opacity-50" />
</>
)
const title = currentProvider ? copy.modelTitle(currentProvider, currentModel || copy.modelNone) : copy.switchModel
if (!model.modelMenuContent) {
return (
<Button
aria-label={copy.openModelPicker}
className={PILL}
disabled={disabled}
onClick={() => setModelPickerOpen(true)}
title={copy.openModelPicker}
type="button"
variant="ghost"
>
{label}
</Button>
)
}
return (
<DropdownMenu onOpenChange={setOpen} open={open}>
<DropdownMenuTrigger asChild>
<Button aria-label={title} className={PILL} disabled={disabled} title={title} type="button" variant="ghost">
{label}
</Button>
</DropdownMenuTrigger>
<DropdownMenuContent align="end" className="w-64 p-0" side="top" sideOffset={8}>
<ModelMenuCloseContext.Provider value={() => setOpen(false)}>
{model.modelMenuContent}
</ModelMenuCloseContext.Provider>
</DropdownMenuContent>
</DropdownMenu>
)
}

View File

@@ -1,5 +1,3 @@
import type { ReactNode } from 'react'
import type { HermesGateway } from '@/hermes'
import type { ComposerAttachment } from '@/store/composer'
@@ -24,8 +22,6 @@ export interface ChatBarState {
canSwitch: boolean
loading?: boolean
quickModels?: QuickModelOption[]
/** Reused status-bar dropdown (built with gateway + selectModel upstream). */
modelMenuContent?: ReactNode
}
tools: { enabled: boolean; label: string; suggestions?: ContextSuggestion[] }
voice: { enabled: boolean; active: boolean }

View File

@@ -15,9 +15,7 @@ import { Backdrop } from '@/components/Backdrop'
import { PromptOverlays } from '@/components/prompt-overlays'
import { Button } from '@/components/ui/button'
import { Codicon } from '@/components/ui/codicon'
import { ErrorState } from '@/components/ui/error-state'
import { getGlobalModelOptions, type HermesGateway } from '@/hermes'
import { useI18n } from '@/i18n'
import type { ChatMessage } from '@/lib/chat-messages'
import { quickModelOptions, sessionTitle, toRuntimeMessage } from '@/lib/chat-runtime'
import { useIncrementalExternalStoreRuntime } from '@/lib/incremental-external-store-runtime'
@@ -40,12 +38,10 @@ import {
$lastVisibleMessageIsUser,
$messages,
$messagesEmpty,
$resumeExhaustedSessionId,
$selectedStoredSessionId,
$sessions,
sessionPinId
} from '@/store/session'
import { isSecondaryWindow } from '@/store/windows'
import type { ModelOptionsResponse } from '@/types/hermes'
import { routeSessionId } from '../routes'
@@ -65,7 +61,6 @@ import { threadLoadingState } from './thread-loading'
interface ChatViewProps extends Omit<React.ComponentProps<'div'>, 'onSubmit'> {
gateway: HermesGateway | null
modelMenuContent?: React.ReactNode
onToggleSelectedPin: () => void
onDeleteSelectedSession: () => void
onCancel: () => Promise<void> | void
@@ -89,9 +84,7 @@ interface ChatViewProps extends Omit<React.ComponentProps<'div'>, 'onSubmit'> {
onEdit: (message: AppendMessage) => Promise<void>
onReload: (parentId: string | null) => Promise<void>
onRestoreToMessage?: (messageId: string) => Promise<void>
onRetryResume: (sessionId: string) => void
onTranscribeAudio?: (audio: Blob) => Promise<string>
onDismissError?: (messageId: string) => void
}
interface ChatHeaderProps {
@@ -126,10 +119,10 @@ function ChatHeader({
? pinnedSessionIds.includes(selectedSessionId)
: false
// Secondary windows (new-session scratch, subagent watch, cmd-click pop-out)
// are compact side panels — they drop the session-actions header + border
// entirely. A brand-new draft has nothing to pin/delete/rename either.
if (isSecondaryWindow() || (!selectedSessionId && !activeSessionId && !isRoutedSessionView)) {
// A brand-new session has no session to pin/delete/rename, so the header is
// just a dead "New session" label + chevron. Drop it (and its border)
// entirely until there's a real session to act on.
if (!selectedSessionId && !activeSessionId && !isRoutedSessionView) {
return null
}
@@ -256,7 +249,6 @@ function ChatRuntimeBoundary({
export function ChatView({
className,
gateway,
modelMenuContent,
onToggleSelectedPin,
onDeleteSelectedSession,
onCancel,
@@ -277,12 +269,9 @@ export function ChatView({
onEdit,
onReload,
onRestoreToMessage,
onRetryResume,
onTranscribeAudio,
onDismissError
onTranscribeAudio
}: ChatViewProps) {
const location = useLocation()
const { t } = useI18n()
const activeSessionId = useStore($activeSessionId)
const awaitingResponse = useStore($awaitingResponse)
const busy = useStore($busy)
@@ -304,7 +293,6 @@ export function ChatView({
const messagesEmpty = useStore($messagesEmpty)
const lastVisibleIsUser = useStore($lastVisibleMessageIsUser)
const selectedSessionId = useStore($selectedStoredSessionId)
const resumeExhaustedSessionId = useStore($resumeExhaustedSessionId)
const routedSessionId = routeSessionId(location.pathname)
const isRoutedSessionView = Boolean(routedSessionId)
@@ -314,31 +302,16 @@ export function ChatView({
// waiting for the resume effect (which paints a frame later) to clear them.
const routeSessionMismatch = isRoutedSessionView && routedSessionId !== selectedSessionId
// The compact new-session pop-out skips the wordmark/tagline intro — it's a
// scratch window, not the full-height empty state.
const showIntro =
!isSecondaryWindow() && freshDraftReady && !isRoutedSessionView && !selectedSessionId && !activeSessionId && messagesEmpty
const showIntro = freshDraftReady && !isRoutedSessionView && !selectedSessionId && !activeSessionId && messagesEmpty
// Session is still loading if the route references a session we haven't
// resumed yet. Once `activeSessionId` is set (runtime has resumed), the
// session exists — even if it has zero messages (a brand-new routed
// session). The flicker where `busy` flips true briefly during hydrate
// is handled by `threadLoadingState`'s last-visible-user gate.
//
// resumeExhausted: the bounded auto-retry in use-route-resume gave up on this
// routed session (gateway RPC + REST fallback failed through every attempt).
// Suppress the loader and show an explicit error + manual Retry instead of
// spinning forever. Gated on the route matching so a stale latch from another
// session can't blank the current one.
const resumeExhausted = isRoutedSessionView && resumeExhaustedSessionId === routedSessionId
const loadingSession =
!resumeExhausted && isRoutedSessionView && (routeSessionMismatch || (messagesEmpty && !activeSessionId))
const loadingSession = isRoutedSessionView && (routeSessionMismatch || (messagesEmpty && !activeSessionId))
const threadLoading = threadLoadingState(loadingSession, busy, awaitingResponse, lastVisibleIsUser)
// Hide the composer in the exhausted error state too: there's no live runtime
// to send to until a retry rebinds one.
const showChatBar = !loadingSession && !resumeExhausted
const showChatBar = !loadingSession
const threadKey = selectedSessionId || activeSessionId || (isRoutedSessionView ? location.pathname : 'new')
const modelOptionsQuery = useQuery<ModelOptionsResponse>({
@@ -369,7 +342,6 @@ export function ChatView({
provider: currentProvider,
canSwitch: gatewayOpen,
loading: !gatewayOpen || (!currentModel && !currentProvider),
modelMenuContent,
quickModels
},
tools: {
@@ -382,7 +354,7 @@ export function ChatView({
active: false
}
}),
[contextSuggestions, currentModel, currentProvider, gatewayOpen, modelMenuContent, quickModels]
[contextSuggestions, currentModel, currentProvider, gatewayOpen, quickModels]
)
// Drop files anywhere in the conversation area, not just on the composer
@@ -453,7 +425,6 @@ export function ChatView({
loading={threadLoading}
onBranchInNewChat={onBranchInNewChat}
onCancel={onCancel}
onDismissError={onDismissError}
onRestoreToMessage={onRestoreToMessage}
sessionId={activeSessionId}
sessionKey={threadKey}
@@ -487,21 +458,6 @@ export function ChatView({
</Suspense>
)}
</ChatRuntimeBoundary>
{resumeExhausted && routedSessionId && (
<div className="absolute inset-0 z-10 grid place-items-center bg-(--ui-chat-surface-background) px-8 py-10">
<ErrorState
className="max-w-sm"
description={t.desktop.resumeStrandedBody}
title={t.desktop.resumeStrandedTitle}
>
<div className="grid justify-items-center">
<Button onClick={() => onRetryResume(routedSessionId)} size="sm" variant="outline">
{t.desktop.resumeRetry}
</Button>
</div>
</ErrorState>
</div>
)}
{showChatBar && <ScrollToBottomButton />}
<ChatDropOverlay kind={dragKind} />
<ChatSwapOverlay profile={gatewaySwapTarget} />

View File

@@ -395,7 +395,7 @@ export function CommandCenterView({ initialSection, onClose, onDeleteSession, on
</div>
<div className="flex shrink-0 items-center gap-1.5 whitespace-nowrap">
<Button onClick={() => void runSystemAction('restart')} size="xs" variant="text">
{cc.restartGateway}
{cc.restartMessaging}
</Button>
<Button onClick={() => void runSystemAction('update')} size="xs" variant="textStrong">
{cc.updateHermes}
@@ -426,10 +426,7 @@ export function CommandCenterView({ initialSection, onClose, onDeleteSession, on
</span>
)}
</div>
<pre
className="min-h-0 flex-1 overflow-auto whitespace-pre-wrap wrap-break-word rounded-lg border border-(--ui-stroke-tertiary) bg-(--ui-bg-quinary) p-3 font-mono text-[0.65rem] leading-relaxed text-(--ui-text-tertiary)"
data-selectable-text="true"
>
<pre className="min-h-0 flex-1 overflow-auto whitespace-pre-wrap wrap-break-word rounded-lg border border-(--ui-stroke-tertiary) bg-(--ui-bg-quinary) p-3 font-mono text-[0.65rem] leading-relaxed text-(--ui-text-tertiary)">
{logs.length ? logs.join('\n') : cc.noLogs}
</pre>
</div>

View File

@@ -30,7 +30,6 @@ import {
Package,
Palette,
Plus,
RefreshCw,
Settings,
Settings2,
Sun,
@@ -42,7 +41,6 @@ import {
import { cn } from '@/lib/utils'
import { $commandPaletteOpen, closeCommandPalette, setCommandPaletteOpen } from '@/store/command-palette'
import { $bindings } from '@/store/keybinds'
import { runGatewayRestart } from '@/store/system-actions'
import { luminance } from '@/themes/color'
import { type ThemeMode, useTheme } from '@/themes/context'
import { isUserTheme, resolveTheme } from '@/themes/user-themes'
@@ -156,7 +154,7 @@ const NON_CONFIG_SETTINGS: ReadonlyArray<{
},
{
icon: KeyRound,
keywords: ['providers', 'api key', 'keys', 'secrets', 'tokens'],
keywords: ['providers', 'api key', 'keys', 'secrets', 'tokens', 'egress', 'iron proxy', 'sandbox proxy'],
labelKey: 'providerApiKeys',
tab: 'providers&pview=keys'
},
@@ -169,7 +167,7 @@ const NON_CONFIG_SETTINGS: ReadonlyArray<{
},
{
icon: Settings2,
keywords: ['gateway', 'proxy', 'server', 'webhook', 'env'],
keywords: ['gateway', 'proxy', 'server', 'webhook', 'env', 'egress proxy', 'iron proxy'],
labelKey: 'keysSettings',
tab: 'keys&kview=settings'
},
@@ -362,13 +360,6 @@ export function CommandPalette() {
keywords: ['command center', 'usage', 'tokens', 'cost'],
label: cc.sections.usage,
run: go(`${COMMAND_CENTER_ROUTE}?section=usage`)
},
{
icon: RefreshCw,
id: 'cc-restart-gateway',
keywords: ['gateway', 'restart', 'messaging', 'reconnect', 'system'],
label: cc.restartGateway,
run: () => void runGatewayRestart()
}
]
},

View File

@@ -13,8 +13,7 @@ import { useSkinCommand } from '@/themes/use-skin-command'
import { formatRefValue } from '../components/assistant-ui/directive-text'
import { getCronJobs, getSessionMessages, listAllProfileSessions, type SessionInfo, triggerCronJob } from '../hermes'
import { type ChatMessage, chatMessageText, preserveLocalAssistantErrors, toChatMessages } from '../lib/chat-messages'
import { storedSessionIdForNotification } from '../lib/session-ids'
import { preserveLocalAssistantErrors, toChatMessages } from '../lib/chat-messages'
import {
isMessagingSource,
LOCAL_SESSION_SOURCE_IDS,
@@ -53,10 +52,7 @@ import {
$currentCwd,
$freshDraftReady,
$gatewayState,
$messages,
$messagingSessions,
$resumeFailedSessionId,
$resumeExhaustedSessionId,
$selectedStoredSessionId,
$sessions,
$workingSessionIds,
@@ -81,7 +77,6 @@ import {
setSessionsLoading,
setSessionsTotal
} from '../store/session'
import { onSessionsChanged } from '../store/session-sync'
import { clearSessionTodos, setSessionTodos, todoListActive } from '../store/todos'
import { openUpdatesWindow, startUpdatePoller, stopUpdatePoller } from '../store/updates'
import { isSecondaryWindow } from '../store/windows'
@@ -203,8 +198,6 @@ export function DesktopController() {
const activeSessionId = useStore($activeSessionId)
const currentCwd = useStore($currentCwd)
const freshDraftReady = useStore($freshDraftReady)
const resumeFailedSessionId = useStore($resumeFailedSessionId)
const resumeExhaustedSessionId = useStore($resumeExhaustedSessionId)
const filePreviewTarget = useStore($filePreviewTarget)
const previewTarget = useStore($previewTarget)
const selectedStoredSessionId = useStore($selectedStoredSessionId)
@@ -277,20 +270,16 @@ export function DesktopController() {
}
}, [])
// Notification click: the main process already focused the window; jump to its
// session. Notifications are tagged with the gateway *runtime* session id, but
// the chat route is keyed by the *stored* id — navigating with the runtime id
// resumes a non-existent stored session ("session not found") and strands the
// user. Translate runtime -> stored before navigating.
// Notification click: the main process already focused the window; jump to its session.
useEffect(() => {
const unsubscribe = window.hermesDesktop?.onFocusSession?.(sessionId => {
if (sessionId) {
navigate(sessionRoute(storedSessionIdForNotification(sessionId, runtimeIdByStoredSessionIdRef.current)))
navigate(sessionRoute(sessionId))
}
})
return () => unsubscribe?.()
}, [navigate, runtimeIdByStoredSessionIdRef])
}, [navigate])
// Notification action button (Approve/Reject) — resolve in place, no navigation.
useEffect(() => {
@@ -475,17 +464,6 @@ export function DesktopController() {
void refreshSessions()
}, [refreshSessions])
// Another window mutated the shared session list (e.g. a chat started in the
// pop-out). Re-pull so the sidebar reflects it. Pop-outs have no sidebar, so
// only real windows bother.
useEffect(() => {
if (isSecondaryWindow()) {
return
}
return onSessionsChanged(() => void refreshSessions().catch(() => undefined))
}, [refreshSessions])
// ALL-profiles view pages one profile at a time: fetch that profile's next
// page and merge it in place, leaving every other profile's rows untouched.
const loadMoreSessionsForProfile = useCallback(async (profile: string) => {
@@ -721,9 +699,7 @@ export function DesktopController() {
}
lastGatewayProfileRef.current = activeGatewayProfile
// Force: the new profile has its own default, so reseed even if the composer
// already shows the previous profile's model.
void refreshCurrentModel(true)
void refreshCurrentModel()
void refreshActiveProfile()
}, [activeGatewayProfile, refreshCurrentModel])
@@ -746,49 +722,6 @@ export function DesktopController() {
[branchCurrentSession, refreshSessions]
)
// Clear a failed turn's red error banner from the transcript. Errors are
// renderer-local state (never persisted), so dismissing is purely a view +
// session-cache edit. A message that errored before emitting any visible
// text is a bare error placeholder → drop it entirely; one that streamed
// partial output then failed keeps its content and just sheds the error.
// Both the per-runtime cache AND the live $messages view must be updated:
// `preserveLocalAssistantErrors` re-grafts any still-errored message it
// finds in the view onto the next session.info flush, so clearing only the
// cache would let the heartbeat resurrect the banner.
const dismissError = useCallback(
(messageId: string) => {
const runtimeSessionId = activeSessionIdRef.current
if (!runtimeSessionId) {
return
}
const clearErrorIn = (messages: ChatMessage[]): ChatMessage[] =>
messages.flatMap(message => {
if (message.id !== messageId || !message.error) {
return [message]
}
if (!chatMessageText(message).trim() && !message.parts.some(part => part.type !== 'text')) {
return []
}
return [{ ...message, error: undefined, pending: false }]
})
// View first: the flush below reads $messages as the "current" baseline
// for error preservation, so the banner must be gone from it before the
// cache update triggers a re-sync.
setMessages(clearErrorIn($messages.get()))
updateSessionState(runtimeSessionId, state => ({
...state,
messages: clearErrorIn(state.messages)
}))
},
[activeSessionIdRef, updateSessionState]
)
const startSessionInWorkspace = useCallback(
(path: null | string) => {
startFreshSessionDraft()
@@ -898,8 +831,6 @@ export function DesktopController() {
gatewayState,
locationPathname: location.pathname,
resumeSession,
resumeFailedSessionId,
resumeExhaustedSessionId,
routedSessionId,
runtimeIdByStoredSessionIdRef,
selectedStoredSessionId,
@@ -916,6 +847,7 @@ export function DesktopController() {
gatewayLogLines,
gatewayState,
inferenceStatus,
modelMenuContent,
openAgents,
freshDraftReady,
openCommandCenterSection,
@@ -1037,7 +969,6 @@ export function DesktopController() {
<ChatView
gateway={gatewayRef.current}
maxVoiceRecordingSeconds={voiceMaxRecordingSeconds}
modelMenuContent={modelMenuContent}
onAddContextRef={composer.addContextRefAttachment}
onAddUrl={url => composer.addContextRefAttachment(`@url:${formatRefValue(url)}`, url)}
onAttachDroppedItems={composer.attachDroppedItems}
@@ -1049,7 +980,6 @@ export function DesktopController() {
void removeSession(selectedStoredSessionId)
}
}}
onDismissError={dismissError}
onEdit={editMessage}
onPasteClipboardImage={() => void composer.pasteClipboardImage()}
onPickFiles={() => void composer.pickContextPaths('file')}
@@ -1058,7 +988,6 @@ export function DesktopController() {
onReload={reloadFromMessage}
onRemoveAttachment={id => void composer.removeAttachment(id)}
onRestoreToMessage={restoreToMessage}
onRetryResume={sessionId => void resumeSession(sessionId, true)}
onSteer={steerPrompt}
onSubmit={submitText}
onThreadMessagesChange={handleThreadMessagesChange}

View File

@@ -37,7 +37,6 @@ import {
switcherActive,
switcherJustClosed
} from '@/store/session-switcher'
import { openNewSessionInNewWindow } from '@/store/windows'
import { useTheme } from '@/themes/context'
import { requestComposerFocus } from '../chat/composer/focus'
@@ -133,7 +132,6 @@ export function useKeybinds(deps: KeybindRuntimeDeps): void {
deps.startFreshSession()
window.dispatchEvent(new CustomEvent('hermes:new-session-shortcut'))
},
'session.newWindow': () => void openNewSessionInNewWindow(),
'session.next': () => stepSession(1),
'session.prev': () => stepSession(-1),
...sessionSlotHandlers,

View File

@@ -17,7 +17,6 @@ import { type Translations, useI18n } from '@/i18n'
import { AlertTriangle, ExternalLink, Save, Trash2 } from '@/lib/icons'
import { cn } from '@/lib/utils'
import { notify, notifyError } from '@/store/notifications'
import { runGatewayRestart } from '@/store/system-actions'
import { useRefreshHotkey } from '../hooks/use-refresh-hotkey'
import { useRouteEnumParam } from '../hooks/use-route-enum-param'
@@ -98,8 +97,6 @@ function fieldCopy(field: MessagingEnvVarInfo, m: Translations['messaging']) {
export function MessagingView({ setStatusbarItemGroup: _setStatusbarItemGroup, ...props }: MessagingViewProps) {
const { t } = useI18n()
const m = t.messaging
// Both save/toggle toasts offer the same one-click restart.
const restartGatewayAction = { label: t.commandCenter.restartGateway, onClick: () => void runGatewayRestart() }
const [platforms, setPlatforms] = useState<MessagingPlatformInfo[] | null>(null)
const [edits, setEdits] = useState<EditMap>({})
const [query, setQuery] = useState('')
@@ -200,8 +197,7 @@ export function MessagingView({ setStatusbarItemGroup: _setStatusbarItemGroup, .
notify({
kind: 'success',
title: enabled ? m.platformEnabled(platform.name) : m.platformDisabled(platform.name),
message: m.restartToApply,
action: restartGatewayAction
message: m.restartToApply
})
} catch (err) {
notifyError(err, m.failedUpdate(platform.name))
@@ -226,8 +222,7 @@ export function MessagingView({ setStatusbarItemGroup: _setStatusbarItemGroup, .
notify({
kind: 'success',
title: m.setupSaved(platform.name),
message: m.restartToReconnect,
action: restartGatewayAction
message: m.restartToReconnect
})
} catch (err) {
notifyError(err, m.failedSave(platform.name))

View File

@@ -1,75 +0,0 @@
import { cleanup, fireEvent, render, screen, waitFor } from '@testing-library/react'
import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest'
import type { HermesReadDirResult } from '@/global'
import { $connection, setCurrentCwd } from '@/store/session'
import { resetProjectTreeState } from './files/use-project-tree'
import { RightSidebarPane } from './index'
const readDir = vi.fn<(path: string) => Promise<HermesReadDirResult>>()
const selectPaths = vi.fn()
function ok(entries: { name: string; path: string; isDirectory: boolean }[]): HermesReadDirResult {
return { entries }
}
function installBridge() {
;(
window as unknown as {
hermesDesktop: {
readDir: typeof readDir
selectPaths: typeof selectPaths
}
}
).hermesDesktop = { readDir, selectPaths }
}
describe('RightSidebarPane', () => {
beforeEach(() => {
$connection.set(null)
resetProjectTreeState()
setCurrentCwd('/repo')
readDir.mockReset()
selectPaths.mockReset()
readDir.mockResolvedValue(ok([{ name: 'README.md', path: '/repo/README.md', isDirectory: false }]))
selectPaths.mockResolvedValue(['/repo-next'])
installBridge()
})
afterEach(() => {
cleanup()
$connection.set(null)
setCurrentCwd('')
resetProjectTreeState()
delete (window as unknown as { hermesDesktop?: unknown }).hermesDesktop
})
it('refreshes the current tree without opening the folder picker', async () => {
const onChangeCwd = vi.fn()
render(<RightSidebarPane onActivateFile={vi.fn()} onActivateFolder={vi.fn()} onChangeCwd={onChangeCwd} />)
await waitFor(() => expect(screen.getByRole('button', { name: 'Refresh tree' }).hasAttribute('disabled')).toBe(false))
readDir.mockClear()
fireEvent.click(screen.getByRole('button', { name: 'Refresh tree' }))
await waitFor(() => expect(readDir).toHaveBeenCalledWith('/repo'))
expect(selectPaths).not.toHaveBeenCalled()
fireEvent.click(screen.getByRole('button', { name: 'Open folder' }))
await waitFor(() =>
expect(selectPaths).toHaveBeenCalledWith({
defaultPath: '/repo',
directories: true,
multiple: false,
title: 'Change working directory'
})
)
await waitFor(() => expect(onChangeCwd).toHaveBeenCalledWith('/repo-next'))
})
})

View File

@@ -126,12 +126,12 @@ interface FilesystemTabProps extends FileTreeBodyProps {
onRefresh: () => void
}
// Sidebar palette + hover-reveal: header actions stay reachable while moving
// from the project label to the action buttons.
// Sidebar palette + hover-reveal: refresh tracks label hover; collapse-all
// stays visible while any folder is expanded.
const HEADER_ACTION_CLASS =
'text-sidebar-foreground/70 hover:bg-sidebar-accent! hover:text-sidebar-accent-foreground! focus-visible:ring-sidebar-ring'
const HEADER_ACTION_LABEL_REVEAL = `${HEADER_ACTION_CLASS} pointer-events-none opacity-0 transition-opacity focus-visible:pointer-events-auto focus-visible:opacity-100 group-focus-within/project-header:pointer-events-auto group-focus-within/project-header:opacity-100 group-hover/project-header:pointer-events-auto group-hover/project-header:opacity-100`
const HEADER_ACTION_LABEL_REVEAL = `${HEADER_ACTION_CLASS} pointer-events-none opacity-0 transition-opacity focus-visible:pointer-events-auto focus-visible:opacity-100 peer-focus-visible/project-label:pointer-events-auto peer-focus-visible/project-label:opacity-100 peer-hover/project-label:pointer-events-auto peer-hover/project-label:opacity-100`
function FilesystemTab({
canCollapse,
@@ -158,7 +158,7 @@ function FilesystemTab({
return (
<div className="flex min-h-0 flex-1 flex-col">
<RightSidebarSectionHeader>
<div className="flex min-w-0 flex-1">
<div className="peer/project-label flex min-w-0 flex-1">
<button
className="flex w-full min-w-0 items-center rounded-md text-left hover:text-(--ui-text-secondary)"
onClick={() => void onChangeFolder()}
@@ -216,7 +216,7 @@ function FilesystemTab({
}
export function RightSidebarSectionHeader({ children }: { children: ReactNode }) {
return <div className="group/project-header flex h-7 shrink-0 items-center px-2.5">{children}</div>
return <div className="flex h-7 shrink-0 items-center px-2.5">{children}</div>
}
interface FileTreeBodyProps {

View File

@@ -9,22 +9,3 @@ export const $terminalTakeover = atom(storedBoolean(TAKEOVER_KEY, false))
$terminalTakeover.subscribe(active => persistBoolean(TAKEOVER_KEY, active))
export const setTerminalTakeover = (active: boolean) => $terminalTakeover.set(active)
/** A command queued to run in the embedded terminal. The terminal pane flushes
* (and clears) it once its session is live, so a value set before the pane
* mounts still runs. Cleared after flush so a later remount can't replay it. */
export const $terminalInjection = atom<null | string>(null)
/** Open the terminal pane and run a command in it. Used to disconnect external
* (CLI-managed) providers, which Hermes can't clear via the API — the user
* sees exactly what runs instead of Hermes silently deleting their creds. */
export const runInTerminal = (command: string) => {
const trimmed = command.trim()
if (!trimmed) {
return
}
setTerminalTakeover(true)
$terminalInjection.set(trimmed)
}

View File

@@ -10,8 +10,6 @@ import { triggerHaptic } from '@/lib/haptics'
import { $filePreviewTarget, $previewTarget } from '@/store/preview'
import { useTheme } from '@/themes/context'
import { $terminalInjection } from '../store'
import { makeTerminalReader, setActiveTerminalReader } from './buffer'
import {
isAddSelectionShortcut,
@@ -677,28 +675,6 @@ export function useTerminalSession({ cwd, onAddSelectionToChat }: UseTerminalSes
return () => cancelAnimationFrame(raf)
}, [activeTheme, themeName])
// Flush a queued command (e.g. a provider-disconnect) into the live session.
// Only active while open; the subscribe fires immediately, so a command set
// before this pane mounted runs as soon as the session is ready. Clearing the
// atom after writing stops a later remount from replaying a stale command.
useEffect(() => {
if (status !== 'open') {
return
}
return $terminalInjection.subscribe(command => {
const id = sessionIdRef.current
if (!command || !id) {
return
}
void window.hermesDesktop?.terminal?.write(id, `${command}\r`)
$terminalInjection.set(null)
termRef.current?.focus()
})
}, [status])
return {
addSelectionToChat,
hostRef,

View File

@@ -13,7 +13,6 @@ import {
type GatewayEventPayload,
reasoningPart,
renderMediaTags,
textPart,
upsertToolPart
} from '@/lib/chat-messages'
import { coerceGatewayText, coerceThinkingText, normalizePersonalityValue } from '@/lib/chat-runtime'
@@ -48,7 +47,6 @@ import {
setTurnStartedAt,
setYoloActive
} from '@/store/session'
import { broadcastSessionsChanged } from '@/store/session-sync'
import { clearSessionSubagents, pruneDelegateFallbackSubagents, upsertSubagent } from '@/store/subagents'
import { setSessionTodos } from '@/store/todos'
import { recordToolDiff } from '@/store/tool-diffs'
@@ -643,9 +641,6 @@ export function useMessageStream({
})
void refreshSessions().catch(() => undefined)
// Sync the freshly-titled row to other windows (e.g. main, when the turn
// ran in the pop-out).
broadcastSessionsChanged()
if (compactedTurnRef.current.delete(sessionId)) {
shouldHydrate = false
@@ -1081,32 +1076,6 @@ export function useMessageStream({
// completions / watch matches here — re-sync the status stack.
void refreshBackgroundProcesses(sessionId)
}
} else if (event.type === 'review.summary') {
// Self-improvement background review saved something to memory/skills
// and emitted a persistent summary (Python formats it as
// "💾 Self-improvement review: …"). The CLI prints this via
// prompt_toolkit and the Ink TUI renders it as a system line; the
// desktop has neither, so without this handler the skill/memory
// change happens silently. Surface it as a persistent system message
// in the transcript so the user is always informed — it must not be a
// transient toast that can be missed.
const text = coerceGatewayText(payload?.text).trim()
if (text && sessionId) {
flushQueuedDeltas(sessionId)
updateSessionState(sessionId, state => ({
...state,
messages: [
...state.messages,
{
id: `review-summary-${Date.now()}`,
role: 'system',
parts: [textPart(text)],
timestamp: Math.floor(Date.now() / 1000)
}
]
}))
}
} else if (event.type === 'error') {
const errorMessage = payload?.message || 'Hermes reported an error'
const looksLikeProviderSetup = isProviderSetupErrorMessage(errorMessage)
@@ -1129,13 +1098,8 @@ export function useMessageStream({
if (looksLikeProviderSetup) {
requestDesktopOnboarding(errorMessage)
} else {
// Toast globally, not just when the failing thread is focused: a
// turn-ending error (e.g. out of funds) blocks every thread, so the
// inline error alone is too easy to miss. The stable id collapses the
// same error from multiple blocked threads into one toast.
} else if (isActiveEvent) {
notify({
id: `gateway-error:${errorMessage}`,
kind: 'error',
title: 'Hermes error',
message: errorMessage

View File

@@ -1,5 +1,5 @@
import { renderHook } from '@testing-library/react'
import { QueryClient } from '@tanstack/react-query'
import { cleanup, render, renderHook } from '@testing-library/react'
import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest'
import { getGlobalModelInfo } from '@/hermes'
@@ -13,51 +13,12 @@ import {
import { useModelControls } from './use-model-controls'
const setGlobalModel = vi.fn()
const notifyError = vi.fn()
vi.mock('@/hermes', () => ({
getGlobalModelInfo: vi.fn(),
setGlobalModel: (...args: Parameters<typeof setGlobalModel>) => setGlobalModel(...args)
setGlobalModel: vi.fn()
}))
vi.mock('@/i18n', () => ({
useI18n: () => ({
t: {
desktop: {
modelSwitchFailed: 'Model switch failed'
}
}
})
}))
vi.mock('@/store/notifications', () => ({
notifyError: (...args: Parameters<typeof notifyError>) => notifyError(...args)
}))
type Controls = ReturnType<typeof useModelControls>
function Harness({
activeSessionId,
onReady,
requestGateway
}: {
activeSessionId: string | null
onReady: (controls: Controls) => void
requestGateway: <T = unknown>(method: string, params?: Record<string, unknown>) => Promise<T>
}) {
const controls = useModelControls({
activeSessionId,
queryClient: new QueryClient(),
requestGateway
})
onReady(controls)
return null
}
describe('useModelControls', () => {
describe('useModelControls.refreshCurrentModel', () => {
beforeEach(() => {
$activeSessionId.set(null)
setCurrentModel('')
@@ -65,7 +26,6 @@ describe('useModelControls', () => {
})
afterEach(() => {
cleanup()
vi.restoreAllMocks()
$activeSessionId.set(null)
setCurrentModel('')
@@ -114,85 +74,4 @@ describe('useModelControls', () => {
expect($currentModel.get()).toBe('deepseek/deepseek-v4-pro')
expect($currentProvider.get()).toBe('deepseek')
})
it('routes active-session picker changes through config.set with an explicit provider', async () => {
const requestGateway = vi.fn(async () => ({ key: 'model', value: 'claude-sonnet-4.6' }) as never)
let controls!: Controls
render(
<Harness
activeSessionId="session-1"
onReady={value => (controls = value)}
requestGateway={requestGateway}
/>
)
await expect(
controls.selectModel({
model: 'claude-sonnet-4.6',
provider: 'anthropic'
})
).resolves.toBe(true)
expect(requestGateway).toHaveBeenCalledWith('config.set', {
session_id: 'session-1',
key: 'model',
value: 'claude-sonnet-4.6 --provider anthropic'
})
expect(requestGateway).not.toHaveBeenCalledWith('slash.exec', expect.anything())
})
it('stores a no-session pick as UI state with no gateway or global write', async () => {
const requestGateway = vi.fn()
let controls!: Controls
render(
<Harness
activeSessionId={null}
onReady={value => (controls = value)}
requestGateway={requestGateway}
/>
)
await expect(
controls.selectModel({
model: 'claude-sonnet-4.6',
provider: 'anthropic'
})
).resolves.toBe(true)
// The pick is plain UI state; session.create ships it later. Nothing touches
// the gateway or the profile default here.
expect($currentModel.get()).toBe('claude-sonnet-4.6')
expect($currentProvider.get()).toBe('anthropic')
expect(requestGateway).not.toHaveBeenCalled()
expect(setGlobalModel).not.toHaveBeenCalled()
})
it('seeds an empty composer model from global but never clobbers a pick', async () => {
vi.mocked(getGlobalModelInfo).mockResolvedValue({ model: 'openai/gpt-5.5', provider: 'openai-codex' })
const { result } = renderHook(() =>
useModelControls({
activeSessionId: null,
queryClient: new QueryClient(),
requestGateway: vi.fn()
})
)
// Empty → seeds the default.
await result.current.refreshCurrentModel()
expect($currentModel.get()).toBe('openai/gpt-5.5')
// A user pick must survive the lifecycle refreshes that fire on boot / fresh
// draft / session events.
setCurrentModel('anthropic/claude-sonnet-4.6')
setCurrentProvider('anthropic')
await result.current.refreshCurrentModel()
expect($currentModel.get()).toBe('anthropic/claude-sonnet-4.6')
// A profile swap forces a reseed to the new profile's default.
await result.current.refreshCurrentModel(true)
expect($currentModel.get()).toBe('openai/gpt-5.5')
})
})

View File

@@ -1,7 +1,7 @@
import { type QueryClient } from '@tanstack/react-query'
import { useCallback } from 'react'
import { getGlobalModelInfo } from '@/hermes'
import { getGlobalModelInfo, setGlobalModel } from '@/hermes'
import { useI18n } from '@/i18n'
import { notifyError } from '@/store/notifications'
import {
@@ -15,6 +15,7 @@ import type { ModelOptionsResponse } from '@/types/hermes'
interface ModelSelection {
model: string
persistGlobal: boolean
provider: string
}
@@ -27,7 +28,6 @@ interface ModelControlsOptions {
export function useModelControls({ activeSessionId, queryClient, requestGateway }: ModelControlsOptions) {
const { t } = useI18n()
const copy = t.desktop
const updateModelOptionsCache = useCallback(
(provider: string, model: string, includeGlobal: boolean) => {
const patch = (prev: ModelOptionsResponse | undefined) => ({ ...(prev ?? {}), provider, model })
@@ -41,24 +41,14 @@ export function useModelControls({ activeSessionId, queryClient, requestGateway
[activeSessionId, queryClient]
)
// Seed the composer's model state from the profile default. `force` reseeds
// for a profile swap (the new profile has its own default); otherwise this
// only fills an EMPTY selection so a user's pick (plain UI state in
// $currentModel) survives the lifecycle refreshes that fire on boot / fresh
// draft / session events. A live session owns the footer, so skip entirely.
const refreshCurrentModel = useCallback(async (force = false) => {
const refreshCurrentModel = useCallback(async () => {
try {
if ($activeSessionId.get()) {
return
}
if (!force && $currentModel.get()) {
return
}
const result = await getGlobalModelInfo()
if ($activeSessionId.get() || (!force && $currentModel.get())) {
// A resumed/live session owns the footer model state. Global config
// refreshes (gateway boot, profile swap, settings save) must not clobber
// the active chat's runtime model/provider in the status bar.
if ($activeSessionId.get()) {
return
}
@@ -74,14 +64,12 @@ export function useModelControls({ activeSessionId, queryClient, requestGateway
}
}, [])
// Returns whether the switch succeeded so callers can await it before applying
// follow-up changes. The composer model is plain UI state: with no live
// session it's just stored (and shipped on the next session.create); with one
// it's scoped to that session via config.set. It NEVER writes the profile
// default — that lives in Settings → Model — so picking a model here can't
// silently mutate global config.
// Returns whether the switch succeeded so callers can await it before
// applying follow-up changes (e.g. editing a model's reasoning/fast must land
// on the right active model — bail rather than write to the previous one).
const selectModel = useCallback(
async (selection: ModelSelection): Promise<boolean> => {
const includeGlobal = selection.persistGlobal || !activeSessionId
// Snapshot for rollback: the switch is applied optimistically, so a
// failure must restore the prior model/provider (store + query cache)
// rather than leave the UI showing a model the backend never selected.
@@ -90,34 +78,41 @@ export function useModelControls({ activeSessionId, queryClient, requestGateway
setCurrentModel(selection.model)
setCurrentProvider(selection.provider)
updateModelOptionsCache(selection.provider, selection.model, !activeSessionId)
// No live session yet: the pick is pure UI state. session.create reads
// $currentModel/$currentProvider and applies it as that session's override.
if (!activeSessionId) {
return true
}
updateModelOptionsCache(selection.provider, selection.model, includeGlobal)
try {
await requestGateway('config.set', {
session_id: activeSessionId,
key: 'model',
value: `${selection.model} --provider ${selection.provider}`
})
if (activeSessionId) {
await requestGateway('slash.exec', {
session_id: activeSessionId,
command: `/model ${selection.model} --provider ${selection.provider}${selection.persistGlobal ? ' --global' : ''}`
})
void queryClient.invalidateQueries({ queryKey: ['model-options', activeSessionId] })
if (selection.persistGlobal) {
void refreshCurrentModel()
}
void queryClient.invalidateQueries({
queryKey: selection.persistGlobal ? ['model-options'] : ['model-options', activeSessionId]
})
return true
}
await setGlobalModel(selection.provider, selection.model)
void refreshCurrentModel()
void queryClient.invalidateQueries({ queryKey: ['model-options'] })
return true
} catch (err) {
setCurrentModel(prevModel)
setCurrentProvider(prevProvider)
updateModelOptionsCache(prevProvider, prevModel, !activeSessionId)
updateModelOptionsCache(prevProvider, prevModel, includeGlobal)
notifyError(err, copy.modelSwitchFailed)
return false
}
},
[activeSessionId, copy.modelSwitchFailed, queryClient, requestGateway, updateModelOptionsCache]
[activeSessionId, copy.modelSwitchFailed, queryClient, refreshCurrentModel, requestGateway, updateModelOptionsCache]
)
return { refreshCurrentModel, selectModel, updateModelOptionsCache }

View File

@@ -32,7 +32,6 @@ import {
clearComposerAttachments,
type ComposerAttachment,
setComposerAttachmentUploadState,
setComposerDraft,
terminalContextBlocksFromDraft,
updateComposerAttachment
} from '@/store/composer'
@@ -59,7 +58,6 @@ import { clearSessionTodos } from '@/store/todos'
import type {
ClientSessionState,
BrowserManageResponse,
FileAttachResponse,
HandoffFailResponse,
HandoffRequestResponse,
@@ -952,26 +950,8 @@ export function usePromptActions({
return
}
// send / prefill carry an optional `notice` (e.g. "⊙ Goal set …")
// that the backend wants shown as a system line before the message
// is acted on. Mirrors the TUI's createSlashHandler — without it a
// `/goal <text>` looked like it did nothing.
if ((dispatch.type === 'send' || dispatch.type === 'prefill') && dispatch.notice?.trim()) {
renderSlashOutput(dispatch.notice.trim())
}
const message = ('message' in dispatch ? dispatch.message : '')?.trim() ?? ''
// /undo returns a prefill directive: drop the backed-up message into
// the composer for editing instead of submitting it immediately.
if (dispatch.type === 'prefill') {
if (message) {
setComposerDraft(message)
}
return
}
if (!message) {
renderSlashOutput(
`/${name}: ${dispatch.type === 'skill' ? 'skill payload missing message' : 'empty message'}`
@@ -1161,81 +1141,6 @@ export function usePromptActions({
} catch (err) {
renderSlashOutput(`error: ${err instanceof Error ? err.message : String(err)}`)
}
},
// /browser connect|disconnect|status manages the live CDP connection on
// the gateway host, mirroring the TUI's browser.manage RPC. It mutates
// BROWSER_CDP_URL (and may launch Chrome) in the gateway process — only
// meaningful when that process runs on this machine, so it's gated to
// local connections. A remote gateway would act on the wrong host.
browser: async ctx => {
const resolved = await withSlashOutput(ctx)
if (!resolved) {
return
}
const { render: renderSlashOutput, sessionId } = resolved
if ($connection.get()?.mode === 'remote') {
renderSlashOutput(
'/browser manages a Chromium-family browser on the gateway host — only available when connected to a local gateway.'
)
return
}
const [rawAction = 'status', ...rest] = ctx.arg.trim().split(/\s+/).filter(Boolean)
const cmdAction = rawAction.toLowerCase()
if (!['connect', 'disconnect', 'status'].includes(cmdAction)) {
renderSlashOutput(
'usage: /browser [connect|disconnect|status] [url] · persistent: set browser.cdp_url in config.yaml'
)
return
}
const url = cmdAction === 'connect' ? rest.join(' ').trim() || 'http://127.0.0.1:9222' : undefined
if (url) {
renderSlashOutput(`checking Chromium-family browser remote debugging at ${url}...`)
}
try {
const result = await requestGateway<BrowserManageResponse>('browser.manage', {
action: cmdAction,
session_id: sessionId,
...(url && { url })
})
// Without a streamed session subscription, the gateway bundles its
// progress lines into `messages` — flush them inline.
result?.messages?.forEach(message => renderSlashOutput(message))
if (cmdAction === 'status') {
renderSlashOutput(
result?.connected
? `browser connected: ${result.url || '(url unavailable)'}`
: 'browser not connected (try /browser connect <url> or set browser.cdp_url in config.yaml)'
)
return
}
if (cmdAction === 'disconnect') {
renderSlashOutput('browser disconnected')
return
}
if (result?.connected) {
renderSlashOutput('Browser connected to live Chromium-family browser via CDP')
renderSlashOutput(`Endpoint: ${result.url || '(url unavailable)'}`)
renderSlashOutput('next browser tool call will use this CDP endpoint')
}
} catch (err) {
renderSlashOutput(`error: ${err instanceof Error ? err.message : String(err)}`)
}
}
}

View File

@@ -2,8 +2,6 @@ import { cleanup, render } from '@testing-library/react'
import type { MutableRefObject } from 'react'
import { afterEach, describe, expect, it, vi } from 'vitest'
import { $resumeExhaustedSessionId, setResumeExhaustedSessionId } from '@/store/session'
import { useRouteResume } from './use-route-resume'
interface HarnessProps {
@@ -15,8 +13,6 @@ interface HarnessProps {
gatewayState: string
locationPathname: string
resumeSession: (sessionId: string, focus: boolean) => Promise<unknown>
resumeFailedSessionId?: null | string
resumeExhaustedSessionId?: null | string
routedSessionId: null | string
runtimeIdByStoredSessionIdRef: MutableRefObject<Map<string, string>>
selectedStoredSessionId: null | string
@@ -24,12 +20,8 @@ interface HarnessProps {
startFreshSessionDraft: (focus: boolean) => unknown
}
function RouteResumeHarness({
resumeFailedSessionId = null,
resumeExhaustedSessionId = null,
...props
}: HarnessProps) {
useRouteResume({ ...props, resumeExhaustedSessionId, resumeFailedSessionId })
function RouteResumeHarness(props: HarnessProps) {
useRouteResume(props)
return null
}
@@ -264,212 +256,3 @@ describe('useRouteResume', () => {
expect(resumeSession).toHaveBeenCalledWith('session-1', true)
})
})
describe('useRouteResume bounded auto-retry after a failed resume', () => {
afterEach(() => {
cleanup()
vi.useRealTimers()
vi.restoreAllMocks()
setResumeExhaustedSessionId(null)
})
// Common stranded-window props: gateway open, route on the session, no runtime
// yet, and the ref already synced to the route (resumeSession sets it at entry
// before failing) — the exact state that defeats the main effect's self-heal.
function strandedProps(resumeSession: (sid: string, focus: boolean) => Promise<unknown>) {
return {
activeSessionId: null,
activeSessionIdRef: { current: null } as MutableRefObject<null | string>,
creatingSessionRef: { current: false },
currentView: 'chat',
freshDraftReady: false,
gatewayState: 'open',
locationPathname: '/session-1',
resumeSession,
routedSessionId: 'session-1',
runtimeIdByStoredSessionIdRef: { current: new Map<string, string>() },
selectedStoredSessionId: 'session-1',
// Synced to the route by the failed resume's synchronous entry-write.
selectedStoredSessionIdRef: { current: 'session-1' } as MutableRefObject<null | string>,
startFreshSessionDraft: vi.fn()
}
}
it('retries the resume on backoff when the routed session is flagged as failed', () => {
vi.useFakeTimers()
const resumeSession = vi.fn(async () => undefined)
render(<RouteResumeHarness {...strandedProps(resumeSession)} resumeFailedSessionId="session-1" />)
// The main effect fires one resume on mount (pathname-changed). Clear it so
// we assert purely the bounded-retry effect's scheduled retry below.
resumeSession.mockClear()
// No immediate fire — the retry is scheduled behind the backoff timer.
expect(resumeSession).not.toHaveBeenCalled()
// First backoff window (1s) elapses → one retry.
vi.advanceTimersByTime(1_000)
expect(resumeSession).toHaveBeenCalledTimes(1)
expect(resumeSession).toHaveBeenCalledWith('session-1', true)
})
it('does NOT retry a failed session that is not the routed one', () => {
vi.useFakeTimers()
const resumeSession = vi.fn(async () => undefined)
// The failure flag points at a different session than the route.
render(<RouteResumeHarness {...strandedProps(resumeSession)} resumeFailedSessionId="other-session" />)
resumeSession.mockClear() // drop the mount resume
vi.advanceTimersByTime(10_000)
expect(resumeSession).not.toHaveBeenCalled()
})
it('skips the scheduled retry if the session already recovered when the timer fires', () => {
vi.useFakeTimers()
const resumeSession = vi.fn(async () => undefined)
const props = strandedProps(resumeSession)
render(<RouteResumeHarness {...props} resumeFailedSessionId="session-1" />)
resumeSession.mockClear() // drop the mount resume
// A resume landed while we waited: runtime is now bound.
props.activeSessionIdRef.current = 'runtime-1'
vi.advanceTimersByTime(8_000)
expect(resumeSession).not.toHaveBeenCalled()
})
it('stops retrying after MAX_RESUME_RETRIES consecutive failures', () => {
vi.useFakeTimers()
const resumeSession = vi.fn(async () => undefined)
const props = strandedProps(resumeSession)
// Model the real re-arm loop: resumeSession clears $resumeFailedSessionId at
// entry (null) and a repeat failure re-sets it ('session-1'). That null->id
// toggle is what re-runs the effect and advances the bounded counter. The
// routed session never changes, so the counter is NOT reset between cycles.
const { rerender } = render(<RouteResumeHarness {...props} resumeFailedSessionId="session-1" />)
resumeSession.mockClear() // drop the mount resume; count only the retries
for (let i = 0; i < 8; i += 1) {
vi.advanceTimersByTime(8_000) // fire the scheduled retry (if any)
rerender(<RouteResumeHarness {...props} resumeFailedSessionId={null} />) // cleared at entry
rerender(<RouteResumeHarness {...props} resumeFailedSessionId="session-1" />) // re-armed on failure
}
// Capped at MAX_RESUME_RETRIES (4): a persistently dead backend can't
// hot-loop the resume forever.
expect(resumeSession.mock.calls.length).toBe(4)
// Once auto-retry gives up, the exhausted latch is armed for the routed
// session so the chat view can swap the perpetual loader for an explicit
// error + manual Retry instead of spinning forever.
expect($resumeExhaustedSessionId.get()).toBe('session-1')
})
it('does not arm the exhausted latch while retries remain', () => {
vi.useFakeTimers()
const resumeSession = vi.fn(async () => undefined)
const props = strandedProps(resumeSession)
const { rerender } = render(<RouteResumeHarness {...props} resumeFailedSessionId="session-1" />)
resumeSession.mockClear()
// Two failure cycles — still under the 4-retry cap, so the latch must stay
// clear and the loader keeps spinning (auto-recovery hasn't given up yet).
for (let i = 0; i < 2; i += 1) {
vi.advanceTimersByTime(8_000)
rerender(<RouteResumeHarness {...props} resumeFailedSessionId={null} />)
rerender(<RouteResumeHarness {...props} resumeFailedSessionId="session-1" />)
}
expect($resumeExhaustedSessionId.get()).toBeNull()
})
it('clears a stale exhausted latch when the route moves off the stranded session', () => {
vi.useFakeTimers()
const resumeSession = vi.fn(async () => undefined)
const props = strandedProps(resumeSession)
// Pre-arm the latch as if this session had exhausted its retries.
setResumeExhaustedSessionId('session-1')
// Route is now on a different, healthy session that is not flagged as
// failed — the retry effect's "route moved off" branch clears the latch.
render(
<RouteResumeHarness
{...props}
activeSessionId="runtime-2"
activeSessionIdRef={{ current: 'runtime-2' }}
locationPathname="/session-2"
resumeFailedSessionId={null}
routedSessionId="session-2"
selectedStoredSessionId="session-2"
selectedStoredSessionIdRef={{ current: 'session-2' }}
/>
)
expect($resumeExhaustedSessionId.get()).toBeNull()
})
it('resets the retry counter for a fresh backoff cycle when the exhausted latch clears (manual retry, same session)', () => {
vi.useFakeTimers()
const resumeSession = vi.fn(async () => undefined)
const props = strandedProps(resumeSession)
// Phase A — exhaust the bounded auto-retry (counter → MAX) like a dead
// backend. The resumeExhaustedSessionId prop stays null here: the hook sets
// the store, which doesn't feed back into the prop in this harness.
const { rerender } = render(<RouteResumeHarness {...props} resumeFailedSessionId="session-1" />)
resumeSession.mockClear()
for (let i = 0; i < 8; i += 1) {
vi.advanceTimersByTime(8_000)
rerender(<RouteResumeHarness {...props} resumeFailedSessionId={null} />)
rerender(<RouteResumeHarness {...props} resumeFailedSessionId="session-1" />)
}
expect(resumeSession.mock.calls.length).toBe(4) // capped
expect($resumeExhaustedSessionId.get()).toBe('session-1')
// Phase B — user clicks Retry on the SAME stranded session. resumeSession
// clears both latches at entry; the exhausted latch's armed->cleared edge
// must reset the attempt counter so a fresh bounded cycle runs, not a single
// one-shot attempt that immediately re-arms the error. Model the prop
// transitions: reflect the armed latch, then clear it (retry), then re-arm
// the failure latch on the fresh failure.
resumeSession.mockClear()
rerender(<RouteResumeHarness {...props} resumeExhaustedSessionId="session-1" resumeFailedSessionId="session-1" />)
rerender(<RouteResumeHarness {...props} resumeExhaustedSessionId={null} resumeFailedSessionId={null} />)
rerender(<RouteResumeHarness {...props} resumeExhaustedSessionId={null} resumeFailedSessionId="session-1" />)
// A real retry fires again instead of staying pinned at MAX (which would
// dispatch nothing). Without the reset the counter stays >= MAX and this
// advance dispatches zero resumes.
vi.advanceTimersByTime(8_000)
expect(resumeSession.mock.calls.length).toBeGreaterThan(0)
})
it('does not burn retry attempts on unrelated re-renders during the backoff window', () => {
vi.useFakeTimers()
const props = strandedProps(vi.fn())
// Mount schedules the first backoff timer. Then re-render repeatedly with a
// fresh resumeSession identity (referential instability — a real dep change
// for the retry effect) WITHOUT ever letting the timer fire. The old code
// incremented the attempt counter at schedule time, so >= MAX re-renders
// armed the exhausted error with zero resumes actually dispatched. The fix
// only advances the counter when a timer truly fires, so the latch stays
// clear no matter how many spurious re-renders happen mid-backoff.
const { rerender } = render(
<RouteResumeHarness {...props} resumeFailedSessionId="session-1" resumeSession={vi.fn(async () => undefined)} />
)
for (let j = 0; j < 8; j += 1) {
rerender(
<RouteResumeHarness {...props} resumeFailedSessionId="session-1" resumeSession={vi.fn(async () => undefined)} />
)
}
expect($resumeExhaustedSessionId.get()).toBeNull()
})
})

View File

@@ -1,7 +1,6 @@
import { type MutableRefObject, useEffect, useRef } from 'react'
import { isNewChatRoute } from '@/app/routes'
import { setResumeExhaustedSessionId } from '@/store/session'
interface RouteResumeOptions {
activeSessionId: string | null
@@ -12,17 +11,6 @@ interface RouteResumeOptions {
gatewayState: string | undefined
locationPathname: string
resumeSession: (sessionId: string, focus: boolean) => Promise<unknown>
// Stored-session id whose most recent resume failed terminally (set by
// useSessionActions, mirrored from $resumeFailedSessionId). While this equals
// routedSessionId the window would otherwise latch on the loader forever, so
// the bounded-retry effect below re-attempts the resume.
resumeFailedSessionId: string | null
// Stored-session id whose bounded auto-retry has EXHAUSTED (mirrored from
// $resumeExhaustedSessionId). Only resumeSession clears this latch (manual
// Retry / reconnect / reselect) — the auto-retry loop never does — so its
// armed->cleared edge is an unambiguous "give me a fresh backoff cycle"
// signal the effect below uses to reset the attempt counter.
resumeExhaustedSessionId: string | null
routedSessionId: string | null
runtimeIdByStoredSessionIdRef: MutableRefObject<Map<string, string>>
selectedStoredSessionId: string | null
@@ -30,19 +18,6 @@ interface RouteResumeOptions {
startFreshSessionDraft: (focus: boolean) => unknown
}
// Bounded auto-retry for a stranded session window. A resume can fail terminally
// (gateway RPC reject + REST fallback failure) on a transiently wedged backend —
// dead provider key, a runaway turn hogging the dispatcher, flaky DNS. Without a
// retry the loader latches forever. We retry with backoff, capped, so a
// genuinely dead backend doesn't hot-loop the resume.
const MAX_RESUME_RETRIES = 4
const RESUME_RETRY_BASE_MS = 1_000
const RESUME_RETRY_MAX_MS = 8_000
function resumeRetryDelayMs(attempt: number): number {
return Math.min(RESUME_RETRY_MAX_MS, RESUME_RETRY_BASE_MS * 2 ** attempt)
}
// HashRouter boot edge case: pathname briefly reads `/` before the hash is
// parsed. If the hash references a real session, defer; resume picks it up
// next tick. Without this, ctrl+R on `#/:sessionId` flashes 5 loading states.
@@ -74,8 +49,6 @@ export function useRouteResume({
gatewayState,
locationPathname,
resumeSession,
resumeFailedSessionId,
resumeExhaustedSessionId,
routedSessionId,
runtimeIdByStoredSessionIdRef,
selectedStoredSessionId,
@@ -85,16 +58,6 @@ export function useRouteResume({
const lastPathnameRef = useRef<string | null>(null)
const seenGatewayStateRef = useRef(false)
const wasGatewayOpenRef = useRef(false)
// Per-session retry bookkeeping for the bounded auto-retry effect below. Keyed
// by the session id we're retrying so switching chats resets the counter.
const retrySessionIdRef = useRef<string | null>(null)
const retryAttemptRef = useRef(0)
// Tracks the previous exhausted-latch value so we can detect its armed->cleared
// edge. resumeSession clears $resumeExhaustedSessionId on a manual Retry /
// reconnect / reselect; that transition is our cue to reset the attempt counter
// for a fresh backoff cycle on the SAME session (the auto-retry loop itself
// never touches this latch, so it can't spuriously trigger the reset).
const prevResumeExhaustedRef = useRef<string | null>(null)
useEffect(() => {
const gatewayOpen = gatewayState === 'open'
@@ -176,111 +139,4 @@ export function useRouteResume({
selectedStoredSessionIdRef,
startFreshSessionDraft
])
// Bounded auto-retry: when the routed session's resume failed terminally
// (resumeFailedSessionId matches the route), schedule a backoff retry so the
// window recovers on its own instead of latching the loader forever. This is
// the safety net the main effect above can't provide: after a failed resume,
// selectedStoredSessionIdRef.current already equals the route (resumeSession
// sets it synchronously at entry) and the pathname/gateway are unchanged, so
// none of stuckOnRoutedSession / pathnameChanged / gatewayBecameOpen fire
// again. resumeSession clears resumeFailedSessionId on its next attempt; a
// success keeps it clear (the effect's guard then no-ops), a repeat failure
// re-arms it and we back off further, capped at MAX_RESUME_RETRIES.
useEffect(() => {
// Detect the exhausted-latch armed->cleared edge for the current route. Only
// resumeSession clears $resumeExhaustedSessionId (manual Retry / reconnect /
// reselect) — the auto-retry loop never touches it — so this transition
// uniquely means "the user asked for another go." Reset the attempt counter
// for a fresh bounded backoff cycle on the SAME session. Without this,
// retryAttemptRef stays pinned at MAX after exhaustion (the !stranded reset
// below only fires on a route CHANGE to a different session), so a manual
// retry on the same stranded session would get exactly ONE attempt and then
// immediately re-arm the exhausted error — never the renewed backoff cycle
// the store/session.ts + use-session-actions.ts comments promise. (Point 2)
const wasExhausted = prevResumeExhaustedRef.current
prevResumeExhaustedRef.current = resumeExhaustedSessionId
if (wasExhausted && wasExhausted === routedSessionId && resumeExhaustedSessionId !== wasExhausted) {
retrySessionIdRef.current = routedSessionId
retryAttemptRef.current = 0
}
if (currentView !== 'chat' || gatewayState !== 'open') {
return
}
const stranded =
Boolean(routedSessionId) &&
resumeFailedSessionId === routedSessionId &&
!creatingSessionRef.current
if (!stranded) {
// Route moved off the stranded session (or it recovered) — reset the
// counter so a future failure on another session starts fresh, and clear
// any exhausted-latch armed for a session we're no longer viewing (never
// the current route: that's the error state we want to keep showing).
// resumeSession also clears it on a fresh attempt; this covers a plain
// route-change away from the stranded window.
if (retrySessionIdRef.current !== routedSessionId) {
retrySessionIdRef.current = null
retryAttemptRef.current = 0
setResumeExhaustedSessionId(current => (current && current !== routedSessionId ? null : current))
}
return
}
// New stranded session id → reset the attempt counter.
if (retrySessionIdRef.current !== routedSessionId) {
retrySessionIdRef.current = routedSessionId
retryAttemptRef.current = 0
}
if (retryAttemptRef.current >= MAX_RESUME_RETRIES) {
// Give up auto-retrying a persistently dead backend; the user can still
// reconnect / reselect (which resets the counter via the branch above).
// Surface an explicit error + manual Retry in the chat view instead of
// spinning the loader forever — resumeSession (manual Retry / reconnect /
// reselect) clears this latch and resets the counter for a fresh cycle.
setResumeExhaustedSessionId(routedSessionId)
return
}
const attempt = retryAttemptRef.current
const sessionId = routedSessionId as string
const timer = setTimeout(() => {
// Re-check liveness at fire time: a resume may have landed while we waited.
if (
creatingSessionRef.current ||
selectedStoredSessionIdRef.current !== sessionId ||
activeSessionIdRef.current !== null
) {
return
}
// Consume an attempt ONLY now that a resume is actually dispatching.
// Incrementing at schedule time (the old behavior) let unrelated dep
// changes during the 1s8s backoff window — a transient gatewayState
// flip, a non-referentially-stable resumeSession — clear the pending
// timer and re-run the effect, burning an attempt without any resume
// having fired. A flapping backend could then hit MAX in a couple of
// re-renders with far fewer than MAX real attempts. (Point 3)
retryAttemptRef.current += 1
void resumeSession(sessionId, true)
}, resumeRetryDelayMs(attempt))
return () => clearTimeout(timer)
}, [
activeSessionIdRef,
creatingSessionRef,
currentView,
gatewayState,
resumeSession,
resumeFailedSessionId,
resumeExhaustedSessionId,
routedSessionId,
selectedStoredSessionIdRef
])
}

View File

@@ -3,9 +3,8 @@ import type { MutableRefObject } from 'react'
import { useEffect } from 'react'
import { afterEach, describe, expect, it, vi } from 'vitest'
import { getSessionMessages } from '@/hermes'
import { $activeGatewayProfile, $newChatProfile } from '@/store/profile'
import { $currentCwd, $messages, $resumeFailedSessionId, setMessages, setResumeFailedSessionId } from '@/store/session'
import { $currentCwd } from '@/store/session'
import type { ClientSessionState } from '../../types'
@@ -118,142 +117,3 @@ describe('createBackendSessionForSend profile routing', () => {
expect(params).toMatchObject({ profile: 'default' })
})
})
// ── Resume failure recovery (the "stuck loading session window" bug) ──────────
// When session.resume rejects AND the REST transcript fallback ALSO fails, the
// hook must (a) not throw out of the fallback (which stranded the loader), and
// (b) arm $resumeFailedSessionId so use-route-resume can retry. A resume that
// succeeds must NOT leave the flag armed.
function ResumeHarness({
onReady,
requestGateway
}: {
onReady: (resume: (storedSessionId: string, replaceRoute?: boolean) => Promise<unknown>) => void
requestGateway: <T>(method: string, params?: Record<string, unknown>) => Promise<T>
}) {
const ref = <T,>(value: T): MutableRefObject<T> => ({ current: value })
const actions = useSessionActions({
activeSessionId: null,
activeSessionIdRef: ref<string | null>(null),
busyRef: ref(false),
creatingSessionRef: ref(false),
ensureSessionState: () => ({}) as ClientSessionState,
getRouteToken: () => 'token',
navigate: vi.fn() as never,
requestGateway,
runtimeIdByStoredSessionIdRef: ref(new Map<string, string>()),
selectedStoredSessionId: null,
selectedStoredSessionIdRef: ref<string | null>(null),
sessionStateByRuntimeIdRef: ref(new Map<string, ClientSessionState>()),
syncSessionStateToView: vi.fn(),
updateSessionState: (_sessionId, updater) => updater({} as ClientSessionState)
})
useEffect(() => {
onReady(actions.resumeSession)
}, [actions.resumeSession, onReady])
return null
}
describe('resumeSession failure recovery', () => {
afterEach(() => {
cleanup()
setResumeFailedSessionId(null)
setMessages([])
vi.restoreAllMocks()
})
async function runResume(
requestGateway: <T>(method: string, params?: Record<string, unknown>) => Promise<T>
): Promise<void> {
let resume: ((storedSessionId: string, replaceRoute?: boolean) => Promise<unknown>) | null = null
render(<ResumeHarness onReady={r => (resume = r)} requestGateway={requestGateway} />)
await waitFor(() => expect(resume).not.toBeNull())
await resume!('stored-1', true)
}
it('arms $resumeFailedSessionId when resume RPC and REST fallback both fail', async () => {
// session.resume rejects (e.g. timeout against a wedged backend)...
const requestGateway = vi.fn(async (method: string) => {
if (method === 'session.resume') {
throw new Error('request timed out: session.resume')
}
return {} as never
})
// ...and the REST transcript fallback also rejects (backend unreachable).
vi.mocked(getSessionMessages).mockRejectedValue(new Error('network down'))
await runResume(requestGateway)
// The window is no longer silently stranded: the failure latch is armed for
// the stored session, which use-route-resume consumes to retry.
expect($resumeFailedSessionId.get()).toBe('stored-1')
})
it('does NOT arm the failure latch when the resume RPC fails but the REST fallback paints history', async () => {
// session.resume rejects, but the REST transcript fallback succeeds and
// hydrates a readable transcript — the window is NOT stranded.
const requestGateway = vi.fn(async (method: string) => {
if (method === 'session.resume') {
throw new Error('request timed out: session.resume')
}
return {} as never
})
vi.mocked(getSessionMessages).mockResolvedValue({
messages: [
{ content: 'hello', role: 'user', timestamp: 1 },
{ content: 'hi there', role: 'assistant', timestamp: 2 }
],
session_id: 'stored-1'
} as never)
await runResume(requestGateway)
// Arming here would auto-retry a window that already shows history and,
// on exhaustion, blank that transcript behind the error overlay — a
// regression vs. plain fallback-success. The latch must stay clear.
expect($resumeFailedSessionId.get()).toBeNull()
// The fallback transcript is visible.
expect($messages.get().length).toBeGreaterThan(0)
})
it('does NOT throw out of the fallback when REST also fails (no unhandled rejection)', async () => {
const requestGateway = vi.fn(async (method: string) => {
if (method === 'session.resume') {
throw new Error('request timed out: session.resume')
}
return {} as never
})
vi.mocked(getSessionMessages).mockRejectedValue(new Error('network down'))
// resumeSession must resolve (swallow the fallback failure), not reject.
await expect(runResume(requestGateway)).resolves.toBeUndefined()
})
it('leaves the failure latch clear when resume succeeds', async () => {
// Pre-arm to prove a successful resume clears it (entry-clear path).
setResumeFailedSessionId('stored-1')
const requestGateway = vi.fn(async (method: string, params?: Record<string, unknown>) => {
if (method === 'session.resume') {
return { session_id: 'runtime-1', resumed: params?.session_id, messages: [], info: {} } as never
}
return {} as never
})
vi.mocked(getSessionMessages).mockResolvedValue({ messages: [] } as never)
await runResume(requestGateway)
expect($resumeFailedSessionId.get()).toBeNull()
})
})

View File

@@ -15,10 +15,6 @@ import { requestDesktopOnboarding } from '@/store/onboarding'
import { $activeGatewayProfile, $newChatProfile, $profiles, ensureGatewayProfile, normalizeProfileKey } from '@/store/profile'
import {
$currentCwd,
$currentFastMode,
$currentModel,
$currentProvider,
$currentReasoningEffort,
$messages,
$sessions,
$yoloActive,
@@ -38,8 +34,6 @@ import {
setFreshDraftReady,
setIntroSeed,
setMessages,
setResumeExhaustedSessionId,
setResumeFailedSessionId,
setSelectedStoredSessionId,
setSessions,
setSessionStartedAt,
@@ -48,7 +42,6 @@ import {
setYoloActive,
workspaceCwdForNewSession
} from '@/store/session'
import { broadcastSessionsChanged } from '@/store/session-sync'
import { reportBackendContract } from '@/store/updates'
import { isWatchWindow } from '@/store/windows'
import type { SessionCreateResponse, SessionInfo, SessionResumeResponse, SessionRuntimeInfo, UsageStats } from '@/types/hermes'
@@ -413,13 +406,13 @@ export function useSessionActions({
})
setSessionStartedAt(null)
setTurnStartedAt(null)
// The composer's model/effort/fast is sticky UI state (persisted in
// localStorage) — a new chat FOLLOWS your last pick instead of snapping
// back to the profile default, so we deliberately don't reset it here. The
// profile default still owns first-run seeding and profile switches (see
// refreshCurrentModel). Only $currentServiceTier (a live-session mirror)
// is cleared.
// New chats start in the configured default project dir when set,
// otherwise the sticky last-used workspace (PR #37586).
setCurrentModel('')
setCurrentProvider('')
setCurrentReasoningEffort('')
setCurrentServiceTier('')
setCurrentFastMode(false)
setYoloActive(false)
setCurrentCwd(workspaceCwdForNewSession())
setCurrentBranch('')
@@ -449,23 +442,11 @@ export function useSessionActions({
const newChatProfile = $newChatProfile.get() ?? normalizeProfileKey($activeGatewayProfile.get())
await ensureGatewayProfile(newChatProfile)
const cwd = $currentCwd.get().trim() || workspaceCwdForNewSession()
// The composer's model/effort/fast is sticky UI state ($currentModel,
// $currentProvider, $currentReasoningEffort, $currentFastMode). Ship it
// with every session.create so the new chat opens on whatever the picker
// shows — applied as per-session overrides, never written to the profile
// default (that lives in Settings → Model).
const uiModel = $currentModel.get().trim()
const uiProvider = $currentProvider.get().trim()
const uiEffort = $currentReasoningEffort.get().trim()
const uiFast = $currentFastMode.get()
const created = await requestGateway<SessionCreateResponse>('session.create', {
cols: 96,
...(cwd && { cwd }),
...(newChatProfile ? { profile: newChatProfile } : {}),
...(uiModel ? { model: uiModel, ...(uiProvider ? { provider: uiProvider } : {}) } : {}),
...(uiEffort ? { reasoning_effort: uiEffort } : {}),
...(uiFast ? { fast: true } : {})
...(newChatProfile ? { profile: newChatProfile } : {})
})
const stored = created.stored_session_id ?? null
@@ -491,9 +472,6 @@ export function useSessionActions({
// server later returns its own preview/title and supersedes this.
upsertOptimisticSession(created, stored, null, preview?.trim() || null)
navigate(sessionRoute(stored), { replace: true })
// Other windows (e.g. the main window when this is the pop-out) can't
// see this session until they re-pull the shared list.
broadcastSessionsChanged()
}
setFreshDraftReady(false)
@@ -581,15 +559,6 @@ export function useSessionActions({
clearNotifications()
setSelectedStoredSessionId(storedSessionId)
selectedStoredSessionIdRef.current = storedSessionId
// Optimistically clear any prior resume-failure latch for this session:
// we're attempting a fresh resume, so the self-heal in use-route-resume
// must not keep treating it as stranded. It's re-armed below only if THIS
// attempt fails terminally (RPC reject + REST fallback failure).
setResumeFailedSessionId(current => (current === storedSessionId ? null : current))
// Also clear the exhausted-latch: a fresh attempt (manual Retry, reconnect,
// reselect) gives the bounded auto-retry counter a clean cycle, so the
// chat view drops the error state and shows the loader again.
setResumeExhaustedSessionId(current => (current === storedSessionId ? null : current))
const warmRuntimeId = runtimeIdByStoredSessionIdRef.current.get(storedSessionId)
@@ -780,41 +749,13 @@ export function useSessionActions({
return
}
// The gateway resume RPC failed. Try the REST transcript as a fallback
// so the window at least shows history. CRITICAL: this fallback must be
// wrapped in its own try — if it ALSO throws (wedged/unreachable backend,
// the common case when resume failed in the first place), an unguarded
// throw here skips setMessages AND leaves activeSessionId null with an
// empty transcript. That is the exact state the thread loader latches on
// forever (messagesEmpty && !activeSessionId) with no recovery path —
// the "open in new window stays stuck loading, even after a nap" bug.
try {
const fallback = await getSessionMessages(storedSessionId, sessionProfile)
const fallback = await getSessionMessages(storedSessionId, sessionProfile)
if (!isCurrentResume()) {
return
}
setMessages(preserveLocalAssistantErrors(toChatMessages(fallback.messages), $messages.get()))
} catch {
// Fallback also failed: nothing to paint. Leave whatever messages are
// already shown and fall through to arm the resume-failure latch so
// use-route-resume re-attempts the resume on the next render / window
// focus / gateway reconnect instead of stranding the loader.
}
if (isCurrentResume() && $messages.get().length === 0) {
// Arm the self-heal ONLY when the window is still empty: the gateway
// resume rejected AND the REST fallback failed to paint a transcript.
// That is the exact stranded state the loader latches on
// (messagesEmpty && !activeSessionId), and matches $resumeFailedSessionId's
// documented contract. If the REST fallback DID paint history, the
// window is readable — arming here would needlessly auto-retry and,
// once retries exhaust, blank that visible transcript behind the
// exhausted-state error overlay (a regression vs. plain fallback success).
setResumeFailedSessionId(storedSessionId)
if (!isCurrentResume()) {
return
}
setMessages(preserveLocalAssistantErrors(toChatMessages(fallback.messages), $messages.get()))
notifyError(err, copy.resumeFailed)
} finally {
if (isCurrentResume()) {

View File

@@ -2,14 +2,12 @@ import { act, cleanup, render } from '@testing-library/react'
import type { MutableRefObject } from 'react'
import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest'
import type { ChatMessage } from '@/lib/chat-messages'
import {
$currentFastMode,
$currentModel,
$currentProvider,
$currentReasoningEffort,
$currentServiceTier,
$messages,
$turnStartedAt,
setCurrentFastMode,
setCurrentModel,
@@ -215,113 +213,3 @@ describe('useSessionStateCache — per-session turn timer', () => {
expect($currentFastMode.get()).toBe(false)
})
})
function userMessage(id: string, text: string): ChatMessage {
return { id, role: 'user', parts: [{ type: 'text', text }] }
}
function assistantText(id: string, text: string): ChatMessage {
return { id, role: 'assistant', parts: [{ type: 'text', text }] }
}
function assistantError(id: string, error: string): ChatMessage {
return { id, role: 'assistant', parts: [], error, pending: false }
}
interface ViewHarnessProps {
activeSessionId: string | null
onReady: (cache: Cache) => void
}
function ViewHarness({ activeSessionId, onReady }: ViewHarnessProps) {
const busyRef: MutableRefObject<boolean> = { current: false }
const cache = useSessionStateCache({
activeSessionId,
busyRef,
selectedStoredSessionId: null,
setAwaitingResponse: () => undefined,
setBusy: () => undefined,
// Wire the published view back into the real $messages atom the flush
// reads from, so the round-trip matches production.
setMessages: messages => $messages.set(messages)
})
onReady(cache)
return null
}
describe('useSessionStateCache — cross-thread error isolation', () => {
afterEach(() => {
cleanup()
$messages.set([])
})
it('does not leak a failed turn into another thread on switch', () => {
$messages.set([])
let cache!: Cache
const { rerender } = render(<ViewHarness activeSessionId="thread-A" onReady={c => (cache = c)} />)
// Thread A ends its turn with an out-of-funds error and is on screen.
act(() => {
cache.updateSessionState(
'thread-A',
state => ({
...state,
busy: false,
messages: [userMessage('user-a', 'do the thing'), assistantError('assistant-a-error', 'Out of funds')]
}),
'stored-A'
)
})
expect($messages.get().some(message => message.error === 'Out of funds')).toBe(true)
// Switch to thread B (which completed cleanly). Its cached state syncs to
// the view while $messages still holds thread A's transcript.
rerender(<ViewHarness activeSessionId="thread-B" onReady={c => (cache = c)} />)
act(() => {
cache.updateSessionState(
'thread-B',
state => ({
...state,
busy: false,
messages: [userMessage('user-b', 'hello'), assistantText('assistant-b', 'hi there')]
}),
'stored-B'
)
})
expect($messages.get().map(message => message.id)).toEqual(['user-b', 'assistant-b'])
expect($messages.get().some(message => message.error === 'Out of funds')).toBe(false)
})
it('still preserves a same-session local error a heartbeat dropped', () => {
$messages.set([])
let cache!: Cache
render(<ViewHarness activeSessionId="thread-A" onReady={c => (cache = c)} />)
// First paint establishes thread A as the on-screen session.
act(() => {
cache.updateSessionState(
'thread-A',
state => ({ ...state, busy: false, messages: [userMessage('user-a', 'do the thing')] }),
'stored-A'
)
})
// A local error lands in the view (e.g. failAssistantMessage wrote it).
$messages.set([userMessage('user-a', 'do the thing'), assistantError('assistant-a-error', 'OpenRouter 403')])
// A later same-session heartbeat carries cached state that lost the error.
act(() => {
cache.updateSessionState('thread-A', state => ({
...state,
busy: false,
messages: [userMessage('user-a', 'do the thing')]
}))
})
expect($messages.get().some(message => message.error === 'OpenRouter 403')).toBe(true)
})
})

View File

@@ -79,9 +79,6 @@ export function useSessionStateCache({
const runtimeIdByStoredSessionIdRef = useRef(new Map<string, string>())
const pendingViewStateRef = useRef<{ sessionId: string; state: ClientSessionState } | null>(null)
const viewSyncRafRef = useRef<number | null>(null)
// Runtime id whose transcript currently occupies `$messages` — lets the
// flush below tell a same-session refresh from a thread switch.
const viewSessionIdRef = useRef<string | null>(null)
useEffect(() => {
activeSessionIdRef.current = activeSessionId
@@ -145,22 +142,12 @@ export function useSessionStateCache({
// jerks the scroll position while the user is reading. Skip the publish when
// the merged result is content-identical to what's already on screen.
const currentMessages = $messages.get()
// On a thread switch `$messages` still holds the *previous* thread, so
// preserving its local errors would graft that thread's failed turn (e.g.
// an out-of-funds error) onto this one — then cascade it everywhere as the
// polluted view becomes the next switch's baseline. Only carry errors
// across a same-session refresh; our cached state already keeps its own.
const nextMessages =
viewSessionIdRef.current === pending.sessionId
? preserveLocalAssistantErrors(pending.state.messages, currentMessages)
: pending.state.messages
const nextMessages = preserveLocalAssistantErrors(pending.state.messages, currentMessages)
if (!sameMessageList(nextMessages, currentMessages)) {
setMessages(nextMessages)
}
viewSessionIdRef.current = pending.sessionId
syncRuntimeMetadataToView(pending.state)
setBusy(pending.state.busy)
setMutableRef(busyRef, pending.state.busy)

View File

@@ -23,7 +23,6 @@ import { fieldCopyForSchemaKey } from './field-copy'
import { enumOptionsFor, getNested, prettyName, setNested } from './helpers'
import { ModelSettings } from './model-settings'
import { EmptyState, ListRow, LoadingState, SettingsContent } from './primitives'
import { ProviderConfigPanel } from './provider-config-panel'
function ConfigField({
schemaKey,
@@ -369,9 +368,6 @@ export function ConfigSettings({
schemaKey={key}
value={getNested(config, key)}
/>
{key === 'memory.provider' && typeof getNested(config, key) === 'string' && getNested(config, key) ? (
<ProviderConfigPanel provider={String(getNested(config, key))} />
) : null}
</div>
))}
</div>

View File

@@ -239,7 +239,7 @@ export const ENUM_OPTIONS: Record<string, string[]> = {
'code_execution.mode': ['project', 'strict'],
'context.engine': ['compressor', 'default', 'custom'],
'delegation.reasoning_effort': ['', 'minimal', 'low', 'medium', 'high', 'xhigh'],
'memory.provider': ['', 'builtin', 'hindsight', 'honcho'],
'memory.provider': ['', 'builtin', 'honcho'],
// Terminal execution backends — kept in sync with the dispatch ladder in
// tools/terminal_tool.py::_create_environment (local/docker/singularity/
// modal/daytona/ssh). Remote backends need extra env (image, tokens, host).

View File

@@ -6,12 +6,6 @@ import { defineFieldCopy, fieldCopyForSchemaKey, schemaKeyToFieldCopyKey } from
import { enumOptionsFor, getNested, providerGroup, setNested, stripToolsetLabel, toolsetDisplayLabel } from './helpers'
describe('settings helpers', () => {
it('lists Hindsight as a built-in desktop memory provider option', () => {
const options = enumOptionsFor('memory.provider', '', {})
expect(options).toContain('hindsight')
})
describe('defineFieldCopy', () => {
it('flattens nested field copy paths', () => {
const copy = defineFieldCopy({

View File

@@ -228,7 +228,7 @@ export function SettingsView({ gateway, onClose, onConfigSaved, onMainModelChang
onMainModelChanged={onMainModelChanged}
/>
) : activeView === 'providers' ? (
<ProvidersSettings onClose={onClose} onViewChange={setProviderView} view={providerView} />
<ProvidersSettings onViewChange={setProviderView} view={providerView} />
) : activeView === 'keys' ? (
<KeysSettings view={keysView} />
) : activeView === 'mcp' ? (

View File

@@ -16,8 +16,6 @@ const getAuxiliaryModels = vi.fn()
const setModelAssignment = vi.fn()
const getRecommendedDefaultModel = vi.fn()
const setEnvVar = vi.fn()
const getHermesConfigRecord = vi.fn()
const saveHermesConfig = vi.fn()
const startManualProviderOAuth = vi.fn()
vi.mock('@/hermes', () => ({
@@ -26,9 +24,7 @@ vi.mock('@/hermes', () => ({
getAuxiliaryModels: () => getAuxiliaryModels(),
setModelAssignment: (body: unknown) => setModelAssignment(body),
getRecommendedDefaultModel: (slug: string) => getRecommendedDefaultModel(slug),
setEnvVar: (key: string, value: string) => setEnvVar(key, value),
getHermesConfigRecord: () => getHermesConfigRecord(),
saveHermesConfig: (config: unknown) => saveHermesConfig(config)
setEnvVar: (key: string, value: string) => setEnvVar(key, value)
}))
vi.mock('@/store/onboarding', () => ({
@@ -39,13 +35,7 @@ beforeEach(() => {
getGlobalModelInfo.mockResolvedValue({ provider: 'nous', model: 'hermes-4' })
getGlobalModelOptions.mockResolvedValue({
providers: [
{
name: 'Nous',
slug: 'nous',
models: ['hermes-4', 'hermes-4-mini'],
authenticated: true,
capabilities: { 'hermes-4': { reasoning: true, fast: true } }
},
{ name: 'Nous', slug: 'nous', models: ['hermes-4', 'hermes-4-mini'], authenticated: true },
// An unconfigured api_key provider — surfaced by the full-universe payload.
{ name: 'DeepSeek', slug: 'deepseek', models: [], authenticated: false, auth_type: 'api_key', key_env: 'DEEPSEEK_API_KEY' }
]
@@ -57,8 +47,6 @@ beforeEach(() => {
setModelAssignment.mockResolvedValue({ provider: 'nous', model: 'hermes-4', gateway_tools: [] })
getRecommendedDefaultModel.mockResolvedValue({ provider: 'deepseek', model: 'deepseek-chat', free_tier: null })
setEnvVar.mockResolvedValue({ ok: true })
getHermesConfigRecord.mockResolvedValue({ agent: { reasoning_effort: 'medium', service_tier: 'normal' } })
saveHermesConfig.mockResolvedValue({ ok: true })
})
afterEach(() => {
@@ -112,31 +100,6 @@ describe('ModelSettings', () => {
await waitFor(() => expect(setEnvVar).toHaveBeenCalledWith('DEEPSEEK_API_KEY', 'sk-test-123'))
})
it('writes the profile default speed (service_tier) when the fast switch is toggled', async () => {
await renderModelSettings()
await waitFor(() => expect(getHermesConfigRecord).toHaveBeenCalled())
const fastSwitch = await screen.findByRole('switch')
fireEvent.click(fastSwitch)
await waitFor(() =>
expect(saveHermesConfig).toHaveBeenCalledWith(
expect.objectContaining({ agent: expect.objectContaining({ service_tier: 'fast' }) })
)
)
})
it('hides the reasoning/speed defaults when the main model reports no capabilities', async () => {
getGlobalModelOptions.mockResolvedValueOnce({
providers: [{ name: 'Nous', slug: 'nous', models: ['hermes-4'], authenticated: true, capabilities: { 'hermes-4': { reasoning: false, fast: false } } }]
})
await renderModelSettings()
await waitFor(() => expect(getHermesConfigRecord).toHaveBeenCalled())
expect(screen.queryByRole('switch')).toBeNull()
})
it('renders the auxiliary task rows', async () => {
await renderModelSettings()

Some files were not shown because too many files have changed in this diff Show More