Compare commits

..

382 Commits

Author SHA1 Message Date
ethernet
576be55ca2 asdfasdf 2026-06-29 17:24:14 -04:00
ethernet
1930bf1e63 wip ts-ify 2026-06-29 16:31:36 -04:00
Ruzzgar
313a8c6833 fix(skills): replace string prefix check with strict path containment 2026-06-28 21:14:01 -07:00
Ben Barclay
0943e2a272 fix(cron): don't report a false 'gateway not running' on external-provider instances (#54600)
`hermes cron status` (and the create/list 'gateway not running' nag)
judge whether cron will fire purely from the in-process ticker's
heartbeat file + a live gateway PID. That heuristic is correct for the
built-in ticker but WRONG for an external provider like Chronos:

Chronos arms exactly one external one-shot per job and is fired by a
NAS-mediated webhook (POST /api/cron/fire). Its `start()` returns
immediately and it deliberately runs no 60s loop and writes no ticker
heartbeat — that's the whole point of scale-to-zero (the machine is at
zero between fires). So on a perfectly healthy Chronos instance,
`cron status` always printed '✗ Gateway is not running — cron jobs will
NOT fire' (or a STALLED-ticker warning), and `cron create` always
appended the 'jobs won't fire automatically' nag — both false.

Verified live on a staging Chronos instance: jobs fired and completed on
schedule via the relay while `cron status` insisted the gateway wasn't
running and the heartbeat was 370s+ stale.

Fix: resolve the active provider (offline — `resolve_cron_scheduler`,
whose `is_available()` contract forbids network) and, for any non-builtin
provider, report the managed-scheduler state instead of the ticker
heuristics, and suppress the ticker-only 'gateway not running' warning.
The built-in path is byte-unchanged. Active-job summary is factored into
a shared helper so both paths print it identically.

New tests prove both directions (chronos: no false negative even with no
gateway PID / no heartbeat; builtin: historical warning preserved) and
fail without the fix.
2026-06-29 14:03:02 +10:00
Teknium
e20ff352b9 test(matrix): authorize inviter in DM-invite fixture for new invite-auth gate
_on_invite now rejects auto-joins from users not on the allow-list. The
DM-recording tests invite @alice and expect a join, so the shared
_make_adapter fixture now puts @alice on _allowed_user_ids.
2026-06-28 20:47:33 -07:00
aaronagent
d836b2bac4 fix(matrix,mattermost): invite auth check + API path traversal guard
Two platform-security hardenings:

- Matrix: _on_invite now checks the inviter against the existing
  allow-list (_allowed_user_ids / GATEWAY_ALLOW_ALL_USERS) before
  auto-joining. Without this any federated Matrix user could invite
  the bot into arbitrary rooms, exposing its presence and metadata.
  The message and reaction paths already enforce this allow-list; the
  invite path bypassed it.

- Mattermost: _api_get / _api_post / _api_put reject any path
  containing '..'. WebSocket-event values (channel_id, post_id,
  file_id) are interpolated directly into API paths, so a malicious or
  compromised server could craft traversal payloads to make the bot
  issue authenticated requests to arbitrary endpoints with its bearer
  token.

The configurable-E2EE-passphrase change from the original PR is dropped:
the matrix adapter was rewritten onto mautrix and the passphrase-protected
key-export file no longer exists.
2026-06-28 20:47:33 -07:00
Teknium
9cf9d3a28f chore(release): add AUTHOR_MAP entry for PR #53295 salvage 2026-06-28 20:46:44 -07:00
lkevincc
163562bf88 fix: normalize lmstudio base urls 2026-06-28 20:46:44 -07:00
Teknium
43eaf79ae6 chore: remove committed PR infographics and gitignore the path (#54564)
PR infographics are rendered locally and embedded in PR descriptions via
the image-provider (fal.media) URL — they were never meant to live in the
repo. The intended .gitignore enforcement (documented as added back in May
2026) was never actually committed, so 35 PNGs (~54MB) accumulated under
infographic/ via 'docs: add PR infographic for X' commits.

- Remove all 35 tracked infographic/*.png files.
- Add infographic/ to .gitignore so git add on the path is now a no-op.

The PR body remains the archive for these images.
2026-06-28 20:46:35 -07:00
teknium1
14204b0646 test(agent): cover .hermes.md no-git-root cwd-only behavior
Regression tests for the injection fix: outside a git repo only cwd is
checked (planted ancestor .hermes.md is ignored), a cwd-local .hermes.md
is still found, and inside a git repo the parent walk to the git root
still works.
2026-06-28 20:46:32 -07:00
aaronagent
306b6615cf fix(agent): limit .hermes.md parent walk to git repos only
_find_hermes_md walks parent directories looking for .hermes.md/HERMES.md,
stopping at the git root. But when there is no git repo (_find_git_root
returns None), the stop guard never fires and the loop walks all the way
to /. On shared systems (CI runners, multi-tenant servers), a .hermes.md
planted at /tmp, /home, or / would be loaded into the system prompt of any
agent session not inside a git repo — a cross-user prompt-injection vector.

Fix: when there is no git root, only check cwd; do not walk parents.

Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>
2026-06-28 20:46:32 -07:00
brooklyn!
4488fe134b Merge pull request #54517 from NousResearch/bb/desktop-multiterminal
feat(desktop): multi-terminal panel with read-only agent terminals
2026-06-28 21:52:14 -05:00
Brooklyn Nicholson
ae465e9fb8 Merge branch 'main' of github.com:NousResearch/hermes-agent into bb/desktop-multiterminal 2026-06-28 21:37:52 -05:00
Brooklyn Nicholson
1a1e00f37e fix(desktop): stop injecting ctrl-l into terminal startup
Remove the prompt-gap cleanup that sent Ctrl-L into the user's shell; it could
render as literal ^L and create the exact top-line gap it was meant to hide.
Keep first-prompt cleanup renderer-side only, and parse short ESC charset
sequences so the initial newline stripper does not disarm early.

Also add a Close all action to the terminal tab context menu.
2026-06-28 21:33:20 -05:00
brooklyn!
83f09f52f9 Merge pull request #54558 from NousResearch/bb/overlay-panels
feat(desktop): shared overlay Panel primitive for cron/profiles/agents
2026-06-28 21:32:14 -05:00
Brooklyn Nicholson
5a2906a11b chore(desktop): keep the diff surgical
Revert the repo-wide prettier churn the earlier fmt pass pulled into files
unrelated to this work; run prettier/eslint scoped to the touched files only.
2026-06-28 21:30:14 -05:00
Brooklyn Nicholson
6776b2f9b5 feat(desktop): live gateway popout + statusbar/command-center polish
- Gateway status popout: flatten the header to stacked connection + inference
  statuses with system-panel and restart actions (reusing the shared
  runGatewayRestart helper). The recent-activity tail is now live while the
  popout is open via the shared LogView (WS connection churn filtered), and the
  icon / "View all logs" link dismiss the popover.
- Statusbar "menu" items accept a menuContent(close) render fn over a now
  controlled DropdownMenu, so popover content can close itself.
- Drop the always-on gateway-log poll from useStatusSnapshot (logs are fetched
  by the popout only while open).
- SearchField → text-xs to match Input/Select (controlVariants).
- Command center: remove the usage/system section dividers, swap the sessions
  nav icon (Pin → MessageCircle), small padding tweaks.
2026-06-28 21:26:15 -05:00
Brooklyn Nicholson
6c52e4a318 fix(desktop): match agent terminal scrollback to user tabs
Keep read-only agent terminal tabs visually and behaviorally aligned with normal
terminal tabs by using the same 1,000-line scrollback cap.
2026-06-28 21:22:17 -05:00
Brooklyn Nicholson
adacb16d62 fix(desktop): make agent terminal tabs fully readable
Register read-only agent terminals with the same renderer-side terminal reader
as user terminals so read_terminal works on whichever tab is active.

Also bring agent xterm rendering closer to user-terminal parity (unicode 11,
web links, font weights/spacing) and make the gateway sink wiring resilient if
only one terminal event sink was already installed.
2026-06-28 21:18:49 -05:00
Ben
dee41d0716 feat(dashboard): catalogue all memory-provider API keys in OPTIONAL_ENV_VARS
The dashboard Keys page and `hermes setup` render API-key rows from
OPTIONAL_ENV_VARS, but only Honcho had an entry — so Hindsight,
Supermemory, Mem0, RetainDB, ByteRover, and OpenViking read their keys
straight from os.environ yet had no place to set them in the GUI.

Add catalog entries (category=tool, password-masked, with get-key URLs
and the tool each powers) for all six, plus the relevant base-URL/endpoint
companions. Pure declaration: the generic GET /api/env endpoint, the
save/reveal write path, and the sandbox env blocklist (which auto-derives
from tool-category OPTIONAL_ENV_VARS) all pick these up with no further
wiring.

Adds a behavior-contract test asserting every memory provider's primary
credential key is catalogued, tool-categorised, and password-masked.
2026-06-28 19:17:02 -07:00
Brooklyn Nicholson
e117cfdff0 feat(desktop): live agent terminals + agent-driven tab close
Make the read-only agent terminal mirrors stream in real time and give
the agent a desktop-only way to dismiss its own tabs.

- Stream background output live: the local reader used a blocking
  read(4096) that buffered small periodic output until EOF, so agent
  tabs only "filled in" at process exit. Switch to buffer.read1(4096)
  (decoded) for incremental chunks.
- Route agent.terminal.output / terminal.close to the window that owns
  the process (its gateway session) instead of an empty session id, so
  events actually reach the desktop renderer.
- Add close_terminal: a HERMES_DESKTOP-gated tool (sibling of
  read_terminal) that drops a process's read-only tab WITHOUT killing it
  via process_registry.on_close; output keeps buffering and the user can
  reopen from the status stack.
- ⌘W now closes a focused agent tab: mark the agent instance
  data-terminal and focus it on activation so isFocusWithin routes there.
- ensureTerminal() no longer spawns an extra user shell when a tab
  already exists (e.g. opening a background task from the status stack).
2026-06-28 21:15:14 -05:00
Brooklyn Nicholson
9f02eea1d2 style(desktop): prettier + eslint pass
Repo-wide `npm run fmt` + `eslint --fix`; also drop two unused destructured
params in titlebar-overlay-width.cjs so the lint run is clean.
2026-06-28 21:04:43 -05:00
Teknium
c8fd47be14 docs: add PR infographic for approval mode validation 2026-06-28 19:04:18 -07:00
LIC99
dda3268d09 fix(approvals): warn and default to manual on unknown approvals.mode
_normalize_approval_mode() previously accepted any string, so an unknown
value like 'auto' fell through every downstream mode check (off/smart) and
silently behaved like manual with no signal. Validate against the known
modes (manual/smart/off), emit a warning for anything else, and default to
manual to match the config default and the rest of the function.

Bug 1 from the original PR (/approve & /deny bypassing the running-agent
guard) already landed on main independently, so only the mode-validation
fix is salvaged here.

Fixes #4261

Co-authored-by: Hermes Agent <agent@nousresearch.com>
2026-06-28 19:04:18 -07:00
Brooklyn Nicholson
317b94871b chore(desktop): drop dead overlay primitives
Remove zero-consumer overlay code surfaced while auditing the primitive set:
OverlayNewButton (orphaned once "New" moved into PanelAddButton), OverlayCard /
overlayCardClass, and the unused overlay-search-input module. Leaves three
intentional layers: OverlayView (base), Panel (master/detail), and
OverlaySplitLayout (settings/command-center nav→content).
2026-06-28 21:03:04 -05:00
Brooklyn Nicholson
991220747f feat(desktop): unify non-settings overlays under a shared Panel primitive
Extract the agents/trace overlay chrome into overlays/panel.tsx and adopt it
across the Cron, Profiles, and Agents overlays so they share one layout
(centered card, header, master/detail list with built-in search, kebab row
actions, big "+" footer, empty state) instead of three ad-hoc split layouts.

Also in this pass:
- OverlayView insets equidistantly on every side (was top/left-only, which
  left a large left gutter on narrow windows).
- Form-control chrome: input border/background/recessed-inset are now
  per-mode theme-var knobs (--dt-input-border/-bg/-inset) — resting borders
  blend in, strengthen on hover, and go solid on focus / while a Select is open.
- Thread-timeline popover reuses the shared dropdown surface (1:1 with the
  kebab menus) and scrolls the hovered prompt into view.
2026-06-28 20:56:52 -05:00
Teknium
11183e8332 fix(profiles): validate custom alias names to prevent path traversal
`hermes profile alias <profile> --name <custom>` accepted arbitrary
strings and used them verbatim as a filename under ~/.local/bin. Because
normalize_profile_name only lowercases/strips (no regex gate), a value
like `../../.bashrc` escaped the wrapper directory and clobbered
arbitrary user-writable files. remove_wrapper_script had the same sink.

Add validate_alias_name (reusing the profile-id regex, which forbids
`/`, `.`, and `..`) and wire it into check_alias_collision,
create_wrapper_script, remove_wrapper_script, and the CLI alias action so
the rejection surfaces a clear "Invalid alias name" error instead of
silently writing or unlinking outside the wrapper dir.

Co-authored-by: Gutslabs <gutslabsxyz@gmail.com>
Co-authored-by: Xowiek <xowiekk@gmail.com>
2026-06-28 18:53:33 -07:00
aaronagent
27ddd8fd80 fix(gateway): sanitize agent error messages, validate webhook gh args
Two of the three fixes from PR #6660 (the cli.py reopen_session change is
moot — that raw _conn.execute reopen block no longer exists on main).

- gateway/run.py: stop sending raw type(e).__name__ and str(e)[:300] to
  end users on chat platforms. Exception text from LLM providers can leak
  API URLs, file paths, and partial credentials. Return a generic message;
  keep curated status hints for known HTTP codes; full detail stays in logs.
- gateway/platforms/webhook.py: validate pr_number (positive int) and repo
  (owner/name regex) before passing to the 'gh pr comment' subprocess.
  Payload-controlled values could otherwise inject gh flags (--help, a
  different --repo). List-form subprocess means this is arg injection, not
  shell injection, but validation is still correct.

Co-authored-by: aaronagent <1115117931@qq.com>
2026-06-28 18:53:26 -07:00
aaronlab
ec148f5d31 fix(agent): guard Anthropic interrupt, cap vision data-URL size
Two independent agent-loop hardening fixes:

- anthropic: when the streaming loop breaks on _interrupt_requested,
  return None instead of calling stream.get_final_message() on the
  partially-drained stream — the SDK may hang draining remaining events
  or return a Message with incomplete tool_use blocks. The outer poll
  loop raises InterruptedError, so the return value is discarded anyway.

- vision: add a 20 MB cap on base64 data-URL payloads before
  base64.b64decode() in _materialize_data_url_for_vision. A 100MB+
  payload creates ~275MB of memory pressure; gateway users sharing the
  process can trivially OOM it. Oversized payloads return ("", None).

The third change from the original PR (streaming tool-name +=  to
assignment dedup) was already landed independently on main.

Co-authored-by: aaronlab <1115117931@qq.com>
2026-06-28 18:53:20 -07:00
Teknium
490f215a19 test: cover export-prefix stripping in .env parsers (PR #6659) 2026-06-28 18:53:00 -07:00
aaronagent
5c1ac6c70d fix(config): strip export prefix in .env parsers across three modules
All three .env parsers use `line.partition("=")` without stripping the
bash-compatible `export ` prefix first.  A line like `export API_KEY=sk-...`
produces key `"export API_KEY"` instead of `"API_KEY"`, silently ignoring
the variable and causing auth failures for users who copy-paste from
bash profiles or follow tutorials that include `export`.

- tools/skills_tool.py: `load_env()` for skill environment
- hermes_cli/config.py: `load_env()` for core config
- hermes_cli/main.py: `_has_any_provider_configured()` inline parser

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-06-28 18:53:00 -07:00
Teknium
f1cbe4308f fix(gateway): log error-notification failures instead of silently swallowing (#54472)
* fix(gateway): log error-notification failures instead of silently swallowing

The last-resort exception handler in _process_message_background() that
sends an error notice to the user caught all exceptions with a bare pass,
leaving zero trace when the notification itself failed. Upgrade to
logger.error(..., exc_info=True) so a failed error-notification send is
debuggable post-mortem.

Salvaged from #6499 by @BongSuCHOI (the logging-upgrade portion only).

* docs: add PR infographic for gateway error-notify logging
2026-06-28 18:52:51 -07:00
Teknium
3483424aaa fix(security): redact bare-token credentials in URL userinfo (#6396) (#54475)
git remote set-url with an embedded password (https://PASSWORD@github.com)
leaked the credential into agent output — the redaction engine only masked
user:pass@ DB connection strings, never the colon-less bare-token userinfo
form a git remote uses.

Add _URL_BARE_TOKEN_RE: scheme://TOKEN@host for web/transport schemes
(http/https/wss/git/ssh/ftp), 8+ char floor to skip short usernames, token
class forbidding /:@ so an @ in a path/query is never treated as userinfo.

Deliberately scoped to the bare-token form only. The user:pass@ colon form
and query-string tokens stay passing through (#34029, 'pass web URLs through
unchanged') so magic-link / OAuth round-trip skills keep working — a bare
credential in userinfo is never a workflow token (those live in the query
string), so masking it can't break a skill.
2026-06-28 18:52:42 -07:00
Teknium
9860d93f2a fix(terminal): require approval for host-bound Docker commands (#54483)
* fix(terminal): require approval for host-bound Docker commands

The Docker terminal backend blanket-skips dangerous-command approval on
the assumption that the container is isolated from the host. That holds
only when nothing is bind-mounted in. Once a host path is exposed (via
TERMINAL_DOCKER_MOUNT_CWD_TO_WORKSPACE or a host-path entry in
TERMINAL_DOCKER_VOLUMES), a command like `rm -rf /workspace` reaches
real host files but is still auto-approved.

Detect host bind mounts and route those sessions through the normal
approval flow. Isolated Docker keeps the fast path. The same gating is
applied to the execute_code guard, which had the identical blanket skip.

Co-authored-by: Hermes Agent <agent@nousresearch.com>

* chore: add AUTHOR_MAP entry for PR #6436 salvage (Kolektori)

* test: accept has_host_access kwarg in _check_all_guards mocks

The host-bound Docker approval fix adds a has_host_access kwarg to the
_check_all_guards wrapper. Six pre-existing tests monkeypatch it with a
fixed (command, env_type) / (cmd, env) lambda signature, which now
raises TypeError when terminal_tool passes the new kwarg. Widen those
mock signatures to accept **kwargs.

---------

Co-authored-by: Kolektori <256073454+Kolektori@users.noreply.github.com>
Co-authored-by: Hermes Agent <agent@nousresearch.com>
2026-06-29 11:35:41 +10:00
Ben Barclay
7cfa2fa13f fix(docker): gate resource limit flags on cgroup controller availability (#54516)
On hosts where the cgroup v2 cpu/memory/pids controllers are not delegated
to the docker/podman process (unprivileged Proxmox LXCs, some rootless and
nested setups), --pids-limit/--cpus/--memory cause every container start to
fail with OCI runtime error / exit 126, breaking terminal + execute_code.

- Add _cgroup_limits_available(image): one-shot, host-wide cached probe that
  spawns a throwaway container from the sandbox image itself (sleep 0) with
  all three flags together, mirroring the existing _storage_opt_supported
  probe-and-degrade pattern.
- Remove --pids-limit from static _BASE_SECURITY_ARGS; apply it (default 256
  via _DEFAULT_PIDS_LIMIT) in resource_args gated on the probe.
- Gate --cpus and --memory on the same probe.

Behavior unchanged on cgroup-capable hosts; graceful degradation with a
one-time warning where controllers aren't delegated.

Fixes #6568.

(cherry picked from commit c933880b7e)

Co-authored-by: angelos <angelos@oikos.lan.home.malaiwah.com>
2026-06-29 11:01:08 +10:00
Brooklyn Nicholson
5d661a3ad7 fix(desktop): show the agent command before terminal output arrives
Seed read-only agent terminal tabs with the background command immediately, so
they never open as a blank pane while stdout is pending or a live stream races
startup. Snapshot fallback now preserves that command header and appends only
missing output without duplicating live chunks.
2026-06-28 19:39:20 -05:00
Brooklyn Nicholson
6ac9ba9fc4 fix(desktop): seed agent terminal tabs from process snapshots
Read-only agent terminal tabs now consume both live agent.terminal.output chunks
and the process-list/status snapshot. The snapshot seeds tabs opened after output
already exists and acts as a fallback if the live stream races startup, so agent
background tabs don't sit blank while the status stack already knows the tail.
2026-06-28 19:35:54 -05:00
Brooklyn Nicholson
520212cc59 feat(desktop): stream agent terminal output live instead of polling
Replace the 5s output_tail poll (which often showed nothing) with a real push
stream. The process registry gains an on_output sink called from its reader
threads with each chunk; the tui_gateway wires it to emit agent.terminal.output
{process_id, chunk} (write_json is _stdout_lock-guarded, so emitting from the
reader thread is safe). The desktop routes chunks by process id straight into
the read-only agent xterm via a small writer registry, with a capped backlog so
a tab opened mid-stream (or reopened) replays what it missed.

Drops the fragile poll/tail path: no session-key matching, no truncation, no
lag — full-fidelity ANSI, env-agnostic (local/docker/ssh).
2026-06-28 19:33:43 -05:00
Brooklyn Nicholson
ad831dd492 feat(desktop): mirror agent background terminals as read-only tabs
When the agent runs terminal(background=true) — Hermes's equivalent of
Cursor's is_background — surface it as a read-only "agent" tab in the rail
(distinct sparkle icon), alongside the glanceable status-stack row, which now
links to the tab. The tab is a write-only xterm (no PTY, no input) fed by the
process output tail, appended live (faster poll while a tab is open) and
env-agnostic (works for local/docker/ssh shells alike).

- terminals.ts: TerminalEntry gains kind ('user'|'agent') + procId; agent tabs
  auto-surface once (closing one doesn't resurrect it) and the status row can
  reopen/focus them. ensureTerminal now guarantees a user shell specifically.
- use-agent-terminal.ts: slim read-only xterm hook, delta-appended.
- workspace: render user vs agent instances; auto-surface from the background
  store; tail faster while an agent tab exists.
- composer-status: $backgroundOutputByProc selector; status row links to the tab
  instead of an inline disclosure.
2026-06-28 19:26:21 -05:00
Brooklyn Nicholson
6e12f8ce4a fix(desktop): force a repaint when a terminal is re-activated
A WebGL terminal doesn't paint while visibility:hidden, so switching to it
(e.g. after closing the active tab) revealed a stale/garbled frame. On
activation, clear the glyph atlas and force a full term.refresh against the
live buffer (after the refit), then focus.
2026-06-28 19:13:58 -05:00
Brooklyn Nicholson
b02f453496 refactor(desktop): generalize focus check to isFocusWithin primitive
Replace the one-off isTerminalFocused with isFocusWithin(selector) in the
keybinds lib (beside isEditableTarget) — the reusable primitive for any
focus-scoped shortcut. The terminal marks itself data-terminal and the ⌘W
handler routes via isFocusWithin('[data-terminal]'); future surfaces just add
their own marker.
2026-06-28 19:11:48 -05:00
Brooklyn Nicholson
2d55ff8fca feat(desktop): ⌘W closes the focused terminal
Fold terminal close into the existing ⌘/Ctrl+W handler so focus decides the
target: a focused terminal takes ⌘W (closes the active tab) and otherwise the
keystroke closes the active preview tab as before. Only the ⌘ gesture is
intercepted — Ctrl+W stays the shell's werase — and a focused terminal never
lets ⌘/Ctrl+W close a preview out from under it.
2026-06-28 19:10:06 -05:00
Brooklyn Nicholson
c1bb34d5e8 fix(desktop): keep inactive terminals sized so switching doesn't garble
Hide inactive terminal tabs with `visibility` (absolute-stacked at full size)
instead of `display:none`. A display:none host is 0×0, so its ResizeObserver
fit bails and the terminal stops tracking pane resizes — re-showing it at a
changed size reflowed the buffer into a garbled prompt. Visibility-hidden
hosts keep their layout size, stay in sync, and switch instantly.
2026-06-28 19:08:25 -05:00
Brooklyn Nicholson
6875d6cd3e feat(desktop): multi-terminal panel with side tab rail
Multiple persistent in-app terminals managed by a thin VS Code-style icon
rail docked on the terminal pane's outer edge. Each tab is its own live
xterm+PTY that survives tab switches, session switches, and hiding the pane
(VS Code parity: only an explicit close or `exit` kills a shell). Terminals
own their state independent of the session — the sole thing they inherit is
an initial cwd snapshotted at creation.

- Rail: icon-only tabs (name + live hotkey on hover), +/hide controls,
  context menu. Sits at z-40 above the collapsed sidebars' hover-reveal
  triggers and marks itself data-suppress-pane-reveal, so reaching for a tab
  can't summon the file-browser/review panel.
- Lifecycle: PersistentTerminal latches mounted on first open so shells stay
  alive while hidden; ensureTerminal re-creates one on reopen.
- Agent reader: id-keyed registry drives read_terminal off the active tab.
- Keybinds (Ctrl-family, OS-aware): toggle Ctrl+`, new Ctrl+Shift+`,
  next/prev Ctrl+Shift+Down/Up, close Ctrl+Shift+W.
2026-06-28 19:06:55 -05:00
brooklyn!
10043c6d0c Merge pull request #54503 from NousResearch/bb/fix-desktop-cross-wired-resume
fix(desktop): restore cross-wired runtime-id guard on session resume
2026-06-28 18:26:01 -05:00
Brooklyn Nicholson
cd5fb760a5 fix(desktop): restore cross-wired runtime-id guard on session resume
resumeSession's warm-cache fast-path once again trusted the
storedSessionId -> runtimeId -> ClientSessionState mapping without
checking the cached state still BELONGS to the session being resumed. A
pooled profile backend that gets idle-reaped and respawned re-mints
runtime ids, so a recycled id resolves to a live-but-DIFFERENT session's
cache entry and paints the wrong transcript under the current route:
click thread A, a totally different thread (often from another worktree)
loads. The session.usage 404 guard only catches a fully-dead id; a
recycled-live id 200s, so the fast-path happily served the stale cache.

Straight regression, not a new bug. f7bf74064 ("reject cross-wired
runtime-id cache on session resume") landed takeWarmCache() + its
regression test; 62af32efe ("keep active sessions aligned with cwd"),
rebased off a stale branch, restructured resumeSession and silently
reverted both 29 minutes later -- the exact stale-branch squash clobber
AGENTS.md warns about ("Squash merges from stale branches silently
revert recent fixes").

Re-apply the whole-class fix on top of the current cwd-aligned code:
takeWarmCache() validates state.storedSessionId === storedSessionId at
BOTH cache reads (the early transcript-keep decision and the fast-path),
purging a cross-wired mapping on a miss so it falls through to a full
resume that rebinds a correct runtime id. Restore the two regression
tests guarding it.

Tests: resumeSession warm-cache mapping integrity -- a cross-wired
mapping is rejected + purged (the bug), a correctly-wired cache is still
served with no needless refetch (no perf regression).

Co-authored-by: professorpalmer <professorpalmer@users.noreply.github.com>
2026-06-28 18:23:09 -05:00
brooklyn!
65d45a0013 Merge pull request #53386 from NousResearch/bb/elevenlabs-voices-401-spam
fix(dashboard): stop ElevenLabs voice-list 401 log spam
2026-06-28 18:10:47 -05:00
Brooklyn Nicholson
f34cf7e3a4 test(gmi): stub profile fetch_models in static-fallback test
The fallback test only mocked fetch_api_models; CI still hit the real GMI
/v1/models endpoint via ProviderProfile.fetch_models and merged live
models into the result.
2026-06-28 18:05:28 -05:00
Brooklyn Nicholson
27f03243a0 fix(dashboard): stop ElevenLabs voice-list 401 log spam
The /api/audio/elevenlabs/voices endpoint logged a WARNING on every
failure, and the desktop re-polls it on each settings open/focus — a
bad/expired/scoped ELEVENLABS_API_KEY floods agent/gui logs with
identical "voice list failed: HTTP Error 401" lines indefinitely.

Treat 401/403 as a persistent "integration unavailable" state: return
{available: false, error: "unauthorized"} with a 200 (the dropdown
already handles available:false) instead of a 502, and collapse repeated
identical failures to a single log line via a small re-arming latch
(logs again on recovery or when the error changes). Non-auth errors keep
the 502 but are throttled the same way.
2026-06-28 17:59:28 -05:00
brooklyn!
d0d2cf1c2f Merge pull request #54492 from NousResearch/bb/windows-hide-checkpoint-skills-git
fix(windows): hide console flash on checkpoint git + skills_hub gh probes
2026-06-28 17:49:37 -05:00
Brooklyn Nicholson
cb1bb1a48d refactor(windows): unify windowless spawn form across the touched sites
windows_hide_flags() already returns 0 on POSIX (and creationflags=0 is
the no-op default there, exactly how server.py::_list_repo_files does it),
so drop the IS_WINDOWS import + ternary/one-use-dict gating and just pass
creationflags=windows_hide_flags() directly. Tests lose the now-pointless
IS_WINDOWS monkeypatch.
2026-06-28 17:44:47 -05:00
Brooklyn Nicholson
ee22d853eb fix(windows): hide pdftoppm console flash on PDF attach
server.py's PDF-attach handler shells out to `pdftoppm` from the
console-less desktop/gateway backend; on Windows that pops a conhost
window each attach. Route it through windows_hide_flags() like the
sibling _list_repo_files git calls (no-op on POSIX).
2026-06-28 17:43:27 -05:00
Brooklyn Nicholson
32087e4bc9 fix(windows): hide console flash on checkpoint git + skills_hub gh probes
The #54236/#54417 backend git/gh sweep routed git_probe, the repo-file
picker, coding_context, context_references, copilot_auth, and the gateway
process scans through CREATE_NO_WINDOW, but two sibling spawn legs that
also run inside the console-less desktop/gateway backend were missed:

- tools/checkpoint_manager.py `_run_git` (and the one-shot `git init
  --bare` in `_init_store`) — when checkpoints are enabled, every
  file-mutating turn fires multiple bare `git` calls (status, add,
  write-tree/commit-tree, update-ref). Spawned from a parent with no
  console (Electron spawns the backend with windowsHide → CREATE_NO_WINDOW),
  each one allocates its own conhost window → a flurry of terminal popups.
- tools/skills_hub.py `GitHubAuth._try_gh_cli` — `gh auth token`, the same
  bug class as the already-fixed copilot_auth gh probe.

Route both through `windows_hide_flags()` (no-op on POSIX), matching the
established per-site pattern. Tests added to
tests/test_windows_subprocess_no_window_flags.py.
2026-06-28 17:41:47 -05:00
Teknium
980622d0ec perf(startup): parse config + plugin manifests with libyaml CSafeLoader (#54486)
The startup config/manifest reads used PyYAML's pure-Python SafeLoader,
which is ~8x slower than the libyaml-backed CSafeLoader C extension.
config.yaml is parsed several times during launch (cli config, raw
config, early interface/redaction bridge, logging config) and every
plugin manifest is parsed once — all on the slow path.

Add utils.fast_safe_load (CSafeLoader-preferring, pure-Python fallback,
true drop-in for safe_load) and route the hot startup parse sites
through it: hermes_cli/config.py (config + manifest reads),
hermes_cli/plugins.py (manifest parse), env_loader, cli.load_cli_config,
hermes_logging, and the two pre-config early YAML bridges in main.py.

Behavior is identical (same restricted safe tag set); only speed changes.
safe_load calls on the startup path drop from ~79 to ~0, cutting the
YAML parse cost from ~0.9s to ~0.15s under profiling.

Adds tests/test_fast_safe_load.py asserting equivalence with safe_load
across input shapes, empty-doc falsiness, C-loader preference, and that
python/object tags are still rejected (safe, not full loader).
2026-06-28 15:38:39 -07:00
Teknium
d65468e7ff fix(security): SSRF guard yuanbao media download_url (#54470)
yuanbao_media.download_url() fetched model-supplied (outbound) and inbound
image/file URLs server-side via httpx with follow_redirects=True and no
SSRF check. A model response containing <img src="http://169.254.169.254/...">
routed through ImageUrlHandler -> download_url and would fetch cloud-metadata
endpoints; same for inbound media.

Add an is_safe_url() pre-flight plus an async redirect event-hook that
re-validates every 30x target, matching the cache_image_from_url() guard in
gateway/platforms/base.py. The other gateway adapters already guard their
URL-fetch paths; this was the remaining unguarded one.
2026-06-28 15:29:59 -07:00
brooklyn!
16ff1a3b93 Merge pull request #54457 from NousResearch/bb/windows-console-launcher-repair
fix(windows): repair missing console script launchers
2026-06-28 17:15:56 -05:00
Hermes Agent
c8b86963d0 docs: add PR infographic for anthropic stale base_url guard 2026-06-28 15:12:03 -07:00
奥森木
e7d4ade8cf fix(anthropic): ignore stale non-Anthropic base_url across all resolution paths
A config left with `provider: anthropic` but a leftover
`base_url: https://openrouter.ai/api/v1` (e.g. after a provider switch)
would route Anthropic OAuth/setup-token traffic to OpenRouter and 404.

Add `_anthropic_base_url_override_ok()` and gate the three native-Anthropic
resolution branches (pool, explicit, native) on it. The guard honors a
configured `model.base_url` only when it plausibly speaks the Anthropic
Messages protocol — official `*.anthropic.com` / `*.claude.com` hosts, Azure
Foundry endpoints, and `/anthropic`-suffixed or Kimi `/coding` proxies — and
falls back to `https://api.anthropic.com` otherwise. Aggregator URLs like
openrouter.ai / api.openai.com are treated as stale.

Reconstructed from @clovericbot's PR #3661 onto current main: the original
patched one branch with an anthropic-only allow-list, which would have broken
Azure-via-anthropic; widened to all three sites and made Azure/proxy-safe.
2026-06-28 15:12:03 -07:00
Teknium
95f2919f91 perf(startup): lazy-load gateway platform adapters (#54448)
Bundled platform plugins (telegram, discord, feishu, teams, ...) were
eagerly imported at plugin-discovery time on every `hermes` invocation,
including plain `hermes chat` which never touches a gateway platform.
Their modules import heavy platform SDKs at module level (lark_oapi,
microsoft_teams, discord.py, slack_bolt, ...) — feishu alone pulled in
lark_oapi (~2.6s), teams pulled microsoft_teams (~1.9s).

Discovery now registers a cheap deferred loader per platform in the
platform_registry; the adapter module is imported only when the gateway
/ cron / setup / send_message path actually asks for that platform.
is_registered() and the iterate-all accessors stay correct (deferred
counts as registered; plugin_entries()/all_entries() materialize all
deferred loaders, since those paths genuinely need every adapter).

Cold start: ~4.4s -> ~2.45s to banner. discover_and_load: 2.0s -> 0.3s
(warm), and the heavy SDKs are no longer imported at all in CLI mode.
Every shipped platform remains available out of the box — it just loads
on first use.
2026-06-28 15:11:59 -07:00
Mibayy
b0b7ff0d75 fix(provider): auto+base_url bypasses cloud API when custom endpoint configured (#3846)
When config.yaml has `provider: auto` and a non-cloud `base_url` (e.g. Ollama
at localhost:11434), requests were silently sent to https://api.anthropic.com
whenever ANTHROPIC_API_KEY was present in the environment, ignoring the
configured local endpoint and returning HTTP 401 / "credit balance too low".

Root cause: resolve_provider("auto") scans env vars and returns "anthropic"
when ANTHROPIC_API_KEY is set, before config.model.base_url is ever consulted.

In resolve_runtime_provider(), before calling resolve_provider(), short-circuit
to the OpenAI-compatible resolver when no explicit creds were passed, provider
is "auto"/unset, and a non-cloud base_url is configured. Well-known cloud roots
(openrouter.ai, anthropic.com, openai.com) are matched on HOST (not substring)
so look-alike hosts can't evade the bypass and leak a cloud credential.

Co-authored-by: Hermes Agent <hermes@nousresearch.com>
2026-06-28 15:11:55 -07:00
Teknium
86e64900b9 fix(gateway): preserve sessions across restarts (#54442) 2026-06-28 15:10:39 -07:00
Teknium
4c2961c511 fix(curator): never archive cron-referenced skills + floor use=0 pruning (#54443)
The curator's inactivity prune archived any non-pinned agent-created
skill whose activity was older than archive_after_days (90d). A skill
loaded only by a cron job had its usage bumped solely when the job
fired, so paused jobs, infrequent (quarterly/annual) schedules, and
far-future one-shots aged their skills out from under them — the next
run then failed to load the now-archived skill.

- cron/jobs.py: add referenced_skill_names() returning skills used by
  ANY job (incl. paused/disabled).
- curator.apply_automatic_transitions(): skip cron-referenced skills
  like pinned; add a use=0 grace floor so a never-used skill is not
  marked stale/archived until it is at least stale_after_days old.
- LLM review pass: candidate list marks cron=yes; prompt forbids
  pruning cron-referenced skills and never-used skills under 30 days.

Tested E2E against a real cron job + real usage records and with 4 new
unit tests.
2026-06-28 15:10:21 -07:00
Gille
df8e2523fa fix(windows): verify launchers after primary install 2026-06-28 17:02:05 -05:00
HexLab98
76bb8f46a0 test(cli): cover Windows console script repair (#52931)
Add unit tests for missing-shim detection and repair trigger in
_verify_console_scripts_installed.
2026-06-28 17:01:31 -05:00
HexLab98
95994bbc56 fix(windows): repair missing hermes.exe after pip install (#52931)
On Windows, uv pip install -e . can register hermes.exe in package metadata
while the launcher never lands on disk. Detect missing [project.scripts]
shims and reinstall entry points under the existing quarantine path in
hermes update and install.ps1.
2026-06-28 17:01:31 -05:00
brooklyn!
28097d9cd9 Merge pull request #54385 from NousResearch/bb/project-folder-picker-remote
feat(desktop): remote-gateway-aware folder picker + git cockpit (status, review, worktrees)
2026-06-28 16:35:57 -05:00
Teknium
e5d22ab80d fix(daytona): quote single-upload mkdir parent path (#54440)
* fix(daytona): quote single-upload mkdir parent path

The single-file _daytona_upload() path shelled out 'mkdir -p {parent}'
with the remote parent interpolated unquoted, so shell metacharacters in
the path could break the command or inject arbitrary commands into the
sandbox. The bulk-upload, bulk-download, and delete paths were already
hardened with shlex-quoting helpers; this single-upload path was missed.

Route it through the existing quoted_mkdir_command() helper and add a
regression test covering a path with shell metacharacters.

Reported by @Gutslabs (#3960); the original branch predated the
file_sync refactor, so the fix is re-applied to the current code path.

* docs(infographic): daytona quote-sync fix
2026-06-28 14:33:03 -07:00
Brooklyn Nicholson
f9b469d7de test(web_git): assert default branch invariant, not hardcoded main
CI git init defaults to master on some runners; compare branch to
defaultBranch instead of pinning a branch name.
2026-06-28 16:29:52 -05:00
teknium1
c648ecdca5 fix(telegram): reject unauthorized users before event construction (#40863)
Removed/unauthorized Telegram users could inject prompt content before the
per-user auth gate fired. The adapter ran `_should_process_message`,
`_build_message_event`, and text/photo batching — and dispatched to the
runner — before `_is_user_authorized()` (gateway/authz_mixin.py) rejected
the sender. Unmentioned group chatter from a removed user was also
persisted into the session transcript via `_observe_unmentioned_group_message`,
leaking into the agent's observed context independent of dispatch.

Add `_is_user_authorized_from_message()` as an intake prefilter that runs
in `_handle_text_message`, `_handle_command`, `_handle_location_message`,
and `_handle_media_message` BEFORE batching, event construction, and the
unmentioned-group observe branch. It reuses the runner's
`_is_user_authorized()` with a correctly-shaped SessionSource (group vs
forum vs dm, real chat_id for TELEGRAM_GROUP_ALLOWED_* allowlists),
falls back to env allowlists, and only rejects when an allowlist actually
exists — unknown DMs with no allowlist still reach the pairing flow.
Channel posts authorize via `sender_chat` identity when `from_user` is
absent.

Co-authored-by: liuhao1024 <sunsky.lau@gmail.com>
Co-authored-by: Carlos Manuel Cejas <carlosmcejas@gmail.com>
2026-06-28 14:25:15 -07:00
srojk34
61210097a5 fix(browser): extend private-network guard to browser_get_images
The SSRF cluster (7a6fe9bb, 48f5c425, 7ef04ae7) sealed
browser_snapshot, browser_vision, and _browser_eval against
eval-navigated private pages, but browser_get_images bypasses
_browser_eval and calls _run_browser_command("eval", ...) directly.
An eval-driven navigation to a private address followed by
browser_get_images would leak image src URLs and alt text from the
private page.

Add the same _eval_ssrf_guard_active + _current_page_private_url
recheck before returning image data, matching the pattern established
by the sibling guards.

5 new tests cover: block on private page, allow on public page, skip
for local backend, skip when private URLs allowed, no guard needed on
failed eval.
2026-06-28 14:25:10 -07:00
Brooklyn Nicholson
c7542358f2 fix(desktop): remote project picker UX and profile-scoped fs/git routing
Route FS/git REST through the active profile, mount the remote folder picker
at app root, keep the project dialog open while picking, show a first-run
blank state, flip into grouped view on create, and constrain the picker scroll
area so Select stays reachable.
2026-06-28 16:23:39 -05:00
Teknium
9a0010fd46 fix(windows): cover remaining console-flash spawn legs (#54417) 2026-06-28 13:49:08 -07:00
Teknium
b31b0b9d95 docs: reconcile docs with code across last 3 releases (#54254)
Audited the last 3 releases (v2026.5.28..main) against the docs site and
fixed code-vs-docs drift:

- slash-commands: add /moa, /prompt, /pet, /hatch, /timestamps
- cli-commands: add hermes pets / project / desktop / whatsapp-cloud +
  dashboard register; correct --insecure (now a deprecated no-op);
  add gateway migrate-legacy + enroll --wake-url + dashboard --skip-build
- environment-variables: document the remaining ~48 env vars (SimpleX,
  Photon, Teams adapter, per-platform *_ALLOW_ALL_USERS, home-channel vars,
  IRC, Brave/Krea/Notion/Linear/Airtable/Tenor keys, QQ_SANDBOX) — full
  OPTIONAL_ENV_VARS (265) now covered
- configuration: document tool_loop_guardrails, goals, prompt_caching,
  network, onboarding, dashboard config blocks
- toolsets/tools-reference + tools.md: add coding/project toolsets and
  read_terminal/project_* tools; remove the stale messaging toolset and
  send_message agent tool (removed in #47856); drop stale RL-training prose
- messaging: new IRC channel page (adapter shipped without docs) + index
  row + sidebar + env vars
- pets: document the /hatch AI generation pipeline + Nous/OpenRouter image
  backend
- web-dashboard: document the bearer-token / TokenPrincipal service auth path
- purge agent-callable send_message references across guides/features and
  the research-paper-writing skill (tool removed in #47856)

Verified: docusaurus build succeeds; all authored internal links resolve.
2026-06-28 12:47:50 -07:00
Brooklyn Nicholson
19bae1b9e0 test(desktop): assert new backend sessions carry workspace cwd
Pin the desktop-to-gateway cwd handoff: createBackendSessionForSend must pass
the current workspace cwd into session.create so the backend registers the
session cwd before the agent/tools run.
2026-06-28 14:44:28 -05:00
Brooklyn Nicholson
8d8c7111d9 refactor(desktop): keep remote fs routing inside the fs facade
Let UI callers ask for folders/files without knowing remote-picker limits:
selectDesktopPaths now normalizes remote directory selection to a single folder
inside the facade. Project creation and composer context picking no longer branch
on remote mode; they route through desktop-fs helpers just like git callers route
through desktopGit(). Behavior unchanged except remote folder context now works
through the same backend picker path.
2026-06-28 14:39:33 -05:00
Brooklyn Nicholson
453f134b3b refactor(desktop): centralize remote git REST routing
Keep the remote git mirror as a thin facade: route all GETs through gitGet,
all mutations through gitPost, and keep consumers on desktopGit(). On the
backend, route git paths through a single _git_path helper instead of repeating
str(_fs_path(...)) in every endpoint. Behavior unchanged.
2026-06-28 14:37:36 -05:00
Brooklyn Nicholson
4e9439cc3b fix(desktop): route composer context picking through remote-aware fs
Second pass on the remote-project flow: the project dialog and git cockpit were
remote-aware, but the composer's Add file/folder context picker still called the
native Electron picker directly. Route it through selectDesktopPaths so remote
sessions use the backend-aware picker instead of local disk paths; preserve local
multi-select behavior and keep remote folder selection single because the in-app
remote picker only supports one directory.

Also use readDesktopFileDataUrl for image previews so an already-known backend
image path can be read through /api/fs/read-data-url, and add focused coverage
for backend file-diff routing plus the plain-folder git init/worktree path.
2026-06-28 14:35:23 -05:00
Brooklyn Nicholson
9b71221187 fix(desktop): write project IDEA.md through the remote-aware fs path
writeProjectIdea used the local-only Electron writeTextFile, so on a remote
gateway IDEA.md never landed on the backend (where the project folder lives).
Route it through writeDesktopFileText (local Electron / POST /api/fs/write-text).
2026-06-28 14:31:58 -05:00
Brooklyn Nicholson
e4cf3a2e9d refactor(web_git): unify porcelain-v2 parsing into one walker
Collapse the two near-duplicate status parsers (_parse_status_v2 +
_iter_status_entries) into a single _walk_entries generator feeding the rail,
review list, and commit flow; share the staged predicate; hoist `import re`.
Behavior unchanged.
2026-06-28 14:29:59 -05:00
Brooklyn Nicholson
fc86e35764 feat(desktop): make the git cockpit work over a remote gateway
After the folder picker fix, an added remote folder was still half-usable:
the desktop's git GUI (coding-rail status, worktree lanes, review pane,
branch switch, file diff) all ran Electron-local git on the USER's machine,
so against a remote-gateway repo they silently degraded to empty.

Mirror the whole surface over the dashboard REST API so it acts on the
BACKEND repo where sessions actually run:

- hermes_cli/web_git.py: git/gh logic (status, worktrees, branches, review
  list/diff/stage/unstage/revert/commit/commit-context/push/ship-info/
  create-pr, file-diff, worktree add/remove, branch switch) shelling to the
  system git, mirroring the Electron ops' shapes.
- web_server.py: /api/git/* routes (same auth gate + _fs_path hardening as
  /api/fs, executor-offloaded, mutations -> 400).
- apps/desktop desktop-git.ts: remote-aware facade exposing the same shape as
  window.hermesDesktop.git; coding-status / review / projects / model /
  desktop-fs route through desktopGit() so local stays Electron, remote hits
  /api/git/*.

Tests: tests/hermes_cli/test_web_server_git.py (real repo: status counts,
review classification, diff incl. untracked all-add, stage+commit roundtrip,
worktree/branch lifecycle, commit-context, gh-absent ship-info, auth) and
desktop-git.test.ts (local vs remote routing, envelope unwrap, POST bodies).
2026-06-28 14:26:09 -05:00
Brooklyn Nicholson
304f0650c4 style(desktop): tighten pickProjectFolder comment 2026-06-28 14:13:36 -05:00
Brooklyn Nicholson
4526fccdbe fix(desktop): make project "Add folder" picker remote-gateway aware
The new-project / add-folder dialog (PR #49037) picked folders via the
native Electron dialog (pickDefaultProjectDir), which only browses the
LOCAL machine. On a remote gateway that picks a path that doesn't exist
on the backend where sessions actually run.

Route pickProjectFolder() through selectDesktopPaths({directories,
multiple:false}) — the same remote-aware path the retired right-sidebar
picker used: local mode opens the native directory dialog, remote mode
browses the backend filesystem via the in-app RemoteFolderPicker. Seed
it with the backend's default cwd on remote so it opens somewhere useful.
2026-06-28 13:49:45 -05:00
brooklyn!
b699d27a4a Merge pull request #54357 from NousResearch/bb/browser-chromium-autoinstall
feat(browser): auto-install Chromium binary on local cold-start failure
2026-06-28 12:36:22 -05:00
brooklyn!
27868e5b55 Merge pull request #54353 from NousResearch/bb/browser-first-open-timeout
fix(browser): extend first-open timeout & surface daemon errors on Linux (salvage #52575)
2026-06-28 12:32:41 -05:00
Brooklyn Nicholson
70292596ef feat(browser): auto-install Chromium binary on local cold-start failure
When a local browser_navigate (or any browser command) fails fast because
Chromium isn't on disk, attempt a one-shot binary download via
`agent-browser install` and retry instead of only printing a hint.

Scope is narrow on purpose:
- binary only, never `--with-deps` (that shells apt/needs root, so missing
  system libraries stay a user action)
- gated by `security.allow_lazy_installs` (same opt-out as every lazy install)
- skipped in Docker (Chromium ships in the image)
- attempted once per process

Follow-up to #54353, which made the cold-start failure legible; this closes
the "doesn't actually install the missing browser" gap for the common case.
2026-06-28 12:25:15 -05:00
Brooklyn Nicholson
1ab5c3cdda refactor(browser): drop redundant sandbox-hint substring check 2026-06-28 12:14:47 -05:00
infinitycrew39
7bb8aa3bd5 test(browser): cover open timeout diagnostics and failed navigate title
Add regression tests for open-command timeout floors, sandbox bypass,
stderr capture formatting, first-navigation timeout wiring, and desktop
failed-navigate labeling.
2026-06-28 12:14:21 -05:00
infinitycrew39
a10727a555 fix(browser): extend first-open timeout and surface daemon errors
Local browser_navigate cold-starts the agent-browser daemon and Chromium;
60s was too short on slow Linux hosts and timeouts discarded stderr,
leaving users with a generic failure. Use a 120s floor on first open,
inject --no-sandbox in Docker, include captured daemon output plus install
hints when commands time out, and show "Failed to open" in the desktop
tool chip when navigation returns success=false.
2026-06-28 12:14:21 -05:00
brooklyn!
23021be26e Merge pull request #52656 from helix4u/fix-desktop-empty-resume-view
fix(desktop): retry empty resumed transcripts
2026-06-28 11:57:57 -05:00
ygd58
3e16176ba4 fix(tools): reconcile agent.disabled_toolsets when a toolset is enabled
_get_platform_tools() applies agent.disabled_toolsets as a final
override AFTER reading platform_toolsets.<platform>, so a toolset
listed there stays permanently OFF no matter what the toggle write
path saves. Blank Slate installs pre-populate this list with ~27
toolsets, making most of the desktop Toolsets UI un-enableable
(issue #49995).

Fix: _save_platform_tools() now removes any toolset the user just
explicitly enabled FOR THIS PLATFORM from agent.disabled_toolsets.
Toolsets the user did not touch, or that remain disabled on other
platforms, are left alone -- disabled_toolsets keeps working as a
cross-platform suppression list for anything not actively re-enabled.
Disabling a toolset (unchecking it) does not touch disabled_toolsets
at all -- only enables reconcile it.

Verified end-to-end with the exact repro from the issue: Blank Slate
config (disabled_toolsets=['todo','memory','browser'], cli=['file',
'terminal']) -> enable 'todo' via the toggle -> _get_platform_tools()
now resolves 'todo' as enabled while 'memory'/'browser' (untouched)
remain disabled.

Added 4 regression tests. Full tools_config suite: 101 passed
(97 existing + 4 new), no regressions.

Fixes #49995
2026-06-28 21:59:03 +05:30
brooklyn!
020966574d Merge pull request #53892 from NousResearch/bb/windows-popup-spawn-legs 2026-06-28 11:16:35 -05:00
Brooklyn Nicholson
eeca59f489 fix(windows): hide remaining backend console-flash legs missed on main
main (cb982ad99) wired windows_hide_flags() into the auxiliary git/gh/wmic/
bash/powershell/taskkill legs but left two it didn't reach, plus the Electron
backend-launch leg it explicitly deferred. Cover them the same way:

- apps/desktop/electron/main.cjs: getNoConsoleVenvPython resolves the BASE
  pythonw.exe instead of the venv Scripts\pythonw.exe shim, which re-execs a
  console python.exe and flashes a conhost the desktop backend can't suppress.
  Both backend creators put the venv site-packages on PYTHONPATH so imports
  still resolve under the base interpreter. (main's commit said this Electron
  leg "needs a Windows-tested change of its own".)
- tools/tts_tool.py, tools/transcription_tools.py, plugins/platforms/discord:
  ffmpeg conversions (voice notes / TTS / STT) via windows_hide_flags().
- plugins/platforms/whatsapp: netstat + taskkill bridge-port cleanup via
  windows_hide_flags().

All no-ops on POSIX. Tests assert the base-pythonw preference and the ffmpeg
legs pass CREATE_NO_WINDOW.
2026-06-28 10:19:21 -05:00
Teknium
0c2e6c0049 test: make active session cross-process race deterministic (#54248) 2026-06-28 05:49:21 -07:00
teknium1
1ffa01f35f test(windows): cover no-window backend subprocess flags 2026-06-28 05:28:45 -07:00
Teknium
cb982ad997 fix(windows): hide console-window flash on backend git/gh/wmic/bash subprocess spawns
The Windows desktop GUI runs its backend headless via pythonw.exe. Several
auxiliary subprocess sites that run inside that windowless backend spawned
console-subsystem children (git, gh, wmic, powershell, bash, rg, taskkill)
WITHOUT CREATE_NO_WINDOW, so Windows allocated a fresh conhost per call and
flashed a black window on screen — sometimes continuously (the dashboard
Projects-tree git probe alone fired ~118 spawns in 60s on startup).

The terminal tool, cron, browser, code_execution, and gateway-spawn paths
already carry windows_hide_flags(); these auxiliary probe/scan/launcher legs
were missed. Wire the existing helper into them:

- tui_gateway/git_probe.py: run_git (+ encoding=utf-8/errors=replace, fixes the
  cp950 UnicodeDecodeError on CJK paths from the same site)
- agent/coding_context.py: _git (per-turn git status/log/diff)
- agent/context_references.py: _run_git + _rg_files (@file/@ref resolution)
- hermes_cli/copilot_auth.py: gh auth token probe (auxiliary provider:auto)
- hermes_cli/gateway.py: wmic + PowerShell Get-CimInstance PID scan
- hermes_cli/main.py: wmic stale-dashboard PID scan
- gateway/status.py: taskkill /T /F force-kill

windows_hide_flags() returns 0 on POSIX, so every changed call is a no-op on
Linux/macOS (verified: real git/rg probes still work; Windows-simulated calls
all pass creationflags=CREATE_NO_WINDOW).

Scoped to the windowless-backend paths that cause the reported flashing. The
Electron updater-handoff leg (main.cjs windowsHide:false) and the
interactive-CLI banner probes (cli.py) are intentionally NOT touched here —
the former needs a Windows-tested change of its own, the latter runs in a
visible console anyway.

Tracking: #54220
Refs: #53178 #53631 #53781 #53957 #49602 #52982 #53424 #53053 #53016
2026-06-28 05:28:45 -07:00
teknium1
f25f235722 chore: map salvaged PR #49845 author email for AUTHOR_MAP 2026-06-28 04:47:39 -07:00
homelab-ha-agent
d05cc8f4d6 fix(mcp): skip preflight content-type probe for OAuth servers
OAuth-protected MCP servers (e.g. Hospitable) return 200 text/html on an
unauthenticated HEAD probe — a login/landing page the server cannot substitute
for a real MCP response without a Bearer token.  The preflight cannot
distinguish this from a misconfigured URL, so it raises NonMcpEndpointError
before the OAuth browser flow has a chance to run.

Add `and self._auth_type != "oauth"` to the preflight condition in
MCPServerTask.run().  The probe is inapplicable to OAuth servers: their URL
legitimacy is established by .well-known/oauth-protected-resource during the
OAuth handshake, not by a GET content-type check.

Concrete repro: Hospitable (https://mcp.hospitable.com/mcp) returns
`200 text/html` to an unauthenticated httpx HEAD.  Without the guard:
  ✗ NonMcpEndpointError at `hermes mcp test`
With the guard:
  ✓ Connected (1487ms) — 63 tools discovered

Relation to open PRs:
- #37598 adds a POST probe fallback for POST-only non-OAuth servers (e.g.
  DocuSeal), but only passes when POST returns 2xx + MCP content-type.
  Hospitable returns 401 on the POST probe (Bearer challenge), so #37598
  does not cover this case.
- #49463 extends the POST probe to also pass on non-2xx auth challenges
  (making it OAuth-aware), but is labeled duplicate of #37598 and may not
  land independently.
This fix is complementary: it handles OAuth servers with zero extra
round-trips rather than adding a POST probe step.

Tests:
- test_oauth_server_html_response_raises_without_skip: documents that
  _preflight_content_type raises NonMcpEndpointError for 200 text/html
  (the underlying issue), with an OAuth-server docstring.
- test_run_skips_preflight_for_oauth: verifies that run() does NOT invoke
  _preflight_content_type when auth_type=="oauth", using class-level
  monkeypatching so the gate is exercised without a live MCP transport.

23 passed  tests/tools/test_mcp_preflight_content_type.py
2026-06-28 04:47:39 -07:00
liuhao1024
9d919daf44 fix(gateway): mark platform lock failure as retryable instead of permanently fatal
When a stale lock file survives a gateway crash, `acquire_scoped_lock()`
may return `(False, existing_dict)` even after detecting and deleting
the stale lock (e.g. if unlink fails or a race condition occurs).

Previously, `_acquire_platform_lock()` called
`_set_fatal_error(..., retryable=False)`, which permanently killed the
platform — the reconnect watcher never retries a non-retryable fatal
error.

Change to `retryable=True` so the platform enters the "retrying"
state and the reconnect watcher can attempt acquisition again after the
standard backoff delay.

Fixes #54167
2026-06-28 04:35:37 -07:00
teknium1
61622bb56a fix(tui): use role=user for model switch marker to avoid HTTP 400 on strict providers (#48338)
_append_model_switch_marker() appended the post-/model-switch context marker
to session history as {"role": "system"}. The cached system prompt is
prepended to the API message list (conversation_loop.py), so this marker
became a SECOND system message mid-array after prior user/assistant turns.
Strict OpenAI-compatible providers (vLLM, Qwen) reject any system message
that is not at the beginning of the array, returning HTTP 400 and killing
the conversation on the next turn.

Flip the marker to role="user" (history entry + both session-DB persist
sites), matching the existing personality-overlay marker which already uses
role="user". repair_message_sequence() then coalesces it with adjacent user
turns as needed.

Co-authored-by: liuhao1024 <sunsky.lau@gmail.com>
Co-authored-by: Lucas Nicolas <lucas.nicolas@proton.me>
2026-06-28 04:34:55 -07:00
Brad Hallett
376d021fee fix(desktop): force app exit after update/uninstall handoff on macOS
On macOS app.quit() closes windows but window-all-closed deliberately keeps
the process alive (Dock convention). Every detached hand-off (update swap,
relaunch, Windows bootstrap recovery, uninstall cleanup) waits for the
desktop PID to exit before replacing/removing the bundle — so the process
never dying means the script spins its full PID-wait and the user sees a
blank app, or an uninstall that appears to do nothing.

Add a module-level isQuittingForHandoff flag, set before every hand-off
app.quit(); window-all-closed then quits on all platforms when it's set.

Covers all five hand-off sites including the Linux relaunch path.
2026-06-28 04:30:14 -07:00
teknium1
e54bedd8ea docs: add infographic for #42006 launchd bootout fix 2026-06-28 04:17:13 -07:00
izumi0uu
c4719aa51c fix(gateway): boot out stale launchd registration before restart bootstrap
launchd restart can leave the gateway job stopped but still registered after
update-time drain logic, so a direct bootstrap hits exit 5 and falls back to a
detached process. Booting the stale registration out before bootstrap keeps the
launchd-managed restart path intact and locks it with a regression test.

Constraint: Keep upstream-facing conventional commit style while preserving local decision context
Rejected: Treat bootstrap exit 5 as expected | Leaves macOS launchd restart outside launchd supervision after update
Confidence: high
Scope-risk: narrow
Directive: Keep launchd start/restart recovery flows aligned when changing launchctl handling
Tested: pytest -q tests/hermes_cli/test_gateway_service.py -k "launchd_restart_boots_out_stale_registration_before_bootstrap or launchd_restart_falls_back_to_detached_on_error_5 or launchd_restart_drains_running_gateway_before_kickstart or launchd_restart_self_requests_graceful_restart_without_kickstart"
Tested: pytest -q tests/hermes_cli/test_gateway_service.py -k launchd
Not-tested: Manual macOS launchctl restart after hermes update
2026-06-28 04:17:13 -07:00
Teknium
52a853f5c3 fix(test): pin monotonic clock in spinner-elapsed test to fix CI flake (#54203)
test_spinner_elapsed_format_is_fixed_width_to_reduce_wrap_jitter derived
_tool_start_time from the live time.monotonic() clock (now - 65.2 / now - 9.2).
monotonic()'s epoch is arbitrary — on a host where monotonic() < 65.2 (fresh
subprocess on a freshly-booted CI runner) the start time went negative, the
(t0 > 0) guard in _render_spinner_text() dropped the '(elapsed)' suffix, and
short.split('(',1)[1] raised IndexError: list index out of range. Deterministic
given a small clock, so it would keep flaking, not clear on rerun.

Pin time.monotonic to a fixed 1000.0 and offset _tool_start_time from it so both
the <60s and >=60s paths always render the elapsed suffix regardless of the
runner's monotonic epoch.

Pre-existing main flake (surfaced in CI test slice 1/8).
2026-06-28 04:16:25 -07:00
Teknium
8e356eccea docs(readme): trim provider list to a few names plus docs link (#54169)
The README line enumerated 11 providers inline, which dilutes the point
and goes stale as providers come and go. Replace with Nous Portal,
OpenRouter, OpenAI, your own endpoint, and a 'many others' link to the
canonical AI Providers docs page that already lists them all.
2026-06-28 04:14:59 -07:00
teknium1
f22b9d3867 docs: add infographic for MCP WS discovery fix (#38945) 2026-06-28 04:14:12 -07:00
Cornna
5c2c85c545 fix(tui): start MCP discovery for websocket sessions
The desktop app and dashboard chat reach the agent through the /api/ws
JSON-RPC sidecar (tui_gateway.ws.handle_ws), NOT through
tui_gateway.entry.main() — the stdio-TUI path that spawns the background
MCP discovery thread. In the WS process discovery was therefore never
started: _make_agent only *waits* (wait_for_mcp_discovery), which no-ops
when the thread was never created, so the agent snapshotted an MCP-less
tool list. The only discovery trigger reachable was a manual /reload-mcp,
which is why tools appeared after a reload but vanished on restart.

Start the shared, idempotent, config-gated background discovery in
handle_ws right after accept() and before gateway.ready, so the first
agent build picks up already-spawning servers (and the existing
late-binding refresh handles slow ones).

Fixes #38945.
2026-06-28 04:14:12 -07:00
teknium1
091ce825fe test(redact): fix file_read regression-guard for current-main YAML collapse
The salvaged #35519 regression guard asserted that default (non-file_read)
mode keeps a head/tail `ghp_S1...Pn2T` mask for a `token: <key>` line. On
current main the YAML config pass (`_YAML_ASSIGN_RE`, key `token`) re-masks
the already-prefix-masked value to `***`, so the assertion was stale. Switch
to a bare-token context so the guard isolates what it claims (prefix-mask
head/tail shape in default mode) without depending on the YAML collapse.
2026-06-28 04:13:20 -07:00
kshitijk4poor
de928bccde fix(redact): non-reusable sentinel for prefix secrets in file reads (#35519)
When security.redact_secrets is on (default), read_file/search_files/cat
applied redact_sensitive_text(code_file=True) to file content, which still
ran prefix masking. An API key in config.yaml (ghp_..., sk-..., xai-..., etc.)
came back as a head/tail mask like `ghp_S1...Pn2T` — a plausible-looking
truncated key. When an agent read that and wrote it back to config, the masked
value replaced the real credential, silently breaking auth (401). Production
evidence: a config.yaml found containing the exact 13-char masked GitHub PAT.

The two community PRs (#35529, #35534) fixed the corruption by NOT redacting
prefixes for config reads — but that exposes the user's real keys to the agent
context, model, and logs (a security regression). This takes the safer route:
keep redacting, but for file content emit a NON-REUSABLE sentinel.

- New `_mask_token_nonreusable`: prefix secrets -> `«redacted:ghp_…»` (vendor
  label preserved for debuggability; zero secret bytes; angle-bracket/ellipsis
  wrapper is syntactically invalid as a token so it can't be mistaken for or
  written back as a usable key).
- New `redact_sensitive_text(file_read=True)` routes prefix matches through it
  (implies code_file=True). Default/log/display mode is UNCHANGED — `_mask_token`
  still keeps head/tail (fine for logs, never written back).
- Wired the 3 file_tools.py call sites (read_file / search_files / cat) to
  file_read=True.

Fixes both the corruption AND avoids the secret-exposure of the un-redact
approach. 6 new tests (sentinel shape, no-leak, not-a-plausible-key, default
mode unchanged, file_read implies code_file, sk- prefix); 88 redact tests pass;
mutation-verified (reverting to the old mask fails the sentinel/leak tests).

Co-authored-by: liuhao1024 <sunsky.lau@gmail.com>
Co-authored-by: adammatski1972 <289282750+adammatski1972@users.noreply.github.com>

Closes #35519. Supersedes #35529, #35534.
2026-06-28 04:13:20 -07:00
teknium1
19cbbe304a docs: add infographic for clarify typed-replies fix 2026-06-28 04:13:19 -07:00
tymrtn
d7f655f370 fix: accept typed clarify choice replies 2026-06-28 04:13:19 -07:00
teknium1
9bb5a809b5 fix(gateway): make zombie check defensive against partial psutil stubs
The zombie status probe referenced psutil.Process/NoSuchProcess/Error
unconditionally, which raised AttributeError when psutil is a partial
stub that only defines pid_exists (as in test_windows_native_support's
fallback tests). Guard the probe so any failure to read status degrades
to the authoritative pid_exists() instead of raising.
2026-06-28 04:11:14 -07:00
MorAlekss
acca526286 fix(gateway): treat zombie PIDs as dead in _pid_exists to unblock --replace (closes #42126)
Under systemd Restart=always, the old gateway becomes a zombie (in the
process table, awaiting reap) when the replacement starts. _pid_exists()
reported the zombie as alive, so --replace waited on a PID that never
dies, then aborted with exit 1 — a silent crash loop. Standalone runs are
unaffected because nothing respawns the gateway into a zombie.

The live path is psutil.pid_exists(), which returns True for zombies, so
the check is added there (Process.status() == STATUS_ZOMBIE -> dead). The
psutil-less POSIX fallback also reads /proc/<pid>/stat (state Z) with a ps
state= fallback for macOS/BSD, before the os.kill(pid, 0) liveness probe.

Diagnosis and the /proc + ps POSIX fallback by MorAlekss (PR #44898);
extended to cover the psutil hot path so the fix applies on normal installs.

Co-authored-by: MorAlekss <mor.aleksandr@yahoo.com>
2026-06-28 04:11:14 -07:00
teknium1
463225caf1 fix(gateway): bypass legacy-unit prompt in non-TTY systemd install
Folds in PR #42124 (kyssta-exe): systemd_install gained a non_interactive
flag so the 'Remove the legacy unit(s)?' prompt — the second hidden prompt
not guarded by --start-now/--start-on-login — is also skipped in headless
contexts. Updates systemd_install test mocks to accept the new kwarg and
adds coverage for the legacy-unit-skip path.
2026-06-28 04:09:54 -07:00
liuhao1024
831d443b03 fix(gateway): honor --start-now/--start-on-login flags and support non-TTY headless installs
When running `hermes gateway install` on Linux/systemd, the command
unconditionally prompts with two `prompt_yes_no` questions, breaking
headless installs (SSH, CI, provisioning scripts) and ignoring the
existing --start-now / --start-on-login CLI flags that the Windows
branch already respects.

The fix mirrors the Windows path: read CLI flags first, prompt only
when flags are not provided AND stdin is a TTY, and fall back to True
defaults for non-TTY contexts. The argparse help strings are promoted
from SUPPRESS to visible so users can discover the flags.

Fixes #42065
2026-06-28 04:09:54 -07:00
Teknium
5e7bca95d9 fix(tui): coalesce render frames while stdout backpressure is unresolved (#31486) (#54171)
When the previous frame's stdout.write has not drained (the outer terminal
parser is overwhelmed by a wide CR+LF burst — CJK + ANSI tool output on a
high-context session), the renderer kept writing a new frame every tick. That
piled writes onto an already-backed-up pipe and kept the macrotask queue hot,
starving the stdin 'readable' callback — the observed stdin freeze where the
agent loop keeps running but keystrokes/Ctrl-C are dead.

onRender now coalesces: while pendingWriteStart is non-null (prior write's
drain callback hasn't fired) it skips the frame and retries on the drain tick
instead of writing. A MAX_COALESCED_BACKPRESSURE_FRAMES ceiling forces a write
through after N skips so a terminal whose drain callback never fires (OSError
EIO on flush) self-heals once the pipe recovers rather than wedging forever.
TTY-only; piped stdout has no flow control. Coalesce counter resets on every
real write.

This is the stdout-backpressure strand left open after #54046 fixed the
swallowed-exception strand.
2026-06-28 04:00:22 -07:00
Teknium
a06d0198cd fix(dashboard): reap PTY bridge on child EOF, not only in writer finally (#54190)
The /api/pty handler only closed the PtyBridge in the writer loop's finally.
On child EOF the reader task closes the WebSocket, but if the handler task is
cancelled the instant the socket closes, the writer's finally can be skipped
and the PTY fds leak (#54028) — the FD-leak the regression test guards. Under
dashboard auto-reconnect this stacks orphaned PTYs until fds are exhausted.

Reap the bridge in the reader's EOF finally too (close() is idempotent), so
the PTY is reaped independently of the writer-loop cancellation race. Harden
the regression test to poll for teardown instead of asserting on the same
tick. Was flaky on main (2/20); now 25/25.
2026-06-28 03:58:18 -07:00
Teknium
7968c90318 test(install): track run_with_timeout extraction after #39219 refactor (#54185)
PR #39219 split run_browser_install_with_timeout into a thin wrapper that
delegates to a new run_with_timeout helper (and parameterized the timeout
binary as $timeout_bin for macOS gtimeout support), but did not update
tests/test_install_sh_browser_install.py. The behavioral harness extracted
only the now-empty wrapper, so the install command never ran (runs==[]),
failing all 8 behavioral cases; two text assertions also still expected the
old literal 'timeout' invocation.

Fix the stale test: extract run_with_timeout alongside the wrapper, and match
the $timeout_bin-parameterized GNU-timeout strings. Behavior unchanged.
2026-06-28 03:58:01 -07:00
Christian Persico
135f235165 docs: fix incorrect web search instructions 2026-06-28 02:54:27 -07:00
kshitijk4poor
546193aa6d fix(install): time-box desktop + node-deps installs so a stalled download self-heals (#39219)
The desktop install step ran npm ci / npm run pack with no wall-clock cap, and
the sibling browser-tools / TUI / agent-browser dependency installs had the same
gap. The Electron binary (~150MB) is fetched from GitHub during the pack; on a
throttled or region-blocked link that download can *stall* rather than fail —
npm never errors and never exits, so the installer sits on "Build desktop app"
(step 9/11) indefinitely with only harmless 'npm warn deprecated' lines visible.
The existing self-heal escalation (cache purge -> dist restore -> npmmirror
fallback) only fires when pack returns non-zero, so a stall bypassed it.

- run_with_timeout (generalized from run_browser_install_with_timeout): GNU
  timeout --foreground -k 10 (Ctrl+C-aware, #35166) / gtimeout for external
  commands, else a pure-shell process-group watchdog so stock macOS (neither
  binary present) is protected. Shell functions (_desktop_pack) always take the
  pure-shell path — the timeout binary can't exec a function. Integer-normalized
  budget + a boundary recheck so a command finishing in the final poll second
  isn't mislabeled 124. The internal wait is guarded so set -e can't abort
  mid-function before the real exit code is computed.
- Wrap the desktop npm ci/install (sharing ONE budget via a computed deadline so
  a stall can't cost 2x DESKTOP_BUILD_TIMEOUT) + all three _desktop_pack attempts
  (DESKTOP_BUILD_TIMEOUT, default 900s), and the browser-tools / TUI / agent-
  browser registry installs (NODE_DEPS_TIMEOUT, default 600s).

A stall now converts to a bounded non-zero exit that feeds the existing mirror
self-heal instead of hanging the whole install.
2026-06-28 02:47:47 -07:00
Teknium
c1c179a239 fix(security): redact secrets in background process + foreground env-dump output (#43025) (#54149)
* fix(security): redact secrets in background process + foreground env-dump output

Terminal-output redaction was incomplete (#43025):

- Gap 1: process(action=poll/log/wait) returned background stdout verbatim —
  no redaction at all. A background printenv/server/test emitting a key leaked
  raw to the model, session.db, and CLI display. Same for the gateway
  background-process watcher's completion/progress notifications.
- Gap 2: the foreground terminal path hardcoded code_file=True, which skips the
  ENV-assignment pass, so an opaque token (no vendor prefix) from env/printenv
  leaked even there.

Adds agent.redact.redact_terminal_output(output, command) as the single policy
for ALL terminal-output surfaces: env-dump commands (env/printenv/set/export/
declare) get the ENV-assignment pass (code_file=False) to mask opaque tokens;
other commands stay on code_file=True to avoid false positives on source dumps.
Wired into terminal_tool, process_registry (_handle_process boundary), and the
gateway watcher. Respects security.redact_secrets (no force) — opt-out preserved.

* docs: add infographic for #43025 terminal-output redaction fix
2026-06-28 02:44:21 -07:00
teknium1
d5ba374c03 fix(telegram): detect wedged getUpdates consumer via pending_update_count
The merged CLOSE-WAIT heartbeat (#52744) only probes get_me(), which uses the
general request path and stays healthy while PTB's getUpdates consumer is
silently wedged (updater.running=True but the long-poll task is stuck, observed
on WSL2). DMs then queue in the Bot API and never reach handlers (#42909).

Augment the existing _polling_heartbeat_loop to also probe
get_webhook_info().pending_update_count. After two consecutive probes that see a
non-draining queue while the updater claims to be running, escalate into the
existing _handle_polling_network_error recovery ladder — no new restart
machinery. No-ops in webhook mode, when the updater is not running, or when a
reconnect is already in flight.

Credit to @gazzumatteo, whose PR #42959 identified the pending_update_count
signal as the missing liveness probe. This reuses the existing heartbeat +
recovery path rather than adding a parallel watchdog.

Fixes #42909.
2026-06-28 02:44:17 -07:00
teknium1
822b71cbf8 docs: add infographic for #43083 secret-redaction fix 2026-06-28 02:44:06 -07:00
teknium1
bbe1bf4045 fix(agent): stop redacting tool-call args in history; fix auth-header quote-eating
Two related redaction bugs from #43083:

1. build_assistant_message redacted tool-call arguments in-memory. That dict
   feeds both the replayed conversation history and state.db (which is itself
   replayed verbatim on session resume), so the model read back its own
   PGPASSWORD='***' psql call and copied the placeholder, breaking every
   credential-dependent command on the second turn. The masking gave no real
   protection either — the same secret still leaks through tool OUTPUT. Remove
   it. Keeping secrets out of the replayable store is a separate
   tokenization/vault concern (security.redact_secrets still governs
   storage-time redaction elsewhere).

2. _AUTH_HEADER_RE's greedy \S+ credential class ate a closing quote when the
   token sat flush against it (Authorization: Bearer sk-.."), turning value
   corruption into syntax corruption (unterminated quote -> shell EOF /
   SyntaxError). Exclude " and ' from the token class; real credentials never
   contain them.

Closes #43083.
2026-06-28 02:44:06 -07:00
yoniebans
204a67f0c8 fix(kanban): retry write_txn on transient SQLITE_BUSY 2026-06-28 02:44:04 -07:00
yoniebans
90c1dc0493 test(kanban): cover write_txn BUSY retry (currently failing) 2026-06-28 02:44:04 -07:00
teknium1
9844243b18 fix(gateway): gate quick_commands through slash access policy
Config-backed quick_commands bypassed the admin-only slash gate. The
early gate in _handle_message only fires for registry-known commands
(is_gateway_known_command), but quick_commands are never in the gateway
registry, so they reached the type:exec dispatch sink unchecked. An
allowlisted non-admin gateway user could invoke admin-only quick
commands — including shell exec in the gateway process — even when the
operator set allow_admin_from / user_allowed_commands to lock them out.

Apply _check_slash_access(source, command) at the quick_commands
dispatch site (the single exec chokepoint, cold-path only) using the
raw typed name. Admins and users with the command in
user_allowed_commands still run it; backward-compat (no policy set)
is unaffected.

Fixes #44727.

Co-authored-by: maxpetrusenko <max.petrusenko.agent@gmail.com>
Co-authored-by: zapabob <1920071390@campus.ouj.ac.jp>
2026-06-28 02:43:23 -07:00
Teknium
6d879d486b fix(dashboard): close PTY WebSocket on child EOF to stop FD leak (#54028) (#54123)
* fix(dashboard): close PTY WebSocket on child EOF to stop FD leak

The /api/pty handler's reader task returns on child EOF, but the writer
loop stayed blocked on ws.receive() until the browser sent a disconnect.
When the browser socket is half-open (no FIN delivered — common on
macOS/launchd), that disconnect never arrives, so the handler never
reaches its finally and the PTY master fd + child process leak. With
dashboard auto-reconnect (#52962), every dropped socket then spawns a
fresh PTY on top of the orphaned one, exhausting file descriptors within
hours (EMFILE / Errno 24).

Fix: the reader task now closes the WebSocket in a finally when the child
EOFs or the send side breaks, which unblocks ws.receive() so the existing
finally runs bridge.close(). The writer loop also guards ws.receive()
against the RuntimeError Starlette raises once the socket is closed.

Reported by @fifteenzhang.

Fixes #54028

* docs: add infographic for #54028 PTY FD leak fix
2026-06-28 02:42:21 -07:00
teknium1
7ef04ae7a7 fix(browser): close eval return-value SSRF bypass (sibling of #44731)
The snapshot/vision guards re-check the page URL before returning content,
but browser_console(expression=...) -> _browser_eval returns arbitrary JS
results directly, leaving two same-class bypasses open:

  1. Direct fetch: fetch('http://127.0.0.1/secret').then(r=>r.text()) reads
     a private endpoint and returns the body — the page URL stays public so
     the post-eval recheck never sees it.
  2. Navigate-then-read: location.href='http://127.0.0.1/' then a later eval
     reads document.body.innerText.

Guard _browser_eval on the same condition as navigate/snapshot/vision
(not local backend, not local sidecar, not allow_private_urls):
  - pre-scan the expression for private/always-blocked URL literals
  - re-check window.location.href after the eval at both success-return
    sites (supervisor fast-path + subprocess fallback)

Probe failures fail-open (matching the snapshot/vision guards).
2026-06-28 02:42:01 -07:00
liuhao1024
0ae6196087 fix(browser): allow local sidecar sessions to bypass SSRF guard
The private-network guard in browser_snapshot() and browser_vision()
blocked all private URLs, including those accessed via local sidecar
sessions (hybrid routing). Local sidecar sessions intentionally access
private URLs — the cloud provider never sees the URL in that case.

Add `_is_local_sidecar_key(effective_task_id)` check to both guards,
matching the existing pattern in browser_navigate().

Fixes #45101 review feedback from egilewski.
2026-06-28 02:42:01 -07:00
liuhao1024
48f5c42599 fix(browser): extend private-network guard to browser_vision
The SSRF bypass in #44731 was only patched for browser_snapshot(), but
browser_vision() exposes the same vulnerability — it takes a screenshot
and sends it to the vision model without checking if eval-driven
navigation moved the page to a private/internal URL.

Add the same current-page URL safety check to browser_vision() before
any screenshot is captured, encoded, or forwarded to the vision model.
This covers both the normal screenshot path and the Lightpanda Chrome
fallback path.

7 new tests: blocks private URL, allows public URL, skips in local
backend, skips when private URLs allowed, handles eval failure/empty/exception.
2026-06-28 02:42:01 -07:00
liuhao1024
7a6fe9bbfa fix(browser): block snapshot from eval-navigated private pages
browser_snapshot() now checks the current page URL before returning
content. When browser_console() changes location.href to a private or
internal address (e.g., http://127.0.0.1:8080/), the snapshot returns
an error instead of exposing the private page content.

This closes the SSRF bypass where an attacker could:
1. Navigate to a public page
2. Use browser_console to eval location.href = 'http://127.0.0.1:port/'
3. Use browser_snapshot to read the private page content

The fix reuses the existing _is_safe_url() and _allow_private_urls()
infrastructure, and fails open if the URL check itself fails.

Fixes #44731
2026-06-28 02:42:01 -07:00
Teknium
7c0a5def58 fix(memory/holographic): close DB connection on shutdown instead of leaking to GC (#54133)
HolographicMemoryProvider.shutdown() dropped its MemoryStore reference
without calling the existing MemoryStore.close(). Since the connection is
opened check_same_thread=False (one per session), its fd was released by
refcount/GC at a non-deterministic time on a non-deterministic thread,
churning a DB fd through the kernel free pool on every session teardown.
Call close() so the fd is released deterministically.

Reported by @alfranli123 (#44037), who pinpointed the exact code location.
Note: the report's TLS-fd-recycle corruption attribution could not be
reproduced from the code — dropping a sqlite connection flushes valid
SQLite pages via the VFS, never TLS framing, and the provider is at most a
releaser of DB fds, not a TLS-flushing socket owner. This change is correct
resource hygiene that removes per-session fd churn regardless.
2026-06-28 02:41:52 -07:00
Teknium
00d8c2c915 fix(gateway): prune stale sessions.json entries on startup
A hard gateway crash (exit code 1) skips the graceful shutdown path, so
sessions.json is never cleared and is left pointing at sessions already
ended in state.db. On the next startup get_or_create_session() reuses
those stale entries as long as the time/policy reset checks pass — it
never consults end_reason — so every incoming message is silently routed
into a closed session, with no log or error (#52804).

SessionStore._ensure_loaded_locked() now calls a new
_prune_stale_sessions_locked() that drops any entry whose session_id has
end_reason IS NOT NULL in state.db. Idempotent, _db=None / legacy-absent
safe, DB errors non-fatal, sessions.json rewritten only when something
was pruned. Self-heals into a fresh session on the next message.

Reported and diagnosed by @terry197913 (#52808).
2026-06-28 02:41:47 -07:00
teknium1
c38dfba3a7 docs: add infographic for #53175 gateway cleanup off-loop fix 2026-06-28 02:41:36 -07:00
teknium1
ea5aaa7a22 fix(gateway): offload remaining inline agent cleanup off the event loop (#53175)
#35994 moved /new reset cleanup off the loop, but _cleanup_agent_resources
(agent.close() subprocess teardown; shutdown_memory_provider() plugin IO) was
still called INLINE on the event loop from three other sites:

  - _session_expiry_watcher (5-min idle sweep) — live loop
  - _handle_message_with_agent cache-hygiene re-eviction — live loop
  - _finalize_shutdown_agents / stop() idle-cache loop — shutdown

A wedged memory provider on any of these froze the loop: bot goes silent,
runtime-status updated_at heartbeat stops advancing, and SIGTERM can't be
serviced (requires kill -9) — exactly the #53175 zombie pattern.

Adds _cleanup_agent_resources_off_loop: a bounded (30s) worker-thread offload
mirroring the #35994 reset fix, and routes all four sites through it.
2026-06-28 02:41:36 -07:00
teknium1
aa50c1ba5d fix(prompt): repair backend probe import (get_environment never existed)
The system-prompt backend probe imported a nonexistent symbol —
`from tools.environments import get_environment` — which always raised
ImportError: cannot import name 'get_environment'. The exception is caught
and only drops the live backend description to a static fallback, so it is
cosmetic, but it broke the live OS/user/cwd probe for every non-local
backend (docker/singularity/modal/daytona/ssh).

The real factory is `_create_environment` in tools.terminal_tool. Build the
environment the same way the live terminal path does (select backend image,
assemble ssh/container config from _get_env_config()), then run the probe.

Note: this does NOT affect tool loading — tool selection runs each tool's
check_fn and never consults this probe. Regression from #52147 (2026-06-25).

Closes #53667 (probe import); the 'cronjob-only' tool-collapse symptom is
not reproducible — tool selection has no probe dependency and memory's
check_fn is unconditionally True.
2026-06-28 02:41:31 -07:00
Teknium
b508d4296e test(ci): raise per-file timeout 140s → 300s to stop false timeouts (#54143)
* test(ci): raise per-file timeout 140s to 300s to stop false timeouts

The per-file parallel runner caps each test-file subprocess at a flat
wall-clock budget. Combined with per-test subprocess isolation (a fresh
Python process per test), a large-collection file pays N x (interpreter
startup + import) of overhead before any test logic runs. That overhead
dilates under load on shared CI runners, so a file that finishes in
~100s on a quiet box can blow the old 140s cap purely from scheduling
jitter, surfacing as a false 'no tests ran' timeout (rc=124) with zero
actual test failures.

Raise the default to 300s (5 min). The Docker build matrix jobs already
take 7-10 min, so this headroom costs nothing on total CI wall time
while still bounding a genuinely hung file.

* docs: add infographic for CI per-file timeout bump
2026-06-28 02:41:07 -07:00
teknium1
dcc6cd1b42 docs: add infographic for #52378 Windows update-loop salvage 2026-06-28 02:40:37 -07:00
teknium1
fe89ce0694 chore(release): map Cossackx in AUTHOR_MAP for #52528 salvage 2026-06-28 02:40:37 -07:00
Cossackx
ba37c910e0 fix(desktop/windows): resolve real hermes over extensionless shim + prefer --update on recovery
Two Windows-only desktop boot bugs that caused spurious reinstall/repair loops:

1. findOnPath() searched the empty extension BEFORE PATHEXT, so an
   extensionless Git-Bash `hermes` shim shadowed the real hermes.cmd/.exe.
   The shim then failed the shell:false --version probe and the resolver
   fell through to bootstrap/repair even though a working CLI was on PATH.
   Fix: try PATHEXT extensions first, keep the empty entry LAST so callers
   that already include the extension (py.exe, pwsh.exe) still resolve.

2. handOffWindowsBootstrapRecovery() chose the destructive --repair over the
   gentle --update by checking only venv\Scripts\hermes.exe -- the setuptools
   console-script shim, written at the END of venv setup and absent in
   interrupted/quarantined states. Fix: take --update when ANY real-install
   signal is present (venv python, the shim, or .hermes-bootstrap-complete).

Adds windows-hermes-resolution.test.cjs (source-assertion pattern, wired into
test:desktop:platforms) guarding both regressions.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-28 02:40:37 -07:00
Cornna
0229246ab8 fix(desktop): probe venv runtime health before trusting bootstrap marker
A broken/empty Windows launcher venv can see the source tree via PYTHONPATH
but lack PyYAML, so 'import hermes_cli' succeeds while the first real CLI
import dies — the desktop then trusts the bootstrap marker, spawns a dead
backend, and loops on 'gateway offline' (#52378).

- backend-probes.cjs: canImportHermesCli now runs 'import yaml; import
  hermes_cli.config' (extracted as hermesRuntimeImportProbe) and accepts an
  env override, so a dependency regression is caught without a real broken
  venv fixture.
- main.cjs: isBootstrapComplete() routes through new isActiveRuntimeUsable(),
  which requires the venv python to pass the runtime import probe (with
  ACTIVE_HERMES_ROOT on PYTHONPATH) — not just exist on disk.

Salvaged from PR #38179. The PR's install.ps1 reset/clean + autocrlf changes
and their tests are dropped: current main already preserves dirty checkouts
via stash (the data-loss-safe #38542 path) rather than the PR's older
reset-based Repair-ManagedCheckoutBeforeUpdate approach.
2026-06-28 02:40:37 -07:00
teknium1
7c9cdad9fd test(cli): cover Windows self-lock recovery guard + cmd-quote its hint
Add two tests for the self-lock guard in _recover_from_interrupted_install:
one asserting it clears the marker and skips install when hermes.exe is a
process ancestor (breaking the #52378/#45542 loop), one asserting it falls
through to a normal recovery install when the shim is NOT an ancestor.

The guard's manual-recovery hint runs only inside the Windows branch, so
quote it for cmd.exe (cd /d, double-quoted paths) — the cross-platform
fallback hint at the end of the function is left POSIX-correct.

Map Icather in scripts/release.py AUTHOR_MAP for the salvage.
2026-06-28 02:40:37 -07:00
灵越羽毛
b6f592dbdc fix(cli): detect self-lock in update recovery to break infinite retry loop on Windows 2026-06-28 02:40:37 -07:00
liuhao1024
14baeefe1d fix(matrix): record DM rooms in m.direct on invite to prevent group misclassification
Rebase onto plugins/platforms/matrix/adapter.py (code moved from
gateway/platforms/matrix.py). Same logic: _on_invite checks is_direct
on invite events and calls _record_dm_room to persist in m.direct
account data.

Fixes #44679
2026-06-28 02:37:52 -07:00
Teknium
fde1c8570f fix(tui_gateway): suppress WS peer-hangup teardown error flood (#50005) (#54126)
When the Desktop forcibly closes its WebSocket mid-write, asyncio logs a
full traceback for every pending connection-lost callback — 50+ identical
WinError 10054 (ConnectionResetError) lines per disconnect on Windows, the
equivalent ConnectionResetError/BrokenPipeError on POSIX. These are not
actionable: they are the expected side effect of the peer hanging up before
our writes drained.

Install a loop exception handler on the gateway serving loop that collapses
exactly this teardown class (ConnectionResetError/ConnectionAbortedError/
BrokenPipeError originating from _call_connection_lost) to a single debug
line, forwarding every other loop error to the existing/default handler
unchanged so genuine loop bugs still surface. Idempotent per loop.
2026-06-28 02:35:01 -07:00
teknium1
6eec0d4f08 docs: add infographic for #53107 gateway force-exit fix 2026-06-28 02:34:23 -07:00
LeonSGP43
9f0e64cedd fix(gateway): force exit after graceful shutdown
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-06-28 02:34:23 -07:00
teknium1
dddaea0c98 chore(release): map yungchentang author for #53622 salvage 2026-06-28 02:34:17 -07:00
yungchentang
7e2ca7f68d fix(telegram): reset send pool after pool timeouts 2026-06-28 02:34:17 -07:00
kshitij
f3d8f20a59 Merge pull request #54116 from kshitijk4poor/fix/36658-gateway-drain-microtask
fix(tui): defer buffered gateway events to stop dashboard chat #301 (#36658)
2026-06-28 14:50:07 +05:30
Teknium
f646b82ff0 docs: add infographic for #38249 atomic env-snapshot fix 2026-06-28 02:08:57 -07:00
Teknium
9f17f16c66 fix(environments): use $BASHPID for atomic snapshot temp + harden failure path
The atomic mv approach (kyssta-exe's commit) narrows but does not close the
#38249 race: the temp name used $$ (parent shell PID), which is identical
across &-launched concurrent subshells. Two concurrent writers pick the same
temp file, clobber each other mid-write, and mv then publishes a torn snapshot
— a reader sourcing it absorbs declare-x/export fragments into PATH.

- Use $BASHPID (actual per-subshell PID) so concurrent writers never collide.
- Chain mv on export success (&&) and rm the temp on failure so a partial dump
  never replaces a good snapshot; apply the same to the init_session bootstrap.
- shlex-quote the static temp-path portion (Windows/spaces), $BASHPID outside.
- LocalEnvironment.cleanup sweeps orphaned snap.tmp.* temps.
- Regression tests: string-shape + a behavioral concurrent writers/readers test
  that proves the snapshot never tears (would still tear with $$).
2026-06-28 02:08:57 -07:00
kyssta-exe
6a2958a521 fix(environments): use atomic file replacement for snapshot writes
Fix race condition in terminal environment snapshots that could corrupt
PATH with declare -x entries. When concurrent terminal calls share the
same snapshot file, the non-atomic 'export -p > snapshot.sh' write could
be read mid-write by another process, causing partial/corrupted env vars
to be sourced and mixed into PATH.

The fix uses atomic file replacement:
- Write to a temp file: export -p > snapshot.sh.tmp.303651
- Atomically replace: mv -f snapshot.sh.tmp.303651 snapshot.sh

On POSIX, mv within the same filesystem is atomic, so source() will
either see the old complete snapshot or the new complete one, never a
partial/truncated file.

Fixes #38249
2026-06-28 02:08:57 -07:00
teknium1
c23f394eb8 fix: satisfy ruff encoding + windows-footgun lints for cgroup reaper
- read_text(encoding='utf-8') (PLW1514)
- # windows-footgun: ok on signal.SIGKILL — module is Linux-only (reads
  /proc, /sys/fs/cgroup; runs from a systemd unit)
- test lambda accepts the new encoding kwarg
2026-06-28 02:05:50 -07:00
teknium1
86ec979f66 chore(release): map PRATHAMESH75 author for #37550 salvage 2026-06-28 02:05:50 -07:00
PRATHAMESH75
e551da6ddb fix(gateway): reap cgroup orphans via ExecStopPost to unblock restart
Long-lived helpers spawned indirectly by tool calls (adb, platform
bridges) were left in the service cgroup after the gateway's main
process exited. When the kernel rejected the deferred cgroup-wide kill
with EINVAL, systemd blocked Restart=always for 6+ minutes, taking
down all platforms and cron windows (#37454).

Add a small ExecStopPost helper (gateway.cgroup_cleanup) that walks
cgroup.procs and sends per-PID SIGKILLs — a different kernel code path
than cgroup.kill, so it succeeds where the cgroup-wide write failed.
KillMode=mixed is preserved so the gateway still reaps its own
tool-call children before systemd intervenes (#8202).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-28 02:05:50 -07:00
Coy Geek
d7a1052424 fix(env-passthrough): fail closed when provider blocklist import fails
When tools.environments.local can't be imported (partial install,
import-time error), _is_hermes_provider_credential() returned False —
fail-open. A skill could then register a Hermes provider credential
(ANTHROPIC_API_KEY, etc.) as env passthrough; _scrub_child_env lets
passthrough vars bypass the secret-substring net (rule 1), so the
operator's real key would land in the execute_code child. Reopens the
GHSA-rhgp-j443-p4rf bypass.

Fail closed instead: on import failure, treat the name as a protected
provider credential and refuse passthrough. Regression test exercises
the full register -> scrub path under a simulated import failure.

Co-authored-by: Hermes Agent <noreply@nousresearch.com>
2026-06-28 02:05:43 -07:00
teknium1
58c36b1798 fix(api-server): widen error redaction to cron-endpoint + SSE sites
Follow-up to the salvaged #37733 fix. The contributor centralized
redaction at _openai_error and the chat/responses failure paths, which
covers the OpenAI-compatible envelopes transitively. Two sibling classes
crossed the same authenticated HTTP boundary unredacted:

- 8x cron-management endpoints returning {"error": str(e)} on 500
- the session-chat SSE error event ({"message": str(exc)})

Route both through the same _redact_api_error_text(force=True) helper.
Add AUTHOR_MAP entry for coygeek and a TestRedactApiErrorText guard
covering mask/force/limit/passthrough behavior.
2026-06-28 02:05:38 -07:00
Coy Geek
5e774de76e fix(api-server): redact provider errors at HTTP boundary
Force API-server error text through the existing secret redactor before returning OpenAI-compatible errors, response fallback text, response snapshots, and run failure events. This prevents credential-shaped provider failure text from crossing the API-server boundary while preserving debuggable sanitized messages.
2026-06-28 02:05:38 -07:00
HexLab98
d2fda5925d test(gateway): cover Discord/Slack compression status suppression (#39293) 2026-06-28 14:35:32 +05:30
HexLab98
d2ea948bc0 fix(gateway): suppress compression status noise on Discord and other chats (#39293)
Extend the gateway noisy-status filter beyond Telegram so internal
compression lifecycle messages stay in logs instead of spamming Discord,
Slack, and other messaging channels.
2026-06-28 14:35:32 +05:30
teknium1
9f7d520caf docs: add infographic for #36664 WhatsApp LID session-path fix 2026-06-28 02:05:26 -07:00
teknium1
3aaa98dd01 test(whatsapp): cover LID allowlist match on modern session layout
Add an _is_user_authorized E2E for the platforms/whatsapp/session layout
on top of fesalfayed's resolver fix (#36665) — guards the actual
silently-dropped-LID-sender path from #36664.
2026-06-28 02:05:26 -07:00
fesalfayed
263ffec1b0 fix(whatsapp): resolve LID aliases on modern platforms/ session layout
expand_whatsapp_aliases hardcoded get_hermes_home()/whatsapp/session, but
the adapter writes lid-mapping files via get_hermes_dir("platforms/whatsapp/
session", "whatsapp/session"). On installs without the legacy directory the
two paths diverge, so the resolver finds no mappings and returns the bare LID,
which misses the allowlist and silently drops the message. Resolve through the
same helper so both sides stay in lockstep on new and legacy layouts.
2026-06-28 02:05:26 -07:00
teknium1
d0f087e7f9 docs: add infographic for #36109 empty-400 diagnostics 2026-06-28 02:05:20 -07:00
xxxigm
093f567f0d fix(agent,cli): surface empty-body API errors and fail oneshot exit code
When an LLM API call returns HTTP 4xx with an empty parsed SDK `body` ({}),
`_summarize_api_error` fell through to a bare `str(error)`, so users saw only
"HTTP 400" with no provider detail (reported on Windows in #36109). The SDK
leaves `body` empty in this case, but the httpx `response` still carries the
payload in `.text`.

- run_agent.py `_summarize_api_error`: when `body` is empty, fall back to
  `response.text` — parse a JSON `error.message`/`message` when present, else
  surface the raw (truncated) body. Platform-agnostic diagnostics.
- hermes_cli/oneshot.py: `hermes -z` now runs via `run_conversation` and returns
  exit code 2 when the run is failed/partial with no usable final response, so
  scripts can detect LLM failures (still 0 when a response — incl. an error
  summary as output — is produced).

Tests: new tests/run_agent/test_summarize_api_error.py (empty-body JSON + raw
text, RED/GREEN verified) + oneshot exit-code/`run_conversation` wiring tests.

NOTE: #36109's original root cause (Windows "all providers return empty 400")
is not reproducible on current main (heavy provider-transport churn since
v0.15.1). This change does not claim to fix that root cause — it makes any
empty-body API error LEGIBLE so a future occurrence shows the real provider
message instead of a bare HTTP 400. Relates to #36109 (does not close it).
2026-06-28 02:05:20 -07:00
teknium1
c0b4a3438a fix(install): scope Playwright override to too-new apt releases + keep step interruptible
Follow-up on #54032 for #35166:
- Gate the PLAYWRIGHT_HOST_PLATFORM_OVERRIDE retry on the host being an apt
  release newer than Playwright recognizes (Ubuntu >24.04 / Debian >13) via
  playwright_host_unrecognized(), instead of retrying on ANY install failure.
  A network/disk/permission failure on a supported host now surfaces unchanged
  rather than getting a mismatched-glibc build forced onto it.
- detect_os() now captures DISTRO_VERSION from os-release.
- Fold in the interruptibility fix (was PR #35304, self-closed): wrap the
  download in 'timeout --foreground -k 10' (probed, with plain-timeout
  fallback) so a terminal Ctrl+C reaches the child and a wedged download is
  force-killed after the deadline.
- Add behavioral tests that source the helpers and assert the retry fires only
  on Ubuntu 26.04 / Debian 14, not on supported hosts, non-apt distros,
  native-success, operator-pinned override, or unsupported arch.
2026-06-28 02:05:18 -07:00
kshitijk4poor
a28fe788a6 fix(install): retry Playwright install with platform override on unrecognized host (#35166)
On apt releases newer than the bundled Playwright recognizes (Ubuntu 26.04,
Debian 14, and future distros), 'npx playwright install --with-deps chromium'
hangs uninterruptibly at 'Installing Playwright Chromium with system
dependencies' because Playwright's resolver maps the host to a platform with
no download build (#35166).

Wrap every installer Playwright call in run_playwright_install(), which tries
the native install first and, only if it fails or times out, retries once with
PLAYWRIGHT_HOST_PLATFORM_OVERRIDE pinned to the newest known build
(ubuntu24.04-<arch>). This is the escape hatch Playwright's maintainers bless
for unrecognized platforms (microsoft/playwright#33434).

Try-native-first (not a hardcoded distro/version table) is deliberate:
- Self-correcting — when Playwright already supports the host (e.g. Ubuntu
  26.04 on Playwright >=1.61) the first attempt succeeds and the override is
  never applied, so we never force a mismatched-glibc build onto a release
  Playwright handles correctly (microsoft/playwright#35114).
- Zero-maintenance — new distro releases work the moment Playwright adds them.
- Covers Debian 14+ and future releases, not just Ubuntu 26.04.

An operator-set PLAYWRIGHT_HOST_PLATFORM_OVERRIDE is always respected (applied
to the first attempt; retry skipped). Non-x64/arm64 arches have no fallback
build and skip the retry.

Refs #35166
2026-06-28 02:05:18 -07:00
teknium1
64972b6403 fix(config): canonicalize model.name/model.model to model.default (#34500)
A custom_providers config that names the model under model.name (or
model.model) resolved to an empty model, so the API request went out
with model= — HTTP 400 from OpenAI-compatible backends. Display paths
(hermes status/dump) already read model.name and showed the model,
making the failure silent.

The model id was read via 'default or model' at ~14 independent sites
(cli, gateway, cron, curator, oneshot, fallback, profiles, ...), none
of which honored 'name'. Rather than patch every site, canonicalize at
the single load/save chokepoint: _normalize_root_model_keys() now
promotes model.model/model.name -> model.default (precedence
default > model > name) and drops the stale alias, so every reader —
present and future — sees a populated default and config.yaml is
migrated canonical on next save. The gateway, which bypasses
load_config(), replays the same normalization in _load_gateway_config().

Co-authored-by: Bartok9 <danielrpike9@gmail.com>

Credit: root-cause analysis and fix direction from @Bartok9 (#34502,
first) and @v86861062 (#34527).
2026-06-28 02:05:13 -07:00
kshitijk4poor
f64d15ccb7 fix(tui): defer buffered gateway events to stop dashboard chat #301 (#36658)
Dashboard /chat spawns the TUI attached to the dashboard's in-memory
gateway via HERMES_TUI_GATEWAY_URL. In that attach mode the already-running
gateway replays `gateway.ready` (and `session.info`) the instant the socket
connects, so those events land in GatewayClient.bufferedEvents *before* the
consumer's mount-time subscribe effect (useMainApp.ts) calls drain().

drain() then emitted the buffered events synchronously, so the
`gateway.ready` handler's patchUiState / setHistoryItems cascade ran while
React was still inside the first commit — tripping "Too many re-renders"
(Minified React error #301) and breaking Dashboard chat after `hermes update`.
Spawn / inline / sidecar modes never hit this: their `gateway.ready` only
arrives after the Python child boots, on a later async tick.

Fix: drain() defers the replay to the next microtask AND keeps `subscribed`
false until that microtask runs. Keeping `subscribed` false in the gap means
any live event arriving before the flush keeps buffering (publish() pushes
when !subscribed) instead of emitting synchronously and jumping ahead of the
chronologically-earlier replayed events — the flush re-drains the buffer
right after flipping `subscribed`, preserving FIFO order. A drainGeneration
token (bumped in resetStartupState) makes a queued flush a no-op if the
transport was reset/killed in the meantime, avoiding use-after-teardown and
duplicate/reordered exits.

Regression tests: (1) drain() does not dispatch buffered events synchronously;
(2) a live event arriving in the post-drain / pre-microtask window still
delivers BEHIND the earlier-buffered event (FIFO). Both are red against the
old synchronous behavior, green with this fix. Same class of fix as #44528.

Closes #36658
2026-06-28 14:18:47 +05:30
Teknium
2ecb6f7fe6 fix(telegram): clear send_path_degraded on successful reconnect (#35205) (#54076)
* fix(telegram): clear send_path_degraded on successful reconnect

_send_path_degraded was cleared only in _verify_polling_after_reconnect,
60s after reconnect and only if scheduled. A clean start_polling() reconnect
left the flag stuck True, short-circuiting send() and blocking all outbound
messages until the deferred probe ran (or forever if it never did).

Clear the flag the moment start_polling() succeeds — that is the recovery
signal. The deferred probe remains a defensive re-check that re-enters the
reconnect ladder (re-setting the flag) if it detects a silent wedge.

Fixes #35205.

* docs: add infographic for #35205 telegram send-path fix
2026-06-28 01:38:17 -07:00
Teknium
674e16e7c6 fix(redact): stop DB-connstr redaction from corrupting code output (#33801) (#54061)
Secret redaction is display/output-scoped on main — write_file writes
content verbatim, terminal/execute_code redact only output not the
command/source. The real bug is in displayed tool OUTPUT (read_file,
terminal, execute_code):

_DB_CONNSTR_RE's password group [^@]+ was greedy across newlines, so on a
multi-line block it scanned past the DSN line to the next stray '@' (a
Python @decorator), replacing every intervening character — including line
breaks — with ***. That dropped lines and concatenated the next line onto
the f-string line, making read_file output look corrupted (the file on disk
was always correct). Reported in #33801.

Fix:
- Forbid whitespace in the userinfo/password groups ([^:\s]+ / [^@\s]+) so
  the match can never span a line break. A real DSN password never contains
  whitespace. This alone kills the catastrophic line-dropping.
- Under code_file=True, preserve a password group that is a pure {...} brace
  expression — f"postgresql://{user}:{pass}@{host}" is an f-string template,
  not a live credential. Literal passwords are still masked.
- Pass code_file=True at the terminal and execute_code output redaction call
  sites (file_tools already did) so code-execution output isn't corrupted by
  ENV/JSON/template false positives. Real prefixes, auth headers, JWTs, and
  private keys are still redacted.

Verified E2E against the reporter's exact pydantic-settings module: file
written verbatim, read_file shows the DSN f-string + @model_validator intact
with zero *** corruption, while a literal postgresql://admin:pw@host DSN and
a real sk- key are still masked.

Reported-by: koishi70
Reported-by: pfrenssen
2026-06-28 01:15:39 -07:00
Teknium
de6e9ac760 docs(discord): document bot-to-bot comms as unsupported (#32791) (#54063)
* docs(discord): document bot-to-bot comms as unsupported (#32791)

Multi-profile bot-to-bot conversation is not a supported topology.
DISCORD_ALLOW_BOTS=none (the default) blocks all bot-originated
messages; setting mentions/all across multiple Hermes profiles to make
them reply to each other ack-loops because Discord's reply auto-mention
satisfies the mention gate every turn. Document the safe default and
the loop hazard so operators don't wire it up.

* docs(discord): infographic for bot-to-bot unsupported stance (#32791)
2026-06-28 01:15:34 -07:00
teknium1
4f16950e9a docs: add infographic for #32421 content-filter fallback fix 2026-06-28 01:15:21 -07:00
teknium1
578e3989d4 fix(agent): route content-filter stream stalls to fallback chain (#32421)
When a provider's output-layer safety filter (MiniMax "output new_sensitive
(1027)", Azure content_filter, etc.) kills a streaming response after deltas
were already sent, interruptible_streaming_api_call swallows the raw error
into a finish_reason=length partial-stream stub. The conversation loop then
burned 3 continuation retries against the SAME primary — re-hitting the
content-deterministic filter every time — and gave up with "Response remained
truncated after 3 continuation attempts", never consulting fallback_providers.

Builds on @595650661's classifier change (cherry-picked) so error_classifier
recognizes the filter; then:
- chat_completion_helpers: run the swallowed error through error_classifier at
  the stub-creation point and stamp _content_filter_terminated on the stub
  (single source of truth — no parallel pattern list).
- conversation_loop: read the tag and activate the fallback chain BEFORE
  burning any continuation retries; roll partial content back to the last
  clean turn and re-issue against the new provider (restart_with_rebuilt_messages).
  Plain network stalls are unaffected (only content_policy_blocked is tagged).

Credits #32479 (@sweetcornna) and #33845 (@Tranquil-Flow) which fixed the
same issue via the stub-tag and loop-escalation approaches respectively.

Live E2E confirmed: before, _try_activate_fallback called 0x; after, fallback
fires on the first stub and the fallback provider completes the turn.
2026-06-28 01:15:21 -07:00
595650661
b8e2268628 fix(agent): add MiniMax 'new_sensitive' to content_policy_blocked patterns
The MiniMax output-layer safety filter surfaces the error verbatim as
`output new_sensitive (1027)` (sometimes with additional provider
wrapping like 'Stream stalled mid tool-call: output new_sensitive (1027)').
When the model emits a large tool-call argument block, the upstream
filter trips and the SSE stream is truncated mid-flight, producing
'stream stalled mid tool-call' errors. Until now this case was
misclassified and retried 3x on the same provider, reproducing the same
refusal and burning paid attempts.

Adding `new_sensitive` to `_CONTENT_POLICY_BLOCKED_PATTERNS` routes
it through the existing is_client_error path: skip 3x retry, activate
configured fallback model immediately, surface a clear provider-safety
message to the user.

Refs #32421
2026-06-28 01:15:21 -07:00
Teknium
c9df4bc094 fix(gateway): default restart_drain_timeout to 0 to kill systemd crash loop (#54066)
A restart now interrupts in-flight agents immediately rather than holding
the gateway open for a grace window. The previous 180s default coupled two
independently-set timers: the gateway's own drain timer and systemd's
TimeoutStopSec. On a stale unit where TimeoutStopSec < drain, systemd
SIGKILLed the gateway mid-cleanup, leaving a stale lock that made the next
startup exit immediately ('already running') — an infinite crash loop under
Restart=on-failure (#31981).

Setting drain to 0 makes the mismatch structurally impossible: with drain 0
the generated unit gets TimeoutStopSec=90 against a near-instant drain, so
systemd never kills mid-cleanup. Contract: restart the gateway, in-flight
work stops. A grace window large enough to 'save' a long agent turn would
have to outlast an unbounded task, which is impossible.

Also fixes the stale-unit warning's suggested command
(hermes gateway service install --replace -> hermes gateway install --force);
the former subcommand does not exist.

Closes #31981
2026-06-28 01:14:34 -07:00
teknium1
0800f1c28b infographic: whatsapp send-queue serialization (#33360) 2026-06-28 01:10:14 -07:00
teknium1
cb9f855c2b test(whatsapp-bridge): drop structural send-queue integration test
The .integration.test.mjs greps bridge.js source text for the queue
wiring — a change-detector that breaks on any benign refactor of the
same code. The behavioral unit test (bridge.sendqueue.test.mjs) already
covers FIFO ordering, error isolation, timeout propagation, and
single-consumer concurrency, which is the contract that matters.
2026-06-28 01:10:14 -07:00
Tranquil-Flow
c393a8e55f fix(whatsapp-bridge): serialize sendMessage to prevent cross-chat contamination (#33360)
Concurrent sock.sendMessage() calls on a single Baileys socket can cause
the WhatsApp protocol-level routing to misdeliver messages — responses
intended for one chat appear in another.

Add a promise-based send queue that serialises all sendMessage() calls
across concurrent HTTP /send, /edit, and /send-media handlers so only
one send is in-flight at a time.

Includes unit tests for queue ordering, error isolation, timeout
propagation, and single-consumer concurrency semantics, plus an
integration check that the queue is wired into sendWithTimeout.
2026-06-28 01:10:14 -07:00
teknium1
1f72ad9be9 refactor(cli): extract interrupt recovery to a testable helper
Pull the #33271 post-interrupt recovery (flush_stdin + _force_full_redraw)
out of process_loop's finally block into _recover_terminal_after_interrupt(),
and replace the inline-logic-copy tests with ones that exercise the real
helper plus a source guard that process_loop still invokes it behind the
_last_turn_interrupted gate.
2026-06-28 01:08:09 -07:00
zccyman
f3aaba7f85 fix(cli): recover terminal state after interrupt to prevent raw control sequence freeze
When the agent is interrupted during processing, prompt_toolkit's
renderer and VT100 input parser can be left in an inconsistent state.
CSI 6n cursor position report responses leak as literal text
(^[[19;1R) and the terminal stops accepting keyboard input.

Fix: in process_loop's finally block, after an interrupted turn:
- flush_stdin() to drain stray escape bytes from the OS input buffer
- _force_full_redraw() to reset prompt_toolkit's renderer cache

Closes #33271
2026-06-28 01:08:09 -07:00
teknium1
2e1b48ed31 chore: map kurlyk local email → skabartem for PR #32867 salvage 2026-06-28 01:08:04 -07:00
kurlyk
def97bcd96 fix: eliminate race condition in OpenAI client replacement
Make check-and-replace atomic in _ensure_primary_openai_client by
keeping both operations under the same lock acquisition. Previously,
the lock was released between detecting a closed client and replacing
it, allowing two threads to simultaneously replace the client.

Fixes #32846

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-28 01:08:04 -07:00
teknium1
4a0fe4e54a docs: add PR infographic for #32762 clarify-expiry fix 2026-06-28 01:07:53 -07:00
teknium1
aacc15b2c9 fix(clarify): raise default clarify_timeout to 3600s (#32762)
The 600s default evicted the gateway clarify entry while users were
still away (meeting/AFK); a later button tap then landed on a dead
entry and the agent hung on 'running: clarify'. Raise the default to
1h in DEFAULT_CONFIG and the get_clarify_timeout() code-level fallback,
documenting the running-agent-guard tradeoff. User overrides still win.
2026-06-28 01:07:53 -07:00
konsisumer
3f543229f2 fix(telegram): notify user when clarify button tap arrives after expiry 2026-06-28 01:07:53 -07:00
Teknium
90d25adc9e fix(gateway): deliver profile-scoped cache media on symlinked HERMES_HOME (#54060)
Generated images under a profile gateway's cache (profiles/<name>/cache/
images/...) were silently dropped from Telegram/Discord delivery when
HERMES_HOME is symlinked under a denied prefix (e.g. /opt/data ->
/root/.hermes) and $HOME is not that prefix. The resolved path lands
under /root (a system denylist prefix), the root-home exception only
fires when the denied prefix IS $HOME, and the static safe-roots list
only covers the active HERMES_HOME's top-level cache — not per-profile
cache dirs. Both gates fail, so validate_media_delivery_path returns
None and the gateway logs 'Skipping unsafe MEDIA directive path'.

_media_delivery_allowed_roots() now also enumerates per-profile cache
roots (<root>/profiles/*/cache/{images,audio,videos,documents,
screenshots}) at check time. Allowlist match runs before the denylist,
so the profile artifact delivers regardless of the /root interaction;
profile-dir credentials (auth.json) stay blocked since they aren't
under a cache subdir.

Reopened regression of #34485/#38108, neither of which covered the
profile-scoped symlink case. Fixes #31733.
2026-06-28 01:07:28 -07:00
sweetcornna
2701ea2f0c fix(agent): reopen fallback chain after primary recovery 2026-06-28 00:57:42 -07:00
teknium1
7b9ff310b6 fix: salvage #33830 for current main — relocate allow_bots bridge to telegram plugin hook, fix stale adapter import in test 2026-06-28 00:57:03 -07:00
sweetcornna
fc70d023d8 fix(telegram): apply bot auth policy to Telegram sources
# Conflicts:
#	gateway/config.py
2026-06-28 00:57:03 -07:00
sweetcornna
002357a83f fix(tui): repump stdin after readable handler errors 2026-06-28 00:53:29 -07:00
teknium1
3a03d03bdc docs: add infographic for #30636 macOS state.db fix 2026-06-28 00:53:19 -07:00
teknium1
52d774f0f9 fix(state): F_FULLFSYNC barrier at WAL checkpoints on macOS (#30636)
On Darwin, synchronous=FULL (the WAL default) only issues a plain
fsync(), which Apple documents does NOT guarantee writes reach stable
storage or stay ordered. SQLite's WAL corruption-safety guarantee
assumes the OS honors the fsync barrier; macOS does not unless the app
uses F_FULLFSYNC. During a launchd *system* shutdown the page cache is
dropped (effectively power-loss for in-flight pages), so a WAL
checkpoint whose fsync 'reported' durable may never hit the platter —
corrupting state.db with a malformed image. That is the trigger in
#30636 ('SIGTERM during launchd shutdown under high load').

Apply PRAGMA checkpoint_fullfsync=1 (macOS-guarded) in
apply_wal_with_fallback. It forces the F_FULLFSYNC barrier only at
checkpoint boundaries (where WAL frames land in the main DB), so cost
amortizes to ~+0.1ms/commit vs ~+4ms for the broader fullfsync=1.
No-op off Darwin (F_FULLFSYNC is macOS-only).

Root-cause analysis by @catapreta on #30636. Supersedes #30654, whose
synchronous=FULL is a no-op (already FULL in WAL mode) and whose
TRUNCATE-on-close is already on main.

Co-authored-by: catapreta <catapreta@users.noreply.github.com>
2026-06-28 00:53:19 -07:00
Gille
9229d0db17 fix(moa): preserve Nous provider identity for references 2026-06-28 00:47:15 -07:00
Teknium
7c38249c79 feat(moa): references see full tool state + fire on every user/tool response (#54016)
The advisory reference view stripped all tool calls and tool results, so
reference models judged a task whose actions and results they never saw — and
references only fired once per user turn, never re-running as the agent's
state advanced through the tool loop.

Two fixes:
- _reference_messages() now PRESERVES the agent's tool calls and tool results,
  rendering them inline as text ([called tool: ...] / [tool result: ...]) so a
  reference gives an informed judgement on the real current state. Still emits
  zero tool-role messages and zero tool_calls arrays (strict providers reject
  those), and large tool results are previewed head+tail (4000-char budget).
  The required end-on-user shape is met by APPENDING a synthetic advisory user
  turn — not by deleting the agent's latest context (which the prior fix did).
- References now re-run on every state change — each new user message AND each
  new tool result — instead of once per user turn. The state-sensitive advisory
  signature drives the cache: new tool result = miss (re-run), identical-state
  re-call = hit (no re-run, no re-emit).

The acting aggregator still receives the full, untrimmed transcript.
2026-06-28 00:30:11 -07:00
kshitijk4poor
fc7a01b6cb test+harden: modernize salvaged Matrix path for current plugin layout
Two follow-ups on top of the salvaged #46365 fix:

1. Tests: the salvaged tests injected the ephemeral MatrixAdapter via
   sys.modules["gateway.platforms.matrix"], but Matrix migrated to a plugin
   (#41112) and the fallback now imports from plugins.platforms.matrix.adapter.
   Point the three sys.modules patches at the current module path so the
   ephemeral-fallback tests actually exercise the injected fake adapter.

2. Harden the live-adapter lookup: split the gateway import guard from the
   adapter lookup and log (instead of silently swallowing) when a runner
   exists but adapters.get() raises. A silent fall-through there would
   re-introduce the per-send reconnect/OTK-exhaustion storm this fix exists
   to prevent (#46310). Documented that the live adapter is gateway-owned and
   must not be disconnected, and why the ephemeral finally never touches it.
2026-06-28 12:48:08 +05:30
liuhao1024
a7fd62d824 fix(send_message): reuse live gateway adapter for Matrix media sends
When a live gateway adapter is available (i.e. the tool runs inside a
running gateway), reuse the persistent connection instead of creating a
new MatrixAdapter per call. This eliminates per-message E2EE re-init
storms that exhaust recipient OTKs and silently drop messages.

The fix follows the same pattern as _send_to_platform (line 618):
gateway_runner_ref → runner.adapters[Platform.MATRIX]. Falls back to
the ephemeral connect/disconnect cycle for standalone contexts.

Also extracts the shared send logic into _send_via_matrix_adapter()
to avoid duplicating the media dispatch code between the two paths.

Fixes #46310
2026-06-28 12:48:08 +05:30
Ben Barclay
1466eab4ee test(docker): wait for cont-init to finish before privilege-drop shim tests (#54026)
The docker-exec privilege-drop shim tests started a sleep container and
released the fixture as soon as `docker exec <c> true` returned 0. On
s6-overlay that succeeds almost immediately — ~0.05s in measurement —
long before the `01-hermes-setup` cont-init hook (docker/stage2-hook.sh)
has finished seeding + `chown hermes:hermes` config.yaml and running the
Python config migration (cont-init only fully settles at ~9.8s under
arm64 QEMU emulation).

`test_shim_opt_out_keeps_root` wipes config.yaml, writes it as root with
HERMES_DOCKER_EXEC_AS_ROOT=1, and asserts root:root ownership. When the
fixture released the test inside that ~10s window, stage2-hook's
boot-time `chown hermes:hermes config.yaml` raced the root-written file
and reset it to hermes:hermes — failing the assertion. The window is
invisible on native amd64 (stage2-hook completes in a blink) but wide
open under the arm64 build's QEMU emulation, which is why only build-arm64
flaked while build-amd64 stayed green.

Replace the responsiveness poll with a wait on the canonical
'cont-init finished' signal: $HERMES_HOME/logs/container-boot.log gaining
a `profile=default` line, written by 02-reconcile-profiles which s6 runs
strictly after 01-hermes-setup. Mirrors the readiness pattern already
used in test_container_restart.py. Also bumps the readiness timeout 20s->60s
to cover slow emulation.

No production code change — test-only hardening of a timing race.
2026-06-28 17:06:26 +10:00
Jeffrey Quesnelle
2c9b017696 Merge pull request #54000 from NousResearch/fix/desktop-main-cjs-clobber-stage-simple-git
fix(desktop): stop hermes desktop from clobbering tracked main.cjs
2026-06-28 01:56:51 -04:00
Teknium
4f61d48aef test(cron): deterministically wait for ticker, fix wall-clock flake (#54010)
tests/cron/test_scheduler_provider.py spawned a background ticker thread,
slept a fixed 0.2s, then asserted the loop had called tick()/heartbeat() at
least N times. Under loaded CI the worker thread isn't always scheduled
within that window, so the loop hadn't ticked yet — flaking with 'provider
never called tick()' (assert 0 >= 1).

Add a _wait_until(predicate, timeout) helper and replace all five fixed
time.sleep(0.2) sites with a poll on the actual predicate (calls/beats count
reached). Same contract assertions, no wall-clock dependence.
2026-06-27 22:52:29 -07:00
Teknium
1fa44180b0 fix(moa): advisory references end on a user turn + get a reference-role system prompt (#54007)
* fix(moa): reference advisory view must end with a user turn

MoA reference calls failed with Anthropic models that don't support
assistant prefill (e.g. Claude Opus 4.8): '400 ... must end with a user
message'. The advisory view built by _reference_messages() kept the last
assistant turn's text while dropping the following tool result, leaving a
trailing assistant turn — which Anthropic (and OpenRouter->Anthropic)
interpret as an assistant prefill to continue. References are advisory and
must end on the user turn they answer.

Strip trailing assistant turns from the advisory view (preserving
intervening ones). Update the existing test that encoded the buggy shape
and add a mid-tool-loop regression test.

* feat(moa): give reference models an advisory-role system prompt

Reference models received the bare trimmed conversation with no role
framing, so they assumed they were the acting agent and refused ("I can't
access repositories/URLs from here") or tried to call tools they don't have.

Prepend a dedicated advisory system prompt to every reference call: the
model is an analyst, not the actor — it cannot execute, should not
apologize for lacking tools, and should reason about the presented state to
advise the aggregator/orchestrator on approach, next steps, tool-use
strategy, risks, and anything the acting agent missed. Its output is private
guidance for the aggregator, not a user-facing answer.
2026-06-27 22:52:25 -07:00
Teknium
2523917680 fix(tests): bare pytest flags pass through run_tests.sh without a '--' separator (#54008)
The parallel runner only forwarded pytest args after a literal '--', so a
bare 'scripts/run_tests.sh tests/foo.py -q' (or -v/-x/-k/--tb=long) errored
out with 'unrecognized arguments'. This contradicted the docstring's
promise that common pytest flags pass through, and forced a retry on every
run that used pytest muscle-memory.

Now any token starting with '-' that isn't one of the runner's own options
(-j/--jobs, --paths, --slice, --file-timeout, --generate-slices, --files,
--include-integration) is routed to each per-file pytest invocation
automatically. Value-taking flags given space-separated (-k expr, -m mark,
-p plugin, -o name=val, etc.) keep their value instead of having it stolen
by positional-path discovery. The explicit '--' separator still works and
stacks with bare flags.

- scripts/run_tests_parallel.py: argv splitter routes bare unknown flags to
  pytest; value-flag lookahead; updated docstring.
- scripts/run_tests.sh: usage comment reflects bare-flag passthrough.
- tests/test_run_tests_parallel.py: 4 behavior-contract tests (bare -q runs,
  -k keeps its value/filters, '--' still works, positional path stays a root).
2026-06-27 22:43:26 -07:00
emozilla
2d206a3a42 fix(desktop): stop hermes desktop from clobbering tracked main.cjs (#52735)
`npm run build` ended with `bundle-electron-main.mjs`, which esbuild-bundled
electron/main.cjs and renamed the bundle on top of the tracked source file.
Because every `hermes desktop` runs `npm run build`, each launch rewrote a
checked-in source file (~7.5k-line source -> ~14.8k-line bundle), dirtying the
working tree with a build artifact that `git restore` couldn't keep (the next
launch re-clobbered it) and forcing autostash/restore conflicts on update.

The bundle only existed to inline `simple-git` so the packaged app.asar (which
ships no node_modules) wouldn't crash at launch with "Cannot find module
'simple-git'". Replace it with the mechanism the repo already uses for the
other hoisted runtime dep (node-pty): stage the dependency closure and resolve
it from process.resourcesPath at runtime.

- stage-native-deps.cjs: resolve simple-git's runtime closure (walking
  dependencies + optionalDependencies, so a version bump that adds a transitive
  dep can't silently reintroduce the crash) and stage it under
  build/native-deps/vendor/node_modules/. The `vendor/` nesting is load-bearing:
  electron-builder drops a node_modules dir at the ROOT of an extraResources
  copy but keeps a nested one.
- git-review-ops.cjs: fall back to the staged
  native-deps/vendor/node_modules/simple-git when the hoisted require() fails;
  dev runs resolve the hoisted copy and never hit the fallback.
- package.json: drop the bundler from the `build` script so main.cjs is never a
  build target again.
- nix/desktop.nix: drop the direct bundler call (the closure rides the existing
  `cp -rn native-deps` into $out) and patch process.resourcesPath in
  git-review-ops.cjs alongside main.cjs.
- delete scripts/bundle-electron-main.mjs.

Verified: electron-builder's own file filter keeps the full staged closure
(0 dropped), and a packaged win-unpacked build launches with the git-review
pane resolving simple-git from the staged vendor path.
2026-06-28 01:30:09 -04:00
teknium1
c918d42d88 feat(desktop): config-driven Electron launch flags + GPU policy
Adds a desktop: section to config.yaml so headless/VM users can make
`hermes desktop` launch correctly without a wrapper command:

- desktop.electron_flags: extra Electron CLI flags (e.g. --ozone-platform=x11)
  appended to every launch. Accepts a list or a shell-split string.
- desktop.disable_gpu: auto|true|false, bridged to the HERMES_DESKTOP_DISABLE_GPU
  env var the Electron app already reads. An explicit env var still wins.

cmd_gui() reads these via _desktop_launch_options() and applies them. This is
the config.yaml form of the capability proposed as a raw env var in #38934
(@1RB) — behavioral settings belong in config.yaml, not a new HERMES_* env var.

Co-authored-by: ray <86501179+1RB@users.noreply.github.com>
2026-06-27 22:26:43 -07:00
Teknium
1b70a91844 docs: third-party-product plugins ship standalone, not into core tree (#54001)
* docs: third-party-product plugins ship standalone, not into core tree

Generalizes the closed-set memory-provider policy to any plugin that
integrates someone else's product/project (observability backends,
vendor SaaS, analytics dashboards, paid-service tie-ins). These create
an open-ended maintenance burden on us for backends we don't own, so
they ship as standalone plugin repos installed into ~/.hermes/plugins/
and are promoted in #plugins-skills-and-skins — not merged into core.

- AGENTS.md: new 'what we don't want' bullet + generalized policy note
  beside the memory-provider closed-set rule
- CONTRIBUTING.md: new 'Third-Party Product Integrations' section
- build-a-hermes-plugin.md: caution callout at the top of the guide

It's a coupling decision, not a quality bar — a plugin can clear review
and still be a close.

* docs: add infographic for standalone-plugin policy
2026-06-27 22:23:50 -07:00
Rafael Millan
54ea059919 fix: fall back to no-sandbox for desktop launch on restricted Linux hosts 2026-06-27 22:16:20 -07:00
teknium1
97640fd9ad fix(desktop): reserve WCO width on plain Linux + author map
The plain-Linux overlay re-enable (#53185) left nativeOverlayWidth() at 0
for plain Linux, so the native min/max/close buttons painted on top of the
app's right-edge titlebar tools. Reserve the fallback width everywhere the
WCO overlay is painted (Windows, WSLg, plain Linux); macOS still reserves 0
since it uses traffic lights.
2026-06-27 22:05:33 -07:00
Chris Wesley
8194dbf612 fix(desktop): re-enable titleBarOverlay on plain Linux
Commit da5484b61 disabled the Window Controls Overlay on all Linux
(non-Windows, non-WSL) with the note that WCO is a Windows/macOS-only
Electron feature. However, several Linux compositors (KDE/KWin,
GNOME/Mutter) do support it — plain Electron titleBarOverlay paints
native min/max/close buttons that were working before that change.

Narrow the exclusion to only WSLg, where the RDP host draws its own
window controls and an Electron overlay would leave a dead gap.

Fixes: da5484b61 ("fix(desktop): WSL2 clipboard image paste + Linux titlebar overlay")
2026-06-27 22:05:33 -07:00
teknium1
9c7f9f9502 infographic: partial-stream recovery fix (salvage #41498) 2026-06-27 22:03:14 -07:00
infinitycrew39
1fa46570fb test(agent,gateway): cover partial-stream recovery and restart helper salvage 2026-06-27 22:03:14 -07:00
infinitycrew39
e860a40e14 fix(agent,gateway): surface partial-stream recovery and bound detached restart
Salvage of NousResearch/hermes-agent#41498 (0-CYBERDYNE-SYSTEMS-0).

- Leave response_previewed false on partial_stream_recovery so gateway
  fallback delivery can send the recovered fragment plus explanation.
- Always append the turn-completion explainer for partial_stream_recovery,
  not only for empty or very short fragments (#34452 gap).
- Launch the detached /restart helper before drain, idempotently, with a
  bounded wait of restart_drain_timeout + 5s.
2026-06-27 22:03:14 -07:00
Teknium
e3c9924b8b fix(cli): correct stale hermes auth login nous hints to hermes auth add nous (#53929)
* fix(cli): correct stale `hermes auth login nous` hints to `hermes auth add nous`

There is no `hermes auth login` subcommand — valid auth verbs are
add/list/remove/reset/status/logout/spotify. Six user-facing strings told
users to run `hermes auth login nous`, which fails with
`invalid choice: 'login'` — the same broken-hint class reported in #28089
for the proxy flow (already fixed there to `hermes auth add nous`).

Sites corrected to `hermes auth add nous`:
- hermes_cli/dashboard_register.py (401 retry hint, not-logged-in hint)
- hermes_cli/gateway_enroll.py (401 retry hint, not-logged-in hint)
- cli-config.yaml.example (two provider-requirement comments)

* docs(infographic): auth login nous hint fix
2026-06-27 21:30:37 -07:00
Teknium
4626ceb747 fix(gateway): only offer system-scope gateway install to root sessions (#53975)
Non-root users picking 'System service' in the setup wizard were handed a
'sudo hermes gateway install --system --run-as-user <you>' recipe that fails
on most distros: sudo's secure_path strips ~/.local/bin (pipx/uv installs),
so 'sudo hermes' is command-not-found. Worse, it funnels a non-root user
toward a system install they shouldn't be doing from a user session.

Now prompt_linux_gateway_install_scope() only offers system scope when
os.geteuid()==0. Non-root sessions get user-service or skip, with a tip to
re-run as root for a boot service. The non-root branch in
install_linux_gateway_from_setup becomes a defensive guard that refuses
without printing any self-elevation recipe. Gated the matching deferral hint
in setup.py behind root too.
2026-06-27 21:24:08 -07:00
teknium1
b304023fc6 docs(infographic): model picker fixes (#49129 + #51488) 2026-06-27 21:23:25 -07:00
teknium1
c72d68715f chore(release): map salvaged contributor emails for #49129 and #51488 2026-06-27 21:23:25 -07:00
Priyanshu Sharma
f6deabca0d fix(gateway): clear stale base_url on model switches 2026-06-27 21:23:25 -07:00
teknium1
f54c52800a fix(models): scope live-first picker merge to opencode aggregators only
Follow-up to the salvaged #49129 commit. The original change flipped the
shared generic-provider merge in provider_model_ids() to live-first
unconditionally, which regressed curated-first for single providers
(kimi/zai, #46309) — and the PR encoded that regression by flipping the
kimi-coding and zai test assertions to expect live-first.

Gate live-first on an explicit _LIVE_FIRST_PICKER_PROVIDERS set
({opencode-zen, opencode-go}); every other provider keeps curated-first.
Also widen the uncapped picker + live-first sets to opencode-go, which has
the same 70+ model catalog problem as opencode-zen. Restore the
kimi-coding curated-first test and rewrite the merge-order test to assert
the per-provider contract.
2026-06-27 21:23:25 -07:00
Afnath Ahamed
f98ffbc246 fix(models): live-first merge + update opencode-zen catalog + uncap aggregator picker 2026-06-27 21:23:25 -07:00
teknium1
2e7e600eaa chore(release): map HexLab98 author for PR #53863 salvage 2026-06-27 21:22:49 -07:00
HexLab98
04ff4d9b54 test(auxiliary): cover env-only proxy policy for auxiliary clients (#53702) 2026-06-27 21:22:49 -07:00
HexLab98
073847c0f2 fix(auxiliary): use env-only proxy policy for OpenAI SDK clients (#53702)
Auxiliary clients now inject a keepalive httpx transport with explicit
HTTPS_PROXY/NO_PROXY resolution, matching the main agent. This avoids
macOS system proxy settings (which omit the ExceptionsList) breaking
vision and other auxiliary calls to internal provider endpoints.
2026-06-27 21:22:49 -07:00
Teknium
3b23a984b5 feat(kanban): stamp handoff freshness so workers don't read stale state as current (#53973)
Multi-agent boards leak staleness: a sibling worker's parent handoff,
comment, or prior-attempt summary gets read by the next worker as live
truth even when it's a day old. build_worker_context surfaced the text
with (at best) a bare absolute timestamp, which an LLM reads as fact
regardless of age — parent results had no timestamp at all.

Adds a coarse relative-age stamp (just now / 18h ago / 3d ago) to every
recalled-state line and a one-line 'point-in-time snapshot, re-verify
against source' frame on the parent-results section, so the worker sees
when handoffs were produced and re-checks stale ones before acting.
2026-06-27 21:21:54 -07:00
Teknium
131c9c542c test(tui-gateway): stop deferred-resume build thread leaking into next test
test_session_resume_uses_parent_lineage_for_display resumes via the
deferred (non-eager) path, which fires a 50ms background Timer
(_schedule_agent_build) calling whatever server._make_agent is patched
in at that moment. The timer outlived the test and landed in the next
test's (_follows_compression_tip) _make_agent mock, racily setting
agent_session_id='tip' and flaking 'assert tip == cont_tip' on CI.

Root-cause fix: stub _schedule_agent_build to a no-op in the leaking
test (it only asserts display history). Defense in depth: the victim's
fake_make_agent now setdefault()s so a stray late build can't overwrite
the synchronous eager build's captured id.
2026-06-27 21:07:53 -07:00
Teknium
e418605450 test(24996): freeze monotonic clock to de-flake fallback cooldown timing
The exhaustion-cooldown timing assertions relied on a wall-clock budget
(before + window + 1.0s). On loaded CI runners the activation calls could
exceed the 1s slack, flaking 'Run tests slice 4/8'. Freeze
chat_completion_helpers.time.monotonic so the cooldown math is exact and
load-independent across all four tests.
2026-06-27 21:07:53 -07:00
teknium1
1ad8b44413 docs(infographic): skill sync external_dirs shadow fix 2026-06-27 21:07:53 -07:00
zccyman
db11849c9d fix(skills): skip shadowing when external_dirs provides the skill
Fixes #28126. sync_skills() was unconditionally writing bundled skills
into the local <profile_home>/skills/ tree even when the profile's
config.yaml delegated skill resolution to an external directory
via skills.external_dirs. The skill loader then saw two candidates
for the same name (local shadow + external canonical), refused to
resolve on collision, and every worker that auto-loaded such a skill
crashed with 'Unknown skill(s): <name>'.

Changes:
- _build_external_skill_index() indexes skills available in external
  dirs (by directory name and frontmatter name)
- sync_skills() skips writing a bundled skill when it finds the same
  name in the external index; records the hash in the manifest so
  subsequent syncs treat it as already handled
- Self-healing: removes stale local shadows left by prior buggy syncs
  (only when origin_hash == bundled_hash == user_hash, i.e. we wrote
  it and user didn't touch it)
- New 'shadowed_by_external' key in sync_skills() return dict

3 new tests in TestExternalDirsIndexing (all passing).
All 48 tests in test_skills_sync.py pass.

Closes #28126
2026-06-27 21:07:53 -07:00
Teknium
a8c862900b fix(tui): sanitize replay history on WebUI/TUI session resume (#29086) (#53939)
A WebUI/TUI session whose last turn died mid-tool-loop (stale-timeout kill,
interrupt, or process restart before the tool result was written) persists a
dangling assistant(tool_calls) or interrupted assistant->tool tail. The
messaging gateway already strips these tails before replay (the #49201 fix),
but the TUI/WebUI resume path fed db.get_messages_as_conversation() straight
in as the agent's conversation_history with no cleanup. The model re-issued
the unanswered call on every resume -- including after a full WebUI + Gateway
restart, since the poison lives in the SessionDB, not memory -- leaving the
session permanently 'thinking'. Only deleting the session recovered it.

- Extract the two strippers + helper from gateway/run.py into a shared
  agent/replay_cleanup.py (sanitize_replay_history wraps both).
- gateway/run.py re-exports under the historical private names; messaging
  behavior unchanged.
- Both TUI cold-resume sites now sanitize the model-fed history while leaving
  the display transcript untouched, so the user still sees their full history.

Verified E2E against a real SessionDB: dangling and interrupted tails are
stripped from the model feed, healthy mid-progress tool sequences are
preserved, and the display transcript is always the full raw history.
2026-06-27 20:56:49 -07:00
Teknium
f03823014b fix(telegram): kill 409 polling conflict loop by disarming PTB retry synchronously (#53941)
Telegram polling entered a self-inflicted ~31s loop of 409 Conflict ->
retry -> resume -> Conflict. The error_callback PTB invokes synchronously
inside its internal network_retry_loop only scheduled our async recovery
task (loop.create_task) and returned, so PTB kept polling getUpdates on its
own while our handler concurrently ran stop -> sleep -> start_polling. The
two polling sessions overlapped and Telegram returned a fresh 409.

Fix: in the conflict branch of the error_callback, synchronously set PTB's
private polling stop_event before scheduling recovery. PTB's loop exits on
its next tick (it races that event in do_action), so our handler owns
polling alone. The handler's await updater.stop() drains the task and PTB
clears the event, so the subsequent start_polling() builds a fresh event
and is not poisoned.

Keeps the existing reconnect ladder intact (option B) — fixes only the
race. Defensive: probes mangled + unmangled stop_event spellings and no-ops
(prior behaviour) if neither exists; never flips _running, which would make
the handler skip stop() and leave the loop wedged.
2026-06-27 20:46:08 -07:00
Teknium
d43e0cf304 fix(agent): config-driven intent-ack continuation for all api_modes (#27881) (#53943)
* fix(agent): config-driven intent-ack continuation for all api_modes (#27881)

The agent could end a turn after only stating intent ('I will run a health
check...') without executing the announced tool call, forcing the user to
re-prompt. A continuation guard that catches this and nudges the model to
proceed already existed but was hard-gated to the codex_responses api_mode,
so Gemini/Claude/OpenRouter turns never benefited.

- New agent.intent_ack_continuation config (default 'auto' = codex-only,
  byte-stable for existing conversations). 'true'/model-list opts every
  api_mode in; 'false' disables. Mirrors agent.tool_use_enforcement's shape.
- looks_like_codex_intermediate_ack gains require_workspace (default True).
  The opted-in path drops the codebase/filesystem requirement so general
  autonomous workflows (server ops, deploys, API calls) are caught, not just
  coding tasks. Future-ack + action-verb + short-content + no-prior-tool
  guards still apply; the 2-nudge-per-turn cap is unchanged.
- Resolution centralized in intent_ack_continuation_mode (off/codex_only/all).

* docs(infographic): intent-ack continuation (#27881)
2026-06-27 20:46:00 -07:00
Teknium
56abbaeac3 fix(curator): fail closed on unverified skill deletes during consolidation (#53935)
The curator's LLM consolidation pass could archive whole clusters of
active skills with zero verified consolidations (#29912): a bare prune
(skill_manage delete with absorbed_into empty/omitted) from the forked
review agent was accepted, removing the skill's name from lookup even
though counts.consolidated_this_run was 0.

- _delete_skill now fails closed during the curator/background-review
  pass: a delete is only allowed when it declares a verified
  consolidation (absorbed_into=<umbrella>, umbrella must exist). A prune
  with no forwarding target is refused; the skill stays active. The
  deterministic inactivity prune (archive_skill) is unaffected.
- A verified consolidation delete during the curator pass now routes
  through the recoverable archive primitive instead of shutil.rmtree, so
  a misjudged consolidation can be undone with hermes curator restore.
  The usage record is kept (state=archived) rather than forgotten.
- Foreground, user-directed deletes keep their existing hard-delete
  semantics.
2026-06-27 20:45:57 -07:00
konsisumer
11b0be8d15 fix(gateway): avoid Matrix pending invite boot loops 2026-06-27 20:45:51 -07:00
teknium1
a1ac6baac4 fix(gateway): make bg-process reset TTL configurable + surface session-scoped processes
Follow-up to the cherry-picked #29212 (#29177):

- Promote the 24h stale-process threshold to config.yaml
  (session_reset.bg_process_max_age_hours) instead of a hardcoded
  constant. 0 disables the cutoff (legacy: any live process blocks reset).
  Wired through GatewayConfig.default_reset_policy in gateway/run.py.
- Bug 2: process(action=list) now resolves the gateway session_key from
  the contextvar and surfaces session-scoped background processes (a
  forgotten preview server under a different task), flagged
  session_scoped — so the agent/user can discover and kill the blocker.
  Previously the task-scoped list returned [] and the blocker was invisible.
- Tests: config round-trip for the new field, cross-task list visibility.
- Docs: messaging session-reset section.
2026-06-27 20:45:43 -07:00
annguyenNous
33d8b66d5b fix: stale background processes no longer permanently block session reset
Background processes (e.g. http.server preview) that Hermes starts and
forgets about previously blocked session idle/daily reset indefinitely.
The reset guard in session.py checked has_active_for_session() with no
max age — a 3-day-old preview server blocked reset the same as a task
started 30 seconds ago.

Changes:
- Add max_active_age parameter to has_active_for_session() in
  process_registry.py. Processes older than this threshold are ignored.
- Add MAX_ACTIVE_PROCESS_AGE constant (24h / 86400s).
- Wire max_active_age into the gateway's session store callback in
  run.py so stale processes no longer block session lifecycle.
- Add debug logging when reset is skipped due to active processes.
- Add 3 tests covering recent, stale, and legacy (None) max age.

Fixes #29177
2026-06-27 20:45:43 -07:00
teknium1
8c8967a50b fix: defer hermes_subprocess_env import in browser_tool
The module-level import broke tests/tools/test_managed_browserbase_and_modal.py,
which loads browser_tool.py via spec_from_file_location against a stubbed
'tools' package that does not include tools.environments.local. Move the import
into a _build_browser_env() helper called at the two agent-browser spawn sites,
matching the lazy-import pattern already used by lazy_deps.py.
2026-06-27 20:45:31 -07:00
teknium1
9c6229ce24 fix(security): centralize credential-safe subprocess env (#29157)
Subprocesses spawned outside the terminal/execute_code path (agent-browser,
copilot ACP, dep-ensure, lazy_deps uv install, TUI Node host, cli.exec)
inherited the operator's full credential environment via os.environ.copy().
The terminal path was already scrubbed by _HERMES_PROVIDER_ENV_BLOCKLIST
(#1002/#1264/#32314); these spawn sites bypassed it.

Adds hermes_subprocess_env(inherit_credentials=) in tools/environments/local.py
reusing the existing dynamic blocklist as the single source of truth:

  - Tier 1 (_ALWAYS_STRIP_KEYS): gateway bot tokens, GitHub auth, infra
    secrets -- stripped even for credential-inheriting children.
  - Tier 2 (_HERMES_PROVIDER_ENV_BLOCKLIST): provider/tool keys -- stripped
    unless inherit_credentials=True. The opt-in is grep-able for audit.

Browser worker keeps a _BROWSER_PASSTHROUGH_KEYS allowlist (BROWSERBASE/
FIRECRAWL) re-added after the strip. Model-driving children (ACP, TUI Node
host, cli.exec) use inherit_credentials=True so they still get provider keys
while losing Tier-1 secrets. Installers (dep-ensure, lazy_deps) inherit
nothing sensitive. cua_backend already routed through _sanitize_subprocess_env
on main -- left as-is. Gateway adapter utility spawns (gh pr comment, ffmpeg)
are left inheriting env: gh needs GH_TOKEN by design, ffmpeg is a trusted
system binary -- no untrusted-dependency exposure.

This is defense-in-depth (personal-assistant trust model: same-user spawns),
making the existing scrub policy uniform across the spawn surface; the main
real payoff is shrinking the blast radius if a transitive npm dep in
agent-browser is compromised.

Reconstructed on current main from the design in #31959 (Tranquil-Flow);
also credits #39003 (rodboev), #37843 (coygeek), #35769 (egilewski).

Co-authored-by: Tranquil-Flow <tranquil_flow@protonmail.com>
Co-authored-by: rodboev <rod.boev@gmail.com>
Co-authored-by: egilewski <egilewski@egilewski.com>
2026-06-27 20:45:31 -07:00
Hermes Agent
88b3d8638e test: de-flake SIGKILL-tree, compression-tip resume, and fallback-cooldown tests
Three CI flakes hit while landing the credential-pool restore fix; all three
were timing/wall-clock races in the tests, not product bugs (each passes
locally and the assertions are correct):

- test_entire_tree_is_sigkilled_not_just_parent: _terminate_host_pid SIGKILLs
  synchronously, but the test's 4s budget after a 1s in-function SIGTERM grace
  left almost no slack for the kernel to tear down 3 processes + reparent the
  children to zombies under loaded-CI scheduling. Widen the wait to 15s and
  make the liveness predicate tolerant of vanished-pid / zombie races. The
  assertion never weakens: every tree member must end up dead or zombie.

- test_session_resume_follows_compression_tip: appended messages got
  time.time() timestamps (~now) while the test forced session started_at into
  the past, so the get_compression_tip MAX(m.timestamp) tiebreaker depended on
  wall-clock ordering. Pass explicit, well-separated message timestamps so the
  chain resolution is deterministic by construction.

- test_non_retryable_exhaustion_arms_cooldown: asserted the short (5s)
  exhaustion cooldown with a tight +1.0s slack, which false-fails when
  wall-clock jitter between the 'before' snapshot and the cooldown computation
  exceeds a second on a loaded runner. Widen to +30s — still cleanly below the
  60s rate-limit window it must distinguish from.
2026-06-27 20:04:45 -07:00
Jack Maloney
f0de4c6a47 fix(pool): re-select from credential pool on primary runtime restore
_restore_primary_runtime restored the construction-time api_key snapshot and
never consulted the credential pool. After the pool rotated away from a
revoked/exhausted entry mid-session, every new turn restored the dead key,
re-failed instantly, burned the remaining entries, and fell through to
cross-provider fallback.

After restoring the snapshot, re-select the pool's current best entry and
swap the live credential in via _swap_credential (which already rebuilds the
OpenAI/Anthropic client, reapplies base-url headers, and carries the #33163
base_url / OAuth-detection fixes). Falls back to the snapshot key when the
pool is absent, empty, or the entry has no usable key.

Salvaged from #25206 onto current main: the original targeted the pre-refactor
monolithic method in run_agent.py; the logic now lives in
agent/agent_runtime_helpers.py and is collapsed onto _swap_credential instead
of re-inlining the client rebuild.

Fixes #25205
2026-06-27 20:04:45 -07:00
teknium1
a590c5efdc docs: add infographic for provider-precedence fix (#29285) 2026-06-27 19:49:02 -07:00
kshitijk4poor
2af1678bfc fix(auth): explicit provider intent beats stale OAuth active_provider (#29285)
`resolve_provider("auto")` checked `auth.json` `active_provider` BEFORE the
config.yaml `model.provider` and env-var API-key checks. So a user who was
OAuth-logged-into one provider (e.g. Anthropic) but had set an explicit
`model.provider` or exported an API key (e.g. `OPENAI_API_KEY`) was silently
routed to the stale OAuth provider — the override was invisible and surprising.

Reorder the auto-path so explicit intent wins (the order the issue asks for):

  1. explicit CLI api_key/base_url
  2. config.yaml `model.provider`            (safety net — see below)
  3. OPENAI_API_KEY / OPENROUTER_API_KEY env
  4. OpenRouter credential pool
  5. provider-specific API-key env vars
  6. auth.json `active_provider` (OAuth)      ← demoted to last-resort
  7. AWS Bedrock credential chain
  8. error

`active_provider` is still honored — it's just a last-resort fallback chosen
only when the user expressed no other preference, instead of overriding one.

The normal chat/gateway/TUI/ACP/status path already resolves config.provider
upstream in `resolve_requested_provider()` before "auto" is reached, so this
duplicate config check is the safety net for the lone direct caller
(`main.py` `resolve_provider("auto")`) and any future bypass. Because every
surface funnels through this one resolver, the fix propagates everywhere with
a single edit — no sibling path re-implements precedence.

Also add a one-shot WARN when resolution lands on `active_provider` while a
populated `model` config dict lacks a `provider` key — surfacing the silent
override the issue reported without breaking first-install.

Synthesizes the two competing PRs: #29615 (LifeJiggy — config-before-auth +
the silent-override framing) and #29809 (Minksgo — the env-before-auth
reorder). #29809 could not be merged directly (bundled unrelated, un-opt-in
cost-tagging telemetry); its reorder idea is incorporated here and credited.

Tests: tests/hermes_cli/test_provider_precedence.py — config/env beat stale
OAuth, OAuth still used as last resort, explicit request short-circuits, WARN
fires on silent fall-through. Full provider-resolution suites: 374 passed.

Fixes #29285

Co-authored-by: LifeJiggy <141562589+LifeJiggy@users.noreply.github.com>
Co-authored-by: Minksgo <153416856+Minksgo@users.noreply.github.com>
2026-06-27 19:49:02 -07:00
teknium1
2b73dd1ca6 fix(gateway): namespace --replace takeover marker by HERMES_HOME to stop cross-profile flap (#29092)
Two profile gateway services sharing the default ~/.hermes resolve the
takeover marker to the same path. A --replace from profile B could land
in profile A's marker, match on PID + start_time by coincidence of a
shared PID namespace, and make profile A exit 0 — only to be revived by
systemd Restart=always, which races the replacer again, flapping
indefinitely.

write_takeover_marker now stamps replacer_hermes_home; the shared
consume path rejects markers written under a different HERMES_HOME and
leaves them in place for the correct profile. Absent field (older
markers) is treated as same-home, so single-profile and mixed old/new
deployments are unaffected.

Salvaged from #31414 by @CryptoByz onto current main (branch was ~3962
commits behind; the consume function had since been refactored for
issue #34597). Co-authored-by: CryptoByz.
2026-06-27 19:43:02 -07:00
Teknium
28ed883959 docs: add PR infographic for config-defaults fix 2026-06-27 19:38:11 -07:00
Teknium
45b2e4dd6b fix(config): opt newer migrations out of default-stripping
The salvaged #27354 fix made save_config strip schema-default leaves by
default. Five migration sites added to main after the PR was authored
still called bare save_config(config) and intentionally materialize a
(often default-valued) key: model_catalog.ttl_hours, write_approval,
curator.consolidate, agent.verify_on_stop, and the suspicious-MCP-server
disable. Pass strip_defaults=False so those one-time deliberate writes
survive, matching the opt-out the PR applied to the other migrations.
2026-06-27 19:38:11 -07:00
郝鹏宇
98488c4be4 fix(config): prevent save_config from materialising schema defaults
Fixes #27354

Root cause:  called during init (or by any code path
that saves ) wrote injected schema defaults into
config.yaml as if the user had authored them.  Two fix layers:

1.  now only injects
    when the user actually set
    somewhere (root or agent).  A user who never set
    keeps it absent, so 's explicit-path
   detection won't treat it as user-authored.

2.  gains a  parameter and a
   new  pass that removes keys matching
    unless those paths were explicitly present in the
   **raw** (pre-normalization) config on disk.  Explicit-path detection
   uses  on  *before* any
   normalisation runs — preventing injected-in defaults from being
   mistaken for user-set values.

All migration and edit-config call sites pass
to preserve their intentional default-seeding behaviour.

New helpers:
-   — collects leaf-key paths from a raw dict
-    — removes keys matching schema defaults

Test coverage: 4 new regression tests (59 total, all passing).
2026-06-27 19:38:11 -07:00
teknium1
6dcc579bcb test(streaming): repoint anthropic stream-cleanup test to close+rebuild path
The existing test_anthropic_stream_parser_valueerror_retries_before_delivery
asserted mock_replace.call_count == 1 — i.e. it passed precisely because the
buggy OpenAI rebuild was invoked on the Anthropic path. Repoint it to assert
the corrected close+rebuild-Anthropic behavior (#28161).
2026-06-27 19:37:33 -07:00
EloquentBrush0x
a0b9663c7c fix(streaming): rebuild Anthropic client on stream cleanup instead of OpenAI client
interruptible_streaming_api_call() has three connection-pool cleanup
sites that called _replace_primary_openai_client() unconditionally.
For api_mode=anthropic_messages this has two consequences:

1. _replace_primary_openai_client() fails (OPENAI_API_KEY unset on
   Anthropic-only configs), so dead connections are never purged.
2. The stale-stream detector's outer-poll site (L1977) is the only
   mechanism that can interrupt the worker thread while it blocks in
   for event in stream:. Because the Anthropic client is never closed,
   the thread stays blocked until the 900 s httpx read-timeout fires,
   producing a visible 15-minute hang for Telegram/gateway users on
   claude-opus-4-7.

Fix: mirror the existing interrupt-path pattern (L1989-1997) at all
three cleanup sites — if api_mode == "anthropic_messages", call
_anthropic_client.close() + _rebuild_anthropic_client() instead of
_replace_primary_openai_client(). _rebuild_anthropic_client() handles
both direct Anthropic and Bedrock-hosted Claude correctly, unlike the
inline build_anthropic_client() calls in open PR #14430.

PR #14430 (open) covers only the outer stale-detector site (L1977).
PR #23678 (open) covers only the inner retry sites (L1774, L1833).
This PR covers all three sites and uses _rebuild_anthropic_client()
for Bedrock parity.

Fixes #28161
2026-06-27 19:37:33 -07:00
xxxigm
6f1a176b33 fix(gateway/discord): REST liveness probe to detect zombie clients (#26656)
The Discord adapter could enter a silent zombie state after a network
outage / proxy stall: the process is alive, _client looks open, but the
underlying socket is dead. discord.py's WebSocket reconnect never sees a
RST through a wedged proxy/NAT, so client.start() spins forever without
exiting — which means the bot-task done callback (which only fires on
task completion) never trips either. The bot stays "offline" in Discord
until a manual `hermes gateway restart`. Reported offline for 13-17h.

Adds an out-of-band REST liveness probe in DiscordAdapter. Every
`discord.liveness_interval_seconds` (default 60s) the adapter issues a
cheap fetch_user(bot_id) — the same REST path as message delivery, so it
fails when the proxy/NAT is wedged. After
`discord.liveness_failure_threshold` consecutive failures (default 3) the
probe closes the wedged client and surfaces a retryable fatal error,
which trips the gateway's existing _platform_reconnect_watcher and
rebuilds the adapter. Operators disable it by setting either knob to 0.

Config lives in config.yaml (discord.liveness_*) per the .env-is-secrets
policy; _apply_yaml_config bridges it to internal env vars the adapter
reads, matching the existing HERMES_DISCORD_TEXT_BATCH_* pattern.

Co-authored-by: Hermes Agent <agent@nousresearch.com>
2026-06-27 19:30:32 -07:00
teknium1
457c8a0a7c fix(file-ops): keep worktree isolation when restoring preserved cwd (#26211)
The durable _last_known_cwd anchor is keyed by the shared 'default' container,
so a non-owning worktree session could inherit the owning session's cwd through
it — breaking the wrong-worktree-routing fix (test_file_tools_cwd_resolution::
test_resolution_routes_to_resolving_sessions_worktree).

Reorder _authoritative_workspace_root so the session-specific registered cwd
override (keyed by raw session id) is checked BEFORE the shared-container
_last_known_cwd fallback. A non-owning session now resolves into its own
registered worktree; the durable anchor only fills in when there's no
session-specific override (the #26211 single-session case). Adds a regression
test covering the owner-mirrors-then-other-session-resolves interaction.
2026-06-27 19:29:06 -07:00
teknium1
b2faeba182 fix(file-ops): make preserved cwd reachable at write-time resolution (#26211)
Belt-and-suspenders on top of the cherry-picked cwd-preservation fix:

- Proactively mirror every live terminal cwd into _last_known_cwd on each
  successful read, so the durable anchor survives even when the cleanup
  thread pops both _file_ops_cache and _active_environments before
  _get_file_ops' stale-cache save branch can fire.
- Fall back to _last_known_cwd in _authoritative_workspace_root. write_file_tool
  resolves the path (via _resolve_path_for_task) BEFORE _get_file_ops rebuilds
  the env, so restoring only the rebuilt env's cwd was insufficient — the
  resolution that decides where the file lands runs first. This closes that gap.

The local env's persisted _cwd_file can't serve this role: it's keyed by a
random per-session uuid and deleted on cleanup (the same cleanup that triggers
the bug). The in-memory _last_known_cwd registry is the durable anchor instead.

Adds a real-IO E2E regression (TestSilentFileMisplacementE2E) exercising the
actual write_file_tool path after env cleanup.
2026-06-27 19:29:06 -07:00
zccyman
adeba1d7a8 fix(file-ops): preserve CWD across terminal environment re-creation (#26211)
Root cause: when the terminal environment (`_active_environments` entry) is
cleaned up and re-created during a long conversation, the new environment
always starts with the default config CWD (typically `~/.hermes/hermes-agent`)
instead of preserving the user's last-known working directory. Subsequent
relative-path writes (`write_file`, `execute_code`, shell commands) silently
land in the default CWD, making files appear to be "created but absent."

Fix: add `_last_known_cwd` dict that preserves the old environment's CWD
before the stale cache entry is invalidated. When a new environment is
created for the same task_id, we check `_last_known_cwd` first and use the
preserved CWD instead of the config default.

Changes:
- tools/file_tools.py: add `_last_known_cwd` dict, save CWD before stale
  cache invalidation, restore CWD on env recreation
- tests/tools/test_file_tools.py: add `TestLastKnownCwd` with 2 tests
  verifying CWD preservation and fallback behavior

Fixes #26211
2026-06-27 19:29:06 -07:00
teknium1
926a1b915d fix(tools): suppress transient check_fn flakes so subagents keep file/terminal tools
A flaky external probe in a tool's check_fn (e.g. check_terminal_requirements
running `docker version` with a 5s timeout, momentarily timing out under load)
would return False for a single get_tool_definitions() call. Because file
tools delegate their check_fn to the terminal check, that one flake silently
stripped read_file/write_file/patch/search_files AND terminal from whatever
agent was being constructed at that instant — most visibly a delegate_task
subagent, which then reported "Tool read_file does not exist". This explains
both the intermittent (~80% success) user-session failures and the
deterministic cron failures in #21658 / #5304.

The existing _check_fn TTL cache made this worse: it cached the transient
False for the full 30s window, poisoning every subagent spawned in that span.

Fix: remember the last time each check_fn returned True; when a fresh probe
fails within a short grace window of that success, treat it as a flake —
serve the last-good True and do NOT cache the failure (so the next call
re-probes). A failure with no recent success, or past the grace window, is
honored normally so a backend that genuinely went down stops advertising its
tools. Probe failures now log at WARNING regardless of quiet mode, making the
previously-silent tool loss diagnosable in subagent (quiet) sessions.

Co-authored-by: Stuart Horner <5261694+djstunami@users.noreply.github.com>
2026-06-27 19:29:00 -07:00
Shashwat Gokhe
505bc27d8d fix(gateway): classify mixed attachments per-attachment + transcode uncommon image formats
A document attached alongside an image in the same Discord message was
swept into the vision pipeline and 400'd the whole turn ("Could not
process image"), and was simultaneously never surfaced to the agent as a
readable file. Restores the "any file type works" contract for mixed
messages and fixes the HTTP 400.

Bug 1 — mixed attachments: the inbound routing loop keyed image/audio/video
classification off the message-level type (PHOTO/VOICE/AUDIO), so a doc in
a PHOTO message landed in image_paths and poisoned the vision call. The
document context-note path was gated on message_type == DOCUMENT, so that
same doc never reached the agent at all. Now classification is
per-attachment (trust each attachment's own MIME; fall back to the
message-level type only when MIME is unknown), via shared _event_media_is_*
helpers used by both _build_media_placeholder and the main inbound loop.
The document note now fires for any non-image/audio/video attachment
regardless of message-level type.

Bug 2 — uncommon formats: AVIF/HEIC/BMP/TIFF/ICO produced the same generic
400 because providers only accept PNG/JPEG/GIF/WEBP. image_routing now
transcodes those to PNG via Pillow before declaring media_type, skipping
cleanly (logged) if Pillow/plugins are missing. SVG is vector — Pillow
can't rasterize it — so it's skipped rather than transcoded.

Closes #25935.

Co-authored-by: LeonSGP43 <cine.dreamer.one@gmail.com>
Co-authored-by: cypres0099 <74935762+cypres0099@users.noreply.github.com>
2026-06-27 19:26:04 -07:00
teknium1
0c372274cd fix(agent): disable OpenAI SDK auto-retry that double-fires inside the rate-limit loop
Same bug class as the Anthropic fix (#26293): the OpenAI/aggregator client is
built without max_retries, so the SDK default of 2 applies. The SDK's own 1-2s
backoff ignores Retry-After and retries inside hermes's outer conversation loop,
burning request slots against a rate-limited bucket. Set max_retries=0 at the
single create_openai_client chokepoint (covers init, switch_model, recovery,
restore, request-scoped). auxiliary_client builds its own clients and is not
wrapped by the loop, so it keeps SDK retries.
2026-06-27 19:23:15 -07:00
konsisumer
1ab35ba25d fix(anthropic): stop SDK auto-retry double-firing and raise Retry-After cap to 600s
The Anthropic SDK clients were built without max_retries, so the SDK
default (max_retries=2) retried 429/5xx with its own backoff that ignores
Retry-After — double-retrying inside hermes's outer loop and burning
request slots against a bucket that won't refill for minutes. Set
max_retries=0 on all Anthropic/AnthropicBedrock client constructions so
the outer conversation loop (which already honors Retry-After) owns retry.

Also raise the Retry-After cap in the conversation loop from 120s to 600s.
Anthropic Tier 1 input-token buckets reset in ~171s, so the 120s cap made
hermes retry before the reset window and re-trip the limit.

Refs #26293
2026-06-27 19:23:15 -07:00
LeonSGP43
32732a8f83 fix(agent): cap same-entry credential refreshes so fallback can activate (#26080)
A persistent upstream 401 on a single-entry OAuth pool (common for Claude
Max subscribers) made the credential-pool recovery spin forever:
try_refresh_current() re-mints a fresh token and reports success on every
401, so recover_with_credential_pool returned True and the retry loop
continue'd without ever incrementing retry_count or reaching the
auth-failover block. The configured fallback_model never activated and the
agent appeared to hang.

Cap consecutive successful same-entry refreshes (keyed by provider +
pool-entry id) at 2; once exceeded, treat the credential as unrecoverable
and return not-recovered so the loop falls through to
_try_activate_fallback. The 429/billing paths already rotate-or-fall-through
correctly (mark_exhausted_and_rotate returns None on a single entry), so
only the auth-refresh branch needed the cap.

Co-authored-by: Hermes Agent <hermes@nousresearch.com>
2026-06-27 19:20:07 -07:00
Teknium
fae920642a fix(agent): throttle cross-turn fallback-switch replay storm (#24996) (#53909)
When every provider in the fallback chain fails non-retryably back-to-back
(e.g. HTTP 400/402/429 across distinct providers), the within-turn walk is
already bounded — _fallback_index advances monotonically and the loop aborts
when the chain exhausts. The damaging mode is cross-turn: restore_primary_
runtime resets _fallback_index=0 every turn, so a client that re-submits
immediately replays the entire chain, re-marshaling the full (potentially
80k-token) context once per provider every turn with no throttle on the
non-rate-limit path. On constrained hosts this exhausts memory/swap.

Rate-limit/billing failures already arm a 60s cooldown via _rate_limited_until;
the gap was the non-rate-limit case. Now, when the chain exhausts on a non-
rate-limit failure with a non-empty chain, arm a short (5s) cooldown on the
same _rate_limited_until gate (max(), never shrinking an existing window).
The next turn's restore stays gated and does NOT reset the index, so the
chain isn't replayed until the cooldown clears. No new state, no thread sleep,
no false-trip on legitimately long chains (those walk normally within a turn).

Tests: tests/run_agent/test_24996_fallback_exhaustion_cooldown.py
2026-06-27 19:15:40 -07:00
Chaz Dinkle
1dde7e2f2a fix(anthropic): adopt Claude Code's already-refreshed token before racing refresh
Claude Code OAuth refresh tokens are single-use; Claude Code refreshes on
its own schedule, so by the time Hermes notices an expired token Claude
Code may have already rotated it. Re-read live credential sources first and
adopt a valid token rather than POSTing a possibly-stale refresh token.

Ports the _refresh_oauth_token hardening from PR #40107 (chazmaniandinkle)
on top of the keychain/file reconciliation from PR #21112 (nodejun).
Adds AUTHOR_MAP entry for nodejun.
2026-06-27 19:14:43 -07:00
jun
5a5396aecb fix(anthropic): reconcile keychain/file credentials when one is expired
read_claude_code_credentials() previously returned the macOS Keychain
entry as soon as one existed, even if its OAuth token was already
expired. Callers then ran is_claude_code_token_valid() on the result
and got False, so resolve_anthropic_token() returned None — surfacing
the misleading 'No Anthropic credentials found' error even when
~/.claude/.credentials.json held a perfectly valid token.

Now reads both sources and prefers the non-expired one. When both are
valid (or both expired), prefers the later expiresAt so any subsequent
refresh uses the freshest refresh_token.

Adds TestReadClaudeCodeCredentialsDesync covering the four reconciliation
cases. The existing 'keychain wins' priority test still passes because
both fixtures share the same expiresAt and the tiebreaker is >=.
2026-06-27 19:14:43 -07:00
Teknium
db16854f34 fix(telegram): surface failed media downloads to user and agent, not a silent empty turn (#53912)
When a Telegram attachment download/cache fails (typically a transient
httpx.ConnectError to Telegram's CDN), the except handler logged a warning
and fell through to handle_message() with empty media and no text — the user
thought the file was delivered, the agent saw a content-less turn with no
signal an attachment was attempted, and the only record was a buried log line.

Adds _surface_media_cache_failure(): replies to the user in Telegram so they
know to retry, and appends an agent-visible notice to event.text via the
existing _append_observed_note channel so the agent knows an attachment was
attempted and failed. No new event fields (structured-event refactor is out
of scope per #23045). Wired into all five cache-failure sites — photo, voice,
audio, video, document — since they shared the identical silent fall-through.

Bug 1 from #23045 (unsupported types routed as fake user messages) no longer
exists on main: the document handler now accepts any file type, so there is no
rejection branch to fix.

Closes #23045
2026-06-27 19:12:57 -07:00
teknium1
6514be5a28 chore(release): add AUTHOR_MAP entry for linyubin (#50228 salvage) 2026-06-27 19:12:21 -07:00
teknium1
4133cd9fbf docs(infographic): eager fallback on persistent transport failures 2026-06-27 19:12:21 -07:00
linyubin
c946e6709f fix(agent): activate fallback on persistent transport failures (#22277)
Eager fallback previously fired only on rate_limit/billing. A stale-
detector-killed hung stream classifies as FailoverReason.timeout
(retryable=True) and the retry loop re-hit the same dead primary until
the budget exhausted -- 3 x ~180-300s stale kills compounding into a
15+ min silent hang while the configured fallback chain sat idle.

Extend the existing eager-fallback gate to also cover timeout and
overloaded, but only after one real retry (retry_count >= 2) so genuine
transient hiccups still recover on the primary. Reuses the same
pool-recovery guard and state-reset as the rate_limit branch -- no new
config flag, no change to the rate-limit intent.

Salvaged from PR #50228 by @linyubin. Closes #22277.

Co-authored-by: Hermes Agent <127238744+teknium1@users.noreply.github.com>
2026-06-27 19:12:21 -07:00
bykim0119
851f75d4df fix(discord): honor "*" wildcard in DISCORD_ALLOWED_USERS (#22334)
DISCORD_ALLOWED_USERS="*" now means "allow everyone", matching the
SIGNAL_ALLOWED_USERS / DISCORD_ALLOWED_CHANNELS wildcard convention and
the value `claw migrate` emits. Previously _is_allowed_user did exact
ID matching only, so "*" matched no user and blocked every non-self
sender — a P1 with no workaround.

Three sites, all required for the fix to hold at runtime:
- _is_allowed_user: short-circuit when "*" is in the allowlist.
- connect(): exclude "*" from the intents.members trigger so the
  wildcard does not request the privileged Server Members intent
  (which can block the bot from coming online).
- _resolve_allowed_usernames: preserve "*" verbatim; otherwise it lands
  in the username-resolution bucket, matches no member, and is silently
  dropped from the set and env var on the first on_ready — quietly
  undoing the fix.

Slash auth delegates to _is_allowed_user (auto-covered); component auth
already honors "*" on main.
2026-06-27 19:11:30 -07:00
Teknium
1207d81eed fix(gateway): unify outbound chat redaction onto authoritative redactor (#23810) (#53907)
The gateway banner promises 'chat responses are scrubbed before delivery',
but _redact_gateway_user_facing_secrets used a divergent 6-pattern subset that
leaked credential shapes the comprehensive agent.redact catches — notably the
GitHub fine-grained PAT (github_pat_...) and the Telegram bot-token shape
(bot<digits>:<token>), the gateway's own credential type.

_redact_gateway_user_facing_secrets now delegates to
agent.redact.redact_sensitive_text(force=True) — the same Tirith-grade redactor
already applied to logs, tool output, and approval-command prompts — so the
outbound LLM-response path (final_response -> _sanitize_gateway_final_response)
masks the full credential set. The narrow local pattern set is kept as a
fail-soft second pass. force=True honors redaction even when
security.redact_secrets is off, matching _redact_approval_command.

Test: regression guard parametrizing all 5 issue shapes x every chat surface;
asserts secret body never reaches the user and surrounding prose survives. The
existing bearer-token test's marker assertion is loosened from the literal
'[REDACTED]' to mask-agnostic (the redactor masks as '***'/partial) — it
asserts the security invariant, not the implementation's mask string.
2026-06-27 19:09:41 -07:00
LeonSGP43
c56b39c11e fix(auxiliary): fall back to OPENROUTER_API_KEY when credential pool exhausted
_try_openrouter() returned (None, None) whenever an OpenRouter credential
pool existed but was exhausted (_select_pool_entry -> (True, None)), making
the OPENROUTER_API_KEY env-var fallback unreachable. Auxiliary tasks
(compression, vision, web_extract) silently failed even with a valid env key.

Now the pool-present branch only returns early when it successfully builds a
client; an exhausted pool falls through to the env-var path. The final
failure (pool exhausted AND no env var) still marks the provider unhealthy.

Fixes #23452.

Co-authored-by: ambition0802 <noreply@github.com>
2026-06-27 19:09:27 -07:00
qWaitCrypto
46e18804ad fix(auxiliary): fall back on 401 auth errors in auto mode (#21165)
When the primary provider returns 401 and the auth-refresh path is
unavailable or fails, both call_llm() and async_call_llm() reached the
should_fallback gate without _is_auth_error in the condition, so the
auxiliary task (e.g. compression) was dropped silently — losing message
history. Add _is_auth_error to should_fallback (NOT is_capacity_error) in
both sync and async paths, plus an 'auth error' reason branch.

Auth stays a non-capacity error: it falls back in auto mode via the
is_auto gate, but on an explicitly-configured provider it still respects
the user's choice and raises rather than silently switching providers.
2026-06-27 19:07:04 -07:00
Teknium
1a570dae00 fix(image-routing): unblock message queue on OpenRouter 'no endpoints' image 404 (#53901)
The agent's image-rejection fallback strips images and retries text-only when
a provider rejects image content, which is what lets the gateway drain its
queued messages. The fallback only fires on a hardcoded phrase list, and the
OpenRouter wording — HTTP 404 'No endpoints found that support image input' —
was missing. For OpenRouter-routed non-vision models the fallback never fired,
the retry loop re-sent the same rejected request until exhaustion, and every
subsequent message (including plain text) stayed queued behind the stuck turn.

Add the phrase to _IMAGE_REJECTION_PHRASES (the 404 already passes the 4xx
gate). Add a positive test and a guard test so the sibling OpenRouter
'no endpoints ... data policy / guardrail' 404s do NOT get their images
stripped.

Fixes #21160. Reported by @liu14goal14-ux; PR #21198 by @ygd58.
2026-06-27 19:07:02 -07:00
Teknium
a94f657a50 fix(tui): route completion RPCs to the pool so they can't freeze the TUI (#53895)
complete.path and complete.slash ran inline on the tui_gateway stdin
reader thread. complete.path spawns git ls-files and fuzzy-ranks the
whole repo; complete.slash does first-call prompt_toolkit imports plus a
skill-dir scan. While either ran, prompt.submit / session.interrupt sat
unread in the stdin pipe, freezing the TUI until the 120s RPC timeout
fired — most reliably reproduced by typing @ on a large repo / WSL2 mount.

Add both to _LONG_HANDLERS so completion runs on the existing thread
pool (write_json is already _stdout_lock-guarded). Root-cause fix:
covers any slow completion, not just the bare-@ trigger.

Fixes #21123
2026-06-27 19:06:01 -07:00
teknium1
ccf526964a fix(gateway): bound adapter teardown awaits on the stop path (#14128)
The main stop loop in _stop_impl() awaited adapter.cancel_background_tasks()
and adapter.disconnect() with no timeout, for both the primary and the
secondary-profile (multiplex) adapter maps. A half-dead platform — a wedged
Feishu/Lark WebSocket thread blocked on network I/O is the reported case —
makes one of those awaits block forever, so the process never exits. systemd
then SIGKILLs it after TimeoutStopSec, skipping atexit PID-file cleanup, and
the next start dies with 'PID file race lost' and enters a restart loop.

The per-adapter timeout infra already existed on main
(_adapter_disconnect_timeout_secs / HERMES_GATEWAY_ADAPTER_DISCONNECT_TIMEOUT,
default 5s) but was only wired into _safe_adapter_disconnect, which the
teardown path never calls.

Add _bounded_adapter_teardown(): wraps BOTH cancel_background_tasks() and
disconnect() in the existing timeout budget, logs and forces forward progress
on timeout, and never raises. Both teardown loops now route through it, so the
stop sequence always completes regardless of any adapter's internal behavior
and PID-file cleanup runs.

Original report + fix direction by @happy5318 (#14128, #14130); this widens it
to cover cancel_background_tasks(), the multiplex loop, and the config knob.

Co-authored-by: happy5318 <happy5318@users.noreply.github.com>
2026-06-27 19:05:04 -07:00
Teknium
6717cfc805 docs(gateway): warn against custom ExecStopPost kill drop-in (restart loop) (#53903)
A user-added systemd drop-in like ExecStopPost=/bin/kill -9 $MAINPID fires
on every stop, including clean restarts — it SIGKILLs the freshly spawned
gateway before it stabilizes and Restart=always respawns it, producing an
infinite restart loop (issue #23272). The unit Hermes installs already shuts
down cleanly via KillMode=mixed + KillSignal=SIGTERM with Restart=always +
RestartForceExitStatus, so no extra kill is needed. Document this as a danger
callout in the gateway service-management section.
2026-06-27 19:04:29 -07:00
teknium1
ea8facee81 chore(release): add konsisumer to AUTHOR_MAP for PR #19608 salvage 2026-06-27 19:01:37 -07:00
konsisumer
8b4c29f0f0 fix(auth): preserve concurrently-added credentials on pool rewrite 2026-06-27 19:01:37 -07:00
Teknium
163cb24d45 feat(moa): render reference-model blocks in TUI and desktop, not just CLI (#53855)
The MoA reference-block display (each reference model's output shown as a
labelled thinking block before the aggregator responds) previously existed
only in the classic CLI. The facade already emits moa.reference / moa.aggregating
through tool_progress_callback; this wires the TUI and desktop consumers.

- tui_gateway/server.py: _on_tool_progress relays moa.reference (label / text /
  index / count) and moa.aggregating to the Ink/desktop client as their own
  events.
- ui-tui: gatewayTypes adds the two event shapes; createGatewayEventHandler
  routes them; turnController.recordMoaReference pushes a committed
  thinking-style segment tagged with the source model. Shown regardless of
  showReasoning — references ARE the mixture-of-agents process the user opted
  into, not ordinary reasoning. moa.aggregating is a status-only transition
  (no transcript entry).
- apps/desktop: use-message-stream appends each reference as a labelled
  reasoning chunk via the existing reasoning disclosure; GatewayEventPayload
  gains label/index/aggregator.

Tests: tui_gateway emit (3), Ink handler render + showReasoning-independence +
aggregating-no-segment (3). TUI typecheck/lint clean; desktop typecheck/lint
clean.
2026-06-27 18:46:20 -07:00
Teknium
d3d621f7c3 revert(windows): roll back terminal-popup PRs #53791 #53810 #53829 (#53853)
* Revert "fix(windows): capture is not a no-window boundary; route flashing spawns through chokepoint (#53829)"

This reverts commit 2ecca1e7d3.

* Revert "fix(windows): stop terminal-window popups from background spawns (#53810)"

This reverts commit 5db1430af9.

* Revert "fix(windows): stop subprocess console-window popups + add CI guard (#53791)"

This reverts commit ef17cd204d.
2026-06-27 15:59:00 -07:00
Teknium
1d32e5d98c fix(gateway): relay _thinking bubbles when thinking_progress is on but tool_progress is off (#53849)
display.thinking_progress is documented as independent of tool_progress —
users can keep tool progress quiet while opting into mid-turn assistant
scratch-text bubbles. But two gates were keyed on tool_progress_enabled alone,
so with tool_progress:off the _thinking relay was silently dead even when
thinking_progress:true:

1. agent.tool_progress_callback was set to None unless tool_progress_enabled,
   so the callback that queues _thinking text never fired.
2. The send_progress_messages drain task was only started when
   tool_progress_enabled, so even queued messages had no consumer.

Both now gate on needs_progress_queue (tool_progress OR thinking_progress) —
the same condition that already decides whether to create the progress queue
at all. No effect when both are off (queue is None) or when tool_progress is
on (unchanged).

Tests: _thinking relays with thinking_progress:on/tool_progress:off, and is
suppressed when thinking_progress:off. Full progress-topics suite: 35 pass.
2026-06-27 15:48:20 -07:00
Teknium
2ecca1e7d3 fix(windows): capture is not a no-window boundary; route flashing spawns through chokepoint (#53829)
Follow-up to #53791 addressing review feedback: the footgun checker treated
capture_output=/stdout=/stderr=/check_output as proof a subprocess can't pop a
Windows console. That invariant is false — stream redirection controls where a
child's output goes, not whether a console is allocated. From a console-less
parent (Desktop/Electron, pythonw.exe, detached gateway/cron) a console-subsystem
child still flashes a window even when fully captured.

- check-windows-footguns.py: capture/redirect/check_output is no longer a blanket
  safe-pass. Added _WINDOWS_FLASHING_PROGRAMS (git/gh/npm/node/python/uv/ffmpeg/
  docker/powershell/…); calls to those are flagged even when captured. Non-flashing
  programs keep the capture exemption (no 271-site noise). _subprocess_compat.run/
  popen calls are inherently safe (wrapper injects CREATE_NO_WINDOW).
- Routed the 35 genuine flashing git/gh/npm/uv/ffmpeg/docker spawns through the
  _subprocess_compat.run/popen chokepoint (Brooklyn's wrapper from #53810) — the
  durable fix, not per-site annotations. cmd.exe /c start stays # ok (intentional).
- Updated tests + CONTRIBUTING.md rule #17 to the corrected invariant.
2026-06-27 14:49:41 -07:00
Teknium
3ac96d3308 fix(moa): resolve auxiliary tasks to the aggregator, not the preset name (#53827)
On a MoA session, auxiliary tasks (title generation, compression, vision, …)
ran through _resolve_auto with provider='moa' / model='<preset>', which sent
the preset name (e.g. 'opus-gpt') as the model id to resolve_provider_client —
producing 'HTTP 400: opus-gpt is not a valid model ID' on every turn (visible
as the title-generation warning).

MoA is a virtual provider with no real HTTP endpoint; aux tasks don't need the
reference fan-out. _resolve_auto now resolves a 'moa' main provider to the
preset's aggregator slot (its acting model) and continues Step 1 with that real
provider+model, dropping the virtual moa://local base_url + placeholder key so
the aggregator resolves via its own provider credentials. Mirrors the MoA
context-length resolution.

Verified live: a MoA turn no longer emits the 'not a valid model ID' warning.
Test: tests/agent/test_auxiliary_main_first.py (19 pass).
2026-06-27 14:21:26 -07:00
Gille
e7bb67332d fix(moa): preserve Codex slot routing 2026-06-27 14:20:51 -07:00
Gille
66aeda3550 fix(moa): keep virtual provider on MoA client 2026-06-27 14:20:51 -07:00
brooklyn!
5db1430af9 fix(windows): stop terminal-window popups from background spawns (#53810)
* fix(windows): stop terminal-window popups from background spawns

Native-Windows desktop/gateway users saw cmd/conhost windows flash on
gateway restart, image paste, the dashboard Projects tree, voice notes,
and ~5 min after closing the app (detached cron). Two root causes:

- Console-subsystem exes (taskkill, schtasks, wmic, netstat, tasklist,
  agent-browser, git, ffmpeg, powershell, git-bash) spawned via raw
  subprocess allocate a fresh console when the launching process has
  none (pythonw desktop backend / detached gateway) - even with output
  captured.
- uv venv pythonw shims re-exec console python.exe, so Python children
  get a console regardless of how they're launched.

Fixes:
- Single hidden-spawn primitive (_subprocess_compat.run/.popen) that ORs
  CREATE_NO_WINDOW on Windows, no-op on POSIX. Route every Hermes-owned
  console-exe spawn through it.
- FreeConsole() catch-all in hermes_bootstrap: any Python child that
  exclusively owns an auto-allocated console detaches it at startup
  (GetConsoleProcessList()==1 gate leaves shared interactive consoles
  untouched).
- Replace PowerShell/wmic gateway PID scans with in-process psutil.
- Skip schtasks queries on non-interactive desktop restarts.
- Prefer native agent-browser .exe over .cmd shims.
- Guard test bans raw subprocess spawns of the Windows-only console
  tools repo-wide so the popup class can't regress.

* fix(windows): scope FreeConsole to background entry points; fix merge fallout

Console detach review (per #53810 feedback): GetConsoleProcessList()==1 can't
tell a uv pythonw->python phantom console apart from a user opening the
interactive CLI/TUI in its own fresh console (double-click, shortcut, ConPTY) —
both report a single attached process with a tty. Running FreeConsole() in the
import-time bootstrap therefore risked detaching a legitimately-interactive
terminal.

- Extract FreeConsole into explicit hermes_bootstrap.detach_orphan_console();
  remove it from apply_windows_utf8_bootstrap() (import side effect).
- Call it only from known background mains: gateway run, dashboard backend
  (start_server, what the desktop spawns), cron standalone, tui_gateway entry,
  slash worker. Interactive CLI/TUI never calls it.
- Behavior-contract tests: frees only when solo owner, leaves shared console,
  no-op without console / on POSIX, and asserts it's not an import side effect.

Merge fallout from origin/main (#53791):
- local.py: 3-way merge left a dangling **_popen_kwargs (NameError crashing
  every terminal init). _subprocess_compat.popen already hides the window, so
  drop it.
- discord adapter: merge stacked an undefined windows_hide_flags() onto the
  primitive call; drop the redundant arg.
- test_gateway: scan now goes psutil-first (zero spawn); rewrite the
  case-variant test to drive that production path.

* test(claw): mock _subprocess_compat.run seam for Windows process scan

claw.py's Windows tasklist/powershell scan routes through the hidden-spawn
primitive; the tests still patched claw_mod.subprocess, so on win32 the mock
was never hit and real spawns returned nothing. Patch the actual seam.
2026-06-27 14:02:24 -07:00
Teknium
ef17cd204d fix(windows): stop subprocess console-window popups + add CI guard (#53791)
* fix(windows): stop subprocess console-window popups + add CI guard

The single biggest source of Windows 'terminal popup' bug reports was bare
subprocess.run/Popen calls spawning a console window. The compat helpers
(windows_hide_flags / windows_detach_popen_kwargs) already existed but the
footgun checker had no rule to stop new bare calls from reintroducing the flash.

- scripts/check-windows-footguns.py: new AST-based rule flagging subprocess
  calls that can create a new console — output-redirection-aware (capture/
  redirect/check_output exempt) and POSIX-only-program-aware (launchctl/
  systemctl/brew/etc. exempt). Comprehensive on real popups, no annotation
  burden on calls that can't flash.
- Swept all genuine window-spawning sites through windows_hide_flags()/
  windows_detach_popen_kwargs(); marked intentionally-visible launches
  (editor/terminal/foreground re-exec) with '# windows-footgun: ok'.
- tests/scripts/test_windows_footgun_subprocess_rule.py: behavior-contract
  tests + full-repo cleanliness invariant.
- CONTRIBUTING.md: documents the rule + the helper pattern.

* test: accept creationflags kwarg in psutil_android fake_subprocess_run

The Windows no-window sweep added creationflags=windows_hide_flags() to
install_psutil_android.py's subprocess.run call; the test's fake stub had a
fixed (cmd) signature and raised TypeError on the new kwarg.
2026-06-27 13:03:51 -07:00
Teknium
3b44a3c8bb feat(moa): show each reference model's output as a labelled block before the aggregator (#53793)
When a MoA preset is selected, each reference model's answer now renders in the
CLI as a thinking-style block labelled with its source model, BEFORE the
aggregator responds — so the mixture-of-agents process is visible instead of a
silent pause. The aggregator's response (and its tool actions) follow as normal.

Mechanism (shared seam, all surfaces):
- MoAChatCompletions/MoAClient take an optional reference_callback and emit
  'moa.reference' (index/count/label/text) per reference, then 'moa.aggregating'
  (aggregator label) once. agent_init wires this to the agent's
  tool_progress_callback, which every surface already consumes — so the events
  reach CLI/TUI/desktop/gateway with no new plumbing.
- CLI _on_tool_progress renders 'moa.reference' as a labelled '┊ ◇ Reference
  i/n — <model>' header + a thinking-style preview (reusing _emit_reasoning_
  preview), and 'moa.aggregating' as a spinner transition. Display-only; never
  touches message history (cache-safe).

Turn-scoped reference cache: the agent loop calls the facade once per tool-loop
iteration, but the advisory message view is identical across iterations within a
turn, so references are now run AND displayed once per user turn (keyed by the
advisory view's signature) instead of re-running/re-spamming on every iteration.
This also cuts reference API cost from O(iterations) back to O(turns).

Verified live via interactive PTY on the opus-gpt preset (gpt-5.5 + opus refs):
reference blocks render once per turn, labelled by model, before the aggregator;
fresh blocks on each new turn; aggregator tool actions still execute.

Follow-up: TUI/desktop rich rendering + gateway batched-summary already receive
the events via tool_progress_callback; their surface-specific renderers are a
separate change.
2026-06-27 12:45:23 -07:00
Dale Nguyen
dbbf102b8e fix(terminal): strip VIRTUAL_ENV/CONDA_PREFIX from terminal subprocess env
The Hermes gateway runs inside its own venv, so its process environment
carries VIRTUAL_ENV (and possibly CONDA_PREFIX). The terminal tool spawned
subprocesses inheriting those markers. When the agent ran `uv sync`,
`uv pip install`, `poetry install`, etc. in ANY other project directory,
those tools honored the inherited VIRTUAL_ENV and rebuilt/synced that
project's dependencies into the Hermes venv path — wiping Hermes' own runtime
deps (and, when the other project pinned a different Python, replacing the
interpreter), bricking the gateway on the next restart (#23473).

Strip VIRTUAL_ENV/CONDA_PREFIX in both subprocess-env construction points in
tools/environments/local.py — `_sanitize_subprocess_env` and `_make_run_env`
— via a shared `_ACTIVE_VENV_MARKER_VARS` constant. The Hermes venv stays
reachable because its bin dir is already first on PATH, so removing the
active-environment markers is safe and only prevents the cross-project clobber.

Adds TestActiveVenvMarkerStripping: end-to-end (markers in os.environ don't
reach the spawned subprocess) and unit coverage for both functions, plus a
guard on the marker constant.

Also adds the AUTHOR_MAP entry for the salvaged contributor.

Closes #23473
2026-06-28 01:04:20 +05:30
Teknium
d470ed0c4c fix(cli): commit tool scrollback lines in verbose mode (non-streaming/MoA) (#53785)
In the interactive CLI, the aggregator's tool calls under a MoA preset (or
any non-streaming model call, e.g. copilot-acp) appeared to overwrite each
other instead of building scrollable history. Each tool only updated the
transient spinner line; no committed scrollback line was printed.

Root cause: persistent tool lines in _on_tool_progress's tool.completed
branch were gated on tool_progress_mode in {all, new}, omitting 'verbose'.
Streaming models hid the bug because _on_tool_gen_start commits a 'preparing'
line per tool during streaming; non-streaming calls (MoA forces
_use_streaming=False) never emit that, so under 'verbose' there was no
committed line at all — only the self-overwriting spinner.

'verbose' is strictly more than 'all', so it now commits the same scrollback
line. Verified live via interactive PTY on the MoA opus-gpt preset: three
terminal calls in turn 1 and two in turn 2 each render as separate persistent
lines.
2026-06-27 12:29:55 -07:00
Teknium
227e6c0143 fix(moa): resolve context window from the aggregator, not the 256K default (#53780)
A MoA session's model is the preset name (e.g. 'opus-gpt') and its base_url is
the virtual local endpoint, so get_model_context_length() missed every probe
and fell through to the 256K fallback — even when the aggregator is a 1M-context
model. The acting model in MoA IS the aggregator, so resolve the context window
from the aggregator slot's real provider+model.

- model_metadata.get_model_context_length: when provider=='moa', resolve the
  preset's aggregator slot through resolve_runtime_provider and recurse with the
  aggregator's real provider/model/base_url. Explicit model.context_length still
  wins (checked first); falls through to the generic default if resolution fails.

Tests: opus-gpt preset now reports 1M (the aggregator window), config override
still honored.
2026-06-27 12:08:09 -07:00
ailthrim
25ec01f79f fix(desktop): don't purge Electron cache / mirror-retry after a late build failure
`hermes desktop` / `hermes update` recover from a corrupt Electron download by
purging the cached zip + re-downloading and retrying the pack, and then by
falling back to a public mirror. That recovery is only meaningful when the
packaged executable is MISSING — the signature of a partial/corrupt unpack.

A LATE failure such as macOS code signing (#40187) leaves
`Hermes.app/Contents/MacOS/Hermes` (or the platform equivalent) in place.
Re-downloading Electron can't repair a signing failure, so the purge +
slow mirror retry just grind through another identical failure before the
build finally errors out.

Gate both recovery blocks on `_desktop_packaged_executable(desktop_dir) is None`
so a build that already produced the executable fails fast instead of
triggering the destructive download recovery. The corrupt-download path
(executable missing) is unchanged.

Salvage of #42782, re-applied onto current main (the surrounding recovery was
refactored to `_electron_dist_ok` / `_redownload_electron_dist` since the PR
was opened). Adds a regression test asserting no purge / mirror retry runs when
the executable exists, and updates the existing retry/mirror tests to model the
corrupt-download case (executable absent) the recovery is actually for.

Related to #40187 (the residual cache-purge sub-issue; the signing failure
itself is fixed by #52591).
2026-06-28 00:29:34 +05:30
teknium1
1ef19bad90 fix(model): show MoA preset picker on selection and label MoA in the banner
Selecting 'Mixture of Agents' in the `hermes model` provider picker fell
through silently — select_provider_and_model had no moa branch, so it just
reprinted the current model/provider summary and exited. And the CLI session
banner rendered the bare preset name (e.g. 'opus-gpt · Nous Research'),
which is meaningless out of context.

- Add _model_flow_moa: always lists the available presets (even one), then
  prints the full reference-models + aggregator breakdown for the selection
  and persists model.provider=moa / model.default=<preset> (dropping stale
  base_url + endpoint creds, since moa is a virtual local provider).
- Wire the branch into select_provider_and_model.
- build_welcome_banner takes provider; when 'moa' it renders
  'MoA: <preset> · agg <aggregator>' instead of a bare slug. Both CLI call
  sites pass self.provider.

Tests: 2 new banner tests (moa + non-moa unchanged); E2E verified the picker
persists the preset and clears stale base_url/api_key.
2026-06-27 11:45:07 -07:00
konsisumer
1b6ebb24c0 fix(agent): validate OpenRouter provider sort before request dispatch 2026-06-27 11:43:08 -07:00
Teknium
27322612b4 fix(update): route loud build/installer output to update.log instead of the terminal (#53616)
* fix(update): route loud build/installer output to update.log instead of the terminal

hermes update flooded the terminal with the full vite asset dump,
electron-builder logs, npm deprecation warnings from the desktop build,
and the cua-driver installer's 'Next steps' wall. All of that is
low-signal noise the user doesn't need on a successful update.

- Capture the desktop --build-only subprocess (vite + electron-builder)
  into ~/.hermes/logs/update.log; print a one-line status, and on
  failure surface the last 15 lines + a pointer to the full log.
- Capture the cua-driver installer's output when verbose=False (the
  hermes update refresh path); concise upgrade line is unchanged.
- Add _log_only_write() / _run_logged_subprocess() helpers that write to
  the update.log handle without echoing to the terminal.

The repo-root npm install keeps streaming (capture_output=False) — that
is the deliberate #18840 guard so a slow postinstall download doesn't
look hung. The desktop npm install is a separate Electron process with
no such progress concern and is captured.

* fix(update): persist full cua-driver installer output to update.log

The captured cua-driver installer output was only sent to logger.debug
(agent.log) on failure, so the 'Next steps' wall was lost from
update.log entirely on success. Write the full captured output straight
to the update.log handle (sys.stdout._log) on both success and failure,
matching the desktop-build capture, so update.log keeps the complete
record of everything an update did.
2026-06-27 11:43:01 -07:00
ethernet
f53b184c48 fix(ci): pass secrets down to docker workflows 2026-06-27 09:53:28 -07:00
Teknium
190e1ffac9 fix(redact): mask passwords in lowercase/dotted config keys (#53590)
The secret redactor only matched uppercase env-style keys ([A-Z0-9_]),
so config-file assignments like spring.datasource.password=secret,
app.api.key=xyz, and YAML password: secret leaked verbatim when the
agent ran cat/grep on application.properties or .env files (issue #16413).

Adds three case-insensitive config-key matchers that run only in a
config-file context, preserving the existing #4367 (lowercase code/prose)
and web-URL-passthrough carve-outs:
  - _CFG_DOTTED_RE: namespaced keys (contain a dot) — unambiguously config
  - _CFG_ANCHORED_RE: bare secret-word keys at line start (incl. export)
  - _YAML_ASSIGN_RE: unquoted colon config (password: value)
Value capture stops at whitespace and '&' so form bodies stay pair-wise;
the '://' guard keeps intentional web-URL query-param passthrough intact.

Reported-by: Murtaza1211
2026-06-27 04:43:28 -07:00
Teknium
917f6bdb00 fix(tools): let vision pick any provider+model, not just OpenRouter (#53606)
* fix(tools): let vision pick any provider+model, not just OpenRouter

hermes tools → configure → vision no longer forces an OPENROUTER_API_KEY.
It now offers the same any-provider surface as the model command: Auto
(use main model / aggregator fallback), pick any authenticated provider +
model, or a custom OpenAI-compatible endpoint. Selections persist to
auxiliary.vision.{provider,model,base_url} — the keys the vision resolver
already reads. Custom endpoint pins provider=custom so base_url routes
correctly. Reconfigure path uses the same picker instead of re-prompting
for OPENROUTER_API_KEY.

* docs: add PR infographic for vision any-provider picker
2026-06-27 04:41:42 -07:00
Brandon Zarnitz
9c81c938d3 fix(approval): honour tirith_fail_open=false on Tirith ImportError (#20733)
check_all_command_guards() swallowed ImportError from tools.tirith_security
with an unconditional pass, leaving tirith_result["action"] as "allow"
regardless of security.tirith_fail_open.  When an operator sets
tirith_fail_open: false they have explicitly opted into fail-closed
behaviour; a missing or broken Tirith module must not silently permit
command execution.

Inside the except ImportError handler, read the live security config.
When tirith_enabled is true and tirith_fail_open is false, synthesise a
"warn"-action Tirith result so the command flows through the normal
approval path (prompt the user, or block in cron/gateway contexts)
instead of bypassing it.  The default tirith_fail_open: true behaviour
is unchanged.

Adds three regression tests to tests/tools/test_approval.py:
- fail_open=true  + ImportError → silently allowed (no regression)
- fail_open=false + ImportError → approval callback invoked, command denied
- tirith_enabled=false           → always allowed regardless of fail_open

Fixes #20733

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

# Conflicts:
#	tests/tools/test_approval.py
2026-06-27 04:41:24 -07:00
Teknium
fe1c1c1121 fix(session_search): demote cron below interactive sessions in discover ranking (#53597)
Cron jobs accumulate large volumes of repetitive vocabulary (recurring
project names, dates, summaries) and out-number a user's interactive
sessions. Under bare BM25 they dominate the top FTS rows, so discover's
early-exit-at-N dedup collects only cron sessions and the user's own
conversations never surface — "recall blindness" (#19434).

- _order_for_recall() stable-sorts FTS rows so interactive sources rank
  above cron before lineage dedup; within each class BM25/recency order
  is preserved. Cron is demoted, not excluded, so it still surfaces when
  it is the only match.
- raise discover scan limit 50 -> 300 so buried interactive matches are
  in hand for the demotion pass.

Fixes the cron-flooding sub-bug of #19434. The split-brain sub-bug is
covered by #52798; the child-session sub-bug is superseded by in-place
compaction.
2026-06-27 04:41:22 -07:00
Teknium
cd592c105c feat(send_message): native WhatsApp media delivery via Baileys bridge (#53598)
send_message with MEDIA:/path to a WhatsApp target previously dropped the
attachment: the WhatsApp branch never passed media_files, the plugin's
_standalone_send accepted the param but only POSTed text, and WhatsApp was
absent from the media-supported platform list.

- send_message_tool: add a Platform.WHATSAPP media block (mirrors Feishu) that
  routes media_files through the whatsapp plugin's standalone_sender_fn, and
  add whatsapp to the supported-media list strings.
- whatsapp adapter: _standalone_send now sends text first (skipped when the
  chunk is media-only), then uploads each file via the bridge /send-media
  endpoint with a mediaType derived from extension/is_voice/force_document, so
  images/videos/voice arrive as native bubbles instead of documents.
- _bridge_media_type classifier maps ext -> image|video|audio|document.

Closes #19105 (remaining send_message gap). Other items in the report
(inbound video paths, image_generate auto-deliver, history dedup, native
gateway bubbles) already landed on main.
2026-06-27 04:40:05 -07:00
Teknium
88c02469cc fix(mcp): never permanently wedge the circuit breaker on a dead transport (#53599)
A long-running gateway session could permanently lose an MCP server: once a
stdio subprocess died (or transient drops accumulated over the session), the
run loop exhausted its reconnect budget and returned, orphaning the task. With
no listener for _reconnect_event, the circuit breaker's half-open probe could
never revive the server — every probe hit a dead/absent session, re-armed the
60s cooldown, and looped forever until a full gateway restart (#16788).

Root cause was split ownership of transport liveness between the run loop and
the tool handler, plus a permanent give-up path. Fixed by one invariant: a
non-shutdown server task is always reconnectable.

- run loop parks (deregisters phantom tools, then awaits _reconnect_event)
  instead of returning when the reconnect budget is exhausted, so the task
  stays alive as a dormant listener
- retry budget resets on every successful (re)connect, so a healthy
  long-lived server can't accumulate lifetime drops into a death sentence
- half-open probe with no live session signals a reconnect (reviving a
  parked/dead task and respawning a dead stdio subprocess) and returns a
  clean 'reconnecting' error instead of writing into a dead pipe
- breaker resets on successful session init across all transports
  (stdio/HTTP/SSE) — fully transport-agnostic, no PID/pipe polling

Builds on the closed-PR cluster for this issue: keeps #49255's deregister-on-
exhaustion insight and #21006's signal-don't-probe insight, discards the racy
os.kill PID machinery.

Co-authored-by: LeonSGP43 <LeonSGP43@users.noreply.github.com>
Co-authored-by: srojk34 <srojk34@users.noreply.github.com>
2026-06-27 04:39:54 -07:00
r266-tech
dbc925b755 Guard oversized Telegram video downloads 2026-06-27 04:39:48 -07:00
Teknium
02b32e2d7c fix(moa): call reference + aggregator models through their provider's real route (#53580)
MoA was calling reference and aggregator models through a bare
call_llm(provider=slot["provider"], model=slot["model"]) with a forced
temperature and a forced max_tokens (the preset's hardcoded 4096). That left
base_url/api_key/api_mode unresolved — so the auxiliary auto-detector guessed
the API surface instead of using the provider's real runtime, and the 4096 cap
truncated long aggregator syntheses.

A MoA slot is just a model selection and must be called the same way any model
is called elsewhere. Each slot is now resolved through resolve_runtime_provider
(the canonical provider→api_mode/base_url/api_key resolver the CLI, gateway, and
delegate_task all use) via a new _slot_runtime() helper, and the resolved
endpoint is passed into call_llm. So a reference/aggregator gets its provider's
actual API surface — MiniMax → anthropic_messages, GPT-5/o-series →
max_completion_tokens, custom endpoints → their base_url — identical to how that
model is handled as the acting model.

MoA also no longer imposes its own output cap: max_tokens defaults to None
(omitted → the model's real maximum) for references and is passed through from
the caller for the aggregator. The preset's hardcoded 4096 is gone. The
max_tokens preset config field is left in place (config/web/desktop unchanged);
it is simply no longer applied as a forced cap.

Tests: slots route through resolve_runtime_provider with resolved base_url/
api_key; resolution errors fall back to bare provider/model; neither call
carries an output cap even when the preset config still contains max_tokens.
2026-06-27 04:39:42 -07:00
herbalizer404
3fe16e3cd5 fix(fallback): attach credential pool after provider switch
When automatic fallback activates a provider that differs from the
primary, try_activate_fallback() cleared the primary's pool (to avoid
cross-provider base_url contamination, #33163) but never loaded the
fallback provider's own pool. The fallback then ran with no pool, so
rate_limit/billing/auth recovery couldn't rotate its credentials.

After clearing a mismatched pool, load_pool(fb_provider) and attach it
when it has credentials, so provider-specific rotation continues to
work on the fallback target.
2026-06-27 04:39:26 -07:00
Tranquil-Flow
635841d210 fix(agent): reload credential pool on switch_model provider change (#52727)
switch_model() swapped model/provider/base_url/api_key but never
refreshed agent._credential_pool, which stays bound to the original
provider. recover_with_credential_pool() then sees a pool.provider !=
agent.provider mismatch and short-circuits — so a 429/401 on the new
provider gets no rotation and falls through to fallback instead.

Reload load_pool(new_provider) inside switch_model when the provider
changes (or the pool is missing). The reload is inside the protected
swap block and the pool is added to the rollback snapshot, so a failed
client rebuild restores the original pool.

Fixes #16678, #52727.
2026-06-27 04:39:26 -07:00
Teknium
2002bb49a7 test(telegram): make config-bridge tests immune to ambient .env pollution (#53594)
test_config_bridges_telegram_group_settings and
test_config_bridges_telegram_user_allowlists asserted the YAML→env bridge
via os.environ. A developer's real ~/.hermes/.env can repopulate TELEGRAM_*
vars during load_gateway_config(): the microsoft_teams plugin runs
load_dotenv(find_dotenv(usecwd=True)) at import time, which walks up from the
cwd (under ~/.hermes/ in worktrees) and reloads the user's .env, defeating the
env-over-YAML bridge for any key present there (e.g. TELEGRAM_GROUP_ALLOWED_CHATS).

Assert the returned PlatformConfig.extra instead — it is parsed straight from
the test's config.yaml and is immune to that ambient leak. free_response_chats
is bridged to the env var only (not extra), and TELEGRAM_FREE_RESPONSE_CHATS
doesn't appear in developer .env files, so it stays a deterministic os.environ
assertion.
2026-06-27 04:36:45 -07:00
Teknium
d4c2217e87 fix(gateway): offload /model switch off the event loop (#53603)
The Telegram/Discord /model command's actual switch calls switch_model()
directly on the asyncio event loop. switch_model() can fall through to a
synchronous models.dev HTTP fetch (requests.get, 15s timeout) on a cold or
expired cache, freezing the gateway for up to 15s and dropping the Telegram
connection while a user switches models.

The picker provider-list and fallback text-list sites were already offloaded
(#41289), but the two _switch_model() calls — the picker callback and the
direct /model <name> path — were not. Wrap both in asyncio.to_thread.

Closes #20525.
2026-06-27 04:36:22 -07:00
Teknium
caf4dcc7ad fix(whatsapp): resolve phone↔LID aliases in adapter DM/group allowlist (#53588)
The adapter-level intake gate (_is_dm_allowed / _is_group_allowed, reached
via _should_process_message) did a raw set-membership check against the
configured allowlist. WhatsApp now delivers inbound DM senders in LID form
(<id>@lid) while operators configure allowlists with phone numbers, so the
check never matched and every DM from an allowed contact was silently
dropped before the gateway authz layer ran.

Route both gates through the existing gateway.whatsapp_identity.
expand_whatsapp_aliases helper (already used by gateway authz and session
keys), which walks the bridge's lid-mapping-*.json session files. Phone and
LID forms now resolve to each other in both directions; exact JID matches,
wildcard, disabled/open policies, and empty-allowlist fail-closed behavior
are all preserved.

Fixes #14486
2026-06-27 04:17:12 -07:00
teknium1
38e7bd8a08 fix(agent): classify 429 'overloaded' bodies as overloaded, not rate_limit
Z.AI / Zhipu reuse HTTP 429 for server-wide overload. The 429 status
path classified these unconditionally as rate_limit with
should_rotate_credential=True, so an overloaded provider exhausted the
credential pool after two errors — fatal for a single-key user, who has
nothing to rotate to.

The credential is valid; the server is just busy. Disambiguate the 429
body against a shared _OVERLOADED_PATTERNS list and route overload
language to FailoverReason.overloaded (retryable, no rotation), matching
the existing 503/529 path and the message-only path (#52890). Genuine
rate limits (no overload language) still rotate.

Extracted the inline overloaded tuple #52890 added into the shared
_OVERLOADED_PATTERNS constant so the status-code and message paths use
one list.

Closes #14038.
2026-06-27 04:16:54 -07:00
ms-alan
16192103f4 fix(config): accept placeholder base_url in custom provider validation
_normalize_custom_provider_entry() ran urlparse() on base_url and dropped
any entry whose value was an un-expanded placeholder, so a caller reaching
the normalizer with raw config (e.g. the Dockerized gateway path) silently
skipped the provider with a 'not a valid URL' warning. Skip URL validation
when the candidate contains a placeholder token — both ${ENV_VAR} env-refs
and bare {region}-style templates — since those are expanded at runtime.

Closes #14457
2026-06-27 04:15:27 -07:00
HiddenPuppy
b34771fc06 fix(cli): disable prompt_toolkit CPR queries to stop escape-sequence leak (#13870)
prompt_toolkit's renderer sends ESC[6n cursor-position queries before
painting in non-fullscreen mode; the terminal replies ESC[<row>;<col>R.
Over SSH/cloudflared tunnels and slow PTYs these replies race past the
input parser and land in the display as raw '20;1R21;1R' text, and the
pending-CPR future can stall the renderer so the prompt freezes after the
agent's final answer.

Build the prompt_toolkit output with enable_cpr=False so CPR is marked
NOT_SUPPORTED up front and ESC[6n is never sent. This is the root-cause
counterpart to the existing input-side _strip_leaked_terminal_responses
scrubbing. Vt100_Output.from_pty() does not expose enable_cpr in
prompt_toolkit 3.x, so _build_cpr_disabled_output() reproduces its
get_size setup and calls the constructor directly; it returns None on any
failure so startup falls back to the default output.

Verified in a real PTY: baseline emits 1 ESC[6n query, the fix emits 0,
banner/UI render identically. Layout is unaffected — with CPR off the
renderer sizes the prompt to its preferred height (the same fallback
prompt_toolkit uses on any terminal that doesn't answer CPR).

Co-authored-by: Hermes Agent <noreply@nousresearch.com>
2026-06-27 04:15:20 -07:00
LeonSGP43
e7c013494d fix(agent): preserve nested API error bodies 2026-06-27 04:13:53 -07:00
Teknium
5ab4136631 fix(webui): switch provider when Config-page model field changes (#53583)
The dashboard Config tab's Model field is a flat string with no provider
info. _denormalize_config_from_web only updated model.default and kept the
stale provider, so picking an OpenRouter model while the default provider was
ollama-local left provider=ollama-local and every call 404'd.

When the model string actually changes, infer the serving provider — curated
catalog first, then a vendor/model-slug heuristic for non-aggregator providers
— and route the switch through the existing _normalize_main_model_assignment /
_apply_main_model_assignment chokepoints so stale base_url/api_mode/api_key are
cleared on a provider change and preserved on a same-provider re-pick. Saving
an unchanged model never re-detects, so unrelated config saves keep an explicit
provider.

Closes #14058
2026-06-27 04:13:44 -07:00
teknium1
7ee0b68973 fix(gateway,feishu): refuse executor resurrection during real shutdown
Add an explicit _closing guard to both owned executors so the
recreate-on-shutdown path only recovers from an *external* teardown of
the loop default — never resurrects a pool the gateway/adapter itself
stopped. _shutdown_*executor() sets the flag; _get_*executor() raises if
closing; feishu connect() re-arms on reconnect. Updates the gateway
recreate test to assert the refusal contract and adds feishu coverage.
2026-06-27 04:13:09 -07:00
teknium1
b296915c82 fix(feishu): route blocking SDK calls through an adapter-owned executor
Feishu SDK calls ran on asyncio's shared default executor, so a torn-down
default executor wedged every send with 'Executor shutdown has been called'
and left the gateway a zombie (#10849). The adapter now owns a
ThreadPoolExecutor recreated on demand if shut down, mirroring the
gateway-owned executor change. Routes all 17 self._client SDK calls through
_run_blocking; shuts the pool down on disconnect.
2026-06-27 04:13:09 -07:00
konsisumer
1011c07966 fix(gateway): use owned executor for agent work 2026-06-27 04:13:09 -07:00
LeonSGP43
52a09d8faf fix(byterover): honor auto extract config 2026-06-27 04:04:15 -07:00
teknium1
f062cf076b fix(agent): also treat provider=ollama as an Ollama GLM backend
Follow-up to the #13971 fix: a genuine native Ollama provider reached
through a reverse proxy carries no ollama/:11434 URL signature, so the
restricted detection would miss it. Add provider=="ollama" as an
explicit True case (idea from #14789, @Tranquil-Flow) and cover both it
and the #13971 LiteLLM-proxy-to-zai false-positive with E2E tests.
2026-06-27 04:03:07 -07:00
YuShu
266521b55f refactor(agent): trim docstring per review feedback
Remove commentary about the previous is_local_endpoint() approach
from _is_ollama_glm_backend() — git history suffices.
2026-06-27 04:03:07 -07:00
YuShu
00a8252b7d fix(agent): scope Ollama/GLM stop-to-length heuristic to Ollama only
The _is_ollama_glm_backend() function was too broad: any local endpoint
running a GLM model was treated as Ollama, triggering the stop->length
misreport heuristic introduced in 8011aa3. This caused false truncation
detection on sglang, vLLM, LM Studio, and other non-Ollama servers that
correctly report finish_reason.

When a GLM model on sglang/vLLM returned finish_reason='stop', the agent
mistakenly reclassified it as 'length' if the response didn't end with
a whitelisted punctuation character (ASCII or CJK). This particularly
affected Chinese-language responses and Markdown-formatted text.

Root cause: the is_local_endpoint() fallback assumed any local GLM
endpoint = Ollama. But many non-Ollama servers also run on localhost.

Fix: remove the is_local_endpoint() catch-all. Only detect Ollama via
its distinctive signatures (port 11434, 'ollama' in URL). All other
local servers are assumed to report finish_reason correctly.

This is the correct tradeoff because:
- False negatives (Ollama at custom port, heuristic not triggered) only
  mean the user sees a truncated response — same as having no heuristic
- False positives (non-Ollama server, heuristic wrongly triggered) inject
  spurious continuation messages into the conversation — strictly worse

Adds two tests:
- sglang GLM response is NOT reclassified as truncated
- Ollama GLM on port 11434 still triggers the heuristic as before

Co-authored-by: Hermes Agent <hermes@nousresearch.com>
2026-06-27 04:03:07 -07:00
teknium1
ab1f9b94c5 fix(telegram): accept @username chat_id in delivery paths (#13206)
TELEGRAM_HOME_CHANNEL set to an @username (not a numeric chat ID) crashed
all webhook/cron->Telegram home-channel delivery with 'ValueError: invalid
literal for int()'. The Telegram Bot API accepts both a numeric chat_id and
an @username string; Hermes was force-coercing every chat_id with int().

Add normalize_telegram_chat_id() (returns int for numeric values, passes
@username strings through) and apply it at the Bot API send/edit sites in
the Telegram adapter and the send_message tool. Username targets are now
recognized as explicit targets in _parse_target_ref.

Reapplies the approach from #13274 (season179), whose branch predated the
gateway/platforms/telegram.py -> plugins/platforms/telegram/adapter.py
relocation. Dupes: #13535 (Tranquil-Flow), #37572 (chewkaah).

Co-authored-by: season179 <season.saw@gmail.com>
2026-06-27 04:01:58 -07:00
teknium1
f2ca3e3d84 fix(gateway): hold _run_restart on _restart_task + explicit cancel-loop skip
Follow-up on the cherry-picked #13173 fix. Holds the _run_restart task in
self._restart_task (a bare asyncio.create_task keeps only a weak reference,
so a still-pending task can be GC'd mid-flight) and explicitly skips it in
the _stop_impl cancel loop alongside _stop_task. Adds AUTHOR_MAP entry for
the contributor and a regression test that fails when the task is cancellable.

Refs #12875
2026-06-27 03:57:31 -07:00
zeapsu
1ce5d6d974 fix(gateway): exclude _run_restart from _background_tasks to prevent zombie on /restart
When request_restart() adds _run_restart to _background_tasks, _stop_impl
later cancels all entries in that set.  Since _run_restart is awaiting
_stop_task at that point, the CancelledError propagates into _stop_impl,
interrupting cleanup before _shutdown_event.set() and _exit_code = 75
execute.  This leaves the gateway as a zombie (alive but disconnected) or
exiting with code 0 instead of 75, preventing systemd Restart=on-failure
from restarting the service.

Fix: don't add _run_restart to _background_tasks — it self-terminates in
~50ms and needs no lifecycle management.

Fixes #12875
2026-06-27 03:57:31 -07:00
teknium1
08e131f77c test(telegram): cover bot self-message ingestion guard (#11905)
Regression tests for the self-author guard added in the salvaged fix:
- bot-authored DM-topic watcher echo is dropped (the exact #11905 symptom)
- bot self-messages dropped in groups/supergroups too
- other bots in the same chat are still processed (self-id, not is_bot)
- observe-unmentioned sibling path also rejects self-messages
- missing from_user does not crash

Test scaffolding ported from @cola-runner's PR #12817 and adapted to the
current plugins/platforms/telegram/adapter.py and _is_own_message().
2026-06-27 03:56:52 -07:00
Sahil-SS9
6fb25f86ac fix(telegram): filter out bot's own messages from inbound processing (#52363) 2026-06-27 03:56:52 -07:00
Teknium
68a65ed7a1 fix(agent_init): correct misleading sub-64K context_length error message (#53569)
The error raised when a model's context window is below the 64K minimum
advertised "or set model.context_length in config.yaml to override" — but
the guard intentionally has no sub-64K escape hatch. Sub-64K models are
rejected by design (tool schemas + system prompt need the headroom).

The misleading clause invited a cluster of dup PRs (#11097, #11110, #8962,
#9142, #37548) all trying to wire an override that we don't want. Reword to
state the real options: pick a >=64K model, or — if your local server
under-reports its true window — declare the real value (which must itself
be >=64K). Guard behavior is unchanged.
2026-06-27 03:56:25 -07:00
Teknium
d73078e7b0 fix(cron): make per-profile cron isolation intentional and tested (#4707) (#53570)
A profile's cron jobs now provably live in AND execute under that profile's
HERMES_HOME. A job authored under profile `coder` is stored at
`~/.hermes/profiles/coder/cron/jobs.json` and runs with coder's .env,
config.yaml, scripts and skills — never the default root's.

This was the de-facto behavior on main but only by accident: PR #50112 had
re-anchored cron storage at the shared default root, and a later stale-branch
squash merge (#52147) silently reverted it back to the profile home. Neither
direction was guarded by a test, so it could flip again on the next stale merge.

Changes:
- cron/jobs.py: document the per-profile storage anchor (get_hermes_home, NOT
  get_default_hermes_root) and why anchoring at the root leaks
  config/credentials/skills across profiles — the #4707 security boundary.
- cron/scheduler.py, cron/suggestions.py: same intent documented at the
  dynamic resolution helper and the suggestions store.
- tests/cron/test_cron_profile_isolation.py: pin storage, lock-path, and
  execution-home resolution to the active profile so a re-anchor can't regress.

Verified E2E: jobs created under two profiles land in separate per-profile
stores with zero cross-profile leakage and no shared-root store; scheduler
execution-home follows the active profile. Full cron suite: 576/576.
2026-06-27 03:55:01 -07:00
Bartok
864d5521ad test(curator): join straggler curator-review thread on fixture teardown
The curator_env fixture left async review threads (synchronous=False spawns
a daemon 'curator-review' thread that calls save_state() on completion)
running past test teardown. save_state() resolves the state path from
HERMES_HOME at write time, so a straggler could write into the next test's
tmp home, corrupting test_state_file_survives_corrupt_read (and others)
under CI load. Join the thread on teardown while HERMES_HOME is still
pinned to this test's home.
2026-06-27 03:52:52 -07:00
Bartok9
45ce35ed72 fix(agent): classify message-only 'overloaded' as server overload
Salvage of #14261 by @ms-alan — rebased onto current main, scoped to the
overloaded-classification fix, with a regression test that fails without it.
2026-06-27 03:52:52 -07:00
teknium1
151ae1e937 test(api-server): cover SSE failure finish_reason for both failure modes
Lock the contract that a clean stream-queue termination followed by an
agent failure never reports finish_reason: "stop". Covers the raised-
exception case (#12422 repro), the flagged failed-result case, truncation
(length), and the success happy path.

Follow-up to the salvaged #12504 fix from @flobo3.
2026-06-27 03:52:44 -07:00
flobo3
b8b695e2cd fix(api): surface agent crash in SSE chat completions stream 2026-06-27 03:52:44 -07:00
Teknium
f67c0b3e60 docs(hermes-agent skill): cover v0.13–v0.17 features, fix stale claims, tighten (#53566)
Refresh the hermes-agent skill against the last 5 major releases and the
current codebase, and cut verbose prose.

Coverage added (v0.13.0–v0.17.0):
- New gateway platforms: iMessage (Photon), Teams, LINE, SimpleX, ntfy,
  Google Chat, Raft, official WhatsApp Business Cloud API (now 20+).
- New surfaces section: desktop app, web dashboard admin panel,
  hermes proxy (OpenAI-compatible OAuth proxy), Automation Blueprints.
- delegate_task(background=true) async subagents; memory-tool atomic
  batch operations; session_search three-mode shape; x_search/video_analyze
  toolsets; image_gen image-to-image; xAI Grok via SuperGrok OAuth.
- display.interface (cli/tui), curator.consolidate opt-in, PyPI install.

Accuracy fixes:
- Adding-a-Tool is two files (auto-discovery), not three.
- Testing uses scripts/run_tests.sh (canonical runner), not bare pytest.
- Dropped change-detector test count and a dangling references/ pointer.
- Refreshed overview (Windows-native, 20+ providers, many surfaces).

Conciseness: trimmed over-explained Windows keybinding/sandbox/test prose
and deep prompt-builder internals to pointers.
2026-06-27 03:51:25 -07:00
Teknium
d3db73210c chore(release): map blaryx@gmail.com → Blaryxoff for PR #32602 salvage 2026-06-27 03:48:18 -07:00
blaryx
76af2456a2 fix(dashboard): merge PUT /api/config with existing on-disk config
The dashboard form is built from CONFIG_SCHEMA, which doesn't enumerate
every root-level key the YAML supports. Most visibly, `custom_providers`
is in `_KNOWN_ROOT_KEYS` but is absent from the schema — so the frontend
never sends it in the PUT body. The previous full-replace save() then
silently wiped the key from disk every time the user clicked anything
that triggered a save. Other casualties (less visible because defaults
re-mask them on load) include `agent.personalities`,
`agent.reasoning_effort`, `terminal.lifetime_seconds`, etc.

Fix: read the raw on-disk config and deep-merge the incoming PUT body
on top of it before saving. The frontend can only overwrite what it
explicitly sends; everything else is preserved verbatim.

Reuses the existing `_deep_merge` helper from `hermes_cli.config`.

Tests:
- `test_round_trip_preserves_custom_providers` exercises the exact bug:
  seed config with custom_providers, GET → drop the key → PUT,
  assert it's still on disk.
- `test_round_trip_preserves_schema_invisible_nested_keys` covers the
  shallow-vs-deep-merge case for nested dicts under `agent` etc.
Both fail on current main; both pass with this patch.
2026-06-27 03:48:18 -07:00
Teknium
ec769e49d2 fix(gateway): WhatsApp/Signal hints affirm markdown instead of forbidding it (#53564)
The 'whatsapp' and 'signal' PLATFORM_HINTS told the agent 'Please do not
use markdown as it does not render' — factually wrong. Both adapters
actively convert markdown to native formatting:

- whatsapp_common.format_message(): **bold**, ~~strike~~, # headers,
  links, code blocks -> WhatsApp native syntax
- signal_format.markdown_to_signal(): same conversions via bodyRanges,
  plus '- item' / '* item' bullets -> '• ' Unicode bullets

The wrong hint made the agent strip bullets and bold the adapter would
have rendered (#12224). Rewrote both hints to mirror whatsapp_cloud:
markdown is auto-converted, bullet lists work, tables are not supported.
Added a contract test asserting markdown-converting platforms never
forbid markdown in their hint.
2026-06-27 03:46:41 -07:00
teknium1
a5d1f68c74 refactor(moa): share one virtual-provider row builder across pickers
Follow-up on the gateway-picker salvage: the cherry-picked change added a
second copy of the MoA virtual-provider row in model_switch.py, duplicating
inventory._moa_provider_row (same slug/name/preset-models, identical extra
fields). Make _moa_provider_row take a bare current_provider string and reuse
it from the gateway picker path so the row shape lives in one place and the
two surfaces can't drift.
2026-06-27 03:43:38 -07:00
dodo-reach
ed54469d06 fix(gateway): show MoA presets in model picker 2026-06-27 03:43:38 -07:00
Teknium
789f8b7dc2 docs(webhook): clarify authenticated != trusted-content trust model (#53562)
HMAC validation authenticates the webhook sender, not the business
fields inside the payload (PR titles, commit messages, issue bodies),
which are authored by untrusted third parties. Expand the prompt-
injection section to make the trust boundary explicit: the agent's
capability surface, not the input channel. Document the hardening
levers (sandbox the runtime, scope the toolset, keep approvals on,
template narrowly) instead of pretending to sanitize untrusted text.

Refs #8820.
2026-06-27 03:43:33 -07:00
teknium1
4e0788783b refactor(gateway): extract MoA one-shot restore helper; restore #28686 comment; real-method tests
Follow-up on the salvaged MoA restore fix:
- Extract the finally-block restore into _restore_moa_one_shot() so the
  behavior is unit-testable without re-implementing it, and so the gateway
  /moa handler and the finally block share one implementation.
- Restore the load-bearing #28686 zombie-eviction comment above
  _release_running_agent_state that the original diff dropped.
- Rewrite the tests to call the real _restore_moa_one_shot helper (the
  originals re-implemented the restore logic inline, so they passed
  regardless of the production code).
2026-06-27 03:43:28 -07:00
srojk34
2f29e3cfc5 fix(gateway): restore MoA one-shot model override on failed turns
The MoA one-shot restore ran inside the try block after
_handle_message_with_agent returned. When that call raised an
exception (agent init failure, interpreter shutdown, OOM), the
restore was skipped and the MoA model override stayed permanently
on _session_model_overrides — silently routing all subsequent
messages through the MoA reference fan-out with no user-visible
indication.

Move the restore to the finally block so it fires on every exit
path (success, exception, interrupt). The restore data lives on
the per-turn event object and would be lost if not consumed here.
2026-06-27 03:43:28 -07:00
briandevans
17cb829991 test(moa): cover non-list/bare-dict reference_models normalization 2026-06-27 03:43:16 -07:00
briandevans
8dd4e576d0 fix(moa): tolerate non-list reference_models in hand-edited MoA preset config 2026-06-27 03:43:16 -07:00
Teknium
60f58a2b95 feat(verify-on-stop): default OFF, one-time migration, skip doc-only edits (#53552)
The verify-on-stop guard fired too eagerly — including on doc/markdown/skill
edits with nothing to verify, where it pushed a pointless /tmp verification
script. Three changes:

1. Default OFF for new installs: agent.verify_on_stop defaults to false
   (was the "auto" surface-aware sentinel). _config_version bumped 30 -> 31.
2. One-time migration (v30 -> v31): existing installs are switched off once,
   but only when the value is missing or still the "auto" sentinel — an
   explicit true/false the user set is preserved.
3. Path filter: build_verify_on_stop_nudge() now drops documentation/prose
   paths (.md/.mdx/.rst/.txt/LICENSE/CHANGELOG/...) so even when explicitly
   enabled, a doc-only turn never nudges. Mixed doc+code turns still nudge on
   the code paths.

The legacy "auto" sentinel is still honored when set explicitly (ON for
interactive coding surfaces, OFF for messaging). HERMES_VERIFY_ON_STOP env
override unchanged.
2026-06-27 03:23:22 -07:00
teknium1
29ee4bbff6 refactor(dashboard): tighten cron-job form helpers
Collapse the three near-identical optional-text helpers
(optionalText/optionalBaseUrl/listToText) into one optionalText with a
strip-trailing-slash flag, route listToText + toolsets through the
existing splitCronList, and replace the repeated
typeof x === 'string' ? x : '' ladders with a single asString helper.
Behavior-identical; all 16 vitest cases pass.
2026-06-27 03:20:32 -07:00
Versun
c655cdf2c1 feat(dashboard): expose cron job execution fields 2026-06-27 03:20:32 -07:00
teknium1
50f6855217 feat(moa): make /moa one-shot only; route preset switching through the model picker
/moa no longer does a sticky model switch. It now always runs a single
prompt through the default MoA preset and restores the prior model
afterward; the whole argument is the prompt (no preset-name matching).
To switch to a MoA preset for the session, select it from the model
picker, where presets already surface under a virtual Mixture of Agents
provider on every model-selection surface.

Also fixes #53444: the TUI one-shot only set session[model_override],
which the already-built cached agent ignored, so MoA silently never ran
and the turn used the original model. The TUI now does a real in-place
agent.switch_model() via _apply_model_switch() when a live agent exists
(with a proper restore after the turn), and falls back to a model_override
for lazy/unbuilt sessions.

Removes the redundant sticky-switch branch from the CLI, gateway, and TUI
/moa handlers; updates the command description, usage string, and docs.
2026-06-27 03:09:09 -07:00
teknium1
3cd4693494 chore: add DiamondEyesFox to AUTHOR_MAP for PR #53351 salvage 2026-06-27 03:04:26 -07:00
diamondeyesfox
8df231c941 fix(agent): rebaseline in-place compression flushes 2026-06-27 03:04:26 -07:00
Mahesh Sanikommu
1b75b3fd90 feat(memory): add Supermemory setup connection summary
Add post_setup() and get_status_config() to the Supermemory memory
provider so `hermes memory setup` and `hermes memory status` print a
one-line connection summary (container, profile fact count,
auto_recall/auto_capture). Point API-key onboarding at the Hermes
connect URL (app.supermemory.ai/integrations?connect=hermes).

Salvage of #52988. Two fixes folded in:

- Test isolation: the new probe/status tests mocked _SupermemoryClient
  but not the __import__("supermemory") guard inside
  _probe_supermemory_connection, so they passed only where the optional
  supermemory package was installed and failed on a clean checkout / CI
  (the PR shipped with red CI). Added _stub_supermemory_importable()
  mirroring the existing test_is_available_false_when_import_missing
  pattern; the suite now passes with supermemory absent.

- post_setup: `if api_key and api_key not in os.environ` checked whether
  the key's *value* named an env var (always false in practice). Fixed to
  compare the value: `os.environ.get("SUPERMEMORY_API_KEY") != api_key`.

Verified: 38/38 in test_supermemory_provider.py and the full
tests/plugins/memory/ suite green with supermemory not installed.

Closes #52988
2026-06-27 15:07:34 +05:30
underthestars-zhy
8827300267 fix(photon): correlate tapbacks to bot message context
Populate `reply_to_message_id`, `reply_to_text`, and
`reply_to_is_own_message` on reaction events so the gateway injects
`[Replying to your previous message: "..."]` when the agent receives
a tapback.

The sidecar now extracts a capped text preview from the hydrated
reaction target (plain text and mixed group messages; null for
attachment/voice-only targets), emitting it as `targetText` in the
NDJSON reaction payload. The Python adapter reads this field and sets
the reply correlation fields on the `MessageEvent`.
2026-06-27 00:51:34 -07:00
underthestars-zhy
4345b3e767 fix(photon): upgrade spectrum-ts sidecar to v8.0.0
v8 made `richlink` outbound-only; inbound rich links now arrive as
plain `text`. Remove the `getBalloonBundleId`/`toRichlinkMessage`
branches from the iMessage mapper patch and update the fixture,
lockfile, and README accordingly.
2026-06-27 00:51:34 -07:00
underthestars-zhy
5636c22828 feat(photon): upgrade spectrum-ts sidecar to v7.0.0
Update the Photon platform plugin's Node.js sidecar from spectrum-ts
3.1.0 to 7.0.0, which splits the SDK into scoped `@spectrum-ts/*`
packages with `spectrum-ts` as the umbrella re-export.

- Bump exact pin in package.json/package-lock.json to 7.0.0
- Update mixed-attachments patch script to target the new
  `@spectrum-ts/imessage/dist/index.js` path and tab-indented output
- Rewrite test fixture to match v7.x mapper shape (tab-indented,
  `const ... = async` declarations, single-line builder calls) and
  point at `@spectrum-ts/imessage/dist/index.js`
- Update README upgrade guide to document the v5 package split and
  the postinstall patch validation step
- Update comments in cli.py and index.mjs to reference v5/v7 changes
2026-06-27 00:51:34 -07:00
Teknium
d712a7fd73 fix(model-picker): surface the current custom/uncurated model in picker rows (#53457)
A model selected via the CLI (e.g. /model openrouter/<uncurated-name>) was
absent from every model picker — the main picker AND the MoA reference/
aggregator slot pickers — because each provider row only carried its curated
catalog. Inject the current model at the front of its provider's row so it is
selectable and shown everywhere.
2026-06-27 00:06:34 -07:00
Ben Barclay
fbf748b282 fix(dashboard-auth): follow redirects on self-hosted OIDC discovery (#53399)
The self-hosted OIDC provider fetched the discovery document with a bare
httpx.get(). httpx defaults to follow_redirects=False (unlike curl -L or
the requests library), so when an IDP answers GET
/.well-known/openid-configuration with a 3xx — Authentik canonicalises the
.well-known path, and any IDP behind a reverse proxy doing an http→https
upgrade redirects too — the bare redirect (empty body) tripped the
status != 200 guard and raised 'OIDC discovery returned 302', which
routes.py maps to the provider_unreachable audit event and a 503. The
browser surfaced 'Auth provider self-hosted unreachable'.

The user's smoking gun (curl -o writing zero bytes from inside the
container) is exactly a redirect with no body — the same wall the code hit.

Add follow_redirects=True to the discovery GET only. It's safe: the
issuer-pin check and _require_https_or_loopback still validate the resolved
document and every endpoint, so a redirect can't smuggle in a bad issuer or
a cleartext endpoint. The token/revocation POSTs deliberately keep the
no-follow default (they carry an auth code / refresh token and the endpoint
is already the canonical absolute URL).

Existing discovery tests mocked httpx.get with a canned 200 and never
exercised a real 3xx. Add a regression test that runs a real loopback
server returning a 302 on the .well-known path — fails without the fix
(ProviderError: discovery returned 302), passes with it.
2026-06-27 14:14:51 +10:00
ethernet
dd0e4ab81a change(ci): slice files in matrix job
avoid duplicating work, avoid file discovery on each job
2026-06-26 19:15:18 -07:00
ethernet
1a75387fa8 change(ci): log json decode error in durations 2026-06-26 19:15:18 -07:00
ethernet
707ae6e623 change(tests): don't count with pytest collect
it's way too slow. just grep files lol
2026-06-26 19:15:18 -07:00
ethernet
bcc3eb3419 fix(ci): rip out some xdist legacy stuff... how did these ever work?? 2026-06-26 19:15:18 -07:00
ethernet
2fa66950e8 change(ci): upload-artifact from v4 -> v7 2026-06-26 19:15:18 -07:00
ethernet
4b0a2040e7 change(ci): use run_tests in docker 2026-06-26 19:15:18 -07:00
ethernet
18f7ad49ab change(ci): update all UV installs 2026-06-26 19:15:18 -07:00
ethernet
f0cb049217 change(ci): migrate docker smoketests to real tests 2026-06-26 19:15:18 -07:00
ethernet
2bd17221b7 change(ci): pretty names 2026-06-26 19:15:18 -07:00
ethernet
9a861cd0ab change(tests): don't pass pytest args when counting tests 2026-06-26 19:15:18 -07:00
ethernet
447f9e7c89 change(nix): simpler dev setup 2026-06-26 19:15:18 -07:00
ethernet
8ae793d3de change(nix): ship fat hermes agent by default 2026-06-26 19:15:18 -07:00
ethernet
fb1dd1bf91 change(ci): docker-publish.yml -> docker.yml 2026-06-26 19:15:18 -07:00
ethernet
35dfe7b58f change(ci): docker runs again on PRs 2026-06-26 19:15:18 -07:00
ethernet
4cf69f0da4 refactor(ci): more test slices 2026-06-26 19:15:18 -07:00
ethernet
d4aec4e92f refactor(ci): run tests thru run_tests.sh 2026-06-26 19:15:18 -07:00
ethernet
c918d07b50 refactor(ci): rewrite docker tests to check built container 2026-06-26 19:15:18 -07:00
ethernet
638243726e refactor(ci): faster docker builds via --link and chmod removal 2026-06-26 19:15:18 -07:00
brooklyn!
f6e815e378 Merge pull request #53357 from helix4u/fix/desktop-titlebar-overlay 2026-06-26 20:42:11 -05:00
Gille
1bff85cf66 fix(desktop): keep titlebar overlay off session title 2026-06-26 19:18:26 -06:00
Nacho Avecilla
dbe734beff fix(dashboard-auth): exclude non-interactive providers from interactive login surfaces (#53239)
* Return None instead of erroring on drain login failure

* Fix login on drain

* Remove login for drained endpoints flow and clean the code

* chore: drop unrelated credits changes from this PR

* Remove extra comments that were not really necessary
2026-06-27 10:08:13 +10:00
brooklyn!
7a38d64a85 Merge pull request #53335 from NousResearch/bb/desktop-custom-model-blank-selector
fix(desktop): show custom (non-curated) model in Settings model pickers
2026-06-26 18:59:43 -05:00
Brooklyn Nicholson
a6ae179f43 fix(desktop): show custom (non-curated) model in Settings model pickers
A Radix <Select> renders a blank trigger when its `value` matches no
<SelectItem>. The Settings model pickers built their options solely from
each provider's curated `models` list, so a model added via config that
isn't in that list (e.g. anthropic/claude-opus-4.7 on nous) selected
nothing and showed an empty selector.

Union the active value into the options via a small `withActive` helper,
applied to the main, auxiliary, MoA reference, and MoA aggregator model
selects so the configured model always stays visible and selectable.
2026-06-26 18:55:44 -05:00
kshitijk4poor
7475d125d2 test(mcp): stub mcp_oauth in backgrounding test to deflake CI
The backgrounding-contract test (test_prepare_agent_startup_backgrounds_
blocking_mcp_for_chat) failed intermittently on loaded CI shards: it stubs
tools.mcp_tool.discover_mcp_tools but NOT tools.mcp_oauth, so the background
discovery thread paid the real, cold ~0.75s 'import tools.mcp_oauth' (added by
this PR's _discover_mcp_tools_without_interactive_oauth) before calling the
stubbed discovery. On a slow/loaded runner that import plus thread scheduling
exceeded the 1.0s polling deadline, leaving calls['mcp'] == 0.

Fix: stub tools.mcp_oauth with a nullcontext suppress_interactive_oauth (the
same no-op production falls back to when mcp_oauth is unavailable), so the
test exercises the backgrounding contract without paying an unrelated cold
import in its timing window. Bumped the poll deadline 1.0s -> 3.0s as
belt-and-suspenders. Production behaviour is unchanged; the import cost was
always off the main thread.

Verified: 5/5 pass repeatedly via scripts/run_tests.sh (per-file isolation,
matching CI), ruff clean.
2026-06-27 04:59:23 +05:30
zapabob
e55ddc3e33 fix(mcp): suppress interactive OAuth stdin prompts during background discovery (#35927)
When an MCP server requires OAuth, the interactive `hermes` TUI froze on
startup: background MCP discovery hit the OAuth flow, which on an interactive
TTY spawns a daemon thread doing a blocking `sys.stdin.readline()` (the
"paste the redirect URL" fallback in mcp_oauth._wait_for_callback). That
thread competes with the TUI's own stdin reader for the same terminal, so
keystrokes get swallowed and the TUI appears frozen (up to the 300s OAuth
timeout). Reported symptom: "MCP OAuth: authorization required / Open this URL
... the tui is freezing, not respond to typing."

Add a thread-local `suppress_interactive_oauth()` context manager in
tools/mcp_oauth.py; `_is_interactive()` returns False while it's active, so the
stdin paste-thread and prompt are never created. Background discovery
(hermes_cli/mcp_startup.py, tui_gateway/entry.py) now runs discovery inside
that context, so OAuth-requiring servers soft-skip (raise
OAuthNonInteractiveError, already handled) instead of stealing the TUI's stdin.
A real `hermes mcp login` on the main thread is unaffected (thread-local).

Salvaged from #35945 by @zapabob (authorship preserved via cherry-pick;
resolved a conflict against main's new mcp_discovery_timeout / wait_for_mcp_
discovery refactor, keeping both). Verified E2E: with suppression the paste
prompt is NOT printed and no stdin thread spawns (raises OAuthNonInteractive
soft-skip); without it the prompt shows (the freeze). Mutation-verified
(removing the suppress check in _is_interactive fails the regression test).
76 tests pass, ruff clean.

Closes #35927.

SELF-REVIEW FIX: the original #35945 used threading.local(), which does NOT
propagate to the dedicated mcp-event-loop thread where OAuth actually runs
(discover_mcp_tools dispatches the connect via run_coroutine_threadsafe), so
the suppression was a NO-OP in production (the tests passed only by stubbing
out the cross-thread dispatch). Converted to a contextvars.ContextVar, which
asyncio copies onto the scheduled coroutine — empirically verified suppression
now holds on the mcp-event-loop thread through the real _run_on_mcp_loop path.
Added a cross-thread regression test (fails on threading.local, passes on the
ContextVar) so the no-op can't regress.
2026-06-27 04:59:23 +05:30
briandevans
2d8c44ac87 fix(hermes-home): only honour legacy dir layout when it has content
get_hermes_dir(new_subpath, old_name) returned the legacy <old_name>/
location as soon as it existed on disk — even when empty. When an empty
legacy stub is created on a profile that already has populated data at
the new consolidated <new_subpath>/ (install scaffolds, profile init, a
stray mkdir, or ensure_hermes_home() recreating legacy dirs), the
resolver silently flipped to the empty legacy dir and the real data
became invisible. No log, no error — the feature behaved as if state was
wiped. Reproduced as a Discord pairing store losing every approved user
when an empty pairing/ shadowed the populated platforms/pairing/.

Resolve the legacy path only when it has content: a populated directory
(any entry) or a non-directory file counts; an empty directory falls
through to the new layout. Inspection failures (PermissionError on
lstat/iterdir, or any OSError short of FileNotFoundError) are treated as
"occupied" so a transient error never orphans legacy data — only a
genuine FileNotFoundError counts as absent. The lstat()-based gate also
fixes the prior exists()/is_dir() path swallowing PermissionError and
mis-reading an unreadable legacy dir as absent.

This hardens all 11+ call sites that share the resolver (pairing,
image/audio/video/document caches, matrix/whatsapp session stores,
vision/credential/tts/browser dirs).

Adds TestGetHermesDir regression coverage (empty/populated/subdir/file/
unreadable/unstatable cases) and updates test_credential_files to
populate its legacy dirs so they still count as content.

Closes #27602
Closes #27715
2026-06-27 04:57:15 +05:30
briandevans
c377e954fb test(gateway): isolate secret-redaction layer from provider-error rewrite
The existing test_chat_gateways_redact_secret_in_provider_error feeds a
provider-error envelope (HTTP 401), which _sanitize_gateway_final_response
rewrites wholesale to a generic category string. That rewrite strips the
secret regardless of whether the redaction layer works, so the test cannot
on its own prove _redact_gateway_user_facing_secrets is exercised.

Add test_chat_gateways_redact_secret_in_non_error_body: ordinary assistant
prose that echoes a bearer token but is NOT a provider-error envelope, so
the rewrite path does not fire and secret redaction is the only defense.
Verified fail-before (token leaks when _GATEWAY_SECRET_PATTERNS is emptied)
and pass-after across whatsapp/slack/signal/matrix, while non-secret prose
is preserved intact.
2026-06-27 04:47:10 +05:30
briandevans
57864d07ed fix(gateway): suppress operational status/error noise on all chat gateways, not just Telegram (#39293)
The Telegram noise/secret filter added in #28533 gated its work on
`_gateway_platform_value(platform) != "telegram"`, so
`_sanitize_gateway_final_response` and `_prepare_gateway_status_message`
only ran for Telegram. Every other human-facing chat surface
(WhatsApp, Discord, Slack, Signal, Matrix, plugin platforms, etc.)
received raw provider-error bodies verbatim — including any leaked
credentials the secret-redaction pass (`sk-…`, `Bearer …`, `gh[pousr]_…`,
`xox[baprs]-…`, `hf_…`, `glpat-…`) was meant to strip.

Invert the gate from a one-platform allowlist into a small
programmatic-surface denylist: only `local`, `api_server`, `webhook`,
and `msgraph_webhook` consume gateway text programmatically and keep raw
status/error text. Every other (chat) surface — including unknown/empty
platform values and on-demand plugin pseudo-members — fails closed to
the redacted, noise-filtered, sanitized path. This widens the same
root-cause fix to both call sites: status callbacks and final replies.
2026-06-27 04:47:10 +05:30
kshitijk4poor
244a6f2ceb fix(desktop): broken "Open setup guide" button for plugin platforms
On the desktop Channels / Messaging page, the "Open setup guide" button was
rendered as a bare <a href={platform.docs_url} target="_blank"> with no guard.
Plugin-provided platforms (Microsoft Teams, Google Chat, Line, Raft, Yuanbao,
…) ship an empty docs_url, so the anchor's href was "".

In a packaged build, Electron resolves an empty href against the current
document — the app's own index.html inside the asar bundle — and
shell.openPath then fails with an OS "file not found" dialog. This is exactly
the Windows error reported for Messaging → Teams → Open guide.

Fix (3 changes):

1. fix(desktop) — Only render the "Open setup guide" button when docs_url is
   non-empty, and route clicks through openExternalLink so a relative/empty
   value can never be treated as a local bundle path. Fixes the whole class
   (every plugin platform), not just Teams.

2. fix(messaging) — Give the Teams platform plugin a real docs_url (Microsoft
   Teams setup guide) so its card shows a working button instead of nothing.

3. fix(messaging) — Give the Google Chat platform plugin a real docs_url
   (Google Chat setup guide) so its card shows a working button instead of
   nothing. Originally from #48940; folded in here because that PR's test
   was broken (it queried the HTTP endpoint, but google_chat is a dynamic
   enum member that only appears after the adapter module is imported).

Test plan:
- apps/desktop — new src/app/messaging/index.test.tsx: button is hidden when
  docs_url is empty; a real URL opens via the validated external opener (does
  not navigate).
- apps/desktop typecheck (tsc --noEmit) clean.
- backend — test_teams_messaging_metadata_links_setup_guide: the Teams catalog
  entry exposes the setup-guide docs_url.
- backend — test_google_chat_messaging_metadata_links_setup_guide: the Google
  Chat catalog entry exposes the setup-guide docs_url.

Co-authored-by: xxxigm <tuancanhnguyen706@gmail.com>
Co-authored-by: p-andhika <andhika.prakasiwi@gmail.com>
2026-06-27 04:34:08 +05:30
kshitijk4poor
58919f68ab fix: also preserve provider selection on Esc-clear-filter path
The back() handler had the same filtered-index drift bug as the Enter
and Ctrl+D transitions: when the user presses Esc to clear an active
filter on the provider stage, providerIdx was reset to 0, losing the
highlighted provider. Apply the same providerIndexAfterClearingFilter
fix as the other three transition paths.

Also adds edge-case tests for the helper: undefined provider, slug not
found, empty rows, and duplicate slug first-match behavior.

Found by hermes-pr-review Phase 2 + hermes-agent-dev 3-agent review.
2026-06-27 04:33:48 +05:30
Donovan Yohan
386478211b fix(tui): preserve filtered model provider selection 2026-06-27 04:33:48 +05:30
kshitijk4poor
b0f44d3fad fix(gateway): remove process-global HERMES_SESSION_KEY write that misroutes approval prompts across concurrent sessions
GatewayRunner._run_agent's run_sync() wrote the per-turn session key to
the process-global os.environ["HERMES_SESSION_KEY"]. Because os.environ
is shared across the whole process, concurrent gateway sessions (e.g.
two Discord threads) clobbered each other's value. A tool worker thread
whose approval contextvar was unset then fell back to os.environ via
get_current_session_key() and read whichever session ran run_sync()
last — routing "Command Approval Required" prompts to the wrong thread.

Session routing is already concurrency-safe via contextvars:
- gateway/session_context.py _SESSION_KEY (set in set_session_vars)
- tools/approval.py _approval_session_key (set via set_current_session_key
  right before the agent runs, inherited by tool worker threads)

The only non-test readers of HERMES_SESSION_KEY (tools/approval.py,
tools/terminal_tool.py, tools/kanban_tools.py) all prefer the contextvar
with os.environ as a mere fallback. CLI/cron/TUI set their own os.environ
via separate export paths (e.g. the TUI parent exporting it into the
agent subprocess), so removing this in-process write does not affect them.

Adds regression tests asserting the resolver prefers the contextvar and
does not leak a concurrent session's cleared/clobbered os.environ value.

Closes #24100

Co-authored-by: Yosapol Jitrak <yosapol@jitrak.dev>
2026-06-27 04:31:37 +05:30
helix4u
5191ebba22 fix(desktop): retry empty resumed transcripts 2026-06-25 13:36:19 -06:00
600 changed files with 39610 additions and 7404 deletions

2
.envrc
View File

@@ -1,5 +1,5 @@
watch_file pyproject.toml uv.lock
watch_file package-lock.json package.json web/package.json ui-tui/package.json website/package.json apps/shared/package.json apps/desktop/package.json ui-tui/packages/hermes-ink/package.json
watch_file flake.nix flake.lock nix/devShell.nix nix/tui.nix nix/package.nix nix/python.nix
watch_file flake.nix flake.lock nix/devShell.nix nix/tui.nix nix/package.nix nix/python.nix nix/hermes-agent.nix nix/desktop.nix
use flake

View File

@@ -1,50 +0,0 @@
name: Hermes smoke test
description: >
Run the image's built-in entrypoint against `--help` and `dashboard --help`
to catch basic runtime regressions before publishing. Requires the image
to already be loaded into the local Docker daemon under `image`.
Works identically on amd64 and arm64 runners.
inputs:
image:
description: Fully-qualified image tag (e.g. nousresearch/hermes-agent:test)
required: true
runs:
using: composite
steps:
- name: Ensure /tmp/hermes-test is hermes-writable
shell: bash
run: |
# The image runs as the hermes user (UID 10000). GitHub Actions
# creates /tmp/hermes-test root-owned by default, which hermes
# can't write to — chown it to match the in-container UID before
# bind-mounting. Real users doing `docker run -v ~/.hermes:...`
# with their own UID hit the same issue and have their own
# remediations (HERMES_UID env var, or chown locally).
mkdir -p /tmp/hermes-test
sudo chown -R 10000:10000 /tmp/hermes-test
- name: hermes --help
shell: bash
run: |
# Use the image's real ENTRYPOINT (/init + main-wrapper.sh) so
# this exercises the actual production startup path. PR #30136
# review caught that an --entrypoint override here had been
# silently neutered by the s6-overlay migration — stage2-hook
# ignores its CMD args, so the smoke test was a no-op.
docker run --rm \
-v /tmp/hermes-test:/opt/data \
"${{ inputs.image }}" --help
- name: hermes dashboard --help
shell: bash
run: |
# Regression guard for #9153: dashboard was present in source but
# missing from the published image. If this fails, something in
# the Dockerfile is excluding the dashboard subcommand from the
# installed package.
docker run --rm \
-v /tmp/hermes-test:/opt/data \
"${{ inputs.image }}" dashboard --help

View File

@@ -20,6 +20,7 @@ permissions:
pull-requests: write # needed by lint (PR comment) + supply-chain (PR comment)
actions: read # needed by osv-scanner (SARIF upload)
security-events: write # needed by osv-scanner (SARIF upload)
packages: write # needed by docker build
concurrency:
group: ci-${{ github.ref }}
@@ -32,6 +33,7 @@ jobs:
# (all lanes true) so post-merge validation is never weakened.
# ─────────────────────────────────────────────────────────────────────
detect:
name: Detect affected areas
runs-on: ubuntu-latest
outputs:
python: ${{ steps.classify.outputs.python }}
@@ -53,11 +55,15 @@ jobs:
# Skipped workflows (if condition is false) don't spin up runners.
# ─────────────────────────────────────────────────────────────────────
tests:
name: Python tests
needs: detect
if: needs.detect.outputs.python == 'true'
uses: ./.github/workflows/tests.yml
with:
slice_count: 8
lint:
name: Python lints
needs: detect
if: needs.detect.outputs.python == 'true'
uses: ./.github/workflows/lint.yml
@@ -65,35 +71,49 @@ jobs:
event_name: ${{ needs.detect.outputs.event_name }}
typecheck:
name: TypeScript
needs: detect
if: needs.detect.outputs.frontend == 'true'
uses: ./.github/workflows/typecheck.yml
docs-site:
name: Docs Site
needs: detect
if: needs.detect.outputs.site == 'true'
uses: ./.github/workflows/docs-site-checks.yml
history-check:
name: Deny unrelated histories
needs: detect
if: needs.detect.outputs.event_name == 'pull_request'
uses: ./.github/workflows/history-check.yml
contributor-check:
name: Check contributors
needs: detect
if: needs.detect.outputs.python == 'true'
uses: ./.github/workflows/contributor-check.yml
uv-lockfile:
name: Check uv.lock
needs: detect
uses: ./.github/workflows/uv-lockfile-check.yml
docker-lint:
name: Lint Docker scripts
needs: detect
if: needs.detect.outputs.docker_meta == 'true'
uses: ./.github/workflows/docker-lint.yml
docker:
name: Build&Test Docker image
needs: detect
if: needs.detect.outputs.python == 'true' || needs.detect.outputs.frontend == 'true' || needs.detect.outputs.docker_meta == 'true'
uses: ./.github/workflows/docker.yml
secrets: inherit
supply-chain:
name: Supply-chain scan
needs: detect
if: needs.detect.outputs.event_name == 'pull_request' && (needs.detect.outputs.scan == 'true' || needs.detect.outputs.deps == 'true' || needs.detect.outputs.mcp_catalog == 'true')
uses: ./.github/workflows/supply-chain-audit.yml
@@ -104,7 +124,7 @@ jobs:
mcp_catalog: ${{ needs.detect.outputs.mcp_catalog == 'true' }}
osv-scanner:
needs: detect
name: OSV scan
uses: ./.github/workflows/osv-scanner.yml
# ─────────────────────────────────────────────────────────────────────
@@ -127,6 +147,8 @@ jobs:
- docker-lint
- supply-chain
- osv-scanner
# we don't require docker to pass rn because it's so slow lol
# - docker
if: always()
runs-on: ubuntu-latest
steps:

View File

@@ -2,7 +2,7 @@ name: Docker / shell lint
# Lints the container build inputs: Dockerfile (via hadolint) and any shell
# scripts under docker/ (via shellcheck). These catch the class of regression
# the behavioral docker-publish smoke test can't — unquoted variable
# the behavioral docker smoke test can't — unquoted variable
# expansions, silently-failing RUN commands, etc.
#
# Rules and ignores are documented in .hadolint.yaml at the repo root.

View File

@@ -1,24 +1,9 @@
name: Docker Build and Publish
name: Docker Build, Test, and Publish
on:
push:
branches: [main]
paths:
- '**/*.py'
- 'pyproject.toml'
- 'uv.lock'
- 'Dockerfile'
- 'docker/**'
- '.github/workflows/docker-publish.yml'
- '.github/actions/hermes-smoke-test/**'
# No paths filter — the job must always run so the required check
# reports a status (path-gated workflows leave checks "pending" forever
# when no matching files change, which blocks merge).
pull_request:
release:
types: [published]
workflow_call:
permissions:
contents: read
@@ -39,11 +24,7 @@ env:
IMAGE_NAME: nousresearch/hermes-agent
jobs:
# ---------------------------------------------------------------------------
# Build amd64 natively. This job also runs the smoke tests (basic --help
# and the dashboard subcommand regression guard from #9153), because amd64
# is the only arch we can `load` into the local daemon on an amd64 runner.
# ---------------------------------------------------------------------------
# Build, test, and optionally push the amd64 image.
build-amd64:
# Only run on the upstream repository, not on forks
if: github.repository == 'NousResearch/hermes-agent'
@@ -53,24 +34,19 @@ jobs:
digest: ${{ steps.push.outputs.digest }}
steps:
- name: Checkout code
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
# The image build + smoke test + integration tests run ONLY on
# push-to-main and release — never on PRs. They are the heaviest jobs
# in CI (~15-45 min) and a broken build surfaces on the main push (and
# is gated pre-merge by docker-lint + uv-lockfile-check). Every step
# below is skipped on PRs, so the job still reports green and the
# required check never hangs.
# The image build + integration tests run on every event
# (PRs, push-to-main, release). Publish steps below are gated to
# push-to-main / release only.
- name: Set up Docker Buildx
if: github.event_name != 'pull_request'
uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # v3
uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # v3
# Build once, load into the local daemon for smoke testing. Cached
# Build once, load into the local daemon for testing. Cached
# to gha with a per-arch scope; the push step below reuses every
# layer from this build.
- name: Build image (amd64, smoke test)
if: github.event_name != 'pull_request'
uses: docker/build-push-action@bcafcacb16a39f128d818304e6c9c0c18556b85f # v7.1.0
- name: Build image (amd64)
uses: docker/build-push-action@bcafcacb16a39f128d818304e6c9c0c18556b85f # v7.1.0
with:
context: .
file: Dockerfile
@@ -82,25 +58,12 @@ jobs:
cache-from: type=gha,scope=docker-amd64
cache-to: type=gha,mode=max,scope=docker-amd64
- name: Smoke test image
if: github.event_name != 'pull_request'
uses: ./.github/actions/hermes-smoke-test
with:
image: ${{ env.IMAGE_NAME }}:test
# ---------------------------------------------------------------------
# Run the docker-integration test suite against the freshly-built
# image already loaded into the local daemon (`:test`). These tests
# are excluded from the sharded `tests.yml :: test` matrix on purpose
# (see `_SKIP_PARTS` in scripts/run_tests_parallel.py) because each
# shard would otherwise reach the session-scoped ``built_image``
# fixture in ``tests/docker/conftest.py`` and start a 3-7min
# ``docker build`` — guaranteed to
# die in fixture setup.
# image already loaded into the local daemon (`:test`).
#
# Piggybacking here avoids a second image build: the smoke test
# already proved the image loads + runs, so the daemon has it under
# `${IMAGE_NAME}:test` and we just point ``HERMES_TEST_IMAGE`` at
# Piggybacking here avoids a second image build: the build step
# already loaded the image into the daemon under
# `${IMAGE_NAME}:test`, so we just point ``HERMES_TEST_IMAGE`` at
# that. The fixture's ``HERMES_TEST_IMAGE`` branch (see
# tests/docker/conftest.py:62-63) short-circuits the rebuild.
#
@@ -110,26 +73,20 @@ jobs:
# cheapest path to coverage on every PR that touches docker code.
# ---------------------------------------------------------------------
- name: Install uv (for docker tests)
if: github.event_name != 'pull_request'
uses: astral-sh/setup-uv@d4b2f3b6ecc6e67c4457f6d3e41ec42d3d0fcb86 # v5
uses: astral-sh/setup-uv@fac544c07dec837d0ccb6301d7b5580bf5edae39 # 8.2.0
- name: Set up Python 3.11 (for docker tests)
if: github.event_name != 'pull_request'
run: uv python install 3.11
- name: Install Python dependencies (for docker tests)
if: github.event_name != 'pull_request'
run: |
uv venv .venv --python 3.11
source .venv/bin/activate
# ``dev`` extra pulls in pytest, pytest-asyncio —
# everything tests/docker/ needs. We deliberately avoid ``all``
# here because the docker tests only drive the container via
# subprocess and don't import hermes_agent's optional deps.
uv pip install -e ".[dev]"
uv sync --locked --python 3.11 --extra dev
- name: Run docker integration tests
if: github.event_name != 'pull_request'
env:
# Skip rebuild; use the image already loaded by the build step.
HERMES_TEST_IMAGE: ${{ env.IMAGE_NAME }}:test
@@ -139,12 +96,11 @@ jobs:
OPENAI_API_KEY: ""
NOUS_API_KEY: ""
run: |
source .venv/bin/activate
python -m pytest tests/docker/ -v --tb=short
scripts/run_tests.sh tests/docker/ --file-timeout 600
- name: Log in to Docker Hub
if: github.event_name == 'push' && github.ref == 'refs/heads/main' || github.event_name == 'release'
uses: docker/login-action@4907a6ddec9925e35a0a9e82d7399ccc52663121 # v4.1.0
uses: docker/login-action@4907a6ddec9925e35a0a9e82d7399ccc52663121 # v4.1.0
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
@@ -155,7 +111,7 @@ jobs:
- name: Push amd64 by digest
id: push
if: github.event_name == 'push' && github.ref == 'refs/heads/main' || github.event_name == 'release'
uses: docker/build-push-action@bcafcacb16a39f128d818304e6c9c0c18556b85f # v7.1.0
uses: docker/build-push-action@bcafcacb16a39f128d818304e6c9c0c18556b85f # v7.1.0
with:
context: .
file: Dockerfile
@@ -179,7 +135,7 @@ jobs:
- name: Upload digest artifact
if: github.event_name == 'push' && github.ref == 'refs/heads/main' || github.event_name == 'release'
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4
uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a # v7
with:
name: digest-amd64
path: /tmp/digests/*
@@ -187,10 +143,7 @@ jobs:
retention-days: 1
# ---------------------------------------------------------------------------
# Build arm64 natively on GitHub's free arm64 runner. This replaces the
# previous QEMU-emulated arm64 build, which was ~5-10x slower and shared
# a cache scope with amd64. Matches the amd64 job's shape: build+load,
# smoke test, then on push/release push by digest.
# Build, test, and optionally push the arm64 image.
# ---------------------------------------------------------------------------
build-arm64:
if: github.repository == 'NousResearch/hermes-agent'
@@ -200,29 +153,26 @@ jobs:
digest: ${{ steps.push.outputs.digest }}
steps:
- name: Checkout code
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
# arm64 build runs only on push-to-main and release (see build-amd64).
- name: Set up Docker Buildx
if: github.event_name != 'pull_request'
uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # v3
uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # v3
# Log in to ghcr.io so the registry-backed build cache below can be
# read (cache-from) on every event and written (cache-to) on
# push/release. Uses the workflow's GITHUB_TOKEN, which is valid for
# the whole job — unlike the gha cache backend's short-lived Azure SAS
# token, which expired mid-build on slow cold-cache arm64 runs and
# crashed the build before the smoke test (the reason the gha cache
# crashed the build before the tests ran (the reason the gha cache
# was removed from arm64 PRs in the first place).
- name: Log in to ghcr.io (build cache)
if: github.event_name != 'pull_request'
uses: docker/login-action@4907a6ddec9925e35a0a9e82d7399ccc52663121 # v4.1.0
uses: docker/login-action@4907a6ddec9925e35a0a9e82d7399ccc52663121 # v4.1.0
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
# Build once, load into the local daemon for smoke testing, then push
# Build once, load into the local daemon for testing, then push
# by digest below. Reads AND writes the registry-backed cache so the
# push reuses layers from this build and the next build starts warm.
#
@@ -230,9 +180,8 @@ jobs:
# cache that previously broke here: its credential is the job-lifetime
# GITHUB_TOKEN, not a short-lived SAS token, so the cold-build-outlives-
# token failure mode cannot recur.
- name: Build image (arm64, smoke test, cached publish)
if: github.event_name != 'pull_request'
uses: docker/build-push-action@bcafcacb16a39f128d818304e6c9c0c18556b85f # v7.1.0
- name: Build image (arm64, cached publish)
uses: docker/build-push-action@bcafcacb16a39f128d818304e6c9c0c18556b85f # v7.1.0
with:
context: .
file: Dockerfile
@@ -244,15 +193,29 @@ jobs:
cache-from: type=registry,ref=ghcr.io/nousresearch/hermes-agent:buildcache-arm64
cache-to: type=registry,ref=ghcr.io/nousresearch/hermes-agent:buildcache-arm64,mode=max
- name: Smoke test image
if: github.event_name != 'pull_request'
uses: ./.github/actions/hermes-smoke-test
with:
image: ${{ env.IMAGE_NAME }}:test
- name: Install uv for docker tests
uses: astral-sh/setup-uv@fac544c07dec837d0ccb6301d7b5580bf5edae39 # 8.2.0
- name: Set up Python 3.11 for docker tests
run: uv python install 3.11
- name: Install Python dependencies for docker tests
run: |
uv sync --locked --python 3.11 --extra dev
- name: Run docker tests
env:
# Skip rebuild; use the image already loaded by the build step.
HERMES_TEST_IMAGE: ${{ env.IMAGE_NAME }}:test
OPENROUTER_API_KEY: ""
OPENAI_API_KEY: ""
NOUS_API_KEY: ""
run: |
scripts/run_tests.sh tests/docker/ --file-timeout 600
- name: Log in to Docker Hub
if: github.event_name == 'push' && github.ref == 'refs/heads/main' || github.event_name == 'release'
uses: docker/login-action@4907a6ddec9925e35a0a9e82d7399ccc52663121 # v4.1.0
uses: docker/login-action@4907a6ddec9925e35a0a9e82d7399ccc52663121 # v4.1.0
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
@@ -260,7 +223,7 @@ jobs:
- name: Push arm64 by digest
id: push
if: github.event_name == 'push' && github.ref == 'refs/heads/main' || github.event_name == 'release'
uses: docker/build-push-action@bcafcacb16a39f128d818304e6c9c0c18556b85f # v7.1.0
uses: docker/build-push-action@bcafcacb16a39f128d818304e6c9c0c18556b85f # v7.1.0
with:
context: .
file: Dockerfile
@@ -282,7 +245,7 @@ jobs:
- name: Upload digest artifact
if: github.event_name == 'push' && github.ref == 'refs/heads/main' || github.event_name == 'release'
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4
uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a # v7
with:
name: digest-arm64
path: /tmp/digests/*
@@ -304,17 +267,17 @@ jobs:
timeout-minutes: 10
steps:
- name: Download digests
uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4
uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4
with:
path: /tmp/digests
pattern: digest-*
merge-multiple: true
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # v3
uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # v3
- name: Log in to Docker Hub
uses: docker/login-action@4907a6ddec9925e35a0a9e82d7399ccc52663121 # v4.1.0
uses: docker/login-action@4907a6ddec9925e35a0a9e82d7399ccc52663121 # v4.1.0
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}

View File

@@ -37,7 +37,7 @@ jobs:
fetch-depth: 0 # need full history for merge-base + worktree
- name: Install uv
uses: astral-sh/setup-uv@d4b2f3b6ecc6e67c4457f6d3e41ec42d3d0fcb86 # v5
uses: astral-sh/setup-uv@fac544c07dec837d0ccb6301d7b5580bf5edae39 # 8.2.0
- name: Install ruff + ty
uses: ./.github/actions/retry
@@ -110,7 +110,7 @@ jobs:
cat .lint-reports/summary.md >> "$GITHUB_STEP_SUMMARY"
- name: Upload reports as artifact
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4
uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a # v7
with:
name: lint-reports
path: .lint-reports/
@@ -164,7 +164,7 @@ jobs:
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- name: Install uv
uses: astral-sh/setup-uv@d4b2f3b6ecc6e67c4457f6d3e41ec42d3d0fcb86 # v5
uses: astral-sh/setup-uv@fac544c07dec837d0ccb6301d7b5580bf5edae39 # 8.2.0
- name: Install ruff
uses: ./.github/actions/retry

View File

@@ -3,17 +3,17 @@ name: Build Skills Index
on:
schedule:
# Run twice daily: 6 AM and 6 PM UTC
- cron: '0 6,18 * * *'
workflow_dispatch: # Manual trigger
- cron: "0 6,18 * * *"
workflow_dispatch: # Manual trigger
push:
branches: [main]
paths:
- 'scripts/build_skills_index.py'
- '.github/workflows/skills-index.yml'
- "scripts/build_skills_index.py"
- ".github/workflows/skills-index.yml"
permissions:
contents: read
actions: write # to trigger deploy-site.yml on schedule
actions: write # to trigger deploy-site.yml on schedule
jobs:
build-index:
@@ -21,11 +21,11 @@ jobs:
if: github.repository == 'NousResearch/hermes-agent'
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
- uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: '3.11'
python-version: "3.11"
- name: Install dependencies
run: pip install httpx==0.28.1 pyyaml==6.0.2
@@ -36,7 +36,7 @@ jobs:
run: python scripts/build_skills_index.py
- name: Upload index artifact
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4
uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a # v7
with:
name: skills-index
path: website/static/api/skills-index.json

View File

@@ -2,6 +2,11 @@ name: Tests
on:
workflow_call:
inputs:
slice_count:
description: Number of parallel test slices
type: number
default: 8
permissions:
contents: read
@@ -12,13 +17,11 @@ concurrency:
cancel-in-progress: true
jobs:
test:
generate:
name: "Generate slices"
runs-on: ubuntu-latest
timeout-minutes: 30
strategy:
fail-fast: false
matrix:
slice: [1, 2, 3, 4, 5, 6]
outputs:
matrix: ${{ steps.matrix.outputs.matrix }}
steps:
- name: Checkout code
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
@@ -27,13 +30,26 @@ jobs:
uses: actions/cache/restore@27d5ce7f107fe9357f9df03efb73ab90386fccae # v5.0.5
with:
path: test_durations.json
# main always writes a new suffix, but jobs pick the latest one with the same prefix
# quote from https://docs.github.com/en/actions/reference/workflows-and-actions/dependency-caching#cache-hits-and-misses
# If you provide restore-keys, the cache action sequentially searches for any caches that match the list of restore-keys.
# If there are no exact matches, the action searches for partial matches of the restore keys.
# When the action finds a partial match, the most recent cache is restored to the path directory.
key: test-durations
- name: Generate test slices
id: matrix
run: |
MATRIX=$(python3 scripts/run_tests_parallel.py --generate-slices ${{ inputs.slice_count }})
echo "matrix=$MATRIX" >> "$GITHUB_OUTPUT"
test:
name: Run tests slice ${{ matrix.slice.index }}/${{ inputs.slice_count }}
needs: generate
runs-on: ubuntu-latest
timeout-minutes: 30
strategy:
fail-fast: false
matrix: ${{ fromJSON(needs.generate.outputs.matrix) }}
steps:
- name: Checkout code
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- name: Install ripgrep (prebuilt binary)
run: |
set -euo pipefail
@@ -49,7 +65,7 @@ jobs:
rg --version
- name: Install uv
uses: astral-sh/setup-uv@d4b2f3b6ecc6e67c4457f6d3e41ec42d3d0fcb86 # v5
uses: astral-sh/setup-uv@fac544c07dec837d0ccb6301d7b5580bf5edae39 # 8.2.0
with:
# Persist uv's download/wheel cache (~/.cache/uv) across runs.
# Keyed on the dependency manifests, so the cache is reused until
@@ -78,33 +94,19 @@ jobs:
# re-download, keeping the persisted cache small and fast to restore.
run: uv cache prune --ci
- name: Run tests (slice ${{ matrix.slice }}/6)
# Per-file isolation via scripts/run_tests_parallel.py: discovers
# every test_*.py file under tests/ (excluding integration/ + e2e/),
# then runs `python -m pytest <file>` in a freshly-spawned subprocess
- name: Run tests (slice ${{ matrix.slice.index }}/${{ inputs.slice_count }})
# Per-file isolation via scripts/run_tests.sh: each test file runs
# in its own freshly-spawned `python -m pytest <file>` subprocess
# with bounded parallelism. No xdist, no shared workers, no
# module-level state leakage between files.
#
# Why per-file (not per-test): per-test spawn cost (~250ms × 17k
# tests = 70min CPU minimum) blew the wall-clock budget. Per-file
# spawn (~250ms × ~850 files = ~3.5min) fits while still giving
# every file a fresh interpreter — the only isolation boundary
# that matters in practice (cross-file leakage was the original
# flake source; intra-file is the test author's responsibility).
#
# Why drop xdist entirely: xdist's persistent workers accumulate
# state across files, which is exactly the leakage we wanted to
# fix. ThreadPoolExecutor + subprocess.run is ~60 lines and does
# the job with cleaner semantics.
#
# Matrix slicing (--slice I/N): files are distributed across 6
# jobs by cached duration (LPT algorithm) so each job gets
# roughly equal wall time. Without a cache, files default to 2s
# estimate and get split roughly evenly by count — still correct,
# just not perfectly balanced.
# File list is pre-computed by the generate job (--generate-slices)
# which runs LPT distribution once and passes the file list to each
# matrix job via --files. Previously each job re-discovered files and
# re-ran LPT independently — redundant N times.
run: |
source .venv/bin/activate
python scripts/run_tests_parallel.py --slice ${{ matrix.slice }}/6
scripts/run_tests.sh --files '${{ matrix.slice.files }}'
env:
# Ensure tests don't accidentally call real APIs
OPENROUTER_API_KEY: ""
@@ -114,7 +116,7 @@ jobs:
- name: Upload per-slice durations
uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a # v7.0.1
with:
name: test-durations-slice-${{ matrix.slice }}
name: test-durations-slice-${{ matrix.slice.index }}
path: test_durations.json
retention-days: 1
@@ -173,7 +175,7 @@ jobs:
rg --version
- name: Install uv
uses: astral-sh/setup-uv@d4b2f3b6ecc6e67c4457f6d3e41ec42d3d0fcb86 # v5
uses: astral-sh/setup-uv@fac544c07dec837d0ccb6301d7b5580bf5edae39 # 8.2.0
with:
# Persist uv's download/wheel cache (~/.cache/uv) across runs.
# Keyed on the dependency manifests, so the cache is reused until

View File

@@ -6,6 +6,7 @@ on:
jobs:
typecheck:
name: Check TypeScript
runs-on: ubuntu-latest
strategy:
matrix:
@@ -22,8 +23,7 @@ jobs:
# native builds. Skipping install scripts drops node-pty's node-gyp
# header fetch — the transient flake that killed this job pre-`tsc` — and
# is faster. retry covers the remaining registry blips.
-
uses: ./.github/actions/retry
- uses: ./.github/actions/retry
with:
command: npm ci --ignore-scripts
- run: npm run --prefix ${{ matrix.package }} typecheck
@@ -35,6 +35,7 @@ jobs:
# users build apps/desktop from source on install/update. Run the real
# `vite build` here so that class of break fails in CI instead.
desktop-build:
name: Build desktop app
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
@@ -44,8 +45,7 @@ jobs:
cache: npm
# Keep install scripts here: the production build may need node-pty's
# native binary. retry handles the transient install-time fetch flakes.
-
uses: ./.github/actions/retry
- uses: ./.github/actions/retry
with:
command: npm ci
- run: npm run --prefix apps/desktop build

View File

@@ -5,11 +5,11 @@ name: Publish to PyPI
on:
push:
tags:
- 'v20*' # CalVer tags: v2026.5.15, v2026.5.15.2, etc.
- "v20*" # CalVer tags: v2026.5.15, v2026.5.15.2, etc.
workflow_dispatch:
inputs:
confirm_tag:
description: 'Tag to publish (e.g. v2026.5.15). Must already exist.'
description: "Tag to publish (e.g. v2026.5.15). Must already exist."
required: true
type: string
@@ -27,7 +27,7 @@ jobs:
name: Build distribution 📦
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false
# On workflow_dispatch, check out the confirmed tag.
@@ -43,17 +43,17 @@ jobs:
fi
- name: Set up Python
uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: '3.13'
python-version: "3.13"
- name: Install uv
uses: astral-sh/setup-uv@d0cc045d04ccac9d8b7881df0226f9e82c39688e # v6
uses: astral-sh/setup-uv@fac544c07dec837d0ccb6301d7b5580bf5edae39 # 8.2.0
- name: Set up Node.js
uses: actions/setup-node@49933ea5288caeca8642d1e84afbd3f7d6820020 # v4
uses: actions/setup-node@49933ea5288caeca8642d1e84afbd3f7d6820020 # v4
with:
node-version: '22'
node-version: "22"
- name: Build web dashboard
run: cd web && npm ci && npm run build
@@ -81,7 +81,7 @@ jobs:
run: uv build --sdist --wheel
- name: Upload distribution artifacts
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4
uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a # v7
with:
name: python-package-distributions
path: dist/
@@ -94,17 +94,17 @@ jobs:
name: pypi
url: https://pypi.org/p/hermes-agent
permissions:
id-token: write # OIDC trusted publishing
id-token: write # OIDC trusted publishing
steps:
- name: Download distribution artifacts
uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4
uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4
with:
name: python-package-distributions
path: dist/
- name: Publish to PyPI
uses: pypa/gh-action-pypi-publish@cef221092ed1bacb1cc03d23a2d87d1d172e277b # v1.14.0
uses: pypa/gh-action-pypi-publish@cef221092ed1bacb1cc03d23a2d87d1d172e277b # v1.14.0
with:
skip-existing: true
@@ -116,12 +116,12 @@ jobs:
needs: publish
runs-on: ubuntu-latest
permissions:
contents: write # attach assets to the existing release
id-token: write # sigstore signing
contents: write # attach assets to the existing release
id-token: write # sigstore signing
steps:
- name: Download distribution artifacts
uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4
uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4
with:
name: python-package-distributions
path: dist/
@@ -145,7 +145,7 @@ jobs:
- name: Sign with Sigstore
if: env.skip_sign != 'true'
uses: sigstore/gh-action-sigstore-python@04cffa1d795717b140764e8b640de88853c92acc # v3.3.0
uses: sigstore/gh-action-sigstore-python@04cffa1d795717b140764e8b640de88853c92acc # v3.3.0
with:
inputs: >-
./dist/*.tar.gz

View File

@@ -4,7 +4,7 @@ name: uv.lock check
# that modify pyproject.toml without regenerating uv.lock (or vice versa)
# must not merge, because the Docker build's `uv sync --frozen` step will
# fail on a stale lockfile and we'd rather catch it here than in the
# docker-publish workflow on main.
# docker workflow on main.
#
# ─────────────────────────────────────────────────────────────────────────
# IMPORTANT: this check runs against the MERGED state, not just your branch
@@ -63,7 +63,7 @@ jobs:
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- name: Install uv
uses: astral-sh/setup-uv@d4b2f3b6ecc6e67c4457f6d3e41ec42d3d0fcb86 # v5
uses: astral-sh/setup-uv@fac544c07dec837d0ccb6301d7b5580bf5edae39 # 8.2.0
# `uv lock --check` re-resolves the project from pyproject.toml and
# compares the result to uv.lock, exiting non-zero if they disagree.
@@ -100,7 +100,7 @@ jobs:
This check is blocking because the Docker image build uses
`uv sync --frozen --extra all`, which rejects stale lockfiles
— catching it here avoids a ~15 min failed docker-publish run
— catching it here avoids a ~15 min failed docker run
on `main` post-merge.
EOF
echo "::error title=uv.lock out of sync::Run \`uv lock\` locally and commit the result. If on a PR, sync with main first."

6
.gitignore vendored
View File

@@ -137,3 +137,9 @@ RELEASE_v*.md
# Desktop demo-run scratch output (hermes writes demo/*.txt during recorded
# walkthroughs). Throwaway artifacts, never part of the app.
apps/desktop/demo/
# PR infographics are rendered locally and embedded in PR descriptions via the
# image-provider (fal.media) URL — they are NEVER committed to the repo. The
# PR body is the archive. See the hermes-agent-dev skill's
# pr-infographic-workflow reference (storage rule + lapse #8 / #COMMIT-1).
infographic/

View File

@@ -123,6 +123,17 @@ conservative at the waist.
without E2E proof, and plugins that touch core files.** Plugins live in their
own directory and work within the ABCs/hooks we provide; if a plugin needs
more, widen the generic plugin surface, don't special-case it in core.
- **Third-party products / other people's projects integrated into the core
tree.** Observability backends, vendor SaaS integrations, analytics dashboards,
and similar "someone else's product" plugins do NOT land under `plugins/` in
this repo. They place an ongoing maintenance burden on us to keep them working
against a fast-moving core, for a backend we don't own. Ship them as a
**standalone plugin repo** users install into `~/.hermes/plugins/` (or via a
pip entry point), and promote them in the Nous Research Discord
(`#plugins-skills-and-skins`). This is a coupling-and-maintenance decision, not
a quality bar — the plugin can be excellent and still be a close. PRs that add
such a directory to the tree are closed with a pointer to publish it as its own
repo.
### Before you call it a bug — verify the premise (and when NOT to close)
@@ -783,6 +794,24 @@ landing in this tree. PRs that add a new directory under
provider as its own repo. Existing in-tree providers stay; bug fixes
to them are welcome.
**No new third-party-product plugins in-tree (policy, June 2026):** the
same rule applies beyond memory providers. Plugins that integrate
someone else's product or project — observability/metrics backends,
vendor SaaS connectors, analytics dashboards, paid-service tie-ins —
must ship as **standalone plugin repos** that users install into
`~/.hermes/plugins/` (or via pip entry points). They register through
the existing plugin discovery path and use the ABCs/hooks/ctx surface
we expose; nothing special is needed in core. The reason is
maintenance load: every product we absorb into the tree becomes our
burden to keep working against a fast-moving core, for a backend we
don't own. Promote standalone plugins in the Nous Research Discord
(`#plugins-skills-and-skins`). PRs that add such a directory under
`plugins/` are closed with a pointer to publish it as its own repo —
this is a coupling decision, not a quality judgment. (The
`observability/`, `kanban/`, `disk-cleanup/`, etc. directories already
in the tree are existing precedent, not an invitation to add more
third-party-product plugins alongside them.)
### Model-provider plugins (`plugins/model-providers/<name>/`)
Every inference backend (openrouter, anthropic, gmi, deepseek, nvidia, …)

View File

@@ -85,6 +85,23 @@ This isn't a quality bar — it's a coupling-and-maintenance decision. Memory pr
---
## Third-Party Product Integrations: Ship as a Standalone Plugin
The same rule extends to **any plugin that integrates someone else's product or project** — observability/metrics backends, vendor SaaS connectors, analytics dashboards, paid-service tie-ins, and similar third-party integrations. **These do not land in this repo.**
The reason is maintenance load, not quality. Every external product absorbed into the core tree becomes ours to keep working against a fast-moving codebase, for a backend we don't own and can't control. Hermes ships a lot and the core moves quickly; coupling third-party products into it creates an open-ended burden on the maintainers.
Publish these as a **standalone plugin repo** instead:
- Implement the relevant ABC and use the existing plugin discovery path (`~/.hermes/plugins/`, project `.hermes/plugins/`, or a pip entry point) — see [Build a Hermes Plugin](https://hermes-agent.nousresearch.com/docs/guides/build-a-hermes-plugin)
- Register lifecycle hooks (`pre_tool_call`, `post_tool_call`, `pre_llm_call`, `post_llm_call`, `on_session_start`, `on_session_end`), tools (`ctx.register_tool`), and CLI subcommands (`ctx.register_cli_command`) through the surface we already expose — no core changes needed
- If your plugin needs a capability the framework doesn't expose, that's a feature request to **widen the generic plugin surface** (a new hook or `ctx` method) — never special-case your plugin in core
- Promote it in the [Nous Research Discord](https://discord.gg/NousResearch) `#plugins-skills-and-skins` channel so users can find and install it
A well-built third-party-product plugin can clear automated review and still be closed for this reason — it's a placement decision, not a verdict on the code. PRs that add such a directory under `plugins/` will be closed with a pointer to publish it as its own repo.
---
## Development Setup
### Prerequisites

View File

@@ -189,7 +189,13 @@ RUN cd web && npm run build && \
# ---------- Source code ----------
# .dockerignore excludes node_modules, so the installs above survive.
COPY . .
# --link decouples this layer from parents for cache purposes; --chmod bakes
# the final read-only permissions at copy time so we skip the separate
# `chmod -R` pass that previously walked ~30k files across the venv +
# node_modules + source (21s amd64 / 222s arm64 — #49113). `a+rX,go-w`
# gives the non-root hermes user read + traverse but no write; root retains
# write so the build steps below don't need chmod u+w dances.
COPY --link --chmod=a+rX,go-w . .
# ---------- Permissions ----------
# Link hermes-agent itself (editable). Deps are already installed in the
@@ -197,19 +203,15 @@ COPY . .
# resolution or downloads.
RUN uv pip install --no-cache-dir --no-deps -e "."
# Keep /opt/hermes immutable for the runtime hermes user. Hosted/container
# instances must not be able to self-edit the installed source or venv; user
# data, skills, plugins, config, logs, and dashboard uploads live under
# /opt/data instead. Root can still repair the image during build/boot, but
# supervised Hermes processes drop to the non-root hermes user.
# Wire the exec shim and install-method stamp. Files under /opt/hermes are
# already root-owned (COPY, uv sync, npm install all run as root) and
# read-only for the hermes user (go-w from the --chmod above).
USER root
RUN mkdir -p /opt/hermes/bin && \
cp /opt/hermes/docker/hermes-exec-shim.sh /opt/hermes/bin/hermes && \
chmod 0755 /opt/hermes/bin/hermes && \
printf 'docker\n' > /opt/hermes/.install_method && \
chown -R root:root /opt/hermes && \
chmod -R a+rX /opt/hermes && \
chmod -R a-w /opt/hermes
printf 'docker\n' > /opt/hermes/.install_method
# The ``.install_method`` stamp is baked next to the running code (the install
# tree), NOT into $HERMES_HOME. $HERMES_HOME (/opt/data) is a shared data
# volume that is commonly bind-mounted from the host and even shared with a
@@ -236,13 +238,11 @@ RUN mkdir -p /opt/hermes/bin && \
#
# The arg is optional — local `docker build` without --build-arg simply
# omits the file, and the runtime falls back to live-git lookup. CI
# (.github/workflows/docker-publish.yml) passes ${{ github.sha }} so
# (.github/workflows/docker.yml) passes ${{ github.sha }} so
# every published image has it.
ARG HERMES_GIT_SHA=
RUN if [ -n "${HERMES_GIT_SHA}" ]; then \
chmod u+w /opt/hermes && \
printf '%s\n' "${HERMES_GIT_SHA}" > /opt/hermes/.hermes_build_sha && \
chmod a-w /opt/hermes /opt/hermes/.hermes_build_sha; \
printf '%s\n' "${HERMES_GIT_SHA}" > /opt/hermes/.hermes_build_sha; \
fi
# ---------- s6-overlay service wiring ----------

View File

@@ -18,7 +18,7 @@
**The self-improving AI agent built by [Nous Research](https://nousresearch.com).** It's the only agent with a built-in learning loop — it creates skills from experience, improves them during use, nudges itself to persist knowledge, searches its own past conversations, and builds a deepening model of who you are across sessions. Run it on a $5 VPS, a GPU cluster, or serverless infrastructure that costs nearly nothing when idle. It's not tied to your laptop — talk to it from Telegram while it works on a cloud VM.
Use any model you want — [Nous Portal](https://portal.nousresearch.com), [OpenRouter](https://openrouter.ai) (200+ models), [NovitaAI](https://novita.ai) (AI-native cloud for Model API, Agent Sandbox, and GPU Cloud), [NVIDIA NIM](https://build.nvidia.com) (Nemotron), [Xiaomi MiMo](https://platform.xiaomimimo.com), [z.ai/GLM](https://z.ai), [Kimi/Moonshot](https://platform.moonshot.ai), [MiniMax](https://www.minimax.io), [Hugging Face](https://huggingface.co), OpenAI, or your own endpoint. Switch with `hermes model` — no code changes, no lock-in.
Use any model you want — [Nous Portal](https://portal.nousresearch.com), OpenRouter, OpenAI, your own endpoint, and [many others](https://hermes-agent.nousresearch.com/docs/integrations/providers). Switch with `hermes model` — no code changes, no lock-in.
<table>
<tr><td><b>A real terminal interface</b></td><td>Full TUI with multiline editing, slash-command autocomplete, conversation history, interrupt-and-redirect, and streaming tool output.</td></tr>

View File

@@ -722,10 +722,50 @@ def init_agent(
elif agent.provider == "moa":
from agent.moa_loop import MoAClient
agent.api_mode = "chat_completions"
agent.client = MoAClient(agent.model or "default")
# Route reference-model outputs to the agent's tool_progress_callback so
# every surface that already consumes it (CLI spinner/scrollback, TUI,
# desktop, gateway) can show each reference's answer as a labelled block
# before the aggregator acts. The facade emits "moa.reference" and
# "moa.aggregating" events; we forward them through the same callback
# the tool lifecycle uses. Best-effort and cache-safe — these are
# display-only events, they never touch the message history.
def _moa_reference_relay(event: str, **kwargs: Any) -> None:
cb = getattr(agent, "tool_progress_callback", None)
if cb is None:
return
try:
if event == "moa.reference":
label = str(kwargs.get("label") or "")
text = str(kwargs.get("text") or "")
idx = kwargs.get("index")
count = kwargs.get("count")
cb(
"moa.reference",
label,
text,
None,
moa_index=idx,
moa_count=count,
)
elif event == "moa.aggregating":
cb(
"moa.aggregating",
str(kwargs.get("aggregator") or ""),
None,
None,
moa_ref_count=kwargs.get("ref_count"),
)
except Exception:
pass
agent.client = MoAClient(
agent.model or "default",
reference_callback=_moa_reference_relay,
)
agent._client_kwargs = {}
agent.api_key = api_key or "moa-virtual-provider"
agent.base_url = base_url or "moa://local"
agent.base_url = "moa://local"
if not agent.quiet_mode:
print(f"🤖 AI Agent initialized with MoA preset: {agent.model}")
elif agent.api_mode == "bedrock_converse":
@@ -1267,6 +1307,12 @@ def init_agent(
_agent_section = {}
agent._tool_use_enforcement = _agent_section.get("tool_use_enforcement", "auto")
# Intent-ack continuation config: "auto" (default — codex_responses only,
# the historical gate), true (all api_modes), false (never), or a list of
# model-name substrings. Resolved against the active api_mode/model in the
# conversation loop's intent-ack block.
agent._intent_ack_continuation = _agent_section.get("intent_ack_continuation", "auto")
# Universal task-completion guidance toggle. Default True. Surfaced
# as a separate flag from tool_use_enforcement because the guidance
# applies to ALL models, not just the model families enforcement
@@ -1630,8 +1676,10 @@ def init_agent(
f"Model {agent.model} has a context window of {_ctx:,} tokens, "
f"which is below the minimum {MINIMUM_CONTEXT_LENGTH:,} required "
f"by Hermes Agent. Choose a model with at least "
f"{MINIMUM_CONTEXT_LENGTH // 1000}K context, or set "
f"model.context_length in config.yaml to override."
f"{MINIMUM_CONTEXT_LENGTH // 1000}K context. If your server "
f"reports a window smaller than the model's true window, set "
f"model.context_length in config.yaml to the real value "
f"(this must be at least {MINIMUM_CONTEXT_LENGTH // 1000}K)."
)
# Inject context engine tool schemas (e.g. lcm_grep, lcm_describe, lcm_expand).

View File

@@ -42,6 +42,14 @@ from utils import base_url_host_matches, base_url_hostname, env_var_enabled, ato
logger = logging.getLogger(__name__)
# Max consecutive successful credential-pool token refreshes of the SAME entry
# on a persistent auth failure before we give up and let the fallback chain
# activate. A single-entry OAuth pool can re-mint a fresh token indefinitely
# even when the upstream keeps rejecting it, so without this cap the retry loop
# spins forever and never reaches ``_try_activate_fallback``. See #26080.
_MAX_AUTH_REFRESH_ATTEMPTS = 2
def _ra():
"""Lazy ``run_agent`` reference for test-patch routing."""
import run_agent
@@ -775,6 +783,30 @@ def recover_with_credential_pool(
return False, has_retried_429
refreshed = pool.try_refresh_current()
if refreshed is not None:
# ``try_refresh_current()`` re-mints a fresh OAuth token and reports
# success even when the upstream keeps rejecting it — a single-entry
# pool (common for OAuth/Max subscribers) has nothing to rotate to,
# so a bare "refreshed → retry" loop spins forever on the same dead
# token and the configured fallback never activates. Cap consecutive
# same-entry refreshes and fall through to fallback once exceeded.
# See #26080.
refreshed_id = getattr(refreshed, "id", None)
if refreshed_id is not None:
refresh_counts = getattr(agent, "_auth_pool_refresh_counts", None)
if refresh_counts is None:
refresh_counts = {}
agent._auth_pool_refresh_counts = refresh_counts
refresh_key = (agent.provider, refreshed_id)
refresh_counts[refresh_key] = refresh_counts.get(refresh_key, 0) + 1
if refresh_counts[refresh_key] > _MAX_AUTH_REFRESH_ATTEMPTS:
_ra().logger.warning(
"Credential auth failure persists after %s refreshes for "
"pool entry %s — treating as unrecoverable and allowing "
"fallback to activate.",
refresh_counts[refresh_key] - 1,
refreshed_id,
)
return False, has_retried_429
_ra().logger.info(f"Credential auth failure — refreshed pool entry {getattr(refreshed, 'id', '?')}")
agent._swap_credential(refreshed)
return True, has_retried_429
@@ -1046,6 +1078,34 @@ def restore_primary_runtime(agent) -> bool:
api_mode=rt.get("compressor_api_mode", ""),
)
# ── Re-select from the credential pool if one is available ──
# The snapshot's api_key was captured at construction time. Across
# turns the pool may have rotated (token revocation, billing/rate-limit
# exhaustion, cooldown), leaving the snapshot key stale. Restoring it
# blindly re-fails on the first request and burns through the remaining
# pool entries before cross-provider fallback even gets a chance. Ask
# the pool for its current best entry and swap the live credential in.
# When the pool is absent, empty, or the entry has no usable key, we
# keep the snapshot key (the existing behavior). Fixes #25205.
pool = getattr(agent, "_credential_pool", None)
if pool is not None and pool.has_available():
entry = pool.select()
if entry is not None:
entry_key = (
getattr(entry, "runtime_api_key", None)
or getattr(entry, "access_token", "")
)
if entry_key:
# ``_swap_credential`` rebuilds the OpenAI/Anthropic client,
# reapplies base-url-scoped headers, and carries the
# accumulated base_url / OAuth-detection fixes (#33163).
agent._swap_credential(entry)
logger.info(
"Restore re-selected pool entry %s (%s)",
getattr(entry, "id", "?"),
getattr(entry, "label", "?"),
)
# ── Reset fallback chain for the new turn ──
agent._fallback_activated = False
agent._fallback_index = 0
@@ -1420,6 +1480,15 @@ def create_openai_client(agent, client_kwargs: dict, *, reason: str, shared: boo
keepalive_http = agent._build_keepalive_http_client(client_kwargs.get("base_url", ""))
if keepalive_http is not None:
client_kwargs["http_client"] = keepalive_http
# Delegate all rate-limit / 5xx retry to hermes's outer conversation loop,
# which honors Retry-After and applies adaptive/jittered backoff. The OpenAI
# SDK default (max_retries=2) uses its own 1-2s backoff that ignores
# Retry-After and double-retries inside our loop — the same deadlock the
# Anthropic clients hit (#26293). This is the single chokepoint every primary
# OpenAI/aggregator client passes through (init, switch_model, recovery,
# restore, request-scoped); auxiliary_client builds its own clients and keeps
# SDK retries because it is NOT wrapped by the conversation loop.
client_kwargs.setdefault("max_retries", 0)
# Uses the module-level `OpenAI` name, resolved lazily on first
# access via __getattr__ below. Tests patch via `run_agent.OpenAI`.
client = _ra().OpenAI(**client_kwargs)
@@ -1499,6 +1568,10 @@ def switch_model(agent, new_model, new_provider, api_key='', base_url='', api_mo
# _client_kwargs is a dict — snapshot a shallow copy so mutating the
# live dict doesn't poison the rollback target.
_snapshot["_client_kwargs"] = dict(getattr(agent, "_client_kwargs", {}) or {})
# Snapshot the credential pool reference so a failed client rebuild can
# restore the original pool (issue #52727: pool reload is part of this
# switch and must be reversible on rollback).
_snapshot["_credential_pool"] = getattr(agent, "_credential_pool", _MISSING)
try:
# Clear the per-config context_length override so the new model's
@@ -1523,8 +1596,36 @@ def switch_model(agent, new_model, new_provider, api_key='', base_url='', api_mo
if api_key:
agent.api_key = api_key
# ── Reload credential pool for the new provider (issue #52727) ──
# Without this, ``recover_with_credential_pool`` sees a
# ``pool.provider != agent.provider`` mismatch and short-circuits,
# leaving the new provider with no rotation/recovery on 401/429 and
# burning the original pool's entries. Only reload when the provider
# actually changed (or the pool was missing) — re-selecting the same
# provider must not churn the pool reference. A reload failure is
# logged + swallowed: the switch itself must still complete.
old_norm = (old_provider or "").strip().lower()
new_norm = (new_provider or "").strip().lower()
if old_norm != new_norm or getattr(agent, "_credential_pool", None) is None:
try:
from agent.credential_pool import load_pool
agent._credential_pool = load_pool(new_provider)
except Exception as _pool_exc: # noqa: BLE001
logger.warning(
"switch_model: credential pool reload failed for %s (%s); "
"continuing without pool rotation this turn",
new_provider, _pool_exc,
)
# ── Build new client ──
if api_mode == "anthropic_messages":
if (new_provider or "").strip().lower() == "moa":
from agent.moa_loop import MoAClient
agent.api_key = api_key or "moa-virtual-provider"
agent.base_url = "moa://local"
agent._client_kwargs = {}
agent.client = MoAClient(agent.model or "default")
elif api_mode == "anthropic_messages":
from agent.anthropic_adapter import (
build_anthropic_client,
resolve_anthropic_token,
@@ -2104,8 +2205,21 @@ def looks_like_codex_intermediate_ack(
user_message: str,
assistant_content: str,
messages: List[Dict[str, Any]],
require_workspace: bool = True,
) -> bool:
"""Detect a planning/ack message that should continue instead of ending the turn."""
"""Detect a planning/ack message that should continue instead of ending the turn.
``require_workspace`` (default True) keeps the original codex-coding scope:
the ack must reference a filesystem/repo workspace. The conversation loop
passes ``require_workspace=False`` when the user has explicitly opted into
intent-ack continuation for all api_modes (``agent.intent_ack_continuation``
is ``true`` or a model-list), so general autonomous workflows ("I'll run a
health check on the server", "I'll start the deployment") — which carry a
future-ack and an action verb but no filesystem reference — are caught too.
The future-ack + short-content + no-prior-tools + action-verb requirements
always apply, which is what keeps conversational "I'll help you brainstorm"
replies from tripping it.
"""
if any(isinstance(msg, dict) and msg.get("role") == "tool" for msg in messages):
return False
@@ -2158,17 +2272,67 @@ def looks_like_codex_intermediate_ack(
"path",
)
assistant_mentions_action = any(marker in assistant_text for marker in action_markers)
if not assistant_mentions_action:
return False
# Opted-in (all-api_mode) path: a future-ack + action verb + no prior tool
# call is enough — the user asked us to keep going when the model only
# announces intent, regardless of whether a filesystem is involved.
if not require_workspace:
return True
user_text = (user_message or "").strip().lower()
user_targets_workspace = (
any(marker in user_text for marker in workspace_markers)
or "~/" in user_text
or "/" in user_text
)
assistant_mentions_action = any(marker in assistant_text for marker in action_markers)
assistant_targets_workspace = any(
marker in assistant_text for marker in workspace_markers
)
return (user_targets_workspace or assistant_targets_workspace) and assistant_mentions_action
return user_targets_workspace or assistant_targets_workspace
def intent_ack_continuation_mode(agent) -> str:
"""Classify the resolved intent-ack continuation mode for this turn.
Returns one of:
* ``"off"`` — never continue.
* ``"codex_only"`` — historical scope: continue only on the
``codex_responses`` api_mode, and only for codebase/workspace acks
(``require_workspace=True``).
* ``"all"`` — user opted in for every api_mode; continue on any
future-ack + action verb (``require_workspace=False``).
Mirrors the four-mode shape of ``agent.tool_use_enforcement``: ``"auto"``
(default) → codex_only; ``True``/"true"/"always"/"yes"/"on" → all;
``False``/"false"/"never"/"no"/"off" → off; ``list`` → all when a substring
matches the active model name, else off.
"""
mode = getattr(agent, "_intent_ack_continuation", "auto")
if mode is True or (isinstance(mode, str) and mode.lower() in {"true", "always", "yes", "on"}):
return "all"
if mode is False or (isinstance(mode, str) and mode.lower() in {"false", "never", "no", "off"}):
return "off"
if isinstance(mode, list):
model_lower = (agent.model or "").lower()
return "all" if any(p.lower() in model_lower for p in mode if isinstance(p, str)) else "off"
# "auto" or any unrecognised value — historical codex-only behavior.
return "codex_only" if agent.api_mode == "codex_responses" else "off"
def intent_ack_continuation_enabled(agent) -> bool:
"""Whether intent-ack continuation should fire at all for this turn.
The ``codex_ack_continuations < 2`` per-turn cap and the
``looks_like_codex_intermediate_ack`` detector are applied by the caller;
this only decides the on/off gate. Callers that also need to know whether
the workspace requirement applies should use ``intent_ack_continuation_mode``
directly (``"codex_only"`` ⇒ require_workspace=True, ``"all"`` ⇒ False).
"""
return intent_ack_continuation_mode(agent) != "off"

View File

@@ -673,6 +673,9 @@ def _build_anthropic_client_with_bearer_hook(
kwargs = {
"timeout": timeout_obj,
"http_client": http_client,
# Delegate retry to hermes's outer loop (honors Retry-After); the SDK
# default max_retries=2 ignores it and double-retries. (#26293)
"max_retries": 0,
# The SDK requires *something* for api_key/auth_token. Our
# event hook overrides Authorization per request so this value
# is never sent. The sentinel string makes accidental leaks
@@ -757,6 +760,12 @@ def build_anthropic_client(
_read_timeout = timeout if (isinstance(timeout, (int, float)) and timeout > 0) else 900.0
kwargs = {
"timeout": Timeout(timeout=float(_read_timeout), connect=10.0),
# Delegate all rate-limit / 5xx retry to hermes's outer conversation
# loop, which honors Retry-After. The SDK default (max_retries=2) uses
# its own 1-2s backoff that ignores Retry-After and double-retries
# inside our loop — burning request slots against a bucket that won't
# refill for minutes. (#26293)
"max_retries": 0,
}
if normalized_base_url:
# Azure Anthropic endpoints require an ``api-version`` query parameter.
@@ -852,6 +861,9 @@ def build_anthropic_bedrock_client(region: str):
return _anthropic_sdk.AnthropicBedrock(
aws_region=region,
timeout=Timeout(timeout=900.0, connect=10.0),
# Delegate retry to hermes's outer loop (honors Retry-After); the SDK
# default max_retries=2 ignores it and double-retries. (#26293)
max_retries=0,
default_headers={"anthropic-beta": ",".join([*_COMMON_BETAS, _CONTEXT_1M_BETA])},
)
@@ -914,44 +926,72 @@ def _read_claude_code_credentials_from_keychain() -> Optional[Dict[str, Any]]:
return None
def _read_claude_code_credentials_from_file() -> Optional[Dict[str, Any]]:
"""Read Claude Code OAuth credentials from ~/.claude/.credentials.json.
Returns dict with {accessToken, refreshToken?, expiresAt?, source} or None.
"""
cred_path = Path.home() / ".claude" / ".credentials.json"
if not cred_path.exists():
return None
try:
data = json.loads(cred_path.read_text(encoding="utf-8"))
except (json.JSONDecodeError, OSError, IOError) as e:
logger.debug("Failed to read ~/.claude/.credentials.json: %s", e)
return None
oauth_data = data.get("claudeAiOauth")
if not (oauth_data and isinstance(oauth_data, dict)):
return None
access_token = oauth_data.get("accessToken", "")
if not access_token:
return None
return {
"accessToken": access_token,
"refreshToken": oauth_data.get("refreshToken", ""),
"expiresAt": oauth_data.get("expiresAt", 0),
"source": "claude_code_credentials_file",
}
def read_claude_code_credentials() -> Optional[Dict[str, Any]]:
"""Read refreshable Claude Code OAuth credentials.
Checks two sources in order:
Reads from two possible sources and reconciles them:
1. macOS Keychain (Darwin only) — "Claude Code-credentials" entry
2. ~/.claude/.credentials.json file
Selection rules when both are present:
- If exactly one is non-expired, prefer that one. (Handles the case
where Claude Code refreshes one source but not the other — observed
in the wild on Claude Code 2.1.x.)
- Otherwise, prefer the source with the later ``expiresAt`` so that
any subsequent refresh uses the most recent ``refreshToken``.
This intentionally excludes ~/.claude.json primaryApiKey. Opencode's
subscription flow is OAuth/setup-token based with refreshable credentials,
and native direct Anthropic provider usage should follow that path rather
than auto-detecting Claude's first-party managed key.
Returns dict with {accessToken, refreshToken?, expiresAt?} or None.
Returns dict with {accessToken, refreshToken?, expiresAt?, source} or None.
"""
# Try macOS Keychain first (covers Claude Code >=2.1.114)
kc_creds = _read_claude_code_credentials_from_keychain()
if kc_creds:
return kc_creds
file_creds = _read_claude_code_credentials_from_file()
# Fall back to JSON file
cred_path = Path.home() / ".claude" / ".credentials.json"
if cred_path.exists():
try:
data = json.loads(cred_path.read_text(encoding="utf-8"))
oauth_data = data.get("claudeAiOauth")
if oauth_data and isinstance(oauth_data, dict):
access_token = oauth_data.get("accessToken", "")
if access_token:
return {
"accessToken": access_token,
"refreshToken": oauth_data.get("refreshToken", ""),
"expiresAt": oauth_data.get("expiresAt", 0),
"source": "claude_code_credentials_file",
}
except (json.JSONDecodeError, OSError, IOError) as e:
logger.debug("Failed to read ~/.claude/.credentials.json: %s", e)
if kc_creds and file_creds:
kc_valid = is_claude_code_token_valid(kc_creds)
file_valid = is_claude_code_token_valid(file_creds)
if kc_valid and not file_valid:
return kc_creds
if file_valid and not kc_valid:
return file_creds
# Both valid or both expired: prefer the later expiresAt so the
# downstream refresh path uses the freshest refresh_token.
kc_exp = kc_creds.get("expiresAt", 0) or 0
file_exp = file_creds.get("expiresAt", 0) or 0
return kc_creds if kc_exp >= file_exp else file_creds
return None
return kc_creds or file_creds
def is_claude_code_token_valid(creds: Dict[str, Any]) -> bool:
@@ -1034,8 +1074,40 @@ def refresh_anthropic_oauth_pure(refresh_token: str, *, use_json: bool = False)
def _refresh_oauth_token(creds: Dict[str, Any]) -> Optional[str]:
"""Attempt to refresh an expired Claude Code OAuth token."""
refresh_token = creds.get("refreshToken", "")
"""Attempt to refresh an expired Claude Code OAuth token.
Claude Code's OAuth refresh tokens are single-use: a successful refresh
rotates the pair and invalidates the old refresh token. Claude Code itself
also refreshes on its own schedule (IDE/CLI activity), so by the time
Hermes notices an expired token, Claude Code may have already rotated it.
POSTing our now-stale refresh token in that window races Claude Code and
fails with ``invalid_grant``.
So before refreshing, re-read the live credential sources. If Claude Code
has already produced a valid token, adopt it and skip the POST entirely.
Only fall back to refreshing ourselves when no fresh credential is found.
"""
# Claude Code may have already refreshed — adopt its token rather than
# racing it with our (possibly already-rotated) refresh token. Only adopt
# when the live re-read produced a DIFFERENT token with a real future
# expiry: re-adopting the same credential we were just handed would be a
# no-op, and a 0/absent ``expiresAt`` means "managed key / unknown expiry"
# (see is_claude_code_token_valid) which must NOT be treated as a fresh
# refresh here.
current = read_claude_code_credentials()
if current:
current_token = current.get("accessToken", "")
current_exp = current.get("expiresAt", 0) or 0
if (
current_token
and current_token != creds.get("accessToken", "")
and current_exp > 0
and is_claude_code_token_valid(current)
):
logger.debug("Adopted Claude Code's already-refreshed OAuth token")
return current_token
refresh_token = (current or {}).get("refreshToken", "") or creds.get("refreshToken", "")
if not refresh_token:
logger.debug("No refresh token available — cannot refresh")
return None

View File

@@ -102,6 +102,7 @@ OpenAI = _OpenAIProxy() # module-level name, resolves lazily on call/isinstance
from agent.credential_pool import load_pool
from agent.model_metadata import MINIMUM_CONTEXT_LENGTH, get_model_context_length
from agent.process_bootstrap import build_keepalive_http_client
from hermes_cli.config import get_hermes_home
from hermes_constants import OPENROUTER_BASE_URL
from utils import base_url_host_matches, base_url_hostname, env_float, model_forces_max_completion_tokens, normalize_proxy_env_vars
@@ -109,6 +110,23 @@ from utils import base_url_host_matches, base_url_hostname, env_float, model_for
logger = logging.getLogger(__name__)
def _openai_http_client_kwargs(
base_url: Optional[str],
*,
async_mode: bool = False,
) -> Dict[str, Any]:
"""Inject keepalive httpx client with env-only proxy (not macOS system proxy)."""
client = build_keepalive_http_client(str(base_url or ""), async_mode=async_mode)
if client is None:
return {}
return {"http_client": client}
def _create_openai_client(*, api_key: str, base_url: str, **kwargs: Any) -> Any:
kwargs = {**_openai_http_client_kwargs(base_url), **kwargs}
return OpenAI(api_key=api_key, base_url=base_url, **kwargs)
# ── Interrupt protection for atomic auxiliary tasks ──────────────────────
# Some auxiliary tasks must NOT be aborted mid-flight by a gateway interrupt
# (e.g. an incoming user message while the agent is busy). Context
@@ -1614,7 +1632,7 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
_merged_aux = _apply_user_default_headers(extra.get("default_headers"))
if _merged_aux:
extra["default_headers"] = _merged_aux
_client = OpenAI(api_key=api_key, base_url=base_url, **extra)
_client = _create_openai_client(api_key=api_key, base_url=base_url, **extra)
_client = _maybe_wrap_anthropic(_client, model, api_key, raw_base_url)
return _client, model
@@ -1654,7 +1672,7 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
_merged_aux2 = _apply_user_default_headers(extra.get("default_headers"))
if _merged_aux2:
extra["default_headers"] = _merged_aux2
_client = OpenAI(api_key=api_key, base_url=base_url, **extra)
_client = _create_openai_client(api_key=api_key, base_url=base_url, **extra)
_client = _maybe_wrap_anthropic(_client, model, api_key, raw_base_url)
return _client, model
@@ -1669,20 +1687,21 @@ def _try_openrouter(explicit_api_key: str = None, model: str = None) -> Tuple[Op
pool_present, entry = _select_pool_entry("openrouter")
if pool_present:
or_key = explicit_api_key or _pool_runtime_api_key(entry)
if not or_key:
_mark_provider_unhealthy("openrouter", ttl=60)
return None, None
base_url = _pool_runtime_base_url(entry, OPENROUTER_BASE_URL) or OPENROUTER_BASE_URL
logger.debug("Auxiliary client: OpenRouter via pool")
return OpenAI(api_key=or_key, base_url=base_url,
default_headers=build_or_headers()), model or _OPENROUTER_MODEL
if or_key:
base_url = _pool_runtime_base_url(entry, OPENROUTER_BASE_URL) or OPENROUTER_BASE_URL
logger.debug("Auxiliary client: OpenRouter via pool")
return _create_openai_client(api_key=or_key, base_url=base_url,
default_headers=build_or_headers()), model or _OPENROUTER_MODEL
# Pool exists but is exhausted (no usable runtime key) — fall through to
# the OPENROUTER_API_KEY env-var path rather than failing outright.
logger.debug("Auxiliary client: OpenRouter pool exhausted, trying OPENROUTER_API_KEY")
or_key = explicit_api_key or os.getenv("OPENROUTER_API_KEY")
if not or_key:
_mark_provider_unhealthy("openrouter", ttl=60)
return None, None
logger.debug("Auxiliary client: OpenRouter")
return OpenAI(api_key=or_key, base_url=OPENROUTER_BASE_URL,
return _create_openai_client(api_key=or_key, base_url=OPENROUTER_BASE_URL,
default_headers=build_or_headers()), model or _OPENROUTER_MODEL
@@ -1775,7 +1794,7 @@ def _try_nous(vision: bool = False) -> Tuple[Optional[OpenAI], Optional[str]]:
return None, None
base_url = str((nous or {}).get("inference_base_url") or _nous_base_url()).rstrip("/")
return (
OpenAI(
_create_openai_client(
api_key=api_key,
base_url=base_url,
),
@@ -2052,7 +2071,7 @@ def _try_custom_endpoint() -> Tuple[Optional[Any], Optional[str]]:
if _custom_headers:
_extra["default_headers"] = _custom_headers
if custom_mode == "codex_responses":
real_client = OpenAI(api_key=custom_key, base_url=_clean_base, **_extra)
real_client = _create_openai_client(api_key=custom_key, base_url=_clean_base, **_extra)
return CodexAuxiliaryClient(real_client, model), model
if custom_mode == "anthropic_messages":
# Third-party Anthropic-compatible gateway (MiniMax, Zhipu GLM,
@@ -2066,14 +2085,14 @@ def _try_custom_endpoint() -> Tuple[Optional[Any], Optional[str]]:
"Custom endpoint declares api_mode=anthropic_messages but the "
"anthropic SDK is not installed — falling back to OpenAI-wire."
)
return OpenAI(api_key=custom_key, base_url=_clean_base, **_extra), model
return _create_openai_client(api_key=custom_key, base_url=_clean_base, **_extra), model
return (
AnthropicAuxiliaryClient(real_client, model, custom_key, custom_base, is_oauth=False),
model,
)
# URL-based anthropic detection for custom endpoints that didn't set
# api_mode explicitly (e.g. kimi.com/coding reached via custom config).
_fallback_client = OpenAI(api_key=custom_key, base_url=_clean_base, **_extra)
_fallback_client = _create_openai_client(api_key=custom_key, base_url=_clean_base, **_extra)
_fallback_client = _maybe_wrap_anthropic(
_fallback_client, model, custom_key, custom_base, custom_mode,
)
@@ -2102,7 +2121,7 @@ def _build_xai_oauth_aux_client(model: str) -> Tuple[Optional[Any], Optional[str
return None, None
api_key, base_url = resolved
logger.debug("Auxiliary client: xAI OAuth (%s via Responses API)", model)
real_client = OpenAI(api_key=api_key, base_url=base_url)
real_client = _create_openai_client(api_key=api_key, base_url=base_url)
return CodexAuxiliaryClient(real_client, model), model
@@ -2139,7 +2158,7 @@ def _build_codex_client(model: str) -> Tuple[Optional[Any], Optional[str]]:
return None, None
base_url = _CODEX_AUX_BASE_URL
logger.debug("Auxiliary client: Codex OAuth (%s via Responses API)", model)
real_client = OpenAI(
real_client = _create_openai_client(
api_key=codex_token,
base_url=base_url,
default_headers=_codex_cloudflare_headers(codex_token),
@@ -2239,7 +2258,7 @@ def _try_azure_foundry(
if _dq:
extra["default_query"] = _dq
client = OpenAI(api_key=api_key, base_url=_clean_base, **extra)
client = _create_openai_client(api_key=api_key, base_url=_clean_base, **extra)
if runtime_api_mode == "codex_responses":
# GPT-5.x / o-series / codex models on Azure Foundry are
@@ -3624,6 +3643,37 @@ def _resolve_auto(
# config.yaml (auxiliary.<task>.provider) still win over this.
main_provider = str(runtime_provider or _read_main_provider() or "")
main_model = str(runtime_model or _read_main_model() or "")
# MoA virtual provider: the "model" is a preset name (e.g. "opus-gpt") and
# there is no real "moa" HTTP endpoint, so resolving an aux client against
# provider="moa"/model=<preset> sends the preset name as the model id and
# the provider 400s ("opus-gpt is not a valid model ID"). Auxiliary tasks
# (title generation, compression, vision, …) don't need the reference
# fan-out — they should run on the aggregator, which is the preset's acting
# model. Resolve the MoA preset to its aggregator slot and continue Step 1
# with that real provider+model. Mirrors the MoA context-length resolution.
if main_provider == "moa":
try:
from hermes_cli.config import load_config
from hermes_cli.moa_config import resolve_moa_preset
_preset = resolve_moa_preset(load_config().get("moa") or {}, main_model)
_agg = _preset.get("aggregator") or {}
_agg_provider = str(_agg.get("provider") or "").strip()
_agg_model = str(_agg.get("model") or "").strip()
if _agg_provider and _agg_model and _agg_provider.lower() != "moa":
main_provider = _agg_provider
main_model = _agg_model
# The MoA virtual runtime carries a non-HTTP base_url
# ("moa://local") and a placeholder api_key; they belong to the
# facade, not the aggregator's real provider. Drop them so the
# aggregator resolves through its own provider credentials.
runtime_base_url = ""
runtime_api_key = ""
runtime_api_mode = ""
except Exception:
logger.debug("MoA aux resolution to aggregator failed", exc_info=True)
if (main_provider and main_model
and main_provider not in {"auto", ""}):
resolved_provider = main_provider
@@ -3770,6 +3820,10 @@ def _to_async_client(sync_client, model: str, is_vision: bool = False):
_merged_async = _apply_user_default_headers(async_kwargs.get("default_headers"))
if _merged_async:
async_kwargs["default_headers"] = _merged_async
async_kwargs = {
**_openai_http_client_kwargs(sync_base_url, async_mode=True),
**async_kwargs,
}
return AsyncOpenAI(**async_kwargs), model
@@ -3980,7 +4034,7 @@ def resolve_provider_client(
"but no Codex OAuth token found (run: hermes model)")
return None, None
final_model = _normalize_resolved_model(model, provider)
raw_client = OpenAI(
raw_client = _create_openai_client(
api_key=codex_token,
base_url=_CODEX_AUX_BASE_URL,
default_headers=_codex_cloudflare_headers(codex_token),
@@ -4061,7 +4115,7 @@ def resolve_provider_client(
_merged_custom = _apply_user_default_headers(extra.get("default_headers"))
if _merged_custom:
extra["default_headers"] = _merged_custom
client = OpenAI(api_key=custom_key, base_url=_clean_base, **extra)
client = _create_openai_client(api_key=custom_key, base_url=_clean_base, **extra)
client = _wrap_if_needed(client, final_model, custom_base, custom_key)
return (_to_async_client(client, final_model, is_vision=is_vision) if async_mode
else (client, final_model))
@@ -4165,7 +4219,7 @@ def resolve_provider_client(
_fb_headers = _apply_user_default_headers(_fb_extra.get("default_headers"))
if _fb_headers:
_fb_extra["default_headers"] = _fb_headers
client = OpenAI(api_key=custom_key, base_url=_fb_clean, **_fb_extra)
client = _create_openai_client(api_key=custom_key, base_url=_fb_clean, **_fb_extra)
return (_to_async_client(client, final_model, is_vision=is_vision) if async_mode
else (client, final_model))
sync_anthropic = AnthropicAuxiliaryClient(
@@ -4174,7 +4228,7 @@ def resolve_provider_client(
if async_mode:
return AsyncAnthropicAuxiliaryClient(sync_anthropic), final_model
return sync_anthropic, final_model
client = OpenAI(api_key=custom_key, base_url=_clean_base2, **_extra2)
client = _create_openai_client(api_key=custom_key, base_url=_clean_base2, **_extra2)
# codex_responses or inherited auto-detect (via _wrap_if_needed).
# _wrap_if_needed reads the closed-over `api_mode` (the task-level
# override). Named-provider entry api_mode=codex_responses also
@@ -4316,7 +4370,7 @@ def resolve_provider_client(
_merged_main = _apply_user_default_headers(headers)
if _merged_main:
headers = _merged_main
client = OpenAI(api_key=api_key, base_url=base_url,
client = _create_openai_client(api_key=api_key, base_url=base_url,
**({"default_headers": headers} if headers else {}))
# Copilot GPT-5+ models (except gpt-5-mini) require the Responses
@@ -4852,7 +4906,7 @@ def _refresh_nous_auxiliary_client(
return None, model
fresh_key, fresh_base_url = runtime
sync_client = OpenAI(api_key=fresh_key, base_url=fresh_base_url)
sync_client = _create_openai_client(api_key=fresh_key, base_url=fresh_base_url)
final_model = model
current_loop = None
@@ -5962,8 +6016,17 @@ def call_llm(
# When the provider returns a 429 rate-limit (not billing), fall
# back to an alternative provider instead of exhausting retries
# against the same rate-limited endpoint.
#
# ── Auth error fallback (#21165) ─────────────────────────────
# When the resolved provider returns 401 and neither the Nous
# refresh path nor explicit provider credential refresh applies,
# fall back to an alternative provider instead of dropping the
# auxiliary task on the floor (silent compression failure /
# message loss). Auth is NOT a capacity error: it only bypasses
# the explicit-provider gate when the user is in auto mode.
should_fallback = (
_is_payment_error(first_err)
_is_auth_error(first_err)
or _is_payment_error(first_err)
or _is_connection_error(first_err)
or _is_rate_limit_error(first_err)
or _is_model_incompatible_error(first_err)
@@ -5993,7 +6056,9 @@ def call_llm(
or _is_invalid_aux_response_error(first_err)
)
if should_fallback and (is_auto or is_capacity_error):
if _is_payment_error(first_err):
if _is_auth_error(first_err):
reason = "auth error"
elif _is_payment_error(first_err):
reason = "payment error"
# Resolve the actual provider label (resolved_provider may be
# "auto"; the client's base_url tells us which backend got the
@@ -6442,8 +6507,13 @@ async def async_call_llm(
raise
# ── Payment / connection / rate-limit fallback (mirrors sync call_llm) ──
# Auth error fallback (#21165): a 401 that survived the refresh path
# falls back in auto mode just like the sync call_llm() path. Auth is
# NOT a capacity error, so on an explicit provider it still respects
# the user's choice (handled by the is_auto/is_capacity_error gate).
should_fallback = (
_is_payment_error(first_err)
_is_auth_error(first_err)
or _is_payment_error(first_err)
or _is_connection_error(first_err)
or _is_rate_limit_error(first_err)
or _is_model_incompatible_error(first_err)
@@ -6465,7 +6535,9 @@ async def async_call_llm(
or _is_invalid_aux_response_error(first_err)
)
if should_fallback and (is_auto or is_capacity_error):
if _is_payment_error(first_err):
if _is_auth_error(first_err):
reason = "auth error"
elif _is_payment_error(first_err):
reason = "payment error"
_mark_provider_unhealthy(
_recoverable_pool_provider(resolved_provider, client) or resolved_provider

View File

@@ -37,6 +37,18 @@ from tools.terminal_tool import is_persistent_env
from utils import base_url_host_matches, base_url_hostname, env_float, env_int
logger = logging.getLogger(__name__)
_OPENROUTER_PROVIDER_SORT_VALUES = {"throughput", "latency", "price"}
# When the fallback chain is fully exhausted on a non-rate-limit failure
# (e.g. every provider returns a non-retryable client error like HTTP 400),
# arm a short cooldown so the NEXT turn's restore_primary_runtime stays gated
# and does not reset _fallback_index=0 to replay the entire chain again.
# Without this, a client/gateway that re-submits immediately would re-marshal
# the full (potentially 80k-token) context once per provider every turn and
# can drive a constrained host into memory/swap exhaustion. Rate-limit /
# billing reasons keep their own 60s cooldown (set above); this is the
# narrower non-rate-limit case. See issue #24996.
_FALLBACK_EXHAUSTED_COOLDOWN_S = 5.0
def _ra():
@@ -115,6 +127,23 @@ def _is_openai_codex_backend(agent) -> bool:
)
def _validated_openrouter_provider_sort(raw_sort: Any) -> Optional[str]:
"""Return a normalized OpenRouter provider.sort value or None."""
if not isinstance(raw_sort, str):
return None
sort_value = raw_sort.strip().lower()
if not sort_value:
return None
if sort_value in _OPENROUTER_PROVIDER_SORT_VALUES:
return sort_value
logger.warning(
"Ignoring invalid OpenRouter provider.sort value %r (allowed: %s)",
raw_sort,
", ".join(sorted(_OPENROUTER_PROVIDER_SORT_VALUES)),
)
return None
def _env_float(name: str, default: float) -> float:
try:
return float(os.getenv(name, str(default)))
@@ -229,6 +258,11 @@ def interruptible_api_call(agent, api_kwargs: dict):
invalidate_runtime_client(region)
raise
result["response"] = normalize_converse_response(raw_response)
elif agent.provider == "moa":
# MoA is a virtual chat-completions provider backed by the
# in-process MoAClient facade. Do not rebuild a request-local
# OpenAI client from the virtual runtime metadata.
result["response"] = agent.client.chat.completions.create(**api_kwargs)
else:
request_client = _set_request_client(
agent._create_request_openai_client(
@@ -698,8 +732,9 @@ def build_api_kwargs(agent, api_messages: list) -> dict:
_prefs["ignore"] = agent.providers_ignored
if agent.providers_order:
_prefs["order"] = agent.providers_order
if agent.provider_sort:
_prefs["sort"] = agent.provider_sort
_provider_sort = _validated_openrouter_provider_sort(agent.provider_sort)
if _provider_sort:
_prefs["sort"] = _provider_sort
if agent.provider_require_parameters:
_prefs["require_parameters"] = True
if agent.provider_data_collection:
@@ -1015,18 +1050,23 @@ def build_assistant_message(agent, assistant_message, finish_reason: str) -> dic
"arguments": tool_call.function.arguments
},
}
# Defence-in-depth: redact credentials from tool call arguments
# before they enter conversation history. Tool execution uses the
# raw API response object, not this dict, so redacting the
# persisted shape is safe and only affects storage. Catches the
# case where a model accidentally inlines a secret into a tool
# call (e.g. `terminal(command="curl -H 'Authorization: Bearer
# sk-...'")`). (#19798)
if isinstance(tc_dict["function"]["arguments"], str):
from agent.redact import redact_sensitive_text
tc_dict["function"]["arguments"] = redact_sensitive_text(
tc_dict["function"]["arguments"]
)
# Tool-call arguments are intentionally NOT redacted here. This
# dict enters the in-memory conversation history that is replayed
# to the model on every subsequent turn AND persisted to state.db,
# which is itself replayed verbatim on session resume
# (get_messages_as_conversation). Masking a credential to `***`
# here poisons that replay: the model reads back its own
# `PGPASSWORD='***' psql ...` call and copies the placeholder into
# the next tool call, breaking every credential-dependent command
# on the second turn (#43083). The masking also provided no real
# protection — the same secret still leaks verbatim through tool
# OUTPUT (file contents, command output, diffs, the compaction
# block), none of which this pass ever touched. Keeping secrets
# out of the replayable store is a separate tokenization/vault
# concern, not something arg-redaction can deliver without
# breaking replay. Storage-time redaction remains governed by the
# `security.redact_secrets` toggle. (#19798 introduced this;
# #43083 removed it.)
# Preserve extra_content (e.g. Gemini thought_signature) so it
# is sent back on subsequent API calls. Without this, Gemini 3
# thinking models reject the request with a 400 error.
@@ -1093,8 +1133,22 @@ def try_activate_fallback(agent, reason: "FailoverReason | None" = None) -> bool
if (not fallback_already_active) or (primary_provider and current_provider == primary_provider):
agent._rate_limited_until = time.monotonic() + 60
if agent._fallback_index >= len(agent._fallback_chain):
# Chain exhausted. If we actually walked a non-empty chain and the
# failure was NOT a rate-limit/billing event (those already armed
# their own 60s cooldown above), arm a short cooldown so the next
# turn's restore_primary_runtime stays gated instead of resetting
# _fallback_index=0 and re-marshaling the whole context across every
# provider again. Guards the cross-turn replay storm in #24996.
if (
len(agent._fallback_chain) > 0
and reason not in {FailoverReason.rate_limit, FailoverReason.billing}
):
_existing_cooldown = getattr(agent, "_rate_limited_until", 0) or 0
agent._rate_limited_until = max(
_existing_cooldown,
time.monotonic() + _FALLBACK_EXHAUSTED_COOLDOWN_S,
)
return False
fb = agent._fallback_chain[agent._fallback_index]
agent._fallback_index += 1
fb_provider = (fb.get("provider") or "").strip().lower()
@@ -1210,14 +1264,16 @@ def try_activate_fallback(agent, reason: "FailoverReason | None" = None) -> bool
agent._transport_cache.clear()
agent._fallback_activated = True
# Clear the credential pool when the fallback provider doesn't match
# the pool's provider. The pool was seeded for the primary provider;
# leaving it attached means downstream recovery (rate_limit / billing /
# auth) calls ``_swap_credential`` with a primary entry which overwrites
# the agent's ``base_url`` back to the primary's endpoint — every
# fallback request then 404s against the wrong host. See #33163.
# Rebind the credential pool to the fallback provider when the provider
# changes. Keeping the primary pool attached would make downstream
# recovery (rate_limit / billing / auth) mutate the wrong credential
# set and can overwrite the fallback's base_url back to the primary
# endpoint. See #33163.
#
# When the fallback shares the pool's provider (e.g. both openrouter
# entries with different routing) the pool is preserved.
# entries with different routing) the pool is preserved. When the
# providers differ, load the fallback provider's own pool if one exists
# so provider-specific rotation continues to work after the switch.
_existing_pool = getattr(agent, "_credential_pool", None)
if _existing_pool is not None:
_pool_provider = (getattr(_existing_pool, "provider", "") or "").strip().lower()
@@ -1228,6 +1284,22 @@ def try_activate_fallback(agent, reason: "FailoverReason | None" = None) -> bool
fb_provider, fb_model, _pool_provider,
)
agent._credential_pool = None
if getattr(agent, "_credential_pool", None) is None:
try:
from agent.credential_pool import load_pool
fallback_pool = load_pool(fb_provider)
if fallback_pool and fallback_pool.has_credentials():
agent._credential_pool = fallback_pool
logger.info(
"Fallback to %s/%s: attached fallback credential pool",
fb_provider, fb_model,
)
except Exception as exc:
logger.debug(
"Fallback to %s/%s: could not attach credential pool: %s",
fb_provider, fb_model, exc,
)
# Honor per-provider / per-model request_timeout_seconds for the
# fallback target (same knob the primary client uses). None = use
@@ -1458,8 +1530,9 @@ def handle_max_iterations(agent, messages: list, api_call_count: int) -> str:
provider_preferences["ignore"] = agent.providers_ignored
if agent.providers_order:
provider_preferences["order"] = agent.providers_order
if agent.provider_sort:
provider_preferences["sort"] = agent.provider_sort
_provider_sort = _validated_openrouter_provider_sort(agent.provider_sort)
if _provider_sort:
provider_preferences["sort"] = _provider_sort
if provider_preferences and (
(agent.provider or "").strip().lower() == "openrouter"
or agent._is_openrouter_url()
@@ -2246,7 +2319,15 @@ def interruptible_streaming_api_call(agent, api_kwargs: dict, *, on_first_delta=
_fire_first_delta()
agent._fire_reasoning_delta(thinking_text)
# Return the native Anthropic Message for downstream processing
# Return the native Anthropic Message for downstream processing.
# If the stream was interrupted (the event loop broke out above on
# agent._interrupt_requested), do NOT call get_final_message() — on
# a partially-consumed stream the SDK may hang draining remaining
# events or return a Message with incomplete tool_use blocks (partial
# JSON in `input`). The outer poll loop raises InterruptedError, so
# this return value is discarded anyway.
if agent._interrupt_requested:
return None
return stream.get_final_message()
def _call():
@@ -2391,12 +2472,19 @@ def interruptible_streaming_api_call(agent, api_kwargs: dict, *, on_first_delta=
diag=request_client_holder.get("diag"),
)
_close_request_client_once("stream_mid_tool_retry_cleanup")
try:
agent._replace_primary_openai_client(
reason="stream_mid_tool_retry_pool_cleanup"
)
except Exception:
pass
if agent.api_mode == "anthropic_messages":
try:
agent._anthropic_client.close()
agent._rebuild_anthropic_client()
except Exception:
pass
else:
try:
agent._replace_primary_openai_client(
reason="stream_mid_tool_retry_pool_cleanup"
)
except Exception:
pass
continue
# SSE error events from proxies (e.g. OpenRouter sends
@@ -2444,12 +2532,19 @@ def interruptible_streaming_api_call(agent, api_kwargs: dict, *, on_first_delta=
_close_request_client_once("stream_retry_cleanup")
# Also rebuild the primary client to purge
# any dead connections from the pool.
try:
agent._replace_primary_openai_client(
reason="stream_retry_pool_cleanup"
)
except Exception:
pass
if agent.api_mode == "anthropic_messages":
try:
agent._anthropic_client.close()
agent._rebuild_anthropic_client()
except Exception:
pass
else:
try:
agent._replace_primary_openai_client(
reason="stream_retry_pool_cleanup"
)
except Exception:
pass
continue
# Retries exhausted. Log the final failure with
# full diagnostic detail (chain, headers,
@@ -2620,10 +2715,17 @@ def interruptible_streaming_api_call(agent, api_kwargs: dict, *, on_first_delta=
pass
# Rebuild the primary client too — its connection pool
# may hold dead sockets from the same provider outage.
try:
agent._replace_primary_openai_client(reason="stale_stream_pool_cleanup")
except Exception:
pass
if agent.api_mode == "anthropic_messages":
try:
agent._anthropic_client.close()
agent._rebuild_anthropic_client()
except Exception:
pass
else:
try:
agent._replace_primary_openai_client(reason="stale_stream_pool_cleanup")
except Exception:
pass
# Reset the timer so we don't kill repeatedly while
# the inner thread processes the closure.
last_chunk_time["t"] = time.time()
@@ -2699,7 +2801,30 @@ def interruptible_streaming_api_call(agent, api_kwargs: dict, *, on_first_delta=
role="assistant", content=_partial_text, tool_calls=None,
reasoning_content=None,
)
return SimpleNamespace(
# Detect provider output-layer content filtering (e.g. MiniMax
# "output new_sensitive (1027)", Azure/OpenAI content_filter,
# Anthropic safety refusal). The raw error is about to be
# swallowed into a finish_reason=length stub, so classify it HERE
# while we still have it and stamp the stub. Retrying such a
# content-deterministic filter on the same primary just re-hits
# the filter — the conversation loop reads this tag and activates
# the fallback chain instead of burning continuation retries.
# error_classifier is the single source of truth for "what counts
# as a content filter" (#32421).
_content_filter_terminated = False
try:
from agent.error_classifier import classify_api_error, FailoverReason
_cls = classify_api_error(
result["error"],
provider=str(getattr(agent, "provider", "") or ""),
model=str(getattr(agent, "model", "") or ""),
)
_content_filter_terminated = (
_cls.reason == FailoverReason.content_policy_blocked
)
except Exception:
_content_filter_terminated = False
_stub = SimpleNamespace(
id=PARTIAL_STREAM_STUB_ID,
model=getattr(agent, "model", "unknown"),
choices=[SimpleNamespace(
@@ -2708,6 +2833,9 @@ def interruptible_streaming_api_call(agent, api_kwargs: dict, *, on_first_delta=
usage=None,
_dropped_tool_names=_partial_names or None,
)
if _content_filter_terminated:
_stub._content_filter_terminated = True
return _stub
raise result["error"]
return result["response"]

View File

@@ -60,6 +60,8 @@ from dataclasses import dataclass
from pathlib import Path
from typing import Any, Optional
from hermes_cli._subprocess_compat import IS_WINDOWS, windows_hide_flags
logger = logging.getLogger("hermes.coding_context")
CODING_TOOLSET = "coding"
@@ -647,12 +649,14 @@ def _enabled_mcp_servers(config: Optional[dict[str, Any]]) -> list[str]:
def _git(cwd: Path, *args: str) -> str:
_popen_kwargs = {"creationflags": windows_hide_flags()} if IS_WINDOWS else {}
try:
out = subprocess.run(
["git", "-C", str(cwd), *args],
capture_output=True,
text=True,
timeout=_GIT_TIMEOUT,
**_popen_kwargs,
)
except (OSError, subprocess.SubprocessError):
return ""

View File

@@ -12,6 +12,7 @@ from pathlib import Path
from typing import Awaitable, Callable
from agent.model_metadata import estimate_tokens_rough
from hermes_cli._subprocess_compat import IS_WINDOWS, windows_hide_flags
_QUOTED_REFERENCE_VALUE = r'(?:`[^`\n]+`|"[^"\n]+"|\'[^\'\n]+\')'
REFERENCE_PATTERN = re.compile(
@@ -290,6 +291,7 @@ def _expand_git_reference(
args: list[str],
label: str,
) -> tuple[str | None, str | None]:
_popen_kwargs = {"creationflags": windows_hide_flags()} if IS_WINDOWS else {}
try:
result = subprocess.run(
["git", *args],
@@ -298,6 +300,7 @@ def _expand_git_reference(
text=True,
timeout=30,
stdin=subprocess.DEVNULL,
**_popen_kwargs,
)
except subprocess.TimeoutExpired:
return f"{ref.raw}: git command timed out (30s)", None
@@ -483,6 +486,7 @@ def _iter_visible_entries(path: Path, cwd: Path, limit: int) -> list[Path]:
def _rg_files(path: Path, cwd: Path, limit: int) -> list[Path] | None:
_popen_kwargs = {"creationflags": windows_hide_flags()} if IS_WINDOWS else {}
try:
result = subprocess.run(
["rg", "--files", str(path.relative_to(cwd))],
@@ -491,6 +495,7 @@ def _rg_files(path: Path, cwd: Path, limit: int) -> list[Path] | None:
text=True,
timeout=10,
stdin=subprocess.DEVNULL,
**_popen_kwargs,
)
except (FileNotFoundError, OSError, subprocess.TimeoutExpired):
return None

View File

@@ -288,6 +288,29 @@ def replay_compression_warning(agent: Any) -> None:
pass
def conversation_history_after_compression(agent: Any, messages: list) -> Optional[list]:
"""Return the correct flush baseline after a compression boundary.
Legacy compression rotates to a fresh child session. That child has not
seen the compacted transcript through the normal same-turn flush path yet,
so callers must clear ``conversation_history`` to ``None`` and let the next
persistence call write the whole compacted list.
In-place compaction is different: ``archive_and_compact()`` has already
soft-archived the previous active rows and inserted ``messages`` as the new
active live transcript under the same session id. If the same agent turn
continues with ``conversation_history=None``, the identity-based flush path
treats those already-persisted compacted dicts as new and appends them a
second time, doubling the active context and retriggering compression.
A shallow copy is intentional: it captures the current compacted dict
identities as history while allowing later same-turn appends to remain new.
"""
if bool(getattr(agent, "_last_compaction_in_place", False)):
return list(messages)
return None
def compress_context(
agent: Any,
messages: list,

View File

@@ -28,6 +28,7 @@ import uuid
from typing import Any, Dict, List, Optional
from agent.codex_responses_adapter import _summarize_user_message_for_log
from agent.conversation_compression import conversation_history_after_compression
from agent.display import KawaiiSpinner
from agent.error_classifier import FailoverReason, classify_api_error
from agent.iteration_budget import IterationBudget
@@ -587,6 +588,13 @@ def run_conversation(
compression_attempts = 0
_turn_exit_reason = "unknown" # Diagnostic: why the loop ended
# Per-turn tally of consecutive successful credential-pool token refreshes,
# keyed by (provider, pool-entry-id). A persistent upstream 401 lets
# ``try_refresh_current()`` "succeed" forever on a single-entry OAuth pool,
# so this tally caps same-entry refreshes and lets the fallback chain take
# over instead of spinning. Reset here so each turn starts fresh. See #26080.
agent._auth_pool_refresh_counts = {}
# Optional opt-in runtime: if api_mode == codex_app_server, hand the
# turn to the codex app-server subprocess (terminal/file ops/patching
# all run inside Codex). Default Hermes path is bypassed entirely.
@@ -827,7 +835,6 @@ def run_conversation(
aggregator=moa_config.get("aggregator") or {},
temperature=float(moa_config.get("reference_temperature", 0.6) or 0.6),
aggregator_temperature=float(moa_config.get("aggregator_temperature", 0.4) or 0.4),
max_tokens=int(moa_config.get("max_tokens", 4096) or 4096),
)
if _moa_context:
for _msg in reversed(api_messages):
@@ -1692,6 +1699,56 @@ def run_conversation(
if agent.api_mode in {"chat_completions", "bedrock_converse", "anthropic_messages"}:
assistant_message = _trunc_msg
# ── Content-filter stream stall → fallback (#32421) ──
# When the provider's output-layer safety filter (e.g.
# MiniMax "output new_sensitive (1027)", Azure
# content_filter) kills the stream mid-delivery, the
# raw error was classified at the swallow point and the
# stub tagged ``_content_filter_terminated``. This
# filter is content-deterministic — continuation
# retries against the SAME primary just re-hit it and
# burn paid attempts (the loop used to give up with
# "Response remained truncated after 3 continuation
# attempts" and never consult the fallback chain).
# Escalate to the configured fallback BEFORE retrying.
_cf_terminated = getattr(
response, "_content_filter_terminated", False
)
if (
_cf_terminated
and agent._fallback_index < len(agent._fallback_chain)
):
agent._vprint(
f"{agent.log_prefix}🛡️ Content filter terminated "
f"stream — activating fallback provider...",
force=True,
)
agent._emit_status(
"Content filter terminated stream; switching to fallback..."
)
if agent._try_activate_fallback():
# Roll the partial content (if any was already
# appended in a prior continuation pass) back to
# the last clean turn so the fallback provider
# gets a coherent continuation point.
if truncated_response_parts:
messages = agent._get_messages_up_to_last_assistant(messages)
agent._session_messages = messages
length_continue_retries = 0
truncated_response_parts = []
retry_count = 0
compression_attempts = 0
_retry.primary_recovery_attempted = False
_retry.restart_with_rebuilt_messages = True
break
# No fallback available — fall through to normal
# continuation (best-effort, may loop).
agent._vprint(
f"{agent.log_prefix}⚠️ No fallback provider "
f"configured — retrying with same provider "
f"(may re-hit filter)...",
force=True,
)
if assistant_message is not None and not _trunc_has_tool_calls:
length_continue_retries += 1
interim_msg = agent._build_assistant_message(assistant_message, finish_reason)
@@ -2259,6 +2316,15 @@ def run_conversation(
# "unknown variant `image_url`, expected `text`".
"unknown variant `image_url`, expected `text`",
"unknown variant image_url, expected text",
# OpenRouter routes a request to upstream endpoints and,
# when none of the candidate endpoints for the model accept
# image input, returns HTTP 404 "No endpoints found that
# support image input". Without this phrase the agent never
# strips the images, the retry loop re-sends the same
# rejected request until exhaustion, and the gateway leaves
# every subsequent message queued behind the stuck turn —
# the P1 in issue #21160. The 404 passes the 4xx gate below.
"no endpoints found that support image input",
)
_err_lower = _err_body.lower()
_looks_like_image_rejection = any(
@@ -2830,10 +2896,9 @@ def run_conversation(
approx_tokens=approx_tokens,
task_id=effective_task_id,
)
# Compression created a new session — clear history
# so _flush_messages_to_session_db writes compressed
# messages to the new session, not skipping them.
conversation_history = None
conversation_history = conversation_history_after_compression(
agent, messages
)
if len(messages) < original_len or old_ctx > _reduced_ctx:
agent._buffer_status(
f"🗜️ Context reduced to {_reduced_ctx:,} tokens "
@@ -2845,15 +2910,25 @@ def run_conversation(
# Fall through to normal error handling if compression
# is exhausted or didn't help.
# Eager fallback for rate-limit errors (429 or quota exhaustion).
# When a fallback model is configured, switch immediately instead
# of burning through retries with exponential backoff -- the
# primary provider won't recover within the retry window.
# Eager fallback for rate-limit errors (429 or quota exhaustion)
# and transport errors (connection failure / timeout / provider
# overloaded). Rate limits and billing: switch immediately —
# the primary provider won't recover within the retry window.
# Transport errors: allow 1 retry first (transient hiccups
# recover), then fall back if the provider is truly unreachable.
is_rate_limited = classified.reason in {
FailoverReason.rate_limit,
FailoverReason.billing,
}
if is_rate_limited and agent._fallback_index < len(agent._fallback_chain):
_is_transport_failure = classified.reason in {
FailoverReason.timeout,
FailoverReason.overloaded,
}
_should_fallback = (
is_rate_limited
or (_is_transport_failure and retry_count >= 2)
)
if _should_fallback and agent._fallback_index < len(agent._fallback_chain):
# Don't eagerly fallback if credential pool rotation may
# still recover. See _pool_may_recover_from_rate_limit
# for the single-credential-pool and CloudCode-quota
@@ -2868,6 +2943,10 @@ def run_conversation(
agent._buffer_status(
"⚠️ Billing or credits exhausted — switching to fallback provider..."
)
elif _is_transport_failure:
agent._buffer_status(
"⚠️ Provider unreachable — switching to fallback provider..."
)
else:
agent._buffer_status("⚠️ Rate limited — switching to fallback provider...")
if agent._try_activate_fallback(reason=classified.reason):
@@ -3042,10 +3121,9 @@ def run_conversation(
messages, system_message, approx_tokens=approx_tokens,
task_id=effective_task_id,
)
# Compression created a new session — clear history
# so _flush_messages_to_session_db writes compressed
# messages to the new session, not skipping them.
conversation_history = None
conversation_history = conversation_history_after_compression(
agent, messages
)
# Re-estimate tokens after compression. Same-message-count
# compression (tool-result pruning, in-place summarization)
@@ -3209,10 +3287,9 @@ def run_conversation(
messages, system_message, approx_tokens=approx_tokens,
task_id=effective_task_id,
)
# Compression created a new session — clear history
# so _flush_messages_to_session_db writes compressed
# messages to the new session, not skipping them.
conversation_history = None
conversation_history = conversation_history_after_compression(
agent, messages
)
# Re-estimate tokens after compression. Same-message-count
# compression (tool-result pruning, in-place summarization)
@@ -3474,6 +3551,13 @@ def run_conversation(
):
_retry.primary_recovery_attempted = True
retry_count = 0
# Primary transport recovery starts a fresh attempt
# cycle. Re-open fallback state so a follow-on 429 can
# still activate fallback_providers after stale
# pre-recovery fallback/credential-pool bookkeeping.
_retry.has_retried_429 = False
agent._fallback_index = 0
agent._fallback_activated = False
continue
# Try fallback before giving up entirely
if agent._has_pending_fallback():
@@ -3661,7 +3745,12 @@ def run_conversation(
_ra_raw = _resp_headers.get("retry-after") or _resp_headers.get("Retry-After")
if _ra_raw:
try:
_retry_after = min(float(_ra_raw), 120) # Cap at 2 minutes
# Cap at 10 minutes. Anthropic Tier 1 input-token
# buckets reset in ~171s, so a 120s cap caused us to
# retry before the actual reset window and re-trip the
# limit. 600s covers all realistic provider reset
# windows while still rejecting pathological values. (#26293)
_retry_after = min(float(_ra_raw), 600)
except (TypeError, ValueError):
pass
wait_time = _retry_after if _retry_after else jittered_backoff(retry_count, base_delay=2.0, max_delay=60.0)
@@ -3742,6 +3831,17 @@ def run_conversation(
_retry.restart_with_compressed_messages = False
continue
if _retry.restart_with_rebuilt_messages:
# A content-filter stream stall (#32421) was escalated to the
# fallback chain and the partial content rolled back. Re-issue
# the API call against the now-active fallback provider. Refund
# the budget/count for the stalled attempt so the fallback gets a
# fair turn.
api_call_count -= 1
agent.iteration_budget.refund()
_retry.restart_with_rebuilt_messages = False
continue
if _retry.restart_with_length_continuation:
# Progressively boost the output token budget on each retry.
# Retry 1 → 2× base, retry 2 → 3× base, capped at 32 768.
@@ -4316,10 +4416,9 @@ def run_conversation(
approx_tokens=agent.context_compressor.last_prompt_tokens,
task_id=effective_task_id,
)
# Compression created a new session — clear history so
# _flush_messages_to_session_db writes compressed messages
# to the new session (see preflight compression comment).
conversation_history = None
conversation_history = conversation_history_after_compression(
agent, messages
)
# Save session log incrementally (so progress is visible even if interrupted)
agent._session_messages = messages
@@ -4361,7 +4460,11 @@ def run_conversation(
"as final response"
)
final_response = _recovered
agent._response_was_previewed = True
# Streaming delivered a fragment, not a confirmed
# final preview. Leave response_previewed false so
# gateway fallback delivery can send the recovered
# text plus the abnormal-turn explanation.
agent._response_was_previewed = False
break
# If the previous turn already delivered real content alongside
@@ -4606,14 +4709,20 @@ def run_conversation(
# status from earlier failed attempts in this turn.
agent._clear_status_buffer()
from agent.agent_runtime_helpers import (
intent_ack_continuation_mode,
)
_ack_mode = intent_ack_continuation_mode(agent)
if (
agent.api_mode == "codex_responses"
_ack_mode != "off"
and agent.valid_tool_names
and codex_ack_continuations < 2
and agent._looks_like_codex_intermediate_ack(
user_message=user_message,
assistant_content=final_response,
messages=messages,
require_workspace=(_ack_mode == "codex_only"),
)
):
codex_ack_continuations += 1

View File

@@ -23,6 +23,7 @@ from typing import Any
from agent.file_safety import get_read_block_error, is_write_denied
from agent.redact import redact_sensitive_text
from tools.environments.local import hermes_subprocess_env
ACP_MARKER_BASE_URL = "acp://copilot"
_DEFAULT_TIMEOUT_SECONDS = 900.0
@@ -94,7 +95,10 @@ def _resolve_home_dir() -> str:
def _build_subprocess_env() -> dict[str, str]:
env = os.environ.copy()
# Copilot ACP is a model-driving CLI executor: it legitimately needs LLM
# provider credentials. Route through the central helper so Tier-1 secrets
# (gateway bot tokens, GitHub auth, infra) are still stripped (#29157).
env = hermes_subprocess_env(inherit_credentials=True)
home = _resolve_home_dir()
env["HOME"] = home
from hermes_constants import apply_subprocess_home_env

View File

@@ -537,10 +537,11 @@ class CredentialPool:
self._entries[idx] = new
return
def _persist(self) -> None:
def _persist(self, *, removed_ids: Optional[List[str]] = None) -> None:
write_credential_pool(
self.provider,
[entry.to_dict() for entry in self._entries],
removed_ids=removed_ids,
)
def _is_terminal_auth_failure(
@@ -1124,13 +1125,17 @@ class CredentialPool:
logger.debug(
"Failed to clear terminal xAI OAuth state: %s", clear_exc
)
removed_ids = [
item.id for item in self._entries
if item.source == "loopback_pkce"
]
self._entries = [
item for item in self._entries
if item.source != "loopback_pkce"
]
if self._current_id == entry.id:
self._current_id = None
self._persist()
self._persist(removed_ids=removed_ids)
return None
# For openai-codex: same race as xAI/nous — another Hermes process
# may have consumed the refresh token between our proactive sync
@@ -1190,13 +1195,17 @@ class CredentialPool:
logger.debug(
"Failed to clear terminal Codex OAuth state: %s", clear_exc
)
removed_ids = [
item.id for item in self._entries
if item.source == "device_code"
]
self._entries = [
item for item in self._entries
if item.source != "device_code"
]
if self._current_id == entry.id:
self._current_id = None
self._persist()
self._persist(removed_ids=removed_ids)
return None
# For nous: another process may have consumed the refresh token
# between our proactive sync and the HTTP call. Re-sync from
@@ -1253,13 +1262,17 @@ class CredentialPool:
auth_mod.NOUS_DEVICE_CODE_SOURCE,
f"manual:{auth_mod.NOUS_DEVICE_CODE_SOURCE}",
}
removed_ids = [
item.id for item in self._entries
if item.source in singleton_sources
]
self._entries = [
item for item in self._entries
if item.source not in singleton_sources
]
if self._current_id == entry.id:
self._current_id = None
self._persist()
self._persist(removed_ids=removed_ids)
return None
self._mark_exhausted(entry, None)
return None
@@ -1421,7 +1434,7 @@ class CredentialPool:
pruned_ids = set(entries_to_prune)
self._entries = [e for e in self._entries if e.id not in pruned_ids]
if cleared_any:
self._persist()
self._persist(removed_ids=entries_to_prune)
return available
def _select_unlocked(self) -> Optional[PooledCredential]:
@@ -1595,7 +1608,11 @@ class CredentialPool:
replace(entry, priority=new_priority)
for new_priority, entry in enumerate(self._entries)
]
self._persist()
write_credential_pool(
self.provider,
[entry.to_dict() for entry in self._entries],
removed_ids=[removed.id],
)
if self._current_id == removed.id:
self._current_id = None
return removed
@@ -2257,6 +2274,11 @@ def _seed_custom_pool(pool_key: str, entries: List[PooledCredential]) -> Tuple[b
def load_pool(provider: str) -> CredentialPool:
provider = (provider or "").strip().lower()
raw_entries = read_credential_pool(provider)
disk_ids = {
entry.get("id")
for entry in raw_entries
if isinstance(entry, dict) and entry.get("id")
}
raw_needs_sanitization = any(
isinstance(payload, dict)
and sanitize_borrowed_credential_payload(payload, provider) != payload
@@ -2285,8 +2307,10 @@ def load_pool(provider: str) -> CredentialPool:
changed |= _normalize_pool_priorities(provider, entries)
if changed:
new_ids = {entry.id for entry in entries}
write_credential_pool(
provider,
[entry.to_dict() for entry in sorted(entries, key=lambda item: item.priority)],
removed_ids=disk_ids - new_ids,
)
return CredentialPool(provider, entries)

View File

@@ -273,6 +273,21 @@ def should_run_now(now: Optional[datetime] = None) -> bool:
# Automatic state transitions (pure function, no LLM)
# ---------------------------------------------------------------------------
def _cron_referenced_skills() -> Set[str]:
"""Skill names referenced by any cron job (incl. paused/disabled).
Best-effort: a cron-module import error or corrupt jobs store must never
break the curator, so any failure yields an empty set (no protection,
but no crash).
"""
try:
from cron.jobs import referenced_skill_names as _refs
return _refs()
except Exception as e:
logger.debug("Curator could not read cron skill references: %s", e, exc_info=True)
return set()
def apply_automatic_transitions(now: Optional[datetime] = None) -> Dict[str, int]:
"""Walk every curator-managed skill and move active/stale/archived based on
the latest real activity timestamp. Pinned skills are never touched.
@@ -292,6 +307,8 @@ def apply_automatic_transitions(now: Optional[datetime] = None) -> Dict[str, int
stale_cutoff = now - timedelta(days=get_stale_after_days())
archive_cutoff = now - timedelta(days=get_archive_after_days())
cron_referenced = _cron_referenced_skills()
counts = {"marked_stale": 0, "archived": 0, "reactivated": 0, "checked": 0, "seeded": 0}
for row in _u.agent_created_report():
@@ -300,6 +317,15 @@ def apply_automatic_transitions(now: Optional[datetime] = None) -> Dict[str, int
if row.get("pinned"):
continue
# A skill referenced by any cron job (incl. paused/disabled) is in
# use by definition — resuming or the next fire must find it. The
# scheduler only bumps usage when a job actually fires, so jobs that
# fire less often than archive_after_days, paused jobs, and far-future
# one-shots would otherwise have their skills aged out from under
# them. Treat referenced skills like pinned: never auto-transition.
if name in cron_referenced:
continue
# First sight of a curation-eligible skill with no persisted record
# (e.g. a newly-eligible built-in): anchor its clock to now and defer.
if not row.get("_persisted", True):
@@ -316,6 +342,18 @@ def apply_automatic_transitions(now: Optional[datetime] = None) -> Dict[str, int
current = row.get("state", _u.STATE_ACTIVE)
# Never-used skills (use_count == 0) get a grace floor: don't archive
# one until it is at least stale_after_days old. A use=0 skill is
# absence of evidence, not evidence of staleness — a skill created
# recently may simply not have had its trigger come up yet.
never_used = int(row.get("use_count", 0) or 0) == 0
if never_used and anchor > stale_cutoff:
# Younger than the stale window — leave it alone entirely.
if current == _u.STATE_STALE:
_u.set_state(name, _u.STATE_ACTIVE)
counts["reactivated"] += 1
continue
if anchor <= archive_cutoff and current != _u.STATE_ARCHIVED:
ok, _msg = _u.archive_skill(name)
if ok:
@@ -390,10 +428,19 @@ CURATOR_REVIEW_PROMPT = (
"back load-bearing UX (slash-command entry points referenced in docs and "
"tips) and are filtered out of the candidate list below — never resurrect "
"one as an archive or absorb target.\n"
"3c. DO NOT archive or prune any skill marked `cron=yes` in the candidate "
"list. A cron job depends on it and will fail to load it on its next "
"run. You MAY still consolidate it into an umbrella — but only because "
"the curator rewrites cron job skill references to follow consolidations; "
"never simply prune it.\n"
"4. DO NOT use usage counters as a reason to skip consolidation. The "
"counters are new and often mostly zero. Judge overlap on CONTENT, "
"not on use_count. 'use=0' is not evidence a skill is valuable; it's "
"absence of evidence either way.\n"
"absence of evidence either way. Corollary: 'use=0' is ALSO not a "
"reason to PRUNE a skill. Never archive a never-used skill (use=0) "
"unless it is at least 30 days old (check last_activity / created date) "
"AND its content is genuinely obsolete or fully absorbed elsewhere — a "
"recently-created skill simply may not have had its trigger come up yet.\n"
"5. DO NOT reject consolidation on the grounds that 'each skill has "
"a distinct trigger'. Pairwise distinctness is the wrong bar. The "
"right bar is: 'would a human maintainer write this as N separate "
@@ -1413,12 +1460,14 @@ def _render_candidate_list() -> str:
rows = skill_usage.agent_created_report()
if not rows:
return "No agent-created skills to review."
cron_referenced = _cron_referenced_skills()
lines = [f"Agent-created skills ({len(rows)}):\n"]
for r in rows:
lines.append(
f"- {r['name']} "
f"state={r['state']} "
f"pinned={'yes' if r.get('pinned') else 'no'} "
f"cron={'yes' if r['name'] in cron_referenced else 'no'} "
f"activity={r.get('activity_count', 0)} "
f"use={r.get('use_count', 0)} "
f"view={r.get('view_count', 0)} "

View File

@@ -133,6 +133,31 @@ _RATE_LIMIT_PATTERNS = [
"servicequotaexceededexception",
]
# Patterns that indicate provider-side overload, NOT a per-credential rate
# limit or billing problem. The credential is valid — the server is just
# busy — so the correct recovery is "back off and retry the same key", never
# "rotate the credential" (rotating exhausts the pool while the endpoint is
# still busy; a single-key user has nothing to rotate to). Some providers
# (notably Z.AI / Zhipu) reuse HTTP 429 for server-wide overload, so the 429
# status path matches the body against this list before falling through to
# the rate_limit default. Phrases are kept narrow and overload-flavoured so a
# normal rate-limit message ("you have been rate-limited") doesn't hit this
# bucket. (#14038, #15297)
_OVERLOADED_PATTERNS = [
"overloaded",
"temporarily overloaded",
"service is temporarily overloaded",
"service may be temporarily overloaded",
"server is overloaded",
"server overloaded",
"service overloaded",
"service is overloaded",
"upstream overloaded",
"currently overloaded",
"at capacity",
"over capacity",
]
# Usage-limit patterns that need disambiguation (could be billing OR rate_limit)
_USAGE_LIMIT_PATTERNS = [
"usage limit",
@@ -330,6 +355,14 @@ _CONTENT_POLICY_BLOCKED_PATTERNS = [
# echo back; the underscore form is provider-specific enough.
"content_filter",
"responsibleaipolicyviolation",
# MiniMax output-layer safety filter. The error string is surfaced
# verbatim by MiniMax SDK / OpenAI-compatible endpoints, usually in the
# form "output new_sensitive (1027)" when the model's *output* (often a
# large tool-call argument block) trips the upstream safety filter and
# the SSE stream is truncated mid-flight. ``new_sensitive`` is the
# filter name and is narrow enough that billing / format / auth error
# strings will not collide. See #32421.
"new_sensitive",
]
# Auth patterns (non-status-code signals)
@@ -863,7 +896,19 @@ def _classify_by_status(
)
if status_code == 429:
# Already checked long_context_tier above; this is a normal rate limit
# Already checked long_context_tier above. Some providers (notably
# Z.AI / Zhipu) reuse HTTP 429 for server-wide overload — same status
# code as a true per-credential rate limit, but the credential is
# valid and the correct recovery is "back off and retry the same key",
# NOT "rotate the credential" (which exhausts the pool while the
# endpoint is still busy, and does nothing for a single-key user).
# Disambiguate on the error body so an overload 429 takes the
# transient-overload path instead of burning the pool. (#14038)
if any(p in error_msg for p in _OVERLOADED_PATTERNS):
return result_fn(
FailoverReason.overloaded,
retryable=True,
)
return result_fn(
FailoverReason.rate_limit,
retryable=True,
@@ -1214,6 +1259,17 @@ def _classify_by_message(
should_fallback=True,
)
# Overloaded / server-busy patterns — must come BEFORE the rate_limit and
# billing checks so that a message-only "overloaded" (no 503/529 status,
# e.g. some Anthropic-compatible proxies) classifies as a transient
# overload (backoff + retry) instead of falling through to `unknown` or
# incorrectly triggering credential rotation.
if any(p in error_msg for p in _OVERLOADED_PATTERNS):
return result_fn(
FailoverReason.overloaded,
retryable=True,
)
# Billing patterns
if any(p in error_msg for p in _BILLING_PATTERNS):
return result_fn(
@@ -1303,19 +1359,25 @@ def _extract_status_code(error: Exception) -> Optional[int]:
def _extract_error_body(error: Exception) -> dict:
"""Extract the structured error body from an SDK exception."""
body = getattr(error, "body", None)
if isinstance(body, dict):
return body
# Some errors have .response.json()
response = getattr(error, "response", None)
if response is not None:
try:
json_body = response.json()
if isinstance(json_body, dict):
return json_body
except Exception:
pass
"""Extract the structured error body from an SDK exception or its cause chain."""
current = error
for _ in range(5): # Match _extract_status_code() traversal depth.
body = getattr(current, "body", None)
if isinstance(body, dict):
return body
# Some errors have .response.json()
response = getattr(current, "response", None)
if response is not None:
try:
json_body = response.json()
if isinstance(json_body, dict):
return json_body
except Exception:
pass
cause = getattr(current, "__cause__", None) or getattr(current, "__context__", None)
if cause is None or cause is current:
break
current = cause
return {}

View File

@@ -388,14 +388,98 @@ def _sniff_mime_from_bytes(raw: bytes) -> Optional[str]:
# BMP: "BM"
if raw.startswith(b"BM"):
return "image/bmp"
# HEIC/HEIF: ftypheic / ftypheix / ftypmif1 / ftypmsf1 etc.
if len(raw) >= 12 and raw[4:8] == b"ftyp" and raw[8:12] in {
b"heic", b"heix", b"hevc", b"hevx", b"mif1", b"msf1", b"heim", b"heis",
}:
return "image/heic"
# ISO-BMFF family (HEIC/HEIF/AVIF): bytes 4..8 == 'ftyp', major brand at 8..12
if len(raw) >= 12 and raw[4:8] == b"ftyp":
brand = raw[8:12]
if brand in {b"avif", b"avis"}:
return "image/avif"
if brand in {
b"heic", b"heix", b"hevc", b"hevx",
b"mif1", b"msf1", b"heim", b"heis",
}:
return "image/heic"
# TIFF: II*\0 (little-endian) or MM\0* (big-endian)
if raw[:4] in {b"II*\x00", b"MM\x00*"}:
return "image/tiff"
# ICO: 00 00 01 00 (reserved=0, type=1=icon)
if raw[:4] == b"\x00\x00\x01\x00":
return "image/x-icon"
# SVG: text-based, look for an <svg tag near the start (skip BOM/whitespace)
head = raw[:512].lstrip().lower()
if head.startswith(b"<?xml") or head.startswith(b"<svg"):
if b"<svg" in head:
return "image/svg+xml"
return None
# Formats every major vision provider (Anthropic, OpenAI, Gemini, Bedrock)
# accepts natively. Anything outside this set has to be transcoded to PNG
# before we declare media_type, otherwise the provider returns HTTP 400
# ("Could not process image" / "Unsupported image media type") and the
# whole turn fails with no salvage path.
#
# Discord (and a few other chat platforms) freely accept attachments in
# formats outside this set -- AVIF screenshots from Chromium, HEIC from
# iPhones, TIFF from scanners, BMP from old Windows tools, ICO -- so users
# do hit this in practice. SVG is vector and Pillow cannot rasterize it;
# it is skipped (logged) rather than transcoded.
_UNIVERSALLY_SUPPORTED_MIMES = frozenset({
"image/png", "image/jpeg", "image/gif", "image/webp",
})
def _transcode_to_png(raw: bytes) -> Optional[bytes]:
"""Decode arbitrary image bytes with Pillow and re-encode as PNG.
Returns None if Pillow isn't installed or can't decode the input
(rare formats, corrupted bytes, missing optional decoder plugin for
HEIC/AVIF, or vector formats like SVG). Caller falls back to skipping
the image so the rest of the turn still works.
HEIC/HEIF and AVIF need optional Pillow plugins; we try to register
them on demand and swallow ImportError so a missing plugin just
looks like 'Pillow can't decode this' rather than crashing.
"""
try:
from PIL import Image
except ImportError:
logger.info(
"image_routing: Pillow not installed; cannot transcode "
"non-standard image format to PNG. Install with `pip install Pillow` "
"(and `pillow-heif` / `pillow-avif-plugin` for those formats)."
)
return None
# Optional plugin registration. Silent on failure: an unsupported
# format will just fall through to Image.open raising below.
try:
import pillow_heif # type: ignore
pillow_heif.register_heif_opener()
except Exception:
pass
try:
import pillow_avif # type: ignore # noqa: F401 -- registers AVIF on import
except Exception:
pass
try:
from io import BytesIO
with Image.open(BytesIO(raw)) as im:
# Pick an output mode PNG can serialise. Anything other than
# the standard set gets normalised to RGBA so transparency is
# preserved where the source had it.
if im.mode not in {"RGB", "RGBA", "L", "LA", "P"}:
im = im.convert("RGBA")
buf = BytesIO()
im.save(buf, format="PNG", optimize=False)
return buf.getvalue()
except Exception as exc:
logger.info(
"image_routing: Pillow could not transcode image to PNG -- %s", exc
)
return None
def _guess_mime(path: Path, raw: Optional[bytes] = None) -> str:
"""Return image MIME type for *path*.
@@ -431,8 +515,18 @@ def _file_to_data_url(path: Path) -> Optional[str]:
accept large images (OpenAI 49 MB+, Gemini 100 MB) don't pay a silent
quality tax just because one other provider is stricter.
Returns None only if the file can't be read (missing, permission
denied, etc.); the caller reports those paths in ``skipped``.
Format compatibility IS handled here: if the sniffed MIME isn't one
of ``_UNIVERSALLY_SUPPORTED_MIMES`` (i.e. it's something like AVIF,
HEIC, BMP, TIFF, or ICO that some providers reject outright), we
transcode to PNG with Pillow before declaring media_type. This fixes
the user-visible "Could not process image" HTTP 400 from Anthropic on
Discord-attached AVIF/HEIC/BMP files.
Returns None if the file can't be read OR if the format isn't
universally supported AND Pillow can't transcode it (Pillow missing,
HEIC/AVIF plugin missing, vector format like SVG, corrupt bytes). The
caller reports those paths in ``skipped`` and the rest of the turn
proceeds.
"""
try:
raw = path.read_bytes()
@@ -440,6 +534,22 @@ def _file_to_data_url(path: Path) -> Optional[str]:
logger.warning("image_routing: failed to read %s%s", path, exc)
return None
mime = _guess_mime(path, raw=raw)
if mime not in _UNIVERSALLY_SUPPORTED_MIMES:
transcoded = _transcode_to_png(raw)
if transcoded is None:
logger.warning(
"image_routing: %s is %s which is not accepted by all major "
"vision providers and could not be transcoded to PNG; "
"skipping this attachment.",
path, mime,
)
return None
logger.info(
"image_routing: transcoded %s (%s) -> image/png for provider compatibility",
path.name, mime,
)
raw = transcoded
mime = "image/png"
b64 = base64.b64encode(raw).decode("ascii")
return f"data:{mime};base64,{b64}"

View File

@@ -8,6 +8,7 @@ iteration.
from __future__ import annotations
import hashlib
import logging
from concurrent.futures import ThreadPoolExecutor
from typing import Any
@@ -25,20 +26,112 @@ logger = logging.getLogger(__name__)
# opening dozens of sockets at once.
_MAX_REFERENCE_WORKERS = 8
# Per-tool-result character budget for the advisory reference view. Tool
# results can be huge (a full diff, a 5000-line file dump); replaying them
# verbatim per reference per tool-loop step would blow the reference model's
# context window and cost. We keep the agent's *actions* (tool calls) in full —
# they are cheap, high-signal, and tell the reference what the agent did — but
# preview each tool *result* head+tail so the reference still sees what came
# back without replaying megabytes. The acting aggregator always gets the full,
# untrimmed transcript; this budget only shapes the advisory copy.
_REFERENCE_TOOL_RESULT_BUDGET = 4000
# System prompt prepended to every reference-model call. References are
# advisory — they do NOT act, call tools, or own the task. Without this
# framing a reference receives the bare trimmed conversation and assumes it is
# the acting agent: it then refuses ("I can't access repositories / URLs from
# here") or tries to call tools it doesn't have. The prompt reframes the model
# as an analyst whose job is to reason about the presented state and hand its
# best thinking to the aggregator/orchestrator that will actually act.
_REFERENCE_SYSTEM_PROMPT = (
"You are a reference advisor in a Mixture of Agents (MoA) process. You are "
"NOT the acting agent and you do NOT execute anything: you cannot call "
"tools, run commands, browse, or access files, repositories, or URLs, and "
"you should not try to or apologize for being unable to. A separate "
"aggregator/orchestrator model holds those capabilities and will take the "
"actual actions.\n\n"
"The conversation below is the current state of a task handled by that "
"acting agent. Your job is to give your most intelligent analysis of that "
"state: understand the goal, reason about the problem, and advise on what "
"to do next. Surface the best approach, concrete next steps and tool-use "
"strategy, likely pitfalls and risks, and anything the acting agent may "
"have missed or gotten wrong. Assume any referenced files, URLs, or "
"systems exist and reason about them from the context given rather than "
"asking for access.\n\n"
"Respond with your advice directly — no preamble, no disclaimers about "
"tools or access. Your response is private guidance handed to the "
"aggregator, not an answer shown to the user."
)
def _slot_label(slot: dict[str, str]) -> str:
return f"{slot.get('provider', '').strip()}:{slot.get('model', '').strip()}"
def _slot_runtime(slot: dict[str, str]) -> dict[str, Any]:
"""Resolve a reference/aggregator slot to real runtime call kwargs.
A MoA slot is just a model selection — it must be called the same way any
model is called elsewhere, not through a bare ``call_llm(provider=...,
model=...)`` that leaves base_url/api_key/api_mode unresolved and lets the
auxiliary auto-detector guess. We route the slot's provider through
``resolve_runtime_provider`` (the canonical provider→api_mode/base_url/
api_key resolver the CLI, gateway, and delegate_task all use), so the slot
gets its provider's real API surface — e.g. MiniMax → anthropic_messages,
GPT-5/o-series → max_completion_tokens, custom endpoints → their base_url.
Returns the kwargs to pass through to ``call_llm`` (provider/model plus the
resolved base_url/api_key when available). Falls back to the bare
provider/model on any resolution error so a misconfigured slot still
attempts the call rather than aborting the whole MoA turn.
"""
provider = str(slot.get("provider") or "").strip()
model = str(slot.get("model") or "").strip()
out: dict[str, Any] = {"provider": provider, "model": model}
try:
from hermes_cli.runtime_provider import resolve_runtime_provider
rt = resolve_runtime_provider(requested=provider, target_model=model)
resolved_provider = str(rt.get("provider") or provider).strip().lower()
# call_llm treats an explicit base_url as a custom endpoint. That is
# correct for ordinary OpenAI-compatible targets, but wrong for OAuth /
# provider-backed targets whose provider branch adds auth refresh,
# request metadata, or request-shape adapters. Keep those providers
# identified by name.
if resolved_provider in {"nous", "openai-codex", "xai-oauth"}:
return out
# Pass the resolved endpoint through so call_llm builds the request for
# the provider's actual API surface instead of auto-detecting. base_url
# routes call_llm to the right adapter (incl. anthropic_messages mode);
# api_key is the resolved credential for that provider.
if rt.get("base_url"):
out["base_url"] = rt["base_url"]
if rt.get("api_key"):
out["api_key"] = rt["api_key"]
except Exception as exc: # pragma: no cover - defensive
logger.debug("MoA slot runtime resolution failed for %s: %s", _slot_label(slot), exc)
return out
def _run_reference(
slot: dict[str, str],
ref_messages: list[dict[str, Any]],
*,
temperature: float,
max_tokens: int,
temperature: float | None = None,
max_tokens: int | None = None,
) -> tuple[str, str]:
"""Call one reference model and return ``(label, text)``.
The slot is resolved to its provider's real runtime (via ``_slot_runtime``)
and called through the same ``call_llm`` request-building path any model
uses, so per-model wire-format handling (anthropic_messages,
max_completion_tokens, fixed/forbidden temperature) applies identically to
a reference as it would if that model were the acting model. MoA imposes no
cap of its own (``max_tokens`` defaults to ``None`` → omitted → the model's
real maximum); ``temperature`` is only the user's configured preset value,
which call_llm may still override per model.
Never raises: a failed reference becomes a labelled note so the aggregator
can still act with partial context. Designed to run inside a thread pool —
``call_llm`` is synchronous/blocking, so threads (not asyncio) are the right
@@ -46,13 +139,17 @@ def _run_reference(
"""
label = _slot_label(slot)
try:
# Prepend the advisory-role system prompt so the reference understands
# it is analyzing state for an aggregator, not acting on the task. The
# trimmed view (_reference_messages) already strips the agent's own
# system prompt, so this is the only system message the reference sees.
messages = [{"role": "system", "content": _REFERENCE_SYSTEM_PROMPT}, *ref_messages]
response = call_llm(
task="moa_reference",
provider=slot["provider"],
model=slot["model"],
messages=ref_messages,
messages=messages,
temperature=temperature,
max_tokens=max_tokens,
**_slot_runtime(slot),
)
return label, _extract_text(response) or "(empty response)"
except Exception as exc:
@@ -64,8 +161,8 @@ def _run_references_parallel(
reference_models: list[dict[str, str]],
ref_messages: list[dict[str, Any]],
*,
temperature: float,
max_tokens: int,
temperature: float | None = None,
max_tokens: int | None = None,
) -> list[tuple[str, str]]:
"""Fan out all reference models in parallel, returning outputs in order.
@@ -106,40 +203,140 @@ def _run_references_parallel(
return [r for r in results if r is not None]
def _reference_messages(messages: list[dict[str, Any]]) -> list[dict[str, Any]]:
"""Build an advisory-safe view of the conversation for reference models.
def _truncate_tool_result(text: str, budget: int = _REFERENCE_TOOL_RESULT_BUDGET) -> str:
"""Head+tail preview of a tool result for the advisory view.
Reference calls are advisory: they never call tools and never emit the
``tool_calls`` the main model did. Replaying the full transcript verbatim
(a) re-bills the ~8K-token Hermes system prompt per reference per
iteration and (b) risks 400s from strict providers (Mistral, Fireworks)
that reject orphan ``tool`` messages or ``tool_calls`` the reference never
produced. We keep only the user/assistant *text* turns, dropping the
system prompt, any ``tool``-role messages, and any ``tool_calls`` payloads.
Keeps the first and last halves of the budget with a ``[... N chars
omitted ...]`` marker between them, so a reference sees both how the result
started and how it ended without replaying the whole payload.
"""
trimmed: list[dict[str, Any]] = []
if not text or len(text) <= budget:
return text
half = budget // 2
omitted = len(text) - 2 * half
return f"{text[:half]}\n[... {omitted} chars omitted ...]\n{text[-half:]}"
def _render_tool_calls(tool_calls: Any) -> str:
"""Render an assistant turn's tool_calls as readable text lines.
The advisory view cannot carry real ``tool_calls`` payloads (strict
providers reject tool_calls the reference never produced), so the agent's
actions are flattened to text the reference can read and reason about.
"""
lines: list[str] = []
for tc in tool_calls or []:
fn = (tc.get("function") or {}) if isinstance(tc, dict) else {}
name = fn.get("name") or (tc.get("name") if isinstance(tc, dict) else "") or "tool"
args = fn.get("arguments")
if isinstance(args, str):
args_text = args
elif args is not None:
try:
import json
args_text = json.dumps(args, ensure_ascii=False)
except Exception:
args_text = str(args)
else:
args_text = ""
lines.append(f"[called tool: {name}({args_text})]" if args_text else f"[called tool: {name}]")
return "\n".join(lines)
def _reference_messages(messages: list[dict[str, Any]]) -> list[dict[str, Any]]:
"""Build an advisory view of the conversation for reference models.
A reference gives an INFORMED judgement on the current state, so it must
see what the agent actually did — its tool calls AND the tool results that
came back — not just the agent's narration. We therefore preserve the whole
conversation flow, but flatten it into clean user/assistant *text* turns:
- system prompt: dropped (8K of Hermes boilerplate, not advisory signal).
- assistant turns: kept; any ``tool_calls`` are rendered inline as
``[called tool: name(args)]`` text lines appended to the turn's text.
- ``tool``-role results: NOT dropped. Each is folded (head+tail preview,
see ``_truncate_tool_result``) into the *preceding* assistant turn as a
``[tool result: ...]`` block, so the reference sees what came back.
This emits ZERO ``tool``-role messages and ZERO ``tool_calls`` arrays — only
plain user/assistant text — so strict providers (Mistral, Fireworks) that
reject orphan tool messages / unproduced tool_calls don't 400, while the
reference still has the full picture.
The view MUST end with a ``user`` turn. Anthropic (and OpenRouter→Anthropic)
interpret a trailing assistant turn as an assistant *prefill* to continue,
and no-prefill models (e.g. Claude Opus 4.8) reject it with
``400 ... must end with a user message``. Rather than DELETE the agent's
latest context to satisfy that (which would blind the reference to the
current state), we APPEND a synthetic user turn asking the reference to
judge the state above. End-on-user is satisfied and no context is lost.
The acting aggregator always receives the full, untrimmed transcript; this
function only shapes the disposable advisory copy.
"""
advisory_instruction = (
"[The conversation above is the current state of the task. Give your "
"most intelligent judgement: what is going on, what should happen next, "
"what risks or mistakes you see, and how the acting agent should "
"proceed.]"
)
rendered: list[dict[str, Any]] = []
last_user_content: str | None = None
for msg in messages:
role = msg.get("role")
if role not in ("user", "assistant"):
# Drop system prompt and tool-result messages.
continue
content = msg.get("content")
if not isinstance(content, str):
# Skip non-text (multimodal/tool-call-only) assistant turns.
if not content:
continue
text = content if isinstance(content, str) else ""
if role == "assistant" and not text.strip():
# Assistant turn that was purely tool calls — nothing advisory.
if role == "system":
continue
trimmed.append({"role": role, "content": text})
if not trimmed:
# Degenerate case (e.g. first turn was stripped): fall back to a
# minimal user turn so the reference still has something to answer.
if role == "user":
if text.strip():
last_user_content = text
rendered.append({"role": "user", "content": text})
elif role == "assistant":
parts: list[str] = []
if text.strip():
parts.append(text.strip())
calls_text = _render_tool_calls(msg.get("tool_calls"))
if calls_text:
parts.append(calls_text)
# Empty assistant turns (no text, no calls) carry nothing advisory.
if parts:
rendered.append({"role": "assistant", "content": "\n".join(parts)})
elif role == "tool":
# Fold the tool result into the preceding assistant turn as text so
# the reference sees what came back, without emitting a tool-role
# message a reference never produced.
result_text = _truncate_tool_result(text)
block = f"[tool result: {result_text}]"
if rendered and rendered[-1].get("role") == "assistant":
rendered[-1]["content"] = rendered[-1]["content"] + "\n" + block
else:
# No assistant turn to attach to (e.g. a leading tool result);
# keep it as advisory context on its own assistant-role line.
rendered.append({"role": "assistant", "content": block})
# Any other role is ignored.
# End on a user turn: append a synthetic advisory request rather than
# deleting the agent's latest assistant context. This satisfies Anthropic's
# no-trailing-assistant-prefill rule while preserving full state.
if rendered and rendered[-1].get("role") == "assistant":
rendered.append({"role": "user", "content": advisory_instruction})
elif rendered and rendered[-1].get("role") == "user":
# Already ends on a user turn (fresh user prompt, no agent action yet).
# Leave it — the reference answers that prompt directly.
pass
if not rendered:
# Degenerate case: nothing rendered. Fall back to the latest user turn.
if last_user_content is not None:
return [{"role": "user", "content": last_user_content}]
for msg in reversed(messages):
if msg.get("role") == "user" and isinstance(msg.get("content"), str):
return [{"role": "user", "content": msg["content"]}]
return trimmed
return rendered
@@ -169,12 +366,18 @@ def aggregate_moa_context(
aggregator: dict[str, str],
temperature: float = 0.6,
aggregator_temperature: float = 0.4,
max_tokens: int = 4096,
max_tokens: int | None = None,
) -> str:
"""Run configured reference models and synthesize their advice.
Failures are returned as model-specific notes instead of aborting the normal
agent loop; the main model can still act with partial context.
``max_tokens`` is ``None`` by default: MoA does not cap reference or
aggregator output, so each model uses its own maximum. ``call_llm`` omits
the parameter entirely when it is ``None`` (see its docstring), which also
sidesteps providers that reject ``max_tokens`` outright. A hardcoded cap
here previously truncated long aggregator syntheses.
"""
reference_outputs: list[tuple[str, str]] = []
ref_messages = _reference_messages(api_messages)
@@ -203,11 +406,10 @@ def aggregate_moa_context(
try:
response = call_llm(
task="moa_aggregator",
provider=aggregator["provider"],
model=aggregator["model"],
messages=[{"role": "user", "content": synth_prompt}],
temperature=aggregator_temperature,
max_tokens=max_tokens,
**_slot_runtime(aggregator),
)
synthesis = _extract_text(response)
except Exception as exc:
@@ -230,8 +432,38 @@ def aggregate_moa_context(
class MoAChatCompletions:
"""OpenAI-chat-compatible facade where the aggregator is the acting model."""
def __init__(self, preset_name: str):
def __init__(self, preset_name: str, reference_callback: Any = None):
self.preset_name = preset_name or "default"
# Optional display hook. Called as reference outputs become available so
# frontends can show each reference model's answer as a labelled block
# before the aggregator acts. Signature:
# reference_callback(event, **kwargs)
# where event is one of:
# "moa.reference" kwargs: index, count, label, text
# "moa.aggregating" kwargs: aggregator (label), ref_count
# Never raises into the model call — display is best-effort.
self.reference_callback = reference_callback
# State-scoped reference cache. The agent loop calls create() once per
# tool-loop iteration; references should re-run whenever the task STATE
# advances — i.e. on every new user message AND every new tool result —
# so each reference judges the latest state. The advisory view
# (_reference_messages) now renders tool calls + results as text, so its
# signature changes on every new tool response; the cache key is that
# signature, so a new tool result is a cache MISS (references re-run)
# while a redundant create() call with identical state is a HIT (no
# re-run, no re-emit). This gives "fire on every user/tool response"
# for free, without re-firing on a pure no-op re-call.
self._ref_cache_key: tuple | None = None
self._ref_cache_outputs: list[tuple[str, str]] = []
def _emit(self, event: str, **kwargs: Any) -> None:
cb = self.reference_callback
if cb is None:
return
try:
cb(event, **kwargs)
except Exception as exc: # pragma: no cover - display must never break the turn
logger.debug("MoA reference_callback failed for %s: %s", event, exc)
def create(self, **api_kwargs: Any) -> Any:
from hermes_cli.config import load_config
@@ -241,7 +473,10 @@ class MoAChatCompletions:
messages = list(api_kwargs.get("messages") or [])
reference_models = preset.get("reference_models") or []
aggregator = preset.get("aggregator") or {}
max_tokens = int(preset.get("max_tokens", api_kwargs.get("max_tokens") or 4096) or 4096)
# MoA does not cap reference or aggregator output: each model uses its
# own maximum. Passing max_tokens=None makes call_llm omit the parameter
# (it never caps by default), so a long aggregator synthesis is never
# truncated and providers that reject max_tokens don't 400.
temperature = float(preset.get("reference_temperature", 0.6) or 0.6)
aggregator_temperature = float(preset.get("aggregator_temperature", api_kwargs.get("temperature") or 0.4) or 0.4)
@@ -253,12 +488,52 @@ class MoAChatCompletions:
reference_outputs: list[tuple[str, str]] = []
ref_messages = _reference_messages(messages)
reference_outputs = _run_references_parallel(
reference_models,
ref_messages,
temperature=temperature,
max_tokens=max_tokens,
)
# Turn-scoped cache: only run + display references when the advisory
# view changed (i.e. a new user turn). Within one turn the agent loop
# calls create() once per tool iteration with the same advisory view;
# reuse the cached outputs and skip both the re-run and the re-emit.
_sig = hashlib.sha256(
"\u0000".join(
f"{m.get('role')}:{m.get('content')}" for m in ref_messages
).encode("utf-8", "replace")
).hexdigest()
_cache_key = (self.preset_name, _sig, tuple(_slot_label(s) for s in reference_models))
_refs_from_cache = _cache_key == self._ref_cache_key and bool(self._ref_cache_outputs)
if _refs_from_cache:
reference_outputs = list(self._ref_cache_outputs)
else:
reference_outputs = _run_references_parallel(
reference_models,
ref_messages,
temperature=temperature,
max_tokens=None,
)
self._ref_cache_key = _cache_key
self._ref_cache_outputs = list(reference_outputs)
# Surface each reference model's answer to the display BEFORE the
# aggregator acts — once per turn (only on the iteration that
# actually ran them). The user sees one labelled block per
# reference (rendered like a thinking block) so the MoA process is
# visible rather than a silent pause. Best-effort: never blocks the
# turn.
_ref_count = len(reference_outputs)
for _idx, (_label, _text) in enumerate(reference_outputs, start=1):
self._emit(
"moa.reference",
index=_idx,
count=_ref_count,
label=_label,
text=_text,
)
if _ref_count:
self._emit(
"moa.aggregating",
aggregator=_slot_label(aggregator),
ref_count=_ref_count,
)
agg_messages = [dict(m) for m in messages]
if reference_outputs:
@@ -286,21 +561,26 @@ class MoAChatCompletions:
raise RuntimeError("MoA aggregator cannot be another MoA preset")
agg_kwargs = dict(api_kwargs)
agg_kwargs["messages"] = agg_messages
agg_kwargs["model"] = aggregator.get("model")
agg_kwargs["temperature"] = aggregator_temperature
# The aggregator is the acting model. Resolve its slot to the provider's
# real runtime (base_url/api_key/api_mode) and call it through the same
# request-building path any model uses — so per-model wire-format
# handling (anthropic_messages, max_completion_tokens, fixed/forbidden
# temperature) applies identically to it. MoA imposes no output cap:
# max_tokens is passed through from the caller (normally None → omitted
# → the model's real maximum). The preset's old hardcoded 4096 default
# is gone — it truncated long syntheses.
return call_llm(
task="moa_aggregator",
provider=aggregator.get("provider"),
model=aggregator.get("model"),
messages=agg_messages,
temperature=aggregator_temperature,
max_tokens=agg_kwargs.get("max_tokens"),
tools=agg_kwargs.get("tools"),
extra_body=agg_kwargs.get("extra_body"),
**_slot_runtime(aggregator),
)
class MoAClient:
def __init__(self, preset_name: str):
def __init__(self, preset_name: str, reference_callback: Any = None):
self.chat = type("_MoAChat", (), {})()
self.chat.completions = MoAChatCompletions(preset_name)
self.chat.completions = MoAChatCompletions(preset_name, reference_callback=reference_callback)

View File

@@ -478,6 +478,16 @@ def _infer_provider_from_url(base_url: str) -> Optional[str]:
return None
def _lmstudio_server_root(base_url: str) -> str:
"""Return the LM Studio server root for native ``/api/v1`` endpoints."""
root = _normalize_base_url(base_url).rstrip("/")
for suffix in ("/api/v1", "/api", "/v1"):
if root.endswith(suffix):
root = root[: -len(suffix)].rstrip("/")
break
return root
def _is_known_provider_base_url(base_url: str) -> bool:
return _infer_provider_from_url(base_url) is not None
@@ -549,6 +559,7 @@ def detect_local_server_type(base_url: str, api_key: str = "") -> Optional[str]:
server_url = normalized
if server_url.endswith("/v1"):
server_url = server_url[:-3]
lmstudio_url = _lmstudio_server_root(base_url)
headers = _auth_headers(api_key)
@@ -556,7 +567,7 @@ def detect_local_server_type(base_url: str, api_key: str = "") -> Optional[str]:
with httpx.Client(timeout=2.0, headers=headers) as client:
# LM Studio exposes /api/v1/models — check first (most specific)
try:
r = client.get(f"{server_url}/api/v1/models")
r = client.get(f"{lmstudio_url}/api/v1/models")
if r.status_code == 200:
return "lm-studio"
except Exception:
@@ -774,7 +785,7 @@ def fetch_endpoint_model_metadata(
if is_local_endpoint(normalized):
try:
if detect_local_server_type(normalized, api_key=api_key) == "lm-studio":
server_url = normalized[:-3].rstrip("/") if normalized.endswith("/v1") else normalized
server_url = _lmstudio_server_root(normalized)
response = requests.get(
server_url.rstrip("/") + "/api/v1/models",
headers=headers,
@@ -1297,6 +1308,7 @@ def _query_local_context_length(model: str, base_url: str, api_key: str = "") ->
server_url = base_url.rstrip("/")
if server_url.endswith("/v1"):
server_url = server_url[:-3]
lmstudio_url = _lmstudio_server_root(base_url)
headers = _auth_headers(api_key)
@@ -1340,7 +1352,7 @@ def _query_local_context_length(model: str, base_url: str, api_key: str = "") ->
# Use _model_id_matches for fuzzy matching: LM Studio stores models as
# "publisher/slug" but users configure only "slug" after "local:" prefix.
if server_type == "lm-studio":
resp = client.get(f"{server_url}/api/v1/models")
resp = client.get(f"{lmstudio_url}/api/v1/models")
if resp.status_code == 200:
data = resp.json()
for m in data.get("models", []):
@@ -1646,6 +1658,34 @@ def get_model_context_length(
if config_context_length is not None and isinstance(config_context_length, int) and config_context_length > 0:
return config_context_length
# 0a. MoA virtual provider — ``model`` is a preset name, not a real model,
# and ``base_url`` is the local virtual endpoint, so every probe below would
# miss and fall through to the 256K default. The aggregator is the acting
# model, so resolve the context window from the aggregator slot's real
# provider+model instead. References are advisory-only and never bound the
# acting context, so they're ignored here.
if (provider or "").strip().lower() == "moa":
try:
from hermes_cli.config import load_config
from hermes_cli.moa_config import resolve_moa_preset
from hermes_cli.runtime_provider import resolve_runtime_provider
preset = resolve_moa_preset(load_config().get("moa") or {}, model)
agg = preset.get("aggregator") or {}
agg_provider = str(agg.get("provider") or "").strip()
agg_model = str(agg.get("model") or "").strip()
if agg_model and agg_provider and agg_provider.lower() != "moa":
rt = resolve_runtime_provider(requested=agg_provider, target_model=agg_model)
return get_model_context_length(
agg_model,
base_url=rt.get("base_url", "") or "",
api_key=rt.get("api_key", "") or "",
provider=agg_provider,
)
except Exception:
logger.debug("MoA aggregator context-length resolution failed", exc_info=True)
# Fall through to the generic default if aggregator resolution failed.
# 0b. custom_providers per-model override — check before any probe.
# This closes the gap where /model switch and display paths used to fall
# back to 128K despite the user having a per-model context_length set.

View File

@@ -26,7 +26,7 @@ from __future__ import annotations
import os
import sys
import urllib.request
from typing import Optional
from typing import Any, Optional
from utils import base_url_hostname, normalize_proxy_url
@@ -142,6 +142,46 @@ def _get_proxy_for_base_url(base_url: Optional[str]) -> Optional[str]:
return proxy
def build_keepalive_http_client(
base_url: str = "",
*,
async_mode: bool = False,
) -> Optional[Any]:
"""Build an httpx client for OpenAI SDK calls with env-only proxy policy.
Uses explicit ``HTTPS_PROXY`` / ``NO_PROXY`` env vars via
``_get_proxy_for_base_url``. A custom transport disables httpx's default
``trust_env`` path, so macOS system proxy settings from
``urllib.request.getproxies()`` (which omit the ExceptionsList) are not
applied. Mirrors ``AIAgent._build_keepalive_http_client``.
"""
try:
import httpx
import socket
if "api.githubcopilot.com" in str(base_url or "").lower():
client_cls = httpx.AsyncClient if async_mode else httpx.Client
return client_cls()
sock_opts = [(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)]
if hasattr(socket, "TCP_KEEPIDLE"):
sock_opts.append((socket.IPPROTO_TCP, socket.TCP_KEEPIDLE, 30))
sock_opts.append((socket.IPPROTO_TCP, socket.TCP_KEEPINTVL, 10))
sock_opts.append((socket.IPPROTO_TCP, socket.TCP_KEEPCNT, 3))
elif hasattr(socket, "TCP_KEEPALIVE"):
sock_opts.append((socket.IPPROTO_TCP, socket.TCP_KEEPALIVE, 30))
proxy = _get_proxy_for_base_url(base_url)
transport_cls = httpx.AsyncHTTPTransport if async_mode else httpx.HTTPTransport
client_cls = httpx.AsyncClient if async_mode else httpx.Client
return client_cls(
transport=transport_cls(socket_options=sock_opts),
proxy=proxy,
)
except Exception:
return None
def _install_safe_stdio() -> None:
"""Wrap stdout/stderr so best-effort console output cannot crash the agent."""
for stream_name in ("stdout", "stderr"):
@@ -164,4 +204,5 @@ __all__ = [
"_install_safe_stdio",
"_get_proxy_from_env",
"_get_proxy_for_base_url",
"build_keepalive_http_client",
]

View File

@@ -88,12 +88,15 @@ def _find_hermes_md(cwd: Path) -> Optional[Path]:
stop_at = _find_git_root(cwd)
current = cwd.resolve()
for directory in [current, *current.parents]:
# When there is no git root, only check cwd itself walking parents
# could pick up a .hermes.md planted in /tmp, /home, etc.
search_dirs = [current, *current.parents] if stop_at else [current]
for directory in search_dirs:
for name in _HERMES_MD_NAMES:
candidate = directory / name
if candidate.is_file():
return candidate
# Stop walking at the git root (or filesystem root).
if stop_at and directory == stop_at:
break
return None
@@ -617,7 +620,12 @@ DEVELOPER_ROLE_MODELS = ("gpt-5", "codex")
PLATFORM_HINTS = {
"whatsapp": (
"You are on a text messaging communication platform, WhatsApp. "
"Please do not use markdown as it does not render. "
"Standard markdown (**bold**, *italic*, ~~strike~~, # headers, "
"`code`, ```code blocks```, [links](url)) is auto-converted to "
"WhatsApp's native syntax (*bold*, _italic_, ~strike~, monospace) — "
"feel free to write in markdown, and use bullet lists ('- item') "
"freely. Tables are NOT supported — prefer bullet lists or labeled "
"key:value pairs. "
"You can send media files natively: to deliver a file to the user, "
"include MEDIA:/absolute/path/to/file in your response. The file "
"will be sent as a native WhatsApp attachment — images (.jpg, .png, "
@@ -682,7 +690,11 @@ PLATFORM_HINTS = {
),
"signal": (
"You are on a text messaging communication platform, Signal. "
"Please do not use markdown as it does not render. "
"Standard markdown (**bold**, *italic*, ~~strike~~, # headers, "
"`code`, ```code blocks```) is auto-converted to Signal's native "
"rich formatting — feel free to write in markdown, and use bullet "
"lists ('- item') freely (they render as • bullets). Tables are NOT "
"supported — prefer bullet lists or labeled key:value pairs. "
"You can send media files natively: to deliver a file to the user, "
"include MEDIA:/absolute/path/to/file in your response. Images "
"(.png, .jpg, .webp) appear as photos, audio as attachments, and other "
@@ -917,8 +929,7 @@ def _probe_remote_backend(env_type: str) -> str | None:
try:
# Import locally: tools/ imports are heavy and only relevant when a
# non-local backend is actually configured.
from tools.terminal_tool import _get_env_config # type: ignore
from tools.environments import get_environment # type: ignore
from tools.terminal_tool import _create_environment, _get_env_config # type: ignore
except Exception as e:
logger.debug("Backend probe unavailable (import failed): %s", e)
_BACKEND_PROBE_CACHE[cache_key] = ""
@@ -926,7 +937,59 @@ def _probe_remote_backend(env_type: str) -> str | None:
try:
config = _get_env_config()
env = get_environment(config)
# Build the environment the same way tools/terminal_tool.py does for a
# live command: select the backend image, then assemble ssh/container
# config from the env-derived dict. (There is no `get_environment`
# factory — the real entry point is `_create_environment`.)
if env_type == "docker":
image = config.get("docker_image", "")
elif env_type == "singularity":
image = config.get("singularity_image", "")
elif env_type == "modal":
image = config.get("modal_image", "")
elif env_type == "daytona":
image = config.get("daytona_image", "")
else:
image = ""
ssh_config = None
if env_type == "ssh":
ssh_config = {
"host": config.get("ssh_host", ""),
"user": config.get("ssh_user", ""),
"port": config.get("ssh_port", 22),
"key": config.get("ssh_key", ""),
"persistent": config.get("ssh_persistent", False),
}
container_config = None
if env_type in {"docker", "singularity", "modal", "daytona"}:
container_config = {
"container_cpu": config.get("container_cpu", 1),
"container_memory": config.get("container_memory", 5120),
"container_disk": config.get("container_disk", 51200),
"container_persistent": config.get("container_persistent", True),
"modal_mode": config.get("modal_mode", "auto"),
"docker_volumes": config.get("docker_volumes", []),
"docker_mount_cwd_to_workspace": config.get("docker_mount_cwd_to_workspace", False),
"docker_forward_env": config.get("docker_forward_env", []),
"docker_env": config.get("docker_env", {}),
"docker_run_as_host_user": config.get("docker_run_as_host_user", False),
"docker_extra_args": config.get("docker_extra_args", []),
"docker_persist_across_processes": config.get("docker_persist_across_processes", True),
"docker_orphan_reaper": config.get("docker_orphan_reaper", True),
}
env = _create_environment(
env_type=env_type,
image=image,
cwd=config.get("cwd", ""),
timeout=config.get("timeout", 180),
ssh_config=ssh_config,
container_config=container_config,
task_id="prompt-backend-probe",
host_cwd=config.get("host_cwd"),
)
# Single-line POSIX probe — works on any Unixy backend. Wrapped in
# `2>/dev/null` so a missing binary doesn't pollute the output.
probe_cmd = (

View File

@@ -10,6 +10,7 @@ the first 6 and last 4 characters for debuggability.
import logging
import os
import re
import shlex
logger = logging.getLogger(__name__)
@@ -107,12 +108,60 @@ _PREFIX_PATTERNS = [
r"ntn_[A-Za-z0-9]{10,}", # Notion internal integration token
]
# ENV assignment patterns: KEY=value where KEY contains a secret-like name
# ENV assignment patterns: KEY=value where KEY contains a secret-like name.
# Uppercase keys tolerate spaces around "=" (e.g. ``FOO_SECRET = bar``) because
# an all-caps key is almost never prose/code.
_SECRET_ENV_NAMES = r"(?:API_?KEY|TOKEN|SECRET|PASSWORD|PASSWD|CREDENTIAL|AUTH)"
_ENV_ASSIGN_RE = re.compile(
rf"([A-Z0-9_]{{0,50}}{_SECRET_ENV_NAMES}[A-Z0-9_]{{0,50}})\s*=\s*(['\"]?)(\S+)\2",
)
# Lowercase / dotted / hyphenated config keys from config files
# (application.properties, .env, YAML-ish dumps): ``spring.datasource.password=secret``,
# ``app.api.key=xyz``, ``password=secret``. The uppercase _ENV_ASSIGN_RE above
# never matched these, so config-file passwords leaked verbatim (issue #16413).
#
# These run only in a config-file context, NOT in prose, code, or URLs — three
# carve-outs preserved from the original design (#4367 + the documented
# web-URL passthrough below):
# 1. The value is bounded by ``[^\s&]`` (stops at whitespace AND ``&``) so
# form-urlencoded bodies are handled pair-by-pair (by _redact_form_body),
# not greedily swallowed.
# 2. _CFG_DOTTED_RE only matches when the key is NAMESPACED (contains a dot),
# which is unambiguously a config key — never a prose word.
# 3. _CFG_ANCHORED_RE matches a bare secret-word key only at line start
# (optionally after ``export``), so conversational ``I have password=foo``
# mid-sentence is left alone.
# The colon-form URL guard (skip when ``://`` present) lives at the call site.
_SECRET_CFG_NAMES = r"(?:api[ _.\-]?key|token|secret|passwd|password|credential|auth)"
_CFG_VALUE = r"(['\"]?)([^\s&]+?)\2(?=[\s&]|$)"
# Namespaced (dotted) key: the secret word may sit anywhere in a dotted path.
_CFG_DOTTED_RE = re.compile(
rf"((?:[A-Za-z0-9_\-]+\.)+[A-Za-z0-9_.\-]*{_SECRET_CFG_NAMES}[A-Za-z0-9_.\-]*"
rf"|[A-Za-z0-9_.\-]*{_SECRET_CFG_NAMES}[A-Za-z0-9_.\-]*\.[A-Za-z0-9_.\-]+)"
rf"={_CFG_VALUE}",
re.IGNORECASE,
)
# Line-anchored bare key: ``password=…`` / ``export api_key=…`` at start of line.
_CFG_ANCHORED_RE = re.compile(
rf"(^[ \t]*(?:export[ \t]+)?[A-Za-z0-9_\-]*{_SECRET_CFG_NAMES}[A-Za-z0-9_\-]*)={_CFG_VALUE}",
re.IGNORECASE | re.MULTILINE,
)
# Unquoted YAML / colon config (e.g. ``password: secret``,
# ``spring.datasource.password: hunter2``). The secret keyword must be part of
# the KEY (anchored to the start of the line/indent), and the value is a single
# whitespace-free token — so prose like ``note: secret meeting`` (keyword in the
# value) and ``error: token expired`` are left alone. Bare ``auth`` is excluded
# from the key set so ``Authorization:`` / ``author:`` don't match (the former
# is masked by _AUTH_HEADER_RE); ``auth_token``/``auth-token`` still match via
# the ``token`` keyword. Quoted values defer to _JSON_FIELD_RE via the lookahead.
_YAML_CFG_NAMES = r"(?:api[ _.\-]?key|token|secret|passwd|password|credential)"
_YAML_ASSIGN_RE = re.compile(
rf"(^[ \t]*[A-Za-z0-9_.\-]*{_YAML_CFG_NAMES}[A-Za-z0-9_.\-]*)(:[ \t]*)(?!['\"])([^\s&]+)",
re.IGNORECASE | re.MULTILINE,
)
# JSON field patterns: "apiKey": "value", "token": "value", etc.
_JSON_KEY_NAMES = r"(?:api_?[Kk]ey|token|secret|password|access_token|refresh_token|auth_token|bearer|secret_value|raw_secret|secret_input|key_material)"
_JSON_FIELD_RE = re.compile(
@@ -125,8 +174,15 @@ _JSON_FIELD_RE = re.compile(
# while the header name and scheme word are preserved for debuggability. The
# previous rule only matched ``Bearer``, so ``Basic <base64 user:pass>`` and
# ``token <pat>`` leaked verbatim into logs/transcripts.
#
# The credential class excludes quote characters (``"`` / ``'``): a token sitting
# flush against a closing quote (``"Authorization: Bearer sk-..."``) must not pull
# that quote into the match, or masking turns value corruption into *syntax*
# corruption — the closing quote vanishes and the command/string no longer parses
# (unterminated quote → shell EOF / Python SyntaxError). Real credentials never
# contain ``"`` or ``'``, so excluding them is safe. See #43083.
_AUTH_HEADER_RE = re.compile(
r"((?:Proxy-)?Authorization:\s*)([A-Za-z][\w.+-]*\s+)?(\S+)",
r"((?:Proxy-)?Authorization:\s*)([A-Za-z][\w.+-]*\s+)?([^\s\"']+)",
re.IGNORECASE,
)
@@ -154,9 +210,37 @@ _PRIVATE_KEY_RE = re.compile(
)
# Database connection strings: protocol://user:PASSWORD@host
# Catches postgres, mysql, mongodb, redis, amqp URLs and redacts the password
# Catches postgres, mysql, mongodb, redis, amqp URLs and redacts the password.
# The userinfo and password groups forbid whitespace ([^:\s]+ / [^@\s]+) so the
# match can never span a line break. A real DSN password never contains
# whitespace; without this bound the greedy [^@]+ would scan past the end of a
# code line to the next stray "@" (e.g. a Python decorator), swallowing
# intervening lines and corrupting tool OUTPUT for any source containing a
# postgresql:// f-string template. See issue #33801.
_DB_CONNSTR_RE = re.compile(
r"((?:postgres(?:ql)?|mysql|mongodb(?:\+srv)?|redis|amqp)://[^:]+:)([^@]+)(@)",
r"((?:postgres(?:ql)?|mysql|mongodb(?:\+srv)?|redis|amqp)://[^:\s]+:)([^@\s]+)(@)",
re.IGNORECASE,
)
# Bare-token credential in a web/transport URL: ``scheme://TOKEN@host``.
# This is the ``git remote set-url origin https://PASSWORD@github.com/...``
# shape from issue #6396 — a single opaque credential in the userinfo position
# with NO ``user:pass`` colon. It is unambiguously a secret: legitimate
# round-trip URLs (OAuth callbacks, magic links, pre-signed shares — see the
# "Web-URL redaction is intentionally OFF" note in redact_sensitive_text) carry
# their tokens in the QUERY STRING, never in bare userinfo. The colon form
# ``user:pass@`` is deliberately left to pass through (commit "pass web URLs
# through unchanged", #34029) and is NOT matched here — the token class forbids
# ``:``. DB schemes are handled by _DB_CONNSTR_RE above and excluded here.
#
# Guards against false positives:
# - 8+ char floor skips short usernames (git, admin, root, deploy, ubuntu).
# - The token class ``[^\s:@/]`` cannot cross ``/``, so an ``@`` sitting in a
# path or query (e.g. ``?q=user@example.com``) is never treated as userinfo.
_URL_BARE_TOKEN_RE = re.compile(
r"((?:https?|wss?|git|ssh|ftp|ftps|sftp)://)" # scheme
r"([^\s:@/]{8,})" # bare token (no colon/slash/@), 8+ chars
r"(@[^\s]+)", # @host...
re.IGNORECASE,
)
@@ -340,7 +424,40 @@ def _redact_form_body(text: str) -> str:
return _redact_query_string(text.strip())
def redact_sensitive_text(text: str, *, force: bool = False, code_file: bool = False) -> str:
def _mask_token_nonreusable(token: str) -> str:
"""Redact a prefix-matched credential to a NON-REUSABLE sentinel.
Unlike :func:`_mask_token` (which keeps head/tail chars — fine for logs
that are never fed back into a config), this emits a marker that:
* cannot be mistaken for a usable-but-truncated key, so an agent that
reads it from a config file and writes it back does NOT corrupt the
stored credential into a dead 13-char string (issue #35519); and
* still does not leak the secret material (no head/tail chars).
The vendor prefix label is preserved for debuggability so the agent can
still tell *which* credential is present (e.g. a GitHub PAT vs an OpenAI
key) without seeing any of its bytes.
"""
if not token:
return "«redacted-secret»"
# Preserve only the recognizable vendor prefix label (e.g. "ghp_", "sk-"),
# never any of the random secret body.
label = ""
for sub in _PREFIX_SUBSTRINGS:
if token.startswith(sub):
label = sub
break
return f"«redacted:{label}…»" if label else "«redacted-secret»"
def redact_sensitive_text(
text: str,
*,
force: bool = False,
code_file: bool = False,
file_read: bool = False,
) -> str:
"""Apply all redaction patterns to a block of text.
Safe to call on any string -- non-matching text passes through unchanged.
@@ -353,6 +470,17 @@ def redact_sensitive_text(text: str, *, force: bool = False, code_file: bool = F
constants, "apiKey": "test" fixtures). Prefix patterns, auth headers,
private keys, DB connstrings, JWTs, and URL secrets are still redacted.
Set file_read=True for file *content* returned to the agent (read_file /
search_files / cat). Secrets are STILL redacted — they are never exposed —
but prefix-matched credentials are replaced with a non-reusable sentinel
(``«redacted:ghp_…»``) instead of a head/tail-preserving mask
(``ghp_S1...Pn2T``). The old mask looked like a real-but-truncated key, so
an agent reading it from config.yaml and writing it back silently corrupted
the stored credential into a dead 13-char value → 401 (issue #35519). The
sentinel is syntactically invalid as a token, so it can't be mistaken for a
usable key or written back as one. Implies code_file=True (config/data
files shouldn't trigger the source-code ENV/JSON false-positive paths).
Performance: each regex pattern is gated behind a cheap substring
pre-check (e.g. ``"=" in text`` for ENV assignments, ``"://" in text``
for URLs, ``"eyJ" in text`` for JWTs). On a typical hermes log line
@@ -371,9 +499,15 @@ def redact_sensitive_text(text: str, *, force: bool = False, code_file: bool = F
if not (force or _REDACT_ENABLED):
return text
# file_read content shouldn't hit the source-code ENV/JSON false-positive
# paths either (it's config/data, not log lines).
if file_read:
code_file = True
# Known prefixes (sk-, ghp_, etc.) — gate on substring presence
if _has_known_prefix_substring(text):
text = _PREFIX_RE.sub(lambda m: _mask_token(m.group(1)), text)
_prefix_sub = _mask_token_nonreusable if file_read else _mask_token
text = _PREFIX_RE.sub(lambda m: _prefix_sub(m.group(1)), text)
# ENV assignments: OPENAI_API_KEY=*** (skip for code files — false positives)
if not code_file:
@@ -382,6 +516,13 @@ def redact_sensitive_text(text: str, *, force: bool = False, code_file: bool = F
name, quote, value = m.group(1), m.group(2), m.group(3)
return f"{name}={quote}{_mask_token(value)}{quote}"
text = _ENV_ASSIGN_RE.sub(_redact_env, text)
# Lowercase/dotted config keys (issue #16413). Skip URLs entirely —
# web-URL query params are intentionally passed through (see note
# near the bottom of this function); _DB_CONNSTR_RE still guards
# connection-string passwords.
if "://" not in text:
text = _CFG_DOTTED_RE.sub(_redact_env, text)
text = _CFG_ANCHORED_RE.sub(_redact_env, text)
# JSON fields: "apiKey": "***" (skip for code files — false positives)
if ":" in text and '"' in text:
@@ -390,6 +531,15 @@ def redact_sensitive_text(text: str, *, force: bool = False, code_file: bool = F
return f'{key}: "{_mask_token(value)}"'
text = _JSON_FIELD_RE.sub(_redact_json, text)
# Unquoted YAML / colon config: password: *** (after JSON so quoted
# values are handled there; the lookahead in _YAML_ASSIGN_RE skips
# quotes). Skip URLs — web-URL query params pass through by design.
if ":" in text and "://" not in text:
def _redact_yaml(m):
key, sep, value = m.group(1), m.group(2), m.group(3)
return f"{key}{sep}{_mask_token(value)}"
text = _YAML_ASSIGN_RE.sub(_redact_yaml, text)
# Authorization headers — _AUTH_HEADER_RE matches any scheme after
# "[Proxy-]Authorization:" case-insensitively, so "uthorization" is the
# cheapest substring gate that covers every casing without a casefold().
@@ -419,9 +569,32 @@ def redact_sensitive_text(text: str, *, force: bool = False, code_file: bool = F
if "BEGIN" in text and "-----" in text:
text = _PRIVATE_KEY_RE.sub("[REDACTED PRIVATE KEY]", text)
# Database connection string passwords
# Database connection string passwords. With code_file=True, a password
# group that is a pure ``{...}`` brace expression is an f-string template
# reference (e.g. f"postgresql://{user}:{pass}@{host}"), not a literal
# credential — preserve it. Literal passwords are still redacted. The regex
# forbids whitespace in the password group, so a single-line template's
# group(2) is exactly the brace expression. See issue #33801.
if "://" in text:
text = _DB_CONNSTR_RE.sub(lambda m: f"{m.group(1)}***{m.group(3)}", text)
if code_file:
def _redact_db(m):
pw = m.group(2)
if pw.startswith("{") and pw.endswith("}"):
return m.group(0)
return f"{m.group(1)}***{m.group(3)}"
text = _DB_CONNSTR_RE.sub(_redact_db, text)
else:
text = _DB_CONNSTR_RE.sub(lambda m: f"{m.group(1)}***{m.group(3)}", text)
# Bare-token userinfo in web/transport URLs: ``scheme://TOKEN@host``.
# The git-remote-with-embedded-password shape from #6396. Only the
# colon-less bare-token form is redacted — ``user:pass@`` and
# query-string tokens are left to pass through (see the web-URL note
# below). See _URL_BARE_TOKEN_RE for the false-positive guards.
text = _URL_BARE_TOKEN_RE.sub(
lambda m: f"{m.group(1)}{_mask_token(m.group(2))}{m.group(3)}",
text,
)
# JWT tokens (eyJ... — base64-encoded JSON headers)
if "eyJ" in text:
@@ -434,7 +607,12 @@ def redact_sensitive_text(text: str, *, force: bool = False, code_file: bool = F
# blanket-redacting param values by name breaks those skills mid-flow.
# Known credential shapes (sk-, ghp_, JWTs, etc.) inside URLs are still
# caught by _PREFIX_RE and _JWT_RE above. DB connection-string passwords
# are still caught by _DB_CONNSTR_RE.
# are still caught by _DB_CONNSTR_RE. The ONE userinfo case still redacted
# is the colon-less bare-token form ``scheme://TOKEN@host`` (#6396, handled
# by _URL_BARE_TOKEN_RE in the ``://`` block above): a bare credential in
# userinfo is never a round-trip workflow token (those live in the query
# string), so masking it can't break a skill. The ``user:pass@`` form is
# left to pass through per #34029.
# Form-urlencoded bodies (only triggers on clean k=v&k=v inputs).
if "&" in text and "=" in text:
@@ -452,6 +630,66 @@ def redact_sensitive_text(text: str, *, force: bool = False, code_file: bool = F
return text
# Commands whose stdout is an environment-variable dump (KEY=value lines),
# NOT source code. For these, terminal-output redaction must run the
# ENV-assignment pass (code_file=False) so opaque tokens with no recognized
# vendor prefix (e.g. ``MY_SERVICE_TOKEN=abc123randomstring``) are still
# masked. For all other commands, code_file=True is used to avoid mangling
# legitimate source/config dumps (``MAX_TOKENS=100``, ``"apiKey": "x"``
# fixtures, ``postgresql://{user}`` f-string templates). See issue #43025.
_ENV_DUMP_COMMANDS = frozenset({"env", "printenv", "set", "export", "declare"})
def is_env_dump_command(command: str | None) -> bool:
"""Return True if ``command`` dumps environment variables to stdout.
Detects ``env`` / ``printenv`` / ``set`` / ``export`` / ``declare`` as the
first token of any segment in a pipeline or sequence (``;`` / ``&&`` /
``||`` / ``|``). Conservative: a parse failure or anything unrecognized
returns False (callers then fall back to the safer code_file=True path,
which still masks prefix-shaped keys).
"""
if not command or not isinstance(command, str):
return False
# Split on shell separators, then inspect the first token of each segment.
segments = re.split(r"[|;&]+", command)
for seg in segments:
seg = seg.strip()
if not seg:
continue
try:
tokens = shlex.split(seg)
except ValueError:
tokens = seg.split()
if tokens and tokens[0] in _ENV_DUMP_COMMANDS:
return True
return False
def redact_terminal_output(
output: str, command: str | None = None, *, force: bool = False
) -> str:
"""Redact secrets from terminal/process stdout.
Single redaction policy for ALL terminal-output surfaces — foreground
``terminal`` results AND background ``process(action=poll/log/wait)``
output — so they can't diverge. Picks ``code_file`` based on whether
``command`` is an environment dump:
- env-dump command (``env``/``printenv``/``set``/``export``/``declare``)
→ ``code_file=False`` so the ENV-assignment pass masks opaque tokens.
- anything else (or unknown command) → ``code_file=True`` to avoid
false positives on source/config dumps.
``force=True`` bypasses the global ``security.redact_secrets`` preference
for safety boundaries that must never emit raw credentials.
"""
if not output:
return output
code_file = not is_env_dump_command(command or "")
return redact_sensitive_text(output, force=force, code_file=code_file)
# Substrings used to gate ``_PREFIX_RE`` execution. If none of these appear in
# the input string, the prefix regex cannot match anything, so we skip it.
# False positives are fine (they just run the regex, which then matches

140
agent/replay_cleanup.py Normal file
View File

@@ -0,0 +1,140 @@
"""Replay-history sanitization shared across resume code paths.
When a session's last turn dies mid-tool-loop — the process is killed by a
restart/shutdown command, a stale-timeout fires, or an interrupt lands before
the tool result is written — the persisted transcript can end with a dangling
``assistant(tool_calls)`` (no matching ``tool`` answer) or an interrupted
``assistant→tool`` block. On resume the model sees that broken tail and
re-issues the unanswered call, producing an endless "thinking"/reboot loop
(#49201, #29086).
These pure helpers strip those tails before the history is replayed to the
model. They were originally local to ``gateway/run.py`` (which fixed the
messaging-gateway path) and are extracted here so every resume surface — the
messaging gateway AND the TUI/WebUI gateway — shares the same cleanup instead
of the WebUI path silently skipping it.
"""
from __future__ import annotations
import logging
from typing import Any, Dict, List
logger = logging.getLogger(__name__)
def is_interrupted_tool_result(content: Any) -> bool:
"""Return True if a tool result indicates the tool was interrupted."""
if not isinstance(content, str):
return False
lowered = content.lower()
if "[command interrupted]" in lowered:
return True
if "exit_code" in lowered and ("130" in lowered or "-1" in lowered):
return "interrupt" in lowered
return False
def strip_interrupted_tool_tails(
agent_history: List[Dict[str, Any]],
) -> List[Dict[str, Any]]:
"""Strip interrupted assistant→tool sequences from replay history.
Older interrupted gateway turns can be followed by a queued real user
message, so the interrupted assistant/tool block is not necessarily the
final tail by the time we rebuild replay history. Remove any contiguous
assistant(tool_calls) + tool-result block that contains an interrupted tool
result, while preserving successful tool-call sequences intact.
"""
if not agent_history:
return agent_history
cleaned: List[Dict[str, Any]] = []
i = 0
n = len(agent_history)
while i < n:
msg = agent_history[i]
if msg.get("role") == "assistant" and "tool_calls" in msg:
j = i + 1
tool_results: List[Dict[str, Any]] = []
while j < n and agent_history[j].get("role") == "tool":
tool_results.append(agent_history[j])
j += 1
if tool_results and any(
is_interrupted_tool_result(m.get("content", ""))
for m in tool_results
):
logger.debug(
"Stripping interrupted assistant→tool replay block "
"(indices %d%d, tool_results=%d)",
i, j - 1, len(tool_results),
)
i = j
continue
if msg.get("role") == "tool" and is_interrupted_tool_result(msg.get("content", "")):
logger.debug("Stripping orphan interrupted tool result from replay history")
i += 1
continue
cleaned.append(msg)
i += 1
return cleaned
def strip_dangling_tool_call_tail(
agent_history: List[Dict[str, Any]],
) -> List[Dict[str, Any]]:
"""Strip a trailing ``assistant(tool_calls)`` block left with NO answers.
When a tool call itself kills the gateway process (``docker restart``,
``systemctl restart``, ``kill``, ``hermes gateway restart``), the process
is terminated by SIGKILL *mid-call* — before the tool result is ever
written and before the orderly shutdown rewind
(``_drop_trailing_empty_response_scaffolding``) can run. The last thing
persisted is the ``assistant`` message that issued the ``tool_calls``,
with zero matching ``tool`` rows.
On resume the model sees an unanswered tool call at the tail and naturally
re-issues it — which restarts the gateway again, producing the infinite
reboot loop in #49201. ``strip_interrupted_tool_tails`` does not catch
this because there is no tool result to inspect for an interrupt marker.
This strips that dangling tail at the source so there is nothing for the
model to re-execute. It only acts when the tail is an
``assistant(tool_calls)`` whose calls have NO corresponding ``tool``
results — a completed assistant→tool pair (any tool answers present) is
left untouched so genuine mid-progress tool loops still resume.
"""
if not agent_history:
return agent_history
last = agent_history[-1]
if not (
isinstance(last, dict)
and last.get("role") == "assistant"
and last.get("tool_calls")
):
return agent_history
logger.debug(
"Stripping dangling unanswered assistant(tool_calls) tail "
"(%d call(s)) — process likely killed mid-tool-call by a "
"restart/shutdown command (#49201)",
len(last.get("tool_calls") or []),
)
return agent_history[:-1]
def sanitize_replay_history(
agent_history: List[Dict[str, Any]],
) -> List[Dict[str, Any]]:
"""Apply both replay-tail strippers in the canonical order.
Convenience entry point for resume code paths: removes interrupted
assistant→tool blocks anywhere in the history, then removes a dangling
unanswered ``assistant(tool_calls)`` tail. Returns the same list object
when there is nothing to strip.
"""
if not agent_history:
return agent_history
return strip_dangling_tool_call_tail(strip_interrupted_tool_tails(agent_history))

View File

@@ -122,6 +122,8 @@ from datetime import datetime, timezone
from pathlib import Path
from typing import Any, Callable, Dict, Iterator, List, Optional, Set, Tuple
from hermes_cli._subprocess_compat import IS_WINDOWS, windows_hide_flags
try:
import fcntl # POSIX only; Windows falls back to best-effort without flock.
except ImportError: # pragma: no cover
@@ -441,6 +443,7 @@ def _spawn(spec: ShellHookSpec, stdin_json: str) -> Dict[str, Any]:
return result
t0 = time.monotonic()
_popen_kwargs = {"creationflags": windows_hide_flags()} if IS_WINDOWS else {}
try:
proc = subprocess.run(
argv,
@@ -449,6 +452,7 @@ def _spawn(spec: ShellHookSpec, stdin_json: str) -> Dict[str, Any]:
timeout=spec.timeout,
text=True,
shell=False,
**_popen_kwargs,
)
except subprocess.TimeoutExpired:
result["timed_out"] = True

View File

@@ -5,6 +5,8 @@ import re
import subprocess
from pathlib import Path
from hermes_cli._subprocess_compat import IS_WINDOWS, windows_hide_flags
logger = logging.getLogger(__name__)
# Matches ${HERMES_SKILL_DIR} / ${HERMES_SESSION_ID} tokens in SKILL.md.
@@ -66,6 +68,7 @@ def run_inline_shell(command: str, cwd: Path | None, timeout: int) -> str:
Failures return a short ``[inline-shell error: ...]`` marker instead of
raising, so one bad snippet can't wreck the whole skill message.
"""
_popen_kwargs = {"creationflags": windows_hide_flags()} if IS_WINDOWS else {}
try:
completed = subprocess.run(
["bash", "-c", command],
@@ -75,6 +78,7 @@ def run_inline_shell(command: str, cwd: Path | None, timeout: int) -> str:
timeout=max(1, int(timeout)),
check=False,
stdin=subprocess.DEVNULL,
**_popen_kwargs,
)
except subprocess.TimeoutExpired:
return f"[inline-shell timeout after {timeout}s: {command}]"

View File

@@ -28,6 +28,7 @@ import uuid
from dataclasses import dataclass
from typing import Any, Dict, List, Optional
from agent.conversation_compression import conversation_history_after_compression
from agent.iteration_budget import IterationBudget
from agent.model_metadata import (
estimate_messages_tokens_rough,
@@ -400,7 +401,9 @@ def build_turn_context(
_orig_len, len(messages), _orig_tokens, _preflight_tokens
):
break # Cannot compress further: neither rows nor tokens moved
conversation_history = None
conversation_history = conversation_history_after_compression(
agent, messages
)
agent._empty_content_retries = 0
agent._thinking_prefill_retries = 0
agent._last_content_with_tools = None

View File

@@ -289,7 +289,14 @@ def finalize_turn(
and len(_stripped) <= 24
and _stripped[-1:] not in {".", "!", "?", "", "", "", "`", ")"}
)
if _is_empty_terminal or _is_partial_fragment:
_is_partial_stream_recovery = (
str(_turn_exit_reason) == "partial_stream_recovery"
)
if (
_is_empty_terminal
or _is_partial_fragment
or _is_partial_stream_recovery
):
_explanation = agent._format_turn_completion_explanation(
_turn_exit_reason
)

View File

@@ -67,6 +67,11 @@ class TurnRetryState:
# ── Restart signals (read by the outer loop after the attempt) ───────
restart_with_compressed_messages: bool = False
restart_with_length_continuation: bool = False
# Set when a content-filter stream stall (e.g. MiniMax "new_sensitive")
# has been escalated to the fallback chain: the partial-stream content
# was rolled back off ``messages`` and the loop should re-issue the API
# call against the newly-activated provider (#32421).
restart_with_rebuilt_messages: bool = False
def __iter__(self):
# Convenience for debugging / tests: iterate (name, value) pairs.

View File

@@ -15,6 +15,63 @@ from typing import Any, Iterable
_MAX_CHANGED_PATHS_IN_NUDGE = 8
# Non-code file extensions whose edits carry no verifiable runtime behavior:
# documentation, prose, and data/markup that no test/build exercises. When a
# turn touches ONLY these, verify-on-stop has nothing to check, so the nudge is
# suppressed (this is fix "C" for the doc/markdown/skill false-positive — a
# SKILL.md or README edit must never demand a /tmp verification script). A turn
# that edits any non-listed path (a real source/code/config file) still nudges.
_NON_CODE_VERIFY_EXTENSIONS = frozenset(
{
".md",
".markdown",
".mdx",
".rst",
".txt",
".text",
".adoc",
".asciidoc",
".org",
".log",
".csv",
".tsv",
}
)
# Filenames (case-insensitive, extension-less or otherwise) that are pure prose
# even without a recognized doc extension.
_NON_CODE_VERIFY_FILENAMES = frozenset(
{
"license",
"licence",
"notice",
"authors",
"contributors",
"changelog",
"codeowners",
}
)
def _is_non_code_path(raw: str) -> bool:
"""Return True when a changed path is documentation/prose with nothing to verify."""
try:
p = Path(str(raw))
except Exception:
return False
suffix = p.suffix.lower()
if suffix in _NON_CODE_VERIFY_EXTENSIONS:
return True
if not suffix and p.name.lower() in _NON_CODE_VERIFY_FILENAMES:
return True
return False
def _filter_verifiable_paths(paths: Iterable[str]) -> list[str]:
"""Drop documentation/prose paths; keep paths that could have verifiable behavior."""
return [p for p in paths if p and not _is_non_code_path(p)]
# Session identities (platform or source) that are NOT human conversational
# messaging surfaces: interactive coding surfaces (CLI, TUI, desktop, codex,
# local, gateway) and programmatic callers (API server, webhooks, tools).
@@ -79,12 +136,13 @@ def verify_on_stop_enabled(config: dict[str, Any] | None = None) -> bool:
"""Return whether edit -> verify-before-finish behavior is enabled.
Precedence: an explicit ``HERMES_VERIFY_ON_STOP`` env var wins, then an
explicit boolean ``agent.verify_on_stop`` config value, then a surface-aware
default. The config default is the sentinel ``"auto"`` (see
``DEFAULT_CONFIG``), which resolves to ON for interactive coding surfaces
explicit ``agent.verify_on_stop`` config value. The config default is
``False`` (see ``DEFAULT_CONFIG``) — verify-on-stop is OFF unless the user
opts in. The legacy ``"auto"`` sentinel is still honored for anyone who
sets it explicitly: it resolves to ON for interactive coding surfaces
(CLI, TUI, desktop) and programmatic callers, and OFF for conversational
messaging surfaces (Telegram, Discord, etc.) where the verification
narrative would otherwise reach a human as chat noise.
messaging surfaces (Telegram, Discord, etc.). A missing/unknown value
falls back to OFF.
"""
env = os.environ.get("HERMES_VERIFY_ON_STOP")
if env is not None:
@@ -106,8 +164,11 @@ def verify_on_stop_enabled(config: dict[str, Any] | None = None) -> bool:
return True
if token in {"0", "false", "no", "off"}:
return False
# "auto", missing, or any other value -> surface-aware default.
return not _session_is_messaging_surface()
if token == "auto":
# Explicit opt-in to the legacy surface-aware behavior.
return not _session_is_messaging_surface()
# Missing or unknown value -> OFF (the new default).
return False
def _candidate_cwds(paths: Iterable[str]) -> list[Path]:
@@ -190,7 +251,10 @@ def build_verify_on_stop_nudge(
max_attempts: int = 2,
) -> str | None:
"""Return a synthetic follow-up when edited code lacks fresh verification."""
paths = sorted({str(p) for p in changed_paths if p})
# Drop documentation/prose paths (markdown, skills, README, LICENSE, ...) —
# they carry no verifiable behavior, so a turn that touched only those has
# nothing to verify and must not nudge.
paths = sorted({str(p) for p in _filter_verifiable_paths(changed_paths)})
if not paths or attempts >= max_attempts:
return None

View File

@@ -1,15 +1,13 @@
const test = require('node:test')
const assert = require('node:assert/strict')
const path = require('node:path')
import assert from 'node:assert/strict'
import path from 'node:path'
import test from 'node:test'
const {
POSIX_SANE_PATH_ENTRIES,
appendUniquePathEntries,
import { appendUniquePathEntries,
buildDesktopBackendEnv,
buildDesktopBackendPath,
normalizeHermesHomeRoot,
pathEnvKey
} = require('./backend-env.cjs')
pathEnvKey,
POSIX_SANE_PATH_ENTRIES } from './backend-env'
test('desktop backend PATH adds Hermes-managed bins and missing POSIX sane entries', () => {
const result = buildDesktopBackendPath({

View File

@@ -1,4 +1,4 @@
const path = require('node:path')
import path from 'node:path'
// Match the POSIX fallback surface used by the Python terminal environment.
// macOS apps launched from Finder/Dock often inherit only /usr/bin:/bin:/usr/sbin:/sbin,
@@ -23,12 +23,14 @@ function pathModuleForPlatform(platform = process.platform) {
}
function pathEnvKey(env = process.env, platform = process.platform) {
if (platform !== 'win32') return 'PATH'
if (platform !== 'win32') {return 'PATH'}
return Object.keys(env || {}).find(key => key.toUpperCase() === 'PATH') || 'PATH'
}
function currentPathValue(env = process.env, platform = process.platform) {
const key = pathEnvKey(env, platform)
return env?.[key] || ''
}
@@ -37,10 +39,11 @@ function appendUniquePathEntries(entries, { delimiter = path.delimiter } = {}) {
const ordered = []
for (const entry of entries) {
if (!entry) continue
if (!entry) {continue}
const parts = Array.isArray(entry) ? entry : String(entry).split(delimiter)
for (const part of parts) {
if (!part || seen.has(part)) continue
if (!part || seen.has(part)) {continue}
seen.add(part)
ordered.push(part)
}
@@ -55,7 +58,7 @@ function buildDesktopBackendPath({
currentPath = '',
platform = process.platform,
pathModule = pathModuleForPlatform(platform)
} = {}) {
}: any = {}) {
const delimiter = delimiterForPlatform(platform)
const hermesNodeBin = hermesHome ? pathModule.join(hermesHome, 'node', 'bin') : null
const venvBin = venvRoot ? pathModule.join(venvRoot, platform === 'win32' ? 'Scripts' : 'bin') : null
@@ -64,13 +67,15 @@ function buildDesktopBackendPath({
return appendUniquePathEntries([hermesNodeBin, venvBin, currentPath, saneEntries], { delimiter })
}
function normalizeHermesHomeRoot(hermesHome, { pathModule = pathModuleForPlatform(process.platform) } = {}) {
if (!hermesHome) return hermesHome
function normalizeHermesHomeRoot(hermesHome, { pathModule = pathModuleForPlatform(process.platform) }: any = {}) {
if (!hermesHome) {return hermesHome}
const resolved = pathModule.resolve(String(hermesHome))
const parent = pathModule.dirname(resolved)
if (pathModule.basename(parent).toLowerCase() === 'profiles') {
return pathModule.dirname(parent)
}
return resolved
}
@@ -81,7 +86,7 @@ function buildDesktopBackendEnv({
currentEnv = process.env,
platform = process.platform,
pathModule = pathModuleForPlatform(platform)
} = {}) {
}: any = {}) {
const delimiter = delimiterForPlatform(platform)
const currentPythonPath = currentEnv?.PYTHONPATH || ''
const key = pathEnvKey(currentEnv, platform)
@@ -98,12 +103,10 @@ function buildDesktopBackendEnv({
}
}
module.exports = {
POSIX_SANE_PATH_ENTRIES,
appendUniquePathEntries,
export { appendUniquePathEntries,
buildDesktopBackendEnv,
buildDesktopBackendPath,
delimiterForPlatform,
normalizeHermesHomeRoot,
pathEnvKey
}
pathEnvKey,
POSIX_SANE_PATH_ENTRIES }

View File

@@ -5,13 +5,13 @@
* (Wired into npm test:desktop:platforms in package.json.)
*/
const test = require('node:test')
const assert = require('node:assert/strict')
const fs = require('node:fs')
const os = require('node:os')
const path = require('node:path')
import assert from 'node:assert/strict'
import fs from 'node:fs'
import os from 'node:os'
import path from 'node:path'
import test from 'node:test'
const { canImportHermesCli, verifyHermesCli } = require('./backend-probes.cjs')
import { canImportHermesCli, hermesRuntimeImportProbe, verifyHermesCli } from './backend-probes'
// Resolve the host's own Node binary -- guaranteed to be on disk and
// runnable. We use it as both a stand-in for "a python that doesn't
@@ -40,6 +40,12 @@ test('canImportHermesCli returns false when binary does not exist', () => {
assert.equal(canImportHermesCli(ghost), false)
})
test('hermes runtime import probe checks config dependencies', () => {
const probe = hermesRuntimeImportProbe()
assert.match(probe, /\bimport yaml\b/)
assert.match(probe, /\bimport hermes_cli\.config\b/)
})
test('verifyHermesCli returns false when command is falsy', () => {
assert.equal(verifyHermesCli(''), false)
assert.equal(verifyHermesCli(null), false)
@@ -57,6 +63,7 @@ test('verifyHermesCli returns true when --version exits 0', () => {
// verifyHermesCli only cares about the exit code.
const scriptPath = path.join(os.tmpdir(), `hermes-probes-ok-${Date.now()}-${process.pid}.cjs`)
fs.writeFileSync(scriptPath, 'process.exit(0)\n')
try {
// Use node as the launcher and our script as the "command". Pass
// shell:false (default) -- node is a real binary, no shim.

View File

@@ -32,12 +32,23 @@
* as bootstrap-platform.cjs and hardening.cjs).
*/
const { execFileSync } = require('node:child_process')
import { execFileSync } from 'node:child_process'
const PROBE_TIMEOUT_MS = 5000
/**
* Return true iff `python -c "import hermes_cli"` exits 0.
* Return the Python snippet used to verify Hermes can import far enough to
* launch the CLI. Kept exported for tests so dependency regressions are
* caught without needing a real broken venv fixture.
*
* @returns {string}
*/
function hermesRuntimeImportProbe() {
return 'import yaml; import hermes_cli.config'
}
/**
* Return true iff the Hermes runtime import probe exits 0.
*
* Used to gate the "fallback to system Python with hermes_cli installed"
* rung of resolveHermesBackend. Without this, a system Python 3.11-3.13
@@ -46,17 +57,25 @@ const PROBE_TIMEOUT_MS = 5000
* site-packages -- and the resolver returns a backend that immediately
* dies on spawn.
*
* The probe intentionally imports hermes_cli.config, not just the top-level
* package: a broken/empty Windows launcher venv can still see the source tree
* through PYTHONPATH but lack PyYAML, then die on the first real CLI import.
*
* @param {string} pythonPath - Absolute path to a python.exe / python.
* @param {object} [opts.env] - Additional environment for the probe.
* @returns {boolean}
*/
function canImportHermesCli(pythonPath) {
if (!pythonPath) return false
function canImportHermesCli(pythonPath: string, opts:{env?: Record<string, string>} = {}) {
if (!pythonPath) {return false}
try {
execFileSync(pythonPath, ['-c', 'import hermes_cli'], {
execFileSync(pythonPath, ['-c', hermesRuntimeImportProbe()], {
env: { ...process.env, ...(opts.env || {}) },
stdio: 'ignore',
timeout: PROBE_TIMEOUT_MS,
windowsHide: true
})
return true
} catch {
return false
@@ -77,30 +96,30 @@ function canImportHermesCli(pythonPath) {
*
* @param {string} hermesCommand - Resolved absolute path to a hermes
* executable (or an interpreter+script wrapper).
* @param {object} [opts]
* @param {boolean} [opts.shell] - Whether to run through a shell. For
* .cmd/.bat shims on Windows execFileSync needs shell:true to find
* the cmd interpreter; mirrors the same flag isCommandScript() drives
* in resolveHermesBackend.
* @returns {boolean}
*/
function verifyHermesCli(hermesCommand, opts = {}) {
if (!hermesCommand) return false
function verifyHermesCli(hermesCommand: string, opts?: {shell?: boolean}) {
if (!hermesCommand) {return false}
try {
execFileSync(hermesCommand, ['--version'], {
stdio: 'ignore',
timeout: PROBE_TIMEOUT_MS,
shell: Boolean(opts.shell),
shell: Boolean(opts?.shell),
windowsHide: true
})
return true
} catch {
return false
}
}
module.exports = {
canImportHermesCli,
verifyHermesCli,
PROBE_TIMEOUT_MS
}
export { canImportHermesCli,
hermesRuntimeImportProbe,
PROBE_TIMEOUT_MS,
verifyHermesCli }

View File

@@ -11,29 +11,32 @@
* HERMES_DESKTOP_PORT_ANNOUNCE_TIMEOUT_MS, clamped to a 45s floor.
*/
const test = require('node:test')
const assert = require('node:assert/strict')
const { EventEmitter } = require('node:events')
const fs = require('node:fs')
const os = require('node:os')
const path = require('node:path')
import assert from 'node:assert/strict'
import { EventEmitter } from 'node:events'
import fs from 'node:fs'
import os from 'node:os'
import path from 'node:path'
import test from 'node:test'
const {
import { DEFAULT_PORT_ANNOUNCE_TIMEOUT_MS,
MIN_PORT_ANNOUNCE_TIMEOUT_MS,
readDashboardReadyFile,
resolvePortAnnounceTimeoutMs,
waitForDashboardPort,
waitForDashboardPortAnnouncement,
waitForDashboardReadyFile,
resolvePortAnnounceTimeoutMs,
DEFAULT_PORT_ANNOUNCE_TIMEOUT_MS,
MIN_PORT_ANNOUNCE_TIMEOUT_MS
} = require('./backend-ready.cjs')
waitForDashboardReadyFile } from './backend-ready'
type FakeChildProcess = EventEmitter & {
stdout: EventEmitter
}
// A minimal stand-in for a spawned child process: an EventEmitter with a
// stdout EventEmitter, matching the surface waitForDashboardPort consumes
// (child.stdout.on('data'), child.on('exit'|'error') + the .off() teardown).
function makeFakeChild() {
const child = new EventEmitter()
function makeFakeChild(): FakeChildProcess {
const child = new EventEmitter() as FakeChildProcess
child.stdout = new EventEmitter()
return child
}
@@ -132,6 +135,7 @@ test('a late announcement after timeout does not throw (listeners torn down)', a
function mkTmpReadyFile() {
const dir = fs.mkdtempSync(path.join(os.tmpdir(), 'hermes-ready-test-'))
return {
dir,
file: path.join(dir, 'ready.json'),
@@ -141,6 +145,7 @@ function mkTmpReadyFile() {
test('readDashboardReadyFile returns a valid port from JSON', () => {
const tmp = mkTmpReadyFile()
try {
fs.writeFileSync(tmp.file, JSON.stringify({ port: 4567 }))
assert.equal(readDashboardReadyFile(tmp.file), 4567)
@@ -151,6 +156,7 @@ test('readDashboardReadyFile returns a valid port from JSON', () => {
test('readDashboardReadyFile ignores missing, malformed, or invalid files', () => {
const tmp = mkTmpReadyFile()
try {
assert.equal(readDashboardReadyFile(tmp.file), null)
fs.writeFileSync(tmp.file, '{')
@@ -165,6 +171,7 @@ test('readDashboardReadyFile ignores missing, malformed, or invalid files', () =
test('waitForDashboardReadyFile resolves when the ready file appears', async () => {
const tmp = mkTmpReadyFile()
const child = makeFakeChild()
try {
const p = waitForDashboardReadyFile(tmp.file, child, 1000)
setTimeout(() => fs.writeFileSync(tmp.file, JSON.stringify({ port: 8765 })), 20)
@@ -177,6 +184,7 @@ test('waitForDashboardReadyFile resolves when the ready file appears', async ()
test('waitForDashboardPortAnnouncement uses ready file when provided', async () => {
const tmp = mkTmpReadyFile()
const child = makeFakeChild()
try {
const p = waitForDashboardPortAnnouncement(child, { readyFile: tmp.file, timeoutMs: 1000 })
setTimeout(() => fs.writeFileSync(tmp.file, JSON.stringify({ port: 9876 })), 20)
@@ -189,6 +197,7 @@ test('waitForDashboardPortAnnouncement uses ready file when provided', async ()
test('waitForDashboardReadyFile rejects when the child exits before file readiness', async () => {
const tmp = mkTmpReadyFile()
const child = makeFakeChild()
try {
const p = waitForDashboardReadyFile(tmp.file, child, 1000)
child.emit('exit', 1, null)

View File

@@ -1,4 +1,4 @@
const fs = require('node:fs')
import fs from 'node:fs'
const _READY_RE = /^HERMES_DASHBOARD_READY port=(\d+)/m
@@ -23,9 +23,11 @@ const MIN_PORT_ANNOUNCE_TIMEOUT_MS = 45_000
*/
function resolvePortAnnounceTimeoutMs(env = process.env) {
const parsed = Number(env.HERMES_DESKTOP_PORT_ANNOUNCE_TIMEOUT_MS)
if (Number.isFinite(parsed) && parsed > 0) {
return Math.max(MIN_PORT_ANNOUNCE_TIMEOUT_MS, Math.round(parsed))
}
return DEFAULT_PORT_ANNOUNCE_TIMEOUT_MS
}
@@ -52,7 +54,7 @@ function waitForDashboardPort(child, timeoutMs = resolvePortAnnounceTimeoutMs())
let done = false
function cleanup() {
if (done) return
if (done) {return}
done = true
clearTimeout(timer)
child.stdout.off('data', onData)
@@ -63,13 +65,16 @@ function waitForDashboardPort(child, timeoutMs = resolvePortAnnounceTimeoutMs())
function onData(chunk) {
buf += chunk.toString()
let nl
while ((nl = buf.indexOf('\n')) !== -1) {
const line = buf.slice(0, nl)
buf = buf.slice(nl + 1)
const m = line.match(_READY_RE)
if (m) {
cleanup()
resolve(parseInt(m[1], 10))
return
}
}
@@ -96,11 +101,13 @@ function waitForDashboardPort(child, timeoutMs = resolvePortAnnounceTimeoutMs())
})
}
function readDashboardReadyFile(readyFile) {
if (!readyFile) return null
function readDashboardReadyFile(readyFile: fs.PathOrFileDescriptor) {
if (!readyFile) {return null}
try {
const parsed = JSON.parse(fs.readFileSync(readyFile, 'utf8'))
const port = Number(parsed?.port)
return Number.isInteger(port) && port > 0 ? port : null
} catch {
return null
@@ -113,16 +120,18 @@ function waitForDashboardReadyFile(readyFile, child, timeoutMs = resolvePortAnno
let interval = null
function cleanup() {
if (done) return
if (done) {return}
done = true
clearTimeout(timer)
if (interval) clearInterval(interval)
if (interval) {clearInterval(interval)}
child.off('exit', onExit)
child.off('error', onError)
}
function check() {
const port = readDashboardReadyFile(readyFile)
if (port) {
cleanup()
resolve(port)
@@ -147,25 +156,29 @@ function waitForDashboardReadyFile(readyFile, child, timeoutMs = resolvePortAnno
child.on('exit', onExit)
child.on('error', onError)
interval = setInterval(check, 50)
if (typeof interval.unref === 'function') interval.unref()
if (typeof interval.unref === 'function') {interval.unref()}
check()
})
}
function waitForDashboardPortAnnouncement(child, options = {}) {
function waitForDashboardPortAnnouncement(child, options: {
readyFile?: fs.PathOrFileDescriptor,
timeoutMs?: number
} = {}) {
const timeoutMs = options.timeoutMs ?? resolvePortAnnounceTimeoutMs()
if (options.readyFile) {
return waitForDashboardReadyFile(options.readyFile, child, timeoutMs)
}
return waitForDashboardPort(child, timeoutMs)
}
module.exports = {
waitForDashboardPort,
waitForDashboardPortAnnouncement,
waitForDashboardReadyFile,
export { DEFAULT_PORT_ANNOUNCE_TIMEOUT_MS,
MIN_PORT_ANNOUNCE_TIMEOUT_MS,
readDashboardReadyFile,
resolvePortAnnounceTimeoutMs,
DEFAULT_PORT_ANNOUNCE_TIMEOUT_MS,
MIN_PORT_ANNOUNCE_TIMEOUT_MS
}
waitForDashboardPort,
waitForDashboardPortAnnouncement,
waitForDashboardReadyFile }

View File

@@ -1,14 +1,13 @@
const assert = require('node:assert/strict')
const fs = require('node:fs')
const path = require('node:path')
const test = require('node:test')
import assert from 'node:assert/strict'
import fs from 'node:fs'
import path from 'node:path'
import test from 'node:test'
import { fileURLToPath } from 'node:url'
const {
bundledRuntimeImportCheck,
import { bundledRuntimeImportCheck,
detectRemoteDisplay,
isWindowsBinaryPathInWsl,
isWslEnvironment
} = require('./bootstrap-platform.cjs')
isWslEnvironment } from './bootstrap-platform'
test('isWslEnvironment detects WSL2 env vars on linux', () => {
assert.equal(isWslEnvironment({ WSL_DISTRO_NAME: 'Ubuntu' }, 'linux'), true)
@@ -87,8 +86,8 @@ test('detectRemoteDisplay honors the HERMES_DESKTOP_DISABLE_GPU override both wa
})
test('packaged electron entrypoints do not require unpackaged npm modules', () => {
const electronDir = __dirname
const entrypoints = ['main.cjs', 'preload.cjs', 'bootstrap-platform.cjs']
const electronDir = path.dirname(fileURLToPath(import.meta.url))
const entrypoints = ['main.ts', 'preload.ts', 'bootstrap-platform.ts']
// - electron: provided by the electron runtime, always resolvable in packaged builds.
// - node-pty: hoisted by workspace dedup AND shipped via extraResources to
// resources/native-deps/node-pty (see scripts/stage-native-deps.cjs). main.cjs
@@ -100,6 +99,7 @@ test('packaged electron entrypoints do not require unpackaged npm modules', () =
for (const entrypoint of entrypoints) {
const source = fs.readFileSync(path.join(electronDir, entrypoint), 'utf8')
const bareRequires = Array.from(source.matchAll(requirePattern))
.map(match => match[1])
.filter(specifier => !specifier.startsWith('node:'))

View File

@@ -1,20 +1,23 @@
const fs = require('node:fs')
import fs from 'node:fs'
function isWslEnvironment(env = process.env, platform = process.platform, kernelRelease = null) {
if (platform !== 'linux') return false
if (env.WSL_DISTRO_NAME || env.WSL_INTEROP) return true
if (platform !== 'linux') {return false}
if (env.WSL_DISTRO_NAME || env.WSL_INTEROP) {return true}
try {
const release = kernelRelease ?? fs.readFileSync('/proc/sys/kernel/osrelease', 'utf8')
return /microsoft|wsl/i.test(release)
} catch {
return false
}
}
function isWindowsBinaryPathInWsl(filePath, options = {}) {
function isWindowsBinaryPathInWsl(filePath, options: {isWsl?: boolean, env?: NodeJS.ProcessEnv, platform?: NodeJS.Platform} = {}) {
const isWsl = options.isWsl ?? isWslEnvironment(options.env, options.platform)
if (!isWsl) return false
if (!isWsl) {return false}
const normalized = String(filePath || '')
.replace(/\\/g, '/')
@@ -48,19 +51,21 @@ const GPU_OVERRIDE_OFF = new Set(['0', 'false', 'no', 'off'])
*
* Pure + dependency-free so it can be unit-tested and called before app ready.
*/
function detectRemoteDisplay(options = {}) {
function detectRemoteDisplay(options: {env?: NodeJS.ProcessEnv, platform?: NodeJS.Platform} = {}) {
const env = options.env ?? process.env
const platform = options.platform ?? process.platform
const override = String(env.HERMES_DESKTOP_DISABLE_GPU || '')
.trim()
.toLowerCase()
if (GPU_OVERRIDE_ON.has(override)) return 'override (HERMES_DESKTOP_DISABLE_GPU)'
if (GPU_OVERRIDE_OFF.has(override)) return null
if (GPU_OVERRIDE_ON.has(override)) {return 'override (HERMES_DESKTOP_DISABLE_GPU)'}
if (GPU_OVERRIDE_OFF.has(override)) {return null}
// Launched from an SSH session → the display is X11-forwarded or otherwise
// remote. Covers the common `ssh user@box` + GUI-forwarding case.
if (env.SSH_CONNECTION || env.SSH_CLIENT || env.SSH_TTY) return 'ssh-session'
if (env.SSH_CONNECTION || env.SSH_CLIENT || env.SSH_TTY) {return 'ssh-session'}
if (platform === 'linux') {
// X11 forwarding sets DISPLAY to "<host>:N" (e.g. "localhost:10.0"); a
@@ -68,6 +73,7 @@ function detectRemoteDisplay(options = {}) {
// NB: WSLg deliberately isn't treated as remote — it reports
// GPU-accelerated vGPU surfaces locally and doesn't show the flicker.
const display = String(env.DISPLAY || '')
if (display.includes(':') && display.split(':')[0]) {
return `x11-forwarding (DISPLAY=${display})`
}
@@ -77,15 +83,14 @@ function detectRemoteDisplay(options = {}) {
// RDP sessions report SESSIONNAME like "RDP-Tcp#7"; the local console is
// "Console".
const sessionName = String(env.SESSIONNAME || '')
if (/^rdp-/i.test(sessionName)) return `rdp (SESSIONNAME=${sessionName})`
if (/^rdp-/i.test(sessionName)) {return `rdp (SESSIONNAME=${sessionName})`}
}
return null
}
module.exports = {
bundledRuntimeImportCheck,
export { bundledRuntimeImportCheck,
detectRemoteDisplay,
isWindowsBinaryPathInWsl,
isWslEnvironment
}
isWslEnvironment }

View File

@@ -1,15 +1,13 @@
const assert = require('node:assert/strict')
const test = require('node:test')
const fs = require('node:fs')
const os = require('node:os')
const path = require('node:path')
import assert from 'node:assert/strict'
import fs from 'node:fs'
import os from 'node:os'
import path from 'node:path'
import test from 'node:test'
const {
runBootstrap,
resolveInstallScript,
import { cachedScriptPath,
installedAgentInstallScript,
cachedScriptPath
} = require('./bootstrap-runner.cjs')
resolveInstallScript,
runBootstrap } from './bootstrap-runner'
const SCRIPT_NAME = process.platform === 'win32' ? 'install.ps1' : 'install.sh'
@@ -22,6 +20,7 @@ test('runBootstrap bails immediately when the signal is already aborted', async
controller.abort()
const events = []
const result = await runBootstrap({
installStamp: null,
activeRoot: '/tmp/hermes-runner-test',
@@ -42,6 +41,7 @@ test('runBootstrap bails immediately when the signal is already aborted', async
test('installedAgentInstallScript resolves the installer in the agent checkout', () => {
const home = mkTmpHome()
try {
assert.equal(installedAgentInstallScript(home), null, 'absent before the checkout exists')
@@ -59,6 +59,7 @@ test('installedAgentInstallScript resolves the installer in the agent checkout',
test('resolveInstallScript prefers a cached script without touching the network', async () => {
const home = mkTmpHome()
try {
const commit = 'a'.repeat(40)
const cached = cachedScriptPath(home, commit)
@@ -66,6 +67,7 @@ test('resolveInstallScript prefers a cached script without touching the network'
fs.writeFileSync(cached, '#!/bin/sh\necho cached\n')
const logs = []
const result = await resolveInstallScript({
installStamp: { commit },
sourceRepoRoot: null,
@@ -82,6 +84,7 @@ test('resolveInstallScript prefers a cached script without touching the network'
test('resolveInstallScript falls back to the installed agent checkout on a 404', async () => {
const home = mkTmpHome()
try {
const commit = 'a'.repeat(40)
// Seed the installed agent checkout so the fallback has something to resolve.
@@ -91,6 +94,7 @@ test('resolveInstallScript falls back to the installed agent checkout on a 404',
fs.writeFileSync(installed, '#!/bin/sh\necho fallback\n')
const logs = []
const result = await resolveInstallScript({
installStamp: { commit },
sourceRepoRoot: null,
@@ -117,6 +121,7 @@ test('resolveInstallScript falls back to the installed agent checkout on a 404',
test('resolveInstallScript rethrows when the 404 fallback is unavailable', async () => {
const home = mkTmpHome()
try {
const commit = 'a'.repeat(40)
// No installed agent checkout seeded -> nothing to fall back to.

View File

@@ -1,5 +1,3 @@
'use strict'
/**
* bootstrap-runner.cjs
*
@@ -34,11 +32,11 @@
* no UI consumes them yet)
*/
const fs = require('node:fs')
const fsp = require('node:fs/promises')
const path = require('node:path')
const https = require('node:https')
const { spawn } = require('node:child_process')
import { spawn } from 'node:child_process'
import fs from 'node:fs'
import fsp from 'node:fs/promises'
import https from 'node:https'
import path from 'node:path'
const IS_WINDOWS = process.platform === 'win32'
@@ -46,6 +44,7 @@ function hiddenWindowsChildOptions(options = {}) {
if (!IS_WINDOWS || Object.prototype.hasOwnProperty.call(options, 'windowsHide')) {
return options
}
return { ...options, windowsHide: true }
}
@@ -71,10 +70,12 @@ function installScriptKind() {
}
function resolveLocalInstallScript(sourceRepoRoot) {
if (!sourceRepoRoot) return null
if (!sourceRepoRoot) {return null}
const candidate = path.join(sourceRepoRoot, 'scripts', installScriptName())
try {
fs.accessSync(candidate, fs.constants.R_OK)
return candidate
} catch {
return null
@@ -90,10 +91,12 @@ function bootstrapCacheDir(hermesHome) {
// the pinned commit can't be fetched from GitHub (e.g. a locally-built desktop
// app stamped to an unpushed HEAD).
function installedAgentInstallScript(hermesHome) {
if (!hermesHome) return null
if (!hermesHome) {return null}
const candidate = path.join(hermesHome, 'hermes-agent', 'scripts', installScriptName())
try {
fs.accessSync(candidate, fs.constants.R_OK)
return candidate
} catch {
return null
@@ -110,6 +113,7 @@ function downloadInstallScript(commit, destPath) {
// verification beyond "did the file we wrote pass a syntax probe."
const scriptName = installScriptName()
const url = `https://raw.githubusercontent.com/NousResearch/hermes-agent/${commit}/scripts/${scriptName}`
return new Promise((resolve, reject) => {
fs.mkdirSync(path.dirname(destPath), { recursive: true })
const tmpPath = destPath + '.tmp'
@@ -129,8 +133,10 @@ function downloadInstallScript(commit, destPath) {
`Failed to download ${scriptName}: HTTP ${res2.statusCode} from redirect ${res.headers.location}`
)
)
return
}
const out2 = fs.createWriteStream(tmpPath)
res2.pipe(out2)
out2.on('finish', () => {
@@ -141,18 +147,24 @@ function downloadInstallScript(commit, destPath) {
out2.on('error', reject)
})
.on('error', reject)
return
}
if (res.statusCode !== 200) {
out.close()
try {
fs.unlinkSync(tmpPath)
} catch {
void 0
}
reject(new Error(`Failed to download ${scriptName}: HTTP ${res.statusCode} from ${url}`))
return
}
res.pipe(out)
out.on('finish', () => {
out.close()
@@ -165,6 +177,7 @@ function downloadInstallScript(commit, destPath) {
} catch {
void 0
}
reject(err)
})
})
@@ -174,6 +187,7 @@ function downloadInstallScript(commit, destPath) {
} catch {
void 0
}
reject(err)
})
})
@@ -190,8 +204,10 @@ async function resolveInstallScript({
// without pushing. SOURCE_REPO_ROOT comes from main.cjs (path.resolve
// of APP_ROOT/../..).
const localScript = resolveLocalInstallScript(sourceRepoRoot)
if (localScript) {
emit({ type: 'log', line: `[bootstrap] using local ${installScriptName()} at ${localScript}` })
return { path: localScript, source: 'local', kind: installScriptKind() }
}
@@ -204,12 +220,14 @@ async function resolveInstallScript({
}
const cached = cachedScriptPath(hermesHome, installStamp.commit)
try {
await fsp.access(cached, fs.constants.R_OK)
emit({
type: 'log',
line: `[bootstrap] using cached ${installScriptName()} for ${installStamp.commit.slice(0, 12)}`
})
return { path: cached, source: 'cache', commit: installStamp.commit, kind: installScriptKind() }
} catch {
// not cached; download
@@ -219,9 +237,11 @@ async function resolveInstallScript({
type: 'log',
line: `[bootstrap] fetching ${installScriptName()} for ${installStamp.commit.slice(0, 12)} from GitHub`
})
try {
await _download(installStamp.commit, cached)
emit({ type: 'log', line: `[bootstrap] saved to ${cached}` })
return { path: cached, source: 'download', commit: installStamp.commit, kind: installScriptKind() }
} catch (err) {
// The pinned commit may not be fetchable from GitHub -- most commonly a
@@ -230,6 +250,7 @@ async function resolveInstallScript({
// ships inside the already-installed agent checkout so dev/self-builds can
// still bootstrap instead of dying with a fatal 404.
const installed = installedAgentInstallScript(hermesHome)
if (installed) {
emit({
type: 'log',
@@ -237,15 +258,18 @@ async function resolveInstallScript({
`[bootstrap] GitHub fetch failed (${err.message}); ` +
`falling back to installed agent ${installScriptName()} at ${installed}`
})
try {
fs.mkdirSync(path.dirname(cached), { recursive: true })
fs.copyFileSync(installed, cached)
return { path: cached, source: 'installed-agent', commit: installStamp.commit, kind: installScriptKind() }
} catch {
// Cache copy failed (read-only FS, etc.) -- use the source path directly.
return { path: installed, source: 'installed-agent', commit: installStamp.commit, kind: installScriptKind() }
}
}
throw err
}
}
@@ -271,31 +295,37 @@ function powershellUnderRoot(root) {
function resolveWindowsPowerShell() {
for (const v of ['SystemRoot', 'windir']) {
const root = process.env[v]
if (root) {
const candidate = powershellUnderRoot(root)
try {
if (fs.statSync(candidate).isFile()) return candidate
if (fs.statSync(candidate).isFile()) {return candidate}
} catch {
void 0
}
}
}
const pathDirs = (process.env.PATH || process.env.Path || '').split(path.delimiter).filter(Boolean)
for (const exe of ['powershell.exe', 'pwsh.exe']) {
for (const dir of pathDirs) {
const candidate = path.join(dir, exe)
try {
if (fs.statSync(candidate).isFile()) return candidate
if (fs.statSync(candidate).isFile()) {return candidate}
} catch {
void 0
}
}
}
return 'powershell.exe'
}
function spawnPowerShell(scriptPath, args, { emit, stageName, abortSignal, hermesHome } = {}) {
return new Promise((resolve, reject) => {
function spawnPowerShell(scriptPath, args, { emit, stageName, abortSignal, hermesHome }: any = {}) {
return new Promise<any>((resolve, reject) => {
const ps = process.platform === 'win32' ? resolveWindowsPowerShell() : 'pwsh'
const fullArgs = ['-NoProfile', '-ExecutionPolicy', 'Bypass', '-File', scriptPath, ...args]
@@ -319,12 +349,14 @@ function spawnPowerShell(scriptPath, args, { emit, stageName, abortSignal, herme
const onAbort = () => {
killed = true
try {
child.kill('SIGTERM')
} catch {
void 0
}
}
if (abortSignal) {
if (abortSignal.aborted) {
onAbort()
@@ -342,10 +374,12 @@ function spawnPowerShell(scriptPath, args, { emit, stageName, abortSignal, herme
stdout += chunk
stdoutBuf += chunk
let nl
while ((nl = stdoutBuf.indexOf('\n')) !== -1) {
const line = stdoutBuf.slice(0, nl).replace(/\r$/, '')
stdoutBuf = stdoutBuf.slice(nl + 1)
if (line) emit && emit({ type: 'log', stage: stageName, line, stream: 'stdout' })
if (line) {emit && emit({ type: 'log', stage: stageName, line, stream: 'stdout' })}
}
})
@@ -354,30 +388,34 @@ function spawnPowerShell(scriptPath, args, { emit, stageName, abortSignal, herme
stderr += chunk
stderrBuf += chunk
let nl
while ((nl = stderrBuf.indexOf('\n')) !== -1) {
const line = stderrBuf.slice(0, nl).replace(/\r$/, '')
stderrBuf = stderrBuf.slice(nl + 1)
if (line) emit && emit({ type: 'log', stage: stageName, line, stream: 'stderr' })
if (line) {emit && emit({ type: 'log', stage: stageName, line, stream: 'stderr' })}
}
})
child.on('error', err => {
if (abortSignal) abortSignal.removeEventListener('abort', onAbort)
if (abortSignal) {abortSignal.removeEventListener('abort', onAbort)}
reject(err)
})
child.on('close', (code, signal) => {
if (abortSignal) abortSignal.removeEventListener('abort', onAbort)
if (abortSignal) {abortSignal.removeEventListener('abort', onAbort)}
// Flush any trailing bytes
if (stdoutBuf) emit && emit({ type: 'log', stage: stageName, line: stdoutBuf, stream: 'stdout' })
if (stderrBuf) emit && emit({ type: 'log', stage: stageName, line: stderrBuf, stream: 'stderr' })
resolve({ stdout, stderr, code, signal, killed })
if (stdoutBuf) {emit && emit({ type: 'log', stage: stageName, line: stdoutBuf, stream: 'stdout' } as any)}
if (stderrBuf) {emit && emit({ type: 'log', stage: stageName, line: stderrBuf, stream: 'stderr' } as any)}
resolve({ stdout, stderr, code, signal, killed } as any)
})
})
}
function spawnBash(scriptPath, args, { emit, stageName, abortSignal, hermesHome } = {}) {
return new Promise((resolve, reject) => {
function spawnBash(scriptPath, args, { emit, stageName, abortSignal, hermesHome }: any = {}) {
return new Promise<any>((resolve, reject) => {
const child = spawn('bash', [scriptPath, ...args], {
stdio: ['ignore', 'pipe', 'pipe'],
env: {
@@ -392,12 +430,14 @@ function spawnBash(scriptPath, args, { emit, stageName, abortSignal, hermesHome
const onAbort = () => {
killed = true
try {
child.kill('SIGTERM')
} catch {
void 0
}
}
if (abortSignal) {
if (abortSignal.aborted) {
onAbort()
@@ -414,10 +454,12 @@ function spawnBash(scriptPath, args, { emit, stageName, abortSignal, hermesHome
stdout += chunk
stdoutBuf += chunk
let nl
while ((nl = stdoutBuf.indexOf('\n')) !== -1) {
const line = stdoutBuf.slice(0, nl).replace(/\r$/, '')
stdoutBuf = stdoutBuf.slice(nl + 1)
if (line) emit && emit({ type: 'log', stage: stageName, line, stream: 'stdout' })
if (line) {emit && emit({ type: 'log', stage: stageName, line, stream: 'stdout' })}
}
})
@@ -426,22 +468,26 @@ function spawnBash(scriptPath, args, { emit, stageName, abortSignal, hermesHome
stderr += chunk
stderrBuf += chunk
let nl
while ((nl = stderrBuf.indexOf('\n')) !== -1) {
const line = stderrBuf.slice(0, nl).replace(/\r$/, '')
stderrBuf = stderrBuf.slice(nl + 1)
if (line) emit && emit({ type: 'log', stage: stageName, line, stream: 'stderr' })
if (line) {emit && emit({ type: 'log', stage: stageName, line, stream: 'stderr' })}
}
})
child.on('error', err => {
if (abortSignal) abortSignal.removeEventListener('abort', onAbort)
if (abortSignal) {abortSignal.removeEventListener('abort', onAbort)}
reject(err)
})
child.on('close', (code, signal) => {
if (abortSignal) abortSignal.removeEventListener('abort', onAbort)
if (stdoutBuf) emit && emit({ type: 'log', stage: stageName, line: stdoutBuf, stream: 'stdout' })
if (stderrBuf) emit && emit({ type: 'log', stage: stageName, line: stderrBuf, stream: 'stderr' })
if (abortSignal) {abortSignal.removeEventListener('abort', onAbort)}
if (stdoutBuf) {emit && emit({ type: 'log', stage: stageName, line: stdoutBuf, stream: 'stdout' })}
if (stderrBuf) {emit && emit({ type: 'log', stage: stageName, line: stderrBuf, stream: 'stderr' })}
resolve({ stdout, stderr, code, signal, killed })
})
})
@@ -456,48 +502,60 @@ function spawnBash(scriptPath, args, { emit, stageName, abortSignal, hermesHome
// instead of falling back to install.ps1's default ($Branch = "main").
function buildPinArgs(installStamp) {
const args = []
if (installStamp && installStamp.commit) {
args.push('-Commit', installStamp.commit)
}
if (installStamp && installStamp.branch) {
args.push('-Branch', installStamp.branch)
}
return args
}
function buildPosixPinArgs({ installStamp, activeRoot, hermesHome }) {
const args = ['--dir', activeRoot, '--hermes-home', hermesHome]
if (installStamp && installStamp.branch) {
args.push('--branch', installStamp.branch)
}
if (installStamp && installStamp.commit) {
args.push('--commit', installStamp.commit)
}
return args
}
async function fetchManifest({ scriptPath, installerKind, emit, hermesHome, activeRoot, installStamp }) {
const isPosix = installerKind === 'posix'
const args = isPosix
? ['--manifest', ...buildPosixPinArgs({ installStamp, activeRoot, hermesHome })]
: ['-Manifest', ...buildPinArgs(installStamp)]
const result = await (isPosix ? spawnBash : spawnPowerShell)(scriptPath, args, {
emit,
stageName: '__manifest__',
hermesHome
})
if (result.code !== 0) {
throw new Error(
`${isPosix ? 'install.sh --manifest' : 'install.ps1 -Manifest'} failed: exit ${result.code}\n${result.stderr || result.stdout}`
)
}
// The manifest is the LAST JSON line on stdout (install.ps1 may print
// banner / info lines first depending on Console.OutputEncoding effects).
// Find the last line that parses as JSON with a `stages` field.
const lines = result.stdout.split(/\r?\n/).filter(Boolean)
for (let i = lines.length - 1; i >= 0; i--) {
try {
const parsed = JSON.parse(lines[i])
if (parsed && Array.isArray(parsed.stages)) {
return parsed
}
@@ -505,6 +563,7 @@ async function fetchManifest({ scriptPath, installerKind, emit, hermesHome, acti
void 0
}
}
throw new Error(
`${isPosix ? 'install.sh --manifest' : 'install.ps1 -Manifest'} produced no parseable JSON payload\n${result.stdout}`
)
@@ -515,9 +574,11 @@ async function fetchManifest({ scriptPath, installerKind, emit, hermesHome, acti
// for the double-emit bug we addressed in the install.ps1 PR).
function parseStageResult(stdout) {
const lines = stdout.split(/\r?\n/).filter(Boolean)
for (let i = lines.length - 1; i >= 0; i--) {
try {
const parsed = JSON.parse(lines[i])
if (parsed && typeof parsed.ok === 'boolean' && typeof parsed.stage === 'string') {
return parsed
}
@@ -525,6 +586,7 @@ function parseStageResult(stdout) {
void 0
}
}
return null
}
@@ -533,6 +595,7 @@ async function runStage({ scriptPath, installerKind, stage, emit, hermesHome, ac
emit({ type: 'stage', name: stage.name, state: 'running' })
const isPosix = installerKind === 'posix'
const args = isPosix
? [
'--stage',
@@ -542,6 +605,7 @@ async function runStage({ scriptPath, installerKind, stage, emit, hermesHome, ac
...buildPosixPinArgs({ installStamp, activeRoot, hermesHome })
]
: ['-Stage', stage.name, '-NonInteractive', '-Json', ...buildPinArgs(installStamp)]
const result = await (isPosix ? spawnBash : spawnPowerShell)(scriptPath, args, {
emit,
stageName: stage.name,
@@ -554,6 +618,7 @@ async function runStage({ scriptPath, installerKind, stage, emit, hermesHome, ac
if (result.killed) {
const ev = { type: 'stage', name: stage.name, state: 'failed', durationMs, error: 'cancelled by user' }
emit(ev)
return ev
}
@@ -568,20 +633,26 @@ async function runStage({ scriptPath, installerKind, stage, emit, hermesHome, ac
error: `${isPosix ? 'install.sh --stage' : 'install.ps1 -Stage'} ${stage.name} produced no JSON result frame (exit=${result.code})`,
json: null
}
emit(ev)
return ev
}
if (json.ok && json.skipped) {
const ev = { type: 'stage', name: stage.name, state: 'skipped', durationMs, json }
emit(ev)
return ev
}
if (json.ok) {
const ev = { type: 'stage', name: stage.name, state: 'succeeded', durationMs, json }
emit(ev)
return ev
}
const ev = {
type: 'stage',
name: stage.name,
@@ -590,7 +661,9 @@ async function runStage({ scriptPath, installerKind, stage, emit, hermesHome, ac
json,
error: json.reason || `exit code ${result.code}`
}
emit(ev)
return ev
}
@@ -603,6 +676,7 @@ function openRunLog(logRoot) {
const ts = new Date().toISOString().replace(/[:.]/g, '-')
const logPath = path.join(logRoot, `bootstrap-${ts}.log`)
const stream = fs.createWriteStream(logPath, { flags: 'a' })
return { path: logPath, stream }
}
@@ -633,6 +707,7 @@ async function runBootstrap(opts) {
void 0
}
}
return { ok: false, cancelled: true }
}
@@ -646,8 +721,9 @@ async function runBootstrap(opts) {
} catch {
void 0
}
try {
if (typeof onEvent === 'function') onEvent(ev)
if (typeof onEvent === 'function') {onEvent(ev)}
} catch (err) {
// Don't let a subscriber bug crash the bootstrap
runLog.stream.write(`emit error: ${err && err.message}\n`)
@@ -677,6 +753,7 @@ async function runBootstrap(opts) {
activeRoot,
installStamp
})
emit({
type: 'manifest',
stages: manifest.stages,
@@ -690,8 +767,10 @@ async function runBootstrap(opts) {
for (const stage of manifest.stages) {
if (abortSignal && abortSignal.aborted) {
emit({ type: 'failed', error: 'bootstrap cancelled by user' })
return { ok: false, cancelled: true }
}
const ev = await runStage({
scriptPath: scriptInfo.path,
installerKind,
@@ -702,9 +781,11 @@ async function runBootstrap(opts) {
abortSignal,
installStamp
})
if (ev.state === 'failed') {
emit({ type: 'failed', stage: stage.name, error: ev.error || 'stage failed' })
return { ok: false, failedStage: stage.name, error: ev.error }
emit({ type: 'failed', stage: stage.name, error: (ev as any).error || 'stage failed' })
return { ok: false, failedStage: stage.name, error: (ev as any).error }
}
}
@@ -713,11 +794,14 @@ async function runBootstrap(opts) {
pinnedCommit: installStamp ? installStamp.commit : null,
pinnedBranch: installStamp ? installStamp.branch : null
}
const marker = typeof writeMarker === 'function' ? writeMarker(markerPayload) : markerPayload
emit({ type: 'complete', marker })
return { ok: true, marker }
} catch (err) {
emit({ type: 'failed', error: err.message || String(err) })
return { ok: false, error: err.message || String(err) }
} finally {
try {
@@ -728,12 +812,10 @@ async function runBootstrap(opts) {
}
}
module.exports = {
runBootstrap,
export { cachedScriptPath,
installedAgentInstallScript,
// Exposed for testability
parseStageResult,
resolveLocalInstallScript,
resolveInstallScript,
installedAgentInstallScript,
cachedScriptPath
}
resolveLocalInstallScript,
runBootstrap }

View File

@@ -1,20 +0,0 @@
'use strict'
/**
* build-mode.cjs — pure helper for the desktop's thin-vs-thick build mode.
*
* The desktop ships in two shapes:
* - thick (default): bundles the first-launch bootstrap installer, can
* spawn a local Hermes backend, and supports in-app self-update.
* - thin: no bootstrap, no local backend, no self-update. Connects ONLY
* to a remote gateway. Used for sandboxed/package-managed deployments
* (Flatpak, Snap, etc.) where the agent lives elsewhere.
*
* The esbuild bundler bakes this env var into the source code, so it's read at build time, not runtime.
*/
function isThinClient() {
return process.env.HERMES_DESKTOP_BUILD_MODE === 'thin'
}
module.exports = { isThinClient }

View File

@@ -1,41 +0,0 @@
'use strict'
const test = require('node:test')
const assert = require('node:assert/strict')
// We test build-mode.cjs by controlling process.env directly. The module
// reads process.env.HERMES_DESKTOP_BUILD_MODE at call time (not import time),
// so we can mutate the env and re-require to exercise both modes.
function freshModule() {
// Bust the require cache so the module re-evaluates with the current env.
delete require.cache[require.resolve('./build-mode.cjs')]
return require('./build-mode.cjs')
}
test('isThinClient returns false by default (thick mode)', () => {
const prev = process.env.HERMES_DESKTOP_BUILD_MODE
delete process.env.HERMES_DESKTOP_BUILD_MODE
const { isThinClient } = freshModule()
assert.equal(isThinClient(), false)
process.env.HERMES_DESKTOP_BUILD_MODE = prev
})
test('isThinClient returns true when HERMES_DESKTOP_BUILD_MODE=thin', () => {
const prev = process.env.HERMES_DESKTOP_BUILD_MODE
process.env.HERMES_DESKTOP_BUILD_MODE = 'thin'
const { isThinClient } = freshModule()
assert.equal(isThinClient(), true)
process.env.HERMES_DESKTOP_BUILD_MODE = prev
})
test('isThinClient returns false for non-thin values', () => {
const prev = process.env.HERMES_DESKTOP_BUILD_MODE
process.env.HERMES_DESKTOP_BUILD_MODE = 'thick'
const { isThinClient } = freshModule()
assert.equal(isThinClient(), false)
process.env.HERMES_DESKTOP_BUILD_MODE = 'thick-client'
const { isThinClient: isThin2 } = freshModule()
assert.equal(isThin2(), false)
process.env.HERMES_DESKTOP_BUILD_MODE = prev
})

View File

@@ -10,26 +10,24 @@
* and the OAuth session-cookie detector.
*/
const test = require('node:test')
const assert = require('node:assert/strict')
import assert from 'node:assert/strict'
import test from 'node:test'
const {
AT_COOKIE_VARIANTS,
RT_COOKIE_VARIANTS,
import { AT_COOKIE_VARIANTS,
authModeFromStatus,
buildGatewayWsUrl,
buildGatewayWsUrlWithTicket,
connectionScopeKey,
cookiesHaveSession,
cookiesHaveLiveSession,
normAuthMode,
cookiesHaveSession,
normalizeRemoteBaseUrl,
normAuthMode,
pathWithGlobalRemoteProfile,
profileRemoteOverride,
resolveAuthMode,
resolveTestWsUrl,
tokenPreview
} = require('./connection-config.cjs')
RT_COOKIE_VARIANTS,
tokenPreview } from './connection-config'
// --- connectionScopeKey / normAuthMode ---
@@ -73,6 +71,7 @@ test('profileRemoteOverride returns the per-profile remote with defaulted auth m
coder: { mode: 'remote', url: ' https://coder.example.com/hermes ', token: { value: 'sek' } }
}
}
assert.deepEqual(profileRemoteOverride(config, 'coder'), {
url: 'https://coder.example.com/hermes',
authMode: 'token',
@@ -365,6 +364,7 @@ test('resolveTestWsUrl (oauth, mint ok) builds a ?ticket= URL', async () => {
const url = await resolveTestWsUrl('https://gw.example.com', 'oauth', null, {
mintTicket: async () => 'tkt-9'
})
assert.equal(url, 'wss://gw.example.com/api/ws?ticket=tkt-9')
})
@@ -376,13 +376,14 @@ test('resolveTestWsUrl (oauth, mint FAILS) throws — must NOT skip WS validatio
throw new Error('401 ticket mint failed')
}
}),
err => {
(err: any) => {
// Actionable, points the user at re-auth, and preserves the cause + flag
// the boot overlay uses to offer a sign-in prompt.
assert.match(err.message, /WebSocket ticket/i)
assert.match(err.message, /sign in again/i)
assert.equal(err.needsOauthLogin, true)
assert.ok(err.cause instanceof Error)
return true
}
)

View File

@@ -45,6 +45,7 @@ function normalizeRemoteBaseUrl(rawUrl) {
}
let parsed
try {
parsed = new URL(value)
} catch (error) {
@@ -105,13 +106,16 @@ function buildGatewayWsUrlWithTicket(baseUrl, ticket) {
* @param {{ mintTicket: (baseUrl: string) => Promise<string> }} deps
* @returns {Promise<string|null>}
*/
async function resolveTestWsUrl(baseUrl, authMode, token, deps = {}) {
async function resolveTestWsUrl(baseUrl, authMode, token, deps: any = {}) {
if (authMode === 'oauth') {
const mintTicket = deps.mintTicket
if (typeof mintTicket !== 'function') {
throw new Error('resolveTestWsUrl: a mintTicket function is required in OAuth mode.')
}
let ticket
try {
ticket = await mintTicket(baseUrl)
} catch (error) {
@@ -119,15 +123,19 @@ async function resolveTestWsUrl(baseUrl, authMode, token, deps = {}) {
'Reached the gateway over HTTP, but could not mint a WebSocket ticket for the OAuth session ' +
'(it may have expired). Open Settings → Gateway and sign in again.'
)
err.needsOauthLogin = true
;(err as any).needsOauthLogin = true
err.cause = error
throw err
}
return buildGatewayWsUrlWithTicket(baseUrl, ticket)
}
if (!token) {
return null
}
return buildGatewayWsUrl(baseUrl, token)
}
@@ -154,11 +162,13 @@ function normAuthMode(mode) {
function profileRemoteOverride(config, profile) {
const key = connectionScopeKey(profile)
const entry = key ? config?.profiles?.[key] : null
if (!entry || typeof entry !== 'object' || entry.mode !== 'remote') {
return null
}
const url = String(entry.url || '').trim()
if (!url) {
return null
}
@@ -172,18 +182,21 @@ function profileRemoteOverride(config, profile) {
* query parameter. Local pooled backends and per-profile remote overrides do not
* need this: they already run against a backend scoped to the target profile.
*/
function pathWithGlobalRemoteProfile(path, profile, opts = {}) {
function pathWithGlobalRemoteProfile(path, profile, opts: any = {}) {
const scopedProfile = connectionScopeKey(profile)
if (!scopedProfile || !opts.globalRemote || opts.profileRemoteOverride) {
return path
}
const rawPath = String(path || '')
if (!rawPath) {
return path
}
let parsed
try {
parsed = new URL(rawPath, 'http://hermes.local')
} catch {
@@ -224,9 +237,12 @@ function authModeFromStatus(statusBody) {
* Returns 'oauth' | 'token'.
*/
function resolveAuthMode(inputAuthMode, existingAuthMode) {
if (inputAuthMode === 'oauth') return 'oauth'
if (inputAuthMode === 'token') return 'token'
if (existingAuthMode === 'oauth') return 'oauth'
if (inputAuthMode === 'oauth') {return 'oauth'}
if (inputAuthMode === 'token') {return 'token'}
if (existingAuthMode === 'oauth') {return 'oauth'}
return 'token'
}
@@ -242,7 +258,8 @@ function resolveAuthMode(inputAuthMode, existingAuthMode) {
* need to know whether an unexpired access token is present right now.
*/
function cookiesHaveSession(cookies) {
if (!Array.isArray(cookies)) return false
if (!Array.isArray(cookies)) {return false}
return cookies.some(c => c && AT_COOKIE_VARIANTS.includes(c.name) && c.value)
}
@@ -260,24 +277,23 @@ function cookiesHaveSession(cookies) {
* the RT is also dead/revoked).
*/
function cookiesHaveLiveSession(cookies) {
if (!Array.isArray(cookies)) return false
if (!Array.isArray(cookies)) {return false}
return cookies.some(c => c && c.value && (AT_COOKIE_VARIANTS.includes(c.name) || RT_COOKIE_VARIANTS.includes(c.name)))
}
module.exports = {
AT_COOKIE_VARIANTS,
RT_COOKIE_VARIANTS,
export { AT_COOKIE_VARIANTS,
authModeFromStatus,
buildGatewayWsUrl,
buildGatewayWsUrlWithTicket,
connectionScopeKey,
cookiesHaveSession,
cookiesHaveLiveSession,
normAuthMode,
cookiesHaveSession,
normalizeRemoteBaseUrl,
normAuthMode,
pathWithGlobalRemoteProfile,
profileRemoteOverride,
resolveAuthMode,
resolveTestWsUrl,
tokenPreview
}
RT_COOKIE_VARIANTS,
tokenPreview }

View File

@@ -5,17 +5,15 @@
* (Wired into npm test:desktop:platforms in package.json.)
*/
const test = require('node:test')
const assert = require('node:assert/strict')
import assert from 'node:assert/strict'
import test from 'node:test'
const {
adoptServedDashboardToken,
import { adoptServedDashboardToken,
dashboardIndexUrl,
extractInjectedDashboardToken,
fetchPublicText,
isForeignBackendToken,
resolveServedDashboardToken
} = require('./dashboard-token.cjs')
resolveServedDashboardToken } from './dashboard-token'
test('extractInjectedDashboardToken reads the JSON-encoded dashboard token', () => {
const html = '<script>window.__HERMES_SESSION_TOKEN__="served-token";window.__HERMES_BASE_PATH__=""</script>'
@@ -39,9 +37,11 @@ test('dashboardIndexUrl preserves dashboard path prefixes', () => {
test('resolveServedDashboardToken uses the served token and logs when it differs', async () => {
const logs = []
const token = await resolveServedDashboardToken('http://127.0.0.1:9120', 'spawn-token', {
fetchText: async url => {
assert.equal(url, 'http://127.0.0.1:9120/')
return '<script>window.__HERMES_SESSION_TOKEN__="served-token";</script>'
},
rememberLog: line => logs.push(line)
@@ -100,8 +100,9 @@ test('isForeignBackendToken only flags a mismatched token from a dead child', ()
[{ servedToken: null, spawnToken: 'mine', childAlive: false }, false],
[{ servedToken: '', spawnToken: 'mine', childAlive: false }, false]
]
for (const [input, expected] of cases) {
assert.equal(isForeignBackendToken(input), expected, JSON.stringify(input))
assert.equal(isForeignBackendToken(input as any), expected, JSON.stringify(input))
}
})
@@ -128,6 +129,7 @@ test('adoptServedDashboardToken refuses a foreign token when our child is dead',
test('adoptServedDashboardToken falls back to the spawn token when the fetch fails', async () => {
const logs = []
const token = await adoptServedDashboardToken('http://127.0.0.1:9120', 'spawn-token', {
childAlive: () => true,
fetchText: async () => {

View File

@@ -9,29 +9,35 @@
const DEFAULT_TOKEN_FETCH_TIMEOUT_MS = 3_000
async function fetchPublicText(url, options = {}) {
async function fetchPublicText(url, options: any = {}) {
const { protocol } = new URL(url)
if (protocol !== 'http:' && protocol !== 'https:') {
throw new Error(`Unsupported Hermes backend URL protocol: ${protocol}`)
}
const timeoutMs = options.timeoutMs ?? DEFAULT_TOKEN_FETCH_TIMEOUT_MS
const res = await fetch(url, { signal: AbortSignal.timeout(timeoutMs) }).catch(error => {
if (error.name === 'TimeoutError') {
throw new Error(`Timed out connecting to Hermes backend after ${timeoutMs}ms`)
}
throw error
})
const text = await res.text()
if (!res.ok) throw new Error(`${res.status}: ${text || res.statusText}`)
if (!res.ok) {throw new Error(`${res.status}: ${text || res.statusText}`)}
return text
}
function extractInjectedDashboardToken(html) {
const match = /window\.__HERMES_SESSION_TOKEN__\s*=\s*("(?:\\.|[^"\\])*")/.exec(String(html || ''))
if (!match) return null
if (!match) {return null}
try {
return JSON.parse(match[1])
} catch {
@@ -43,11 +49,13 @@ function dashboardIndexUrl(baseUrl) {
return `${String(baseUrl || '').replace(/\/+$/, '')}/`
}
async function resolveServedDashboardToken(baseUrl, fallbackToken, options = {}) {
async function resolveServedDashboardToken(baseUrl, fallbackToken, options: any = {}) {
const fetchText = options.fetchText || fetchPublicText
const html = await fetchText(dashboardIndexUrl(baseUrl), {
timeoutMs: options.timeoutMs ?? DEFAULT_TOKEN_FETCH_TIMEOUT_MS
})
const servedToken = extractInjectedDashboardToken(html)
if (servedToken && servedToken !== fallbackToken && typeof options.rememberLog === 'function') {
@@ -76,6 +84,7 @@ function isForeignBackendToken({ servedToken, spawnToken, childAlive }) {
async function adoptServedDashboardToken(baseUrl, spawnToken, { childAlive, label = 'Hermes backend', ...options }) {
const servedToken = await resolveServedDashboardToken(baseUrl, spawnToken, options).catch(error => {
options.rememberLog?.(`[boot] could not read served dashboard token (${label}): ${error.message}`)
return spawnToken
})
@@ -88,12 +97,10 @@ async function adoptServedDashboardToken(baseUrl, spawnToken, { childAlive, labe
return servedToken
}
module.exports = {
DEFAULT_TOKEN_FETCH_TIMEOUT_MS,
adoptServedDashboardToken,
export { adoptServedDashboardToken,
dashboardIndexUrl,
DEFAULT_TOKEN_FETCH_TIMEOUT_MS,
extractInjectedDashboardToken,
fetchPublicText,
isForeignBackendToken,
resolveServedDashboardToken
}
resolveServedDashboardToken }

View File

@@ -9,19 +9,17 @@
* cleanup-script builders (POSIX + Windows).
*/
const test = require('node:test')
const assert = require('node:assert/strict')
import assert from 'node:assert/strict'
import test from 'node:test'
const {
UNINSTALL_MODES,
buildPosixCleanupScript,
import { buildPosixCleanupScript,
buildWindowsCleanupScript,
modeRemovesAgent,
modeRemovesUserData,
resolveRemovableAppPath,
shouldRemoveAppBundle,
uninstallArgsForMode
} = require('./desktop-uninstall.cjs')
UNINSTALL_MODES,
uninstallArgsForMode } from './desktop-uninstall'
// --- uninstallArgsForMode ---
@@ -132,6 +130,7 @@ test('buildPosixCleanupScript waits for the PID, runs the uninstall module, remo
appPath: '/opt/hermes/linux-unpacked',
hermesHome: '/home/x/.hermes'
})
assert.match(script, /^#!\/bin\/bash/)
assert.match(script, /pid=4321/)
assert.match(script, /kill -0 "\$pid"/)
@@ -152,6 +151,7 @@ test('buildPosixCleanupScript exports PYTHONPATH when pythonPath is set (lite/fu
appPath: null,
hermesHome: '/home/x/.hermes'
})
// System python + source on PYTHONPATH so import hermes_cli works while the
// venv is torn down.
assert.match(script, /export PYTHONPATH='\/home\/x\/\.hermes\/hermes-agent'/)
@@ -168,6 +168,7 @@ test('buildPosixCleanupScript omits PYTHONPATH when pythonPath is null (gui)', (
appPath: null,
hermesHome: '/h'
})
assert.doesNotMatch(script, /export PYTHONPATH/)
})
@@ -181,6 +182,7 @@ test('buildPosixCleanupScript omits the bundle rm when appPath is null', () => {
appPath: null,
hermesHome: '/h'
})
assert.doesNotMatch(script, /rm -rf '\//)
// Still runs the uninstall.
assert.match(script, /'-m' 'hermes_cli\.uninstall' '--mode' 'lite'/)
@@ -196,6 +198,7 @@ test('buildPosixCleanupScript single-quote-escapes paths with apostrophes', () =
appPath: null,
hermesHome: '/h'
})
// The apostrophe is closed-escaped-reopened so the shell sees the literal.
assert.match(script, /'\/home\/o'\\''brien\/python'/)
})
@@ -212,6 +215,7 @@ test('buildWindowsCleanupScript waits (bounded) for PID, runs uninstall, rmdir b
appPath: 'C:\\Users\\x\\AppData\\Local\\Programs\\Hermes',
hermesHome: 'C:\\Users\\x\\AppData\\Local\\hermes'
})
assert.match(script, /@echo off/)
assert.match(script, /set "PID=9988"/)
// PYTHONPATH set so a system python can import hermes_cli from source.
@@ -238,6 +242,7 @@ test('buildWindowsCleanupScript omits PYTHONPATH + rmdir when not needed (gui, n
appPath: null,
hermesHome: 'C:\\h'
})
assert.doesNotMatch(script, /rmdir/)
assert.doesNotMatch(script, /set "PYTHONPATH=/)
})

View File

@@ -26,7 +26,7 @@
* shape as the self-update swap-and-relaunch flow already in main.cjs.
*/
const path = require('node:path')
import path from 'node:path'
const UNINSTALL_MODES = ['gui', 'lite', 'full']
@@ -41,6 +41,7 @@ function uninstallArgsForMode(mode) {
if (!UNINSTALL_MODES.includes(mode)) {
throw new Error(`Unknown uninstall mode: ${mode}`)
}
return ['-m', 'hermes_cli.uninstall', '--mode', mode]
}
@@ -65,9 +66,10 @@ function modeRemovesUserData(mode) {
* Returns null when we can't confidently identify a removable bundle (e.g.
* running from a dev checkout, or a system-package install we must not rmtree).
*/
function resolveRemovableAppPath(execPath, platform, env = {}) {
function resolveRemovableAppPath(execPath, platform, env: any = {}) {
const exe = String(execPath || '')
if (!exe) return null
if (!exe) {return null}
// Use the path flavor that matches the TARGET platform, not the host running
// this code — so the Windows branch parses backslash paths correctly even
@@ -79,22 +81,28 @@ function resolveRemovableAppPath(execPath, platform, env = {}) {
const macOsDir = p.dirname(exe) // …/Contents/MacOS
const contents = p.dirname(macOsDir) // …/Contents
const appBundle = p.dirname(contents) // …/Hermes.app
if (appBundle.endsWith('.app')) return appBundle
if (appBundle.endsWith('.app')) {return appBundle}
return null
}
if (platform === 'win32') {
// NSIS per-user installs Hermes.exe directly in the install dir.
const dir = p.dirname(exe)
if (/[\\/]Hermes$/i.test(dir) || /[\\/]hermes-desktop$/i.test(dir)) return dir
if (/[\\/]Hermes$/i.test(dir) || /[\\/]hermes-desktop$/i.test(dir)) {return dir}
return null
}
// Linux: an AppImage exposes its own path via the APPIMAGE env var.
if (env.APPIMAGE) return env.APPIMAGE
if (env.APPIMAGE) {return env.APPIMAGE}
// Unpacked electron-builder tree: …/linux-unpacked/hermes
const dir = p.dirname(exe)
if (/-unpacked$/.test(dir)) return dir
if (/-unpacked$/.test(dir)) {return dir}
return null
}
@@ -121,6 +129,7 @@ function shouldRemoveAppBundle(isPackaged, appPath) {
*/
function buildPosixCleanupScript({ desktopPid, pythonExe, pythonPath, agentRoot, uninstallArgs, appPath, hermesHome }) {
const q = s => `'${String(s).replace(/'/g, `'\\''`)}'`
const lines = [
'#!/bin/bash',
'set -u',
@@ -135,16 +144,21 @@ function buildPosixCleanupScript({ desktopPid, pythonExe, pythonPath, agentRoot,
'fi',
`export HERMES_HOME=${q(hermesHome)}`
]
if (pythonPath) {
lines.push(`export PYTHONPATH=${q(pythonPath)}\${PYTHONPATH:+:$PYTHONPATH}`)
}
lines.push(`cd ${q(agentRoot)} 2>/dev/null || true`, `${q(pythonExe)} ${uninstallArgs.map(q).join(' ')} || true`)
if (appPath) {
lines.push(`rm -rf ${q(appPath)} || true`)
}
// Self-delete the script.
lines.push('rm -f "$0" 2>/dev/null || true')
lines.push('')
return lines.join('\n')
}
@@ -180,15 +194,18 @@ function buildWindowsCleanupScript({
// under %LOCALAPPDATA% never contain them). `&`/`^` in a path would still be
// a problem, but Hermes install paths don't use them.
const q = s => `"${String(s).replace(/"/g, '')}"`
const lines = [
'@echo off',
'setlocal enableextensions',
`set "HERMES_HOME=${String(hermesHome).replace(/"/g, '')}"`,
`set "PID=${pid}"`
]
if (pythonPath) {
lines.push(`set "PYTHONPATH=${String(pythonPath).replace(/"/g, '')};%PYTHONPATH%"`)
}
lines.push(
'set /a waited=0',
':waitloop',
@@ -206,6 +223,7 @@ function buildWindowsCleanupScript({
`cd /d ${q(agentRoot)}`,
`${q(pythonExe)} ${uninstallArgs.map(q).join(' ')}`
)
if (appPath) {
lines.push(
'set /a tries=0',
@@ -220,18 +238,18 @@ function buildWindowsCleanupScript({
':rmdone'
)
}
lines.push('del "%~f0"')
lines.push('')
return lines.join('\r\n')
}
module.exports = {
UNINSTALL_MODES,
buildPosixCleanupScript,
export { buildPosixCleanupScript,
buildWindowsCleanupScript,
modeRemovesAgent,
modeRemovesUserData,
resolveRemovableAppPath,
shouldRemoveAppBundle,
uninstallArgsForMode
}
UNINSTALL_MODES,
uninstallArgsForMode }

View File

@@ -1,9 +1,8 @@
'use strict'
const { session } = require('electron')
import { session } from 'electron'
const EMBED_SESSION_PARTITION = 'persist:hermes-embed'
const EMBED_REFERER = 'https://www.youtube.com/'
const YOUTUBE_REFERER_HOST_RE =
/(^|\.)(youtube\.com|youtube-nocookie\.com|googlevideo\.com|ytimg\.com|youtubei\.googleapis\.com)$/i
@@ -23,6 +22,7 @@ function installEmbedRefererForSession(embedSession) {
if (!YOUTUBE_REFERER_HOST_RE.test(host)) {
callback({ requestHeaders: details.requestHeaders })
return
}
@@ -45,4 +45,4 @@ function installEmbedReferer() {
}
}
module.exports = { installEmbedReferer }
export { installEmbedReferer }

View File

@@ -1,19 +1,17 @@
'use strict'
import assert from 'node:assert/strict'
import fs from 'node:fs'
import os from 'node:os'
import path from 'node:path'
import test from 'node:test'
import { pathToFileURL } from 'node:url'
const assert = require('node:assert/strict')
const fs = require('node:fs')
const os = require('node:os')
const path = require('node:path')
const test = require('node:test')
const { pathToFileURL } = require('node:url')
const { readDirForIpc } = require('./fs-read-dir.cjs')
import { readDirForIpc } from './fs-read-dir'
function mkTmpDir() {
return fs.mkdtempSync(path.join(os.tmpdir(), 'hermes-fs-read-dir-'))
}
function fakeDirent(name, flags = {}) {
function fakeDirent(name, flags: any = {}) {
return {
name,
isDirectory: () => Boolean(flags.directory),
@@ -109,10 +107,12 @@ test('readDirForIpc accepts file URLs for directories', async () => {
test('readDirForIpc returns invalid-path for blank or non-string input', async () => {
let readdirCalls = 0
const fsImpl = {
promises: {
readdir: async () => {
readdirCalls += 1
return []
}
}
@@ -126,10 +126,12 @@ test('readDirForIpc returns invalid-path for blank or non-string input', async (
test('readDirForIpc rejects Windows device paths before readdir', async () => {
let readdirCalls = 0
const fsImpl = {
promises: {
readdir: async () => {
readdirCalls += 1
return []
}
}
@@ -224,6 +226,7 @@ test('readDirForIpc allows expanding symlink or junction directories outside the
fs.writeFileSync(path.join(outside, 'outside.txt'), 'ok')
const linkPath = path.join(root, 'outside-link')
try {
fs.symlinkSync(outside, linkPath, process.platform === 'win32' ? 'junction' : 'dir')
} catch (error) {
@@ -252,6 +255,7 @@ test('readDirForIpc stats symbolic links and unknown entries without dropping th
const input = path.join('virtual-root')
const resolved = path.resolve(input)
const statCalls = []
const fsImpl = {
promises: {
readdir: async () => [
@@ -266,9 +270,11 @@ test('readDirForIpc stats symbolic links and unknown entries without dropping th
}
statCalls.push(fullPath)
if (fullPath.endsWith(`${path.sep}linked-dir`)) {
return { isDirectory: () => true }
}
throw Object.assign(new Error('gone'), { code: 'ENOENT' })
}
}
@@ -301,12 +307,15 @@ test('readDirForIpc bounds concurrent stats while preserving complete sorted out
let peak = 0
let releaseStats
let markFirstStatStarted
const statsReleased = new Promise(resolve => {
releaseStats = resolve
})
const firstStatStarted = new Promise(resolve => {
markFirstStatStarted = resolve
})
const fsImpl = {
promises: {
readdir: async () => [
@@ -326,6 +335,7 @@ test('readDirForIpc bounds concurrent stats while preserving complete sorted out
active -= 1
const name = path.basename(fullPath)
if (name === failedName) {
throw Object.assign(new Error('gone'), { code: 'ENOENT' })
}

View File

@@ -1,8 +1,7 @@
'use strict'
import fs from 'node:fs'
import path from 'node:path'
const fs = require('node:fs')
const path = require('node:path')
const { resolveDirectoryForIpc } = require('./hardening.cjs')
import { resolveDirectoryForIpc } from './hardening'
const FS_READDIR_STAT_CONCURRENCY = 16
@@ -37,7 +36,7 @@ function direntIsSymbolicLink(dirent) {
}
function shouldStatDirent(dirent) {
if (direntIsDirectory(dirent)) return false
if (direntIsDirectory(dirent)) {return false}
return direntIsSymbolicLink(dirent) || !direntIsFile(dirent)
}
@@ -70,13 +69,13 @@ async function mapWithStatConcurrency(items, mapper) {
}
const workerCount = Math.min(FS_READDIR_STAT_CONCURRENCY, items.length)
const workers = Array.from({ length: workerCount }, () => runWorker())
const workers = Array.from({ length: workerCount } as any, () => runWorker())
await Promise.all(workers)
return results
}
async function readDirForIpc(dirPath, options = {}) {
async function readDirForIpc(dirPath, options: any = {}) {
const fsImpl = options.fs || fs
let resolved
@@ -102,6 +101,4 @@ async function readDirForIpc(dirPath, options = {}) {
}
}
module.exports = {
readDirForIpc
}
export { readDirForIpc }

View File

@@ -9,16 +9,20 @@
* outcome (open, frame, error, early close, never-opens) without a network.
*/
const test = require('node:test')
const assert = require('node:assert/strict')
import assert from 'node:assert/strict'
import test from 'node:test'
const { probeGatewayWebSocket } = require('./gateway-ws-probe.cjs')
import { probeGatewayWebSocket } from './gateway-ws-probe'
// Minimal WebSocket double: records listeners synchronously (the probe attaches
// them in its executor) and exposes emit() so the test can replay events.
function makeFakeWs() {
function makeFakeWs(): { FakeWs: new (url: string) => any; instances: any[] } {
const instances = []
class FakeWs {
url: string
closed = false
listeners: Record<string, any[]> = {}
constructor(url) {
this.url = url
this.listeners = {}
@@ -32,9 +36,12 @@ function makeFakeWs() {
this.closed = true
}
emit(type, event) {
for (const fn of this.listeners[type] || []) fn(event)
for (const fn of this.listeners[type] || []) {
fn(event)
}
}
}
return { FakeWs, instances }
}
@@ -51,11 +58,13 @@ test('probe resolves ok when the socket opens and stays open', async () => {
test('probe resolves ok immediately when a frame arrives', async () => {
const { FakeWs, instances } = makeFakeWs()
const promise = probeGatewayWebSocket('ws://host/api/ws?token=t', {
WebSocketImpl: FakeWs,
connectTimeoutMs: 1_000,
readyGraceMs: 10_000 // long grace: success must come from the frame, not the timer
})
instances[0].emit('open')
instances[0].emit('message', { data: '{"jsonrpc":"2.0"}' })
const result = await promise
@@ -95,11 +104,13 @@ test('probe fails when the gateway accepts then immediately closes (auth rejecte
test('probe times out when the socket never opens', async () => {
const { FakeWs } = makeFakeWs()
const result = await probeGatewayWebSocket('ws://host/api/ws?token=t', {
WebSocketImpl: FakeWs,
connectTimeoutMs: 20,
readyGraceMs: 10
})
assert.equal(result.ok, false)
assert.match(result.reason, /Timed out/)
})

View File

@@ -36,13 +36,13 @@ const DEFAULT_READY_GRACE_MS = 750
* Attempt a live WebSocket connection and classify the outcome.
*
* @param {string} wsUrl - Fully-formed ws(s):// URL including the credential.
* @param {object} [options]
* @param {new (url: string) => any} [options.WebSocketImpl] - WebSocket ctor.
* @param {number} [options.connectTimeoutMs]
* @param {number} [options.readyGraceMs]
* @returns {Promise<{ ok: boolean, reason?: string }>}
*/
function probeGatewayWebSocket(wsUrl, options = {}) {
function probeGatewayWebSocket<T>(wsUrl: string, options:{
WebSocketImpl?: any,
connectTimeoutMs?: number
readyGraceMs?: number
} = {}) {
const WebSocketImpl = options.WebSocketImpl
const connectTimeoutMs = options.connectTimeoutMs ?? DEFAULT_CONNECT_TIMEOUT_MS
const readyGraceMs = options.readyGraceMs ?? DEFAULT_READY_GRACE_MS
@@ -54,7 +54,7 @@ function probeGatewayWebSocket(wsUrl, options = {}) {
})
}
return new Promise(resolve => {
return new Promise<any>(resolve => {
let settled = false
let opened = false
let connectTimer = null
@@ -66,6 +66,7 @@ function probeGatewayWebSocket(wsUrl, options = {}) {
clearTimeout(connectTimer)
connectTimer = null
}
if (graceTimer !== null) {
clearTimeout(graceTimer)
graceTimer = null
@@ -73,14 +74,16 @@ function probeGatewayWebSocket(wsUrl, options = {}) {
}
const finish = result => {
if (settled) return
if (settled) {return}
settled = true
clearTimers()
try {
socket?.close?.()
} catch {
// ignore — best effort teardown
}
resolve(result)
}
@@ -91,11 +94,12 @@ function probeGatewayWebSocket(wsUrl, options = {}) {
ok: false,
reason: error instanceof Error ? error.message : String(error)
})
return
}
const onOpen = () => {
if (settled) return
if (settled) {return}
opened = true
// Upgrade accepted. Give the server a brief window to reject the
// credential post-handshake (early close) before declaring success.
@@ -118,7 +122,8 @@ function probeGatewayWebSocket(wsUrl, options = {}) {
}
const onClose = event => {
if (settled) return
if (settled) {return}
if (opened) {
// Opened, then closed inside the grace window: the upgrade was accepted
// but the session was refused (e.g. ws-ticket/token rejected, or a
@@ -127,8 +132,10 @@ function probeGatewayWebSocket(wsUrl, options = {}) {
ok: false,
reason: closeReason(event, 'The gateway accepted the connection then closed it (credential rejected?).')
})
return
}
finish({
ok: false,
reason: closeReason(event, 'The gateway closed the WebSocket before it opened.')
@@ -154,8 +161,10 @@ function probeGatewayWebSocket(wsUrl, options = {}) {
function addListener(socket, type, handler) {
if (typeof socket.addEventListener === 'function') {
socket.addEventListener(type, handler)
return
}
// Node's global WebSocket implements addEventListener; this fallback keeps the
// helper usable with the `ws` package's EventEmitter shape too.
if (typeof socket.on === 'function') {
@@ -164,25 +173,31 @@ function addListener(socket, type, handler) {
}
function extractErrorReason(event) {
if (!event) return ''
if (event instanceof Error) return event.message
if (!event) {return ''}
if (event instanceof Error) {return event.message}
const err = event.error || event.message
if (err instanceof Error) return err.message
if (typeof err === 'string') return err
if (err instanceof Error) {return err.message}
if (typeof err === 'string') {return err}
return ''
}
function closeReason(event, fallback) {
const code = event && typeof event.code === 'number' ? event.code : null
const reason = event && typeof event.reason === 'string' ? event.reason.trim() : ''
if (code && reason) return `${fallback} (code ${code}: ${reason})`
if (code) return `${fallback} (code ${code})`
if (reason) return `${fallback} (${reason})`
if (code && reason) {return `${fallback} (code ${code}: ${reason})`}
if (code) {return `${fallback} (code ${code})`}
if (reason) {return `${fallback} (${reason})`}
return fallback
}
module.exports = {
DEFAULT_CONNECT_TIMEOUT_MS,
export { DEFAULT_CONNECT_TIMEOUT_MS,
DEFAULT_READY_GRACE_MS,
probeGatewayWebSocket
}
probeGatewayWebSocket }

View File

@@ -1,14 +1,12 @@
'use strict'
// Repo-first discovery: walk bounded roots for git repos using only Node's `fs`
// — no native addon, so it just works for anyone who pulls main (no
// electron-rebuild). Mirrors how GitHub Desktop scans: stop at the first `.git`
// (don't descend into a repo), cap depth, and skip heavy non-repo trees so the
// first scan stays fast. Results are cached by the backend after the first run.
const fs = require('node:fs')
const os = require('node:os')
const path = require('node:path')
import fs from 'node:fs'
import os from 'node:os'
import path from 'node:path'
const fsp = fs.promises
@@ -36,14 +34,14 @@ async function mapLimit(items, limit, fn) {
}
}
await Promise.all(Array.from({ length: Math.min(limit, items.length) }, worker))
await Promise.all(Array.from({ length: Math.min(limit, items.length) } as any, worker))
}
/**
* Scan `roots` (default: the home dir) for git repositories. Returns deduped
* `{ root, label }` entries. `options.maxDepth` caps recursion (default 3).
*/
async function scanGitRepos(roots, options = {}) {
async function scanGitRepos(roots, options: any = {}) {
const maxDepth = Number(options.maxDepth) || DEFAULT_MAX_DEPTH
const searchRoots = Array.isArray(roots) && roots.length > 0 ? roots : [os.homedir()]
const found = new Map()
@@ -54,6 +52,7 @@ async function scanGitRepos(roots, options = {}) {
}
let entries
try {
entries = await fsp.readdir(dir, { withFileTypes: true })
} catch {
@@ -73,6 +72,7 @@ async function scanGitRepos(roots, options = {}) {
}
const subdirs = []
for (const entry of entries) {
// Real directories only (skip symlinks to avoid loops), no hidden dirs, no
// known heavy trees.
@@ -93,4 +93,4 @@ async function scanGitRepos(roots, options = {}) {
return [...found.entries()].map(([root, label]) => ({ label, root }))
}
module.exports = { scanGitRepos }
export { scanGitRepos }

View File

@@ -1,9 +1,7 @@
'use strict'
import assert from 'node:assert/strict'
import test from 'node:test'
const assert = require('node:assert/strict')
const test = require('node:test')
const { resolveRenamePath } = require('./git-review-ops.cjs')
import { resolveRenamePath } from './git-review-ops'
test('resolveRenamePath: plain path is unchanged', () => {
assert.equal(resolveRenamePath('src/a.ts'), 'src/a.ts')

View File

@@ -1,18 +1,38 @@
'use strict'
// Git ops backing the coding rail + Codex-style review pane. Built on `simple-git`
// (a maintained wrapper around the system git binary — same git the rest of the
// app shells to, no native build) so we read structured status()/diffSummary()
// results instead of hand-parsing porcelain. Reads degrade to null/empty on a
// non-repo / remote backend; mutations reject so the renderer can toast.
const { execFile } = require('node:child_process')
const fs = require('node:fs/promises')
const path = require('node:path')
import { execFile } from 'node:child_process'
import fs from 'node:fs/promises'
import path from 'node:path'
const simpleGit = require('simple-git')
import simpleGitFn from 'simple-git'
const { resolveRequestedPathForIpc } = require('./hardening.cjs')
import { resolveRequestedPathForIpc } from './hardening'
// `simple-git` is a pure-JS runtime dep that workspace dedup hoists into the
// repo-root node_modules. Packaged builds set `files:` in package.json, which
// excludes node_modules from the asar, so a normal import fails at launch
// (issue #52735: "Cannot find module 'simple-git'"). We ship the dep's
// closure under resources/native-deps/vendor/node_modules/ via extraResources
// + scripts/stage-native-deps.mjs, and resolve from there when the hoisted
// import isn't reachable. The `vendor/` nesting matters: electron-builder
// drops a node_modules dir at the root of an extraResources copy but keeps a
// nested one. Dev mode never hits the fallback -- Node's normal lookup finds
// the hoisted copy.
let simpleGit = simpleGitFn
if (!simpleGit) {
const resourcesPath = (process as any).resourcesPath
if (!resourcesPath) {
throw new Error("git-review IPC: 'simple-git' not found and no resourcesPath to fall back to")
}
simpleGit = require(path.join(resourcesPath, 'native-deps', 'vendor', 'node_modules', 'simple-git'))
}
const COMMIT_CONTEXT_DIFF_MAX_CHARS = 120_000
const COMMIT_CONTEXT_UNTRACKED_MAX = 80
@@ -33,7 +53,7 @@ function ghEnv(ghBin) {
// Run the `gh` CLI in a repo. Resolves { ok, stdout } so callers branch on
// availability/auth without a throw. gh missing/unauthed → ok:false.
function runGh(args, cwd, ghBin) {
function runGh(args, cwd, ghBin): Promise<{ok: boolean, stdout: string}> {
return new Promise(resolve => {
execFile(
ghBin || 'gh',
@@ -241,10 +261,11 @@ async function reviewList(repoPath, scope, baseRef, gitBin) {
const range = scope === 'branch' ? `${base}...HEAD` : base
const summary = await git.diffSummary([range])
const files = summary.files.map(file => ({
path: resolveRenamePath(file.file),
added: file.binary ? 0 : file.insertions,
removed: file.binary ? 0 : file.deletions,
added: 'insertions' in file ? file.insertions : 0 ,
removed: 'deletions' in file ? file.deletions : 0 ,
status: 'M',
staged: false
}))
@@ -272,6 +293,7 @@ async function reviewList(repoPath, scope, baseRef, gitBin) {
git.diffSummary(['--cached']),
git.diffSummary([])
])
const stagedCounts = countsByPath(staged)
const unstagedCounts = countsByPath(unstaged)
@@ -476,6 +498,7 @@ async function reviewCommitContext(repoPath, gitBin) {
const safe = args => git.diff(args).catch(() => '')
let status
try {
status = await git.status()
} catch {
@@ -491,9 +514,11 @@ async function reviewCommitContext(repoPath, gitBin) {
// Untracked files have no diff — list them so new files aren't invisible.
const untracked = status.not_added || []
if (untracked.length > 0) {
const visible = untracked.slice(0, COMMIT_CONTEXT_UNTRACKED_MAX)
const omitted = untracked.length - visible.length
const note =
`\n# New (untracked) files:\n${visible.map(p => `# ${p}`).join('\n')}\n` +
(omitted > 0 ? `# ... ${omitted} more omitted\n` : '')
@@ -588,6 +613,7 @@ async function repoStatus(repoPath, gitBin) {
// fail soft and hide the coding rail instead of spamming IPC handler errors.
try {
const stat = await fs.stat(cwd)
if (!stat.isDirectory()) {
return null
}
@@ -596,11 +622,13 @@ async function repoStatus(repoPath, gitBin) {
}
let git
try {
git = gitFor(cwd, gitBin)
} catch {
return null
}
let status
try {
@@ -611,6 +639,7 @@ async function repoStatus(repoPath, gitBin) {
}
const detached = typeof status.detached === 'boolean' ? status.detached : !status.current
const files = status.files.map(file => ({
path: file.path,
staged: isStaged(file),
@@ -652,10 +681,12 @@ async function repoStatus(repoPath, gitBin) {
// can't stall the probe.
try {
const untracked = status.not_added.slice(0, 500)
for (let i = 0; i < untracked.length; i += UNTRACKED_LINE_COUNT_CONCURRENCY) {
const batch = await Promise.all(
untracked.slice(i, i + UNTRACKED_LINE_COUNT_CONCURRENCY).map(path => untrackedInsertions(cwd, path))
)
result.added += batch.reduce((sum, n) => sum + n, 0)
}
} catch {
@@ -665,8 +696,7 @@ async function repoStatus(repoPath, gitBin) {
return result
}
module.exports = {
branchBase,
export { branchBase,
fileDiffVsHead,
repoStatus,
resolveRenamePath,
@@ -676,9 +706,8 @@ module.exports = {
reviewDiff,
reviewList,
reviewPush,
reviewRevParse,
reviewRevert,
reviewRevParse,
reviewShipInfo,
reviewStage,
reviewUnstage
}
reviewUnstage }

View File

@@ -1,13 +1,11 @@
'use strict'
import assert from 'node:assert/strict'
import fs from 'node:fs'
import os from 'node:os'
import path from 'node:path'
import test from 'node:test'
import { pathToFileURL } from 'node:url'
const assert = require('node:assert/strict')
const fs = require('node:fs')
const os = require('node:os')
const path = require('node:path')
const test = require('node:test')
const { pathToFileURL } = require('node:url')
const { gitRootForIpc } = require('./git-root.cjs')
import { gitRootForIpc } from './git-root'
function mkTmpDir() {
return fs.mkdtempSync(path.join(os.tmpdir(), 'hermes-git-root-'))

View File

@@ -1,8 +1,7 @@
'use strict'
import fs from 'node:fs'
import path from 'node:path'
const fs = require('node:fs')
const path = require('node:path')
const { resolveRequestedPathForIpc } = require('./hardening.cjs')
import { resolveRequestedPathForIpc } from './hardening'
function findGitRoot(start, fsImpl = fs) {
let dir = start
@@ -28,7 +27,7 @@ function findGitRoot(start, fsImpl = fs) {
return null
}
async function gitRootForIpc(startPath, options = {}) {
async function gitRootForIpc(startPath, options: {fs?: typeof fs} = {}) {
const fsImpl = options.fs || fs
let resolved
@@ -48,7 +47,5 @@ async function gitRootForIpc(startPath, options = {}) {
}
}
module.exports = {
findGitRoot,
gitRootForIpc
}
export { findGitRoot,
gitRootForIpc }

View File

@@ -1,20 +1,16 @@
'use strict'
import assert from 'node:assert/strict'
import { execFileSync } from 'node:child_process'
import fs from 'node:fs'
import os from 'node:os'
import path from 'node:path'
import test from 'node:test'
const assert = require('node:assert/strict')
const { execFileSync } = require('node:child_process')
const fs = require('node:fs')
const os = require('node:os')
const path = require('node:path')
const test = require('node:test')
const {
addWorktree,
import { addWorktree,
ensureGitRepo,
listBranches,
parseWorktrees,
sanitizeBranch,
switchBranch
} = require('./git-worktree-ops.cjs')
switchBranch } from './git-worktree-ops'
test('sanitizeBranch: spaces → hyphens, forbidden chars dropped, edges trimmed', () => {
assert.equal(sanitizeBranch('beach vibes'), 'beach-vibes')

View File

@@ -1,16 +1,14 @@
'use strict'
// Git-driven worktree operations for the desktop "Start work" flow: spin up a
// fresh worktree the lightest way (`git worktree add -b`), list real worktrees,
// and remove them. Git is the source of truth; the renderer just drives these.
const path = require('node:path')
const fs = require('node:fs')
const { execFile } = require('node:child_process')
import { execFile } from 'node:child_process'
import fs from 'node:fs'
import path from 'node:path'
const { resolveRequestedPathForIpc } = require('./hardening.cjs')
import { resolveRequestedPathForIpc } from './hardening'
function runGit(gitBin, args, cwd) {
function runGit(gitBin, args, cwd): Promise<string> {
return new Promise((resolve, reject) => {
execFile(
gitBin,
@@ -306,6 +304,7 @@ async function listBranches(repoPath, gitBin) {
['for-each-ref', '--format=%(refname:short)', '--sort=-committerdate', 'refs/heads'],
resolved
)
const trees = await listWorktrees(resolved, gitBin)
const pathByBranch = new Map(trees.filter(tree => tree.branch).map(tree => [tree.branch, tree.path]))
const trunk = await defaultBranch(gitBin, resolved)
@@ -338,13 +337,11 @@ async function switchBranch(repoPath, branch, gitBin) {
return { branch: target }
}
module.exports = {
addWorktree,
export { addWorktree,
ensureGitRepo,
listBranches,
listWorktrees,
parseWorktrees,
removeWorktree,
sanitizeBranch,
switchBranch
}
switchBranch }

View File

@@ -1,23 +1,22 @@
const assert = require('node:assert/strict')
const fs = require('node:fs')
const os = require('node:os')
const path = require('node:path')
const test = require('node:test')
const { pathToFileURL } = require('node:url')
import assert from 'node:assert/strict'
import fs from 'node:fs'
import os from 'node:os'
import path from 'node:path'
import test from 'node:test'
import { pathToFileURL } from 'node:url'
const {
DEFAULT_FETCH_TIMEOUT_MS,
import { DEFAULT_FETCH_TIMEOUT_MS,
encryptDesktopSecret,
resolveDirectoryForIpc,
resolveReadableFileForIpc,
resolveRequestedPathForIpc,
resolveTimeoutMs,
sensitiveFileBlockReason
} = require('./hardening.cjs')
sensitiveFileBlockReason } from './hardening'
async function rejectsWithCode(promise, code) {
await assert.rejects(promise, error => {
async function rejectsWithCode(promise, code: string) {
await assert.rejects(promise, (error: any) => {
assert.equal(error?.code, code)
return true
})
}
@@ -76,8 +75,9 @@ test('path helpers reject blank non-string NUL and Windows device syntax', async
for (const devicePath of devicePaths) {
assert.throws(
() => resolveRequestedPathForIpc(devicePath, { purpose: 'File preview' }),
error => {
(error: any) => {
assert.equal(error?.code, 'device-path')
return true
}
)
@@ -86,8 +86,9 @@ test('path helpers reject blank non-string NUL and Windows device syntax', async
assert.throws(
() => resolveRequestedPathForIpc('file:///%E0%A4%A', { purpose: 'File preview' }),
error => {
(error: any) => {
assert.equal(error?.code, 'invalid-path')
return true
}
)
@@ -131,19 +132,23 @@ test('resolveReadableFileForIpc validates existence type size and sensitivity',
maxBytes: 256,
purpose: 'File preview'
})
assert.equal(fromRelative.resolvedPath, textPath)
assert.equal(fromRelative.stat.size, 11)
const fromFileUrl = await resolveReadableFileForIpc(pathToFileURL(textPath).toString(), {
purpose: 'File preview'
})
assert.equal(fromFileUrl.resolvedPath, textPath)
const spacedPath = path.join(tempDir, 'notes with spaces.txt')
fs.writeFileSync(spacedPath, 'space ok', 'utf8')
const fromSpacedFileUrl = await resolveReadableFileForIpc(pathToFileURL(spacedPath).toString(), {
purpose: 'File preview'
})
assert.equal(fromSpacedFileUrl.resolvedPath, spacedPath)
await assert.rejects(
@@ -184,9 +189,11 @@ test('resolveReadableFileForIpc validates existence type size and sensitivity',
const envTemplatePath = path.join(tempDir, '.env.example')
fs.writeFileSync(envTemplatePath, 'EXAMPLE_TOKEN=value', 'utf8')
const envTemplate = await resolveReadableFileForIpc(envTemplatePath, {
purpose: 'File preview'
})
assert.equal(envTemplate.resolvedPath, envTemplatePath)
})
@@ -229,8 +236,10 @@ test('resolveReadableFileForIpc blocks symlinks whose realpath is sensitive', as
} catch (error) {
if (error?.code === 'EPERM' || error?.code === 'EACCES') {
t.skip(`symlink creation is not permitted on this platform (${error.code})`)
return
}
throw error
}
@@ -268,8 +277,10 @@ test('resolveDirectoryForIpc accepts directory symlinks or junctions', async t =
} catch (error) {
if (error?.code === 'EPERM' || error?.code === 'EACCES') {
t.skip(`directory symlink creation is not permitted on this platform (${error.code})`)
return
}
throw error
}

View File

@@ -1,7 +1,7 @@
const fs = require('node:fs')
const os = require('node:os')
const path = require('node:path')
const { fileURLToPath } = require('node:url')
import fs from 'node:fs'
import os from 'node:os'
import path from 'node:path'
import { fileURLToPath } from 'node:url'
const DEFAULT_FETCH_TIMEOUT_MS = 15_000
const DATA_URL_READ_MAX_BYTES = 16 * 1024 * 1024
@@ -13,6 +13,7 @@ const SENSITIVE_EXTENSIONS = new Set(['.kdbx', '.p12', '.pem', '.pfx'])
function resolveTimeoutMs(timeoutMs, fallbackMs = DEFAULT_FETCH_TIMEOUT_MS) {
const fallback =
Number.isFinite(fallbackMs) && Number(fallbackMs) > 0 ? Math.round(Number(fallbackMs)) : DEFAULT_FETCH_TIMEOUT_MS
const parsed = Number(timeoutMs)
if (Number.isFinite(parsed) && parsed > 0) {
@@ -62,6 +63,7 @@ function sensitiveFileBlockReason(filePath) {
const normalized = String(filePath || '')
.replace(/\\/g, '/')
.toLowerCase()
const basename = path.basename(normalized)
const ext = path.extname(basename)
@@ -87,6 +89,7 @@ function sensitiveFileBlockReason(filePath) {
if (basename.startsWith('.env.')) {
const suffix = basename.slice('.env.'.length)
if (!SAFE_ENV_SUFFIXES.has(suffix)) {
return `${basename} is blocked because it appears to contain environment secrets.`
}
@@ -107,9 +110,10 @@ function sensitiveFileBlockReason(filePath) {
return null
}
function ipcPathError(code, message) {
const error = new Error(message)
error.code = code
function ipcPathError(code: any, message: string): Error & {code: any} {
const error = new Error(message) as Error & {code: any}
(error as any).code = code
return error
}
@@ -129,6 +133,7 @@ function rejectUnsafePathSyntax(filePath, purpose = 'File read') {
}
const normalized = raw.replace(/\\/g, '/').toLowerCase()
if (
normalized.startsWith('//?/') ||
normalized.startsWith('//./') ||
@@ -141,7 +146,7 @@ function rejectUnsafePathSyntax(filePath, purpose = 'File read') {
return raw
}
function resolveRequestedPathForIpc(filePath, options = {}) {
function resolveRequestedPathForIpc(filePath, options: {purpose?: string, baseDir?: fs.PathOrFileDescriptor} = {}) {
const purpose = String(options.purpose || 'File read')
let raw = rejectUnsafePathSyntax(filePath, purpose)
@@ -154,17 +159,21 @@ function resolveRequestedPathForIpc(filePath, options = {}) {
if (/^file:/i.test(raw)) {
let resolvedPath
try {
const parsed = new URL(raw)
if (parsed.protocol !== 'file:') {
throw new Error('not a file URL')
}
resolvedPath = fileURLToPath(parsed)
} catch {
throw ipcPathError('invalid-path', `${purpose} failed: file URL is invalid.`)
}
rejectUnsafePathSyntax(resolvedPath, purpose)
return path.resolve(resolvedPath)
}
@@ -178,14 +187,16 @@ function resolveRequestedPathForIpc(filePath, options = {}) {
return resolvedPath
}
async function statForIpc(fsImpl, resolvedPath, purpose, typeLabel) {
async function statForIpc(fsImpl: {promises: {stat: typeof fs.promises.stat}}, resolvedPath, purpose, typeLabel) {
try {
return await fsImpl.promises.stat(resolvedPath)
} catch (error) {
const code = error && typeof error === 'object' ? error.code : ''
if (code === 'ENOENT' || code === 'ENOTDIR') {
throw ipcPathError(code || 'ENOENT', `${purpose} failed: ${typeLabel} does not exist.`)
}
throw ipcPathError(
code || 'read-error',
`${purpose} failed: ${error instanceof Error ? error.message : String(error)}`
@@ -201,6 +212,7 @@ async function realpathForIpc(fsImpl, resolvedPath, purpose) {
try {
const realPath = await fsImpl.promises.realpath(resolvedPath)
rejectUnsafePathSyntax(realPath, purpose)
return realPath
} catch (error) {
const code = error && typeof error === 'object' ? error.code : ''
@@ -213,12 +225,13 @@ async function realpathForIpc(fsImpl, resolvedPath, purpose) {
function rejectSensitiveFilePath(filePath, purpose) {
const blockReason = sensitiveFileBlockReason(filePath)
if (blockReason) {
throw ipcPathError('sensitive-file', `${purpose} blocked for sensitive file: ${blockReason}`)
}
}
async function resolveDirectoryForIpc(dirPath, options = {}) {
async function resolveDirectoryForIpc(dirPath, options: {purpose?: string , baseDir?: fs.PathOrFileDescriptor, fs?: {promises:{stat: typeof fs.promises.stat}}} = {}) {
const purpose = String(options.purpose || 'Directory read')
const fsImpl = options.fs || fs
const resolvedPath = resolveRequestedPathForIpc(dirPath, { baseDir: options.baseDir, purpose })
@@ -233,7 +246,7 @@ async function resolveDirectoryForIpc(dirPath, options = {}) {
return { realPath, resolvedPath, stat }
}
async function resolveReadableFileForIpc(filePath, options = {}) {
async function resolveReadableFileForIpc(filePath, options: {purpose?: string , baseDir?: fs.PathOrFileDescriptor, fs?: typeof fs, blockSensitive?: boolean, maxBytes?: number} = {}) {
const purpose = String(options.purpose || 'File read')
const fsImpl = options.fs || fs
const resolvedPath = resolveRequestedPathForIpc(filePath, { baseDir: options.baseDir, purpose })
@@ -253,11 +266,13 @@ async function resolveReadableFileForIpc(filePath, options = {}) {
}
const realPath = await realpathForIpc(fsImpl, resolvedPath, purpose)
if (options.blockSensitive !== false) {
rejectSensitiveFilePath(realPath, purpose)
}
const maxBytes = Number.isFinite(options.maxBytes) && Number(options.maxBytes) > 0 ? Number(options.maxBytes) : null
if (maxBytes && stat.size > maxBytes) {
throw ipcPathError('EFBIG', `${purpose} failed: file is too large (${stat.size} bytes; limit ${maxBytes} bytes).`)
}
@@ -271,15 +286,13 @@ async function resolveReadableFileForIpc(filePath, options = {}) {
return { realPath, resolvedPath, stat }
}
module.exports = {
DATA_URL_READ_MAX_BYTES,
export { DATA_URL_READ_MAX_BYTES,
DEFAULT_FETCH_TIMEOUT_MS,
TEXT_PREVIEW_SOURCE_MAX_BYTES,
encryptDesktopSecret,
rejectUnsafePathSyntax,
resolveDirectoryForIpc,
resolveReadableFileForIpc,
resolveRequestedPathForIpc,
resolveTimeoutMs,
sensitiveFileBlockReason
}
sensitiveFileBlockReason,
TEXT_PREVIEW_SOURCE_MAX_BYTES }

View File

@@ -1,10 +1,11 @@
const assert = require('node:assert/strict')
const test = require('node:test')
import assert from 'node:assert/strict'
import test from 'node:test'
const { createLinkTitleWindow, linkTitleWindowOptions } = require('./link-title-window.cjs')
import { createLinkTitleWindow, linkTitleWindowOptions } from './link-title-window'
function makeFakeBrowserWindow() {
const calls = { audioMuted: [] }
const FakeBrowserWindow = function (options) {
this.options = options
this.webContents = {

View File

@@ -1,5 +1,3 @@
'use strict'
// Hidden BrowserWindow used by tier-2 link-title resolution: when curl can't
// read a page <title> (bot walls, JS-rendered pages), we briefly load the URL
// in an offscreen window and read its title. That window loads arbitrary
@@ -39,4 +37,4 @@ function createLinkTitleWindow(BrowserWindow, partitionSession) {
return window
}
module.exports = { createLinkTitleWindow, linkTitleWindowOptions }
export { createLinkTitleWindow, linkTitleWindowOptions }

View File

@@ -4,10 +4,10 @@
* Run with: node --test electron/oauth-net-request.test.cjs
*/
const test = require('node:test')
const assert = require('node:assert/strict')
import assert from 'node:assert/strict'
import test from 'node:test'
const { serializeJsonBody, setJsonRequestHeaders } = require('./oauth-net-request.cjs')
import { serializeJsonBody, setJsonRequestHeaders } from './oauth-net-request'
test('serializeJsonBody returns undefined for absent bodies', () => {
assert.equal(serializeJsonBody(undefined), undefined)
@@ -21,6 +21,7 @@ test('serializeJsonBody JSON-encodes request bodies', () => {
test('setJsonRequestHeaders does not set Electron-restricted Content-Length', () => {
const headers = []
const request = {
setHeader(name, value) {
headers.push([name, value])

View File

@@ -14,7 +14,5 @@ function setJsonRequestHeaders(request) {
request.setHeader('Content-Type', 'application/json')
}
module.exports = {
serializeJsonBody,
setJsonRequestHeaders
}
export { serializeJsonBody,
setJsonRequestHeaders }

View File

@@ -1,4 +1,4 @@
const { contextBridge, ipcRenderer, webUtils } = require('electron')
import { contextBridge, ipcRenderer, webUtils } from 'electron'
contextBridge.exposeInMainWorld('hermesDesktop', {
getConnection: profile => ipcRenderer.invoke('hermes:connection', profile),
@@ -24,12 +24,14 @@ contextBridge.exposeInMainWorld('hermesDesktop', {
onState: callback => {
const listener = (_event, payload) => callback(payload)
ipcRenderer.on('hermes:pet-overlay:state', listener)
return () => ipcRenderer.removeListener('hermes:pet-overlay:state', listener)
},
// Main renderer subscribes to overlay control messages.
onControl: callback => {
const listener = (_event, payload) => callback(payload)
ipcRenderer.on('hermes:pet-overlay:control', listener)
return () => ipcRenderer.removeListener('hermes:pet-overlay:control', listener)
}
},
@@ -120,64 +122,76 @@ contextBridge.exposeInMainWorld('hermesDesktop', {
const channel = `hermes:terminal:${id}:data`
const listener = (_event, payload) => callback(payload)
ipcRenderer.on(channel, listener)
return () => ipcRenderer.removeListener(channel, listener)
},
onExit: (id, callback) => {
const channel = `hermes:terminal:${id}:exit`
const listener = (_event, payload) => callback(payload)
ipcRenderer.on(channel, listener)
return () => ipcRenderer.removeListener(channel, listener)
}
},
onClosePreviewRequested: callback => {
const listener = () => callback()
ipcRenderer.on('hermes:close-preview-requested', listener)
return () => ipcRenderer.removeListener('hermes:close-preview-requested', listener)
},
onOpenUpdatesRequested: callback => {
const listener = () => callback()
ipcRenderer.on('hermes:open-updates', listener)
return () => ipcRenderer.removeListener('hermes:open-updates', listener)
},
onDeepLink: callback => {
const listener = (_event, payload) => callback(payload)
ipcRenderer.on('hermes:deep-link', listener)
return () => ipcRenderer.removeListener('hermes:deep-link', listener)
},
signalDeepLinkReady: () => ipcRenderer.invoke('hermes:deep-link-ready'),
onWindowStateChanged: callback => {
const listener = (_event, payload) => callback(payload)
ipcRenderer.on('hermes:window-state-changed', listener)
return () => ipcRenderer.removeListener('hermes:window-state-changed', listener)
},
onFocusSession: callback => {
const listener = (_event, sessionId) => callback(sessionId)
ipcRenderer.on('hermes:focus-session', listener)
return () => ipcRenderer.removeListener('hermes:focus-session', listener)
},
onNotificationAction: callback => {
const listener = (_event, payload) => callback(payload)
ipcRenderer.on('hermes:notification-action', listener)
return () => ipcRenderer.removeListener('hermes:notification-action', listener)
},
onPreviewFileChanged: callback => {
const listener = (_event, payload) => callback(payload)
ipcRenderer.on('hermes:preview-file-changed', listener)
return () => ipcRenderer.removeListener('hermes:preview-file-changed', listener)
},
onBackendExit: callback => {
const listener = (_event, payload) => callback(payload)
ipcRenderer.on('hermes:backend-exit', listener)
return () => ipcRenderer.removeListener('hermes:backend-exit', listener)
},
onPowerResume: callback => {
const listener = () => callback()
ipcRenderer.on('hermes:power-resume', listener)
return () => ipcRenderer.removeListener('hermes:power-resume', listener)
},
onBootProgress: callback => {
const listener = (_event, payload) => callback(payload)
ipcRenderer.on('hermes:boot-progress', listener)
return () => ipcRenderer.removeListener('hermes:boot-progress', listener)
},
// First-launch bootstrap progress -- emitted by the install.ps1 stage
@@ -192,6 +206,7 @@ contextBridge.exposeInMainWorld('hermesDesktop', {
onBootstrapEvent: callback => {
const listener = (_event, payload) => callback(payload)
ipcRenderer.on('hermes:bootstrap:event', listener)
return () => ipcRenderer.removeListener('hermes:bootstrap:event', listener)
},
getVersion: () => ipcRenderer.invoke('hermes:version'),
@@ -208,6 +223,7 @@ contextBridge.exposeInMainWorld('hermesDesktop', {
onProgress: callback => {
const listener = (_event, payload) => callback(payload)
ipcRenderer.on('hermes:updates:progress', listener)
return () => ipcRenderer.removeListener('hermes:updates:progress', listener)
}
},

View File

@@ -1,11 +1,9 @@
const assert = require('node:assert/strict')
const test = require('node:test')
import assert from 'node:assert/strict'
import test from 'node:test'
const {
buildSessionWindowUrl,
import { buildSessionWindowUrl,
chatWindowWebPreferences,
createSessionWindowRegistry
} = require('./session-windows.cjs')
createSessionWindowRegistry } from './session-windows'
// A minimal fake BrowserWindow: tracks listeners + destroyed state and lets a
// test fire the 'closed' event, mirroring the slice of the Electron API the
@@ -96,6 +94,7 @@ test('registry opens one window per session and focuses on re-open', () => {
const registry = createSessionWindowRegistry()
let built = 0
const win = makeFakeWindow()
const factory = () => {
built += 1
@@ -145,6 +144,7 @@ test('registry rebuilds a fresh window after the previous one was destroyed', ()
let built = 0
const second = makeFakeWindow()
const result = registry.openOrFocus('s1', () => {
built += 1
@@ -158,6 +158,7 @@ test('registry rebuilds a fresh window after the previous one was destroyed', ()
test('registry ignores empty / non-string session ids', () => {
const registry = createSessionWindowRegistry()
let built = 0
const factory = () => {
built += 1

View File

@@ -3,7 +3,7 @@
// here so they can be unit-tested with node --test (mirroring how the rest of
// electron/*.cjs splits testable logic out of the main.cjs monolith).
const { pathToFileURL } = require('node:url')
import { pathToFileURL } from 'node:url'
// Secondary windows open at the minimum usable size — a compact side panel for
// subagent watch / cmd-click session pop-out, not a second full desktop.
@@ -42,7 +42,7 @@ function chatWindowWebPreferences(preloadPath) {
// scratch window; `watch=1` marks a spectator window (e.g. a running subagent's
// session): the renderer resumes it lazily so the gateway never builds an agent
// just to stream into it.
function buildSessionWindowUrl(sessionId, { devServer, rendererIndexPath, watch, newSession } = {}) {
function buildSessionWindowUrl(sessionId: string, { devServer, rendererIndexPath, watch, newSession }: any = {}) {
const query = `?win=secondary${newSession ? '&new=1' : ''}${watch ? '&watch=1' : ''}`
const route = newSession ? '#/' : `#/${encodeURIComponent(sessionId)}`
@@ -115,10 +115,8 @@ function createSessionWindowRegistry() {
}
}
module.exports = {
buildSessionWindowUrl,
export { buildSessionWindowUrl,
chatWindowWebPreferences,
createSessionWindowRegistry,
SESSION_WINDOW_MIN_HEIGHT,
SESSION_WINDOW_MIN_WIDTH
}
SESSION_WINDOW_MIN_WIDTH }

View File

@@ -1,11 +0,0 @@
// Pre-layout fallback for WCO right-edge reservation (--titlebar-tools-right).
// Live width comes from navigator.windowControlsOverlay in the renderer.
const OVERLAY_FALLBACK_WIDTH = 144
/** @param {{ isWindows?: boolean, isWsl?: boolean }} opts */
function nativeOverlayWidth({ isWindows = false, isWsl = false } = {}) {
return isWindows || isWsl ? OVERLAY_FALLBACK_WIDTH : 0
}
module.exports = { OVERLAY_FALLBACK_WIDTH, nativeOverlayWidth }

View File

@@ -1,7 +1,7 @@
const assert = require('node:assert/strict')
const test = require('node:test')
import assert from 'node:assert/strict'
import test from 'node:test'
const { OVERLAY_FALLBACK_WIDTH, nativeOverlayWidth } = require('./titlebar-overlay-width.cjs')
import { nativeOverlayWidth, OVERLAY_FALLBACK_WIDTH } from './titlebar-overlay-width'
// This static reservation is only the pre-layout FALLBACK. Once laid out the
// renderer reads the exact width from navigator.windowControlsOverlay
@@ -18,10 +18,17 @@ test('WSLg paints the same WCO, so it reserves the same fallback width', () => {
assert.equal(nativeOverlayWidth({ isWsl: true }), OVERLAY_FALLBACK_WIDTH)
})
test('plain Linux and macOS reserve nothing', () => {
assert.equal(nativeOverlayWidth({ isWindows: false, isWsl: false }), 0)
assert.equal(nativeOverlayWidth(), 0)
assert.equal(nativeOverlayWidth({}), 0)
test('plain Linux paints the WCO too, so it reserves the fallback width', () => {
// Regression #53185: re-enabling the overlay on plain Linux (KDE/GNOME)
// without reserving its width left the native min/max/close buttons painting
// on top of the app's right-edge titlebar tools.
assert.equal(nativeOverlayWidth({ isWindows: false, isWsl: false }), OVERLAY_FALLBACK_WIDTH)
assert.equal(nativeOverlayWidth(), OVERLAY_FALLBACK_WIDTH)
assert.equal(nativeOverlayWidth({}), OVERLAY_FALLBACK_WIDTH)
})
test('macOS uses traffic lights, not a WCO overlay, so it reserves nothing', () => {
assert.equal(nativeOverlayWidth({ isMac: true }), 0)
})
test('the fallback width is a sane positive pixel value', () => {

View File

@@ -0,0 +1,23 @@
const OVERLAY_FALLBACK_WIDTH = 144
/**
* Static pre-layout reservation (px) for the right-side native window-controls
* overlay (min/max/close). Only a FALLBACK — once laid out the renderer reads
* the exact width from navigator.windowControlsOverlay
* (use-window-controls-overlay-width.ts) and uses this value only when the WCO
* API is unavailable.
*
* macOS uses traffic lights positioned via trafficLightPosition, not a WCO
* overlay, so it reserves nothing here. Every other desktop platform now paints
* the Electron overlay (Windows, WSLg, and plain Linux KDE/GNOME), so they all
* reserve the fallback width.
*
* @param {{ isWindows?: boolean, isWsl?: boolean, isMac?: boolean }} opts
*/
function nativeOverlayWidth({ isWindows = false, isWsl = false, isMac = false } = {}) {
if (isMac) {return 0}
return OVERLAY_FALLBACK_WIDTH
}
export { nativeOverlayWidth, OVERLAY_FALLBACK_WIDTH }

View File

@@ -1,7 +1,7 @@
'use strict'
const test = require('node:test')
const assert = require('node:assert/strict')
const { resolveBehindCount, shouldCountCommits } = require('./update-count.cjs')
import assert from 'node:assert/strict'
import test from 'node:test'
import { resolveBehindCount, shouldCountCommits } from './update-count'
// FAIL-BEFORE: pre-fix the function did `Number.parseInt(countStr) || 0`
// unconditionally, so a shallow checkout with no merge-base surfaced the bogus

View File

@@ -1,5 +1,3 @@
'use strict'
// Whether `git rev-list HEAD..origin/<branch> --count` produces a meaningful
// number worth computing. On a SHALLOW checkout (installer clones with
// --depth 1) the local history often shares no merge-base with the freshly
@@ -19,10 +17,12 @@ function shouldCountCommits({ isShallow, hasMergeBase }) {
// (developers / Docker dev images) keep the exact count path unchanged.
function resolveBehindCount({ countStr, currentSha, targetSha, isShallow, hasMergeBase }) {
if (!shouldCountCommits({ isShallow, hasMergeBase })) {
if (currentSha && targetSha && currentSha === targetSha) return 0
if (currentSha && targetSha && currentSha === targetSha) {return 0}
return 1 // behind by an unknown amount — show a generic "update available"
}
return Number.parseInt(countStr, 10) || 0
}
module.exports = { resolveBehindCount, shouldCountCommits }
export { resolveBehindCount, shouldCountCommits }

View File

@@ -12,16 +12,17 @@
* strand future launches, and (c) self-heal by deleting a stale marker file.
*/
const test = require('node:test')
const assert = require('node:assert/strict')
const fs = require('fs')
const os = require('os')
const path = require('path')
import fs from 'fs'
import assert from 'node:assert/strict'
import test from 'node:test'
import os from 'os'
import path from 'path'
const { markerPath, isPidAlive, readLiveUpdateMarker, UPDATE_MARKER_MAX_AGE_MS } = require('./update-marker.cjs')
import { isPidAlive, markerPath, readLiveUpdateMarker, UPDATE_MARKER_MAX_AGE_MS } from './update-marker'
function tmpHome(tag) {
const dir = fs.mkdtempSync(path.join(os.tmpdir(), `hermes-marker-${tag}-`))
return dir
}
@@ -29,10 +30,11 @@ function writeMarker(home, pid, startedAtSec) {
fs.writeFileSync(markerPath(home), `${pid}\n${startedAtSec}`)
}
const ALIVE = () => true // injected kill that "succeeds" => pid alive
const DEAD = () => {
const err = new Error('no such process')
err.code = 'ESRCH'
const ALIVE: typeof process.kill = () => true // injected kill that "succeeds" => pid alive
const DEAD : typeof process.kill= () => {
const err = new Error('no such process');
(err as any).code = 'ESRCH'
throw err
}
@@ -84,9 +86,10 @@ test('isPidAlive: own pid is alive, impossible pid is dead', () => {
test('isPidAlive: EPERM counts as alive (process owned by another user)', () => {
const eperm = () => {
const err = new Error('operation not permitted')
err.code = 'EPERM'
const err = new Error('operation not permitted');
(err as any).code = 'EPERM'
throw err
}
assert.equal(isPidAlive(4242, eperm), true)
})

View File

@@ -20,8 +20,8 @@
* log sinks are.
*/
const fs = require('fs')
const path = require('path')
import fs from 'fs'
import path from 'path'
// Even with a live-looking PID, never treat a marker older than this as a live
// update. A full update (git pull + pip + desktop rebuild) is minutes, not tens
@@ -37,10 +37,12 @@ function markerPath(hermesHome) {
// not deliver a signal — it just probes existence/permission. ESRCH => dead;
// EPERM => alive but owned by another user (still "alive" for our purposes).
// Injectable `kill` keeps it unit-testable.
function isPidAlive(pid, kill = process.kill.bind(process)) {
if (!Number.isInteger(pid) || pid <= 0) return false
function isPidAlive(pid, kill: typeof process.kill = process.kill.bind(process)) {
if (!Number.isInteger(pid) || pid <= 0) {return false}
try {
kill(pid, 0)
return true
} catch (err) {
return Boolean(err && err.code === 'EPERM')
@@ -59,9 +61,12 @@ function isPidAlive(pid, kill = process.kill.bind(process)) {
* Pure-ish: file I/O against the given path, plus an injectable pid probe and
* clock for tests.
*/
function readLiveUpdateMarker(hermesHome, { kill, now = Date.now, maxAgeMs = UPDATE_MARKER_MAX_AGE_MS } = {}) {
function readLiveUpdateMarker(hermesHome, { kill, now = Date.now, maxAgeMs = UPDATE_MARKER_MAX_AGE_MS }: {
now?: () => number, maxAgeMs?: number, kill?: typeof process.kill
} = {}) {
const file = markerPath(hermesHome)
let raw
try {
raw = fs.readFileSync(file, 'utf8')
} catch {
@@ -80,14 +85,14 @@ function readLiveUpdateMarker(hermesHome, { kill, now = Date.now, maxAgeMs = UPD
} catch {
void 0
}
return null
}
return { pid, ageMs }
}
module.exports = {
UPDATE_MARKER_MAX_AGE_MS,
export { isPidAlive,
markerPath,
isPidAlive,
readLiveUpdateMarker
}
readLiveUpdateMarker,
UPDATE_MARKER_MAX_AGE_MS }

View File

@@ -12,10 +12,10 @@
* success, and must run at most twice.
*/
const test = require('node:test')
const assert = require('node:assert/strict')
import assert from 'node:assert/strict'
import test from 'node:test'
const { shouldRetryRebuild, runRebuildWithRetry } = require('./update-rebuild.cjs')
import { runRebuildWithRetry, shouldRetryRebuild } from './update-rebuild'
test('shouldRetryRebuild retries only on a non-success exit', () => {
assert.equal(shouldRetryRebuild(0), false)
@@ -25,30 +25,39 @@ test('shouldRetryRebuild retries only on a non-success exit', () => {
test('a clean first rebuild runs once and does not retry', async () => {
const codes = []
const result = await runRebuildWithRetry(attempt => {
codes.push(attempt)
return Promise.resolve({ code: 0 })
})
assert.deepEqual(codes, [0])
assert.equal(result.code, 0)
})
test('a failed first rebuild retries once and succeeds', async () => {
const codes = []
const result = await runRebuildWithRetry(attempt => {
codes.push(attempt)
return Promise.resolve({ code: attempt === 0 ? 1 : 0 })
})
assert.deepEqual(codes, [0, 1])
assert.equal(result.code, 0)
})
test('a rebuild that keeps failing runs at most twice and reports the failure', async () => {
const codes = []
const result = await runRebuildWithRetry(attempt => {
codes.push(attempt)
return Promise.resolve({ code: 1, error: 'rebuild-failed' })
})
assert.deepEqual(codes, [0, 1])
assert.equal(result.code, 1)
assert.equal(result.error, 'rebuild-failed')

View File

@@ -1,5 +1,3 @@
'use strict'
/**
* Retry-once policy for the desktop `--build-only` rebuild during self-update.
*
@@ -20,10 +18,12 @@ function shouldRetryRebuild(code) {
*/
async function runRebuildWithRetry(rebuild) {
let result = await rebuild(0)
if (shouldRetryRebuild(result.code)) {
result = await rebuild(1)
}
return result
}
module.exports = { shouldRetryRebuild, runRebuildWithRetry }
export { runRebuildWithRetry, shouldRetryRebuild }

View File

@@ -17,24 +17,22 @@
* (keep a working window) unless a non-interactive fallback applies.
*/
const test = require('node:test')
const assert = require('node:assert/strict')
const fs = require('node:fs')
const os = require('node:os')
const path = require('node:path')
const { execFileSync } = require('node:child_process')
import assert from 'node:assert/strict'
import { execFileSync } from 'node:child_process'
import fs from 'node:fs'
import os from 'node:os'
import path from 'node:path'
import test from 'node:test'
const {
unpackedDirName,
resolveUnpackedRelease,
decideRelaunchOutcome,
sandboxPreflight,
sandboxFallbackFromEnv,
import { buildRelaunchScript,
collectRelaunchArgs,
collectRelaunchEnv,
buildRelaunchScript,
shellQuote
} = require('./update-relaunch.cjs')
decideRelaunchOutcome,
resolveUnpackedRelease,
sandboxFallbackFromEnv,
sandboxPreflight,
shellQuote,
unpackedDirName } from './update-relaunch'
const ROOT = '/home/u/.hermes/hermes-agent'
const UNPACKED = path.join(ROOT, 'apps', 'desktop', 'release', 'linux-unpacked')
@@ -91,6 +89,7 @@ test('decideRelaunchOutcome: only under-unpacked + sandbox-ok relaunches', () =>
// ---------------------------------------------------------------------------
const fakeStat = (uid, mode) => () => ({ uid, mode })
const throwStat = () => {
throw Object.assign(new Error('ENOENT'), { code: 'ENOENT' })
}
@@ -150,6 +149,7 @@ test('collectRelaunchArgs drops Electron internals, keeps user/launcher args', (
'--profile=work', // app flag — keep
'--remote-debugging-port=9222' // internal — drop
]
assert.deepEqual(collectRelaunchArgs(argv), ['--no-sandbox', 'hermes://open/agent/42', '--profile=work'])
assert.deepEqual(collectRelaunchArgs(undefined), [])
})
@@ -165,6 +165,7 @@ test('collectRelaunchEnv preserves HERMES_HOME + HERMES_DESKTOP_* + sandbox opt-
HOME: '/home/u', // not preserved
UNRELATED: 'x'
}
assert.deepEqual(collectRelaunchEnv(env), {
HERMES_HOME: '/home/u/.hermes',
HERMES_DESKTOP_REMOTE_URL: 'http://box:9119',
@@ -207,6 +208,7 @@ test('buildRelaunchScript embeds pid/exec/args/env/cwd and is valid bash', () =>
// It must be syntactically valid bash (`bash -n`). Write to a temp file and lint.
const tmp = path.join(os.tmpdir(), `hermes-relaunch-test-${Date.now()}.sh`)
fs.writeFileSync(tmp, script)
try {
execFileSync('bash', ['-n', tmp], { stdio: 'pipe' })
} finally {
@@ -222,13 +224,16 @@ test('buildRelaunchScript with no args/env still lints clean', () => {
env: {},
cwd: ''
})
const tmp = path.join(os.tmpdir(), `hermes-relaunch-test2-${Date.now()}.sh`)
fs.writeFileSync(tmp, script)
try {
execFileSync('bash', ['-n', tmp], { stdio: 'pipe' })
} finally {
fs.rmSync(tmp, { force: true })
}
// exec line has no trailing args.
assert.match(script, /exec '\/opt\/Hermes\/Hermes'\n/)
})

View File

@@ -1,5 +1,3 @@
'use strict'
/**
* update-relaunch.cjs pure decision + script-generation helpers for the
* Linux in-app update relaunch (#45205).
@@ -37,12 +35,14 @@
* the closeable manual-restart terminal state instead.
*/
const path = require('node:path')
import path from 'node:path'
// Map process.platform → electron-builder's `release/<dir>-unpacked` name.
function unpackedDirName(platform) {
if (platform === 'darwin') return 'mac-unpacked' // not used (mac swaps bundles)
if (platform === 'win32') return 'win-unpacked'
if (platform === 'darwin') {return 'mac-unpacked'} // not used (mac swaps bundles)
if (platform === 'win32') {return 'win-unpacked'}
return 'linux-unpacked'
}
@@ -56,15 +56,17 @@ function unpackedDirName(platform) {
* `.../release/linux-unpacked-evil` can't masquerade as `.../release/linux-unpacked`.
*/
function resolveUnpackedRelease(execPath, updateRoot, platform) {
if (!execPath || !updateRoot) return null
if (!execPath || !updateRoot) {return null}
const releaseDir = path.join(updateRoot, 'apps', 'desktop', 'release')
const unpacked = path.join(releaseDir, unpackedDirName(platform))
const normalizedExec = path.resolve(String(execPath))
// execPath must be the unpacked dir itself or a descendant of it.
const withSep = unpacked.endsWith(path.sep) ? unpacked : unpacked + path.sep
if (normalizedExec === unpacked || normalizedExec.startsWith(withSep)) {
return unpacked
}
return null
}
@@ -81,8 +83,10 @@ function resolveUnpackedRelease(execPath, updateRoot, platform) {
* app. Closeable manual-restart terminal state.
*/
function decideRelaunchOutcome({ underUnpacked, sandboxOk }) {
if (!underUnpacked) return 'guiSkew'
if (!sandboxOk) return 'manual'
if (!underUnpacked) {return 'guiSkew'}
if (!sandboxOk) {return 'manual'}
return 'relaunch'
}
@@ -99,9 +103,10 @@ function decideRelaunchOutcome({ underUnpacked, sandboxOk }) {
* `statSync` is injectable so this is testable without a real setuid file.
*/
function sandboxPreflight(unpackedDir, statSync) {
if (!unpackedDir) return { ok: false, reason: 'no-unpacked-dir', path: null }
if (!unpackedDir) {return { ok: false, reason: 'no-unpacked-dir', path: null }}
const sandboxPath = path.join(unpackedDir, 'chrome-sandbox')
let st
try {
st = statSync(sandboxPath)
} catch {
@@ -109,15 +114,20 @@ function sandboxPreflight(unpackedDir, statSync) {
// sandbox; nothing to block the relaunch.
return { ok: true, reason: 'no-sandbox-helper', path: sandboxPath }
}
const ownedByRoot = st.uid === 0
const hasSetuid = (st.mode & 0o4000) !== 0
if (ownedByRoot && hasSetuid) {
return { ok: true, reason: 'launchable', path: sandboxPath }
}
if (!ownedByRoot && !hasSetuid) {
return { ok: false, reason: 'not-root-not-setuid', path: sandboxPath }
}
if (!ownedByRoot) return { ok: false, reason: 'not-root', path: sandboxPath }
if (!ownedByRoot) {return { ok: false, reason: 'not-root', path: sandboxPath }}
return { ok: false, reason: 'not-setuid', path: sandboxPath }
}
@@ -137,8 +147,11 @@ function sandboxPreflight(unpackedDir, statSync) {
*/
function sandboxFallbackFromEnv(env, launchArgs) {
const disable = String((env && env.ELECTRON_DISABLE_SANDBOX) || '').trim()
if (disable === '1' || disable.toLowerCase() === 'true') return true
if (Array.isArray(launchArgs) && launchArgs.some(a => a === '--no-sandbox')) return true
if (disable === '1' || disable.toLowerCase() === 'true') {return true}
if (Array.isArray(launchArgs) && launchArgs.some(a => a === '--no-sandbox')) {return true}
return false
}
@@ -176,9 +189,11 @@ const INTERNAL_ARG_PREFIXES = [
* the exec path itself; there is no entry-script arg as in a dev run).
*/
function collectRelaunchArgs(argv) {
if (!Array.isArray(argv)) return []
if (!Array.isArray(argv)) {return []}
return argv.filter(arg => {
if (typeof arg !== 'string' || arg.length === 0) return false
if (typeof arg !== 'string' || arg.length === 0) {return false}
return !INTERNAL_ARG_PREFIXES.some(prefix =>
prefix.endsWith('=') ? arg.startsWith(prefix) : arg === prefix || arg.startsWith(prefix + '=')
)
@@ -197,13 +212,17 @@ const PRESERVED_ENV_PREFIXES = ['HERMES_DESKTOP_']
function collectRelaunchEnv(env) {
const out = {}
if (!env || typeof env !== 'object') return out
if (!env || typeof env !== 'object') {return out}
for (const [key, value] of Object.entries(env)) {
if (value == null) continue
if (value == null) {continue}
if (PRESERVED_ENV_KEYS.includes(key) || PRESERVED_ENV_PREFIXES.some(p => key.startsWith(p))) {
out[key] = String(value)
}
}
return out
}
@@ -223,8 +242,10 @@ function buildRelaunchScript({ pid, execPath, args, env, cwd }) {
const exports = Object.entries(env || {})
.map(([k, v]) => `export ${k}=${shellQuote(v)}`)
.join('\n')
const quotedArgs = (args || []).map(shellQuote).join(' ')
const cwdLine = cwd ? `cd ${shellQuote(cwd)} 2>/dev/null || true` : ''
// NOTE: `exec` replaces the watcher process with the relaunched app, so the
// re-exec inherits exactly the env/cwd we set above.
return `#!/bin/bash
@@ -249,17 +270,15 @@ exec ${shellQuote(execPath)}${quotedArgs ? ' ' + quotedArgs : ''}
`
}
module.exports = {
unpackedDirName,
resolveUnpackedRelease,
decideRelaunchOutcome,
sandboxPreflight,
sandboxFallbackFromEnv,
export { buildRelaunchScript,
collectRelaunchArgs,
collectRelaunchEnv,
buildRelaunchScript,
shellQuote,
decideRelaunchOutcome,
INTERNAL_ARG_PREFIXES,
PRESERVED_ENV_KEYS,
PRESERVED_ENV_PREFIXES
}
PRESERVED_ENV_PREFIXES,
resolveUnpackedRelease,
sandboxFallbackFromEnv,
sandboxPreflight,
shellQuote,
unpackedDirName }

View File

@@ -15,16 +15,14 @@
* never prompts and should keep the normal fetch path).
*/
const test = require('node:test')
const assert = require('node:assert/strict')
import assert from 'node:assert/strict'
import test from 'node:test'
const {
OFFICIAL_REPO_HTTPS_URL,
OFFICIAL_REPO_CANONICAL,
canonicalGitHubRemote,
import { canonicalGitHubRemote,
isOfficialSshRemote,
isSshRemote,
isOfficialSshRemote
} = require('./update-remote.cjs')
OFFICIAL_REPO_CANONICAL,
OFFICIAL_REPO_HTTPS_URL } from './update-remote'
test('canonicalGitHubRemote normalizes SSH and HTTPS forms to the same value', () => {
assert.equal(canonicalGitHubRemote('git@github.com:NousResearch/hermes-agent.git'), OFFICIAL_REPO_CANONICAL)

View File

@@ -19,8 +19,9 @@ const OFFICIAL_REPO_CANONICAL = 'github.com/nousresearch/hermes-agent'
// no trailing slash, no .git suffix) so SSH and HTTPS forms of the same repo
// compare equal.
function canonicalGitHubRemote(url) {
if (!url) return ''
if (!url) {return ''}
let value = String(url).trim()
if (value.startsWith('git@github.com:')) {
value = `github.com/${value.slice('git@github.com:'.length)}`
} else if (value.startsWith('ssh://git@github.com/')) {
@@ -28,13 +29,17 @@ function canonicalGitHubRemote(url) {
} else {
try {
const parsed = new URL(value)
if (parsed.hostname && parsed.pathname) value = `${parsed.hostname}${parsed.pathname}`
if (parsed.hostname && parsed.pathname) {value = `${parsed.hostname}${parsed.pathname}`}
} catch {
// Leave non-URL forms unchanged.
}
}
value = value.trim().replace(/\/+$/, '')
if (value.endsWith('.git')) value = value.slice(0, -4)
if (value.endsWith('.git')) {value = value.slice(0, -4)}
return value.toLowerCase()
}
@@ -42,6 +47,7 @@ function isSshRemote(url) {
const value = String(url || '')
.trim()
.toLowerCase()
return value.startsWith('git@') || value.startsWith('ssh://')
}
@@ -49,10 +55,8 @@ function isOfficialSshRemote(url) {
return isSshRemote(url) && canonicalGitHubRemote(url) === OFFICIAL_REPO_CANONICAL
}
module.exports = {
OFFICIAL_REPO_HTTPS_URL,
OFFICIAL_REPO_CANONICAL,
canonicalGitHubRemote,
export { canonicalGitHubRemote,
isOfficialSshRemote,
isSshRemote,
isOfficialSshRemote
}
OFFICIAL_REPO_CANONICAL,
OFFICIAL_REPO_HTTPS_URL }

View File

@@ -1,9 +1,7 @@
'use strict'
import assert from 'node:assert'
import test from 'node:test'
const assert = require('node:assert')
const test = require('node:test')
const { __testing, extractThemes, readCentralDirectory } = require('./vscode-marketplace.cjs')
import { __testing, extractThemes, readCentralDirectory } from './vscode-marketplace'
// Build a minimal zip with stored (uncompressed) entries so the test controls
// the bytes exactly — exercises the central-directory reader + theme extraction
@@ -72,6 +70,7 @@ test('extractThemes reads contributed color themes (resolving ./ paths)', () =>
themes: [{ label: 'Dracula', uiTheme: 'vs-dark', path: './themes/dracula.json' }]
}
})
const themeJson = JSON.stringify({ name: 'Dracula', type: 'dark', colors: { 'editor.background': '#282a36' } })
const zip = makeZip([

View File

@@ -1,5 +1,3 @@
'use strict'
/**
* VS Code Marketplace color-theme fetcher (main process).
*
@@ -14,8 +12,8 @@
* zip library into the desktop bundle for a feature this small.
*/
const https = require('node:https')
const zlib = require('node:zlib')
import https from 'node:https'
import zlib from 'node:zlib'
const GALLERY_QUERY_URL = 'https://marketplace.visualstudio.com/_apis/public/gallery/extensionquery'
const VSIX_ASSET_TYPE = 'Microsoft.VisualStudio.Services.VSIXPackage'
@@ -30,7 +28,7 @@ function request(
url,
{ method = 'GET', headers = {}, body = null, maxBytes = MAX_VSIX_BYTES } = {},
redirectsLeft = MAX_REDIRECTS
) {
): Promise<Buffer<ArrayBuffer>> {
return new Promise((resolve, reject) => {
const req = https.request(url, { method, headers }, res => {
const status = res.statusCode ?? 0
@@ -102,6 +100,7 @@ async function resolveExtension(id) {
// IncludeCategoryAndTags | IncludeLatestVersionOnly = 914.
flags: 914
})
const extension = json?.results?.[0]?.extensions?.[0]
if (!extension) {
@@ -127,6 +126,7 @@ async function resolveExtension(id) {
/** POST an ExtensionQuery payload and return the parsed gallery response. */
async function queryGallery(payload, { maxBytes = 4 * 1024 * 1024 } = {}) {
const body = JSON.stringify(payload)
const raw = await request(GALLERY_QUERY_URL, {
method: 'POST',
headers: {
@@ -332,10 +332,12 @@ async function fetchMarketplaceThemes(id) {
return { extensionId: trimmed, displayName, themes }
}
module.exports = {
fetchMarketplaceThemes,
searchMarketplaceThemes,
const __testing = { themeEntryName, looksLikeIconTheme }
export {
__testing,
extractThemes,
fetchMarketplaceThemes,
readCentralDirectory,
__testing: { themeEntryName, looksLikeIconTheme }
searchMarketplaceThemes
}

View File

@@ -4,19 +4,17 @@
* clamping, and the debounce that collapses mid-drag write storms.
*/
const test = require('node:test')
const assert = require('node:assert/strict')
import assert from 'node:assert/strict'
import test from 'node:test'
const {
DEFAULT_WIDTH,
import { computeWindowOptions,
debounce,
DEFAULT_HEIGHT,
MIN_WIDTH,
DEFAULT_WIDTH,
MIN_HEIGHT,
sanitizeWindowState,
MIN_WIDTH,
onScreen,
computeWindowOptions,
debounce
} = require('./window-state.cjs')
sanitizeWindowState } from './window-state'
// A single 1920×1080 monitor (work area trimmed for the taskbar).
const PRIMARY = [{ workArea: { x: 0, y: 0, width: 1920, height: 1040 } }]
@@ -121,6 +119,7 @@ test('computeWindowOptions does not clamp when displays are unknown', () => {
test('debounce coalesces a burst into one trailing run', t => {
t.mock.timers.enable({ apis: ['setTimeout'] })
let calls = 0
const d = debounce(() => {
calls += 1
}, 250)
@@ -138,6 +137,7 @@ test('debounce coalesces a burst into one trailing run', t => {
test('debounce.flush runs now and cancels the pending timer', t => {
t.mock.timers.enable({ apis: ['setTimeout'] })
let calls = 0
const d = debounce(() => {
calls += 1
}, 250)

View File

@@ -21,41 +21,59 @@ const MIN_VISIBLE = 48
const finite = v => typeof v === 'number' && Number.isFinite(v)
const clamp = (v, lo, hi) => Math.max(lo, Math.min(v, hi))
interface SanitizedWindowState{
width: number, height: number, isMaximized: boolean, x?: number,y?: number
}
// Parse raw JSON → clean state, or null if garbage. width/height are required
// and floored; x/y survive only as a finite pair; isMaximized is strict.
function sanitizeWindowState(raw) {
if (!raw || typeof raw !== 'object' || !finite(raw.width) || !finite(raw.height)) return null
function sanitizeWindowState(raw?: any): SanitizedWindowState | null
const state = {
{
if (!raw || typeof raw !== 'object' || !finite(raw.width) || !finite(raw.height)) {return null}
const state: SanitizedWindowState = {
width: Math.max(MIN_WIDTH, Math.round(raw.width)),
height: Math.max(MIN_HEIGHT, Math.round(raw.height)),
isMaximized: raw.isMaximized === true
isMaximized: raw.isMaximized === true,
}
if (finite(raw.x) && finite(raw.y)) {
state.x = Math.round(raw.x)
state.x = Math.round(raw.x);
state.y = Math.round(raw.y)
}
return state
}
// True when `bounds` overlaps some display's work area by ≥ MIN_VISIBLE on both
// axes. `displays` is Electron's screen.getAllDisplays() shape.
function onScreen(bounds, displays) {
if (!Array.isArray(displays)) return false
if (!Array.isArray(displays)) {return false}
return displays.some(({ workArea: a } = {}) => {
if (!a) return false
if (!a) {return false}
const x = Math.min(bounds.x + bounds.width, a.x + a.width) - Math.max(bounds.x, a.x)
const y = Math.min(bounds.y + bounds.height, a.y + a.height) - Math.max(bounds.y, a.y)
return x >= MIN_VISIBLE && y >= MIN_VISIBLE
})
}
interface WindowOptions {
width: number
height: number
x?: number
y?: number
}
// Sanitized state (or null) → BrowserWindow size/position options. Always sets
// width/height, capped to the largest current display so a size saved on a
// since-disconnected bigger monitor can't exceed any screen the user now has.
// Sets x/y only when still on-screen; otherwise Electron centers the window.
function computeWindowOptions(state, displays) {
const opts = {
function computeWindowOptions(state, displays): WindowOptions {
const opts: WindowOptions = {
width: finite(state?.width) ? state.width : DEFAULT_WIDTH,
height: finite(state?.height) ? state.height : DEFAULT_HEIGHT
}
@@ -67,6 +85,7 @@ function computeWindowOptions(state, displays) {
: m,
{ width: 0, height: 0 }
)
if (cap.width && cap.height) {
opts.width = clamp(opts.width, MIN_WIDTH, cap.width)
opts.height = clamp(opts.height, MIN_HEIGHT, cap.height)
@@ -78,9 +97,10 @@ function computeWindowOptions(state, displays) {
finite(state.y) &&
onScreen({ x: state.x, y: state.y, width: opts.width, height: opts.height }, displays)
) {
opts.x = state.x
opts.x = state.x;
opts.y = state.y
}
return opts
}
@@ -89,6 +109,7 @@ function computeWindowOptions(state, displays) {
// cancels the pending timer — used on close, before the window is gone.
function debounce(fn, delayMs) {
let timer = null
const debounced = () => {
clearTimeout(timer)
timer = setTimeout(() => {
@@ -96,22 +117,22 @@ function debounce(fn, delayMs) {
fn()
}, delayMs)
}
debounced.flush = () => {
clearTimeout(timer)
timer = null
fn()
}
return debounced
}
module.exports = {
DEFAULT_WIDTH,
export { computeWindowOptions,
debounce,
DEFAULT_HEIGHT,
MIN_WIDTH,
DEFAULT_WIDTH,
MIN_HEIGHT,
MIN_VISIBLE,
sanitizeWindowState,
MIN_WIDTH,
onScreen,
computeWindowOptions,
debounce
}
sanitizeWindowState }

View File

@@ -1,11 +1,10 @@
'use strict'
import assert from 'node:assert/strict'
import fs from 'node:fs'
import path from 'node:path'
import test from 'node:test'
import { fileURLToPath } from 'node:url'
const test = require('node:test')
const assert = require('node:assert/strict')
const fs = require('node:fs')
const path = require('node:path')
const ELECTRON_DIR = __dirname
const ELECTRON_DIR = path.dirname(fileURLToPath(import.meta.url))
function readElectronFile(name) {
return fs.readFileSync(path.join(ELECTRON_DIR, name), 'utf8').replace(/\r\n/g, '\n')
@@ -24,7 +23,7 @@ function requireHiddenChildOptions(source, needle) {
}
test('desktop background child processes opt into hidden Windows consoles', () => {
const source = readElectronFile('main.cjs')
const source = readElectronFile('main.ts')
assert.match(source, /function hiddenWindowsChildOptions\(options = \{\}\)/)
@@ -53,8 +52,25 @@ test('desktop background child processes opt into hidden Windows consoles', () =
assert.match(source, /args: \['-m', 'hermes_cli\.main', \.\.\.dashboardArgs\]/)
})
test('getNoConsoleVenvPython prefers base pythonw over the uv re-exec shim', () => {
const source = readElectronFile('main.ts')
const body = source.slice(
source.indexOf('function getNoConsoleVenvPython(venvRoot)'),
source.indexOf('function getVenvSitePackagesEntries(venvRoot)')
)
// The venv Scripts\pythonw.exe re-execs a console python.exe (flashes a
// conhost); the base pythonw must be resolved first so it never runs.
const baseIdx = body.indexOf('basePythonw')
const shimIdx = body.indexOf("'Scripts', 'pythonw.exe'")
assert.notEqual(baseIdx, -1, 'base pythonw resolution missing')
assert.notEqual(shimIdx, -1, 'venv shim fallback missing')
assert.ok(baseIdx < shimIdx, 'base pythonw must be preferred before the venv Scripts shim')
})
test('intentional or interactive desktop child processes stay documented', () => {
const source = readElectronFile('main.cjs')
const source = readElectronFile('main.ts')
assert.match(source, /windowsHide: false/)
assert.match(source, /handOffWindowsBootstrapRecovery/)
@@ -65,7 +81,7 @@ test('intentional or interactive desktop child processes stay documented', () =>
})
test('bootstrap PowerShell runner hides Windows console children', () => {
const source = readElectronFile('bootstrap-runner.cjs')
const source = readElectronFile('bootstrap-runner.ts')
assert.match(source, /function hiddenWindowsChildOptions\(options = \{\}\)/)
requireHiddenChildOptions(source, 'spawn(ps, fullArgs')

Some files were not shown because too many files have changed in this diff Show More