Compare commits

...

637 Commits

Author SHA1 Message Date
alt-glitch
946d3eaf95 Merge remote-tracking branch 'origin/main' into feat/opentui-native-engine 2026-06-15 15:59:30 +05:30
alt-glitch
bf45aa3a45 opentui(v6): WIP — todo panel + memoryMonitor + mouse/startup-prompt env
Preservation snapshot of three in-progress OpenTUI threads before merging main
(gate-green together: npm run check 821 tests, 0 lint/type errors):
- todo panel (todoPanel/todoTool + App/statusBar/transcript/store wiring,
  ☑ done/total status chip, latestTodos snapshot)
- OpenTUI memoryMonitor (boundary+logic+test; the inverse of the Ink memlog port)
- env: resolveMouseEnabled/envToggle/startupPrompt, HERMES_TUI_MOUSE +
  HERMES_TUI_PROMPT aliases, startup-image attach in main.tsx
- termChrome refactor

Not yet split per-feature; commit boundary is the pre-merge clean point.
2026-06-15 15:59:11 +05:30
alt-glitch
16e408f3f0 fix(clarify): docstring — put options in choices[] only, never enumerate in question text
The model was enumerating options inside the question string (dead prose the UI
can't render as pickable rows). Schema description now spells out: choices[] is
REQUIRED for selectable options; question holds ONLY the question.
2026-06-15 15:58:06 +05:30
alt-glitch
4108fe6014 tui(diag): Ink 1Hz memwatch collector — OpenTUI-compatible memory trace
Ink had no continuous memory trace (only point-in-time heapdumps + a threshold
monitor), so HERMES_TUI_DIAGNOSTICS=1 gave OpenTUI dogfood data with no Ink
equivalent. Port OpenTUI's memlog collector to Ink so both engines emit
byte-identical ~/.hermes/logs/memwatch/<boot>-<pid>.jsonl traces feeding one
memwatch-report.mjs.

- lib/memlog.ts: 1Hz unref'd sampler, {t,rss_kb,heap_used_kb,external_kb}
  (no mounted — Ink has no windowing), 14-day prune, silent-disable on error
- gated by HERMES_TUI_MEMLOG defaulting to the HERMES_TUI_DIAGNOSTICS master
  switch (same as OpenTUI — one export covers both engines)
- wired into entry.tsx alongside the existing monitor; stop on beforeExit
- lib/memlog.test.ts: gate/schema/retention/silent-disable (7 tests)
- docs/ink-env-flags.md (new) + docs/opentui-env-flags.md updated
2026-06-15 15:58:02 +05:30
alt-glitch
b0fb2b8b05 opentui(v6): skins/theming parity — live /skin switch, animated spinner, tool_emojis, status-bar colors
- server resolve_skin() now serializes spinner + tool_emojis (were dropped on
  the wire; neither native engine could use them)
- GatewaySkin schema gains optional spinner/tool_emojis (additive, back-compat)
- theme.ts: SpinnerConfig + parseSpinner (crash-proof) threaded through fromSkin;
  status_bar_* keys now drive statusBg/Fg/Bad/Critical (were hardcoded — all
  dark skins looked identical in the status bar)
- /skin <name> CLIENT slash handler -> config.set -> skin.changed -> live retheme
- composer: imperative ta.textColor/cursorColor on theme change (uncontrolled
  textarea recolor) + slash-token SyntaxStyle re-register
- statusLine: animated face via bounded setInterval armed on running/cleared on stop
- registry/toolPart: skin tool_emojis override tool glyphs
2026-06-15 15:57:57 +05:30
kshitij
6cb88a0874 Merge pull request #46552 from kshitijk4poor/salvage/file-tools-session-cwd
fix(tools): respect session cwd in file tools (salvage of #46460)
2026-06-15 14:13:15 +05:30
kshitijk4poor
8fce54499f refactor(tools): extract shared sentinel-free abs cwd validator
_configured_terminal_cwd and _registered_task_cwd_override carried a
byte-identical sentinel + expanduser + isabs validation tail. Extract it
into _sentinel_free_abs_cwd(raw) so the relative/sentinel rejection rule
lives in one place. Behaviour unchanged (the str() coercion the override
path relied on is preserved in the helper).
2026-06-15 14:03:41 +05:30
kshitijk4poor
b0c99c12dd docs(tools): document registered-cwd step in resolver docstrings
The session-cwd fix inserted a registered task/session cwd override step
between the live-cwd and $TERMINAL_CWD fallbacks, but three docstrings still
described the old two-step order — _resolve_base_dir's numbered list was
outright wrong. Update _authoritative_workspace_root, _resolve_base_dir, and
_path_resolution_warning to reflect the actual four-step resolution order.
No behaviour change.
2026-06-15 14:02:54 +05:30
kshitijk4poor
ddf7c7af81 refactor(tools): consolidate task-override lookup into one helper
The raw-key-first-then-collapsed override lookup was hand-rolled in three
places with subtly different spellings: terminal_tool's command setup, and
both file_tools._registered_task_cwd_override and _get_file_ops. Since that
exact raw-vs-collapsed invariant is what the session-cwd fix depends on,
keeping three copies invites the drift that caused the original bug.

Add terminal_tool.resolve_task_overrides(task_id) as the single source and
route all three sites through it. Behaviour is unchanged (verified
byte-equivalent across raw/collapsed/isolation/None/subagent inputs).
2026-06-15 14:02:17 +05:30
Gille
d6a8d9dcab fix(tools): respect session cwd in file tools 2026-06-15 14:00:42 +05:30
Ben Barclay
95715dcb03 fix(s6): reserved default gateway must not follow sticky active_profile (#46483)
The supervised `gateway-default` s6 slot runs bare `hermes gateway run`
(no -p) to mean "the root HERMES_HOME profile". But `_apply_profile_override`
falls through its #22502 HERMES_HOME guard for the container root
(/opt/data, whose parent is not `profiles`) and reads the sticky
`active_profile` file. If the user set another profile active (e.g. via
the dashboard), the reserved default gateway gets redirected into that
profile — producing a duplicate gateway for the active profile and no
real default gateway. The profile page and `gateway status` then
correctly report default as "not running" because there genuinely isn't
one.

Guard step 2 (the sticky active_profile fallback) with the existing
HERMES_S6_SUPERVISED_CHILD sentinel that the container run-script already
exports. Supervised named-profile slots pass -p explicitly (step 1, never
reaches step 2); only the bare default slot was affected. Inert outside
the s6 container — the sentinel is never set elsewhere.

Reported in the 'Docker & Profiles & Dashboard' support thread.
2026-06-15 05:36:20 +00:00
Ben Barclay
80f8ffc74c fix(dashboard): pin machine-dashboard reroute to the machine root, not $HOME/.hermes (#46487)
The unified machine-dashboard reroute (cmd_dashboard) re-execs a named-profile
dashboard launch as the machine dashboard and dropped HERMES_HOME from the
child env with the comment "so the child binds the machine root". That holds
for a standard install (root == ~/.hermes) but breaks the Docker layout: the
published image sets `ENV HERMES_HOME=/opt/data`, so once HERMES_HOME is unset
the child falls back to $HOME/.hermes = /opt/data/.hermes — an empty,
auto-seeded home.

Two user-visible symptoms, one root cause (reported via support):

1. Dashboard Profiles page shows only an empty `default` — the real
   default/oracle/saga profiles live under /opt/data/profiles, but the
   rerouted child resolves _get_profiles_root() to /opt/data/.hermes/profiles.

2. The "Update Hermes" button runs `hermes update` inside the container
   repeatedly instead of bailing with the docker-update guidance. The Docker
   guard keys off detect_install_method(), which reads
   $HERMES_HOME/.install_method; the image stamps that at /opt/data, but the
   misresolved home has no stamp, no HERMES_MANAGED, and no .git → falls
   through to "pip", so the guard never fires.

The reporter's workaround was to bind-mount the host dir at both /opt/data and
/opt/data/.hermes so the two paths converge (at the cost of a self-referential
recursion).

Fix: resolve the machine root explicitly with get_default_hermes_root() and set
it on the child env instead of popping HERMES_HOME. That helper returns the
root for both layouts — ~/.hermes for a standard install, and /opt/data for
Docker (it strips a trailing profiles/<name>). Falls back to the old pop
behaviour only if root resolution raises, so the reroute is never blocked.

Regression tests in test_dashboard_unified_launch.py: the existing standard-
install test now asserts the child carries HERMES_HOME == get_default_hermes_root()
(not absent), and a new test_reexec_pins_docker_machine_root covers the Docker
layout (HERMES_HOME=/opt/data/profiles/oracle → child gets /opt/data). Both
fail against the pre-fix pop behaviour (mutation-verified).
2026-06-15 15:33:15 +10:00
Teknium
c2b7669ad3 fix(s6): clear stale log lock before startup (#46289)
* fix(cli): clear stale s6-log lock file before startup on virtiofs

* chore(release): map salvaged contributor

---------

Co-authored-by: zxcasongs <35259607+zxcasongs@users.noreply.github.com>
Co-authored-by: Ben <ben@nousresearch.com>
2026-06-15 14:10:51 +10:00
Teknium
b770967263 fix(s6): persist profile gateway desired state (#46292)
* fix: persist s6 gateway desired state

* chore(release): map salvaged contributor

---------

Co-authored-by: Alfred Smith <alfred@my-cloud.me>
Co-authored-by: Ben <ben@nousresearch.com>
2026-06-15 14:02:10 +10:00
Teknium
61ee2dbfdb fix(s6): make profile gateway log parent writable (#46291)
* fix(gateway): chown logs/gateways parent so late-added profiles can log

The per-profile log service script created $HERMES_HOME/logs/gateways/
via 'mkdir -p' but only chowned the leaf logs/gateways/<profile>. When
the first log service boots in root context, the gateways/ parent stays
root:root; every profile registered later runs its log service as the
dropped hermes user, 'mkdir -p' fails with EACCES, and s6-log enters a
sub-second fatal crash-loop flooding the container log. The stage2
recursive heal does not catch it either: it is gated on needs_chown,
which is false when the top-level $HERMES_HOME is already hermes-owned.

Two complementary fixes:

- service_manager._render_log_run: chown the gateways/ parent
  (non-recursively) before the leaf chown. Runs on every root-context
  boot, so it also heals volumes already poisoned by older images.
- docker/stage2-hook.sh: seed logs/gateways in the as_hermes mkdir -p
  block; cont-init runs before any service starts, so the parent
  already exists hermes-owned when the first log/run does 'mkdir -p'.

The needs_chown repair loop needs no twin entry: it already chowns
logs/ recursively, which covers logs/gateways.

Fixes #45258

* chore(release): map salvaged contributor

---------

Co-authored-by: tangtaizhong666 <tangtaizhong792@gmail.com>
2026-06-15 13:47:05 +10:00
leo4226
f795513782 fix(windows): kill hermes before recreating venv to release _bcrypt.pyd lock (#45120)
On Windows, native Python extensions such as _bcrypt.pyd are loaded as
DLLs by any running hermes process. When the installer tries to recreate
the venv (Remove-Item -Recurse -Force "venv"), Windows denies the delete
because the DLL is still mapped into the running process.

Add a taskkill /F /T /IM hermes.exe call before the Remove-Item so any
hermes process tree is stopped first, releasing the file lock. A short
sleep gives the OS time to unload the image before deletion proceeds.

This mirrors the existing force_kill_other_hermes() guard already present
in the --update flow (update.rs), applying the same pattern to the full
reinstall/repair path through install.ps1.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-15 02:27:00 +00:00
Alli
8fe334b056 fix(desktop): inset hover-reveal trigger past the adjacent scrollbar (#44159)
The collapsed-pane hover-reveal trigger strip (14px wide, 6px edge
gutter) overlapped the neighboring scroller's 8px .scrollbar-dt
scrollbar, which sits flush with the window edge when the rail panes
are collapsed. Hovering the scrollbar revealed the file browser over
it, and clicks on the overlapped band hit the trigger instead of the
scrollbar thumb.

Widen the edge gutter to calc(0.5rem + 2px) so the strip clears the
scrollbar (rem-coupled to the .scrollbar-dt width) while still
covering the OS window-resize grab area inset.

Part of #44140 (item 2).

Co-authored-by: AIalliAI <285906080+AIalliAI@users.noreply.github.com>
Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
2026-06-14 21:23:38 -05:00
Teknium
40d7c264f0 fix(s6): register profile gateways without auto-starting (#46266)
* fix(s6): prevent profile create from auto-starting gateway service

When hermes profile create runs inside an s6 container,
_maybe_register_gateway_service() calls register_profile_gateway()
which creates the service directory and triggers s6-svscanctl -a.
Previously the service always started immediately, causing profiles
that share the main gateway's bot token (e.g. Kanban worker profiles)
to fail with a token-lock conflict and persist gateway_state: running
— becoming zombies that resurrect on every container restart.

Wire the existing start_now parameter through the S6 implementation:
when start_now=False, write a  marker file (same pattern as
container_boot.py _register_gateway_slot) so s6-supervise leaves the
service stopped until the user explicitly runs hermes -p <profile>
gateway start.

4 files, +61/-6, 4 new tests (all passing).

* test(docker): wait for gateway running state before restart

---------

Co-authored-by: liuhao1024 <sunsky.lau@gmail.com>
2026-06-15 11:43:23 +10:00
Teknium
4eb0ff639b Remove is_container check when restarting over dashboard (#46290)
Co-authored-by: IAvecilla <ignacio.avecilla@lambdaclass.com>
2026-06-15 11:09:23 +10:00
Teknium
f3fe99863d revert(web): remove keyless Parallel search fallback (#46350)
Remove the free Parallel Search MCP path and restore the keyed Parallel backend behavior from before it was introduced.

Also drops the keyless fallback registration/display labeling tests and returns the Parallel SDK pin to the prior version.
2026-06-14 16:47:57 -07:00
Teknium
a829e04d62 fix: migrate cloned profile configs (#46345) 2026-06-14 16:30:23 -07:00
Teknium
2a14e8957d fix(kimi): surface K2.7 Code in native picker (#46309) 2026-06-14 14:01:03 -07:00
mr-r0b0t
bff78a34dc feat(zai): add GLM-5.2 with verified 1M context window
GLM-5.2 ships with a 1M (1,048,576) token context window. Without this
entry, Hermes falls through to the generic 'glm' key (202,752 tokens),
under-reporting the context bar and prematurely compressing conversations.

The 1M limit was verified empirically via needle-in-a-haystack retrieval
at 789,240 prompt tokens on api.z.ai/api/coding/paas/v4 — zero errors,
zero truncation, correct retrieval at every tested size (25K through 789K).

Changes:
- agent/model_metadata.py: add 'glm-5.2': 1_048_576 before 'glm' fallback
- hermes_cli/models.py: add glm-5.2 to zai curated models
- hermes_cli/setup.py: add glm-5.2 to setup wizard zai list
- hermes_cli/auth.py: add glm-5.2 to coding plan endpoint probes
- plugins/model-providers/zai/__init__.py: add glm-5.2 to fallback_models
- tests/agent/test_model_metadata.py: context resolution + vendor-prefix tests
2026-06-14 13:50:36 -07:00
Teknium
4e6d05c6a5 perf(skills): share raw config cache in skill utils (#46149) 2026-06-14 11:14:58 -07:00
Teknium
a1f51feb72 fix(telegram): avoid rich final duplicate previews (#46206) 2026-06-14 11:13:38 -07:00
kshitij
6c34088a17 Merge pull request #46237 from kshitijk4poor/salvage/46095-cross-process-cache
fix(gateway): cross-process agent-cache coherence (#45966) + preserve prompt caching
2026-06-14 23:05:17 +05:30
kshitij
fc2b8b3d31 Merge pull request #46236 from kshitijk4poor/salvage/disabled-skills-union
fix(skills): platform-disabled skills still appear in <available_skills> + unify all resolution sites (#46201)
2026-06-14 23:00:11 +05:30
kshitijk4poor
3bc4a2ff78 fix(gateway): re-baseline agent-cache message_count after each turn
The #45966 cross-process coherence guard snapshots a session's on-disk
message_count next to the cached agent and rebuilds the agent when the
count changes.  But the snapshot is taken at agent-BUILD time — before
the turn writes its own user + assistant (+ tool) rows — and the cache
entry is never rewritten on a reuse.  So this process's OWN turn grows
message_count, and the very next turn sees a mismatch and rebuilds the
agent.  That happens every turn, for every conversation, silently
destroying the per-conversation prompt caching the cache exists to
protect (AGENTS.md: prompt caching is sacred).

Add _refresh_agent_cache_message_count(): after a turn completes and the
agent has flushed its rows to the SessionDB, re-baseline the stored count
to the now-current value.  The guard then fires ONLY when a DIFFERENT
process changes the transcript — preserving the #45966 fix while keeping
the cache warm for normal single-process operation.

Tests drive the real SessionDB + the real guard condition: 5 consecutive
same-process turns now all REUSE the cached agent (0 before the fix); a
cross-process append still invalidates; and the re-baseline is fail-safe
(no DB, falsy session_id, raising probe, legacy 2-tuple, pending sentinel
all no-op).
2026-06-14 22:58:55 +05:30
kshitijk4poor
ce19fdb7ce fix(skills): apply global|platform disabled union to all resolution sites
The platform-disabled fix landed only in agent.skill_utils.get_disabled_skill_names
(the system-prompt path). Two sibling resolvers still used the old
replace-not-union semantics, so the same skill could be hidden from the
<available_skills> prompt yet reported enabled elsewhere:

- hermes_cli/skills_config.get_disabled_skills (the 'hermes skills config' UI)
  returned only the platform list, so a globally-disabled skill showed as
  enabled (unchecked) on any platform with a platform_disabled entry.
- tools/skills_tool._is_skill_disabled (gates whether skill_view loads a skill)
  ignored the global list when a platform list existed, so a globally-disabled
  skill could still be loaded on such a platform.

Both now union the global list with the platform list, matching
get_disabled_skill_names. An explicit empty platform list no longer re-enables
a globally-disabled skill — global disables hold on every platform (#46201).

Also: fix the now-stale get_disabled_skill_names docstring and drop a stray
blank line. Regression tests added for both sites (proven to fail on the old
replace semantics).
2026-06-14 22:54:54 +05:30
kyssta-exe
7f245b0035 fix(gateway): invalidate agent cache on cross-process session writes (#45966)
(cherry picked from commit 6d0f79defe)
2026-06-14 22:54:39 +05:30
ibrahim özsaraç
7bbe7024c2 fix: filter platform-disabled skills from <available_skills> prompt (#46201)
build_skills_system_prompt() already resolved _platform_hint but called
get_disabled_skill_names() with no argument, so the resolved platform never
reached the filter and the prompt cache_key varied by platform while the
disabled set did not. Pass _platform_hint or None.

get_disabled_skill_names() also fully ignored the global 'disabled' list once
a platform-specific list was found. Return the union (global | platform) so a
globally-disabled skill stays disabled on every platform.

Salvaged from #46203 by @iborazzi; the unrelated apps/shared/tsconfig.json
ES2023 bump is intentionally dropped (one concern per PR).
2026-06-14 22:52:57 +05:30
Teknium
7433d5f0eb fix(gateway): scope early duplicate guard to pid file 2026-06-14 08:42:06 -07:00
konsisumer
1436793051 fix(gateway): block shell gateway run when a service supervises the profile 2026-06-14 08:42:06 -07:00
brooklyn!
08d89e7aba fix(desktop): limit thinking shimmer to the disclosure label (#46197)
Reasoning body text was inheriting tw-shimmer while streaming even though
the "Thinking" header already pulses — keep shimmer on the label only.
2026-06-14 10:14:58 -05:00
Teknium
2c174bce24 fix(gateway): preserve new input on interrupted replay cleanup 2026-06-14 05:10:39 -07:00
Arnaud L
5191c1c2ce fix(gateway): stop replaying interrupted tool-call tails and auto-continue notes
Three changes to prevent infinite re-execution loops when a user sends
a new message while long-running tools are executing:

1. Filter interrupted tool results in _build_gateway_agent_history:
   skip tool messages whose content contains [Command interrupted] or
   exit_code 130 — they represent partial execution, not valid results.

2. Don't replay auto-continue notes as user messages: detect
   gateway-injected [System note: ...] / [IMPORTANT: ...] prefixes
   and skip them in _build_gateway_agent_history so the LLM doesn't
   see 4+ messages from 'the user' telling it to finish old work.

3. Fix the wording: the system note now instructs the model to
   address the user's NEW message FIRST, IGNORE pending results,
   and NOT re-execute old tool calls.

Closes #45230
2026-06-14 05:10:39 -07:00
Teknium
0f3670ba79 chore(release): map Diyoncrz18 author email 2026-06-14 04:52:54 -07:00
Diyon18
288f7026e3 fix(messaging): correct Weixin personal account labeling 2026-06-14 04:52:54 -07:00
Teknium
efbe1635dd fix(gateway): include replied-to media attachments (#46107) 2026-06-14 04:51:50 -07:00
Teknium
a27d7e68cc fix(mcp): block suspicious stdio configs before probe (#46112) 2026-06-14 04:46:54 -07:00
Teknium
13a1bd0f83 perf(model-metadata): persist OpenRouter metadata cache (#46114) 2026-06-14 04:45:46 -07:00
Teknium
0e22bf6439 docs(gateway): document exact silence tokens (#46105) 2026-06-14 04:37:18 -07:00
Teknium
972a9885ee fix(mcp): block exfil-shaped stdio server configs (#46083) 2026-06-14 04:24:14 -07:00
Teknium
9459057d7f fix(telegram): guard rich details math crash (#46102) 2026-06-14 04:22:22 -07:00
Teknium
cf7d5932f8 fix(email): make IPv4 SMTP fallback use supported sockets 2026-06-14 04:16:26 -07:00
liuhao1024
04d4471d79 fix(email): use SMTP_SSL for port 465 and fall back to IPv4 on timeout
Port 465 expects implicit TLS (SMTP_SSL) from the first byte. The email
adapter always used SMTP() + starttls(), which is correct for port 587
but hangs/fails on port 465 providers (e.g., Swiss ISPs).

Additionally, when the SMTP host has AAAA DNS records but IPv6 is
unreachable, socket.create_connection() tries IPv6 first and hangs
until timeout. Add an IPv4 fallback via AF_INET socket.

Extract _connect_smtp() helper to consolidate the 4 duplicate SMTP
connection sites into a single method with correct protocol selection
and IPv6 fallback logic.
2026-06-14 04:16:26 -07:00
alt-glitch
1ddf7a1021 opentui(v6): bench fixture — HERMES_BENCH_TOOL_BODY_LINES knob for fat tool output
The default lumpy-turn fixture's tool bodies are tiny (2/7/18 short lines), so
W3 (HERMES_TUI_TOOL_OUTPUTS=off) shows no RSS delta at realistic sizes — the
saved bytes sit below the ~20MB run-to-run noise floor. This adds an env knob
to scale every tool result body to N lines, making the retention asymmetry
measurable in the bench (a `find /` / big-file-read class fixture).

UNSET = the original tiny bodies, byte-identical — existing benches and the
determinism digest are unaffected. Set HERMES_BENCH_TOOL_BODY_LINES=N (the
bench harness inherits it at fixture-generation time) for a fat run.

Measured with this (mem300, ~100KB/tool, ~27MB retained tool text): OpenTUI
OFF 244-259MB vs ON 260-261MB vs Ink 269-279MB — i.e. W3 OFF is a real
~5-15MB win at heavy output, and OpenTUI's windowing actually beats Ink at
scale (Ink mounts every row; OpenTUI windows).
2026-06-14 16:12:10 +05:30
Teknium
5105c3651a perf(api-server): normalize chat content linearly (#46079) 2026-06-14 03:25:49 -07:00
Aldo
293c04fef6 fix(gateway): suppress exact silence tokens without mutating history 2026-06-14 03:25:08 -07:00
Teknium
10bad2faf1 fix(gateway): serialize startup auto-resume before inbound (#46074)
Gateway startup now queues real inbound messages until restart-interrupted auto-resume turns have completed, preventing duplicate agents for the same session after a restart.
2026-06-14 03:21:06 -07:00
Teknium
2b4873f7fb fix(agent): persist repaired-turn responses (#46071) 2026-06-14 03:20:25 -07:00
Teknium
723c2331bd fix: make profile subprocess HOME policy explicit 2026-06-14 03:20:21 -07:00
zccyman
b00060ce54 fix(agent): expose HERMES_REAL_HOME in subprocess envs for profile isolation
When profile isolation activates ({HERMES_HOME}/home/ exists), child
processes receive HOME={HERMES_HOME}/home/ for tool config isolation
(git, ssh, gh). However, scripts using Path.home() to locate
~/.hermes/ would incorrectly resolve to the isolated profile home,
breaking helpers that rely on the real user home directory.

New get_real_home() helper in hermes_constants resolves the actual
user home independently of profile isolation. All four subprocess
spawners now inject HERMES_REAL_HOME alongside the profile HOME:

- tools/code_execution_tool.py (execute_code)
- tools/environments/local.py (terminal background, run_env)
- agent/copilot_acp_client.py (Copilot ACP)

Child scripts can now use:
  Path(os.environ.get("HERMES_REAL_HOME", os.environ.get("HOME", "")))

to reliably find the real user home regardless of profile isolation.

Closes #25114
2026-06-14 03:20:21 -07:00
Teknium
0428945b5b fix(desktop): keep profile homes out of bootstrap (#46073) 2026-06-14 03:08:52 -07:00
Teknium
afc8615509 perf(webhook): prune request caches incrementally (#46065) 2026-06-14 02:40:54 -07:00
LeonSGP43
89bdb1e546 fix: read dashboard spa assets as utf-8
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-06-14 02:31:04 -07:00
Teknium
7b9dc7cd0a test(gateway): align web profile wrapper expectation 2026-06-14 02:20:55 -07:00
helix4u
d76a58bd15 fix(gateway): resolve sudo profile system installs 2026-06-14 02:20:55 -07:00
Teknium
1f5eef8093 test(tui): tolerate resume init kwargs in protocol tests 2026-06-14 02:15:33 -07:00
Teknium
9f33d673e9 fix(tui): persist resumed profile cwd updates to profile db 2026-06-14 02:15:33 -07:00
dsad
d842155da1 Keep resumed profile cwd scoped to profile DB 2026-06-14 02:15:33 -07:00
helix4u
4936a49a0c fix(mcp): preserve loop during probes 2026-06-14 02:09:45 -07:00
helix4u
85e6232a07 fix(providers): support anthropic proxy v1 endpoints 2026-06-14 02:09:16 -07:00
Teknium
81e42335a1 fix(file-safety): relax user-write deny policy (#45947)
Allow file tools to edit shell startup files, user package-manager configs, and Hermes control files that the user can already modify directly. Keep hard blocks for SSH keys, .env/OAuth token stores, mcp-tokens, pairing files, and system privilege files.
2026-06-14 02:07:32 -07:00
alt-glitch
a70f7f3b7b opentui(v6): proactive idle GC, gated on the low-mem heap knob (W2)
OpenTUI-only by design (verified: Ink never calls global.gc proactively — it
only exposes it for heapdumps; spec D5 sanctions the divergence since this is
opt-in). Default / unconstrained sessions do NOTHING.

boundary/proactiveGc.ts: a low-frequency watcher that calls global.gc() only
when ALL hold: (a) the low-mem opt-in is active — HERMES_TUI_HEAP_MB set at or
below 4096 (the same W1 knob; HERMES_TUI_PROACTIVE_GC can force on/off), AND
global.gc is exposed (W1's --expose-gc — else a silent no-op); (b) a turn is
NOT streaming (reads the store's info.running) so it never collects mid-reply;
(c) a full idle window has elapsed since the last activity. Eagerness: once RSS
crosses 400MB the idle window tightens (8s→3s) but STILL waits for idle — never
a mid-stream pause (the jank the campaign fought). Reuses
process.memoryUsage().rss (same read as memlog); the timer is unref'd and every
failure path disables silently. Wired in entry/main.tsx next to startMemlog,
scoped acquire→release.

Tests: gating (low-cap-on / high-cap-off / no-gc-off / explicit on|off) +
timing (fires after the idle window, never while streaming, stop() halts).
proactiveGc.test.ts 10 passed. npm run check OK (785 tests). Verified live that
the gate enables under HERMES_TUI_HEAP_MB=256 and global.gc is callable.
2026-06-14 14:17:50 +05:30
alt-glitch
6e3c393ef9 opentui(v6): configurable V8 heap (HERMES_TUI_HEAP_MB) + --expose-gc (W1)
Low-mem enabler — default unchanged (8192 for both engines), low blast radius.

- Heap knob honored by BOTH engines via the shared NODE_OPTIONS injection:
  HERMES_TUI_HEAP_MB env (highest precedence, matches the HERMES_TUI_ENGINE
  env-first pattern) > display.tui_heap_mb config (minimal early YAML read,
  mirrors _config_tui_engine_early) > the existing cgroup-aware default. The
  override REPLACES the 8192 default inside _resolve_tui_heap_mb (D3): low =
  the low-mem opt-in, high = raise the ceiling. The cgroup-fit 75% clamp still
  applies on top, so an override never exceeds the container. A non-secret
  behavioral setting → config.yaml, NOT the denylisted NODE_OPTIONS bridge.

- --expose-gc added to the OpenTUI argv in _make_opentui_argv (D4, parity with
  Ink which already has it). Must be an argv flag — Node rejects --expose-gc in
  NODE_OPTIONS. Makes global.gc() a real call so the engine's GC hooks
  (/heapdump; W2's proactive idle GC) work instead of silent no-ops. Verified:
  `node --expose-gc -e 'typeof global.gc'` → "function" (vs "undefined").

Tests: TestHeapOverride (env>config precedence, cgroup clamp on a too-high
override, low override honored under a big container, garbage/non-positive
fall-through) + TestExposeGcOnOpenTuiArgv. test_tui_heap_sizing.py 29 passed.
2026-06-14 14:15:32 +05:30
alt-glitch
c7e5215b50 opentui(v6): HERMES_TUI_TOOL_OUTPUTS flag — drop tool-body retention (W3)
The biggest real memory lever: OpenTUI retained full resultText + the raw
result dict + the args dict per tool call, while Ink discards tool bodies
(keeps only a short context line). That retention asymmetry is the bulk of the
Ink-vs-OpenTUI memory gap.

New `HERMES_TUI_TOOL_OUTPUTS` flag (toolOutputsEnabled() in env.ts, default
ON — the rich tool cards are OpenTUI's differentiator). When OFF, the
tool.complete reducer (store.ts) neither BUILDS nor STORES the body: skip the
whole result_text/result stringify+envelope-strip work and suppress
part.resultText / part.result / part.args / part.argsText / part.lineCount /
part.omittedNote. KEPT either way: name, state, duration, error, summary,
argsPreview (the redaction-safe one-liner from tool.start context = Ink's
context line), and the file-edit diff (diffUnified/diffStats — a diff is a
high-value surface, not generic "output"). Render is automatic: with no
resultText/result, defaultRenderer.expandable() is false → header-only row =
Ink parity, no extra view gating needed.

This powers the bench's fair Ink-vs-OpenTUI comparison (D8 — launch OpenTUI
with outputs off so both engines are body-less = pure engine overhead) and the
low-mem mode.

Tests: store retains rich outputs by default; OFF drops the bodies but keeps
name/duration/error/argsPreview/diff. npm run check OK (770 tests).
2026-06-14 13:56:08 +05:30
alt-glitch
25686feebf opentui(v6): bump @opentui 0.4.0 -> 0.4.1 + openConsoleOnError:false (W5)
Bump the three @opentui pins (core/keymap/solid) 0.4.0 -> 0.4.1 + lockfile.
The headline upstream change is native-yoga (#1126); per the locked spec
decision D11 this is NOT a memory-floor lever at typical sizes, and a fresh
bench on this branch confirms it — OpenTUI capped VmHWM is within run-to-run
noise across mem50/100/300 (0.4.0: 193/200/220 vs 0.4.1: 192/218/217 MB).
The value is tail-session layout wins + upstream alignment, not the floor.

D14 (ffiSafe re-verify): the FFI signatures ffiSafe.ts clamps (OptimizedBuffer
fillRect/drawText/setCell/setCellWithAlphaBlending/drawChar + TextBufferView
setViewport) are byte-identical across 0.4.0->0.4.1 (still u32, still crash on
negatives under node:ffi), so the shim stays as-is and remains load-bearing.
Verified live: scrolled a 300-msg transcript with syntax-highlighted code +
tool cards past the viewport top (the negative-y fillRect trigger) on 0.4.1 —
no ERR_INVALID_ARG_VALUE loop, clean render.

Also pick up openConsoleOnError:false (public createCliRenderer option): stops
core's uncaught-error handler from calling the ALLOCATING console.show(), which
exit-7-masks the original error under native-handle exhaustion (the bench
mem3000 postmortem). guardRendererErrorHandlers stays as belt-and-suspenders.

Gate: npm run check OK (768 tests). Determinism gate green both engines.
2026-06-14 13:53:11 +05:30
brooklyn!
526a1e24b5 Merge pull request #46029 from NousResearch/bb/summarize-gui
fix(desktop): show summarizing indicator during auto-compaction
2026-06-14 02:53:14 -05:00
Brooklyn Nicholson
1eb13744b4 fix(desktop): polish compaction indicator and preserve scrollback
Show a shimmering "Summarizing thread" label during auto-compaction, skip
the post-turn hydrate when compaction fired so the live transcript does not
collapse to the stored summary-only session.
2026-06-14 02:48:48 -05:00
brooklyn!
49dd91d682 fix(desktop): show copied checkmark on session Copy ID (#46030)
Route sidebar Copy ID through CopyButton so dropdown and context menus
get the same checkmark feedback as every other copy action.
2026-06-14 07:38:55 +00:00
Brooklyn Nicholson
715b691723 fix(desktop): show summarizing indicator during auto-compaction
Auto-compression rewrites history mid-turn, which made long threads look
like they reset. Re-tag the gateway lifecycle status as compacting and
surface it in the desktop thread loading indicators.
2026-06-14 02:28:07 -05:00
alt-glitch
8e3b320eb8 opentui(v6): credits/usage notice chrome banner (Ink parity)
The OpenTUI engine received the gateway's credits/usage notices but
mis-rendered them as scrolling inline transcript cards with no lifecycle.
Render them instead as a persistent, level-tinted chrome banner pinned
directly above the status bar, matching the Ink engine — no gateway/agent
changes (the wire + credits policy stay the source of truth).

- backgroundActivity.ts: widen level to include `success` (was silently
  dropped to info) + add isChromeNotice() (kind sticky|ttl) discriminator.
- store.ts: port the Ink turnController notice lifecycle — showNotice/
  applyNotice/clearNotice/flushPendingNotice/clearNoticeState, a single TTL
  timer (latest-wins, id-guarded), mid-turn hold + turn-end reveal (the
  three end sites: message.complete, gateway.exited, error), flash-and-yield
  for credits.usage/grant_spent at message.start, and a notice reset on
  clearTranscript + commitSnapshot so it can't bleed across sessions. Route
  notification.show by kind: sticky|ttl -> chrome banner, everything else
  (process/background completion cards) -> existing inline path, unchanged.
  Distinct clones for notice vs lastNotification (createStore aliasing).
- noticeBanner.tsx + App.tsx: a single sticky row above the status bar,
  text rendered verbatim (already glyphed by the policy), tinted by level,
  width-truncated so it can never wrap and push the composer.

Tests: statusNotice.test.ts (lifecycle/routing/TTL/flash-and-yield),
noticeBanner.test.tsx (render/color/truncation), backgroundActivity +
render additions. npm run check OK (768 tests).
2026-06-14 12:56:21 +05:30
alt-glitch
f5823277dc opentui(v6): clarify markdown + cold slash-highlight + @-mention race fix
Three user-reported TUI fixes:

- clarify prompt rendered raw markdown (literal **bold** / `code`). The
  question + each choice now go through the native <markdown> renderable
  (same engine as the transcript) in a flex column so wrapping + the
  selection accent are preserved. Tests assert structural chrome since
  tree-sitter markdown doesn't paint in the headless renderer (same
  limitation as render.test.tsx); painted markdown verified in a live smoke.

- a leading `/path` first message broke @-mention completion afterward:
  onType fired completion RPCs per keystroke with no out-of-order guard,
  and the transport doesn't guarantee in-order delivery, so a slow orphaned
  complete.slash could land after a later @-mention complete.path and blank
  the dropdown. Add createCompletionGate (pure) — claim() per keystroke,
  isCurrent(token) drops any superseded response.

- slash-command highlighting was hit-or-miss (only highlighted a /command
  if its completion batch had been browsed earlier): LEARNED_NAMES started
  empty and grew lazily. Seed it once at boot from the full uncapped
  commands.catalog via seedLearnedNames, so a cold /command highlights on
  the first keystroke.

npm run check OK (768 tests).
2026-06-14 12:56:21 +05:30
brooklyn!
9cbb91abd3 fix(desktop): clarify UX — loading, enter-to-send, radio align (#46014)
* fix(desktop): clarify enter-to-send and top-align choice radios

Match the composer keyboard contract in clarify freeform answers and align choice-row radio dots to the start of wrapped labels.

* fix(desktop): clarify loading spinner until request is ready

Hold the clarify panel on a centered Loader2 until clarify.request arrives instead of showing disabled choices or a loading-question stub.

* refactor(desktop): dedupe clarify shell and drop stale ready gates

Extract the shared clarify panel wrapper and remove disabled-state checks that loading already makes unreachable.
2026-06-14 07:06:40 +00:00
kshitij
c8ad2ca997 Merge pull request #46013 from kshitijk4poor/salvage/refusal-content-filter
fix(agent): surface model refusals as content_filter (salvage #43108 + edge-case fix)
2026-06-14 12:28:51 +05:30
kshitijk4poor
10bd01972b refactor(agent): share the content_policy_blocked result builder + recovery hint
The HTTP-200 refusal handler (finish_reason=content_filter) and the
exception-path handler (a provider moderation error classified as
content_policy_blocked) independently built the same terminal turn result —
the same {final_response, messages, api_calls, completed:False, failed:True,
error:'content_policy_blocked: ...'} dict — and ended their user-facing
message with the same 'Try rephrasing... hermes fallback add' trailer, copied
verbatim. The two copies could drift.

Funnel both through a shared _content_policy_blocked_result() builder and a
shared _CONTENT_POLICY_RECOVERY_HINT constant. Also collapse the HTTP-200
path's two near-identical with/without-explanation templates into one (compute
the detail fragment once) and pass reason=FailoverReason.content_policy_blocked
.value to the error hook instead of a hand-written string literal, matching the
sibling hook call.

Behavior-preserving: the provider/refusal lead-in wording stays distinct (a
provider safety filter vs the model declining are genuinely different signals),
the with-text and exception messages are byte-identical to before, and the
no-explanation case only gains a paragraph break for consistency. Surfaced by
the simplify-code reuse/quality reviewers.

The efficiency reviewer's 'redundant normalize_response' flag was deliberately
NOT applied: that branch is cold (refusal-only) and pure-CPU, and reusing the
sibling-branch normalized locals would risk a NameError on the codex_responses
path (which sets finish_reason without normalizing) — re-normalizing is the
robust choice.
2026-06-14 12:19:19 +05:30
kshitijk4poor
12c84d6c77 fix(transports): only treat a refusal as terminal when it is the sole payload
A chat-completions response that carries real text or tool calls *alongside*
a `message.refusal` note is a normal, usable turn — the model did work. The
prior logic flipped finish_reason to `content_filter` whenever a refusal
string was present, so the conversation loop reframed a content-bearing turn
as a *failed* safety refusal (failed=True) and buried the model's actual
output inside the "model declined" template, or dropped tool calls entirely.

Only promote to a terminal `content_filter` when the refusal is the sole
payload (no visible text AND no tool calls). The refusal explanation is still
recorded in provider_data in every case for observability. Refusal-only
responses (the bug this feature targets) are unaffected and still surface
terminally; the empty+refusal, bare content_filter passthrough, and no-refusal
common cases are byte-identical to before.

Updates the partial-content test to the corrected contract and adds a
tool_calls-alongside-refusal regression guard.
2026-06-14 12:12:52 +05:30
SHL0MS
ab26541b9a test(transports): lock in content_filter passthrough for OpenRouter
OpenRouter (and every other OpenAI-compatible provider) uses the default
chat_completions transport, so it is already covered by the refusal fix:
an upstream Claude / moderation refusal arrives as
finish_reason="content_filter" (often empty content, no message.refusal).
Add a regression test asserting the transport passes that finish reason
straight through to the loop's content_filter handler.

(cherry picked from commit 60168a513b)
2026-06-14 12:10:08 +05:30
SHL0MS
bb46bf8ce4 fix(agent): surface model refusals instead of retrying them as errors
A Claude refusal (HTTP 200, stop_reason="refusal", empty content) was
laundered into a generic retry loop and surfaced as a misleading
"rate limited / invalid response" or "no content after retries" error,
burning paid attempts reproducing a deterministic refusal.

This hit two distinct paths:

- Direct Anthropic (anthropic_messages): validate_response rejected the
  empty-content refusal *before* normalize_response mapped refusal ->
  content_filter, so it fell into the invalid-response retry loop.
- Nous Portal / OpenAI-compatible (chat_completions): the portal surfaces
  a Claude refusal via message.refusal with empty content, which sailed
  past validation and died in the empty-response retry loop.

Fix (one unified content_filter dispatch for all backends):
- AnthropicTransport.validate_response: accept empty content when
  stop_reason == "refusal" so it flows to normalize_response.
- ChatCompletionsTransport.normalize_response: promote message.refusal to
  content + a content_filter finish reason.
- conversation_loop: handle finish_reason == "content_filter" - fire the
  api_request_error hook (content_policy_blocked), try a configured
  fallback once, else return a clear terminal refusal message. Never retry
  a deterministic refusal.

Supersedes #43084, which fixed only the direct-Anthropic path and could
not reach the chat_completions/portal path.

Tests: transport-level (validate_response refusal, message.refusal
promotion) + end-to-end loop (refusal surfaced, exactly one API call).

(cherry picked from commit 01f546f92c)
2026-06-14 12:10:08 +05:30
brooklyn!
4b5ba112ad fix: shrink images to reported provider dimension limit (#45979)
Parse provider-reported image pixel ceilings so many-image Anthropic requests can recover by shrinking Retina screenshots below the stricter limit instead of retrying the same rejected payload.
2026-06-14 01:07:43 -05:00
brooklyn!
cdf30a7ac6 Merge pull request #45866 from NousResearch/bb/desktop-notifications
feat(desktop): native OS notifications with per-type toggles
2026-06-14 00:36:38 -05:00
Brooklyn Nicholson
b0288ae9b6 feat(desktop): move completion-sound picker into Notifications settings
The turn-end sound is a notification concern, not an appearance one — relocate
the variant picker + preview from the Appearance tab to the Notifications tab
(its i18n keys move from settings.appearance to settings.notifications with it).
2026-06-14 00:31:09 -05:00
Brooklyn Nicholson
630a4ef03c feat(desktop): native OS notifications with per-type toggles
Adds a native OS notification system (Electron Notification, routed cross-OS)
distinct from the in-app toast feed. Before this, one hardcoded cue existed
(message.complete while document.hidden) with no settings or event coverage.

- Engine (store/native-notifications.ts): localStorage-backed prefs (master
  switch + per-kind toggles) and a gated dispatcher over five kinds — approval,
  input, turnDone, turnError, backgroundDone — with a 1s per-(kind,session)
  self-evicting throttle.
- Gating: "backgrounded" = document.hidden OR !document.hasFocus(), so an
  alt-tabbed window still counts as away. Completion kinds fire only when
  backgrounded and for the active session (no spam from a busy gateway);
  attention kinds (approval/input) also break through for off-screen sessions.
- Wired into real event sites (use-message-stream.ts): message.complete, error,
  approval/clarify/sudo/secret.request; backgroundDone from composer-status at
  the running -> exited transition.
- Click focuses the window and jumps to the originating session; approval
  notifications carry Approve/Reject buttons that resolve in place over
  approval.respond, mirroring the in-app Run/Reject bar.
- Settings: new Notifications panel (master + per-kind switches, test button
  with real OS-result feedback). Full i18n (en/ja/zh/zh-hant).
2026-06-14 00:31:03 -05:00
alt-glitch
33924c074c docs(opentui): point bench references at the tui-bench repo
bench/ moved to github.com/NousResearch/tui-bench; repoint the lingering
references in dev-handoff, env-flags, memory-story, ui-opentui README, and
the memlog/memSampler/reconciler source comments.
2026-06-14 10:56:08 +05:30
brooklyn!
b4ba3f5e3b feat(desktop): add curated completion cue for agent turn completion (#42480)
* feat(desktop): add curated completion sound bank for turn completion

Replace the prior haptic-only completion cue with a curated Web Audio completion sound flow, defaulting to the minimal two-note comfort preset while keeping alternate presets available for quick iteration. Play the cue on every message completion event (including background sessions) so turn-end feedback is consistent across active and non-active chats.

* refactor(desktop): drop done1 byte sample from completion bank

Keep the curated Web Audio presets only; the embedded sample added bulk without shipping as the default cue.

* feat(desktop): expand completion sounds and add Appearance picker

Add fourteen synthesized turn-end presets with preview in settings, persisted variant selection, and softer default mixing for late-night use.

Co-authored-by: Cursor <cursoragent@cursor.com>

* refactor(desktop): dedupe completion-sound resolver, trim audio comments

Make the store the single source of truth for the variant default + range
validation and have the sound lib import it (one-way lib→store edge, no
cycle), instead of two divergent copies. Extract the shared white-noise
buffer used by the air/whoosh voices and cut the synth comments down to
why-only notes.

---------

Co-authored-by: Austin Pickett <pickett.austin@gmail.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-14 00:21:40 -05:00
Teknium
8f278403d1 perf(execute-code): stop waiting on idle RPC accept (#45948) 2026-06-13 21:57:15 -07:00
Teknium
1b16c48170 fix: guard OAuth account removal 2026-06-13 21:47:13 -07:00
Flownium
e986e3fc68 fix: add provider account removal 2026-06-13 21:47:13 -07:00
Justin Sunseri
12682d96b9 feat(telegram): restore rich messages opt-out
Salvages PR #45840's client-compatibility opt-out while keeping rich messages enabled by default via telegram.extra.rich_messages: true.
2026-06-13 21:45:49 -07:00
aimable100
8d5d36d793 fix(dispatch): forward session_id into registry.dispatch (#28479)
Both the regular and execute_code dispatch paths forward task_id into
registry.dispatch via middleware _dispatch lambdas but silently dropped
session_id. Dispatch-layer hooks (e.g. set_enforcement_fn) that correlate
calls with the active session received "" for every invocation.

Pass session_id=session_id at both _dispatch call sites inside
handle_function_call, matching the existing task_id pattern. Hooks
already received session_id; this closes the registry.dispatch gap.

Rebased onto current main where dispatch is wrapped by
run_tool_execution_middleware — the old direct-dispatch sites from
#28479 no longer exist.

test(dispatch): add tests for session_id forwarding (NousResearch#28479)

Covers standard and execute_code paths through the middleware wrapper.
Verifies task_id forwarding is not broken by the change.
2026-06-14 00:27:59 -04:00
Teknium
7aaae7acd0 fix(ssl): align guard docs and escape hatch 2026-06-13 21:14:32 -07:00
Teknium
73d1357747 style(agent): keep run_agent import order stable 2026-06-13 21:14:32 -07:00
Teknium
af1995a838 chore(release): map chromalinx noreply author 2026-06-13 21:14:32 -07:00
Teknium
dc90ca4e17 fix(ssl): run CA guard during agent initialization 2026-06-13 21:14:32 -07:00
Teknium
af5b526472 fix(ssl): validate CA bundle paths before provider calls 2026-06-13 21:14:32 -07:00
chromalinx
b42c5bf652 test(ssl_guard): fix macOS fallback test that passed for the wrong reason
The previous test patched ssl.create_default_context globally with a bare
SSLContext that has zero CA certs. Both verify_ca_bundle() and the macOS
fallback got the same mocked context, so the test verified nothing useful:
both paths produced empty get_ca_certs() and the assertion that no
exception escaped was vacuously satisfied.

Only mock the fallback call (no cafile) — let the certifi call hit the
real SSL stack and fail with SSLError on the broken PEM. The mock
fallback returns a context with load_default_certs() so the test now
verifies the real scenario: broken certifi → SSLConfigurationError,
macOS system trust store → success.

Also pads the broken PEM past the 1 KB size guard so the size check
doesn't short-circuit before ssl.create_default_context(cafile=...) runs.

Reported by @liuhao1024 in PR review.
2026-06-13 21:14:32 -07:00
chromalinx
a218a0f156 fix(agent,gateway,doctor): add SSL CA cert bundle fail-fast guard
A stale certifi CA bundle after a partial `hermes update` used to crash
the agent on the first outbound HTTPS call with a raw traceback and
trap the gateway in a retry loop.

This patch:

* Adds `agent/errors.py` with a typed `SSLConfigurationError`
* Adds `agent/ssl_guard.py` with a `verify_ca_bundle()` pre-flight
  that asserts the bundle exists, is non-trivial in size, and can build
  a working SSLContext. On macOS, it falls back to the system trust
  store when the bundle is empty but the system store is healthy
  (covers corporate proxies / MDM setups).
* Wires the guard into `run_agent.py` and `gateway/run.py` right
  after the `hermes_bootstrap` import, inside a try/except so a bug
  in the guard itself can never prevent startup.
* Adds a `SSL / CA Certificates` section to `hermes_cli doctor` so
  users can detect the failure with one command.
* Adds unit tests covering the healthy, missing, empty, skip-env, and
  macOS-fallback paths.
* Adds an RCA document describing the failure mode and the recovery
  path (`pip install -e .`).

When the bundle is broken the user sees:

    \u26a0\ufe0f SSL certificate bundle issue detected.
       Run: pip install -e .

`HERMES_SKIP_SSL_GUARD=1` disables the check for sandboxed
environments that ship their own trust store.
2026-06-13 21:14:32 -07:00
Teknium
1106879147 perf(process): wake waiters on background completion (#45831) 2026-06-13 21:11:19 -07:00
alt-glitch
21c64f90aa feat(tui_gateway): emit notification.show on background-process completion (Option B)
A notify_on_complete background process (the agent's terminal tool, proc_*) only
reached the TUI as the AGENT'S NARRATION — the completion is fed to the model as a
synthetic prompt (_run_prompt_submit) and the gateway emitted only message.start +
the reply, never the completion itself. So the OpenTUI card mechanism had nothing
to render and the synthetic turn read as a context-less line.

Additive fix (glitch-approved relaxation of the no-core rule for this one emit):
new _emit_process_completion_card() fires a `notification.show` (text "<cmd> exited
<code>", level info/warn by exit code, kind process.complete, key proc:<id>) at the
two completion-delivery sites, just before the agent turn. The OpenTUI engine
renders it as a distinct inline card (P1) + an OSC ping; Ink treats it as a notice.
Completion events only (watch matches skipped); call-site dedup → one card per exit.
No existing behavior changes — the agent turn still happens.

Gateway suite green (321 passed incl. 5 new TestProcessCompletionCard cases).
2026-06-14 08:55:29 +05:30
brooklyn!
6b76284c77 fix(desktop): surface off-screen approvals via the jump-to-bottom control (#45853)
* fix(desktop): jump-to-approval pill for off-screen approvals

A blocked approval's only response surface is the inline Run/Reject bar on
the pending tool row. When that row is scrolled out of view the session looks
stalled with no visible action. Surface a composer-anchored "Approval needed"
pill only when an approval is pending AND its inline bar is scrolled away;
clicking scrolls the bar back into view. Preserves the deliberate inline (not
modal) approval design — the pill never duplicates the approve/reject controls.

The inline bar mirrors its own viewport visibility via IntersectionObserver
(tracks scroll/resize/layout) and registers a scroll-into-view handler the pill
fires, mirroring the existing thread-scroll jump-button bridge.

Supersedes #45828.

* fix(desktop): morph jump-to-bottom into approval prompt; drop scroll bridge

Collapse the separate "jump to approval" pill into the existing
scroll-to-bottom control: when scrolled away from the bottom while an approval
is pending, it relabels to "Approval needed". A parked approval's inline
Run/Reject bar is always the bottom-most content, so the existing
scroll-to-bottom action lands the user right on it — one control, no collision.

This also fixes the layout corruption from the first cut: the pill called
native el.scrollIntoView(), which scrolls every scrollable ancestor including
the overflow:hidden chat shell containers. Those have no scrollbar to scroll
back and don't remount on session switch, so the composer stayed shoved and
the breakage persisted across sessions. Reusing requestScrollToBottom() (the
use-stick-to-bottom path) only touches the one designated scroll container.

Removes the now-unused approval-scroll store + IntersectionObserver wiring.
2026-06-13 23:07:22 +00:00
Teknium
4026f526d5 chore(release): map MaxFreedomPollard author email 2026-06-13 15:01:42 -07:00
Max Pollard
9a2b976326 test(skills): add regression tests for bundled-update backup recovery
Three tests covering: a stale .bak poisoning a failed update's move/restore, an orphaned .bak misread as a user deletion, and a partially written dest blocking restore-on-failure. All three fail on current main without the fix.

Refs #44942
2026-06-13 15:01:42 -07:00
Max Pollard
3581131e7d fix(skills): make bundled-update backup handling crash-safe and idempotent
Recover an orphaned .bak before classification (interrupted updates no longer read as user deletions), clear a stale .bak before shutil.move (replace, not nest), and clear a partial dest before restore so restore-on-failure actually runs.

Fixes #44942
2026-06-13 15:01:42 -07:00
Teknium
bf8effad02 fix(utils): copy fallback for atomic replace across devices (#43852)
Fallback from `os.replace` on EXDEV/EBUSY using copy+fsync+unlink while preserving symlink target semantics and metadata.
2026-06-13 14:50:05 -07:00
Teknium
817f392311 feat(read): extract notebook and office documents (#37082)
Add stdlib-only extraction for `.ipynb`, `.docx`, and `.xlsx` in read_file with lazy integration and malformed-document fallback.
2026-06-13 14:42:51 -07:00
Teknium
2b67e96aec fix(approval): gate in-place edits to sensitive user files
Cover sed, perl, and ruby in-place mutations against shell rc, SSH, and credential files so terminal approvals pair the redirection and copy guards.
2026-06-13 14:35:27 -07:00
helix4u
abd69b8117 fix(approval): detect absolute home shell rc writes 2026-06-13 14:35:27 -07:00
briandevans
da28d5d113 fix(security): gate cp/mv/install into ~/.ssh, credential, and shell-rc files
tools/approval.py already denies tee/redirection writes to every
_SENSITIVE_WRITE_TARGET (~/.ssh/*, ~/.netrc/.pgpass/.npmrc/.pypirc, shell
rc files, ~/.hermes/config.yaml/.env) via the DANGEROUS_PATTERNS tee/`>`
rules, but cp/mv/install were only paired for _SYSTEM_CONFIG_PATH (/etc) and
the project-relative env/config target. So `cp evil ~/.ssh/authorized_keys`
(SSH-key implant / persistence), `cp creds ~/.netrc`, and `cp evil ~/.bashrc`
(login-time command injection) auto-approved while the equivalent tee/`>`
forms were denied — an unpaired write deny is theater (same rationale as
#14639 / commit 4e9d886d, which paired the terminal side for
~/.hermes/config.yaml writes but did not touch these cp/mv/install verbs on
the broader sensitive set).

Add one (cp|mv|install) DANGEROUS_PATTERNS entry reusing the existing
_SENSITIVE_WRITE_TARGET fragment, anchored via _COMMAND_TAIL so it fires on
the destination (last arg) only: reading OUT of a sensitive path
(`cp ~/.ssh/config /tmp/x`) stays auto-approved. Description differs from the
system-config cp entry so the two keep distinct approval keys (no silent
cross-approval). Additive — does not subsume the /etc or project-config rules.

Adds TestSensitiveCopyMovePattern: 5 positive cases (ssh authorized_keys,
ssh private key via mv, netrc via install, bashrc, ~/.hermes/config.yaml) +
2 negative guards (copy FROM ssh, unrelated copy). The ssh/netrc/bashrc
positives fail on main and pass on this branch; the negatives stay green
both ways.
2026-06-13 14:35:27 -07:00
Teknium
1fa761f8de fix(search): keep partial results on search timeout (#36142)
Treat search command budget timeouts as soft truncation so partial results survive, while real search failures still return structured errors.
2026-06-13 14:35:21 -07:00
Teknium
069bfd6545 fix(agent): keep Codex reasoning replay on Codex path 2026-06-13 14:35:00 -07:00
briandevans
1d584a301e fix(agent): treat Codex reasoning items as thinking-only 2026-06-13 14:35:00 -07:00
ITheEqualizer
57c2a55be4 fix(telegram): harden rich message fallback handling
Carry forward focused follow-ups from PR #45741: treat PTB's raw Bot API 10.1 response shapes safely, recognize real missing-endpoint errors, preserve link preview settings on rich sends, and lock the rich limit to Telegram's character-based cap.
2026-06-13 14:34:53 -07:00
brooklyn!
0a865e5948 fix(desktop): bypass Chromium editing pipeline for large paste & select-delete (#45812)
Large paste and Ctrl+A → Delete froze the composer for seconds — both routed
through Chromium's contenteditable editing pipeline (~O(n²) on multiline DOM).

- insertPlainTextAtCaret: Range + text/<br> fragment (paste path)
- deleteSelectionInEditor: range.deleteContents for non-collapsed Backspace/Delete
- Shared composerSelectionRange helper; both flush via flushEditorToDraft

Profiled live (47 KB / 122 paragraphs): paste 4474 ms → 13 ms; select-delete
1304 ms → 4 ms. Collapsed-caret deletes still native.
2026-06-13 20:49:58 +00:00
Teknium
c8e5f34f24 fix(gemini): strip native self prefixes before generateContent (#36141)
Strip `google/` and `gemini/` self-prefixes before native Gemini generateContent calls, and keep provider-normalization expectations aligned.
2026-06-13 13:47:08 -07:00
briandevans
7d11fa4e9e fix(codex-responses): let final_answer complete top-level incomplete responses 2026-06-13 13:45:29 -07:00
ITheEqualizer
7c0605bf22 fix(telegram): preserve rich formatting on stream final 2026-06-13 13:44:45 -07:00
achaljhawar
819def44c7 fix(agent): scope Nous tags to Nous auxiliary calls 2026-06-13 13:24:40 -07:00
Teknium
08890d77e6 fix(plugins): normalize browser-pasted GitHub repo URLs (#33539)
Accept common GitHub web URLs in `hermes plugins install` by normalizing repository views back to cloneable `.git` URLs, with focused parser coverage.
2026-06-13 13:23:59 -07:00
brooklyn!
425e777f54 fix(desktop): polish slash command completion (space/tab/click + typed args) (#45760)
* fix(desktop): accept slash command on space at command stage

Pressing space on a no-arg slash command (e.g. /hermes-agent) fell
through to the arg-completion stage and dead-ended on "No matches"
instead of inserting the directive. Space now mirrors Tab/Enter while
the command name is still being typed: no-arg commands commit the chip,
arg-taking commands expand to their options step.

* fix(desktop): suppress arg popover for no-arg slash commands

Committing a no-arg command (`/hermes-agent `) re-detected the chip+space
as an arg query and re-opened the popover on "No matches". The arg-stage
menu now only opens when the command actually takes args.

* fix(desktop): polish slash arg completion (space/tab/click + typed args)

Unify Enter/Tab/Space accept of the highlighted item at both the command
and arg stages: no-arg commands commit a chip, arg commands expand to
options, and an arg option commits the full `/cmd arg` chip. A fully-typed
arg (which the backend completer drops from suggestions) now commits on
Space/Tab via the verbatim text instead of dead-ending, and the "No
matches" empty state is suppressed past a command's name. Space stays
slash-only so @ mentions keep a literal space.
2026-06-13 18:43:52 +00:00
kshitij
7be22e37e1 Merge pull request #45753 from kshitijk4poor/salvage/gateway-auto-resume-duplicate-agent
fix(gateway): claim session slot before auto-resume task to prevent duplicate agents (#45456)
2026-06-13 23:46:17 +05:30
kshitijk4poor
28902dc890 chore: map liuhao1024 contributor email for attribution 2026-06-13 23:39:49 +05:30
kshitijk4poor
63097ee0d7 test(gateway): cover auto-resume full-path no-regression; clarify guard docstring
The salvaged fix's two regression tests mock adapter.handle_message, so
they only assert the pre-claimed sentinel is set/cleaned around a stub —
they never drive the real dispatch chain. Add a full-path test that
exercises _schedule_resume_pending_sessions -> _guarded_handle_message ->
adapter.handle_message -> _process_message_background -> _handle_message
and asserts the resumed session's agent runs EXACTLY ONCE: not zero (the
pre-claim must not self-bounce the resume into a queued no-op) and not
twice (the duplicate-agent bug #45456 the fix targets). Also assert no
leaked sentinel and no orphaned pending event after the drain settles.

Tighten the _guarded_handle_message docstring: on current main the real
sentinel is taken over inside _handle_message (not _process_message_background),
and note the `is _AGENT_PENDING_SENTINEL` guard only releases the slot we
ourselves placed, never one a live run owns.
2026-06-13 23:39:35 +05:30
liuhao1024
6e2fd955ca fix(gateway): claim session slot before auto-resume task to prevent duplicate agents
When the gateway restarts and auto-resumes an interrupted session, an
inbound message arriving in the window between `asyncio.create_task()`
and the task's first await could spin up a second AIAgent for the same
session.  Both agents would then process messages concurrently,
producing interleaved duplicate responses (#45456).

Fix: set `_AGENT_PENDING_SENTINEL` in `_running_agents` immediately
after the "already running" check, before creating the task.  This
closes the race window — any inbound message sees the slot as occupied
and queues behind the auto-resume.

A `_guarded_handle_message` wrapper ensures the pre-claimed sentinel is
always released, even if `handle_message` raises before reaching
`_process_message_background` (whose `finally` block handles normal
cleanup).

(cherry picked from commit 85150c976b)
2026-06-13 23:36:51 +05:30
helix4u
78c11d99e3 fix(update): stop Windows gateways before mutating install 2026-06-13 10:46:08 -07:00
ashishpatel26
957a8ffa88 fix(bedrock): omit sampling params for restricted Claude models
Bedrock Converse rejects non-default sampling parameters for Opus 4.7 and 4.8 with a ValidationException. Reuse the Anthropic-native sampling-param guard in the Bedrock kwargs builder so those models omit temperature/topP while older Claude and non-Claude models keep existing behavior.

Includes the stop-sequence regression from the parallel fix to ensure stopSequences still pass through for restricted Opus models.

Co-authored-by: Tranquil-Flow <tranquil_flow@protonmail.com>
2026-06-13 10:45:56 -07:00
alt-glitch
8e853e3ff8 opentui(v6): fix /bg — it launches a background PROMPT, not the process panel
I conflated two "background" concepts. In hermes, /bg (aliases /background, /btw)
launches a background PROMPT (prompt.background → background.complete), and the
`bg: N` badge counts in-flight prompt tasks — but P3 hijacked /bg + bg: for the
OS-process registry and never handled background.complete (so completions were
silent). Corrected:

- /bg <prompt> now launches a background prompt (Ink parity): prompt.background,
  echoes "bg <id> started", tracks the task in store.bgTasks.
- background.complete → drops the task + renders a distinct inline completion
  CARD with the result (the missing completion notification; a completion-ish
  kind also fires the OSC desktop ping).
- `bg: N` badge counts store.bgTasks (in-flight background prompts), not OS procs.
- the OS-process panel moved off /bg to /processes (+ /procs); its header count
  uses runningCount. Dropped the ambient agents.list poll — the badge is now
  event-driven and the panel fetches on open.

Gate green; new store test for background.complete (card + badge decrement).
2026-06-13 22:44:02 +05:30
alt-glitch
76eab10b14 opentui(v6): simplify the background-activity work (dedup + dead-code removal)
/simplify pass over this session's diff (4 cleanup agents → applied the high-value
findings):
- reuse: extract the duplicated truncRight/truncLeft (statusBar/agentsDashboard/
  backgroundPanel) to logic/truncate.ts; export DONE_STATUSES/procIsRunning from
  backgroundActivity and drop backgroundPanel's re-declared copy.
- simplification: remove dead/speculative exports (upsertNotification,
  clearNotificationsByKey, BackgroundRun) — the campaign models notifications as
  Message rows, not a notifications array — and drop runningCount's always-empty
  `runs` param; collapse notificationDispatcher's always-true `card` field into a
  plain notificationOsc(): TermNotification | undefined.
- efficiency: the bg-process poll now idles at 30s (was a flat 8s) and tightens to
  8s only when something is running — most sessions have zero background processes.
- altitude: the ` agents` chip joins the statusSegments width-ladder (was an inline
  width gate) so all bar segments share one drop policy; refreshed the stale
  statusBar header (bg: is wired now, not "reserved").

Net −95 LOC. Gate green. Skipped (noted): statusColor merge (divergent domains),
the pushNotification double-clone (necessary — guards the Solid aliasing footgun),
moving the notification-card dispatch out of messageLine, and the Message-row model
(deliberate "inline in transcript" design).
2026-06-13 22:30:27 +05:30
alt-glitch
5c5a1fec4b opentui(v6): background-activity P4 — fold the agents tray line into a status-bar chip
Input-zone density (rpiw): the agents tray kept a persistent collapsed line under
the composer (` N agents running — ↓ to inspect`), stacking with the status bar +
composer. The running count now lives in a status-bar ` N` chip (next to bg:/mcp:),
and the tray renders NOTHING when collapsed — one fewer persistent line under the
transcript. The tray stays mounted + focusable, so composer-Down still hands focus
over and expands it into the rows (focus-routing tests unchanged + green).

Gate green; agentsTray tests repointed from the removed tray line to the chip; the
Down/Esc/printable focus-routing coverage is intact.
2026-06-13 22:18:08 +05:30
alt-glitch
7016fa4902 opentui(v6): background-activity P3 — /bg process panel + ambient bg badge (no core)
The OS process registry (the qxpe "Claude Code background process" gap) had NO
surface in the TUI. Now:
- /bg (aliases /background, /jobs) opens a Background Processes panel listing the
  registry from agents.list — per-process command + uptime + status, running
  count, and a single STOP-ALL action (x → process.stop; the gateway exposes
  kill_all only, so there's no per-row kill — noted in the panel).
- the reserved status-bar `bg: N` badge (A) now shows the running-process count,
  fed by a slow (8s) scoped poll of agents.list so it stays live with the panel
  closed; hidden at zero.

Background *runs* are intentionally NOT duplicated here — they're already the
resume picker's active-sessions tab; this panel targets the process registry,
the actual gap. All TUI-only (agents.list + process.stop already exist). Gate
green; new backgroundPanel.test (parse + list + running-count + empty state).
2026-06-13 22:09:11 +05:30
alt-glitch
74cb03423e opentui(v6): background-activity P2 — de-crowd the agents dashboard + typed trace
The agents dashboard (rplj pain) dumped each subagent's full multi-line prompt
into the master list, wrapping it into a wall of text and squeezing the trace.
Now each master row is ONE line: status + a width-budgeted truncated goal + model.
The detail pane still shows the full goal (the inspect half) and renders the
activity as a TYPED transcript instead of flat lines: SubagentInfo.trace is now
TraceEntry[] ({kind:'start'|'tool'|'progress'|'summary'}), drawn with per-kind
glyph+color (▶ start /  tool accent / · progress muted / ✓ summary green).

No foregrounding (kept subagent UX unchanged — inspection only, per the brainstorm).
TUI-only. Gate green; new agentsDashboard.test.tsx asserts one-line truncation +
typed-trace render; store.test trace assertion updated to the typed shape.
2026-06-13 21:56:02 +05:30
alt-glitch
5988e21ed7 opentui(v6): background-activity P1 — inline notification cards + OSC (no core change)
The TUI now consumes notification.show/clear gateway events (it dropped them
before — they leaked into the transcript as plain model-output-looking lines,
the qxpe pain). They render as a distinct, level-tinted inline card (role
'notification'): gold ◆ for info, amber for warn, red for error — clearly chrome,
not the agent. Important ones (error/warn/'complete'|'done'|'finish' kinds) also
fire the EXISTING focus-gated OSC desktop notification via termChrome.

Shared substrate (pure, unit-tested) for the rest of the campaign:
- logic/backgroundActivity.ts — parse notification.show/agents.list payloads,
  dedupe-by-id upsert, clear-by-key, runningCount (badge, used in P3).
- logic/notificationDispatcher.ts — card-always + OSC-when-important decision.
- store: notification.show → pushNotification (distinct clones to avoid Solid
  createStore reference-aliasing), notification.clear → drop matching cards,
  lastNotification → OSC seam (terminalChrome).

All TUI-layer; builds only on events/RPCs the gateway already emits. Gate green
(737 tests incl. new unit + frame coverage); verified live in tmux.
2026-06-13 21:44:27 +05:30
alt-glitch
965226fd52 docs: spec for OpenTUI background-activity (agents inspection + background panel + notifications)
Brainstormed design (glitch 2026-06-13). TUI-only, no core gateway/agent changes —
builds on existing events (notification.show/clear, background.complete, subagent.*,
agents.list, process.stop, session.interrupt). Approach 1: shared substrate +
two surfaces + multi-channel notifications (inline card + ambient badge + OSC) +
input-density pass. Phased P1–P4 with per-phase gates.
2026-06-13 21:27:06 +05:30
alt-glitch
353a8c1c8f opentui(v6): composer/transcript UX polish from dogfood feedback (glitch 2026-06-13)
Three Tier-1 fixes from live use:

- bare `/` hydrates the full command menu again (reverses F1's "name char
  first" gate). The lead-token grammar still rejects `/abs/path` (F2) and a
  `/ ` trailing-space is still not arg-completion on an empty name.
- `!cmd` shell mode now reads unmistakably: the composer glyph flips ❯ → `$`
  in the alert (warn) color and an amber "shell mode — Enter runs this in your
  shell (no model turn)" note rides the slot the slash/path dropdown would use
  (they never coexist). New optional `brand.shellPrompt` ($), skin-overridable.
- the transcript scrollbox reserves a 1-cell right gutter (contentOptions
  paddingRight) so the vertical scrollbar no longer paints OVER hard-width
  content — markdown table / code-block right borders were clipped under it.

Gate green (714 tests); F1/F2/F7/F8 slash specs + the slashMenu frame test
updated to the new bare-/ behavior. Verified live in tmux.
2026-06-13 20:43:16 +05:30
Teknium
cc14b74718 docs(profile): update clone-from references 2026-06-13 07:33:58 -07:00
Teknium
9b5f7b63c6 fix(profile): make clone-from a full source selector 2026-06-13 07:33:58 -07:00
Teknium
d146b85173 chore(release): map WompaJango author 2026-06-13 07:33:58 -07:00
WompaJango
28bf8fb47d feat(dashboard): clone profiles from any source 2026-06-13 07:33:58 -07:00
Que0x
3380563d94 fix(security): stop /api/status leaking host paths and PID on gated binds
The dashboard's public /api/status liveness endpoint is in PUBLIC_API_PATHS
and bypasses dashboard auth, yet it returned absolute hermes_home,
config_path, env_path, the gateway PID, and the internal gateway health URL.
That exceeds the shape its own allowlist documents as public ("version,
gateway state, active session count, and the dashboard auth-gate shape. No
bodies, no session content, no secrets"), leaking deployment recon to any
unauthenticated caller on a network-exposed (gated) bind.

Withhold host-local detail unless the bind is loopback / --insecure, where
the dashboard is local-only and the caller is already inside the trust
envelope -- the same split should_require_auth draws. The NAS liveness probe
and the auth-gate badge are unaffected.

Adds invariant tests for both modes (gated withholds, loopback keeps).
2026-06-13 07:18:59 -07:00
Teknium
ad7436a5d9 fix(gateway): preserve WeCom per-group sender allowlists
Keep the own-policy fail-closed hardening from PR #45444, but still trust WeCom groups.<id>.allow_from because the adapter already checked that sender allowlist before dispatching to gateway auth.
2026-06-13 07:18:54 -07:00
Que0x
fc46354580 fix(security): fail closed when an own-policy gateway adapter has no allowlist
Own-policy adapters (WhatsApp, WeCom, Weixin, QQBot, Yuanbao) default dm_policy/group_policy to "open", which forwards every sender. The gateway's adapter-trust shortcut in _is_user_authorized blanket-trusted those platforms when no env allowlist was set, so an operator who enabled one with only credentials authorized the entire external network -- the fail-open SECURITY.md section 2.6 forbids ("an allowlist is required for every enabled network-exposed adapter").

Trust the adapter only when its effective policy for the chat type is an actual "allowlist" restriction (the case #34515 was protecting). "open"/"pairing"/anything else falls through to default-deny, where {PLATFORM}_ALLOW_ALL_USERS / GATEWAY_ALLOW_ALL_USERS and the pairing flow remain the explicit opt-ins.
2026-06-13 07:18:54 -07:00
Teknium
1185dfd773 test: cover legacy Office document extensions 2026-06-13 07:18:37 -07:00
Clayton Chew
f82cb48120 fix(platform): add .xls, .doc, .ppt to SUPPORTED_DOCUMENT_TYPES
Old Office formats (.xls, .doc, .ppt) were missing from the
SUPPORTED_DOCUMENT_TYPES dict in gateway/platforms/base.py while their
newer counterparts (.xlsx, .docx, .pptx) were included.

Sending an .xls file via Telegram triggers 'Unsupported document type'
and the file is silently dropped instead of being cached and forwarded
to the agent.

Add the three legacy MIME types so these files are handled the same way
as their modern equivalents.
2026-06-13 07:18:37 -07:00
Tranquil-Flow
4fd9397ae3 fix(codex): drop extra_headers for chatgpt.com backend 2026-06-13 07:13:24 -07:00
Sarvesh
45f9099e51 fix(matrix): preserve markdown table structure 2026-06-13 06:57:08 -07:00
alt-glitch
5747d9a2d8 docs: mark opentui composer-ux batch SHIPPED (F1–F10, decisions D1/D2) 2026-06-13 19:17:41 +05:30
alt-glitch
e01b04de46 fix(tui): chrome cost from Nous portal headers only (F3)
The status-bar cost segment must show cost ONLY when running against the Nous
portal — per-model cache/input/output pricing is unreliable across the model
long tail, so a guessed figure is worse than none.

- New nous_header_cost_usd(agent): the chrome cost source, derived ONLY from the
  x-nous-credits-* header delta (deliberately ignores the OpenRouter usage.cost
  accumulator). _get_usage now uses it for cost_usd, so a non-Nous session
  reports no cost and the TUI hides the segment.
- The /usage accounting page is unchanged in spirit: it now reads
  real_session_cost_usd(agent) directly (both provider-reported sources) instead
  of the chrome-narrowed _get_usage cost_usd, so OpenRouter cost still shows there.

Tests: new TestNousHeaderCost (header-only, OR-accumulator ignored, clamp,
no-method); updated gateway _get_usage tests for the chrome narrowing; /usage
page test still asserts the full provider-reported figure. 316 gateway + 25 cost
tests green.
2026-06-13 19:17:41 +05:30
alt-glitch
ef9232a2f7 opentui(v6): paste-while-unfocused + clarify prompt rewrite (F4/F5/F6)
- F4: a paste while the composer is unfocused (transcript scrollbox grabbed
  focus) now lands — a renderer-level paste listener focuses the textarea and
  applies the bytes; the focused path stays the textarea's own onPaste (no
  double insert). Paste logic shared via applyPaste(text, native).
- F5: clarify prompt rewritten off the native <select> onto a custom list —
  long options WRAP instead of clipping, options are numbered, the selected
  row gets a real background + accent (three signals), and the custom answer
  is an always-present inline <input> in the same screen.
- F6: Up/Down/Enter are preventDefault'd so arrows drive selection and never
  leak to the transcript scrollbox.

Verified live via tmux screenshot (wrapping + numbering + highlight + inline
input all correct). 714 tests green; new clarifyPrompt.test.tsx covers wrap,
numbering, selection, inline custom input, no-choices, Esc cancel.
2026-06-13 19:17:41 +05:30
alt-glitch
5268027e6b opentui(v6): composer UX batch — slash trigger, @-mentions, !bash, right-pinned cwd
- F1: slash menu opens only after a name char (bare / no longer fires)
- F2: /abs/path is no longer mistaken for a slash command (lead token must match NAME_RE)
- F7/F8: completion survives newlines — computed at the cursor token, not whole-buffer bail
- F8b: @ is the only file/dir mention trigger (~ / ./ / bare paths dropped)
- F9: !cmd runs a shell command via gateway shell.exec (Ink parity), output as a system line
- F10: cwd is right-pinned on the chrome bar so dirname+branch hug the right edge

planCompletion is now cursor-aware (onType threads ta.cursorOffset). classifySubmit
extracted as a pure, tested router. 708 tests green.
2026-06-13 19:17:41 +05:30
Teknium
4373e802a1 fix(docs): reuse healthy skills index during Pages deploys (#45616) 2026-06-13 06:46:07 -07:00
alt-glitch
b6598017c8 docs(handoff): concrete tmux-pane-screenshot usage + note skills are TUI-reachable 2026-06-13 19:11:06 +05:30
Teknium
d206e1f51d fix(dashboard): keep local file browser on home 2026-06-13 06:39:38 -07:00
konsisumer
16fb573bae fix(gateway): clear bloated compression binding on compression-exhaustion auto-reset
After compression exhaustion the auto-reset created a fresh session but
discarded reset_session()'s return value and left the Telegram topic
binding pointing at the oversized compressed child. The next inbound
message in that topic healed the binding forward and switch_session'd the
freshly-reset lane back onto the bloated transcript, re-triggering
compression exhaustion in a loop with a new session id each time.

Capture the fresh entry and re-sync the topic binding to it so the next
message starts clean. No-op on non-topic lanes.

Regression of the #9893/#10063 auto-reset fix.

Fixes #35809
2026-06-13 06:38:29 -07:00
alt-glitch
5af3a81490 docs: OpenTUI dev handoff — base operating manual for continuing memory+UX on the canonical branch 2026-06-13 18:43:30 +05:30
Teknium
6f43ff5572 chore(release): map Gemini schema contributor 2026-06-13 06:12:52 -07:00
Henrik Bentel
eed61a1251 fix(gemini): add role field to systemInstruction 2026-06-13 06:12:52 -07:00
Teknium
74c5158b10 fix(model): show bare custom endpoints in gateway picker (#45597)
Surface direct model.provider=custom endpoints in /model picker output and keep explicit bare custom switches on the current endpoint instead of requiring a named providers/custom_providers row.
2026-06-13 06:05:30 -07:00
Teknium
6724daa2c2 fix: keep CLI idle timer ticking (#45592) 2026-06-13 05:55:04 -07:00
Teknium
aa53a78d67 fix(desktop): hand off Windows bootstrap recovery (#45594) 2026-06-13 05:54:32 -07:00
Teknium
0333a99925 fix: merge session-only model analytics rows (#45582) 2026-06-13 05:52:42 -07:00
Tranquil-Flow
5acd185f7c fix(moonshot): handle union type arrays in tool schemas 2026-06-13 05:51:41 -07:00
Teknium
39a35b784f chore(release): map custom provider resume contributors 2026-06-13 05:51:05 -07:00
Adalsteinn Helgason
2667601c05 fix(tui): keep reasoning-only assistant turns visible on session resume
A thinking-only assistant turn (reasoning present, empty visible text) is
persisted with its reasoning fields and stays recallable from the transcript,
but `_history_to_messages` dropped it as "empty" before its reasoning was
attached. On desktop/TUI resume or reload the turn therefore vanished from the
session view while the agent could still recall it from a fresh session --
exactly the "messages disappear when the LLM uses its thinking block, but a new
session can recall them" symptom reported on #44022.

Keep an assistant turn when it carries reasoning, even with empty text, so the
desktop "Thinking…" disclosure has something to render. Genuinely empty turns
(no text, no reasoning, no tool calls) are still filtered out.

Refs #44022

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-13 05:51:05 -07:00
Adalsteinn Helgason
643dc82793 Fix custom provider identity loss in session persistence
_runtime_model_config persisted the live agent's RESOLVED provider into
the session row's model_config JSON. For any named providers:/
custom_providers: entry, agent.provider is the literal string "custom",
so the entry name was lost (and the api_key is deliberately never
persisted). On session.resume or _reset_session_agent the stored
provider="custom" fed resolve_runtime_provider(requested="custom"),
which cannot match a named entry — the rebuild either raised "No LLM
provider configured" or silently resolved placeholder credentials
against the patched-back base_url.

Persist the REQUESTED/entry identity instead: a new reverse lookup
find_custom_provider_identity(base_url) maps the endpoint URL back to
the canonical custom:<name> menu key. _runtime_model_config stores that
key; _make_agent performs the same recovery for rows persisted before
the fix, falling back to passing the stored base_url as
explicit_base_url so the direct-alias branch still targets the
session's endpoint when no entry matches.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-13 05:51:05 -07:00
Haozhe Zhang
e256f4aae4 fix(gateway): don't restore a bare billing provider as the resumed session's provider
`_stored_session_runtime_overrides` restored the session provider from
`billing_provider` when `model_config` had no explicit provider. For a
`custom:<name>` endpoint that only ran normal turns (no `/model` switch), the
persisted `billing_provider` is the bare billing bucket `"custom"`, which
`agent_init` treats as non-routable, so `session.resume` failed with
"No LLM provider configured" even though new chats and CLI `--resume` work.

Only restore an explicit `model_config.provider`; skip a bare billing bucket
(`auto`/`openrouter`/`custom`) so resume falls back to the configured default,
matching the CLI path.

Fixes #44022
2026-06-13 05:51:05 -07:00
Teknium
cb125c2b3f fix(kanban): pin assigned profile toolsets for workers (#45590) 2026-06-13 05:50:09 -07:00
Teknium
a59d5e37e8 feat(telegram): make rich messages always on (#45584)
Remove the rich_messages config toggle entirely so Telegram replies always try the Bot API 10.1 rich-message path first, with the existing MarkdownV2 fallback/latch behavior for unsupported endpoints and per-message failures.

Restore the Telegram platform hint to encourage rich Markdown tables/task lists/math now that the rich path is the default, and remove the config/docs surface for the old toggle.
2026-06-13 05:45:11 -07:00
Teknium
4b646bc21e fix(auxiliary): preserve main provider base url (#45587) 2026-06-13 05:44:18 -07:00
Teknium
62b4618e9a fix(dashboard): scope sessions and analytics to selected profile (#45598) 2026-06-13 05:42:38 -07:00
H-Ali13381
2abcae9678 fix(cli): preserve renderer state on resize 2026-06-13 05:40:18 -07:00
xxxigm
c814d3d1dd test(installer): regression for unmerged-index update failure
Functional bash test drives install.sh's autostash block against a throwaway
repo with a real conflicted index and asserts the stash now succeeds and the
unmerged entries are cleared (previously `git stash` failed with "could not
write index"). Source-order assertions cover both scripts to ensure the
`git reset` clear runs before `git stash push` (a no-op otherwise).
2026-06-13 05:19:44 -07:00
xxxigm
573b964dc7 fix(installer): clear an unmerged git index before stashing on update
When an existing install at $INSTALL_DIR has an unmerged index (files in a
"needs merge" state left by a previously interrupted update), the update path
ran `git stash` then `git checkout <branch>`. On a conflicted index `git stash`
aborts with "could not write index" and `git checkout` then aborts with "you
need to resolve your current index first" — surfacing to desktop/bootstrap
users as `git checkout main failed (exit 1)` and failing the whole install at
the repository stage.

Mirror the `hermes update` Python path (#4735): detect unmerged entries with
`git ls-files --unmerged` and clear the conflict state with `git reset` before
stashing. Working-tree changes are still captured by the subsequent stash, so
nothing is discarded; only the index-level conflict markers are dropped, which
lets the checkout proceed.

Fixed in both installers (install.sh and install.ps1) so the Windows GUI
installer and the POSIX one share the same recovery behavior.
2026-06-13 05:19:44 -07:00
Teknium
aa0798352a fix(auth): self-heal missing Codex access tokens
Recover Codex singleton auth entries that have a refresh token but no access token by adopting a valid Codex CLI token pair, matching the cron-time failure mode before falling back to the credential pool.
2026-06-13 05:15:26 -07:00
Kennedy Umege
311ff967de review: validate refresh_token, path-agnostic recovery log, map author email
Addresses PR review feedback:
- Validate refresh_token (not only access_token) before persisting the
  re-imported Codex token, so a half-token payload can't silently break the
  next refresh cycle.
- Make the recovery log path-agnostic ("Codex CLI auth.json") since
  _import_codex_cli_tokens can read $CODEX_HOME, not only ~/.codex.
- Add regression test: relogin-required + imported token missing refresh_token
  -> re-raise and persist nothing.
- Map kenmege@yahoo.com -> Kenmege in scripts/release.py AUTHOR_MAP
  (fixes the check-attribution job).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-13 05:15:26 -07:00
Kennedy Umege
bd66e7e3fb fix(auth): self-heal Codex refresh_token rotation by reimporting from ~/.codex
Hermes keeps its own copy of the Codex OAuth token per profile and at the
top level, separate from the Codex CLI's ~/.codex/auth.json. OAuth
refresh_tokens are single-use, so when the Codex CLI (or another Hermes
process) rotates the shared token, the frozen copy's refresh_token goes
stale and refresh_codex_oauth_pure fails with a relogin-required error
(invalid_grant / refresh_token_reused / 401). Today that surfaces as a hard
401 on the turn — idle profiles and desktop sessions 401 "token_expired"
until a manual re-auth — even though ~/.codex/auth.json holds a fresh token.

_refresh_codex_auth_tokens now falls back to _import_codex_cli_tokens() (the
canonical Codex CLI store) when the stored refresh_token is rejected, adopts
and persists the fresh token, and lets the in-flight retry succeed. This
complements PR #6525 (force relogin on 401/403): we attempt automatic
recovery before surfacing a relogin prompt. Transient failures (e.g. 429
quota, relogin_required=False) are never self-healed — the stored token is
still valid there — so they re-raise unchanged, and the happy path is
untouched.

Adds tests/hermes_cli/test_auth_codex_self_heal.py covering: self-heal on
invalid_grant, no self-heal on 429 quota, re-raise when ~/.codex is absent,
and happy-path-unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-13 05:15:26 -07:00
Teknium
2681c5a12d fix(photon): correct gateway start command (#45566) 2026-06-13 05:14:59 -07:00
xxxigm
fa2aba90b4 docs(docker): explain per-profile gateway ports for multi-profile setups
The Multi-profile section never explained how to reach more than one
profile from outside the container, and distinguishes the two surfaces
that people conflate:

- Hermes Desktop's Remote Gateway connects to a `hermes dashboard`
  backend (port 9119), and a single dashboard serves every co-located
  profile via its profile switcher (the target profile is sent per
  request; the backend opens that profile's HERMES_HOME). No per-profile
  port or second connection is needed for Desktop.
- OpenAI-compatible API clients (Open WebUI, LobeChat, /v1) talk to each
  profile's API server, which binds 8642 for every profile with no
  auto-allocation. Reaching a second profile from such a client needs a
  distinct `API_SERVER_PORT` in that profile's own `.env` (and the port
  must NOT go in the container-wide `environment:` block, or every
  profile collides on it).

Adds the create -> set port -> restart flow, the bridge port-publishing
note, and clarifies the default profile's connection is untouched.
2026-06-13 05:13:25 -07:00
xxxigm
5b857201b7 fix(profiles): correct misleading per-profile gateway port docstrings
The s6 profile-gateway docstrings claimed the bind port comes from a
`[gateway] port` key in config.yaml ("the single source of truth"). No such
key exists or is read anywhere — the API server port is resolved by
gateway/config.py from `API_SERVER_PORT` (or `platforms.api_server.extra.port`)
and defaults to 8642. The wrong reference actively misled a Docker user into
setting a non-functional `gateway.port`.

Point both docstrings (`S6ServiceManager._render_run_script`,
`_maybe_register_gateway_service`) at the real knob, and note the practical
consequence: since each supervised profile gateway loads its own HERMES_HOME,
two profiles left at the default both try to bind 8642 — each needs a distinct
`API_SERVER_PORT` in its own `.env`.
2026-06-13 05:13:25 -07:00
Teknium
905ed413d1 fix(doctor): avoid unsafe npm audit fallback
Root-level npm audit fix can crash with isDescendantOf on the same monorepo tree, so workspace audit advisories should explain the lockfile-bump path instead of recommending another manual npm fix command.
2026-06-13 05:09:56 -07:00
xxxigm
bea6c1c01f test(doctor): assert audit-fix hint avoids crashing form and explains build-tool advisories 2026-06-13 05:09:56 -07:00
xxxigm
a5e9b17ce3 fix(doctor): stop recommending the npm-crashing audit fix, explain build-tool advisories
`hermes doctor` flagged the web/ui-tui workspaces and told the user to run
`npm audit fix --workspace <name>`, which crashes current npm with
"Cannot read properties of null (reading 'edgesOut')" (an arborist bug with
workspace-filtered audit fix). Recommend the root-level `npm audit fix`
instead.

Even the root form can hit a known npm arborist crash (edgesOut /
isDescendantOf) on this monorepo tree, so add a note that these workspace
advisories are build-time tooling (esbuild/vite, etc.) — not runtime code —
and clear via a lockfile bump rather than a manual fix. This keeps doctor
from handing users a command that errors out and from implying a broken
Hermes install.
2026-06-13 05:09:56 -07:00
xxxigm
5d6c16e972 test(desktop): cover the inline command expander on the approval bar
Asserts the full command is absent until the Command toggle is clicked, then
rendered in full — guarding the long-command reveal path.
2026-06-13 05:08:37 -07:00
xxxigm
266b5a19f1 feat(desktop): expand the full command inline from the approval bar
The native desktop approval bar deliberately omits the command because the
pending tool row "already shows it" — but that row only renders a single
truncated line, and a pending row can't be expanded (it has no result yet). So
the full command was only reachable by opening the "Always allow" dropdown,
reading the modal, cancelling, then clicking Run — 4-5 clicks just to see what
you're approving.

Add a "Command" toggle to the approval bar that reveals the full
`request.command` inline (reusing the dialog's pre styling), default collapsed.
Approving a long command is now "expand, Run". Gated on a non-empty command so
zero-command approvals are unaffected.
2026-06-13 05:08:37 -07:00
Black-Kylin
202e318cb1 fix(gateway): sync compression session splits before failures
Salvages PR #25747 by preserving gateway session rotation even when a post-compression model call fails before returning final content.

Co-authored-by: Hermes <127238744+teknium1@users.noreply.github.com>
2026-06-13 04:51:59 -07:00
helix4u
2d474e39c7 fix(acp): preserve memory provider tools 2026-06-13 04:51:44 -07:00
Teknium
2a5dc0ef3d fix(slack): make video attachments available to agents (#45512) 2026-06-13 03:33:27 -07:00
Teknium
197337cc47 fix(gateway): suppress duplicate final stream sends (#45517) 2026-06-13 03:23:44 -07:00
Teknium
8cf9d8689d fix(desktop): keep composer usable during reconnect (#45488)
* feat(cli): add --safe-mode troubleshooting flag

Inspired by Claude Code v2.1.169 (June 2026): run Hermes with all
customizations disabled to isolate setup problems from product bugs.

--safe-mode implies --ignore-user-config and --ignore-rules, and
additionally skips plugin discovery (hermes_cli/plugins.py) and MCP
server loading (tools/mcp_tool.py) via the internal HERMES_SAFE_MODE
env bridge.

* fix(desktop): keep composer usable during reconnect
2026-06-13 02:36:09 -07:00
brooklyn!
b62e57b2f4 Merge pull request #45445 from NousResearch/bb/desktop-stick-to-bottom
fix(desktop): stabilize thread scrolling and session switching
2026-06-13 04:14:49 -05:00
Teknium
bc060c7c1c fix(models): remove unavailable claude-fable-5 (#45492) 2026-06-13 02:03:50 -07:00
Teknium
3803e5fc28 fix(agent): don't treat custom:<name> pools as cross-provider mismatch (#45289)
Custom endpoints carry two naming conventions for the same provider: the
agent's provider attribute is the generic 'custom' label while the pool
is keyed 'custom:<normalized-name>'. The defensive guard in
recover_with_credential_pool compared them literally, logged
'Credential pool provider mismatch: pool=custom:<name>, agent=custom',
and skipped recovery — so 401 refresh and 429 rotation never ran for
ANY custom-provider user (seen in the field on a Fireworks setup whose
dead key burned full retry cycles every turn with the skip warning on
each one).

Accept the pair only when the agent's CURRENT base_url resolves to the
same pool key via get_custom_provider_pool_key, preserving the guard's
original purpose (#33088/#33163): a fallback provider or a different
custom endpoint still skips pool mutation.
2026-06-13 02:01:09 -07:00
brooklyn!
bdd3868b57 fix(desktop): keep profile color picker open from the context menu (#45489)
Right-click → Color flashed open then closed: on dismiss the context menu
refocuses its trigger, which doubles as the popover anchor, so the picker
read it as a focus-outside event and closed itself. Suppress the menu's
close auto-focus so the picker survives. Long-press already worked since
it bypasses the menu lifecycle.
2026-06-13 04:00:09 -05:00
xxxigm
b6c7ebf028 fix(tui): honor provider_routing config in the desktop/TUI backend (#44953)
* fix(tui): honor provider_routing config in the desktop/TUI backend

The messaging gateway and classic CLI both read `provider_routing` from
config.yaml and pass the OpenRouter routing prefs (only / ignore / order /
sort / require_parameters / data_collection) into the agent. The tui_gateway
backend that powers the desktop app and TUI never did, so it built agents
with every routing pref left at its default — OpenRouter then selected
providers freely (effectively at random), ignoring the user's config.

Load `provider_routing` in `_make_agent` and forward the same six prefs the
gateway does, restoring parity across CLI / gateway / desktop. Background
subagent kwargs already propagate these from the parent agent, so they now
inherit correctly too.

* test(tui): cover provider_routing forwarding in _make_agent

Asserts the six OpenRouter routing prefs flow from config.yaml into AIAgent,
and that an absent provider_routing section forwards None/False (unchanged
behavior for users who never configured routing).
2026-06-13 02:58:15 -05:00
alt-glitch
7b7ab279f2 fix(tui): opentui launches when fnm default is older than 26.3; chrome bar reads the real cwd
Two reasons the local TUI stopped running OpenTUI / showed the wrong directory:

1. Node resolution. OpenTUI needs Node >= 26.3 (node:ffi floor), but
   _node26_bin_or_none only checked HERMES_NODE + `which node`. When fnm's
   default flips to an older line (e.g. v25.9) the active node fails the gate
   and the engine silently falls back to Ink even though a usable v26.3 sits
   installed. _fnm_node26_candidates now discovers fnm's installed versions
   (FNM_DIR / XDG_DATA_HOME/fnm / ~/.local/share/fnm / macOS Library path),
   newest first, version-probed — so the engine launches without the user
   re-aliasing their global default.

2. Launch cwd. The launcher runs the engine with cwd=<engine package dir> so
   its build/resolution works; the gateway it spawns then auto-detected THAT
   dir as the workspace (chrome bar showed 'ui-opentui (feat/opentui-native-
   engine)' regardless of where you ran hermes). TERMINAL_CWD — the gateway's
   canonical launch-dir channel — was only exported in worktree mode; now it's
   set to the real cwd for every launch (worktree mode still overrides to the
   worktree path). The TUI's session.create no longer sends process.cwd() (the
   engine dir) — a new launchCwd() reads the launcher's HERMES_CWD/TERMINAL_CWD,
   falling back to process.cwd() only for standalone smokes.

Together: session cwd, chrome bar, terminal-tool cwd, and /sessions grouping
all anchor to where you actually ran hermes. Verified live — chrome bar shows
'/tmp/cwd-probe (my-feature)' launched from there with fnm default on v25.9.

8 new tests (fnm discovery order/precedence/empty-safety; launchCwd env
precedence).
2026-06-13 13:24:54 +05:30
Brooklyn Nicholson
b2bc48cd5e Merge branch 'main' into bb/desktop-stick-to-bottom
# Conflicts:
#	apps/desktop/src/components/assistant-ui/thread.tsx
2026-06-13 02:52:03 -05:00
brooklyn!
9cd3d8a6ac Merge pull request #45466 from NousResearch/bb/fix-image-generation-placement
fix(desktop): keep generated images in the tool slot, not inline
2026-06-13 02:50:59 -05:00
Brooklyn Nicholson
b82d2e549f fix(desktop): keep the diffusion placeholder circular at any aspect
Normalise the radial bloom by the shorter side so portrait/square
placeholders aren't squished into an ellipse.
2026-06-13 02:45:34 -05:00
Brooklyn Nicholson
b15dc58064 fix(desktop): keep generated images in the tool slot, not inline
The image-generate tool showed a placeholder, then the model echoed a
(often different) image inline in its prose — a second, jarring copy in
the wrong place, dimmed as tool scaffolding, with a misplaced download
button.

Now the generated image lives only in the tool slot:
- Strip every embedded image/media link from the assistant prose of a
  message that produced an image (the model frequently restates the
  remote URL while the result holds the local path), preserving the
  agent's words. Applied on hydration, live deltas, and completion.
- One stable frame sized from the aspect_ratio arg up front, so the
  diffusion placeholder and the decoded image share the same box and
  crossfade with no layout shift; the box derives its height from the
  true ratio on load (no letterboxing).
- Exempt generated images from the tool-block dim-until-hover rule.
- Extract a shared useImageDownload hook + ImageLightbox so the tool
  image and markdown images share one implementation.
2026-06-13 02:42:15 -05:00
Brooklyn Nicholson
acd4278c8a fix(nix): use fetchNpmDeps hash from flake check
prefetch-npm-deps returned a different digest than the actual
fetchNpmDeps build; use the CI-reported hash.
2026-06-13 02:34:25 -05:00
Brooklyn Nicholson
be6713c536 fix(nix): refresh npm deps hash 2026-06-13 02:16:13 -05:00
Brooklyn Nicholson
77687156b4 fix(desktop): tighten multiline user prompt spacing 2026-06-13 02:16:13 -05:00
Brooklyn Nicholson
45ceee8a32 Merge branch 'main' of github.com:NousResearch/hermes-agent into bb/desktop-stick-to-bottom 2026-06-13 02:08:10 -05:00
brooklyn!
0a7a81835b Merge pull request #45255 from NousResearch/bb/desktop-stuck-tool-rows
fix(desktop): dismiss settled tool rows (persistent, caret-safe)
2026-06-13 02:08:01 -05:00
Brooklyn Nicholson
76b93869d8 fix(desktop): rebuild thread autoscroll on use-stick-to-bottom 2026-06-13 01:57:30 -05:00
brooklyn!
a856276124 Merge pull request #45414 from NousResearch/bb/fix-desktop-queue-drain-strand
fix(desktop): stop stranding queued prompts across backend bounces
2026-06-13 00:39:13 -05:00
Gille
1e755ff556 fix(desktop): keep recents sorted unless manually reordered (#45404) 2026-06-13 00:38:10 -05:00
Brooklyn Nicholson
7f302c91b2 chore: uptick 2026-06-13 00:33:44 -05:00
Brooklyn Nicholson
18916376f1 fix(desktop): never surface "session busy" — retry every submit past it
"Session busy" (4009) is the gateway's concurrency guard, not a user-facing
error. The queue already covers the deliberate "type while busy" case, so
the only leak was a submit racing the settle edge. Generalize the rewind
path's busy-retry into a shared `withSessionBusyRetry` and wrap every
`prompt.submit` (fresh send, session-resume resubmit, and rewind) so a
transient busy is ridden out within a bounded deadline and the call lands
silently. The fromQueue swallow stays as a backstop for the pathological
>deadline case.
2026-06-13 00:26:34 -05:00
Brooklyn Nicholson
f23a4b7bb3 fix(desktop): keep queued drains quiet on transient "session busy"
A queued drain firing on the settle edge can race a not-yet-wound-down
turn and get a transient 4009 "session busy". Previously that appended a
red "session busy" error bubble (and toast) per attempt. For fromQueue
submits, swallow the busy error: release busy, keep the entry queued, and
let the composer's bounded auto-drain retry on the next idle.
2026-06-13 00:23:51 -05:00
Brooklyn Nicholson
bf090deed3 fix(desktop): stop stranding queued prompts across backend bounces
A prompt typed mid-turn ("ghost bubble") could stick forever and never
send when the backend restarted/reconnected during the turn. Two fragile
assumptions in the composer queue drain caused it:

1. Drain fired ONLY on an observed busy true→false edge. A remount/
   reconnect resets `previousBusyRef` to the current busy value, so the
   settle edge is swallowed and the queue never drains. Replace
   `shouldAutoDrainOnSettle` with the edge-independent `shouldAutoDrain`
   (idle + non-empty), driven on the settle edge, on mount/reconnect, and
   after a re-key. The drain lock still serializes sends.

2. The queue is keyed by `queueSessionKey || sessionId`. When a backend
   resume mints a new runtime session id for the same conversation, the
   entry strands under the dead key. Pass the *stable* stored id as
   `queueSessionKey` so the composer can tell runtime churn from a real
   session switch, and `migrateQueuedPrompts` re-keys pending entries on a
   runtime-id change only (never on a deliberate switch).

Also make the drain resilient to a thrown/rejected onSubmit (e.g. a stale-
session 404): the entry stays queued and is retried on the next idle, with
a per-entry attempt cap (MAX_AUTO_DRAIN_ATTEMPTS) to avoid spin-loops and a
quiet toast once it gives up. A manual send clears the backoff.

Tests: composer-queue covers edge-free drain + re-key migration;
use-prompt-actions covers rejected-drain-keeps-entry + idle retry sends.
2026-06-13 00:20:51 -05:00
brooklyn!
7d183f6497 fix(desktop): theme the image-gen placeholder instead of a white square (#45354)
The diffusion placeholder read `--dt-*` tokens via
`getComputedStyle().getPropertyValue()`, but those resolve through `var()`
chains into `color-mix(in srgb, …)` — returned verbatim and unparseable, so
every token fell to a hardcoded light fallback (white card). In dark mode the
placeholder rendered as a white square.

Resolve each token through a throwaway probe element's `color` so the browser
computes it to a concrete color, and teach `parseColor` Chromium's
`color(srgb r g b / a)` serialization. Re-resolve on theme repaint via a
MutationObserver rather than per animation frame.
2026-06-12 21:45:24 -05:00
brooklyn!
492c402774 perf(desktop): cut GUI streaming & interaction lag (#45343)
* perf(desktop): isolate streaming re-renders & cut layout thrash

During a token stream $messages is replaced ~30x/s. Subscribing the whole
chat view to it re-rendered the composer, runtime boundary, and every
message on every delta.

- Derive coarse facts (empty thread? tail is user?) via nanostores
  `computed` atoms so per-token flushes don't re-render their consumers.
- Move the $messages subscription + runtime wiring into a dedicated
  ChatRuntimeBoundary; the composer reads $messages imperatively.
- Drive message rows off stable useAuiState selectors and a lazy
  getMessageText getter instead of eagerly materialized text.
- Feed ResizeObserver entry sizes into measureClamp / FadeText and dedupe
  the style writes, killing the read-write-read reflow cascade.

* perf(desktop): incremental markdown rendering during streams

Re-parsing the full message markdown every reveal frame is O(N^2) over a
long answer and dominated stream CPU.

- Throttle useSmoothReveal commits to ~1 frame (REVEAL_MIN_COMMIT_MS).
- Memoize block parsing with an LRU keyed on source text so only changed
  blocks re-parse.
- Replace Streamdown's full-text parseIncompleteMarkdown with a
  tail-bounded remend: scan to the last top-level boundary outside
  fences/math and repair only the trailing open block. New remend-tail.ts
  is proven render-equivalent to full remend at every streaming prefix
  (remend-tail.test.ts), minus an intentional, documented divergence on
  cross-block dangling openers.

* perf(desktop): faster session resume & warm AudioContext at idle

- Resume: fire the REST transcript prefetch and the session.resume RPC in
  parallel, and skip the redundant message conversion + reconciliation
  when the prefetch already hydrated the transcript.
- Haptics: web-haptics builds its AudioContext lazily on first trigger,
  paying the ~850ms CoreAudio spin-up on the first streamStart haptic as
  the first token paints. Open/close a throwaway context at idle so the
  real one connects to an already-warm audio service.
2026-06-12 21:22:39 -05:00
Brooklyn Nicholson
d62e9b7592 build(nix): refresh npmDepsHash for the remend dependency
Adding remend changed package-lock.json, so the flake's pinned npm deps
hash went stale and `nix flake check` failed. Bump it to match.
2026-06-12 21:17:22 -05:00
Brooklyn Nicholson
3cf7d43262 perf(desktop): faster session resume & warm AudioContext at idle
- Resume: fire the REST transcript prefetch and the session.resume RPC in
  parallel, and skip the redundant message conversion + reconciliation
  when the prefetch already hydrated the transcript.
- Haptics: web-haptics builds its AudioContext lazily on first trigger,
  paying the ~850ms CoreAudio spin-up on the first streamStart haptic as
  the first token paints. Open/close a throwaway context at idle so the
  real one connects to an already-warm audio service.
2026-06-12 21:07:40 -05:00
Brooklyn Nicholson
edc36f3a45 perf(desktop): incremental markdown rendering during streams
Re-parsing the full message markdown every reveal frame is O(N^2) over a
long answer and dominated stream CPU.

- Throttle useSmoothReveal commits to ~1 frame (REVEAL_MIN_COMMIT_MS).
- Memoize block parsing with an LRU keyed on source text so only changed
  blocks re-parse.
- Replace Streamdown's full-text parseIncompleteMarkdown with a
  tail-bounded remend: scan to the last top-level boundary outside
  fences/math and repair only the trailing open block. New remend-tail.ts
  is proven render-equivalent to full remend at every streaming prefix
  (remend-tail.test.ts), minus an intentional, documented divergence on
  cross-block dangling openers.
2026-06-12 21:07:36 -05:00
Brooklyn Nicholson
7c226cc57f perf(desktop): isolate streaming re-renders & cut layout thrash
During a token stream $messages is replaced ~30x/s. Subscribing the whole
chat view to it re-rendered the composer, runtime boundary, and every
message on every delta.

- Derive coarse facts (empty thread? tail is user?) via nanostores
  `computed` atoms so per-token flushes don't re-render their consumers.
- Move the $messages subscription + runtime wiring into a dedicated
  ChatRuntimeBoundary; the composer reads $messages imperatively.
- Drive message rows off stable useAuiState selectors and a lazy
  getMessageText getter instead of eagerly materialized text.
- Feed ResizeObserver entry sizes into measureClamp / FadeText and dedupe
  the style writes, killing the read-write-read reflow cascade.
2026-06-12 21:07:33 -05:00
brooklyn!
a86b7b314b Merge pull request #45273 from NousResearch/bb/sidebar-workspace-dedup
feat(desktop): worktree-aware sidebar grouping + composer/sidebar UX fixes
2026-06-12 20:03:43 -05:00
Brooklyn Nicholson
d14f6c9563 fix(desktop): stop streaming autoscroll bounce; move attachments below user bubble
Streaming auto-follow chased content growth while parked at the bottom,
which rubber-banded — the tail pin and the virtualizer's own measurement
adjustments fought for scrollTop. Drop it; the one-time new-turn jump
already lands a fresh message in view and the viewport stays put after.

Attachments rendered inside the editable user bubble and were collapsed
via an IntersectionObserver + [data-stuck] CSS hack while the bubble was
pinned. Render them as a flow sibling BELOW the sticky bubble instead, so
they scroll away behind it naturally — no observer, no collapse. Image
refs still render as thumbnails, file refs as chips; no border. Removes
the now-unused useStuckToTop hook and its CSS.
2026-06-12 19:58:25 -05:00
Brooklyn Nicholson
a1c6349c1f Merge branch 'main' of github.com:NousResearch/hermes-agent into bb/sidebar-workspace-dedup 2026-06-12 19:40:24 -05:00
Brooklyn Nicholson
78ce91750e fix(desktop): crisp terminal text via opaque xterm canvas
The terminal looked soft/heavy on every platform because the xterm
Terminal was built with allowTransparency: true, which drops the WebGL
renderer's opaque fast-path and bakes glyphs as grayscale-alpha coverage
for compositing over a see-through canvas. Our surface (--ui-bg-chrome)
is opaque and withSurface already paints it, so transparency was pure
blur for no benefit — VS Code keeps it off too. Also drop the Medium
(500) base weight for normal/bold (400/700) to match VS Code's metrics,
and remove the now-unused JetBrains Mono Medium face + woff2.
2026-06-12 19:36:30 -05:00
Brooklyn Nicholson
1a3cd3d436 refactor(desktop): collapse sidebar drag-reorder into one generic ReorderableList
Every reorderable surface (repos, worktrees, sessions, pins) now drops in a
single ReorderableList that owns its own DndContext, so a drag only ever
collides with that list's own items — nesting "just works" without leaking
into the lists around or inside it. This replaces the shared DndContext +
id-prefix dispatch (parent:/group:) whose closestCenter collisions resolved
to a different-typed droppable and silently no-op'd worktree/repo drags.

- Delete groupDndId/parentDndId/parse* helpers and the monolithic
  handleAgentDragEnd/handlePinnedDragEnd; each list persists its new id order
  via a direct typed write (reorderParents/reorderWorktree/reorderSessions/
  reorderPinned).
- Sessions inside repos/worktrees are date-ordered and static (no drag),
  matching the "never reorder on new messages" rule.
- Add setPinnedSessionOrder; drop now-unused reorderPinnedSession.
2026-06-12 18:59:54 -05:00
Teknium
9688c1a94f chore: add Kimi K2.7 code catalog slug (#45283) 2026-06-12 16:55:40 -07:00
Teknium
7e46533d9f test: compressed-summary metadata flag set in-process, stripped on wire 2026-06-12 16:47:15 -07:00
kyssta-exe
956af7f3c3 fix(agent): add metadata flag to context compression summary messages (#38389)
Summary messages (standalone insertion and merge-into-tail) now carry a
metadata flag so frontends (CLI, Desktop, gateway, TUI) can distinguish
them from real assistant/user messages without content-prefix heuristics.

Re-applied from PR #38434 onto current main (conflicted with the
_SUMMARY_END_MARKER hoist). Key renamed from the PR's
'is_compressed_summary' to '_compressed_summary': the wire sanitizers
strip underscore-prefixed message keys, so the flag stays in-process and
can never reach strict gateways (Fireworks/Mistral/Kimi reject unknown
keys with 'Extra inputs are not permitted').
2026-06-12 16:47:15 -07:00
helix4u
1899c8f507 fix(skills): run youtube transcript helper through uv 2026-06-12 16:33:46 -07:00
Brooklyn Nicholson
dd12a5403d refactor(desktop): extract shared WorkspaceHeader for repo + worktree rows
The repo and worktree header rows were ~identical after the handle move.
Fold them into one WorkspaceHeader (emphasis flag for the repo level) plus
a small WorkspaceAddButton, so the toggle/handle/count/+ wiring lives in
one place.
2026-06-12 18:30:49 -05:00
Teknium
8905ee6b8a fix(agent): rewind flush cursor exactly when repair compacts before the cursor
Follow-up to the #44837 clamp: a min() clamp only fixes cursor overshoot
past the new end of the list. When repair_message_sequence drops/merges
messages at indexes below the cursor, the clamp leaves the cursor pointing
past unflushed rows and the turn-end flush silently skips them.

Extract repair_message_sequence_with_cursor(): snapshot the flushed prefix
by object identity before repair, then recompute the cursor as the count
of surviving flushed messages. Falls back to the clamp when no snapshot is
available. Keeps the safety guard in _flush_messages_to_session_db.

Adds targeted tests for overshoot, before-cursor compaction, no-repair,
bare-agent, and the flush guard.
2026-06-12 16:29:01 -07:00
kyssta-exe
5d0408d9fe fix(agent): clamp flush cursor after repair_message_sequence compaction (#44837) 2026-06-12 16:29:01 -07:00
konsisumer
aec38855b5 fix(agent): preserve recent turns during compression 2026-06-12 16:26:58 -07:00
Brooklyn Nicholson
0595af0ad1 feat(desktop): move workspace/worktree drag handle into the leading icon
Mirror the session row: the repo/worktree header's leading glyph (repo
mark, or a new git-branch mark for worktrees) swaps to a grabber on
hover/drag instead of carrying a separate handle on the right — freeing
header width for the label and + button.
2026-06-12 18:26:38 -05:00
Brooklyn Nicholson
e90672696e feat(desktop): worktree-aware sidebar grouping + composer/sidebar UX fixes
Group recents as parent-repo → worktree → sessions using local git
metadata (probed over IPC, with a path-name heuristic fallback for
remote backends). Single-worktree repos collapse to one level. Sessions
order by creation time and never reshuffle on new messages.

Also: fuse the status stack to the composer border, restore icon actions
in the queue panel, fix sidebar label truncation and drag styling, hide
sticky-message attachments while pinned, and bump the terminal font.
2026-06-12 18:18:39 -05:00
brooklyn!
bbf020e709 feat(desktop): follow streaming output at bottom + jump-to-bottom button (#45263)
Strict sticky-bottom autoscroll for the chat thread: while the viewport is
parked at the bottom, the tail follows content growth (streaming tokens, late
measurement, Shiki re-highlight) via a useLayoutEffect keyed on the
virtualizer's own size signal, pinned in the same pre-paint pass as its
scrollToFn so the two never rubber-band. The gate is a single boolean — one
upward pixel (scroll/wheel/touch) disarms follow until the user returns to the
bottom.

Adds a floating jump-to-bottom control that appears once scrolled ~10px away
(above the dim threshold so a sub-pixel settle never flashes it), positioned
above the composer with respect to the status stack, with a subtle
scale + slide in/out animation that honours prefers-reduced-motion. The button
bridges to the virtualizer's re-arm + pin path through a small nanostore
emitter.

Supersedes #43624.
2026-06-12 23:00:11 +00:00
Teknium
135fe90166 fix(profiles): backfill .env for pre-existing profiles on hermes update (#45247)
Profiles created before #44792 have no .env. Now that the Channels/Keys
endpoints are profile-scoped (no os.environ fallback), those profiles
would show everything as unconfigured. hermes update now copies the
default install's .env into each named profile that lacks one (0600,
never overwrites, placeholder fallback when the root has no .env), so
existing users keep the credentials they were effectively running with.
2026-06-12 15:42:14 -07:00
xxxigm
68536d4375 test(compressor): regression coverage for assistant-tail anchor + compaction rollup (#29824)
21 cases pinning the new ``_ensure_last_assistant_message_in_tail``
anchor and its interaction with the existing tail-cut path:

* ``TestFindLastAssistantMessageIdx`` — helper contract: prefers a
  content-bearing assistant message, skips ``tool_calls``-only
  stubs, multimodal text-block content counts, falls back to
  "any assistant" when no content-bearing reply exists, honours
  ``head_end``, returns -1 when there's none.

* ``TestEnsureLastAssistantMessageInTail`` — direct: no-op when
  already in the tail, walks ``cut_idx`` back when the reply is
  in the compressed middle, never crosses into the head region,
  re-aligns through a preceding ``tool_call`` / ``tool_result``
  group instead of orphaning it.

* ``TestFindTailCutByTokensAnchorsAssistant`` — integration:
  reporter repro (long tool-output run after the visible reply)
  now preserves the reply; user and assistant anchors compose
  in a single tail-cut call; a soft-ceiling-overrunning oversized
  tool result no longer strands the prior reply.

* ``TestCompactionRollupReproduction`` — end-to-end through
  ``compress()`` with a stubbed ``_generate_summary``: the
  visible reply text survives either as its own standalone
  assistant message (normal path) or concatenated onto the
  merged summary tail (double-collision path the WebUI then
  re-splits). The standalone-summary case is asserted strictly
  (exactly one summary row, exactly one separate assistant
  row carrying the reply) — that's the dominant path and any
  drift there reintroduces the original bug.

* ``TestSourceGuardrail`` — static asserts on
  ``agent/context_compressor.py``: the helper exists, the
  anchor is wired into ``_find_tail_cut_by_tokens`` AFTER the
  user-message anchor (so chaining is monotonic), the
  content-bearing preference is preserved, and the issue
  number is referenced so future bisects can find this fix.
2026-06-12 15:41:57 -07:00
xxxigm
2fef3e2df2 fix(webui): split merge-into-tail compaction so reply renders as its own bubble (#29824)
The compressor has a "double-collision" fallback path: when the
chosen ``summary_role`` collides with the first tail message AND
the flipped role would collide with the last head message, it can't
emit a standalone summary turn (consecutive same-role messages
break Anthropic and friends). It instead prepends the summary +
end-of-summary marker to the first tail message's content via
``_merge_summary_into_tail``.

With the matching anchor from the previous commit, that first tail
message is now usually the user's previously-visible assistant
reply — so the persisted assistant turn ends up shaped as
``[CONTEXT COMPACTION ...] ... --- END OF CONTEXT SUMMARY --- ...
THE ACTUAL REPLY``. Without splitting it, the session viewer
renders one big "Context handoff" bubble and the reply text is
buried inside the metadata blob — which is exactly the
"can't see the last reply" experience #29824 reports, just one
layer deeper.

Added ``splitCompactionContent`` that detects the merge marker
(kept in sync with ``--- END OF CONTEXT SUMMARY — respond to the
message below, not the summary above ---`` in
``agent/context_compressor.py``) and ``MessageBubble`` now
recurses on the two halves: the prefix half renders as the muted
"Context handoff" row, the remainder half renders with the
original assistant styling. Pure (non-merged) summary messages
hit the no-remainder branch and still render as a single
"Context handoff" row, preserving the original behaviour.
2026-06-12 15:41:57 -07:00
xxxigm
691ff7c188 fix(compressor): keep last visible assistant reply out of compaction summary + label handoffs in WebUI (#29824)
Two-pronged fix for the WebUI "context compaction block in place of
last assistant response" regression.

Agent layer (the real fix). ``_find_tail_cut_by_tokens`` already had
``_ensure_last_user_message_in_tail`` to keep the most recent user
request out of the compressed middle (#10896), but no symmetric
anchor for the assistant side. When the conversation has an
oversized recent tool result or a long stretch of tool-call/result
pairs *after* the assistant's last visible reply, the token-budget
walk can stop with the previously-visible reply on the wrong side
of ``cut_idx``. The summariser then rolls it into the single
``[CONTEXT COMPACTION — REFERENCE ONLY]`` block persisted as
``role="user"`` or ``role="assistant"``, and from the operator's
perspective the WebUI session viewer
(``web/src/pages/SessionsPage.tsx``) and the TUI chat panel both
suddenly show the opaque "Context compaction" block in the slot
where they were just reading the actual answer:

    User:  "i cant see the output of the last message you sent,
            i did see it previously, however now see 'context
            compaction'"

Added ``_ensure_last_assistant_message_in_tail`` mirror of the
user-side anchor. It looks for the most recent assistant message
with non-empty text content (skipping tool-call-only assistant
"stubs" which the UI renders as small "calling tool X" indicators
rather than a readable bubble) and walks ``cut_idx`` back through
the standard ``_align_boundary_backward`` so we don't split a
tool_call/result group that immediately precedes it. The two
anchors are chained — each only walks ``cut_idx`` backward, so
the tail can only grow.

Falls back to "most recent assistant of any kind" only when no
content-bearing reply exists in the compressible region (fresh
multi-step tool sequence with no prior reply) — in that case the
agent-side fix is effectively a no-op and the existing
user-message anchor carries the load.

WebUI layer (clarity). Added ``isCompactionMessage`` detector that
recognises the ``[CONTEXT COMPACTION — REFERENCE ONLY]`` (current)
and ``[CONTEXT SUMMARY]:`` (legacy) prefixes from
``agent/context_compressor.py``, and a new ``compaction`` entry
in ``MessageBubble``'s ``ROLE_STYLES`` map. Compaction blocks
now render as muted, italicised system-style rows labelled
``Context handoff`` — clearly metadata, not the assistant's
actual reply — so an operator scrolling back through a long
session can't mistake the summary for a real answer.

Keeping the detected prefixes inline (rather than importing them)
because the WebUI bundle has no Python interop. A guardrail comment
points readers at the source-of-truth constants in
``agent/context_compressor.py``.
2026-06-12 15:41:57 -07:00
Teknium
7a318aae22 fix(profiles): exclude session history, backups, and snapshots from --clone-all (#45246)
--clone-all copied the source profile's state.db, sessions/, backups/,
state-snapshots/, and checkpoints/ into the new profile. These are
per-profile history: a 49GB copy in practice (15GB snapshots + 11GB
backup archives + 16GB state.db + 6.4GB sessions), and restoring a
copied backup inside the clone would resurrect the SOURCE profile's
state. A clone is a fresh workspace; history stays with the source.

New _CLONE_ALL_HISTORY_EXCLUDE_ROOT set, applied at root level for ANY
source profile (named profiles accumulate the same artifacts), unlike
the default-gated infrastructure excludes. Nested same-name dirs still
copy. Docs and the post-create CLI message updated to match; profile
export / hermes backup remain the full-history paths.
2026-06-12 15:41:50 -07:00
Brooklyn Nicholson
b16e22b8f2 fix(desktop): persist tool-row dismissal across virtualization; keep caret hittable
Salvage of #45240. The dismiss-settled-tool-rows affordance was correct in
intent but had two issues against current main:

- The thread is virtualized, so a row's component unmounts/remounts as it
  scrolls. Component-local `useState` dismissal was forgotten on remount and
  the row popped back. Move dismissal into a session-scoped nanostore keyed by
  the stable disclosure id (mirrors $toolDisclosureOpen), so a dismissed row
  stays gone while scrolling but a reload restores real history instead of
  permanently rewriting it.
- The dismiss button lived in DisclosureRow's absolute `trailing` slot — the
  exact "opacity-0-but-clickable control fights the caret" pattern the trailing
  comment warns against. Add an in-flow `action` slot that lays out at the far
  right so an interactive control never overlaps the caret's hit-target,
  regardless of title length, and move the dismiss button into it.

Adds a remount regression test alongside the existing dismissal coverage.
2026-06-12 17:34:48 -05:00
helix4u
2e874ef879 fix(desktop): allow dismissing settled tool rows 2026-06-12 17:22:30 -05:00
Teknium
0db5cb8e75 refactor(agent): hoist summary end marker to _SUMMARY_END_MARKER; strip it on rehydration
Follow-up to the #33346 cherry-pick:
- the marker string was duplicated at both insertion sites (standalone +
  merged-into-tail); hoist to a module constant
- _strip_summary_prefix now also strips a trailing end marker so a
  rehydrated handoff body doesn't leak the boundary directive into the
  iterative-update summarizer prompt (it is re-appended on insertion)
2026-06-12 15:05:00 -07:00
Tranquil-Flow
749b7219c4 fix(compression): always append END OF CONTEXT SUMMARY marker to standalone summaries regardless of role
When the compression summary lands as an assistant-role message (head ends
with user), the end marker was not appended. Models may regurgitate the
summary text as their own visible output when there's no clear boundary
signal (#33256).

The end marker was already appended for user-role summaries (#11475, #14521)
but the assistant-role path was missed in the original fix. This ensures ALL
standalone summary messages carry the boundary marker, preventing summary
text from leaking into user-visible chat output.
2026-06-12 15:05:00 -07:00
Teknium
a118b94a85 fix(dashboard): skill installs from the dashboard silently auto-cancel (#45150)
The dashboard's /api/skills/hub/install (and the new-profile hub_skills
path) spawned `hermes skills install <id>` with stdin=DEVNULL but
without --yes. do_install()'s 'Confirm [y/N]' prompt hit EOF, defaulted
to 'n', and printed 'Installation cancelled.' into a background log the
user never sees — every dashboard install no-opped.

Pass --yes on both spawn sites, matching the uninstall endpoint which
already passed --yes. The dashboard install button is the explicit user
consent, same as the TUI/slash-command skip_confirm rationale.

Repro: spawned the exact argv with stdin=DEVNULL against a temp
HERMES_HOME — without --yes it cancels, with --yes the skill installs.
2026-06-12 12:58:36 -07:00
Teknium
bba9b519aa fix(delegation): remove the default subagent wall-clock timeout (#45149)
Subagents doing legitimate heavy work (deep code reviews, research
fan-outs, slow reasoning models) were routinely killed at the blanket
600s child_timeout_seconds cap while making steady progress (e.g. 36
API calls completed when the axe fell). Failures should come from what
the child is actually doing — API errors, tool errors, iteration
budget — not a delegation-level stopwatch.

- DEFAULT_CHILD_TIMEOUT: 600 -> None; Future.result(timeout=None)
  blocks until the child finishes
- config default delegation.child_timeout_seconds: 600 -> 0
  (0/negative = disabled; positive opts back in, floor 30s unchanged)
- stuck-child protection unchanged: the heartbeat staleness monitor
  still stops refreshing parent activity so the gateway inactivity
  timeout fires on a truly wedged worker; the 0-API-call diagnostic
  dump still works when a cap is configured
- docs updated (EN + zh-Hans)
2026-06-12 12:58:25 -07:00
Teknium
9b01c4d193 fix(update): never spawn an interactive polkit prompt when restarting a system-scope gateway (#45145)
When hermes update restarts a hermes-gateway system service as a
non-root user, the systemctl reset-failed/start/restart calls trigger
polkit's org.freedesktop.systemd1.manage-units TTY authentication
agent. That prompt runs inside a captured subprocess with a 10-15s
timeout, so it flashes and dies before the user can answer, and the
resulting TimeoutExpired was swallowed silently by the loop's blanket
except — the restart phase just vanished with no output.

- Resolve a manage-units command prefix up front: plain systemctl as
  root, sudo -n systemctl as non-root (with a targeted reset-failed
  probe so least-privilege sudoers entries scoped to hermes-gateway*
  qualify), or None when no non-interactive privilege path exists.
- Add --no-ask-password to every manage-units call in the update
  restart path so polkit can never prompt inside a captured subprocess.
- When unprivileged: after a graceful drain, rely on systemd's own
  RestartSec auto-restart (needs no privileges) with a message about
  the wait; skip the force-restart fallback with clear manual
  instructions instead of racing a doomed polkit prompt.
- Surface TimeoutExpired in the restart loop instead of passing
  silently, and add sudo to the system-scope recovery hints.
- Docs: headless-VM note recommending user service + enable-linger,
  or sudo updates / a scoped NOPASSWD sudoers entry for system
  services.
2026-06-12 12:38:15 -07:00
Teknium
fca84fe20b test: regression guard for Nous 429 fallback re-entry; AUTHOR_MAP entry 2026-06-12 12:21:29 -07:00
Aðalsteinn Helgason
2714fc8396 fix(agent): re-enter retry loop on genuine Nous 429 so fallback guard runs
The genuine-rate-limit branch set retry_count = max_retries before
continue, intending the top-of-loop Nous guard to handle fallback or
bail cleanly. But the loop condition is retry_count < max_retries, so
the guard never ran: no fallback activation, no clean rate-limit
message — just the generic retry-exhaustion error.

Set retry_count = max(0, max_retries - 1) so the loop body runs exactly
once more and the guard sees the breaker state recorded moments earlier.

Extracted from the #44061 bugfix rollup by @AIalliAI.
2026-06-12 12:21:29 -07:00
Teknium
dc467488a7 test: assert typing-stop-before-callback as an invariant, not a call count
The shared _stop_typing_refresh cleanup makes up to two bounded
stop_typing attempts; the old assertion pinned exactly one
typing-stopped event before callback-start.
2026-06-12 12:02:41 -07:00
Teknium
c2326bc3be chore: add itsflownium to AUTHOR_MAP 2026-06-12 12:02:41 -07:00
Flownium
331cb38e21 fix: stop Discord typing after replies 2026-06-12 12:02:41 -07:00
Teknium
fa5e98facb fix(send): helpful error when --file gets a binary; document MEDIA: attachments (#45116)
A user passing an image to `hermes send --file` got a raw
UnicodeDecodeError ('utf-8 codec can't decode byte 0x89...') with no
hint that media delivery goes through the MEDIA:<path> directive.

- send_cmd: catch UnicodeDecodeError separately and print a usage error
  explaining --file is for text bodies, with copy-pasteable MEDIA: and
  [[as_document]] examples using the user's own path
- --file help text + epilog now mention MEDIA:
- docs: new 'Sending images and other media' section on the hermes send
  reference page
2026-06-12 11:48:06 -07:00
Teknium
652dd9c9f2 fix: rich messages follow-ups — reply_parameters, send latch, opt-in default
- Use reply_parameters per the sendRichMessage spec instead of the
  undocumented reply_to_message_id scalar (silently ignored -> reply
  anchor quietly dropped).
- Latch rich sends off after an endpoint-capability failure (old PTB /
  server without sendRichMessage) so every later reply doesn't pay a
  doomed extra roundtrip; per-message BadRequests do NOT latch.
- Default rich_messages to OFF (opt-in) while the day-old Bot API 10.1
  endpoint is validated live; revert the prompt-hint table guidance
  until the default flips on.
- Tests: reply_parameters shape, send-latch behavior, BadRequest
  non-latch; rich tests opt in explicitly via extra.
2026-06-12 11:47:54 -07:00
ITheEqualizer
05b9c84ca4 Add Telegram Bot API 10.1 rich message support
Introduce opportunistic support for Telegram Bot API 10.1 rich messages by sending raw agent Markdown via sendRichMessage and streaming previews via sendRichMessageDraft. Implements a rich-path fast‑path in gateway/platforms/telegram.py (RICH_MESSAGE_MAX_BYTES=32768, feature gate platforms.telegram.extra.rich_messages, bot capability checks, routing/thread handling, and conservative fallback rules: permanent/capability errors fall back to the legacy MarkdownV2 path, transient/network errors are surfaced without legacy-resend). Also add a latch for draft capability failures (_rich_draft_disabled) and preserve legacy chunking and draft behavior when needed. Update agent prompt hints (telegram encourages rich Markdown/tables), add CLI config example option, update English and Chinese docs to describe rich messages and fallbacks, and add/adjust tests for rich send and draft behavior.
2026-06-12 11:47:54 -07:00
alt-glitch
01669f2f12 opentui(v6): /sessions groups this directory's sessions first + TUI persists its cwd
The resume picker never had cwd grouping — deliberately deferred in the v1
spec because TUI session rows had no cwd to group by: the TUI's
session.create sent only {cols}, so explicit_cwd stayed false and
_ensure_session_db_row skipped cwd stamping by design (the desktop's launch
dir is meaningless — 'No workspace' grouping is its desired default).

In a terminal the launch directory IS the workspace choice, so the entry now
passes cwd: process.cwd() at session.create — the existing explicit-workspace
machinery persists it to the session row on first message (covered by
test_ensure_session_db_row_persists_explicit_cwd; zero gateway changes).

Picker: while browsing (no search), sessions whose cwd matches the TUI's
current directory order first under a '▾ this directory (N)' caption, the
rest under '▾ other directories' — one flat reordered list, so selection/
windowing/load-more math is untouched, and captions are pure render
decoration keyed off hereCount. During search the fuzzy score keeps owning
the order. Trailing-slash-normalized comparison, no fs calls.

Old sessions can't be backfilled (their cwd was never recorded); coverage
accumulates from here. 6 new tests (pure ordering edges + grouped frames,
search-drops-grouping, no-cwd passthrough).
2026-06-12 23:50:59 +05:30
teknium1
6b4073648e fix(tui): config.yaml wins over env model seed in per-turn sync
Hosted instances set HERMES_INFERENCE_MODEL as a provision-time seed in
the container env. _config_model_target() previously went through
_resolve_model() (env-first), so on hosted VPS the sync target stayed
pinned to the seed and dashboard model changes never reached an open
chat -- the exact scenario the sync exists to fix. The sync target now
reads config.yaml first and only falls back to the env vars when config
has no model. Startup resolution (_resolve_model) is unchanged.
2026-06-12 11:03:44 -07:00
IAvecilla
bc3f4ed70f Skip redundant model switch 2026-06-12 11:03:44 -07:00
IAvecilla
8c3c08c50b Update implementation to make it cleaner 2026-06-12 11:03:44 -07:00
IAvecilla
c61815232a Update model correctly when updating from dashboard 2026-06-12 11:03:44 -07:00
ethernet
1e25358a8f refactor(desktop): use port 0 for ephemeral port discovery instead of PortPool reservation
Replace the PortPool-based port reservation system (9120-9199 range) with OS-assigned ephemeral ports via --port 0.

Before: Desktop probed a hardcoded port range, reserved ports in-process to close TOCTOU races, and passed the chosen port to the dashboard via CLI arg.

After: Desktop spawns dashboard with --port 0, parses the actual port from a stdout announcement line (HERMES_DASHBOARD_READY port=<N>), and uses that for WebSocket connections.

Changes:
- web_server.py: add --port 0 support with SO_REUSEADDR pre-bind + announcement; add EADDRINUSE preflight for explicit ports
- main.cjs: remove PortPool, PORT_FLOOR/CEILING, pickPort(), isPortAvailable(); add waitForDashboardPort() stdout parser
- Delete port-pool.cjs and port-pool.test.cjs (106 lines removed)

Net effect: eliminates the entire TOCTOU-mitigation reservation infrastructure and arbitrary port range constraints. OS handles port allocation natively.
2026-06-12 14:02:19 -04:00
ethernet
8044bf0206 fix(ci): only save test durations when tests pass
The save-durations job used `if: always()` which meant it would
run even when the test matrix failed, potentially caching duration
data from a failed/incomplete run. Changed to check
needs.test.result == 'success' so durations are only cached when
all test slices pass cleanly.
2026-06-12 13:50:52 -04:00
ethernet
4d68984ec7 fix(tests): remove no-longer-needed forensics 2026-06-12 13:42:42 -04:00
ethernet
6ff39c31ad fix(tests): guard against real 'hermes update' subprocess spawns in conftest
Extends _live_system_guard in tests/conftest.py to block any subprocess
call that would run 'hermes update' (or 'python -m hermes_cli.main update')
against the real checkout.

These commands run git fetch origin + git pull, overwriting repo files
like pyproject.toml mid-test-run and corrupting every subsequent
subprocess that reads them. The spawned process uses setsid /
start_new_session=True so it's invisible to pytest's process tree
(PPid=1) — the corruption was essentially undetectable without
explicit inotify/SHA watchdogs.

Root cause of #43703 CI failures: tests in TestUpdateCommandPlatformGate
called _handle_update_command() with HERMES_MANAGED='' and no Popen mock,
causing the code to fall through and spawn a real 'hermes update --gateway'
that overwrote pyproject.toml with origin/main's content (which still
had '--timeout=30 --timeout-method=thread' in addopts while the PR had
already removed pytest-timeout).

The guard covers all three invocation patterns:
- 'hermes update' / 'hermes update --gateway' (direct or via setsid bash -c)
- 'python -m hermes_cli.main update --gateway'
- '.venv/bin/hermes update' (absolute path variant)

Does not false-positive on: git update-index, apt-get update,
pip install --upgrade, or any command lacking 'hermes'/'hermes_cli'.
2026-06-12 13:42:42 -04:00
ethernet
c41a6534cf fix(tests): mock subprocess.Popen in all _handle_update_command tests 2026-06-12 13:42:42 -04:00
ethernet
2f9d18711f fix(ci): remove pytest-timeout, use per-file timeout only
fix(ci): write a new cache for test durations every time
change(ci): rip out error 4 retries because we found the real bug
2026-06-12 13:42:42 -04:00
brooklyn!
46d758bb3e feat(desktop): window translucency slider in Appearance settings (#45086)
A see-through-window control (0–100, off by default) that maps to the
native window opacity via setOpacity — the desktop shows through the whole
window, the same effect as the Windows shift-scroll trick. macOS + Windows;
a no-op on Linux (no runtime window opacity).

Renderer owns the value (persisted, nanostore) and mirrors it to the main
process over IPC; main persists it to translucency.json so a cold launch
applies it at window creation before the renderer reports in.
2026-06-12 12:02:38 -05:00
SHL0MS
7d4e60e44a docs(website): redirect old automation-templates URL to automation-blueprints
The Automation Blueprints rebrand (#44470) renamed the guide page from
guides/automation-templates to guides/automation-blueprints, leaving the
old URL 404ing. The site deploys to static hosting, so server-side
redirects aren't available.

Add @docusaurus/plugin-client-redirects (pinned 3.9.2, same as the other
Docusaurus packages) and a redirect entry for the old slug. The plugin
emits a static HTML page at the old path that meta-refresh/JS-redirects
to the new page, preserving query string and hash, with a canonical link
for SEO. Localized routes are handled automatically (zh-Hans verified).
2026-06-12 09:46:27 -07:00
brooklyn!
79c3ed3cc9 fix(desktop): new chat honours the active profile instead of rubberbanding to default (#45057)
The top "New Session" button (and /new, the keyboard shortcut) cleared
$newChatProfile to null, meaning "use the live gateway context". But
createBackendSessionForSend turned a null into an omitted `profile` param on
session.create. In global-remote mode one backend serves every profile, so an
omitted profile silently binds the new chat to the launch (default) profile's
home/state.db — the session "rubberbands back to default" even though the rail
still shows the selected profile. The per-profile "+" worked because it sets
$newChatProfile explicitly.

Resolve a null $newChatProfile to the active gateway profile at the single
session-creation chokepoint so session.create always carries the live profile.
Harmless for single-profile and local-pooled users: a backend resolves its own
launch profile to None (_profile_home), so passing it changes nothing.
2026-06-12 16:38:56 +00:00
brooklyn!
d62979a6f3 feat(desktop): composer status stack, live subagent windows, editable prompts (#44630)
* feat(desktop): session-scoped status stack + kill new-window theme flash

Stack subagents, background tasks, and the queue into one collapsible
"sink" above the composer, reusing the queue's chrome so every status
reads as one piece. Extracts shared StatusSection / StatusRow /
TerminalOutput primitives and a unified $statusItemsBySession store
(subagents mirrored, background owned here, merged + grouped for render).
Renames BrailleSpinner → GlyphSpinner now that it drives more than braille.

Separately, fix the white flash on every new/cmd-clicked window: macOS
`vibrancy` paints an NSVisualEffectView that follows the OS appearance and
ignores `backgroundColor`, so a dark app on a light-mode Mac flashed white
until the renderer painted over it. Pin `nativeTheme.themeSource` to the
app theme (persisted to userData so cold launches paint right before the
renderer loads), hold windows with `show:false` until `ready-to-show`, and
pre-paint the themed background via an inline script before the bundle runs.

* feat(desktop): dock the slash popover to the composer via one shared fill var

The slash·@ popover (and ? help) now docks onto the composer's edge with the
same chrome as the queue/status stack — rounded outer corners, fused borderless
edge, no shadow — but keeps its own narrow width.

Surface + drawer paint a single --composer-fill var; the state ladder
(rest / scrolled / focused / drawer-open) lives once in styles.css on
[data-slot='composer-root']. The :has() drawer-open rule is last and forces an
opaque fill, since translucent glass sampling different backdrops (thread vs
fade gradient) can never match. This replaces the focus-within !important
override that repainted the surface behind every previous matching attempt.

Also drop the chevron column from the project file tree — the folder open/closed
icon already carries the expand state.

* feat(desktop): base inset for file tree rows (post-chevron alignment)

* feat(desktop): wire the status stack's background tasks to the real process registry

The background group was UI-only (dev-mock seeded). Now it's live e2e:

- tui_gateway: new session-scoped `process.list` (registry snapshot filtered
  by the session's session_key, plus a 4KB output tail for the inline
  terminal viewer) and `process.kill` (single process, ownership-checked —
  unlike process.stop's kill_all).
- Renderer: `reconcileBackgroundProcesses` syncs snapshots into the store
  layout-stably — rows keep their position when state flips (never re-sort),
  new processes append, unchanged rows keep object identity so memoised rows
  skip re-rendering, and a dismissed-set stops the registry's retained
  finished procs from resurrecting X-ed rows.
- Refresh triggers: session open, terminal/process tool.complete,
  status.update(kind=process) from the gateway's notification poller, and a
  5s poll armed only while a running row is visible (catches silent exits).
- Stop = real `process.kill` + optimistic dismiss; Dismiss = client-side
  with resurrection guard.
- Re-keyed the stack to the RUNTIME session id: it was keyed by the stored
  session id, where neither subagent events nor process.list would ever land.
- Deleted dev-status-mocks.ts (__hermesStatusMocks) — no more seed shit.

Reconcile invariants covered in store/composer-status.test.ts.

* feat(desktop): todos + openable subagents in the status stack, self-healing file tree

- todo lists move out of the inline chat panel into the composer status stack
  (checklist icon, dashed ring = pending, spinner = in progress, check = done),
  fed live from todo tool events and seeded from history on session open
- subagent rows carry the child's real session id end-to-end
  (delegate_tool → gateway → renderer) so clicking one opens ITS session window
- status stack publishes its measured height so the thread's bottom clearance
  grows with it; card paints the shared --composer-fill so focused/scrolled
  states match the composer exactly
- file tree self-heals: ENOENT roots retry on a 3s cadence + Try again button,
  and the main process expands ~ in IPC paths (gateway cwds arrive as ~/...)
- composer drag-drop of tree entries inserts inline refs instead of attachments

* fix(desktop): file tree falls back to the workspace dir when a session's cwd is gone

Sessions record their launch cwd; deleted worktrees leave that path dead,
so opening such a session swapped the tree from the default workspace to a
directory that ENOENTs forever — the 3s retry just spun on it. On a root
read error the tree now asks main to sanitize the cwd (prefers the
configured default project dir), displays that fallback, and quietly
re-probes the original path so it switches back if the dir reappears.

* feat(desktop): working restore-checkpoint button on past user prompts

The discard icon on hover of a past user bubble was decorative — clicking
did nothing. It's now a real control: a confirmation dialog explains that
everything after the prompt is removed, then the session rewinds to that
turn and reruns the same prompt (prompt.submit with
truncate_before_user_ordinal, the same mechanism the edit composer uses).
Failures rethrow into the dialog's inline error instead of toasting.

* fix(desktop): show the restore-checkpoint button on the latest user prompt too

Restoring the most recent prompt is just 'retry this turn' — no reason to
exclude it. Stop still takes the slot while the turn is running.

* fix(desktop): finished todo lists clear themselves out of the status stack

A list whose every item is completed/cancelled lingers ~4s so the final
checkmark is visible, then the todo group drops out of the stack. A fresh
active list arriving within the linger cancels the scheduled clear.

* chore(desktop): drop dead editableCheckpoint copy, terser restore confirm

* fix(desktop): rewind clears the abandoned timeline's todos + background

Restoring to (or editing) an earlier prompt rewinds the conversation, but
the todos and background processes spawned by the now-discarded turns kept
showing in the status stack — and the real background processes kept
running. Both rewind paths now clear the session's todo rows and kill +
drop its background processes before the fresh run repopulates them. Also
drops the click-to-edit clamp transition, which flashed a half-expanded
bubble on the way into the edit composer.

* feat(desktop): user messages are always editable; edit/restore revert mid-stream

The bubble is now always click-to-edit — even while a turn streams — instead
of going inert during a run. Sending an edit acts like restore: it rewinds to
that prompt and re-runs with the new text. Both edit and restore can fire
mid-stream now; the gateway refuses prompt.submit while a turn runs (4009
"session busy"), so they interrupt the live turn first and retry the submit
until the cooperative interrupt winds it down. Restore (re-run as-is) shows on
every prompt except the latest running one, which keeps the Stop button.

* fix(desktop): label preview-pane ⌘L selections with the filename, not "zsh"

The terminal owns a global ⌘/Ctrl+L "send selection to composer" shortcut, so
selecting text in the file preview pane and hitting it fell through to the
terminal handler — which imported the right text but labelled the composer ref
"zsh:N lines" off the shell name. When the selection isn't an xterm selection,
label it with the previewed file instead.

* fix(desktop): ⌘L on a preview line selection inserts the @line ref, like dragging

The source preview lets you select lines in the gutter and drag them into the
composer as an @line:path:start-end ref. ⌘/Ctrl+L now does the same when a line
selection is active — it drops the identical ref instead of falling through to
the terminal's global handler (which grabbed the native text selection and sent
a bogus terminal block). Capture-phase + stopPropagation so it wins; with a line
selection there's no native selection, so the terminal handler stays out of it.

* chore: gitignore apps/desktop/demo/ scratch output

The desktop demo prompt writes demo/*.txt during recorded walkthroughs; it's
throwaway, never part of the app. Ignore it so it stops cluttering git status.

* feat(desktop): subagent watch windows, hard stop, sidebar hygiene

Child-session mirror for live subagent windows, delegate sessions tagged
and excluded from the sidebar, composer focus/stop polish, and WS stall
resilience on the gateway transport.

* refactor: DRY delegate SQL + trim status-stack noise

Extract shared listable-child and delegate-delete helpers in hermes_state,
collapse cancelRun busy release, and cut comment bloat in resume/status paths.

* fix(desktop): hide orphaned subagent sessions in sidebar

Cascade-delete all ephemeral children on parent delete (not just tagged rows),
run v16 backfill to tag legacy orphans, and record new delegates as source=subagent.

* fix: restore orphan contract for untagged children + lazy session eviction

Cascade-delete only _delegate_from-tagged rows (v16 backfill covers legacy),
walk marker chains recursively with FK-safe orphaning, gate lazy watch
sessions out of the still-starting eviction exemption via an explicit flag,
pass session_id to _make_agent only when resuming, and hide source=subagent
from session search.

* fix(gateway): gate child mirror off upgraded sessions + age out stale run entries

Review findings: the mirror could interleave synthetic events with a real
native stream once a watch window upgrades (prompt.submit builds an agent),
and a lost subagent.complete left _active_child_runs pinning running=true
forever. Mirror now stops when the live session owns an agent; liveness
reads ignore entries older than an hour.

* fix(gateway): reject prompt.submit into a watch session while its child runs

A lazy watch session's running flag is False (the run lives in the parent
turn), so typing mid-run sailed past the busy guard and built a second agent
racing the in-flight child on the same stored session. Busy error until the
run completes; afterwards the submit upgrades into a normal conversation.

* refactor(gateway): DRY watch-resume payload + compose listable-child SQL

Fold the duplicated child-run busy overlay into one _reuse_live_payload
helper across both resume reuse paths, collapse the twin mirror early-returns,
and build _LISTABLE_CHILD_SQL from _BRANCH_CHILD_SQL instead of restating it.

* fix(desktop): clip horizontal overflow on sidebar scroll areas

Add overflow-x-hidden alongside overflow-y-auto on session list scrollers
and the shared SidebarContent primitive — vertical scroll unchanged.
2026-06-12 08:30:06 -05:00
y0shualee
9c50521704 fix(desktop): complete backend PATH for Homebrew Codex
macOS Desktop backend processes can still miss Apple Silicon Homebrew paths even after adding Hermes-managed Node and venv bins. That leaves `/codex-runtime on` unable to find a Homebrew-installed `codex` binary at `/opt/homebrew/bin/codex`.

Add a small testable backend env helper that builds the dashboard subprocess environment in one place. It prepends Hermes-managed Node and venv bins, appends missing POSIX sane PATH entries individually, preserves caller precedence without duplicates, and keeps Windows PATH casing/delimiters intact.

Wire both source-checkout and active-install backend descriptors through the helper, and add Node regression coverage to the desktop platform test suite.
2026-06-12 03:03:44 -07:00
Teknium
88dbf95105 fix(dashboard): profile-scope Channels endpoints and seed per-profile .env (#44792)
Two halves of the same community report (dashboard Profile Builder):

1. A fresh dashboard/CLI-created profile got no .env file unless cloned,
   so it silently inherited API keys and messaging tokens from the shell
   environment / root install. create_profile() now seeds a placeholder
   .env (0600) for non-clone profiles, matching the SOUL.md seeding.

2. The Channels endpoints (/api/messaging/platforms GET/PUT/test) were
   not profile-scoped: they read/wrote the dashboard process's own .env
   via load_env()/save_env_value() regardless of the global profile
   switcher. They now accept the standard optional profile param (body
   beats query on the PUT, matching other scoped writes) and run inside
   _profile_scope(). When scoped, the payload no longer falls back to
   os.environ or load_gateway_config()'s env-override layer — both carry
   the ROOT install's credentials and would misreport them as the
   profile's. /api/messaging/platforms added to PROFILE_SCOPED_PREFIXES
   so the sidebar switcher scopes the Channels page automatically.
2026-06-12 02:09:28 -07:00
loongfay
e20e0bd744 feat(Yuanbao): support wechat forward msg (#43508)
* feat(yuanbao): support wechat forward msg

* feat(yuanbao): support wechat forward msg

---------

Co-authored-by: loongfay <izhaolongfei@gmail.com>
2026-06-12 02:06:47 -07:00
Teknium
0fd34e8c5a fix(teams): cache document/video/audio attachments and classify as DOCUMENT (#44778)
The Teams adapter only handled image/* attachments — documents (the
application/vnd.microsoft.teams.file.download.info consent-free download
payload and any direct-URL non-image attachment) never reached media_urls
at all, so run.py's document-context injection had nothing to surface.
Completes the class-wide sweep from PR #44695 (Signal/Email/SimpleX).

- download.info attachments: fetch the pre-authed SharePoint downloadUrl
  (SSRF-guarded, same guard chain as base.py cache_*_from_url) and route
  through cache_media_bytes
- direct-URL non-image attachments: same fetch + classify path
- skip Teams' text/html message-body mirror and adaptive-card attachments
- DOCUMENT > PHOTO > VIDEO > AUDIO precedence for mixed attachments,
  matching the Email precedence rationale from #44695
2026-06-12 02:05:41 -07:00
Siddharth Balyan
7ba5df0d52 feat(billing): /credits command — balance + portal top-up handoff (#44776)
* feat(billing): /usage → portal top-up browser handoff

Add the terminal side of the billing slice (phase 2a): start a top-up by
throwing the user to the portal billing page with the top-up modal open. The
terminal does not confirm, poll, or track payment — checkout completes in the
browser and the next /usage shows the new balance.

- nous_account.py: parse organisation.slug/name from /api/oauth/account into
  NousPortalAccountInfo; add nous_portal_topup_url() building the org-pinned
  {base}/orgs/{slug}/billing?topup=open with a null-slug fallback to the legacy
  {base}/billing?topup=open (never /orgs/None/...).
- portal_cli.py: 'hermes portal topup' — fresh account fetch, identity line
  (Topping up as <email> / org <name>), browser open with printed-URL fallback,
  no-wait closing copy. No polling/confirmation (deferred to 2b).
- account_usage.py: the shared /usage credits block now links the org-pinned
  top-up URL (auto-opens the modal) + points to the command.

Depends on NAS #409 (organisation.slug/name + ?topup=open). Do not merge until
that is live on the target env; until then /api/oauth/account returns
organisation: { id } only and the URL falls back to legacy.

* feat(billing): /credits command for balance + top-up handoff

Replace the standalone `hermes portal topup` subcommand with an in-session
/credits slash command — a focused money surface (balance in, top-up out) that
works in the CLI, TUI, and every messaging platform from one registry entry.

- commands.py: register /credits (Info category). Slack is at its 50-slash cap,
  so /credits is routed via /hermes credits on Slack only (new
  _SLACK_VIA_HERMES_ONLY set) to avoid clamping a canonical command off the
  native list and breaking Telegram parity; native everywhere else.
- account_usage.py: build_credits_view() — one portal fetch → balance lines +
  identity line + org-pinned top-up URL + depleted flag, consumed by all
  surfaces. Reuses the same snapshot/URL builder as /usage so numbers match.
- cli.py: _show_credits() — balance block + identity line + 3-button panel
  (Open top-up / Copy link / Cancel) via the existing prompt_toolkit modal.
  ASK, never auto-launch; headless falls back to printing the URL.
- gateway/slash_commands.py: _handle_credits_command() — renders the block +
  tappable top-up URL + no-wait copy; works on button and plain-text platforms.
- /usage credits line now points to /credits.
- Retire `hermes portal topup` (portal_cli.py back to baseline); the engine
  (slug/name parse + nous_portal_topup_url) stays as the shared core.

No polling, no payment confirmation (billing phase 2a). Depends on NAS #409.

* fix(credits): /credits works in the TUI slash-worker (non-interactive)

In the TUI, /credits runs in the slash-worker subprocess where there is no
live prompt_toolkit app and stdin is the JSON-RPC pipe. _show_credits called
the 3-button modal unconditionally, which fell back to reading stdin →
exception → slash.exec rejected → the command produced no output (only the
pre-existing 'Credit access paused' banner showed).

- _show_credits: when self._app is None (TUI worker / piped / non-interactive),
  render the text variant — balance block + tappable top-up URL + no-wait line,
  same affordance as the messaging surfaces — and skip the modal entirely. The
  3-button panel still renders in the interactive CLI.
- Depleted banner copy: 'run /usage for balance' → 'run /credits to top up'
  now that /credits is the dedicated money surface (+ tests).
- Regression tests: _show_credits with self._app=None renders text and never
  invokes the modal; logged-out path.

* feat(tui): credits.view RPC for the /credits tappable top-up button

Add a credits.view JSON-RPC method returning the structured CreditsView
(logged_in, balance_lines, identity_line, topup_url, depleted) so the TUI can
render a clickable <Link> top-up button instead of plain text. Account-
independent (portal fetch gated on a logged-in Nous account), fail-open to
{logged_in: false} on any hiccup. Mirrors session.usage's credits-block pattern.

Frontend (TUI-local /credits command + Ink component) lands separately.

* feat(tui): /credits command with keyboard-driven top-up confirm

TUI-local /credits: fetches the structured balance via the credits.view RPC,
prints the balance + identity + top-up URL, then arms the EXISTING confirm
overlay (Enter = open top-up in browser via openExternalUrl, Esc = cancel).
Reuses ConfirmReq — no new overlay component/state/input handler. Headless
(openExternalUrl returns false) falls back to printing the URL.

- gatewayTypes.ts: CreditsViewResponse.
- commands/credits.ts: the command (mirrors /status's rpc+guarded pattern).
- registry.ts: register creditsCommands.
- test: balance+overlay armed, headless fallback, no-url, logged-out (4 cases).

Matches the CLI /credits 'Enter to open' affordance. Phase 2a: no polling.
2026-06-12 08:51:10 +00:00
Teknium
4474873d2c feat(cli): persist resolved approval/clarify prompts in scrollback (#44702)
Modal prompt panels (dangerous-command approval, clarify questions)
live in the prompt_toolkit layout and vanish on the next repaint,
leaving no trace of the question or the decision in chat history.

Emit a dim one-line summary after each prompt resolves:
  ⚠ Approval: <command> → allowed for session
  ? Clarify: <question> → <answer>

Gated on display.persist_prompts (default true). Detail and outcome
are whitespace-collapsed and capped at 120 chars.
2026-06-12 01:14:35 -07:00
Teknium
8e5b7592f8 refactor(agent): hoist MEDIA-directive regex to module level
Avoid recompiling the pattern on every _serialize_for_summary call; name it
beside _PATH_MENTION_RE with the #14665 rationale.
2026-06-12 01:14:28 -07:00
Tranquil-Flow
286ecd26d8 fix(agent): strip MEDIA directives from compressor summarizer input (#14665) 2026-06-12 01:14:28 -07:00
Teknium
8b2a3c9c51 chore: add kdunn926 to AUTHOR_MAP 2026-06-12 01:07:50 -07:00
Teknium
74180ebf0b fix(gateway): classify SimpleX non-image/non-audio files as DOCUMENT
SimpleX tagged unknown files application/octet-stream in media_types
but classification only handled audio/image, leaving msg_type TEXT —
run.py never injected the document context. Same bug class as #12845.
2026-06-12 01:07:50 -07:00
Teknium
f03f161b39 fix(gateway): classify email document attachments as DOCUMENT
Email cached document attachments and placed them in media_urls, but
msg_type only flipped on image attachments — documents stayed TEXT and
run.py's document-context injection (gated on MessageType.DOCUMENT)
silently dropped them. Same bug class as Signal #12845. DOCUMENT wins
over PHOTO for mixed attachments since image handling keys off per-path
mime types while document injection gates strictly on message_type.
2026-06-12 01:07:50 -07:00
Teknium
1e29ab38c7 fix(gateway): classify Signal video attachments + catch-all DOCUMENT fallback
Widen the salvaged #12851 fix to match the established classification
pattern (WhatsApp/Slack/BlueBubbles/Mattermost): video/* -> VIDEO, and
any remaining MIME type falls through to DOCUMENT instead of TEXT, so
exotic types still trigger run.py's document-context injection.
2026-06-12 01:07:50 -07:00
Kyle Dunn
8e821cd2f5 test(gateway): verify Signal inbound text attachment sets MessageType.DOCUMENT 2026-06-12 01:07:50 -07:00
Kyle Dunn
ffef9da9b7 test(gateway): verify Signal inbound PDF attachment sets MessageType.DOCUMENT 2026-06-12 01:07:50 -07:00
Kyle Dunn
8207ae888d fix(gateway): add Signal message type classification for documents 2026-06-12 01:07:50 -07:00
teknium1
05470aa1b6 feat(messaging): expose action='unreact' in send_message + react dispatch tests
Follow-up for salvaged PR #44486: the adapter shipped remove_reaction but
the tool only exposed 'react'. Generalize _handle_react(remove=) and add
tool-level dispatch tests for react/unreact (missing from the original PR).
2026-06-12 01:07:38 -07:00
underthestars-zhy
b4e95a2efe fix(photon): add clarifying comments for Windows-safe os.kill usage 2026-06-12 01:07:38 -07:00
underthestars-zhy
23305cfeab fix(photon): normalize DM chat keys in last-inbound reaction tracker
Inbound events key the tracker by the DM chat GUID (any;-;+1555...),
but home-channel react calls address the same space by bare E.164 —
normalize both to the phone so add_reaction's last-inbound default
resolves regardless of which form the caller uses (mirrors the
sidecar's phoneTargetFromSpaceId).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-12 01:07:38 -07:00
underthestars-zhy
156f4fba92 feat(photon): add agent-facing emoji reaction support
Add `action='react'` to `send_message` tool and expose `add_reaction`/
`remove_reaction` on the Photon adapter.

- Track latest inbound message id per chat (`_last_inbound_by_chat`,
  bounded to 200 entries) so the agent can react without threading
  message ids through tool calls
- New `add_reaction`/`remove_reaction` public methods on PhotonAdapter;
  unlike the lifecycle tapbacks, these are not gated by PHOTON_REACTIONS
- `send_message` gains `action='react'` with `emoji` and optional
  `message_id` params; resolves target via existing channel-directory
  and home-channel logic; requires a live gateway adapter
2026-06-12 01:07:38 -07:00
underthestars-zhy
a23c0b378c fix(photon): use per-call httpx client in _sidecar_call
Prevents "Future attached to a different loop" errors when
_sidecar_call is invoked from a worker thread via _run_async in
send_message_tool. The persistent _http_client remains in use for
the inbound streaming loop, which always runs on the gateway's loop.
2026-06-12 01:07:38 -07:00
underthestars-zhy
9bfff6e16c chore(photon): bump spectrum-ts to 3.1.0 2026-06-12 01:07:38 -07:00
underthestars-zhy
a652131c42 fix(photon): stop gateway restarts from orphaning the sidecar on its port
A hard gateway exit (crash, SIGKILL, supervisor restart) left the
detached Node sidecar running with a token the next gateway run doesn't
know, so it could never be told to /shutdown. Every replacement spawn
then died on EADDRINUSE, failing each 30→300s reconnect attempt while
the orphan kept consuming the inbound gRPC stream.

Two layers:
- Lifetime binding: the adapter now holds the sidecar's stdin as a
  pipe, and the sidecar (PHOTON_SIDECAR_WATCH_STDIN=1) shuts down on
  stdin EOF — fired by the OS on any parent death, including SIGKILL.
- Startup reaping: before spawning, the adapter probes the port and
  terminates a stale listener, but only after verifying its command
  line is a Photon sidecar; a foreign listener raises a clear error
  instead of being signalled.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-12 01:07:38 -07:00
underthestars-zhy
573c4e6511 feat(photon): upgrade to spectrum-ts 3.0.0 (pinned) with markdown + reactions
Pin spectrum-ts to exactly 3.0.0 (was ^1.18.0 plus an `npm install
spectrum-ts@latest` on every setup) so breaking SDK majors can't take
down fresh installs silently; `hermes photon setup` now runs `npm ci`.
Upgrade procedure documented in the README.

Migrate resolveSpace to the v3 namespace API: `im.space.create(phone)`
for DMs and `im.space.get(id)` for everything else — group spaces are
now rehydratable from their persisted id after a sidecar restart, which
v1 could not do.

Markdown: replies go out via the v3 `markdown()` builder (iMessage
renders natively; other Spectrum platforms degrade to plain text).
`PHOTON_MARKDOWN=false` reverts to the stripped plain-text path.

Reactions, behind PHOTON_REACTIONS (default off): lifecycle tapbacks
(👀 while processing, 👍/👎 on completion) via new sidecar /react and
/unreact endpoints with per-target reaction-handle tracking, and user
tapbacks on bot-sent messages routed to the agent as synthetic
`reaction:added:<emoji>` events.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-12 01:07:38 -07:00
underthestars-zhy
0a963d8c9a feat(photon): add telemetry toggle via hermes photon telemetry 2026-06-12 01:07:38 -07:00
Teknium
c196269d8d fix(credits): suppress usage gauge when top-up funds exist + add display.credits_notices toggle (#44716)
The subscription-cap usage gauge (50/75/90% bands) ignored purchased
(top-up) credits: a sub user with top-up funds got a sticky warn banner
at 90% of their cap — permanently at >=100%, alongside grant_spent —
despite being fully able to keep inferencing. The cap is the wrong
denominator for an account that can keep spending.

- evaluate_credits_notices: purchased_micros > 0 suppresses the usage
  band (grant_spent already covers the cap-reached + top-up case with
  the remaining balance). A top-up landing mid-session clears any
  showing band; spending top-up down to 0 resumes the gauge.
- New display.credits_notices config (default true): false silences all
  credits notices. State capture and /usage are unaffected. Read once
  per agent (cached) in _emit_credits_notices, fail-open true.
- Docs: configuration.md display block.
2026-06-12 01:06:46 -07:00
alt-glitch
338b5275be opentui(v6): syntax highlighting for 10 more languages (vendored tree-sitter grammars)
@opentui/core@0.4.0 bundles only 5 grammars (ts/js/markdown/markdown_inline/
zig) and Hermes registered none of its own — Python/Rust/Go/bash/JSON/C/HTML/
CSS/YAML/TOML tool bodies and fences rendered plain text (never a regression:
no addDefaultParsers existed anywhere in branch history).

Now: parsers/manifest.json curates the 10 grammars (cpp deliberately dropped —
3.28MB alone); scripts/update-parsers.mjs vendors wasm+highlights.scm with
magic/content validation (plain Node fetch — core's update-assets generator is
Bun-flavored and its import-module won't bundle under esbuild, so registration
skips it and points at the vendored files by runtime-resolved path instead);
boundary/parsers.ts registers via the public addDefaultParsers() at entry
module load, before the first <code>/<markdown> mount initializes the global
tree-sitter client. ~4MB vendored, committed (build inputs, offline-safe).

Markdown fence injections need no infoStringMap: fence labels resolve as
filetype ids and core's ext maps already normalize py→python, zsh→bash, h→c.
Live-smoked in a real renderer: python tool body draws 6 distinct token
colors; ```python and ```yaml fences inside markdown highlight too. 6 new
tests pin the wiring (vendored assets valid, registration set, filetype
routing); visuals stay live-smoke territory per codeBlock.tsx.
2026-06-12 13:26:17 +05:30
ethernet
906bee9cf7 fix(nix): natively compile and correctly stage node-pty for desktop app
- Add ELECTRON_SKIP_BINARY_DOWNLOAD=1 to nix/lib.nix to prevent offline download failures.
- Manually trigger native compilation of node-pty via npm rebuild --build-from-source in buildPhase.
- Run stage-native-deps.cjs to copy the natively compiled binary into build/native-deps.
- Flatten native-deps and install-stamp.json to the root of the output derivation in installPhase, matching electron-builder's extraResources behavior so main.cjs can find it at process.resourcesPath + '/native-deps/node-pty'.
- Add doCheck=true and a strict checkPhase to fail fast if the staged native binary is missing.
2026-06-12 03:55:09 -04:00
kshitij
046f444ddc Merge pull request #44738 from kshitijk4poor/salvage/memory-sync-multimodal-content
fix(memory): flatten multimodal content before provider sync
2026-06-12 00:40:31 -07:00
alt-glitch
ef94562125 opentui(v6): terminal window title (OSC 0/2) + waiting-on-you notifications (OSC 9/99/777)
Window title: a render-nothing <TerminalChrome> tracks session.info — the
native renderer.setTerminalTitle (frame-safe, zig-side OSC emit) shows
'{session title} — Hermes' once the session is titled, 'Hermes Agent' until
then. The user's previous title is bracketed with the XTWINOPS title stack
(save on boot, best-effort restore on quit). Gateway: _session_info now
carries the live title (DB row, pending_title fallback) and a session.info
refresh follows every title change — pending-title application, the
auto-title worker landing (via maybe_auto_title's title_callback), and
session.title renames — so the window retitles without waiting for the
next turn.

Notifications: when the TUI starts waiting on the user — any blocking
prompt (clarify/approval/sudo/secret/confirm) or turn completion — three
dialect sequences go out through renderer.writeOut: OSC 9 (iTerm2/wezterm),
OSC 99 (kitty), OSC 777 (urxvt/foot); terminals ignore what they don't
speak. Suppressed while the terminal reports focused (core's mode-1004
focus/blur events; until a first blur proves reporting works, notify
unconditionally). HERMES_TUI_NOTIFY=0/false/off kills notifications; the
title is not gated. All text is OSC-sanitized (control chars stripped,
777's semicolon fields spliced-proof, length-capped).

13 new TUI tests (pure shaping/sequences/env gate + store-edge wiring via
an injected seam) and 2 gateway tests (title resolution order, thread-safe
refresh emitter). Live-smoked: tmux pane_title shows 'Hermes Agent' from
the native title path.
2026-06-12 13:09:23 +05:30
alt-glitch
3616b813ec opentui(v6): fleet memory self-sampling — HERMES_TUI_MEMLOG + memwatch-report aggregator
Instead of an external watcher chasing 5-10 concurrent session pids,
every TUI samples ITSELF at 1Hz (rss/heap/external + windowing
mounted/peak-mounted counters) into ~/.hermes/logs/memwatch/, gated by
HERMES_TUI_MEMLOG (defaults to the HERMES_TUI_DIAGNOSTICS master
switch) — one shell-rc export covers every session a dev ever starts.
Unref'd timer, every failure path silently disables, 14-day retention.
bench/memwatch-report.mjs aggregates the fleet: per-session
baseline/peak/last, steady-state MB/h slope, peak mounted rows, and
SLOPE/PEAK/MOUNTED anomaly flags. Verified live: two fake-gateway
smoke sessions logged and aggregated (102MB base, mounted ≤60).
2026-06-12 19:24:00 +05:30
kshitijk4poor
15439bee47 refactor(memory): reuse _summarize_user_message_for_log instead of forking it
The original fix added agent/memory_manager.py:flatten_message_content, but
that helper was a near-exact duplicate of
agent/codex_responses_adapter.py:_summarize_user_message_for_log — same
None/str/list dispatch, same {text,input_text,output_text}/{image_url,input_image}
part sets, the identical [N image(s)] marker, and the same str() fallback. The
only difference was the join separator (newline for memory vs space for the
log/trajectory previews the existing helper already serves), and that helper is
already imported into agent/turn_finalizer.py — the same file whose call site the
memory fix touches.

Parameterize the existing helper with sep=' ' (default preserves every current
logging/trajectory caller byte-for-byte) and call it with sep='\n' at the memory
boundary; drop the forked flatten_message_content. Repoints the unit tests to the
consolidated helper and adds a case locking the default space-join.

Single source of truth for multimodal-content flattening; no behavior change for
the fix or for existing callers.
2026-06-12 12:49:18 +05:30
Erosika
87893fe4cb fix(memory): flatten multimodal content before provider sync
Multimodal turns carry message content as a list of typed parts
({type: "text"|"image_url", ...}). _sync_external_memory_for_turn
passed that list straight into MemoryManager.sync_all, and providers
feed it to regexes — Honcho's sync_turn calls sanitize_context, where
re.sub raised 'expected string or bytes-like object, got list'. Every
turn with an attached image silently never synced.

Flatten to plain text at the boundary: text parts joined, images noted
as an [N image(s)] marker so the attachment isn't erased from recall.
Fixing here covers all providers instead of patching each plugin.

(cherry picked from commit 705bdb6ffe)
2026-06-12 12:46:28 +05:30
brooklyn!
d810f2b262 Merge pull request #44676 from NousResearch/bb/fix-schema-ref-default
fix(tools): strip default from $ref nodes in tool schemas
2026-06-12 01:21:14 -05:00
teknium1
b3f5e17bb9 fix(tui): wrap long approval commands in the Ink overlay
Sibling site of the CLI approval-panel fix: the TUI ApprovalPrompt
rendered each command line with wrap="truncate-end", so a long
single-line command lost its tail at terminal width. Wrap to the
panel width via wrapAnsi before applying the 10-line preview cap.
2026-06-11 23:05:08 -07:00
墨綠BG
81cdbbddc8 🐛 fix(cli): wrap approval preview hints 2026-06-11 23:05:08 -07:00
墨綠BG
d6df38bb6b 🐛 fix(cli): wrap long approval commands in prompt 2026-06-11 23:05:08 -07:00
Teknium
c7bee8f961 refactor(agent): drop unused tail_start param from _derive_auto_focus_topic
The parameter was reserved-but-unused (del'd immediately); YAGNI. Test
call site updated.
2026-06-11 23:03:52 -07:00
konsisumer
434c684bfa fix(agent): focus automatic compression on recent user turns 2026-06-11 23:03:52 -07:00
Teknium
db7714d5f1 Merge pull request #44331 from NousResearch/hermes/hermes-6b48295e
feat(whatsapp): WhatsApp Business Cloud API adapter (salvage #43921)
2026-06-11 22:48:06 -07:00
Kyssta
343803b23c fix(cli): use subprocess on Windows for dashboard profile re-exec (#44282) (#44446)
Co-authored-by: kyssta-exe <kyssta-exe@users.noreply.github.com>
2026-06-11 22:41:39 -07:00
Kyssta
a942bfd9cc fix(gateway): reset _last_flushed_db_idx when reusing cached agent (#44327) (#44518)
Co-authored-by: kyssta-exe <kyssta-exe@users.noreply.github.com>
2026-06-11 22:41:34 -07:00
kshitij
a35b370284 Merge pull request #44674 from kshitijk4poor/fix/slack-reactions-plugin-registry-bookkeeping
fix(plugins,slack): registry bookkeeping fixes + ack reaction events (salvage #42561)
2026-06-11 22:32:59 -07:00
Brooklyn Nicholson
b2d151abe2 fix(tools): strip default from $ref nodes in tool schemas
Fireworks-hosted Kimi rejects tool requests when nullable MCP/Pydantic
schemas collapse to {"$ref": "...", "default": null}. Strip that sibling
during global schema sanitization so gateway and CLI calls succeed again.
2026-06-12 00:30:51 -05:00
kshitijk4poor
44bd478039 fix(plugins): credit shared hook/middleware/tool names to every plugin
list_plugins() attribution diffed registry names against all already-loaded
plugins, so when a plugin registered a hook / middleware / tool name an
earlier plugin had already used, the shared name was credited to the first
plugin only and later plugins under-reported (0 hooks) in hermes plugins
list. commands_registered right beside it already attributed correctly by
plugin ownership.

Snapshot per-registry counts before register() and attribute the entries
this plugin's register() actually added (per-registration delta). Add a
regression test: two plugins registering the same hook name are each
credited with 1 hook.
2026-06-12 10:57:25 +05:30
kshitijk4poor
889a13696b fix(plugins): clear _plugin_platform_names on force-rediscover
discover_and_load(force=True) cleared every per-plugin registry except
_plugin_platform_names, which register_platform() populates. A platform
plugin disabled between force-rediscovers left a stale name behind, so the
set diverged from the real platform_registry / _plugins state and never
shrank across repeated force passes.

Add the missing clear() and a regression test that seeds every per-plugin
registry, forces a rediscover, and asserts they all empty (so a future
registry addition can't silently leak across a force pass either).
2026-06-12 10:55:44 +05:30
Veritas-7
82d570165e fix(slack): ack reaction lifecycle events
Register no-op Slack event handlers for inbound reaction_added and reaction_removed events so Slack Bolt does not log unhandled-request warnings for events Hermes does not consume.
2026-06-12 10:54:07 +05:30
kshitij
c574170050 Merge pull request #44664 from kshitijk4poor/salvage/slack-plugin-action-handlers
feat(plugins): expose register_slack_action_handler API (salvage #20589)
2026-06-11 22:14:44 -07:00
kshitijk4poor
e4c168b1f4 chore: map bcsmith528 contributor email for attribution 2026-06-12 10:39:05 +05:30
Brad Smith
08e8bedae8 fix(gateway): keep plugin action wrapper signature to (ack, body, action)
The previous implementation captured loop vars via default arguments::

    async def _wrapped(ack, body, action, _cb=_cb, _plugin_name=_plugin_name):

slack_bolt's ``kwargs_injection`` introspects each listener's signature
via ``inspect.signature`` and passes ``None`` for any parameter name it
doesn't recognise (see ``slack_bolt/kwargs_injection/async_utils.py``
``build_async_required_kwargs``). That clobbered ``_cb`` to ``None`` at
dispatch time, so the wrapped plugin handler became ``NoneType`` —
``await _cb(...)`` then raised ``'NoneType' object is not callable`` and
no plugin action handler ever fired.

Replace the default-arg trick with a small closure factory so the
wrapper's public signature is exactly ``(ack, body, action)``. Add a
regression test that introspects the wrapped function's signature.

Found via real Slack click on a Block Kit button registered through
``ctx.register_slack_action_handler`` — gateway log showed
``[Slack] Plugin 'None' action handler raised: 'NoneType' object is
not callable`` despite the registration log line confirming the
handler was wired.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-12 10:36:14 +05:30
Brad Smith
62e937bf2b feat(plugins): expose register_slack_action_handler API
Plugins that post Block Kit messages with interactive elements (buttons,
overflow menus, datepickers, etc.) had no documented way to receive the
resulting click events. The plugin API exposed register_tool, register_hook,
register_command, register_platform, and register_context_engine, but
nothing for slack_bolt action handlers. The only workaround was to
monkey-patch SlackAdapter.connect from inside register(), which is
fragile and breaks on every Hermes update.

This change adds:

* PluginContext.register_slack_action_handler(action_id, callback) —
  validates inputs and queues the handler on the PluginManager.
  action_id accepts whatever slack_bolt.App.action() accepts (literal
  string, compiled re.Pattern, or constraint dict).
* PluginManager.get_slack_action_handlers() — accessor used by the
  Slack adapter at connect time.
* SlackAdapter.connect — after wiring its built-in approval and
  slash-confirm buttons, iterates the plugin-registered handlers
  and registers each via self._app.action(matcher)(callback). Each
  callback is wrapped defensively so a misbehaving plugin cannot
  crash slack_bolt's dispatch loop, with a best-effort ack on
  exception so Slack stops retrying the click.
* Defensive fallback when the plugin layer is unhealthy: a
  RuntimeError from get_plugin_manager() is logged and swallowed
  rather than blocking the gateway from starting.
* Test coverage in tests/gateway/test_slack_plugin_action_handlers.py
  for input validation, multi-plugin registration, the connect-time
  wiring, defensive exception handling, and the plugin-loader-
  failure fallback path.
* Documentation in website/docs/guides/build-a-hermes-plugin.md
  describing the new API alongside the existing register_command /
  dispatch_tool documentation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-12 10:36:14 +05:30
alt-glitch
e1067dbbe5 tests: pin ink engine in _make_tui_argv npm-bootstrap tests (post-merge semantic fix)
Main's rewritten test_tui_npm_install.py tests call _make_tui_argv expecting
the Ink/npm flow unconditionally; with the dual-engine dispatch merged in,
_resolve_tui_engine() auto-selects opentui whenever ui-opentui/dist is built
in the repo, routing the call away from the path under test (first subprocess
became 'node --version' instead of 'npm run build'). Pin the engine to ink
via an autouse fixture, mirroring the existing pinning precedent in
test_tui_resume_flow.py.
2026-06-12 10:32:40 +05:30
brooklyn!
24f74eb888 fix(desktop): make file-preview source + markdown selectable (#44648)
body sets user-select:none for native feel and opts text back in only via
[data-selectable-text='true']; the preview's source and rendered-markdown
panes never set it, so code couldn't be selected or copied. Tag the Shiki
code column and the markdown root. The attribute stays off the SourceView
grid root so the gutter keeps its select-none and line numbers don't bleed
into copied text.
2026-06-12 04:15:06 +00:00
brooklyn!
6e41ca956b fix(desktop): bundle JetBrains Mono for the terminal pane (#44642)
The terminal listed JetBrains Mono only as a late fallback and shipped no
webfont, so on machines without SF Mono/Menlo xterm measured the grid on the
regular system face while styled SGR spans fell back to a font with different
advances — glyphs squeezed and overlapped.

Bundle the regular/bold/italic woff2 (Apache-2.0, the faces the dashboard
already ships), put the family first in the xterm stack, pin the weights, and
warm every face before mount (fonts.ready only settles already-requested
faces; bold/italic aren't asked for until styled output paints, past atlas
init). Vite emits them as hashed assets under dist/** with base './', so the
fonts ship in the asar and every install path inherits them.
2026-06-12 04:11:51 +00:00
alt-glitch
ab37440ce6 opentui(v6): diagnostics-gate review nits — completion-mechanism precision + client-only design note 2026-06-12 09:28:15 +05:30
alt-glitch
cf3002664b opentui(v6): HERMES_TUI_DIAGNOSTICS master switch — gate /mem, /heapdump + window-stats default
Regular users get zero diagnostic surface by default: /mem and /heapdump
disappear from /help and completion, and invoking them prints the
one-line enable hint (relaunch with HERMES_TUI_DIAGNOSTICS=1) instead of
executing — an enable switch, not a secret. With the switch on, the
commands work as before and HERMES_TUI_WINDOW_STATS defaults on (still
individually settable either way). Full env-flag ledger (master switch /
user config / dev tuning / internal plumbing) in docs/opentui-env-flags.md.
672 tests exit 0.
2026-06-12 09:17:25 +05:30
brooklyn!
6db65e687c Merge pull request #44627 from NousResearch/bb/desktop-tool-row-copy-affordance
fix(desktop): move tool-row copy control into expanded body
2026-06-11 22:32:52 -05:00
Brooklyn Nicholson
09bcf5a937 fix(desktop): move tool-row copy control into expanded body
The per-row copy control lived in the header's trailing slot as a 24px
button that depended on a `group-hover/tool-row` group that exists nowhere
in the tree. It therefore stayed `opacity-0` yet remained clickable — an
invisible hit-target straddling the disclosure caret and duration, making
the caret hard to click without firing a copy.

Move copy into the expanded body's top-right (matching the code-block
convention) where it can't fight the caret for the right edge, and make it
actually visible (subtle at rest, full on hover/focus). The header right
edge now belongs solely to the duration label + caret.

Tradeoff: copy is only reachable once a row is expanded; rows with no
expandable body no longer surface a copy control.
2026-06-11 22:27:39 -05:00
alt-glitch
fc8d5f203a opentui(v6): post-rebase fixups — dedup probe mouse, .demo lint ignore, cap tests to windowing-aware contract
Rebasing onto fcf49f313 (multi-click selection) collided in the test
probe (both sides added 'mouse' — deduped) and surfaced two test debts
the cap-restore commit (3cc56517a) had shipped masked by a piped exit
code: store cap tests still asserted the 1000 default (now: 3000
windowed, 1000 with HERMES_TUI_WINDOWING=0, both covered), and the
burst-interplay test relied on the old cap trimming a 1500-row burst
(now pins HERMES_TUI_MAX_MESSAGES=1000 explicitly for both stores).
Also: .demo/ build artifacts excluded from typed linting. 669 tests
exit 0 (verified unpiped). Multi-click selections flow through the
same renderer selection seam, so windowing's drag-freeze + row-pinning
covers them with no changes.
2026-06-12 08:40:53 +05:30
alt-glitch
c5806b9ad9 docs: upstream alignment playbook — forkless invariant, shim ledger, upgrade contract
Maintainer signals (native yoga next release, 2x layout, opencode's
100-cap was a legacy perf workaround): what changes for us (WASM ratchet
dies), what doesn't (the 65k handle table still makes windowing
load-bearing at 3000 rows), the boundary/ shim ledger with
delete-on-upstream-fix criteria, and the per-release upgrade playbook
that uses the bench suite as the acceptance contract.
2026-06-12 08:29:24 +05:30
alt-glitch
b145607029 docs: the OpenTUI memory story — ELI5 walkthrough of the 686MB→300MB campaign
Shareable explainer: every primitive at play (native handle table, Yoga
WASM grow-only memory, renderables, Solid surgical unmount, V8 GC
laziness, scrollbox draw-only culling) and every decision (windowing vs
store-cull, exact-heights-at-unmount vs Ink's estimate-correct, the
correctionIsLegal zero-jank law, append-time adjudication, never-window
rules, windowing-aware cap restore, heap right-sizing), with the
measured scoreboard and honest open items.
2026-06-12 08:29:24 +05:30
alt-glitch
4a3b755162 opentui(v6): restore scrollback cap 1000 → 3000 under windowing (#27 payoff)
With transcript windowing (S1+S2) the mounted set no longer scales with
the store (peak 31 rows over a 1500-row burst), so the handle-table
clamp that forced 1000 rows is unnecessary when windowing is on. The
ceiling is now windowing-aware: 3000 rows (the originally-shipped
default, regression documented in opentui-fixes-audit.md §2) with
windowing, 1000 with HERMES_TUI_WINDOWING=0 (every row mounts again).

Measured at the restored cap (full 3000-msg store): mem3000 360MB peak
styled end-to-end (pre-campaign: ~870MB + unstyled past ~1,400 rows;
before that: crash). scroll3000 p50=2 p90=3 p99=8 max=17ms (Ink same
workload: p90=35 p99=96). Gate digest unchanged.
2026-06-12 08:29:24 +05:30
alt-glitch
16dbcbe85d opentui: Node 26 onboarding — scoped .node-version, engines floor, README setup guide
Pins Node 26.3 to ui-opentui/ only (fnm/mise auto-switch on cd; leaving
the directory restores whatever the dev had — no global default change).
engines.node >= 26.3 makes a wrong-Node npm ci warn. README covers
install paths (fnm/mise/nvm/absolute-binary), the ABI-locked
node_modules gotcha, and build/run commands.
2026-06-12 08:29:24 +05:30
alt-glitch
c04aaecb51 opentui(v6): windowing S2 — pin selected rows instead of freezing on a lingering highlight
Adversarial-review follow-up to the S2 slice. The S1 rule froze ALL window
recomputes while renderer.getSelection()?.isActive — but a finished mouse
selection persists by design (boundary/renderer.ts keeps the highlight so
Ctrl+C can re-copy), so a long streaming turn behind a lingering highlight
ballooned the mounted set exactly like pre-windowing (and permanently
ratcheted the Yoga-WASM high-water).

Refinement:
- full freeze only while selection.isDragging (the native walk touches the
  live tree on every drag update — destroying a row mid-walk corrupts the
  highlight; unchanged from S1 where it matters),
- a finished highlight instead PINS the rows containing
  selection.selectedRenderables (parent-climb to the row wrapper via a
  WeakMap) as neverWindow — the highlight and a later Ctrl+C copy stay
  byte-exact while everything else keeps windowing,
- an active highlight counts as activity (no idle measure churn under it).

Test (headless mock-mouse drag): finished selection persists (isActive,
!isDragging) → 300-row burst keeps peakMounted < 120 AND
getSelectedText() returns the identical text afterward, the selected rows
having been pinned while long scrolled past the margin.

Verified on this build: gate digest otui-capped d5e9558583159eac… (2/2),
mem2000 otui-capped windowing-ON vmhwm 312MB (target ≤ 350), scroll2000
otui-capped p50 2.0ms / p99 6.0ms (gate ≤ 17ms). check exit 0 (648 tests).

Review verdicts on the remaining findings (verified against core source):
- "scrollTop compensation race": rejected — scrollTop is an imperative
  scrollbar property (no signal staleness); records fire in document order,
  each compensation immediately visible to the next.
- "heights map leak on /new": rejected — the countChanged cleanup prunes
  every per-key map against the live key set (test-verified).
- remount-in-viewport estimate shift: only reachable when one frame jumps
  past the margin (> 1 viewport); the design's accepted "remounted for
  view" path — documented in the header.
- expanded tool/reasoning re-collapse on far-remount: S1-accepted,
  deferred (component-local state; out of S2 file ownership).
2026-06-12 08:29:24 +05:30
alt-glitch
eaa069e322 opentui(v6): transcript windowing S2 — append-time adjudication + windowed resume + edge measure
S2 of docs/plans/opentui-transcript-windowing.md (#27), behind
HERMES_TUI_WINDOWING (OFF path renders the byte-identical legacy tree).

Append-time adjudication: the window now recomputes on transcript GROWTH,
not just scroll — a createComputed on messages.length re-windows
synchronously per append, and while pinned at the bottom computeWindow
anchors to the cumulative content BOTTOM (pinnedBottom) instead of the
stale pre-layout scrollTop, so burst-appended rows are spacer-swapped the
moment they pass the margin. The frame driver additionally treats a
≥ ¼-viewport scrollHeight change (streaming growth) like scroll movement.
Unseen-row default changed from "always mounted" to "mounted iff created
streaming or within the bottom-30" — live rows still paint instantly with
zero added latency; a bulk commitSnapshot (resume) mounts ONLY the bottom
window and everything above starts as line-count-estimate spacers (chip-
and-spacing-aware estimateMessageHeight).

Spacer corrections (zero-jank rule): when a measure lands a height
different from what the spacer occupied, the wrapper's onSizeChange fires
inside the layout traversal, pre-paint. Pinned at bottom the scrollbox's
own sticky re-pin (content onSizeChange runs before the row wrappers')
already compensated — verified by test; otherwise scrollTop is compensated
same-frame for rows fully above the viewport (correctionIsLegal). Frames
stay byte-stable across corrections in both pinned and mid-history tests.

Lazy exact-measure (design §4 — the simple choice, documented): no true
offscreen layout exists in @opentui/core, so an idle pulse (no appends,
no scroll, no turn, no selection for HERMES_TUI_WINDOW_IDLE_MS≈1s) mounts
MEASURE_BATCH_ROWS=10 never-measured rows nearest the bottom window edge
(edgeMeasureBatch), records exact heights (incl. a direct post-layout pull
for rows whose mount changed nothing — no onSizeChange fires), and the
next recompute swaps them back to now-exact spacers. Scrolling itself
still measures the margin band.

DEV counter: windowRowStats (current/peak simultaneously-mounted rows),
exposed on globalThis behind HERMES_TUI_WINDOW_STATS; tests assert it.

Measured (this build, 39f9f433e+S2):
- check: exit 0 (647 tests / 39 files; +11 pure window cases, +4 headless)
- peak mounted: 31 rows over a 1500-row burst; 30 rows on a 600-row
  resume snapshot (bound asserted < 120)
- gate digest: otui-capped d5e9558583159eac… — byte-identical, 2/2 reps
- mem2000 (otui-capped, windowing ON, 8GB heap): vmhwm 300MB
  (S1 same-heap 518MB; S1 right-sized-heap 427MB; Ink 229-239MB;
  target ≤ 350MB)
- scroll2000 otui-capped: p50 2.0ms / p99 5.0ms / max 18ms
  (gate ≤ 17ms p99; S1 baseline p99 15ms)

Known S2 limits (deferred to S3, design §5): /compact·/details toggles and
width resizes leave out-of-window spacer heights stale until remount or
the idle march; expanded-body state above the window may re-collapse on
remount (S1-accepted).
2026-06-12 08:29:24 +05:30
alt-glitch
2d7616121b opentui(v6): transcript windowing S1 — exact-height spacers behind HERMES_TUI_WINDOWING
Core machinery of docs/plans/opentui-transcript-windowing.md (#27): rows
outside [scrollTop − viewport, scrollTop + 2·viewport) swap to an exact-height
empty <box> (1 yoga node, no text buffers / native handles), so the mounted
set stays ~3 viewports regardless of transcript length.

Flag: HERMES_TUI_WINDOWING — unset → ON; 0/false/no/off → OFF (envFlag
semantics, the bench A/B + one-env escape hatch). OFF renders the exact
legacy tree (no wrapper boxes).

Pieces:
- logic/window.ts (pure, table-tested): computeWindow (viewport ± 1-viewport
  margin intersection over cumulative exact heights; null heights fall back
  to a per-row line-count estimate), hysteresisFor/shouldRecompute (≥ ¼
  viewport between recomputes), correctionIsLegal (the jank rule: corrections
  only fully above the viewport with same-frame scrollTop compensation, or
  fully below it), estimateMessageHeight (line-count estimate; wrong values
  are fixed by remount only — S1 never corrects a spacer in place).
- view/transcript.tsx: per-row measuring wrapper records exact heights via
  onSizeChange (only while the real row is mounted); window driver is a
  renderer frame callback (setFrameCallback — scroll always renders, so no
  extra timer) publishing the mounted set through one signal + createSelector
  so only flipped rows re-render. Stable row keys via WeakMap<Message, n>
  (messages have no id; store proxies are reference-stable). Solid <Show>
  unmount destroys the row's renderables (@opentui/solid _removeNode →
  destroyRecursively).

Never-window rules:
- streaming rows (remount would restart native markdown streaming),
- the last row while a turn is running (deltas land there),
- the bottom 30 rows (fixed K — sticky-bottom region; rows under
  viewport+margin are mounted by the window calc anyway),
- rows the window has never adjudicated default to MOUNTED (new live rows
  paint instantly),
- the whole window FREEZES while a mouse selection is active
  (renderer.getSelection()?.isActive — a swap would destroy highlighted
  renderables under the native selection walk).

Tests: 30 pure window.test.ts cases + 2 headless integration cases
(transcriptWindow.test.tsx) pinning the zero-jank invariant (scrollHeight
identical ON vs OFF), the renderable shedding, and remount-on-scroll-back.
2026-06-12 08:29:24 +05:30
brooklyn!
4d67ac6172 Merge pull request #44596 from NousResearch/bb/desktop-rtl-bidi
feat(desktop): auto-detect RTL/bidi text direction in chat
2026-06-11 21:44:13 -05:00
Brooklyn Nicholson
6c00077d38 feat(desktop): auto-detect RTL/bidi text direction in chat
Arabic/Hebrew/Persian/Urdu chat text rendered left-to-right and
left-aligned, and mixed RTL/English technical messages (the common case)
read backwards. Resolve each chat block's base direction from its own
first strong character (UAX#9) with pure CSS, scoped to the chat
surfaces only:

- `unicode-bidi: plaintext` + `text-align: start` on assistant prose
  blocks (p, h1-h6, li, blockquote), the user bubble's text lines, and
  both composers (main + edit share the composer-rich-input slot). RTL
  blocks read and right-align RTL; English stays LTR; mixed
  conversations resolve per block. `text-align: start` is required
  because the user bubble hardcodes `text-left`.
- Inline `code` and KaTeX are pinned `direction: ltr; unicode-bidi:
  isolate`, so the bidi first-strong heuristic skips them: a sentence
  that *starts* with a command (`./run.sh ...`) followed by Arabic
  still resolves RTL, and the command's own neutrals keep their order.
- Fenced code surfaces (code-card, user fences) are pinned LTR so they
  never mirror or right-align inside an RTL list item or blockquote.

`direction` is never forced, so app chrome, layout, and list indent
stay LTR per the issue's request not to flip the whole UI. English-only
content is byte-for-byte unchanged.

Salvaged and unified from #44065 and #44169; verified in Chromium that
isolate removes inline code from the paragraph direction vote (the
code-first case), making the JS dir-resolution in #44065 unnecessary.

Fixes #44150

Co-authored-by: Adolanium <Adolanium@users.noreply.github.com>
Co-authored-by: Adalsteinn Helgason <AIalliAI@users.noreply.github.com>
2026-06-11 21:06:26 -05:00
brooklyn!
9e484f052a Merge pull request #44559 from NousResearch/bb/persistent-terminal-env
fix(terminal): advertise persistent env state
2026-06-11 20:07:11 -05:00
Brooklyn Nicholson
ab06ef8ed6 fix(coding): teach agents terminal env state persists
Tell coding agents to activate shell setup once per session instead of re-sourcing it before every command, and pin the existing LocalEnvironment env-snapshot behavior with regression tests.
2026-06-11 19:50:08 -05:00
brooklyn!
afe53708ee Merge pull request #44545 from NousResearch/hermes-worktree-code
fix(coding): don't expose primary worktree path in coding context
2026-06-11 19:35:18 -05:00
Teknium
5affecb443 fix(mcp): capability-gate tools/list so prompt-only MCP servers can connect (#44550)
Port from anomalyco/opencode#31271: only call tools/list when the server
advertises the 'tools' capability in InitializeResult.capabilities.

Previously, _discover_tools() unconditionally called session.list_tools()
right after initialize. Prompt-only / resource-only servers (which omit
the tools capability per the MCP spec) raise McpError(-32601 Method not
found), which aborted the connection — burning all 3 initial-connect
retries and permanently failing the server even though its prompts and
resources were perfectly usable. The 180s keepalive had the same problem:
it probed with list_tools(), so even a successfully connected prompt-only
server would be torn down on the first keepalive cycle.

Changes:
- MCPServerTask._advertises_tools(): capability check with a legacy
  fallback (no captured InitializeResult -> behave as before)
- _discover_tools(): skip tools/list for non-tool servers
- keepalive: use the universal ping request for non-tool servers
- _refresh_tools(): guard against tools/list_changed from non-tool servers

E2E verified with a real stdio prompt-only FastMCP-style server: on main
it fails all 3 connection attempts with Method-not-found; with this fix
it connects, lists prompts, answers ping keepalives, and shuts down
cleanly.
2026-06-11 17:34:49 -07:00
ethernet
96cc7ee1e3 fix(coding): don't provide worktree root in context
this makes the agent frequently edit files in the wrong worktree.
what the agent doesn't know can't hurt it.
2026-06-11 20:27:06 -04:00
brooklyn!
880107ab24 Merge pull request #44529 from NousResearch/bb/desktop-profile-fallout
fix(desktop): close out the multi-profile desktop fallout — WS auth + cross-profile session reads
2026-06-11 19:06:00 -05:00
brooklyn!
4ddb03390a fix(desktop): collect + persist API key for custom OpenAI endpoints (#43896)
The desktop "Local / custom endpoint" onboarding never collected an API
key and /api/model/set silently dropped one, so an auth-gated endpoint
(e.g. a hosted vLLM behind a key) could never enumerate models — and
Settings' "Set up custom endpoint" routed `custom` into a non-existent
OAuth flow, booting the user back to the first screen (the reported loop).

Backend (web_server.py):
- /api/providers/validate accepts an optional api_key and sends it as a
  Bearer header when probing a custom endpoint's /v1/models.
- /api/model/set accepts api_key, persists it to model.api_key (same
  switch/preserve lifecycle as base_url), and registers a named
  custom_providers entry via _save_custom_provider — matching the
  `hermes model` CLI flow so the endpoint shows up as a ready picker row.

Desktop:
- ApiKeyForm shows an optional API key field for the local/custom option;
  the key is threaded through saveOnboardingLocalEndpoint → validate +
  setModelAssignment.
- New onboarding `localEndpoint` intent + startManualLocalEndpoint(); the
  Settings "Set up custom endpoint" button now opens the local-endpoint
  form (URL + key) instead of the OAuth dead-end.
- Added localApiKeyPlaceholder i18n key (en + types + zh).

Tests: api_key lifecycle on _apply_main_model_assignment, key persistence
+ custom_providers registration on /api/model/set, Bearer-header probe;
onboarding store forwards + persists the key.
2026-06-12 00:03:55 +00:00
brooklyn!
c6007e5c1a Merge pull request #44534 from NousResearch/bb/approval-allow-permanent
fix(approval): carry allow_permanent to TUI + desktop approval prompts
2026-06-11 18:49:58 -05:00
Austin Pickett
e2145a5c9c fix(ui-tui): stabilize embedded dashboard chat gateway (#44528)
Cherry-picked from #39840 by @flyinhigh and rebased cleanly on main.

- Defer config fetch in createGatewayEventHandler until gateway.ready to
  avoid render-phase RPC that can mutate transcript state and trigger
  React error 301 in embedded dashboard PTYs.
- Use undici WebSocket fallback when globalThis.WebSocket is unavailable
  (Node attach mode and sidecar mirror sockets).
- Add regression tests for both fixes.

Co-authored-by: flyinhigh <flyinhigh@users.noreply.github.com>
2026-06-11 19:47:53 -04:00
Brooklyn Nicholson
55a18e6860 chore(approval): tighten allow_permanent comments + DRY the no-always opt set
Collapse the verbose multi-line rationale comments across the TUI/desktop/
backend approval surfaces into single-line "why" notes, and derive
APPROVAL_OPTS_NO_ALWAYS from APPROVAL_OPTS instead of re-listing it.
No behavior change.
2026-06-11 18:42:59 -05:00
Brooklyn Nicholson
b097d7b033 refactor(desktop): use native fetch in dashboard-token
Node >=18 / Electron 40 ship fetch; the hand-rolled http/https.request
plumbing buys nothing. AbortSignal.timeout replaces the socket timeout,
protocol guard and >=400 rejection semantics preserved. 13/13 unit
tests and the live web_server.py repro both green over the new
transport.
2026-06-11 18:41:16 -05:00
Brooklyn Nicholson
cc726aad68 refactor(desktop): fold served-token adoption + foreign-backend refusal into one helper
Both spawn paths (startHermes, spawnPoolBackend) duplicated the same
resolve -> log-fallback -> foreign-check -> throw dance. Collapse it into
adoptServedDashboardToken(baseUrl, spawnToken, {childAlive, label}) in
dashboard-token.cjs; childAlive is a thunk so liveness is sampled after
the fetch. Drop the redundant backendPool.delete in the pool's throw
path (the child exit/error handlers already own pool eviction).

Validated end-to-end against a real web_server.py backend, not just
units: token-injection regex vs the actual served index.html, foreign
refusal (dead child + live squatter), benign drift adoption, and the
401-vs-200 token auth split on /api/sessions.
2026-06-11 18:33:05 -05:00
Brooklyn Nicholson
81436e143e fix(approval): carry allow_permanent to TUI + desktop approval prompts
When a tirith content-security warning is present the approval backend
forces allow_permanent=False and silently downgrades an "always" choice to
session scope (the persistence loop in check_all_command_guards only honors
"always" → permanent when no tirith finding exists). But the gateway notify
payload that drives the TUI and the Electron desktop app never carried that
flag, so both surfaces always rendered "Always allow" — offering a permanent
allow the backend would quietly refuse to persist.

Plumb allow_permanent end-to-end:
- tools/approval.py: include `allow_permanent: not has_tirith` in the gateway
  approval_data the notify callback emits as `approval.request`.
- ui-tui: thread `allowPermanent` through the event handler, gateway types,
  and ApprovalReq; ApprovalPrompt drops the "always" option (and renumbers the
  quick-pick keys) when it's false.
- apps/desktop: thread `allow_permanent` through the gateway payload type, the
  per-session approval store, and the inline ApprovalBar, which now hides the
  "Always allow…" dropdown item when permanent allow is disallowed — reusing
  the existing DropdownMenu / confirm-Dialog UI.

The desktop/TUI render path for approvals already landed in #38578 (the root
cause of approvals not surfacing in the GUI); this completes the salvage of
#37856 by carrying allow_permanent across both surfaces. #37856's original
thread-local _block() approach is dropped: desktop/TUI approvals resolve via
approval.respond → resolve_gateway_approval (the per-session queue), not the
_block()/request_id correlation, so a worker-thread callback waiting on _block
would never be released by the real UI.

Tests: gateway notify payload carries allow_permanent (True without tirith,
False with a tirith warning); ui-tui approvalAction reduced option set +
event-handler allowPermanent propagation; desktop store round-trip + the
ApprovalBar showing/hiding "Always allow".

Supersedes #37856
Closes #37812

Co-authored-by: LeonSGP43 <cine.dreamer.one@gmail.com>
2026-06-11 18:23:59 -05:00
Mani Saint-Victor, MD
9ff0ba0827 fix(desktop): prevent backend port-squat boot loop and pickPort self-collision
Two fixes to the Electron desktop launch path, with the port-reservation logic extracted into a unit-tested module:

1. hermes:bootstrap:reset ("Reload and retry") only cleared connectionPromise, leaving the live backend alive; the orphan kept binding PORT_FLOOR (9120) so the next startHermes() hit EADDRINUSE / "Object has been destroyed" and the window looped. Await teardownPrimaryBackendAndWait() so the reset stops the old backend before restarting.

2. pickPort() probes-then-closes a socket before the real bind happens in a separate Python child, so two concurrent spawns (primary + pool backend) could both be handed PORT_FLOOR and one died with EADDRINUSE. The reservation bookkeeping is extracted into electron/port-pool.cjs (PortPool): pickPort() reserves the chosen port until the child exits and releases it on every exit/error/throw-before-spawn path, closing the TOCTOU window.

PortPool is dependency-injected (probe passed in) and socket-free, unit-tested in electron/port-pool.test.cjs (8 cases) and wired into the test:desktop:platforms script.

(cherry picked from commit d4133945b9)
2026-06-11 18:22:54 -05:00
Brooklyn Nicholson
e3ed7722b5 fix(desktop): refuse a foreign backend's session token after readiness
The served-token fallback adopts whatever token the dashboard HTML
injects. That is correct when our own child regenerated the token (env
pin lost across a shell-wrapped spawn), but wrong when the readiness
probe answered from a process we did not spawn: /api/status is public,
so an orphaned dashboard squatting the port passes waitForHermes while
our child dies on the bind conflict. Silently adopting that process's
token would authenticate the renderer against a foreign backend,
possibly on the wrong profile.

Discriminate on child liveness: the desktop pins
HERMES_DASHBOARD_SESSION_TOKEN on every spawn, so a live child always
serves our token. Served-token mismatch + dead child = foreign backend;
fail the boot loudly instead of connecting. Mismatch + live child keeps
the adopt-served-token salvage from #43720.
2026-06-11 18:18:22 -05:00
Evis
7a2d498b9d fix(desktop): route profile session reads
(cherry picked from commit 64aaf58f5e)
2026-06-11 18:09:24 -05:00
Jeff
e96fe06e49 fix(desktop): use served dashboard token for websocket auth
(cherry picked from commit f8209f91d3)
(cherry picked from commit 72290f0809)
2026-06-11 18:07:19 -05:00
Gille
9102d4a588 fix(dashboard): show Windows 11 in host panel (#44511) 2026-06-11 19:06:29 -04:00
Andrew Fiebert
d221e369b8 fix(desktop): recover from transient assistant-ui index-lookup crash (#44493)
`@assistant-ui/store`'s index-keyed child-scope lookup (`tapClientLookup`)
throws — rather than returning undefined — when a subscriber reads an index
the message/parts list no longer has. During high-frequency store replacement
(switching sessions mid-stream, gateway reconnect replay) a subscriber from
the previous, longer list is still in React's notification queue and reads one
slot past the new, shorter array before it can unmount. The throw
(`Index N out of bounds (length: N)`, the classic index === length off-by-one)
unwinds all the way to the root error boundary and blanks the entire window,
even though the store self-heals on the very next consistent snapshot.

Wrap each virtualized message group in a tiny boundary that swallows ONLY this
transient lookup race and auto-recovers when the message signature changes
(the existing list-mutation key). Any other error re-throws to the root
boundary, so genuine bugs still surface.

Upstream-tracked and unresolved: assistant-ui/assistant-ui#4051, #3652.

Co-authored-by: mollusk <mollusk@users.noreply.github.com>
2026-06-11 22:52:37 +00:00
brooklyn!
b1fe2107d6 fix(desktop): keep named-profile desktop backends per-profile (#44510)
Desktop spawns its dashboard backend with `--profile <name>` and
`HERMES_DESKTOP=1`. cmd_dashboard's unified-launch routing treats any
named profile as a request for the shared machine dashboard: it re-execs
as the default profile (dropping HERMES_HOME) or, when one is already
listening, prints "Machine dashboard already running ... Managing profile
'<name>'" and exits 0. Either way the desktop-spawned child exits before
the app sees a ready backend, so Desktop retries forever — the Windows
named-profile boot loop in the post-mortem.

Skip the machine-dashboard reroute when HERMES_DESKTOP=1 so desktop pool
backends stay per-profile (which is what the pool expects). Carved out of
#44478.

Co-authored-by: AJ <yspdev@gmail.com>
2026-06-11 22:47:28 +00:00
brooklyn!
73969771a5 fix(desktop): discover MCP tools for dashboard /api/ws backends (#44512)
The desktop chat surface talks to the dashboard's in-process /api/ws
gateway, which builds agents through tui_gateway.server._make_agent. That
path only snapshots the existing tool registry — MCP discovery is started
by tui_gateway/entry.py (the stdio TUI), which the dashboard process never
runs. So a profile's configured MCP servers never connect under the
desktop app and sessions show no MCP tools.

Start a shared background MCP discovery thread at dashboard startup (via
hermes_cli.mcp_startup, bounded so a slow/dead server can't block boot),
and have _make_agent briefly join that thread in addition to the existing
entry-owned TUI thread before snapshotting tools.

Carved out of #44478.

Co-authored-by: AJ <yspdev@gmail.com>
2026-06-11 22:45:45 +00:00
Austin Pickett
2ee69d0579 fix(skills): let ClawHub index build walk past the 12s browse budget (#44500)
The deploy-site skills index crawl was capped at ~3k ClawHub entries
because CATALOG_WALK_BUDGET_SECONDS applied to max_items=0 walks too.
Only enforce the wall-clock budget for bounded browse requests and pass
limit=0 from build_skills_index so CI walks the full catalog.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-11 18:03:11 -04:00
Austin Pickett
021ed69141 docs: finish Automation Blueprints terminology rebrand (#44470)
* docs: finish Automation Blueprints terminology rebrand

Replace leftover "Automation Templates" wording from the Cron Recipes
rebrand, rename the copy-paste cookbook guide to Automation Recipes, and
point the marketing gallery link at the blueprints catalog.

Co-authored-by: Cursor <cursoragent@cursor.com>

* docs: use Automation Blueprints instead of Recipes in guide

Rename the cookbook guide from automation-recipes to
automation-blueprints so sidebar and copy match the product term.

Co-authored-by: Cursor <cursoragent@cursor.com>

* docs: rename automation-blueprints-catalog to automation-blueprints

Drop the -catalog suffix from the reference page slug and title, and
move the copy-paste cookbook to automation-blueprint-examples so the
main Automation Blueprints doc is unambiguous.

Co-authored-by: Cursor <cursoragent@cursor.com>

* Revert "docs: rename automation-blueprints-catalog to automation-blueprints"

This reverts commit 605f1eeab5.

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-11 17:22:22 -04:00
Teknium
6c752ca3a5 refactor(agent): tighten SUMMARY_PREFIX wording and fix stale doc references
Legibility pass on the consolidated prefix: collapse the topic-overlap rule
from three overlapping sentences into one WINS sentence + one discard/no-wrap-up
sentence (same constraints, less dilution), fix the module docstring to
describe the headings that actually shipped, and correct the #10896 comment's
heading name (Historical Pending User Asks).
2026-06-11 13:57:13 -07:00
Teknium
acb2954d82 fix(agent): freeze carveout-era SUMMARY_PREFIX for renormalization
The prompt consolidation above retires the carveout-era prefix. Without a
frozen copy in _HISTORICAL_SUMMARY_PREFIXES, summaries persisted by
pre-upgrade builds would lose detection (_is_context_summary_content) and
renormalization (_strip_summary_prefix) — the exact regression class the
tuple exists to prevent. Adds contract tests covering every frozen prefix.

Refs #41607 #38364 #42812
2026-06-11 13:57:13 -07:00
kyssta-exe
8f8cad7ec5 fix(agent): strengthen compression preamble against stale task execution (#41607) 2026-06-11 13:57:13 -07:00
konsisumer
d5e2fbf244 fix(agent): frame compaction handoff sections as historical context 2026-06-11 13:57:13 -07:00
brooklyn!
484f484c25 fix(desktop): carve sidebar nav rows out of the titlebar drag region (#44453)
A WSL2 user reported the top two left-sidebar items being unclickable
while the rest of the UI works. That symptom shape matches an
-webkit-app-region:drag hit-test band eating clicks, not GPU/compositing:
the shell's titlebar drag strips (app-shell.tsx) span the top 34px and
the nav group clears them by only 6px, and drag regions win hit-testing
over DOM regardless of pointer-events. Linux WCO (Electron >=32) is the
newest implementation and has known region quirks (electron#43030).

Apply the same no-drag carve-out the codebase already uses for sticky
user bubbles (USER_BUBBLE_BASE_CLASS in thread.tsx) to the sidebar nav
buttons. Harmless on every platform: the rows were never meant to be
draggable surface.
2026-06-11 15:10:09 -05:00
teknium1
114e265737 fix(plugins): don't cache a failed discovery sweep as discovered
Root-cause hardening for the stranded-empty-registry failure behind
'No web search/extract provider configured': discover_and_load() set
_discovered=True before scanning, so a sweep that raised partway was
swallowed by callers as a warning and every later call early-returned
against an empty registry for the process lifetime. The flag now acts
only as a re-entrancy guard and is reset when the sweep raises, so the
next call retries discovery.
2026-06-11 12:56:44 -07:00
xxxigm
32a73010bb test(web): cover keyless default surviving a failed plugin sweep
Pins the invariant that _ensure_web_plugins_loaded registers the keyless
Parallel default (and the wider bundled set) even when the general plugin
discovery raises, that the direct-registration fallback honors plugins.disabled,
and that it stays a no-op on the healthy path.
2026-06-11 12:56:44 -07:00
xxxigm
93764b9303 fix(web): guarantee the keyless web default registers even if discovery doesn't
web_search/web_extract are documented to work with zero setup via the bundled
keyless Parallel free-MCP backend, but that only holds when the bundled
plugins/web/* providers are registered. The dispatch relied entirely on the
general plugin sweep to do that; when the sweep finishes without registering
them (its exception swallowed as a warning, a packaged layout where it ran
before the bundled tree was importable, or a stale empty-discovery cache), the
registry is empty and BOTH tools dead-end on "No web {search,extract} provider
configured" — despite needing no setup at all.

_ensure_web_plugins_loaded now verifies the keyless default landed after the
sweep and, if not, registers the bundled web providers directly against the
registry. Idempotent, a no-op on the healthy path (one dict lookup), and honors
an explicit plugins.disabled entry.
2026-06-11 12:56:44 -07:00
Austin Pickett
c3464ecf45 fix(discord): recover from runtime gateway task exits (#44383)
* fix(discord): recover from runtime gateway task exits

Salvaged from #39416 (AMEOBIUS) — cherry-picked only the task-exit
recovery; the original PR was 1081 commits behind with 28 unrelated
commits.

A post-ready discord.py WebSocket crash left the gateway split-brained:
producers stayed active while Discord stopped responding. After this fix
the adapter calls _set_fatal_error(retryable=True) + _notify_fatal_error()
so the existing GatewayRunner reconnect watcher replaces the dead adapter.

Also adds _wait_for_ready_or_bot_exit() so startup failures (SOCKS/proxy
errors, invalid tokens) surface fast instead of burning the full ready
timeout. Because connect() no longer waits via asyncio.wait_for on that
path, test_connect_releases_token_lock_on_timeout is updated to trigger
the timeout through the new helper (same lock-release contract).

3 tests pass (2 new runtime-failure tests + the updated timeout test);
test_discord_connect.py and test_discord_slash_commands.py green.

Co-Authored-By: ameobius <ameobius@local.host>

* fix(test): patch _wait_for_ready_or_bot_exit in timeout cancel test

connect() no longer uses asyncio.wait_for for the ready handshake, so
test_connect_timeout_cancels_bot_task was hanging for 30s in CI.

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: ameobius <ameobius@local.host>
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-11 15:39:01 -04:00
ethernet
e080365a7a fix(tui): new weird typeerror 2026-06-11 15:36:39 -04:00
ethernet
5e5308d34d fix(node): fix @types/node version
TODO lock to a specific node/npm version.
this is a fix for a diff between 10 and 11.
2026-06-11 15:36:39 -04:00
teknium1
08b1c44a53 fix(discord): extend bot-task cancellation to connect()'s generic exception branch
Follow-up to #44389: the generic 'except Exception' branch in connect()
had the same orphaned-task hazard as the timeout branch. Extract the
cancel-and-await logic into _cancel_bot_task() and call it from all
three sites (timeout branch, exception branch, disconnect()).

Also adds deaneeth to AUTHOR_MAP.
2026-06-11 12:09:18 -07:00
Dineth Hettiarachchi
020ef76cf1 fix(discord): cancel _bot_task on connect() timeout to prevent zombie client
When connect() times out waiting for the Discord ready event, the background
asyncio.Task running client.start() was not cancelled. discord.py's internal
reconnect loop can ignore client.close() while a WebSocket handshake is in
flight, so the orphaned task eventually completes and fires on_ready.

A later successful reconnect then leaves two live Discord clients in the same
process — each with its own on_message handler and MessageDeduplicator instance
— so every @mention creates two threads because the per-adapter dedup caches
cannot catch cross-client duplicates.

Fix: explicitly cancel and await _bot_task in two places:
1. The asyncio.TimeoutError handler inside connect() — catches the case where
   the adapter's own inner wait_for fires before the gateway's outer timeout.
2. The start of disconnect() — the load-bearing path, always reached via
   _dispose_unused_adapter regardless of which timeout fired first.

Root cause confirmed from production logs: a Jun 8 network outage caused three
consecutive connect() timeouts. The first attempt's bot_task completed its
handshake 4 minutes later ("Connected as") with no preceding watcher line,
then the watcher's real reconnect also connected 90 seconds after that. The two
clients ran continuously for 41+ hours, confirmed by the same user message
appearing as two separate inbound events in two different thread IDs 357ms apart.

Regression tests added to tests/gateway/test_discord_connect.py:
- test_connect_timeout_cancels_bot_task: simulates a connect() timeout with a
  NeverReadyBot and asserts _bot_task is None afterward
- test_disconnect_cancels_running_bot_task: injects a live zombie task, calls
  disconnect(), and asserts the task is cancelled and the attribute cleared
2026-06-11 12:09:18 -07:00
Teknium
13650ab7f8 fix(gateway): audio attachment note no longer steers the agent into punting
Sibling site of the PDF/DOCX note fixed in PR #44175: the audio file
attachment context note led with "Ask the user what they'd like you to
do with it", steering the model into asking instead of transcribing.
Rewritten to instruct the agent to transcribe/process the file itself
when the request involves its content, only asking when intent is
genuinely unclear. Contract assertion added to the existing audio
attachment note test.
2026-06-11 11:58:19 -07:00
xxxigm
4e9be3ee32 test(gateway): cover document context note for PDF/DOCX vs text
Pin the contract for _build_document_context_note: text documents confirm the
inlined content and record the path; binary documents (PDF/DOCX/XLSX/octet-
stream) tell the agent to extract the text itself and never instruct it to ask
the user to paste the contents.
2026-06-11 11:58:19 -07:00
xxxigm
e7ae145ac4 fix(gateway): guide the agent to read attached PDF/DOCX instead of punting
When a user attached a binary document (PDF, DOCX, XLSX, …) in chat, the
context note prepended to the turn said "Ask the user what they'd like you to
do with it." That steered the model into asking the user to paste the
contents rather than extracting the text it is fully capable of reading — so
attached PDFs/DOCX appeared "unreadable" to the agent.

Rewrite the binary-document note to tell the agent the file is a non-text
format saved at the given path and to extract its text itself (e.g. via the
terminal tool or the ocr-and-documents skill) before answering. Text
documents (whose content is already inlined by the platform adapter) keep
their existing note. The note construction is pulled into a small
`_build_document_context_note` helper so it is unit-testable.
2026-06-11 11:58:19 -07:00
Austin Pickett
ce99a81123 fix(dashboard): suppress unicode-animations postinstall during npm ci
Set CI=1 in _run_npm_install_deterministic so the package's /dev/tty
postinstall demo is skipped during hermes dashboard web UI builds.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-11 11:49:08 -07:00
xxxigm
743c55efa3 fix(desktop): stop file tree throwing "Cannot have two HTML5 backends" on remount (#43541)
* fix(desktop): stop file tree throwing "two HTML5 backends" on remount

The Agent Workspace file tree (react-arborist) shows a permanent "TREE ERROR"
with `[error-boundary:file-tree] Cannot have two HTML5 backends at the same
time.` react-arborist mounts its own react-dnd DndProvider + HTML5Backend per
<Tree>. react-dnd v14 keeps that manager on a global, ref-counted singleton
context and nulls it when the count reaches 0. The tree is keyed on
`${cwd}:${collapseNonce}`, so changing folder / collapsing forces a fresh
<Tree>; during the remount the singleton can be torn down and recreated while
the previous HTML5Backend still owns `window.__isReactDndHtml5Backend`, so the
new backend's setup() throws. The error boundary then sticks, because "Try
again" just remounts into the same race.

Pass arborist a stable, app-lifetime `dndManager` (new getFileTreeDndManager
singleton) so it reuses one backend for the life of the app and never
double-claims the window flag. Drag/drop is already disabled on this tree;
this only changes how the (unused) dnd backend is provisioned.

Promotes dnd-core and react-dnd-html5-backend to explicit deps (already present
transitively via react-arborist's react-dnd 14.x line, so they dedupe to one
instance).

* fix(nix): bump npmDepsHash for desktop dnd deps

Adding dnd-core / react-dnd-html5-backend changed the workspace
package-lock.json, so the single workspace-root npmDepsHash in
nix/lib.nix was stale and the nix build failed. Regenerate it
(hash from the failing nix CI job's 'got:' value).

* fix(nix): update npmDepsHash for merged lockfile

After merging main, the workspace lockfile combined main's dep
changes with the desktop dnd additions, so the npmDepsHash needed
recomputing again. Hash from the nix lockfile-check job.

* fix(nix): use fetchNpmDeps hash for desktop dnd lockfile

prefetch-npm-deps reported sha256-lVnybH9RE/... but fetchNpmDeps
wants sha256-mYgKXE/FL4hnkrEvpVv+ULM/oeyIfO2AM9Ol8OrfWm0= for the
merged workspace lockfile. Use the nix build 'got:' hash so CI passes.

---------

Co-authored-by: Brooklyn Nicholson <brooklyn.bb.nicholson@gmail.com>
2026-06-11 11:47:34 -07:00
liuhao1024
93a2f680fd fix(desktop): preserve explicit hide-all choice in model visibility dialog (#43496)
When a user toggles off the last visible model for a provider group, the
effectiveVisibleKeys() function treated the missing provider prefix as
'never customized' and re-added the default models on the next render,
causing all models to snap back to enabled.

Fix: store a sentinel key (e.g. 'provider::') when the last model for a
provider is toggled off. The sentinel distinguishes 'user hid everything'
from 'user never customized', preventing the default-fallback path from
re-adding models the user explicitly chose to hide.

Fixes #43485
2026-06-11 13:27:38 -05:00
brooklyn!
8505e9d669 fix(desktop): disable spellcheck on composer inputs (#44415)
Turn off browser spellcheck, autocorrect, and autocomplete on the main chat composer and message-edit composer so code, paths, and slash commands are not flagged or altered.
2026-06-11 18:03:23 +00:00
brooklyn!
a4f179c509 fix(agent): steer GPT/Codex family to V4A for single-file edits too (#44411)
The coding-posture brief told GPT/Codex models to use patch mode='patch'
(V4A) for structured/multi-file changes but mode='replace' "for a single
small swap". That second nudge points those models at a format their
first-party harness never taught them.

Verified against openai/codex (current main): apply_patch is the ONLY file
editor in codex-rs — zero occurrences of str_replace/old_string anywhere in
the repo; the grammar (core/src/tools/handlers/apply_patch.lark) is exactly
the V4A dialect our patch_parser implements; the shipped model prompts
(gpt_5_codex, gpt-5.2-codex, gpt-5.1-codex-max + instruction templates)
explicitly say to use apply_patch "for single file edits"; and the tool is
gated per model via ModelInfo.apply_patch_tool_type, i.e. OpenAI ships
V4A-for-everything as model metadata.

The GPT-family line now steers to mode='patch' for all edits, single-file
included. The replace-family line (Claude + open-weight) is unchanged —
Claude Code's FileEdit is old_string/new_string/replace_all exact string
replacement (confirmed from Anthropic's shipped sdk-tools.d.ts, the only
file editor in its tool union), matching our mode='replace'.
2026-06-11 17:52:52 +00:00
Teknium
cb29e8a82e refactor(cron): rebrand Cron Recipes -> Automation Blueprints
Product rename across every surface: module/file names (blueprint_catalog,
tools/blueprints, blueprint_cmd), slash command /cron-recipe -> /blueprint
(alias /bp), dashboard API /api/cron/blueprints, desktop deep-link
hermes://blueprint/<key>, docs catalog page + extract script, and the
skill frontmatter block metadata.hermes.blueprint. No behavior change.
2026-06-11 10:49:47 -07:00
Teknium
3c489fda81 fix(commands): unpin /reset from Slack priority aliases — registry hit the 50-cap
CI tests the PR merged with current main, where the new /memory canonical
command filled Slack's 50-slash cap: with btw/bg/reset all pinned ahead of
canonicals, the last canonical (/debug) got clamped and the Telegram-parity
test failed. Canonical commands must win slots over alias spellings — /new
keeps its native slot and 'reset' stays reachable via /hermes reset.

Also updates test_includes_aliases_as_first_class_slashes to assert the
pinned-alias contract (_SLACK_PRIORITY_ALIASES survive) instead of a
specific unpinned alias's survival, which was the same change-detector
pattern the docstring already warned about.
2026-06-11 10:49:47 -07:00
Teknium
e8b757845d fix(cron-recipes): pre-release hardening — honest cadences, strict slot names, surface-aware UX
Review fixes for the Cron Recipes stack before release:

- hydration-move: */90 in the cron minute field silently wraps to hourly
  (croniter-verified) — 90/120-minute options never fired at their stated
  cadence. Replaced with an hour-field step (0 9-17/2 * * 1-5) and an
  interval_hours slot whose options (1/2/3h) all fire as labeled.
- fill_recipe: reject unknown slot names. A typo'd 'tiem=07:15' used to
  silently create the job at the 08:00 default; now it 422s on the dashboard
  form and errors on the slash/deep-link paths with the valid slot list.
- deliver slot: non-strict enum (options are suggestions, scheduler
  validates downstream) so slack/whatsapp/etc. users aren't locked out;
  GET /api/cron/recipes rewrites its options from cron_delivery_targets()
  so the dashboard form only offers configured platforms; help text no
  longer claims dashboard-created jobs deliver to 'the chat you set this
  up from' (the endpoint strips origin — they go to the home channel).
- gateway: success/accept messages no longer point at /cron (cli_only);
  surface-aware hint instead. Conversational fill now sends the
  'Setting up X — I'll ask you a couple of things…' ack before the agent
  turn, matching the CLI experience.
- important-mail catalog entry: reference the urgency classifier by module
  path (python3 -m cron.scripts.classify_items) instead of baking an
  absolute host path into the job prompt — stale after relocation and
  nonexistent on remote terminal backends. cron/scripts is now a real
  package and ships in the wheel (pyproject packages.find).
- export_recipe: interval schedules round-trip again — parse_schedule
  stores 'minutes' but the renderer only read 'seconds', so every interval
  job exported as the silent '0 9 * * *' fallback.
- skills_hub install: say so when a recipe suggestion is dropped
  (latched dedup or pending cap) instead of printing nothing.

Targeted tests: 58 cron/recipe + 261 web_server pass; E2E-validated all
14 recipes fill+parse, hydration cadences via croniter, typo rejection on
slash + endpoint paths, surface-aware hints, and interval export round-trip.
2026-06-11 10:49:47 -07:00
teknium1
e976faac7a feat(cron-recipes): /cron-recipe <name> seeds a conversational fill
Reworks the chat-line UX: pick a recipe by name and the agent asks you for
what it needs, one question at a time, instead of forcing you to hand-type a
slot=val command line.

- /cron-recipe                  -> lists the catalog
- /cron-recipe <name>           -> forgiving name match (exact/prefix/substring/
                                   fuzzy; ambiguous lists candidates), then seeds
                                   the agent with a natural-language fill request
                                   built from the recipe's typed slots + schedule
                                   and prompt templates. The agent asks for each
                                   value one at a time and calls the EXISTING
                                   cronjob tool. No new tool.
- /cron-recipe <name> slot=val  -> unchanged deterministic path (fill_recipe ->
                                   create_job) for the dashboard/docs/power user.

Mechanism (no new plumbing, invariant-safe — the seed enters as a normal user
turn, never a synthetic injection):
- shared handler returns RecipeCommandResult{text, agent_seed}; match_recipe()
  and build_recipe_seed() are the new shared pieces.
- gateway: dispatch rewrites event.text to the seed and falls through to the
  agent (the same pattern /steer uses).
- CLI: handler sets a one-shot self._pending_agent_seed; the interactive loop
  consumes it right after process_command() and runs it as the next turn.

The typed-slot schema stays the single source of truth (still validates the
form/inline path via fill_recipe); the agent path just renders those slots into
the questions to ask. Docs updated to lead with the name-then-ask flow.
2026-06-11 10:49:47 -07:00
teknium1
1593ca5406 feat(cron): Cron Recipes — parameterized automation templates across every surface
A 'recipe' is a one-place definition of an automation that every surface
renders natively. The slot schema (cron/recipe_catalog.py) is the single
source of truth; four renderers consume it, and all paths end at the same
cron.jobs.create_job — no second job engine.

Form where there's a screen, conversation where there's a chat line:
- Dashboard / GUI app: a Recipes sub-tab on the Cron page renders each
  recipe's typed slots as a form (time-picker, enum dropdown, free-text);
  submit POSTs /api/cron/recipes/instantiate which fills + creates the job.
- CLI / TUI / messengers: /cron-recipe lists the catalog, shows a recipe's
  fields, or fills + creates from a pasted 'key slot=val' command. The shared
  handler (hermes_cli/cron_recipe_cmd.py) names any missing/invalid slot so
  the agent can ask a targeted follow-up.
- Docs: a generated Cron Recipes catalog page (website, .mdx + React cards)
  shows each recipe with a copy-paste command and a 'Send to App' button.
- Desktop: a hermes:// URL scheme (Electron single-instance lock +
  setAsDefaultProtocolClient + open-url/second-instance) routes
  hermes://cron-recipe/<key>?slot=val into the chat composer pre-filled.

Typed slots (time/enum/text/weekdays) with defaults: users never type raw
cron — recipes parameterize time-of-day and weekday sets and translate to
cron expressions; a free-text 'schedule' slot is the full-flexibility escape
hatch. Consent-first throughout: nothing schedules without an explicit submit
or send.

Core:
- cron/recipe_catalog.py — CronRecipe + RecipeSlot, 5 curated recipes,
  recipe_form_schema / recipe_slash_command / recipe_deeplink /
  recipe_catalog_entry renderers, fill_recipe (validate + translate to
  create_job kwargs).
- hermes_cli/cron_recipe_cmd.py — shared /cron-recipe handler (CLI + TUI +
  gateway never drift). CommandDef + dispatch in commands.py / cli.py /
  gateway/run.py.

Dashboard: GET /api/cron/recipes + POST /api/cron/recipes/instantiate
(web_server.py), CronRecipes.tsx gallery+form, Segmented sub-tab on CronPage,
api.ts methods + types.

Desktop: hermes:// scheme end to end (main.cjs deep-link router + ready-queue,
preload onDeepLink/signalDeepLinkReady, global.d.ts types, desktop-controller
composer prefill, electron-builder protocols key).

Docs: extract-cron-recipes.py generator wired into prebuild.mjs,
cron-recipes-catalog.mdx + CronRecipesCatalog React component, sidebar entry.
Generated index json gitignored like skills.json.

Tests: 23 core (catalog/slots/schedule-resolution/validation/renderers/command
handler/generator) + 5 web_server endpoint tests. E2E verified end to end:
slot fill -> create_job -> persisted job with correct schedule/deliver/origin.
2026-06-11 10:49:47 -07:00
teknium1
9a09ea69fb feat(cron): Suggested Cron Jobs — one surface for proposed automations
Hermes can propose automations and let the user accept them with one tap
via /suggestions, instead of making them assemble cron jobs by hand. Every
proposal — wherever it originates — flows through one surface.

Sources (the 'where suggestions come from'):
- catalog: curated starter automations (daily briefing, important-mail
  monitor, weekly review, workday-start reminder) via /suggestions catalog
- recipe: installing a skill that carries a metadata.hermes.recipe block
  registers a suggestion instead of auto-scheduling
- usage / integration: reserved for the background-review detector and
  account-connect triggers (sources defined; emitters land next)

Pieces:
- cron/suggestions.py — the store. add/list/accept/dismiss, dedup+latch by
  key (dismissed proposals never re-offered), pending cap so it can't become
  a nag wall. Accepting calls the existing cron.jobs.create_job — there is
  NO second job engine. Mirrors jobs.py storage (atomic writes, lock, 0600).
- cron/suggestion_catalog.py — the curated set. The important-mail monitor
  entry is where the old proactive-monitor poll->classify->surface engine
  lives now (cron/scripts/classify_items.py + the 'monitor' aux task), as ONE
  catalog automation rather than a standalone feature.
- tools/recipes.py — recipe<->job bridge; register_recipe_suggestion() makes
  a recipe source 'recipe' of this surface. recipe_to_job_spec() is the single
  translation both the direct and suggestion paths share.
- hermes_cli/suggestions_cmd.py — shared /suggestions handler (CLI + gateway
  never drift); /suggestions [accept N|dismiss N|catalog|clear].
- Wired: CommandDef + CLI dispatch (cli.py) + gateway dispatch (gateway/run.py)
  + aux 'monitor' task (config.py) + recipe-install hook (skills_hub.py).

Consent-first throughout: nothing auto-schedules; acceptance is always
explicit; dismissals latch.

Supersedes #41122 (proactive-monitor) and #41127 (recipes): both fold in here
as a catalog entry and a suggestion source respectively.

Tests: store (dedup/cap/accept/dismiss/latch), catalog seeding+idempotency,
recipe->suggestion bridge, command handler, aux config. E2E: recipe SKILL.md
-> parsed -> suggested -> accepted -> real cron job persisted to jobs.json.
2026-06-11 10:49:47 -07:00
Teknium
4d6a133a9f fix(agent): gate skill-index demotion behind the opt-in focus mode (#44387)
The coding posture's names-only demotion of non-coding skill categories
(#44342) applied under the default auto mode, silently changing the skill
index for every user in a git repo. Index changes must be opt-in: demotion
now only fires under agent.coding_context=focus, alongside the toolset
collapse. auto/on leave the skill index untouched; focus semantics are
unchanged (demoted, never hidden; deny-list keeps coding-adjacent and
custom categories at full entries).
2026-06-11 10:00:57 -07:00
Teknium
c7bfc938d5 fix(dashboard): Config page header shows the switched profile's config.yaml path (#44374)
The Config page read config_path from /api/status, which is machine-global
and always reports the profile the dashboard process was started under.
After switching profiles with the global switcher, the header kept showing
the old profile's path (e.g. /root/.hermes/profiles/worker_1/config.yaml)
even though reads/writes correctly targeted the new profile.

Fix: /api/config/raw now returns the resolved path alongside the YAML
(resolved inside _profile_scope, so it follows ?profile=). ConfigPage
prefers that scoped path and only falls back to /api/status for old
servers. ProfileKeyedRoutes already remounts the page on switch, so the
header refreshes immediately.
2026-06-11 09:46:15 -07:00
yoniebans
9121834b31 fix(desktop): scope remote workspace defaults 2026-06-11 09:41:35 -07:00
yoniebans
56a0f48ba6 fix(desktop): tighten remote filesystem wiring 2026-06-11 09:41:35 -07:00
yoniebans
8878484f85 feat(desktop): wire remote filesystem browsing 2026-06-11 09:41:35 -07:00
yoniebans
db79e90130 feat(desktop): add filesystem routing facade 2026-06-11 09:41:35 -07:00
yoniebans
51f47f9a97 feat(desktop): add read-only remote filesystem API 2026-06-11 09:41:35 -07:00
helix4u
e71d746820 fix(mcp): avoid false failed startup status 2026-06-11 09:01:52 -07:00
Teknium
5508f4bc54 fix(cli): utf-8 decode for whatsapp-bridge npm install capture (sibling of #43790) 2026-06-11 09:00:55 -07:00
helix4u
b2043cf157 fix(tui): decode startup subprocess output as utf-8 2026-06-11 09:00:55 -07:00
helix4u
dca11b6650 fix(mcp): preserve stdio argv passthrough 2026-06-11 08:59:55 -07:00
brooklyn!
ee1a744ace fix(agent): demote non-coding skill categories to names-only — never hide skills (#44342)
Real-world failure with the original index pruning: under the default auto
posture, an agent-created ops skill in a demoted category vanished from the
prompt's skill index mid-project, and the agent silently fell back to a
stale sibling skill instead. The "discovery-only" premise didn't hold —
models do not reach for skills_list to rediscover what the index stops
showing them, and agent-created skills are the model's accumulated project
memory (runbooks, pitfalls, operating rules).

Gating pruning behind the opt-in focus mode was the wrong fix too: users
opening a worktree don't know the config exists, so the index-noise win
would effectively never ship.

Instead, the coding posture now DEMOTES non-coding categories rather than
hiding them: each demoted category renders as a single names-only line
("gaming [names only]: allthemons10-ops, mc-backup") with a footer note
explaining the omitted descriptions. Every skill name stays in the prompt,
so memory-anchored recall ("load <name>") keeps working in every mode,
while the description noise is still cut. Applies in auto/on/focus alike;
the general posture demotes nothing. Deny-list semantics unchanged —
unknown/custom categories and coding-adjacent ones keep full entries.

API renamed to match the honest semantics: hidden_skill_categories →
compact_skill_categories, build_skills_system_prompt(hidden_categories=) →
compact_categories=.
2026-06-11 10:25:42 -05:00
teknium1
52c7976f40 fix(whatsapp-cloud): review follow-ups for #43921
- nous_subscription: gate the STT managed-default flip on openai-audio
  entitlement and skip when a local backend (faster-whisper or custom
  command) works; new _local_stt_backend_available() helper + tests
- whatsapp_cloud: WHATSAPP_CLOUD_{DM_POLICY,ALLOW_FROM,GROUP_POLICY,
  GROUP_ALLOW_FROM} env overrides so both adapters can run in parallel;
  normalize allowlist entries (JID/punctuation) to bare wa_id
- whatsapp_cloud: wrap per-message event build in try/except (dedup-marked
  wamids would be silently dropped on Meta's batch retry otherwise)
- whatsapp_cloud: validate media_id before URL/filename interpolation,
  delete transient .ogg after voice upload, FIFO-cap interactive-button
  state dicts and per-chat wamid cache
- whatsapp_common: '# **Title**' headers no longer double-wrap asterisks
- setup wizard: read access token / app secret via getpass on TTYs
- docs: new WHATSAPP_CLOUD_* gating env vars
2026-06-11 07:51:01 -07:00
Teknium
2ecb4e62bb Merge remote-tracking branch 'origin/main' into hermes/hermes-6b48295e 2026-06-11 07:38:25 -07:00
Teknium
9c051f57c3 fix(dashboard): Anthropic API Key entry checks ANTHROPIC_API_KEY, not Claude Code creds; hide deprecated tool-progress env vars (#44286)
Two dashboard fixes:

1. The 'Anthropic API Key' OAuth catalog entry's status fn read
   ~/.claude/.credentials.json (which has its own dedicated claude-code
   entry) and never checked ANTHROPIC_API_KEY at all. It now checks the
   Hermes PKCE file, then the registry env-var order (ANTHROPIC_API_KEY
   -> ANTHROPIC_TOKEN -> CLAUDE_CODE_OAUTH_TOKEN) via get_env_value, so
   keys from .env, the shell, or Bitwarden (injected into the process
   env by load_hermes_dotenv) are all reported, with a '(from Bitwarden)'
   source suffix when applicable.

2. Deprecated HERMES_TOOL_PROGRESS / HERMES_TOOL_PROGRESS_MODE removed
   from OPTIONAL_ENV_VARS so the keys page and setup checklists stop
   offering them. Moved to _EXTRA_ENV_KEYS so .env sanitization and
   reload_env still recognize them for existing users (gateway back-compat
   fallback unchanged).
2026-06-11 07:18:15 -07:00
Teknium
e24c935cf3 fix(bedrock): fall back to non-streaming InvokeModel when IAM denies InvokeModelWithResponseStream (#44293)
IAM policies scoped to bedrock:InvokeModel only (a common least-privilege
setup) reject converse_stream() with AccessDeniedException. The agent loop
hard-prefers streaming and the denial never matched the 'stream not
supported' auto-fallback, so InvokeModel-only users looped on AccessDenied
forever.

- agent/bedrock_adapter.py: new is_streaming_access_denied_error()
  detector (ClientError code check + wrapped-SDK message match);
  call_converse_stream() falls back to converse() on denial.
- agent/chat_completion_helpers.py: bedrock_converse streaming branch
  retries inline via converse() and sets _disable_streaming so later
  turns skip the doomed stream attempt; the chat-completions retry
  block also recognizes the denial for the AnthropicBedrock SDK path
  (message pre-check avoids importing bedrock_adapter — and its lazy
  boto3 install — for unrelated providers).

Both paths print a one-line notice telling the user which IAM action
restores streaming.
2026-06-11 07:15:30 -07:00
b1af653bf6 fix(desktop): Harden local file tree paths (#43618)
* fix(desktop): Harden local file tree paths

Normalize Electron local path handling across file tree, preview, media, and git-root flows. Reject malformed and Windows device paths, recheck sensitive files after realpath resolution, and preserve external symlink traversal with stable renderer errors.

* fix(desktop): Address file tree review feedback
2026-06-11 10:05:59 -04:00
Omar Baradei
e372803554 fix(desktop): refresh session model metadata on switch (#43977)
Co-authored-by: Omar Baradei <omar@kostudios.io>
2026-06-11 10:05:32 -04:00
Austin Pickett
d0e017bac8 fix(gateway): gate oversized Telegram voice/audio before download (#44245)
* fix(gateway): gate oversized Telegram voice/audio before download

Adds a pre-download size check to the Telegram voice and audio inbound
paths. Files that exceed _max_doc_bytes (default 20 MB) are rejected
before get_file() is called, preventing silent OOM-style stalls on large
uploads. A human-readable note is appended to the event text so the
model can explain the limit to the user.

Also extends 403 entitlement detection in recover_with_credential_pool
to cover two additional cases: 'oauth authentication is currently not
allowed for this organization' and Anthropic anthropic_messages-mode 403s,
both of which should be treated as entitlement failures rather than
transient errors.

Tests: 7 new cases in test_telegram_voice_v0_regressions.py covering
the size gate (accept, reject, note text) and the STT-failure notice path.

Salvaged from #40487 (cryptopafi) — cherry-picked the Telegram voice
policy and 403 entitlement fixes; LiveKit/Discord/uv.lock workstreams
left for separate PRs.

* test(gateway): drop orphaned voice tests not backed by this PR

The cherry-picked test file from #40487 included 3 tests for STT-failure
notice and voice-mode (_handle_voice_command 'on' -> voice_only) behavior
that this PR intentionally does NOT salvage (those belong to the LiveKit/
voice-policy workstreams left in #40487). They fail on both this branch
and clean main because the feature code isn't present.

Keep only the 2 tests backed by code actually in this PR:
- test_telegram_audio_size_gate_rejects_oversized_media_before_download
  (covers the _telegram_media_size_allowed guard this PR adds)
- test_voice_tts_is_explicit_audio_reply_opt_in (matches current main)

Removed now-unused imports (MessageEvent, MessageType, AsyncMock).
2026-06-11 10:01:51 -04:00
Teknium
a09343cc96 feat(dashboard): SKILL.md editor on Skills page + attach-skill selector in cron modals (#44231)
Headless/VPS users (dashboard-over-Tailscale, no comfortable SSH) could
list/toggle/install skills and create/edit cron jobs, but not author a
custom skill or link one to a cron job — the UI set WHEN a job runs, but
not WHICH skill it uses.

- Skills page: 'New skill' button + per-row edit pencil open a SKILL.md
  editor dialog (frontmatter + body, server-side validation via the same
  _create_skill/_edit_skill path as the agent's skill_manage tool).
- New endpoints: GET /api/skills/content, POST /api/skills,
  PUT /api/skills/content — all profile-scoped via _profile_scope(),
  which now also retargets tools.skill_manager_tool's import-time
  SKILLS_DIR binding.
- Cron page: skills multi-select in both create and edit modals (parity
  with hermes cron --skill / edit --add-skill); CronJobCreate gains a
  skills field; job cards show an attached-skills badge. update_job
  already accepted skills in updates.
- Tests: 17 new endpoint tests (content read, create/edit validation +
  profile scoping + auth gate, cron skills round-trip).
2026-06-11 06:10:27 -07:00
Teknium
f456f302df fix(gateway): refuse to write service definitions with a temp-dir HERMES_HOME (#44267)
* fix(gateway): refuse to write service definitions with a temp-dir HERMES_HOME

A test/E2E harness that exports HERMES_HOME=/tmp/... and touches any
gateway service write path (install, start self-heal, restart's
refresh_systemd_unit_if_needed) bakes the throwaway home into the
production systemd unit / launchd plist. The gateway then restarts
'healthy' but pointed at an empty temp home — no platforms enabled,
deaf to every message (live incident 2026-06-11: /tmp/hermes-e2e-41264
poisoned the unit during a PR-review E2E probe; the post-update restart
produced a 7-hour zombie gateway).

The existing safety belt only sniffed pytest-shaped markers
(/pytest-of-, /hermes_test). Add a structural guard:
_temp_home_in_service_definition() extracts HERMES_HOME from the
generated systemd unit or launchd plist and refuses the write (with
actionable guidance) when it resolves under tempfile.gettempdir(),
/tmp, /var/tmp, or the macOS /private variants. Wired into all five
write sites: systemd refresh + install, launchd refresh + install +
start self-heal.

* test: patch unit generator in install tests tripped by temp-home guard

CI runs hermetic with HERMES_HOME under a tmp dir, so the real
generate_systemd_unit() output now (correctly) trips the new temp-home
write guard in three install tests. Patch the generator with synthetic
non-temp content — same pattern the existing pytest-marker guard tests
use.
2026-06-11 06:10:08 -07:00
Teknium
8972a151a4 feat(cli,tui): show time since last final agent response on the status bar (#44265)
Adds an idle clock to the context/status bar in both the prompt_toolkit CLI
and the Ink TUI: once a turn completes, a dim '✓ <elapsed>' segment shows how
long the session has been idle since the last final agent response. Hidden
while a turn is live (the per-prompt elapsed timer covers that) and before
the first turn completes.

- cli.py: track _last_turn_finished_at when the agent thread exits, surface
  it via _format_idle_since() in the snapshot, render in both the wide
  fragments path and the plain-text fallback.
- ui-tui: stamp lastTurnEndedAt when busy flips false after a live turn,
  thread it through appStatus -> StatusRule, render via a ticking IdleSince
  segment sharing the duration breakpoint/width budget.
2026-06-11 06:06:19 -07:00
Teknium
a2d7f538d4 fix(delegate): stop subagent tool completion lines leaking into parent CLI display (#44223)
Commit 550b72dd8 changed the concurrent-path tool-result rendering gate
from 'not agent.quiet_mode' to 'tool_progress_mode != off'. Subagents are
constructed with quiet_mode=True but inherit the default
tool_progress_mode='all', so every child tool call during delegate_task
started printing raw ' Tool N completed in Xs - {json...}' lines into
the parent's display, bypassing the curated tree-view relay in
_build_child_progress_callback.

Fix: require BOTH gates — quiet_mode must be off AND tool_progress_mode
must not be 'off' — restoring subagent silence while preserving the
#33860 fix (CLI verbose + tool-progress off stays suppressed). The same
combined gate is applied to the three sibling print sites in
tool_executor.py (concurrent header/args, sequential args, sequential
completion) so the whole class is consistent.
2026-06-11 05:10:10 -07:00
Teknium
9c16ca8790 fix(dashboard): normalize model assignments + confirm-modal for backup import (#44237)
Two beta-reported dashboard bugs:

1. Models page: 'Use as -> Main model' on an analytics card sends
   entry.provider, which falls back to the model's VENDOR prefix
   (modelVendor('anthropic/claude-opus-4.6') == 'anthropic') when the
   session row has no billing_provider. That persisted
   provider: anthropic + default: anthropic/claude-opus-4.6 — a
   vendor-prefixed OpenRouter slug on the NATIVE Anthropic provider.
   New sessions then 400 against api.anthropic.com and the user reads
   it as 'changing models does nothing'. Unknown vendors (moonshotai,
   poolside, ...) were worse: a provider that can never resolve
   credentials.

   Fix: _normalize_main_model_assignment() at the single write
   chokepoint — maps non-provider vendor names back to the user's
   current aggregator (else openrouter), and runs the model through
   normalize_model_for_provider() so the persisted name matches the
   target provider's API format. Wired into both /api/model/set and
   the profile-scoped _write_profile_model.

2. System page: 'Restore from backup' spawns hermes import with
   stdin=DEVNULL, so the CLI's interactive 'Continue? [y/N]' overwrite
   prompt hits EOF and auto-aborts whenever a config already exists
   (always, when the dashboard is running). Fix: ConfirmDialog in the
   dashboard owns the consent, then the endpoint passes --force so the
   restore runs non-interactively.

Validated live: dashboard on a temp HERMES_HOME, repro'd both failure
modes pre-fix (vendor-slug write verified via config.yaml + tui
session.create; import 'Aborted.' in action-import.log), then verified
post-fix (normalized writes, modal -> --force -> restored marker file).
2026-06-11 05:07:58 -07:00
Chris
4717989c10 fix(matrix): isolate room context and restore reliable inbound dispatch (#18505)
* fix(matrix): isolate room context and inbound dispatch

* test(matrix): cover room isolation and dispatch regressions

* docs(matrix): document room isolation and session scope

* fix(matrix): stabilize CI requirement checks

* test(matrix): isolate mautrix stubs in requirements tests

* fix(matrix): port room-scoped status and resume to slash commands mixin

Move Matrix /status scope output and /resume same-room guards from the
pre-refactor gateway/run.py into gateway/slash_commands.py so PR #18505
foundation behavior survives the upstream god-file decomposition.

Uses i18n keys for Matrix resume/status messages. Preserves upstream
session.py fixes (role_authorized, DM user_id isolation).

* docs(matrix): explain inbound dispatch via handle_sync loop

Document why Hermes uses an explicit sync loop with handle_sync() rather than
client.start(), aligning with upstream #7914 diagnostics while preserving
Hermes background maintenance tasks.

* fix(i18n): add Matrix resume/status keys to all locale catalogs

The Matrix /resume and /status slash-command keys added in the foundation
PR must exist in every supported locale file. tests/agent/test_i18n.py
asserts key and placeholder parity across catalogs.

Non-English locales use English strings as interim placeholders until
community translators can localize them.

* fix(matrix): restore gateway authz for allowed_users; honor config require_mention

Revert the early MATRIX_ALLOWED_USERS gate in _on_room_message so inbound
sender authorization stays in gateway authz like main. Parse require_mention
from config.extra (platforms.matrix / top-level matrix yaml) with env fallback,
matching thread_require_mention and fixing Forge when require_mention is set
only in profile config.yaml.

* fix(matrix): harden status scope and allowlisted DMs

* fix(matrix): use session store lookup for resume scope
2026-06-11 07:41:43 -04:00
Teknium
73dd584995 fix(mcp): propagate HERMES_HOME override onto the MCP event loop (#44220)
* fix(mcp): propagate HERMES_HOME override onto the MCP event loop

Closes the known limit documented in #44007: tasks scheduled via
run_coroutine_threadsafe are created INSIDE the MCP loop thread, so they
copy that thread's context — a per-request profile scope (dashboard
?profile= endpoints, e.g. the MCP 'Test server' probe) silently vanished
for anything resolving get_hermes_home() inside the coroutine. Most
visible symptom: OAuth token-store paths (HERMES_HOME/mcp-tokens/)
resolved against the process home instead of the selected profile, so
testing an OAuth MCP cross-profile read the wrong tokens.

_run_on_mcp_loop now wraps scheduled coroutines with the caller's
context-local override (_wrap_with_home_override): set inside the task's
own context on the loop, reset on completion — task-local, so concurrent
calls carrying different scopes don't interfere, and the loop thread's
default context stays untouched. No-op (coroutine passes through
unwrapped) when no override is active, i.e. every non-dashboard caller.

web_server's probe comment updated from 'known limit' to 'covered'.

Tests: override propagation (direct + factory form), OAuth token-path
resolution on the loop, loop-context cleanliness after scoped calls,
no-op passthrough. 225 green across mcp_tool + unification suites.

* test(mcp): concurrent different-scope calls don't interfere
2026-06-11 04:37:01 -07:00
Teknium
3edd09a46f fix(whatsapp): restart stale bridge processes instead of silently reusing them (#44205)
A long-lived Baileys bridge survives gateway restarts AND hermes update:
connect() adopted any bridge already listening with status connected, and
disconnect() only kills bridges the adapter spawned itself. Users who
updated to get inbound media support kept talking to a bridge process
serving months-old bridge.js — images and voice notes still arrived as
placeholders with no cached file path (refs #19105 follow-up reports).

Three fixes in the same stale-bridge class:

- Staleness handshake: bridge.js reports a sha256 self-hash in /health
  (scriptHash); connect() compares it against bridge.js on disk and
  restarts the bridge on mismatch. Pre-handshake bridges report no hash
  and are treated as stale, so every existing stale bridge gets recycled
  exactly once on the next gateway start.
- npm dep refresh: deps reinstall when package.json changes (stamp file
  in node_modules), not only when node_modules is missing — a Baileys
  pin bump now actually lands.
- Cache-dir passthrough: the gateway passes profile-aware
  HERMES_{IMAGE,AUDIO,DOCUMENT}_CACHE_DIR to the bridge instead of the
  bridge hardcoding ~/.hermes/image_cache etc., fixing media paths under
  HERMES_HOME overrides, profiles, and the new cache/ layout.
2026-06-11 03:47:29 -07:00
Teknium
875aa8f162 feat(dashboard): unify multi-profile management — one machine dashboard, global profile switcher (#44007)
* feat(dashboard): unify multi-profile management — one machine dashboard, global profile switcher

The dashboard becomes a machine-level management surface with one
write-target selector, replacing per-profile dashboard fragmentation.

Backend:
- profile param (query or body) on /api/config (get/put/raw), /api/env
  (get/put/delete/reveal), /api/mcp/servers (list/add/remove/test/enabled),
  /api/mcp/catalog (list/install), /api/model/info, /api/model/set —
  all scoped through the existing _profile_scope() context manager
- model/set restructured: expensive-model warning (await) runs before the
  scope; the config write runs sync inside the scope in a worker thread
- MCP catalog installs + git-bootstrap entries spawn 'hermes -p <profile>'
- chat PTY: ?profile= on /api/pty points the child's HERMES_HOME at the
  profile dir (its own gateway subprocess, config/skills/memory/state.db
  all profile-bound); in-process gateway attach skipped when scoped

CLI launch unification:
- '<profile> dashboard' routes to the machine dashboard: attach (open
  browser at ?profile=) when one is listening, else re-exec pinned to the
  default profile with --open-profile preselecting the launcher
- --isolated preserves the old dedicated per-profile server behavior
- start_server(initial_profile=...) appends ?profile= to the auto-open URL

Frontend:
- ProfileProvider + sidebar ProfileSwitcher: ONE global selector, URL-
  persisted (?profile=), mirrored into fetchJSON which auto-appends the
  param to the scoped endpoint families (explicit params win)
- app-wide amber banner names the managed profile
- SkillsPage's page-local selector (from the skills-scoping PR) folded
  into the global context — single source of truth
- ChatPage threads the scope into the PTY WS URL; switching profiles
  remounts the terminal into a fresh scoped session

Omitted profile keeps legacy behavior everywhere.

* docs(dashboard): document machine-level multi-profile management

- web-dashboard.md: 'Managing multiple profiles' section (switcher, URL
  deep-links, unified launch, --isolated, scoped Chat, what stays
  per-profile) + --isolated in the options table
- profiles.md: 'From the dashboard' subsection + set-as-active vs
  switcher clarification
- cli-commands.md: --isolated flag + profile-alias launch example

* fix(dashboard): address profile-unification review findings

Review findings (dev review on PR #44007):

1. HIGH — stale page state on profile switch: pages load data on mount
   and didn't consume the profile scope, so a page opened under profile A
   kept showing A's state while writes silently targeted the newly
   selected B. Fixed structurally: ProfileKeyedRoutes wraps the routed
   page tree and keys it by the selected profile, remounting every page
   (fresh state + refetch) on switch. ChatPage keeps its own remount
   (channel keyed on scopedProfile).

2. HIGH — /api/model/auxiliary read was unscoped while /api/model/set
   wrote scoped (Models page could show default's aux pins while editing
   worker's). Endpoint now takes profile + _profile_scope, added to
   PROFILE_SCOPED_PREFIXES, HTTPException re-raise so ghost profiles 404
   instead of 500. Regression test asserts read/write symmetry with
   differing worker/default aux config.

3. MEDIUM — tools post-setup spawned unscoped from the profile-aware
   drawer. Now spawns 'hermes -p <profile> tools post-setup <key>'
   (same mechanism as hub installs); drawer threads its profile prop.
   Most hooks install machine-level artifacts where the scope is inert,
   but hooks reading config/env now see the drawer's HERMES_HOME.

4. LOW — ty warnings: env Optional asserts before subscript/membership,
   fastapi import replaced with web_server.HTTPException re-use.

298 tests green across the four affected suites; tsc -b + vite build
green; aux scoping E2E-verified with real imports.

* fix(dashboard): address second profile-unification review (gille)

1. BLOCKER — profile scope dropped on sidebar navigation: ProfileProvider
   derived the selection from the current URL, and nav links are bare
   paths, so clicking Config from /skills?profile=worker silently reset
   the write target. State is now the source of truth; an effect
   re-asserts ?profile= onto the new location after every navigation
   (URL stays a synchronized projection for deep links/refresh), and an
   incoming URL param (e.g. 'Manage skills & tools' links) still wins.

2. BLOCKER — /api/model/options unscoped while model/set wrote scoped:
   the picker context (current model/provider, custom providers,
   per-profile .env auth state) now loads inside _profile_scope; added
   to PROFILE_SCOPED_PREFIXES. Test: a worker-only current-model pin
   appears in the scoped payload and not the unscoped one.

3. BLOCKER — MCP test-server probe escaped the scope after the config
   read: the probe now re-enters _profile_scope inside the worker thread
   so env-placeholder expansion resolves against the selected profile's
   .env. Known limit (documented): the probe's dedicated MCP event-loop
   thread doesn't inherit the contextvar (OAuth token paths). Test
   asserts get_hermes_home() inside the probe == the worker profile dir.

4. BLOCKER — broad excepts swallowed unknown-profile 404s: /api/model/info
   degraded to 200-with-empty-model-info and /api/mcp/catalog to a
   silently-empty catalog. Both re-raise HTTPException; 404 regression
   tests added for info/options/catalog.

Polish: scope banner clears the fixed mobile header (mt-14 lg:mt-0);
--open-profile hidden via argparse.SUPPRESS (internal re-exec flag);
attach-path test now asserts the opened ?profile= URL.

(Stale-page-state + /api/model/auxiliary findings from this review were
already fixed in 92bcd1568 — the review ran against e600f6951.)

35 tests in the two new suites + 274 in the adjacent ones, all green;
tsc -b + vite build green; scoping E2E-verified with real imports.

* docs(dashboard)+fix: self-review pass — Profiles page section, REST profile-param tip, body-beats-query precedence

Docs:
- web-dashboard.md: add the missing 'Profiles' subsection to Pages
  (cards, create/builder, manage-skills jump, set-as-active vs switcher
  distinction, editors); REST API section gets a profile-scoped-endpoints
  tip documenting ?profile= / body profile / 404 semantics / /api/pty
- (profiles.md + cli-commands.md were already updated in e600f6951)

Precedence fix: scoped endpoints taking BOTH a query param and a body
field now resolve body.profile first. The SPA's fetchJSON injects the
query param from the GLOBAL switcher; an explicit body.profile (e.g.
Profile Builder flows writing into a specific new profile) is the more
specific intent and must not be overridden by whatever the sidebar
happens to be set to. Matches the documented 'explicit beats global'
contract in api.ts.

Verified: 304 tests green across the four suites; tsc -b + vite build
green; docusaurus build green (only pre-existing broken-link warnings,
none from this PR's pages).
2026-06-11 03:29:33 -07:00
alt-glitch
8afb7bc570 opentui(v6): double-click word / triple-click line selection with held drag-extend
Editor-grade mouse selection parity with the Ink TUI (hermes-ink selection.ts):
a second click in the 500ms/1-cell chain selects the same-class character run
under the cursor (iTerm2 word set, wide-glyph aware), a third selects the line,
and dragging with the button held extends word-by-word / line-by-line while the
clicked span stays selected — anchor flips across the span on direction change.

Core knows only press-drag char selection, so this is a boundary shim
(multiClickSelect.ts) wrapping the renderer's startSelection/updateSelection
seam; word bounds read the presented frame's char grid. Native quirks probed
and pinned: per-renderable selection anchors are fixed at set time (anchor
flips restart the selection) and forward selections exclude the focus cell
(inclusive spans seed focus at hi+1). Pure scanning logic in logic/multiClick.ts;
20 new tests (pure + real-mouse-path frames); demo.tsx installs the seam for
tmux smokes.
2026-06-11 15:58:33 +05:30
Teknium
85503dceca Merge pull request #44038 from NousResearch/hermes/hermes-fb4ee8ce
fix(cli): show quick commands in /help output
2026-06-11 03:04:30 -07:00
kshitij
955fa40062 Merge pull request #44085 from kshitijk4poor/review/pr-43754-ssh-update
fix(update): avoid SSH auth for passive official checks
2026-06-11 01:12:03 -07:00
liuhao1024
0d3e2cc539 fix(desktop): deduplicate sidebar rows by compression lineage in mergeSessionPage (#43487)
When auto-compression rotates the session tip (old #4 → new #5), the
incoming page carries the new tip but the previous list still holds the
old one. The old tip's id differs from the new tip's id, so the existing
id-only dedup in mergeSessionPage() preserves both as separate sidebar
rows.

Add lineage-level dedup: build a set of incoming lineage keys
(`_lineage_root_id ?? id`) and filter survivors whose lineage key
matches any incoming row. This mirrors the existing sessionPinId()
logic used for pin stability.

Fixes #43483
2026-06-11 01:02:27 -07:00
kshitij
c94e93a648 Merge pull request #44084 from kshitijk4poor/salvage/windows-winget-stale-reg
fix(install/windows): repair stale winget registration + refresh/merge PATH after every package manager
2026-06-11 00:25:15 -07:00
kshitij
39f40ece70 Merge pull request #44074 from kshitijk4poor/fix/archive-compressed-session-lineages-salvage
fix(sessions): archive compressed conversation lineages
2026-06-11 00:24:00 -07:00
kshitijk4poor
0edeee14c6 test(desktop): cover official-SSH remote detection for passive updates
Extract the remote-detection helpers (canonicalGitHubRemote, isSshRemote,
isOfficialSshRemote) from main.cjs into a testable update-remote.cjs sibling
module and add a node:test suite, wired into test:desktop:platforms.

main.cjs requires('electron') at load, so its inline helpers weren't unit
testable. The Python side of #43754 shipped a regression test; this gives the
desktop side the same coverage for the security-critical detection that keeps
passive update checks off the SSH origin (avoiding FIDO2/passkey touch
prompts). Tests assert SSH/HTTPS forms canonicalize equal, official SSH is
detected case-insensitively, and forks / other hosts / the HTTPS remote are
NOT misclassified.
2026-06-11 12:53:19 +05:30
kshitij
b4fbf7b93c Merge pull request #44082 from kshitijk4poor/fix/backup-staging-and-nested-skill-dirs
fix(backup): stage SQLite snapshots beside output zip (all paths) and stop excluding nested hermes-agent skill dirs
2026-06-11 00:20:52 -07:00
kshitijk4poor
9662b76d59 fix(install/windows): merge PATH in Update-ProcessPathForPackages instead of overwriting
Follow-up to the winget stale-registration fix. Update-ProcessPathForPackages
rebuilt $env:Path wholesale from the persisted User+Machine hives (plus winget's
Links dir), discarding any process-only PATH entries added earlier in the
installer run. Since the helper now runs after every package manager, that
wholesale replace is more likely to clobber a process-local entry than the
original winget-branch-only version was.

Merge instead: seed from the current process PATH, then append hive and
winget-Links entries not already present, with a case-insensitive,
order-preserving dedupe. Behaviour on a clean box is unchanged (the hive entries
are simply appended); the difference is that pre-existing process-only entries
now survive the refresh.
2026-06-11 12:49:58 +05:30
xxxigm
899acfe42f fix(install/windows): repair stale winget registration; refresh PATH after every package manager
When ripgrep/ffmpeg is missing, `winget install <id>` on a package winget
already has registered is treated as an upgrade: it finds no newer version and
exits 0x8A15002B (-1978335189, APPINSTALLER_CLI_ERROR_UPDATE_NOT_APPLICABLE)
without ensuring the binary is actually present. The installer only logged that
code and judged success by `Get-Command rg`, so a stale registration (files
removed outside winget, or a missing alias shim) became a permanent dead-end —
winget kept reporting "already installed" and the user could never reinstall.

Detect that exit code and retry once with `--force` to repair the registration
so the shim reappears.

Also refresh the process PATH after the choco and scoop fallbacks (not just
winget) via a shared helper, so a successful fallback install — or any install
on a box without winget — is no longer misreported as "not installed".
2026-06-11 12:47:59 +05:30
kshitijk4poor
ed2b9e43c8 fix(backup): stage SQLite snapshots beside output zip in pre-update path too
The pre-update / pre-migration backup path (_write_full_zip_backup) had the
same /tmp staging bug as run_backup: a small tmpfs at the default tempfile
location silently drops large *.db files from the archive. Route its SQLite
staging temp files to the output zip's directory as well, and add regression
tests (mutation-verified) for both staging paths.

Co-authored-by: liuhao1024 <sunsky.lau@gmail.com>
2026-06-11 12:45:40 +05:30
helix4u
cedd9b6d47 fix(update): avoid SSH auth for passive official checks 2026-06-11 12:45:07 +05:30
liuhao1024
dd40600e0a fix(backup): stage SQLite snapshots alongside output zip and stop excluding nested hermes-agent skill dirs
Two bugs in the backup routine:

1. SQLite safe-copy used tempfile.NamedTemporaryFile() which defaults to
   the system temp directory (/tmp).  When /tmp is a small tmpfs and the
   database is large, the copy silently fails and the resulting zip is
   missing state.db, kanban.db, and response_store.db.

   Fix: pass dir=out_path.parent so the temp file is staged alongside the
   output zip on the same filesystem.

2. _EXCLUDED_DIRS contained "hermes-agent" which matched at ANY path
   depth, accidentally excluding the Hermes Agent skill directory at
   skills/autonomous-ai-agents/hermes-agent/.

   Fix: special-case "hermes-agent" to only match when it is the first
   path component (the root-level code checkout).  All other excluded dir
   names continue to match at any depth.

Regression tests added for both fixes.
2026-06-11 12:43:39 +05:30
kshitijk4poor
5e81113d09 chore: map dschnurbusch contributor email for attribution 2026-06-11 12:34:12 +05:30
Dan Schnurbusch
04b3f19538 fix(sessions): archive compressed conversation lineages 2026-06-11 12:31:10 +05:30
Teknium
b8e2c16579 Merge origin/main into salvage branch (resolve AUTHOR_MAP conflict) 2026-06-10 23:25:54 -07:00
kshitij
4829f8d2c5 Merge pull request #44047 from kshitijk4poor/salvage/desktop-stop-stale-session
fix(desktop): recover stale session before stop
2026-06-10 23:23:38 -07:00
teknium1
cb2c13055e fix(gateway): scrub _HERMES_GATEWAY from POSIX detached restart watcher too
Follow-up to the salvaged #41264 (Windows watcher): the setsid/bash detached
restart watcher on Linux/macOS inherits _HERMES_GATEWAY=1 the same way, so
the CLI's self-restart loop guard silently refuses 'hermes gateway restart'
and the gateway never comes back. Scrub the marker from the watcher env on
the POSIX branch as well, and extend the setsid test to assert it.
2026-06-10 23:22:43 -07:00
鼬君夏纪
264ac72b67 fix(gateway,windows): preserve restart watcher env 2026-06-10 23:22:43 -07:00
helix4u
f38f7a3870 fix(desktop): recover stale session before stop
Desktop already recovers from a stale runtime session id when
`prompt.submit` returns `session not found` after a gateway restart or
sleep/wake. The stop path did not have the same recovery: `cancelRun`
called `session.interrupt` once with the stale runtime id, then surfaced
`Stop failed / session not found`.

This makes stop/cancel mirror the prompt recovery path. If
`session.interrupt` reports `session not found` and the selected stored
session id is available, Desktop resumes that durable session, updates
the active runtime ref with the recovered id, and retries
`session.interrupt` once against the recovered runtime id.

Salvaged from #43941 — rebased onto current main, dropping the unrelated
`package-lock.json` (@types/node 24.13.1->24.13.2) and `nix/lib.nix`
hash churn. That bump is a local npm 11 re-resolution artifact, not a CI
requirement: repo CI runs node 22 (npm 10) and main is green at
@types/node 24.13.1, so the lockfile and nix hash do not need to change.

Co-authored-by: helix4u <4317663+helix4u@users.noreply.github.com>
2026-06-11 11:45:08 +05:30
Teknium
2450fd7066 chore: add mvanhorn to AUTHOR_MAP 2026-06-10 22:56:17 -07:00
Matt Van Horn
0b5b7ddfd2 fix(cli): show quick commands in /help output
User-defined quick_commands from config.yaml now appear in the /help
output under a "Quick Commands" section, between skill commands and tips.

Fixes https://github.com/NousResearch/hermes-agent/issues/4090

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-06-10 22:55:52 -07:00
Shannon Sands
fa7f24e898 Enable webhooks from dashboard page 2026-06-10 22:55:06 -07:00
Teknium
13f1efdd15 fix(gateway): collapse repeated terminal headers in consecutive tool progress blocks (#43968)
When the agent runs several terminal commands back-to-back, each
progress line repeated the '💻 terminal' header above its fenced code
block, cluttering the progress bubble. Now only the first terminal call
in a streak emits the header; subsequent consecutive terminal calls
render adjacent code blocks. Any other tool (or non-block preview)
resets the streak so the next terminal call gets a fresh header.
2026-06-10 22:30:27 -07:00
brooklyn!
4d22b82933 Merge pull request #43959 from NousResearch/hermes/salvage-composer-drafts
fix(desktop): per-thread composer drafts on decoupled lifecycle (salvage #43660, supersedes #43939)
2026-06-11 00:12:23 -05:00
Brooklyn Nicholson
419c8a98a9 Merge remote-tracking branch 'origin/main' into hermes/salvage-composer-drafts 2026-06-11 00:07:07 -05:00
brooklyn!
975edd4140 fix(cli): omit --workspace when subpackage has its own package-lock.json (#42973) (#43986)
* fix(cli): omit --workspace when subpackage has its own package-lock.json

When ui-tui/ (or web/) contains its own package-lock.json, _workspace_root()
returns the subpackage directory itself.  Passing --workspace ui-tui in that
case fails because npm cannot find a workspace named 'ui-tui' inside ui-tui/.

Fix: skip the --workspace flag when npm_cwd equals the target directory,
running a plain 'npm install' from the standalone project root instead.

Applies the same fix to both _make_tui_argv (TUI) and _build_web_ui (web).

Fixes #42973

* test(cli): fix web workspace-scope fixture + cover own-lockfile fallback (#42973)

The web half of the #42977 fix broke test_npm_install_uses_workspace_web_scope,
which built its fixture with no lockfile anywhere. Without a root lockfile,
_workspace_root(web_dir) already returns web_dir, so the new
"() if npm_cwd == web_dir" branch correctly drops --workspace and the
assertion failed. Model a real workspace checkout instead: the single
package-lock.json lives at the root, so --workspace web scopes the install.

Also add the symmetric web regression test (web/ carrying its own lockfile =>
--workspace must be dropped and the install runs plainly from web_dir via
npm ci), matching the TUI coverage already in test_tui_npm_install.py.

---------

Co-authored-by: liuhao1024 <sunsky.lau@gmail.com>
2026-06-11 05:01:25 +00:00
Brooklyn Nicholson
d7d281fa37 feat(desktop): strict per-thread drafts on decoupled composer
Keyed draft stash (Map + localStorage mirror) behind the live composer:
switching threads stashes the departing draft and restores the entering
one; empty threads show an empty box. Session lifecycle never clears
composer state — the scope swap is the only coupling.

Co-authored-by: mollusk <roger@roger.local>
Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>
2026-06-11 00:01:06 -05:00
Brooklyn Nicholson
292192f7d7 refactor(desktop): tidy composer draft persistence
- DRY the duplicated submit-restore blocks into dispatchSubmit()
- inline localStorage access (drop browserStorage indirection);
  clearPersistedComposerDraft delegates to write('')
- drop stale per-scope-stash comment in use-session-actions
2026-06-10 23:47:32 -05:00
Brooklyn Nicholson
c710868fbc refactor(desktop): decouple composer from session lifecycle entirely
The composer is a single global surface that sits ABOVE the thread: its
contents follow the user across session switches and are never touched
by session lifecycle. Switching threads doesn't change the render.

Replaces the per-scope draft choreography (scoped storage keys, attachment
stash map, skip-sentinel, restore-on-scope-change effect) with:
- one global localStorage key so an unsent draft survives app reloads
- a one-shot restore on mount
- nothing else — session switches simply don't touch the composer

Verified E2E via CDP with real sidebar clicks + real keystrokes:
typed draft survives A->B->A switching and a full page reload.
2026-06-10 23:39:35 -05:00
brooklyn!
3e74f75e41 feat(agent): coding-context posture across CLI/TUI/desktop/ACP (#43316)
* feat(agent): coding-context posture with per-model edit-format tuning

Hermes detects when it's running in a coding context — an interactive
surface (CLI, TUI, ACP, desktop) sitting in a code workspace (git repo or
recognised project root) — and shifts into a coding posture. Outside that
(chat platforms, non-workspaces) nothing changes.

The posture is modelled as a frozen RuntimeMode selected from a small
ContextProfile registry (coding/general). A profile is data: the toolset to
collapse to, the operating brief to inject, and seams for model routing and
memory. Every domain reads the same resolved object instead of re-probing
git/config on its own:

- System prompt — RuntimeMode.system_blocks(): an operating brief (gather
  context before editing, edit through tools not chat, verify with terminal,
  cap retry loops) plus a live git/workspace snapshot, built once and baked
  into the stable prompt tier so per-conversation caching is preserved.
- Per-model edit-format tuning — the brief nudges each model family toward
  the patch mode it handles best: OpenAI/Codex toward mode='patch' (V4A
  multi-file diffs), Anthropic toward mode='replace' (string replacement).
  The model id rides on RuntimeMode; unknown families keep neutral wording.
- Skill index — non-coding skill categories are pruned from the prompt's
  skill index (discovery-only; skills_list/skill_view still reach the full
  catalog, with a disclosure note).
- Toolset — only under the opt-in 'focus' mode does the posture collapse to
  the coding toolset + enabled MCP servers; the default posture is
  prompt-only and never overrides configured toolsets.

Activation via agent.coding_context: auto (default), focus, on, off.
Subagents inherit the posture for free via toolset inheritance + the shared
prompt builder. Detection is not memoized so a long-lived gateway/TUI
process can't pin a stale posture across working directories.

* feat(agent): cover new-file authoring in the coding edit-format nudge

The per-model edit-format guidance only addressed editing existing code
(patch mode='patch' vs 'replace'), but authoring a brand-new file —
write_file, not patch — is a large fraction of real coding work and the
nudge was silent on it. Surfaced when building a single-file artifact where
the dominant operation was write_file and the steering offered no guidance.

Both family lines now lead with "author new files with write_file; for
edits to existing code prefer ...". Tests assert write_file appears in each
family's brief; unknown families still get neutral wording.

* docs(agent): correct memoization docstring + clarify TUI config-load asymmetry

* feat(agent): sharpen the coding posture — verify-loop facts, wider edit steering, $HOME guard

Tuning pass on the coding posture from dogfooding it as a harness:

- Workspace snapshot now hands the model its verify loop up front:
  detected manifests + package manager (lockfile sniff), the exact
  verify commands (package.json scripts, Makefile targets,
  scripts/run_tests.sh, pytest config), and which context files
  (AGENTS.md / CLAUDE.md / .cursorrules) exist at the root. Marker-only
  (non-git) projects get the snapshot too instead of nothing. The
  "verify before claiming done" brief line was the highest-value piece
  in evals — this turns it from advice into an executable loop instead
  of making the model rediscover the test command every session. Still
  stat-cheap, size-guarded reads, built once at prompt time.

- Edit-format steering covers the families Hermes actually serves:
  Gemini and open-weight coding models (DeepSeek, Qwen, Kimi, GLM,
  Grok, Hermes, Llama, Mistral, Devstral, MiniMax) steer to
  mode='replace' — their RL scaffolds use str_replace-style editors.
  Previously only GPT/Codex and Claude families got steering; the
  models Hermes users disproportionately run all fell to neutral.

- Operating brief gains four behaviors elite harnesses encode: batch
  independent reads/searches in one turn; fix root causes and the bug
  class (sibling call paths), not the reported site; no drive-by
  refactors/renames/reformatting; never read, print, or commit secrets.
  Plus a patch-failure escalation ladder: after the same region fails
  twice, rewrite the enclosing function/file with write_file instead of
  a third patch attempt.

- $HOME dotfiles guard: a git repo rooted exactly at the home directory
  (or a marker sitting in it, e.g. a global ~/AGENTS.md) is user config,
  not a code workspace — without the guard, every session anywhere under
  a dotfiles-managed home silently flipped to the coding posture. Real
  projects under such a home still detect via their own markers/repos;
  'on' mode bypasses the guard.
2026-06-10 23:06:44 -05:00
Brooklyn Nicholson
fdc0d19566 fix(desktop): make draft persistence actually fire — new-chat sentinel, reload flush, session-switch clears
Manual testing of the salvaged draft persistence showed none of it worked
end-to-end. Three distinct bugs, all invisible to the store-level unit
tests:

1. New-chat drafts were never written. The skip-one-persist sentinel was
   reset to null after consuming, but null IS a real scope (the unsaved
   new-session draft) — so in a new chat every persist run matched the
   "consumed" sentinel and bailed. This silently killed the headline
   #38498 fix. Use undefined as the no-skip sentinel, which can never
   collide with a scope.

2. Cmd+R inside the debounce window dropped the trailing text. React does
   not run effect cleanups on a page reload, so the flush-on-unmount
   never fired; with the 400ms debounce that meant type-then-reload lost
   the draft every time. Flush pending writes on pagehide.

3. Session switch/new/resume/branch paths in use-session-actions cleared
   the composer stores synchronously with the session-id updates. React
   batches those, so by the time ChatBar's scope-change cleanup ran to
   stash the departing session's attachments, the store was already
   empty — the stash recorded [] and the chips were lost anyway. The
   composer's per-scope restore now owns composer contents wholesale on
   scope change, so drop the upstream clears (clearComposerDraft only
   touched the vestigial $composerDraft atom nothing reads).

Co-authored-by: mollusk <roger@roger.local>
Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>
2026-06-10 22:58:50 -05:00
Teknium
7d8d000b19 revert(cron): remove per-job profile support (PR #28124) (#43956)
Fully removes the cron per-job 'profile' arg added in #28124: the
cronjob tool schema field, CLI --profile flags on cron create/edit,
job-record storage/validation, the scheduler's _job_profile_context
wrapper, and the script-runner env override. Sequential-partition
logic reverts to workdir-only.

The context-local HERMES_HOME override in hermes_constants and the
subprocess bridging in tools/environments/local.py are kept — they
now have other consumers (dashboard multi-profile, TUI gateway).
2026-06-10 20:46:17 -07:00
teknium1
68ffedb6a9 chore(release): map Spaceman-Spiffy for #35586 salvage 2026-06-10 20:45:16 -07:00
teknium1
efcbbde48c refactor: keep anthropic_content_blocks in-memory only (no state.db column)
Drop the hermes_state.py column + persistence plumbing from the salvaged
interleaved-thinking fix. The ordered-block channel covers the failure
window in-memory (turn replayed within the live conversation loop). A
session reloaded from disk after a crash falls back to reconstruction;
if that replay 400s, the thinking-signature recovery (#43667) strips
reasoning_details and retries — one degraded call in a rare resume path
instead of a schema column. Replaces the DB-roundtrip test with a
fallback-shape test.
2026-06-10 20:45:16 -07:00
RaumfahrerSpiffy
7a1eed8268 fix(anthropic): redact replayed tool inputs and broaden thinking-replay 400 recovery
Two additive hardening changes on the interleaved-thinking replay path
introduced by this PR's anthropic_content_blocks channel. Both are scoped
to that channel's blast radius; neither changes correct behavior.

1. Replay-time tool-input re-sourcing (credential safety).
   The ordered-block channel captures each tool_use `input` from the RAW
   API response in normalize_response, which is NOT credential-redacted.
   The parallel tool_calls[].function.arguments IS redacted at storage
   time (build_assistant_message, #19798). The verbatim-replay fast path
   in _convert_assistant_message replayed the raw block input, so a secret
   a model inlined into a tool call (e.g. an Authorization header value
   passed inside a terminal command) would ride back onto the wire even
   though it is redacted everywhere else in history. Re-source tool_use
   input from the redacted tool_calls map by
   sanitized id; interleave order (the reason this channel exists) is
   unaffected. Adapted from #36071, which re-sources tool inputs the same
   way on its replay path.

2. Broaden the thinking-replay 400 classifier (defense-in-depth).
   error_classifier only matched "signature" + "thinking", so the
   frozen-block variant — "thinking ... blocks in the latest assistant
   message cannot be modified. These blocks must remain as they were in
   the original response." — carried no "signature" token and fell through
   to a non-retryable abort. The anthropic_content_blocks channel prevents
   the reorder that triggers this 400 at the source, but if any future
   mutator reintroduces it, the turn now self-heals via the existing
   strip-reasoning-and-retry recovery instead of crash-looping. A negative
   case ensures an unrelated "cannot be modified" 400 (no "thinking") is
   not swept in. Mirrors the classifier broadening in #36087 and #36071.

Tests
- tests/agent/test_anthropic_thinking_block_order.py: a replay test
  asserting an inlined secret is redacted on the wire while interleave
  order is preserved.
- tests/agent/test_error_classifier.py: three cases — frozen-block 400
  native and via OpenRouter route to thinking_signature/retryable; an
  unrelated "cannot be modified" 400 does not.
Both grafts verified RED (tests fail with the change reverted) then GREEN.
Full adapter, transport, classifier and output-field-leak suites pass.

Co-authored-by: AlexanderBFoley <92330381+AlexanderBFoley@users.noreply.github.com>
2026-06-10 20:45:16 -07:00
RaumfahrerSpiffy
529bb1c3d5 fix(anthropic): strip output-only SDK fields from replayed content blocks
HTTP 400 "messages.N.content.M.text.parsed_output: Extra inputs are not
permitted" on the native Anthropic transport. Anthropic SDK 0.87.0 response
blocks carry output-only attributes the Messages *input* schema forbids: text
blocks get `parsed_output` and `citations=None`, tool_use blocks get `caller`.
normalize_response captured blocks verbatim via _to_plain_data and replayed
them as request input on the next turn, so the forbidden fields leaked back ->
400. Like the earlier thinking-block bug, one poisoned turn wedges every
subsequent request in the session (even the diagnostic turn), recoverable only
by switching models or deleting the session.

This is a defect in the anthropic_content_blocks channel added for the
interleaved-thinking fix: it preserved block ORDER correctly but copied every
SDK attribute, including output-only ones.

Fix — whitelist input-permitted fields per block type at all three leak points:
- agent/transports/anthropic.py normalize_response: sanitize at CAPTURE so the
  poison never persists to state.db (defence-in-depth).
- agent/anthropic_adapter.py _sanitize_replay_block (new): whitelist used on the
  ordered-blocks replay path; also recovers already-poisoned stored sessions.
- agent/anthropic_adapter.py _convert_content_part_to_anthropic: a stored
  `text` part is rebuilt from whitelisted fields instead of dict(part) verbatim
  (this was the exact content.N.text.parsed_output failure locus).

Whitelist not blacklist, so future SDK output-only fields can't reintroduce it.
Block order and thinking-block signatures are preserved (the reason the channel
exists). Adds tests/agent/test_anthropic_output_field_leak.py; full adapter
suite green (163 tests). Existing poisoned state.db rows scrubbed out-of-band.
2026-06-10 20:45:16 -07:00
RaumfahrerSpiffy
aaccaada28 fix(anthropic): preserve interleaved thinking/tool_use block order on replay
Interleaved-thinking turns (adaptive thinking, Claude 4.6+/Opus 4.8) emit
content blocks like:

    thinking_1(signed) tool_use_1 thinking_2(signed) tool_use_2

Anthropic signs each thinking block against the turn content preceding it
at its position. normalize_response split the turn into two parallel lists
(reasoning_details + tool_calls), discarding cross-type order, and
_convert_assistant_message rebuilt it as [all thinking][text][all tool_use].
That moved thinking_2 ahead of tool_use_1, invalidating its signature, so
Anthropic rejected the latest assistant message with HTTP 400:

    messages.N.content.M: `thinking` or `redacted_thinking` blocks in the
    latest assistant message cannot be modified.

Observed repeatedly in agent.conversation_loop against api.anthropic.com /
claude-opus-4-8, recurring across sessions on multi-thinking-block turns.

Fix: carry a verbatim, order-preserving copy of the turn's content blocks
(anthropic_content_blocks) end-to-end - capture in normalize_response,
persist/restore through state.db, and replay unchanged for the latest
assistant message. Gated to turns that actually interleave signed thinking
with tool_use, so normal turns are unaffected.

Adds 3 regression tests including a SQLite round-trip covering the
crash-recovery reload path.
2026-06-10 20:45:16 -07:00
Brooklyn Nicholson
65ddc7c4a1 fix(desktop): retain composer attachments per session scope + guard programmatic drafts
The salvaged draft persistence scoped text per session but reset the
composer's attachments to [] on every scope change, so a staged image or
file was silently dropped when you switched sessions and never restored on
return — inconsistent with the "drafts survive session switches" promise
and a real paper-cut given remote staging cost.

Retain attachments per scope in an in-memory map (keyed by the same scope
as the text draft) since blobs / object URLs / live upload state can't be
serialized to localStorage. Entering a scope restores its stashed chips;
leaving stashes the current ones; an accepted submit clears the scope.
This survives session switches (the case users hit) without pretending to
survive a full reload, which attachments fundamentally can't.

Also guard the debounced text write so browsing sent-message history or
editing a queued prompt (both swap the composer to recalled text via
loadIntoComposer) no longer clobbers the genuine in-progress draft in
storage.

Co-authored-by: mollusk <roger@roger.local>
Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>
2026-06-10 22:41:34 -05:00
Teknium
ad9012097b fix(dashboard): dedupe useNavigate import/declaration in ProfilesPage
tsc -b (run by the Docker image build, unlike local vite-only checks)
rejected the duplicate identifier.
2026-06-10 20:34:53 -07:00
Teknium
914befa9aa feat(dashboard): profile-scoped skills & toolsets management
'Set as active' on the Profiles page only flips the sticky active_profile
file (future CLI/gateway runs) — it never retargets the running dashboard
process. The skills/toolsets endpoints called bare load_config()/
save_config(), so after 'activating' a profile in the web UI, deactivating
a skill silently wrote into the dashboard's own profile and the activated
profile was untouched.

Backend:
- _profile_scope() context manager on the skills/toolsets endpoints:
  context-local HERMES_HOME override for call-time config resolution +
  cron-style locked swap of tools.skills_tool's import-time SKILLS_DIR
- profile param on /api/skills, /api/skills/toggle, /api/tools/toolsets*
  (list/toggle/config/provider/env), hub sources/search installed-state
- hub install/uninstall/update spawn 'hermes -p <profile> skills ...' so
  the child rebinds skills_hub.SKILLS_DIR at import (the override cannot
  reach import-time globals); profile validated -> 404/400 before spawn

Frontend:
- Skills page: profile selector (deep-linkable /skills?profile=<name>),
  amber banner naming the managed profile, threaded through skill toggles,
  toolset drawer, and hub browser
- Profiles page: 'Manage skills & tools' action per card; 'Set as active'
  toast now says it applies to new CLI/gateway runs only

Omitted profile keeps legacy behavior (dashboard's own profile).
2026-06-10 20:34:53 -07:00
Teknium
3d14f01fd6 fix(desktop): debounce per-keystroke draft persistence writes
The salvaged draft-persistence effect wrote to localStorage on every
keystroke — the composer's per-keystroke path was deliberately slimmed
down previously, so debounce the write (400ms) and flush pending text on
scope change/unmount so a fast session switch can't drop trailing
keystrokes. Also add AUTHOR_MAP entry for the salvaged commit.
2026-06-10 22:34:30 -05:00
Roger
18d61bd06e fix(desktop): persist composer drafts across reloads
Save in-progress composer text to browser localStorage per chat session and restore it when the desktop composer remounts. Keep the draft when submit is rejected or throws, and clear it only after the prompt is accepted.
2026-06-10 22:34:13 -05:00
Teknium
acd7932c0f docs: cross-link write-approval gate from skills, configuration, and slash-command docs (#43801)
The memory/skill write-approval gate (#38199, #43354, #43452) was only
documented inside features/memory.md. Surface it everywhere users will
actually look:

- features/skills.md: new 'Gating agent skill writes' section under
  skill_manage, with the staging semantics, review commands, and the
  distinction from skills.guard_agent_created
- configuration.md: memory.write_approval added to the Memory
  Configuration block; new 'Write approval for skill writes' subsection
  next to the guard_agent_created scanner
- reference/slash-commands.md: /memory and /skills review subcommands in
  both the CLI and messaging tables; Notes updated since /skills
  pending/approve/reject/diff/approval now works on the gateway
- features/memory.md: cross-link to the new skills section
2026-06-10 19:54:44 -07:00
Teknium
0a5762c78d fix(web): genericize free-MCP client identity per telemetry policy
Replace the hermes-identifying clientInfo/User-Agent/session-id prefix on
the keyless Parallel Search MCP path with a neutral 'mcp-web-client'
identity. Project policy forbids third-party usage attribution without an
explicit user opt-in (see telemetry PR policy); MCP requires a clientInfo,
so a generic one satisfies the spec without attributing traffic.

Also adds the contributor AUTHOR_MAP entry and refreshes uv.lock against
current main (parallel-web 0.6.0).
2026-06-10 19:54:38 -07:00
Matt Harris
e0e2571711 feat(web): Parallel-backed web search & extract — free Search MCP when keyless, v1 REST when keyed
Make Parallel the web search/extract backend with a zero-setup free tier:

- Keyless (no PARALLEL_API_KEY): web_search/web_extract work out of the box via
  Parallel's free hosted Search MCP (search.parallel.ai/mcp), and parallel
  becomes the default backend when no other web credentials are configured
  (ahead of ddgs, which is search-only). A small hand-rolled Streamable-HTTP
  JSON-RPC client speaks the MCP's web_search/web_fetch tools; the existing
  web_search/web_extract tools are the only tools registered.
- Keyed (PARALLEL_API_KEY set): uses the Parallel v1 REST endpoints
  (client.search / client.extract with advanced_settings.full_content) — no beta.
  Bumps parallel-web 0.4.2 -> 0.6.0.
- Attribution: on the free path only, results carry provider/attribution and the
  CLI tool line reads "Parallel search" / "Parallel fetch"; the paid path is
  unbranded.
- Selection/registration: web tools register unconditionally (free MCP backstop)
  while check_web_api_key remains a real usability probe; explicit per-capability
  backends are honored (so misconfig surfaces) rather than masked by the fallback.

Tested: live web_search/web_extract against search.parallel.ai in keyless and
keyed modes; unit suites for the MCP client, backend selection, and display
labeling; full agent run shows the "Parallel search" label on the free path.
2026-06-10 19:54:38 -07:00
alt-glitch
94765e48ff cli: worktree lock + dirty-tree preservation — stop pruning uncommitted work
Three behavior changes to the hermes -w worktree lifecycle:

1. Git-native locks. _setup_worktree now locks its worktree
   (git worktree lock --reason "hermes session pid=<pid>"), and
   _prune_stale_worktrees skips locked worktrees at ANY age — a lock
   from a live or crashed session means "do not touch". New helpers
   _lock_worktree / _unlock_worktree / _worktree_is_locked (fail-safe:
   any error reads as locked) / _worktree_is_dirty (fail-safe: any
   error reads as dirty).

2. Dirty trees are preserved. _cleanup_worktree previously destroyed
   worktrees with uncommitted changes if there were no unpushed
   commits; it now keeps the worktree, branch, and lock when the tree
   is dirty OR has unpushed commits, and prints manual cleanup hints
   (git worktree unlock + remove --force). The >72h "force remove
   regardless" prune tier is removed: pruning may only ever delete
   clean, unlocked, fully-pushed worktrees.

3. Branch deletion is gated on removal success. Both cleanup and
   prune previously deleted the branch without checking the
   git worktree remove returncode, dropping easy reachability of the
   commits even when removal failed; the branch is now only deleted
   after a successful remove.
2026-06-11 08:10:55 +05:30
brooklyn!
fe54960142 desktop: un-truncate the active slash/@ row so long descriptions stay readable (#43926)
Follow-up to #42351. Slash command rows render the command label and
description with `truncate`, so skill commands and longer blurbs were
clipped with no way to read the full text. Rather than add a floating
tooltip (which overlaps the popover and only helps the mouse), the active
row — the one reached by keyboard arrows or hover, since onMouseEnter
already sets activeIndex — now drops truncation and wraps inline
(whitespace-normal break-words). Idle rows stay single-line/truncated so
the list reads compact.
2026-06-11 02:35:38 +00:00
brooklyn!
3ffbdfbcc0 desktop: registry-driven slash commands + first-class /resume & /handoff (#42351)
* desktop: surface /tools, /save, /personality and fix /help skill count

Move /tools and /save out of TERMINAL_ONLY_COMMANDS and /personality out of
ADVANCED_COMMANDS so they appear in the desktop slash palette and execute via
the existing slash.exec → command.dispatch fallback. The backend gateway already
accepts these through slash.exec (none are in _PENDING_INPUT_COMMANDS or the
skill list), so no backend change is required.

Recompute skill_count in filterDesktopCommandsCatalog from the filtered pairs.
Previously the /help footer echoed the unfiltered backend total — e.g. "60
skill commands available" while only ~29 actually appeared in the rendered
list, because the desktop hides terminal-only, picker-owned, and advanced
commands.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* desktop: keep slash popover live while typing args

The trigger regex `(?:^|[\s])([@/])([^\s@/]*)$` stopped matching the moment
the user typed a space after a slash command, so the popover never showed arg
completions for `/personality`, `/tools`, etc. — even though the backend's
`complete.slash` already returns them with a `replace_from` indicator.

Split the trigger detection so `/` allows args (`/cmd arg1 arg2`) while `@`
keeps the strict no-space behavior. Restrict the slash command name to
`[a-zA-Z][\w-]*` so file paths like `src/foo/bar` don't accidentally trigger
the popover.

Rewrite arg-completion items in useSlashCompletions to insert the full
`/personality alice` token instead of stranding `/alice`: when `replace_from`
is past the command base, prepend the existing prefix to each item's text so
the chip serializer produces a coherent replacement.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* cli: complete toolset names after /tools enable|disable

SlashCommandCompleter previously only auto-derived the first subcommand level
from args_hint, so `/tools enable <tab>` yielded nothing — the user had to
remember every toolset key (web, file, spotify, …) and every MCP server prefix.

Add `_tools_completions` that handles both stages: subcommand (list|disable|enable)
and tool name. Filter by current enable state so `/tools enable <tab>` only
offers disabled toolsets and `/tools disable <tab>` only offers enabled ones —
no point suggesting a no-op. MCP server prefixes (server:) come from the
saved mcp_servers config; per-tool completion under a server would require
runtime MCP introspection and is left as follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* desktop: registry-driven slash commands with first-class pickers

Collapse the if/else slash dispatch into one DESKTOP_COMMAND_SPECS table
that drives popover suggestions, per-type composer pills, and execution.

- /resume, /sessions, /switch: inline session completions (like /skin) plus
  a "Browse all sessions…" entry that opens a dedicated session picker overlay
- /handoff: inline platform completion + handoff.request/handoff.state
  gateway bridge so desktop reaches CLI parity
- colored per-type pills (command/skill/theme) in the composer
- strip ANSI and fix width/alignment of slash output in the chat panel

* desktop: fold repeated slash session/output boilerplate into one helper

runExec, /title, /help and the unavailable case each re-derived the same
ensure-session → bail-with-notify → build-renderSlashOutput dance.
withSlashOutput() returns {sessionId, render} or null, so each handler is
a two-line resolve instead of an eight-line preamble.

* desktop: keep backend meta on slash arg completions

Arg suggestions (/personality <name>, /tools enable <toolset>, /handoff
<platform>) were having their meta overwritten with the parent command's
registry description: desktopSlashDescription("/personality none") canonicalizes
back to /personality and returns its blurb. Skip the lookup for arg rows so the
backend's own display_meta ("clear personality overlay", etc.) survives.

* cli: list real personalities in /personality completion

_personality_completions resolved load_config().agent.personalities — but that
schema has no agent.personalities key, so completion always returned just
`none` even though the runtime (load_cli_config().agent.personalities) ships a
dozen built-ins (helpful, kawaii, pirate, …). Read from the same source the
command actually applies, so `/personality ` surfaces the real options.

* desktop: expand bare arg-commands to their options on pick

Picking a command like /personality from the slash popover committed it
immediately instead of advancing to its argument list. Mark arg-taking
commands (/skin, /resume, /handoff, /personality, /tools) in the registry
and, when one is picked bare, insert "/cmd " as plain text and re-open the
popover on its inline options — mirroring typing "/cmd " by hand. Arg picks
(serialized text already contains a space) still commit a single pill.

Also realign trigger-popover loading test with the redesigned popover (the
/help empty-state hint shows when resolved, not while the spinner is up);
the merge from main reintroduced the pre-redesign expectation.

* tui_gateway: fold session-db close into a context manager

Both handoff RPCs repeated the same `db, close_db = _session_db_handle()`
+ `finally: if close_db: db.close()` dance. Turn the helper into a
`_session_db` contextmanager that owns the close, so callers just
`with _session_db(session) as db:`.

* desktop: unblock handoff retries and exact resume ids

Clear timed-out desktop handoffs through the gateway so retries are not stuck behind a pending row, and let typed /resume session ids bypass the loaded sidebar cache.

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-11 01:49:24 +00:00
emozilla
bfcc9f92b4 Merge commit '6110aed9b' into feat/whatsapp-cloud-api 2026-06-10 21:39:22 -04:00
xxxigm
615ad97928 fix(streaming): stop socket read timeout from preempting stale-stream detector (#43570)
* fix(streaming): stop socket read timeout from preempting stale-stream detector

The stale-stream detector is deliberately scaled to 180-300s so reasoning
models (e.g. Opus) can pause mid-stream during extended thinking. But the
httpx socket read timeout stayed at a flat 120s for cloud providers and fired
first, tearing down healthy reasoning streams before the detector (which owns
retry + diagnostics) could act. Symptom: every Copilot/Opus turn dies with
ReadTimeout at a consistent ~125s and never completes.

Floor the cloud socket read timeout at the stale-stream timeout so it can no
longer fire before the detector. Local providers and explicit
HERMES_STREAM_READ_TIMEOUT / request_timeout_seconds overrides are unchanged.

* test(streaming): pin read-timeout >= stale-stream invariant for cloud reasoning streams

Cover the contract that the httpx socket read timeout is never shorter than
the stale-stream detector for cloud providers on the default: small contexts
floor to 180s, >=50K to 240s, >=100K to 300s; explicit overrides win; local
providers and the unresolved-value fallback are unaffected.
2026-06-10 20:21:38 -05:00
Austin Pickett
9dd9ef0ec9 fix(web): profiles page modal (#43858)
* fix(web): profiles page modal

* chore: drop unrelated package-lock.json changes

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-10 20:43:22 -04:00
Teknium
4490c7cf8d fix: in-memory transcript blocks empty-session prune
CI caught tests/cli/test_cli_new_session.py asserting that /new keeps
the old session row when conversation history exists in memory. The
live transcript is authoritative: a session whose messages haven't
flushed to the DB yet (or whose flush failed) must not be pruned.
Guard _discard_session_if_empty on self.conversation_history and pin
the behavior with a test.
2026-06-10 17:37:34 -07:00
Teknium
e96ca1a0d3 feat(sessions): drop empty sessions on CLI exit and session rotation
Port from google-gemini/gemini-cli#27770: starting the CLI and
immediately quitting (or rotating with /new, /clear) left an empty
untitled session row behind. These ghost rows pile up in /resume,
`hermes sessions list`, and the in-chat recent-sessions browser.

- SessionDB.delete_session_if_empty(): transactional check-and-delete
  that only removes rows with no messages, no title, and no child
  sessions (delegate subagent parents are preserved). Also removes
  on-disk transcript files via the existing _remove_session_files.
- HermesCLI._discard_session_if_empty(): thin wrapper, wired into the
  cli_close shutdown path and the new_session() rotation path.
  Skipped when /exit --delete already handles removal.

Unlike the one-shot prune_empty_ghost_sessions migration (TUI-only,
24h-old rows), this prevents new ghost rows from accumulating at the
moment they would be created.
2026-06-10 17:22:27 -07:00
alt-glitch
31916539af opentui(v6): degrade SyntaxStyle exhaustion, unmask the exit-7 crash, clamp the cap to the 65k native handle table
Root cause of the bench-suite crash (every otui mem3000/slope cell died at
~3000 lumpy fixture msgs, exit 7, ~880MB RSS — not a cgroup kill):

- @opentui/core 0.4.0 routes EVERY native object through ONE global handle
  registry with 16-bit slot indices (core src/zig/handles.zig: INDEX_BITS=16,
  MAX_SLOTS=65535, slot 0 reserved). Measured on this install: exactly 65,534
  live handles; the next createSyntaxStyle() fails. destroy() DOES recycle
  slots — exhaustion means LIVE objects.
- Every TextBufferRenderable burns THREE slots in its constructor
  (TextBufferRenderable.ts:77-80: TextBuffer + TextBufferView + SyntaxStyle),
  so the mount-everything transcript hits the wall at ~1,400 store rows
  (~16 text renderables/row x 3 ~ 47 handles/row): "Failed to create
  SyntaxStyle" (zig.ts:4554) throws out of a Solid mount effect.
- The crash was MASKED: CliRenderer's own uncaughtException handler
  (handleError -> console.show()) allocates the console-overlay
  OptimizedBuffer — another handle — so the handler itself threw "Failed to
  create optimized buffer: WxH" and Node died with exit 7 (fatal error in
  the uncaughtException handler), hiding the real error.

Why not share one SyntaxStyle (the obvious 3->2): the per-buffer style is
load-bearing — native setStyledText (text-buffer.zig) registers each chunk's
color by NAME ("chunk{i}") into the buffer's OWN style, and registration is
name-keyed-overwrite (syntax-style.zig putStyle), so a shared style would
cross-corrupt chunk colors between every styled <text>. Pooling is unsound
at our layer in core 0.4.0.

The fix, at the seams that are ours:
- boundary/nativeHandles.ts (ffiSafe.ts sibling): SyntaxStyle.create() on a
  full table DEGRADES to a detached style (native handle 0) instead of
  throwing — JS-side styleDefs/mergeStyles (what markdown/code chunk colors
  actually use) keep working; all native calls on handle 0 are inert no-ops.
- boundary/renderer.ts: guard the process error listeners createCliRenderer
  installs so an exception INSIDE the handler can never exit-7-mask the
  original error again (logged honestly; original error stays the story).
- logic/store.ts: HERMES_TUI_MAX_MESSAGES clamped to a handle-safe ceiling
  (1000 rows ~ 47k handles ~ 72% of the table on the realistic fixture).
  The old default of 3000 was unreachable — the TUI crashed at ~1,400 rows,
  before the cap ever bound. Renderable-weight-aware capping is #27's
  (virtualization) to do properly; until then the degrade shim backstops
  pathological rows.

TODO(upstream) — issue-shaped, for the OpenTUI repo:
  (a) a global 64k handle table with a 3-slot cost per text renderable is
      too small for transcript-style TUIs (61k renderables ~ 3k messages);
  (b) native allocation failures throw out of the render loop with no
      degrade path;
  (c) handleError allocates (console overlay buffer) and so crashes on the
      very condition it is reporting, masking the root cause with exit 7.

Also: eslint now ignores ui-opentui/.bench/** (bench `nodes`-cell build
artifact broke the lint gate) and .gitignore covers it.

Gate: npm run check green, 599 tests (595 baseline + 3 degrade-path tests
+ 1 cap-clamp test).
2026-06-11 04:06:19 +05:30
teknium1
d1383a6b14 fix(skills): widen HERMES_HOME-aware .env resolution to all sibling skills
Follow-up to the GitHub-skills fix: the same hardcoded ~/.hermes/.env
pattern existed across other bundled and optional skills. Under the
official Docker setup (HERMES_HOME=/opt/data, subprocess HOME=/opt/data/home)
those paths point at a nonexistent file.

- kanban-video-orchestrator setup.sh.tmpl + docs: resolve via
  ${HERMES_HOME:-$HOME/.hermes}/.env in check_key()
- telephony.py / canvas_api.py / hyperliquid_client.py: error and
  save messages now report the real resolved env path instead of a
  hardcoded literal (path resolution itself was already correct)
- godmode SKILL.md: load_dotenv snippet resolves via HERMES_HOME
- watch_github.py + ~20 SKILL.md prose mentions: document the env file
  as ${HERMES_HOME:-~/.hermes}/.env so Docker users edit the right file
2026-06-10 15:10:11 -07:00
xxxigm
0a593f132c fix(skills/github): resolve .env via HERMES_HOME, not hardcoded ~/.hermes
The GitHub skills' auth-detection fell back to reading GITHUB_TOKEN from a
hardcoded ~/.hermes/.env. In the official Docker layout HERMES_HOME=/opt/data
while tool subprocesses run with HOME=/opt/data/home, so `~/.hermes/.env`
expands to /opt/data/home/.hermes/.env — a path that does not exist — while the
real secrets file is /opt/data/.env. Result: the agent reports GITHUB_TOKEN as
"not set" even though it is present and the dashboard Keys page shows it.

Resolve the file as ${HERMES_HOME:-$HOME/.hermes}/.env (HERMES_HOME is bridged
into tool subprocess env, falling back to ~/.hermes when unset) across all six
auth-detection sites: github-auth (SKILL.md + scripts/gh-env.sh), github-issues,
github-repo-management, github-pr-workflow, github-code-review.
2026-06-10 15:10:11 -07:00
Teknium
3b4c715e1c fix(telegram): stripped-text fallbacks, re-finalize skip, and tail-only delete guard
Follow-ups on top of the two salvaged GodsBoy commits, all live-validated
against the real Telegram Bot API:

- _edit_overflow_split finalize fallbacks degrade to _strip_mdv2() clean
  text instead of putting raw **markdown** markers on screen (salvaged
  from PR #43463 minus its format-first sizing — live probes show
  Telegram's 4096 limit counts PARSED text, so MarkdownV2 escape
  inflation cannot cause MESSAGE_TOO_LONG and sizing against formatted
  wire length only causes premature splits and fragment messages).
- Skip the redundant requires-finalize edit after a got_done edit that
  split-and-delivered (salvaged from PR #43463): re-finalizing re-splits
  the full text into the adopted continuation and duplicates chunks.
- _send_fallback_final only deletes the stale partial message when the
  fallback re-sent the COMPLETE final text. When the prefix dedup sent
  only the missing tail, the partial IS the head of the answer; deleting
  it left users with only the second half of long responses (live-
  reproduced: flood-control during a long stream -> head deleted,
  ratio 0.54 of content visible). This is the third bug behind the
  'Telegram cut messages' reports and was present on main and both PRs.
2026-06-10 15:09:35 -07:00
GodsBoy
da818510ec fix(gateway): finalize best-effort delivery when stream consumer is cancelled 2026-06-10 15:09:35 -07:00
GodsBoy
590b3c0d7e fix(gateway): recover partial Telegram overflow streams 2026-06-10 15:09:35 -07:00
xxxigm
88fcf0c8c0 docs(memory): clarify that memory does not auto-compact when full
The "Persistent Memory" callout said "when memory is full, the agent
consolidates or replaces entries to make room," which reads as if the
store self-compacts automatically. It does not: the `memory` tool
returns an overflow error and the agent does the consolidation in-turn
(the design from #41755). Also note that `replace` is bound by the same
limit — swapping in a longer entry can still overflow — which is the
exact case that confused a user (replace rejected near the cap even
though the math was correct).
2026-06-10 14:39:50 -07:00
xxxigm
f7a6d6a6a1 test(cron): cover provider "custom" → providers.custom resolution
Add execution-time coverage that bare `provider="custom"` resolves a literal
providers.custom endpoint (and still falls through when none exists), plus
creation-time coverage that `_resolve_model_override` keeps a resolvable
"custom" and only pins the main provider when it is unresolvable.
2026-06-10 14:39:03 -07:00
xxxigm
acd4f34e65 fix(cron): resolve per-job provider "custom" to providers.custom instead of codex
A cron job stored with `provider: "custom"` and a matching `providers.custom`
entry in config failed at execution with `auth_unavailable: providers=codex`.
Two layers conspired:

- `_get_named_custom_provider` returned None for bare "custom" *before*
  scanning config, so a literal `providers.custom` entry was never matched and
  resolution fell through to the global default (codex). Now it scans config
  for an entry literally named "custom"; with none it still returns None,
  preserving the legacy model.base_url trust path.
- `_resolve_model_override` blindly stripped bare "custom" at job creation and
  pinned `model.provider` (e.g. codex). It now keeps "custom" when a configured
  custom endpoint resolves, pinning the main provider only when it doesn't.
2026-06-10 14:39:03 -07:00
helix4u
1e7316ced2 fix(desktop): use sudo callback without interactive env 2026-06-10 14:29:56 -07:00
alt-glitch
79d1b58afe ui-tui: env-gated yoga-node sampler for bench instrumentation (dark by default) 2026-06-11 02:29:48 +05:30
alt-glitch
b091b4eaeb opentui(v6): tier-A latex — unicode math with fence-aware preprocessing 2026-06-11 01:44:48 +05:30
alt-glitch
7e01a96e53 opentui(v6): status chrome v3 — one left-aligned labeled line; copy chip off the scrollbar edge 2026-06-11 01:42:59 +05:30
Tranquil-Flow
a8f404b29f fix(gateway): probe launchd domain instead of hardcoding user/<uid> (#40831)
The previous fix for #23387 changed _launchd_domain() from gui/<uid> to
user/<uid> to support Background/SSH sessions on macOS 26+. However, this
broke Aqua sessions where gui/<uid> is the only working domain and
user/<uid> cannot bootstrap or manage the service.

Now _launchd_domain() probes which domain actually contains the loaded
service:
1. Try gui/<uid> first (Aqua sessions)
2. Fall back to user/<uid> (Background/SSH sessions)
3. Use launchctl managername as heuristic when neither has the service
4. Cache the result for the process lifetime

Regression tests cover all four paths plus caching behavior.
2026-06-10 12:39:48 -07:00
teknium1
2d75833abe chore(release): map ianculling for #36087 salvage 2026-06-10 12:39:44 -07:00
0xyg3n
9f95f72b98 fix(agent): strip api_messages in thinking-signature recovery so the retry actually omits thinking blocks
The thinking-signature recovery in agent/conversation_loop.py popped
reasoning_details from messages, then continued to retry. That had two
defects.

First, the strip never reached the wire payload. api_messages is built
once at the start of the turn by shallow-copying every entry in messages
(line 919 area). Each api_messages entry has its own reference to the
same reasoning_details list. When build_api_kwargs runs on every retry
iteration of the inner while-loop, it consumes api_messages, not
messages. Popping reasoning_details from messages left api_messages
untouched, so the retry's request still carried the same thinking
blocks Anthropic had just rejected. The classifier latched
thinking_sig_retry_attempted = True after the first attempt, and the
loop terminated with max_retries_exhausted on the same 400.

Second, the pop mutated the canonical message list. messages is the
same list _persist_session writes to state.db and the session
transcript, so a single recovery permanently wiped every signed
thinking block from the stored conversation. Subsequent turns reloaded
the stripped state, hit the same 400 ('invalid signature' or 'cannot
be modified', see #24107), and the agent stopped responding entirely.
Cascading compaction-ended sessions then chained off the corrupted
parent and the affected chat could not produce a response on any
future turn.

Move the strip onto api_messages, which is the API-call-time list
rebuilt into kwargs on every retry. messages is no longer touched, so
disk I/O stays clean and the recovery actually reaches the wire.

Observed against the native Anthropic Messages API on claude-opus-4-7
and claude-opus-4-8 with the interleaved-thinking-2025-05-14 beta on
hermes-agent 0.12.0 and 0.14.0. PR #24107 narrows the trigger; this
change makes the recovery do what it always claimed to do, and
prevents the destructive aftermath.

Tests cover the api_messages strip in isolation: pop on a shallow copy
does not affect the source, the canonical messages list survives the
strip, idempotency on a duplicate firing path, and a no-op when no
reasoning_details exist on the messages.

Related: #24107, #26959, #17861.
2026-06-10 12:39:44 -07:00
Ian Culling
86e10dd874 fix(agent): route 'thinking blocks cannot be modified' 400 to recovery
Anthropic returns a 400 when the thinking/redacted_thinking blocks in the
latest assistant message are mutated upstream: 'thinking or redacted_thinking
blocks in the latest assistant message cannot be modified. These blocks must
remain as they were in the original response.'

The classifier's thinking_signature branch only matched on the substring
'signature', so this variant fell through to a non-retryable client error
and hard-aborted the turn -- even though the existing strip-reasoning_details
-and-retry recovery would have healed it.

Broaden the 400 match to also catch 'cannot be modified' / 'must remain as
they were' (still gated on 'thinking'), routing it to the same recovery.
Adds a negative-case test so unrelated 'cannot be modified' 400s are not
swept in.

Defense-in-depth, orthogonal to the root-cause work in #35975 / #17861
(which prevent the block mutation in the first place). Only changes a
terminal-failure into a one-shot recovery.

Signed-off-by: Ian Culling <ian@culling.ca>
2026-06-10 12:39:44 -07:00
alt-glitch
31e0adc681 opentui(v6): code-token scopes in the shared syntax style (highlighting was parsing but painting monochrome) 2026-06-11 00:56:45 +05:30
alt-glitch
8445995321 opentui(v6): composer — shift+enter newline (kitty), height cap + internal scroll, line navigation 2026-06-11 00:34:54 +05:30
alt-glitch
abba43eb63 tests: pin envelope fragment-peel guards incl. the known tail-shape tradeoff 2026-06-11 00:24:51 +05:30
alt-glitch
4a0991c1d2 opentui(v6): ink-budget follow-up — transparent root canvas; muted stops borrowing banner_dim 2026-06-11 00:16:00 +05:30
alt-glitch
364b93a4b9 gateway: compact /usage with current-session per-model costs
The OpenTUI /usage went through the slash-worker subprocess, which
resumes the session WITHOUT a live agent — so it could never show
current-session tokens or costs, and what it did show landed as a
full-screen page.

- slash.exec now answers /usage in-process from the live agent:
  per-model rows (requests, tokens in/out, cache, provider-reported
  cost when present), session totals/context, a one-line 30-day
  summary (SessionDB.usage_totals, real costs only) and a one-line
  Nous credits gauge (nous_credits_compact_line, refactored out of
  nous_credits_lines). ~8 lines instead of a page.
- Unreported costs render as 'not reported by provider' — never
  $0.00 — and the 30d summary omits cost when no session in the
  window has a provider-reported figure.
- /usage full keeps the detailed legacy CLI page via the worker.
2026-06-11 00:15:00 +05:30
alt-glitch
85546bb9e2 gateway: capture real provider-reported cost (openrouter usage accounting)
Cost displays were estimates from a pricing table; on OpenRouter the
status bar never reflected what was actually charged. Now cost is
provider-REPORTED only, end to end:

- OpenRouter requests carry usage:{include:true} (profile + legacy
  transport paths); the response usage.cost field (credits, 1:1 USD)
  is captured per call into agent.session_actual_cost_usd and
  persisted to the sessions DB actual_cost_usd column (NULL-safe:
  unreported calls never touch the stored value).
- Nous keeps its x-nous-credits-* header capture; the header delta
  now surfaces as the session's real cost via real_session_cost_usd.
- Providers that report nothing accumulate NOTHING: cost fields stay
  absent/None (the TUI hides its cost segment), never a fabricated
  $0.00 and never an estimate. _get_usage, gateway /usage and the
  CLI usage page all switched off estimate_usage_cost for display.
- Per-model session accumulator (session_model_usage) records real
  per-call counts and provider-reported cost per model.
2026-06-11 00:14:21 +05:30
alt-glitch
ba3fe7027c opentui(v6): responsive two-line chrome at wide widths 2026-06-11 00:09:31 +05:30
alt-glitch
5999cd2848 opentui(v6): per-block copy affordance 2026-06-10 23:58:55 +05:30
rob-maron
6110aed9be Suppress "Credit access paused" notice on free models (#43669)
* don't show credits message on free model

* PR comments
2026-06-10 23:55:06 +05:30
alt-glitch
639a9cb9a7 opentui(v6): ink budget — earned gold, blue machinery, neutral muted (design pass) 2026-06-10 23:55:04 +05:30
brooklyn!
6de3963e37 fix(desktop): keep model runtime state per session (#43702)
* fix(desktop): keep model runtime state per session

(cherry picked from commit f72ee87d99ee38cb7b5badeb9a8af869bb92073a)

* fix(desktop): keep footer model state scoped to active session

(cherry picked from commit d91942ebd4671ff857b5c8526dbf133f04782ecb)

* fix(desktop): restore stored runtime when resuming sessions

(cherry picked from commit 32b3793418257617b8da57e26151f079c2620d00)

* fix(desktop): persist live runtime changes for resume

(cherry picked from commit c58467779436dcef44a80ad55b52664752dc0837)

* fix(desktop): persist resumed endpoint runtime

* chore(attribution): map pinguarmy's commit email in AUTHOR_MAP

The salvaged commits on this branch preserve @pinguarmy's authorship
(郝鹏宇 / peterhao@Peters-MacBook-Air.local). Add the mapping so the
check-attribution CI gate resolves the email to the GitHub username.

---------

Co-authored-by: 郝鹏宇 <peterhao@Peters-MacBook-Air.local>
2026-06-10 18:16:50 +00:00
alt-glitch
a09fa9df42 opentui(v6): resume picker — tabbed /sessions with peek preview (supersedes switcher) 2026-06-10 23:46:07 +05:30
alt-glitch
4e69fdb3be opentui(v6): per-tool content fixes — clarify/skill_view/read/search/exec + tree-sitter outputs 2026-06-10 23:34:03 +05:30
alt-glitch
b3efafcc73 opentui(v6): dedupe model.options prefetch with /model open 2026-06-10 23:10:11 +05:30
alt-glitch
036e863e4a opentui(v6): model picker provider tabs (nous-first chip strip) 2026-06-10 23:09:41 +05:30
alt-glitch
e3cdedbf0f opentui(v6): kill expand/collapse scroll jitter (suspend stickyScroll across the toggle)
User feedback: tool/thinking rows did a "v small quick lil jump up and
down" when toggled, worst on the bottom rows.

Root cause (verified live with 10ms tmux capture sampling): the
transcript scrollbox's sticky-bottom re-pin and the scroll anchor fought
AFTER paint. On a toggle near the bottom, the content-height change runs
ScrollBox.recalculateBarProps -> applyStickyStart("bottom") (the user is
at the sticky position, so _hasManualScroll is false), which paints a
fully bottom-pinned frame; the anchor's 4x16ms scrollTo re-asserts then
yanked the viewport back up. The capture burst shows the transient
pinned frame between two anchored ones on every expand — the visible
down-up flick.

Fix at the cause instead of correcting after the effect: suspend
stickyScroll (a runtime get/set property on ScrollBoxRenderable) BEFORE
running the toggle and restore it ~100ms later, once the content height
has settled. With sticky off, the toggle's layout pass leaves scrollTop
untouched — the clicked header's document position is unchanged (content
grows/shrinks below it), so nothing moves and there is nothing left to
flicker; a collapse past the new bottom clamps naturally via the
ScrollBar scrollSize setter. Restoring recomputes the manual-scroll
state from the actual position: still at the bottom -> keeps pinning for
new content; mid-content -> manual-scroll semantics until the user
returns (the same end state the old anchor produced). Rapid re-toggles
inside the window keep the ORIGINAL saved value.

The far-from-bottom anchor guarantee is unchanged (scrollTop is simply
never touched), pinned headlessly in scrollAnchor.test.tsx along with
the suspension sequencing, the clamp-then-re-pin collapse path, and the
double-toggle restore. ffiSafe's tall-diff scroll-cut regression now
drives the negative-y condition explicitly via wheel scrolls (the old
anchor exercised it through the very transient sticky-bottom frames this
fix removes).

Verified live (tmux, real gateway): before — toggling the bottom rows
painted a transient bottom-pinned frame (f141 of a 10ms burst); after —
three toggle bursts produce ONLY the clean before/after states (4
distinct frames in 458 samples), headers hold their row, including the
bottom-most rows.
2026-06-10 22:39:12 +05:30
alt-glitch
38eb9bb19a opentui(v6): tool output uncapped by default (env restores a cap)
User feedback: "for all tools, i'd want all their output viewing enabled
to be infinite by default."

Flip envOutputLines (HERMES_TUI_TOOL_OUTPUT_LINES): unset -> Infinity
(was 200); a positive integer RESTORES a cap (e.g. =200); 0 stays
Infinity for back-compat with the old opt-in-unlimited value; garbage ->
Infinity (unrecognized = no cap asked for). The semantic is now "cap
only when the user asked for one".

The store's raw-result preference follows the same rule: envOutputLinesSet
becomes envOutputUnlimited — whenever the cap is unlimited (the default
now) and a gateway tail-capped result_text (omittedNote) arrives with the
always-full raw result on the wire, the raw result wins, since an
uncapped view of a tail would silently miss the head. With an explicit
finite cap the gateway tail + honest omitted note are kept.

Memory safety is unchanged: tool bodies mount only while EXPANDED (rows
default collapsed and free their Yoga nodes on collapse/unmount), and the
rolling HERMES_TUI_MAX_MESSAGES cap bounds the transcript's high-water
mark.

Tests: env.test.ts expectations flipped (unset/garbage -> Infinity, 0
documented as back-compat); tools.test.tsx "flag unset caps at 200"
becomes "unset renders all 250 lines", plus an explicit =50 cap (+note)
test and =200 restored-cap test; the store preference matrix covers
unset/0 (raw wins), =50 (tail+note kept), and no-raw (tail+note, no
crash). Verified live: seq 1 220 expanded renders rows 201-220 with no
"+N more lines" note.
2026-06-10 22:38:49 +05:30
Teknium
07ac185904 fix(ci): exit-4 forensics for vanishing test files in run_tests_parallel.py (#43646)
* fix(ci): append filesystem forensics when a per-file pytest run exhausts exit-4 retries

A PR-added test file (tests/test_iron_proxy.py, PR #30179) repeatedly
failed exactly one CI shard with 'ERROR: file or directory not found'
across 4 runs (including a fresh merge SHA on fresh runners), while the
identical slice passes locally against the same merge commit and a
tree-integrity watcher confirms no sibling test mutates the repo. Three
unrelated branches showed the same one-shard signature the same day.

We currently cannot attribute these because the log only carries
pytest's exit-4 line. This adds a forensics block to the captured
output when exit-4 survives the retry loop:

- does the file exist NOW (post-retries)
- parent dir entry count + similarly-named entries
- git status --porcelain dirty-entry count + first 10 entries

Zero behavior change: rc stays 4, retries unchanged, forensics wrapped
in a broad try/except so they can never mask the failure.

Two new tests cover the exhausted-retries and genuinely-missing paths.

* chore: drop the two forensics tests — ship the runner change only
2026-06-10 10:04:17 -07:00
Shannon Sands
3acf73161f Move folder creation into dialog 2026-06-10 09:53:12 -07:00
Shannon Sands
dd60c49bb8 Add dashboard file drop upload panel 2026-06-10 09:53:12 -07:00
Shannon Sands
6fe4821926 Add dashboard file browser paths 2026-06-10 09:53:12 -07:00
alt-glitch
0bb58b65ec opentui(v6): fix popup-boot latency regression (model.options prefetch blocked the gateway dispatcher)
The native TUI prefetches model.options right after session.create (91df32545,
picker instant-open). The handler is network-bound (~3.7s: pricing fetch + Nous
tier check in build_models_payload) and ran on the gateway's main dispatcher
thread, so every fast-path RPC issued in the first seconds after launch —
complete.slash for the '/' dropdown, session.list, config.get — sat unread
behind it. Measured: first '/' dropdown 1718ms at HEAD vs 53ms at 394f45a3d
(pre-prefetch baseline); 52ms after routing model.options onto the existing
RPC thread pool (_LONG_HANDLERS). The /model picker keeps its 29ms cached open.
2026-06-10 22:20:09 +05:30
alt-glitch
fb30ff218d tests: align dropdown-hint + wrap expectations with arrows-everywhere menus 2026-06-10 22:15:12 +05:30
alt-glitch
e1edbb0e89 opentui(v6): arrows + enter navigate every completion menu (paths, args) 2026-06-10 22:14:11 +05:30
alt-glitch
72118b049f tests(cli): align tui argv prebuild test with the node-probe launcher 2026-06-10 22:09:43 +05:30
alt-glitch
c007d08419 opentui(v6): port utility commands — compact, details, replay, heapdump, mem 2026-06-10 22:09:39 +05:30
alt-glitch
73b261b94f opentui(v6): monotonic double-press clock + consume the viewer's closing Esc 2026-06-10 22:08:39 +05:30
alt-glitch
f4bb617f62 opentui(v6): tray-exit Esc never arms the prompt-history double-press 2026-06-10 21:58:15 +05:30
alt-glitch
4c630d3e7b opentui(v6): Esc+Esc session prompt history — rollback/undo confirm 2026-06-10 21:49:17 +05:30
Teknium
d986bb0c6d feat(dashboard): full-featured profile builder (model + skills + MCPs) (#39084)
* feat(profiles): extend create endpoint for full profile-builder (model + MCPs + skills)

Backend foundation for the dashboard profile builder. Extends POST /api/profiles
to accept, in one call, everything a profile needs beyond name/clone:

- mcp_servers[]  -> written into the new profile's config.yaml
- keep_skills[]  -> replace-semantics: disable every seeded skill not kept
- hub_skills[]   -> async install via 'hermes -p <name> skills install <id>'

All applied best-effort AFTER the profile dir exists, so a hiccup in any one
never 500s the create. Model/MCP/keep-skills writes are profile-scoped via the
HERMES_HOME context override (same mechanism as the existing _write_profile_model).
Hub installs go through a subprocess scoped with -p because skills_hub.SKILLS_DIR
is import-time-bound and the runtime override can't redirect it.

Adds two helpers (_write_profile_mcp_servers, _disable_unselected_skills) and a
TestClient test asserting all four paths land in the NEW profile's config and
the hub spawn is scoped to it. Design doc at docs/design/profile-builder.md.

* feat(dashboard): full-featured profile builder page

Adds a dedicated /profiles/new builder that composes everything a profile
needs into one stepped create flow, reusing the existing Models/Skills/MCP
data paths instead of duplicating them:

- Identity   name + description
- Model      provider+model picker (api.getModelOptions)
- Skills     keep-which-built-in/optional (replace semantics, default = full
             bundle) + skills-hub search/add (api.getSkills, searchSkillsHub)
- MCPs       add HTTP/stdio servers inline
- Review     blueprint -> single POST /api/profiles create

Nothing writes until Create; the one call commits model+MCPs+skill selection
and spawns hub-skill installs (reported in the success toast). ProfilesPage
header gets a 'Build' button (full builder) alongside 'Create' (quick modal).
Route is page-only (not in the sidebar nav). Verified with vite build (2258
modules, green).
2026-06-10 09:18:32 -07:00
alt-glitch
bc71c57ba9 tui_gateway: session.list reports scan-cap truncation honestly 2026-06-10 21:44:17 +05:30
alt-glitch
ab5d422835 tui_gateway+cli: session.list filters + session.peek + bare --resume picker sentinel 2026-06-10 21:31:59 +05:30
alt-glitch
72ee55ed53 opentui(v6): picker v2.1 — provider search, availability toggle, native input, manual refresh 2026-06-10 21:30:55 +05:30
ethernet
4cecb1a13a change(tooling): npm audit fix in website/ 2026-06-10 11:59:34 -04:00
ethernet
90f4b3040d change(tooling): remove react-compiler eslint, update concurrently
concurrently 9 had a critical vuln dependency,
react-compiler eslint plugin is built into react-hooks eslint plugin as
of https://react.dev/blog/2025/10/07/react-compiler-1
2026-06-10 11:59:34 -04:00
ethernet
3bfbb3f2a0 change(tooling): typecheck in CI, update ts to 6
fix(ui-tui): fix ts 6 real type errors

change(tooling): use new node everywhere
2026-06-10 11:59:34 -04:00
alt-glitch
df4bdc9d58 opentui(v6): skill highlighting + one-edit autocorrect (anti-jank) 2026-06-10 21:27:36 +05:30
alt-glitch
eaad47a6f6 opentui(v6): header chrome — dense status bar (Variant A) 2026-06-10 21:24:58 +05:30
alt-glitch
8c3060342f opentui(v6): standardize fuzzy search on fuzzysort (adapter keeps our API) 2026-06-10 21:04:22 +05:30
alt-glitch
c9c6cfc0ee opentui(v6): background-agents tray — down-arrow focus + enter to dashboard 2026-06-10 21:00:59 +05:30
alt-glitch
6438acec60 opentui(v6): model picker v2 — fuzzy search + provider groups + instant open 2026-06-10 20:50:17 +05:30
alt-glitch
76a8bba15f tui_gateway: blocking prompts wait for the human (drop _block timeouts) 2026-06-10 20:23:25 +05:30
alt-glitch
ad220b9d93 opentui(v6): slash menu — arrow navigation + enter accept 2026-06-10 20:05:39 +05:30
alt-glitch
ca791f4000 opentui(v6): trust gateway payload.error — drop client-side result sniffing 2026-06-10 19:34:27 +05:30
alt-glitch
0dafcdd9e3 opentui(v6): tool-name emphasis, thought styling, HERMES_TUI_TOOL_OUTPUT_LINES 2026-06-10 19:31:45 +05:30
alt-glitch
6e62489d9e tui_gateway: surface tool failure as payload.error (result convention) 2026-06-10 19:27:49 +05:30
alt-glitch
076aebc7e6 opentui(v6): tool lifecycle states — live elapsed tick + failed glyph 2026-06-10 19:12:45 +05:30
alt-glitch
b6dc49200d opentui(v6): suppress redundant JSON/diff-echo output under rendered diffs
A patch tool's result is a JSON record whose payload IS the diff. In a verbose
session the gateway redacts + TAIL-caps result_text (_cap_tui_verbose_text),
so the echo arrived under the native diff in two broken shapes: truncated
mid-JSON (unparseable, so the old JSON.parse check failed open), or — for tall
edits — capped PAST the JSON head, which the store's normalizeOutput then
un-escapes into plain lines that duplicate the diff. North star: no raw JSON
in the transcript, ever.

Three layers:
- gateway: when diff_unified ships, result_text drops the in-JSON diff echo
  (_result_sans_diff_echo) — small, parseable, carries only the non-diff
  signal (success/files_modified/warnings/lsp_diagnostics).
- fileTool diffOutputPlan: anything starting with '{' under a rendered diff is
  suppressed regardless of parseability; parseable JSON with real non-diff
  signal (error/warning/lsp_diagnostics) renders JUST those as labeled notes;
  a non-JSON fragment whose lines echo the rendered diff is suppressed too
  (guards older emitters). Plain-text results (lint tails) still render.
2026-06-10 18:49:03 +05:30
alt-glitch
0a5b0780f5 opentui(v6): clamp negative draw coords at the node:ffi seam (diff crash fix)
Expanding a tall <diff showLineNumbers> pinned to the scrollbox bottom froze
the TUI with ERR_INVALID_ARG_VALUE looping out of CliRenderer.loop every
frame. Root cause: @opentui/core 0.4.0 marshals OptimizedBuffer
fillRect/drawText/setCell* coordinates as u32 in the FFI table while
LineNumberRenderable.renderSelf passes raw screen coordinates — NEGATIVE when
the diff is partially scrolled above the viewport. Bun's FFI silently wraps
negatives (native side bounds-checks them into a no-op); Node's experimental
node:ffi rejects them. bufferDrawBox already uses i32, which is why ordinary
boxes/text scroll fine and only the diff line-background path crashed.

Fix at the seam we own: boundary/ffiSafe.ts patches OptimizedBuffer to clip
fillRect to the non-negative quadrant and skip negative-origin
drawText/setCell*/drawChar before the FFI call (Bun parity). Installed from
boundary/renderer.ts (live) and test/lib/render.ts (headless). TODO(upstream):
widen those FFI params to i32 so this shim can be deleted.
2026-06-10 18:48:51 +05:30
alt-glitch
c4348480f3 opentui(v6): file tool renderer — relative path + full native diff 2026-06-10 16:48:15 +05:30
alt-glitch
f76df0688c tui_gateway: send full unified diff (diff_unified) on file-edit tool.complete 2026-06-10 16:42:14 +05:30
alt-glitch
84cbf5c1f3 opentui(v6): prefer gateway-redacted args_text over raw args in tool renderers 2026-06-10 16:24:16 +05:30
alt-glitch
0f92a3cf63 opentui(v6): bash tool renderer — command + full output 2026-06-10 16:18:08 +05:30
alt-glitch
60cbc4c68b opentui(v6): tool renderer registry + labeled-args default (no raw JSON) 2026-06-10 16:15:37 +05:30
Davy
a72bb03757 fix(docker): optimize image size — .dockerignore, drop dev deps, split build layers (#38749)
* fix(docker): optimize image size with .dockerignore, drop dev deps, split build layers

Three changes to reduce the Docker image size and speed up rebuilds:

1. .dockerignore — exclude ~69 MB of files that are never needed inside
   the container: apps/ (desktop Tauri source), tests/, website/
   (Docusaurus), docs/, infographic/, nix/, plans/, packaging/, and
   various dotfiles (.envrc, .hadolint.yaml, .mailmap, etc.).  The
   existing .dockerignore already covered node_modules and .git; these
   additions prevent the remaining non-runtime content from inflating
   both the build context and the final image (COPY . .).

2. pyproject.toml — add a [docker] extra that mirrors [all] but omits
   [dev] (debugpy, pytest, pytest-asyncio, pytest-timeout, ty, ruff,
   setuptools).  The published image doesn't need test/debug tooling.
   Estimated savings: ~30-50 MB of Python packages.

3. Dockerfile — use --extra docker instead of --extra all in the
   uv sync layer.  Also split the COPY + npm run build so that the
   web/ and ui-tui/ frontend builds are cached independently from
   Python source changes (COPY . .).  A Python-only commit no longer
   invalidates the (slower) frontend build layer.

Note: the build-only apt packages (gcc, python3-dev, libffi-dev,
libolm-dev) are still installed in the final image.  Removing them
requires a true multi-stage build (builder → runtime), which is a
larger refactor tracked separately.

* fix(docker): remove redundant [docker] extra, revert to --extra all

The [docker] extra was identical to [all] on main — the PR had added [dev]
to [all] then created [docker] as [all] minus [dev], a no-op round-trip.
Revert [all] to its original form and drop the [docker] extra.

Keep the .dockerignore additions and frontend build layer reordering.
2026-06-10 03:08:00 -07:00
Gille
47e77ae166 fix(curator): use shared atomic state writer 2026-06-10 03:04:54 -07:00
Gille
4c797d0e23 fix(desktop): hide Windows console children launched by GUI 2026-06-10 03:04:54 -07:00
teknium1
189ffe7362 test: port voice-reply suffix assertions, fix change-detector cap test, add AUTHOR_MAP entry
- Add output_path suffix assertions (.ogg Telegram / .mp3 non-Telegram) to
  _send_voice_reply tests, covering the OGG voice-note path that landed on
  main in ae82eed2b (the PR's third commit was redundant with it).
- Convert test_gemini_default_is_32000 back to an invariant against
  PROVIDER_MAX_TEXT_LENGTH instead of a hardcoded literal.
- Map barronlroth@gmail.com -> barronlroth in scripts/release.py.
2026-06-10 02:57:39 -07:00
Barron Roth
2c19208224 feat(tts): add Gemini audio tag rewrite 2026-06-10 02:57:39 -07:00
Barron Roth
5718811de0 feat(tts): add Gemini persona prompt file 2026-06-10 02:57:39 -07:00
Teknium
af3c8b80b5 fix(tests): close pid-file read race in test_grandchild_reaped_via_pgroup (#43447)
The grandchild wrote its pid with open('w').write(...), so the polling
reader in the test could observe the file after creation but before the
write flushed, parsing '' -> ValueError: invalid literal for int().
Write to a temp file and os.replace() it into place so the pid file only
ever appears fully written.
2026-06-10 02:57:27 -07:00
Teknium
70d5d7e39b fix(memory,skills): repair write-approval inline prompt, gateway staging, and gateway /skills review (#43452)
Follow-ups to #38199/#43354 found in post-merge review:

- Inline CLI memory approval never worked: the per-thread approval callback
  was not passed to prompt_dangerous_approval, so the prompt_toolkit
  fail-closed guard (#15216) denied every gated foreground write without
  showing a prompt. Now invokes the registered callback directly; a crashed
  prompt falls back to staging instead of a silent deny.
- Gateway sessions claimed inline support but prompt_dangerous_approval has
  no gateway round-trip (that lives in the pending-approval queue), so gated
  gateway memory writes hit the input() fallback and denied. Gateway
  contexts now stage for /memory pending review.
- /skills pending|approve|reject|diff|approval now works on the gateway
  (gateway_config_gate on skills.write_approval), so skills staged from a
  messaging session can be reviewed there. Diff output truncated for chat.
- memory_tool validates required params before the gate so invalid writes
  are rejected immediately instead of staged and failing at approve time.
- Stale tri-state write_mode docstrings updated to the boolean gate; docs
  table corrected (inline prompt is interactive-CLI-only).
- 6 new tests covering the interactive approve/deny/error paths, gateway
  staging, skills never-prompt invariant, and pre-gate validation.
2026-06-10 02:57:15 -07:00
Teknium
a5c32cdf30 fix(update): self-heal a venv left half-built by an interrupted install (#42172)
* fix(update): self-heal a venv left half-built by an interrupted install

An update killed mid dependency-install (Ctrl-C, terminal close, WSL OOM)
could leave the venv with pip wiped and core deps (e.g. Pillow) missing,
with no automatic recovery — the user had to manually run ensurepip +
reinstall.

Drop an install-scoped .update-incomplete breadcrumb right before the dep
install and clear it only after core-dependency verification passes. On the
next launch (any command except 'update' itself), if the marker is present,
unconditionally bootstrap pip via ensurepip then re-run the .[all] install +
verification, then clear the marker. Failure leaves the marker for retry and
prints the manual recovery command. Never raises — recovery cannot block
launch.

* fix(update): address review — stderr-only recovery output, single-flight lock, gitignore marker

- Route all recovery output (status lines + streamed pip/uv install via
  fd-level dup2) to stderr so protocol-on-stdout launches (hermes acp)
  never get install noise on the JSON-RPC stream.
- Single-flight O_EXCL lockfile (.update-incomplete.lock) so a gateway
  start + CLI launch (or two profiles) can't run concurrent installs
  into the shared venv; stale locks (>1h) are broken for the next launch.
- gitignore .update-incomplete + lock so source-tree installs keep a
  clean git status and update's autostash skips them.
- Document why the loose 'update' argv substring match is intentional
  (over-match defers one launch; under-match would race the real update).
- 4 new tests: lock held → skip, stale lock broken, lock released,
  output lands on stderr only.
2026-06-10 02:57:05 -07:00
Ben Barclay
15813336cc fix(config): preserve original .env file mode in remove_env_value too (#43349)
#33699 fixed save_env_value so an operator-set .env mode (e.g. 0640 on a
Docker bind-mount) survives a config write instead of being re-tightened
to 0600 by the unconditional _secure_file() call. The sibling
remove_env_value() had the identical bug: it restores original_mode and
then unconditionally called _secure_file(env_path), clobbering the mode
back to 0600 on every `hermes config remove KEY`.

Apply the same fix: move _secure_file() into the else branch so it only
runs when no original mode was captured (a freshly created .env still
gets 0600 hardening; existing operator-set modes survive).

Added test_remove_env_value_preserves_existing_file_mode_on_posix, which
fails on the unfixed remove path (expected 0o640, got 0o600) and passes
with the fix.
2026-06-10 19:53:07 +10:00
Siddharth Balyan
183d86b3e0 fix(openrouter): route reasoning_effort to verbosity for adaptive Anthropic models (#43436)
* fix(openrouter): route reasoning_effort to verbosity for adaptive Anthropic models

Reasoning-mandatory Anthropic models (Claude 4.6+/fable/mythos-class) over
OpenRouter ignore reasoning.effort and use adaptive thinking. #42991 correctly
stopped Hermes from sending a reasoning field to them (it 400s), but put nothing
in its place — leaving agent.reasoning_effort a silent no-op on the OpenRouter
path: the model always ran at its adaptive default (high) regardless of config.

OpenRouter honors the requested effort on the top-level verbosity field instead
(maps to Anthropic output_config.effort). Route the existing
reasoning_config[effort] there for these models while still never emitting a
reasoning field, preserving the #42991 fix. No new config arg — the value the
user already sets via agent.reasoning_effort now flows to verbosity.

- low/medium/high/xhigh/max pass through verbatim (OpenRouter accepts the
  extended scale for Claude; verified live HTTP 200 + monotonic token spend).
- effort unset/none/disabled omits verbosity so the model keeps its default.
- native Anthropic transport already correct; unchanged.

Fixes #43432

* test(openrouter): cover real effort range (add minimal, frame max as passthrough)

Adversarial review noted the verbosity tests looped over 'max' — a value
parse_reasoning_effort can never produce — while omitting 'minimal', which it
can. Align the routing test with the real config range
(VALID_REASONING_EFFORTS = minimal/low/medium/high/xhigh) and keep a separate
value-agnostic passthrough test that documents why xhigh/max must survive
verbatim (TypedDict, no runtime literal validation; OpenRouter accepts the
extended scale for Claude).

* docs: explain reasoning_effort -> verbosity routing for adaptive Anthropic models

Document that reasoning_effort transparently maps to OpenRouter's verbosity
field for adaptive-thinking Anthropic models (Claude 4.6+/Fable/Mythos), where
reasoning.effort is ignored. Note xhigh is the configurable ceiling (max is wire-
only). Add verbosity as a top-level-kwarg example in the provider-plugin guide.
2026-06-10 15:03:01 +05:30
Teknium
cd9a9cd8e5 fix(gateway): Slack approval UX in threads — block-size overflow + typed-prefix instruction text (#43444)
Two fixes for the reported Slack thread approval UX:

1. Slack Block Kit approval/confirm sends silently overflowed the
   3000-char section-block cap (flat 2900-char truncation + header +
   reason), so long execute_code approvals failed with invalid_blocks
   and fell back to the plain-text prompt with no buttons. Budget the
   command preview against the rendered fixed parts so blocks never
   exceed the cap (send_exec_approval + send_slash_confirm).

2. The text fallbacks told users to reply /approve — which Slack blocks
   inside threads and Matrix clients reserve client-side. Add a
   typed_command_prefix capability flag on BasePlatformAdapter
   (default "/"; Slack and Matrix set "!" to match their existing
   bang-prefix rewrite) and use it in the shared fallback prompt
   builders (exec approval, update prompt, destructive slash confirm,
   expensive-model confirm) plus Matrix's reaction-prompt text.
   The slash-confirm text-intercept now also accepts bang-prefixed
   replies (!always, !cancel) since those keywords aren't registered
   commands and the adapters' rewrite doesn't touch them.
2026-06-10 02:30:01 -07:00
Evi Nova
5d8c44a393 fix(docker): pre-install matrix deps in Docker image (#30399) (#42413)
The Matrix gateway requires mautrix[encryption] which pulls in
python-olm. While python-olm was removed from [all] due to missing
Windows/macOS wheels, it has binary manylinux wheels for Linux
amd64/arm64. The Docker image only runs on Linux, so adding --extra
matrix to the uv sync line is safe.

libolm-dev is already in the apt-get install line for runtime linking.

Fixes: #30399
2026-06-10 19:23:06 +10:00
kshitij
2f19512341 fix(cli): repair non-UTF-8 stdout/stderr on all platforms, not just Windows (#43439)
`hermes setup` (and other banner-printing commands) crash with an unhandled
UnicodeEncodeError on Linux hosts whose locale selects a non-UTF-8 codec —
e.g. a fresh Raspberry Pi / minimal Debian with a latin-1 or C/POSIX locale.
The setup wizard prints box-drawing characters (┌│├└─) and the ⚕ glyph before
any stream repair runs, so the command dies before it can start.

The existing _ensure_utf8() shim already knew how to re-wrap the standard
streams as UTF-8, but it returned early on `sys.platform != "win32"`, so the
identical crash class on Linux was never covered.

- Drop the win32 gate: repair any stdout/stderr whose encoding is not UTF-8.
- Prefer TextIOWrapper.reconfigure() so the stream object is fixed in place
  (cached sys.stdout references keep working); fall back to reopening the fd
  with closefd=False (the CPython-recommended safe variant).
- Use errors="replace" — matching the sibling hermes_cli/stdio.py shim — so a
  stray un-encodable byte degrades gracefully instead of crashing.
- Only set the PYTHONUTF8/PYTHONIOENCODING child-process hints when a repair
  actually happened, so a healthy UTF-8 host sees zero footprint (no stream
  swap, no env mutation).

This is intentionally the earliest, platform-agnostic guard, running at import
time before any banner prints. hermes_cli/stdio.py::configure_windows_stdio()
still runs later from the entry points for the Windows-only extras (console
code-page flip, EDITOR default, PATH augmentation); it early-returns on
non-Windows and its stream reconfigure is an idempotent no-op once we've
already repaired the streams here.

Add regression tests covering latin-1 and ascii/POSIX streams, the reconfigure
fallback, already-UTF-8 no-op (identity preserved + no env mutation), the
repair-sets-env and respects-explicit-env contracts, and hostile/None streams.
2026-06-10 02:21:00 -07:00
brooklyn!
f222bd26e7 Merge pull request #43430 from NousResearch/bb/desktop-tool-codicons-filled
style(desktop): filled glyphs for in-thread tool icons
2026-06-10 03:52:04 -05:00
Brooklyn Nicholson
38273676ea fix(desktop): carve sticky user bubbles out of the titlebar drag region
Sticky human bubbles park at --sticky-human-top (~4px), sliding under the
titlebar's -webkit-app-region:drag strips. Electron resolves drag regions at
the compositor level — z-index and pointer-events don't apply — so clicking a
stuck bubble dragged the window instead of opening the edit composer. Add
no-drag to the shared bubble base class (read-only bubble + edit composer).

Covers the runtime side with a test: clicking a user bubble opens the inline
edit composer through both the incremental external-store runtime and the
stock one.

(cherry picked from commit db4e1f4f3e)
2026-06-10 03:46:03 -05:00
Brooklyn Nicholson
c1308ebf3f style(desktop): filled SVG glyphs for in-thread tool icons
Replace the earlier text-stroke approach (which only bolds outline
codicons — a font glyph has no fillable region) with dedicated solid
SVG glyphs for tool rows. Adds ToolIcon, keyed by the same names as
TOOL_META, with a codicon fallback for uncovered tools.
2026-06-10 03:41:55 -05:00
Brooklyn Nicholson
e80754647c style(desktop): render in-thread tool codicons as filled glyphs
Outline codicons read too thin at conversation-tool scale; a scoped
filled modifier thickens tool-row and code-card icons without changing
icon semantics elsewhere in the shell.
2026-06-10 03:30:25 -05:00
alt-glitch
ae11a636dc feat(tui): run on Node 26 (one runtime), finalize copy UX, rename to ui-opentui
Ports the engine off the second JS runtime onto Node 26.3 (node:ffi) so the
repo ships a single JavaScript runtime: child_process for the gateway, vitest
for tests, an esbuild + Solid build step. Mouse selection copies the rendered
text you highlight, and the clipboard path is crash-proofed (a broken copy
pipe no longer quits the UI). Renames the engine dir ui-tui-opentui-v2/ ->
ui-opentui/ and updates the launcher/installer/Docker references.
2026-06-09 16:16:48 +00:00
alt-glitch
25567919ea opentui(bench): scripts/demo.tsx — view the fixture in a real attachable TUI
Dev demo (not a test): seeds the bench fixture into the store via the resume path
and renders <App> under a real CliRenderer (no gateway) so you can attach over
tmux, scroll, and eyeball the transcript + the rolling-cap truncation notice.
Run: DEMO_TOTAL=240 HERMES_TUI_MAX_MESSAGES=80 bun scripts/demo.tsx
2026-06-09 10:41:13 +00:00
alt-glitch
dcd8ba2a0d opentui(memory): cap default 1500→3000 + honest truncation notice
Bench (realistic fat-turn fixture) put numbers on the cap tradeoff: ~0.65 MB/msg,
~20.4 renderables/msg → 3000 ≈ 2 GB steady RSS, the highest cap within a sane TUI
budget (that ceiling only hit by marathon 3000+-msg sessions; typical cost a
fraction). 1500 was too little scrollback. Tunable via HERMES_TUI_MAX_MESSAGES.

Adds a store `dropped` counter (live overflow in capMessages + the resume slice in
commitSnapshot; reset on clearTranscript) and a dim, selectable=false top-of-
transcript notice — '⤒ N earlier messages — scroll-back capped; full transcript on
the dashboard · session <id>' — so display truncation is visible + points to the
deep-history surface. Display-only: never touches the model's gateway-side context.
2026-06-09 10:25:16 +00:00
alt-glitch
f205dc2a3b opentui(bench): realistic heavy-session fixture (fat tool-turns) + multi-cap matrix
Replaces the synthetic ~5.5-node/msg pushes with a deterministic generator
(scripts/fixture.ts): lorem-ipsum user turns + fat assistant turns (markdown +
reasoning + 1-15 tool parts with multi-line results) driven through the real
apply()/commitSnapshot paths. mem-bench.tsx pumps it + checks the resume path.
Realistic cost is ~20.4 renderables/msg (3.7x synthetic); informed the cap tune.
2026-06-09 10:25:16 +00:00
alt-glitch
c40d3172ac opentui(harden): slice the resume snapshot before mounting (no transient over-cap)
commitSnapshot set the full fetched history then trimmed — briefly handing the
whole transcript to <For>. Since Yoga (WASM) layout memory is grow-only, even a
transient over-cap mount permanently ratchets the high-water mark, partly
defeating the cap when resuming a large session (a real one has ~1980 messages).
Slice to MESSAGE_CAP BEFORE the first setState so resume mounts at most the cap.
2026-06-09 09:41:37 +00:00
alt-glitch
52533bea09 opentui(bench): headless memory bench proving the cap bounds Yoga-node growth
Dev bench (not a test, not in the gate suite): mounts <App> under the Solid
test renderer, pushes N streamed turns, samples RSS + mounted-renderable count
with Bun.gc. Demonstrates HERMES_TUI_MAX_MESSAGES=400 pins mounted renderables
at ~2218 vs an unbounded climb to ~55k at 10k messages (RSS flat ~350MB vs
1.3GB). Run: bun scripts/mem-bench.tsx (MEM_BENCH_TOTAL/SAMPLE tunable).
2026-06-09 08:44:30 +00:00
alt-glitch
9eb36fd697 docs: OpenTUI is the default engine on supported hosts; Ink is the fallback
Note in the README CLI section that the terminal UI defaults to the native
OpenTUI engine on Linux/macOS with Bun (provisioned by the installer), and
that the legacy Ink engine remains the automatic fallback (Windows, Termux,
no Bun) and can be selected explicitly with HERMES_TUI_ENGINE=ink. Ink is
not removed — it's the kept fallback.

No in-repo config example documents display.tui_engine (the published config
reference lives on the docs site, not the repo), so there was nothing to
annotate there.
2026-06-09 08:35:43 +00:00
alt-glitch
f8f4b3044a install: provision Bun + OpenTUI engine (best-effort, Ink fallback on failure)
Add an opt-in-safe `install_opentui` stage that provisions the native
OpenTUI TUI engine: it resolves/installs Bun (~/.bun/bin/bun) and runs
`bun install` in ui-tui-opentui-v2 so the launcher's _opentui_available()
probe (Bun + node_modules/@opentui) passes and OpenTUI becomes the default.

Strictly best-effort: skipped on Windows/Termux/Android and when the v2
package is absent; any sub-step failure (no network, Bun install fails,
`bun install` fails) logs a warning via log_warn and returns 0. The stage
never `exit`s and never returns non-zero, so it can't abort the install — a
failed/skipped setup simply leaves the user on the kept Ink fallback.

Registered after node-deps in all three drivers: the monolithic main()
flow, the run_stage_body case dispatcher (opentui-engine), and the
emit_manifest staged-installer JSON.
2026-06-09 08:35:38 +00:00
alt-glitch
0dc257d610 tui: default to the OpenTUI engine when the host can run it (Ink fallback)
Flip the default engine: with no explicit HERMES_TUI_ENGINE env / display.tui_engine
config, resolve to 'opentui' when this host is genuinely set up for it (Bun resolves +
the v2 package's entry + node_modules present + not Windows/Termux), else 'ink'. An
explicit env/config choice still wins, and 'ink' remains the universal opt-out. Hosts
without the OpenTUI setup are unaffected (stay on Ink), so nothing strands a user.

- _config_tui_engine_early() now returns None (not 'ink') when unset, so the caller
  distinguishes 'explicitly ink' from 'unset' and applies the availability-gated default.
- _bun_bin() split: _bun_bin_or_none() is the non-fatal probe; _bun_bin() still exit(1)s
  on the explicit launch path. New _opentui_available() gates the default.
- Verified the full resolution matrix (7 cases) + that the platform/availability gates hold.
2026-06-09 08:31:30 +00:00
alt-glitch
20865a2653 opentui(ts): shared envFlag parser
Extract one boolean env-flag parser (src/logic/env.ts: envFlag + the shared
TRUE_RE/FALSE_RE) instead of per-file regexes. Rewire entry/main.tsx
(HERMES_TUI_FAKE → envFlag(…, false); HERMES_TUI_MOUSE → envFlag(…, true)) and
logic/theme.ts (detectLightMode's HERMES_TUI_LIGHT tri-state now uses the
shared regexes; the lowercased-input + /i regex is behaviorally identical to
the prior lowercased-input + non-/i regex). Semantics are byte-identical.
Adds src/test/env.test.ts (true/false/unset/garbage→fallback).
2026-06-09 08:26:43 +00:00
alt-glitch
da07e67efd opentui(ts): collapse prompt accessors into a generic narrow()
Replace the ~5 near-identical `as*()` accessors in
view/prompts/promptOverlay.tsx (one per ActivePrompt kind) with one generic
`narrow(kind)` helper that narrows the discriminated union via a typed type
guard (`p is Extract<ActivePrompt, { kind: K }>`) — no `as`. Each <Match>
branch keeps its precise typed payload. Behavior is identical.
2026-06-09 08:26:36 +00:00
alt-glitch
b36001940a opentui(ts): deferClose helper for overlay-close defers
Extract the repeated `setTimeout(() => …close…, 0)` overlay/prompt-close
pattern into a single `deferClose(fn)` helper (src/logic/defer.ts) so the
"why deferred" rationale (let the closing keystroke finish dispatching before
the composer remounts/refocuses) lives in one place.

Rewires the 5 close-defer sites: closePager/closeDashboard/closeSwitcher/
closePicker in view/App.tsx and clearSoon in view/prompts/promptOverlay.tsx.
Timing is unchanged (0ms). Other setTimeout uses (quit window, flashHint,
scroll re-anchor, resize debounce, transport) are NOT close-defers and are
left untouched.
2026-06-09 08:26:24 +00:00
alt-glitch
f4d944c49c opentui(ts): enforce no-unsafe-* + require-await as errors (prod .ts), exempt JSX views + tests
Production boundary/logic .ts is clean of the no-unsafe-* family (gateway
payloads are Schema-decoded), so promote it from warn to error. *.tsx is
exempted: @opentui/solid's JSX namespace types every component return as
error/unknown — a framework limitation, not unsafe app code. Test helpers/
mocks (loose fixtures + async signatures) are exempted too. Remaining warns
are no-unnecessary-condition: intentional defensive guards on untrusted
runtime/gateway data that TS's narrowing can't model.
2026-06-09 08:21:13 +00:00
alt-glitch
e4652b99e2 opentui(ts): decode SessionInfo + Catalog via Schema (drop the as-casts)
Replace the two ad-hoc as-cast loose readers in src/logic/store.ts with
effect Schema decode-at-boundary. New src/boundary/schema/SessionInfo.ts
defines SessionInfoPatchSchema + CatalogSchema (decodeUnknownOption),
mirroring GatewayEvent.ts. readInfoPatch + setCatalog now decode once and
build the typed patch/Catalog from the result (Option.none → empty
patch / catalog unset, never crashes). Wire field names verified against
tui_gateway/server.py. Removed the now-dead readOptBool helper. Tests
extended for nested-usage vs top-level context fallback, malformed/partial
payloads, and a garbage catalog.
2026-06-09 08:17:47 +00:00
alt-glitch
6a73b09d15 opentui(ts): rotate the NDJSON log file (bounded disk use)
The ring buffer is bounded (2000) but the NDJSON file was append-only and grew
forever. Add size-based rotation mirroring opencode's keep-N model: track bytes
written in-process (seeded from statSync on open, so we avoid a statSync on every
write) and, when the next line would cross LOG_MAX_BYTES (5 MiB), shift
.log -> .log.1 -> ... -> .log.5 (LOG_KEEP=5, oldest dropped) and resume on a
fresh file. Rotation is best-effort and fully try/catch-wrapped: any fs failure
leaves us appending to the existing file rather than crashing logging. Adds a
temp-dir rotation test (seeds >5 MiB to force a rotation on next write).
2026-06-09 08:11:01 +00:00
alt-glitch
af82979d43 opentui(ts): safe-stringify log payloads (circular/BigInt-proof)
A caller-supplied `data` with a circular reference or BigInt makes plain
JSON.stringify throw inside the file-write catch, flipping `fileBroken` and
killing ALL file logging for the session. Add `safeStringify` (WeakSet circular
guard, BigInt -> `${n}n`, wrapped to never throw) and use it for entry
serialization, so a bad payload degrades to a placeholder instead of breaking
the sink. Also model LogLevel schema-first via Schema.Literals + inferred type
(matches boundary/schema/GatewayEvent.ts), and add focused safeStringify +
poison-payload tests.
2026-06-09 08:10:20 +00:00
alt-glitch
3d87abcf1c opentui(ts): enforce prettier in the gate
Add a [1/4] format step to scripts/check.sh running
`bunx prettier --check src` (matching how the script invokes the other
tools), renumbering the existing steps to 2-4. Future formatting drift
now fails the gate.

The unused-imports/no-unused-vars warn → error promotion shipped in the
no-non-null-assertion commit (where the eslint rule changes live).
2026-06-09 08:03:38 +00:00
alt-glitch
4cb9aa6664 opentui(ts): normalize formatting with prettier
Run `prettier --write src` over ui-tui-opentui-v2 to normalize formatting
to the repo .prettierrc (no semicolons, single quotes, width 120,
arrowParens avoid, trailingComma none). This worktree had pre-existing
prettier-version divergences across 19 files; normalizing is correct.
No behavior changes — formatting only. The gate (type-check → lint →
bun test) stays green.
2026-06-09 08:03:08 +00:00
alt-glitch
bc79644f16 opentui(ts): no-non-null-assertion + noUnusedLocals + noImplicitReturns
Promote strictness in ui-tui-opentui-v2:
- eslint: @typescript-eslint/no-non-null-assertion: error (with a test
  override block keeping `!` in *.test.ts/tsx fixtures), and promote
  unused-imports/no-unused-vars warn → error.
- tsconfig: add noUnusedLocals + noImplicitReturns.

Remove all 24 production `!` non-null assertions by replacing each with
a real guard / default / early-return, preserving rendered behavior:
- gateway/client.ts: read the pending entry once and guard (vs has()+get()!).
- logic/theme.ts: guard the parseHex regex match; `?? 0` on the always-
  in-bounds XTERM_6_LEVELS lookups; restructure backgroundLuminance to
  branch into a typed tuple and use charAt() for the 3-digit hex expand.
- view/homeHint.tsx: use Solid's <Show>{value => …} callback form to
  narrow info().model / info().cwd instead of `!`.
- view/reasoningPart.tsx: guard the regex match before slicing m[0].
- view/statusBar.tsx: read model/cwd/pct into locals + guard; `?? 0` on
  the showBar()-guarded context-bar percentage.
- view/toolPart.tsx: guard the single-arg entry; `?? 0` on the
  duration-guarded fmtDuration.
2026-06-09 08:02:25 +00:00
alt-glitch
e36b2d1519 opentui(ts): type-aware eslint (projectService + recommendedTypeChecked); defer cast-family to warn
Enable type-aware linting in ui-tui-opentui-v2: add projectService +
tsconfigRootDir to the TS files block and switch the preset to
recommendedTypeChecked. Turn ON as ERROR the high-value promise rules
(no-floating-promises, no-misused-promises, await-thenable) and fix the
3 real floating-promise sites in the gateway client (FileSink
write/flush/end are fire-and-forget on a piped child stdin — marked
with explicit `void`).

Defer the cast/unknown family + the noisy type-checked rules to 'warn'
(gate stays green; eslint exits 0 on warnings) for Phase 2, which will
replace the `as`/`unknown` boundary casts with Schema decoding:
no-unsafe-{assignment,member-access,argument,return,call},
no-unnecessary-condition, no-base-to-string, restrict-template-
expressions, no-unnecessary-type-assertion, require-await.
2026-06-09 07:58:50 +00:00
alt-glitch
fe15a9bb00 tui(opentui): preflight node_modules before spawning the Bun engine
Bun runs the TS entry directly (no build step), so a missing `bun install`
otherwise surfaces as a cryptic '@opentui' resolve crash + blank UI. Fail
loudly with the fix instead. Part of the gateway build/run hardening.
2026-06-09 07:54:18 +00:00
alt-glitch
af577a4c5a opentui(harden): startup-readiness timeout + stderr-tail diagnostic
Arm a startup watchdog after spawning the gateway child: if the unsolicited
gateway.ready handshake never arrives within HERMES_TUI_STARTUP_TIMEOUT_MS
(floor 2s, default 20s), emit a gateway.start_timeout event so the store can
surface a failure line + the captured stderr tail instead of a silent blank UI.
Cleared on ready (dispatch), on stop(); re-arms per recovery respawn.
2026-06-09 07:53:23 +00:00
alt-glitch
9e81be7228 opentui(harden): configurable RPC timeout
Read HERMES_TUI_RPC_TIMEOUT_MS for the JSON-RPC request timeout (floor 5s,
default 120s) — Ink parity, env-tunable for slow handlers.
2026-06-09 07:52:40 +00:00
alt-glitch
28a2f95631 opentui(harden): clear the recovering status once the gateway is ready again 2026-06-09 07:49:46 +00:00
alt-glitch
07fcb3282c opentui(harden): auto-heal — restart + resume on gateway crash 2026-06-09 07:45:58 +00:00
alt-glitch
41a5bbf3e8 opentui(harden): gateway recovery policy (count-cap + exp backoff) 2026-06-09 07:39:36 +00:00
alt-glitch
84b77f68e5 opentui(harden): surface gateway exit/recovery + transport errors to the UI 2026-06-09 07:39:31 +00:00
alt-glitch
c29402d731 opentui(harden): rolling message cap bounds the Yoga node high-water mark 2026-06-09 07:31:14 +00:00
alt-glitch
76e9271dce opentui(copy): theme the selection highlight
Apply the existing theme selectionBg token to the plain <text> content
renderables (TextBufferRenderable supports selectionBg/selectionFg) so a
selection draws a clean solid bar that PRESERVES the text fg (no selectionFg →
no SGR-inverse fragmenting). Applied to:
- messageLine: the flat settled/user/system message text.
- toolPart: the args value lines + the output body lines.
Limitation: assistant answers rendered via the native <markdown> renderable
(MarkdownRenderable extends Renderable, not TextBufferRenderable, and
MarkdownOptions has no selectionBg/selectionFg) cannot take the themed highlight
— they fall back to the renderer's default selection style.
2026-06-09 07:19:39 +00:00
alt-glitch
60c5a82c85 opentui(copy): mask chrome/gutters so free-form copy is clean
Audit selectable masking (free-code noSelect model) so a free-form drag over an
agent turn yields CLEAN pasteable content — no labels, summaries, carets, or
annotations. Newly masked (selectable={false}):
- toolPart: the whole collapsed header row (name + args-preview + duration +
  "(N lines)") summary; the "args"/"output" section labels; the args overflow
  "… +N more"; the "… omitted N" / "… +N more lines" truncation notes.
- messageLine: the streaming caret (▍) — a cursor glyph, not content.
- reasoningPart: the collapsible-section header label (Thinking/Thought + title).
- composer: the completion dropdown rows + the "Tab complete · Esc dismiss" hint.
Kept selectable (real content): assistant markdown, tool args values + output
body, user/system message text.
2026-06-09 07:18:48 +00:00
alt-glitch
eb4821127c opentui(copy): copy-on-select (auto-copy on selection finish)
Subscribe to the renderer's "selection" event (fires once when a free-form
mouse selection completes) and auto-copy the spanned selectable text via the
existing onCopySelection callback. Unlike the Ctrl+C path, this does NOT
clearSelection() — the highlight persists so the user sees what was copied and
Ctrl+C still works. writeClipboard is idempotent so both paths are harmless.
2026-06-09 07:14:09 +00:00
alt-glitch
028bd89959 opentui(copy): /copy [n] copies the agent response 2026-06-09 07:09:10 +00:00
alt-glitch
0f53d67ee4 opentui(copy): assistant-text extraction helpers 2026-06-09 07:09:07 +00:00
alt-glitch
aa5489e804 opentui(harden): fix 3 triaged findings (timer leak, tool-match scope, complete-only)
Subagent hardening pass over boundary/logic/view, findings triaged (most were
false positives or app-lifetime-moot). The 3 genuine fixes:
- liveGateway.stop() now clears the pending 16ms coalesce timer before
  client.stop() — a queued flush() could otherwise fire batch()/handlers into a
  torn-down store after the layer scope releases.
- store.findToolPart now scans only the LIVE (last) assistant turn, not every
  message — a tool.complete pairs with a tool.start in the current turn, so this
  avoids matching a same-id tool in an older/resumed turn (and is O(parts)).
- store message.complete with text but NO prior start/delta now creates the turn
  (complete-only gateways) instead of dropping the final text; still no empty
  bubble when there's no text. +2 regression tests.

Triaged as NOT-a-bug / accepted-risk (documented so they're not relitigated):
@opentui/solid useKeyboard DOES auto-cleanup (onCleanup keyHandler.off, index.js:59);
dimensions/scrollAnchor timers are app-lifetime / try-catch-safe; unbounded-growth,
duplicate-dedup, and split-frame are theoretical for a trusted local newline-framed
subprocess; clipboard spawn timeout + atomic active-session write are minor follow-ups.
93 pass.
2026-06-09 06:35:29 +00:00
alt-glitch
2a86f039ea opentui(test): track the test harness (test/lib/) swallowed by global lib/ ignore
The Solid render-test harness (src/test/lib/render.ts + effect.ts) was never
committed — a global ~/.gitignore_global `lib/` rule silently excluded it, so the
opentui-v2 test suite wasn't reproducible from a clean checkout (render.test.tsx
imports ./lib/render). Force-add both + add a repo .gitignore negation
(!src/test/lib/). render.ts also carries the withKeymap() wrapper the keymap
migration needs (view tests mount under a KeymapProvider). 91 pass.
2026-06-09 06:25:55 +00:00
alt-glitch
79c6896153 opentui(keymap): adopt native @opentui/keymap for overlay close + confirm
@opentui/keymap@0.3.2 was installed but unused (the spec said we'd use it). Wire
it natively: createDefaultOpenTuiKeymap(renderer) + <KeymapProvider> at the render
root, and a useCloseLayer(target,onClose) helper that registers a focus-within
Esc/Ctrl+C → close layer. Migrated the close handling of sessionSwitcher, picker,
approvalPrompt (close-only), confirmPrompt (y/n via confirm/cancel commands), and
pager + agentsDashboard (close via keymap; scroll/select stay raw — not cleanly
focus-gated). Overlays gain a root ref + focus-on-mount so the focus-within layer
activates. q-close re-added to pager/dashboard (footer advertises it).

Composer history/refocus + masked prompt + the Ctrl+C quit machine stay raw by
design (need the in-flight keystroke / careful state). Test harness gains a
withKeymap() wrapper so view tests mount under a provider. 91 pass; live-verified
/sessions + /tools Esc-close and composer focus recovery after.
2026-06-09 06:24:14 +00:00
alt-glitch
46293f618c opentui(input): "Pasted text" placeholder for large pastes
Large bracketed pastes no longer flood the composer. On paste, if the text is
≥4 lines or >400 chars, insert a compact `[Pasted text #N +M lines]` chip and
hold the real content in a PasteStore; on submit, expand the chip back to the
full text before sending (free-code model). Single-pass String.replace keeps a
pasted block that itself contains a `[Pasted text #k]` literal safe.

The store is created ONCE in main.tsx and passed App→Composer (NOT per-composer)
so it survives the composer remounting on overlay open/close — a per-composer
store would lose a pending paste mid-compose. +6 unit tests (91 pass). Verified
live: paste 10 lines → chip; submit → transcript shows the full expanded code;
composer cleared.
2026-06-09 06:09:49 +00:00
alt-glitch
080440bd9c opentui(input): auto-expanding composer textbox
The composer was a fixed height:3. Match free-code/opencode: native textarea
auto-grow via direct minHeight={1} maxHeight={max(6,⌊rows/3⌋)} props (opencode's
prompt sizing) — 1 row when empty, grows with wrapped/multiline content up to ~a
third of the screen, then scrolls internally. maxHeight is a DIRECT reactive prop
(not in style) so the cap tracks terminal resize via useDimensions. 85 pass;
verified live (a long wrapping line grew the box to 3 rows).
2026-06-09 06:05:05 +00:00
alt-glitch
bd3c253420 opentui(v5b): fix first-letter duplication on always-active refocus
Typing while the textarea was unfocused doubled the FIRST letter: the always-active
handler did ta.focus() AND ta.insertText(key.sequence), but the renderer runs the
global useKeyboard handler BEFORE routing the key to the focused renderable — so
after focus() the same keystroke was also delivered to the now-focused textarea,
inserting it twice. Subsequent keys were fine (textarea already focused → block
skipped). Fix: focus() only; let the textarea insert the char it now receives.
Verified live: typing 'x' then 'y' while blurred yields '❯ xy' (no dup). 85 pass.
2026-06-09 05:36:37 +00:00
alt-glitch
82e13ed949 opentui(v5b): frame the startup panel in a themed border box
Design-judge top nit: Ink's bordered-box-around-the-session-info is the single
biggest 'designed home screen vs log output' signal, and the flat left-aligned
version lacked it. Wrap the model/dir/session block + Tools/Skills/MCP sections +
summary in a full border box (theme border token); banner+tagline stay above,
tips below. 85 pass.
2026-06-09 05:22:26 +00:00
alt-glitch
fb04e85a14 opentui(v5b/item1): Ink-parity startup banner panel
Rebuilt the home screen to match hermes --tui: the HERMES-AGENT banner + tagline,
then a session info block (model · Nous Research / dir (branch) / Session: <id>),
then SEPARATE collapsible sections — Available Tools (enabled toolsets each as
'name: tool1, tool2', capped + '(and N more toolsets…)'), Available Skills (N) in
M categories, MCP Servers (N) connected — and a '… /help for commands' summary.
Previously it was one combined '▶ N tools · M skills · K MCP' dropdown that only
listed tools and showed no model/dir/session.

- gateway startup.catalog now returns per-toolset {enabled, tools} (resolved_tools,
  session-aware enabled set — mirrors tools.list); py_compile OK.
- store Catalog gains toolset.enabled/tools; new sessionId field + setSessionId,
  set on session create/resume (alongside the active-session-file write).
- homeHint takes the store, reads info (model/cwd/branch) + sessionId + catalog.
85 pass; verified live (model·Nous·dir·session + enabled toolsets w/ tools).
2026-06-09 05:19:21 +00:00
alt-glitch
06762a0f5e opentui(v5b): visual hierarchy — color-code roles + clean turn spacing
The transcript read as one undifferentiated gold blob (user/assistant/tool all
the same color). Adopt the Ink model where color IS the hierarchy, in 3 brightness
tiers:
- USER input  → label (gold)        — the human's turn stands out.
- ASSISTANT answer → text (bright)   — the primary content, brightest.
- TOOL / REASONING → muted (dim) with an ACCENT glyph (/▶/▼ amber) that marks
  the block — clearly the secondary 'working area' below the answer.
Spacing: one blank line above every turn (was cramped: user had a blank, the
reply didn't) + the existing gap:1 between parts. Dropped the transcript's extra
marginTop (turns own their spacing now). 85 pass; verified live — gold ask, dim
tool, white answer read as three distinct things.
2026-06-09 05:13:37 +00:00
alt-glitch
92f35fab19 opentui(v5b/item5): track active session for the post-quit resume epilogue
The launcher (hermes_cli/main.py _print_tui_exit_summary) reads
HERMES_TUI_ACTIVE_SESSION_FILE to print 'Resume this session with…' on exit. The
Ink TUI writes the current session id there on every session change
(useSessionLifecycle.writeActiveSessionFile); the native engine never did, so
after a /session switch the launcher fell back to the INITIAL launch session and
showed resume info for the wrong session (the reported leak).

Now writeActiveSession() writes {session_id} on session.create AND inside
resumeInto (every /session switch), mirroring Ink. Verified live: file shows the
created session, then updates to the switched-to session. 85 pass.
2026-06-09 05:08:57 +00:00
alt-glitch
cd11ed7a04 opentui(v5b/item4): hold viewport on tool/thinking expand (no scroll jump)
The transcript scrollbox (stickyScroll+stickyStart=bottom) re-pins to the bottom
on any content-height change when the user is at the bottom (@opentui/core
ScrollBox: `if (stickyStart && !_hasManualScroll) applyStickyStart`). So expanding
a tool/thinking block scrolled the clicked header up off-screen. A
ScrollAnchorProvider (transcript owns the scrollbox ref) lets toolPart/reasoningPart
wrap their toggle so scrollTop is held constant across the height change (re-asserted
over a few frames as layout settles) — the clicked header stays put and the
expansion reveals beneath it. 85 pass.
2026-06-09 05:05:01 +00:00
alt-glitch
4407fee49f opentui(v5b/item2+3): fix streaming flicker + native markdown tables
#2 (flicker regression): my item-7 AssistantText wrapped text in
<For each={segmentMarkdown(text)}> — segmentMarkdown returns NEW objects per
delta, so <For> (keyed by reference) DISPOSED and re-created the markdown
renderable on EVERY streamed delta. Each remount re-measured from zero → content
height oscillated → the scrollbar grew/shrank (exactly the reported symptom).

Fix (deep opencode parity): render assistant text as ONE stable native
<markdown> (MarkdownRenderable) fed the growing content in place, with
internalBlockMode="top-level" — opencode's anti-flicker mode where settled
top-level blocks aren't re-rendered per delta (_stableBlockCount, managed
internally). This is opencode's TextPart verbatim (routes/session/index.tsx:1687).

#3 (table inline formatting): the native <markdown tableOptions={{style:grid}}>
renders GFM tables as a grid WITH inline bold/italic/code in cells — so the
hand-rolled segmentMarkdown + MdTable grid are deleted (obsolete). Switched from
<code filetype=markdown> to <markdown> (the former re-measured the whole buffer
each delta and never aligned tables). 85 pass; verified live (smooth stream,
boxed table, concealed **/* markers styled).
2026-06-09 04:57:57 +00:00
alt-glitch
48d0c70f61 opentui(v5/item4): coalesce resize via a shared debounced dimensions signal
Raw useTerminalDimensions fires on every SIGWINCH tick; during a drag that's a
recompute/reflow storm across every width-sensitive component (tool bodies,
tables, status bar, banner). Add a DimensionsProvider that runs the raw hook
ONCE and feeds a single leading+trailing-debounced (40ms) signal — mirroring the
gateway's 16ms event coalescing / opencode's createLeadingTrailingSignal — that
every consumer shares via useDimensions(). They now reflow together (no tearing)
and at most once per window. Falls back to the raw hook outside a provider
(headless tests). Verified: single resizes converge clean (wide banner ⇄ compact
brand at the 102-col threshold); rapid bursts coalesce. 90 pass.
2026-06-09 04:08:56 +00:00
alt-glitch
4061e635d3 opentui(v5/item9): startup HERMES banner + collapsible tools/skills/MCP panel
Home screen now shows the canonical HERMES-AGENT block logo (hermes_cli/banner.py,
gold->amber->bronze via primary/accent/border tokens; width-guarded to a compact
brand line under 102 cols) plus a collapsible '▶ N tools · M skills · K MCP' panel
that expands to per-toolset / per-category / per-server detail.

Data comes from a new opt-in gateway RPC 'startup.catalog' (aggregates
get_all_toolsets + banner.get_available_skills + config mcp_servers); the native
engine fetches it best-effort on session start (Effect.catchCause swallows it on
old gateways). Opt-in => Ink path untouched. py_compile OK. Store gains a typed
Catalog + defensive setCatalog mapper. +2 tests (90 pass); verified live
(1185 tools / 196 skills / 2 MCP, expand shows the full lists).
2026-06-09 04:04:43 +00:00
alt-glitch
741a4c23ca opentui(v5/item8): design polish — header chrome, status segments, ANSI strip
Visual-hierarchy pass (design-reviewed against free-code/opencode):
- header: brand glyph in accent + name in primary/bold + a bottom rule, so it
  reads as chrome and bookends the transcript with the status bar's top rule
  (fixes 'nothing differentiates the header from the text stream').
- status bar: a dim │ divider segments model·effort from the context meter.
- user/assistant turn glyphs bold + the user ❯ in accent so turns are scannable.
- reasoning 'Thought' label uses label (not warn) so it matches tool headers —
  warn is reserved for warnings; reasoning/tool now read as one aside family.
- home screen: brand in primary/bold, command names in accent vs muted descs,
  wider column.
- FIX (load-bearing): strip ANSI/SGR escape sequences from slash/notice text
  (pushSystem + openPager) — the gateway colors them for Ink, which interprets
  them; the native <text> rendered them as literal  glyphs. +stripAnsi
  + 3 tests. All tokens themed (no hardcoded colors). 88 pass.
2026-06-09 03:56:44 +00:00
alt-glitch
c6e72a8454 opentui(v5/item7): render GFM markdown tables as aligned grids
The native <code filetype=markdown> colorizes pipes but never aligns tables.
Add a pure segmenter (segmentMarkdown) that splits assistant text into prose
runs (native renderable) and GFM table blocks, plus an MdTable grid renderer:
per-column widths (free-code's stringWidth+padAligned), :--- / :--: / ---:
alignment, bold header, dim │ separators, a ┼ header rule, width-aware column
shrink on resize. Incomplete tables (no separator yet, e.g. mid-stream) stay
prose until they close. +5 unit tests (85 pass); verified live with a 3-col table.
2026-06-09 03:48:02 +00:00
alt-glitch
180fe665cb opentui(v5/item5): stabilize inter-part spacing (kill streaming jitter)
Blank lines between reasoning/tool/text grew and shrank mid-stream because
spacing was ad-hoc: tools carried marginTop:1, text/reasoning none, and the
markdown text part rendered the model's leading/trailing newlines as transient
blank lines that filled in as deltas arrived.

Now the parts column owns ALL spacing via gap:1 (uniform 1 line between any two
parts regardless of type/order), per-part marginTop is dropped, and text parts
are stripped of leading/trailing blank lines so the gap is the sole source —
no double gaps, no popping. Verified live: Thought/tool/tool/answer all spaced
by exactly one line. (80 pass.)
2026-06-09 03:43:43 +00:00
alt-glitch
6a24249e7c opentui(v5/item6): collapsible thinking traces
Reasoning rendered as an always-expanded plain muted blob. Now it's a proper
collapsible part (opencode ReasoningPart): auto-EXPANDED while the turn streams
(watch it think), then collapses to a one-line `▶ Thought: <title>` when settled;
click toggles. Title is the model's leading `**bold**` line (reasoningSummary).
Body renders as DIM markdown in a left-`│`-border block (Markdown gained an
optional `fg` so reasoning is muted vs the answer). +1 render test (80 pass).
2026-06-09 03:37:42 +00:00
alt-glitch
ee211d087b opentui(v5/item1): resumed tools render like live (collapsible + output)
Resumed tool calls were flat `name arg` rows — no output, not collapsible —
because the resume snapshot (_history_to_messages) dropped each tool's result.
Now the native engine passes `with_tool_output: true` on session.resume and the
gateway folds the tool's redacted+capped result + args into its row, so resumed
turns show `▶ name arg (N lines)` collapsible blocks identical to a live turn.

The flag is OPT-IN: Ink doesn't pass it, so _history_to_messages stays byte-for-
byte unchanged for the Ink path (its expanded verbose-trail render OOM'd on big
output, #34095; the native engine renders tools collapsed, so the capped tail is
safe there). resume.ts maps context→argsPreview, result_text→resultText (label
peeled + envelope stripped), args→argsText — same shape as the live tool part.

py_compile OK. resume tests updated + 1 added (79 pass).
2026-06-09 03:32:52 +00:00
alt-glitch
aec752faa3 opentui(v5/item2): surface tool-call args + de-pad output
The gateway already ships per-tool arg metadata the client was discarding:
`context` (build_tool_preview's primary-arg line, always sent), `args` (full
dict on complete), `args_text` (redacted JSON, verbose), `duration_s`. Capture
them on the tool part and render free-code style:

- collapsed header: `▶ name <arg-preview> · <duration> (N lines)` — args are
  finally visible without expanding (the core item-2 complaint).
- expanded: a single left-bordered (`│`) column with a key:value args block
  (suppressed when the lone arg is already the header preview — judge nit) then
  the output block.
- strip the gateway's `[showing verbose tail; omitted N chars]` banner into a
  tidy `… omitted N chars` note; unwrap tail-capped `{"output":…}` envelope
  fragments so the last line isn't a dangling JSON tail.

Left bar is a border glyph (opencode BlockTool style), not a bg fill — cleaner
and renders faithfully. +4 unit tests, +1 render test (78 pass).
2026-06-09 03:24:15 +00:00
alt-glitch
c9540570ae opentui(v5/item3): composer flush to bottom — drop root paddingBottom
The root box used padding:1 (all edges), reserving a blank row BELOW the
status-bar+composer block. Switch to paddingTop/Left/Right only so the input
hugs the last terminal row. Transcript stays flexGrow:1 minHeight:0; the
bottom block is the flexShrink:0 last child. StatusLine already renders
zero-height when idle, so no other change is needed.
2026-06-09 03:00:16 +00:00
alt-glitch
6e3915fbc1 opentui(v2): home hint (item 12) + verify /goal + expand feature matrix (item 8)
Item 12 — the missing helper/home screen: view/homeHint.tsx renders on an empty
transcript (Ink helpHint.tsx parity) — brand line, common commands (/help /model
/sessions /skills /agents /clear), and input tips (type · ↑↓ history · @file ·
Ctrl+C). Decorative → selectable={false}. Replaced by the transcript on the first
turn.

Item 8 — /goal verified live: slash.exec rejects it (pending-input) → dispatch
falls to command.dispatch {name:'goal'} → {type:'send', notice:'⊙ Goal set…',
message} → notice shown + the goal turn submitted (handleDispatchResult). Wired.

Docs: opentui-feature-map.md gains the full 15-item live-feedback parity matrix
(Ink/opencode primitive · v2 file · status); opentui-smoke.md gains the 15-item
run log. All 15 items  (image-paste wired but unverified in the clipboard-less
CI env).

Tests: home-hint render. 72 pass. Live-smoked: empty launch shows the home hint.
2026-06-08 18:26:13 +00:00
alt-glitch
c3d2d87a74 opentui(v2): clipboard copy/paste, image paste, glyph-free selection (items 1, 4)
Item 1 — copy/paste:
- boundary/clipboard.ts (ported/trimmed from opencode): writeClipboard = OSC 52
  (SSH/tmux-safe) + a native command (pbcopy/wl-copy/xclip/xsel/clip);
  readClipboardImage = clipboard PNG via wl-paste/xclip/pngpaste/powershell.
- Ctrl+C copies a live MOUSE SELECTION (renderer.getSelection) before the
  interrupt/quit machine runs (opencode's selection-key precedence), with a
  "Copied to clipboard" hint; falls through to interrupt/quit when there's no
  selection.
- text paste inserts natively (textarea handlePaste); the composer's onPaste only
  intercepts an EMPTY bracketed paste (image-only clipboard) → readClipboardImage
  → image.attach_bytes (the next prompt.submit picks it up).

Item 4 — mouse selection now ignores decorative glyphs: selectable={false} on the
message/tool gutter glyphs and all chrome (header, status bar, status line,
composer prompt glyph), so a drag copies the message text, not ❯/⚕/▶/.

Live-smoked (this env has no clipboard tools/DISPLAY, so native copy + image read
can't be confirmed here, but): drag-select + Ctrl+C → "Copied to clipboard" (not
quit); no-selection Ctrl+C still arms quit; bracketed text paste lands in the
composer. 71 pass.
2026-06-08 18:20:59 +00:00
alt-glitch
f423aebb80 opentui(v2): fix streaming caret alignment during model response (item 10)
A just-started assistant turn (message.start, no deltas yet) rendered an EMPTY
fallback <text> on the glyph's line plus the `▍` caret on a SEPARATE line below —
so `⚕` sat alone with the caret dangling beneath it, indented. Folded the caret
into the no-parts fallback so it renders inline with the glyph (` ⚕ ▍`); a settled
row still shows its flat text, a turn with parts renders the parts. 71 pass.

Live-smoked: streaming start now shows `⚕ ▍` on one line; the reply text then
aligns with the glyph.
2026-06-08 18:13:04 +00:00
alt-glitch
37b74f4df3 opentui(v2): live agent trace + /tools navigable overlay (items 9, 15)
Item 15 — "/agents doesn't let me look into an agent trace live":
- store accumulates a concise per-subagent trace from the subagent.* stream
  (▶ start /  tool — preview / progress text / ✓ summary), capped at 200 lines;
  thinking deltas update a transient `thought` (not appended — they'd flood).
- AgentsDashboard is now master-detail: ↑/↓ select a subagent (▸ + accent), and
  the bottom pane shows the selected agent's goal · status · model, its latest
  thought, and a sticky-bottom (live) trace scrollbox. PgUp/PgDn scroll the trace.

Item 9 — /tools wired to a deliberate navigable overlay (fetch the roster via
slash.exec → pager) instead of incidental fallthrough; /skills already opens the
native picker.

Tests: store trace accumulation + dashboard render (trace line + footer). 71 pass.

Live-smoked: /tools → tool roster pager; /skills → picker; a real delegation
(spawn a subagent → reply PURPLE) → /agents showed the subagent with its goal ·
completed · model, 🧠 PURPLE thought, and ▶/✓ trace lines.
2026-06-08 18:09:32 +00:00
alt-glitch
247604cdde opentui(v2): collapsible tools + composer glyph, drop the blue tint (items 3, 7)
Item 7 — tools were non-collapsible and "ugly-interlaced":
- ToolPart now renders COLLAPSED by default as one line: `▶ name  summary  (N
  lines)` (summary = explicit summary / first output line / error). A ▶/▼ glyph
  marks expandable tools; clicking the header toggles a left-bar block of the
  full (capped) output. Running tools show `name …`; single-line/erroring tools
  render inline. Compact by default → far less interlacing clutter.
- toolOutput.normalizeOutput: un-double-escapes literal \n/\t when they dominate
  over real newlines (some gateway tool tails are repr'd, so newlines arrived as
  backslash-n and rendered as one ugly line). Conservative — genuine multi-line
  output and legit `\n`-in-code are left alone. Applied in stripToolEnvelope.

Item 3 — the input "blue tint": dropped the textarea's blue focusedBackgroundColor
and added a `❯` prompt glyph. The composer is now distinguished by structure (the
glyph + the status-bar rule above it), not a background tint.

Tests: normalizeOutput (dominant-literal vs genuine-multiline). 70 pass.

Live-smoked: `ls -la` tool → collapsed `▶ terminal  total 3460  (N lines)`;
SGR-click → `▼` + clean per-line output; composer shows `❯` with no blue tint.
2026-06-08 18:04:08 +00:00
alt-glitch
2bb61a7d09 opentui(v2): slash-arg autocomplete + file/@-mention completion (items 5, 13)
onType used to fire complete.slash only for an argless `/command`, and Tab
replaced the whole line. Now:

- planCompletion(text) (pure, in slash.ts) routes: a `/command [args]` line →
  complete.slash (the gateway completes names AND args, e.g. /details section
  names); a trailing path-like word (@…, ~/…, ./…, /…, or anything with /) →
  complete.path for file/dir tagging; else nothing.
- the accepted item splices ONLY its token: store tracks completionFrom (gateway
  replace_from via readReplaceFrom, or the path-token start), and the composer's
  Tab handler keeps the text before `from` and appends the candidate.

Tests: planCompletion (slash/path/prose/multiline) + readReplaceFrom. 69 pass.

Live-smoked: `/details ` → section dropdown (hidden/collapsed/.../activity), Tab
→ `/details hidden` (arg-only splice); `tui_gateway/` → its .py files;
`@hermes_cli/m` → m-prefixed files.
2026-06-08 17:57:07 +00:00
alt-glitch
1ecec7a9bc opentui(v2): prompt history — Up/Down cycling, per-directory scope (item 6)
New logic/history.ts: createPromptHistory (pure cursor cycling — Up walks older,
Down walks newer back to the stashed draft, push dedupes a consecutive duplicate
+ resets) plus best-effort per-dir JSONL persistence under
$HERMES_HOME/tui-history/<sha1(cwd)>.jsonl (one JSON-encoded prompt per line,
multiline-safe).

Scoping matches the ask: prior prompts from the SAME launch dir are loaded on
start (recallable across relaunches), but a different dir keeps its own list — no
cross-dir/cross-session bleed.

Composer: Up at the first line → older prompt; Down at the last line → newer/draft
(at the boundary the textarea's own up/down is a no-op, so no conflict; mid-buffer
it still moves the cursor). setText + cursor-to-end on recall; any edit resets the
recall cursor. submit() pushes the prompt. Threaded entry → App → Composer; cwd =
process.cwd() (the launch dir under the real launcher).

Tests: 5 pure cursor-cycling cases. Live-smoked: seeded a dir file → Up/Up/Down
cycled two→one→two; a freshly submitted prompt was recalled via Up. 65 pass.
2026-06-08 17:52:38 +00:00
alt-glitch
325350d192 opentui(v2): always-active input — typing reclaims the composer (item 2)
The textarea focuses on mount and when an overlay closes (remount), but focus
could drift to the transcript scrollbox on a mouse-scroll, dropping keystrokes.
Now (opencode's keep-the-prompt-focused idea, adapted):
- onMouseDown → focus the textarea (click-to-focus).
- a global keystroke net: a PRINTABLE, unmodified key while the textarea is
  unfocused reclaims focus AND recovers the char (the in-flight event went to
  the global handler, not the unfocused textarea, so insert it). Nav/scroll keys
  (arrows/page/home/end/…) are deliberately left alone so keyboard transcript
  scroll still works; kitty `release` events are skipped to avoid double-insert.
Completion accept/dismiss handler folded into the same useKeyboard with early
returns.

Live-smoked: type → text lands; `/` → completions; Esc → dismiss; type again →
lands; clean quit. 60 pass.
2026-06-08 17:46:37 +00:00
alt-glitch
1be5bd92fa opentui(v2): Ctrl-C stops the agent; second press (debounced) quits
Item 11 — "stopping the agent doesn't work". Ctrl+C used to immediately destroy
the renderer. Now a turn-aware state machine (opencode's double-press model, the
user's preferred behaviour):

- While a turn runs (store.info.running): first Ctrl+C → session.interrupt
  {session_id} (STOP the agent), and arms a 3s quit window with a warn hint
  "⏹ stopped — Ctrl+C again to quit".
- Idle: first Ctrl+C arms the window ("Ctrl+C again to quit"); a stray single
  press never nukes the session.
- A second Ctrl+C within the window KILLS the TUI (renderer.destroy → clean
  scope teardown → gateway child EOF).
- A blocking prompt still owns Ctrl+C (deny/cancel) — unchanged.

Wiring: renderer.ts gains an `onCtrlC` hook (owns Ctrl+C when not blocked);
entry builds the machine (gateway yielded before the renderer so it can read
`running` + send interrupt). store gains a transient `hint` slice; StatusLine
shows hint (warn, priority) or the busy face (dim).

Live-smoked: long turn → Ctrl+C shows "stopped" + idle dot; second press exits
cleanly with no orphaned gateway child (the user's installed-venv sessions
untouched). 60 pass.
2026-06-08 17:42:21 +00:00
alt-glitch
93793b6af5 opentui(v2): status bar (status·model·effort·context·dir) above composer
Item 14: a persistent bottom-chrome status bar, ported from Ink's appChrome
StatusRule. Sourced from the session.info event (model / reasoning_effort /
fast / cwd / branch / running / usage.context_*) which was decoded but dropped
until now; also folded session.create/resume result.info and message.complete
usage into a new store `info` slice.

- store: SessionInfo slice + applyInfo(); session.info handler; message.start/
  complete flip `running` (the flag the Ctrl-C interrupt will read); refresh
  usage on complete.
- schema: MessageComplete.payload gains loose `usage` so it survives decode.
- view/statusBar.tsx: width-aware (Ink progressive disclosure) — context bar
  drops on narrow terminals, cwd compacts to last two segments + left-truncates
  so the row never wraps. Turn/connection dot ◐/●/○.
- App: status bar sits ABOVE the composer; a top-edge rule (border:['top'])
  visually separates the status bar + textbox input region from the transcript.
- tests: store info slice (3) + headless status-bar render (1); bumped the
  approval-prompt capture height for the taller input region. 59 pass.

Live-smoked: bar shows model·effort·context%·dir; context updates 0→4% across a
turn; running dot flips; separator divides input region from transcript.
2026-06-08 17:38:10 +00:00
alt-glitch
26f6929eb4 fix(opentui-v2): route thinking-faces to a transient status line (not the transcript)
Live-usage issue 3/5: the kaomoji faces ("(¬_¬) processing…") lingered in the
transcript. Traced (instrumented capture): they arrive via `thinking.delta` —
Hermes's transient kaomoji busy *indicator* (_INDICATOR_DEFAULT=kaomoji), which I
was rendering as a persistent reasoning part.

- store: new transient `status` field. thinking.delta / status.update → `status`
  (not a part); message.start + message.complete clear it. Only the real
  `reasoning.delta` still becomes a (dim) transcript part.
- view/statusLine.tsx: a dim busy line above the composer shown while `status` is
  set (Ink's FaceTicker analog), rendering nothing when idle; wired into App
  between the transcript and the input zone.

Verified: bun run check green (55 tests / 7 files) — store tests assert
thinking.delta → status (no transcript part) + cleared on complete; status.update
→ status. Live tmux: a turn showed "٩(๑❛ᴗ❛๑)۶ cogitating…" on the transient status
line (cleared on completion) with NO face left in the transcript.
2026-06-08 16:54:07 +00:00
alt-glitch
808ef152e5 fix(opentui-v2): live UX — enable mouse + smooth streaming markdown (opencode parity)
From live-usage feedback (driving the real TUI):

- Mouse ON by default (opencode parity; HERMES_TUI_MOUSE=0 opts out). Was hardcoded
  off, which is why transcript wheel-scroll, scrollbar drag, and click-to-expand
  tools didn't work and the terminal's native region-select polluted copy. With
  useMouse the scrollbox handles the wheel + scrollbar and tools are click-expandable;
  selection becomes OpenTUI's text-aware select. (Mouse can't be driven via tmux
  send-keys — verify wheel/drag/click interactively.)
- Streaming markdown: match opencode's v2 text path —
  <code filetype="markdown" streaming drawUnstyledText={false}>. The previous
  drawUnstyledText:true drew raw text then overlaid styling each delta (a flash);
  false avoids that and re-tokenizes incrementally for smoother streaming. (The
  native renderable's tree-sitter doesn't settle in the headless test renderer with
  drawUnstyledText:false, so the two markdown frame tests now assert the assistant
  text via the store — paint is verified in the live smoke; render.ts also settles
  to waitForVisualIdle.)

Verified: bun run check green (53 tests / 7 files). Live mouse + streaming
smoothness for glitch to confirm. Part of the live-feedback polish goal.
2026-06-08 16:48:23 +00:00
alt-glitch
e7d7e0157f feat(opentui-v2): Phase 8 — launcher cutover to the v4 Solid engine
Repoint hermes_cli/main.py `_make_opentui_argv` from the superseded React entry
to the v4 Solid + Effect-at-boundary entry: it now prefers
`ui-tui-opentui-v2/src/entry/main.tsx` (cwd ui-tui-opentui-v2) and falls back to
`ui-tui-opentui/src/entry.real.tsx` only if the v2 package is absent (graceful
during coexistence). The engine gate (_resolve_tui_engine: HERMES_TUI_ENGINE /
display.tui_engine → opentui; Windows/Termux → Ink fallback) and the dual-engine
dispatch in _make_tui_argv are unchanged; Ink (ui-tui/) is untouched. The spawned
tui_gateway's source-root default lands on PROJECT_ROOT (package at
<root>/ui-tui-opentui-v2), so it loads Python from the same checkout, no extra env.

So `HERMES_TUI_ENGINE=opentui hermes --tui` now launches the v4 engine — the exact
`bun …/v2/src/entry/main.tsx` invocation live-smoked across P1–P5e, making every
first-class surface reachable from the real CLI.

Also: a consolidated 3-way acceptance summary (Ink ↔ opencode ↔ build) at the top
of opentui-feature-map.md covering all 7 first-class surfaces + the foundation +
the launcher, each  + tested + smoked.

Verified: py_compile main.py OK (dev-skill rule for the 4k-line file); imported
the worktree CLI with HERMES_TUI_ENGINE=opentui → _resolve_tui_engine()='opentui',
_make_opentui_argv() → [bun, …/ui-tui-opentui-v2/src/entry/main.tsx] (cwd
ui-tui-opentui-v2, --watch in dev). v2 `bun run check` green (53 tests / 7 files).
Smoke P8 + matrix updated. Remaining: header chrome detail (5b), agent-feature
trail (5d), distribution (§10) — polish, not first-class blockers.
2026-06-08 16:27:21 +00:00
alt-glitch
edc4164704 feat(opentui-v2): Phase 5e — agents dashboard (7th first-class surface; ALL done)
The agents dashboard (spec §2b; Ink agentsOverlay) — the last first-class
interactive surface. Subagent delegations are tracked from the `subagent.*`
event stream and shown in a full-height overlay.

- store: subagents[] built from subagent.{spawn_requested,start,thinking,tool,
  progress,complete} by subagent_id (status·goal·model·depth·lastTool·summary);
  clearTranscript clears them. dashboard flag + openDashboard/closeDashboard.
- view/overlays/agentsDashboard.tsx: full-height overlay (replaces transcript+
  composer), depth-indented subagent rows colored by status, scroll via
  scrollBy/scrollTo, Esc/q close. Empty state prompts to delegate.
- view/App.tsx: content zone is now a <Switch> — pager / agents dashboard /
  (transcript + input zone).
- logic/slash.ts: /agents, /tasks → openDashboard (SlashContext.openDashboard).

Verified: bun run check green (53 tests / 7 files) — subagent reducer + a
dashboard frame test (seeded tree renders, transcript replaced) + /agents
dispatch. LIVE tmux: /agents opened empty; then a REAL delegation spawned a
subagent → /agents showed "⛓ Agents · 1 subagent · ● completed <goal>
(model) terminal". ALL 7 first-class surfaces are now +tested+smoked
(blocking prompts, pager, session switcher, model picker, skills hub,
completions, agents dashboard). Smoke P5e + matrix updated. Remaining: chrome
(5b), agent-feature polish (5d), launcher (8).
2026-06-08 16:23:17 +00:00
alt-glitch
7412cd5c78 feat(opentui-v2): Phase 5a — slash completions dropdown (last first-class overlay)
A live slash-completion dropdown renders above the composer as you type `/…`
(spec §1 autocomplete) — the 6th and final first-class overlay surface.

- view/composer.tsx: onContentChange → onType (reads ta.plainText); a dropdown
  of candidates (display + meta) renders above the textarea when completions are
  set. The textarea owns key input (live refine-by-typing), so Tab accepts the
  top match (ta.clear()+insertText) and Esc dismisses; arrow-nav would fight the
  cursor (noted polish).
- store: completions state + setCompletions/clearCompletions; CompletionItem.
- logic/slash.ts: mapCompletions(complete.slash result) → candidates.
- entry: onType queries complete.slash for `/word` (no space) and sets/clears the
  store completions; cleared on submit / non-slash / space.

Verified: bun run check green (49 tests / 7 files) — mapCompletions + a
composer-dropdown frame test. LIVE tmux: typing `/comp` showed /compress,
/composio, /compact (with descriptions); Tab accepted the top + cleared the
dropdown. ALL 6 first-class overlays are now +tested+smoked (blocking prompts,
pager, session switcher, model picker, skills hub, completions). Smoke P5a +
matrix updated. Remaining: chrome (5b), agent features (5d), agents dashboard (5e).
2026-06-08 16:15:38 +00:00
alt-glitch
3f54152191 feat(opentui-v2): Phase 5c — model picker + skills hub (generic Picker overlay)
A reusable generic picker (titled <select> + onPick) powers two more first-class
overlays (spec §2b):

- view/overlays/picker.tsx + store picker/openPicker/closePicker + PickerItem.
- /model: bare → model.options → a picker of authenticated providers' models
  (current marked ✓), pick switches via `slash.exec model <name>`; `/model <name>`
  switches directly without the picker.
- /skills: skills.manage {action:list} → a picker flattened from
  {category: names[]}; picking inspects (skills.manage inspect) → the pager.
- view/App.tsx: the input zone is now a <Switch> — prompt → switcher → picker →
  composer (overlays replace, never stack, so the composer remounts/refocuses).

Verified: bun run check green (47 tests / 7 files) — /model bare→picker (auth
filtered, current marked, pick→slash.exec), /model <name> direct, /skills flatten.
LIVE tmux: /model → picker listing 8 models (anthropic/claude-opus-4.8 ▶, nous,
…), Esc closed clean; /skills → hub listing skills w/ category descriptions.
5 of 6 first-class overlays done (prompts, pager, session switcher, model picker,
skills hub) — completions dropdown remains. Smoke P5c + matrix updated.
(Note: model.options is ~5s server-side; a loading indicator is a polish TODO.)
2026-06-08 16:08:25 +00:00
alt-glitch
3fe7709b86 feat(opentui-v2): Phase 5c — session switcher overlay (list → pick → resume)
A first-class picker (spec §2b, Ink activeSessionSwitcher): /sessions (aliases
/resume, /switch, /session) → session.list → a native <select> overlay; Enter
resumes the chosen session via the SAME resumeInto hydrate path as launch, so
tool rows + transcript hydrate correctly. Esc closes. Reuses Phase 4b resume.

- view/overlays/sessionSwitcher.tsx: <select> of sessions (title / preview /
  message count), onSelect → onPick(id); Esc cancels.
- store: switcher state + openSwitcher/closeSwitcher; SessionItem type.
- logic/resume.ts: mapSessionList(session.list result) → SessionItem[].
- logic/slash.ts: /sessions|/resume|/switch|/session client commands +
  listSessions/openSwitcher on SlashContext.
- entry: resumeInto extracted (shared by bootstrap + switcher); slashCtx wires
  listSessions (session.list → mapSessionList) + openSwitcher; onResume runs
  resumeInto via runFork. App input zone is now prompt → switcher → composer
  (overlays replace, not stack, so the composer remounts/refocuses on close).

Verified: bun run check green (43 tests / 7 files) — slash /sessions → switcher,
+ a switcher frame test (rows render, composer replaced). LIVE tmux: /sessions
listed real titled sessions w/ counts/previews; ↓+Enter resumed the picked one
(hydrate_ms=8) → transcript hydrated incl. the terminal tool row; switcher
closed, composer returned; /quit clean. 3 of 6 first-class overlays done
(prompts, pager, switcher). Smoke P5c + matrix updated.
2026-06-08 16:00:39 +00:00
alt-glitch
abce50e34d feat(opentui-v2): Phase 5a — pager overlay for long slash output
A full-height scrollable pager (the FloatBox analog) — porting it unlocks the
long-output slash commands (/status /logs /history /tools) at once (spec §2b).

- view/overlays/pager.tsx: bordered full-height overlay (title + scrollbox +
  footer), scrolling driven explicitly via useKeyboard → scrollBy/scrollTo (no
  reliance on scrollbox auto-focus), Esc/q/Ctrl+C close. §8 #2 scrollbox gotchas.
- store: pager state + openPager/closePager.
- view/App.tsx: content zone swaps to the Pager (replacing transcript+composer)
  when store.state.pager is set; the close is deferred a tick so the closing key
  can't leak into the remounting composer.
- logic/slash.ts: present() routes output to the pager when long (>180 chars or
  >2 non-empty lines, Ink parity) else a system line; titled by command; /logs
  always pages. New openPager on SlashContext.

Verified: bun run check green (41 tests / 7 files) — present() routing
(short→system, long→pager) + a pager frame test (renders title/content, replaces
the transcript/composer). LIVE tmux: /logs → pager (title "Logs", scroll via
PageDown, Esc closed → composer refocused, no key-leak); /version (5-line output)
→ pager titled "Version". Smoke P5a + parity matrix updated. Completions dropdown
+ pickers + chrome are the next slices.
2026-06-08 15:52:36 +00:00
alt-glitch
c704d384f4 feat(opentui-v2): Phase 4b — session resume with tool/transcript hydration
HERMES_TUI_RESUME=<id|recent> resumes a session instead of creating one:
session.most_recent (for "recent") → session.resume {cols, session_id} →
commitSnapshot(mapResumeHistory(messages)), buffering live events across the RPC.

- logic/resume.ts: maps the session.resume history into Message[]. Resumed tool
  rows arrive as {role:'tool', name, context} (NO text — gotcha §8 #5); they're
  FOLDED into the preceding assistant turn's ordered parts (state:'complete',
  summary=context) so a resumed transcript renders the tools INLINE like a live
  one. Assistant text gets a text part (renders via native markdown). User/system
  stay flat. Unknown roles / non-arrays are ignored.
- logic/store.ts: hydrate split into beginBuffer() + commitSnapshot() so the live
  event buffer spans the async resume RPC (events that arrive during resume are
  replayed after the snapshot, in order).
- entry/main.tsx: bootstrap branches create vs resume; the resume path is timed
  (rpc_ms / hydrate_ms) for profiling.

Verified: bun run check green (40 tests / 7 files) — resume mapper (fold tool
rows, standalone holder, ignore junk) + beginBuffer/commitSnapshot replay. LIVE
tmux: Launch A created a session with a terminal tool call; Launch B
(HERMES_TUI_RESUME=recent) hydrated user + assistant + the tool row inline.
STRESS+PROFILE on a real 103-message session (~/.hermes/sessions): client hydrate
= 76ms, bun RSS = 214MB STABLE (no leak), tool rows hydrated, PageUp scroll works;
the 1.6s cost is the server-side session.resume RPC, not the TUI. Smoke P4 +
matrix updated. Note: rows instantiate for the full history (scrollbox culls
render only) → RSS ~linear in turns; list virtualization is the lever if
multi-thousand-turn sessions become a target.
2026-06-08 15:31:46 +00:00
alt-glitch
4f2bb7e52f feat(opentui-v2): Phase 4a — slash command system + local confirm dialog
The composer now routes `/command` through the Ink-parity dispatch ladder
instead of submitting it as a prompt (spec §1):

- logic/slash.ts: parseSlash + dispatchSlash — client-local command →
  slash.exec {command, session_id} (output → system line) → on reject
  command.dispatch {arg, name, session_id} with typed handling
  (exec/plugin→system · alias→re-dispatch · skill/send→submit a turn ·
  prefill→notice). 6 client commands: help/quit/exit/clear/new/logs.
- /help renders the live `commands.catalog` (reads the `pairs` shape).
- view/prompts/confirmPrompt.tsx + store.setConfirm: a LOCAL (non-gateway) Y/N
  dialog for /clear and /new; store gains pushSystem + clearTranscript.
- entry: a Promise-returning `request` adapter + the SlashContext wiring (quit →
  renderer.destroy, confirm, clearTranscript, logTail, submit).

Also fixes a keystroke-leak: the key that ANSWERED a prompt was bleeding into the
freshly-refocused composer (`/clear`→y left "y" in the input, breaking the next
`/quit`). PromptOverlay now defers the prompt-clear (composer remount) past the
current keystroke — this hardens every Phase 3 prompt too.

Verified: bun run check green (36 tests / 6 files) — slash.test covers parse + the
full ladder against a fake context. LIVE tmux: /help → full gateway catalog;
/version → slash.exec output; /clear → confirm → cleared, no key-leak (typed "hi"
not "yhi"); /quit → clean quit, child reaped. Remaining TUI-only commands,
completions, pager routing, and session resume are 4b/4c. Smoke P4 + matrix updated.
2026-06-08 15:20:06 +00:00
alt-glitch
1bc376921f feat(opentui-v2): Phase 3 — blocking prompts (clarify/approval/sudo/secret), no deadlock
The 4 gateway *.request events now drive a blocking-prompt overlay instead of
deadlocking the agent (spec §8 #6). Native OpenTUI paradigm (per glitch's steer):

- view/prompts/approvalPrompt.tsx: native <select> (once/session/always/deny)
  → approval.respond {choice, session_id}.
- view/prompts/clarifyPrompt.tsx: native <select> over choices + an "✎ Other…"
  option that swaps to a native <input> for free-text → clarify.respond
  {answer, request_id}.
- view/prompts/maskedPrompt.tsx: sudo (🔐) / secret (🔑) — native <input> has no
  mask, so we own a buffer via useKeyboard and render '*' per char →
  sudo/secret.respond {password|value, request_id}.
- view/prompts/promptOverlay.tsx: dispatches by prompt kind, binds each
  answer/cancel to the matching *.respond; Esc/Ctrl+C → deny/empty so the agent
  always unblocks.

Wiring: store gains ActivePrompt state + the 4 reducer cases + clearPrompt;
App swaps Composer↔PromptOverlay on store.state.prompt (so the composer textarea
stops capturing keys while blocked); renderer.ts gates the global Ctrl+C-quit on
isBlocked() so a prompt owns Ctrl+C (→ cancel); entry adds a generic `respond`
runFork callback + passes sessionId.

Verified: bun run check green (28 tests / 5 files) — reducer set/clear for all 4,
+ a frame test (approval overlay renders the command + all options as a bordered
modal, composer hidden while blocked). LIVE tmux: a real `rm -rf` approval fired;
Approve-once → command ran → unblocked; Esc → deny → "BLOCKED by user" →
unblocked; Ctrl+C-while-blocked cancelled WITHOUT quitting; Ctrl+C-unblocked quit
clean, no orphan. Smoke P3 + parity matrix updated. confirm (local) → Phase 4.
2026-06-08 15:06:58 +00:00
alt-glitch
6cefb7c5b5 feat(opentui-v2): Phase 2b-ii — native markdown for assistant text (Phase 2 done)
Assistant text parts now render through the NATIVE markdown renderable instead of
plain spans — bold/headings/lists/fences render, raw `**`/backtick markup is
concealed (spec §7; never hand-roll a parser).

- view/markdown.tsx: `<code filetype="markdown" streaming conceal drawUnstyledText>`
  (CodeRenderable — opencode's v2 AssistantText path; `<markdown>` +
  internalBlockMode="top-level" deferred paint headlessly). SyntaxStyle.fromStyles
  is derived from the theme (markup.* → theme.color.*, non-hex colors guarded) and
  cached by theme-object identity so all text parts share one instance, rebuilt
  only on skin change. drawUnstyledText paints raw text immediately while
  Tree-sitter highlighting settles (and makes it headless-capturable).
- view/messageLine.tsx: text-part Match renders <Markdown> instead of <text>.
- test/lib/render.ts: settle async markdown via flush(); captureFrame gains an
  `until` option (waitForFrame) for content that paints after the first pass.

Verified: bun run check green (23 tests / 5 files). Live tmux: a markdown reply
(heading + bold word + 2-item list) rendered with `**` concealed (grep -c '**' = 0);
Ctrl+C clean, no orphan. Phase 2 complete (2a shell + 2b-i parts/tools + 2b-ii
markdown) — smoke steps 1–4 run live. Next: Phase 3 blocking prompts.
2026-06-08 14:49:52 +00:00
alt-glitch
59fbc05031 feat(opentui-v2): Phase 2b-i — ordered parts + inline tool render
An assistant turn is now ONE ordered parts[] (text/reasoning/tool) instead of a
flat string, so tool calls render INLINE between text blocks rather than dumped
as separate rows below (spec §7 — the "dump-below" bug opencode's sync-v2 avoids).

- logic/store.ts: Part discriminated union + reducer rework. message.delta
  appends to the open text part (or opens one); tool.start pushes a running tool
  part; tool.complete matches by tool_id and updates that part IN PLACE (state,
  envelope-stripped resultText, summary, error, lineCount); reasoning.delta
  accumulates a reasoning part. User/system rows stay flat text; settled/resumed
  assistant rows fall back to text.
- logic/toolOutput.ts: ported pure helpers — stripToolEnvelope (unwrap
  {output,exit_code}, append [exit N]/[error] suffix) + collapseToolOutput +
  truncate.
- view/messageLine.tsx: <For>+<Switch> dispatch by part.type with stable id keys.
- view/toolPart.tsx: two-tier render — inline one-liner (≤1 output line) or a
  capped left-bar block (TOOL_MAX_LINES, "… +N more", click-to-expand) keyed off
  the theme; reactive width via useTerminalDimensions.

Verified: bun run check green (23 tests / 5 files / 64 expects) — store
interleave/in-place/reasoning, a frame test asserting the tool renders inline +
envelope stripped, and toolOutput unit tests. Live tmux: a terminal-tool prompt
rendered " terminal" with its alpha/beta output inline between the assistant's
text parts; Ctrl+C clean, no orphan. Smoke P2b + parity matrix updated. Native
<markdown> for text parts is the next slice (2b-ii).
2026-06-08 14:42:24 +00:00
alt-glitch
4b81ded58b feat(opentui-v2): Phase 2a — scrollbox transcript + textarea composer + header
Turns the read-only Phase-1 view into an interactive shell, split into focused
view components (spec v4 §2 layout):

- view/transcript.tsx: ONE full-height <scrollbox> with a reactive <For>
  (opencode's no-scrollback model). Applies the §8 #2 gotchas exactly:
  minHeight:0 on the wrapper AND the scrollbox, NO flexDirection on the
  scrollbox root, stickyScroll + stickyStart="bottom".
- view/composer.tsx: a native <textarea> captured by ref — flexShrink:0,
  focus-on-mount, Enter->submit via keyBindings, imperative .clear() on submit,
  and a `submitting` re-entrancy guard. Wired by the entry to fire prompt.submit
  (Effect.runFork on the in-hand service value); it's now the PRIMARY input, with
  the HERMES_TUI_PROMPT stand-in kept only for launch-with-prompt.
- view/header.tsx + view/messageLine.tsx: extracted, themed (no hardcoded
  styles). MessageLine stays flat-text this slice; ordered parts (§7) land in 2b.

test/lib/render.ts now flushes 3 renderOnce passes before capture — a <scrollbox>
needs more than one pass to measure content + apply sticky, else the transcript
row paints blank.

Verified: bun run check green (12 tests / 4 files / 31 expects). Live tmux drive:
typed into the composer -> cleared -> user row -> streamed reply ("Here are three
words"); Ctrl+C quits cleanly even with the textarea focused, no orphan child.
Composer placeholder rendered the live skin's welcome string (skin->theme live).
Smoke P2a + parity matrix updated. Phase 2b (ordered parts/tool render/markdown)
is the next slice.
2026-06-08 14:28:59 +00:00
alt-glitch
927c902785 feat(opentui-v2): Phase 1 — live tui_gateway transport + Solid store + theming
GatewayService/liveGateway over the real Python tui_gateway: JSON-RPC stdio
framing (Bun.spawn), 16ms event coalescing flushed inside Solid batch(), typed
GatewayError, and a decode-once GatewayEvent Schema (~35-member tagged union;
unknown/malformed events skip via Option.none, never crash the stream).

The Solid sync-v2-style store grows to: streaming text concat (prefer
payload.text), gateway.ready{skin}/skin.changed -> fromSkin reactive re-theme,
LRU id-dedup, and hydrate-while-buffering (resume scaffold). Theming is a 1:1
port of Ink's theme.ts (DARK/LIGHT, detectLightMode, ANSI-256 normalization,
fromSkin) behind a Solid ThemeProvider so existing skins work unchanged and the
view carries NO hardcoded styles. A console-safe diagnostics log (in-memory
ring + NDJSON file) is the single logging path.

Entry gains a live launch path (default; HERMES_TUI_FAKE=1 -> scripted hello)
with an initial-prompt bootstrap (session.create -> prompt.submit) as the
Phase-2-composer stand-in, plus a minimal Ctrl+C graceful quit
(renderer.destroy -> shutdown Deferred -> scope finalizers -> client.stop) so
the engine reaps its own gateway child instead of orphaning it.

Verified: bun run check green (tsc + eslint + 12 tests / 4 files); live tmux
drive connect -> gateway.ready -> prompt -> streamed reply ("pong") -> clean
teardown with no orphan bun/python. Parity matrix + smoke P1 run log updated.
2026-06-08 14:19:21 +00:00
alt-glitch
d3943fe37d feat(opentui-v2): Phase 0 scaffold — Solid + Effect-at-boundary native TUI
New from-scratch package ui-tui-opentui-v2/ (NOT a port of the superseded React
ui-tui-opentui/; Ink ui-tui/ untouched). Mirrors opencode's method: @opentui/solid
view, Effect 4.0-beta only at the boundary (renderer lifecycle, GatewayService
transport, runtime), plain Solid for the logic/view.

Phase 0 (per docs/plans/opentui-rewrite-v4-spec.md §11):
- deps pinned: effect@4.0.0-beta.78, @opentui/{core,solid,keymap}@0.3.2, solid-js@1.9.10
- strict rails: tsconfig (verbatimModuleSyntax, exactOptionalPropertyTypes,
  noUncheckedIndexedAccess, jsxImportSource @opentui/solid), eslint, prettier
- boundary: acquireRelease(createCliRenderer) + finalizers + Deferred-on-destroy;
  GatewayService (Context.Service) shape; typed errors (Data.TaggedError); AppLayer
- logic (Solid): createSessionStore + apply(event) reducer (sync-v2 model, minimal)
- view (Solid): App shell (header + transcript); inline color via <span style={{fg}}>
- entry: the one-line render(() => <App/>, renderer) bridge + Effect.provide(layer)
- FakeGateway layer (test/dev seam) streaming a scripted hello
- test rails: test/lib/effect.ts (testEffect/testLayer over ManagedRuntime + TestClock,
  no @effect/vitest), test/lib/render.ts (testRender + renderOnce + captureCharFrame)
- 4-layer tests (boundary/store/render) 5/5 green; scripts/check.sh gate green
- docs: v4 spec + living smoke doc (Phase 0 PASS logged); v3 spec + parts/markdown
  plan marked superseded

Verified: bun run check green (tsc 0, eslint 0, bun test 5/5); live tmux drive paints
'hermes · opentui · ready' + '✦ Hi there, glitch!' in a real TTY.
2026-06-08 13:36:54 +00:00
alt-glitch
2bd9c9b881 opentui(phase3): launcher integration — HERMES_TUI_ENGINE dual-engine
hermes --tui launches the native OpenTUI engine (Bun) when
HERMES_TUI_ENGINE=opentui (env) or display.tui_engine=opentui (config);
Ink stays the default and the shipping path is untouched.

- _resolve_tui_engine() (env > config > ink); refuses opentui on
  Windows/Termux (no Bun) -> falls back to ink with a notice.
- _make_opentui_argv() -> [bun, src/entry.real.tsx] (no build step).
- _bun_bin() with HERMES_BUN override.
- Branch at top of _make_tui_argv BEFORE _ensure_tui_node (Bun-only host
  must not bootstrap Node).
- Gate _launch_tui NODE_OPTIONS/--max-old-space-size on engine==ink (Bun
  is JSC; the V8 flag errors/ignores).

Verified end-to-end via tmux: real hermes --tui -> Bun -> OpenTUI ->
real Python gateway streamed a real reply. No-flag default still ink.
2026-06-08 11:11:54 +00:00
emozilla
984e6cb5b8 feat(whatsapp): add WhatsApp Business Cloud API adapter
Add an official, production-grade WhatsApp integration via Meta's
Business Cloud API as a complement to the existing Baileys bridge.
No bridge subprocess, no QR codes, no account-ban risk — at the cost
of a Meta Business account and a public HTTPS webhook URL.

Setup is fully wizard-driven: 'hermes whatsapp-cloud' walks through
every credential with paste-time validation (catches the #1 trap of
pasting a phone number into the Phone Number ID field), generates a
verify token, and ends with copy-paste instructions for the
cloudflared / Meta-dashboard / Business Manager pieces that can't be
automated. The wizard also points users at Meta's Business Manager
for setting the bot's display name and profile picture.

Feature set:

- Inbound: text, images (with native-vision routing), voice notes
  (STT), documents (small text inlined, larger cached), reply context.
- Outbound: text with WhatsApp-flavored markdown conversion, images,
  videos, documents, opus voice notes via ffmpeg with MP3 fallback.
- Native interactive buttons for clarify, dangerous-command approval,
  and slash-command confirmation flows — matches the Telegram /
  Discord UX, graceful degrades to plain text.
- Read receipts (blue double-checkmarks) and typing indicator,
  using Meta's combined endpoint so they fire in a single API call.
- Webhook security: X-Hub-Signature-256 HMAC verification (raw body,
  constant-time), wamid deduplication, group-shaped-message refusal
  (groups deferred to v2 — Baileys still covers them).
- Full integration with the gateway's session, cron, display-tier,
  prompt-hint, and auth-allowlist systems. Cloud and Baileys can run
  side-by-side against different phone numbers.

Also wires STT (speech-to-text) through Nous's managed audio gateway
for Nous subscribers — previously the default stt.provider=local
required a separate faster-whisper install. New subscribers now get
voice-note transcription out of the box.

Docs: 418-line user guide at website/docs/user-guide/messaging/
whatsapp-cloud.md, sidebar entry, environment-variables reference,
ADDING_A_PLATFORM.md updated with the optional interactive-UX
contract for future adapter authors.

Tests: 100 dedicated tests for the adapter, 32 for the setup wizard,
20 for the Nous subscription STT wiring, plus regression coverage
across display_config, prompt_builder, and the cron scheduler.

Known limitations (deferred until clear demand signal):
- Group chats — use the Baileys bridge if you need them.
- Message templates for 24-hour-window outside-conversation sends —
  reactive chat is unaffected; cron / delegate_task with gaps > 24h
  will fail with a clear error. The agent's system prompt warns the
  model about this so it knows to mention it when scheduling delayed
  messages.
2026-05-23 01:07:01 -04:00
989 changed files with 109961 additions and 8581 deletions

View File

@@ -63,3 +63,45 @@ data/
# Compose/profile runtime state (bind-mounted; avoid ownership/secret issues)
hermes-config/
runtime/
# ---------- Not needed inside the Docker image ----------
# Desktop app source (Tauri/Electron); never installed in the container
apps/
# Test suite — not shipped in production images
tests/
# Documentation site (Docusaurus) and supplementary docs
website/
docs/
# Assets only used by the GitHub README
assets/
infographic/
# Plugin-level docs (hermes-achievements ships docs/ but the runtime doesn't read them)
plugins/hermes-achievements/docs/
# Nix / Homebrew / AUR packaging metadata — irrelevant to Docker
nix/
flake.nix
flake.lock
packaging/
# Design and planning documents
plans/
.plans/
# ACP registry manifest (icon + agent.json) — not consumed at runtime
acp_registry/
# Repo-level dotfiles that are git-only or dev-tooling config
.env.example
.envrc
.gitattributes
.hadolint.yaml
.mailmap
# Top-level LICENSE (not matched by *.md); not needed inside the container
LICENSE

Binary file not shown.

After

Width:  |  Height:  |  Size: 428 KiB

View File

@@ -11,8 +11,20 @@ on:
- 'optional-skills/**'
- '.github/workflows/deploy-site.yml'
workflow_dispatch:
inputs:
skills_index_run_id:
description: 'Optional Build Skills Index run ID whose skills-index artifact should be deployed'
required: false
type: string
rebuild_skills_index:
description: 'Force a fresh multi-source crawl instead of reusing the latest healthy index'
required: false
default: false
type: boolean
permissions:
contents: read
actions: read
pages: write
id-token: write
@@ -44,7 +56,7 @@ jobs:
- uses: actions/setup-node@49933ea5288caeca8642d1e84afbd3f7d6820020 # v4
with:
node-version: 20
node-version: 22
cache: npm
cache-dependency-path: website/package-lock.json
@@ -55,26 +67,81 @@ jobs:
- name: Install PyYAML for skill extraction
run: pip install pyyaml==6.0.2 httpx==0.28.1
- name: Build skills index (unified multi-source catalog)
- name: Prepare skills index (unified multi-source catalog)
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GH_TOKEN: ${{ github.token }}
GITHUB_TOKEN: ${{ github.token }}
SKILLS_INDEX_RUN_ID: ${{ github.event.inputs.skills_index_run_id || '' }}
REBUILD_SKILLS_INDEX: ${{ github.event.inputs.rebuild_skills_index || 'false' }}
run: |
# Rebuild the unified catalog. The file is gitignored, so a fresh
# checkout starts without it and we want the freshest crawl in
# every deploy.
# The unified external catalog is expensive to crawl and can burn
# through the repository installation's GitHub API quota when several
# docs deploys land close together. Normal docs deploys therefore
# reuse the latest healthy catalog: first the artifact from a
# scheduled skills-index run, then the currently live index. Only a
# manual force rebuild does a fresh crawl here.
#
# This MUST be fatal. build_skills_index.py runs a health check and
# exits non-zero WITHOUT writing the output file when a source
# collapses (e.g. a GitHub API rate limit zeroes the github /
# claude-marketplace / well-known taps all at once). Letting the
# deploy continue would either (a) ship a degenerate index missing
# whole hubs — the June 2026 regression where OpenAI/Anthropic/
# HuggingFace/NVIDIA tabs vanished — or (b) fall through to a
# local-only catalog. Failing here keeps the last good deployment
# live (GitHub Pages serves the previous build) instead of
# publishing a broken catalog. Re-run the workflow once the
# transient rate limit clears.
# If we do crawl, the build remains fatal. build_skills_index.py runs
# the health check BEFORE writing and exits non-zero on source
# collapse, keeping the last good Pages deployment live instead of
# publishing a degenerate catalog.
set -euo pipefail
INDEX_PATH="website/static/api/skills-index.json"
mkdir -p "$(dirname "$INDEX_PATH")"
validate_index() {
python3 - "$INDEX_PATH" <<'PY'
import json
import sys
from pathlib import Path
path = Path(sys.argv[1])
try:
data = json.loads(path.read_text(encoding="utf-8"))
except Exception as exc:
print(f"invalid skills index JSON: {exc}", file=sys.stderr)
sys.exit(1)
skills = data.get("skills")
if not isinstance(skills, list) or len(skills) < 1500:
count = len(skills) if isinstance(skills, list) else "missing"
print(f"skills index too small: {count}", file=sys.stderr)
sys.exit(1)
print(f"skills index ready: {len(skills)} skills")
PY
}
if [ "$REBUILD_SKILLS_INDEX" = "true" ]; then
python3 scripts/build_skills_index.py
validate_index
exit 0
fi
if [ -n "$SKILLS_INDEX_RUN_ID" ]; then
tmpdir="$(mktemp -d)"
echo "Downloading skills-index artifact from run $SKILLS_INDEX_RUN_ID"
if gh run download "$SKILLS_INDEX_RUN_ID" --name skills-index --dir "$tmpdir"; then
candidate="$(find "$tmpdir" -name skills-index.json -type f | head -n 1 || true)"
if [ -n "$candidate" ]; then
cp "$candidate" "$INDEX_PATH"
if validate_index; then
exit 0
fi
fi
fi
echo "::warning::Could not use skills-index artifact from run $SKILLS_INDEX_RUN_ID; trying live index"
fi
echo "Downloading currently live skills index"
if curl -fsSL --retry 3 --retry-delay 5 \
"https://hermes-agent.nousresearch.com/docs/api/skills-index.json" \
-o "$INDEX_PATH" && validate_index; then
exit 0
fi
echo "::warning::Live skills index unavailable or unhealthy; falling back to a fresh crawl"
rm -f "$INDEX_PATH"
python3 scripts/build_skills_index.py
validate_index
- name: Extract skill metadata for dashboard
run: python3 website/scripts/extract-skills.py

View File

@@ -90,7 +90,7 @@ jobs:
# (see `_SKIP_PARTS` in scripts/run_tests_parallel.py) because each
# shard would otherwise reach the session-scoped ``built_image``
# fixture in ``tests/docker/conftest.py`` and start a 3-7min
# ``docker build`` under a 180s pytest-timeout cap — guaranteed to
# ``docker build`` — guaranteed to
# die in fixture setup.
#
# Piggybacking here avoids a second image build: the smoke test
@@ -114,7 +114,7 @@ jobs:
run: |
uv venv .venv --python 3.11
source .venv/bin/activate
# ``dev`` extra pulls in pytest, pytest-asyncio, pytest-timeout
# ``dev`` extra pulls in pytest, pytest-asyncio —
# everything tests/docker/ needs. We deliberately avoid ``all``
# here because the docker tests only drive the container via
# subprocess and don't import hermes_agent's optional deps.

View File

@@ -18,7 +18,7 @@ jobs:
- uses: actions/setup-node@49933ea5288caeca8642d1e84afbd3f7d6820020 # v4
with:
node-version: 20
node-version: 22
cache: npm
cache-dependency-path: website/package-lock.json

View File

@@ -53,4 +53,4 @@ jobs:
- name: Trigger Deploy Site workflow
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: gh workflow run deploy-site.yml --repo ${{ github.repository }}
run: gh workflow run deploy-site.yml --repo ${{ github.repository }} -f skills_index_run_id=${{ github.run_id }}

View File

@@ -29,6 +29,8 @@ jobs:
scan: ${{ steps.filter.outputs.scan }}
# True when pyproject.toml changed in this PR
deps: ${{ steps.filter.outputs.deps }}
# True when the curated MCP catalog / bundled MCP manifests changed.
mcp_catalog: ${{ steps.filter.outputs.mcp_catalog }}
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
@@ -54,6 +56,14 @@ jobs:
else
echo "deps=false" >> "$GITHUB_OUTPUT"
fi
MCP_CATALOG_FILES=$(git diff --name-only "$BASE"..."$HEAD" -- \
'optional-mcps/**' \
'hermes_cli/mcp_catalog.py' || true)
if [ -n "$MCP_CATALOG_FILES" ]; then
echo "mcp_catalog=true" >> "$GITHUB_OUTPUT"
else
echo "mcp_catalog=false" >> "$GITHUB_OUTPUT"
fi
scan:
name: Scan PR for critical supply chain risks
@@ -268,3 +278,50 @@ jobs:
runs-on: ubuntu-latest
steps:
- run: echo "No pyproject.toml changes, skipping dependency bounds check."
mcp-catalog-review:
name: MCP catalog security review
needs: changes
if: needs.changes.outputs.mcp_catalog == 'true'
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
fetch-depth: 0
- name: Require explicit MCP catalog review label
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
set -euo pipefail
PR="${{ github.event.pull_request.number }}"
LABELS=$(gh pr view "$PR" --json labels --jq '.labels[].name' || true)
if echo "$LABELS" | grep -Fxq 'mcp-catalog-reviewed'; then
echo "MCP catalog review label present."
exit 0
fi
BODY="## ⚠️ MCP catalog security review required
This PR changes the bundled MCP catalog or MCP catalog installer code. MCP entries can define local commands that users later install into \`mcp_servers\`, so this needs explicit maintainer review before merge.
A maintainer should verify:
- any new/changed \`optional-mcps/**/manifest.yaml\` command and args are expected,
- stdio transports do not use shell+egress/exfiltration payloads,
- git install refs are pinned and bootstrap commands are minimal,
- requested env vars/secrets match the upstream MCP's documented needs.
After review, add the \`mcp-catalog-reviewed\` label and re-run this check."
gh pr comment "$PR" --body "$BODY" || echo "::warning::Could not post PR comment (expected for fork PRs)"
echo "::error::MCP catalog changes require the mcp-catalog-reviewed label."
exit 1
mcp-catalog-review-gate:
name: MCP catalog security review
needs: changes
if: always() && needs.changes.outputs.mcp_catalog != 'true'
runs-on: ubuntu-latest
steps:
- run: echo "No MCP catalog changes, skipping MCP catalog security review."

View File

@@ -4,13 +4,13 @@ on:
push:
branches: [main]
paths-ignore:
- '**/*.md'
- 'docs/**'
- "**/*.md"
- "docs/**"
pull_request:
branches: [main]
paths-ignore:
- '**/*.md'
- 'docs/**'
- "**/*.md"
- "docs/**"
permissions:
contents: read
@@ -30,13 +30,17 @@ jobs:
slice: [1, 2, 3, 4, 5, 6]
steps:
- name: Checkout code
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- name: Restore duration cache
uses: actions/cache/restore@27d5ce7f107fe9357f9df03efb73ab90386fccae # v5.0.5
uses: actions/cache/restore@27d5ce7f107fe9357f9df03efb73ab90386fccae # v5.0.5
with:
path: test_durations.json
# Single stable key. main always overwrites, PRs always find it.
# main always writes a new suffix, but jobs pick the latest one with the same prefix
# quote from https://docs.github.com/en/actions/reference/workflows-and-actions/dependency-caching#cache-hits-and-misses
# If you provide restore-keys, the cache action sequentially searches for any caches that match the list of restore-keys.
# If there are no exact matches, the action searches for partial matches of the restore keys.
# When the action finds a partial match, the most recent cache is restored to the path directory.
key: test-durations
- name: Install ripgrep (prebuilt binary)
@@ -54,7 +58,7 @@ jobs:
rg --version
- name: Install uv
uses: astral-sh/setup-uv@d4b2f3b6ecc6e67c4457f6d3e41ec42d3d0fcb86 # v5
uses: astral-sh/setup-uv@d4b2f3b6ecc6e67c4457f6d3e41ec42d3d0fcb86 # v5
with:
# Persist uv's download/wheel cache (~/.cache/uv) across runs.
# Keyed on the dependency manifests, so the cache is reused until
@@ -115,7 +119,7 @@ jobs:
NOUS_API_KEY: ""
- name: Upload per-slice durations
uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a # v7.0.1
uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a # v7.0.1
with:
name: test-durations-slice-${{ matrix.slice }}
path: test_durations.json
@@ -125,11 +129,11 @@ jobs:
# (including PRs) get balanced slicing.
save-durations:
needs: test
if: always() && github.ref == 'refs/heads/main'
if: needs.test.result == 'success' && github.ref == 'refs/heads/main'
runs-on: ubuntu-latest
steps:
- name: Download all slice durations
uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8.0.1
uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8.0.1
with:
pattern: test-durations-slice-*
path: durations
@@ -149,17 +153,17 @@ jobs:
"
- name: Save merged duration cache
uses: actions/cache/save@27d5ce7f107fe9357f9df03efb73ab90386fccae # v5.0.5
uses: actions/cache/save@27d5ce7f107fe9357f9df03efb73ab90386fccae # v5.0.5
with:
path: test_durations.json
key: test-durations
key: test-durations-${{ github.run_id }}
e2e:
runs-on: ubuntu-latest
timeout-minutes: 15
steps:
- name: Checkout code
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- name: Install ripgrep (prebuilt binary)
run: |
@@ -176,7 +180,7 @@ jobs:
rg --version
- name: Install uv
uses: astral-sh/setup-uv@d4b2f3b6ecc6e67c4457f6d3e41ec42d3d0fcb86 # v5
uses: astral-sh/setup-uv@d4b2f3b6ecc6e67c4457f6d3e41ec42d3d0fcb86 # v5
with:
# Persist uv's download/wheel cache (~/.cache/uv) across runs.
# Keyed on the dependency manifests, so the cache is reused until

25
.github/workflows/typecheck.yml vendored Normal file
View File

@@ -0,0 +1,25 @@
# .github/workflows/typecheck.yml
name: Typecheck
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
typecheck:
runs-on: ubuntu-latest
strategy:
matrix:
package:
[ui-tui, web, apps/bootstrap-installer, apps/desktop, apps/shared]
fail-fast: false # report all failures, not just the first one
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- uses: actions/setup-node@49933ea5288caeca8642d1e84afbd3f7d6820020 # v4
with:
node-version: 22
cache: npm
- run: npm ci
- run: npm run --prefix ${{ matrix.package }} typecheck

13
.gitignore vendored
View File

@@ -89,6 +89,9 @@ website/static/api/skills-index.json
# every build).
website/static/api/skills.json
website/static/api/skills-meta.json
# automation-blueprints-index.json is a build artifact emitted by
# website/scripts/extract-automation-blueprints.py during prebuild.
website/static/api/automation-blueprints-index.json
models-dev-upstream/
# Local editor / agent tooling (machine-specific; keep in global config, not the repo)
@@ -114,6 +117,12 @@ docs/superpowers/*
# treat it as a local edit and autostash it on every run (#38529).
.hermes-bootstrap-complete
# Interrupted-update breadcrumb + recovery lock written next to the shared venv
# by `hermes update` / launch-time self-heal. Runtime state, never a code change
# — ignore so `git status` stays clean and update's autostash skips them.
.update-incomplete
.update-incomplete.lock
# Tool Search live-test harness output — non-deterministic model transcripts,
# regenerated by scripts/tool_search_livetest.py. Never an artifact of the repo.
scripts/out/
@@ -123,3 +132,7 @@ scripts/out/
# stores the published notes. They are not a build artifact and must never be
# committed to the repo root. See the hermes-release skill.
RELEASE_v*.md
# Desktop demo-run scratch output (hermes writes demo/*.txt during recorded
# walkthroughs). Throwaway artifacts, never part of the app.
apps/desktop/demo/

View File

@@ -459,7 +459,7 @@ npm install # first time
npm run dev # watch mode (rebuilds hermes-ink + tsx --watch)
npm start # production
npm run build # full build (hermes-ink + tsc)
npm run type-check # typecheck only (tsc --noEmit)
npm run typecheck # typecheck only (tsc --noEmit)
npm run lint # eslint
npm run fmt # prettier
npm test # vitest

View File

@@ -1,12 +1,14 @@
FROM ghcr.io/astral-sh/uv:0.11.6-python3.13-trixie@sha256:b3c543b6c4f23a5f2df22866bd7857e5d304b67a564f4feab6ac22044dde719b AS uv_source
# Node 22 LTS source stage. Debian trixie's bundled nodejs is pinned to 20.x
# which reached EOL in April 2026 we copy node + npm + corepack from the
# upstream node:22 image instead so we can stay on a supported LTS without
# waiting for Debian 14 (forky, ~mid-2027). Bookworm-based slim image used
# so the produced binary links against glibc 2.36, which runs cleanly on
# our Debian 13 (trixie, glibc 2.41) runtime. Bumping to a new Node major
# is a one-line ARG change; see #4977.
FROM node:22-bookworm-slim@sha256:7af03b14a13c8cdd38e45058fd957bf00a72bbe17feac43b1c15a689c029c732 AS node_source
# Node 26 source stage. Debian trixie's bundled nodejs is pinned to 20.x
# (EOL April 2026), so we copy node + npm + corepack from the upstream node:26
# image instead. Node 26 (Current; LTS promotion ~Oct 2026) is REQUIRED by the
# native OpenTUI TUI engine, which loads its renderer via the experimental
# `node:ffi` API that only exists on Node 26.3+ (the Ink engine + web build run
# on it too). Bookworm-based slim image used so the produced binary links
# against glibc 2.36, which runs cleanly on our Debian 13 (trixie, glibc 2.41)
# runtime. The pinned tag ships v26.3.0. Bumping Node is a one-line change here.
# NOTE: verify the full image build + Ink/web/Playwright on Node 26 in CI.
FROM node:26-bookworm-slim@sha256:79723b41edbedf595f62e943a9f8b0ba9af5b1e61045c5f8f59c2c02c1212a16 AS node_source
FROM debian:13.4
# Disable Python stdout buffering to ensure logs are printed immediately
@@ -25,7 +27,7 @@ ENV PLAYWRIGHT_BROWSERS_PATH=/opt/hermes/.playwright
# hermes process, the dashboard, and per-profile gateways.
RUN apt-get update && \
apt-get install -y --no-install-recommends \
ca-certificates curl iputils-ping python3 python-is-python3 ripgrep ffmpeg gcc python3-dev python3-venv libffi-dev libolm-dev procps git openssh-client docker-cli xz-utils && \
ca-certificates curl iputils-ping python3 python-is-python3 ripgrep ffmpeg gcc g++ make cmake python3-dev python3-venv libffi-dev libolm-dev procps git openssh-client docker-cli xz-utils && \
rm -rf /var/lib/apt/lists/*
# ---------- s6-overlay install ----------
@@ -90,7 +92,7 @@ RUN useradd -u 10000 -m -d /opt/data hermes
COPY --chmod=0755 --from=uv_source /usr/local/bin/uv /usr/local/bin/uvx /usr/local/bin/
# Node 22 LTS: copy the node binary plus the bundled npm + corepack JS
# Node 26: copy the node binary plus the bundled npm + corepack JS
# installs from the upstream image. npm and npx are recreated as symlinks
# because they're symlinks in the source image (and need to live on PATH).
# See node_source stage at the top of the file for the version-bump
@@ -119,7 +121,7 @@ COPY ui-tui/packages/hermes-ink/ ui-tui/packages/hermes-ink/
# `npm_config_install_links=false` forces npm to install `file:` deps as
# symlinks instead of copies. This is the default since npm 10+, which is
# what the image ships now (via the node:22 source stage). We set it
# what the image ships now (via the node:26 source stage). We set it
# explicitly anyway as defense-in-depth: the previous Debian-bundled npm
# 9.x defaulted to install-as-copy, which produced a hidden
# node_modules/.package-lock.json that permanently disagreed with the root
@@ -146,9 +148,9 @@ RUN npm install --prefer-offline --no-audit && \
#
# `uv sync --frozen --no-install-project --extra all --extra messaging`
# installs the deps reachable through the composite `[all]` extra
# (handpicked set intended for the production image), plus gateway
# messaging adapters that should work in the published image without a
# first-boot lazy install. We do NOT use `--all-extras`:
# (handpicked set intended for the production image — excludes `[dev]`),
# plus gateway messaging adapters that should work in the published image
# without a first-boot lazy install. We do NOT use `--all-extras`:
# that would pull in `[rl]` (atroposlib + tinker + torch + wandb from
# git), `[yc-bench]` (another git dep), and `[termux-all]` (Android
# redundancy), none of which belong in the published container.
@@ -164,19 +166,38 @@ RUN npm install --prefer-offline --no-audit && \
# image update and recall/retain then fails with
# `ModuleNotFoundError: No module named 'hindsight_client'` (#38128).
#
# The Matrix gateway's deps ([matrix] extra) are baked in because
# python-olm (transitive via mautrix[encryption]) builds from source on
# Python/image combinations without usable wheels. The Docker image is
# Linux-only, so keeping the native libolm/build-toolchain packages here
# avoids the cross-platform failures that kept [matrix] out of [all]
# while still making Matrix work in the published container. Fixes #30399.
#
# The editable link is created after the source copy below.
COPY pyproject.toml uv.lock ./
RUN touch ./README.md
RUN uv sync --frozen --no-install-project --extra all --extra messaging --extra anthropic --extra bedrock --extra azure-identity --extra hindsight
RUN uv sync --frozen --no-install-project --extra all --extra messaging --extra anthropic --extra bedrock --extra azure-identity --extra hindsight --extra matrix
# ---------- Frontend build (cached independently from Python source) ----------
# Copy only the frontend source trees first so that Python-only changes don't
# invalidate the (relatively slow) web + ui-tui build layer.
COPY web/ web/
COPY ui-tui/ ui-tui/
COPY ui-opentui/ ui-opentui/
# ui-opentui is the opt-in native OpenTUI engine (HERMES_TUI_ENGINE=opentui;
# default stays Ink). .dockerignore strips its node_modules/dist, so install +
# esbuild-build it here -> dist/main.js, then prune devDeps (esbuild/babel/
# vitest); the runtime only needs the prod deps (the external @opentui/core +
# its native blob -- the bundle inlines solid/effect). Build needs Node 26.3
# (node:ffi floor), which this image ships.
RUN cd web && npm run build && \
cd ../ui-tui && npm run build && \
cd ../ui-opentui && npm install --no-audit --no-fund && npm run build && npm prune --omit=dev
# ---------- Source code ----------
# .dockerignore excludes node_modules, so the installs above survive.
COPY --chown=hermes:hermes . .
# Build browser dashboard and terminal UI assets.
RUN cd web && npm run build && \
cd ../ui-tui && npm run build
# ---------- Permissions ----------
# Make install dir world-readable so any HERMES_UID can read it at runtime.
# The venv needs to be traversable too.

View File

@@ -107,6 +107,8 @@ You can still bring your own keys per-tool whenever you want — the gateway is
Hermes has two entry points: start the terminal UI with `hermes`, or run the gateway and talk to it from Telegram, Discord, Slack, WhatsApp, Signal, or Email. Once you're in a conversation, many slash commands are shared across both interfaces.
> **TUI engine:** On supported hosts (Linux/macOS with Node 26.3+), the terminal UI defaults to the native **OpenTUI** engine, which the installer provisions for you. The legacy **Ink** engine remains the fallback — it's used automatically on Windows, Termux, or when the native engine can't run, and you can select it explicitly with `HERMES_TUI_ENGINE=ink hermes`. Ink is not going away; it's the kept fallback.
| Action | CLI | Messaging platforms |
| ------------------------------ | --------------------------------------------- | -------------------------------------------------------------------------------- |
| Start chatting | `hermes` | Run `hermes gateway setup` + `hermes gateway start`, then send the bot a message |

View File

@@ -824,6 +824,7 @@ class HermesACPAgent(acp.Agent):
try:
from model_tools import get_tool_definitions
from agent.memory_manager import inject_memory_provider_tools
enabled_toolsets = _expand_acp_enabled_toolsets(
getattr(state.agent, "enabled_toolsets", None) or ["hermes-acp"],
@@ -839,6 +840,7 @@ class HermesACPAgent(acp.Agent):
state.agent.valid_tool_names = {
tool["function"]["name"] for tool in state.agent.tools or []
}
inject_memory_provider_tools(state.agent)
invalidate = getattr(state.agent, "_invalidate_system_prompt", None)
if callable(invalidate):
invalidate()
@@ -1779,10 +1781,25 @@ class HermesACPAgent(acp.Agent):
def _cmd_tools(self, args: str, state: SessionState) -> str:
try:
from model_tools import get_tool_definitions
from types import SimpleNamespace
from agent.memory_manager import inject_memory_provider_tools
toolsets = _expand_acp_enabled_toolsets(
getattr(state.agent, "enabled_toolsets", None) or ["hermes-acp"]
)
tools = get_tool_definitions(enabled_toolsets=toolsets, quiet_mode=True)
tool_view = SimpleNamespace(
tools=list(tools or []),
valid_tool_names={
tool.get("function", {}).get("name")
for tool in tools or []
if isinstance(tool, dict)
},
enabled_toolsets=toolsets,
_memory_manager=getattr(state.agent, "_memory_manager", None),
)
inject_memory_provider_tools(tool_view)
tools = tool_view.tools
if not tools:
return "No tools available."
lines = [f"Available tools ({len(tools)}):"]

View File

@@ -145,7 +145,7 @@ def build_nous_credits_snapshot(account_info) -> Optional[AccountUsageSnapshot]:
account info to show (fail-open: caller just shows nothing).
"""
try:
from hermes_cli.nous_account import nous_portal_billing_url
from hermes_cli.nous_account import nous_portal_topup_url
if account_info is None or not getattr(account_info, "logged_in", False):
return None
@@ -213,7 +213,8 @@ def build_nous_credits_snapshot(account_info) -> Optional[AccountUsageSnapshot]:
if not windows and not details:
return None
details.append(f"Manage / top up: {nous_portal_billing_url(account_info)}")
details.append(f"Top up: {nous_portal_topup_url(account_info)}")
details.append("(or run /credits)")
plan = getattr(sub, "plan", None) if sub is not None else None
return AccountUsageSnapshot(
@@ -242,6 +243,17 @@ def nous_credits_lines(*, markdown: bool = False, timeout: float = 10.0) -> list
renders from that fixture instead of the real portal (so the block + gauge are
testable without a live account). Throwaway scaffolding.
"""
snapshot = _fetch_nous_credits_snapshot(timeout=timeout)
return render_account_usage_lines(snapshot, markdown=markdown)
def _fetch_nous_credits_snapshot(timeout: float = 10.0) -> Optional[AccountUsageSnapshot]:
"""Auth-gate + portal fetch + snapshot build for the Nous credits block.
Shared by ``nous_credits_lines`` (full block) and
``nous_credits_compact_line`` (one-liner). Honors the
HERMES_DEV_CREDITS_FIXTURE dev override. Fail-open → None.
"""
# Dev fixture short-circuit — render /usage from the injected state, no portal.
try:
from agent.credits_tracker import dev_fixture_credits_state
@@ -250,17 +262,16 @@ def nous_credits_lines(*, markdown: bool = False, timeout: float = 10.0) -> list
except Exception:
fixture = None
if fixture is not None:
snapshot = _snapshot_from_credits_state(fixture)
return render_account_usage_lines(snapshot, markdown=markdown)
return _snapshot_from_credits_state(fixture)
try:
from hermes_cli.auth import get_provider_auth_state
tok = (get_provider_auth_state("nous") or {}).get("access_token")
if not (isinstance(tok, str) and tok.strip()):
return []
return None
except Exception:
return []
return None
try:
import concurrent.futures
@@ -270,13 +281,36 @@ def nous_credits_lines(*, markdown: bool = False, timeout: float = 10.0) -> list
account = pool.submit(
get_nous_portal_account_info, force_fresh=True
).result(timeout=timeout)
snapshot = build_nous_credits_snapshot(account)
return render_account_usage_lines(snapshot, markdown=markdown)
return build_nous_credits_snapshot(account)
except Exception:
# Fail-open (caller shows nothing), but leave a breadcrumb so a dead
# /usage credits block is diagnosable in agent.log without a dev flag.
logger.debug("credits ▸ /usage portal fetch/render failed (fail-open)", exc_info=True)
return []
return None
def nous_credits_compact_line(*, timeout: float = 10.0) -> Optional[str]:
"""One-line Nous credits summary for the compact /usage view, or None.
Condenses the snapshot's own detail strings (stable, locally-built
formats) into ``Nous credits (Plan): Total usable: $X · Renews: …``.
Same gating/fail-open semantics as ``nous_credits_lines``.
"""
snap = _fetch_nous_credits_snapshot(timeout=timeout)
if snap is None or not snap.available:
return None
picked = [
d for d in snap.details
if d.startswith(("Total usable:", "Renews:", "Status:"))
]
if not picked:
picked = [d for d in snap.details if not d.startswith("Manage / top up:")][:2]
if not picked:
return None
title = snap.title
if snap.plan:
title += f" ({snap.plan})"
return f"{title}: " + " · ".join(picked)
def _snapshot_from_credits_state(state) -> Optional[AccountUsageSnapshot]:
@@ -337,6 +371,93 @@ def _snapshot_from_credits_state(state) -> Optional[AccountUsageSnapshot]:
return None
@dataclass(frozen=True)
class CreditsView:
"""Surface-agnostic data for the ``/credits`` command.
One portal fetch, one parse — consumed identically by the CLI panel, the
gateway button, and any other money surface. Fail-open: when not logged in
or the portal is unreachable, ``logged_in`` is False / ``topup_url`` is None
and callers degrade gracefully.
"""
logged_in: bool
balance_lines: tuple[str, ...] = ()
identity_line: Optional[str] = None
topup_url: Optional[str] = None
depleted: bool = False
def build_credits_view(*, markdown: bool = False, timeout: float = 10.0) -> CreditsView:
"""Build the /credits view: balance block + identity line + top-up URL.
Reuses the same account fetch + snapshot + URL builder as the /usage credits
block, so the numbers always match. The balance block is the rendered
snapshot MINUS its trailing top-up/command-hint lines (the /credits surface
supplies its own affordance). Fail-open → ``CreditsView(logged_in=False)``.
"""
not_logged_in = CreditsView(logged_in=False)
try:
from hermes_cli.auth import get_provider_auth_state
tok = (get_provider_auth_state("nous") or {}).get("access_token")
if not (isinstance(tok, str) and tok.strip()):
return not_logged_in
except Exception:
return not_logged_in
try:
import concurrent.futures
from hermes_cli.nous_account import (
get_nous_portal_account_info,
nous_portal_topup_url,
)
with concurrent.futures.ThreadPoolExecutor(max_workers=1) as pool:
account = pool.submit(get_nous_portal_account_info, force_fresh=True).result(
timeout=timeout
)
except Exception:
logger.debug("credits ▸ /credits portal fetch failed (fail-open)", exc_info=True)
return not_logged_in
if account is None or not getattr(account, "logged_in", False):
return not_logged_in
snapshot = build_nous_credits_snapshot(account)
# Balance lines = the snapshot block minus the two trailing affordance lines
# ("Top up: <url>" + "(or run /credits)") that build_nous_credits_snapshot
# appends for the /usage surface. /credits renders its own button/panel.
balance_lines: list[str] = []
if snapshot is not None:
rendered = render_account_usage_lines(snapshot, markdown=markdown)
balance_lines = [
line
for line in rendered
if not line.lstrip().startswith("Top up:")
and not line.lstrip().startswith("(or run")
]
# Identity line — shown before any open (roadmap §4.4).
email = getattr(account, "email", None)
org_name = getattr(account, "org_name", None)
who: list[str] = []
if email:
who.append(str(email))
if org_name:
who.append(f"org {org_name}")
identity_line = ("Topping up as " + " / ".join(who)) if who else None
return CreditsView(
logged_in=True,
balance_lines=tuple(balance_lines),
identity_line=identity_line,
topup_url=nous_portal_topup_url(account),
depleted=getattr(account, "paid_service_access", None) is False,
)
def _resolve_codex_usage_url(base_url: str) -> str:
normalized = (base_url or "").strip().rstrip("/")
if not normalized:

View File

@@ -900,6 +900,9 @@ def init_agent(
agent.api_key = client_kwargs.get("api_key", "")
agent.base_url = client_kwargs.get("base_url", agent.base_url)
try:
from agent.ssl_guard import verify_ca_bundle_with_fallback
verify_ca_bundle_with_fallback()
agent.client = agent._create_openai_client(client_kwargs, reason="agent_init", shared=True)
if not agent.quiet_mode:
print(f"🤖 AI Agent initialized with model: {agent.model}")
@@ -1193,38 +1196,8 @@ def init_agent(
_ra().logger.warning("Memory provider plugin init failed: %s", _mpe)
agent._memory_manager = None
# Inject memory provider tool schemas into the tool surface.
# Skip tools whose names already exist (plugins may register the
# same tools via ctx.register_tool(), which lands in agent.tools
# through _ra().get_tool_definitions()). Duplicate function names cause
# 400 errors on providers that enforce unique names (e.g. Xiaomi
# MiMo via Nous Portal).
#
# Respect the platform's enabled_toolsets configuration (#5544):
# enabled_toolsets is None → no filter, inject (backward compat)
# "memory" in enabled_toolsets → user opted in, inject
# otherwise (incl. []) → user excluded memory, skip injection
#
# Without this gate, `platform_toolsets: telegram: []` still leaks memory
# provider tools (fact_store, etc.) into the tool surface — a 10x latency
# penalty on local models and a frequent trigger of tool-call loops.
if agent._memory_manager and agent.tools is not None and (
agent.enabled_toolsets is None or "memory" in agent.enabled_toolsets
):
_existing_tool_names = {
t.get("function", {}).get("name")
for t in agent.tools
if isinstance(t, dict)
}
for _schema in agent._memory_manager.get_all_tool_schemas():
_tname = _schema.get("name", "")
if _tname and _tname in _existing_tool_names:
continue # already registered via plugin path
_wrapped = {"type": "function", "function": _schema}
agent.tools.append(_wrapped)
if _tname:
agent.valid_tool_names.add(_tname)
_existing_tool_names.add(_tname)
from agent.memory_manager import inject_memory_provider_tools as _inject_memory_provider_tools
_inject_memory_provider_tools(agent)
# Skills config: nudge interval for skill creation reminders
agent._skill_nudge_interval = 10
@@ -1624,6 +1597,12 @@ def init_agent(
agent.session_cache_write_tokens = 0
agent.session_reasoning_tokens = 0
agent.session_estimated_cost_usd = 0.0
# Provider-REPORTED cost only (e.g. OpenRouter usage.cost). None means
# "nothing reported" — distinct from a real $0.00.
agent.session_actual_cost_usd = None
# Per-model session usage rows for /usage: {model: {calls, input, output,
# cache_read, cache_write, cost_usd|None}}.
agent.session_model_usage = {}
agent.session_cost_status = "unknown"
agent.session_cost_source = "none"

View File

@@ -445,6 +445,45 @@ def repair_message_sequence(agent, messages: List[Dict]) -> int:
return repairs
def repair_message_sequence_with_cursor(agent, messages: List[Dict]) -> int:
"""Run :func:`repair_message_sequence` and keep the SessionDB flush
cursor consistent with the compacted list (#44837).
``repair_message_sequence`` merges/drops messages in place, shrinking
the list. ``_last_flushed_db_idx`` (the DB-write cursor) indexes into
that list, so after compaction it can point past the new end — the
turn-end flush would then skip the assistant/tool chain entirely — or
past unflushed messages shifted to lower indexes.
Repair preserves object identity for surviving messages, so counting
the survivors from the previously-flushed prefix gives the exact new
cursor even when messages are dropped/merged at indexes *before* the
cursor — a plain ``min()`` clamp would silently skip that many
unflushed rows. Falls back to the clamp when no prefix snapshot is
available.
Returns the number of repairs made (same as ``repair_message_sequence``).
"""
pre_repair_flushed_ids = None
flush_cursor = getattr(agent, "_last_flushed_db_idx", None)
if isinstance(flush_cursor, int) and flush_cursor > 0:
pre_repair_flushed_ids = {id(m) for m in messages[:flush_cursor]}
repairs = repair_message_sequence(agent, messages)
if repairs > 0 and hasattr(agent, "_last_flushed_db_idx"):
if pre_repair_flushed_ids is not None:
agent._last_flushed_db_idx = sum(
1 for m in messages if id(m) in pre_repair_flushed_ids
)
else:
agent._last_flushed_db_idx = min(
agent._last_flushed_db_idx, len(messages)
)
return repairs
def strip_think_blocks(agent, content: str) -> str:
"""Remove reasoning/thinking blocks from content, returning only visible text.
@@ -579,12 +618,33 @@ def recover_with_credential_pool(
current_provider = (getattr(agent, "provider", "") or "").strip().lower()
pool_provider = (getattr(pool, "provider", "") or "").strip().lower()
if current_provider and pool_provider and current_provider != pool_provider:
_ra().logger.warning(
"Credential pool provider mismatch: pool=%s, agent=%s"
"skipping pool mutation to avoid cross-provider contamination",
pool_provider, current_provider,
)
return False, has_retried_429
# Custom endpoints use two naming conventions for the SAME provider:
# the agent carries the generic ``custom`` label while the pool is
# keyed ``custom:<name>`` (see CUSTOM_POOL_PREFIX). A literal string
# compare treats them as a mismatch and skips recovery for every
# custom-provider user — 401s/429s then burn the full retry cycle
# with no rotation or refresh. Accept the pair as matching only when
# the agent's CURRENT base_url actually resolves to this pool key,
# so a fallback provider (or a different custom endpoint) still
# triggers the guard.
_custom_match = False
if current_provider == "custom" and pool_provider.startswith("custom:"):
try:
from agent.credential_pool import get_custom_provider_pool_key
_agent_base = (getattr(agent, "base_url", "") or "").strip()
_custom_match = bool(_agent_base) and (
(get_custom_provider_pool_key(_agent_base) or "").strip().lower()
== pool_provider
)
except Exception:
_custom_match = False
if not _custom_match:
_ra().logger.warning(
"Credential pool provider mismatch: pool=%s, agent=%s"
"skipping pool mutation to avoid cross-provider contamination",
pool_provider, current_provider,
)
return False, has_retried_429
effective_reason = classified_reason
if effective_reason is None:
@@ -679,15 +739,28 @@ def recover_with_credential_pool(
# long-running TUI sessions stuck on stale tokens until the user
# exited and reopened.
is_entitlement = agent._is_entitlement_failure(error_context, status_code)
_auth_haystack = " ".join(
str(error_context.get(k) or "").lower()
for k in ("message", "reason", "code", "error")
if isinstance(error_context, dict)
)
if (
not is_entitlement
and status_code == 403
and "oauth authentication is currently not allowed for this organization" in _auth_haystack
):
is_entitlement = True
if (
not is_entitlement
and status_code == 403
and (agent.provider or "") == "anthropic"
and getattr(agent, "api_mode", "") == "anthropic_messages"
):
is_entitlement = True
if not is_entitlement and status_code == 403 and (agent.provider or "") == "xai-oauth":
_disambiguator_haystack = " ".join(
str(error_context.get(k) or "").lower()
for k in ("message", "reason", "code", "error")
if isinstance(error_context, dict)
)
_is_xai_auth_failure = (
"[wke=unauthenticated:" in _disambiguator_haystack
or "oauth2 access token could not be validated" in _disambiguator_haystack
"[wke=unauthenticated:" in _auth_haystack
or "oauth2 access token could not be validated" in _auth_haystack
)
if not _is_xai_auth_failure:
is_entitlement = True
@@ -808,6 +881,8 @@ def try_recover_primary_transport(
def drop_thinking_only_and_merge_users(
messages: List[Dict[str, Any]],
*,
drop_codex_reasoning_items: bool = True,
) -> List[Dict[str, Any]]:
"""Drop thinking-only assistant turns; merge any adjacent user messages left behind.
@@ -829,7 +904,13 @@ def drop_thinking_only_and_merge_users(
return messages
# Pass 1: drop thinking-only assistant turns.
kept = [m for m in messages if not _ra().AIAgent._is_thinking_only_assistant(m)]
kept = [
m for m in messages
if not _ra().AIAgent._is_thinking_only_assistant(
m,
drop_codex_reasoning_items=drop_codex_reasoning_items,
)
]
dropped = len(messages) - len(kept)
if dropped == 0:
return messages

View File

@@ -751,6 +751,9 @@ def build_anthropic_client(
from httpx import Timeout
normalized_base_url = _normalize_base_url_text(base_url)
if normalized_base_url:
import re as _re
normalized_base_url = _re.sub(r"/v1/?$", "", normalized_base_url.rstrip("/"))
_read_timeout = timeout if (isinstance(timeout, (int, float)) and timeout > 0) else 900.0
kwargs = {
"timeout": Timeout(timeout=float(_read_timeout), connect=10.0),
@@ -1571,6 +1574,15 @@ def _convert_content_part_to_anthropic(part: Any) -> Optional[Dict[str, Any]]:
if ptype == "input_text":
block: Dict[str, Any] = {"type": "text", "text": part.get("text", "")}
elif ptype == "text":
# A stored Anthropic text block. Rebuild from whitelisted fields only —
# SDK response text blocks carry output-only siblings (parsed_output,
# citations=None) that the Messages INPUT schema rejects with HTTP 400
# "Extra inputs are not permitted". Do NOT dict(part) it verbatim.
block = {"type": "text", "text": part.get("text", "")}
cits = part.get("citations")
if isinstance(cits, list) and cits:
block["citations"] = cits
elif ptype in {"image_url", "input_image"}:
image_value = part.get("image_url", {})
url = image_value.get("url", "") if isinstance(image_value, dict) else str(image_value or "")
@@ -1685,6 +1697,58 @@ def _content_parts_to_anthropic_blocks(parts: Any) -> List[Dict[str, Any]]:
return out
def _sanitize_replay_block(b: Dict[str, Any]) -> Optional[Dict[str, Any]]:
"""Strip output-only fields from a stored Anthropic content block so it is
valid as REQUEST input on replay.
The SDK response objects carry output-only attributes that the Messages
*input* schema forbids ("Extra inputs are not permitted"): text blocks get
``parsed_output``/``citations`` (when null), tool_use blocks get ``caller``,
etc. ``normalize_response`` captured blocks verbatim via ``_to_plain_data``,
so these leak back as input on the next turn → HTTP 400.
Whitelist per type (NOT a blacklist) so future SDK output-only fields can't
reintroduce the bug. Returns a clean block, or None to drop it.
"""
if not isinstance(b, dict):
return None
btype = b.get("type")
if btype == "text":
out: Dict[str, Any] = {"type": "text", "text": b.get("text", "")}
# citations is input-valid ONLY when it's a non-empty list; the SDK
# emits citations=None on responses, which the input schema rejects.
cits = b.get("citations")
if isinstance(cits, list) and cits:
out["citations"] = cits
if isinstance(b.get("cache_control"), dict):
out["cache_control"] = b["cache_control"]
return out
if btype == "thinking":
out = {"type": "thinking", "thinking": b.get("thinking", "")}
if b.get("signature"):
out["signature"] = b["signature"]
return out
if btype == "redacted_thinking":
# Only valid with its data payload; drop if missing.
return {"type": "redacted_thinking", "data": b["data"]} if b.get("data") else None
if btype == "tool_use":
out = {
"type": "tool_use",
"id": _sanitize_tool_id(b.get("id", "")),
"name": b.get("name", ""),
"input": b.get("input", {}),
}
if isinstance(b.get("cache_control"), dict):
out["cache_control"] = b["cache_control"]
return out
if btype == "image":
src = b.get("source")
return {"type": "image", "source": src} if isinstance(src, dict) else None
# Unknown/unsupported block type on the input path — drop rather than risk
# another "Extra inputs are not permitted".
return None
def _convert_assistant_message(m: Dict[str, Any]) -> Dict[str, Any]:
"""Convert an assistant message to Anthropic content blocks.
@@ -1692,6 +1756,55 @@ def _convert_assistant_message(m: Dict[str, Any]) -> Dict[str, Any]:
reasoning_content injection for Kimi/DeepSeek endpoints.
"""
content = m.get("content", "")
# Anthropic interleaved-thinking fast path: when this turn carries a
# verbatim, order-preserving block list (set by normalize_response only
# for turns that interleave SIGNED thinking with tool_use), replay it.
# Each block is run through _sanitize_replay_block to strip output-only
# SDK fields (parsed_output, caller, citations=None, …) that the Messages
# INPUT schema forbids — replaying them verbatim caused HTTP 400 "Extra
# inputs are not permitted" (text.parsed_output). Block ORDER is preserved
# (the reason this channel exists); only forbidden sibling fields are
# dropped, leaving thinking signatures and tool_use id/name/input intact.
ordered_blocks = m.get("anthropic_content_blocks")
if isinstance(ordered_blocks, list) and ordered_blocks:
# Re-source each tool_use input from the stored tool_calls map rather
# than the captured block. The ordered-blocks list captures tool_use
# input from the RAW API response (normalize_response), which is NOT
# credential-redacted; tool_calls[].function.arguments IS redacted at
# storage time (build_assistant_message, #19798). Replaying the raw
# block input would resurrect a secret the model inlined into a tool
# call (e.g. terminal(command="curl -H 'Authorization: Bearer sk-...'")
# onto the wire, even though the same value is redacted everywhere else
# in history. Keying by sanitized tool id preserves interleave order
# (the reason this channel exists) while swapping in the redacted
# input. Adapted from #36071 (replay-time tool-input re-sourcing).
redacted_input_by_id: Dict[str, Any] = {}
for tc in m.get("tool_calls", []) or []:
if not isinstance(tc, dict):
continue
fn = tc.get("function", {}) or {}
raw_args = fn.get("arguments", "{}")
try:
parsed_args = json.loads(raw_args) if isinstance(raw_args, str) else raw_args
except (json.JSONDecodeError, ValueError):
parsed_args = {}
redacted_input_by_id[_sanitize_tool_id(tc.get("id", ""))] = parsed_args
replayed: List[Dict[str, Any]] = []
for b in ordered_blocks:
clean = _sanitize_replay_block(b)
if clean is None:
continue
if clean.get("type") == "tool_use":
# Override raw (un-redacted) input with the redacted copy when
# we have one for this id; fall back to the sanitized block
# input only if the tool_call is missing (shape mismatch).
redacted = redacted_input_by_id.get(clean.get("id", ""))
if redacted is not None:
clean["input"] = redacted
replayed.append(clean)
if replayed:
return {"role": "assistant", "content": replayed}
blocks = _extract_preserved_thinking_blocks(m)
if content:
if isinstance(content, list):

View File

@@ -1144,7 +1144,8 @@ def _endpoint_speaks_anthropic_messages(base_url: str) -> bool:
normalized = (base_url or "").strip().lower().rstrip("/")
if not normalized:
return False
if normalized.endswith("/anthropic"):
path = urlparse(normalized).path.rstrip("/")
if path.endswith("/anthropic") or path.endswith("/anthropic/v1"):
return True
hostname = base_url_hostname(normalized)
if hostname == "api.anthropic.com":
@@ -3190,7 +3191,7 @@ def _resolve_auto(main_runtime: Optional[Dict[str, Any]] = None) -> Tuple[Option
if (main_provider and main_model
and main_provider not in {"auto", ""}):
resolved_provider = main_provider
explicit_base_url = None
explicit_base_url = runtime_base_url or None
explicit_api_key = None
if runtime_base_url and (main_provider == "custom" or main_provider.startswith("custom:")):
resolved_provider = "custom"
@@ -5004,7 +5005,7 @@ def _build_call_kwargs(
# Provider-specific extra_body
merged_extra = dict(extra_body or {})
if provider == "nous" or auxiliary_is_nous:
if provider == "nous":
merged_extra.setdefault("tags", []).extend(_nous_portal_tags())
if merged_extra:
kwargs["extra_body"] = merged_extra

View File

@@ -208,6 +208,41 @@ def is_stale_connection_error(exc: BaseException) -> bool:
return False
def is_streaming_access_denied_error(exc: BaseException) -> bool:
"""Return True when AWS denied the ``bedrock:InvokeModelWithResponseStream`` action.
IAM policies scoped to ``bedrock:InvokeModel`` only (a common least-privilege
setup) reject ``converse_stream()`` with an ``AccessDeniedException`` whose
message names the streaming action, e.g.::
User: arn:aws:iam::123456789012:user/x is not authorized to perform:
bedrock:InvokeModelWithResponseStream on resource: ...
This is permanent for the session — retrying the stream can never succeed —
so callers should flip to the non-streaming ``converse()`` path (which maps
to ``bedrock:InvokeModel``) instead of burning retries.
Detection is deliberately message-based: boto3 surfaces this as a
``ClientError`` with ``Error.Code == "AccessDeniedException"``, and the
AnthropicBedrock SDK wraps the same AWS response in its own exception
types, but both preserve the action name in the message.
"""
msg = str(exc).lower()
if "invokemodelwithresponsestream" not in msg:
return False
# ClientError with an explicit access-denied code is the canonical form.
try:
from botocore.exceptions import ClientError
except ImportError: # pragma: no cover — botocore always present with boto3
ClientError = None # type: ignore[assignment]
if ClientError is not None and isinstance(exc, ClientError):
code = (getattr(exc, "response", None) or {}).get("Error", {}).get("Code", "")
return code in ("AccessDeniedException", "UnauthorizedException")
# Wrapped forms (e.g. AnthropicBedrock SDK PermissionDeniedError) — match
# on the authorization-failure phrasing AWS uses.
return "not authorized" in msg or "accessdenied" in msg
# ---------------------------------------------------------------------------
# AWS credential detection
# ---------------------------------------------------------------------------
@@ -900,11 +935,14 @@ def build_converse_kwargs(
if system_prompt:
kwargs["system"] = system_prompt
if temperature is not None:
kwargs["inferenceConfig"]["temperature"] = temperature
from agent.anthropic_adapter import _forbids_sampling_params
if top_p is not None:
kwargs["inferenceConfig"]["topP"] = top_p
if not _forbids_sampling_params(model):
if temperature is not None:
kwargs["inferenceConfig"]["temperature"] = temperature
if top_p is not None:
kwargs["inferenceConfig"]["topP"] = top_p
if stop_sequences:
kwargs["inferenceConfig"]["stopSequences"] = stop_sequences
@@ -1003,6 +1041,16 @@ def call_converse_stream(
try:
response = client.converse_stream(**kwargs)
except Exception as exc:
if is_streaming_access_denied_error(exc):
# IAM allows bedrock:InvokeModel but not
# InvokeModelWithResponseStream — permanent for this session.
# Fall back to the non-streaming converse() path.
logger.info(
"bedrock: converse_stream denied by IAM on (region=%s, model=%s) — "
"falling back to non-streaming converse().",
region, model,
)
return normalize_converse_response(client.converse(**kwargs))
if is_stale_connection_error(exc):
logger.warning(
"bedrock: stale-connection error on converse_stream(region=%s, "

View File

@@ -952,6 +952,18 @@ def build_assistant_message(agent, assistant_message, finish_reason: str) -> dic
if preserved:
msg["reasoning_details"] = preserved
# Anthropic interleaved-thinking replay: when a turn interleaves signed
# thinking blocks with tool_use, the parallel reasoning_details +
# tool_calls fields lose the cross-type ordering, and reconstruction
# front-loads thinking — reordering signed blocks and triggering HTTP 400
# ("thinking ... blocks in the latest assistant message cannot be
# modified"). Carry the verbatim ordered block list so the adapter can
# replay the latest assistant message unchanged. See
# agent/transports/anthropic.py and agent/anthropic_adapter.py.
ordered_blocks = getattr(assistant_message, "anthropic_content_blocks", None)
if ordered_blocks:
msg["anthropic_content_blocks"] = ordered_blocks
# Codex Responses API: preserve encrypted reasoning items for
# multi-turn continuity. These get replayed as input on the next turn.
codex_items = getattr(assistant_message, "codex_reasoning_items", None)
@@ -1603,6 +1615,8 @@ def interruptible_streaming_api_call(agent, api_kwargs: dict, *, on_first_delta=
_get_bedrock_runtime_client,
invalidate_runtime_client,
is_stale_connection_error,
is_streaming_access_denied_error,
normalize_converse_response,
stream_converse_with_callbacks,
)
region = api_kwargs.pop("__bedrock_region__", "us-east-1")
@@ -1611,6 +1625,29 @@ def interruptible_streaming_api_call(agent, api_kwargs: dict, *, on_first_delta=
try:
raw_response = client.converse_stream(**api_kwargs)
except Exception as _bedrock_exc:
# IAM policies scoped to bedrock:InvokeModel only (no
# InvokeModelWithResponseStream) reject converse_stream()
# with AccessDeniedException. That denial is permanent for
# the session — fall back to the non-streaming converse()
# inline (it maps to bedrock:InvokeModel) and disable
# streaming for subsequent calls so we don't re-fail every
# turn.
if is_streaming_access_denied_error(_bedrock_exc):
agent._disable_streaming = True
agent._safe_print(
"\n⚠ AWS IAM denied bedrock:InvokeModelWithResponseStream — "
"falling back to non-streaming InvokeModel.\n"
" Grant that action to restore streaming output.\n"
)
logger.info(
"bedrock: converse_stream denied by IAM (%s) — "
"using non-streaming converse() for this session.",
type(_bedrock_exc).__name__,
)
result["response"] = normalize_converse_response(
client.converse(**api_kwargs)
)
return
# Evict the cached client on stale-connection failures
# so the outer retry loop builds a fresh client/pool.
if is_stale_connection_error(_bedrock_exc):
@@ -1698,6 +1735,14 @@ def interruptible_streaming_api_call(agent, api_kwargs: dict, *, on_first_delta=
# poll loop uses this to detect stale connections that keep receiving
# SSE keep-alive pings but no actual data.
last_chunk_time = {"t": time.time()}
# Stale-stream patience, shared between the httpx socket read timeout
# (built in ``_call_chat_completions`` below) and the stale-stream detector
# (computed further down, before the worker thread starts). Initialized
# here so the read-timeout builder can floor itself at the stale value and
# never fire before the detector. ``None`` until the detector value is
# resolved, so the builder degrades to its plain default if it ever runs
# first.
_stream_stale_timeout = None
def _fire_first_delta():
if not first_delta_fired["done"] and on_first_delta:
@@ -1734,6 +1779,26 @@ def interruptible_streaming_api_call(agent, api_kwargs: dict, *, on_first_delta=
"Local provider detected (%s) — stream read timeout raised to %.0fs",
agent.base_url, _stream_read_timeout,
)
elif (
_stream_read_timeout == 120.0
and _stream_stale_timeout is not None
and _stream_stale_timeout != float("inf")
and _stream_stale_timeout > _stream_read_timeout
):
# Cloud reasoning models (e.g. Opus) routinely pause mid-stream
# for minutes during extended thinking. The stale-stream
# detector is deliberately scaled up to tolerate this (180300s,
# see the stale-timeout block below), but the raw httpx socket
# read timeout defaulted to a flat 120s and fired *first* —
# tearing down a healthy reasoning stream before the stale
# detector (which owns retry + diagnostics) could act. Keep the
# socket read timeout in step with the detector so it no longer
# preempts it.
_stream_read_timeout = _stream_stale_timeout
logger.debug(
"Cloud reasoning stream — read timeout raised to %.0fs to "
"match stale-stream detector", _stream_read_timeout,
)
# Cap connect/pool at 60s even when provider timeout is higher.
# connect/pool cover TCP handshake, not model inference.
_conn_cap = min(_base_timeout, 60.0) if _provider_timeout_cfg is not None else 30.0
@@ -2384,9 +2449,34 @@ def interruptible_streaming_api_call(agent, api_kwargs: dict, *, on_first_delta=
"stream" in _err_lower
and "not supported" in _err_lower
)
if _is_stream_unsupported:
# AWS Bedrock (AnthropicBedrock SDK path): IAM policies
# with bedrock:InvokeModel but not
# InvokeModelWithResponseStream reject messages.stream()
# with a permission error naming the streaming action.
# Permanent for the session — flip to non-streaming
# (messages.create() maps to bedrock:InvokeModel).
_is_bedrock_stream_denied = False
if (
not _is_stream_unsupported
and "invokemodelwithresponsestream" in _err_lower
):
# Cheap message pre-check before importing the
# adapter — bedrock_adapter triggers a lazy boto3
# install at import time, which must not run for
# unrelated providers' stream errors.
from agent.bedrock_adapter import (
is_streaming_access_denied_error,
)
_is_bedrock_stream_denied = (
is_streaming_access_denied_error(e)
)
if _is_stream_unsupported or _is_bedrock_stream_denied:
agent._disable_streaming = True
agent._safe_print(
"\n⚠ AWS IAM denied bedrock:InvokeModelWithResponseStream. "
"Switching to non-streaming.\n"
" Grant that action to restore streaming output.\n"
if _is_bedrock_stream_denied else
"\n⚠ Streaming is not supported for this "
"model/provider. Switching to non-streaming.\n"
" To avoid this delay, set display.streaming: false "

View File

@@ -127,14 +127,21 @@ def _chat_content_to_responses_parts(content: Any, *, role: str = "user") -> Lis
return converted
def _summarize_user_message_for_log(content: Any) -> str:
"""Return a short text summary of a user message for logging/trajectory.
def _summarize_user_message_for_log(content: Any, *, sep: str = " ") -> str:
"""Flatten message content to a plain-text summary.
Multimodal messages arrive as a list of ``{type:"text"|"image_url", ...}``
parts from the API server. Logging, spinner previews, and trajectory
files all want a plain string — this helper extracts the first chunk of
text and notes any attached images. Returns an empty string for empty
lists and ``str(content)`` for unexpected scalar types.
parts from the API server. Several consumers want a plain string:
- Logging, spinner previews, and trajectory files (the default ``sep=" "``).
- External memory providers, which feed the text to regexes
(``sanitize_context``) and text APIs — a raw list crashes the sync with
``expected string or bytes-like object, got 'list'`` (use ``sep="\\n"``).
Text parts are joined with ``sep``; images become a ``[N image(s)]`` marker
so the turn isn't recorded as if the attachment never existed. Returns an
empty string for empty lists and ``str(content)`` for unexpected scalar
types.
"""
if content is None:
return ""
@@ -157,7 +164,7 @@ def _summarize_user_message_for_log(content: Any) -> str:
text_bits.append(text)
elif ptype in {"image_url", "input_image"}:
image_count += 1
summary = " ".join(text_bits).strip()
summary = sep.join(text_bits).strip()
if image_count:
note = f"[{image_count} image{'s' if image_count != 1 else ''}]"
summary = f"{note} {summary}" if summary else note
@@ -1074,6 +1081,7 @@ def _normalize_codex_response(
message_items_raw: List[Dict[str, Any]] = []
tool_calls: List[Any] = []
has_incomplete_items = response_status in {"queued", "in_progress", "incomplete"}
saw_streaming_or_item_incomplete = response_status in {"queued", "in_progress"}
saw_commentary_phase = False
saw_final_answer_phase = False
saw_reasoning_item = False
@@ -1088,6 +1096,7 @@ def _normalize_codex_response(
if item_status in {"queued", "in_progress", "incomplete"}:
has_incomplete_items = True
saw_streaming_or_item_incomplete = True
if item_type == "message":
item_phase = getattr(item, "phase", None)
@@ -1245,7 +1254,9 @@ def _normalize_codex_response(
finish_reason = "tool_calls"
elif leaked_tool_call_text:
finish_reason = "incomplete"
elif has_incomplete_items or (saw_commentary_phase and not saw_final_answer_phase):
elif saw_streaming_or_item_incomplete:
finish_reason = "incomplete"
elif (has_incomplete_items or saw_commentary_phase) and not saw_final_answer_phase:
finish_reason = "incomplete"
elif (reasoning_items_raw or reasoning_parts or saw_reasoning_item) and not final_text:
# Response contains only reasoning (encrypted thinking state and/or

738
agent/coding_context.py Normal file
View File

@@ -0,0 +1,738 @@
"""Coding-context awareness — base Hermes, every interactive surface.
When the user runs Hermes inside a code workspace (CLI, TUI, desktop app, or an
editor over ACP), Hermes shifts into a **coding posture**. This module is the
single place that decides whether we're in that posture and what it implies,
so the rest of the codebase never re-derives "are we coding?" on its own.
Architecture — one seam, many consumers
----------------------------------------
The posture is modelled as a frozen :class:`RuntimeMode` selected from a small
:class:`ContextProfile` registry (today: ``coding`` and ``general``). A profile
is *data* — it declares the toolset to collapse to, the operating brief to
inject, and hints for other domains (model routing, memory, subagents). Every
domain reads the same resolved object instead of probing git/config itself:
* **System prompt** — ``RuntimeMode.system_blocks()`` → the operating brief +
a live git/workspace snapshot (``agent/system_prompt.py``).
* **Toolset** — ``RuntimeMode.toolset_selection()`` → the ``coding`` toolset
plus the user's enabled MCP servers (``cli.py`` / ``tui_gateway``). Only
under the opt-in ``focus`` mode: the default posture is prompt-only and
never touches the user's configured toolsets (toolsets like messaging /
smart-home / music are off-by-default anyway, and someone who explicitly
enabled image-gen or Spotify shouldn't lose it for being in a git repo).
* **Delegation** — subagents inherit the parent's toolset and run through the
same prompt builder, so the coding posture propagates to children for free.
* **Model / memory / compression** — declared on the profile
(``model_hint``, ``memory_policy``) as the extension seam; consumers read
``mode.profile`` rather than re-deciding.
Cache safety
------------
The mode is resolved **once** and is immutable. The workspace snapshot is built
once at prompt-build time and baked into the *stable* system-prompt tier — never
re-probed per turn (that would shatter the prompt cache). Branch and dirty state
drift mid-session, so the brief tells the model to re-check with ``git`` before
acting on the snapshot. A ``/coding`` flip therefore only takes effect next
session (deferred), the same contract as ``/skills install`` vs ``--now``.
Activation (config ``agent.coding_context``):
* ``auto`` (default) — posture (brief + snapshot) on an interactive coding
surface sitting in a code workspace (git repo or recognised project root).
Prompt-only; toolsets and the skill index untouched.
* ``focus`` — like ``auto``, but additionally collapses the toolset to the
``coding`` set + enabled MCP servers and demotes non-coding skill
categories to names-only in the prompt's skill index (no skill is ever
hidden). Explicit opt-in for a lean schema.
* ``on`` — force the posture anywhere (incl. non-workspaces). Prompt-only.
* ``off`` — disable entirely.
"""
from __future__ import annotations
import json
import logging
import os
import re
import subprocess
from dataclasses import dataclass
from pathlib import Path
from typing import Any, Optional
logger = logging.getLogger("hermes.coding_context")
CODING_TOOLSET = "coding"
# Surfaces where a coding posture makes sense under ``auto``. Messaging
# platforms (telegram, discord, slack, …) are intentionally absent — a chat bot
# in a group is not pair-programming.
INTERACTIVE_CODING_PLATFORMS = {"cli", "tui", "acp", "desktop", ""}
# Project-root signals that mark a directory as a code workspace even when it
# isn't (yet) a git repo. Cheap filename checks — no parsing.
_PROJECT_MARKERS = (
"pyproject.toml", "setup.py", "setup.cfg", "requirements.txt",
"package.json", "tsconfig.json", "deno.json",
"Cargo.toml", "go.mod", "pom.xml", "build.gradle", "build.gradle.kts",
"Gemfile", "composer.json", "mix.exs", "pubspec.yaml",
"CMakeLists.txt", "Makefile", "Dockerfile",
"AGENTS.md", "CLAUDE.md", ".cursorrules",
)
# Agent-instruction files surfaced separately from manifests in the snapshot.
_CONTEXT_FILES = ("AGENTS.md", "CLAUDE.md", ".cursorrules")
# Lockfile → package manager, checked in priority order.
_PY_LOCKFILES = (("uv.lock", "uv"), ("poetry.lock", "poetry"), ("Pipfile.lock", "pipenv"))
_JS_LOCKFILES = (
("pnpm-lock.yaml", "pnpm"), ("bun.lockb", "bun"), ("bun.lock", "bun"),
("yarn.lock", "yarn"), ("package-lock.json", "npm"),
)
# package.json scripts / Makefile targets worth surfacing as verify commands.
_VERIFY_TARGETS = ("test", "tests", "lint", "typecheck", "check", "build", "fmt", "format")
_MAX_VERIFY_COMMANDS = 8
_MAX_FACT_FILE_BYTES = 256 * 1024
_GIT_TIMEOUT = 2.5
# Per-model edit-format steering. Matching the edit tool format to how a model
# was trained reduces mistakes and wasted reasoning (OpenAI/Codex handle
# patch-style diffs best; Anthropic models — and most open-weight coding
# models, whose RL scaffolds use str_replace-style editors — do best with
# string-replacement). Our `patch` tool exposes both: mode="patch" (V4A
# multi-file) and mode="replace" (find-and-swap). We nudge each family toward
# its native format. Unknown families get nothing (the brief's neutral wording
# stands). Substrings match the model id; aligned with TOOL_USE_ENFORCEMENT_MODELS.
#
# GPT/Codex get V4A for ALL edits, single-file included: in codex-rs,
# apply_patch (V4A — apply_patch.lark) is the ONLY file editor, no
# str_replace-style tool exists, and the shipped model prompts say to use
# apply_patch even "for single file edits" — so a replace-mode nudge would
# steer those models toward a format their first-party harness never taught
# them.
_EDIT_FORMAT_GUIDANCE: dict[str, tuple[tuple[str, ...], str]] = {
"patch": (
("gpt", "codex"),
"- Edit format: author new files with `write_file`; for edits to "
"existing code use `patch` with `mode='patch'` (V4A diff) — including "
"single-file edits. It's the edit format you handle most reliably.",
),
"replace": (
("claude", "sonnet", "opus", "haiku",
"gemini", "gemma", "deepseek", "qwen", "kimi", "glm", "grok",
"hermes", "llama", "mistral", "devstral", "minimax"),
"- Edit format: author new files with `write_file`; for edits to "
"existing code prefer `patch` in `mode='replace'` — match a unique "
"snippet and swap it. Reach for `mode='patch'` (V4A) only when an edit "
"genuinely spans several files at once.",
),
}
def _model_family(model: Optional[str]) -> Optional[str]:
"""Classify a model id into an edit-format family key, or ``None``.
Used to steer the coding posture toward the edit tool format a model was
trained on. Family-agnostic by design: an unrecognised model gets ``None``
and the operating brief's neutral edit wording applies.
"""
if not model:
return None
lowered = model.lower()
for family, (needles, _line) in _EDIT_FORMAT_GUIDANCE.items():
if any(n in lowered for n in needles):
return family
return None
def _edit_format_line(model: Optional[str]) -> str:
"""The edit-format guidance line for this model's family (``""`` if none)."""
family = _model_family(model)
if family is None:
return ""
return _EDIT_FORMAT_GUIDANCE[family][1]
# Operating brief for the coding posture. Tool names referenced here (read_file,
# search_files, patch, write_file, terminal, todo) are in the coding toolset and
# in _HERMES_CORE_TOOLS, so they're present on every surface this fires on.
CODING_AGENT_GUIDANCE = (
"You are a coding agent pairing with the user inside their codebase. "
"Operate like a careful senior engineer.\n"
"\n"
"Gather context first:\n"
"- Read the relevant files with `read_file` and locate code with "
"`search_files` before changing anything. Trace a symbol to its definition "
"and usages rather than guessing its shape.\n"
"- Batch independent lookups: when several reads/searches don't depend on "
"each other, issue them together in one turn instead of one at a time.\n"
"- Never invent files, symbols, APIs, or imports. If you haven't seen it in "
"the repo, go look. Don't assume a library is available — check the project "
"manifest (pyproject.toml / package.json / Cargo.toml / go.mod) and how "
"neighbouring files import it.\n"
"\n"
"Make changes through the tools, not the chat:\n"
"- Edit with `patch`/`write_file`. Do NOT print code blocks to the user as "
"a substitute for editing — apply the change, then summarise it. Only show "
"code when the user explicitly asks to see it.\n"
"- Match the project's existing style and conventions; AGENTS.md / "
"CLAUDE.md / .cursorrules already in context win over your defaults. Touch "
"only what the task needs — no drive-by refactors, renames, or reformatting "
"— and add any imports/dependencies your code requires.\n"
"- If an edit fails to apply, re-read the file to get the current exact "
"contents before retrying — don't repeat a stale patch. If the same region "
"fails twice, rewrite the enclosing function or file with `write_file` "
"instead of attempting a third patch.\n"
"\n"
"Verify, and know when to stop:\n"
"- Use `terminal` for git, builds, tests, and inspection. Run the relevant "
"tests/linter/build and confirm they pass before claiming the work is done.\n"
"- Terminal state persists across calls: current directory and exported "
"environment variables carry forward. Activate a virtualenv or export setup "
"vars once, then reuse that state instead of re-sourcing it before every "
"test command.\n"
"- Fix root causes, not symptoms: when you find a bug, check sibling call "
"paths for the same flaw and fix the class, not just the reported site.\n"
"- When fixing linter/type errors on a file, stop after about three "
"attempts on the same file and ask the user rather than looping.\n"
"- Track multi-step work with `todo`. Reference code as `path:line` instead "
"of pasting whole files.\n"
"\n"
"Respect the user's repo: don't commit, push, or rewrite history unless "
"asked, and never read, print, or commit secrets — leave `.env` and "
"credential files alone unless the user explicitly asks. The Workspace "
"block below is a snapshot from session start — re-run `git status`/"
"`git branch` before relying on it. Be concise: lead with the change or "
"answer, not a preamble."
)
# ── Context profiles (declarative posture definitions) ──────────────────────
@dataclass(frozen=True)
class ContextProfile:
"""A named operating posture. Pure data — consumers read these fields.
``toolset`` — collapse to this toolset (+ enabled MCP) when no explicit
selection is pinned; ``None`` keeps the platform default.
``guidance`` — operating brief injected into the stable system prompt;
``""`` injects nothing.
``model_hint`` — routing preference key for smart model routing
(extension seam; not yet consumed by the router).
``memory_policy``— memory namespace/weighting hint (extension seam).
``compact_skill_categories`` — skill categories DEMOTED to names-only in
the system-prompt skill index under the opt-in ``focus``
mode. Never hidden: every skill name stays visible
(so memory-anchored recall keeps working) — only the
descriptions are dropped to cut index noise. Deny-list
semantics so unknown/custom categories keep full
entries.
"""
name: str
toolset: Optional[str] = None
guidance: str = ""
model_hint: Optional[str] = None
memory_policy: str = "default"
compact_skill_categories: tuple[str, ...] = ()
# Skill categories that are clearly not part of a coding workflow. Demoted to
# names-only in the prompt's skill index under the opt-in ``focus`` mode only
# (deny-list — anything not listed here, incl. custom user categories, keeps
# full entries). Coding-adjacent categories (devops, github, mcp,
# data-science, diagramming, research, security, …) are intentionally absent.
_NON_CODING_SKILL_CATEGORIES = (
"apple", "communication", "cooking", "creative", "email", "finance",
"gaming", "gifs", "health", "media", "music", "note-taking",
"productivity", "shopping", "smart-home", "social-media", "travel",
"yuanbao",
)
GENERAL_PROFILE = ContextProfile(name="general")
CODING_PROFILE = ContextProfile(
name="coding",
toolset=CODING_TOOLSET,
guidance=CODING_AGENT_GUIDANCE,
model_hint="coding",
memory_policy="project",
compact_skill_categories=_NON_CODING_SKILL_CATEGORIES,
)
_PROFILES: dict[str, ContextProfile] = {
GENERAL_PROFILE.name: GENERAL_PROFILE,
CODING_PROFILE.name: CODING_PROFILE,
}
def get_profile(name: str) -> ContextProfile:
"""Return a registered profile, falling back to ``general``."""
return _PROFILES.get(name, GENERAL_PROFILE)
# ── Helpers ─────────────────────────────────────────────────────────────────
def _coding_mode(config: Optional[dict[str, Any]]) -> str:
"""Return the normalized ``agent.coding_context`` mode (auto/focus/on/off)."""
if config is None:
try:
from hermes_cli.config import load_config
config = load_config()
except Exception:
config = {}
raw = ((config or {}).get("agent", {}) or {}).get("coding_context", "auto")
mode = str(raw).strip().lower()
if mode in {"focus", "strict", "lean"}:
return "focus"
if mode in {"on", "true", "yes", "1", "always"}:
return "on"
if mode in {"off", "false", "no", "0", "never"}:
return "off"
return "auto"
def _resolve_cwd(cwd: Optional[str | Path]) -> Path:
if cwd:
return Path(cwd).expanduser()
try:
from agent.runtime_cwd import resolve_agent_cwd
return resolve_agent_cwd()
except Exception:
return Path(os.getcwd())
def _git_root(cwd: Path) -> Optional[Path]:
current = cwd.resolve()
for parent in [current, *current.parents]:
if (parent / ".git").exists():
return parent
return None
def _home() -> Optional[Path]:
try:
return Path.home().resolve()
except (OSError, RuntimeError):
return None
def _marker_root(cwd: Path) -> Optional[Path]:
"""Nearest ancestor that looks like a project root, or ``None``.
Walks up at most a few levels so a manifest in the workspace root counts
even when the user is in a subdirectory. ``$HOME`` itself is skipped — a
Makefile or AGENTS.md sitting in the home directory is global user config,
not a project-root signal.
"""
current = cwd.resolve()
home = _home()
for depth, parent in enumerate([current, *current.parents]):
if depth > 6:
break
if parent == home:
continue
for marker in _PROJECT_MARKERS:
if (parent / marker).exists():
return parent
return None
def _detect_profile_name(mode: str, platform: str, cwd_str: str) -> str:
"""Resolve which profile applies.
``auto``/``focus``: coding when the surface is interactive AND the cwd is a
code workspace (a git repo or a recognised project root). ``on``: always
coding. ``off``: always general.
A git repo rooted at ``$HOME`` (the dotfiles pattern) is NOT a workspace
signal — without the guard, every session anywhere under a dotfiles-managed
home directory would silently flip to the coding posture.
Detection is intentionally not memoized: it's a handful of ``stat`` calls,
and callers resolve the mode once per session anyway. Caching here would
risk a stale posture if a long-lived process (gateway/TUI) serves sessions
from different working directories.
"""
if mode == "off":
return GENERAL_PROFILE.name
if mode == "on":
return CODING_PROFILE.name
if platform and platform.strip().lower() not in INTERACTIVE_CODING_PLATFORMS:
return GENERAL_PROFILE.name
cwd = Path(cwd_str)
git_root = _git_root(cwd)
if git_root is not None and git_root == _home():
git_root = None # dotfiles repo at $HOME — not a code workspace
if git_root is not None or _marker_root(cwd) is not None:
return CODING_PROFILE.name
return GENERAL_PROFILE.name
# ── RuntimeMode (the seam) ──────────────────────────────────────────────────
@dataclass(frozen=True)
class RuntimeMode:
"""The resolved operating posture for a session. Immutable by construction.
Built once via :func:`resolve_runtime_mode` and consumed by every domain
that cares about the coding/general distinction. Never mutate or re-resolve
mid-session — that would break the prompt cache.
"""
profile: ContextProfile
surface: str
cwd: Path
# The normalized ``agent.coding_context`` mode this posture was resolved
# under (auto/focus/on/off). Toolset collapse is gated on ``focus``.
config_mode: str = "auto"
# The model id this session runs (e.g. "anthropic/claude-opus-4.8"). Used
# only to steer edit-format guidance toward the model's family — see
# ``_edit_format_line``. Fixed for the session, so cache-safe.
model: Optional[str] = None
@property
def kind(self) -> str:
return self.profile.name
@property
def is_coding(self) -> bool:
return self.profile.name == CODING_PROFILE.name
def toolset_selection(self, config: Optional[dict[str, Any]] = None) -> Optional[list[str]]:
"""Toolset list for this posture, or ``None`` to keep the platform default.
Non-``None`` only under the opt-in ``focus`` mode. The default posture
is prompt-only: most strippable toolsets are off-by-default anyway, and
a user who explicitly enabled one (image-gen for frontend/game assets,
messaging for build notifications, …) keeps it while coding.
Callers apply this only when the user hasn't pinned an explicit
selection (``--toolsets``, ``HERMES_TUI_TOOLSETS``, …); they never
override a pin. Returns the profile's toolset plus enabled MCP servers.
"""
if self.config_mode != "focus":
return None
if self.profile.toolset is None:
return None
return [self.profile.toolset, *_enabled_mcp_servers(config)]
def system_blocks(self) -> list[str]:
"""Stable system-prompt blocks for this posture (brief + workspace).
The operating brief carries a model-family edit-format nudge appended
to it (one cached string, not a separate block) so the model is steered
toward the `patch` mode it handles best — see ``_edit_format_line``.
"""
if not self.is_coding:
return []
blocks: list[str] = []
if self.profile.guidance:
brief = self.profile.guidance
edit_line = _edit_format_line(self.model)
if edit_line:
brief = f"{brief}\n{edit_line}"
blocks.append(brief)
workspace = build_coding_workspace_block(self.cwd)
if workspace:
blocks.append(workspace)
return blocks
def compact_skill_categories(self) -> frozenset[str]:
"""Skill categories to demote to names-only in the prompt's skill index.
Gated on the opt-in ``focus`` mode, like the toolset collapse: the
default posture leaves the skill index untouched. Users who didn't ask
for a lean prompt keep full entries for every category — index changes
under ``auto`` proved too surprising in practice, even names-only ones
(a demoted description is information the model no longer weighs when
deciding what to load).
Demoted — never hidden — even under ``focus``. An earlier revision
fully pruned these categories from the index, which caused silent
capability loss in a real workflow: agent-created skills are the
model's accumulated project memory (server-ops runbooks, learned
pitfalls, …), and models do not reliably reach for ``skills_list`` to
rediscover what the index stopped showing them. Names-only keeps every
skill loadable on recall while still cutting the description noise.
"""
if not self.is_coding or self.config_mode != "focus":
return frozenset()
return frozenset(self.profile.compact_skill_categories)
def resolve_runtime_mode(
*,
platform: Optional[str] = None,
cwd: Optional[str | Path] = None,
config: Optional[dict[str, Any]] = None,
model: Optional[str] = None,
) -> RuntimeMode:
"""Resolve the operating posture once. Cheap — a handful of ``stat`` calls.
This is the single entry point every domain should call. The returned
object is immutable and safe to cache for the session. Detection itself is
intentionally *not* memoized (see ``_detect_profile_name``) so a long-lived
process can't pin a stale posture; callers resolve once per session and
hold the result. ``model`` is recorded only to steer edit-format guidance;
it never affects detection.
"""
resolved_cwd = _resolve_cwd(cwd)
mode = _coding_mode(config)
name = _detect_profile_name(
mode, (platform or "").strip().lower(), str(resolved_cwd)
)
return RuntimeMode(
profile=get_profile(name),
surface=platform or "",
cwd=resolved_cwd,
config_mode=mode,
model=model,
)
# ── Back-compat surface (thin wrappers over RuntimeMode) ────────────────────
def is_coding_context(
*,
platform: Optional[str] = None,
cwd: Optional[str | Path] = None,
config: Optional[dict[str, Any]] = None,
) -> bool:
"""Whether Hermes should operate in its coding posture right now."""
return resolve_runtime_mode(platform=platform, cwd=cwd, config=config).is_coding
def coding_selection(
*,
platform: Optional[str] = None,
cwd: Optional[str | Path] = None,
config: Optional[dict[str, Any]] = None,
) -> Optional[list[str]]:
"""Toolset selection for the coding posture.
``None`` unless the user opted into ``focus`` mode AND the posture is
active — the default coding posture never overrides configured toolsets.
"""
return resolve_runtime_mode(
platform=platform, cwd=cwd, config=config
).toolset_selection(config)
def coding_system_blocks(
*,
platform: Optional[str] = None,
cwd: Optional[str | Path] = None,
config: Optional[dict[str, Any]] = None,
model: Optional[str] = None,
) -> list[str]:
"""Stable system-prompt blocks for the current posture (empty when general).
``model`` steers the brief's edit-format nudge toward the model's family.
"""
return resolve_runtime_mode(
platform=platform, cwd=cwd, config=config, model=model
).system_blocks()
def coding_compact_skill_categories(
*,
platform: Optional[str] = None,
cwd: Optional[str | Path] = None,
config: Optional[dict[str, Any]] = None,
) -> frozenset[str]:
"""Skill categories the active posture demotes to names-only in the index.
Empty outside the coding posture and outside the opt-in ``focus`` mode —
the default posture never touches the skill index. Under ``focus``,
demoted — never hidden: every skill name stays in the index and remains
loadable via ``skill_view`` / ``skills_list``; only descriptions are
dropped.
"""
return resolve_runtime_mode(
platform=platform, cwd=cwd, config=config
).compact_skill_categories()
def _enabled_mcp_servers(config: Optional[dict[str, Any]]) -> list[str]:
"""Names of MCP servers the user has enabled — kept in the coding posture.
MCP servers (figma, browser, tophat, …) are explicitly configured and part
of the coding workflow, not noise to strip.
"""
try:
from hermes_cli.config import read_raw_config
from hermes_cli.tools_config import _parse_enabled_flag
servers = read_raw_config().get("mcp_servers") or {}
return [
str(name)
for name, cfg in servers.items()
if isinstance(cfg, dict)
and _parse_enabled_flag(cfg.get("enabled", True), default=True)
]
except Exception:
return []
# ── git/workspace probe ─────────────────────────────────────────────────────
def _git(cwd: Path, *args: str) -> str:
try:
out = subprocess.run(
["git", "-C", str(cwd), *args],
capture_output=True,
text=True,
timeout=_GIT_TIMEOUT,
)
except (OSError, subprocess.SubprocessError):
return ""
return out.stdout.strip() if out.returncode == 0 else ""
def _parse_status(porcelain: str) -> tuple[dict[str, str], dict[str, int]]:
"""Parse ``git status --porcelain=2 --branch`` into branch + counts."""
branch: dict[str, str] = {}
counts = {"staged": 0, "modified": 0, "untracked": 0, "conflicts": 0}
for line in porcelain.splitlines():
if line.startswith("# branch.head"):
branch["head"] = line.split(maxsplit=2)[-1]
elif line.startswith("# branch.upstream"):
branch["upstream"] = line.split(maxsplit=2)[-1]
elif line.startswith("# branch.ab"):
parts = line.split()
branch["ahead"], branch["behind"] = parts[2].lstrip("+"), parts[3].lstrip("-")
elif line.startswith(("1 ", "2 ")):
xy = line.split(maxsplit=2)[1]
if xy[0] != ".":
counts["staged"] += 1
if xy[1] != ".":
counts["modified"] += 1
elif line.startswith("u "):
counts["conflicts"] += 1
elif line.startswith("? "):
counts["untracked"] += 1
return branch, counts
def _read_small(path: Path) -> str:
"""Read a small text file, or ``""`` — never raises, never reads huge files."""
try:
if not path.is_file() or path.stat().st_size > _MAX_FACT_FILE_BYTES:
return ""
return path.read_text(encoding="utf-8", errors="replace")
except OSError:
return ""
def _project_facts(root: Path) -> list[str]:
"""Detected project facts for the workspace snapshot.
The point is to hand the model its *verify loop* up front — which manifest,
which package manager, and the exact test/lint/build commands — instead of
making it rediscover them every session. Cheap: stat calls plus reads of a
couple of small files; built once at prompt-build time (cache-safe).
"""
facts: list[str] = []
manifests = [m for m in _PROJECT_MARKERS if m not in _CONTEXT_FILES and (root / m).is_file()]
package_managers = [
pm for lock, pm in (*_PY_LOCKFILES, *_JS_LOCKFILES) if (root / lock).is_file()
]
if manifests:
line = f"- Project: {', '.join(manifests[:6])}"
if package_managers:
line += f" ({'/'.join(dict.fromkeys(package_managers))})"
facts.append(line)
verify: list[str] = []
if (root / "scripts" / "run_tests.sh").is_file():
verify.append("scripts/run_tests.sh")
if (root / "package.json").is_file():
try:
scripts = json.loads(_read_small(root / "package.json") or "{}").get("scripts") or {}
except (json.JSONDecodeError, AttributeError):
scripts = {}
js_pm = next((pm for lock, pm in _JS_LOCKFILES if (root / lock).is_file()), "npm")
verify.extend(f"{js_pm} run {name}" for name in _VERIFY_TARGETS if name in scripts)
if (root / "pytest.ini").is_file() or "[tool.pytest" in _read_small(root / "pyproject.toml"):
verify.append("pytest")
makefile = _read_small(root / "Makefile")
if makefile:
verify.extend(
f"make {name}" for name in _VERIFY_TARGETS
if re.search(rf"^{re.escape(name)}\s*:", makefile, re.MULTILINE)
)
if verify:
deduped = list(dict.fromkeys(verify))[:_MAX_VERIFY_COMMANDS]
facts.append(f"- Verify: {'; '.join(deduped)}")
context_files = [c for c in _CONTEXT_FILES if (root / c).is_file()]
if context_files:
facts.append(f"- Context files: {', '.join(context_files)}")
return facts
def build_coding_workspace_block(cwd: Optional[str | Path] = None) -> str:
"""Workspace snapshot for the system prompt (empty outside a workspace).
Git state (branch/status/commits) when the cwd is in a repo, plus detected
project facts (manifest, package manager, verify commands, context files)
— so marker-only (non-git) projects still get a snapshot.
"""
resolved = _resolve_cwd(cwd)
git_root = _git_root(resolved)
root = git_root or _marker_root(resolved)
if root is None:
return ""
lines = ["Workspace (snapshot at session start — re-check with `git` before acting on it):"]
lines.append(f"- Root: {root}")
if git_root is not None:
branch, counts = _parse_status(_git(root, "status", "--porcelain=2", "--branch"))
head = branch.get("head", "")
if head and head != "(detached)":
line = f"- Branch: {head}"
if branch.get("upstream"):
line += f" \u2192 {branch['upstream']}"
ahead, behind = branch.get("ahead", "0"), branch.get("behind", "0")
if ahead != "0" or behind != "0":
line += f" (ahead {ahead}, behind {behind})"
lines.append(line)
elif head == "(detached)":
lines.append("- Branch: (detached HEAD)")
# Linked worktree: the per-worktree git dir differs from the shared common dir.
# We surface the fact that it's a worktree (so the model knows branches/stashes
# are shared state) but deliberately do NOT expose the primary tree path —
# giving the model a second absolute path causes it to sometimes run commands
# in the wrong directory.
git_dir, common_dir = _git(root, "rev-parse", "--git-dir"), _git(root, "rev-parse", "--git-common-dir")
if git_dir and common_dir and Path(git_dir).resolve() != Path(common_dir).resolve():
lines.append("- Worktree: linked (git state shared with primary tree)")
dirty = [f"{n} {label}" for label, n in (
("staged", counts["staged"]), ("modified", counts["modified"]),
("untracked", counts["untracked"]), ("conflicts", counts["conflicts"]),
) if n]
lines.append(f"- Status: {', '.join(dirty) if dirty else 'clean'}")
recent = _git(root, "log", "-3", "--pretty=%h %s")
if recent:
lines.append("- Recent commits:")
lines.extend(f" {c}" for c in recent.splitlines())
lines.extend(_project_facts(root))
return "\n".join(lines)

View File

@@ -7,7 +7,7 @@ protecting head and tail context.
Improvements over v2:
- Structured summary template with Resolved/Pending question tracking
- Filter-safe summarizer preamble that treats prior turns as source material
- "Remaining Work" replaces "Next Steps" to avoid reading as active instructions
- Historical (reference-only) section headings replace "Next Steps"/"Remaining Work" to avoid reading as active instructions
- Clear separator when summary merges into tail message
- Iterative summary updates (preserves info across multiple compactions)
- Token-budget tail protection instead of fixed message count
@@ -34,7 +34,75 @@ from agent.redact import redact_sensitive_text
logger = logging.getLogger(__name__)
HISTORICAL_TASK_HEADING = "## Historical Task Snapshot"
HISTORICAL_IN_PROGRESS_HEADING = "## Historical In-Progress State"
HISTORICAL_PENDING_ASKS_HEADING = "## Historical Pending User Asks"
HISTORICAL_REMAINING_WORK_HEADING = "## Historical Remaining Work"
SUMMARY_PREFIX = (
"[CONTEXT COMPACTION — REFERENCE ONLY] Earlier turns were compacted "
"into the summary below. This is a handoff from a previous context "
"window — treat it as background reference, NOT as active instructions. "
"Do NOT answer questions or fulfill requests mentioned in this summary; "
"they were already addressed. "
"Respond ONLY to the latest user message that appears AFTER this "
"summary — that message is the single source of truth for what to do "
"right now. "
"Topic overlap with the summary does NOT mean you should resume its "
"task: even on similar topics, the latest user message WINS. Treat ONLY "
"the latest message as the active task and discard stale items from "
f"'{HISTORICAL_TASK_HEADING}' / '{HISTORICAL_IN_PROGRESS_HEADING}' / "
f"'{HISTORICAL_PENDING_ASKS_HEADING}' / "
f"'{HISTORICAL_REMAINING_WORK_HEADING}' entirely — do not 'wrap up' or "
"'finish' work described there unless the latest message explicitly "
"asks for it. "
"Reverse signals in the latest message (e.g. 'stop', 'undo', 'roll "
"back', 'just verify', 'don't do that anymore', 'never mind', a new "
"topic) must immediately end any in-flight work described in the "
"summary; do not re-surface it in later turns. "
"IMPORTANT: Your persistent memory (MEMORY.md, USER.md) in the system "
"prompt is ALWAYS authoritative and active — never ignore or deprioritize "
"memory content due to this compaction note. "
"The current session state (files, config, etc.) may reflect work "
"described here — avoid repeating it:"
)
LEGACY_SUMMARY_PREFIX = "[CONTEXT SUMMARY]:"
# Metadata key added to context compression summary messages so that frontends
# (CLI, Desktop, gateway, TUI) can distinguish them from real assistant/user
# messages and filter or render them appropriately without content-prefix
# heuristics. See https://github.com/NousResearch/hermes-agent/issues/38389
#
# Underscore-prefixed ON PURPOSE: the wire sanitizers
# (agent/transports/chat_completions.py convert_messages and the summary-path
# mirror in agent/chat_completion_helpers.py) strip every top-level message
# key starting with "_" before the request leaves the process. Strict
# OpenAI-compatible gateways (Fireworks, Mistral, Moonshot/Kimi, opencode-go)
# reject payloads carrying unknown keys with "Extra inputs are not permitted",
# poisoning every subsequent request in the session — a bare key like
# "is_compressed_summary" would reach the wire and trip exactly that.
COMPRESSED_SUMMARY_METADATA_KEY = "_compressed_summary"
# Appended to every standalone summary message (and to the merged-into-tail
# prefix) so the model has an unambiguous "summary ends here" boundary.
# Without it, weak models read the verbatim "## Active Task" quote as fresh
# user input (#11475, #14521) or regurgitate an assistant-role summary as
# their own output (#33256).
_SUMMARY_END_MARKER = (
"--- END OF CONTEXT SUMMARY — "
"respond to the message below, not the summary above ---"
)
# Handoff prefixes that shipped in earlier releases. A summary persisted under
# one of these can be inherited into a resumed lineage (#35344); when it is
# re-normalized on re-compaction we must strip the OLD prefix too, otherwise the
# stale directive it carried (e.g. "resume exactly from Active Task") survives
# embedded in the body and keeps hijacking replies. Keep newest-first; entries
# are matched literally. Add a frozen copy here whenever SUMMARY_PREFIX changes.
_HISTORICAL_SUMMARY_PREFIXES = (
# Carveout era (#41607/#38364/#42812): "consistent → use as background"
# licensed stale-task resumption on topic overlap.
"[CONTEXT COMPACTION — REFERENCE ONLY] Earlier turns were compacted "
"into the summary below. This is a handoff from a previous context "
"window — treat it as background reference, NOT as active instructions. "
@@ -57,17 +125,7 @@ SUMMARY_PREFIX = (
"prompt is ALWAYS authoritative and active — never ignore or deprioritize "
"memory content due to this compaction note. "
"The current session state (files, config, etc.) may reflect work "
"described here — avoid repeating it:"
)
LEGACY_SUMMARY_PREFIX = "[CONTEXT SUMMARY]:"
# Handoff prefixes that shipped in earlier releases. A summary persisted under
# one of these can be inherited into a resumed lineage (#35344); when it is
# re-normalized on re-compaction we must strip the OLD prefix too, otherwise the
# stale directive it carried (e.g. "resume exactly from Active Task") survives
# embedded in the body and keeps hijacking replies. Keep newest-first; entries
# are matched literally. Add a frozen copy here whenever SUMMARY_PREFIX changes.
_HISTORICAL_SUMMARY_PREFIXES = (
"described here — avoid repeating it:",
# Pre-#35344: contained the self-contradicting "resume exactly" directive.
"[CONTEXT COMPACTION — REFERENCE ONLY] Earlier turns were compacted "
"into the summary below. This is a handoff from a previous context "
@@ -110,10 +168,23 @@ _SUMMARY_FAILURE_COOLDOWN_SECONDS = 600
# become another unbounded transcript copy after the LLM summarizer failed.
_FALLBACK_SUMMARY_MAX_CHARS = 8_000
_FALLBACK_TURN_MAX_CHARS = 700
_AUTO_FOCUS_MAX_TURNS = 3
_AUTO_FOCUS_TURN_MAX_CHARS = 260
_AUTO_FOCUS_MAX_CHARS = 700
# Keep a short run of recent messages verbatim even when the token budget is
# already exhausted. The public ``protect_last_n`` default is intentionally
# high for small/light tails, but using all 20 as a hard floor here would bring
# back the old large-tool-output case where nothing can be compacted.
_MAX_TAIL_MESSAGE_FLOOR = 8
_PATH_MENTION_RE = re.compile(r"(?:/|~/?|[A-Za-z]:\\)[^\s`'\")\]}<>]+")
# MEDIA delivery directives must not reach the summarizer — if one leaks into
# the summary, the downstream model may re-emit it as an active directive on
# the next turn, triggering bogus attachment sends (#14665).
_MEDIA_DIRECTIVE_RE = re.compile(r"MEDIA:\S+")
def _dedupe_append(items: list[str], value: str, *, limit: int) -> None:
value = value.strip()
@@ -974,6 +1045,7 @@ class ContextCompressor(ContextEngine):
for msg in turns:
role = msg.get("role", "unknown")
content = redact_sensitive_text(msg.get("content") or "")
content = _MEDIA_DIRECTIVE_RE.sub("[media attachment]", content)
# Tool results: keep enough content for the summarizer
if role == "tool":
@@ -1155,7 +1227,7 @@ class ContextCompressor(ContextEngine):
)
reason_text = f" Summary failure reason: {reason}." if reason else ""
body = f"""## Active Task
body = f"""{HISTORICAL_TASK_HEADING}
{active_task}
## Goal
@@ -1172,7 +1244,7 @@ Recovered from a deterministic fallback because the LLM context summarizer was u
## Active State
Unknown from deterministic fallback. Inspect current repository/session state if needed.
## In Progress
{HISTORICAL_IN_PROGRESS_HEADING}
{active_task}
## Blocked
@@ -1184,13 +1256,13 @@ None recoverable from deterministic fallback.
## Resolved Questions
None recoverable from deterministic fallback.
## Pending User Asks
{HISTORICAL_PENDING_ASKS_HEADING}
{active_task}
## Relevant Files
{_bullets(relevant_files, limit=12)}
## Remaining Work
{HISTORICAL_REMAINING_WORK_HEADING}
Continue from the most recent unfulfilled user ask and protected tail messages. Verify state with tools before making claims.
## Last Dropped Turns
@@ -1312,7 +1384,7 @@ Summary generation was unavailable, so this is a best-effort deterministic fallb
_temporal_anchoring_rule = ""
# Shared structured template (used by both paths).
_template_sections = f"""## Active Task
_template_sections = f"""{HISTORICAL_TASK_HEADING}
[THE SINGLE MOST IMPORTANT FIELD. Capture the user's most recent unfulfilled
input verbatim — the exact words they used. This includes:
- Explicit task assignments ("refactor the auth module")
@@ -1359,7 +1431,7 @@ Be specific with file paths, commands, line numbers, and results.]
- Any running processes or servers
- Environment details that matter]
## In Progress
{HISTORICAL_IN_PROGRESS_HEADING}
[Work currently underway — what was being done when compaction fired]
## Blocked
@@ -1371,14 +1443,14 @@ Be specific with file paths, commands, line numbers, and results.]
## Resolved Questions
[Questions the user asked that were ALREADY answered — include the answer so it is not repeated]
## Pending User Asks
[Questions or requests from the user that have NOT yet been answered or fulfilled. If none, write "None."]
{HISTORICAL_PENDING_ASKS_HEADING}
[Questions or requests from the user that have NOT yet been answered or fulfilled. These are STALE — they were from the compacted turns. Write them here for reference only. The agent must NOT act on them unless the latest user message explicitly requests it. If none, write "None."]
## Relevant Files
[Files read, modified, or created — with brief note on each]
## Remaining Work
[What remains to be done — framed as context, not instructions]
{HISTORICAL_REMAINING_WORK_HEADING}
[What remains to be done — framed as STALE context for reference only. The agent must NOT resume this work unless the latest user message explicitly asks for it.]
## Critical Context
[Any specific values, error messages, configuration details, or data that would be lost without explicit preservation. NEVER include API keys, tokens, passwords, or credentials — write [REDACTED] instead.]
@@ -1421,7 +1493,7 @@ Use this exact structure:
prompt += f"""
FOCUS TOPIC: "{focus_topic}"
The user has requested that this compaction PRIORITISE preserving all information related to the focus topic above. For content related to "{focus_topic}", include full detail — exact values, file paths, command outputs, error messages, and decisions. For content NOT related to the focus topic, summarise more aggressively (brief one-liners or omit if truly irrelevant). The focus topic sections should receive roughly 60-70% of the summary token budget. Even for the focus topic, NEVER preserve API keys, tokens, passwords, or credentials — use [REDACTED]."""
This compaction should PRIORITISE preserving all information related to the focus topic above. For content related to "{focus_topic}", include full detail — exact values, file paths, command outputs, error messages, and decisions. For content NOT related to the focus topic, summarise more aggressively (brief one-liners or omit if truly irrelevant). The focus topic sections should receive roughly 60-70% of the summary token budget. Even for the focus topic, NEVER preserve API keys, tokens, passwords, or credentials — use [REDACTED]."""
try:
call_kwargs = {
@@ -1574,7 +1646,13 @@ The user has requested that this compaction PRIORITISE preserving all informatio
text = (summary or "").strip()
for prefix in (SUMMARY_PREFIX, LEGACY_SUMMARY_PREFIX, *_HISTORICAL_SUMMARY_PREFIXES):
if text.startswith(prefix):
return text[len(prefix):].lstrip()
text = text[len(prefix):].lstrip()
break
# Strip the trailing end marker too — a rehydrated handoff body that
# keeps it would leak the boundary directive into the iterative-update
# summarizer prompt (and the marker is re-appended on insertion anyway).
if text.endswith(_SUMMARY_END_MARKER):
text = text[: -len(_SUMMARY_END_MARKER)].rstrip()
return text
@classmethod
@@ -1590,6 +1668,52 @@ The user has requested that this compaction PRIORITISE preserving all informatio
return True
return any(text.startswith(p) for p in _HISTORICAL_SUMMARY_PREFIXES)
@staticmethod
def _has_compressed_summary_metadata(message: Any) -> bool:
"""Return True if *message* carries the compressed-summary flag.
Callers (frontends, CLI, gateway) can use this to distinguish context
compaction summaries from real assistant or user messages without
relying on content-prefix heuristics. The flag is in-process only —
the wire sanitizers strip underscore-prefixed keys before API calls.
"""
if not isinstance(message, dict):
return False
return bool(message.get(COMPRESSED_SUMMARY_METADATA_KEY))
@classmethod
def _derive_auto_focus_topic(
cls,
messages: List[Dict[str, Any]],
) -> Optional[str]:
"""Infer a compact focus hint from the most recent real user turns."""
candidates: list[str] = []
for idx in range(len(messages) - 1, -1, -1):
msg = messages[idx]
if msg.get("role") != "user":
continue
content = msg.get("content")
if cls._is_context_summary_content(content):
continue
text = redact_sensitive_text(_content_text_for_contains(content).strip())
if not text:
continue
text = " ".join(text.split())
if len(text) > _AUTO_FOCUS_TURN_MAX_CHARS:
text = text[: _AUTO_FOCUS_TURN_MAX_CHARS - 1].rstrip() + ""
candidates.append(text)
if len(candidates) >= _AUTO_FOCUS_MAX_TURNS:
break
if not candidates:
return None
candidates.reverse()
focus = "Recent user focus:\n" + "\n".join(f"- {item}" for item in candidates)
if len(focus) > _AUTO_FOCUS_MAX_CHARS:
focus = focus[: _AUTO_FOCUS_MAX_CHARS - 1].rstrip() + ""
return focus
@classmethod
def _find_latest_context_summary(
cls,
@@ -1742,6 +1866,105 @@ The user has requested that this compaction PRIORITISE preserving all informatio
return i
return -1
def _find_last_assistant_message_idx(
self, messages: List[Dict[str, Any]], head_end: int
) -> int:
"""Return the index of the last user-visible assistant reply at or
after *head_end*, or -1.
A "user-visible reply" is an assistant message with non-empty
textual content — i.e. one that the WebUI / TUI / SessionsPage
rendered as a bubble the operator could read. We deliberately
skip assistant messages that contain only ``tool_calls`` (and
no text), because those render as small "calling tool X"
indicators and aren't what the reporter means by "the output
of the last message you sent" (#29824).
Falling back to the most recent assistant message of ANY kind
only kicks in when no content-bearing assistant message exists
in the compressible region — typically a fresh session that
just started a multi-step tool sequence with no prior reply
to anchor. In that case the agent fix is a no-op and the
existing user-message anchor carries the load.
"""
last_any = -1
for i in range(len(messages) - 1, head_end - 1, -1):
msg = messages[i]
if msg.get("role") != "assistant":
continue
if last_any < 0:
last_any = i
content = msg.get("content")
if isinstance(content, str) and content.strip():
return i
if isinstance(content, list):
# Multimodal / Anthropic-style content: look for any
# text block with non-empty text.
for part in content:
if isinstance(part, dict):
text = part.get("text") or part.get("content")
if isinstance(text, str) and text.strip():
return i
return last_any
def _ensure_last_assistant_message_in_tail(
self,
messages: List[Dict[str, Any]],
cut_idx: int,
head_end: int,
) -> int:
"""Guarantee the most recent assistant message is in the protected tail.
WebUI / TUI / SessionsPage bug (#29824). Without this anchor,
``_find_tail_cut_by_tokens`` can leave the user's most recent
visible assistant response inside the compressed middle region —
especially when the conversation has a single oversized tool
result or a long stretch of tool-call/result pairs after the
last assistant reply. The summariser then rolls that reply up
into the single ``[CONTEXT COMPACTION — REFERENCE ONLY]`` block
persisted as ``role="user"`` or ``role="assistant"``. From the
operator's perspective the WebUI session viewer
(``web/src/pages/SessionsPage.tsx``) and the TUI chat panel
both suddenly show the opaque "Context compaction" block in the
slot where they were just reading the assistant's actual reply:
User: "i cant see the output of the last message you
sent, i did see it previously, however now see
'context compaction'"
Mirror of ``_ensure_last_user_message_in_tail`` but anchors on
the last assistant-role message. Re-runs the tool-group
alignment so we don't split a ``tool_call`` / ``tool_result``
group that immediately precedes the anchored message — orphaned
tool messages would otherwise be removed by
``_sanitize_tool_pairs`` and trigger the same data-loss symptom
we're trying to prevent.
"""
last_asst_idx = self._find_last_assistant_message_idx(messages, head_end)
if last_asst_idx < 0:
# No assistant message in the compressible region — nothing
# to anchor (single-turn pre-reply state, etc.).
return cut_idx
if last_asst_idx >= cut_idx:
# Already in the tail — the token-budget walk did the right
# thing on its own.
return cut_idx
# Pull cut_idx back to the assistant message, then re-align so
# we don't split a tool group that immediately precedes it
# (e.g. an ``assistant(tool_calls)`` → ``tool(result)`` →
# ``assistant(final reply)`` sequence would otherwise leave the
# ``tool`` orphan when cut lands at the final reply).
new_cut = self._align_boundary_backward(messages, last_asst_idx)
if not self.quiet_mode:
logger.debug(
"Anchoring tail cut to last assistant message at index %d "
"(was %d, aligned to %d) to keep the previously-visible "
"reply out of the compaction summary (#29824)",
last_asst_idx, cut_idx, new_cut,
)
# Safety: never go back into the head region.
return max(new_cut, head_end + 1)
def _ensure_last_user_message_in_tail(
self,
messages: List[Dict[str, Any]],
@@ -1753,7 +1976,7 @@ The user has requested that this compaction PRIORITISE preserving all informatio
Context compressor bug (#10896): ``_align_boundary_backward`` can pull
``cut_idx`` past a user message when it tries to keep tool_call/result
groups together. If the last user message ends up in the *compressed*
middle region the LLM summariser writes it into "Pending User Asks",
middle region the LLM summariser writes it into "Historical Pending User Asks",
but ``SUMMARY_PREFIX`` tells the next model to respond only to user
messages *after* the summary — so the task effectively disappears from
the active context, causing the agent to stall, repeat completed work,
@@ -1800,11 +2023,12 @@ The user has requested that this compaction PRIORITISE preserving all informatio
derived from ``summary_target_ratio * context_length``, so it
scales automatically with the model's context window.
Token budget is the primary criterion. A hard minimum of 3 messages
is always protected, but the budget is allowed to exceed by up to
1.5x to avoid cutting inside an oversized message (tool output, file
read, etc.). If even the minimum 3 messages exceed 1.5x the budget
the cut is placed right after the head so compression still runs.
Token budget is the primary criterion. A bounded message-count floor
keeps a short run of recent turns verbatim even when the budget is
exhausted, but the budget is allowed to exceed by up to 1.5x to avoid
cutting inside an oversized message (tool output, file read, etc.). If
even that floor exceeds 1.5x the budget, the cut is placed right after
the head so compression still runs.
Never cuts inside a tool_call/result group. Always ensures the most
recent user message is in the tail (see ``_ensure_last_user_message_in_tail``).
@@ -1812,8 +2036,19 @@ The user has requested that this compaction PRIORITISE preserving all informatio
if token_budget is None:
token_budget = self.tail_token_budget
n = len(messages)
# Hard minimum: always keep at least 3 messages in the tail
min_tail = min(3, n - head_end - 1) if n - head_end > 1 else 0
# Hard minimum: always keep a bounded recent-message floor in the tail.
# ``protect_last_n`` remains a minimum up to the cap; the cap avoids
# preserving a whole run of bulky tool outputs on every compaction.
available_tail = max(0, n - head_end - 1)
min_tail_floor = max(3, min(self.protect_last_n, _MAX_TAIL_MESSAGE_FLOOR))
# Leave at least two non-head messages available to summarize on short
# transcripts; otherwise compression can replace a tiny middle with a
# summary and save no messages at all.
compressible_tail_cap = max(3, available_tail - 2)
min_tail = (
min(min_tail_floor, compressible_tail_cap, available_tail)
if available_tail > 1 else 0
)
soft_ceiling = int(token_budget * 1.5)
accumulated = 0
cut_idx = n # start from beyond the end
@@ -1885,6 +2120,13 @@ The user has requested that this compaction PRIORITISE preserving all informatio
# active task is never lost to compression (fixes #10896).
cut_idx = self._ensure_last_user_message_in_tail(messages, cut_idx, head_end)
# Ensure the most recent assistant message is always in the tail
# so the previously-visible reply isn't silently rolled into the
# ``[CONTEXT COMPACTION — REFERENCE ONLY]`` block (fixes #29824).
# Each anchor only walks ``cut_idx`` backward, so chaining them is
# monotonic — the tail can only grow, never shrink.
cut_idx = self._ensure_last_assistant_message_in_tail(messages, cut_idx, head_end)
return max(cut_idx, head_end + 1)
# ------------------------------------------------------------------
@@ -2037,7 +2279,8 @@ The user has requested that this compaction PRIORITISE preserving all informatio
)
# Phase 3: Generate structured summary
summary = self._generate_summary(turns_to_summarize, focus_topic=focus_topic)
summary_focus_topic = focus_topic or self._derive_auto_focus_topic(messages)
summary = self._generate_summary(turns_to_summarize, focus_topic=summary_focus_topic)
# If summary generation failed, behavior splits on
# ``abort_on_summary_failure`` (config: compression.abort_on_summary_failure):
@@ -2117,32 +2360,33 @@ The user has requested that this compaction PRIORITISE preserving all informatio
# When the summary lands as a standalone role="user" message,
# weak models read the verbatim "## Active Task" quote of a past
# user request as fresh input (#11475, #14521). Append the explicit
# end marker — the same one used in the merge-into-tail path — so
# the model has a clear "summary above, not new input" signal.
if not _merge_summary_into_tail and summary_role == "user":
summary = (
summary
+ "\n\n--- END OF CONTEXT SUMMARY — "
"respond to the message below, not the summary above ---"
)
# user request as fresh input (#11475, #14521).
# When it lands as role="assistant", models may regurgitate the
# summary text as their own output (#33256). In both cases, append
# the explicit end marker so the model has a clear "summary ends
# here, respond to the message below" signal.
if not _merge_summary_into_tail:
summary = summary + "\n\n" + _SUMMARY_END_MARKER
if not _merge_summary_into_tail:
compressed.append({"role": summary_role, "content": summary})
compressed.append({
"role": summary_role,
"content": summary,
COMPRESSED_SUMMARY_METADATA_KEY: True,
})
for i in range(compress_end, n_messages):
msg = messages[i].copy()
if _merge_summary_into_tail and i == compress_end:
merged_prefix = (
summary
+ "\n\n--- END OF CONTEXT SUMMARY — "
"respond to the message below, not the summary above ---\n\n"
)
merged_prefix = summary + "\n\n" + _SUMMARY_END_MARKER + "\n\n"
msg["content"] = _append_text_to_content(
msg.get("content"),
merged_prefix,
prepend=True,
)
# Mark the merged message so frontends can identify it as
# containing a compression summary prefix.
msg[COMPRESSED_SUMMARY_METADATA_KEY] = True
_merge_summary_into_tail = False
compressed.append(msg)

View File

@@ -40,6 +40,16 @@ from agent.model_metadata import estimate_request_tokens_rough
logger = logging.getLogger(__name__)
# Stable marker the gateway matches on to re-tag the auto-compaction lifecycle
# status as ``kind="compacting"`` (tui_gateway/server.py::_status_update), so
# drivers like the desktop app can show an explicit "Summarizing…" indicator
# instead of the transcript appearing to silently reset. Keep the marker phrase
# intact if you reword COMPACTION_STATUS.
COMPACTION_STATUS_MARKER = "Compacting context"
COMPACTION_STATUS = (
f"🗜️ {COMPACTION_STATUS_MARKER} — summarizing earlier conversation so I can continue..."
)
def _compression_lock_holder(agent: Any) -> str:
"""Build a unique holder id for the lock: pid:tid:agent-instance:uuid.
@@ -324,9 +334,7 @@ def compress_context(
f"{approx_tokens:,}" if approx_tokens else "unknown", agent.model,
focus_topic,
)
agent._emit_status(
"🗜️ Compacting context — summarizing earlier conversation so I can continue..."
)
agent._emit_status(COMPACTION_STATUS)
# ── Compression lock ────────────────────────────────────────────────
# Atomic, state.db-backed lock per session_id. Without this, two
@@ -631,7 +639,11 @@ def compress_context(
return compressed, new_system_prompt
def try_shrink_image_parts_in_messages(api_messages: list) -> bool:
def try_shrink_image_parts_in_messages(
api_messages: list,
*,
max_dimension: int = 8000,
) -> bool:
"""Re-encode all native image parts at a smaller size to recover from
image-too-large errors (Anthropic 5 MB, unknown other providers).
@@ -642,7 +654,8 @@ def try_shrink_image_parts_in_messages(api_messages: list) -> bool:
Strategy: look for ``image_url`` / ``input_image`` parts carrying a
``data:image/...;base64,...`` payload. For each one whose encoded
size exceeds 4 MB (a safe target that slides under Anthropic's 5 MB
ceiling with header overhead), write the base64 to a tempfile, call
ceiling with header overhead) or whose longest side exceeds
``max_dimension``, write the base64 to a tempfile, call
``vision_tools._resize_image_for_vision`` to produce a smaller data
URL, and substitute it in place.
@@ -664,10 +677,9 @@ def try_shrink_image_parts_in_messages(api_messages: list) -> bool:
# after a confirmed provider rejection, so the alternative is failure.
target_bytes = 4 * 1024 * 1024
# Anthropic enforces an 8000px per-side dimension cap independently of
# the 5 MB byte cap. A tall screenshot can be well under 5 MB yet far
# over 8000px (e.g. 1200×12000 at 0.06 MB). We check pixel dimensions
# even when the byte budget is fine.
max_dimension = 8000
# the 5 MB byte cap. In many-image requests, the provider can report a
# lower cap (observed: 2000px). The caller passes that parsed ceiling
# when the rejection includes it.
changed_count = 0
# Track parts that are over the target but could NOT be shrunk under it.
# If any survive, retrying is pointless — the same oversized payload will
@@ -684,9 +696,9 @@ def try_shrink_image_parts_in_messages(api_messages: list) -> bool:
# Check both byte size AND pixel dimensions.
needs_shrink = len(url) > target_bytes # over byte budget
if not needs_shrink:
# Even if bytes are fine, check pixel dimensions against
# Anthropic's 8000px cap. A tall image can be tiny in bytes
# yet huge in pixels.
# Even if bytes are fine, check pixel dimensions against the
# provider's reported per-side cap. A screenshot can be tiny in
# bytes yet too large in pixels.
try:
import base64 as _b64_dim
header_d, _, data_d = url.partition(",")
@@ -795,6 +807,8 @@ def try_shrink_image_parts_in_messages(api_messages: list) -> bool:
__all__ = [
"COMPACTION_STATUS",
"COMPACTION_STATUS_MARKER",
"check_compression_model_feasibility",
"replay_compression_warning",
"compress_context",

View File

@@ -57,7 +57,11 @@ from agent.process_bootstrap import _install_safe_stdio
from agent.prompt_caching import apply_anthropic_cache_control
from agent.retry_utils import jittered_backoff
from agent.trajectory import has_incomplete_scratchpad
from agent.usage_pricing import estimate_usage_cost, normalize_usage
from agent.usage_pricing import (
estimate_usage_cost,
extract_provider_cost_usd,
normalize_usage,
)
from hermes_constants import PARTIAL_STREAM_STUB_ID
from hermes_logging import set_session_context
from tools.skill_provenance import set_current_write_origin
@@ -71,6 +75,35 @@ logger = logging.getLogger(__name__)
INTERRUPT_WAITING_FOR_MODEL_PREFIX = "Operation interrupted: waiting for model response ("
def _image_error_max_dimension(error: Exception) -> Optional[int]:
"""Extract a provider-reported image dimension ceiling, if present."""
parts = []
for value in (
error,
getattr(error, "message", None),
getattr(error, "body", None),
):
if value:
try:
parts.append(str(value))
except Exception:
pass
text = " ".join(parts).lower()
if "image" not in text or "dimension" not in text or "max allowed size" not in text:
return None
match = re.search(r"max allowed size(?:\s+for [^:]+)?:\s*(\d{3,5})\s*pixels?", text)
if not match:
return None
try:
max_dimension = int(match.group(1))
except ValueError:
return None
if 512 <= max_dimension <= 8000:
return max_dimension
return None
def _ollama_context_limit_error(agent: Any, request_tokens: int) -> Optional[str]:
"""Return a user-facing error when Ollama is loaded with too little context."""
if not getattr(agent, "tools", None):
@@ -368,6 +401,42 @@ def _get_continuation_prompt(is_partial_stub: bool, dropped_tools: Optional[List
)
# Shared recovery hint appended to every content-policy refusal message. Both
# the HTTP-200 refusal path (``finish_reason=content_filter``) and the
# exception path (a provider moderation error classified as
# ``content_policy_blocked``) end with the same actionable next steps, so they
# share one trailer to keep the guidance from drifting between the two sites.
_CONTENT_POLICY_RECOVERY_HINT = (
"Try rephrasing the request, narrowing the context, or "
"adding a fallback provider with `hermes fallback add`."
)
def _content_policy_blocked_result(
messages: List[Dict],
api_call_count: int,
*,
final_response: str,
error_detail: str,
) -> Dict[str, Any]:
"""Build the terminal turn result for a content-policy block.
A content-policy refusal is deterministic for the unchanged prompt, so the
turn ends here (no retry). Both the HTTP-200 refusal handler and the
exception-path handler return the identical shape — a failed, non-completed
turn carrying the user-facing message and a ``content_policy_blocked:``
prefixed error — so they funnel through this one builder.
"""
return {
"final_response": final_response,
"messages": messages,
"api_calls": api_call_count,
"completed": False,
"failed": True,
"error": f"content_policy_blocked: {error_detail}",
}
def run_conversation(
agent,
user_message: str,
@@ -595,7 +664,11 @@ def run_conversation(
# landed after an orphan tool result). Most providers return
# empty content on malformed sequences, which would otherwise
# retrigger the empty-retry loop indefinitely.
repaired_seq = agent._repair_message_sequence(messages)
# repair_message_sequence_with_cursor also recomputes the SessionDB
# flush cursor (_last_flushed_db_idx) when repair compacts the list,
# so the turn-end flush doesn't skip the assistant/tool chain (#44837).
from agent.agent_runtime_helpers import repair_message_sequence_with_cursor
repaired_seq = repair_message_sequence_with_cursor(agent, messages)
if repaired_seq > 0:
request_logger.info(
"Repaired %s message-alternation violations before request (session=%s)",
@@ -703,7 +776,10 @@ def run_conversation(
# a thinking-only turn. Runs on the per-call copy only — the
# stored conversation history keeps the reasoning block for the
# UI transcript and session persistence.
api_messages = agent._drop_thinking_only_and_merge_users(api_messages)
api_messages = agent._drop_thinking_only_and_merge_users(
api_messages,
drop_codex_reasoning_items=agent.api_mode != "codex_responses",
)
# Normalize message whitespace and tool-call JSON for consistent
# prefix matching. Ensures bit-perfect prefixes across turns,
@@ -1312,6 +1388,106 @@ def run_conversation(
)
finish_reason = "length"
# ── Content-policy refusal (HTTP 200) ──────────────────
# The model — or the provider's safety system — returned a
# *successful* response whose stop/finish reason is a refusal:
# Anthropic ``stop_reason="refusal"`` → ``content_filter``;
# OpenAI / portal ``finish_reason="content_filter"`` or a
# populated ``message.refusal`` (mapped in the chat_completions
# transport); Bedrock ``guardrail_intervened``. The content is
# typically empty, so without this branch the response falls
# through to the empty-response / invalid-response retry loops
# and is mis-surfaced as "rate limited" / "no content after
# retries" — burning paid attempts reproducing a deterministic
# refusal. Surface it clearly and stop. Mirrors the
# exception-based ``content_policy_blocked`` recovery: try a
# configured fallback once, otherwise return the refusal.
if finish_reason == "content_filter":
_refusal_transport = agent._get_transport()
if agent.api_mode == "anthropic_messages":
_refusal_result = _refusal_transport.normalize_response(
response, strip_tool_prefix=agent._is_anthropic_oauth
)
else:
_refusal_result = _refusal_transport.normalize_response(response)
_refusal_text = (getattr(_refusal_result, "content", None) or "").strip()
# Some refusals carry the explanation only in the reasoning
# channel; fall back to it so the user sees *something*.
if not _refusal_text:
_refusal_text = (agent._extract_reasoning(_refusal_result) or "").strip()
agent._invoke_api_request_error_hook(
task_id=effective_task_id,
turn_id=turn_id,
api_request_id=api_request_id,
api_call_count=api_call_count,
api_start_time=api_start_time,
api_kwargs=api_kwargs,
error_type="ContentPolicyBlocked",
error_message=_refusal_text or "model declined to respond (content_filter)",
status_code=None,
retry_count=retry_count,
max_retries=max_retries,
retryable=False,
reason=FailoverReason.content_policy_blocked.value,
)
if thinking_spinner:
thinking_spinner.stop("")
thinking_spinner = None
if agent.thinking_callback:
agent.thinking_callback("")
# Deterministic for the unchanged prompt — never retry.
# Try a configured fallback once (a different model may not
# refuse); otherwise surface the refusal terminally.
if agent._has_pending_fallback():
agent._buffer_status(
"⚠️ Model declined to respond (safety refusal) — trying fallback..."
)
if agent._try_activate_fallback():
retry_count = 0
compression_attempts = 0
_retry.primary_recovery_attempted = False
continue
agent._flush_status_buffer()
_refusal_log = (
_refusal_text[:500] + "..."
if len(_refusal_text) > 500
else _refusal_text
)
logger.warning(
"%sModel declined to respond (finish_reason=content_filter). "
"model=%s provider=%s refusal=%s",
agent.log_prefix, agent.model, agent.provider,
_refusal_log or "(no text)",
)
agent._emit_status(
"⚠️ The model declined to respond to this request (safety refusal)."
)
_refusal_detail = (
f"Model's explanation: {_refusal_text}"
if _refusal_text
else "The model returned no explanation."
)
_refusal_response = (
"⚠️ The model declined to respond to this request "
"(safety refusal — not a Hermes/gateway failure).\n\n"
f"{_refusal_detail}\n\n"
f"{_CONTENT_POLICY_RECOVERY_HINT}"
)
agent._cleanup_task_resources(effective_task_id)
agent._persist_session(messages, conversation_history)
return _content_policy_blocked_result(
messages,
api_call_count,
final_response=_refusal_response,
error_detail=_refusal_text or "model declined (content_filter)",
)
if finish_reason == "length":
if getattr(response, "id", "") == PARTIAL_STREAM_STUB_ID:
agent._vprint(
@@ -1633,6 +1809,37 @@ def run_conversation(
agent.session_cost_status = cost_result.status
agent.session_cost_source = cost_result.source
# ── Real provider-REPORTED cost (never estimated) ──
# OpenRouter usage accounting returns ``usage.cost`` on the
# response when the request carries usage:{include:true}
# (added on OpenRouter routes). When the provider reports
# nothing, this stays None — absent, NOT zero — so cost
# displays hide instead of showing a fabricated $0.00.
reported_cost_usd = extract_provider_cost_usd(response.usage)
if reported_cost_usd is not None:
_prev_actual = getattr(agent, "session_actual_cost_usd", None)
agent.session_actual_cost_usd = (_prev_actual or 0.0) + reported_cost_usd
agent.session_cost_status = "actual"
agent.session_cost_source = "provider_cost_api"
# Per-model session breakdown for /usage — counts are always
# real; cost_usd only accumulates provider-reported values
# and stays None when the provider reports nothing.
_model_usage = getattr(agent, "session_model_usage", None)
if _model_usage is None:
_model_usage = agent.session_model_usage = {}
_mrow = _model_usage.setdefault(agent.model, {
"calls": 0, "input": 0, "output": 0,
"cache_read": 0, "cache_write": 0, "cost_usd": None,
})
_mrow["calls"] += 1
_mrow["input"] += canonical_usage.input_tokens
_mrow["output"] += canonical_usage.output_tokens
_mrow["cache_read"] += canonical_usage.cache_read_tokens
_mrow["cache_write"] += canonical_usage.cache_write_tokens
if reported_cost_usd is not None:
_mrow["cost_usd"] = (_mrow["cost_usd"] or 0.0) + reported_cost_usd
# Persist token counts to session DB for /insights.
# Do this for every platform with a session_id so non-CLI
# sessions (gateway, cron, delegated runs) cannot lose
@@ -1659,8 +1866,14 @@ def run_conversation(
reasoning_tokens=canonical_usage.reasoning_tokens,
estimated_cost_usd=float(cost_result.amount_usd)
if cost_result.amount_usd is not None else None,
cost_status=cost_result.status,
cost_source=cost_result.source,
# Provider-reported per-call cost delta. NULL
# (not 0) when the provider reported nothing —
# the SQL CASE keeps actual_cost_usd untouched.
actual_cost_usd=reported_cost_usd,
cost_status="actual"
if reported_cost_usd is not None else cost_result.status,
cost_source="provider_cost_api"
if reported_cost_usd is not None else cost_result.source,
billing_provider=agent.provider,
billing_base_url=agent.base_url,
billing_mode="subscription_included"
@@ -2063,7 +2276,11 @@ def run_conversation(
and not _retry.image_shrink_retry_attempted
):
_retry.image_shrink_retry_attempted = True
if agent._try_shrink_image_parts_in_messages(api_messages):
image_max_dimension = _image_error_max_dimension(api_error) or 8000
if agent._try_shrink_image_parts_in_messages(
api_messages,
max_dimension=image_max_dimension,
):
agent._vprint(
f"{agent.log_prefix}📐 Image(s) exceeded provider size limit — "
f"shrank and retrying...",
@@ -2221,30 +2438,54 @@ def run_conversation(
print(f"{agent.log_prefix} • Legacy cleanup: hermes config set ANTHROPIC_TOKEN \"\"")
print(f"{agent.log_prefix} • Clear stale keys: hermes config set ANTHROPIC_API_KEY \"\"")
# ── Thinking block signature recovery ─────────────────
# Thinking block signature recovery.
#
# Anthropic signs thinking blocks against the full turn
# content. Any upstream mutation (context compression,
# content. Any upstream mutation (context compression,
# session truncation, message merging) invalidates the
# signature → HTTP 400. Recovery: strip reasoning_details
# from all messages so the next retry sends no thinking
# blocks at all. One-shot — don't retry infinitely.
# signature and the API replies HTTP 400 ("invalid
# signature" or "cannot be modified"). Recovery strips
# ``reasoning_details`` so the retry sends no thinking
# blocks at all. One-shot per outer loop.
#
# The strip targets ``api_messages``, which is the
# API-call-time list that ``_build_api_kwargs`` consumes
# on every retry. ``api_messages`` was populated once at
# the start of the turn from shallow copies of
# ``messages``, so mutating it does not touch the
# canonical store. The previous implementation popped
# ``reasoning_details`` from ``messages`` instead, which
# had two problems: ``api_messages`` carried its own
# reference to the field through the shallow copy, so the
# retry's wire payload still included thinking blocks and
# the recovery never reached the API; and the mutation
# persisted into ``state.db`` through any subsequent
# ``_persist_session`` call, permanently corrupting the
# conversation. Future turns would replay the stripped
# state, hit the same 400, and the agent would terminate
# with ``max_retries_exhausted``, often spawning
# cascading compaction-ended sessions chained off the
# corrupted parent.
if (
classified.reason == FailoverReason.thinking_signature
and not _retry.thinking_sig_retry_attempted
):
_retry.thinking_sig_retry_attempted = True
for _m in messages:
if isinstance(_m, dict):
_api_stripped = 0
for _m in api_messages:
if isinstance(_m, dict) and "reasoning_details" in _m:
_m.pop("reasoning_details", None)
_api_stripped += 1
agent._vprint(
f"{agent.log_prefix}⚠️ Thinking block signature invalid "
f"stripped all thinking blocks, retrying...",
f"{agent.log_prefix}⚠️ Thinking block signature invalid, "
f"stripped reasoning_details from api_messages for retry...",
force=True,
)
logger.warning(
"%sThinking block signature recovery: stripped "
"reasoning_details from %d messages",
agent.log_prefix, len(messages),
"reasoning_details from %d api_messages "
"(canonical messages unchanged)",
agent.log_prefix, _api_stripped,
)
continue
@@ -2607,10 +2848,13 @@ def run_conversation(
except Exception:
pass
if _genuine_nous_rate_limit:
# Skip straight to max_retries -- the
# top-of-loop guard will handle fallback or
# bail cleanly.
retry_count = max_retries
# Re-enter the loop exactly once so the
# top-of-loop Nous guard handles fallback or
# bails cleanly. (Setting retry_count to
# max_retries would make the while condition
# false immediately and the guard would never
# run -- no fallback, generic exhaustion error.)
retry_count = max(0, max_retries - 1)
continue
# Upstream capacity 429: fall through to normal
# retry logic. A different model (or the same
@@ -3052,20 +3296,17 @@ def run_conversation(
if classified.reason == FailoverReason.content_policy_blocked:
_summary = agent._summarize_api_error(api_error)
_policy_response = (
f"⚠️ The model provider's safety filter blocked this request "
f"(not a Hermes/gateway failure).\n\n"
"⚠️ The model provider's safety filter blocked this request "
"(not a Hermes/gateway failure).\n\n"
f"Provider message: {_summary}\n\n"
f"Try rephrasing the request, narrowing the context, or "
f"adding a fallback provider with `hermes fallback add`."
f"{_CONTENT_POLICY_RECOVERY_HINT}"
)
return _content_policy_blocked_result(
messages,
api_call_count,
final_response=_policy_response,
error_detail=_summary,
)
return {
"final_response": _policy_response,
"messages": messages,
"api_calls": api_call_count,
"completed": False,
"failed": True,
"error": f"content_policy_blocked: {_summary}",
}
return {
"final_response": None,
"messages": messages,

View File

@@ -70,16 +70,6 @@ def _resolve_args() -> list[str]:
def _resolve_home_dir() -> str:
"""Return a stable HOME for child ACP processes."""
try:
from hermes_constants import get_subprocess_home
profile_home = get_subprocess_home()
if profile_home:
return profile_home
except Exception:
pass
home = os.environ.get("HOME", "").strip()
if home:
return home
@@ -105,7 +95,10 @@ def _resolve_home_dir() -> str:
def _build_subprocess_env() -> dict[str, str]:
env = os.environ.copy()
env["HOME"] = _resolve_home_dir()
home = _resolve_home_dir()
env["HOME"] = home
from hermes_constants import apply_subprocess_home_env
apply_subprocess_home_env(env)
return env

View File

@@ -194,17 +194,71 @@ class AgentNotice:
id: Optional[str] = None
# ── is_free_tier_model (local-data-only free-model check) ────────────────────
def is_free_tier_model(model: str, base_url: str = "") -> bool:
"""Return True when *model* is a Nous free-tier model, using ONLY local data.
Two signals, both zero-network:
1. The ``:free`` suffix — the canonical Nous free SKU marker (e.g.
``nvidia/nemotron-3-ultra:free``). Free by construction on the API side
(spend is forced to 0 for ``:free`` ids).
2. A peek into the in-process pricing cache in ``hermes_cli.models``
(populated when the model picker fetched ``/v1/models`` pricing for
*base_url*). PEEK ONLY — a cache miss never triggers a fetch. This is
CLI/TUI-session best-effort: gateway sessions never run the picker's
pricing fetch, so suppression there rests entirely on the ``:free``
suffix (which all Nous free SKUs carry).
Fail-open to False (the depleted notice still shows) on any error: wrongly
showing the warning is recoverable noise; wrongly hiding it on a paid model
would mask a real billing block.
"""
if not model:
return False
if model.endswith(":free"):
return True
if not base_url:
return False
try:
from hermes_cli.models import _is_model_free, _pricing_cache
# Mirror get_pricing_for_provider's key normalization: the agent's
# Nous base_url is /v1-suffixed (https://inference-api.nousresearch.com/v1)
# but the picker keys _pricing_cache on the pre-/v1 root.
key = base_url.rstrip("/")
if key.endswith("/v1"):
key = key[:-3].rstrip("/")
pricing = _pricing_cache.get(key)
if not pricing:
return False
return _is_model_free(model, pricing)
except Exception:
return False
# ── evaluate_credits_notices (pure reconciliation function) ──────────────────
def evaluate_credits_notices(
state: CreditsState,
latch: dict,
*,
model_is_free: bool = False,
) -> tuple[list[AgentNotice], list[str]]:
"""Reconcile credits notices against the latch. Mutates ``latch`` IN PLACE.
latch = {"active": set[str], "seen_below_90": bool, "usage_band": Optional[int]}.
``model_is_free``: True when the session's active model is a Nous free-tier
model (see :func:`is_free_tier_model`). Suppresses the ``credits.depleted``
notice — a depleted account on a free model can keep inferencing, so the
error banner is noise (and confuses free-tier users who never had credits).
Suppression does NOT emit the "restored" success notice; that fires only on
a genuine ``paid_access`` flip back to True.
Returns ``(to_show: list[AgentNotice], to_clear: list[str])``.
Caller emits to_clear FIRST, then to_show.
@@ -232,6 +286,16 @@ def evaluate_credits_notices(
for band in CREDITS_USAGE_BANDS: # ascending → last match wins = highest
if uf >= band[0]:
current_band = band
# Top-up suppression: when the account holds purchased (top-up) credits,
# the subscription-cap gauge is the wrong denominator — warning "90% used"
# at a user sitting on $50 of top-up is noise (and it previously stuck
# PERMANENTLY alongside grant_spent at >=100%). Suppress the usage band
# entirely; the cap-reached case is covered by the grant_spent info notice
# below, which already names the remaining top-up balance. A top-up landing
# mid-session flips current_band → None and the clear path below removes
# any showing band line.
if state.purchased_micros > 0:
current_band = None
grant_cond = (
state.denominator_kind == "subscription_cap"
and uf is not None
@@ -284,10 +348,14 @@ def evaluate_credits_notices(
active.discard("credits.grant_spent")
# ── depleted ─────────────────────────────────────────────────────────────
if depleted_cond and "credits.depleted" not in active:
# Suppressed while the active model is free: inference still works there,
# so the error banner would just alarm users (free-tier users especially,
# who never had paid credits to "lose").
show_depleted = depleted_cond and not model_is_free
if show_depleted and "credits.depleted" not in active:
to_show.append(
AgentNotice(
text="✕ Credit access paused · run /usage for balance",
text="✕ Credit access paused · run /credits to top up",
level="error",
kind=CREDITS_NOTICE_KIND,
key="credits.depleted",
@@ -295,20 +363,23 @@ def evaluate_credits_notices(
)
)
active.add("credits.depleted")
elif "credits.depleted" in active and not depleted_cond:
elif "credits.depleted" in active and not show_depleted:
to_clear.append("credits.depleted")
active.discard("credits.depleted")
# Recovery: also emit the success notice
to_show.append(
AgentNotice(
text="✓ Credit access restored",
level="success",
kind="ttl",
ttl_ms=CREDITS_RESTORED_TTL_MS,
key="credits.restored",
id="credits.restored",
if not depleted_cond:
# Genuine recovery (paid_access flipped back True): also emit the
# success notice. A clear caused by switching to a free model while
# still depleted must NOT claim access was restored.
to_show.append(
AgentNotice(
text="✓ Credit access restored",
level="success",
kind="ttl",
ttl_ms=CREDITS_RESTORED_TTL_MS,
key="credits.restored",
id="credits.restored",
)
)
)
return (to_show, to_clear)

View File

@@ -25,7 +25,6 @@ import json
import logging
import os
import re
import tempfile
import threading
from datetime import datetime, timedelta, timezone
from pathlib import Path
@@ -33,6 +32,7 @@ from typing import Any, Callable, Dict, List, NamedTuple, Optional, Set
from hermes_constants import get_hermes_home
from tools import skill_usage
from utils import atomic_json_write
logger = logging.getLogger(__name__)
@@ -97,20 +97,7 @@ def load_state() -> Dict[str, Any]:
def save_state(data: Dict[str, Any]) -> None:
path = _state_file()
try:
path.parent.mkdir(parents=True, exist_ok=True)
fd, tmp = tempfile.mkstemp(dir=str(path.parent), prefix=".curator_state_", suffix=".tmp")
try:
with os.fdopen(fd, "w", encoding="utf-8") as f:
json.dump(data, f, indent=2, sort_keys=True, ensure_ascii=False)
f.flush()
os.fsync(f.fileno())
os.replace(tmp, path)
except BaseException:
try:
os.unlink(tmp)
except OSError:
pass
raise
atomic_json_write(path, data, indent=2, sort_keys=True)
except Exception as e:
logger.debug("Failed to save curator state: %s", e, exc_info=True)

View File

@@ -549,14 +549,32 @@ def classify_api_error(
should_fallback=True,
)
# Anthropic thinking block signature invalid (400).
# Anthropic thinking block recovery (400). Two distinct failure modes,
# same recovery (strip all reasoning_details and retry without thinking
# blocks — see the thinking_signature handler in conversation_loop.py):
# 1. Signature mismatch: a thinking block is signed against the full
# turn content; any upstream mutation (context compression, session
# truncation, message merging) invalidates the signature.
# Pattern: "signature" + "thinking".
# 2. Frozen-block mutation: Anthropic rejects any change to the
# thinking/redacted_thinking blocks in the *latest* assistant
# message — "`thinking` or `redacted_thinking` blocks in the latest
# assistant message cannot be modified. These blocks must remain as
# they were in the original response." This carries no "signature"
# token, so the original pattern missed it and the turn hard-aborted
# as a non-retryable client error instead of self-healing.
# Pattern: "thinking" + ("cannot be modified" | "must remain as they were").
# Don't gate on provider — OpenRouter proxies Anthropic errors, so the
# provider may be "openrouter" even though the error is Anthropic-specific.
# The message pattern ("signature" + "thinking") is unique enough.
# The combined patterns are unique enough.
if (
status_code == 400
and "signature" in error_msg
and "thinking" in error_msg
and (
"signature" in error_msg
or "cannot be modified" in error_msg
or "must remain as they were" in error_msg
)
):
return _result(
FailoverReason.thinking_signature,

3
agent/errors.py Normal file
View File

@@ -0,0 +1,3 @@
class SSLConfigurationError(Exception):
"""Raised when SSL/TLS certificate bundle configuration fails."""
pass

View File

@@ -46,11 +46,6 @@ def build_write_denied_paths(home: str) -> set[str]:
# Top-level Anthropic PKCE credential store remains sensitive even
# when a profile is active; default/non-profile sessions still read it.
str(hermes_root / ".anthropic_oauth.json"),
os.path.join(home, ".bashrc"),
os.path.join(home, ".zshrc"),
os.path.join(home, ".profile"),
os.path.join(home, ".bash_profile"),
os.path.join(home, ".zprofile"),
os.path.join(home, ".netrc"),
os.path.join(home, ".pgpass"),
os.path.join(home, ".npmrc"),
@@ -104,12 +99,6 @@ def is_write_denied(path: str) -> bool:
if resolved.startswith(prefix):
return True
# Hermes control-plane files: block both the ACTIVE profile's view
# (hermes_home) AND the global root view. Without the root pass, a
# profile-mode session leaves <root>/auth.json + <root>/config.yaml
# writable — letting a prompt-injected write_file overwrite the global
# files that every profile inherits from (same shape as #15981).
control_file_names = ("auth.json", "config.yaml", "webhook_subscriptions.json")
mcp_tokens_dir_name = "mcp-tokens"
hermes_dirs = []
@@ -122,12 +111,6 @@ def is_write_denied(path: str) -> bool:
continue
for base_real in hermes_dirs:
for name in control_file_names:
try:
if resolved == os.path.realpath(os.path.join(base_real, name)):
return True
except Exception:
continue
try:
mcp_real = os.path.realpath(os.path.join(base_real, mcp_tokens_dir_name))
if resolved == mcp_real or resolved.startswith(mcp_real + os.sep):

View File

@@ -41,6 +41,16 @@ DEFAULT_GEMINI_BASE_URL = "https://generativelanguage.googleapis.com/v1beta"
GEMINI_DEFAULT_MAX_OUTPUT_TOKENS = 65535
def bare_gemini_model_id(model: str) -> str:
"""Strip Gemini's own provider prefix from an aggregator-style model id."""
name = (model or "").strip()
lowered = name.lower()
for prefix in ("google/", "gemini/"):
if lowered.startswith(prefix):
return name[len(prefix):].strip() or name
return name
def is_native_gemini_base_url(base_url: str) -> bool:
"""Return True when the endpoint speaks Gemini's native REST API."""
normalized = str(base_url or "").strip().rstrip("/").lower()
@@ -330,7 +340,7 @@ def _build_gemini_contents(messages: List[Dict[str, Any]]) -> tuple[List[Dict[st
system_instruction = None
joined_system = "\n".join(part for part in system_text_parts if part).strip()
if joined_system:
system_instruction = {"parts": [{"text": joined_system}]}
system_instruction = {"role": "system", "parts": [{"text": joined_system}]}
return contents, system_instruction
@@ -914,6 +924,7 @@ class GeminiNativeClient:
thinking_config=thinking_config,
)
model = bare_gemini_model_id(model)
if stream:
return self._stream_completion(model=model, request=request, timeout=timeout)

View File

@@ -44,6 +44,66 @@ logger = logging.getLogger(__name__)
_SYNC_DRAIN_TIMEOUT_S = 5.0
def memory_provider_tools_enabled(enabled_toolsets: Optional[List[str]]) -> bool:
"""Return whether external memory-provider tools should be exposed."""
if enabled_toolsets is None:
return True
if not enabled_toolsets:
return False
if "memory" in enabled_toolsets:
return True
try:
from toolsets import resolve_toolset
return any("memory" in resolve_toolset(name) for name in enabled_toolsets)
except Exception:
logger.debug("Failed to resolve enabled toolsets for memory-provider tools", exc_info=True)
return False
def inject_memory_provider_tools(agent: Any) -> int:
"""Append external memory-provider tool schemas to an agent tool surface."""
memory_manager = getattr(agent, "_memory_manager", None)
tools = getattr(agent, "tools", None)
if not memory_manager or tools is None:
return 0
existing_tool_names = {
tool.get("function", {}).get("name")
for tool in tools
if isinstance(tool, dict)
}
if (
"memory" not in existing_tool_names
and not memory_provider_tools_enabled(getattr(agent, "enabled_toolsets", None))
):
return 0
get_schemas = getattr(memory_manager, "get_all_tool_schemas", None)
if not callable(get_schemas):
return 0
valid_tool_names = getattr(agent, "valid_tool_names", None)
if valid_tool_names is None:
valid_tool_names = set()
agent.valid_tool_names = valid_tool_names
added = 0
for schema in get_schemas():
if not isinstance(schema, dict):
continue
tool_name = schema.get("name", "")
if not tool_name or tool_name in existing_tool_names:
continue
tools.append({"type": "function", "function": schema})
valid_tool_names.add(tool_name)
existing_tool_names.add(tool_name)
added += 1
return added
# ---------------------------------------------------------------------------
# Context fencing helpers
# ---------------------------------------------------------------------------

View File

@@ -5,6 +5,7 @@ and run_agent.py for pre-flight context checks.
"""
import ipaddress
import json
import logging
import os
import re
@@ -16,7 +17,7 @@ from urllib.parse import urlparse
import requests
import yaml
from utils import base_url_host_matches, base_url_hostname
from utils import atomic_json_write, base_url_host_matches, base_url_hostname
from hermes_constants import OPENROUTER_MODELS_URL
@@ -111,6 +112,57 @@ _endpoint_model_metadata_cache: Dict[str, Dict[str, Dict[str, Any]]] = {}
_endpoint_model_metadata_cache_time: Dict[str, float] = {}
_ENDPOINT_MODEL_CACHE_TTL = 300
def _get_model_metadata_cache_path() -> Path:
"""Return path to the OpenRouter model metadata disk cache."""
from hermes_constants import get_hermes_home
return get_hermes_home() / "cache" / "openrouter_model_metadata.json"
def _model_metadata_disk_cache_age_seconds() -> Optional[float]:
"""Return disk-cache age in seconds, or None if freshness is unknown."""
try:
cache_path = _get_model_metadata_cache_path()
if not cache_path.exists():
return None
age = time.time() - cache_path.stat().st_mtime
if age < 0:
return None
return age
except Exception:
return None
def _load_model_metadata_disk_cache() -> Dict[str, Dict[str, Any]]:
"""Load processed OpenRouter metadata cache from disk."""
try:
cache_path = _get_model_metadata_cache_path()
with cache_path.open("r", encoding="utf-8") as f:
data = json.load(f)
if not isinstance(data, dict):
return {}
return {
str(key): value
for key, value in data.items()
if isinstance(value, dict)
}
except Exception as e:
logger.debug("Failed to load OpenRouter model metadata disk cache: %s", e)
return {}
def _save_model_metadata_disk_cache(data: Dict[str, Dict[str, Any]]) -> None:
"""Save processed OpenRouter metadata cache to disk atomically."""
try:
atomic_json_write(
_get_model_metadata_cache_path(),
data,
indent=0,
separators=(",", ":"),
)
except Exception as e:
logger.debug("Failed to save OpenRouter model metadata disk cache: %s", e)
# Descending tiers for context length probing when the model is unknown.
# We start at 256K (covers GPT-5.x, many current large-context models) and
# step down on context-length errors until one works. Tier[0] is also the
@@ -209,7 +261,13 @@ DEFAULT_CONTEXT_LENGTHS = {
# https://platform.minimax.io/docs/api-reference/text-chat-openai
"minimax-m3": 1000000,
"minimax": 204800,
# GLM
# GLM — GLM-5.2 ships with a 1M context window (verified empirically:
# needle-in-a-haystack retrieval at 789K prompt tokens succeeded with
# zero errors on api.z.ai/api/coding/paas/v4). Older GLM models
# (5, 5.1, 5-turbo) are ~202K. Longest-key-first substring matching
# ensures "glm-5.2" resolves to 1M while older variants still hit the
# generic 202K fallback.
"glm-5.2": 1_048_576,
"glm": 202752,
# xAI Grok — xAI /v1/models does not return context_length metadata,
# so these hardcoded fallbacks prevent Hermes from probing-down to
@@ -627,6 +685,15 @@ def fetch_model_metadata(force_refresh: bool = False) -> Dict[str, Dict[str, Any
if not force_refresh and _model_metadata_cache and (time.time() - _model_metadata_cache_time) < _MODEL_CACHE_TTL:
return _model_metadata_cache
if not force_refresh:
disk_age = _model_metadata_disk_cache_age_seconds()
if disk_age is not None and disk_age < _MODEL_CACHE_TTL:
disk_cache = _load_model_metadata_disk_cache()
if disk_cache:
_model_metadata_cache = disk_cache
_model_metadata_cache_time = time.time() - disk_age
return _model_metadata_cache
try:
response = requests.get(OPENROUTER_MODELS_URL, timeout=10, verify=_resolve_requests_verify())
response.raise_for_status()
@@ -648,12 +715,24 @@ def fetch_model_metadata(force_refresh: bool = False) -> Dict[str, Dict[str, Any
_model_metadata_cache = cache
_model_metadata_cache_time = time.time()
_save_model_metadata_disk_cache(cache)
logger.debug("Fetched metadata for %s models from OpenRouter", len(cache))
return cache
except Exception as e:
logger.warning(f"Failed to fetch model metadata from OpenRouter: {e}")
return _model_metadata_cache or {}
if _model_metadata_cache:
return _model_metadata_cache
disk_cache = _load_model_metadata_disk_cache()
if disk_cache:
_model_metadata_cache = disk_cache
disk_age = _model_metadata_disk_cache_age_seconds()
if disk_age is not None:
_model_metadata_cache_time = time.time() - min(disk_age, _MODEL_CACHE_TTL)
else:
_model_metadata_cache_time = time.time() - _MODEL_CACHE_TTL + 1
return _model_metadata_cache
return {}
def fetch_endpoint_model_metadata(

View File

@@ -135,7 +135,14 @@ def _repair_schema(node: Any, is_schema: bool = True) -> Any:
def _fill_missing_type(node: Dict[str, Any]) -> Dict[str, Any]:
"""Infer a reasonable ``type`` if this schema node has none."""
if "type" in node and node["type"] not in {None, ""}:
node_type = node.get("type")
if isinstance(node_type, list):
concrete = next(
(t for t in node_type if isinstance(t, str) and t not in {"", "null"}),
"string",
)
return {**node, "type": concrete}
if "type" in node and node_type not in {None, ""}:
return node
# Heuristic: presence of ``properties`` → object, ``items`` → array, ``enum``

View File

@@ -489,15 +489,41 @@ PLATFORM_HINTS = {
"files arrive as downloadable documents. You can also include image "
"URLs in markdown format ![alt](url) and they will be sent as photos."
),
"whatsapp_cloud": (
"You are on a text messaging communication platform, WhatsApp "
"(via Meta's official Business Cloud API). Standard markdown "
"(**bold**, ~~strike~~, # headers, [links](url)) is auto-converted "
"to WhatsApp's native syntax (*bold*, ~strike~, etc.) — feel free "
"to write in markdown. Tables are NOT supported — prefer bullet "
"lists or labeled key:value pairs. "
"You can send media files natively: include MEDIA:/absolute/path/to/file "
"in your response. Images (.jpg, .png) become photo attachments, "
"videos (.mp4) play inline, audio (.mp3, .ogg) sends as voice/audio "
"messages, other files arrive as documents. Image URLs in markdown "
"format ![alt](url) also work. "
"IMPORTANT: this platform has a 24-hour conversation window — if the "
"user hasn't messaged in 24h, free-form replies are refused by Meta "
"(error 131047). This rarely matters for live chat, but is worth "
"knowing if you're scheduling a delayed message."
),
"telegram": (
"You are on a text messaging communication platform, Telegram. "
"Standard markdown is automatically converted to Telegram format. "
"Standard Markdown is automatically converted to Telegram formatting. "
"Supported: **bold**, *italic*, ~~strikethrough~~, ||spoiler||, "
"`inline code`, ```code blocks```, [links](url), and ## headers. "
"Telegram has NO table syntax — prefer bullet lists or labeled "
"key: value pairs over pipe tables (any tables you do emit are "
"auto-rewritten into row-group bullets, which you can produce "
"directly for cleaner output). "
"Telegram now supports rich Markdown, so lean into it: whenever it "
"makes the answer clearer or easier to scan, actively reach for real "
"Markdown tables (pipe `| col | col |` syntax), bullet and numbered "
"lists, task lists (`- [ ]` / `- [x]`), headings, nested blockquotes, "
"collapsible details, footnotes/references, math/formulas (`$...$`, "
"`$$...$$`), underline, subscript/superscript, marked (highlighted) "
"text, and anchors. Default to structured formatting over dense "
"paragraphs for any comparison, set of steps, key/value summary, or "
"tabular data. Prefer real Markdown tables and task lists over "
"hand-built bullet substitutes when presenting structured data; these "
"degrade gracefully (tables become readable bullet groups) when rich "
"rendering is unavailable, but advanced constructs like math and "
"collapsible details may render as plain source text in that case. "
"You can send media files natively: to deliver a file to the user, "
"include MEDIA:/absolute/path/to/file in your response. Images "
"(.png, .jpg, .webp) appear as photos, audio (.ogg) sends as voice "
@@ -1101,11 +1127,12 @@ def _skill_should_show(
def build_skills_system_prompt(
available_tools: "set[str] | None" = None,
available_toolsets: "set[str] | None" = None,
compact_categories: "frozenset[str] | None" = None,
) -> str:
"""Build a compact skill index for the system prompt.
Two-layer cache:
1. In-process LRU dict keyed by (skills_dir, tools, toolsets)
1. In-process LRU dict keyed by (skills_dir, tools, toolsets, hidden)
2. Disk snapshot (``.skills_prompt_snapshot.json``) validated by
mtime/size manifest — survives process restarts
@@ -1115,6 +1142,12 @@ def build_skills_system_prompt(
scanned alongside the local ``~/.hermes/skills/`` directory. External dirs
are read-only — they appear in the index but new skills are always created
in the local dir. Local skills take precedence when names collide.
``compact_categories`` (e.g. from the coding posture — see
agent/coding_context.py) demotes whole categories to a names-only line in
the rendered index. Nothing is ever hidden: every skill name stays
visible and loadable via ``skill_view`` / ``skills_list``; only the
descriptions are dropped, and a footer note explains the demotion.
"""
skills_dir = get_skills_dir()
external_dirs = get_all_skills_dirs()[1:] # skip local (index 0)
@@ -1131,7 +1164,7 @@ def build_skills_system_prompt(
or get_session_env("HERMES_SESSION_PLATFORM")
or ""
)
disabled = get_disabled_skill_names()
disabled = get_disabled_skill_names(_platform_hint or None)
cache_key = (
str(skills_dir.resolve()),
tuple(str(d) for d in external_dirs),
@@ -1139,6 +1172,7 @@ def build_skills_system_prompt(
tuple(sorted(str(ts) for ts in (available_toolsets or set()))),
_platform_hint,
tuple(sorted(disabled)),
tuple(sorted(compact_categories or ())),
)
with _SKILLS_PROMPT_CACHE_LOCK:
cached = _SKILLS_PROMPT_CACHE.get(cache_key)
@@ -1272,18 +1306,44 @@ def build_skills_system_prompt(
except Exception as e:
logger.debug("Could not read external skill description %s: %s", desc_file, e)
# Posture-driven category demotion (e.g. non-coding skills while pairing
# on code). Demoted categories stay in the index as a single names-only
# line — descriptions are dropped to cut noise, but every skill name
# remains visible so memory-anchored recall ("load <name>") keeps working.
# NEVER remove entries entirely: agent-created skills are the model's
# project memory, and models don't reach for skills_list to rediscover
# what the index stops showing them. Match on the top-level category
# segment so nested categories ("social-media/twitter") are demoted with
# their parent.
demoted = frozenset(
cat for cat in skills_by_category
if cat.split("/", 1)[0] in (compact_categories or frozenset())
)
hidden_note = ""
if demoted:
hidden_note = (
"\n(Categories marked [names only] are outside the current coding "
"context, so their descriptions are omitted — the skills work "
"normally and load with skill_view(name) as usual.)"
)
if not skills_by_category:
result = ""
else:
index_lines = []
for category in sorted(skills_by_category.keys()):
# Deduplicate and sort skills within each category
seen = set()
if category in demoted:
names = sorted({name for name, _ in skills_by_category[category]})
index_lines.append(f" {category} [names only]: {', '.join(names)}")
continue
cat_desc = category_descriptions.get(category, "")
if cat_desc:
index_lines.append(f" {category}: {cat_desc}")
else:
index_lines.append(f" {category}:")
# Deduplicate and sort skills within each category
seen = set()
for name, desc in sorted(skills_by_category[category], key=lambda x: x[0]):
if name in seen:
continue
@@ -1320,6 +1380,7 @@ def build_skills_system_prompt(
"</available_skills>\n"
"\n"
"Only proceed without loading a skill if genuinely none are relevant to the task."
+ hidden_note
)
# ── Store in LRU cache ────────────────────────────────────────────
@@ -1383,13 +1444,13 @@ def build_nous_subscription_prompt(valid_tool_names: "set[str] | None" = None) -
lines = [
"# Nous Subscription",
"Nous subscription includes managed web tools (Firecrawl), image generation (FAL), OpenAI TTS, and browser automation (Browser Use) by default. Modal execution is optional.",
"Nous subscription includes managed web tools (Firecrawl), image generation (FAL), OpenAI TTS, OpenAI Whisper STT, and browser automation (Browser Use) by default. Modal execution is optional.",
"Current capability status:",
]
lines.extend(_status_line(feature) for feature in features.items())
lines.extend(
[
"When a Nous-managed feature is active, do not ask the user for Firecrawl, FAL, OpenAI TTS, or Browser-Use API keys.",
"When a Nous-managed feature is active, do not ask the user for Firecrawl, FAL, OpenAI TTS, OpenAI Whisper, or Browser-Use API keys.",
"If the user is not subscribed and asks for a capability that Nous subscription would unlock or simplify, suggest Nous subscription as one option alongside direct setup or local alternatives.",
"Do not mention subscription unless the user asks about it or it directly solves the current missing capability.",
"Useful commands: hermes setup, hermes setup tools, hermes setup terminal, hermes status.",

View File

@@ -272,27 +272,65 @@ def skill_matches_environment(frontmatter: Dict[str, Any]) -> bool:
# ── Disabled skills ───────────────────────────────────────────────────────
_RAW_CONFIG_CACHE: Dict[Tuple[str, int, int], Dict[str, Any]] = {}
def _raw_config_cache_clear() -> None:
"""Test hook — drop the shared raw config cache."""
_RAW_CONFIG_CACHE.clear()
def _load_raw_config() -> Dict[str, Any]:
"""Read config.yaml with a shared mtime+size keyed cache.
This module intentionally avoids importing ``hermes_cli.config`` on the
skill prompt/build path. A tiny local cache gives the same repeated-read
win without pulling the heavier CLI config stack into startup.
"""
config_path = get_config_path()
if not config_path.exists():
return {}
try:
stat = config_path.stat()
cache_key = (str(config_path), stat.st_mtime_ns, stat.st_size)
except OSError:
cache_key = None
if cache_key is not None:
cached = _RAW_CONFIG_CACHE.get(cache_key)
if cached is not None:
return cached
try:
parsed = yaml_load(config_path.read_text(encoding="utf-8"))
except Exception as e:
logger.debug("Could not read skill config %s: %s", config_path, e)
return {}
if not isinstance(parsed, dict):
return {}
if cache_key is not None:
_RAW_CONFIG_CACHE.clear()
_RAW_CONFIG_CACHE[cache_key] = parsed
return parsed
def get_disabled_skill_names(platform: str | None = None) -> Set[str]:
"""Read disabled skill names from config.yaml.
Args:
platform: Explicit platform name (e.g. ``"telegram"``). When
*None*, resolves from ``HERMES_PLATFORM`` or
``HERMES_SESSION_PLATFORM`` env vars. Falls back to the
global disabled list when no platform is determined.
``HERMES_SESSION_PLATFORM`` env vars. Returns the global
disabled list, unioned with the platform-specific list when a
platform is resolved (a globally-disabled skill stays disabled
on every platform).
Reads the config file directly (no CLI config imports) to stay
lightweight.
"""
config_path = get_config_path()
if not config_path.exists():
return set()
try:
parsed = yaml_load(config_path.read_text(encoding="utf-8"))
except Exception as e:
logger.debug("Could not read skill config %s: %s", config_path, e)
return set()
if not isinstance(parsed, dict):
parsed = _load_raw_config()
if not parsed:
return set()
skills_cfg = parsed.get("skills")
@@ -305,13 +343,14 @@ def get_disabled_skill_names(platform: str | None = None) -> Set[str]:
or os.getenv("HERMES_PLATFORM")
or get_session_env("HERMES_SESSION_PLATFORM")
)
global_disabled = _normalize_string_set(skills_cfg.get("disabled"))
if resolved_platform:
platform_disabled = (skills_cfg.get("platform_disabled") or {}).get(
resolved_platform
)
if platform_disabled is not None:
return _normalize_string_set(platform_disabled)
return _normalize_string_set(skills_cfg.get("disabled"))
return global_disabled | _normalize_string_set(platform_disabled)
return global_disabled
def _normalize_string_set(values) -> Set[str]:
@@ -336,6 +375,7 @@ _EXTERNAL_DIRS_CACHE: Dict[Tuple[str, int], List[Path]] = {}
def _external_dirs_cache_clear() -> None:
"""Test hook — drop the in-process cache."""
_EXTERNAL_DIRS_CACHE.clear()
_raw_config_cache_clear()
def get_external_skills_dirs() -> List[Path]:
@@ -368,11 +408,8 @@ def get_external_skills_dirs() -> List[Path]:
# Return a copy so callers can't mutate the cached list.
return list(cached)
try:
parsed = yaml_load(config_path.read_text(encoding="utf-8"))
except Exception:
return []
if not isinstance(parsed, dict):
parsed = _load_raw_config()
if not parsed:
return []
skills_cfg = parsed.get("skills")
@@ -584,15 +621,7 @@ def resolve_skill_config_values(
current values (or the declared default if the key isn't set).
Path values are expanded via ``os.path.expanduser``.
"""
config_path = get_config_path()
config: Dict[str, Any] = {}
if config_path.exists():
try:
parsed = yaml_load(config_path.read_text(encoding="utf-8"))
if isinstance(parsed, dict):
config = parsed
except Exception:
pass
config = _load_raw_config()
resolved: Dict[str, Any] = {}
for var in config_vars:

94
agent/ssl_guard.py Normal file
View File

@@ -0,0 +1,94 @@
"""Preventive SSL CA certificate checks for Hermes Agent.
This module catches broken CA bundle paths before OpenAI/httpx turns them into
opaque ``FileNotFoundError: [Errno 2] No such file or directory`` failures.
"""
from __future__ import annotations
import logging
import os
import ssl
from pathlib import Path
from agent.errors import SSLConfigurationError
logger = logging.getLogger(__name__)
_CA_BUNDLE_ENV_VARS = (
"HERMES_CA_BUNDLE",
"SSL_CERT_FILE",
"REQUESTS_CA_BUNDLE",
"CURL_CA_BUNDLE",
)
_SKIP_VALUES = {"1", "true", "yes", "on"}
def _skip_ssl_guard_enabled() -> bool:
return os.getenv("HERMES_SKIP_SSL_GUARD", "").strip().lower() in _SKIP_VALUES
def _repair_hint() -> str:
return (
"Repair: python -m pip install --force-reinstall certifi openai httpx\n"
"If you configured a custom corporate CA bundle, fix or unset the "
"broken CA bundle environment variable."
)
def _ssl_err(message: str) -> SSLConfigurationError:
"""Create a consistent, user-actionable SSL configuration error."""
return SSLConfigurationError(f"{message}\n{_repair_hint()}")
def _validate_bundle_path(label: str, value: str, *, require_substantial: bool = False) -> None:
path = Path(value).expanduser()
if not path.exists():
raise _ssl_err(f"{label} points to a missing CA bundle: {value}")
if not path.is_file():
raise _ssl_err(f"{label} does not point to a CA bundle file: {value}")
if require_substantial and path.stat().st_size < 1024:
raise _ssl_err(f"{label} at {value} appears corrupted (too small)")
try:
ctx = ssl.create_default_context(cafile=str(path))
except Exception as exc:
raise _ssl_err(f"{label} CA bundle at {value} cannot be loaded: {exc}") from exc
if not ctx.get_ca_certs():
raise _ssl_err(f"{label} CA bundle at {value} did not load any certificates")
def verify_ca_bundle() -> None:
"""Verify configured and bundled CA certificates are present and loadable.
Raises:
SSLConfigurationError: If an explicit CA-bundle environment variable
points at a bad path, or if certifi's bundled ``cacert.pem`` is
missing/corrupt.
"""
if _skip_ssl_guard_enabled():
logger.debug("SSL CA bundle guard skipped via HERMES_SKIP_SSL_GUARD")
return
for env_var in _CA_BUNDLE_ENV_VARS:
value = os.getenv(env_var)
if value:
_validate_bundle_path(env_var, value)
try:
import certifi
except Exception as exc:
raise _ssl_err(f"certifi is not importable: {exc}") from exc
ca_bundle = str(certifi.where())
_validate_bundle_path("certifi", ca_bundle, require_substantial=True)
def verify_ca_bundle_with_fallback() -> None:
"""Backward-compatible wrapper for older call sites.
The old PR name mentioned a platform fallback, but allowing startup with a
broken certifi bundle still leaves httpx/OpenAI and requests call sites
failing later. Keep the wrapper name but enforce the same check.
"""
verify_ca_bundle()

View File

@@ -191,9 +191,23 @@ def build_system_prompt_parts(agent: Any, system_message: Optional[str] = None)
)
if toolset
}
# Focus mode (opt-in) demotes non-coding skill categories to
# names-only in the index (never hidden — skill_view/skills_list
# reach everything, and every name stays visible for recall). The
# default coding posture leaves the index untouched.
_compact_cats = frozenset()
try:
from agent.coding_context import coding_compact_skill_categories
_compact_cats = coding_compact_skill_categories(
platform=agent.platform, cwd=resolve_context_cwd()
)
except Exception:
_compact_cats = frozenset()
skills_prompt = _r.build_skills_system_prompt(
available_tools=agent.valid_tool_names,
available_toolsets=avail_toolsets,
compact_categories=_compact_cats or None,
)
else:
skills_prompt = ""
@@ -221,6 +235,26 @@ def build_system_prompt_parts(agent: Any, system_message: Optional[str] = None)
if _env_hints:
stable_parts.append(_env_hints)
# Coding posture (base Hermes, any interactive coding surface in a code
# workspace — see agent/coding_context.py). The operating brief + the live
# git/workspace snapshot are built once here and cached for the session;
# the snapshot is never re-probed per turn (that would break the prompt
# cache), so the brief tells the model to re-check git before relying on it.
if agent.valid_tool_names:
try:
from agent.coding_context import coding_system_blocks
stable_parts.extend(
coding_system_blocks(
platform=agent.platform,
cwd=resolve_context_cwd(),
model=agent.model,
)
)
except Exception:
# Coding-context probing must never block prompt build.
pass
# Local Python toolchain probe — names python/pip/uv/PEP-668 state when
# something is non-default so the model can pick the right install
# strategy without discovering by failure. Emits a single line; emits

View File

@@ -417,7 +417,7 @@ def execute_tool_calls_concurrent(agent, assistant_message, messages: list, effe
# ── Logging / callbacks ──────────────────────────────────────────
tool_names_str = ", ".join(name for _, name, _, _, _, _ in parsed_calls)
if not agent.quiet_mode:
if not agent.quiet_mode and getattr(agent, "tool_progress_mode", "all") != "off":
print(f" ⚡ Concurrent: {num_tools} tool calls — {tool_names_str}")
for i, (tc, name, args, middleware_trace, block_result, blocked_by_guardrail) in enumerate(parsed_calls, 1):
args_str = json.dumps(args, ensure_ascii=False)
@@ -702,7 +702,7 @@ def execute_tool_calls_concurrent(agent, assistant_message, messages: list, effe
if agent._should_emit_quiet_tool_messages():
cute_msg = _get_cute_tool_message_impl(name, args, tool_duration, result=function_result)
agent._safe_print(f" {cute_msg}")
elif getattr(agent, "tool_progress_mode", "all") != "off":
elif not agent.quiet_mode and getattr(agent, "tool_progress_mode", "all") != "off":
_preview_str = _multimodal_text_summary(function_result)
if agent.verbose_logging:
print(f" ✅ Tool {i+1} completed in {tool_duration:.2f}s")
@@ -866,7 +866,7 @@ def execute_tool_calls_sequential(agent, assistant_message, messages: list, effe
elif function_name == "skill_manage":
agent._iters_since_skill = 0
if not agent.quiet_mode:
if not agent.quiet_mode and getattr(agent, "tool_progress_mode", "all") != "off":
args_str = json.dumps(function_args, ensure_ascii=False)
if agent.verbose_logging:
print(f" 📞 Tool {i}: {function_name}({list(function_args.keys())})")
@@ -1384,7 +1384,7 @@ def execute_tool_calls_sequential(agent, assistant_message, messages: list, effe
# entire batch. The model sees it on the next API iteration.
agent._apply_pending_steer_to_tool_results(messages, 1)
if not agent.quiet_mode:
if not agent.quiet_mode and getattr(agent, "tool_progress_mode", "all") != "off":
if agent.verbose_logging:
print(f" ✅ Tool {i} completed in {tool_duration:.2f}s")
print(agent._wrap_verbose("Result: ", function_result))

View File

@@ -84,7 +84,7 @@ class AnthropicTransport(ProviderTransport):
to OpenAI finish_reason, and collects reasoning_details in provider_data.
"""
import json
from agent.anthropic_adapter import _to_plain_data
from agent.anthropic_adapter import _to_plain_data, _sanitize_replay_block
from agent.transports.types import ToolCall
strip_tool_prefix = kwargs.get("strip_tool_prefix", False)
@@ -94,14 +94,40 @@ class AnthropicTransport(ProviderTransport):
reasoning_parts = []
reasoning_details = []
tool_calls = []
# Verbatim, order-preserving copy of every content block in the turn.
# Anthropic signs each thinking block against the turn content that
# PRECEDES it at its position; when a turn interleaves thinking and
# tool_use (adaptive/interleaved thinking, Claude 4.6+), the parallel
# reasoning_details + tool_calls lists below lose that cross-type
# ordering. Replaying the latest assistant message in the wrong order
# invalidates the signatures -> HTTP 400 "thinking ... blocks in the
# latest assistant message cannot be modified". Preserve the exact
# block sequence here so the adapter can replay it unchanged. See
# tests/agent/test_anthropic_thinking_block_order.py.
ordered_blocks = []
for block in response.content:
block_dict = _to_plain_data(block)
clean_block = None
if isinstance(block_dict, dict):
# Sanitize at capture so output-only SDK fields (parsed_output,
# caller, citations=None, …) never persist to state.db and leak
# back as request input on replay → HTTP 400 "Extra inputs are
# not permitted". Defence-in-depth with the replay-side sanitize.
clean_block = _sanitize_replay_block(block_dict)
if clean_block is not None:
ordered_blocks.append(clean_block)
if block.type == "text":
text_parts.append(block.text)
elif block.type == "thinking":
reasoning_parts.append(block.thinking)
block_dict = _to_plain_data(block)
if isinstance(block_dict, dict):
elif block.type in ("thinking", "redacted_thinking"):
if block.type == "thinking":
reasoning_parts.append(block.thinking)
# Use the sanitized block (clean_block) for reasoning_details too,
# since _extract_preserved_thinking_blocks replays these on the
# non-ordered path. Falls back to raw only if sanitize dropped it.
if isinstance(clean_block, dict):
reasoning_details.append(clean_block)
elif isinstance(block_dict, dict):
reasoning_details.append(block_dict)
elif block.type == "tool_use":
name = block.name
@@ -130,6 +156,23 @@ class AnthropicTransport(ProviderTransport):
provider_data = {}
if reasoning_details:
provider_data["reasoning_details"] = reasoning_details
# Only worth carrying the ordered-blocks channel when the turn
# actually interleaves signed thinking with tool_use — that's the
# only shape the parallel lists reconstruct incorrectly. A turn that
# is purely text, or thinking-then-tools with a single leading
# thinking block, replays correctly without it.
_has_signed_thinking = any(
isinstance(b, dict)
and b.get("type") in ("thinking", "redacted_thinking")
and (b.get("signature") or b.get("data"))
for b in ordered_blocks
)
_has_tool_use = any(
isinstance(b, dict) and b.get("type") == "tool_use"
for b in ordered_blocks
)
if _has_signed_thinking and _has_tool_use:
provider_data["anthropic_content_blocks"] = ordered_blocks
return NormalizedResponse(
content="\n".join(text_parts) if text_parts else None,
@@ -143,10 +186,21 @@ class AnthropicTransport(ProviderTransport):
def validate_response(self, response: Any) -> bool:
"""Check Anthropic response structure is valid.
An empty content list is legitimate when ``stop_reason == "end_turn"``
— the model's canonical way of signalling "nothing more to add" after
a tool turn that already delivered the user-facing text. Treating it
as invalid falsely retries a completed response.
An empty content list is legitimate for terminal stop reasons that
carry no text payload:
- ``end_turn`` — the model's canonical "nothing more to add" after a
tool turn that already delivered the user-facing text.
- ``refusal`` — the model declined to respond (Claude 4.5+). The
Messages API returns an empty ``content`` list with this stop
reason. Treating it as invalid sends a deterministic refusal into
the invalid-response retry loop, which reproduces the refusal on
every attempt and surfaces a misleading "rate limited / invalid
response" error instead of the refusal. ``normalize_response`` maps
``refusal`` → ``content_filter`` so the agent loop's refusal handler
can surface it.
Treating either as invalid falsely retries a completed response.
"""
if response is None:
return False
@@ -154,7 +208,7 @@ class AnthropicTransport(ProviderTransport):
if not isinstance(content_blocks, list):
return False
if not content_blocks:
return getattr(response, "stop_reason", None) == "end_turn"
return getattr(response, "stop_reason", None) in {"end_turn", "refusal"}
return True
def extract_cache_stats(self, response: Any) -> Optional[Dict[str, int]]:

View File

@@ -388,6 +388,13 @@ class ChatCompletionsTransport(ProviderTransport):
if provider_prefs and is_openrouter:
extra_body["provider"] = provider_prefs
# OpenRouter usage accounting — response `usage.cost` carries the REAL
# charged cost (credits are 1:1 USD). Parity with the profile path in
# plugins/model-providers/openrouter/__init__.py; this branch only runs
# when the OpenRouter profile isn't loaded.
if is_openrouter:
extra_body["usage"] = {"include": True}
# Pareto Code router plugin — model-gated. Same shape as the
# profile path in plugins/model-providers/openrouter/__init__.py;
# this branch only runs when the OpenRouter profile isn't loaded.
@@ -664,8 +671,42 @@ class ChatCompletionsTransport(ProviderTransport):
if rd:
provider_data["reasoning_details"] = rd
# OpenAI structured-refusal field. When a model declines, the SDK
# populates ``message.refusal`` with the explanation and leaves
# ``content`` empty. OpenAI-compatible proxies that front Anthropic /
# Bedrock (e.g. Nous Portal) surface a Claude refusal this way — or via
# ``finish_reason="content_filter"`` — instead of the native
# ``stop_reason="refusal"``. Without capturing it the refusal looks
# like an empty response, so the agent loop retries a deterministic
# refusal three times and gives up with "no content after retries".
# Promote it to content + a ``content_filter`` finish reason so the
# loop's refusal handler surfaces it clearly and stops. ``refusal`` is
# ``None`` for normal responses, so this is a no-op in the common case.
content = msg.content
refusal = getattr(msg, "refusal", None)
if refusal is None and hasattr(msg, "model_extra"):
_msg_extra = getattr(msg, "model_extra", None) or {}
if isinstance(_msg_extra, dict):
refusal = _msg_extra.get("refusal")
if isinstance(refusal, str) and refusal.strip():
# Record the refusal explanation regardless — it's useful provider
# metadata even when the model also returned a usable payload.
provider_data["refusal"] = refusal
_has_text = isinstance(content, str) and content.strip()
_has_tool_calls = bool(tool_calls)
# Only promote to a terminal ``content_filter`` when the refusal is
# the *sole* payload — no visible text and no tool calls. A response
# that carries real content (or tool calls) alongside a refusal note
# is a normal, usable turn: surfacing it as a failed safety refusal
# would discard the model's actual work. In the empty-payload case,
# adopt the refusal as content so the loop has something to show.
if not _has_text and not _has_tool_calls:
content = refusal
if finish_reason in (None, "stop"):
finish_reason = "content_filter"
return NormalizedResponse(
content=msg.content,
content=content,
tool_calls=tool_calls,
finish_reason=finish_reason,
reasoning=reasoning,

View File

@@ -218,22 +218,10 @@ class ResponsesApiTransport(ProviderTransport):
kwargs.pop("timeout", None)
if is_codex_backend:
prompt_cache_key = kwargs.get("prompt_cache_key")
cache_scope_id = str(prompt_cache_key or session_id or "").strip()
if cache_scope_id:
existing_extra_headers = kwargs.get("extra_headers")
merged_extra_headers: Dict[str, str] = {}
if isinstance(existing_extra_headers, dict):
merged_extra_headers.update(
{
str(key): str(value)
for key, value in existing_extra_headers.items()
if key and value is not None
}
)
merged_extra_headers["session_id"] = cache_scope_id
merged_extra_headers["x-client-request-id"] = cache_scope_id
kwargs["extra_headers"] = merged_extra_headers
# chatgpt.com/backend-api/codex rejects body-level
# ``extra_headers`` with HTTP 400. Correlation/cache routing for
# this backend must not be sent through the Responses payload.
kwargs.pop("extra_headers", None)
max_tokens = params.get("max_tokens")
if max_tokens is not None and not is_codex_backend:

View File

@@ -121,6 +121,18 @@ class NormalizedResponse:
pd = self.provider_data or {}
return pd.get("reasoning_details")
@property
def anthropic_content_blocks(self):
"""Verbatim, order-preserving Anthropic content blocks for a turn.
Present only when an Anthropic turn interleaves signed thinking with
tool_use — the one shape the parallel reasoning_details + tool_calls
lists reconstruct in the wrong order, invalidating thinking-block
signatures on replay. See agent/transports/anthropic.py.
"""
pd = self.provider_data or {}
return pd.get("anthropic_content_blocks")
@property
def codex_reasoning_items(self):
pd = self.provider_data or {}

View File

@@ -852,6 +852,100 @@ def estimate_usage_cost(
)
def _finite_nonneg_number(value: Any) -> Optional[float]:
"""Return ``value`` as a float when it is a real, finite, non-negative
number (int/float, not bool); otherwise None."""
if isinstance(value, bool) or not isinstance(value, (int, float)):
return None
try:
f = float(value)
except (TypeError, ValueError):
return None
if f != f or f in (float("inf"), float("-inf")) or f < 0:
return None
return f
def extract_provider_cost_usd(response_usage: Any) -> Optional[float]:
"""Provider-REPORTED cost (USD) from a response ``usage`` object, or None.
Reads the ``usage.cost`` field that OpenRouter's usage accounting returns
(``usage: {"include": true}`` request param; OpenRouter credits are 1:1
USD). OpenRouter-compatible aggregators use the same field. This NEVER
estimates: when the provider reports nothing, the result is None — callers
must treat None as "no cost data", not zero. A reported ``0`` is a real
zero (e.g. free-tier models) and is returned as ``0.0``.
"""
if response_usage is None:
return None
cost = getattr(response_usage, "cost", None)
if cost is None and isinstance(response_usage, dict):
cost = response_usage.get("cost")
return _finite_nonneg_number(cost)
def real_session_cost_usd(agent: Any) -> Optional[float]:
"""Session-cumulative provider-REPORTED cost in USD, or None.
Combines the two real sources Hermes has — no estimation, ever:
- ``agent.session_actual_cost_usd``: per-response ``usage.cost``
accumulator (OpenRouter usage accounting).
- Nous ``x-nous-credits-*`` header delta via
``agent.get_credits_spent_micros()`` (account-level spend since the
session first saw a header; clamped at 0 so a mid-session top-up
doesn't render a negative cost).
Returns None when neither source has reported anything — callers must
hide their cost display in that case rather than showing $0.00.
"""
total: Optional[float] = None
actual = _finite_nonneg_number(getattr(agent, "session_actual_cost_usd", None))
if actual is not None:
total = actual
try:
spent_micros = agent.get_credits_spent_micros()
except Exception:
spent_micros = None
if spent_micros is not None:
try:
spent_usd = max(0, int(spent_micros)) / 1_000_000
except (TypeError, ValueError):
spent_usd = None
if spent_usd is not None:
total = (total or 0.0) + spent_usd
return total
def nous_header_cost_usd(agent: Any) -> Optional[float]:
"""Session-cumulative cost in USD derived ONLY from the Nous portal
``x-nous-credits-*`` header delta, or None.
This is the STATUS-BAR cost source (glitch 2026-06-13, F3): the TUI chrome
must show cost ONLY when the session runs against the Nous portal, because
the header delta is the one figure we can trust without re-deriving per-model
cache/input/output pricing (which is unreliable across the model long tail).
Unlike :func:`real_session_cost_usd`, this DELIBERATELY ignores the
OpenRouter ``usage.cost`` accumulator — a non-Nous route reports no header,
so the chrome hides its cost segment entirely.
The ``/usage`` accounting page keeps using ``real_session_cost_usd`` (both
provider-reported sources); only the chrome bar narrows to header-only.
"""
try:
spent_micros = agent.get_credits_spent_micros()
except Exception:
return None
if spent_micros is None:
return None
try:
return max(0, int(spent_micros)) / 1_000_000
except (TypeError, ValueError):
return None
def has_known_pricing(
model_name: str,
provider: Optional[str] = None,

View File

@@ -11,7 +11,8 @@
"tauri": "tauri",
"tauri:dev": "tauri dev",
"tauri:build": "tauri build",
"tauri:build:debug": "tauri build --debug"
"tauri:build:debug": "tauri build --debug",
"typecheck": "tsc -p . --noEmit"
},
"dependencies": {
"@nous-research/ui": "0.16.0",
@@ -40,7 +41,7 @@
"@types/react": "^19.2.14",
"@types/react-dom": "^19.2.3",
"@vitejs/plugin-react": "^5.2.0",
"typescript": "~5.9.3",
"typescript": "^6.0.3",
"vite": "^7.3.1"
}
}

View File

@@ -3,8 +3,9 @@
//! Driven when the installer is launched as `Hermes-Setup.exe --update` (see
//! `AppMode` in lib.rs). The desktop app hands off to us — it exits, then we:
//!
//! 1. wait for the old Hermes desktop process to fully exit (so the venv
//! shim is free; otherwise `hermes update` aborts with exit code 2),
//! 1. wait for the old Hermes desktop process to fully exit (so both the
//! venv shim and packaged app.asar are free; otherwise `hermes update`
//! or repair bootstrap can race locked files),
//! 2. run `hermes update --yes --gateway` (Python/repo update; this does NOT
//! rebuild apps/desktop by design — see cmd_update in hermes_cli/main.py),
//! 3. run `hermes desktop --build-only` (the rebuild step update skips),
@@ -38,8 +39,8 @@ use crate::events::{BootstrapEvent, LogStream, StageInfo, StageState};
/// hermes_cli/main.py (sys.exit(2)). We surface a targeted message for this.
const UPDATE_EXIT_CONCURRENT: i32 = 2;
/// How long to wait for the old desktop process to release the venv shim
/// before giving up and letting `hermes update`'s own guard decide.
/// How long to wait for the old desktop process to release files under the
/// install tree before giving up and letting `hermes update`'s own guard decide.
const DESKTOP_EXIT_WAIT: Duration = Duration::from_secs(20);
const DESKTOP_EXIT_POLL: Duration = Duration::from_millis(500);
@@ -150,8 +151,10 @@ async fn run_update(app: AppHandle) -> Result<()> {
// ---- pre-step: wait for the old desktop to die -----------------------
// The desktop exec'd us then called app.exit(), but process teardown is
// async on Windows. If it still holds the venv shim, `hermes update`
// aborts with exit 2. Give it a bounded window to clear.
wait_for_venv_free(&install_root, &app).await;
// aborts with exit 2. If it still holds the packaged app.asar,
// install.ps1's repair/re-clone path cannot move/remove the install tree.
// Give both handles a bounded window to clear.
wait_for_install_locks_free(&install_root, &app, "update").await;
// ---- stage 1: hermes update -----------------------------------------
// Pass --branch so `hermes update` targets the branch this installer was
@@ -173,8 +176,8 @@ async fn run_update(app: AppHandle) -> Result<()> {
vec!["update".into(), "--yes".into(), "--gateway".into()];
// --force skips `hermes update`'s Windows running-exe guard (which would
// `sys.exit(2)` and dead-end the handoff). By contract the desktop has
// already exited and waited for the venv shim to unlock before launching
// us, and wait_for_venv_free below force-kills any straggler — so by the
// already exited and waited for the install locks to clear before launching
// us, and wait_for_install_locks_free below force-kills any straggler — so by the
// time `hermes update` runs there is no legitimate hermes.exe to protect,
// and the guard would only produce a false "Hermes is still running" stop.
update_args.push("--force".into());
@@ -391,48 +394,57 @@ async fn run_update(app: AppHandle) -> Result<()> {
Ok(())
}
/// Poll until the venv shim is no longer locked (Windows) or a bounded timeout
/// elapses. On non-Windows this is a short fixed grace since file locking
/// isn't the failure mode there.
async fn wait_for_venv_free(install_root: &Path, app: &AppHandle) {
let shim = venv_hermes(install_root);
/// Poll until the venv shim AND packaged desktop app bundle are no longer locked
/// (Windows) or a bounded timeout elapses. On non-Windows this is a short fixed
/// grace since file locking isn't the failure mode there.
pub(crate) async fn wait_for_install_locks_free(install_root: &Path, app: &AppHandle, stage: &str) {
let lock_targets = install_lock_probe_paths(install_root);
let deadline = Instant::now() + DESKTOP_EXIT_WAIT;
emit_log(app, Some("update"), LogStream::Stdout, "[update] waiting for Hermes to exit…");
emit_log(app, Some(stage), LogStream::Stdout, "[handoff] waiting for Hermes to exit…");
loop {
if !is_locked(&shim) {
let locked = locked_paths(&lock_targets);
if locked.is_empty() {
return;
}
if Instant::now() >= deadline {
// Last resort: a backend hermes.exe (or a grandchild it spawned)
// is still holding the shim. The desktop should have reaped its
// tree before handing off, but SIGTERM races / detached
// grandchildren / AV handles can leave a straggler. Rather than
// "proceed anyway" straight into uv's "Access is denied", force-kill
// every hermes.exe except ourselves, then give the OS a beat to
// unload the image.
// Last resort: a backend hermes.exe (or the desktop Hermes.exe
// itself) is still holding one of the update-sensitive files. The
// desktop should have reaped its tree before handing off, but
// SIGTERM races / detached grandchildren / AV handles can leave a
// straggler. Rather than "proceed anyway" straight into uv's
// "Access is denied" or install.ps1's locked app.asar failure,
// force-kill every Hermes.exe except ourselves, then give the OS a
// beat to unload the image.
emit_log(
app,
Some("update"),
Some(stage),
LogStream::Stdout,
"[update] Hermes still holding the venv shim; force-killing stragglers…",
&format!(
"[handoff] Hermes still holding install files ({}); force-killing stragglers…",
format_locked_paths(&locked)
),
);
force_kill_other_hermes();
tokio::time::sleep(Duration::from_millis(800)).await;
if !is_locked(&shim) {
let locked_after_kill = locked_paths(&lock_targets);
if locked_after_kill.is_empty() {
emit_log(
app,
Some("update"),
Some(stage),
LogStream::Stdout,
"[update] venv shim freed after force-kill",
"[handoff] install files freed after force-kill",
);
} else {
emit_log(
app,
Some("update"),
Some(stage),
LogStream::Stdout,
"[update] venv shim still locked; proceeding (--force + quarantine will handle it)",
&format!(
"[handoff] install files still locked ({}); proceeding (--force + quarantine will handle it)",
format_locked_paths(&locked_after_kill)
),
);
}
return;
@@ -441,13 +453,44 @@ async fn wait_for_venv_free(install_root: &Path, app: &AppHandle) {
}
}
fn install_lock_probe_paths(install_root: &Path) -> Vec<PathBuf> {
let mut paths = vec![venv_hermes(install_root)];
paths.extend(desktop_app_payload_paths(install_root));
paths
}
fn desktop_app_payload_paths(install_root: &Path) -> Vec<PathBuf> {
let release = install_root.join("apps").join("desktop").join("release");
if cfg!(target_os = "windows") {
vec![
release.join("win-unpacked").join("resources").join("app.asar"),
release.join("win-arm64-unpacked").join("resources").join("app.asar"),
]
} else if cfg!(target_os = "macos") {
vec![
release.join("mac").join("Hermes.app").join("Contents").join("Resources").join("app.asar"),
release.join("mac-arm64").join("Hermes.app").join("Contents").join("Resources").join("app.asar"),
]
} else {
vec![release.join("linux-unpacked").join("resources").join("app.asar")]
}
}
fn locked_paths(paths: &[PathBuf]) -> Vec<PathBuf> {
paths.iter().filter(|p| is_locked(p)).cloned().collect()
}
fn format_locked_paths(paths: &[PathBuf]) -> String {
paths.iter().map(|p| p.display().to_string()).collect::<Vec<_>>().join(", ")
}
/// Force-kill any `hermes.exe` other than this process. Windows-only; a no-op
/// elsewhere (POSIX has no mandatory-lock contention). We can't selectively
/// target "the backend" by PID here — the desktop already exited and we never
/// knew its children — so we kill the whole `hermes.exe` image tree via
/// taskkill, excluding our own PID.
///
/// Safe w.r.t. our own update child: this runs inside `wait_for_venv_free`,
/// Safe w.r.t. our own update child: this runs inside the install-lock wait,
/// which completes BEFORE we spawn `venv\Scripts\hermes.exe update`. At this
/// point no update-driven hermes.exe exists yet, so the only hermes.exe images
/// are stragglers from the old desktop — exactly what we want gone. (`/FI PID
@@ -891,6 +934,29 @@ mod tests {
assert!(!is_locked(Path::new("/nonexistent/does/not/exist/xyz")));
}
#[test]
fn lock_probe_paths_include_desktop_app_payload() {
let root = Path::new("/x/hermes-agent");
let probes = install_lock_probe_paths(root);
assert!(
probes.iter().any(|p| p == &venv_hermes(root)),
"venv shim remains part of the update lock probe"
);
assert!(
probes.iter().any(|p| p.ends_with(Path::new("resources/app.asar"))),
"packaged app.asar must be probed so repair/re-clone waits for the old desktop to exit"
);
}
#[test]
fn locked_paths_ignores_missing_payloads() {
let root = Path::new("/nonexistent/hermes-agent");
let probes = install_lock_probe_paths(root);
assert!(locked_paths(&probes).is_empty());
}
#[test]
fn parses_update_branch_from_space_or_equals_args() {
assert_eq!(

View File

@@ -16,9 +16,8 @@
"noUnusedParameters": true,
"esModuleInterop": true,
"noFallthroughCasesInSwitch": true,
"baseUrl": ".",
"paths": {
"@/*": ["src/*"]
"@/*": ["./src/*"]
}
},
"include": ["src"],

View File

@@ -93,7 +93,7 @@ Run before opening a PR (lint may surface pre-existing warnings but must exit cl
```bash
npm run fix
npm run type-check
npm run typecheck
npm run lint
npm run test:desktop:all
```

View File

@@ -0,0 +1,112 @@
const path = require('node:path')
// Match the POSIX fallback surface used by the Python terminal environment.
// macOS apps launched from Finder/Dock often inherit only /usr/bin:/bin:/usr/sbin:/sbin,
// which misses Apple Silicon Homebrew and user-installed CLI tools such as codex.
const POSIX_SANE_PATH_ENTRIES = Object.freeze([
'/opt/homebrew/bin',
'/opt/homebrew/sbin',
'/usr/local/sbin',
'/usr/local/bin',
'/usr/sbin',
'/usr/bin',
'/sbin',
'/bin'
])
function delimiterForPlatform(platform = process.platform) {
return platform === 'win32' ? ';' : ':'
}
function pathModuleForPlatform(platform = process.platform) {
return platform === 'win32' ? path.win32 : path.posix
}
function pathEnvKey(env = process.env, platform = process.platform) {
if (platform !== 'win32') return 'PATH'
return Object.keys(env || {}).find(key => key.toUpperCase() === 'PATH') || 'PATH'
}
function currentPathValue(env = process.env, platform = process.platform) {
const key = pathEnvKey(env, platform)
return env?.[key] || ''
}
function appendUniquePathEntries(entries, { delimiter = path.delimiter } = {}) {
const seen = new Set()
const ordered = []
for (const entry of entries) {
if (!entry) continue
const parts = Array.isArray(entry) ? entry : String(entry).split(delimiter)
for (const part of parts) {
if (!part || seen.has(part)) continue
seen.add(part)
ordered.push(part)
}
}
return ordered.join(delimiter)
}
function buildDesktopBackendPath({
hermesHome,
venvRoot,
currentPath = '',
platform = process.platform,
pathModule = pathModuleForPlatform(platform)
} = {}) {
const delimiter = delimiterForPlatform(platform)
const hermesNodeBin = hermesHome ? pathModule.join(hermesHome, 'node', 'bin') : null
const venvBin = venvRoot ? pathModule.join(venvRoot, platform === 'win32' ? 'Scripts' : 'bin') : null
const saneEntries = platform === 'win32' ? [] : POSIX_SANE_PATH_ENTRIES
return appendUniquePathEntries(
[hermesNodeBin, venvBin, currentPath, saneEntries],
{ delimiter }
)
}
function normalizeHermesHomeRoot(hermesHome, { pathModule = pathModuleForPlatform(process.platform) } = {}) {
if (!hermesHome) return hermesHome
const resolved = pathModule.resolve(String(hermesHome))
const parent = pathModule.dirname(resolved)
if (pathModule.basename(parent).toLowerCase() === 'profiles') {
return pathModule.dirname(parent)
}
return resolved
}
function buildDesktopBackendEnv({
hermesHome,
pythonPathEntries = [],
venvRoot,
currentEnv = process.env,
platform = process.platform,
pathModule = pathModuleForPlatform(platform)
} = {}) {
const delimiter = delimiterForPlatform(platform)
const currentPythonPath = currentEnv?.PYTHONPATH || ''
const key = pathEnvKey(currentEnv, platform)
return {
PYTHONPATH: appendUniquePathEntries([...pythonPathEntries, currentPythonPath], { delimiter }),
[key]: buildDesktopBackendPath({
hermesHome,
venvRoot,
currentPath: currentPathValue(currentEnv, platform),
platform,
pathModule
})
}
}
module.exports = {
POSIX_SANE_PATH_ENTRIES,
appendUniquePathEntries,
buildDesktopBackendEnv,
buildDesktopBackendPath,
delimiterForPlatform,
normalizeHermesHomeRoot,
pathEnvKey
}

View File

@@ -0,0 +1,111 @@
const test = require('node:test')
const assert = require('node:assert/strict')
const path = require('node:path')
const {
POSIX_SANE_PATH_ENTRIES,
appendUniquePathEntries,
buildDesktopBackendEnv,
buildDesktopBackendPath,
normalizeHermesHomeRoot,
pathEnvKey
} = require('./backend-env.cjs')
test('desktop backend PATH adds Hermes-managed bins and missing POSIX sane entries', () => {
const result = buildDesktopBackendPath({
hermesHome: '/Users/test/.hermes',
venvRoot: '/Users/test/.hermes/hermes-agent/venv',
currentPath: '/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin',
platform: 'darwin',
pathModule: path.posix
})
const entries = result.split(':')
assert.equal(entries[0], '/Users/test/.hermes/node/bin')
assert.equal(entries[1], '/Users/test/.hermes/hermes-agent/venv/bin')
assert.ok(entries.includes('/opt/homebrew/bin'), 'Apple Silicon Homebrew bin is added')
assert.ok(entries.includes('/opt/homebrew/sbin'), 'Apple Silicon Homebrew sbin is added')
assert.ok(entries.includes('/usr/local/sbin'), 'missing standard sbin is added')
for (const expected of POSIX_SANE_PATH_ENTRIES) {
assert.ok(entries.includes(expected), `${expected} should be present`)
}
})
test('desktop backend PATH preserves first occurrence and avoids duplicates', () => {
const result = buildDesktopBackendPath({
hermesHome: '/Users/test/.hermes',
venvRoot: '/Users/test/.hermes/hermes-agent/venv',
currentPath: '/opt/homebrew/bin:/usr/bin:/opt/homebrew/bin:/bin',
platform: 'darwin',
pathModule: path.posix
})
const entries = result.split(':')
assert.equal(entries.filter(entry => entry === '/opt/homebrew/bin').length, 1)
assert.ok(
entries.indexOf('/opt/homebrew/bin') < entries.indexOf('/opt/homebrew/sbin'),
'existing Homebrew bin keeps its precedence over appended missing sane entries'
)
})
test('buildDesktopBackendEnv extends PYTHONPATH and backend PATH together', () => {
const env = buildDesktopBackendEnv({
hermesHome: '/Users/test/.hermes',
pythonPathEntries: ['/repo/hermes-agent'],
venvRoot: '/Users/test/.hermes/hermes-agent/venv',
currentEnv: {
PATH: '/usr/bin:/bin',
PYTHONPATH: '/existing/pythonpath'
},
platform: 'darwin',
pathModule: path.posix
})
assert.equal(env.PYTHONPATH, '/repo/hermes-agent:/existing/pythonpath')
assert.ok(env.PATH.startsWith('/Users/test/.hermes/node/bin:/Users/test/.hermes/hermes-agent/venv/bin:'))
assert.ok(env.PATH.includes('/opt/homebrew/bin'))
})
test('normalizeHermesHomeRoot maps profile homes back to the global Hermes root', () => {
assert.equal(
normalizeHermesHomeRoot('/Users/test/.hermes/profiles/oracle', { pathModule: path.posix }),
'/Users/test/.hermes'
)
assert.equal(
normalizeHermesHomeRoot('C:\\Users\\test\\AppData\\Local\\hermes\\profiles\\oracle', { pathModule: path.win32 }),
'C:\\Users\\test\\AppData\\Local\\hermes'
)
assert.equal(
normalizeHermesHomeRoot('/Users/test/.hermes', { pathModule: path.posix }),
'/Users/test/.hermes'
)
})
test('Windows PATH casing and delimiter are preserved without POSIX sane entries', () => {
const env = buildDesktopBackendEnv({
hermesHome: 'C:\\Users\\test\\AppData\\Local\\hermes',
pythonPathEntries: ['C:\\repo\\hermes-agent'],
venvRoot: 'C:\\Users\\test\\AppData\\Local\\hermes\\hermes-agent\\venv',
currentEnv: {
Path: 'C:\\Windows\\System32;C:\\Windows',
PYTHONPATH: 'C:\\existing\\pythonpath'
},
platform: 'win32',
pathModule: path.win32
})
assert.equal(pathEnvKey({ Path: 'x' }, 'win32'), 'Path')
assert.equal(env.PATH, undefined)
assert.ok(env.Path.startsWith('C:\\Users\\test\\AppData\\Local\\hermes\\node\\bin;'))
assert.ok(env.Path.includes('\\venv\\Scripts;'))
assert.ok(env.Path.includes(';C:\\Windows\\System32;C:\\Windows'))
assert.equal(env.Path.includes('/opt/homebrew/bin'), false)
})
test('appendUniquePathEntries drops empty entries and keeps first occurrence', () => {
assert.equal(
appendUniquePathEntries([':/a::/b', ['/a', '/c']], { delimiter: ':' }),
'/a:/b:/c'
)
})

View File

@@ -0,0 +1,66 @@
const _READY_RE = /^HERMES_DASHBOARD_READY port=(\d+)/m
/**
* Watch a child process's stdout for the `HERMES_DASHBOARD_READY port=<N>`
* line that web_server.py prints after uvicorn binds its socket.
*
* Returns the parsed port. Rejects if:
* - the child exits before emitting the line
* - the child emits an `error` event
* - no line arrives within the timeout
*
* A single `cleanup()` tears down every listener (data/exit/error/timeout)
* on every terminal path — resolve, reject, or timeout — so repeated
* backend spawns don't leak listener slots on the child.
*/
function waitForDashboardPort(child, timeoutMs = 45_000) {
return new Promise((resolve, reject) => {
let buf = ''
let done = false
function cleanup() {
if (done) return
done = true
clearTimeout(timer)
child.stdout.off('data', onData)
child.off('exit', onExit)
child.off('error', onError)
}
function onData(chunk) {
buf += chunk.toString()
let nl
while ((nl = buf.indexOf('\n')) !== -1) {
const line = buf.slice(0, nl)
buf = buf.slice(nl + 1)
const m = line.match(_READY_RE)
if (m) {
cleanup()
resolve(parseInt(m[1], 10))
return
}
}
}
function onExit(code, signal) {
cleanup()
reject(new Error(`Hermes backend: exited before port announcement (${signal || code})`))
}
function onError(err) {
cleanup()
reject(err)
}
const timer = setTimeout(() => {
cleanup()
reject(new Error(`Timed out waiting for Hermes backend port announcement (${timeoutMs}ms)`))
}, timeoutMs)
child.stdout.on('data', onData)
child.on('exit', onExit)
child.on('error', onError)
})
}
module.exports = { waitForDashboardPort }

View File

@@ -40,6 +40,15 @@ const path = require('node:path')
const https = require('node:https')
const { spawn } = require('node:child_process')
const IS_WINDOWS = process.platform === 'win32'
function hiddenWindowsChildOptions(options = {}) {
if (!IS_WINDOWS || Object.prototype.hasOwnProperty.call(options, 'windowsHide')) {
return options
}
return { ...options, windowsHide: true }
}
const STAMP_COMMIT_RE = /^[0-9a-f]{7,40}$/i
// Stages flagged needs_user_input=true in the manifest are skipped by the
@@ -284,7 +293,7 @@ function spawnPowerShell(scriptPath, args, { emit, stageName, abortSignal, herme
const ps = process.platform === 'win32' ? resolveWindowsPowerShell() : 'pwsh'
const fullArgs = ['-NoProfile', '-ExecutionPolicy', 'Bypass', '-File', scriptPath, ...args]
const child = spawn(ps, fullArgs, {
const child = spawn(ps, fullArgs, hiddenWindowsChildOptions({
stdio: ['ignore', 'pipe', 'pipe'],
env: {
...process.env,
@@ -292,7 +301,7 @@ function spawnPowerShell(scriptPath, args, { emit, stageName, abortSignal, herme
// choice rather than re-computing the default.
HERMES_HOME: hermesHome || process.env.HERMES_HOME || ''
}
})
}))
let stdout = ''
let stderr = ''

View File

@@ -0,0 +1,99 @@
/**
* Helpers for local dashboard session-token discovery.
*
* The desktop main process can pass HERMES_DASHBOARD_SESSION_TOKEN when it
* spawns the local dashboard, but the dashboard is the source of truth for the
* token it actually serves to the renderer. If those drift, HTTP readiness
* probes still pass while /api/ws rejects the renderer's token.
*/
const DEFAULT_TOKEN_FETCH_TIMEOUT_MS = 3_000
async function fetchPublicText(url, options = {}) {
const { protocol } = new URL(url)
if (protocol !== 'http:' && protocol !== 'https:') {
throw new Error(`Unsupported Hermes backend URL protocol: ${protocol}`)
}
const timeoutMs = options.timeoutMs ?? DEFAULT_TOKEN_FETCH_TIMEOUT_MS
const res = await fetch(url, { signal: AbortSignal.timeout(timeoutMs) }).catch(error => {
if (error.name === 'TimeoutError') {
throw new Error(`Timed out connecting to Hermes backend after ${timeoutMs}ms`)
}
throw error
})
const text = await res.text()
if (!res.ok) throw new Error(`${res.status}: ${text || res.statusText}`)
return text
}
function extractInjectedDashboardToken(html) {
const match = /window\.__HERMES_SESSION_TOKEN__\s*=\s*("(?:\\.|[^"\\])*")/.exec(String(html || ''))
if (!match) return null
try {
return JSON.parse(match[1])
} catch {
return null
}
}
function dashboardIndexUrl(baseUrl) {
return `${String(baseUrl || '').replace(/\/+$/, '')}/`
}
async function resolveServedDashboardToken(baseUrl, fallbackToken, options = {}) {
const fetchText = options.fetchText || fetchPublicText
const html = await fetchText(dashboardIndexUrl(baseUrl), {
timeoutMs: options.timeoutMs ?? DEFAULT_TOKEN_FETCH_TIMEOUT_MS
})
const servedToken = extractInjectedDashboardToken(html)
if (servedToken && servedToken !== fallbackToken && typeof options.rememberLog === 'function') {
options.rememberLog('[boot] dashboard served a different session token; using served token for WebSocket auth')
}
return servedToken || fallbackToken
}
/**
* A served token that differs from our spawn token while our child is DEAD
* came from a process we did not spawn (orphan/port squatter that satisfied
* the public /api/status readiness probe). With a live child the mismatch is
* benign: our own backend regenerated the token because the env pin did not
* survive the spawn.
*/
function isForeignBackendToken({ servedToken, spawnToken, childAlive }) {
return Boolean(servedToken) && servedToken !== spawnToken && !childAlive
}
/**
* Resolve the token the backend actually serves, adopting benign drift and
* failing loudly on a foreign backend. `childAlive` is a thunk so liveness is
* sampled after the fetch, not before.
*/
async function adoptServedDashboardToken(baseUrl, spawnToken, { childAlive, label = 'Hermes backend', ...options }) {
const servedToken = await resolveServedDashboardToken(baseUrl, spawnToken, options).catch(error => {
options.rememberLog?.(`[boot] could not read served dashboard token (${label}): ${error.message}`)
return spawnToken
})
if (isForeignBackendToken({ servedToken, spawnToken, childAlive: childAlive() })) {
throw new Error(
`${label} exited and ${dashboardIndexUrl(baseUrl)} is served by a process we did not spawn; refusing its session token.`
)
}
return servedToken
}
module.exports = {
DEFAULT_TOKEN_FETCH_TIMEOUT_MS,
adoptServedDashboardToken,
dashboardIndexUrl,
extractInjectedDashboardToken,
fetchPublicText,
isForeignBackendToken,
resolveServedDashboardToken
}

View File

@@ -0,0 +1,142 @@
/**
* Tests for electron/dashboard-token.cjs.
*
* Run with: node --test electron/dashboard-token.test.cjs
* (Wired into npm test:desktop:platforms in package.json.)
*/
const test = require('node:test')
const assert = require('node:assert/strict')
const {
adoptServedDashboardToken,
dashboardIndexUrl,
extractInjectedDashboardToken,
fetchPublicText,
isForeignBackendToken,
resolveServedDashboardToken
} = require('./dashboard-token.cjs')
test('extractInjectedDashboardToken reads the JSON-encoded dashboard token', () => {
const html = '<script>window.__HERMES_SESSION_TOKEN__="served-token";window.__HERMES_BASE_PATH__=""</script>'
assert.equal(extractInjectedDashboardToken(html), 'served-token')
})
test('extractInjectedDashboardToken handles escaped token strings', () => {
const html = '<script>window.__HERMES_SESSION_TOKEN__="served\\\\token\\"quoted";</script>'
assert.equal(extractInjectedDashboardToken(html), 'served\\token"quoted')
})
test('extractInjectedDashboardToken returns null for missing or malformed values', () => {
assert.equal(extractInjectedDashboardToken('<html></html>'), null)
assert.equal(extractInjectedDashboardToken('<script>window.__HERMES_SESSION_TOKEN__={bad}</script>'), null)
})
test('dashboardIndexUrl preserves dashboard path prefixes', () => {
assert.equal(dashboardIndexUrl('http://127.0.0.1:9120'), 'http://127.0.0.1:9120/')
assert.equal(dashboardIndexUrl('https://host.example/hermes/'), 'https://host.example/hermes/')
})
test('resolveServedDashboardToken uses the served token and logs when it differs', async () => {
const logs = []
const token = await resolveServedDashboardToken('http://127.0.0.1:9120', 'spawn-token', {
fetchText: async url => {
assert.equal(url, 'http://127.0.0.1:9120/')
return '<script>window.__HERMES_SESSION_TOKEN__="served-token";</script>'
},
rememberLog: line => logs.push(line)
})
assert.equal(token, 'served-token')
assert.equal(logs.length, 1)
assert.match(logs[0], /served a different session token/)
})
test('resolveServedDashboardToken falls back when the served HTML has no token', async () => {
const token = await resolveServedDashboardToken('http://127.0.0.1:9120', 'spawn-token', {
fetchText: async () => '<html></html>',
rememberLog: () => {
throw new Error('should not log when no served token is present')
}
})
assert.equal(token, 'spawn-token')
})
test('resolveServedDashboardToken does not log when served token matches fallback', async () => {
const token = await resolveServedDashboardToken('http://127.0.0.1:9120', 'same-token', {
fetchText: async () => '<script>window.__HERMES_SESSION_TOKEN__="same-token";</script>',
rememberLog: () => {
throw new Error('should not log when token already matches')
}
})
assert.equal(token, 'same-token')
})
test('resolveServedDashboardToken propagates fetch errors so callers can fall back explicitly', async () => {
await assert.rejects(
() =>
resolveServedDashboardToken('http://127.0.0.1:9120', 'spawn-token', {
fetchText: async () => {
throw new Error('boom')
}
}),
/boom/
)
})
test('fetchPublicText rejects unsupported protocols', async () => {
await assert.rejects(() => fetchPublicText('file:///tmp/index.html'), /Unsupported Hermes backend URL protocol/)
})
test('isForeignBackendToken only flags a mismatched token from a dead child', () => {
const cases = [
[{ servedToken: 'other', spawnToken: 'mine', childAlive: false }, true],
// Live child + drift = our backend regenerated the token (env pin lost).
[{ servedToken: 'other', spawnToken: 'mine', childAlive: true }, false],
[{ servedToken: 'mine', spawnToken: 'mine', childAlive: false }, false],
[{ servedToken: 'mine', spawnToken: 'mine', childAlive: true }, false],
[{ servedToken: null, spawnToken: 'mine', childAlive: false }, false],
[{ servedToken: '', spawnToken: 'mine', childAlive: false }, false]
]
for (const [input, expected] of cases) {
assert.equal(isForeignBackendToken(input), expected, JSON.stringify(input))
}
})
test('adoptServedDashboardToken adopts drift from a live child', async () => {
const token = await adoptServedDashboardToken('http://127.0.0.1:9120', 'spawn-token', {
childAlive: () => true,
fetchText: async () => '<script>window.__HERMES_SESSION_TOKEN__="served-token";</script>'
})
assert.equal(token, 'served-token')
})
test('adoptServedDashboardToken refuses a foreign token when our child is dead', async () => {
await assert.rejects(
() =>
adoptServedDashboardToken('http://127.0.0.1:9120', 'spawn-token', {
childAlive: () => false,
fetchText: async () => '<script>window.__HERMES_SESSION_TOKEN__="squatter-token";</script>',
label: 'Hermes backend for profile "work"'
}),
/profile "work".*process we did not spawn/
)
})
test('adoptServedDashboardToken falls back to the spawn token when the fetch fails', async () => {
const logs = []
const token = await adoptServedDashboardToken('http://127.0.0.1:9120', 'spawn-token', {
childAlive: () => true,
fetchText: async () => {
throw new Error('boom')
},
rememberLog: line => logs.push(line)
})
assert.equal(token, 'spawn-token')
assert.equal(logs.length, 1)
assert.match(logs[0], /could not read served dashboard token \(Hermes backend\): boom/)
})

View File

@@ -0,0 +1,109 @@
'use strict'
const fs = require('node:fs')
const path = require('node:path')
const { resolveDirectoryForIpc } = require('./hardening.cjs')
const FS_READDIR_STAT_CONCURRENCY = 16
// Always-hidden noise (covers non-git projects too; gitignore catches many of
// these, but the project tree should keep the same hygiene without one).
const FS_READDIR_HIDDEN = new Set([
'.git',
'.hg',
'.svn',
'.cache',
'.next',
'.turbo',
'.venv',
'__pycache__',
'build',
'dist',
'node_modules',
'target',
'venv'
])
function direntIsDirectory(dirent) {
return typeof dirent.isDirectory === 'function' && dirent.isDirectory()
}
function direntIsFile(dirent) {
return typeof dirent.isFile === 'function' && dirent.isFile()
}
function direntIsSymbolicLink(dirent) {
return typeof dirent.isSymbolicLink === 'function' && dirent.isSymbolicLink()
}
function shouldStatDirent(dirent) {
if (direntIsDirectory(dirent)) return false
return direntIsSymbolicLink(dirent) || !direntIsFile(dirent)
}
async function entryForDirent(dirent, resolved, fsImpl) {
const fullPath = path.join(resolved, dirent.name)
let isDirectory = direntIsDirectory(dirent)
if (!isDirectory && shouldStatDirent(dirent)) {
try {
isDirectory = (await fsImpl.promises.stat(fullPath)).isDirectory()
} catch {
isDirectory = false
}
}
return { name: dirent.name, path: fullPath, isDirectory }
}
async function mapWithStatConcurrency(items, mapper) {
const results = new Array(items.length)
let nextIndex = 0
async function runWorker() {
while (nextIndex < items.length) {
const index = nextIndex
nextIndex += 1
results[index] = await mapper(items[index])
}
}
const workerCount = Math.min(FS_READDIR_STAT_CONCURRENCY, items.length)
const workers = Array.from({ length: workerCount }, () => runWorker())
await Promise.all(workers)
return results
}
async function readDirForIpc(dirPath, options = {}) {
const fsImpl = options.fs || fs
let resolved
try {
;({ resolvedPath: resolved } = await resolveDirectoryForIpc(dirPath, {
fs: fsImpl,
purpose: 'Directory read'
}))
} catch (error) {
return { entries: [], error: error?.code || 'read-error' }
}
try {
const dirents = await fsImpl.promises.readdir(resolved, { withFileTypes: true })
const visibleDirents = dirents.filter(dirent => !FS_READDIR_HIDDEN.has(dirent.name))
const entries = await mapWithStatConcurrency(visibleDirents, dirent =>
entryForDirent(dirent, resolved, fsImpl)
)
entries.sort((a, b) => Number(b.isDirectory) - Number(a.isDirectory) || a.name.localeCompare(b.name))
return { entries }
} catch (error) {
return { entries: [], error: error?.code || 'read-error' }
}
}
module.exports = {
readDirForIpc
}

View File

@@ -0,0 +1,364 @@
'use strict'
const assert = require('node:assert/strict')
const fs = require('node:fs')
const os = require('node:os')
const path = require('node:path')
const test = require('node:test')
const { pathToFileURL } = require('node:url')
const { readDirForIpc } = require('./fs-read-dir.cjs')
function mkTmpDir() {
return fs.mkdtempSync(path.join(os.tmpdir(), 'hermes-fs-read-dir-'))
}
function fakeDirent(name, flags = {}) {
return {
name,
isDirectory: () => Boolean(flags.directory),
isFile: () => Boolean(flags.file),
isSymbolicLink: () => Boolean(flags.symlink)
}
}
test('readDirForIpc hides noisy directories and files from the project tree', async () => {
const root = mkTmpDir()
try {
fs.mkdirSync(path.join(root, 'node_modules'))
fs.mkdirSync(path.join(root, 'src'))
fs.writeFileSync(path.join(root, 'target'), 'hidden file')
fs.writeFileSync(path.join(root, 'README.md'), 'visible file')
const result = await readDirForIpc(root)
assert.equal(result.error, undefined)
assert.deepEqual(
result.entries.map(entry => entry.name),
['src', 'README.md']
)
} finally {
fs.rmSync(root, { recursive: true, force: true })
}
})
test('readDirForIpc filters a hidden basename whether it is a file or directory', async () => {
const dirRoot = mkTmpDir()
const fileRoot = mkTmpDir()
try {
fs.mkdirSync(path.join(dirRoot, 'node_modules'))
fs.writeFileSync(path.join(dirRoot, 'visible.txt'), 'visible')
fs.writeFileSync(path.join(fileRoot, 'node_modules'), 'hidden file')
fs.writeFileSync(path.join(fileRoot, 'visible.txt'), 'visible')
assert.deepEqual(
(await readDirForIpc(dirRoot)).entries.map(entry => entry.name),
['visible.txt']
)
assert.deepEqual(
(await readDirForIpc(fileRoot)).entries.map(entry => entry.name),
['visible.txt']
)
} finally {
fs.rmSync(dirRoot, { recursive: true, force: true })
fs.rmSync(fileRoot, { recursive: true, force: true })
}
})
test('readDirForIpc returns directories before files and sorts by name within groups', async () => {
const root = mkTmpDir()
try {
fs.writeFileSync(path.join(root, 'z.txt'), 'z')
fs.mkdirSync(path.join(root, 'src'))
fs.writeFileSync(path.join(root, 'a.txt'), 'a')
fs.mkdirSync(path.join(root, 'lib'))
const result = await readDirForIpc(root)
assert.equal(result.error, undefined)
assert.deepEqual(
result.entries.map(entry => entry.name),
['lib', 'src', 'a.txt', 'z.txt']
)
} finally {
fs.rmSync(root, { recursive: true, force: true })
}
})
test('readDirForIpc accepts file URLs for directories', async () => {
const root = mkTmpDir()
try {
fs.mkdirSync(path.join(root, 'src'))
fs.writeFileSync(path.join(root, 'README.md'), 'visible file')
const result = await readDirForIpc(pathToFileURL(root).toString())
assert.equal(result.error, undefined)
assert.deepEqual(
result.entries.map(entry => entry.name),
['src', 'README.md']
)
} finally {
fs.rmSync(root, { recursive: true, force: true })
}
})
test('readDirForIpc returns invalid-path for blank or non-string input', async () => {
let readdirCalls = 0
const fsImpl = {
promises: {
readdir: async () => {
readdirCalls += 1
return []
}
}
}
assert.deepEqual(await readDirForIpc('', { fs: fsImpl }), { entries: [], error: 'invalid-path' })
assert.deepEqual(await readDirForIpc(' ', { fs: fsImpl }), { entries: [], error: 'invalid-path' })
assert.deepEqual(await readDirForIpc(null, { fs: fsImpl }), { entries: [], error: 'invalid-path' })
assert.equal(readdirCalls, 0)
})
test('readDirForIpc rejects Windows device paths before readdir', async () => {
let readdirCalls = 0
const fsImpl = {
promises: {
readdir: async () => {
readdirCalls += 1
return []
}
}
}
assert.deepEqual(await readDirForIpc('\\\\?\\C:\\secret', { fs: fsImpl }), {
entries: [],
error: 'device-path'
})
assert.equal(readdirCalls, 0)
})
test('readDirForIpc returns filesystem error codes instead of throwing', async () => {
const root = mkTmpDir()
try {
const result = await readDirForIpc(path.join(root, 'missing'))
assert.deepEqual(result, { entries: [], error: 'ENOENT' })
} finally {
fs.rmSync(root, { recursive: true, force: true })
}
})
test('readDirForIpc marks a symlink to a directory as a directory', async t => {
const root = mkTmpDir()
try {
fs.mkdirSync(path.join(root, 'actual-dir'))
try {
fs.symlinkSync(path.join(root, 'actual-dir'), path.join(root, 'linked-dir'), 'dir')
} catch (error) {
if (error?.code === 'EPERM' || error?.code === 'EACCES') {
t.skip(`symlink creation is not permitted on this platform (${error.code})`)
return
}
throw error
}
const result = await readDirForIpc(root)
const linked = result.entries.find(entry => entry.name === 'linked-dir')
assert.equal(result.error, undefined)
assert.equal(linked?.isDirectory, true)
} finally {
fs.rmSync(root, { recursive: true, force: true })
}
})
test('readDirForIpc marks a Windows junction to a directory as a directory', async t => {
if (process.platform !== 'win32') {
t.skip('junctions are a Windows-specific symlink type')
return
}
const root = mkTmpDir()
try {
fs.mkdirSync(path.join(root, 'actual-dir'))
try {
fs.symlinkSync(path.join(root, 'actual-dir'), path.join(root, 'junction-dir'), 'junction')
} catch (error) {
if (error?.code === 'EPERM' || error?.code === 'EACCES') {
t.skip(`junction creation is not permitted on this platform (${error.code})`)
return
}
throw error
}
const result = await readDirForIpc(root)
const junction = result.entries.find(entry => entry.name === 'junction-dir')
assert.equal(result.error, undefined)
assert.equal(junction?.isDirectory, true)
} finally {
fs.rmSync(root, { recursive: true, force: true })
}
})
test('readDirForIpc allows expanding symlink or junction directories outside the project root', async t => {
const root = mkTmpDir()
const outside = mkTmpDir()
try {
fs.writeFileSync(path.join(outside, 'outside.txt'), 'ok')
const linkPath = path.join(root, 'outside-link')
try {
fs.symlinkSync(outside, linkPath, process.platform === 'win32' ? 'junction' : 'dir')
} catch (error) {
if (error?.code === 'EPERM' || error?.code === 'EACCES') {
t.skip(`directory symlink creation is not permitted on this platform (${error.code})`)
return
}
throw error
}
const result = await readDirForIpc(linkPath)
assert.equal(result.error, undefined)
assert.deepEqual(result.entries, [
{ name: 'outside.txt', path: path.join(linkPath, 'outside.txt'), isDirectory: false }
])
} finally {
fs.rmSync(root, { recursive: true, force: true })
fs.rmSync(outside, { recursive: true, force: true })
}
})
test('readDirForIpc stats symbolic links and unknown entries without dropping the whole listing', async () => {
const input = path.join('virtual-root')
const resolved = path.resolve(input)
const statCalls = []
const fsImpl = {
promises: {
readdir: async () => [
fakeDirent('unknown-entry'),
fakeDirent('linked-dir', { symlink: true }),
fakeDirent('broken-link', { symlink: true }),
fakeDirent('plain.txt', { file: true })
],
stat: async fullPath => {
if (fullPath === resolved) {
return { isDirectory: () => true }
}
statCalls.push(fullPath)
if (fullPath.endsWith(`${path.sep}linked-dir`)) {
return { isDirectory: () => true }
}
throw Object.assign(new Error('gone'), { code: 'ENOENT' })
}
}
}
const result = await readDirForIpc(input, { fs: fsImpl })
assert.equal(result.error, undefined)
assert.deepEqual(
statCalls.sort(),
[path.join(resolved, 'broken-link'), path.join(resolved, 'linked-dir'), path.join(resolved, 'unknown-entry')].sort()
)
assert.deepEqual(result.entries, [
{ name: 'linked-dir', path: path.join(resolved, 'linked-dir'), isDirectory: true },
{ name: 'broken-link', path: path.join(resolved, 'broken-link'), isDirectory: false },
{ name: 'plain.txt', path: path.join(resolved, 'plain.txt'), isDirectory: false },
{ name: 'unknown-entry', path: path.join(resolved, 'unknown-entry'), isDirectory: false }
])
})
test('readDirForIpc bounds concurrent stats while preserving complete sorted output', async () => {
const input = path.join('virtual-root')
const resolved = path.resolve(input)
const names = Array.from({ length: 105 }, (_, index) => `entry-${String(104 - index).padStart(3, '0')}`)
const failedName = 'entry-100'
const directoryNames = new Set(names.filter((_, index) => index % 10 === 4))
const successfulDirectoryNames = new Set([...directoryNames].filter(name => name !== failedName))
const statCalls = []
let active = 0
let peak = 0
let releaseStats
let markFirstStatStarted
const statsReleased = new Promise(resolve => {
releaseStats = resolve
})
const firstStatStarted = new Promise(resolve => {
markFirstStatStarted = resolve
})
const fsImpl = {
promises: {
readdir: async () => [
fakeDirent('node_modules', { symlink: true }),
...names.map((name, index) => fakeDirent(name, { symlink: index % 2 === 0 }))
],
stat: async fullPath => {
if (fullPath === resolved) {
return { isDirectory: () => true }
}
statCalls.push(fullPath)
active += 1
peak = Math.max(peak, active)
markFirstStatStarted()
await statsReleased
active -= 1
const name = path.basename(fullPath)
if (name === failedName) {
throw Object.assign(new Error('gone'), { code: 'ENOENT' })
}
return { isDirectory: () => successfulDirectoryNames.has(name) }
}
}
}
const resultPromise = readDirForIpc(input, { fs: fsImpl })
await firstStatStarted
await new Promise(resolve => setImmediate(resolve))
releaseStats()
const result = await resultPromise
const expectedNames = [
...names.filter(name => successfulDirectoryNames.has(name)).sort(),
...names.filter(name => !successfulDirectoryNames.has(name)).sort()
]
assert.equal(result.error, undefined)
assert.equal(result.entries.length, names.length)
assert.equal(statCalls.length, names.length)
assert.equal(statCalls.some(fullPath => fullPath.endsWith(`${path.sep}node_modules`)), false)
assert.ok(peak > 1, `expected concurrent stats, observed peak ${peak}`)
assert.ok(peak <= 16, `expected at most 16 concurrent stats, observed peak ${peak}`)
assert.deepEqual(
result.entries.map(entry => entry.name),
expectedNames
)
assert.equal(result.entries.find(entry => entry.name === failedName)?.isDirectory, false)
assert.equal(
result.entries.filter(entry => entry.isDirectory).length,
successfulDirectoryNames.size
)
})

View File

@@ -0,0 +1,54 @@
'use strict'
const fs = require('node:fs')
const path = require('node:path')
const { resolveRequestedPathForIpc } = require('./hardening.cjs')
function findGitRoot(start, fsImpl = fs) {
let dir = start
for (let i = 0; i < 50; i += 1) {
try {
if (fsImpl.existsSync(path.join(dir, '.git'))) {
return dir
}
} catch {
return null
}
const parent = path.dirname(dir)
if (parent === dir) {
return null
}
dir = parent
}
return null
}
async function gitRootForIpc(startPath, options = {}) {
const fsImpl = options.fs || fs
let resolved
try {
resolved = resolveRequestedPathForIpc(startPath, { purpose: 'Git root' })
} catch {
return null
}
try {
const stat = await fsImpl.promises.stat(resolved)
const start = stat.isDirectory() ? resolved : path.dirname(resolved)
return findGitRoot(start, fsImpl)
} catch {
return findGitRoot(resolved, fsImpl)
}
}
module.exports = {
findGitRoot,
gitRootForIpc
}

View File

@@ -0,0 +1,40 @@
'use strict'
const assert = require('node:assert/strict')
const fs = require('node:fs')
const os = require('node:os')
const path = require('node:path')
const test = require('node:test')
const { pathToFileURL } = require('node:url')
const { gitRootForIpc } = require('./git-root.cjs')
function mkTmpDir() {
return fs.mkdtempSync(path.join(os.tmpdir(), 'hermes-git-root-'))
}
test('gitRootForIpc returns null for invalid and device paths', async () => {
assert.equal(await gitRootForIpc(''), null)
assert.equal(await gitRootForIpc(' '), null)
assert.equal(await gitRootForIpc(null), null)
assert.equal(await gitRootForIpc('\\\\?\\C:\\secret'), null)
assert.equal(await gitRootForIpc('file:///%E0%A4%A'), null)
})
test('gitRootForIpc resolves directories files missing descendants and file URLs', async t => {
const root = mkTmpDir()
t.after(() => fs.rmSync(root, { recursive: true, force: true }))
const gitDir = path.join(root, '.git')
const srcDir = path.join(root, 'src')
const filePath = path.join(srcDir, 'index.ts')
fs.mkdirSync(gitDir)
fs.mkdirSync(srcDir)
fs.writeFileSync(filePath, 'export {}\n', 'utf8')
assert.equal(await gitRootForIpc(root), root)
assert.equal(await gitRootForIpc(srcDir), root)
assert.equal(await gitRootForIpc(filePath), root)
assert.equal(await gitRootForIpc(pathToFileURL(filePath).toString()), root)
assert.equal(await gitRootForIpc(path.join(srcDir, 'missing.ts')), root)
})

View File

@@ -0,0 +1,174 @@
'use strict'
// Resolve git-worktree relationships for a set of session cwds, reading git's
// on-disk metadata directly (no `git` spawn per path):
//
// - A normal checkout has a `.git` DIRECTORY at its root → it's the main
// worktree; its repo root IS that directory's parent.
// - A linked worktree has a `.git` FILE: `gitdir: <repo>/.git/worktrees/<name>`.
// That admin dir's `commondir` points back at the shared `<repo>/.git`, whose
// parent is the main repo root.
//
// Grouping by repoRoot therefore clusters a repo's main checkout with all of its
// linked worktrees, regardless of how the worktree directories are named. The
// branch (read from the worktree's own HEAD) gives each worktree a meaningful
// label.
const fs = require('node:fs')
const path = require('node:path')
const { resolveRequestedPathForIpc } = require('./hardening.cjs')
// Walk up from `start` to the nearest ancestor that carries a `.git` entry
// (file for a linked worktree, dir for the main checkout). Capped so a stray
// path can't loop forever.
function findGitHost(start, fsImpl) {
let dir = start
for (let i = 0; i < 64; i += 1) {
const dotgit = path.join(dir, '.git')
try {
if (fsImpl.existsSync(dotgit)) {
return dir
}
} catch {
return null
}
const parent = path.dirname(dir)
if (parent === dir) {
return null
}
dir = parent
}
return null
}
function readBranch(gitDir, fsImpl) {
try {
const head = fsImpl.readFileSync(path.join(gitDir, 'HEAD'), 'utf8').trim()
const ref = head.match(/^ref:\s*refs\/heads\/(.+)$/)
if (ref) {
return ref[1]
}
// Detached HEAD: surface a short sha so the worktree still gets a label.
return /^[0-9a-f]{7,40}$/i.test(head) ? head.slice(0, 8) : null
} catch {
return null
}
}
// Given the directory that owns the `.git` entry, resolve its worktree identity.
function resolveFromHost(host, fsImpl) {
const dotgit = path.join(host, '.git')
let stat
try {
stat = fsImpl.statSync(dotgit)
} catch {
return null
}
if (stat.isDirectory()) {
return {
repoRoot: host,
worktreeRoot: host,
isMainWorktree: true,
branch: readBranch(dotgit, fsImpl)
}
}
// Linked worktree: `.git` is a file pointing at the admin dir.
let contents
try {
contents = fsImpl.readFileSync(dotgit, 'utf8').trim()
} catch {
return null
}
const match = contents.match(/^gitdir:\s*(.+)$/m)
if (!match) {
return null
}
const adminDir = path.resolve(host, match[1].trim())
// `commondir` resolves to the shared `<repo>/.git`; fall back to walking two
// levels up from `<repo>/.git/worktrees/<name>` if it's missing.
let commonDir
try {
const rel = fsImpl.readFileSync(path.join(adminDir, 'commondir'), 'utf8').trim()
commonDir = path.resolve(adminDir, rel)
} catch {
commonDir = path.dirname(path.dirname(adminDir))
}
return {
repoRoot: path.dirname(commonDir),
worktreeRoot: host,
isMainWorktree: false,
branch: readBranch(adminDir, fsImpl)
}
}
function resolveWorktree(startPath, fsImpl = fs) {
let resolved
try {
resolved = resolveRequestedPathForIpc(startPath, { purpose: 'Worktree lookup' })
} catch {
return null
}
let start = resolved
try {
const stat = fsImpl.statSync(resolved)
if (!stat.isDirectory()) {
start = path.dirname(resolved)
}
} catch {
return null
}
const host = findGitHost(start, fsImpl)
if (!host) {
return null
}
return resolveFromHost(host, fsImpl)
}
// Batch entry point for the renderer: maps each requested cwd to its worktree
// info (or null when it isn't inside a git checkout / can't be read). Dedupes so
// many sessions sharing a cwd cost one lookup.
async function worktreesForIpc(cwds, options = {}) {
const fsImpl = options.fs || fs
const list = Array.isArray(cwds) ? cwds : []
const out = {}
for (const cwd of list) {
if (typeof cwd !== 'string' || !cwd.trim() || cwd in out) {
continue
}
out[cwd] = resolveWorktree(cwd, fsImpl)
}
return out
}
module.exports = {
resolveWorktree,
worktreesForIpc
}

View File

@@ -1,4 +1,5 @@
const fs = require('node:fs')
const os = require('node:os')
const path = require('node:path')
const { fileURLToPath } = require('node:url')
@@ -106,71 +107,162 @@ function sensitiveFileBlockReason(filePath) {
return null
}
function resolveRequestedFilePath(filePath, baseDir = process.cwd(), purpose = 'File read') {
const raw = String(filePath || '').trim()
function ipcPathError(code, message) {
const error = new Error(message)
error.code = code
return error
}
function rejectUnsafePathSyntax(filePath, purpose = 'File read') {
if (typeof filePath !== 'string') {
throw ipcPathError('invalid-path', `${purpose} failed: file path is required.`)
}
const raw = filePath.trim()
if (!raw) {
throw new Error(`${purpose} failed: file path is required.`)
throw ipcPathError('invalid-path', `${purpose} failed: file path is required.`)
}
if (raw.includes('\0')) {
throw new Error(`${purpose} failed: file path is invalid.`)
throw ipcPathError('invalid-path', `${purpose} failed: file path is invalid.`)
}
const normalized = raw.replace(/\\/g, '/').toLowerCase()
if (
normalized.startsWith('//?/') ||
normalized.startsWith('//./') ||
normalized.startsWith('globalroot/device/') ||
normalized.includes('/globalroot/device/')
) {
throw ipcPathError('device-path', `${purpose} blocked: Windows device paths are not allowed.`)
}
return raw
}
function resolveRequestedPathForIpc(filePath, options = {}) {
const purpose = String(options.purpose || 'File read')
let raw = rejectUnsafePathSyntax(filePath, purpose)
// Gateway-reported cwds (config `terminal.cwd`, remote sessions) routinely
// arrive as `~/...`. Node's fs has no shell — without expansion the path
// resolves under process.cwd() and every read "ENOENT"s forever.
if (raw === '~' || raw.startsWith('~/') || raw.startsWith('~\\')) {
raw = path.join(os.homedir(), raw.slice(1))
}
if (/^file:/i.test(raw)) {
let resolvedPath
try {
return fileURLToPath(raw)
const parsed = new URL(raw)
if (parsed.protocol !== 'file:') {
throw new Error('not a file URL')
}
resolvedPath = fileURLToPath(parsed)
} catch {
throw new Error(`${purpose} failed: file URL is invalid.`)
throw ipcPathError('invalid-path', `${purpose} failed: file URL is invalid.`)
}
rejectUnsafePathSyntax(resolvedPath, purpose)
return path.resolve(resolvedPath)
}
const resolvedBase = path.resolve(String(baseDir || process.cwd()))
return path.resolve(resolvedBase, raw)
const baseInput = typeof options.baseDir === 'string' && options.baseDir.trim() ? options.baseDir : process.cwd()
const safeBaseInput = rejectUnsafePathSyntax(baseInput, purpose)
const resolvedBase = path.resolve(safeBaseInput)
rejectUnsafePathSyntax(resolvedBase, purpose)
const resolvedPath = path.resolve(resolvedBase, raw)
rejectUnsafePathSyntax(resolvedPath, purpose)
return resolvedPath
}
async function statForIpc(fsImpl, resolvedPath, purpose, typeLabel) {
try {
return await fsImpl.promises.stat(resolvedPath)
} catch (error) {
const code = error && typeof error === 'object' ? error.code : ''
if (code === 'ENOENT' || code === 'ENOTDIR') {
throw ipcPathError(code || 'ENOENT', `${purpose} failed: ${typeLabel} does not exist.`)
}
throw ipcPathError(code || 'read-error', `${purpose} failed: ${error instanceof Error ? error.message : String(error)}`)
}
}
async function realpathForIpc(fsImpl, resolvedPath, purpose) {
if (typeof fsImpl.promises.realpath !== 'function') {
return resolvedPath
}
try {
const realPath = await fsImpl.promises.realpath(resolvedPath)
rejectUnsafePathSyntax(realPath, purpose)
return realPath
} catch (error) {
const code = error && typeof error === 'object' ? error.code : ''
throw ipcPathError(code || 'read-error', `${purpose} failed: ${error instanceof Error ? error.message : String(error)}`)
}
}
function rejectSensitiveFilePath(filePath, purpose) {
const blockReason = sensitiveFileBlockReason(filePath)
if (blockReason) {
throw ipcPathError('sensitive-file', `${purpose} blocked for sensitive file: ${blockReason}`)
}
}
async function resolveDirectoryForIpc(dirPath, options = {}) {
const purpose = String(options.purpose || 'Directory read')
const fsImpl = options.fs || fs
const resolvedPath = resolveRequestedPathForIpc(dirPath, { baseDir: options.baseDir, purpose })
const stat = await statForIpc(fsImpl, resolvedPath, purpose, 'directory')
if (!stat.isDirectory()) {
throw ipcPathError('ENOTDIR', `${purpose} failed: path is not a directory.`)
}
const realPath = await realpathForIpc(fsImpl, resolvedPath, purpose)
return { realPath, resolvedPath, stat }
}
async function resolveReadableFileForIpc(filePath, options = {}) {
const purpose = String(options.purpose || 'File read')
const resolvedPath = resolveRequestedFilePath(filePath, options.baseDir, purpose)
const fsImpl = options.fs || fs
const resolvedPath = resolveRequestedPathForIpc(filePath, { baseDir: options.baseDir, purpose })
if (options.blockSensitive !== false) {
const blockReason = sensitiveFileBlockReason(resolvedPath)
if (blockReason) {
throw new Error(`${purpose} blocked for sensitive file: ${blockReason}`)
}
rejectSensitiveFilePath(resolvedPath, purpose)
}
let stat
try {
stat = await fs.promises.stat(resolvedPath)
} catch (error) {
const code = error && typeof error === 'object' ? error.code : ''
if (code === 'ENOENT' || code === 'ENOTDIR') {
throw new Error(`${purpose} failed: file does not exist.`)
}
throw new Error(`${purpose} failed: ${error instanceof Error ? error.message : String(error)}`)
}
const stat = await statForIpc(fsImpl, resolvedPath, purpose, 'file')
if (stat.isDirectory()) {
throw new Error(`${purpose} failed: path points to a directory.`)
throw ipcPathError('EISDIR', `${purpose} failed: path points to a directory.`)
}
if (!stat.isFile()) {
throw new Error(`${purpose} failed: only regular files can be read.`)
throw ipcPathError('EINVAL', `${purpose} failed: only regular files can be read.`)
}
const realPath = await realpathForIpc(fsImpl, resolvedPath, purpose)
if (options.blockSensitive !== false) {
rejectSensitiveFilePath(realPath, purpose)
}
const maxBytes = Number.isFinite(options.maxBytes) && Number(options.maxBytes) > 0 ? Number(options.maxBytes) : null
if (maxBytes && stat.size > maxBytes) {
throw new Error(`${purpose} failed: file is too large (${stat.size} bytes; limit ${maxBytes} bytes).`)
throw ipcPathError('EFBIG', `${purpose} failed: file is too large (${stat.size} bytes; limit ${maxBytes} bytes).`)
}
try {
await fs.promises.access(resolvedPath, fs.constants.R_OK)
await fsImpl.promises.access(resolvedPath, fs.constants.R_OK)
} catch {
throw new Error(`${purpose} failed: file is not readable.`)
throw ipcPathError('EACCES', `${purpose} failed: file is not readable.`)
}
return { resolvedPath, stat }
return { realPath, resolvedPath, stat }
}
module.exports = {
@@ -178,7 +270,10 @@ module.exports = {
DEFAULT_FETCH_TIMEOUT_MS,
TEXT_PREVIEW_SOURCE_MAX_BYTES,
encryptDesktopSecret,
rejectUnsafePathSyntax,
resolveDirectoryForIpc,
resolveReadableFileForIpc,
resolveRequestedPathForIpc,
resolveTimeoutMs,
sensitiveFileBlockReason
}

View File

@@ -8,11 +8,20 @@ const { pathToFileURL } = require('node:url')
const {
DEFAULT_FETCH_TIMEOUT_MS,
encryptDesktopSecret,
resolveDirectoryForIpc,
resolveReadableFileForIpc,
resolveRequestedPathForIpc,
resolveTimeoutMs,
sensitiveFileBlockReason
} = require('./hardening.cjs')
async function rejectsWithCode(promise, code) {
await assert.rejects(promise, error => {
assert.equal(error?.code, code)
return true
})
}
test('resolveTimeoutMs falls back to defaults and accepts overrides', () => {
assert.equal(resolveTimeoutMs(undefined), DEFAULT_FETCH_TIMEOUT_MS)
assert.equal(resolveTimeoutMs(0), DEFAULT_FETCH_TIMEOUT_MS)
@@ -51,6 +60,65 @@ test('sensitiveFileBlockReason blocks obvious secret file patterns', () => {
assert.match(String(sensitiveFileBlockReason('/tmp/server-cert.pem')), /\.pem/)
})
test('path helpers reject blank non-string NUL and Windows device syntax', async () => {
await rejectsWithCode(resolveReadableFileForIpc('', { purpose: 'File preview' }), 'invalid-path')
await rejectsWithCode(resolveReadableFileForIpc(' ', { purpose: 'File preview' }), 'invalid-path')
await rejectsWithCode(resolveReadableFileForIpc(null, { purpose: 'File preview' }), 'invalid-path')
await rejectsWithCode(resolveReadableFileForIpc(`safe${String.fromCharCode(0)}name.txt`), 'invalid-path')
const devicePaths = [
'\\\\?\\C:\\secret.txt',
'\\\\.\\C:\\secret.txt',
'\\\\?\\UNC\\server\\share\\secret.txt',
'GLOBALROOT/Device/HarddiskVolumeShadowCopy1/secret.txt'
]
for (const devicePath of devicePaths) {
assert.throws(
() => resolveRequestedPathForIpc(devicePath, { purpose: 'File preview' }),
error => {
assert.equal(error?.code, 'device-path')
return true
}
)
await rejectsWithCode(resolveReadableFileForIpc(devicePath, { purpose: 'File preview' }), 'device-path')
}
assert.throws(
() => resolveRequestedPathForIpc('file:///%E0%A4%A', { purpose: 'File preview' }),
error => {
assert.equal(error?.code, 'invalid-path')
return true
}
)
await rejectsWithCode(resolveReadableFileForIpc('file:///%E0%A4%A', { purpose: 'File preview' }), 'invalid-path')
})
test('resolveRequestedPathForIpc resolves relative paths from the trimmed base directory', () => {
const baseDir = path.join(os.tmpdir(), 'hermes-desktop-base')
assert.equal(
resolveRequestedPathForIpc('notes.txt', {
baseDir: ` ${baseDir} `,
purpose: 'File preview'
}),
path.resolve(baseDir, 'notes.txt')
)
})
test('resolveRequestedPathForIpc expands ~ to the home directory', () => {
assert.equal(resolveRequestedPathForIpc('~', { purpose: 'Directory read' }), path.resolve(os.homedir()))
assert.equal(
resolveRequestedPathForIpc('~/www/project', { purpose: 'Directory read' }),
path.resolve(os.homedir(), 'www/project')
)
// `~user` shorthand is NOT expanded — only the caller's own home.
assert.equal(
resolveRequestedPathForIpc('~other/secret', { baseDir: os.tmpdir(), purpose: 'Directory read' }),
path.resolve(os.tmpdir(), '~other/secret')
)
})
test('resolveReadableFileForIpc validates existence type size and sensitivity', async t => {
const tempDir = fs.mkdtempSync(path.join(os.tmpdir(), 'hermes-desktop-hardening-'))
t.after(() => fs.rmSync(tempDir, { recursive: true, force: true }))
@@ -71,6 +139,13 @@ test('resolveReadableFileForIpc validates existence type size and sensitivity',
})
assert.equal(fromFileUrl.resolvedPath, textPath)
const spacedPath = path.join(tempDir, 'notes with spaces.txt')
fs.writeFileSync(spacedPath, 'space ok', 'utf8')
const fromSpacedFileUrl = await resolveReadableFileForIpc(pathToFileURL(spacedPath).toString(), {
purpose: 'File preview'
})
assert.equal(fromSpacedFileUrl.resolvedPath, spacedPath)
await assert.rejects(
resolveReadableFileForIpc('missing.txt', {
baseDir: tempDir,
@@ -114,3 +189,91 @@ test('resolveReadableFileForIpc validates existence type size and sensitivity',
})
assert.equal(envTemplate.resolvedPath, envTemplatePath)
})
test('resolveReadableFileForIpc blocks common sensitive files', async t => {
const tempDir = fs.mkdtempSync(path.join(os.tmpdir(), 'hermes-desktop-sensitive-'))
t.after(() => fs.rmSync(tempDir, { recursive: true, force: true }))
const sshDir = path.join(tempDir, '.ssh')
fs.mkdirSync(sshDir)
const blockedFiles = [
path.join(tempDir, '.env'),
path.join(tempDir, '.npmrc'),
path.join(sshDir, 'id_ed25519'),
path.join(tempDir, 'cert.pem'),
path.join(tempDir, 'cert.p12'),
path.join(tempDir, 'cert.pfx')
]
for (const filePath of blockedFiles) {
fs.writeFileSync(filePath, 'secret', 'utf8')
await rejectsWithCode(resolveReadableFileForIpc(filePath, { purpose: 'File preview' }), 'sensitive-file')
}
const allowed = path.join(tempDir, '.env.example')
fs.writeFileSync(allowed, 'EXAMPLE_TOKEN=value', 'utf8')
assert.equal((await resolveReadableFileForIpc(allowed, { purpose: 'File preview' })).resolvedPath, allowed)
})
test('resolveReadableFileForIpc blocks symlinks whose realpath is sensitive', async t => {
const tempDir = fs.mkdtempSync(path.join(os.tmpdir(), 'hermes-desktop-realpath-'))
t.after(() => fs.rmSync(tempDir, { recursive: true, force: true }))
const envPath = path.join(tempDir, '.env')
const linkPath = path.join(tempDir, 'safe-name.txt')
fs.writeFileSync(envPath, 'SECRET_TOKEN=123', 'utf8')
try {
fs.symlinkSync(envPath, linkPath, 'file')
} catch (error) {
if (error?.code === 'EPERM' || error?.code === 'EACCES') {
t.skip(`symlink creation is not permitted on this platform (${error.code})`)
return
}
throw error
}
await rejectsWithCode(resolveReadableFileForIpc(linkPath, { purpose: 'File preview' }), 'sensitive-file')
})
test('resolveDirectoryForIpc accepts directories and rejects invalid directory targets', async t => {
const tempDir = fs.mkdtempSync(path.join(os.tmpdir(), 'hermes-desktop-dir-'))
t.after(() => fs.rmSync(tempDir, { recursive: true, force: true }))
const directory = path.join(tempDir, 'project')
const filePath = path.join(tempDir, 'file.txt')
fs.mkdirSync(directory)
fs.writeFileSync(filePath, 'not a directory', 'utf8')
const resolved = await resolveDirectoryForIpc(directory)
assert.equal(resolved.resolvedPath, directory)
assert.equal(resolved.stat.isDirectory(), true)
await rejectsWithCode(resolveDirectoryForIpc(filePath), 'ENOTDIR')
await rejectsWithCode(resolveDirectoryForIpc(path.join(tempDir, 'missing')), 'ENOENT')
await rejectsWithCode(resolveDirectoryForIpc('\\\\?\\C:\\secret'), 'device-path')
})
test('resolveDirectoryForIpc accepts directory symlinks or junctions', async t => {
const tempDir = fs.mkdtempSync(path.join(os.tmpdir(), 'hermes-desktop-dir-link-'))
t.after(() => fs.rmSync(tempDir, { recursive: true, force: true }))
const directory = path.join(tempDir, 'actual-project')
const linkPath = path.join(tempDir, 'linked-project')
fs.mkdirSync(directory)
try {
fs.symlinkSync(directory, linkPath, process.platform === 'win32' ? 'junction' : 'dir')
} catch (error) {
if (error?.code === 'EPERM' || error?.code === 'EACCES') {
t.skip(`directory symlink creation is not permitted on this platform (${error.code})`)
return
}
throw error
}
const resolved = await resolveDirectoryForIpc(linkPath)
assert.equal(resolved.resolvedPath, linkPath)
assert.equal(resolved.stat.isDirectory(), true)
})

File diff suppressed because it is too large Load Diff

View File

@@ -5,7 +5,7 @@ contextBridge.exposeInMainWorld('hermesDesktop', {
revalidateConnection: () => ipcRenderer.invoke('hermes:connection:revalidate'),
touchBackend: profile => ipcRenderer.invoke('hermes:backend:touch', profile),
getGatewayWsUrl: profile => ipcRenderer.invoke('hermes:gateway:ws-url', profile),
openSessionWindow: sessionId => ipcRenderer.invoke('hermes:window:openSession', sessionId),
openSessionWindow: (sessionId, opts) => ipcRenderer.invoke('hermes:window:openSession', sessionId, opts),
getBootProgress: () => ipcRenderer.invoke('hermes:boot-progress:get'),
getConnectionConfig: profile => ipcRenderer.invoke('hermes:connection-config:get', profile),
saveConnectionConfig: payload => ipcRenderer.invoke('hermes:connection-config:save', payload),
@@ -39,6 +39,8 @@ contextBridge.exposeInMainWorld('hermesDesktop', {
watchPreviewFile: url => ipcRenderer.invoke('hermes:watchPreviewFile', url),
stopPreviewFileWatch: id => ipcRenderer.invoke('hermes:stopPreviewFileWatch', id),
setTitleBarTheme: payload => ipcRenderer.send('hermes:titlebar-theme', payload),
setNativeTheme: mode => ipcRenderer.send('hermes:native-theme', mode),
setTranslucency: payload => ipcRenderer.send('hermes:translucency', payload),
setPreviewShortcutActive: active => ipcRenderer.send('hermes:previewShortcutActive', Boolean(active)),
openExternal: url => ipcRenderer.invoke('hermes:openExternal', url),
fetchLinkTitle: url => ipcRenderer.invoke('hermes:fetchLinkTitle', url),
@@ -52,6 +54,7 @@ contextBridge.exposeInMainWorld('hermesDesktop', {
getRecentLogs: () => ipcRenderer.invoke('hermes:logs:recent'),
readDir: dirPath => ipcRenderer.invoke('hermes:fs:readDir', dirPath),
gitRoot: startPath => ipcRenderer.invoke('hermes:fs:gitRoot', startPath),
worktrees: cwds => ipcRenderer.invoke('hermes:fs:worktrees', cwds),
terminal: {
dispose: id => ipcRenderer.invoke('hermes:terminal:dispose', id),
resize: (id, size) => ipcRenderer.invoke('hermes:terminal:resize', id, size),
@@ -80,11 +83,27 @@ contextBridge.exposeInMainWorld('hermesDesktop', {
ipcRenderer.on('hermes:open-updates', listener)
return () => ipcRenderer.removeListener('hermes:open-updates', listener)
},
onDeepLink: callback => {
const listener = (_event, payload) => callback(payload)
ipcRenderer.on('hermes:deep-link', listener)
return () => ipcRenderer.removeListener('hermes:deep-link', listener)
},
signalDeepLinkReady: () => ipcRenderer.invoke('hermes:deep-link-ready'),
onWindowStateChanged: callback => {
const listener = (_event, payload) => callback(payload)
ipcRenderer.on('hermes:window-state-changed', listener)
return () => ipcRenderer.removeListener('hermes:window-state-changed', listener)
},
onFocusSession: callback => {
const listener = (_event, sessionId) => callback(sessionId)
ipcRenderer.on('hermes:focus-session', listener)
return () => ipcRenderer.removeListener('hermes:focus-session', listener)
},
onNotificationAction: callback => {
const listener = (_event, payload) => callback(payload)
ipcRenderer.on('hermes:notification-action', listener)
return () => ipcRenderer.removeListener('hermes:notification-action', listener)
},
onPreviewFileChanged: callback => {
const listener = (_event, payload) => callback(payload)
ipcRenderer.on('hermes:preview-file-changed', listener)

View File

@@ -5,22 +5,30 @@
const { pathToFileURL } = require('node:url')
// Secondary windows open at the minimum usable size — a compact side panel for
// subagent watch / cmd-click session pop-out, not a second full desktop.
const SESSION_WINDOW_MIN_WIDTH = 420
const SESSION_WINDOW_MIN_HEIGHT = 620
// Build the renderer URL for a secondary window. The renderer uses a
// HashRouter, so the session route lives after the '#'. The `?win=secondary`
// flag MUST sit in the query string BEFORE the '#': anything after the '#' is
// treated as the route by HashRouter and would break routeSessionId(). The
// renderer reads the flag from window.location.search to suppress the install /
// onboarding overlays and the global session sidebar.
function buildSessionWindowUrl(sessionId, { devServer, rendererIndexPath } = {}) {
// onboarding overlays and the global session sidebar. `watch=1` marks a
// spectator window (e.g. a running subagent's session): the renderer resumes
// it lazily so the gateway never builds an agent just to stream into it.
function buildSessionWindowUrl(sessionId, { devServer, rendererIndexPath, watch } = {}) {
const query = `?win=secondary${watch ? '&watch=1' : ''}`
const route = `#/${encodeURIComponent(sessionId)}`
if (devServer) {
const base = devServer.endsWith('/') ? devServer.slice(0, -1) : devServer
return `${base}/?win=secondary${route}`
return `${base}/${query}${route}`
}
return `${pathToFileURL(rendererIndexPath).toString()}?win=secondary${route}`
return `${pathToFileURL(rendererIndexPath).toString()}${query}${route}`
}
// A small registry keyed by sessionId that guarantees one window per chat:
@@ -83,4 +91,9 @@ function createSessionWindowRegistry() {
}
}
module.exports = { buildSessionWindowUrl, createSessionWindowRegistry }
module.exports = {
buildSessionWindowUrl,
createSessionWindowRegistry,
SESSION_WINDOW_MIN_HEIGHT,
SESSION_WINDOW_MIN_WIDTH
}

View File

@@ -76,6 +76,12 @@ test('buildSessionWindowUrl builds a packaged file URL with the flag before the
assert.match(url, /^file:\/\/.*index\.html\?win=secondary#\/abc$/)
})
test('buildSessionWindowUrl adds the watch flag for spectator windows, before the hash', () => {
const url = buildSessionWindowUrl('abc', { devServer: 'http://localhost:5173', watch: true })
assert.equal(url, 'http://localhost:5173/?win=secondary&watch=1#/abc')
})
test('registry opens one window per session and focuses on re-open', () => {
const registry = createSessionWindowRegistry()
let built = 0

View File

@@ -0,0 +1,56 @@
/**
* Pure helpers for choosing a remote URL during passive update checks.
*
* A public install can end up with `origin=git@github.com:NousResearch/hermes-agent.git`.
* If the user's GitHub SSH key is FIDO2/passkey-backed, a background `git fetch
* origin` triggers an unexplained hardware-touch prompt. For passive checks
* against the official repo we substitute the public HTTPS `ls-remote` path,
* which needs no auth and cannot prompt. Active update/apply flows are left
* unchanged.
*
* Extracted from main.cjs so the security-critical remote detection is unit
* testable without booting Electron (main.cjs requires('electron') at load).
*/
const OFFICIAL_REPO_HTTPS_URL = 'https://github.com/NousResearch/hermes-agent.git'
const OFFICIAL_REPO_CANONICAL = 'github.com/nousresearch/hermes-agent'
// Normalize common GitHub remote URL forms to `host/owner/repo` (lowercased,
// no trailing slash, no .git suffix) so SSH and HTTPS forms of the same repo
// compare equal.
function canonicalGitHubRemote(url) {
if (!url) return ''
let value = String(url).trim()
if (value.startsWith('git@github.com:')) {
value = `github.com/${value.slice('git@github.com:'.length)}`
} else if (value.startsWith('ssh://git@github.com/')) {
value = `github.com/${value.slice('ssh://git@github.com/'.length)}`
} else {
try {
const parsed = new URL(value)
if (parsed.hostname && parsed.pathname) value = `${parsed.hostname}${parsed.pathname}`
} catch {
// Leave non-URL forms unchanged.
}
}
value = value.trim().replace(/\/+$/, '')
if (value.endsWith('.git')) value = value.slice(0, -4)
return value.toLowerCase()
}
function isSshRemote(url) {
const value = String(url || '').trim().toLowerCase()
return value.startsWith('git@') || value.startsWith('ssh://')
}
function isOfficialSshRemote(url) {
return isSshRemote(url) && canonicalGitHubRemote(url) === OFFICIAL_REPO_CANONICAL
}
module.exports = {
OFFICIAL_REPO_HTTPS_URL,
OFFICIAL_REPO_CANONICAL,
canonicalGitHubRemote,
isSshRemote,
isOfficialSshRemote
}

View File

@@ -0,0 +1,78 @@
/**
* Tests for electron/update-remote.cjs — the remote-detection helpers that
* keep passive update checks off the SSH origin for official installs.
*
* Run with: node --test electron/update-remote.test.cjs
* (Wired into npm test:desktop:platforms in package.json.)
*
* Why this matters: a public install can carry
* origin=git@github.com:NousResearch/hermes-agent.git. A background
* `git fetch origin` then authenticates over SSH and, with a FIDO2/passkey
* key, triggers an unexplained hardware-touch prompt. isOfficialSshRemote
* must reliably recognize the official SSH remote (in every URL form,
* case-insensitively) so the caller can swap in the anonymous HTTPS path —
* while NOT misclassifying forks, other hosts, or the HTTPS remote (which
* never prompts and should keep the normal fetch path).
*/
const test = require('node:test')
const assert = require('node:assert/strict')
const {
OFFICIAL_REPO_HTTPS_URL,
OFFICIAL_REPO_CANONICAL,
canonicalGitHubRemote,
isSshRemote,
isOfficialSshRemote
} = require('./update-remote.cjs')
test('canonicalGitHubRemote normalizes SSH and HTTPS forms to the same value', () => {
assert.equal(canonicalGitHubRemote('git@github.com:NousResearch/hermes-agent.git'), OFFICIAL_REPO_CANONICAL)
assert.equal(canonicalGitHubRemote('git@github.com:NousResearch/hermes-agent'), OFFICIAL_REPO_CANONICAL)
assert.equal(canonicalGitHubRemote('ssh://git@github.com/NousResearch/hermes-agent.git'), OFFICIAL_REPO_CANONICAL)
assert.equal(canonicalGitHubRemote('https://github.com/NousResearch/hermes-agent.git'), OFFICIAL_REPO_CANONICAL)
// Case-insensitive: an uppercased owner still canonicalizes to the same repo.
assert.equal(canonicalGitHubRemote('git@github.com:nousresearch/hermes-agent.git'), OFFICIAL_REPO_CANONICAL)
// Trailing slashes are stripped.
assert.equal(canonicalGitHubRemote('https://github.com/NousResearch/hermes-agent/'), OFFICIAL_REPO_CANONICAL)
})
test('canonicalGitHubRemote is empty for falsy input', () => {
assert.equal(canonicalGitHubRemote(''), '')
assert.equal(canonicalGitHubRemote(null), '')
assert.equal(canonicalGitHubRemote(undefined), '')
})
test('isSshRemote detects scp-like and ssh:// forms only', () => {
assert.equal(isSshRemote('git@github.com:NousResearch/hermes-agent.git'), true)
assert.equal(isSshRemote('ssh://git@github.com/NousResearch/hermes-agent.git'), true)
assert.equal(isSshRemote('https://github.com/NousResearch/hermes-agent.git'), false)
assert.equal(isSshRemote(''), false)
assert.equal(isSshRemote(null), false)
})
test('isOfficialSshRemote is true only for the official repo over SSH', () => {
assert.equal(isOfficialSshRemote('git@github.com:NousResearch/hermes-agent.git'), true)
assert.equal(isOfficialSshRemote('git@github.com:NousResearch/hermes-agent'), true)
assert.equal(isOfficialSshRemote('ssh://git@github.com/NousResearch/hermes-agent.git'), true)
// Case-insensitive owner/repo match.
assert.equal(isOfficialSshRemote('git@github.com:nousresearch/hermes-agent.git'), true)
})
test('isOfficialSshRemote does NOT match forks, other hosts, or HTTPS', () => {
// A fork over SSH belongs to the user — fetching it is their own remote,
// not the official upstream, so the SSH-avoidance swap must not apply.
assert.equal(isOfficialSshRemote('git@github.com:someuser/hermes-agent.git'), false)
// Same repo name on a different host is not the official repo.
assert.equal(isOfficialSshRemote('git@gitlab.com:NousResearch/hermes-agent.git'), false)
// HTTPS to the official repo never prompts for SSH/FIDO2, so it keeps the
// normal fetch path — must not be flagged as an official SSH remote.
assert.equal(isOfficialSshRemote('https://github.com/NousResearch/hermes-agent.git'), false)
assert.equal(isOfficialSshRemote(''), false)
assert.equal(isOfficialSshRemote(null), false)
})
test('OFFICIAL_REPO_HTTPS_URL canonicalizes to OFFICIAL_REPO_CANONICAL', () => {
// Invariant: the URL we substitute in must be the same repo we detect.
assert.equal(canonicalGitHubRemote(OFFICIAL_REPO_HTTPS_URL), OFFICIAL_REPO_CANONICAL)
})

View File

@@ -0,0 +1,57 @@
'use strict'
const test = require('node:test')
const assert = require('node:assert/strict')
const fs = require('node:fs')
const path = require('node:path')
const ELECTRON_DIR = __dirname
function readElectronFile(name) {
return fs.readFileSync(path.join(ELECTRON_DIR, name), 'utf8').replace(/\r\n/g, '\n')
}
function requireHiddenChildOptions(source, needle) {
const index = source.indexOf(needle)
assert.notEqual(index, -1, `missing call site: ${needle}`)
const snippet = source.slice(index, index + 700)
assert.match(
snippet,
/hiddenWindowsChildOptions\(/,
`expected ${needle} to wrap child-process options with hiddenWindowsChildOptions`
)
}
test('desktop background child processes opt into hidden Windows consoles', () => {
const source = readElectronFile('main.cjs')
assert.match(source, /function hiddenWindowsChildOptions\(options = \{\}\)/)
requireHiddenChildOptions(source, "execFileSync(\n 'reg'")
requireHiddenChildOptions(source, 'execFileSync(pyExe')
requireHiddenChildOptions(source, 'spawn(resolveGitBinary()')
requireHiddenChildOptions(source, "execFileSync('taskkill'")
requireHiddenChildOptions(source, 'spawn(command, args')
requireHiddenChildOptions(source, "spawn('curl'")
requireHiddenChildOptions(source, 'spawn(backend.command, backend.args')
requireHiddenChildOptions(source, 'hermesProcess = spawn(backend.command, backend.args')
requireHiddenChildOptions(source, "spawn(py, ['-m', 'hermes_cli.main', 'uninstall', '--gui-summary']")
})
test('intentional or interactive desktop child processes stay documented', () => {
const source = readElectronFile('main.cjs')
assert.match(source, /windowsHide: false/)
assert.match(source, /handOffWindowsBootstrapRecovery/)
assert.match(source, /'--repair', '--branch'/)
assert.match(source, /'--update', '--branch'/)
assert.match(source, /nodePty\.spawn\(command, args/)
assert.match(source, /spawn\('cmd\.exe', \['\/c', 'start'/)
})
test('bootstrap PowerShell runner hides Windows console children', () => {
const source = readElectronFile('bootstrap-runner.cjs')
assert.match(source, /function hiddenWindowsChildOptions\(options = \{\}\)/)
requireHiddenChildOptions(source, 'spawn(ps, fullArgs')
})

View File

@@ -3,7 +3,6 @@ import typescriptEslint from '@typescript-eslint/eslint-plugin'
import typescriptParser from '@typescript-eslint/parser'
import perfectionist from 'eslint-plugin-perfectionist'
import reactPlugin from 'eslint-plugin-react'
import reactCompiler from 'eslint-plugin-react-compiler'
import hooksPlugin from 'eslint-plugin-react-hooks'
import unusedImports from 'eslint-plugin-unused-imports'
import globals from 'globals'
@@ -47,7 +46,6 @@ export default [
'custom-rules': customRules,
perfectionist,
react: reactPlugin,
'react-compiler': reactCompiler,
'react-hooks': hooksPlugin,
'unused-imports': unusedImports
},
@@ -98,7 +96,6 @@ export default [
'perfectionist/sort-jsx-props': ['error', { order: 'asc', type: 'natural' }],
'perfectionist/sort-named-exports': ['error', { order: 'asc', type: 'natural' }],
'perfectionist/sort-named-imports': ['error', { order: 'asc', type: 'natural' }],
'react-compiler/react-compiler': 'warn',
'react-hooks/exhaustive-deps': 'warn',
'react-hooks/rules-of-hooks': 'error',
'unused-imports/no-unused-imports': 'error'

View File

@@ -9,6 +9,28 @@
<link rel="apple-touch-icon" sizes="180x180" href="/apple-touch-icon.png" />
<link rel="shortcut icon" href="/apple-touch-icon.png" />
<title>Hermes</title>
<script>
// Pre-paint the themed background before the app bundle loads. Without
// this, the first frame (which is what `ready-to-show` waits for) is the
// UA-default white page, and the real theme only lands once the whole
// module graph has executed — i.e. the "white flash" on every new
// window. applyTheme() in src/themes/context.tsx keeps these keys fresh.
try {
let bg = localStorage.getItem('hermes-boot-background')
let scheme = localStorage.getItem('hermes-boot-color-scheme')
if (!bg) {
const dark = window.matchMedia('(prefers-color-scheme: dark)').matches
bg = dark ? '#111111' : '#f7f7f7'
scheme = dark ? 'dark' : 'light'
}
document.documentElement.style.backgroundColor = bg
if (scheme === 'dark' || scheme === 'light') {
document.documentElement.style.colorScheme = scheme
}
} catch {
// localStorage unavailable — keep UA defaults.
}
</script>
</head>
<body>
<div id="root" class="scrollbar-dt"></div>

View File

@@ -18,7 +18,8 @@
"profile:main": "wait-on http://127.0.0.1:5174 && cross-env XCURSOR_SIZE=24 HERMES_DESKTOP_DEV_SERVER=http://127.0.0.1:5174 electron --inspect=9229 .",
"profile:main:cpu": "wait-on http://127.0.0.1:5174 && cross-env XCURSOR_SIZE=24 NODE_OPTIONS=--cpu-prof HERMES_DESKTOP_DEV_SERVER=http://127.0.0.1:5174 electron .",
"start": "npm run build && electron .",
"build": "node scripts/assert-root-install.cjs && node scripts/write-build-stamp.cjs && node scripts/stage-native-deps.cjs && tsc -b && vite build && node scripts/assert-dist-built.cjs",
"build": "node scripts/assert-root-install.cjs && node scripts/write-build-stamp.cjs && node scripts/stage-native-deps.cjs && tsc -b && vite build && npm run postbuild",
"postbuild": "node scripts/assert-dist-built.cjs",
"builder": "cross-env NODE_OPTIONS=--max-old-space-size=16384 electron-builder",
"pack": "npm run build && npm run builder -- --dir",
"dist": "npm run build && npm run builder",
@@ -35,8 +36,8 @@
"test:desktop:nsis": "node scripts/test-desktop.mjs nsis",
"test:desktop:existing": "node scripts/test-desktop.mjs existing",
"test:desktop:fresh": "node scripts/test-desktop.mjs fresh",
"test:desktop:platforms": "node --test electron/bootstrap-platform.test.cjs electron/hardening.test.cjs electron/backend-probes.test.cjs electron/bootstrap-runner.test.cjs electron/connection-config.test.cjs electron/gateway-ws-probe.test.cjs electron/oauth-net-request.test.cjs electron/desktop-uninstall.test.cjs electron/session-windows.test.cjs electron/workspace-cwd.test.cjs",
"type-check": "tsc -b",
"test:desktop:platforms": "node --test electron/bootstrap-platform.test.cjs electron/hardening.test.cjs electron/backend-env.test.cjs electron/backend-probes.test.cjs electron/bootstrap-runner.test.cjs electron/connection-config.test.cjs electron/dashboard-token.test.cjs electron/gateway-ws-probe.test.cjs electron/oauth-net-request.test.cjs electron/desktop-uninstall.test.cjs electron/session-windows.test.cjs electron/workspace-cwd.test.cjs electron/fs-read-dir.test.cjs electron/git-root.test.cjs electron/windows-child-process.test.cjs electron/update-remote.test.cjs",
"typecheck": "tsc -p . --noEmit",
"lint": "eslint src/ electron/",
"lint:fix": "eslint src/ electron/ --fix",
"fmt": "prettier --write 'src/**/*.{ts,tsx}' 'electron/**/*.{js,cjs}' 'vite.config.ts'",
@@ -72,6 +73,7 @@
"class-variance-authority": "^0.7.1",
"clsx": "^2.1.1",
"cmdk": "^1.1.1",
"dnd-core": "^14.0.1",
"hast-util-from-html-isomorphic": "^2.0.0",
"hast-util-to-text": "^4.0.2",
"ignore": "^7.0.5",
@@ -83,10 +85,12 @@
"radix-ui": "^1.4.3",
"react": "^19.2.5",
"react-arborist": "^3.5.0",
"react-dnd-html5-backend": "^14.0.3",
"react-dom": "^19.2.5",
"react-router-dom": "^7.17.0",
"react-shiki": "^0.9.3",
"remark-math": "^6.0.0",
"remend": "^1.3.0",
"shiki": "^4.0.2",
"streamdown": "^2.5.0",
"tailwind-merge": "^3.5.0",
@@ -95,6 +99,7 @@
"unicode-animations": "^1.0.3",
"unified": "^11.0.5",
"unist-util-visit-parents": "^6.0.2",
"use-stick-to-bottom": "^1.1.6",
"vfile": "^6.0.3",
"web-haptics": "^0.0.6"
},
@@ -103,20 +108,19 @@
"@testing-library/dom": "^10.4.0",
"@testing-library/react": "^16.3.2",
"@types/hast": "^3.0.4",
"@types/node": "^24.12.2",
"@types/node": "^24.13.2",
"@types/react": "^19.2.14",
"@types/react-dom": "^19.2.3",
"@typescript-eslint/eslint-plugin": "^8.59.1",
"@typescript-eslint/parser": "^8.59.1",
"@vitejs/plugin-react": "^6.0.1",
"concurrently": "^9.2.1",
"concurrently": "^10.0.3",
"cross-env": "^10.1.0",
"electron": "^40.9.3",
"electron-builder": "^26.8.1",
"eslint": "^9.39.4",
"eslint-plugin-perfectionist": "^5.9.0",
"eslint-plugin-react": "^7.37.5",
"eslint-plugin-react-compiler": "^19.1.0-rc.2",
"eslint-plugin-react-hooks": "^7.1.1",
"eslint-plugin-unused-imports": "^4.4.1",
"globals": "^16.5.0",
@@ -133,6 +137,14 @@
"appId": "com.nousresearch.hermes",
"productName": "Hermes",
"executableName": "Hermes",
"protocols": [
{
"name": "Hermes Protocol",
"schemes": [
"hermes"
]
}
],
"artifactName": "Hermes-${version}-${os}-${arch}.${ext}",
"icon": "assets/icon",
"directories": {

View File

@@ -3,8 +3,8 @@ import { type ReactNode, useEffect, useMemo, useState } from 'react'
import { useElapsedSeconds } from '@/components/chat/activity-timer'
import { ActivityTimerText } from '@/components/chat/activity-timer-text'
import { BrailleSpinner } from '@/components/ui/braille-spinner'
import { FadeText } from '@/components/ui/fade-text'
import { GlyphSpinner } from '@/components/ui/glyph-spinner'
import { type Translations, useI18n } from '@/i18n'
import { AlertCircle, CheckCircle2, Sparkles } from '@/lib/icons'
import { useEnterAnimation } from '@/lib/use-enter-animation'
@@ -25,7 +25,7 @@ import { OverlayView } from '../overlays/overlay-view'
function statusGlyph(status: SubagentStatus, a: Translations['agents']): ReactNode {
if (status === 'running' || status === 'queued') {
return (
<BrailleSpinner
<GlyphSpinner
ariaLabel={a.running}
className="size-3.5 shrink-0 text-[0.95rem] text-muted-foreground/80"
spinner="breathe"
@@ -290,7 +290,7 @@ function StreamLine({
<span className={cn('min-w-0 flex-1 wrap-anywhere', tone, isMono && 'font-mono text-[0.69rem]')}>
{entry.text}
{active ? (
<BrailleSpinner
<GlyphSpinner
ariaLabel={t.agents.streaming}
className="ml-1 inline-block size-2.5 align-middle text-muted-foreground/70"
spinner="breathe"
@@ -372,7 +372,9 @@ function SubagentRow({ node, depth = 0, nowMs }: { node: SubagentNode; depth?: n
{open && fileLines.length > 0 ? (
<div className="grid min-w-0 gap-0.5 pl-6">
<p className="text-[0.58rem] font-medium tracking-wider text-muted-foreground/60 uppercase">{t.agents.files}</p>
<p className="text-[0.58rem] font-medium tracking-wider text-muted-foreground/60 uppercase">
{t.agents.files}
</p>
{fileLines.slice(0, 8).map(line => (
<p className="wrap-break-word font-mono text-[0.67rem] leading-relaxed text-muted-foreground/80" key={line}>
{line}

View File

@@ -18,7 +18,7 @@ import {
} from '@/components/ui/pagination'
import { TextTab, TextTabMeta } from '@/components/ui/text-tab'
import { Tip } from '@/components/ui/tooltip'
import { getSessionMessages, listSessions } from '@/hermes'
import { getSessionMessages, listAllProfileSessions } from '@/hermes'
import { type Translations, useI18n } from '@/i18n'
import { sessionTitle } from '@/lib/chat-runtime'
import { ExternalLink, ExternalLinkIcon, hostPathLabel, urlSlugTitleLabel, useLinkTitle } from '@/lib/external-link'
@@ -388,8 +388,8 @@ export function ArtifactsView({ setStatusbarItemGroup: _setStatusbarItemGroup, .
setRefreshing(true)
try {
const sessions = (await listSessions(30, 1)).sessions
const results = await Promise.allSettled(sessions.map(session => getSessionMessages(session.id)))
const sessions = (await listAllProfileSessions(30, 1)).sessions
const results = await Promise.allSettled(sessions.map(session => getSessionMessages(session.id, session.profile)))
const nextArtifacts: ArtifactRecord[] = []
results.forEach((result, index) => {

View File

@@ -2,32 +2,21 @@ import type { Unstable_TriggerAdapter } from '@assistant-ui/core'
import { ComposerPrimitive } from '@assistant-ui/react'
import type { ReactNode } from 'react'
export const COMPLETION_DRAWER_CLASS = [
'absolute bottom-[calc(100%+0.25rem)] left-0 z-50',
'w-60 max-w-[calc(100vw-2rem)]',
'max-h-[min(23rem,calc(100vh-8rem))] overflow-y-auto overscroll-contain',
'rounded-lg border border-(--ui-stroke-secondary)',
'bg-[color-mix(in_srgb,var(--ui-bg-elevated)_96%,transparent)]',
'p-1 text-xs text-popover-foreground shadow-md',
'backdrop-blur-md'
].join(' ')
import { composerFusedDockCard } from '@/components/chat/composer-dock'
import { cn } from '@/lib/utils'
export const COMPLETION_DRAWER_BELOW_CLASS = [
'absolute left-0 top-[calc(100%+0.25rem)] z-50',
'w-60 max-w-[calc(100vw-2rem)]',
'max-h-[min(23rem,calc(100vh-8rem))] overflow-y-auto overscroll-contain',
'rounded-lg border border-(--ui-stroke-secondary)',
'bg-[color-mix(in_srgb,var(--ui-bg-elevated)_96%,transparent)]',
'p-1 text-xs text-popover-foreground shadow-md',
'backdrop-blur-md'
].join(' ')
// Same docked chrome as the queue/status stack, but its own thing: a narrow,
// left-aligned card (not full width) that fuses to the composer's edge instead
// of floating above it. `left-1` matches the stack's `mx-1` inset; the negative
// margin overlaps the seam so the composer's (now-transparent) edge border reads
// as shared. Fused (opaque) fill — the composer surface swaps to the same fill
// while a drawer is open, so the two paint as one panel.
const DRAWER_SHELL =
'absolute left-1 z-50 w-80 max-w-[calc(100%-0.5rem)] max-h-[min(22rem,calc(100vh-8rem))] overflow-y-auto overscroll-contain p-1 text-xs text-popover-foreground'
export const COMPLETION_DRAWER_ROW_CLASS = [
'relative flex cursor-default select-none items-center gap-2 rounded-md px-2 py-1',
'w-full min-w-0 text-left text-xs outline-hidden transition-colors',
'hover:bg-(--ui-bg-tertiary)',
'data-[highlighted]:bg-(--ui-bg-tertiary) data-[highlighted]:text-foreground'
].join(' ')
export const COMPLETION_DRAWER_CLASS = cn(DRAWER_SHELL, 'bottom-full -mb-[9px]', composerFusedDockCard('top'))
export const COMPLETION_DRAWER_BELOW_CLASS = cn(DRAWER_SHELL, 'top-full -mt-[9px]', composerFusedDockCard('bottom'))
export function ComposerCompletionDrawer({
adapter,

View File

@@ -11,6 +11,7 @@ import {
DropdownMenuSeparator,
DropdownMenuTrigger
} from '@/components/ui/dropdown-menu'
import { Kbd } from '@/components/ui/kbd'
import { useI18n } from '@/i18n'
import { Clipboard, FileText, FolderOpen, type IconComponent, ImageIcon, Link, MessageSquareText } from '@/lib/icons'
import { cn } from '@/lib/utils'
@@ -86,7 +87,7 @@ export function ContextMenu({
<div className="px-2 py-1 text-[0.7rem] text-muted-foreground/80">
{c.tipPre}
<kbd className="rounded bg-muted/70 px-1 py-px font-mono text-[0.65rem]">@</kbd>
<Kbd size="sm">@</Kbd>
{c.tipPost}
</div>
</DropdownMenuContent>

View File

@@ -1,5 +1,6 @@
import { Button } from '@/components/ui/button'
import { Codicon } from '@/components/ui/codicon'
import { KbdCombo } from '@/components/ui/kbd'
import { Tip } from '@/components/ui/tooltip'
import { useI18n } from '@/i18n'
import { triggerHaptic } from '@/lib/haptics'
@@ -63,7 +64,14 @@ export function ComposerControls({
}) {
const { t } = useI18n()
const c = t.composer
const steerLabel = `${c.steer} (${formatCombo('mod+enter')})`
const steerCombo = formatCombo('mod+enter')
const steerLabel = `${c.steer} (${steerCombo})`
const steerTip = (
<span className="inline-flex items-center gap-1.5">
{c.steer}
<KbdCombo combo="mod+enter" size="sm" variant="inverted" />
</span>
)
if (conversation.active) {
return <ConversationPill {...conversation} disabled={disabled} />
@@ -75,7 +83,7 @@ export function ComposerControls({
<div className="ml-auto flex shrink-0 items-center gap-(--composer-control-gap)">
<DictationButton disabled={disabled} onToggle={onDictate} state={state.voice} status={voiceStatus} />
{canSteer && (
<Tip label={steerLabel}>
<Tip label={steerTip}>
<Button
aria-label={steerLabel}
className={GHOST_ICON_BTN}

View File

@@ -24,6 +24,7 @@ afterEach(cleanup)
// state stays stale while the DOM already holds the text.
function Harness({
busy = false,
disabled = false,
queued = [],
onSubmit,
onQueue,
@@ -31,6 +32,7 @@ function Harness({
onDrain
}: {
busy?: boolean
disabled?: boolean
queued?: readonly string[]
onSubmit: (text: string) => void
onQueue: (text: string) => void
@@ -52,6 +54,10 @@ function Harness({
}
const submitDraft = () => {
if (disabled) {
return
}
const editor = editorRef.current
if (editor) {
const domText = composerPlainText(editor)
@@ -84,6 +90,10 @@ function Harness({
const editorText = editorRef.current ? composerPlainText(editorRef.current) : draftRef.current
const hasLivePayload = editorText.trim().length > 0 || attachments.length > 0
if (disabled) {
return
}
if (!busy && !hasLivePayload && queued.length > 0) {
onDrain()
@@ -186,4 +196,23 @@ describe('composer Enter submit — live DOM vs stale composer state (#39630)',
expect(onDrain).toHaveBeenCalledTimes(1)
expect(onSubmit).not.toHaveBeenCalled()
})
it('keeps reconnect drafts editable but blocks Enter submit until the gateway returns', async () => {
const onSubmit = vi.fn()
const onDrain = vi.fn()
const { getByTestId } = render(
<Harness disabled onCancel={vi.fn()} onDrain={onDrain} onQueue={vi.fn()} onSubmit={onSubmit} queued={['queued-1']} />
)
const editor = getByTestId('editor')
await act(async () => {
editor.textContent = 'draft while reconnecting'
fireEvent.input(editor)
fireEvent.keyDown(editor, { key: 'Enter' })
})
expect(editor.textContent).toBe('draft while reconnecting')
expect(onDrain).not.toHaveBeenCalled()
expect(onSubmit).not.toHaveBeenCalled()
})
})

View File

@@ -10,6 +10,7 @@
* steal focus from the composer effect.
*/
import { RICH_INPUT_SLOT } from './rich-editor'
import type { InlineRefInput } from './inline-refs'
export type ComposerTarget = 'edit' | 'main'
@@ -123,3 +124,12 @@ export const focusComposerInput = (el: HTMLElement | null) => {
window.requestAnimationFrame(focus)
window.setTimeout(focus, 0)
}
/** Drop focus from the main composer input (status-stack chrome, sidebar, etc.). */
export const blurComposerInput = () => {
const el = document.querySelector(`[data-slot="${RICH_INPUT_SLOT}"]`) as HTMLElement | null
if (el && document.activeElement === el) {
el.blur()
}
}

View File

@@ -1,11 +1,23 @@
import type { ReactNode } from 'react'
import { KbdCombo } from '@/components/ui/kbd'
import { useI18n } from '@/i18n'
import { COMPLETION_DRAWER_CLASS } from './completion-drawer'
const COMMON_COMMAND_KEYS = ['/help', '/clear', '/resume', '/details', '/copy', '/quit']
const HOTKEY_KEYS = ['@', '/', '?', 'Enter', 'Cmd/Ctrl+Shift+K', 'Cmd/Ctrl+/', 'Esc', '↑ / ↓']
/** Stable ids → i18n `hotkeyDescs` keys. Combos resolve mod labels per OS. */
const COMPOSER_HOTKEY_ROWS = [
{ id: 'composer.mention', combos: ['@'] },
{ id: 'composer.slash', combos: ['/'] },
{ id: 'composer.help', combos: ['?'] },
{ id: 'composer.sendNewline', combos: ['enter', 'shift+enter'] },
{ id: 'composer.sendQueued', combos: ['mod+shift+k'] },
{ id: 'keybinds.openPanel', combos: ['mod+/'] },
{ id: 'composer.cancel', combos: ['escape'] },
{ id: 'composer.history', combos: ['up', 'down'] }
] as const
export function HelpHint() {
const { t } = useI18n()
@@ -20,8 +32,8 @@ export function HelpHint() {
</Section>
<Section title={c.hotkeys}>
{HOTKEY_KEYS.map(key => (
<Row description={c.hotkeyDescs[key] ?? ''} key={key} keyLabel={key} />
{COMPOSER_HOTKEY_ROWS.map(row => (
<HotkeyRow description={c.hotkeyDescs[row.id] ?? ''} combos={[...row.combos]} key={row.id} />
))}
</Section>
@@ -57,3 +69,16 @@ function Row({ description, keyLabel, mono = false }: { description: string; key
</div>
)
}
function HotkeyRow({ combos, description }: { combos: string[]; description: string }) {
return (
<div className="flex min-w-0 items-center gap-2 rounded-md px-2.5 py-1 text-xs">
<span className="flex shrink-0 items-center gap-1">
{combos.map(combo => (
<KbdCombo combo={combo} key={combo} size="sm" />
))}
</span>
<span className="min-w-0 truncate text-muted-foreground/80">{description}</span>
</div>
)
}

View File

@@ -5,6 +5,13 @@ export interface CompletionEntry {
text: string
display?: unknown
meta?: unknown
/** Optional section label (e.g. "Commands", "Skills"). The popover renders a
* header whenever this changes between consecutive items, so the fetcher must
* emit entries already grouped contiguously. */
group?: string
/** Optional completion-action id. When set, picking the item runs that action
* (e.g. opening an overlay) instead of inserting a chip + waiting for submit. */
action?: string
}
export interface CompletionPayload {

View File

@@ -2,12 +2,17 @@ import type { Unstable_TriggerAdapter, Unstable_TriggerItem } from '@assistant-u
import { useCallback } from 'react'
import type { HermesGateway } from '@/hermes'
import { sessionTitle } from '@/lib/chat-runtime'
import {
type CommandsCatalogLike,
desktopSkinSlashCompletions,
desktopSlashDescription,
type DesktopThemeCommandOption,
filterDesktopCommandsCatalog,
isDesktopSlashExtensionCommand,
isDesktopSlashSuggestion
} from '@/lib/desktop-slash-commands'
import { $sessions } from '@/store/session'
import type { CompletionEntry, CompletionPayload } from './use-live-completion-adapter'
import { useLiveCompletionAdapter } from './use-live-completion-adapter'
@@ -16,7 +21,10 @@ interface SlashItemMetadata extends Record<string, string> {
command: string
display: string
meta: string
group: string
rawText: string
/** Completion-action id; empty for ordinary insert-a-chip completions. */
action: string
}
function textValue(value: unknown, fallback = ''): string {
@@ -38,12 +46,21 @@ function commandText(value: string): string {
return value.startsWith('/') ? value : `/${value}`
}
/** How many recent sessions to surface inline before the "Browse all…" entry. */
const SESSION_INLINE_LIMIT = 7
/** Live `/` completions backed by the gateway's `complete.slash` RPC. */
export function useSlashCompletions(options: { gateway: HermesGateway | null }): {
export function useSlashCompletions(options: {
gateway: HermesGateway | null
/** Desktop theme list — `/skin` is owned client-side, so its arg completions
* come from here, not the backend (whose skin list is CLI/TUI-only). */
skinThemes?: DesktopThemeCommandOption[]
activeSkin?: string
}): {
adapter: Unstable_TriggerAdapter
loading: boolean
} {
const { gateway } = options
const { gateway, skinThemes, activeSkin } = options
const enabled = Boolean(gateway)
const fetcher = useCallback(
@@ -54,34 +71,136 @@ export function useSlashCompletions(options: { gateway: HermesGateway | null }):
const text = `/${query}`
// The desktop owns /skin entirely (client-side theme context). Surface its
// theme list inside this single popover instead of a bespoke one, and skip
// the backend skin completions (which describe CLI/TUI skins that don't
// apply here). Matches once we're past `/skin ` into the arg stage.
const skinArg = /^\/skin\s+(.*)$/is.exec(text)
if (skinArg && skinThemes) {
const items = desktopSkinSlashCompletions(skinThemes, activeSkin ?? '', skinArg[1] ?? '').map(entry => ({
text: entry.text,
display: entry.display,
meta: entry.meta,
group: 'Themes'
}))
return { items, query }
}
// /resume (and its aliases) completes recent sessions inline — the same
// client-side list the picker overlay shows — instead of the backend
// (whose /resume opens an interactive TUI picker we can't render here).
const sessionArg = /^\/(?:resume|sessions|switch)\s+(.*)$/is.exec(text)
if (sessionArg) {
const needle = (sessionArg[1] ?? '').trim().toLowerCase()
const matches = (
needle
? $sessions.get().filter(
session =>
sessionTitle(session).toLowerCase().includes(needle) ||
(session.preview ?? '').toLowerCase().includes(needle) ||
session.id.toLowerCase().includes(needle)
)
: $sessions.get()
).slice(0, SESSION_INLINE_LIMIT)
const items: CompletionEntry[] = matches.map(session => ({
text: `/resume ${session.id}`,
display: sessionTitle(session),
meta: (session.preview ?? '').trim(),
group: 'Sessions'
}))
// Trailing "more" affordance (Cursor-style): picking it opens the full
// session picker overlay directly. `text` stays a bare `/resume` so that
// submitting it (Enter) still opens the overlay if the action is skipped.
items.push({
text: '/resume',
display: 'Browse all sessions…',
meta: '',
group: 'Sessions',
action: 'session-picker'
})
return { items, query }
}
try {
if (!query) {
const catalog = filterDesktopCommandsCatalog(await gateway.request<CommandsCatalogLike>('commands.catalog'))
const items = (catalog.pairs ?? []).map(([command, meta]) => ({
text: command,
display: command,
meta
}))
// Prefer the categorized layout so the popover renders section headers
// (Session, Tools & Skills, ...). Fall back to the flat list when the
// backend didn't categorize.
const sections = catalog.categories?.length
? catalog.categories
: [{ name: '', pairs: catalog.pairs ?? [] }]
const items = sections.flatMap(section =>
section.pairs.map(([command, meta]) => ({
text: command,
display: command,
group: section.name || undefined,
meta
}))
)
return { items, query }
}
const result = await gateway.request<{ items?: CompletionEntry[] }>('complete.slash', { text })
const result = await gateway.request<{ items?: CompletionEntry[]; replace_from?: number }>(
'complete.slash',
{ text }
)
const items = (result.items ?? [])
.filter(item => isDesktopSlashSuggestion(item.text))
// Arg-completion items (replace_from > 1) carry just the arg stub —
// e.g. complete.slash returns `{text: "alice"}` for `/personality alic`
// with replace_from = 14. Rewrite those entries so the popover inserts
// the full `/personality alice` token instead of stranding `/alice`.
const replaceFrom = typeof result.replace_from === 'number' ? result.replace_from : 1
const isArgCompletion = replaceFrom > 1
const prefix = isArgCompletion ? text.slice(0, replaceFrom) : ''
const decorated = (result.items ?? [])
.map(item => {
if (!isArgCompletion) {
return item
}
const argText = typeof item.text === 'string' ? item.text : ''
return { ...item, text: `${prefix}${argText}` }
})
.filter(item => isArgCompletion || isDesktopSlashSuggestion(item.text))
.map(item => ({
...item,
meta: desktopSlashDescription(item.text, textValue(item.meta))
// Arg suggestions (e.g. `/handoff <platform>`) live under one
// header; otherwise split skills out from built-in commands.
group: isArgCompletion ? 'Options' : isDesktopSlashExtensionCommand(item.text) ? 'Skills' : 'Commands',
// Arg items carry their own meta (the personality/toolset/platform
// blurb). Only command rows get the registry description — looking
// one up for `/personality none` would clobber it with the parent
// command's text.
meta: isArgCompletion ? textValue(item.meta) : desktopSlashDescription(item.text, textValue(item.meta))
}))
// Keep each group contiguous so headers render once: Commands before
// Skills (stable within a group, preserving backend relevance order).
const groupOrder = ['Commands', 'Skills', 'Options']
const items = isArgCompletion
? decorated
: [...decorated].sort((a, b) => groupOrder.indexOf(a.group) - groupOrder.indexOf(b.group))
return { items, query }
} catch {
return { items: [], query }
}
},
[gateway]
[gateway, skinThemes, activeSkin]
)
const toItem = useCallback((entry: CompletionEntry, index: number): Unstable_TriggerItem => {
@@ -93,6 +212,8 @@ export function useSlashCompletions(options: { gateway: HermesGateway | null }):
command,
display,
meta,
group: textValue(entry.group),
action: textValue(entry.action),
// Provide rawText so hermesDirectiveFormatter.serialize uses the
// direct-insertion path instead of the legacy @type:id fallback.
// Without this, the item.id (which includes a "|index" suffix for

File diff suppressed because it is too large Load Diff

View File

@@ -3,7 +3,12 @@ import { contextPath } from '@/lib/chat-runtime'
import type { DroppedFile } from '../hooks/use-composer-actions'
import { composerPlainText, escapeHtml, placeCaretEnd, refChipHtml } from './rich-editor'
import {
composerPlainText,
normalizeComposerEditorDom,
placeCaretEnd,
refChipElement
} from './rich-editor'
/** A chip to insert: a raw `@kind:value` string, or a typed value + display label. */
export type InlineRefInput = string | { kind: string; label?: string; value: string }
@@ -89,56 +94,102 @@ export function droppedFileInlineRefs(candidates: DroppedFile[], cwd: string | n
return candidates.map(candidate => droppedFileInlineRef(candidate, cwd)).filter((ref): ref is string => Boolean(ref))
}
export function insertInlineRefsIntoEditor(editor: HTMLDivElement, refs: readonly InlineRefInput[]) {
if (!refs.length) {
function parseInlineRef(ref: InlineRefInput): { kind: string; label?: string; rawValue: string } | null {
if (typeof ref !== 'string') {
return { kind: ref.kind, label: ref.label, rawValue: ref.value }
}
const match = ref.match(/^@([^:]+):(.+)$/)
if (!match) {
return null
}
const refsHtml = refs
.map(ref => {
if (typeof ref !== 'string') {
return refChipHtml(ref.kind, ref.value, ref.label)
}
return { kind: match[1] || 'file', rawValue: match[2] || '' }
}
const match = ref.match(/^@([^:]+):(.+)$/)
function plainTextInRange(editor: HTMLDivElement, range: Range, edge: 'after' | 'before') {
const slice = range.cloneRange()
slice.selectNodeContents(editor)
return match ? refChipHtml(match[1], match[2]) : escapeHtml(ref)
})
.join(' ')
if (edge === 'before') {
slice.setEnd(range.startContainer, range.startOffset)
} else {
slice.setStart(range.endContainer, range.endOffset)
}
const container = document.createElement('div')
container.appendChild(slice.cloneContents())
return composerPlainText(container)
}
function buildRefFragment(
refs: readonly { kind: string; label?: string; rawValue: string }[],
{ needsBeforeSpace, needsAfterSpace }: { needsAfterSpace: boolean; needsBeforeSpace: boolean }
) {
const fragment = document.createDocumentFragment()
if (needsBeforeSpace) {
fragment.append(document.createTextNode(' '))
}
refs.forEach((ref, index) => {
if (index > 0) {
fragment.append(document.createTextNode(' '))
}
fragment.append(refChipElement(ref.kind, ref.rawValue, ref.label))
})
if (needsAfterSpace) {
fragment.append(document.createTextNode(' '))
}
return fragment
}
export function insertInlineRefsIntoEditor(editor: HTMLDivElement, refs: readonly InlineRefInput[]) {
const parsed = refs.map(parseInlineRef).filter((ref): ref is NonNullable<typeof ref> => ref !== null)
if (!parsed.length) {
return null
}
editor.focus({ preventScroll: true })
const selection = window.getSelection()
const range =
selection?.rangeCount && editor.contains(selection.getRangeAt(0).commonAncestorContainer)
? selection.getRangeAt(0)
: null
editor.focus({ preventScroll: true })
if (range && selection) {
const beforeText = plainTextInRange(editor, range, 'before')
const afterText = plainTextInRange(editor, range, 'after')
if (range) {
const beforeRange = range.cloneRange()
beforeRange.selectNodeContents(editor)
beforeRange.setEnd(range.startContainer, range.startOffset)
const beforeContainer = document.createElement('div')
beforeContainer.appendChild(beforeRange.cloneContents())
const afterRange = range.cloneRange()
afterRange.selectNodeContents(editor)
afterRange.setStart(range.endContainer, range.endOffset)
const afterContainer = document.createElement('div')
afterContainer.appendChild(afterRange.cloneContents())
const beforeText = composerPlainText(beforeContainer)
const afterText = composerPlainText(afterContainer)
const needsBeforeSpace = beforeText.length > 0 && !/\s$/.test(beforeText)
const needsAfterSpace = afterText.length === 0 || !/^\s/.test(afterText)
document.execCommand('insertHTML', false, `${needsBeforeSpace ? ' ' : ''}${refsHtml}${needsAfterSpace ? ' ' : ''}`)
range.insertNode(
buildRefFragment(parsed, {
needsAfterSpace: afterText.length === 0 || !/^\s/.test(afterText),
needsBeforeSpace: beforeText.length > 0 && !/\s$/.test(beforeText)
})
)
range.collapse(false)
selection.removeAllRanges()
selection.addRange(range)
} else {
const current = composerPlainText(editor)
editor.append(
buildRefFragment(parsed, {
needsAfterSpace: true,
needsBeforeSpace: current.length > 0 && !/\s$/.test(current)
})
)
placeCaretEnd(editor)
document.execCommand('insertHTML', false, `${current && !/\s$/.test(current) ? ' ' : ''}${refsHtml} `)
}
normalizeComposerEditorDom(editor)
return composerPlainText(editor)
}

View File

@@ -1,7 +1,6 @@
import { useState } from 'react'
import { StatusRow } from '@/components/chat/status-row'
import { StatusSection } from '@/components/chat/status-section'
import { Button } from '@/components/ui/button'
import { DisclosureCaret } from '@/components/ui/disclosure-caret'
import { Tip } from '@/components/ui/tooltip'
import { type Translations, useI18n } from '@/i18n'
import { ArrowUp, Pencil, Trash2 } from '@/lib/icons'
@@ -23,108 +22,84 @@ const entryPreview = (entry: QueuedPromptEntry, c: Translations['composer']) =>
export function QueuePanel({ busy, editingId, entries, onDelete, onEdit, onSendNow }: QueuePanelProps) {
const { t } = useI18n()
const c = t.composer
const [collapsed, setCollapsed] = useState(true)
if (entries.length === 0) {
return null
}
return (
<div className="rounded-t-2xl border border-b-0 border-border/65 bg-[color-mix(in_srgb,var(--dt-card)_70%,transparent)] pt-0.5 pb-1 mx-1">
<button
className="flex w-full items-center gap-1.5 px-2 text-left text-[0.6rem] font-medium text-muted-foreground/92 transition-colors hover:text-foreground/90"
onClick={() => setCollapsed(open => !open)}
type="button"
>
<DisclosureCaret className="shrink-0" open={!collapsed} size="1em" />
<span className="truncate">{c.queued(entries.length)}</span>
</button>
<StatusSection label={c.queued(entries.length)}>
{entries.map(entry => {
const isEditing = editingId === entry.id
const attachmentsCount = entry.attachments.length
{!collapsed && (
<div className="space-y-0.5 px-1 pb-0.5">
{entries.map(entry => {
const isEditing = editingId === entry.id
const attachmentsCount = entry.attachments.length
const sendLabel = busy ? c.sendQueuedNext : c.sendQueuedNow
return (
<div
className={cn(
'group/queue-row flex items-center gap-1.5 rounded-lg border border-transparent px-1.5 py-0.5',
'transition-colors duration-300 ease-out hover:bg-(--chrome-action-hover) hover:transition-none',
isEditing && 'border-[color-mix(in_srgb,var(--dt-composer-ring)_40%,transparent)] bg-accent/25'
)}
key={entry.id}
>
<span
aria-hidden
className="h-3.5 w-3.5 shrink-0 rounded-full border border-foreground/35 bg-transparent"
/>
<div className="min-w-0 flex-1">
<p className="truncate text-[0.73rem] leading-4 text-foreground/92">{entryPreview(entry, c)}</p>
{(attachmentsCount > 0 || isEditing) && (
<div className="mt-0.5 flex items-center gap-1.5 text-[0.64rem] text-muted-foreground/75">
{attachmentsCount > 0 && <span>{c.attachments(attachmentsCount)}</span>}
{isEditing && (
<span className="text-[color-mix(in_srgb,var(--dt-composer-ring)_78%,var(--muted-foreground))]">
{c.editingInComposer}
</span>
)}
</div>
return (
<StatusRow
className={cn(
'border border-transparent',
isEditing && 'border-[color-mix(in_srgb,var(--dt-composer-ring)_40%,transparent)] bg-accent/25'
)}
key={entry.id}
trailing={
<>
<Tip label={c.queueEdit}>
<Button
aria-label={c.queueEdit}
className="size-5 rounded-md"
disabled={Boolean(editingId) && !isEditing}
onClick={() => onEdit(entry)}
size="icon-xs"
type="button"
variant="ghost"
>
<Pencil size={11} />
</Button>
</Tip>
<Tip label={busy ? c.queueSendNext : c.queueSend}>
<Button
aria-label={busy ? c.queueSendNext : c.queueSend}
className="size-5 rounded-md"
disabled={isEditing}
onClick={() => onSendNow(entry.id)}
size="icon-xs"
type="button"
variant="ghost"
>
<ArrowUp size={11} />
</Button>
</Tip>
<Tip label={c.queueDelete}>
<Button
aria-label={c.queueDelete}
className="size-5 rounded-md"
onClick={() => onDelete(entry.id)}
size="icon-xs"
type="button"
variant="ghost"
>
<Trash2 size={11} />
</Button>
</Tip>
</>
}
trailingVisible={isEditing}
>
<div className="min-w-0 flex-1">
<p className="truncate text-[0.73rem] leading-4 text-foreground/92">{entryPreview(entry, c)}</p>
{(attachmentsCount > 0 || isEditing) && (
<div className="mt-0.5 flex items-center gap-1.5 text-[0.64rem] text-muted-foreground/75">
{attachmentsCount > 0 && <span>{c.attachments(attachmentsCount)}</span>}
{isEditing && (
<span className="text-[color-mix(in_srgb,var(--dt-composer-ring)_78%,var(--muted-foreground))]">
{c.editingInComposer}
</span>
)}
</div>
<div
className={cn(
'flex shrink-0 items-center gap-0 transition-opacity',
isEditing
? 'opacity-100'
: 'opacity-0 group-hover/queue-row:opacity-100 group-focus-within/queue-row:opacity-100'
)}
>
<Tip label={c.editQueued}>
<Button
aria-label={c.editQueued}
className="h-5 w-5 rounded-md"
disabled={Boolean(editingId) && !isEditing}
onClick={() => onEdit(entry)}
size="icon-xs"
type="button"
variant="ghost"
>
<Pencil size={11} />
</Button>
</Tip>
<Tip label={sendLabel}>
<Button
aria-label={sendLabel}
className="h-5 w-5 rounded-md"
disabled={isEditing}
onClick={() => onSendNow(entry.id)}
size="icon-xs"
type="button"
variant="ghost"
>
<ArrowUp size={11} />
</Button>
</Tip>
<Tip label={c.deleteQueued}>
<Button
aria-label={c.deleteQueued}
className="h-5 w-5 rounded-md"
onClick={() => onDelete(entry.id)}
size="icon-xs"
type="button"
variant="ghost"
>
<Trash2 size={11} />
</Button>
</Tip>
</div>
</div>
)
})}
</div>
)}
</div>
)}
</div>
</StatusRow>
)
})}
</StatusSection>
)
}

View File

@@ -1,6 +1,25 @@
import { describe, expect, it } from 'vitest'
import { composerPlainText, renderComposerContents, RICH_INPUT_SLOT } from './rich-editor'
import { insertInlineRefsIntoEditor } from './inline-refs'
import {
composerPlainText,
deleteSelectionInEditor,
insertPlainTextAtCaret,
normalizeComposerEditorDom,
refChipElement,
renderComposerContents,
RICH_INPUT_SLOT
} from './rich-editor'
const caretIn = (editor: HTMLElement) => {
const range = document.createRange()
const selection = window.getSelection()!
range.selectNodeContents(editor)
range.collapse(false)
selection.removeAllRanges()
selection.addRange(range)
}
describe('renderComposerContents', () => {
it('renders refs and raw text without interpreting user text as HTML', () => {
@@ -16,3 +35,100 @@ describe('renderComposerContents', () => {
expect(composerPlainText(editor)).toBe('@file:`<img src=x onerror=alert(1)>` <b>raw</b>')
})
})
describe('normalizeComposerEditorDom', () => {
it('unwraps a single insertHTML wrapper div so plain text stays one line', () => {
const editor = document.createElement('div')
editor.dataset.slot = RICH_INPUT_SLOT
editor.innerHTML = '<div><span data-ref-text="@file:`src/foo.ts`" contenteditable="false">foo.ts</span> </div>'
normalizeComposerEditorDom(editor)
expect(composerPlainText(editor)).toBe('@file:`src/foo.ts` ')
expect(editor.querySelector(':scope > div')).toBeNull()
})
it('removes a trailing br after a ref chip', () => {
const editor = document.createElement('div')
editor.dataset.slot = RICH_INPUT_SLOT
editor.append(refChipElement('file', '`src/foo.ts`'), document.createElement('br'))
normalizeComposerEditorDom(editor)
expect(composerPlainText(editor)).toBe('@file:`src/foo.ts`')
expect(editor.querySelector('br')).toBeNull()
})
})
describe('insertInlineRefsIntoEditor', () => {
it('inserts chips without wrapper divs or spurious newlines', () => {
const editor = document.createElement('div')
editor.dataset.slot = RICH_INPUT_SLOT
insertInlineRefsIntoEditor(editor, ['@file:`src/foo.ts`'])
expect(editor.querySelector(':scope > div')).toBeNull()
expect(composerPlainText(editor)).toBe('@file:`src/foo.ts` ')
})
})
describe('insertPlainTextAtCaret', () => {
it('inserts multiline text as text nodes + br', () => {
const editor = document.createElement('div')
editor.dataset.slot = RICH_INPUT_SLOT
document.body.append(editor)
caretIn(editor)
insertPlainTextAtCaret(editor, 'one\ntwo\nthree')
expect(editor.querySelectorAll('br').length).toBe(2)
expect(composerPlainText(editor)).toBe('one\ntwo\nthree')
editor.remove()
})
it('replaces the selected span', () => {
const editor = document.createElement('div')
editor.dataset.slot = RICH_INPUT_SLOT
editor.textContent = 'abXYef'
document.body.append(editor)
const text = editor.firstChild!
const selection = window.getSelection()!
const range = document.createRange()
range.setStart(text, 2)
range.setEnd(text, 4)
selection.removeAllRanges()
selection.addRange(range)
insertPlainTextAtCaret(editor, 'cd')
expect(composerPlainText(editor)).toBe('abcdef')
editor.remove()
})
})
describe('deleteSelectionInEditor', () => {
it('clears a non-collapsed range and leaves a collapsed caret', () => {
const editor = document.createElement('div')
editor.dataset.slot = RICH_INPUT_SLOT
editor.textContent = 'hello world'
document.body.append(editor)
const selection = window.getSelection()!
const range = document.createRange()
range.selectNodeContents(editor)
selection.removeAllRanges()
selection.addRange(range)
expect(deleteSelectionInEditor(editor)).toBe(true)
expect(composerPlainText(editor)).toBe('')
expect(selection.getRangeAt(0).collapsed).toBe(true)
expect(deleteSelectionInEditor(editor)).toBe(false)
editor.remove()
})
})

View File

@@ -10,7 +10,10 @@ import {
DIRECTIVE_CHIP_CLASS,
directiveIconElement,
directiveIconSvg,
formatRefValue
formatRefValue,
slashChipClass,
type SlashChipKind,
slashIconElement
} from '@/components/assistant-ui/directive-text'
export const RICH_INPUT_SLOT = 'composer-rich-input'
@@ -77,6 +80,24 @@ export function refChipElement(kind: string, rawValue: string, displayLabel?: st
return chip
}
/** A non-editable pill for a picked slash command (`/skin nous`, `/tropes`).
* `data-ref-text` carries the literal command so `composerPlainText` round-trips
* it back to the exact text that gets submitted. */
export function slashChipElement(command: string, kind: SlashChipKind, label?: string) {
const chip = document.createElement('span')
const text = document.createElement('span')
chip.contentEditable = 'false'
chip.dataset.refText = command
chip.dataset.slashKind = kind
chip.className = slashChipClass(kind)
text.className = 'truncate'
text.textContent = label || command
chip.append(slashIconElement(kind), text)
return chip
}
function appendTextWithBreaks(target: DocumentFragment | HTMLElement, text: string) {
const lines = text.split('\n')
@@ -111,6 +132,63 @@ export function renderComposerContents(target: HTMLElement, text: string) {
appendComposerContents(target, text)
}
/** Caret range when the selection lives inside `editor`; else null. */
function composerSelectionRange(editor: HTMLElement) {
const selection = window.getSelection()
const range = selection?.rangeCount ? selection.getRangeAt(0) : null
if (!selection || !range || !editor.contains(range.commonAncestorContainer)) {
return null
}
return { range, selection }
}
/** Insert plain text at the caret (replacing any selection). Pastes use this
* instead of `execCommand('insertText')` — Chromium's editing pipeline is
* ~O(n²) on large multiline blobs. */
export function insertPlainTextAtCaret(editor: HTMLElement, text: string) {
const hit = composerSelectionRange(editor)
const fragment = document.createDocumentFragment()
appendTextWithBreaks(fragment, text)
const tail = fragment.lastChild
if (hit) {
hit.range.deleteContents()
hit.range.insertNode(fragment)
} else {
editor.append(fragment)
}
if (tail) {
const caret = document.createRange()
caret.setStartAfter(tail)
caret.collapse(true)
const selection = hit?.selection ?? window.getSelection()
selection?.removeAllRanges()
selection?.addRange(caret)
}
}
/** Remove a non-collapsed selection in-editor. Skips collapsed carets so word/
* line delete (Opt/Cmd+Backspace) stays native. Returns whether anything ran. */
export function deleteSelectionInEditor(editor: HTMLElement) {
const hit = composerSelectionRange(editor)
if (!hit || hit.range.collapsed) {
return false
}
hit.range.deleteContents()
hit.range.collapse(true)
hit.selection.removeAllRanges()
hit.selection.addRange(hit.range)
return true
}
/** Serialize a draft string into chip-HTML for the contenteditable surface. */
export function composerHtml(text: string) {
let cursor = 0
@@ -163,3 +241,36 @@ export function placeCaretEnd(element: HTMLElement) {
selection?.removeAllRanges()
selection?.addRange(range)
}
/** Drop contenteditable junk that serializes as `\n` and falsely expands the composer. */
export function normalizeComposerEditorDom(editor: HTMLElement) {
if (editor.childNodes.length === 1 && editor.firstChild?.nodeName === 'BR') {
editor.replaceChildren()
return
}
if (editor.childNodes.length === 1 && editor.firstChild?.nodeType === Node.ELEMENT_NODE) {
const wrapper = editor.firstChild as HTMLElement
if (wrapper.tagName === 'DIV' && wrapper.dataset.slot !== RICH_INPUT_SLOT) {
editor.replaceChildren(...Array.from(wrapper.childNodes))
}
}
const last = editor.lastChild
if (last?.nodeName !== 'BR') {
return
}
let prev: ChildNode | null = last.previousSibling
while (prev?.nodeType === Node.TEXT_NODE && !(prev.textContent || '').trim()) {
prev = prev.previousSibling
}
if ((prev as HTMLElement | null)?.dataset.refText) {
editor.removeChild(last)
}
}

View File

@@ -1,61 +0,0 @@
import { useI18n } from '@/i18n'
import { desktopSkinSlashCompletions } from '@/lib/desktop-slash-commands'
import { triggerHaptic } from '@/lib/haptics'
import { useTheme } from '@/themes/context'
import { COMPLETION_DRAWER_CLASS, COMPLETION_DRAWER_ROW_CLASS, CompletionDrawerEmpty } from './completion-drawer'
interface SkinSlashPopoverProps {
draft: string
onSelect: (command: string) => void
}
export function SkinSlashPopover({ draft, onSelect }: SkinSlashPopoverProps) {
const { t } = useI18n()
const c = t.composer
const { availableThemes, themeName } = useTheme()
const match = draft.match(/^\/skin\s+(\S*)$/i)
if (!match) {
return null
}
const items = desktopSkinSlashCompletions(availableThemes, themeName, match[1] ?? '')
return (
<div
aria-label={c.themeSuggestions}
className={COMPLETION_DRAWER_CLASS}
data-slot="composer-skin-completion-drawer"
data-state="open"
role="listbox"
>
<div className="grid gap-0.5 pt-0.5">
{items.length === 0 ? (
<CompletionDrawerEmpty title={c.noMatchingThemes}>
{c.themeTryPre}
<span className="font-mono text-foreground/80">/skin list</span>
{c.themeTryPost}
</CompletionDrawerEmpty>
) : (
items.map(item => (
<button
className={COMPLETION_DRAWER_ROW_CLASS}
key={item.text}
onClick={() => {
triggerHaptic('selection')
onSelect(item.text)
}}
onMouseDown={event => event.preventDefault()}
role="option"
type="button"
>
<span className="shrink-0 font-mono font-medium leading-5 text-foreground">{item.display}</span>
<span className="min-w-0 truncate leading-5 text-muted-foreground/80">{item.meta}</span>
</button>
))
)}
</div>
</div>
)
}

View File

@@ -0,0 +1,202 @@
import { useStore } from '@nanostores/react'
import { type ReactNode, useEffect, useLayoutEffect, useMemo, useRef } from 'react'
import { useNavigate } from 'react-router-dom'
import { blurComposerInput } from '@/app/chat/composer/focus'
import { AGENTS_ROUTE } from '@/app/routes'
import { composerDockCard } from '@/components/chat/composer-dock'
import { StatusSection } from '@/components/chat/status-section'
import { Button } from '@/components/ui/button'
import { Codicon } from '@/components/ui/codicon'
import { type Translations, useI18n } from '@/i18n'
import { cn } from '@/lib/utils'
import {
$statusItemsBySession,
type ComposerStatusItem,
dismissBackgroundProcess,
groupStatusItems,
refreshBackgroundProcesses,
type StatusGroup,
stopBackgroundProcess
} from '@/store/composer-status'
import { $threadScrolledUp } from '@/store/thread-scroll'
import { openSessionInNewWindow } from '@/store/windows'
import { StatusItemRow } from './status-row'
// Slow safety-net poll for silent exits (processes without notify_on_complete
// emit no event when they die). Only armed while a running row is on screen.
const BACKGROUND_POLL_MS = 5_000
const groupLabel = (group: StatusGroup, s: Translations['statusStack']) => {
if (group.type === 'todo') {
return s.todos(group.items.filter(i => i.todoStatus === 'completed').length, group.items.length)
}
return group.type === 'subagent' ? s.subagents(group.items.length) : s.background(group.items.length)
}
interface ComposerStatusStackProps {
/** The queue, built by the composer (it owns the queue's callbacks). Rendered
* as the last group so it stays fused to the composer like before. */
queue: ReactNode
sessionId: null | string
}
/**
* The status "sink" above the composer: one card (the queue's chrome) holding
* every session-scoped status — subagents, background tasks, queue — grouped by
* type and separated by light dividers. Collapses to nothing when empty.
*/
export function ComposerStatusStack({ queue, sessionId }: ComposerStatusStackProps) {
const { t } = useI18n()
const navigate = useNavigate()
const itemsBySession = useStore($statusItemsBySession)
const scrolledUp = useStore($threadScrolledUp)
const groups = useMemo(
() => groupStatusItems(sessionId ? (itemsBySession[sessionId] ?? []) : []),
[itemsBySession, sessionId]
)
// Seed from the registry on session open; event-driven refreshes (terminal /
// process tool completions) live in use-message-stream.
useEffect(() => {
if (sessionId) {
void refreshBackgroundProcesses(sessionId)
}
}, [sessionId])
const hasRunningBackground = groups.some(g => g.type === 'background' && g.items.some(i => i.state === 'running'))
useEffect(() => {
if (!sessionId || !hasRunningBackground) {
return
}
const timer = setInterval(() => void refreshBackgroundProcesses(sessionId), BACKGROUND_POLL_MS)
return () => clearInterval(timer)
}, [hasRunningBackground, sessionId])
const openAgents = () => navigate(AGENTS_ROUTE)
const openSubagent = (item: ComposerStatusItem) =>
item.sessionId ? void openSessionInNewWindow(item.sessionId, { watch: true }) : openAgents()
const sections: { key: string; node: ReactNode }[] = groups.map(group => ({
key: group.type,
node: (
<StatusSection
accessory={
group.type === 'subagent' ? (
<Button
className="text-muted-foreground/75 hover:text-foreground/90"
onClick={openAgents}
size="micro"
type="button"
variant="text"
>
{t.statusStack.agents}
</Button>
) : undefined
}
defaultCollapsed={group.type !== 'todo'}
icon={
group.type === 'todo' ? (
<Codicon className="text-muted-foreground/70" name="checklist" size="0.8rem" />
) : undefined
}
label={groupLabel(group, t.statusStack)}
>
{group.items.map(item => (
<StatusItemRow
item={item}
key={item.id}
onDismiss={sessionId ? id => dismissBackgroundProcess(sessionId, id) : undefined}
onOpen={() => openSubagent(item)}
onStop={sessionId ? id => stopBackgroundProcess(sessionId, id) : undefined}
/>
))}
</StatusSection>
)
}))
if (queue) {
sections.push({ key: 'queue', node: queue })
}
const visible = sections.length > 0
const stackRef = useRef<HTMLDivElement | null>(null)
// The stack is out of flow (overlays the thread), so the composer's measured
// height never sees it. Publish our own measured height — bucketed like the
// composer's, to avoid style invalidation churn — so the thread's
// last-message clearance can add it and the stack never hides messages.
useLayoutEffect(() => {
const root = document.documentElement
const el = stackRef.current
if (!visible || !el) {
root.style.removeProperty('--status-stack-measured-height')
return
}
let last = -1
const sync = () => {
const bucket = Math.round(el.getBoundingClientRect().height / 8) * 8
if (bucket !== last) {
last = bucket
root.style.setProperty('--status-stack-measured-height', `${bucket}px`)
}
}
const observer = new ResizeObserver(sync)
observer.observe(el)
sync()
return () => {
observer.disconnect()
root.style.removeProperty('--status-stack-measured-height')
}
}, [visible])
if (!visible) {
return null
}
return (
<div
// Sits above the composer (bottom-full), nudged down by the shell's 0.5rem
// top pad (pt-2 on composer-root) plus 1px so its bottom edge overlaps the
// composer surface's top border. z BELOW the surface (z-4) so the surface's
// top border paints over our transparent bottom border — one seam, no
// double line.
className="absolute inset-x-0 bottom-full z-3 max-h-[40vh] translate-y-[calc(0.5rem+1px)] overflow-y-auto"
onPointerDownCapture={() => blurComposerInput()}
ref={stackRef}
>
{/* The card paints the shared --composer-fill (rest / scrolled / focused
all match the composer surface by construction); on scroll we only
ghost the CONTENT — element opacity on the card would kill the blur.
Rounded top, square bottom; the bottom border is TRANSPARENT — the
composer surface's visible top border (which sits at a higher z) is the
single shared seam, so the two read as one fused capsule. */}
<div className={cn(composerDockCard('top'), 'mx-2 rounded-b-none border-b border-b-transparent pt-0.5 pb-1')}>
<div
className={cn(
'transition-opacity duration-200 ease-out',
scrolledUp ? 'opacity-30 group-hover/composer:opacity-100' : 'opacity-100'
)}
>
{sections.map(section => (
<div key={section.key}>{section.node}</div>
))}
</div>
</div>
</div>
)
}

View File

@@ -0,0 +1,155 @@
import { Fragment, memo, type ReactNode, useState } from 'react'
import { StatusRow } from '@/components/chat/status-row'
import { TerminalOutput } from '@/components/chat/terminal-output'
import { Button } from '@/components/ui/button'
import { Codicon } from '@/components/ui/codicon'
import { DisclosureCaret } from '@/components/ui/disclosure-caret'
import { GlyphSpinner } from '@/components/ui/glyph-spinner'
import { Tip } from '@/components/ui/tooltip'
import { type Translations, useI18n } from '@/i18n'
import { ArrowUpRight, X } from '@/lib/icons'
import type { TodoStatus } from '@/lib/todos'
import { cn } from '@/lib/utils'
import type { ComposerStatusItem } from '@/store/composer-status'
const toolLabel = (name: string) =>
name
.split('_')
.filter(Boolean)
.map(part => part[0]!.toUpperCase() + part.slice(1))
.join(' ') || name
// Todo rows speak checkbox, not spinner-and-dot: a dashed ring while the item
// is still open (pending), codicons once it resolves, a live spinner only on
// the in-progress item.
const TODO_GLYPHS: Record<Exclude<TodoStatus, 'in_progress' | 'pending'>, { icon: string; tone: string }> = {
cancelled: { icon: 'circle-slash', tone: 'text-muted-foreground/45' },
completed: { icon: 'pass-filled', tone: 'text-emerald-500/80' }
}
// Left slot: braille spinner while running, otherwise a small status dot
// (green = done, red = failed) so the slot is always filled and rows align.
function leadingGlyph(item: ComposerStatusItem, s: Translations['statusStack']): ReactNode {
if (item.todoStatus === 'pending') {
return (
<span
aria-hidden
className="box-border size-[0.7rem] rounded-full border border-dashed border-muted-foreground/60"
/>
)
}
if (item.todoStatus && item.todoStatus !== 'in_progress') {
const glyph = TODO_GLYPHS[item.todoStatus]
return <Codicon className={glyph.tone} name={glyph.icon} size="0.8rem" />
}
if (item.state === 'running') {
return (
<GlyphSpinner
ariaLabel={s.running}
className="text-[0.9rem] leading-none text-muted-foreground/80"
spinner="braille"
/>
)
}
return (
<span
aria-hidden
className={cn('size-1.5 rounded-full', item.state === 'failed' ? 'bg-destructive/80' : 'bg-emerald-500/70')}
/>
)
}
interface StatusItemRowProps {
item: ComposerStatusItem
/** Clear a finished background task from the stack. */
onDismiss?: (id: string) => void
/** Open the subagent's own session window, livestreamed by the gateway's
* child-session mirror (Agents view fallback for older gateways). */
onOpen?: () => void
/** Cancel a running background task. */
onStop?: (id: string) => void
}
/**
* Renders one {@link ComposerStatusItem} into the shared {@link StatusRow}.
* Memoised + keyed by id so parent re-renders never remount it (the spinner
* keeps ticking instead of resetting).
*/
export const StatusItemRow = memo(function StatusItemRow({ item, onDismiss, onOpen, onStop }: StatusItemRowProps) {
const { t } = useI18n()
const s = t.statusStack
const [outputOpen, setOutputOpen] = useState(false)
const failed = item.state === 'failed'
const running = item.state === 'running'
const action =
item.type === 'background'
? running
? onStop && { label: s.stop, onClick: () => onStop(item.id) }
: onDismiss && { label: s.dismiss, onClick: () => onDismiss(item.id) }
: null
const canOpen = item.type === 'subagent' && !!onOpen
const hasOutput = item.type === 'background' && !!item.output
const onActivate = canOpen ? onOpen : hasOutput ? () => setOutputOpen(open => !open) : undefined
return (
<Fragment>
<StatusRow
leading={leadingGlyph(item, s)}
onActivate={onActivate}
trailing={
action ? (
<Tip label={action.label}>
<Button
aria-label={action.label}
className="-my-1 size-4 rounded-md text-muted-foreground/60 hover:text-foreground/90"
onClick={event => {
event.stopPropagation()
action.onClick()
}}
size="icon-xs"
type="button"
variant="ghost"
>
<X size={12} />
</Button>
</Tip>
) : canOpen ? (
<ArrowUpRight aria-hidden className="size-3.5 text-muted-foreground/55" />
) : undefined
}
>
<span
className={cn(
'min-w-0 max-w-[18rem] truncate text-[0.73rem] leading-4',
failed
? 'text-destructive/90'
: item.todoStatus && item.todoStatus !== 'in_progress'
? 'text-muted-foreground/75'
: 'text-foreground/92'
)}
>
{item.title}
</span>
{item.type === 'subagent' && item.currentTool && (
<span className="shrink-0 truncate text-[0.62rem] leading-4 text-muted-foreground/70">
{toolLabel(item.currentTool)}
</span>
)}
{failed && typeof item.exitCode === 'number' && item.exitCode !== 0 && (
<span className="shrink-0 rounded bg-destructive/15 px-1 text-[0.58rem] font-semibold text-destructive tabular-nums">
{s.exit(item.exitCode)}
</span>
)}
{hasOutput && <DisclosureCaret className="shrink-0 text-muted-foreground/45" open={outputOpen} size="0.8em" />}
</StatusRow>
{hasOutput && outputOpen && <TerminalOutput className="mx-auto mb-1 max-w-[90%]" text={item.output!} />}
</Fragment>
)
})

View File

@@ -22,6 +22,33 @@ describe('detectTrigger', () => {
it('returns null for plain text', () => {
expect(detectTrigger('hello there')).toBeNull()
})
it('keeps the slash trigger live while typing args', () => {
expect(detectTrigger('/personality ')).toEqual({
kind: '/',
query: 'personality ',
tokenLength: 13
})
expect(detectTrigger('/personality alic')).toEqual({
kind: '/',
query: 'personality alic',
tokenLength: 17
})
expect(detectTrigger('/tools enable foo')).toEqual({
kind: '/',
query: 'tools enable foo',
tokenLength: 17
})
})
it('does not treat file-style paths as slash triggers', () => {
expect(detectTrigger('src/foo/bar')).toBeNull()
expect(detectTrigger('/path/to/file')).toBeNull()
})
it('still anchors at-mention triggers strictly at the token edge', () => {
expect(detectTrigger('@file:path with space')).toBeNull()
})
})
describe('extractClipboardImageBlobs', () => {

View File

@@ -6,7 +6,13 @@ export interface TriggerState {
tokenLength: number
}
const TRIGGER_RE = /(?:^|[\s])([@/])([^\s@/]*)$/
// `@` triggers stop at the first whitespace — `@file:path` and `@diff` are
// single tokens. `/` triggers keep going so the popover stays live while the
// user types args (`/personality alic` → arg completer suggests `alice`).
// Restricting the slash command name to `[a-zA-Z][\w-]*` avoids matching file
// paths like `src/foo/bar`.
const AT_TRIGGER_RE = /(?:^|[\s])(@)([^\s@/]*)$/
const SLASH_TRIGGER_RE = /(?:^|[\s])(\/)((?:[a-zA-Z][\w-]*(?:\s+\S*)*)?)$/
/** Stable key for paste dedupe — `items` and `files` often mirror the same image as different objects. */
export function blobDedupeKey(blob: Blob): string {
@@ -97,11 +103,17 @@ export function textBeforeCaret(editor: HTMLDivElement): string | null {
}
export function detectTrigger(textBefore: string): TriggerState | null {
const match = TRIGGER_RE.exec(textBefore)
const slash = SLASH_TRIGGER_RE.exec(textBefore)
if (!match) {
return null
if (slash) {
return { kind: '/', query: slash[2], tokenLength: 1 + slash[2].length }
}
return { kind: match[1] as '@' | '/', query: match[2], tokenLength: 1 + match[2].length }
const at = AT_TRIGGER_RE.exec(textBefore)
if (at) {
return { kind: '@', query: at[2], tokenLength: 1 + at[2].length }
}
return null
}

View File

@@ -34,9 +34,17 @@ describe('ComposerTriggerPopover i18n', () => {
})
it('renders localized loading copy for slash commands', () => {
const { container } = renderPopover('/', true)
renderPopover('/', true)
// While loading the popover shows only the spinner + loading copy — the
// `/help` empty-state hint is reserved for the resolved (not-loading) state.
expect(screen.getByText('查找中…')).toBeTruthy()
})
it('renders the slash empty-state hint when not loading', () => {
const { container } = renderPopover('/')
expect(screen.getByText('没有匹配项。')).toBeTruthy()
expect(container.textContent).toContain('/help')
})
})

View File

@@ -1,15 +1,12 @@
import type { Unstable_TriggerItem } from '@assistant-ui/core'
import { Fragment } from 'react'
import { Codicon } from '@/components/ui/codicon'
import { GlyphSpinner } from '@/components/ui/glyph-spinner'
import { useI18n } from '@/i18n'
import { cn } from '@/lib/utils'
import {
COMPLETION_DRAWER_BELOW_CLASS,
COMPLETION_DRAWER_CLASS,
COMPLETION_DRAWER_ROW_CLASS,
CompletionDrawerEmpty
} from './completion-drawer'
import { COMPLETION_DRAWER_BELOW_CLASS, COMPLETION_DRAWER_CLASS, CompletionDrawerEmpty } from './completion-drawer'
const AT_ICON_BY_TYPE: Record<string, string> = {
diff: 'diff',
@@ -23,11 +20,7 @@ const AT_ICON_BY_TYPE: Record<string, string> = {
url: 'globe'
}
function completionIcon(kind: '@' | '/', item: Unstable_TriggerItem) {
if (kind === '/') {
return 'terminal'
}
function atIcon(item: Unstable_TriggerItem) {
const meta = item.metadata as { rawText?: string } | undefined
const raw = meta?.rawText || item.label
@@ -42,6 +35,18 @@ function completionIcon(kind: '@' | '/', item: Unstable_TriggerItem) {
return AT_ICON_BY_TYPE[item.type] || AT_ICON_BY_TYPE.simple
}
interface RowMeta {
display?: string
group?: string
meta?: string
}
const ROW_BASE_CLASS = [
'relative flex w-full cursor-default select-none rounded-md px-2 py-1 text-left',
'outline-hidden transition-colors hover:bg-(--ui-bg-tertiary)',
'data-[highlighted]:bg-(--ui-bg-tertiary) data-[highlighted]:text-foreground'
].join(' ')
interface ComposerTriggerPopoverProps {
activeIndex: number
items: readonly Unstable_TriggerItem[]
@@ -63,6 +68,9 @@ export function ComposerTriggerPopover({
}: ComposerTriggerPopoverProps) {
const { t } = useI18n()
const copy = t.composer
const isSlash = kind === '/'
let lastGroup: string | undefined
return (
<div
@@ -73,41 +81,94 @@ export function ComposerTriggerPopover({
role="listbox"
>
{items.length === 0 ? (
<CompletionDrawerEmpty title={loading ? copy.lookupLoading : copy.lookupNoMatches}>
{kind === '@' ? (
<>
{copy.lookupTry} <span className="font-mono text-foreground/80">@file:</span> {copy.lookupOr}{' '}
<span className="font-mono text-foreground/80">@folder:</span>.
</>
) : (
<>
{copy.lookupTry} <span className="font-mono text-foreground/80">/help</span>.
</>
)}
</CompletionDrawerEmpty>
loading ? (
<div className="flex items-center gap-2 px-2 py-1.5 text-(--ui-text-tertiary)">
<GlyphSpinner ariaLabel={copy.lookupLoading} className="text-foreground/70" spinner="braille" />
<span>{copy.lookupLoading}</span>
</div>
) : (
<CompletionDrawerEmpty title={copy.lookupNoMatches}>
{kind === '@' ? (
<>
{copy.lookupTry} <span className="font-mono text-foreground/80">@file:</span> {copy.lookupOr}{' '}
<span className="font-mono text-foreground/80">@folder:</span>.
</>
) : (
<>
{copy.lookupTry} <span className="font-mono text-foreground/80">/help</span>.
</>
)}
</CompletionDrawerEmpty>
)
) : (
items.map((item, index) => {
const meta = item.metadata as { display?: string; meta?: string } | undefined
const display = meta?.display ?? (kind === '/' ? `/${item.label}` : item.label)
const meta = item.metadata as RowMeta | undefined
const display = meta?.display ?? (isSlash ? `/${item.label}` : item.label)
const description = meta?.meta || item.description
const group = meta?.group?.trim()
const showHeader = isSlash && Boolean(group) && group !== lastGroup
const isFirstHeader = lastGroup === undefined
lastGroup = group || lastGroup
const active = index === activeIndex
return (
<button
className={cn(COMPLETION_DRAWER_ROW_CLASS, index === activeIndex && 'bg-(--ui-bg-tertiary)')}
data-highlighted={index === activeIndex ? '' : undefined}
key={item.id}
onClick={() => onPick(item)}
onMouseEnter={() => onHover(index)}
type="button"
>
<span className="grid size-3.5 shrink-0 place-items-center text-(--ui-text-tertiary)">
<Codicon name={completionIcon(kind, item)} size="0.875rem" />
</span>
<span className="min-w-0 shrink truncate font-mono font-medium leading-5 text-foreground">{display}</span>
{description && (
<span className="min-w-0 flex-1 truncate leading-5 text-(--ui-text-tertiary)">{description}</span>
<Fragment key={item.id}>
{showHeader && (
<div
className={cn(
'select-none px-2 pb-0.5 text-[0.625rem] font-semibold uppercase tracking-wider text-(--ui-text-tertiary)',
isFirstHeader ? 'pt-0.5' : 'pt-2'
)}
>
{group}
</div>
)}
</button>
<button
className={cn(ROW_BASE_CLASS, isSlash ? 'flex-col gap-0' : 'items-center gap-2')}
data-highlighted={active ? '' : undefined}
onClick={() => onPick(item)}
onMouseEnter={() => onHover(index)}
type="button"
>
{isSlash ? (
<>
{/* Active row (keyboard nav or hover) un-truncates inline so
long command names / descriptions stay readable without a
floating tooltip. */}
<span
className={cn(
'text-[0.8125rem] font-medium leading-snug text-foreground',
active ? 'whitespace-normal break-words' : 'truncate'
)}
>
{display}
</span>
{description && (
<span
className={cn(
'text-[0.6875rem] leading-snug text-(--ui-text-tertiary)',
active ? 'whitespace-normal break-words' : 'truncate'
)}
>
{description}
</span>
)}
</>
) : (
<>
<span className="grid size-4 shrink-0 place-items-center text-(--ui-text-tertiary)">
<Codicon name={atIcon(item)} size="0.875rem" />
</span>
<span className="min-w-0 shrink truncate font-mono font-medium leading-5 text-foreground">
{display}
</span>
{description && (
<span className="min-w-0 flex-1 truncate leading-5 text-(--ui-text-tertiary)">{description}</span>
)}
</>
)}
</button>
</Fragment>
)
})
)}

View File

@@ -1,6 +1,7 @@
import { useCallback } from 'react'
import { requestComposerFocus, requestComposerInsert } from '@/app/chat/composer/focus'
import { requestComposerFocus, requestComposerInsert, requestComposerInsertRefs } from '@/app/chat/composer/focus'
import { droppedFileInlineRef } from '@/app/chat/composer/inline-refs'
import { formatRefValue } from '@/components/assistant-ui/directive-text'
import { useI18n } from '@/i18n'
import { attachmentId, contextPath, pathLabel } from '@/lib/chat-runtime'
@@ -286,6 +287,26 @@ export function useComposerActions({ activeSessionId, currentCwd, requestGateway
[currentCwd]
)
const insertContextPathInlineRef = useCallback(
(path: string, isDirectory = false) => {
if (!path) {
return false
}
const ref = droppedFileInlineRef({ isDirectory, path }, currentCwd)
if (!ref) {
return false
}
requestComposerInsertRefs([ref])
requestComposerFocus('main')
return true
},
[currentCwd]
)
const attachContextFilePath = useCallback(
(filePath: string) => {
if (!filePath) {
@@ -546,6 +567,7 @@ export function useComposerActions({ activeSessionId, currentCwd, requestGateway
attachDroppedItems,
attachImageBlob,
attachImagePath,
insertContextPathInlineRef,
pasteClipboardImage,
pickContextPaths,
pickImages,

View File

@@ -35,7 +35,9 @@ import {
$gatewayState,
$introPersonality,
$introSeed,
$lastVisibleMessageIsUser,
$messages,
$messagesEmpty,
$selectedStoredSessionId,
$sessions,
sessionPinId
@@ -43,7 +45,7 @@ import {
import type { ModelOptionsResponse } from '@/types/hermes'
import { routeSessionId } from '../routes'
import { titlebarHeaderBaseClass, titlebarHeaderShadowClass } from '../shell/titlebar'
import { titlebarHeaderBaseClass, titlebarHeaderShadowClass, titlebarHeaderTitleClass } from '../shell/titlebar'
import { ChatDropOverlay } from './chat-drop-overlay'
import { ChatSwapOverlay } from './chat-swap-overlay'
@@ -53,8 +55,9 @@ import { droppedFileInlineRefs, type SessionDragPayload, sessionInlineRef } from
import type { ChatBarState } from './composer/types'
import { type DroppedFile, partitionDroppedFiles } from './hooks/use-composer-actions'
import { useFileDropZone } from './hooks/use-file-drop-zone'
import { ScrollToBottomButton } from './scroll-to-bottom-button'
import { SessionActionsMenu } from './sidebar/session-actions-menu'
import { lastVisibleMessageIsUser, threadLoadingState } from './thread-loading'
import { threadLoadingState } from './thread-loading'
interface ChatViewProps extends Omit<React.ComponentProps<'div'>, 'onSubmit'> {
gateway: HermesGateway | null
@@ -80,6 +83,7 @@ interface ChatViewProps extends Omit<React.ComponentProps<'div'>, 'onSubmit'> {
onThreadMessagesChange: (messages: readonly ThreadMessage[]) => void
onEdit: (message: AppendMessage) => Promise<void>
onReload: (parentId: string | null) => Promise<void>
onRestoreToMessage?: (messageId: string) => Promise<void>
onTranscribeAudio?: (audio: Blob) => Promise<string>
}
@@ -125,7 +129,7 @@ function ChatHeader({
return (
<header className={cn(titlebarHeaderBaseClass, isRoutedSessionView && titlebarHeaderShadowClass)}>
<div
className="min-w-0 flex-1"
className={titlebarHeaderTitleClass}
style={{
maxWidth:
'calc(100vw - var(--titlebar-content-inset,0px) - var(--titlebar-tools-right) - var(--titlebar-tools-width) - 1.5rem)'
@@ -141,7 +145,7 @@ function ChatHeader({
title={title}
>
<Button
className="pointer-events-auto flex h-6 min-w-0 max-w-full gap-1 border border-transparent bg-transparent px-2 py-0 text-(--ui-text-secondary) hover:border-(--ui-stroke-tertiary) hover:bg-(--ui-control-hover-background) hover:text-foreground data-[state=open]:border-(--ui-stroke-tertiary) data-[state=open]:bg-(--ui-control-active-background) [-webkit-app-region:no-drag]"
className="pointer-events-auto flex h-6 min-w-0 max-w-full gap-1 overflow-hidden border border-transparent bg-transparent px-2 py-0 text-(--ui-text-secondary) hover:border-(--ui-stroke-tertiary) hover:bg-(--ui-control-hover-background) hover:text-foreground data-[state=open]:border-(--ui-stroke-tertiary) data-[state=open]:bg-(--ui-control-active-background) [-webkit-app-region:no-drag]"
type="button"
variant="ghost"
>
@@ -154,104 +158,42 @@ function ChatHeader({
)
}
export function ChatView({
className,
gateway,
onToggleSelectedPin,
onDeleteSelectedSession,
interface ChatRuntimeBoundaryProps {
busy: boolean
children: React.ReactNode
onCancel: () => Promise<void> | void
onEdit: (message: AppendMessage) => Promise<void>
onReload: (parentId: string | null) => Promise<void>
onThreadMessagesChange: (messages: readonly ThreadMessage[]) => void
/** Route points at an unloaded session — render empty until resume swaps in
* the new transcript, so the previous session's messages don't linger. */
suppressMessages: boolean
}
const NO_MESSAGES: ChatMessage[] = []
/**
* Owns the $messages subscription and the assistant-ui external-store runtime.
*
* Isolated from ChatView so the per-token delta flush (which replaces the
* $messages atom ~30×/s during streaming) only re-renders this component and
* the runtime provider. The children (Thread, ChatBar) are created by
* ChatView, whose render output is stable across flushes — so React bails out
* of re-rendering them by element identity and the stream's render cost stays
* confined to the streaming message's own subtree.
*/
function ChatRuntimeBoundary({
busy,
children,
onCancel,
onAddContextRef,
onAddUrl,
onAttachImageBlob,
onAttachDroppedItems,
onBranchInNewChat,
maxVoiceRecordingSeconds,
onPasteClipboardImage,
onPickFiles,
onPickFolders,
onPickImages,
onRemoveAttachment,
onSteer,
onSubmit,
onThreadMessagesChange,
onEdit,
onReload,
onTranscribeAudio
}: ChatViewProps) {
const location = useLocation()
const activeSessionId = useStore($activeSessionId)
const awaitingResponse = useStore($awaitingResponse)
const busy = useStore($busy)
const contextSuggestions = useStore($contextSuggestions)
const currentCwd = useStore($currentCwd)
const currentModel = useStore($currentModel)
const currentProvider = useStore($currentProvider)
const freshDraftReady = useStore($freshDraftReady)
const gatewayState = useStore($gatewayState)
const gatewaySwapTarget = useStore($gatewaySwapTarget)
const gatewayOpen = gatewayState === 'open'
const introPersonality = useStore($introPersonality)
const introSeed = useStore($introSeed)
const messages = useStore($messages)
const selectedSessionId = useStore($selectedStoredSessionId)
onThreadMessagesChange,
suppressMessages
}: ChatRuntimeBoundaryProps) {
const storeMessages = useStore($messages)
const messages = suppressMessages ? NO_MESSAGES : storeMessages
const runtimeMessageCacheRef = useRef(new WeakMap<ChatMessage, ThreadMessage>())
const isRoutedSessionView = Boolean(routeSessionId(location.pathname))
const showIntro =
freshDraftReady && !isRoutedSessionView && !selectedSessionId && !activeSessionId && messages.length === 0
// Session is still loading if the route references a session we haven't
// resumed yet. Once `activeSessionId` is set (runtime has resumed), the
// session exists — even if it has zero messages (a brand-new routed
// session). The flicker where `busy` flips true briefly during hydrate
// is handled by `threadLoadingState`'s last-visible-user gate.
const loadingSession = isRoutedSessionView && messages.length === 0 && !activeSessionId
const threadLoading = threadLoadingState(loadingSession, busy, awaitingResponse, lastVisibleMessageIsUser(messages))
const showChatBar = !loadingSession
const threadKey = selectedSessionId || activeSessionId || (isRoutedSessionView ? location.pathname : 'new')
const modelOptionsQuery = useQuery<ModelOptionsResponse>({
queryKey: ['model-options', activeSessionId || 'global'],
queryFn: () => {
if (!activeSessionId) {
return getGlobalModelOptions()
}
if (!gateway) {
throw new Error('Hermes gateway unavailable')
}
return gateway.request<ModelOptionsResponse>('model.options', { session_id: activeSessionId })
},
enabled: gatewayOpen
})
const quickModels = useMemo(
() => quickModelOptions(modelOptionsQuery.data, currentProvider, currentModel),
[currentModel, currentProvider, modelOptionsQuery.data]
)
const chatBarState = useMemo<ChatBarState>(
() => ({
model: {
model: currentModel,
provider: currentProvider,
canSwitch: gatewayOpen,
loading: !gatewayOpen || (!currentModel && !currentProvider),
quickModels
},
tools: {
enabled: true,
label: 'Add context',
suggestions: contextSuggestions
},
voice: {
enabled: true,
active: false
}
}),
[contextSuggestions, currentModel, currentProvider, gatewayOpen, quickModels]
)
const runtimeMessageRepository = useMemo(() => {
const items: { message: ThreadMessage; parentId: string | null }[] = []
@@ -301,6 +243,120 @@ export function ChatView({
onReload
})
return <AssistantRuntimeProvider runtime={runtime}>{children}</AssistantRuntimeProvider>
}
export function ChatView({
className,
gateway,
onToggleSelectedPin,
onDeleteSelectedSession,
onCancel,
onAddContextRef,
onAddUrl,
onAttachImageBlob,
onAttachDroppedItems,
onBranchInNewChat,
maxVoiceRecordingSeconds,
onPasteClipboardImage,
onPickFiles,
onPickFolders,
onPickImages,
onRemoveAttachment,
onSteer,
onSubmit,
onThreadMessagesChange,
onEdit,
onReload,
onRestoreToMessage,
onTranscribeAudio
}: ChatViewProps) {
const location = useLocation()
const activeSessionId = useStore($activeSessionId)
const awaitingResponse = useStore($awaitingResponse)
const busy = useStore($busy)
const contextSuggestions = useStore($contextSuggestions)
const currentCwd = useStore($currentCwd)
const currentModel = useStore($currentModel)
const currentProvider = useStore($currentProvider)
const freshDraftReady = useStore($freshDraftReady)
const gatewayState = useStore($gatewayState)
const gatewaySwapTarget = useStore($gatewaySwapTarget)
const gatewayOpen = gatewayState === 'open'
const introPersonality = useStore($introPersonality)
const introSeed = useStore($introSeed)
// PERF: ChatView must not subscribe to $messages — the atom is replaced on
// every streaming delta flush (~30×/s) and a subscription here re-renders
// the entire chat shell (header, chat bar, thread wrapper) per token. The
// runtime that DOES need the messages lives in ChatRuntimeBoundary below;
// this component only needs streaming-stable derivations.
const messagesEmpty = useStore($messagesEmpty)
const lastVisibleIsUser = useStore($lastVisibleMessageIsUser)
const selectedSessionId = useStore($selectedStoredSessionId)
const routedSessionId = routeSessionId(location.pathname)
const isRoutedSessionView = Boolean(routedSessionId)
// The URL points at a session the store hasn't loaded yet (sidebar / cmd-K /
// direct nav). Derived in render so the swap reads instantly: the same frame
// the id changes we drop the old transcript and show the loader, instead of
// waiting for the resume effect (which paints a frame later) to clear them.
const routeSessionMismatch = isRoutedSessionView && routedSessionId !== selectedSessionId
const showIntro = freshDraftReady && !isRoutedSessionView && !selectedSessionId && !activeSessionId && messagesEmpty
// Session is still loading if the route references a session we haven't
// resumed yet. Once `activeSessionId` is set (runtime has resumed), the
// session exists — even if it has zero messages (a brand-new routed
// session). The flicker where `busy` flips true briefly during hydrate
// is handled by `threadLoadingState`'s last-visible-user gate.
const loadingSession = isRoutedSessionView && (routeSessionMismatch || (messagesEmpty && !activeSessionId))
const threadLoading = threadLoadingState(loadingSession, busy, awaitingResponse, lastVisibleIsUser)
const showChatBar = !loadingSession
const threadKey = selectedSessionId || activeSessionId || (isRoutedSessionView ? location.pathname : 'new')
const modelOptionsQuery = useQuery<ModelOptionsResponse>({
queryKey: ['model-options', activeSessionId || 'global'],
queryFn: () => {
if (!activeSessionId) {
return getGlobalModelOptions()
}
if (!gateway) {
throw new Error('Hermes gateway unavailable')
}
return gateway.request<ModelOptionsResponse>('model.options', { session_id: activeSessionId })
},
enabled: gatewayOpen
})
const quickModels = useMemo(
() => quickModelOptions(modelOptionsQuery.data, currentProvider, currentModel),
[currentModel, currentProvider, modelOptionsQuery.data]
)
const chatBarState = useMemo<ChatBarState>(
() => ({
model: {
model: currentModel,
provider: currentProvider,
canSwitch: gatewayOpen,
loading: !gatewayOpen || (!currentModel && !currentProvider),
quickModels
},
tools: {
enabled: true,
label: 'Add context',
suggestions: contextSuggestions
},
voice: {
enabled: true,
active: false
}
}),
[contextSuggestions, currentModel, currentProvider, gatewayOpen, quickModels]
)
// Drop files anywhere in the conversation area, not just on the composer
// input. In-app drags (project tree / gutter) carry workspace-relative paths
// the gateway resolves directly, so they stay inline `@file:` refs. OS/Finder
@@ -353,7 +409,14 @@ export function ChatView({
className="relative min-h-0 max-w-full flex-1 overflow-hidden bg-(--ui-chat-surface-background) contain-[layout_paint]"
{...dropHandlers}
>
<AssistantRuntimeProvider runtime={runtime}>
<ChatRuntimeBoundary
busy={busy}
onCancel={onCancel}
onEdit={onEdit}
onReload={onReload}
onThreadMessagesChange={onThreadMessagesChange}
suppressMessages={routeSessionMismatch}
>
<Thread
clampToComposer={showChatBar}
cwd={currentCwd}
@@ -362,6 +425,7 @@ export function ChatView({
loading={threadLoading}
onBranchInNewChat={onBranchInNewChat}
onCancel={onCancel}
onRestoreToMessage={onRestoreToMessage}
sessionId={activeSessionId}
sessionKey={threadKey}
/>
@@ -387,13 +451,14 @@ export function ChatView({
onSteer={onSteer}
onSubmit={onSubmit}
onTranscribeAudio={onTranscribeAudio}
queueSessionKey={selectedSessionId || activeSessionId}
queueSessionKey={selectedSessionId}
sessionId={activeSessionId}
state={chatBarState}
/>
</Suspense>
)}
</AssistantRuntimeProvider>
</ChatRuntimeBoundary>
{showChatBar && <ScrollToBottomButton />}
<ChatDropOverlay kind={dragKind} />
<ChatSwapOverlay profile={gatewaySwapTarget} />
</div>

View File

@@ -10,11 +10,16 @@ import { useEffect, useMemo, useState } from 'react'
import ShikiHighlighter from 'react-shiki'
import { Streamdown } from 'streamdown'
import { requestComposerFocus, requestComposerInsertRefs } from '@/app/chat/composer/focus'
import { droppedFileInlineRef } from '@/app/chat/composer/inline-refs'
import { HERMES_PATHS_MIME } from '@/app/chat/hooks/use-composer-actions'
import { isAddSelectionShortcut } from '@/app/right-sidebar/terminal/selection'
import { PageLoader } from '@/components/page-loader'
import { translateNow, useI18n } from '@/i18n'
import { readDesktopFileDataUrl, readDesktopFileText } from '@/lib/desktop-fs'
import { cn } from '@/lib/utils'
import type { PreviewTarget } from '@/store/preview'
import { $currentCwd } from '@/store/session'
const SHIKI_THEME = { dark: 'github-dark-default', light: 'github-light-default' } as const
const TEXT_PREVIEW_MAX_BYTES = 512 * 1024
@@ -180,15 +185,13 @@ function looksBinaryBytes(bytes: Uint8Array) {
}
async function readTextPreview(filePath: string) {
if (window.hermesDesktop.readFileText) {
try {
return await window.hermesDesktop.readFileText(filePath)
} catch (error) {
const message = error instanceof Error ? error.message : String(error)
try {
return await readDesktopFileText(filePath)
} catch (error) {
const message = error instanceof Error ? error.message : String(error)
if (!message.includes("No handler registered for 'hermes:readFileText'")) {
throw error
}
if (!message.includes("No handler registered for 'hermes:readFileText'")) {
throw error
}
}
@@ -288,7 +291,7 @@ const MARKDOWN_COMPONENTS = {
function MarkdownPreview({ text }: { text: string }) {
return (
<div className="preview-markdown mx-auto max-w-3xl px-4 py-3 text-sm text-foreground">
<div className="preview-markdown mx-auto max-w-3xl px-4 py-3 text-sm text-foreground" data-selectable-text="true">
<Streamdown components={MARKDOWN_COMPONENTS} controls={false} mode="static" parseIncompleteMarkdown={false}>
{text}
</Streamdown>
@@ -358,6 +361,38 @@ function SourceView({ filePath, language, text }: { filePath: string; language:
startLineDrag(event, filePath, inSelection(line) && selection ? selection : { end: line, start: line })
}
// ⌘/Ctrl+L with a line selection drops the same `@line:path:start-end` ref the
// gutter drag produces — so the keyboard path mirrors dragging the lines into
// the composer. Capture-phase + stopPropagation so it beats the terminal's
// global ⌘L handler (which would otherwise grab the native text selection).
useEffect(() => {
if (!selection) {
return
}
const onKeyDown = (event: KeyboardEvent) => {
if (!isAddSelectionShortcut(event)) {
return
}
const lineEnd = selection.end > selection.start ? selection.end : undefined
const ref = droppedFileInlineRef({ line: selection.start, lineEnd, path: filePath }, $currentCwd.get())
if (!ref) {
return
}
event.preventDefault()
event.stopPropagation()
requestComposerInsertRefs([ref])
requestComposerFocus('main')
}
window.addEventListener('keydown', onKeyDown, { capture: true })
return () => window.removeEventListener('keydown', onKeyDown, { capture: true })
}, [filePath, selection])
return (
<div className="grid min-w-max grid-cols-[auto_minmax(0,1fr)] font-mono text-xs leading-relaxed">
<div className="select-none py-3 text-right text-muted-foreground/55">
@@ -384,7 +419,10 @@ function SourceView({ filePath, language, text }: { filePath: string; language:
)
})}
</div>
<div className="relative [&_pre]:m-0 [&_pre]:px-3 [&_pre]:py-3 [&_pre]:bg-transparent!">
<div
className="relative [&_pre]:m-0 [&_pre]:px-3 [&_pre]:py-3 [&_pre]:bg-transparent!"
data-selectable-text="true"
>
{selection && (
<div
aria-hidden
@@ -448,7 +486,7 @@ export function LocalFilePreview({ reloadKey, target }: { reloadKey: number; tar
if (isImage) {
// Prefer bytes the caller already handed us (a pasted/dropped
// screenshot) over re-reading a path that may be transient/unreadable.
const dataUrl = target.dataUrl || (await window.hermesDesktop.readFileDataUrl(filePath))
const dataUrl = target.dataUrl || (await readDesktopFileDataUrl(filePath))
if (active) {
setState({ dataUrl, loading: false })

View File

@@ -1,11 +1,50 @@
import { act, cleanup, render } from '@testing-library/react'
import { afterEach, describe, expect, it, vi } from 'vitest'
import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest'
import { $connection } from '@/store/session'
import { PreviewPane } from './preview-pane'
describe('PreviewPane console state', () => {
beforeEach(() => {
vi.stubGlobal('requestAnimationFrame', (callback: FrameRequestCallback) => window.setTimeout(() => callback(Date.now()), 0))
vi.stubGlobal('cancelAnimationFrame', (id: number) => window.clearTimeout(id))
})
afterEach(() => {
cleanup()
$connection.set(null)
vi.unstubAllGlobals()
})
it('does not watch backend-only remote filesystem previews locally', () => {
const watchPreviewFile = vi.fn(async () => ({ id: 'watch-1', path: '/remote/file.txt' }))
const onPreviewFileChanged = vi.fn(() => vi.fn())
$connection.set({ mode: 'remote' } as never)
vi.stubGlobal('window', {
...window,
hermesDesktop: {
onPreviewFileChanged,
watchPreviewFile
}
})
render(
<PreviewPane
setTitlebarToolGroup={vi.fn()}
target={{
kind: 'file',
label: 'file.txt',
path: '/remote/file.txt',
previewKind: 'text',
source: '/remote/file.txt',
url: 'file:///remote/file.txt'
}}
/>
)
expect(watchPreviewFile).not.toHaveBeenCalled()
expect(onPreviewFileChanged).not.toHaveBeenCalled()
})
it('does not rebuild the pane titlebar group for streamed console logs', () => {

View File

@@ -5,6 +5,7 @@ import { useCallback, useEffect, useRef, useState } from 'react'
import type { SetTitlebarToolGroup, TitlebarTool } from '@/app/shell/titlebar-controls'
import { Tip } from '@/components/ui/tooltip'
import { type Translations, useI18n } from '@/i18n'
import { isDesktopFsRemoteMode } from '@/lib/desktop-fs'
import { Bug } from '@/lib/icons'
import { cn } from '@/lib/utils'
import { notify, notifyError } from '@/store/notifications'
@@ -406,6 +407,7 @@ export function PreviewPane({
useEffect(() => {
if (
target.kind !== 'file' ||
isDesktopFsRemoteMode() ||
!window.hermesDesktop?.watchPreviewFile ||
!window.hermesDesktop?.onPreviewFileChanged
) {

Some files were not shown because too many files have changed in this diff Show More