Compare commits

...

184 Commits

Author SHA1 Message Date
alt-glitch
3723bf5fe6 opentui(v6): defer per-block copy button (carved to its own PR)
The clickable per-block ⧉ copy chip under each message block is split out of
the engine PR (#42922) into its own issue + PR, to keep the engine PR focused.

Removes:
- logic/blockCopy.ts + its unit test (copyBlock + injectable writer)
- the CopyChip component and its two render sites in view/messageLine.tsx
  (flat message text + text parts); the wrapper boxes unwrap back to bare
  <text>/<Markdown>
- the `chips` height accounting in logic/window.ts (estimateMessageHeight +
  partLines added a phantom +1 line per block) and the arg passed from
  view/transcript.tsx; updated window.test.ts / displayModes.test.tsx /
  transcriptWindow.test.tsx expectations accordingly

Unaffected (intentionally kept — core, parity-critical): mouse-selection copy
/ Ctrl+C / copy-on-select (OSC52) in boundary/renderer.ts, and the /copy [n]
command in logic/copy.ts.

Verified: npm run check green (type-check + lint + 813 tests), acceptance greps
clean (no CopyChip/copyBlock/blockCopy left; selection-copy + /copy intact),
and a live tmux smoke confirms no ⧉ copy renders under messages.
2026-06-16 21:03:13 +05:30
alt-glitch
a348fc1ccc Merge remote-tracking branch 'origin/main' into feat/opentui-native-engine 2026-06-16 20:00:44 +05:30
alt-glitch
677680034a Revert "gateway: capture real provider-reported cost (openrouter usage accounting)"
This reverts commit 85546bb9e2.
2026-06-16 19:44:41 +05:30
alt-glitch
222126db1d Revert "gateway: compact /usage with current-session per-model costs"
This reverts commit 364b93a4b9.
2026-06-16 19:43:25 +05:30
alt-glitch
418ceaf8c1 Revert "fix(tui): chrome cost from Nous portal headers only (F3)"
This reverts commit e01b04de46.
2026-06-16 19:42:50 +05:30
alt-glitch
c7e23690e0 Revert "cli: worktree lock + dirty-tree preservation — stop pruning uncommitted work"
This reverts commit 94765e48ff.
2026-06-16 19:42:50 +05:30
alt-glitch
4d8bfa103b Revert "fix(clarify): docstring — put options in choices[] only, never enumerate in question text"
This reverts commit 16e408f3f0.
2026-06-16 19:42:50 +05:30
alt-glitch
9d05f3721d refactor(tui): fetch tree-sitter grammars at runtime instead of vendoring
The OpenTUI engine vendored 10 tree-sitter grammars (.wasm + .scm) under
ui-opentui/parsers/ — ~37k checked-in binary lines, the single biggest
addition in the engine diff. opencode (the production reference) vendors
none: it declares grammars as remote URLs and lets OpenTUI fetch + cache
them. OpenTUI supports this natively via TreeSitterClient's dataPath cache.

Migrate to that model:
- parsers.manifest.json (now under src/boundary/) becomes the URL source of
  truth: each grammar is { filetype, aliases, wasm: <release URL>,
  highlights: <.scm URL> }. Grammar versions stay pinned (same release tags);
  .scm sources follow opencode's per-language choices (parser-repo queries
  for python/html where nvim-treesitter's are parser-incompatible).
- parsers.ts: registerVendoredParsers -> registerRemoteParsers. It points the
  global tree-sitter client's cache at HERMES_TUI_PARSER_CACHE via setDataPath
  BEFORE the client initializes, then addDefaultParsers() with the URL configs.
  Registration does zero network; the fetch is lazy on first use of a language
  and degrades to plain text (never throws) when GitHub is unreachable.
- hermes_cli/main.py sets HERMES_TUI_PARSER_CACHE to
  ~/.hermes/cache/opentui-parsers/ (profile-aware via get_hermes_home).
- git rm -r ui-opentui/parsers/ and drop scripts/update-parsers.mjs.
- parsers.test.tsx asserts URL configs are well-formed + cache-dir behavior
  instead of vendored-file existence.

Verified end-to-end on Node 26.3: type-check + lint clean, full ui-opentui
suite (821 tests) green, and a built smoke proves first-use fetch -> cache ->
10 real highlights, cache-hit on rerun, and graceful plain-text degrade when
the grammar URLs are unreachable.
2026-06-16 19:12:21 +05:30
alt-glitch
23cc009879 Merge remote-tracking branch 'origin/main' into feat/opentui-native-engine 2026-06-16 18:52:34 +05:30
alt-glitch
946d3eaf95 Merge remote-tracking branch 'origin/main' into feat/opentui-native-engine 2026-06-15 15:59:30 +05:30
alt-glitch
bf45aa3a45 opentui(v6): WIP — todo panel + memoryMonitor + mouse/startup-prompt env
Preservation snapshot of three in-progress OpenTUI threads before merging main
(gate-green together: npm run check 821 tests, 0 lint/type errors):
- todo panel (todoPanel/todoTool + App/statusBar/transcript/store wiring,
  ☑ done/total status chip, latestTodos snapshot)
- OpenTUI memoryMonitor (boundary+logic+test; the inverse of the Ink memlog port)
- env: resolveMouseEnabled/envToggle/startupPrompt, HERMES_TUI_MOUSE +
  HERMES_TUI_PROMPT aliases, startup-image attach in main.tsx
- termChrome refactor

Not yet split per-feature; commit boundary is the pre-merge clean point.
2026-06-15 15:59:11 +05:30
alt-glitch
16e408f3f0 fix(clarify): docstring — put options in choices[] only, never enumerate in question text
The model was enumerating options inside the question string (dead prose the UI
can't render as pickable rows). Schema description now spells out: choices[] is
REQUIRED for selectable options; question holds ONLY the question.
2026-06-15 15:58:06 +05:30
alt-glitch
4108fe6014 tui(diag): Ink 1Hz memwatch collector — OpenTUI-compatible memory trace
Ink had no continuous memory trace (only point-in-time heapdumps + a threshold
monitor), so HERMES_TUI_DIAGNOSTICS=1 gave OpenTUI dogfood data with no Ink
equivalent. Port OpenTUI's memlog collector to Ink so both engines emit
byte-identical ~/.hermes/logs/memwatch/<boot>-<pid>.jsonl traces feeding one
memwatch-report.mjs.

- lib/memlog.ts: 1Hz unref'd sampler, {t,rss_kb,heap_used_kb,external_kb}
  (no mounted — Ink has no windowing), 14-day prune, silent-disable on error
- gated by HERMES_TUI_MEMLOG defaulting to the HERMES_TUI_DIAGNOSTICS master
  switch (same as OpenTUI — one export covers both engines)
- wired into entry.tsx alongside the existing monitor; stop on beforeExit
- lib/memlog.test.ts: gate/schema/retention/silent-disable (7 tests)
- docs/ink-env-flags.md (new) + docs/opentui-env-flags.md updated
2026-06-15 15:58:02 +05:30
alt-glitch
b0fb2b8b05 opentui(v6): skins/theming parity — live /skin switch, animated spinner, tool_emojis, status-bar colors
- server resolve_skin() now serializes spinner + tool_emojis (were dropped on
  the wire; neither native engine could use them)
- GatewaySkin schema gains optional spinner/tool_emojis (additive, back-compat)
- theme.ts: SpinnerConfig + parseSpinner (crash-proof) threaded through fromSkin;
  status_bar_* keys now drive statusBg/Fg/Bad/Critical (were hardcoded — all
  dark skins looked identical in the status bar)
- /skin <name> CLIENT slash handler -> config.set -> skin.changed -> live retheme
- composer: imperative ta.textColor/cursorColor on theme change (uncontrolled
  textarea recolor) + slash-token SyntaxStyle re-register
- statusLine: animated face via bounded setInterval armed on running/cleared on stop
- registry/toolPart: skin tool_emojis override tool glyphs
2026-06-15 15:57:57 +05:30
alt-glitch
1ddf7a1021 opentui(v6): bench fixture — HERMES_BENCH_TOOL_BODY_LINES knob for fat tool output
The default lumpy-turn fixture's tool bodies are tiny (2/7/18 short lines), so
W3 (HERMES_TUI_TOOL_OUTPUTS=off) shows no RSS delta at realistic sizes — the
saved bytes sit below the ~20MB run-to-run noise floor. This adds an env knob
to scale every tool result body to N lines, making the retention asymmetry
measurable in the bench (a `find /` / big-file-read class fixture).

UNSET = the original tiny bodies, byte-identical — existing benches and the
determinism digest are unaffected. Set HERMES_BENCH_TOOL_BODY_LINES=N (the
bench harness inherits it at fixture-generation time) for a fat run.

Measured with this (mem300, ~100KB/tool, ~27MB retained tool text): OpenTUI
OFF 244-259MB vs ON 260-261MB vs Ink 269-279MB — i.e. W3 OFF is a real
~5-15MB win at heavy output, and OpenTUI's windowing actually beats Ink at
scale (Ink mounts every row; OpenTUI windows).
2026-06-14 16:12:10 +05:30
alt-glitch
a70f7f3b7b opentui(v6): proactive idle GC, gated on the low-mem heap knob (W2)
OpenTUI-only by design (verified: Ink never calls global.gc proactively — it
only exposes it for heapdumps; spec D5 sanctions the divergence since this is
opt-in). Default / unconstrained sessions do NOTHING.

boundary/proactiveGc.ts: a low-frequency watcher that calls global.gc() only
when ALL hold: (a) the low-mem opt-in is active — HERMES_TUI_HEAP_MB set at or
below 4096 (the same W1 knob; HERMES_TUI_PROACTIVE_GC can force on/off), AND
global.gc is exposed (W1's --expose-gc — else a silent no-op); (b) a turn is
NOT streaming (reads the store's info.running) so it never collects mid-reply;
(c) a full idle window has elapsed since the last activity. Eagerness: once RSS
crosses 400MB the idle window tightens (8s→3s) but STILL waits for idle — never
a mid-stream pause (the jank the campaign fought). Reuses
process.memoryUsage().rss (same read as memlog); the timer is unref'd and every
failure path disables silently. Wired in entry/main.tsx next to startMemlog,
scoped acquire→release.

Tests: gating (low-cap-on / high-cap-off / no-gc-off / explicit on|off) +
timing (fires after the idle window, never while streaming, stop() halts).
proactiveGc.test.ts 10 passed. npm run check OK (785 tests). Verified live that
the gate enables under HERMES_TUI_HEAP_MB=256 and global.gc is callable.
2026-06-14 14:17:50 +05:30
alt-glitch
6e3c393ef9 opentui(v6): configurable V8 heap (HERMES_TUI_HEAP_MB) + --expose-gc (W1)
Low-mem enabler — default unchanged (8192 for both engines), low blast radius.

- Heap knob honored by BOTH engines via the shared NODE_OPTIONS injection:
  HERMES_TUI_HEAP_MB env (highest precedence, matches the HERMES_TUI_ENGINE
  env-first pattern) > display.tui_heap_mb config (minimal early YAML read,
  mirrors _config_tui_engine_early) > the existing cgroup-aware default. The
  override REPLACES the 8192 default inside _resolve_tui_heap_mb (D3): low =
  the low-mem opt-in, high = raise the ceiling. The cgroup-fit 75% clamp still
  applies on top, so an override never exceeds the container. A non-secret
  behavioral setting → config.yaml, NOT the denylisted NODE_OPTIONS bridge.

- --expose-gc added to the OpenTUI argv in _make_opentui_argv (D4, parity with
  Ink which already has it). Must be an argv flag — Node rejects --expose-gc in
  NODE_OPTIONS. Makes global.gc() a real call so the engine's GC hooks
  (/heapdump; W2's proactive idle GC) work instead of silent no-ops. Verified:
  `node --expose-gc -e 'typeof global.gc'` → "function" (vs "undefined").

Tests: TestHeapOverride (env>config precedence, cgroup clamp on a too-high
override, low override honored under a big container, garbage/non-positive
fall-through) + TestExposeGcOnOpenTuiArgv. test_tui_heap_sizing.py 29 passed.
2026-06-14 14:15:32 +05:30
alt-glitch
c7e5215b50 opentui(v6): HERMES_TUI_TOOL_OUTPUTS flag — drop tool-body retention (W3)
The biggest real memory lever: OpenTUI retained full resultText + the raw
result dict + the args dict per tool call, while Ink discards tool bodies
(keeps only a short context line). That retention asymmetry is the bulk of the
Ink-vs-OpenTUI memory gap.

New `HERMES_TUI_TOOL_OUTPUTS` flag (toolOutputsEnabled() in env.ts, default
ON — the rich tool cards are OpenTUI's differentiator). When OFF, the
tool.complete reducer (store.ts) neither BUILDS nor STORES the body: skip the
whole result_text/result stringify+envelope-strip work and suppress
part.resultText / part.result / part.args / part.argsText / part.lineCount /
part.omittedNote. KEPT either way: name, state, duration, error, summary,
argsPreview (the redaction-safe one-liner from tool.start context = Ink's
context line), and the file-edit diff (diffUnified/diffStats — a diff is a
high-value surface, not generic "output"). Render is automatic: with no
resultText/result, defaultRenderer.expandable() is false → header-only row =
Ink parity, no extra view gating needed.

This powers the bench's fair Ink-vs-OpenTUI comparison (D8 — launch OpenTUI
with outputs off so both engines are body-less = pure engine overhead) and the
low-mem mode.

Tests: store retains rich outputs by default; OFF drops the bodies but keeps
name/duration/error/argsPreview/diff. npm run check OK (770 tests).
2026-06-14 13:56:08 +05:30
alt-glitch
25686feebf opentui(v6): bump @opentui 0.4.0 -> 0.4.1 + openConsoleOnError:false (W5)
Bump the three @opentui pins (core/keymap/solid) 0.4.0 -> 0.4.1 + lockfile.
The headline upstream change is native-yoga (#1126); per the locked spec
decision D11 this is NOT a memory-floor lever at typical sizes, and a fresh
bench on this branch confirms it — OpenTUI capped VmHWM is within run-to-run
noise across mem50/100/300 (0.4.0: 193/200/220 vs 0.4.1: 192/218/217 MB).
The value is tail-session layout wins + upstream alignment, not the floor.

D14 (ffiSafe re-verify): the FFI signatures ffiSafe.ts clamps (OptimizedBuffer
fillRect/drawText/setCell/setCellWithAlphaBlending/drawChar + TextBufferView
setViewport) are byte-identical across 0.4.0->0.4.1 (still u32, still crash on
negatives under node:ffi), so the shim stays as-is and remains load-bearing.
Verified live: scrolled a 300-msg transcript with syntax-highlighted code +
tool cards past the viewport top (the negative-y fillRect trigger) on 0.4.1 —
no ERR_INVALID_ARG_VALUE loop, clean render.

Also pick up openConsoleOnError:false (public createCliRenderer option): stops
core's uncaught-error handler from calling the ALLOCATING console.show(), which
exit-7-masks the original error under native-handle exhaustion (the bench
mem3000 postmortem). guardRendererErrorHandlers stays as belt-and-suspenders.

Gate: npm run check OK (768 tests). Determinism gate green both engines.
2026-06-14 13:53:11 +05:30
alt-glitch
8e3b320eb8 opentui(v6): credits/usage notice chrome banner (Ink parity)
The OpenTUI engine received the gateway's credits/usage notices but
mis-rendered them as scrolling inline transcript cards with no lifecycle.
Render them instead as a persistent, level-tinted chrome banner pinned
directly above the status bar, matching the Ink engine — no gateway/agent
changes (the wire + credits policy stay the source of truth).

- backgroundActivity.ts: widen level to include `success` (was silently
  dropped to info) + add isChromeNotice() (kind sticky|ttl) discriminator.
- store.ts: port the Ink turnController notice lifecycle — showNotice/
  applyNotice/clearNotice/flushPendingNotice/clearNoticeState, a single TTL
  timer (latest-wins, id-guarded), mid-turn hold + turn-end reveal (the
  three end sites: message.complete, gateway.exited, error), flash-and-yield
  for credits.usage/grant_spent at message.start, and a notice reset on
  clearTranscript + commitSnapshot so it can't bleed across sessions. Route
  notification.show by kind: sticky|ttl -> chrome banner, everything else
  (process/background completion cards) -> existing inline path, unchanged.
  Distinct clones for notice vs lastNotification (createStore aliasing).
- noticeBanner.tsx + App.tsx: a single sticky row above the status bar,
  text rendered verbatim (already glyphed by the policy), tinted by level,
  width-truncated so it can never wrap and push the composer.

Tests: statusNotice.test.ts (lifecycle/routing/TTL/flash-and-yield),
noticeBanner.test.tsx (render/color/truncation), backgroundActivity +
render additions. npm run check OK (768 tests).
2026-06-14 12:56:21 +05:30
alt-glitch
f5823277dc opentui(v6): clarify markdown + cold slash-highlight + @-mention race fix
Three user-reported TUI fixes:

- clarify prompt rendered raw markdown (literal **bold** / `code`). The
  question + each choice now go through the native <markdown> renderable
  (same engine as the transcript) in a flex column so wrapping + the
  selection accent are preserved. Tests assert structural chrome since
  tree-sitter markdown doesn't paint in the headless renderer (same
  limitation as render.test.tsx); painted markdown verified in a live smoke.

- a leading `/path` first message broke @-mention completion afterward:
  onType fired completion RPCs per keystroke with no out-of-order guard,
  and the transport doesn't guarantee in-order delivery, so a slow orphaned
  complete.slash could land after a later @-mention complete.path and blank
  the dropdown. Add createCompletionGate (pure) — claim() per keystroke,
  isCurrent(token) drops any superseded response.

- slash-command highlighting was hit-or-miss (only highlighted a /command
  if its completion batch had been browsed earlier): LEARNED_NAMES started
  empty and grew lazily. Seed it once at boot from the full uncapped
  commands.catalog via seedLearnedNames, so a cold /command highlights on
  the first keystroke.

npm run check OK (768 tests).
2026-06-14 12:56:21 +05:30
alt-glitch
33924c074c docs(opentui): point bench references at the tui-bench repo
bench/ moved to github.com/NousResearch/tui-bench; repoint the lingering
references in dev-handoff, env-flags, memory-story, ui-opentui README, and
the memlog/memSampler/reconciler source comments.
2026-06-14 10:56:08 +05:30
alt-glitch
21c64f90aa feat(tui_gateway): emit notification.show on background-process completion (Option B)
A notify_on_complete background process (the agent's terminal tool, proc_*) only
reached the TUI as the AGENT'S NARRATION — the completion is fed to the model as a
synthetic prompt (_run_prompt_submit) and the gateway emitted only message.start +
the reply, never the completion itself. So the OpenTUI card mechanism had nothing
to render and the synthetic turn read as a context-less line.

Additive fix (glitch-approved relaxation of the no-core rule for this one emit):
new _emit_process_completion_card() fires a `notification.show` (text "<cmd> exited
<code>", level info/warn by exit code, kind process.complete, key proc:<id>) at the
two completion-delivery sites, just before the agent turn. The OpenTUI engine
renders it as a distinct inline card (P1) + an OSC ping; Ink treats it as a notice.
Completion events only (watch matches skipped); call-site dedup → one card per exit.
No existing behavior changes — the agent turn still happens.

Gateway suite green (321 passed incl. 5 new TestProcessCompletionCard cases).
2026-06-14 08:55:29 +05:30
alt-glitch
8e853e3ff8 opentui(v6): fix /bg — it launches a background PROMPT, not the process panel
I conflated two "background" concepts. In hermes, /bg (aliases /background, /btw)
launches a background PROMPT (prompt.background → background.complete), and the
`bg: N` badge counts in-flight prompt tasks — but P3 hijacked /bg + bg: for the
OS-process registry and never handled background.complete (so completions were
silent). Corrected:

- /bg <prompt> now launches a background prompt (Ink parity): prompt.background,
  echoes "bg <id> started", tracks the task in store.bgTasks.
- background.complete → drops the task + renders a distinct inline completion
  CARD with the result (the missing completion notification; a completion-ish
  kind also fires the OSC desktop ping).
- `bg: N` badge counts store.bgTasks (in-flight background prompts), not OS procs.
- the OS-process panel moved off /bg to /processes (+ /procs); its header count
  uses runningCount. Dropped the ambient agents.list poll — the badge is now
  event-driven and the panel fetches on open.

Gate green; new store test for background.complete (card + badge decrement).
2026-06-13 22:44:02 +05:30
alt-glitch
76eab10b14 opentui(v6): simplify the background-activity work (dedup + dead-code removal)
/simplify pass over this session's diff (4 cleanup agents → applied the high-value
findings):
- reuse: extract the duplicated truncRight/truncLeft (statusBar/agentsDashboard/
  backgroundPanel) to logic/truncate.ts; export DONE_STATUSES/procIsRunning from
  backgroundActivity and drop backgroundPanel's re-declared copy.
- simplification: remove dead/speculative exports (upsertNotification,
  clearNotificationsByKey, BackgroundRun) — the campaign models notifications as
  Message rows, not a notifications array — and drop runningCount's always-empty
  `runs` param; collapse notificationDispatcher's always-true `card` field into a
  plain notificationOsc(): TermNotification | undefined.
- efficiency: the bg-process poll now idles at 30s (was a flat 8s) and tightens to
  8s only when something is running — most sessions have zero background processes.
- altitude: the ` agents` chip joins the statusSegments width-ladder (was an inline
  width gate) so all bar segments share one drop policy; refreshed the stale
  statusBar header (bg: is wired now, not "reserved").

Net −95 LOC. Gate green. Skipped (noted): statusColor merge (divergent domains),
the pushNotification double-clone (necessary — guards the Solid aliasing footgun),
moving the notification-card dispatch out of messageLine, and the Message-row model
(deliberate "inline in transcript" design).
2026-06-13 22:30:27 +05:30
alt-glitch
5c5a1fec4b opentui(v6): background-activity P4 — fold the agents tray line into a status-bar chip
Input-zone density (rpiw): the agents tray kept a persistent collapsed line under
the composer (` N agents running — ↓ to inspect`), stacking with the status bar +
composer. The running count now lives in a status-bar ` N` chip (next to bg:/mcp:),
and the tray renders NOTHING when collapsed — one fewer persistent line under the
transcript. The tray stays mounted + focusable, so composer-Down still hands focus
over and expands it into the rows (focus-routing tests unchanged + green).

Gate green; agentsTray tests repointed from the removed tray line to the chip; the
Down/Esc/printable focus-routing coverage is intact.
2026-06-13 22:18:08 +05:30
alt-glitch
7016fa4902 opentui(v6): background-activity P3 — /bg process panel + ambient bg badge (no core)
The OS process registry (the qxpe "Claude Code background process" gap) had NO
surface in the TUI. Now:
- /bg (aliases /background, /jobs) opens a Background Processes panel listing the
  registry from agents.list — per-process command + uptime + status, running
  count, and a single STOP-ALL action (x → process.stop; the gateway exposes
  kill_all only, so there's no per-row kill — noted in the panel).
- the reserved status-bar `bg: N` badge (A) now shows the running-process count,
  fed by a slow (8s) scoped poll of agents.list so it stays live with the panel
  closed; hidden at zero.

Background *runs* are intentionally NOT duplicated here — they're already the
resume picker's active-sessions tab; this panel targets the process registry,
the actual gap. All TUI-only (agents.list + process.stop already exist). Gate
green; new backgroundPanel.test (parse + list + running-count + empty state).
2026-06-13 22:09:11 +05:30
alt-glitch
74cb03423e opentui(v6): background-activity P2 — de-crowd the agents dashboard + typed trace
The agents dashboard (rplj pain) dumped each subagent's full multi-line prompt
into the master list, wrapping it into a wall of text and squeezing the trace.
Now each master row is ONE line: status + a width-budgeted truncated goal + model.
The detail pane still shows the full goal (the inspect half) and renders the
activity as a TYPED transcript instead of flat lines: SubagentInfo.trace is now
TraceEntry[] ({kind:'start'|'tool'|'progress'|'summary'}), drawn with per-kind
glyph+color (▶ start /  tool accent / · progress muted / ✓ summary green).

No foregrounding (kept subagent UX unchanged — inspection only, per the brainstorm).
TUI-only. Gate green; new agentsDashboard.test.tsx asserts one-line truncation +
typed-trace render; store.test trace assertion updated to the typed shape.
2026-06-13 21:56:02 +05:30
alt-glitch
5988e21ed7 opentui(v6): background-activity P1 — inline notification cards + OSC (no core change)
The TUI now consumes notification.show/clear gateway events (it dropped them
before — they leaked into the transcript as plain model-output-looking lines,
the qxpe pain). They render as a distinct, level-tinted inline card (role
'notification'): gold ◆ for info, amber for warn, red for error — clearly chrome,
not the agent. Important ones (error/warn/'complete'|'done'|'finish' kinds) also
fire the EXISTING focus-gated OSC desktop notification via termChrome.

Shared substrate (pure, unit-tested) for the rest of the campaign:
- logic/backgroundActivity.ts — parse notification.show/agents.list payloads,
  dedupe-by-id upsert, clear-by-key, runningCount (badge, used in P3).
- logic/notificationDispatcher.ts — card-always + OSC-when-important decision.
- store: notification.show → pushNotification (distinct clones to avoid Solid
  createStore reference-aliasing), notification.clear → drop matching cards,
  lastNotification → OSC seam (terminalChrome).

All TUI-layer; builds only on events/RPCs the gateway already emits. Gate green
(737 tests incl. new unit + frame coverage); verified live in tmux.
2026-06-13 21:44:27 +05:30
alt-glitch
965226fd52 docs: spec for OpenTUI background-activity (agents inspection + background panel + notifications)
Brainstormed design (glitch 2026-06-13). TUI-only, no core gateway/agent changes —
builds on existing events (notification.show/clear, background.complete, subagent.*,
agents.list, process.stop, session.interrupt). Approach 1: shared substrate +
two surfaces + multi-channel notifications (inline card + ambient badge + OSC) +
input-density pass. Phased P1–P4 with per-phase gates.
2026-06-13 21:27:06 +05:30
alt-glitch
353a8c1c8f opentui(v6): composer/transcript UX polish from dogfood feedback (glitch 2026-06-13)
Three Tier-1 fixes from live use:

- bare `/` hydrates the full command menu again (reverses F1's "name char
  first" gate). The lead-token grammar still rejects `/abs/path` (F2) and a
  `/ ` trailing-space is still not arg-completion on an empty name.
- `!cmd` shell mode now reads unmistakably: the composer glyph flips ❯ → `$`
  in the alert (warn) color and an amber "shell mode — Enter runs this in your
  shell (no model turn)" note rides the slot the slash/path dropdown would use
  (they never coexist). New optional `brand.shellPrompt` ($), skin-overridable.
- the transcript scrollbox reserves a 1-cell right gutter (contentOptions
  paddingRight) so the vertical scrollbar no longer paints OVER hard-width
  content — markdown table / code-block right borders were clipped under it.

Gate green (714 tests); F1/F2/F7/F8 slash specs + the slashMenu frame test
updated to the new bare-/ behavior. Verified live in tmux.
2026-06-13 20:43:16 +05:30
alt-glitch
5747d9a2d8 docs: mark opentui composer-ux batch SHIPPED (F1–F10, decisions D1/D2) 2026-06-13 19:17:41 +05:30
alt-glitch
e01b04de46 fix(tui): chrome cost from Nous portal headers only (F3)
The status-bar cost segment must show cost ONLY when running against the Nous
portal — per-model cache/input/output pricing is unreliable across the model
long tail, so a guessed figure is worse than none.

- New nous_header_cost_usd(agent): the chrome cost source, derived ONLY from the
  x-nous-credits-* header delta (deliberately ignores the OpenRouter usage.cost
  accumulator). _get_usage now uses it for cost_usd, so a non-Nous session
  reports no cost and the TUI hides the segment.
- The /usage accounting page is unchanged in spirit: it now reads
  real_session_cost_usd(agent) directly (both provider-reported sources) instead
  of the chrome-narrowed _get_usage cost_usd, so OpenRouter cost still shows there.

Tests: new TestNousHeaderCost (header-only, OR-accumulator ignored, clamp,
no-method); updated gateway _get_usage tests for the chrome narrowing; /usage
page test still asserts the full provider-reported figure. 316 gateway + 25 cost
tests green.
2026-06-13 19:17:41 +05:30
alt-glitch
ef9232a2f7 opentui(v6): paste-while-unfocused + clarify prompt rewrite (F4/F5/F6)
- F4: a paste while the composer is unfocused (transcript scrollbox grabbed
  focus) now lands — a renderer-level paste listener focuses the textarea and
  applies the bytes; the focused path stays the textarea's own onPaste (no
  double insert). Paste logic shared via applyPaste(text, native).
- F5: clarify prompt rewritten off the native <select> onto a custom list —
  long options WRAP instead of clipping, options are numbered, the selected
  row gets a real background + accent (three signals), and the custom answer
  is an always-present inline <input> in the same screen.
- F6: Up/Down/Enter are preventDefault'd so arrows drive selection and never
  leak to the transcript scrollbox.

Verified live via tmux screenshot (wrapping + numbering + highlight + inline
input all correct). 714 tests green; new clarifyPrompt.test.tsx covers wrap,
numbering, selection, inline custom input, no-choices, Esc cancel.
2026-06-13 19:17:41 +05:30
alt-glitch
5268027e6b opentui(v6): composer UX batch — slash trigger, @-mentions, !bash, right-pinned cwd
- F1: slash menu opens only after a name char (bare / no longer fires)
- F2: /abs/path is no longer mistaken for a slash command (lead token must match NAME_RE)
- F7/F8: completion survives newlines — computed at the cursor token, not whole-buffer bail
- F8b: @ is the only file/dir mention trigger (~ / ./ / bare paths dropped)
- F9: !cmd runs a shell command via gateway shell.exec (Ink parity), output as a system line
- F10: cwd is right-pinned on the chrome bar so dirname+branch hug the right edge

planCompletion is now cursor-aware (onType threads ta.cursorOffset). classifySubmit
extracted as a pure, tested router. 708 tests green.
2026-06-13 19:17:41 +05:30
alt-glitch
b6598017c8 docs(handoff): concrete tmux-pane-screenshot usage + note skills are TUI-reachable 2026-06-13 19:11:06 +05:30
alt-glitch
5af3a81490 docs: OpenTUI dev handoff — base operating manual for continuing memory+UX on the canonical branch 2026-06-13 18:43:30 +05:30
alt-glitch
7b7ab279f2 fix(tui): opentui launches when fnm default is older than 26.3; chrome bar reads the real cwd
Two reasons the local TUI stopped running OpenTUI / showed the wrong directory:

1. Node resolution. OpenTUI needs Node >= 26.3 (node:ffi floor), but
   _node26_bin_or_none only checked HERMES_NODE + `which node`. When fnm's
   default flips to an older line (e.g. v25.9) the active node fails the gate
   and the engine silently falls back to Ink even though a usable v26.3 sits
   installed. _fnm_node26_candidates now discovers fnm's installed versions
   (FNM_DIR / XDG_DATA_HOME/fnm / ~/.local/share/fnm / macOS Library path),
   newest first, version-probed — so the engine launches without the user
   re-aliasing their global default.

2. Launch cwd. The launcher runs the engine with cwd=<engine package dir> so
   its build/resolution works; the gateway it spawns then auto-detected THAT
   dir as the workspace (chrome bar showed 'ui-opentui (feat/opentui-native-
   engine)' regardless of where you ran hermes). TERMINAL_CWD — the gateway's
   canonical launch-dir channel — was only exported in worktree mode; now it's
   set to the real cwd for every launch (worktree mode still overrides to the
   worktree path). The TUI's session.create no longer sends process.cwd() (the
   engine dir) — a new launchCwd() reads the launcher's HERMES_CWD/TERMINAL_CWD,
   falling back to process.cwd() only for standalone smokes.

Together: session cwd, chrome bar, terminal-tool cwd, and /sessions grouping
all anchor to where you actually ran hermes. Verified live — chrome bar shows
'/tmp/cwd-probe (my-feature)' launched from there with fnm default on v25.9.

8 new tests (fnm discovery order/precedence/empty-safety; launchCwd env
precedence).
2026-06-13 13:24:54 +05:30
alt-glitch
01669f2f12 opentui(v6): /sessions groups this directory's sessions first + TUI persists its cwd
The resume picker never had cwd grouping — deliberately deferred in the v1
spec because TUI session rows had no cwd to group by: the TUI's
session.create sent only {cols}, so explicit_cwd stayed false and
_ensure_session_db_row skipped cwd stamping by design (the desktop's launch
dir is meaningless — 'No workspace' grouping is its desired default).

In a terminal the launch directory IS the workspace choice, so the entry now
passes cwd: process.cwd() at session.create — the existing explicit-workspace
machinery persists it to the session row on first message (covered by
test_ensure_session_db_row_persists_explicit_cwd; zero gateway changes).

Picker: while browsing (no search), sessions whose cwd matches the TUI's
current directory order first under a '▾ this directory (N)' caption, the
rest under '▾ other directories' — one flat reordered list, so selection/
windowing/load-more math is untouched, and captions are pure render
decoration keyed off hereCount. During search the fuzzy score keeps owning
the order. Trailing-slash-normalized comparison, no fs calls.

Old sessions can't be backfilled (their cwd was never recorded); coverage
accumulates from here. 6 new tests (pure ordering edges + grouped frames,
search-drops-grouping, no-cwd passthrough).
2026-06-12 23:50:59 +05:30
alt-glitch
338b5275be opentui(v6): syntax highlighting for 10 more languages (vendored tree-sitter grammars)
@opentui/core@0.4.0 bundles only 5 grammars (ts/js/markdown/markdown_inline/
zig) and Hermes registered none of its own — Python/Rust/Go/bash/JSON/C/HTML/
CSS/YAML/TOML tool bodies and fences rendered plain text (never a regression:
no addDefaultParsers existed anywhere in branch history).

Now: parsers/manifest.json curates the 10 grammars (cpp deliberately dropped —
3.28MB alone); scripts/update-parsers.mjs vendors wasm+highlights.scm with
magic/content validation (plain Node fetch — core's update-assets generator is
Bun-flavored and its import-module won't bundle under esbuild, so registration
skips it and points at the vendored files by runtime-resolved path instead);
boundary/parsers.ts registers via the public addDefaultParsers() at entry
module load, before the first <code>/<markdown> mount initializes the global
tree-sitter client. ~4MB vendored, committed (build inputs, offline-safe).

Markdown fence injections need no infoStringMap: fence labels resolve as
filetype ids and core's ext maps already normalize py→python, zsh→bash, h→c.
Live-smoked in a real renderer: python tool body draws 6 distinct token
colors; ```python and ```yaml fences inside markdown highlight too. 6 new
tests pin the wiring (vendored assets valid, registration set, filetype
routing); visuals stay live-smoke territory per codeBlock.tsx.
2026-06-12 13:26:17 +05:30
alt-glitch
ef94562125 opentui(v6): terminal window title (OSC 0/2) + waiting-on-you notifications (OSC 9/99/777)
Window title: a render-nothing <TerminalChrome> tracks session.info — the
native renderer.setTerminalTitle (frame-safe, zig-side OSC emit) shows
'{session title} — Hermes' once the session is titled, 'Hermes Agent' until
then. The user's previous title is bracketed with the XTWINOPS title stack
(save on boot, best-effort restore on quit). Gateway: _session_info now
carries the live title (DB row, pending_title fallback) and a session.info
refresh follows every title change — pending-title application, the
auto-title worker landing (via maybe_auto_title's title_callback), and
session.title renames — so the window retitles without waiting for the
next turn.

Notifications: when the TUI starts waiting on the user — any blocking
prompt (clarify/approval/sudo/secret/confirm) or turn completion — three
dialect sequences go out through renderer.writeOut: OSC 9 (iTerm2/wezterm),
OSC 99 (kitty), OSC 777 (urxvt/foot); terminals ignore what they don't
speak. Suppressed while the terminal reports focused (core's mode-1004
focus/blur events; until a first blur proves reporting works, notify
unconditionally). HERMES_TUI_NOTIFY=0/false/off kills notifications; the
title is not gated. All text is OSC-sanitized (control chars stripped,
777's semicolon fields spliced-proof, length-capped).

13 new TUI tests (pure shaping/sequences/env gate + store-edge wiring via
an injected seam) and 2 gateway tests (title resolution order, thread-safe
refresh emitter). Live-smoked: tmux pane_title shows 'Hermes Agent' from
the native title path.
2026-06-12 13:09:23 +05:30
alt-glitch
3616b813ec opentui(v6): fleet memory self-sampling — HERMES_TUI_MEMLOG + memwatch-report aggregator
Instead of an external watcher chasing 5-10 concurrent session pids,
every TUI samples ITSELF at 1Hz (rss/heap/external + windowing
mounted/peak-mounted counters) into ~/.hermes/logs/memwatch/, gated by
HERMES_TUI_MEMLOG (defaults to the HERMES_TUI_DIAGNOSTICS master
switch) — one shell-rc export covers every session a dev ever starts.
Unref'd timer, every failure path silently disables, 14-day retention.
bench/memwatch-report.mjs aggregates the fleet: per-session
baseline/peak/last, steady-state MB/h slope, peak mounted rows, and
SLOPE/PEAK/MOUNTED anomaly flags. Verified live: two fake-gateway
smoke sessions logged and aggregated (102MB base, mounted ≤60).
2026-06-12 19:24:00 +05:30
alt-glitch
e1067dbbe5 tests: pin ink engine in _make_tui_argv npm-bootstrap tests (post-merge semantic fix)
Main's rewritten test_tui_npm_install.py tests call _make_tui_argv expecting
the Ink/npm flow unconditionally; with the dual-engine dispatch merged in,
_resolve_tui_engine() auto-selects opentui whenever ui-opentui/dist is built
in the repo, routing the call away from the path under test (first subprocess
became 'node --version' instead of 'npm run build'). Pin the engine to ink
via an autouse fixture, mirroring the existing pinning precedent in
test_tui_resume_flow.py.
2026-06-12 10:32:40 +05:30
alt-glitch
ab37440ce6 opentui(v6): diagnostics-gate review nits — completion-mechanism precision + client-only design note 2026-06-12 09:28:15 +05:30
alt-glitch
cf3002664b opentui(v6): HERMES_TUI_DIAGNOSTICS master switch — gate /mem, /heapdump + window-stats default
Regular users get zero diagnostic surface by default: /mem and /heapdump
disappear from /help and completion, and invoking them prints the
one-line enable hint (relaunch with HERMES_TUI_DIAGNOSTICS=1) instead of
executing — an enable switch, not a secret. With the switch on, the
commands work as before and HERMES_TUI_WINDOW_STATS defaults on (still
individually settable either way). Full env-flag ledger (master switch /
user config / dev tuning / internal plumbing) in docs/opentui-env-flags.md.
672 tests exit 0.
2026-06-12 09:17:25 +05:30
alt-glitch
fc8d5f203a opentui(v6): post-rebase fixups — dedup probe mouse, .demo lint ignore, cap tests to windowing-aware contract
Rebasing onto fcf49f313 (multi-click selection) collided in the test
probe (both sides added 'mouse' — deduped) and surfaced two test debts
the cap-restore commit (3cc56517a) had shipped masked by a piped exit
code: store cap tests still asserted the 1000 default (now: 3000
windowed, 1000 with HERMES_TUI_WINDOWING=0, both covered), and the
burst-interplay test relied on the old cap trimming a 1500-row burst
(now pins HERMES_TUI_MAX_MESSAGES=1000 explicitly for both stores).
Also: .demo/ build artifacts excluded from typed linting. 669 tests
exit 0 (verified unpiped). Multi-click selections flow through the
same renderer selection seam, so windowing's drag-freeze + row-pinning
covers them with no changes.
2026-06-12 08:40:53 +05:30
alt-glitch
c5806b9ad9 docs: upstream alignment playbook — forkless invariant, shim ledger, upgrade contract
Maintainer signals (native yoga next release, 2x layout, opencode's
100-cap was a legacy perf workaround): what changes for us (WASM ratchet
dies), what doesn't (the 65k handle table still makes windowing
load-bearing at 3000 rows), the boundary/ shim ledger with
delete-on-upstream-fix criteria, and the per-release upgrade playbook
that uses the bench suite as the acceptance contract.
2026-06-12 08:29:24 +05:30
alt-glitch
b145607029 docs: the OpenTUI memory story — ELI5 walkthrough of the 686MB→300MB campaign
Shareable explainer: every primitive at play (native handle table, Yoga
WASM grow-only memory, renderables, Solid surgical unmount, V8 GC
laziness, scrollbox draw-only culling) and every decision (windowing vs
store-cull, exact-heights-at-unmount vs Ink's estimate-correct, the
correctionIsLegal zero-jank law, append-time adjudication, never-window
rules, windowing-aware cap restore, heap right-sizing), with the
measured scoreboard and honest open items.
2026-06-12 08:29:24 +05:30
alt-glitch
4a3b755162 opentui(v6): restore scrollback cap 1000 → 3000 under windowing (#27 payoff)
With transcript windowing (S1+S2) the mounted set no longer scales with
the store (peak 31 rows over a 1500-row burst), so the handle-table
clamp that forced 1000 rows is unnecessary when windowing is on. The
ceiling is now windowing-aware: 3000 rows (the originally-shipped
default, regression documented in opentui-fixes-audit.md §2) with
windowing, 1000 with HERMES_TUI_WINDOWING=0 (every row mounts again).

Measured at the restored cap (full 3000-msg store): mem3000 360MB peak
styled end-to-end (pre-campaign: ~870MB + unstyled past ~1,400 rows;
before that: crash). scroll3000 p50=2 p90=3 p99=8 max=17ms (Ink same
workload: p90=35 p99=96). Gate digest unchanged.
2026-06-12 08:29:24 +05:30
alt-glitch
16dbcbe85d opentui: Node 26 onboarding — scoped .node-version, engines floor, README setup guide
Pins Node 26.3 to ui-opentui/ only (fnm/mise auto-switch on cd; leaving
the directory restores whatever the dev had — no global default change).
engines.node >= 26.3 makes a wrong-Node npm ci warn. README covers
install paths (fnm/mise/nvm/absolute-binary), the ABI-locked
node_modules gotcha, and build/run commands.
2026-06-12 08:29:24 +05:30
alt-glitch
c04aaecb51 opentui(v6): windowing S2 — pin selected rows instead of freezing on a lingering highlight
Adversarial-review follow-up to the S2 slice. The S1 rule froze ALL window
recomputes while renderer.getSelection()?.isActive — but a finished mouse
selection persists by design (boundary/renderer.ts keeps the highlight so
Ctrl+C can re-copy), so a long streaming turn behind a lingering highlight
ballooned the mounted set exactly like pre-windowing (and permanently
ratcheted the Yoga-WASM high-water).

Refinement:
- full freeze only while selection.isDragging (the native walk touches the
  live tree on every drag update — destroying a row mid-walk corrupts the
  highlight; unchanged from S1 where it matters),
- a finished highlight instead PINS the rows containing
  selection.selectedRenderables (parent-climb to the row wrapper via a
  WeakMap) as neverWindow — the highlight and a later Ctrl+C copy stay
  byte-exact while everything else keeps windowing,
- an active highlight counts as activity (no idle measure churn under it).

Test (headless mock-mouse drag): finished selection persists (isActive,
!isDragging) → 300-row burst keeps peakMounted < 120 AND
getSelectedText() returns the identical text afterward, the selected rows
having been pinned while long scrolled past the margin.

Verified on this build: gate digest otui-capped d5e9558583159eac… (2/2),
mem2000 otui-capped windowing-ON vmhwm 312MB (target ≤ 350), scroll2000
otui-capped p50 2.0ms / p99 6.0ms (gate ≤ 17ms). check exit 0 (648 tests).

Review verdicts on the remaining findings (verified against core source):
- "scrollTop compensation race": rejected — scrollTop is an imperative
  scrollbar property (no signal staleness); records fire in document order,
  each compensation immediately visible to the next.
- "heights map leak on /new": rejected — the countChanged cleanup prunes
  every per-key map against the live key set (test-verified).
- remount-in-viewport estimate shift: only reachable when one frame jumps
  past the margin (> 1 viewport); the design's accepted "remounted for
  view" path — documented in the header.
- expanded tool/reasoning re-collapse on far-remount: S1-accepted,
  deferred (component-local state; out of S2 file ownership).
2026-06-12 08:29:24 +05:30
alt-glitch
eaa069e322 opentui(v6): transcript windowing S2 — append-time adjudication + windowed resume + edge measure
S2 of docs/plans/opentui-transcript-windowing.md (#27), behind
HERMES_TUI_WINDOWING (OFF path renders the byte-identical legacy tree).

Append-time adjudication: the window now recomputes on transcript GROWTH,
not just scroll — a createComputed on messages.length re-windows
synchronously per append, and while pinned at the bottom computeWindow
anchors to the cumulative content BOTTOM (pinnedBottom) instead of the
stale pre-layout scrollTop, so burst-appended rows are spacer-swapped the
moment they pass the margin. The frame driver additionally treats a
≥ ¼-viewport scrollHeight change (streaming growth) like scroll movement.
Unseen-row default changed from "always mounted" to "mounted iff created
streaming or within the bottom-30" — live rows still paint instantly with
zero added latency; a bulk commitSnapshot (resume) mounts ONLY the bottom
window and everything above starts as line-count-estimate spacers (chip-
and-spacing-aware estimateMessageHeight).

Spacer corrections (zero-jank rule): when a measure lands a height
different from what the spacer occupied, the wrapper's onSizeChange fires
inside the layout traversal, pre-paint. Pinned at bottom the scrollbox's
own sticky re-pin (content onSizeChange runs before the row wrappers')
already compensated — verified by test; otherwise scrollTop is compensated
same-frame for rows fully above the viewport (correctionIsLegal). Frames
stay byte-stable across corrections in both pinned and mid-history tests.

Lazy exact-measure (design §4 — the simple choice, documented): no true
offscreen layout exists in @opentui/core, so an idle pulse (no appends,
no scroll, no turn, no selection for HERMES_TUI_WINDOW_IDLE_MS≈1s) mounts
MEASURE_BATCH_ROWS=10 never-measured rows nearest the bottom window edge
(edgeMeasureBatch), records exact heights (incl. a direct post-layout pull
for rows whose mount changed nothing — no onSizeChange fires), and the
next recompute swaps them back to now-exact spacers. Scrolling itself
still measures the margin band.

DEV counter: windowRowStats (current/peak simultaneously-mounted rows),
exposed on globalThis behind HERMES_TUI_WINDOW_STATS; tests assert it.

Measured (this build, 39f9f433e+S2):
- check: exit 0 (647 tests / 39 files; +11 pure window cases, +4 headless)
- peak mounted: 31 rows over a 1500-row burst; 30 rows on a 600-row
  resume snapshot (bound asserted < 120)
- gate digest: otui-capped d5e9558583159eac… — byte-identical, 2/2 reps
- mem2000 (otui-capped, windowing ON, 8GB heap): vmhwm 300MB
  (S1 same-heap 518MB; S1 right-sized-heap 427MB; Ink 229-239MB;
  target ≤ 350MB)
- scroll2000 otui-capped: p50 2.0ms / p99 5.0ms / max 18ms
  (gate ≤ 17ms p99; S1 baseline p99 15ms)

Known S2 limits (deferred to S3, design §5): /compact·/details toggles and
width resizes leave out-of-window spacer heights stale until remount or
the idle march; expanded-body state above the window may re-collapse on
remount (S1-accepted).
2026-06-12 08:29:24 +05:30
alt-glitch
2d7616121b opentui(v6): transcript windowing S1 — exact-height spacers behind HERMES_TUI_WINDOWING
Core machinery of docs/plans/opentui-transcript-windowing.md (#27): rows
outside [scrollTop − viewport, scrollTop + 2·viewport) swap to an exact-height
empty <box> (1 yoga node, no text buffers / native handles), so the mounted
set stays ~3 viewports regardless of transcript length.

Flag: HERMES_TUI_WINDOWING — unset → ON; 0/false/no/off → OFF (envFlag
semantics, the bench A/B + one-env escape hatch). OFF renders the exact
legacy tree (no wrapper boxes).

Pieces:
- logic/window.ts (pure, table-tested): computeWindow (viewport ± 1-viewport
  margin intersection over cumulative exact heights; null heights fall back
  to a per-row line-count estimate), hysteresisFor/shouldRecompute (≥ ¼
  viewport between recomputes), correctionIsLegal (the jank rule: corrections
  only fully above the viewport with same-frame scrollTop compensation, or
  fully below it), estimateMessageHeight (line-count estimate; wrong values
  are fixed by remount only — S1 never corrects a spacer in place).
- view/transcript.tsx: per-row measuring wrapper records exact heights via
  onSizeChange (only while the real row is mounted); window driver is a
  renderer frame callback (setFrameCallback — scroll always renders, so no
  extra timer) publishing the mounted set through one signal + createSelector
  so only flipped rows re-render. Stable row keys via WeakMap<Message, n>
  (messages have no id; store proxies are reference-stable). Solid <Show>
  unmount destroys the row's renderables (@opentui/solid _removeNode →
  destroyRecursively).

Never-window rules:
- streaming rows (remount would restart native markdown streaming),
- the last row while a turn is running (deltas land there),
- the bottom 30 rows (fixed K — sticky-bottom region; rows under
  viewport+margin are mounted by the window calc anyway),
- rows the window has never adjudicated default to MOUNTED (new live rows
  paint instantly),
- the whole window FREEZES while a mouse selection is active
  (renderer.getSelection()?.isActive — a swap would destroy highlighted
  renderables under the native selection walk).

Tests: 30 pure window.test.ts cases + 2 headless integration cases
(transcriptWindow.test.tsx) pinning the zero-jank invariant (scrollHeight
identical ON vs OFF), the renderable shedding, and remount-on-scroll-back.
2026-06-12 08:29:24 +05:30
alt-glitch
8afb7bc570 opentui(v6): double-click word / triple-click line selection with held drag-extend
Editor-grade mouse selection parity with the Ink TUI (hermes-ink selection.ts):
a second click in the 500ms/1-cell chain selects the same-class character run
under the cursor (iTerm2 word set, wide-glyph aware), a third selects the line,
and dragging with the button held extends word-by-word / line-by-line while the
clicked span stays selected — anchor flips across the span on direction change.

Core knows only press-drag char selection, so this is a boundary shim
(multiClickSelect.ts) wrapping the renderer's startSelection/updateSelection
seam; word bounds read the presented frame's char grid. Native quirks probed
and pinned: per-renderable selection anchors are fixed at set time (anchor
flips restart the selection) and forward selections exclude the focus cell
(inclusive spans seed focus at hi+1). Pure scanning logic in logic/multiClick.ts;
20 new tests (pure + real-mouse-path frames); demo.tsx installs the seam for
tmux smokes.
2026-06-11 15:58:33 +05:30
alt-glitch
94765e48ff cli: worktree lock + dirty-tree preservation — stop pruning uncommitted work
Three behavior changes to the hermes -w worktree lifecycle:

1. Git-native locks. _setup_worktree now locks its worktree
   (git worktree lock --reason "hermes session pid=<pid>"), and
   _prune_stale_worktrees skips locked worktrees at ANY age — a lock
   from a live or crashed session means "do not touch". New helpers
   _lock_worktree / _unlock_worktree / _worktree_is_locked (fail-safe:
   any error reads as locked) / _worktree_is_dirty (fail-safe: any
   error reads as dirty).

2. Dirty trees are preserved. _cleanup_worktree previously destroyed
   worktrees with uncommitted changes if there were no unpushed
   commits; it now keeps the worktree, branch, and lock when the tree
   is dirty OR has unpushed commits, and prints manual cleanup hints
   (git worktree unlock + remove --force). The >72h "force remove
   regardless" prune tier is removed: pruning may only ever delete
   clean, unlocked, fully-pushed worktrees.

3. Branch deletion is gated on removal success. Both cleanup and
   prune previously deleted the branch without checking the
   git worktree remove returncode, dropping easy reachability of the
   commits even when removal failed; the branch is now only deleted
   after a successful remove.
2026-06-11 08:10:55 +05:30
alt-glitch
31916539af opentui(v6): degrade SyntaxStyle exhaustion, unmask the exit-7 crash, clamp the cap to the 65k native handle table
Root cause of the bench-suite crash (every otui mem3000/slope cell died at
~3000 lumpy fixture msgs, exit 7, ~880MB RSS — not a cgroup kill):

- @opentui/core 0.4.0 routes EVERY native object through ONE global handle
  registry with 16-bit slot indices (core src/zig/handles.zig: INDEX_BITS=16,
  MAX_SLOTS=65535, slot 0 reserved). Measured on this install: exactly 65,534
  live handles; the next createSyntaxStyle() fails. destroy() DOES recycle
  slots — exhaustion means LIVE objects.
- Every TextBufferRenderable burns THREE slots in its constructor
  (TextBufferRenderable.ts:77-80: TextBuffer + TextBufferView + SyntaxStyle),
  so the mount-everything transcript hits the wall at ~1,400 store rows
  (~16 text renderables/row x 3 ~ 47 handles/row): "Failed to create
  SyntaxStyle" (zig.ts:4554) throws out of a Solid mount effect.
- The crash was MASKED: CliRenderer's own uncaughtException handler
  (handleError -> console.show()) allocates the console-overlay
  OptimizedBuffer — another handle — so the handler itself threw "Failed to
  create optimized buffer: WxH" and Node died with exit 7 (fatal error in
  the uncaughtException handler), hiding the real error.

Why not share one SyntaxStyle (the obvious 3->2): the per-buffer style is
load-bearing — native setStyledText (text-buffer.zig) registers each chunk's
color by NAME ("chunk{i}") into the buffer's OWN style, and registration is
name-keyed-overwrite (syntax-style.zig putStyle), so a shared style would
cross-corrupt chunk colors between every styled <text>. Pooling is unsound
at our layer in core 0.4.0.

The fix, at the seams that are ours:
- boundary/nativeHandles.ts (ffiSafe.ts sibling): SyntaxStyle.create() on a
  full table DEGRADES to a detached style (native handle 0) instead of
  throwing — JS-side styleDefs/mergeStyles (what markdown/code chunk colors
  actually use) keep working; all native calls on handle 0 are inert no-ops.
- boundary/renderer.ts: guard the process error listeners createCliRenderer
  installs so an exception INSIDE the handler can never exit-7-mask the
  original error again (logged honestly; original error stays the story).
- logic/store.ts: HERMES_TUI_MAX_MESSAGES clamped to a handle-safe ceiling
  (1000 rows ~ 47k handles ~ 72% of the table on the realistic fixture).
  The old default of 3000 was unreachable — the TUI crashed at ~1,400 rows,
  before the cap ever bound. Renderable-weight-aware capping is #27's
  (virtualization) to do properly; until then the degrade shim backstops
  pathological rows.

TODO(upstream) — issue-shaped, for the OpenTUI repo:
  (a) a global 64k handle table with a 3-slot cost per text renderable is
      too small for transcript-style TUIs (61k renderables ~ 3k messages);
  (b) native allocation failures throw out of the render loop with no
      degrade path;
  (c) handleError allocates (console overlay buffer) and so crashes on the
      very condition it is reporting, masking the root cause with exit 7.

Also: eslint now ignores ui-opentui/.bench/** (bench `nodes`-cell build
artifact broke the lint gate) and .gitignore covers it.

Gate: npm run check green, 599 tests (595 baseline + 3 degrade-path tests
+ 1 cap-clamp test).
2026-06-11 04:06:19 +05:30
alt-glitch
79d1b58afe ui-tui: env-gated yoga-node sampler for bench instrumentation (dark by default) 2026-06-11 02:29:48 +05:30
alt-glitch
b091b4eaeb opentui(v6): tier-A latex — unicode math with fence-aware preprocessing 2026-06-11 01:44:48 +05:30
alt-glitch
7e01a96e53 opentui(v6): status chrome v3 — one left-aligned labeled line; copy chip off the scrollbar edge 2026-06-11 01:42:59 +05:30
alt-glitch
31e0adc681 opentui(v6): code-token scopes in the shared syntax style (highlighting was parsing but painting monochrome) 2026-06-11 00:56:45 +05:30
alt-glitch
8445995321 opentui(v6): composer — shift+enter newline (kitty), height cap + internal scroll, line navigation 2026-06-11 00:34:54 +05:30
alt-glitch
abba43eb63 tests: pin envelope fragment-peel guards incl. the known tail-shape tradeoff 2026-06-11 00:24:51 +05:30
alt-glitch
4a0991c1d2 opentui(v6): ink-budget follow-up — transparent root canvas; muted stops borrowing banner_dim 2026-06-11 00:16:00 +05:30
alt-glitch
364b93a4b9 gateway: compact /usage with current-session per-model costs
The OpenTUI /usage went through the slash-worker subprocess, which
resumes the session WITHOUT a live agent — so it could never show
current-session tokens or costs, and what it did show landed as a
full-screen page.

- slash.exec now answers /usage in-process from the live agent:
  per-model rows (requests, tokens in/out, cache, provider-reported
  cost when present), session totals/context, a one-line 30-day
  summary (SessionDB.usage_totals, real costs only) and a one-line
  Nous credits gauge (nous_credits_compact_line, refactored out of
  nous_credits_lines). ~8 lines instead of a page.
- Unreported costs render as 'not reported by provider' — never
  $0.00 — and the 30d summary omits cost when no session in the
  window has a provider-reported figure.
- /usage full keeps the detailed legacy CLI page via the worker.
2026-06-11 00:15:00 +05:30
alt-glitch
85546bb9e2 gateway: capture real provider-reported cost (openrouter usage accounting)
Cost displays were estimates from a pricing table; on OpenRouter the
status bar never reflected what was actually charged. Now cost is
provider-REPORTED only, end to end:

- OpenRouter requests carry usage:{include:true} (profile + legacy
  transport paths); the response usage.cost field (credits, 1:1 USD)
  is captured per call into agent.session_actual_cost_usd and
  persisted to the sessions DB actual_cost_usd column (NULL-safe:
  unreported calls never touch the stored value).
- Nous keeps its x-nous-credits-* header capture; the header delta
  now surfaces as the session's real cost via real_session_cost_usd.
- Providers that report nothing accumulate NOTHING: cost fields stay
  absent/None (the TUI hides its cost segment), never a fabricated
  $0.00 and never an estimate. _get_usage, gateway /usage and the
  CLI usage page all switched off estimate_usage_cost for display.
- Per-model session accumulator (session_model_usage) records real
  per-call counts and provider-reported cost per model.
2026-06-11 00:14:21 +05:30
alt-glitch
ba3fe7027c opentui(v6): responsive two-line chrome at wide widths 2026-06-11 00:09:31 +05:30
alt-glitch
5999cd2848 opentui(v6): per-block copy affordance 2026-06-10 23:58:55 +05:30
alt-glitch
639a9cb9a7 opentui(v6): ink budget — earned gold, blue machinery, neutral muted (design pass) 2026-06-10 23:55:04 +05:30
alt-glitch
a09fa9df42 opentui(v6): resume picker — tabbed /sessions with peek preview (supersedes switcher) 2026-06-10 23:46:07 +05:30
alt-glitch
4e69fdb3be opentui(v6): per-tool content fixes — clarify/skill_view/read/search/exec + tree-sitter outputs 2026-06-10 23:34:03 +05:30
alt-glitch
b3efafcc73 opentui(v6): dedupe model.options prefetch with /model open 2026-06-10 23:10:11 +05:30
alt-glitch
036e863e4a opentui(v6): model picker provider tabs (nous-first chip strip) 2026-06-10 23:09:41 +05:30
alt-glitch
e3cdedbf0f opentui(v6): kill expand/collapse scroll jitter (suspend stickyScroll across the toggle)
User feedback: tool/thinking rows did a "v small quick lil jump up and
down" when toggled, worst on the bottom rows.

Root cause (verified live with 10ms tmux capture sampling): the
transcript scrollbox's sticky-bottom re-pin and the scroll anchor fought
AFTER paint. On a toggle near the bottom, the content-height change runs
ScrollBox.recalculateBarProps -> applyStickyStart("bottom") (the user is
at the sticky position, so _hasManualScroll is false), which paints a
fully bottom-pinned frame; the anchor's 4x16ms scrollTo re-asserts then
yanked the viewport back up. The capture burst shows the transient
pinned frame between two anchored ones on every expand — the visible
down-up flick.

Fix at the cause instead of correcting after the effect: suspend
stickyScroll (a runtime get/set property on ScrollBoxRenderable) BEFORE
running the toggle and restore it ~100ms later, once the content height
has settled. With sticky off, the toggle's layout pass leaves scrollTop
untouched — the clicked header's document position is unchanged (content
grows/shrinks below it), so nothing moves and there is nothing left to
flicker; a collapse past the new bottom clamps naturally via the
ScrollBar scrollSize setter. Restoring recomputes the manual-scroll
state from the actual position: still at the bottom -> keeps pinning for
new content; mid-content -> manual-scroll semantics until the user
returns (the same end state the old anchor produced). Rapid re-toggles
inside the window keep the ORIGINAL saved value.

The far-from-bottom anchor guarantee is unchanged (scrollTop is simply
never touched), pinned headlessly in scrollAnchor.test.tsx along with
the suspension sequencing, the clamp-then-re-pin collapse path, and the
double-toggle restore. ffiSafe's tall-diff scroll-cut regression now
drives the negative-y condition explicitly via wheel scrolls (the old
anchor exercised it through the very transient sticky-bottom frames this
fix removes).

Verified live (tmux, real gateway): before — toggling the bottom rows
painted a transient bottom-pinned frame (f141 of a 10ms burst); after —
three toggle bursts produce ONLY the clean before/after states (4
distinct frames in 458 samples), headers hold their row, including the
bottom-most rows.
2026-06-10 22:39:12 +05:30
alt-glitch
38eb9bb19a opentui(v6): tool output uncapped by default (env restores a cap)
User feedback: "for all tools, i'd want all their output viewing enabled
to be infinite by default."

Flip envOutputLines (HERMES_TUI_TOOL_OUTPUT_LINES): unset -> Infinity
(was 200); a positive integer RESTORES a cap (e.g. =200); 0 stays
Infinity for back-compat with the old opt-in-unlimited value; garbage ->
Infinity (unrecognized = no cap asked for). The semantic is now "cap
only when the user asked for one".

The store's raw-result preference follows the same rule: envOutputLinesSet
becomes envOutputUnlimited — whenever the cap is unlimited (the default
now) and a gateway tail-capped result_text (omittedNote) arrives with the
always-full raw result on the wire, the raw result wins, since an
uncapped view of a tail would silently miss the head. With an explicit
finite cap the gateway tail + honest omitted note are kept.

Memory safety is unchanged: tool bodies mount only while EXPANDED (rows
default collapsed and free their Yoga nodes on collapse/unmount), and the
rolling HERMES_TUI_MAX_MESSAGES cap bounds the transcript's high-water
mark.

Tests: env.test.ts expectations flipped (unset/garbage -> Infinity, 0
documented as back-compat); tools.test.tsx "flag unset caps at 200"
becomes "unset renders all 250 lines", plus an explicit =50 cap (+note)
test and =200 restored-cap test; the store preference matrix covers
unset/0 (raw wins), =50 (tail+note kept), and no-raw (tail+note, no
crash). Verified live: seq 1 220 expanded renders rows 201-220 with no
"+N more lines" note.
2026-06-10 22:38:49 +05:30
alt-glitch
0bb58b65ec opentui(v6): fix popup-boot latency regression (model.options prefetch blocked the gateway dispatcher)
The native TUI prefetches model.options right after session.create (91df32545,
picker instant-open). The handler is network-bound (~3.7s: pricing fetch + Nous
tier check in build_models_payload) and ran on the gateway's main dispatcher
thread, so every fast-path RPC issued in the first seconds after launch —
complete.slash for the '/' dropdown, session.list, config.get — sat unread
behind it. Measured: first '/' dropdown 1718ms at HEAD vs 53ms at 394f45a3d
(pre-prefetch baseline); 52ms after routing model.options onto the existing
RPC thread pool (_LONG_HANDLERS). The /model picker keeps its 29ms cached open.
2026-06-10 22:20:09 +05:30
alt-glitch
fb30ff218d tests: align dropdown-hint + wrap expectations with arrows-everywhere menus 2026-06-10 22:15:12 +05:30
alt-glitch
e1edbb0e89 opentui(v6): arrows + enter navigate every completion menu (paths, args) 2026-06-10 22:14:11 +05:30
alt-glitch
72118b049f tests(cli): align tui argv prebuild test with the node-probe launcher 2026-06-10 22:09:43 +05:30
alt-glitch
c007d08419 opentui(v6): port utility commands — compact, details, replay, heapdump, mem 2026-06-10 22:09:39 +05:30
alt-glitch
73b261b94f opentui(v6): monotonic double-press clock + consume the viewer's closing Esc 2026-06-10 22:08:39 +05:30
alt-glitch
f4bb617f62 opentui(v6): tray-exit Esc never arms the prompt-history double-press 2026-06-10 21:58:15 +05:30
alt-glitch
4c630d3e7b opentui(v6): Esc+Esc session prompt history — rollback/undo confirm 2026-06-10 21:49:17 +05:30
alt-glitch
bc71c57ba9 tui_gateway: session.list reports scan-cap truncation honestly 2026-06-10 21:44:17 +05:30
alt-glitch
ab5d422835 tui_gateway+cli: session.list filters + session.peek + bare --resume picker sentinel 2026-06-10 21:31:59 +05:30
alt-glitch
72ee55ed53 opentui(v6): picker v2.1 — provider search, availability toggle, native input, manual refresh 2026-06-10 21:30:55 +05:30
alt-glitch
df4bdc9d58 opentui(v6): skill highlighting + one-edit autocorrect (anti-jank) 2026-06-10 21:27:36 +05:30
alt-glitch
eaad47a6f6 opentui(v6): header chrome — dense status bar (Variant A) 2026-06-10 21:24:58 +05:30
alt-glitch
8c3060342f opentui(v6): standardize fuzzy search on fuzzysort (adapter keeps our API) 2026-06-10 21:04:22 +05:30
alt-glitch
c9c6cfc0ee opentui(v6): background-agents tray — down-arrow focus + enter to dashboard 2026-06-10 21:00:59 +05:30
alt-glitch
6438acec60 opentui(v6): model picker v2 — fuzzy search + provider groups + instant open 2026-06-10 20:50:17 +05:30
alt-glitch
76a8bba15f tui_gateway: blocking prompts wait for the human (drop _block timeouts) 2026-06-10 20:23:25 +05:30
alt-glitch
ad220b9d93 opentui(v6): slash menu — arrow navigation + enter accept 2026-06-10 20:05:39 +05:30
alt-glitch
ca791f4000 opentui(v6): trust gateway payload.error — drop client-side result sniffing 2026-06-10 19:34:27 +05:30
alt-glitch
0dafcdd9e3 opentui(v6): tool-name emphasis, thought styling, HERMES_TUI_TOOL_OUTPUT_LINES 2026-06-10 19:31:45 +05:30
alt-glitch
6e62489d9e tui_gateway: surface tool failure as payload.error (result convention) 2026-06-10 19:27:49 +05:30
alt-glitch
076aebc7e6 opentui(v6): tool lifecycle states — live elapsed tick + failed glyph 2026-06-10 19:12:45 +05:30
alt-glitch
b6dc49200d opentui(v6): suppress redundant JSON/diff-echo output under rendered diffs
A patch tool's result is a JSON record whose payload IS the diff. In a verbose
session the gateway redacts + TAIL-caps result_text (_cap_tui_verbose_text),
so the echo arrived under the native diff in two broken shapes: truncated
mid-JSON (unparseable, so the old JSON.parse check failed open), or — for tall
edits — capped PAST the JSON head, which the store's normalizeOutput then
un-escapes into plain lines that duplicate the diff. North star: no raw JSON
in the transcript, ever.

Three layers:
- gateway: when diff_unified ships, result_text drops the in-JSON diff echo
  (_result_sans_diff_echo) — small, parseable, carries only the non-diff
  signal (success/files_modified/warnings/lsp_diagnostics).
- fileTool diffOutputPlan: anything starting with '{' under a rendered diff is
  suppressed regardless of parseability; parseable JSON with real non-diff
  signal (error/warning/lsp_diagnostics) renders JUST those as labeled notes;
  a non-JSON fragment whose lines echo the rendered diff is suppressed too
  (guards older emitters). Plain-text results (lint tails) still render.
2026-06-10 18:49:03 +05:30
alt-glitch
0a5b0780f5 opentui(v6): clamp negative draw coords at the node:ffi seam (diff crash fix)
Expanding a tall <diff showLineNumbers> pinned to the scrollbox bottom froze
the TUI with ERR_INVALID_ARG_VALUE looping out of CliRenderer.loop every
frame. Root cause: @opentui/core 0.4.0 marshals OptimizedBuffer
fillRect/drawText/setCell* coordinates as u32 in the FFI table while
LineNumberRenderable.renderSelf passes raw screen coordinates — NEGATIVE when
the diff is partially scrolled above the viewport. Bun's FFI silently wraps
negatives (native side bounds-checks them into a no-op); Node's experimental
node:ffi rejects them. bufferDrawBox already uses i32, which is why ordinary
boxes/text scroll fine and only the diff line-background path crashed.

Fix at the seam we own: boundary/ffiSafe.ts patches OptimizedBuffer to clip
fillRect to the non-negative quadrant and skip negative-origin
drawText/setCell*/drawChar before the FFI call (Bun parity). Installed from
boundary/renderer.ts (live) and test/lib/render.ts (headless). TODO(upstream):
widen those FFI params to i32 so this shim can be deleted.
2026-06-10 18:48:51 +05:30
alt-glitch
c4348480f3 opentui(v6): file tool renderer — relative path + full native diff 2026-06-10 16:48:15 +05:30
alt-glitch
f76df0688c tui_gateway: send full unified diff (diff_unified) on file-edit tool.complete 2026-06-10 16:42:14 +05:30
alt-glitch
84cbf5c1f3 opentui(v6): prefer gateway-redacted args_text over raw args in tool renderers 2026-06-10 16:24:16 +05:30
alt-glitch
0f92a3cf63 opentui(v6): bash tool renderer — command + full output 2026-06-10 16:18:08 +05:30
alt-glitch
60cbc4c68b opentui(v6): tool renderer registry + labeled-args default (no raw JSON) 2026-06-10 16:15:37 +05:30
alt-glitch
ae11a636dc feat(tui): run on Node 26 (one runtime), finalize copy UX, rename to ui-opentui
Ports the engine off the second JS runtime onto Node 26.3 (node:ffi) so the
repo ships a single JavaScript runtime: child_process for the gateway, vitest
for tests, an esbuild + Solid build step. Mouse selection copies the rendered
text you highlight, and the clipboard path is crash-proofed (a broken copy
pipe no longer quits the UI). Renames the engine dir ui-tui-opentui-v2/ ->
ui-opentui/ and updates the launcher/installer/Docker references.
2026-06-09 16:16:48 +00:00
alt-glitch
25567919ea opentui(bench): scripts/demo.tsx — view the fixture in a real attachable TUI
Dev demo (not a test): seeds the bench fixture into the store via the resume path
and renders <App> under a real CliRenderer (no gateway) so you can attach over
tmux, scroll, and eyeball the transcript + the rolling-cap truncation notice.
Run: DEMO_TOTAL=240 HERMES_TUI_MAX_MESSAGES=80 bun scripts/demo.tsx
2026-06-09 10:41:13 +00:00
alt-glitch
dcd8ba2a0d opentui(memory): cap default 1500→3000 + honest truncation notice
Bench (realistic fat-turn fixture) put numbers on the cap tradeoff: ~0.65 MB/msg,
~20.4 renderables/msg → 3000 ≈ 2 GB steady RSS, the highest cap within a sane TUI
budget (that ceiling only hit by marathon 3000+-msg sessions; typical cost a
fraction). 1500 was too little scrollback. Tunable via HERMES_TUI_MAX_MESSAGES.

Adds a store `dropped` counter (live overflow in capMessages + the resume slice in
commitSnapshot; reset on clearTranscript) and a dim, selectable=false top-of-
transcript notice — '⤒ N earlier messages — scroll-back capped; full transcript on
the dashboard · session <id>' — so display truncation is visible + points to the
deep-history surface. Display-only: never touches the model's gateway-side context.
2026-06-09 10:25:16 +00:00
alt-glitch
f205dc2a3b opentui(bench): realistic heavy-session fixture (fat tool-turns) + multi-cap matrix
Replaces the synthetic ~5.5-node/msg pushes with a deterministic generator
(scripts/fixture.ts): lorem-ipsum user turns + fat assistant turns (markdown +
reasoning + 1-15 tool parts with multi-line results) driven through the real
apply()/commitSnapshot paths. mem-bench.tsx pumps it + checks the resume path.
Realistic cost is ~20.4 renderables/msg (3.7x synthetic); informed the cap tune.
2026-06-09 10:25:16 +00:00
alt-glitch
c40d3172ac opentui(harden): slice the resume snapshot before mounting (no transient over-cap)
commitSnapshot set the full fetched history then trimmed — briefly handing the
whole transcript to <For>. Since Yoga (WASM) layout memory is grow-only, even a
transient over-cap mount permanently ratchets the high-water mark, partly
defeating the cap when resuming a large session (a real one has ~1980 messages).
Slice to MESSAGE_CAP BEFORE the first setState so resume mounts at most the cap.
2026-06-09 09:41:37 +00:00
alt-glitch
52533bea09 opentui(bench): headless memory bench proving the cap bounds Yoga-node growth
Dev bench (not a test, not in the gate suite): mounts <App> under the Solid
test renderer, pushes N streamed turns, samples RSS + mounted-renderable count
with Bun.gc. Demonstrates HERMES_TUI_MAX_MESSAGES=400 pins mounted renderables
at ~2218 vs an unbounded climb to ~55k at 10k messages (RSS flat ~350MB vs
1.3GB). Run: bun scripts/mem-bench.tsx (MEM_BENCH_TOTAL/SAMPLE tunable).
2026-06-09 08:44:30 +00:00
alt-glitch
9eb36fd697 docs: OpenTUI is the default engine on supported hosts; Ink is the fallback
Note in the README CLI section that the terminal UI defaults to the native
OpenTUI engine on Linux/macOS with Bun (provisioned by the installer), and
that the legacy Ink engine remains the automatic fallback (Windows, Termux,
no Bun) and can be selected explicitly with HERMES_TUI_ENGINE=ink. Ink is
not removed — it's the kept fallback.

No in-repo config example documents display.tui_engine (the published config
reference lives on the docs site, not the repo), so there was nothing to
annotate there.
2026-06-09 08:35:43 +00:00
alt-glitch
f8f4b3044a install: provision Bun + OpenTUI engine (best-effort, Ink fallback on failure)
Add an opt-in-safe `install_opentui` stage that provisions the native
OpenTUI TUI engine: it resolves/installs Bun (~/.bun/bin/bun) and runs
`bun install` in ui-tui-opentui-v2 so the launcher's _opentui_available()
probe (Bun + node_modules/@opentui) passes and OpenTUI becomes the default.

Strictly best-effort: skipped on Windows/Termux/Android and when the v2
package is absent; any sub-step failure (no network, Bun install fails,
`bun install` fails) logs a warning via log_warn and returns 0. The stage
never `exit`s and never returns non-zero, so it can't abort the install — a
failed/skipped setup simply leaves the user on the kept Ink fallback.

Registered after node-deps in all three drivers: the monolithic main()
flow, the run_stage_body case dispatcher (opentui-engine), and the
emit_manifest staged-installer JSON.
2026-06-09 08:35:38 +00:00
alt-glitch
0dc257d610 tui: default to the OpenTUI engine when the host can run it (Ink fallback)
Flip the default engine: with no explicit HERMES_TUI_ENGINE env / display.tui_engine
config, resolve to 'opentui' when this host is genuinely set up for it (Bun resolves +
the v2 package's entry + node_modules present + not Windows/Termux), else 'ink'. An
explicit env/config choice still wins, and 'ink' remains the universal opt-out. Hosts
without the OpenTUI setup are unaffected (stay on Ink), so nothing strands a user.

- _config_tui_engine_early() now returns None (not 'ink') when unset, so the caller
  distinguishes 'explicitly ink' from 'unset' and applies the availability-gated default.
- _bun_bin() split: _bun_bin_or_none() is the non-fatal probe; _bun_bin() still exit(1)s
  on the explicit launch path. New _opentui_available() gates the default.
- Verified the full resolution matrix (7 cases) + that the platform/availability gates hold.
2026-06-09 08:31:30 +00:00
alt-glitch
20865a2653 opentui(ts): shared envFlag parser
Extract one boolean env-flag parser (src/logic/env.ts: envFlag + the shared
TRUE_RE/FALSE_RE) instead of per-file regexes. Rewire entry/main.tsx
(HERMES_TUI_FAKE → envFlag(…, false); HERMES_TUI_MOUSE → envFlag(…, true)) and
logic/theme.ts (detectLightMode's HERMES_TUI_LIGHT tri-state now uses the
shared regexes; the lowercased-input + /i regex is behaviorally identical to
the prior lowercased-input + non-/i regex). Semantics are byte-identical.
Adds src/test/env.test.ts (true/false/unset/garbage→fallback).
2026-06-09 08:26:43 +00:00
alt-glitch
da07e67efd opentui(ts): collapse prompt accessors into a generic narrow()
Replace the ~5 near-identical `as*()` accessors in
view/prompts/promptOverlay.tsx (one per ActivePrompt kind) with one generic
`narrow(kind)` helper that narrows the discriminated union via a typed type
guard (`p is Extract<ActivePrompt, { kind: K }>`) — no `as`. Each <Match>
branch keeps its precise typed payload. Behavior is identical.
2026-06-09 08:26:36 +00:00
alt-glitch
b36001940a opentui(ts): deferClose helper for overlay-close defers
Extract the repeated `setTimeout(() => …close…, 0)` overlay/prompt-close
pattern into a single `deferClose(fn)` helper (src/logic/defer.ts) so the
"why deferred" rationale (let the closing keystroke finish dispatching before
the composer remounts/refocuses) lives in one place.

Rewires the 5 close-defer sites: closePager/closeDashboard/closeSwitcher/
closePicker in view/App.tsx and clearSoon in view/prompts/promptOverlay.tsx.
Timing is unchanged (0ms). Other setTimeout uses (quit window, flashHint,
scroll re-anchor, resize debounce, transport) are NOT close-defers and are
left untouched.
2026-06-09 08:26:24 +00:00
alt-glitch
f4d944c49c opentui(ts): enforce no-unsafe-* + require-await as errors (prod .ts), exempt JSX views + tests
Production boundary/logic .ts is clean of the no-unsafe-* family (gateway
payloads are Schema-decoded), so promote it from warn to error. *.tsx is
exempted: @opentui/solid's JSX namespace types every component return as
error/unknown — a framework limitation, not unsafe app code. Test helpers/
mocks (loose fixtures + async signatures) are exempted too. Remaining warns
are no-unnecessary-condition: intentional defensive guards on untrusted
runtime/gateway data that TS's narrowing can't model.
2026-06-09 08:21:13 +00:00
alt-glitch
e4652b99e2 opentui(ts): decode SessionInfo + Catalog via Schema (drop the as-casts)
Replace the two ad-hoc as-cast loose readers in src/logic/store.ts with
effect Schema decode-at-boundary. New src/boundary/schema/SessionInfo.ts
defines SessionInfoPatchSchema + CatalogSchema (decodeUnknownOption),
mirroring GatewayEvent.ts. readInfoPatch + setCatalog now decode once and
build the typed patch/Catalog from the result (Option.none → empty
patch / catalog unset, never crashes). Wire field names verified against
tui_gateway/server.py. Removed the now-dead readOptBool helper. Tests
extended for nested-usage vs top-level context fallback, malformed/partial
payloads, and a garbage catalog.
2026-06-09 08:17:47 +00:00
alt-glitch
6a73b09d15 opentui(ts): rotate the NDJSON log file (bounded disk use)
The ring buffer is bounded (2000) but the NDJSON file was append-only and grew
forever. Add size-based rotation mirroring opencode's keep-N model: track bytes
written in-process (seeded from statSync on open, so we avoid a statSync on every
write) and, when the next line would cross LOG_MAX_BYTES (5 MiB), shift
.log -> .log.1 -> ... -> .log.5 (LOG_KEEP=5, oldest dropped) and resume on a
fresh file. Rotation is best-effort and fully try/catch-wrapped: any fs failure
leaves us appending to the existing file rather than crashing logging. Adds a
temp-dir rotation test (seeds >5 MiB to force a rotation on next write).
2026-06-09 08:11:01 +00:00
alt-glitch
af82979d43 opentui(ts): safe-stringify log payloads (circular/BigInt-proof)
A caller-supplied `data` with a circular reference or BigInt makes plain
JSON.stringify throw inside the file-write catch, flipping `fileBroken` and
killing ALL file logging for the session. Add `safeStringify` (WeakSet circular
guard, BigInt -> `${n}n`, wrapped to never throw) and use it for entry
serialization, so a bad payload degrades to a placeholder instead of breaking
the sink. Also model LogLevel schema-first via Schema.Literals + inferred type
(matches boundary/schema/GatewayEvent.ts), and add focused safeStringify +
poison-payload tests.
2026-06-09 08:10:20 +00:00
alt-glitch
3d87abcf1c opentui(ts): enforce prettier in the gate
Add a [1/4] format step to scripts/check.sh running
`bunx prettier --check src` (matching how the script invokes the other
tools), renumbering the existing steps to 2-4. Future formatting drift
now fails the gate.

The unused-imports/no-unused-vars warn → error promotion shipped in the
no-non-null-assertion commit (where the eslint rule changes live).
2026-06-09 08:03:38 +00:00
alt-glitch
4cb9aa6664 opentui(ts): normalize formatting with prettier
Run `prettier --write src` over ui-tui-opentui-v2 to normalize formatting
to the repo .prettierrc (no semicolons, single quotes, width 120,
arrowParens avoid, trailingComma none). This worktree had pre-existing
prettier-version divergences across 19 files; normalizing is correct.
No behavior changes — formatting only. The gate (type-check → lint →
bun test) stays green.
2026-06-09 08:03:08 +00:00
alt-glitch
bc79644f16 opentui(ts): no-non-null-assertion + noUnusedLocals + noImplicitReturns
Promote strictness in ui-tui-opentui-v2:
- eslint: @typescript-eslint/no-non-null-assertion: error (with a test
  override block keeping `!` in *.test.ts/tsx fixtures), and promote
  unused-imports/no-unused-vars warn → error.
- tsconfig: add noUnusedLocals + noImplicitReturns.

Remove all 24 production `!` non-null assertions by replacing each with
a real guard / default / early-return, preserving rendered behavior:
- gateway/client.ts: read the pending entry once and guard (vs has()+get()!).
- logic/theme.ts: guard the parseHex regex match; `?? 0` on the always-
  in-bounds XTERM_6_LEVELS lookups; restructure backgroundLuminance to
  branch into a typed tuple and use charAt() for the 3-digit hex expand.
- view/homeHint.tsx: use Solid's <Show>{value => …} callback form to
  narrow info().model / info().cwd instead of `!`.
- view/reasoningPart.tsx: guard the regex match before slicing m[0].
- view/statusBar.tsx: read model/cwd/pct into locals + guard; `?? 0` on
  the showBar()-guarded context-bar percentage.
- view/toolPart.tsx: guard the single-arg entry; `?? 0` on the
  duration-guarded fmtDuration.
2026-06-09 08:02:25 +00:00
alt-glitch
e36b2d1519 opentui(ts): type-aware eslint (projectService + recommendedTypeChecked); defer cast-family to warn
Enable type-aware linting in ui-tui-opentui-v2: add projectService +
tsconfigRootDir to the TS files block and switch the preset to
recommendedTypeChecked. Turn ON as ERROR the high-value promise rules
(no-floating-promises, no-misused-promises, await-thenable) and fix the
3 real floating-promise sites in the gateway client (FileSink
write/flush/end are fire-and-forget on a piped child stdin — marked
with explicit `void`).

Defer the cast/unknown family + the noisy type-checked rules to 'warn'
(gate stays green; eslint exits 0 on warnings) for Phase 2, which will
replace the `as`/`unknown` boundary casts with Schema decoding:
no-unsafe-{assignment,member-access,argument,return,call},
no-unnecessary-condition, no-base-to-string, restrict-template-
expressions, no-unnecessary-type-assertion, require-await.
2026-06-09 07:58:50 +00:00
alt-glitch
fe15a9bb00 tui(opentui): preflight node_modules before spawning the Bun engine
Bun runs the TS entry directly (no build step), so a missing `bun install`
otherwise surfaces as a cryptic '@opentui' resolve crash + blank UI. Fail
loudly with the fix instead. Part of the gateway build/run hardening.
2026-06-09 07:54:18 +00:00
alt-glitch
af577a4c5a opentui(harden): startup-readiness timeout + stderr-tail diagnostic
Arm a startup watchdog after spawning the gateway child: if the unsolicited
gateway.ready handshake never arrives within HERMES_TUI_STARTUP_TIMEOUT_MS
(floor 2s, default 20s), emit a gateway.start_timeout event so the store can
surface a failure line + the captured stderr tail instead of a silent blank UI.
Cleared on ready (dispatch), on stop(); re-arms per recovery respawn.
2026-06-09 07:53:23 +00:00
alt-glitch
9e81be7228 opentui(harden): configurable RPC timeout
Read HERMES_TUI_RPC_TIMEOUT_MS for the JSON-RPC request timeout (floor 5s,
default 120s) — Ink parity, env-tunable for slow handlers.
2026-06-09 07:52:40 +00:00
alt-glitch
28a2f95631 opentui(harden): clear the recovering status once the gateway is ready again 2026-06-09 07:49:46 +00:00
alt-glitch
07fcb3282c opentui(harden): auto-heal — restart + resume on gateway crash 2026-06-09 07:45:58 +00:00
alt-glitch
41a5bbf3e8 opentui(harden): gateway recovery policy (count-cap + exp backoff) 2026-06-09 07:39:36 +00:00
alt-glitch
84b77f68e5 opentui(harden): surface gateway exit/recovery + transport errors to the UI 2026-06-09 07:39:31 +00:00
alt-glitch
c29402d731 opentui(harden): rolling message cap bounds the Yoga node high-water mark 2026-06-09 07:31:14 +00:00
alt-glitch
76e9271dce opentui(copy): theme the selection highlight
Apply the existing theme selectionBg token to the plain <text> content
renderables (TextBufferRenderable supports selectionBg/selectionFg) so a
selection draws a clean solid bar that PRESERVES the text fg (no selectionFg →
no SGR-inverse fragmenting). Applied to:
- messageLine: the flat settled/user/system message text.
- toolPart: the args value lines + the output body lines.
Limitation: assistant answers rendered via the native <markdown> renderable
(MarkdownRenderable extends Renderable, not TextBufferRenderable, and
MarkdownOptions has no selectionBg/selectionFg) cannot take the themed highlight
— they fall back to the renderer's default selection style.
2026-06-09 07:19:39 +00:00
alt-glitch
60c5a82c85 opentui(copy): mask chrome/gutters so free-form copy is clean
Audit selectable masking (free-code noSelect model) so a free-form drag over an
agent turn yields CLEAN pasteable content — no labels, summaries, carets, or
annotations. Newly masked (selectable={false}):
- toolPart: the whole collapsed header row (name + args-preview + duration +
  "(N lines)") summary; the "args"/"output" section labels; the args overflow
  "… +N more"; the "… omitted N" / "… +N more lines" truncation notes.
- messageLine: the streaming caret (▍) — a cursor glyph, not content.
- reasoningPart: the collapsible-section header label (Thinking/Thought + title).
- composer: the completion dropdown rows + the "Tab complete · Esc dismiss" hint.
Kept selectable (real content): assistant markdown, tool args values + output
body, user/system message text.
2026-06-09 07:18:48 +00:00
alt-glitch
eb4821127c opentui(copy): copy-on-select (auto-copy on selection finish)
Subscribe to the renderer's "selection" event (fires once when a free-form
mouse selection completes) and auto-copy the spanned selectable text via the
existing onCopySelection callback. Unlike the Ctrl+C path, this does NOT
clearSelection() — the highlight persists so the user sees what was copied and
Ctrl+C still works. writeClipboard is idempotent so both paths are harmless.
2026-06-09 07:14:09 +00:00
alt-glitch
028bd89959 opentui(copy): /copy [n] copies the agent response 2026-06-09 07:09:10 +00:00
alt-glitch
0f53d67ee4 opentui(copy): assistant-text extraction helpers 2026-06-09 07:09:07 +00:00
alt-glitch
aa5489e804 opentui(harden): fix 3 triaged findings (timer leak, tool-match scope, complete-only)
Subagent hardening pass over boundary/logic/view, findings triaged (most were
false positives or app-lifetime-moot). The 3 genuine fixes:
- liveGateway.stop() now clears the pending 16ms coalesce timer before
  client.stop() — a queued flush() could otherwise fire batch()/handlers into a
  torn-down store after the layer scope releases.
- store.findToolPart now scans only the LIVE (last) assistant turn, not every
  message — a tool.complete pairs with a tool.start in the current turn, so this
  avoids matching a same-id tool in an older/resumed turn (and is O(parts)).
- store message.complete with text but NO prior start/delta now creates the turn
  (complete-only gateways) instead of dropping the final text; still no empty
  bubble when there's no text. +2 regression tests.

Triaged as NOT-a-bug / accepted-risk (documented so they're not relitigated):
@opentui/solid useKeyboard DOES auto-cleanup (onCleanup keyHandler.off, index.js:59);
dimensions/scrollAnchor timers are app-lifetime / try-catch-safe; unbounded-growth,
duplicate-dedup, and split-frame are theoretical for a trusted local newline-framed
subprocess; clipboard spawn timeout + atomic active-session write are minor follow-ups.
93 pass.
2026-06-09 06:35:29 +00:00
alt-glitch
2a86f039ea opentui(test): track the test harness (test/lib/) swallowed by global lib/ ignore
The Solid render-test harness (src/test/lib/render.ts + effect.ts) was never
committed — a global ~/.gitignore_global `lib/` rule silently excluded it, so the
opentui-v2 test suite wasn't reproducible from a clean checkout (render.test.tsx
imports ./lib/render). Force-add both + add a repo .gitignore negation
(!src/test/lib/). render.ts also carries the withKeymap() wrapper the keymap
migration needs (view tests mount under a KeymapProvider). 91 pass.
2026-06-09 06:25:55 +00:00
alt-glitch
79c6896153 opentui(keymap): adopt native @opentui/keymap for overlay close + confirm
@opentui/keymap@0.3.2 was installed but unused (the spec said we'd use it). Wire
it natively: createDefaultOpenTuiKeymap(renderer) + <KeymapProvider> at the render
root, and a useCloseLayer(target,onClose) helper that registers a focus-within
Esc/Ctrl+C → close layer. Migrated the close handling of sessionSwitcher, picker,
approvalPrompt (close-only), confirmPrompt (y/n via confirm/cancel commands), and
pager + agentsDashboard (close via keymap; scroll/select stay raw — not cleanly
focus-gated). Overlays gain a root ref + focus-on-mount so the focus-within layer
activates. q-close re-added to pager/dashboard (footer advertises it).

Composer history/refocus + masked prompt + the Ctrl+C quit machine stay raw by
design (need the in-flight keystroke / careful state). Test harness gains a
withKeymap() wrapper so view tests mount under a provider. 91 pass; live-verified
/sessions + /tools Esc-close and composer focus recovery after.
2026-06-09 06:24:14 +00:00
alt-glitch
46293f618c opentui(input): "Pasted text" placeholder for large pastes
Large bracketed pastes no longer flood the composer. On paste, if the text is
≥4 lines or >400 chars, insert a compact `[Pasted text #N +M lines]` chip and
hold the real content in a PasteStore; on submit, expand the chip back to the
full text before sending (free-code model). Single-pass String.replace keeps a
pasted block that itself contains a `[Pasted text #k]` literal safe.

The store is created ONCE in main.tsx and passed App→Composer (NOT per-composer)
so it survives the composer remounting on overlay open/close — a per-composer
store would lose a pending paste mid-compose. +6 unit tests (91 pass). Verified
live: paste 10 lines → chip; submit → transcript shows the full expanded code;
composer cleared.
2026-06-09 06:09:49 +00:00
alt-glitch
080440bd9c opentui(input): auto-expanding composer textbox
The composer was a fixed height:3. Match free-code/opencode: native textarea
auto-grow via direct minHeight={1} maxHeight={max(6,⌊rows/3⌋)} props (opencode's
prompt sizing) — 1 row when empty, grows with wrapped/multiline content up to ~a
third of the screen, then scrolls internally. maxHeight is a DIRECT reactive prop
(not in style) so the cap tracks terminal resize via useDimensions. 85 pass;
verified live (a long wrapping line grew the box to 3 rows).
2026-06-09 06:05:05 +00:00
alt-glitch
bd3c253420 opentui(v5b): fix first-letter duplication on always-active refocus
Typing while the textarea was unfocused doubled the FIRST letter: the always-active
handler did ta.focus() AND ta.insertText(key.sequence), but the renderer runs the
global useKeyboard handler BEFORE routing the key to the focused renderable — so
after focus() the same keystroke was also delivered to the now-focused textarea,
inserting it twice. Subsequent keys were fine (textarea already focused → block
skipped). Fix: focus() only; let the textarea insert the char it now receives.
Verified live: typing 'x' then 'y' while blurred yields '❯ xy' (no dup). 85 pass.
2026-06-09 05:36:37 +00:00
alt-glitch
82e13ed949 opentui(v5b): frame the startup panel in a themed border box
Design-judge top nit: Ink's bordered-box-around-the-session-info is the single
biggest 'designed home screen vs log output' signal, and the flat left-aligned
version lacked it. Wrap the model/dir/session block + Tools/Skills/MCP sections +
summary in a full border box (theme border token); banner+tagline stay above,
tips below. 85 pass.
2026-06-09 05:22:26 +00:00
alt-glitch
fb04e85a14 opentui(v5b/item1): Ink-parity startup banner panel
Rebuilt the home screen to match hermes --tui: the HERMES-AGENT banner + tagline,
then a session info block (model · Nous Research / dir (branch) / Session: <id>),
then SEPARATE collapsible sections — Available Tools (enabled toolsets each as
'name: tool1, tool2', capped + '(and N more toolsets…)'), Available Skills (N) in
M categories, MCP Servers (N) connected — and a '… /help for commands' summary.
Previously it was one combined '▶ N tools · M skills · K MCP' dropdown that only
listed tools and showed no model/dir/session.

- gateway startup.catalog now returns per-toolset {enabled, tools} (resolved_tools,
  session-aware enabled set — mirrors tools.list); py_compile OK.
- store Catalog gains toolset.enabled/tools; new sessionId field + setSessionId,
  set on session create/resume (alongside the active-session-file write).
- homeHint takes the store, reads info (model/cwd/branch) + sessionId + catalog.
85 pass; verified live (model·Nous·dir·session + enabled toolsets w/ tools).
2026-06-09 05:19:21 +00:00
alt-glitch
06762a0f5e opentui(v5b): visual hierarchy — color-code roles + clean turn spacing
The transcript read as one undifferentiated gold blob (user/assistant/tool all
the same color). Adopt the Ink model where color IS the hierarchy, in 3 brightness
tiers:
- USER input  → label (gold)        — the human's turn stands out.
- ASSISTANT answer → text (bright)   — the primary content, brightest.
- TOOL / REASONING → muted (dim) with an ACCENT glyph (/▶/▼ amber) that marks
  the block — clearly the secondary 'working area' below the answer.
Spacing: one blank line above every turn (was cramped: user had a blank, the
reply didn't) + the existing gap:1 between parts. Dropped the transcript's extra
marginTop (turns own their spacing now). 85 pass; verified live — gold ask, dim
tool, white answer read as three distinct things.
2026-06-09 05:13:37 +00:00
alt-glitch
92f35fab19 opentui(v5b/item5): track active session for the post-quit resume epilogue
The launcher (hermes_cli/main.py _print_tui_exit_summary) reads
HERMES_TUI_ACTIVE_SESSION_FILE to print 'Resume this session with…' on exit. The
Ink TUI writes the current session id there on every session change
(useSessionLifecycle.writeActiveSessionFile); the native engine never did, so
after a /session switch the launcher fell back to the INITIAL launch session and
showed resume info for the wrong session (the reported leak).

Now writeActiveSession() writes {session_id} on session.create AND inside
resumeInto (every /session switch), mirroring Ink. Verified live: file shows the
created session, then updates to the switched-to session. 85 pass.
2026-06-09 05:08:57 +00:00
alt-glitch
cd11ed7a04 opentui(v5b/item4): hold viewport on tool/thinking expand (no scroll jump)
The transcript scrollbox (stickyScroll+stickyStart=bottom) re-pins to the bottom
on any content-height change when the user is at the bottom (@opentui/core
ScrollBox: `if (stickyStart && !_hasManualScroll) applyStickyStart`). So expanding
a tool/thinking block scrolled the clicked header up off-screen. A
ScrollAnchorProvider (transcript owns the scrollbox ref) lets toolPart/reasoningPart
wrap their toggle so scrollTop is held constant across the height change (re-asserted
over a few frames as layout settles) — the clicked header stays put and the
expansion reveals beneath it. 85 pass.
2026-06-09 05:05:01 +00:00
alt-glitch
4407fee49f opentui(v5b/item2+3): fix streaming flicker + native markdown tables
#2 (flicker regression): my item-7 AssistantText wrapped text in
<For each={segmentMarkdown(text)}> — segmentMarkdown returns NEW objects per
delta, so <For> (keyed by reference) DISPOSED and re-created the markdown
renderable on EVERY streamed delta. Each remount re-measured from zero → content
height oscillated → the scrollbar grew/shrank (exactly the reported symptom).

Fix (deep opencode parity): render assistant text as ONE stable native
<markdown> (MarkdownRenderable) fed the growing content in place, with
internalBlockMode="top-level" — opencode's anti-flicker mode where settled
top-level blocks aren't re-rendered per delta (_stableBlockCount, managed
internally). This is opencode's TextPart verbatim (routes/session/index.tsx:1687).

#3 (table inline formatting): the native <markdown tableOptions={{style:grid}}>
renders GFM tables as a grid WITH inline bold/italic/code in cells — so the
hand-rolled segmentMarkdown + MdTable grid are deleted (obsolete). Switched from
<code filetype=markdown> to <markdown> (the former re-measured the whole buffer
each delta and never aligned tables). 85 pass; verified live (smooth stream,
boxed table, concealed **/* markers styled).
2026-06-09 04:57:57 +00:00
alt-glitch
48d0c70f61 opentui(v5/item4): coalesce resize via a shared debounced dimensions signal
Raw useTerminalDimensions fires on every SIGWINCH tick; during a drag that's a
recompute/reflow storm across every width-sensitive component (tool bodies,
tables, status bar, banner). Add a DimensionsProvider that runs the raw hook
ONCE and feeds a single leading+trailing-debounced (40ms) signal — mirroring the
gateway's 16ms event coalescing / opencode's createLeadingTrailingSignal — that
every consumer shares via useDimensions(). They now reflow together (no tearing)
and at most once per window. Falls back to the raw hook outside a provider
(headless tests). Verified: single resizes converge clean (wide banner ⇄ compact
brand at the 102-col threshold); rapid bursts coalesce. 90 pass.
2026-06-09 04:08:56 +00:00
alt-glitch
4061e635d3 opentui(v5/item9): startup HERMES banner + collapsible tools/skills/MCP panel
Home screen now shows the canonical HERMES-AGENT block logo (hermes_cli/banner.py,
gold->amber->bronze via primary/accent/border tokens; width-guarded to a compact
brand line under 102 cols) plus a collapsible '▶ N tools · M skills · K MCP' panel
that expands to per-toolset / per-category / per-server detail.

Data comes from a new opt-in gateway RPC 'startup.catalog' (aggregates
get_all_toolsets + banner.get_available_skills + config mcp_servers); the native
engine fetches it best-effort on session start (Effect.catchCause swallows it on
old gateways). Opt-in => Ink path untouched. py_compile OK. Store gains a typed
Catalog + defensive setCatalog mapper. +2 tests (90 pass); verified live
(1185 tools / 196 skills / 2 MCP, expand shows the full lists).
2026-06-09 04:04:43 +00:00
alt-glitch
741a4c23ca opentui(v5/item8): design polish — header chrome, status segments, ANSI strip
Visual-hierarchy pass (design-reviewed against free-code/opencode):
- header: brand glyph in accent + name in primary/bold + a bottom rule, so it
  reads as chrome and bookends the transcript with the status bar's top rule
  (fixes 'nothing differentiates the header from the text stream').
- status bar: a dim │ divider segments model·effort from the context meter.
- user/assistant turn glyphs bold + the user ❯ in accent so turns are scannable.
- reasoning 'Thought' label uses label (not warn) so it matches tool headers —
  warn is reserved for warnings; reasoning/tool now read as one aside family.
- home screen: brand in primary/bold, command names in accent vs muted descs,
  wider column.
- FIX (load-bearing): strip ANSI/SGR escape sequences from slash/notice text
  (pushSystem + openPager) — the gateway colors them for Ink, which interprets
  them; the native <text> rendered them as literal  glyphs. +stripAnsi
  + 3 tests. All tokens themed (no hardcoded colors). 88 pass.
2026-06-09 03:56:44 +00:00
alt-glitch
c6e72a8454 opentui(v5/item7): render GFM markdown tables as aligned grids
The native <code filetype=markdown> colorizes pipes but never aligns tables.
Add a pure segmenter (segmentMarkdown) that splits assistant text into prose
runs (native renderable) and GFM table blocks, plus an MdTable grid renderer:
per-column widths (free-code's stringWidth+padAligned), :--- / :--: / ---:
alignment, bold header, dim │ separators, a ┼ header rule, width-aware column
shrink on resize. Incomplete tables (no separator yet, e.g. mid-stream) stay
prose until they close. +5 unit tests (85 pass); verified live with a 3-col table.
2026-06-09 03:48:02 +00:00
alt-glitch
180fe665cb opentui(v5/item5): stabilize inter-part spacing (kill streaming jitter)
Blank lines between reasoning/tool/text grew and shrank mid-stream because
spacing was ad-hoc: tools carried marginTop:1, text/reasoning none, and the
markdown text part rendered the model's leading/trailing newlines as transient
blank lines that filled in as deltas arrived.

Now the parts column owns ALL spacing via gap:1 (uniform 1 line between any two
parts regardless of type/order), per-part marginTop is dropped, and text parts
are stripped of leading/trailing blank lines so the gap is the sole source —
no double gaps, no popping. Verified live: Thought/tool/tool/answer all spaced
by exactly one line. (80 pass.)
2026-06-09 03:43:43 +00:00
alt-glitch
6a24249e7c opentui(v5/item6): collapsible thinking traces
Reasoning rendered as an always-expanded plain muted blob. Now it's a proper
collapsible part (opencode ReasoningPart): auto-EXPANDED while the turn streams
(watch it think), then collapses to a one-line `▶ Thought: <title>` when settled;
click toggles. Title is the model's leading `**bold**` line (reasoningSummary).
Body renders as DIM markdown in a left-`│`-border block (Markdown gained an
optional `fg` so reasoning is muted vs the answer). +1 render test (80 pass).
2026-06-09 03:37:42 +00:00
alt-glitch
ee211d087b opentui(v5/item1): resumed tools render like live (collapsible + output)
Resumed tool calls were flat `name arg` rows — no output, not collapsible —
because the resume snapshot (_history_to_messages) dropped each tool's result.
Now the native engine passes `with_tool_output: true` on session.resume and the
gateway folds the tool's redacted+capped result + args into its row, so resumed
turns show `▶ name arg (N lines)` collapsible blocks identical to a live turn.

The flag is OPT-IN: Ink doesn't pass it, so _history_to_messages stays byte-for-
byte unchanged for the Ink path (its expanded verbose-trail render OOM'd on big
output, #34095; the native engine renders tools collapsed, so the capped tail is
safe there). resume.ts maps context→argsPreview, result_text→resultText (label
peeled + envelope stripped), args→argsText — same shape as the live tool part.

py_compile OK. resume tests updated + 1 added (79 pass).
2026-06-09 03:32:52 +00:00
alt-glitch
aec752faa3 opentui(v5/item2): surface tool-call args + de-pad output
The gateway already ships per-tool arg metadata the client was discarding:
`context` (build_tool_preview's primary-arg line, always sent), `args` (full
dict on complete), `args_text` (redacted JSON, verbose), `duration_s`. Capture
them on the tool part and render free-code style:

- collapsed header: `▶ name <arg-preview> · <duration> (N lines)` — args are
  finally visible without expanding (the core item-2 complaint).
- expanded: a single left-bordered (`│`) column with a key:value args block
  (suppressed when the lone arg is already the header preview — judge nit) then
  the output block.
- strip the gateway's `[showing verbose tail; omitted N chars]` banner into a
  tidy `… omitted N chars` note; unwrap tail-capped `{"output":…}` envelope
  fragments so the last line isn't a dangling JSON tail.

Left bar is a border glyph (opencode BlockTool style), not a bg fill — cleaner
and renders faithfully. +4 unit tests, +1 render test (78 pass).
2026-06-09 03:24:15 +00:00
alt-glitch
c9540570ae opentui(v5/item3): composer flush to bottom — drop root paddingBottom
The root box used padding:1 (all edges), reserving a blank row BELOW the
status-bar+composer block. Switch to paddingTop/Left/Right only so the input
hugs the last terminal row. Transcript stays flexGrow:1 minHeight:0; the
bottom block is the flexShrink:0 last child. StatusLine already renders
zero-height when idle, so no other change is needed.
2026-06-09 03:00:16 +00:00
alt-glitch
6e3915fbc1 opentui(v2): home hint (item 12) + verify /goal + expand feature matrix (item 8)
Item 12 — the missing helper/home screen: view/homeHint.tsx renders on an empty
transcript (Ink helpHint.tsx parity) — brand line, common commands (/help /model
/sessions /skills /agents /clear), and input tips (type · ↑↓ history · @file ·
Ctrl+C). Decorative → selectable={false}. Replaced by the transcript on the first
turn.

Item 8 — /goal verified live: slash.exec rejects it (pending-input) → dispatch
falls to command.dispatch {name:'goal'} → {type:'send', notice:'⊙ Goal set…',
message} → notice shown + the goal turn submitted (handleDispatchResult). Wired.

Docs: opentui-feature-map.md gains the full 15-item live-feedback parity matrix
(Ink/opencode primitive · v2 file · status); opentui-smoke.md gains the 15-item
run log. All 15 items  (image-paste wired but unverified in the clipboard-less
CI env).

Tests: home-hint render. 72 pass. Live-smoked: empty launch shows the home hint.
2026-06-08 18:26:13 +00:00
alt-glitch
c3d2d87a74 opentui(v2): clipboard copy/paste, image paste, glyph-free selection (items 1, 4)
Item 1 — copy/paste:
- boundary/clipboard.ts (ported/trimmed from opencode): writeClipboard = OSC 52
  (SSH/tmux-safe) + a native command (pbcopy/wl-copy/xclip/xsel/clip);
  readClipboardImage = clipboard PNG via wl-paste/xclip/pngpaste/powershell.
- Ctrl+C copies a live MOUSE SELECTION (renderer.getSelection) before the
  interrupt/quit machine runs (opencode's selection-key precedence), with a
  "Copied to clipboard" hint; falls through to interrupt/quit when there's no
  selection.
- text paste inserts natively (textarea handlePaste); the composer's onPaste only
  intercepts an EMPTY bracketed paste (image-only clipboard) → readClipboardImage
  → image.attach_bytes (the next prompt.submit picks it up).

Item 4 — mouse selection now ignores decorative glyphs: selectable={false} on the
message/tool gutter glyphs and all chrome (header, status bar, status line,
composer prompt glyph), so a drag copies the message text, not ❯/⚕/▶/.

Live-smoked (this env has no clipboard tools/DISPLAY, so native copy + image read
can't be confirmed here, but): drag-select + Ctrl+C → "Copied to clipboard" (not
quit); no-selection Ctrl+C still arms quit; bracketed text paste lands in the
composer. 71 pass.
2026-06-08 18:20:59 +00:00
alt-glitch
f423aebb80 opentui(v2): fix streaming caret alignment during model response (item 10)
A just-started assistant turn (message.start, no deltas yet) rendered an EMPTY
fallback <text> on the glyph's line plus the `▍` caret on a SEPARATE line below —
so `⚕` sat alone with the caret dangling beneath it, indented. Folded the caret
into the no-parts fallback so it renders inline with the glyph (` ⚕ ▍`); a settled
row still shows its flat text, a turn with parts renders the parts. 71 pass.

Live-smoked: streaming start now shows `⚕ ▍` on one line; the reply text then
aligns with the glyph.
2026-06-08 18:13:04 +00:00
alt-glitch
37b74f4df3 opentui(v2): live agent trace + /tools navigable overlay (items 9, 15)
Item 15 — "/agents doesn't let me look into an agent trace live":
- store accumulates a concise per-subagent trace from the subagent.* stream
  (▶ start /  tool — preview / progress text / ✓ summary), capped at 200 lines;
  thinking deltas update a transient `thought` (not appended — they'd flood).
- AgentsDashboard is now master-detail: ↑/↓ select a subagent (▸ + accent), and
  the bottom pane shows the selected agent's goal · status · model, its latest
  thought, and a sticky-bottom (live) trace scrollbox. PgUp/PgDn scroll the trace.

Item 9 — /tools wired to a deliberate navigable overlay (fetch the roster via
slash.exec → pager) instead of incidental fallthrough; /skills already opens the
native picker.

Tests: store trace accumulation + dashboard render (trace line + footer). 71 pass.

Live-smoked: /tools → tool roster pager; /skills → picker; a real delegation
(spawn a subagent → reply PURPLE) → /agents showed the subagent with its goal ·
completed · model, 🧠 PURPLE thought, and ▶/✓ trace lines.
2026-06-08 18:09:32 +00:00
alt-glitch
247604cdde opentui(v2): collapsible tools + composer glyph, drop the blue tint (items 3, 7)
Item 7 — tools were non-collapsible and "ugly-interlaced":
- ToolPart now renders COLLAPSED by default as one line: `▶ name  summary  (N
  lines)` (summary = explicit summary / first output line / error). A ▶/▼ glyph
  marks expandable tools; clicking the header toggles a left-bar block of the
  full (capped) output. Running tools show `name …`; single-line/erroring tools
  render inline. Compact by default → far less interlacing clutter.
- toolOutput.normalizeOutput: un-double-escapes literal \n/\t when they dominate
  over real newlines (some gateway tool tails are repr'd, so newlines arrived as
  backslash-n and rendered as one ugly line). Conservative — genuine multi-line
  output and legit `\n`-in-code are left alone. Applied in stripToolEnvelope.

Item 3 — the input "blue tint": dropped the textarea's blue focusedBackgroundColor
and added a `❯` prompt glyph. The composer is now distinguished by structure (the
glyph + the status-bar rule above it), not a background tint.

Tests: normalizeOutput (dominant-literal vs genuine-multiline). 70 pass.

Live-smoked: `ls -la` tool → collapsed `▶ terminal  total 3460  (N lines)`;
SGR-click → `▼` + clean per-line output; composer shows `❯` with no blue tint.
2026-06-08 18:04:08 +00:00
alt-glitch
2bb61a7d09 opentui(v2): slash-arg autocomplete + file/@-mention completion (items 5, 13)
onType used to fire complete.slash only for an argless `/command`, and Tab
replaced the whole line. Now:

- planCompletion(text) (pure, in slash.ts) routes: a `/command [args]` line →
  complete.slash (the gateway completes names AND args, e.g. /details section
  names); a trailing path-like word (@…, ~/…, ./…, /…, or anything with /) →
  complete.path for file/dir tagging; else nothing.
- the accepted item splices ONLY its token: store tracks completionFrom (gateway
  replace_from via readReplaceFrom, or the path-token start), and the composer's
  Tab handler keeps the text before `from` and appends the candidate.

Tests: planCompletion (slash/path/prose/multiline) + readReplaceFrom. 69 pass.

Live-smoked: `/details ` → section dropdown (hidden/collapsed/.../activity), Tab
→ `/details hidden` (arg-only splice); `tui_gateway/` → its .py files;
`@hermes_cli/m` → m-prefixed files.
2026-06-08 17:57:07 +00:00
alt-glitch
1ecec7a9bc opentui(v2): prompt history — Up/Down cycling, per-directory scope (item 6)
New logic/history.ts: createPromptHistory (pure cursor cycling — Up walks older,
Down walks newer back to the stashed draft, push dedupes a consecutive duplicate
+ resets) plus best-effort per-dir JSONL persistence under
$HERMES_HOME/tui-history/<sha1(cwd)>.jsonl (one JSON-encoded prompt per line,
multiline-safe).

Scoping matches the ask: prior prompts from the SAME launch dir are loaded on
start (recallable across relaunches), but a different dir keeps its own list — no
cross-dir/cross-session bleed.

Composer: Up at the first line → older prompt; Down at the last line → newer/draft
(at the boundary the textarea's own up/down is a no-op, so no conflict; mid-buffer
it still moves the cursor). setText + cursor-to-end on recall; any edit resets the
recall cursor. submit() pushes the prompt. Threaded entry → App → Composer; cwd =
process.cwd() (the launch dir under the real launcher).

Tests: 5 pure cursor-cycling cases. Live-smoked: seeded a dir file → Up/Up/Down
cycled two→one→two; a freshly submitted prompt was recalled via Up. 65 pass.
2026-06-08 17:52:38 +00:00
alt-glitch
325350d192 opentui(v2): always-active input — typing reclaims the composer (item 2)
The textarea focuses on mount and when an overlay closes (remount), but focus
could drift to the transcript scrollbox on a mouse-scroll, dropping keystrokes.
Now (opencode's keep-the-prompt-focused idea, adapted):
- onMouseDown → focus the textarea (click-to-focus).
- a global keystroke net: a PRINTABLE, unmodified key while the textarea is
  unfocused reclaims focus AND recovers the char (the in-flight event went to
  the global handler, not the unfocused textarea, so insert it). Nav/scroll keys
  (arrows/page/home/end/…) are deliberately left alone so keyboard transcript
  scroll still works; kitty `release` events are skipped to avoid double-insert.
Completion accept/dismiss handler folded into the same useKeyboard with early
returns.

Live-smoked: type → text lands; `/` → completions; Esc → dismiss; type again →
lands; clean quit. 60 pass.
2026-06-08 17:46:37 +00:00
alt-glitch
1be5bd92fa opentui(v2): Ctrl-C stops the agent; second press (debounced) quits
Item 11 — "stopping the agent doesn't work". Ctrl+C used to immediately destroy
the renderer. Now a turn-aware state machine (opencode's double-press model, the
user's preferred behaviour):

- While a turn runs (store.info.running): first Ctrl+C → session.interrupt
  {session_id} (STOP the agent), and arms a 3s quit window with a warn hint
  "⏹ stopped — Ctrl+C again to quit".
- Idle: first Ctrl+C arms the window ("Ctrl+C again to quit"); a stray single
  press never nukes the session.
- A second Ctrl+C within the window KILLS the TUI (renderer.destroy → clean
  scope teardown → gateway child EOF).
- A blocking prompt still owns Ctrl+C (deny/cancel) — unchanged.

Wiring: renderer.ts gains an `onCtrlC` hook (owns Ctrl+C when not blocked);
entry builds the machine (gateway yielded before the renderer so it can read
`running` + send interrupt). store gains a transient `hint` slice; StatusLine
shows hint (warn, priority) or the busy face (dim).

Live-smoked: long turn → Ctrl+C shows "stopped" + idle dot; second press exits
cleanly with no orphaned gateway child (the user's installed-venv sessions
untouched). 60 pass.
2026-06-08 17:42:21 +00:00
alt-glitch
93793b6af5 opentui(v2): status bar (status·model·effort·context·dir) above composer
Item 14: a persistent bottom-chrome status bar, ported from Ink's appChrome
StatusRule. Sourced from the session.info event (model / reasoning_effort /
fast / cwd / branch / running / usage.context_*) which was decoded but dropped
until now; also folded session.create/resume result.info and message.complete
usage into a new store `info` slice.

- store: SessionInfo slice + applyInfo(); session.info handler; message.start/
  complete flip `running` (the flag the Ctrl-C interrupt will read); refresh
  usage on complete.
- schema: MessageComplete.payload gains loose `usage` so it survives decode.
- view/statusBar.tsx: width-aware (Ink progressive disclosure) — context bar
  drops on narrow terminals, cwd compacts to last two segments + left-truncates
  so the row never wraps. Turn/connection dot ◐/●/○.
- App: status bar sits ABOVE the composer; a top-edge rule (border:['top'])
  visually separates the status bar + textbox input region from the transcript.
- tests: store info slice (3) + headless status-bar render (1); bumped the
  approval-prompt capture height for the taller input region. 59 pass.

Live-smoked: bar shows model·effort·context%·dir; context updates 0→4% across a
turn; running dot flips; separator divides input region from transcript.
2026-06-08 17:38:10 +00:00
alt-glitch
26f6929eb4 fix(opentui-v2): route thinking-faces to a transient status line (not the transcript)
Live-usage issue 3/5: the kaomoji faces ("(¬_¬) processing…") lingered in the
transcript. Traced (instrumented capture): they arrive via `thinking.delta` —
Hermes's transient kaomoji busy *indicator* (_INDICATOR_DEFAULT=kaomoji), which I
was rendering as a persistent reasoning part.

- store: new transient `status` field. thinking.delta / status.update → `status`
  (not a part); message.start + message.complete clear it. Only the real
  `reasoning.delta` still becomes a (dim) transcript part.
- view/statusLine.tsx: a dim busy line above the composer shown while `status` is
  set (Ink's FaceTicker analog), rendering nothing when idle; wired into App
  between the transcript and the input zone.

Verified: bun run check green (55 tests / 7 files) — store tests assert
thinking.delta → status (no transcript part) + cleared on complete; status.update
→ status. Live tmux: a turn showed "٩(๑❛ᴗ❛๑)۶ cogitating…" on the transient status
line (cleared on completion) with NO face left in the transcript.
2026-06-08 16:54:07 +00:00
alt-glitch
808ef152e5 fix(opentui-v2): live UX — enable mouse + smooth streaming markdown (opencode parity)
From live-usage feedback (driving the real TUI):

- Mouse ON by default (opencode parity; HERMES_TUI_MOUSE=0 opts out). Was hardcoded
  off, which is why transcript wheel-scroll, scrollbar drag, and click-to-expand
  tools didn't work and the terminal's native region-select polluted copy. With
  useMouse the scrollbox handles the wheel + scrollbar and tools are click-expandable;
  selection becomes OpenTUI's text-aware select. (Mouse can't be driven via tmux
  send-keys — verify wheel/drag/click interactively.)
- Streaming markdown: match opencode's v2 text path —
  <code filetype="markdown" streaming drawUnstyledText={false}>. The previous
  drawUnstyledText:true drew raw text then overlaid styling each delta (a flash);
  false avoids that and re-tokenizes incrementally for smoother streaming. (The
  native renderable's tree-sitter doesn't settle in the headless test renderer with
  drawUnstyledText:false, so the two markdown frame tests now assert the assistant
  text via the store — paint is verified in the live smoke; render.ts also settles
  to waitForVisualIdle.)

Verified: bun run check green (53 tests / 7 files). Live mouse + streaming
smoothness for glitch to confirm. Part of the live-feedback polish goal.
2026-06-08 16:48:23 +00:00
alt-glitch
e7d7e0157f feat(opentui-v2): Phase 8 — launcher cutover to the v4 Solid engine
Repoint hermes_cli/main.py `_make_opentui_argv` from the superseded React entry
to the v4 Solid + Effect-at-boundary entry: it now prefers
`ui-tui-opentui-v2/src/entry/main.tsx` (cwd ui-tui-opentui-v2) and falls back to
`ui-tui-opentui/src/entry.real.tsx` only if the v2 package is absent (graceful
during coexistence). The engine gate (_resolve_tui_engine: HERMES_TUI_ENGINE /
display.tui_engine → opentui; Windows/Termux → Ink fallback) and the dual-engine
dispatch in _make_tui_argv are unchanged; Ink (ui-tui/) is untouched. The spawned
tui_gateway's source-root default lands on PROJECT_ROOT (package at
<root>/ui-tui-opentui-v2), so it loads Python from the same checkout, no extra env.

So `HERMES_TUI_ENGINE=opentui hermes --tui` now launches the v4 engine — the exact
`bun …/v2/src/entry/main.tsx` invocation live-smoked across P1–P5e, making every
first-class surface reachable from the real CLI.

Also: a consolidated 3-way acceptance summary (Ink ↔ opencode ↔ build) at the top
of opentui-feature-map.md covering all 7 first-class surfaces + the foundation +
the launcher, each  + tested + smoked.

Verified: py_compile main.py OK (dev-skill rule for the 4k-line file); imported
the worktree CLI with HERMES_TUI_ENGINE=opentui → _resolve_tui_engine()='opentui',
_make_opentui_argv() → [bun, …/ui-tui-opentui-v2/src/entry/main.tsx] (cwd
ui-tui-opentui-v2, --watch in dev). v2 `bun run check` green (53 tests / 7 files).
Smoke P8 + matrix updated. Remaining: header chrome detail (5b), agent-feature
trail (5d), distribution (§10) — polish, not first-class blockers.
2026-06-08 16:27:21 +00:00
alt-glitch
edc4164704 feat(opentui-v2): Phase 5e — agents dashboard (7th first-class surface; ALL done)
The agents dashboard (spec §2b; Ink agentsOverlay) — the last first-class
interactive surface. Subagent delegations are tracked from the `subagent.*`
event stream and shown in a full-height overlay.

- store: subagents[] built from subagent.{spawn_requested,start,thinking,tool,
  progress,complete} by subagent_id (status·goal·model·depth·lastTool·summary);
  clearTranscript clears them. dashboard flag + openDashboard/closeDashboard.
- view/overlays/agentsDashboard.tsx: full-height overlay (replaces transcript+
  composer), depth-indented subagent rows colored by status, scroll via
  scrollBy/scrollTo, Esc/q close. Empty state prompts to delegate.
- view/App.tsx: content zone is now a <Switch> — pager / agents dashboard /
  (transcript + input zone).
- logic/slash.ts: /agents, /tasks → openDashboard (SlashContext.openDashboard).

Verified: bun run check green (53 tests / 7 files) — subagent reducer + a
dashboard frame test (seeded tree renders, transcript replaced) + /agents
dispatch. LIVE tmux: /agents opened empty; then a REAL delegation spawned a
subagent → /agents showed "⛓ Agents · 1 subagent · ● completed <goal>
(model) terminal". ALL 7 first-class surfaces are now +tested+smoked
(blocking prompts, pager, session switcher, model picker, skills hub,
completions, agents dashboard). Smoke P5e + matrix updated. Remaining: chrome
(5b), agent-feature polish (5d), launcher (8).
2026-06-08 16:23:17 +00:00
alt-glitch
7412cd5c78 feat(opentui-v2): Phase 5a — slash completions dropdown (last first-class overlay)
A live slash-completion dropdown renders above the composer as you type `/…`
(spec §1 autocomplete) — the 6th and final first-class overlay surface.

- view/composer.tsx: onContentChange → onType (reads ta.plainText); a dropdown
  of candidates (display + meta) renders above the textarea when completions are
  set. The textarea owns key input (live refine-by-typing), so Tab accepts the
  top match (ta.clear()+insertText) and Esc dismisses; arrow-nav would fight the
  cursor (noted polish).
- store: completions state + setCompletions/clearCompletions; CompletionItem.
- logic/slash.ts: mapCompletions(complete.slash result) → candidates.
- entry: onType queries complete.slash for `/word` (no space) and sets/clears the
  store completions; cleared on submit / non-slash / space.

Verified: bun run check green (49 tests / 7 files) — mapCompletions + a
composer-dropdown frame test. LIVE tmux: typing `/comp` showed /compress,
/composio, /compact (with descriptions); Tab accepted the top + cleared the
dropdown. ALL 6 first-class overlays are now +tested+smoked (blocking prompts,
pager, session switcher, model picker, skills hub, completions). Smoke P5a +
matrix updated. Remaining: chrome (5b), agent features (5d), agents dashboard (5e).
2026-06-08 16:15:38 +00:00
alt-glitch
3f54152191 feat(opentui-v2): Phase 5c — model picker + skills hub (generic Picker overlay)
A reusable generic picker (titled <select> + onPick) powers two more first-class
overlays (spec §2b):

- view/overlays/picker.tsx + store picker/openPicker/closePicker + PickerItem.
- /model: bare → model.options → a picker of authenticated providers' models
  (current marked ✓), pick switches via `slash.exec model <name>`; `/model <name>`
  switches directly without the picker.
- /skills: skills.manage {action:list} → a picker flattened from
  {category: names[]}; picking inspects (skills.manage inspect) → the pager.
- view/App.tsx: the input zone is now a <Switch> — prompt → switcher → picker →
  composer (overlays replace, never stack, so the composer remounts/refocuses).

Verified: bun run check green (47 tests / 7 files) — /model bare→picker (auth
filtered, current marked, pick→slash.exec), /model <name> direct, /skills flatten.
LIVE tmux: /model → picker listing 8 models (anthropic/claude-opus-4.8 ▶, nous,
…), Esc closed clean; /skills → hub listing skills w/ category descriptions.
5 of 6 first-class overlays done (prompts, pager, session switcher, model picker,
skills hub) — completions dropdown remains. Smoke P5c + matrix updated.
(Note: model.options is ~5s server-side; a loading indicator is a polish TODO.)
2026-06-08 16:08:25 +00:00
alt-glitch
3fe7709b86 feat(opentui-v2): Phase 5c — session switcher overlay (list → pick → resume)
A first-class picker (spec §2b, Ink activeSessionSwitcher): /sessions (aliases
/resume, /switch, /session) → session.list → a native <select> overlay; Enter
resumes the chosen session via the SAME resumeInto hydrate path as launch, so
tool rows + transcript hydrate correctly. Esc closes. Reuses Phase 4b resume.

- view/overlays/sessionSwitcher.tsx: <select> of sessions (title / preview /
  message count), onSelect → onPick(id); Esc cancels.
- store: switcher state + openSwitcher/closeSwitcher; SessionItem type.
- logic/resume.ts: mapSessionList(session.list result) → SessionItem[].
- logic/slash.ts: /sessions|/resume|/switch|/session client commands +
  listSessions/openSwitcher on SlashContext.
- entry: resumeInto extracted (shared by bootstrap + switcher); slashCtx wires
  listSessions (session.list → mapSessionList) + openSwitcher; onResume runs
  resumeInto via runFork. App input zone is now prompt → switcher → composer
  (overlays replace, not stack, so the composer remounts/refocuses on close).

Verified: bun run check green (43 tests / 7 files) — slash /sessions → switcher,
+ a switcher frame test (rows render, composer replaced). LIVE tmux: /sessions
listed real titled sessions w/ counts/previews; ↓+Enter resumed the picked one
(hydrate_ms=8) → transcript hydrated incl. the terminal tool row; switcher
closed, composer returned; /quit clean. 3 of 6 first-class overlays done
(prompts, pager, switcher). Smoke P5c + matrix updated.
2026-06-08 16:00:39 +00:00
alt-glitch
abce50e34d feat(opentui-v2): Phase 5a — pager overlay for long slash output
A full-height scrollable pager (the FloatBox analog) — porting it unlocks the
long-output slash commands (/status /logs /history /tools) at once (spec §2b).

- view/overlays/pager.tsx: bordered full-height overlay (title + scrollbox +
  footer), scrolling driven explicitly via useKeyboard → scrollBy/scrollTo (no
  reliance on scrollbox auto-focus), Esc/q/Ctrl+C close. §8 #2 scrollbox gotchas.
- store: pager state + openPager/closePager.
- view/App.tsx: content zone swaps to the Pager (replacing transcript+composer)
  when store.state.pager is set; the close is deferred a tick so the closing key
  can't leak into the remounting composer.
- logic/slash.ts: present() routes output to the pager when long (>180 chars or
  >2 non-empty lines, Ink parity) else a system line; titled by command; /logs
  always pages. New openPager on SlashContext.

Verified: bun run check green (41 tests / 7 files) — present() routing
(short→system, long→pager) + a pager frame test (renders title/content, replaces
the transcript/composer). LIVE tmux: /logs → pager (title "Logs", scroll via
PageDown, Esc closed → composer refocused, no key-leak); /version (5-line output)
→ pager titled "Version". Smoke P5a + parity matrix updated. Completions dropdown
+ pickers + chrome are the next slices.
2026-06-08 15:52:36 +00:00
alt-glitch
c704d384f4 feat(opentui-v2): Phase 4b — session resume with tool/transcript hydration
HERMES_TUI_RESUME=<id|recent> resumes a session instead of creating one:
session.most_recent (for "recent") → session.resume {cols, session_id} →
commitSnapshot(mapResumeHistory(messages)), buffering live events across the RPC.

- logic/resume.ts: maps the session.resume history into Message[]. Resumed tool
  rows arrive as {role:'tool', name, context} (NO text — gotcha §8 #5); they're
  FOLDED into the preceding assistant turn's ordered parts (state:'complete',
  summary=context) so a resumed transcript renders the tools INLINE like a live
  one. Assistant text gets a text part (renders via native markdown). User/system
  stay flat. Unknown roles / non-arrays are ignored.
- logic/store.ts: hydrate split into beginBuffer() + commitSnapshot() so the live
  event buffer spans the async resume RPC (events that arrive during resume are
  replayed after the snapshot, in order).
- entry/main.tsx: bootstrap branches create vs resume; the resume path is timed
  (rpc_ms / hydrate_ms) for profiling.

Verified: bun run check green (40 tests / 7 files) — resume mapper (fold tool
rows, standalone holder, ignore junk) + beginBuffer/commitSnapshot replay. LIVE
tmux: Launch A created a session with a terminal tool call; Launch B
(HERMES_TUI_RESUME=recent) hydrated user + assistant + the tool row inline.
STRESS+PROFILE on a real 103-message session (~/.hermes/sessions): client hydrate
= 76ms, bun RSS = 214MB STABLE (no leak), tool rows hydrated, PageUp scroll works;
the 1.6s cost is the server-side session.resume RPC, not the TUI. Smoke P4 +
matrix updated. Note: rows instantiate for the full history (scrollbox culls
render only) → RSS ~linear in turns; list virtualization is the lever if
multi-thousand-turn sessions become a target.
2026-06-08 15:31:46 +00:00
alt-glitch
4f2bb7e52f feat(opentui-v2): Phase 4a — slash command system + local confirm dialog
The composer now routes `/command` through the Ink-parity dispatch ladder
instead of submitting it as a prompt (spec §1):

- logic/slash.ts: parseSlash + dispatchSlash — client-local command →
  slash.exec {command, session_id} (output → system line) → on reject
  command.dispatch {arg, name, session_id} with typed handling
  (exec/plugin→system · alias→re-dispatch · skill/send→submit a turn ·
  prefill→notice). 6 client commands: help/quit/exit/clear/new/logs.
- /help renders the live `commands.catalog` (reads the `pairs` shape).
- view/prompts/confirmPrompt.tsx + store.setConfirm: a LOCAL (non-gateway) Y/N
  dialog for /clear and /new; store gains pushSystem + clearTranscript.
- entry: a Promise-returning `request` adapter + the SlashContext wiring (quit →
  renderer.destroy, confirm, clearTranscript, logTail, submit).

Also fixes a keystroke-leak: the key that ANSWERED a prompt was bleeding into the
freshly-refocused composer (`/clear`→y left "y" in the input, breaking the next
`/quit`). PromptOverlay now defers the prompt-clear (composer remount) past the
current keystroke — this hardens every Phase 3 prompt too.

Verified: bun run check green (36 tests / 6 files) — slash.test covers parse + the
full ladder against a fake context. LIVE tmux: /help → full gateway catalog;
/version → slash.exec output; /clear → confirm → cleared, no key-leak (typed "hi"
not "yhi"); /quit → clean quit, child reaped. Remaining TUI-only commands,
completions, pager routing, and session resume are 4b/4c. Smoke P4 + matrix updated.
2026-06-08 15:20:06 +00:00
alt-glitch
1bc376921f feat(opentui-v2): Phase 3 — blocking prompts (clarify/approval/sudo/secret), no deadlock
The 4 gateway *.request events now drive a blocking-prompt overlay instead of
deadlocking the agent (spec §8 #6). Native OpenTUI paradigm (per glitch's steer):

- view/prompts/approvalPrompt.tsx: native <select> (once/session/always/deny)
  → approval.respond {choice, session_id}.
- view/prompts/clarifyPrompt.tsx: native <select> over choices + an "✎ Other…"
  option that swaps to a native <input> for free-text → clarify.respond
  {answer, request_id}.
- view/prompts/maskedPrompt.tsx: sudo (🔐) / secret (🔑) — native <input> has no
  mask, so we own a buffer via useKeyboard and render '*' per char →
  sudo/secret.respond {password|value, request_id}.
- view/prompts/promptOverlay.tsx: dispatches by prompt kind, binds each
  answer/cancel to the matching *.respond; Esc/Ctrl+C → deny/empty so the agent
  always unblocks.

Wiring: store gains ActivePrompt state + the 4 reducer cases + clearPrompt;
App swaps Composer↔PromptOverlay on store.state.prompt (so the composer textarea
stops capturing keys while blocked); renderer.ts gates the global Ctrl+C-quit on
isBlocked() so a prompt owns Ctrl+C (→ cancel); entry adds a generic `respond`
runFork callback + passes sessionId.

Verified: bun run check green (28 tests / 5 files) — reducer set/clear for all 4,
+ a frame test (approval overlay renders the command + all options as a bordered
modal, composer hidden while blocked). LIVE tmux: a real `rm -rf` approval fired;
Approve-once → command ran → unblocked; Esc → deny → "BLOCKED by user" →
unblocked; Ctrl+C-while-blocked cancelled WITHOUT quitting; Ctrl+C-unblocked quit
clean, no orphan. Smoke P3 + parity matrix updated. confirm (local) → Phase 4.
2026-06-08 15:06:58 +00:00
alt-glitch
6cefb7c5b5 feat(opentui-v2): Phase 2b-ii — native markdown for assistant text (Phase 2 done)
Assistant text parts now render through the NATIVE markdown renderable instead of
plain spans — bold/headings/lists/fences render, raw `**`/backtick markup is
concealed (spec §7; never hand-roll a parser).

- view/markdown.tsx: `<code filetype="markdown" streaming conceal drawUnstyledText>`
  (CodeRenderable — opencode's v2 AssistantText path; `<markdown>` +
  internalBlockMode="top-level" deferred paint headlessly). SyntaxStyle.fromStyles
  is derived from the theme (markup.* → theme.color.*, non-hex colors guarded) and
  cached by theme-object identity so all text parts share one instance, rebuilt
  only on skin change. drawUnstyledText paints raw text immediately while
  Tree-sitter highlighting settles (and makes it headless-capturable).
- view/messageLine.tsx: text-part Match renders <Markdown> instead of <text>.
- test/lib/render.ts: settle async markdown via flush(); captureFrame gains an
  `until` option (waitForFrame) for content that paints after the first pass.

Verified: bun run check green (23 tests / 5 files). Live tmux: a markdown reply
(heading + bold word + 2-item list) rendered with `**` concealed (grep -c '**' = 0);
Ctrl+C clean, no orphan. Phase 2 complete (2a shell + 2b-i parts/tools + 2b-ii
markdown) — smoke steps 1–4 run live. Next: Phase 3 blocking prompts.
2026-06-08 14:49:52 +00:00
alt-glitch
59fbc05031 feat(opentui-v2): Phase 2b-i — ordered parts + inline tool render
An assistant turn is now ONE ordered parts[] (text/reasoning/tool) instead of a
flat string, so tool calls render INLINE between text blocks rather than dumped
as separate rows below (spec §7 — the "dump-below" bug opencode's sync-v2 avoids).

- logic/store.ts: Part discriminated union + reducer rework. message.delta
  appends to the open text part (or opens one); tool.start pushes a running tool
  part; tool.complete matches by tool_id and updates that part IN PLACE (state,
  envelope-stripped resultText, summary, error, lineCount); reasoning.delta
  accumulates a reasoning part. User/system rows stay flat text; settled/resumed
  assistant rows fall back to text.
- logic/toolOutput.ts: ported pure helpers — stripToolEnvelope (unwrap
  {output,exit_code}, append [exit N]/[error] suffix) + collapseToolOutput +
  truncate.
- view/messageLine.tsx: <For>+<Switch> dispatch by part.type with stable id keys.
- view/toolPart.tsx: two-tier render — inline one-liner (≤1 output line) or a
  capped left-bar block (TOOL_MAX_LINES, "… +N more", click-to-expand) keyed off
  the theme; reactive width via useTerminalDimensions.

Verified: bun run check green (23 tests / 5 files / 64 expects) — store
interleave/in-place/reasoning, a frame test asserting the tool renders inline +
envelope stripped, and toolOutput unit tests. Live tmux: a terminal-tool prompt
rendered " terminal" with its alpha/beta output inline between the assistant's
text parts; Ctrl+C clean, no orphan. Smoke P2b + parity matrix updated. Native
<markdown> for text parts is the next slice (2b-ii).
2026-06-08 14:42:24 +00:00
alt-glitch
4b81ded58b feat(opentui-v2): Phase 2a — scrollbox transcript + textarea composer + header
Turns the read-only Phase-1 view into an interactive shell, split into focused
view components (spec v4 §2 layout):

- view/transcript.tsx: ONE full-height <scrollbox> with a reactive <For>
  (opencode's no-scrollback model). Applies the §8 #2 gotchas exactly:
  minHeight:0 on the wrapper AND the scrollbox, NO flexDirection on the
  scrollbox root, stickyScroll + stickyStart="bottom".
- view/composer.tsx: a native <textarea> captured by ref — flexShrink:0,
  focus-on-mount, Enter->submit via keyBindings, imperative .clear() on submit,
  and a `submitting` re-entrancy guard. Wired by the entry to fire prompt.submit
  (Effect.runFork on the in-hand service value); it's now the PRIMARY input, with
  the HERMES_TUI_PROMPT stand-in kept only for launch-with-prompt.
- view/header.tsx + view/messageLine.tsx: extracted, themed (no hardcoded
  styles). MessageLine stays flat-text this slice; ordered parts (§7) land in 2b.

test/lib/render.ts now flushes 3 renderOnce passes before capture — a <scrollbox>
needs more than one pass to measure content + apply sticky, else the transcript
row paints blank.

Verified: bun run check green (12 tests / 4 files / 31 expects). Live tmux drive:
typed into the composer -> cleared -> user row -> streamed reply ("Here are three
words"); Ctrl+C quits cleanly even with the textarea focused, no orphan child.
Composer placeholder rendered the live skin's welcome string (skin->theme live).
Smoke P2a + parity matrix updated. Phase 2b (ordered parts/tool render/markdown)
is the next slice.
2026-06-08 14:28:59 +00:00
alt-glitch
927c902785 feat(opentui-v2): Phase 1 — live tui_gateway transport + Solid store + theming
GatewayService/liveGateway over the real Python tui_gateway: JSON-RPC stdio
framing (Bun.spawn), 16ms event coalescing flushed inside Solid batch(), typed
GatewayError, and a decode-once GatewayEvent Schema (~35-member tagged union;
unknown/malformed events skip via Option.none, never crash the stream).

The Solid sync-v2-style store grows to: streaming text concat (prefer
payload.text), gateway.ready{skin}/skin.changed -> fromSkin reactive re-theme,
LRU id-dedup, and hydrate-while-buffering (resume scaffold). Theming is a 1:1
port of Ink's theme.ts (DARK/LIGHT, detectLightMode, ANSI-256 normalization,
fromSkin) behind a Solid ThemeProvider so existing skins work unchanged and the
view carries NO hardcoded styles. A console-safe diagnostics log (in-memory
ring + NDJSON file) is the single logging path.

Entry gains a live launch path (default; HERMES_TUI_FAKE=1 -> scripted hello)
with an initial-prompt bootstrap (session.create -> prompt.submit) as the
Phase-2-composer stand-in, plus a minimal Ctrl+C graceful quit
(renderer.destroy -> shutdown Deferred -> scope finalizers -> client.stop) so
the engine reaps its own gateway child instead of orphaning it.

Verified: bun run check green (tsc + eslint + 12 tests / 4 files); live tmux
drive connect -> gateway.ready -> prompt -> streamed reply ("pong") -> clean
teardown with no orphan bun/python. Parity matrix + smoke P1 run log updated.
2026-06-08 14:19:21 +00:00
alt-glitch
d3943fe37d feat(opentui-v2): Phase 0 scaffold — Solid + Effect-at-boundary native TUI
New from-scratch package ui-tui-opentui-v2/ (NOT a port of the superseded React
ui-tui-opentui/; Ink ui-tui/ untouched). Mirrors opencode's method: @opentui/solid
view, Effect 4.0-beta only at the boundary (renderer lifecycle, GatewayService
transport, runtime), plain Solid for the logic/view.

Phase 0 (per docs/plans/opentui-rewrite-v4-spec.md §11):
- deps pinned: effect@4.0.0-beta.78, @opentui/{core,solid,keymap}@0.3.2, solid-js@1.9.10
- strict rails: tsconfig (verbatimModuleSyntax, exactOptionalPropertyTypes,
  noUncheckedIndexedAccess, jsxImportSource @opentui/solid), eslint, prettier
- boundary: acquireRelease(createCliRenderer) + finalizers + Deferred-on-destroy;
  GatewayService (Context.Service) shape; typed errors (Data.TaggedError); AppLayer
- logic (Solid): createSessionStore + apply(event) reducer (sync-v2 model, minimal)
- view (Solid): App shell (header + transcript); inline color via <span style={{fg}}>
- entry: the one-line render(() => <App/>, renderer) bridge + Effect.provide(layer)
- FakeGateway layer (test/dev seam) streaming a scripted hello
- test rails: test/lib/effect.ts (testEffect/testLayer over ManagedRuntime + TestClock,
  no @effect/vitest), test/lib/render.ts (testRender + renderOnce + captureCharFrame)
- 4-layer tests (boundary/store/render) 5/5 green; scripts/check.sh gate green
- docs: v4 spec + living smoke doc (Phase 0 PASS logged); v3 spec + parts/markdown
  plan marked superseded

Verified: bun run check green (tsc 0, eslint 0, bun test 5/5); live tmux drive paints
'hermes · opentui · ready' + '✦ Hi there, glitch!' in a real TTY.
2026-06-08 13:36:54 +00:00
alt-glitch
2bd9c9b881 opentui(phase3): launcher integration — HERMES_TUI_ENGINE dual-engine
hermes --tui launches the native OpenTUI engine (Bun) when
HERMES_TUI_ENGINE=opentui (env) or display.tui_engine=opentui (config);
Ink stays the default and the shipping path is untouched.

- _resolve_tui_engine() (env > config > ink); refuses opentui on
  Windows/Termux (no Bun) -> falls back to ink with a notice.
- _make_opentui_argv() -> [bun, src/entry.real.tsx] (no build step).
- _bun_bin() with HERMES_BUN override.
- Branch at top of _make_tui_argv BEFORE _ensure_tui_node (Bun-only host
  must not bootstrap Node).
- Gate _launch_tui NODE_OPTIONS/--max-old-space-size on engine==ink (Bun
  is JSC; the V8 flag errors/ignores).

Verified end-to-end via tmux: real hermes --tui -> Bun -> OpenTUI ->
real Python gateway streamed a real reply. No-flag default still ink.
2026-06-08 11:11:54 +00:00
194 changed files with 38436 additions and 43 deletions

View File

@@ -1,12 +1,14 @@
FROM ghcr.io/astral-sh/uv:0.11.6-python3.13-trixie@sha256:b3c543b6c4f23a5f2df22866bd7857e5d304b67a564f4feab6ac22044dde719b AS uv_source
# Node 22 LTS source stage. Debian trixie's bundled nodejs is pinned to 20.x
# which reached EOL in April 2026 we copy node + npm + corepack from the
# upstream node:22 image instead so we can stay on a supported LTS without
# waiting for Debian 14 (forky, ~mid-2027). Bookworm-based slim image used
# so the produced binary links against glibc 2.36, which runs cleanly on
# our Debian 13 (trixie, glibc 2.41) runtime. Bumping to a new Node major
# is a one-line ARG change; see #4977.
FROM node:22-bookworm-slim@sha256:7af03b14a13c8cdd38e45058fd957bf00a72bbe17feac43b1c15a689c029c732 AS node_source
# Node 26 source stage. Debian trixie's bundled nodejs is pinned to 20.x
# (EOL April 2026), so we copy node + npm + corepack from the upstream node:26
# image instead. Node 26 (Current; LTS promotion ~Oct 2026) is REQUIRED by the
# native OpenTUI TUI engine, which loads its renderer via the experimental
# `node:ffi` API that only exists on Node 26.3+ (the Ink engine + web build run
# on it too). Bookworm-based slim image used so the produced binary links
# against glibc 2.36, which runs cleanly on our Debian 13 (trixie, glibc 2.41)
# runtime. The pinned tag ships v26.3.0. Bumping Node is a one-line change here.
# NOTE: verify the full image build + Ink/web/Playwright on Node 26 in CI.
FROM node:26-bookworm-slim@sha256:79723b41edbedf595f62e943a9f8b0ba9af5b1e61045c5f8f59c2c02c1212a16 AS node_source
FROM debian:13.4
# Disable Python stdout buffering to ensure logs are printed immediately
@@ -90,7 +92,7 @@ RUN useradd -u 10000 -m -d /opt/data hermes
COPY --chmod=0755 --from=uv_source /usr/local/bin/uv /usr/local/bin/uvx /usr/local/bin/
# Node 22 LTS: copy the node binary plus the bundled npm + corepack JS
# Node 26: copy the node binary plus the bundled npm + corepack JS
# installs from the upstream image. npm and npx are recreated as symlinks
# because they're symlinks in the source image (and need to live on PATH).
# See node_source stage at the top of the file for the version-bump
@@ -119,7 +121,7 @@ COPY ui-tui/packages/hermes-ink/ ui-tui/packages/hermes-ink/
# `npm_config_install_links=false` forces npm to install `file:` deps as
# symlinks instead of copies. This is the default since npm 10+, which is
# what the image ships now (via the node:22 source stage). We set it
# what the image ships now (via the node:26 source stage). We set it
# explicitly anyway as defense-in-depth: the previous Debian-bundled npm
# 9.x defaulted to install-as-copy, which produced a hidden
# node_modules/.package-lock.json that permanently disagreed with the root
@@ -181,8 +183,16 @@ RUN uv sync --frozen --no-install-project --extra all --extra messaging --extra
# invalidate the (relatively slow) web + ui-tui build layer.
COPY web/ web/
COPY ui-tui/ ui-tui/
COPY ui-opentui/ ui-opentui/
# ui-opentui is the opt-in native OpenTUI engine (HERMES_TUI_ENGINE=opentui;
# default stays Ink). .dockerignore strips its node_modules/dist, so install +
# esbuild-build it here -> dist/main.js, then prune devDeps (esbuild/babel/
# vitest); the runtime only needs the prod deps (the external @opentui/core +
# its native blob -- the bundle inlines solid/effect). Build needs Node 26.3
# (node:ffi floor), which this image ships.
RUN cd web && npm run build && \
cd ../ui-tui && npm run build
cd ../ui-tui && npm run build && \
cd ../ui-opentui && npm install --no-audit --no-fund && npm run build && npm prune --omit=dev
# ---------- Source code ----------
# .dockerignore excludes node_modules, so the installs above survive.

View File

@@ -107,6 +107,8 @@ You can still bring your own keys per-tool whenever you want — the gateway is
Hermes has two entry points: start the terminal UI with `hermes`, or run the gateway and talk to it from Telegram, Discord, Slack, WhatsApp, Signal, or Email. Once you're in a conversation, many slash commands are shared across both interfaces.
> **TUI engine:** On supported hosts (Linux/macOS with Node 26.3+), the terminal UI defaults to the native **OpenTUI** engine, which the installer provisions for you. The legacy **Ink** engine remains the fallback — it's used automatically on Windows, Termux, or when the native engine can't run, and you can select it explicitly with `HERMES_TUI_ENGINE=ink hermes`. Ink is not going away; it's the kept fallback.
| Action | CLI | Messaging platforms |
| ------------------------------ | --------------------------------------------- | -------------------------------------------------------------------------------- |
| Start chatting | `hermes` | Run `hermes gateway setup` + `hermes gateway start`, then send the bot a message |

68
docs/ink-env-flags.md Normal file
View File

@@ -0,0 +1,68 @@
# Ink TUI — diagnostic environment flags
Non-secret behavioral knobs for the Ink engine (`ui-tui/`). These are
**environment overrides**, not `.env` secrets — set them in your shell for a
session, or `export` them in your shell rc to make them sticky. They mirror the
OpenTUI engine's flags (`docs/opentui-env-flags.md`) so a single switch covers
both engines.
| Flag | Default | What it does |
|---|---|---|
| `HERMES_TUI_DIAGNOSTICS` | off | Master diagnostics switch. Turning it on enables the developer/profiling surface across the TUI — including the memory self-sampler below. One `export HERMES_TUI_DIAGNOSTICS=1` in your shell rc covers **every** session you start, on **either** engine. |
| `HERMES_TUI_MEMLOG` | = `HERMES_TUI_DIAGNOSTICS` | In-process 1Hz memory self-sampling (`ui-tui/src/lib/memlog.ts`) → `~/.hermes/logs/memwatch/<boot>-<pid>.jsonl`. Defaults to the master switch; set `=1` / `=0` to force it on/off independently. |
## What the memory trace captures
Each Ink session, when sampling is enabled, appends one JSON line per second to
its own file under `~/.hermes/logs/memwatch/`, keyed by boot time + pid:
```json
{"t":1781514892,"rss_kb":92148,"heap_used_kb":7234,"external_kb":2378}
```
- `t` — unix seconds.
- `rss_kb` — resident set size (the number that matters for the native-RSS-gap
story: rss climbing while heap stays flat is the #15141-class signal).
- `heap_used_kb` — V8 heap in use.
- `external_kb` — off-heap (buffers, native allocations).
**Ink emits no `mounted` / `peak_mounted` field.** Those are OpenTUI's
windowing dev counters; Ink has no windowing, so it logs the rss/heap/external
core only. `memwatch-report.mjs` treats `mounted` as optional, so Ink lines
aggregate cleanly alongside OpenTUI's.
## Why this exists — cross-engine memory comparison
The filename scheme, directory, and line schema are **byte-compatible with
OpenTUI's collector** (`ui-opentui/src/boundary/memlog.ts`). Both engines write
to the same `~/.hermes/logs/memwatch/` directory, so one aggregator reads both:
```sh
# enable on either/both engines (master switch covers both)
export HERMES_TUI_DIAGNOSTICS=1
HERMES_TUI_ENGINE=ink hermes --tui # Ink session → its own .jsonl
HERMES_TUI_ENGINE=opentui hermes --tui # OpenTUI session → its own .jsonl
# fleet table across BOTH engines' sessions:
cd ~/github/tui-bench && node memwatch-report.mjs
```
This is what makes a true side-by-side **real-world** memory arc possible —
cold floor → load → plateau/leak — instead of comparing OpenTUI dogfood traces
against an Ink harness with no equivalent data.
## Cost & safety
- ~50 bytes/s when on; one `process.memoryUsage()` + one short append per
second. The interval is **unref'd** — it never keeps the process alive.
- 14-day retention: older traces are pruned (best-effort) at start.
- **Every failure path disables the logger silently.** Diagnostics must never
break the TUI — this is the one place the "errors propagate" rule is
intentionally inverted, matching the OpenTUI collector.
- Off by default: regular users write nothing.
## Getting a meaningful trace
A short scroll-through won't show growth. For a comparison against OpenTUI's
45h sessions, drive a tool-heavy 23h Ink session as the floor (see
`docs/plans/opentui-ink-asymmetry-note.md` for why the harness ≠ dogfood data).

120
docs/opentui-dev-handoff.md Normal file
View File

@@ -0,0 +1,120 @@
# Handoff — OpenTUI memory + UX, continuing on the canonical branch
**You are continuing the Hermes OpenTUI engine work.** This is the base operating manual; the
user (glitch) appends specific tasks on top. Read it, then read the repo docs it points to. It
assumes NO prior transcript/memory.
## Where things are
- **Canonical branch: `feat/opentui-native-engine`** (the draft PR to main, #42922).
`feat/opentui-memory-window` is a synonym at the *same tip* — they were consolidated. Treat
native-engine as canonical; if you work from memory-window, periodically
`git push origin HEAD:feat/opentui-native-engine` to keep them in sync, or just use native-engine.
- The native engine source is **`ui-opentui/`**; the legacy Ink engine is `ui-tui/` (shipping
default, untouched by this campaign). The Python gateway is `tui_gateway/`, launcher
`hermes_cli/main.py`.
- **The worktree is often the user's LIVE global `hermes`** (`~/.local/bin/hermes` symlinks into a
worktree's `.venv`). Consequences: (1) NEVER leave the worktree in a half-merged/conflicted state
— a new `hermes` session would fail to build; (2) after you land source changes, rebuild
`dist/main.js` so the next session picks them up; (3) `hermes-stable` is the flip-back to the
stock `~/.hermes/hermes-agent` install if you need to bypass the worktree.
- Backups of pre-merge branch states exist as `backup/*` refs (recoverable via `git reset`).
## Runtime, build, gate (Node 26 — NOT Bun; the port is done)
```sh
export PATH="$HOME/.local/share/fnm/node-versions/v26.3.0/installation/bin:$PATH"
cd ui-opentui && node scripts/build.mjs # → dist/main.js (esbuild + Solid/JSX)
HERMES_TUI_MOUSE=1 node --experimental-ffi --no-warnings dist/main.js # launch; quit = double Ctrl+C
cd ui-opentui && npm run check # THE GATE: prettier+eslint(typed)+vitest (~700). Judge by `echo $?`, never a piped tail.
```
Never run bun here. Never run `hermes update` in the worktree (it flips the branch — recovery is
painful). Never broad-pkill tui_gateway (other live sessions). Host RAM ~15GB, often <5GB free —
run benches SEQUENTIALLY (the harness already wraps SUTs in `systemd-run … MemoryMax=2G`).
## The docs that are the source of truth (read, and KEEP UPDATED as you change things)
- `docs/opentui-memory-story.md` — ELI5 of the whole memory architecture (primitives + every decision).
- `docs/plans/opentui-transcript-windowing.md` — windowing design (S1 spacers, S2 append-time), the
`correctionIsLegal` zero-jank law, pre-registered gates, SHIPPED status + S3 backlog.
- `docs/opentui-env-flags.md` — the consolidated env-flag ledger (master switch / user / dev / plumbing).
- `docs/opentui-upstream-alignment.md` — forkless invariant, `boundary/` shim ledger, the per-release
OpenTUI upgrade playbook (native-yoga is coming upstream — re-tune windowing margins when it lands).
- the bench suite (cells, harness, live-attach, memwatch) now lives in its own
repo: **tui-bench** (`github.com/NousResearch/tui-bench`); see its `README.md`.
- `ui-opentui/README.md` — Node 26 onboarding (fnm setup that doesn't disturb other projects).
- `docs/plans/ink-memory-adversarial-review.md` — Ink's memory weaknesses (F1F10, the turnabout).
- `docs/plans/gateway-death-forensics.md`, `docs/plans/workorder-2026-06-11-results.md`,
`docs/plans/rebase-from-main-spec.md` — forensics, the merge-bar verdict, the rebase plan.
## Workflow (this is how the last 60+ commits were produced with ~zero rework)
1. **Subagent-driven** (skill: `subagent-driven-development`): one implementer per task with a TIGHT
file fence ("you own exactly these files; `git diff --cached --stat` before commit, abort on
out-of-fence"), a mandatory `opentui` skill read FIRST for any renderable work, and a gate judged
by exit code. Verify the self-report YOURSELF (re-run the gate, read the riskiest hunks, check the
commit file-list) — a subagent "✅ done" is a claim, not a fact.
2. **Adversarial review** after a task: a fresh read-only reviewer (Explore-type) with NAMED attack
surfaces. Then ADJUDICATE in code — reviewers over-flag; ~half of "blockers" don't survive a read.
3. **Parallel implementers are safe ONLY with disjoint file fences.** Read-only recon agents
parallelize freely.
4. **Live smoke catches what headless can't** — tmux + the `tmux-pane-screenshot` skill for real
colored frames. The demo: `node scripts/build.mjs scripts/demo.tsx .demo` then
`DEMO_TOTAL=2000 … node --experimental-ffi --no-warnings .demo/demo.js`.
5. Commit format `opentui(v6): …`, **NO attribution lines**. The user's standing instruction is
"commit + push as you land things" — honor it; otherwise don't push without asking. Edit large
load-bearing files (the Python launcher, `store.ts`) DIRECTLY, never via subagent.
## Dogfooding (the user works on this FROM the hermes TUI)
`export HERMES_TUI_DIAGNOSTICS=1` in the shell rc turns on, for every session: the `/mem` +
`/heapdump` slash commands, window-stats, and **fleet memory self-logging** to
`~/.hermes/logs/memwatch/<boot>-<pid>.jsonl`. Aggregate all sessions with
`node memwatch-report.mjs` from the **tui-bench** repo
(`github.com/NousResearch/tui-bench`) (per-session baseline/peak/slope + SLOPE/PEAK/MOUNTED anomaly
flags). Chase a flagged session with tui-bench's `live-attach.sh <pid> --heap`. The discipline: live
anomaly → encode as a bench cell → fix → validate against live sessions again.
## Current state (2026-06) + the ranked backlog
Windowing SHIPPED: 2k-msg peak ~300MB (was 686; Ink 234), scroll p99 6ms, cap restored 1000→3000,
determinism digest unchanged, peak mounted ~31 rows. Live sessions peak <200MB. The transcript is no
longer the biggest lever — the ~160MB floor is ≈104MB Node+OpenTUI runtime + **≈55MB tool/skill
catalogs hydrated at boot**. Ranked next levers:
1. **W3 — 1GB V8 heap default** (small, ~free): set the unconstrained default in
`_resolve_tui_heap_mb`; both engines are Node now so both inherit it. Ink half = separate gated
commit (shipping engine). Measured 90MB at bench scale.
2. **cg_peak harness fix** (small): the cgroup `memory.peak` field is polluted (shared across runs) —
reset/scope it before quoting tui-bench's `report.html` again. Trust `vmhwm_kb` + `samples[].rss_kb`.
3. **New bench cells** (before W1, as its baselines): `resume-1900` (real p99 shape: time-to-first-
paint + post-hydration RSS) and `10MB-tool-output` (the F1 byte-unbounded class). Run BOTH engines.
4. **Catalog lazy-load** (new, promoted by live data): don't hydrate 1,185 tools at boot — fetch on
picker-open. Attacks the ≈55MB floor; pays on EVERY session (median is 20 msgs). Likely cheaper
than W1.
5. **W1 thin renderer** (structural, biggest): bodies live in the gateway (SQLite); TUI keeps ~300B
stubs + fetches bodies for the window only. Design the gateway windowed-read RPC FIRST. WATCH: `/copy`
and the ⧉ block-copy read store parts — they need a fetch-on-demand fallback or W1 ships a copy regression.
6. **Standing**: when native-yoga OpenTUI ships, run the upgrade playbook (re-bench, re-tune margins,
audit the shim ledger). Three questions to relay to the OpenTUI maintainer are in the alignment doc.
## What NOT to do
- Don't copy opencode's 100-msg store cap (user's p90 session is 182 msgs — it would truncate normal use).
- Don't reintroduce estimate-correction scroll jank (the user explicitly vetoed it; `correctionIsLegal` forbids it).
- Don't cite the obsolete "~210MB bun renderer / +120MB" memory figures — pre-port, pre-windowing, wrong.
- Don't push/PR without the standing OK; don't commit `.plans/` scratch unless asked.
## Suggested skills
(All available from the Hermes TUI agent too — this is the dogfooding surface. Curated to the load-bearing set, not the full ~40-skill catalog.)
- `opentui-tui-engineering` — the workflow/architecture/pitfalls layer for `ui-opentui/` (just updated).
- `hermes-tui-architecture` — the Hermes-specific TUI facts (launch pipeline, both engines; just updated).
- `opentui` — the offline renderable-API doc set; mandatory `skill_view` before any view/renderable code.
- `subagent-driven-development` — the process spine for parallel/heavy work.
- `tmux-pane-screenshot` — real colored PNG of a tmux pane for visual verification (ported
into hermes skills 2026-06-13). Use: `bash ~/.hermes/skills/software-development/
tmux-pane-screenshot/scripts/tshot.sh <session:win.pane> out.png 2`, then Read the PNG.
`freeze` (~/go/bin) + the resvg rasterizer are shared/system-wide — works as-is.
- `effect-ts` — for the Effect-at-boundary entry/lifecycle code.
- `superpowers:brainstorming` — before committing to a memory-architecture design (e.g. W1's store split).
- `systematic-debugging` — if a gate fails; root-cause before patching.

81
docs/opentui-env-flags.md Normal file
View File

@@ -0,0 +1,81 @@
# OpenTUI env flags — the consolidated ledger
Every environment variable the OpenTUI TUI reads (grep-verified 2026-06-12),
classified by who should ever touch it. The design rule shipped with this doc:
**regular users see zero diagnostic surface by default; one master switch
(`HERMES_TUI_DIAGNOSTICS=1`) turns all of it on when needed.**
## 1. The master switch
| var | default | effect |
|---|---|---|
| `HERMES_TUI_DIAGNOSTICS` | **off** | Enables the diagnostic slash commands (`/mem`, `/heapdump`). While off they're hidden from `/help` (client-side filter) and invoking them prints the enable hint rather than executing. They never appear in slash *completion* in either state — completion is gateway-driven and these are client-only commands the gateway doesn't know (an adversarial review confirmed there's no bypass path; if a SERVER command named `mem`/`heapdump` is ever added it must be gated gateway-side too — the client gate would shadow but not hide it). Also flips the *default* of `HERMES_TUI_WINDOW_STATS` to on. Not a secret — support flows are "relaunch with `HERMES_TUI_DIAGNOSTICS=1`". |
## 2. User-facing configuration (fine to document publicly)
| var | default | effect |
|---|---|---|
| `HERMES_TUI_ENGINE` | auto (`opentui` if Node≥26.3 + built, else `ink`) | Engine pick; also `display.tui_engine` in config.yaml. |
| `HERMES_TUI_MOUSE` / `HERMES_TUI_MOUSE_TRACKING` / `HERMES_TUI_DISABLE_MOUSE` | on | Mouse support (wheel scroll, selection, click-to-expand). **Defers to Ink's env surface (`logic/env.ts` `resolveMouseEnabled`):** precedence is `HERMES_TUI_MOUSE_TRACKING` (toggle, force knob) > `HERMES_TUI_DISABLE_MOUSE=1` (legacy kill switch) > `HERMES_TUI_MOUSE` (OpenTUI-native alias, kept — also what the launcher sets) > default on. OpenTUI's renderer mouse is a single boolean, so Ink's granular off\|wheel\|buttons\|all collapses to on/off (the granular mode lives in `display.mouse_tracking` config). |
| `HERMES_TUI_SCROLL_SPEED` (alias `CLAUDE_CODE_SCROLL_SPEED`) | native | Wheel-scroll speed multiplier (Ink parity). UNSET → OpenTUI's native scroll acceleration (untouched). A positive value (clamped to (0,20]) installs a constant-multiplier `ScrollAcceleration` on the transcript scrollbox (`view/transcript.tsx`). |
| `HERMES_TUI_NO_CONFIRM` | off | Skip the destructive-action confirm step (`/clear`, `/new`) and run immediately (Ink parity, `NO_CONFIRM_DESTRUCTIVE`). Wired at the `confirm` seam (`entry/main.tsx`). |
| `HERMES_TUI_MAX_MESSAGES` | ceiling | Scrollback rows kept in the TUI. Can LOWER the ceiling, never raise: 3000 with windowing, 1000 with windowing off (handle-table safety). |
| `HERMES_TUI_TOOL_OUTPUT_LINES` | unlimited | Cap expanded tool-output lines (set a number to restore a cap). |
| `HERMES_TUI_TOOL_OUTPUTS` | **on** | Keep rich tool-call OUTPUTS (full result body + raw result/args dicts). `=off` drops both the RENDER and the STORE of those bodies (Ink parity: only a one-line context preview + name/duration/error/diff survive) — the memory lever for the OpenTUI-vs-Ink retention asymmetry, and what the bench launches OpenTUI with for the fair engine-overhead comparison (W3). Diffs (file-edit) are KEPT either way. |
| `HERMES_TUI_HEAP_MB` | cgroup-aware (default 8192) | V8 `--max-old-space-size` (MB) for BOTH engines. Highest precedence (then `display.tui_heap_mb` config, then the cgroup-75% fallback). Set it LOW for a low-mem session (still cgroup-clamped on top so it never exceeds the container); raise it to lift the ceiling. The low-mem opt-in signal that also arms `HERMES_TUI_PROACTIVE_GC` (W1). |
| `HERMES_TUI_PROACTIVE_GC` | = low-`HERMES_TUI_HEAP_MB` (≤4096) | Idle-gated `global.gc()` for the low-mem path. Defaults ON only when a low heap cap is set (so the knobs compose); `=on`/`=off` forces it. Needs `--expose-gc` (the OpenTUI argv now carries it). Never runs mid-stream; tightens cadence above 400MB RSS but stays idle-gated. OpenTUI-only — Ink never GCs proactively (W2). |
| `HERMES_TUI_COMPOSER_ROWS` | default rows | Composer height. |
## 3. Escape hatches & tuning (dev-facing, individually settable)
| var | default | effect |
|---|---|---|
| `HERMES_TUI_WINDOWING` | **on** | `0` = bit-exact pre-windowing renderer (every row mounts; cap clamps back to 1000). The A/B + regression escape hatch. |
| `HERMES_TUI_WINDOW_IDLE_MS` | ~1000 | Idle-measure pulse cadence (the spacer-exactness march). Test knob. |
| `HERMES_TUI_WINDOW_STATS` | = `HERMES_TUI_DIAGNOSTICS` | Exposes live/peak mounted-row counters (`globalThis.__hermesTuiWindowStats`) for tui-bench's live-attach reads. |
| `HERMES_TUI_MEMLOG` | = `HERMES_TUI_DIAGNOSTICS` | In-process 1Hz memory self-sampling (`boundary/memlog.ts`) → `~/.hermes/logs/memwatch/<boot>-<pid>.jsonl` (rss/heap/external + mounted rows; 14-day retention). Fleet view: `node memwatch-report.mjs` from the tui-bench repo (`github.com/NousResearch/tui-bench`). The "monitor all my sessions" answer: one `export HERMES_TUI_DIAGNOSTICS=1` in your shell rc covers every session. |
| `HERMES_TUI_LOG_LEVEL` / `HERMES_TUI_LOG_FILE` | engine defaults | Logging verbosity/destination (`/logs` reads the ring buffer regardless). Deliberately independent of the master switch — support often wants logs without the full diag surface. |
| `HERMES_HEAPDUMP_ON_START` | off | Write one V8 heap snapshot at boot (Ink parity). A deliberate baseline-capture escape hatch that BYPASSES the diagnostics master switch; lands at `$HERMES_HOME/logs/opentui-heap-<ts>.heapsnapshot` and echoes the path as a system line (`entry/main.tsx`). |
| `HERMES_TUI_NOTIFY` | on | Desktop-notification kill switch (`=0`/`false`/`off` silences the "waiting on you" pings). The ping itself goes through the renderer's native `triggerNotification` (protocol detection + tmux/Zellij wrapping); the window title is not gated by this. |
## 4. Internal plumbing (set by the launcher/tui-bench/tests — humans never set these)
| var | set by | effect |
|---|---|---|
| `HERMES_PYTHON`, `HERMES_PYTHON_SRC_ROOT`, `HERMES_CWD` | launcher / bench | Which gateway python + repo root + cwd the TUI spawns against (the bench's fake-gateway seam). |
| `HERMES_TUI_ACTIVE_SESSION_FILE` | launcher/bench | Session handoff file. |
| `HERMES_TUI_RESUME`, `HERMES_TUI_QUERY`, `HERMES_TUI_PROMPT`, `HERMES_TUI_IMAGE`, `HERMES_TUI_FAKE` | launcher/tests | Resume-at-boot; seeded prompt (`--tui "prompt"`: launcher sets `HERMES_TUI_QUERY`, the engine reads QUERY > the `HERMES_TUI_PROMPT` alias > a bare argv tail — `logic/env.ts` `startupPrompt`); seeded image PATH (`--image`: `HERMES_TUI_IMAGE`, `image.attach`ed before the prompt — `startupImage`, attach in `postSessionSetup`); fake-mode. |
| `HERMES_AUTO_HEAPDUMP*` (`_COOLDOWN_MS`/`_MAX_BYTES`), `HERMES_HEAPDUMP_DIR`, `HERMES_HEAPDUMP_MAX_BYTES` | — | **NOT read by the OpenTUI engine (deliberate).** The engine ports Ink's #34095 silent-death early-WARNING (a transcript system line, `boundary/memoryMonitor.ts`) but NOT the auto heap-SNAPSHOT capture — the always-on memlog NDJSON trace is the diagnosis path, and its rss-vs-heap divergence is the better diagnostic for the native-RSS leak class (#15141) a V8 snapshot captures poorly. So the #41948 disk-fill safety set (gate/cooldown/byte-cap/dir) has no consumer here. `HERMES_HEAPDUMP_ON_START` (manual one-shot, §3) is the only heapdump knob the engine honors. |
| `HERMES_TUI_RPC_TIMEOUT_MS`, `HERMES_TUI_STARTUP_TIMEOUT_MS` | tests/CI | Protocol timeouts. |
| (`ui-tui` only) `HERMES_TUI_MEMSAMPLE_FD/MS` | bench | Ink fd-3 node sampler. |
## 5. Ink flags NOT ported — handled natively or out of scope
These exist on the legacy Ink TUI (`ui-tui/`) and are deliberately **not** read
by the OpenTUI engine. Documented so a missing flag reads as a decision, not a gap.
| Ink flag | why not ported |
|---|---|
| `HERMES_TUI_TRUECOLOR` | OpenTUI core does COLORTERM/truecolor detection natively — the Ink force-truecolor hack is a fork workaround we shed. |
| `HERMES_TUI_FORCE_OSC52` | OpenTUI core owns OSC52 clipboard as a primitive; no fallback hint needed. |
| `HERMES_TUI_INLINE` / `HERMES_TUI_TERMUX_MODE` / `HERMES_TUI_TERMUX_FAST_ECHO` | Termux/primary-buffer accommodations. OpenTUI's native FFI floor (Node ≥26.3 + `--experimental-ffi`) is absent on Termux, so those sessions stay on **Ink** — these are correctly N/A for the OpenTUI engine. |
| `HERMES_TUI_FPS` | Ink FPS overlay; the OpenTUI equivalent is the diag/window-stats surface (`HERMES_TUI_WINDOW_STATS`). Not parity-critical. |
| `HERMES_DEV_CREDITS` / `HERMES_DEV_PERF*` | Dev-only throwaway scaffolding (live-spend readout, perf logging) — not user parity. |
| `HERMES_BIN` / `HERMES_TUI_GATEWAY_URL` / `HERMES_TUI_SIDECAR_URL` | External-CLI / remote-gateway-URL overrides. OpenTUI spawns its gateway via the Effect boundary (`liveGateway.ts`) and does not shell out to `hermes` or take an external gateway URL. |
| `HERMES_VOICE` | Voice mode is tracked on the OpenTUI parity backlog separately, not here. |
## How the pieces compose (the support script)
- Regular user, normal day: zero flags, zero diagnostic commands visible.
- "My TUI feels heavy" support flow: `HERMES_TUI_DIAGNOSTICS=1 hermes``/mem`
for the live numbers, `/heapdump` for a snapshot to attach, window stats
exposed for tui-bench's `live-attach.sh <pid>` to read.
- Developer profiling: same master switch + the individual knobs
(`HERMES_TUI_WINDOWING=0` A/B, `WINDOW_IDLE_MS` tuning) as needed.
- Anything in section 4 appearing in a user-facing doc is a bug.
Gating implementation: `logic/env.ts` (`diagnosticsEnabled()`),
`logic/slash.ts` (`DIAGNOSTIC_COMMANDS` — dispatch hint, help + completion
filtering), `view/transcript.tsx` (stats default). Tests:
`slash.test.ts` (gating both states), `utilityCommands.test.ts` (commands
themselves, gate enabled suite-wide).

View File

@@ -0,0 +1,207 @@
# How the OpenTUI transcript got from 686MB to ~300MB — the full story
*For: glitch. Branch: `feat/opentui-memory-window`. Everything here is measured,
not vibes; every number has a result JSON in the **tui-bench** repo's `results/` (`github.com/NousResearch/tui-bench`).*
---
## 1. The cast of characters (the primitives, bottom-up)
To understand where the memory went, you need to know who's holding it. Six
layers, from the screen up:
**The terminal grid.** Your terminal is a spreadsheet of character cells.
Nobody pays per-message here — tmux holds ~5MB flat no matter how long the
session is (we measured). The terminal is never the problem.
**The OpenTUI native renderer (Zig).** A compiled library that owns the
"frame buffer" — the grid of cells about to be painted. Every piece of text the
TUI shows lives in a native **TextBuffer** (the characters + their colors),
viewed through a **TextBufferView**, styled by a **SyntaxStyle**. Each of those
is a **native handle** — a ticket into one global table that has only **65,535
slots, total, ever** (16-bit indices — like a coat check with 65k hooks).
Destroying a renderable returns its tickets, so the constraint is not "how much
have you ever created" but **"how much is alive right now."**
**Renderables.** OpenTUI's UI objects — `<text>`, `<box>`, `<markdown>`,
`<code>`, `<scrollbox>`. One transcript row (a message with its tool calls,
markdown, code blocks, copy chips) is a *tree* of these: **~16 text renderables
≈ 47 native handles ≈ ~250340KB of RSS, per row.** This is the number that
drives everything. 1,400 mounted rows × 47 handles = table full = the crash we
root-caused last week.
**Yoga (the layout engine, WASM).** Every renderable also has a Yoga node —
Yoga is the flexbox calculator that decides where boxes go. OpenTUI ships it
compiled to **WebAssembly**, and WASM has a brutal property: its memory can
**grow but never shrink** back to the OS. So the peak number of
*simultaneously-mounted* renderables sets a high-water mark you pay **forever**,
even after everything is destroyed. (Fun fact from this week's forensics: we
spent two days believing Ink had this disease. It doesn't — our Ink fork swapped
Yoga-WASM for a plain TypeScript port at fork creation. **We** are the ones
running layout in WASM. The accusation was true; we just had the defendant
wrong.)
**Solid (the view framework).** Renders each store message into a row via
`<For>`. The property we exploit: Solid mounts/unmounts *surgically* — remove a
row from what the component returns and Solid destroys exactly that row's
renderables (returning its handles and freeing its Yoga nodes), touching
nothing else. No virtual-DOM diffing, no collateral re-renders.
**V8 (the JavaScript engine) + the store.** The store keeps every message as JS
strings/objects. V8's garbage collector is *lazy by design*: with the default
8GB ceiling we launch with, it sees no reason to clean up aggressively, so RSS
includes a lot of "collectible but not yet collected" garbage. Cheap to fix,
worth real MB (measured below).
**The scrollbox.** One detail that fooled everyone at some point:
`viewportCulling` (on by default) skips *drawing* offscreen rows — but they stay
fully **mounted**: handles held, Yoga nodes alive, memory paid. Culling saves
paint time, not memory. That misunderstanding is half the reason the "rolling
store cap" was expected to be enough, and wasn't.
## 2. Why it was 686MB
Simple arithmetic. The old TUI mounted **every message in the store** as a full
renderable tree. 2,000 messages × ~16 renderables × (handles + Yoga nodes +
text buffers + V8 objects) ≈ 670690MB, growing ~300MB per 1,000 messages. And
at ~1,400 rows the handle table filled: first a hard crash (exit 7), then —
after our containment fix — survival with **unstyled text** past that point,
plus a cap clamped from 3,000 rows down to 1,000 as the price of not crashing.
Ink, meanwhile, sat at ~234MB at the same workload, because Ink only ever
mounts the rows near your viewport (~84400 live nodes). Its memory is the
*data* plus some caches — not the *view*.
## 3. The decisions, in order
### Decision 1: virtualize the view, don't starve the store
Two ways to cut view memory: keep fewer messages (opencode's answer — they keep
100 and delete the rest from memory; transcript truth lives on their server), or
keep all messages but only *materialize* the ones near the viewport. You vetoed
the first (your p90 session is 182 messages — a 100-row store truncates normal
sessions), so: **windowing**. Notably the OpenTUI devs confirmed this week that
framework-level virtualization is the intended path — the engine doesn't ship
it out of the box, and opencode never built it. We did.
### Decision 2: exact heights, recorded at unmount — never estimates in your face
This is the load-bearing idea, and it's where we beat Ink at its own game.
The hard problem of any virtualized list: an unmounted row still needs to
occupy its correct *height*, or the scrollbar lies and content jumps. Ink
solves it by **guessing** heights and correcting after measurement — those
corrections are precisely the 83101ms scroll stutters you hate. You explicitly
vetoed "estimate-correction jank" as a model.
Our advantage: OpenTUI lays out with real, queryable heights. So when a row
scrolls out of the window, we record its **exact laid-out height** (an
`onSizeChange` hook fires inside layout, pre-paint) and replace the row with an
empty `<box height={exactly-that}/>` — a **spacer**: one Yoga node, zero text
buffers, zero native handles. Think of a bookshelf where books you're not
reading are swapped for cardboard sleeves cut to *exactly* the book's
thickness: the shelf never shifts, and you can't tell from across the room.
The window is your viewport ± one viewport of margin (plus hysteresis so it
doesn't thrash at the edges). Scroll near a spacer and the real row remounts —
at the recorded height, so nothing moves.
And one **law**, written into the code as `correctionIsLegal`: a spacer's
height may only ever be corrected where you *cannot see it* — fully above the
viewport (with the scroll position compensated in the same frame, so the world
doesn't move) or fully below it. A correction that would shift visible content
is forbidden, structurally. Jank isn't tuned down; it's outlawed.
### Decision 3 (the S2 insight): adjudicate on *append*, not just on scroll
S1 alone got 686 → 518MB. Why not more? Because of *when* windowing decided.
S1 re-decided the window when you **scrolled**. But during a streaming burst —
an agent turn dumping hundreds of rows — you don't scroll; rows arrive, each
mounting fully, and only get demoted later. That transient pile-up is mostly
invisible in steady-state numbers… except for Yoga-WASM, where **the transient
peak is permanent** (memory never shrinks). The burst was quietly ratcheting
the floor.
S2 makes the window recompute on **transcript growth**: while you're pinned at
the bottom, the window anchors to the content *bottom*, so a row that falls
more than a margin behind the live edge becomes a spacer the moment it's
measured — not whenever you next scroll. Measured result: across a 1,500-row
burst, the peak number of simultaneously-mounted rows is **31**.
Same trick for **resume**: opening a 2,000-message session used to mount all of
it (transient peak again — paid forever). Now resume mounts only the bottom
window; everything above starts as spacers using a line-count estimate, and an
idle-time "measure march" quietly mounts ten rows at a time near the window
edge, records their true heights, and swaps them back — all outside the
viewport, all invisible by the law above.
### Decision 4: rows that must never be windowed
Windowing has to know what it's not allowed to touch:
- **Streaming rows** — the native markdown renderer streams incrementally;
unmounting mid-stream would restart it visibly.
- **The bottom 30 rows** — the region you actually live in.
- **Rows under a mouse selection** — the review caught that a lingering
highlight originally froze windowing *forever* (memory regrowing silently).
Fixed: only an active drag pauses swaps, and selected rows get pinned, so
copy is byte-exact while everything else keeps windowing.
### Decision 5: give back the scrollback (cap 1,000 → 3,000)
The 1,000-row clamp existed only because mounted-rows == stored-rows and the
handle table dies at ~1,400. With windowing, mounted ≈ 31 regardless of store
size — so the cap went back to the originally-shipped 3,000. It's
windowing-aware: the `HERMES_TUI_WINDOWING=0` escape hatch (which mounts
everything again) keeps the safe 1,000.
### Decision 6 (measured, not yet shipped as default): right-size the V8 heap
Running the windowed TUI with a 512MB heap ceiling instead of 8GB forced V8 to
actually collect: another 90MB with zero latency cost. That's queued as a
launcher default change (~1GB), for both engines.
## 4. The scoreboard
At 2,000 messages (your real p99 session size — yes, we checked your DB:
median session is 20 messages, p99 is 1,941):
| | peak memory | scroll p99 (slowest 1-in-100) |
|---|---|---|
| OpenTUI before | 686MB | 16ms |
| + S1 windowing | 518MB | 16ms |
| + S2 append/resume windowing | **300375MB** | **6ms** |
| Ink (reference) | 229246MB | ~100ms |
At the **3,000-message stress** with the restored triple-size scrollback:
**360MB, fully styled, scroll p99 8ms** — a workload that six days ago crashed
the process, and three days ago survived only by dropping syntax colors.
Scroll got *faster* because there are simply fewer live renderables to walk.
The determinism gate stayed **byte-identical** — the windowed TUI's settled
frame is provably the same pixels as before. And the live smoke (2,000-message
session: full sweep to the top, resize storm, back to bottom) returned a frame
pixel-identical to boot, with deep history fully syntax-highlighted — something
the pre-windowing TUI literally could not do.
## 5. What's honestly still open
- The remaining ~60120MB over Ink is mostly the **store's JS strings** and
process baseline — the view is no longer the problem. The structural fix is
the **thin renderer** (W1): bodies live in the Python gateway (which already
has them in SQLite); the TUI keeps ~300-byte stubs and fetches bodies only
for the window. That also fixes the class of problem neither engine handles
today: a single 10MB tool output.
- Two accepted, documented limits: scrollbar-*jumping* deep into a freshly
resumed session can land on estimate-height rows that snap to true height as
they enter view (normal scrolling doesn't — the margin pre-measures; the idle
march erodes the exposure over time), and a tool you expanded, scrolled far
away from, then returned to will have re-collapsed (state is component-local;
hoisting it to the store is queued).
- Everything is behind `HERMES_TUI_WINDOWING` (default on, `0` = bit-exact old
behavior) — a one-env escape hatch if anything feels off in real use.
*Where to verify: the **tui-bench** repo's `results/` (`github.com/NousResearch/tui-bench`; every number above), the design+gates doc
`docs/plans/opentui-transcript-windowing.md`, tests in
`ui-opentui/src/test/window.test.ts` and `transcriptWindow.test.tsx` (the
zero-jank invariants are literal assertions: identical scrollHeight windowed
vs not, byte-stable frames across corrections).*

View File

@@ -0,0 +1,432 @@
# OpenTUI native engine — PR documentation
**Branch:** `feat/opentui-native-engine` · **Base:** `origin/main` (merged in; HEAD is at `~main`)
**New engine root:** `ui-opentui/` (Node 26 + `@opentui/core` 0.4.1 + `@opentui/solid`, Effect at the boundary)
**Legacy engine root:** `ui-tui/` (React + the `@hermes/ink` fork at `ui-tui/packages/hermes-ink/`)
> This is the canonical in-repo doc for the PR. The companion interactive HTML
> write-up (`~/projects/opentui-perf-writeup/index.html`) is the case/benchmark
> deep-dive; this doc is the reviewable text version + the four things review
> actually needs: **(1) the LoC reduction math, (2) the measured perf deltas,
> (3) the real UI divergence (with screenshots), (4) the non-core / kitchen-sink
> change audit.**
This PR adds a from-scratch native terminal UI built on OpenTUI, intended to
replace the React/Ink TUI **and the Ink fork we maintain alone**. It currently
ships as a parallel engine (Ink untouched, auto-fallback), selected by
`HERMES_TUI_ENGINE` env > `display.tui_engine` config > auto (OpenTUI when the
host is Node ≥ 26.3 with the built bundle, else Ink). **100% parity with the Ink
TUI is the bar.**
---
## 1. Line-of-code reduction (the headline maintenance win)
All counts are **git-tracked files only** (respects `.gitignore`; `dist/` and
`node_modules/` are untracked and excluded). Measured live on this branch at
`~HEAD`. "Code" = `.ts/.tsx/.js/.jsx` only; "total" includes config/json/md.
### What gets *removed* when Ink is retired
| Area | Files | Total lines | Code lines (ts/tsx/js) | Non-blank code |
|---|---:|---:|---:|---:|
| `ui-tui/src/` — Ink **consumer app** (our React/Ink view code) | 204 | 40,422 | 40,422 | 33,550 |
| `ui-tui/packages/hermes-ink/`**the fork** (`@hermes/ink`) | 148 | 28,167 | 28,113 | 23,718 |
| **`ui-tui/` whole tree (tracked)** | **362** | **69,320** | **68,831** | **57,545** |
The `ui-tui/` whole-tree number (69,320) also folds in a handful of build
scripts, `.prettierrc`, `package.json`, etc. The two rows above it are the
load-bearing split:
- **The fork alone is 28,167 LOC across 148 files** — code we own and can never
sync from upstream. Upstream Ink v6.8.0 `src/` is ~7,259 LOC, so the fork's
renderer core is **~3.2× the size of stock Ink**. (Cross-checked against the
HTML write-up's `ink-fork-analysis.json`: 28,111 LOC / 148 files — the 56-line
delta is a single tracked JSON the file-level count includes.)
- **The consumer app is another 40,422 LOC** — React components/hooks that only
exist to drive Ink.
### What gets *added*
| Area | Files | Total lines | Code lines | Non-blank code |
|---|---:|---:|---:|---:|
| `ui-opentui/src/` — new engine (app code **+ its own tests**) | 153 | 28,763 | 28,763 | 26,495 |
| &nbsp;&nbsp;↳ non-test (app code only) | 97 | 16,628 | 16,628 | 15,450 |
| &nbsp;&nbsp;↳ tests (`src/test/`) | 56 | 12,135 | 12,135 | 11,045 |
| Tree-sitter grammars (`python``toml`) | 0 | 0 | 0 | 0 |
| **`ui-opentui/` whole tree (tracked)** | **~170** | **~34,800** | **29,614** | **27,283** |
> Tree-sitter grammars carry **zero repo lines**: the engine declares the 10
> extra grammars as remote URLs (`src/boundary/parsers.manifest.json`) and
> OpenTUI fetches+caches each `.wasm`/`.scm` on first use into
> `~/.hermes/cache/opentui-parsers/` (à la opencode, which vendors none). An
> earlier revision vendored them as 37,302 checked-in binary lines (10 `.wasm` +
> 10 `.scm`); that's gone — code lines and total lines now move together.
### The net reduction (code lines, the honest comparison)
| Comparison | Removed (ts/tsx/js) | Added (ts/tsx/js) | Net change |
|---|---:|---:|---:|
| **Incl. fork** — retire all of `ui-tui/` vs add `ui-opentui/src` | 68,831 | +28,763 | **40,068 LOC (58%)** |
| **Incl. fork, app-vs-app** (exclude both test suites) | 56,463¹ | +16,628 | **39,835 LOC (71%)** |
| **Excl. fork** — only the Ink *consumer app* vs new engine | 40,422 | +28,763 | **11,659 LOC (29%)** |
| **The fork in isolation** (the unsyncable liability we shed) | 28,113 | — | **28,113 code lines deleted outright (28,167 incl. its 1 config file)** |
¹ `ui-tui/src` non-test = 28,350 LOC + fork (≈ all 28,113 code lines are non-test;
it carries only ~54 config lines) = 56,463. (`ui-tui/src` carries 80 test files /
12,072 LOC; the new engine carries 56 test files / 12,135 LOC.)
**Read it this way:**
- **The cleanest single number: ~40k code lines net** (retire all of `ui-tui/`,
add `ui-opentui/src`). That is a **~58% reduction in the TUI's
hand-maintained surface**, and it *includes* the new engine's full 56-file test
suite.
- **The most important number is the fork: 28,167 LOC of unsyncable engine
code** disappears. That is the load-bearing maintenance win — it's not just
fewer lines, it's lines we are the *sole* maintainer of (own reconciler, ANSI
parser, scrollbox, selection/OSC52, hand-rolled memory eviction, Yoga binding).
- **Even excluding the fork** — i.e. if you imagine upstream Ink were free — the
app rewrite is still a net reduction (11,659 LOC) because the new engine
mounts OpenTUI built-ins instead of hand-building components.
### Caveat on the comparison (keep it honest for review)
- These are **whole-tree retirements vs a single source dir add.** If/when Ink is
deleted, the `ui-tui/` `package.json`, lockfile, and build scripts go too; the
table counts `ui-tui/src` + the fork as the apples-to-apples "hand-maintained
TS" figure.
- **Tree-sitter grammars are NOT vendored.** The 10 extra grammars are declared
as remote URLs (`src/boundary/parsers.manifest.json`); OpenTUI fetches each
`.wasm`/`.scm` on first use of a language and caches it under
`~/.hermes/cache/opentui-parsers/` (profile-aware, set via
`HERMES_TUI_PARSER_CACHE` by the launcher). Registration does **zero** network;
the fetch is lazy and off the boot critical path, and an unreachable
GitHub/air-gapped env degrades that language to plain text — never a throw. This
replaces an earlier revision that vendored 37k binary lines, so the repo no
longer grows on disk for syntax highlighting. (Trade-off: first-use-per-language
needs network to `github.com`/`raw.githubusercontent.com`; pre-seed the cache in
a Docker build if you need offline highlighting.)
- Python/backend LoC is **not** part of this reduction: `tui_gateway/` (~12k LOC)
is **shared by both engines** and stays. See §4.
---
## 2. Performance (CPU / latency / memory)
Measured with the `tui-bench` harness driving **both engines on a real PTY
120×40**, fake gateway feeding deterministic events, `/proc`-sampled identically,
each SUT under `systemd-run --scope -p MemoryMax=2G -p MemorySwapMax=0`,
sequential with a load-gate + 10s cooldown. Determinism gate **GREEN**, 71 result
files, 0 cell errors, 3 reps/cell, `@opentui/core` 0.4.1 native-yoga
(`libopentui.so`, no `yoga.wasm`). Every number traces to a `summary.<field>` in
a result dir. Source: `~/projects/opentui-html/bench-numbers.json` (frozen
2026-06-14, build under test `1ddf7a102` + WIP).
### Scorecard
| Dimension | Winner | Margin | Source cell |
|---|---|---|---|
| Streaming frame rate | **OpenTUI** | **~3×** (43 vs 14 fps) | `cpu800.frame_pacing` |
| Streaming smoothness (interframe p95) | **OpenTUI** | **40ms vs ~220ms** (no ¼-second stalls) | `cpu800.frame_pacing` |
| Scroll CPU | **OpenTUI** | **~2.7× cheaper** (134155 vs 403416 ticks) | `scroll3000.scroll.cpu_ticks` |
| Cold-start floor | **OpenTUI** | ~97103 vs ~107109 MB | `startup.vmhwm_kb` |
| Session-create latency | **OpenTUI** | ~151177 vs ~204229 ms | `startup.session_create_ms` |
| First-byte paint | Ink | ~93 vs ~122 ms | `startup.first_byte_ms` |
| Memory @ small/typical | Ink | OpenTUI +3050 MB | `mem50/100/300.vmhwm` |
| Memory @ heavy tool output | **OpenTUI** | **crossover** (258265 vs 280290 MB) | `results-fat-mem-*` |
| Layout reflow latency | **Ink** | **~0ms vs ~13ms** (OpenTUI's one honest loss) | `resize3000.resize.reflow_ms` |
### The honest reading
- **OpenTUI wins everything you feel continuously** — frame rate (~3×), scroll
CPU (~2.7×), and smoothness (no 200ms hitches; p95 40ms vs ~220ms). This is the
lead. The single most user-perceptible difference is the stall-free stream.
- **Memory: lead with smoothness, not raw RSS.** Ink is lighter at small/typical
sizes (OpenTUI carries a ~102 MB irreducible Node+V8+`libopentui.so` floor, so
it sits +3050 MB above Ink there). But it **crosses over** under heavy tool
output (mem300: 258265 MB OpenTUI vs 280290 MB Ink) because windowing beats
Ink's mount-every-row. Real-world: 20 memwatch sessions show a flat ~108 MB
floor and ~0 MB/h on long sessions (one 15h session, 0 MB/h; one 4.4h session
plateaus flat at ~237 MB with mounted rows pinned at 33).
- **The one outright loss is layout reflow** (~13ms p50 vs Ink's ~0ms; under a
resize storm OpenTUI degrades to ~14fps/~197ms vs Ink ~26fps/~100ms). Heavier
native renderables vs Ink's string nodes. This is a real, quantified
optimization target — **not** a regression vs current behavior, and **not** the
"halved 0.4.0→0.4.1" delta (we measured the absolute 1215ms only; do not quote
"halved" from this run).
- **The memory fix is engine-agnostic** — a rolling display cap
(`HERMES_TUI_MAX_MESSAGES=3000` default) that is display-only and never touches
the model's context. Uncapped is a stress config, not real usage (10k msgs
uncapped: 793 MB; capped sessions are flat MB/h).
- **Gut-check vs upstream/opencode: no bugs.** Exactly one frame callback
(early-exits cheaply), zero `writeToScrollback` for the transcript (one sticky
`<scrollbox>` + reactive `<For>`), native `<markdown streaming>` byte-for-byte
parity with live opencode, no reactive-read-outside-tracking-scope (the #1 Solid
trap). Source: `docs/plans/opentui-gutcheck-verification.md`.
Full methodology + every cell: see the HTML write-up's benchmark sections and
`docs/plans/opentui-endgame-benchmark-report.md`.
---
## 3. UI parity — and where the two engines genuinely diverge visually
100% *feature* parity is the bar (matrix in §6), but the two engines are **not**
visually identical. The Ink TUI renders the transcript as a **box-drawing tree**;
OpenTUI renders it **flat and marker-based**. This is a deliberate design
divergence, captured in `ui-opentui/src/view/messageLine.tsx`:
> *"the view is a dark room and gold is the single lamp — it sits on the NEWEST
> answer's `⚕` and the user's ``, nowhere else (older assistant glyphs demote to
> grey: they merely happened)."*
Real screenshots (saved under `docs/research/opentui-screenshots/`), captured live
on a real PTY 120×40 via the `tmux-pane-screenshot` workflow — **same session
resumed in both engines** where possible.
### Legacy Ink — `docs/research/opentui-screenshots/ink-transcript.png`
![Ink transcript](research/opentui-screenshots/ink-transcript.png)
- **Box-drawing tree layout.** Each turn is a nested structure: `└─ Response`,
`└─ ▾ Tool calls (1)`, ` └─ ● Terminal("…")` — explicit corner rails and
disclosure triangles.
- **`┊` dotted quote-bar** prefixes assistant prose.
- **Tool calls collapse by default** behind a `▾ Tool calls (N)` disclosure,
nested one rail deeper.
- **Whole assistant message tinted gold/amber** (body text is colored, not just
the marker).
- Right-edge scrollbar: thin `│` track + `┃`/orange thumb.
- Status bar: `─ ready │ opus 4.8 fast high │ 0/1m │ [░░░░░░] 0% │ 25s │ voice off │ 1 session ─ ~`
— leading dash, pipe-delimited fields, trailing `~`.
- **No top header bar.**
### New OpenTUI — `docs/research/opentui-screenshots/opentui-transcript.png` (+ `opentui-toolcall.png`)
![OpenTUI transcript](research/opentui-screenshots/opentui-transcript.png)
![OpenTUI tool call](research/opentui-screenshots/opentui-toolcall.png)
- **Flat, marker-based layout.** No tree rails. Assistant = `⚕` (caduceus, gold
only on the newest answer), user = `` (gold chevron + gold text). Older
assistant glyphs demote to grey.
- **Neutral body text.** Gold is reserved for markers and inline-code accents;
prose is grey/white (the "single lamp" rule), so the screen reads calmer than
Ink's all-amber blocks.
- **Tool calls render inline, expanded, on one header line:**
`⚕ ▶ delegate_task Run the shell command `` (/agents to monitor) · 41s (11 lines)`
— marker, `▶` collapse triangle, bold tool name, grey arg preview, hint,
`· duration`, `(N lines)` — and the result flows flat directly below (no nesting
rail). Per-tool renderers exist (`view/tools/registry.tsx`) — bash/file+diff/
read/search/skill/clarify/todo each render differently, not a uniform dump.
- **Per-block `⧉ copy` affordance** on a quiet footer line under every settled
assistant block and user prompt (click → copies that block's source).
- **Top header bar:** `⚕ Hermes Agent · opentui · ready` + a gold horizontal rule
(Ink has none).
- Status bar (real backend): `● claude-fable-5 │ [▒▒▒] 4% │ …/lively-thrush/hermes-agent (feat/opentui-native-engine)`
— green status dot, model, context/token bar, **right-pinned cwd + branch**.
### Divergence summary table
| Aspect | Ink (legacy) | OpenTUI (new) |
|---|---|---|
| Transcript structure | Box-drawing **tree** (`└─`, rails) | **Flat**, indented, marker-based |
| Assistant marker | `└─ Response` rail + `┊` quote-bar | `⚕` caduceus glyph |
| User marker | (rail) | `` gold chevron |
| Assistant body color | Tinted gold/amber | Neutral grey/white (gold = accents only) |
| Tool calls | Collapsed `▾ Tool calls (N)`, nested | Inline expanded header + flat result |
| Per-tool rendering | Largely uniform | Dedicated renderers per tool |
| Copy affordance | `/copy` command | `/copy` **+ per-block `⧉ copy`** |
| Header bar | None | `⚕ Hermes Agent · opentui · ready` + rule |
| Status bar | `─`/`│`-delimited, trailing `~` | dot + bars + right-pinned cwd/branch |
**For review:** the divergence is intentional (a design pass, not an accident),
but it means "drop-in replacement" is true at the *feature* level, not the
*pixel* level. A user switching engines will immediately notice the flatter,
calmer transcript. Worth calling out explicitly so the swap isn't sold as
visually invisible.
---
## 4. Non-core / kitchen-sink change audit (what review should scrutinize)
Full report: **`docs/research/opentui-noncore-change-audit.md`** (file-by-file,
commit-by-commit, with `file:line` evidence). Summary below.
This PR's net footprint vs `origin/main` (two-dot diff = exactly this PR's adds,
no main work re-included):
| Bucket | Files | Net diff |
|---|---:|---:|
| UI (`ui-opentui/`, the engine + tests) | 197 | +36,001 / 1 |
| Docs | 8 | +1,164 / 0 |
| **Other (the review-flag surface)** | **28** | **+3,218 / 204** |
The 28 "other" files are the only place this PR touches shared Hermes core. They
classify as:
### ✅ CORE-OPENTUI-NECESSARY (the engine can't work without these; Ink path provably untouched)
- **`hermes_cli/main.py`** (+382/5) — dual-engine launcher (engine resolution,
Node 26 / fnm detection, `_make_opentui_argv`, heap override). Default falls
back to Ink unless the host is OpenTUI-ready (`main.py:1685`); OpenTUI is
dispatched *around* the Ink bootstrap, never through it (`main.py:1914-1922`).
- **`scripts/install.sh`** (+78/1) — `install_opentui` stage, **strictly
best-effort** (every failure returns 0; falls back to Ink; Windows/Termux
skipped). Ink install path unchanged.
- **`Dockerfile`** (+21/11) — Node 22→**26** bump (required by the `node:ffi`
renderer) + `ui-opentui` build step. Opt-in; Ink build line preserved. **Caveat:
the Node major bump affects the whole image (Ink + web + Playwright)** — the
diff self-flags "verify the full image build on Node 26 in CI."
- **`hermes_cli/_parser.py`** (+16/2) — bare `--resume` → OpenTUI session picker;
`--resume <id>` unchanged.
- **`tui_gateway/server.py`** (+612/40) — predominantly opt-in RPCs/fields the
new engine calls (`session.peek`, `session.list` filters, `startup.catalog`,
`diff_unified`, window-title, skin keys). Each is gated so **the Ink path is
byte-for-byte unchanged** (`server.py:3930`, `:4254`, `:10447`). *Note:* this
file also carries some of the cost-accounting code (below) — separable.
> `tui_gateway/` (~12k LOC Python) is **shared by both engines** and is **not**
> removed when Ink is retired. Only the `ui-tui/` frontend tree goes.
### 🚩 FLAG FOR REVIEW — Category C, separable from an OpenTUI PR
These do **not** need to ship with the engine and a reviewer should ask to split
them out:
1. **Provider-reported-cost accounting** (commits `85546bb9e` + `364b93a4b` +
`e01b04de4`) — a coherent feature spanning **11 files**: `agent/usage_pricing.py`,
`plugins/model-providers/openrouter/__init__.py`,
`agent/transports/chat_completions.py`, `agent/agent_init.py`, `run_agent.py`,
`agent/conversation_loop.py`, `agent/account_usage.py`, `hermes_state.py`,
`gateway/slash_commands.py`, the cost half of `cli.py`, and the
`_get_usage`/`_compact_usage_text` blocks of `tui_gateway/server.py` (+ 5 test
files). Strongest evidence: commit `85546bb9e` *"gateway: capture real
provider-reported cost (openrouter usage accounting)"* — a provider-accounting
rework, not a renderer.
2. **`plugins/model-providers/openrouter/__init__.py`** — sends
`usage:{include:true}`, a provider request-shape change affecting *all*
interfaces, not just the TUI (`openrouter/__init__.py:85-90` cites the
OpenRouter usage-accounting docs).
3. **Worktree lock / dirty-tree preservation** (commit `94765e48f`,
`cli.py` + `tests/cli/test_worktree.py`, ~145 lines) — git-worktree lifecycle
safety plumbing with **zero TUI references** (`cli.py:1391-1545`, `:1635-1713`).
4. **`tools/clarify_tool.py`** (+16/4) — docstring/schema-description-only fix
(commit `16e408f3f`); applies to every interface, trivially separable.
### ✅ Conversation-loop / role-alternation / prompt-cache correctness verdict: **NO RISK**
Verified: none of `run_agent.py`, `agent/conversation_loop.py`,
`agent/agent_init.py`, `agent/transports/chat_completions.py` touch
message-role alternation or the prompt-cache prefix. The
`conversation_loop.py` added lines grep clean for
`cache_control|alternation|prompt_cach|api_messages`; the cache/alternation
machinery (`:57`, `:660-674`, `:759`) is untouched; the PR's insertion at
`:1809-1879` is purely additive cost bookkeeping after `cost_result`. **Prompt
caching and strict role alternation are preserved.**
---
## 5. What this does and does NOT fix
**Fixes (structurally, by replacing the rendering substrate):** the renderer bug
class — layout/scroll/input/copy/mouse/markdown/resize — plus the
hand-maintained memory-eviction problem (windowing + Solid keyed `<For>`
unmount→`destroy()``free()`), and several long-open feature requests (mouse,
collapsible tool calls, session title/status bar, double-ESC, chronological
thinking/tool ordering).
**Does NOT fix:** the gateway is unchanged — the biggest single hotspot file in
triage is `tui_gateway/server.py`, and whole bug clusters are gateway/Python-side
(WS write-timeout/RPC pool, MCP-failure startup freezes, shell.exec denylist).
The engine swap addresses rendering/input/scroll/memory; **gateway bugs ride
along.** The Effect-boundary hardening does make those failures *visible* (typed
events → system lines instead of a frozen spinner) and the TUI auto-heals
(crash → backoff → respawn → resume, capped 3/60s).
---
## 6. Feature parity matrix (vs the Ink TUI)
Verbatim, detailed, surface-by-surface with `file:line` evidence:
**`docs/plans/opentui-ink-parity-matrix.md`** (interactive/filterable version in
the HTML write-up). Headline state:
| Surface | State |
|---|---|
| Transcript rendering (scrollbox, markdown, code, diffs, collapsible tools, reasoning, chronological order, windowing) | **full parity (9/9)** |
| Blocking prompts (approval/clarify/sudo/secret/confirm) | **full parity (5/5)** |
| Theming (skins, light/dark, ANSI-256 norm) | **full parity** |
| Mouse / copy (tracking, selection, multi-click, OSC52, click-to-expand, wheel accel) | **full parity** |
| Resilience (crash auto-heal + resume) | **parity++ (exponential backoff)** |
| Composer / input | near parity — **missing: external editor (Ctrl+G → `$EDITOR`)**; ghost-text autosuggest partial |
| Slash commands | core parity — **missing: `/setup`, `/redraw`, `/plugins`, `/voice`**; `/undo` prefill + `/image` partial |
| Status bar / header chrome | almost all closed — **missing: MCP-servers panel, profile-in-prompt** |
| Agent surfaces | most shipped — **missing: voice indicators, browser/CDP indicator** |
| Utility commands | **missing: `/redraw`, `/setup`**; rest present |
> The original PR-draft gap list was **substantially stale** — the WIP since
> shipped context %/token bar, cost, compressions, duration, update banner, todos
> panel, activity feed, notifications, background-task indicator, **and per-tool
> renderers** (the "every tool renders the same" claim is false:
> `view/tools/registry.tsx` has dedicated renderers).
### Genuinely-remaining parity gaps
- [ ] **External editor (Ctrl+G → `$EDITOR`)** — highest-impact missing composer affordance
- [ ] MCP-servers detail panel; profile-in-prompt marker
- [ ] Voice indicators (listening/transcribing/REC/STT) + `/voice`
- [ ] Browser/CDP connection indicator + `/browser`
- [ ] `/setup` wizard handoff, `/redraw`, `/plugins` hub
- [ ] Draggable scrollbar; sticky-prompt line
- [ ] `/undo` prefill into composer; model-picker persist-global toggle; skills-hub install/manage
---
## 7. Rollout, runtime & risks
- **Runtime:** plain Node 26 (FFI floor 26.3+) — one runtime, no Bun. (Note: the
upstream OpenTUI docs say "requires Bun"; this engine deliberately runs on Node
26's experimental `node:ffi` instead — that's the load-bearing runtime decision.)
- **Rollback:** Ink is untouched and remains the fallback; reverting is a launcher
decision, not a code revert.
- **Default-engine selection:** auto-picks OpenTUI only when the host is genuinely
set up (Node ≥ 26.3 + built bundle), else Ink; explicit env/config bypasses the
probe.
- **Known sharp edges:** `libopentui.so` native-lib distribution (P1 upstream:
copies can fill `/tmp`); the Dockerfile Node major bump needs full-image CI
verification; tree-sitter grammars are fetched from GitHub on first use and
cached in `~/.hermes/cache/opentui-parsers/` — air-gapped hosts get plain-text
highlighting until the cache is pre-seeded (the fetch never blocks boot and
never throws).
## 8. Try it
```bash
hermes # auto-selects OpenTUI when the host supports it
HERMES_TUI_ENGINE=opentui hermes # force the native engine
HERMES_TUI_ENGINE=ink hermes # force the legacy Ink engine
# preview standalone (no backend), Node 26:
cd ui-opentui && npm install
node scripts/build.mjs scripts/demo.tsx .demo
DEMO_TOTAL=120 HERMES_TUI_MAX_MESSAGES=80 \
node --experimental-ffi --no-warnings .demo/demo.js # inside a TTY
```
Requires Node 26.3+. On older Node / Windows / Termux it auto-falls-back to Ink.
---
## Appendix — source-of-truth files in this repo
| Topic | File |
|---|---|
| Non-core change audit (full) | `docs/research/opentui-noncore-change-audit.md` |
| Feature parity matrix (verbatim) | `docs/plans/opentui-ink-parity-matrix.md` |
| Benchmark report | `docs/plans/opentui-endgame-benchmark-report.md` |
| Gut-check verification | `docs/plans/opentui-gutcheck-verification.md` |
| Ink↔OpenTUI capture asymmetry | `docs/plans/opentui-ink-asymmetry-note.md` |
| UI screenshots | `docs/research/opentui-screenshots/{ink,opentui}-*.png` |
| PR description (prose) | `docs/pr-description-main-doc.md` |
| Interactive write-up | `~/projects/opentui-perf-writeup/index.html` (out-of-repo) |

View File

@@ -0,0 +1,73 @@
# Upstream alignment — how we inherit OpenTUI's performance work for free
Context (maintainer, 2026-06-11): opencode's 100-message cap was a November-era
performance workaround, since obsoleted; the **next OpenTUI version ships
native yoga** (≥2× layout performance, more improvements building on it);
opencode does not use virtualization.
## The invariant that makes alignment free
**We are forkless and public-API-only.** The windowing layer (S1+S2) drives the
STOCK `<scrollbox>` through documented surface only — `onSizeChange`,
`setFrameCallback`, `scrollTop`/`viewport`/`scrollHeight`, Solid `<Show>`
mount/unmount. Zero patches to `@opentui/core`. Every upstream release
therefore drops in by bumping three pinned versions in `ui-opentui/package.json`
(`@opentui/{core,keymap,solid}`, currently 0.4.0). Keep it that way: any new
code that needs core behavior goes through a `boundary/` wrapper, never a
patched dependency.
## What native yoga changes for us (and what it doesn't)
- **Kills the WASM ratchet** (grow-only linear memory → freeable native
allocations). This retro-justifies S2 less, but S2's append-time windowing
remains correct: transient mounted peaks still cost handles and RSS.
- **Does NOT obsolete windowing.** The binding constraint is the 65,535-slot
native handle table: ~47 handles/row × 3,000 stored rows ≈ 141k handles —
over the table at ANY layout speed. Windowing is what makes the 3,000-row
scrollback possible; yoga's backend is irrelevant to that math.
- **Makes windowing feel even better**: 2× layout = cheaper margin remounts =
smaller window margins viable and less exposure for the one accepted limit
(estimate-height snap under scrollbar jumps). After the bump, re-tune margin/
hysteresis against the scroll cell.
## The shim ledger (delete-on-upstream-fix; all in `ui-opentui/src/boundary/`)
| shim | what it papers over | delete when |
|---|---|---|
| `ffiSafe.ts` | u32 draw coords go negative under Node FFI (Bun silently wraps) — ERR_INVALID_ARG_VALUE loop | upstream clamps, or Node FFI path is officially supported |
| `nativeHandles.ts` | SyntaxStyle exhaustion crashes mid-mount; degrade-to-unstyled | handle table widened (INDEX_BITS>16) or per-kind tables |
| `renderer.ts` exit-signal guard | core 0.4.0 treats SIGPIPE (clipboard spawn) as an exit signal; its own uncaughtException handler allocates a handle and dies (exit-7 masking) | both fixed upstream |
| `clipboard.ts` hardening | same SIGPIPE incident class | with the above |
Each is (a) isolated, (b) inert if upstream fixes the behavior, (c) worth
reporting upstream — four concrete, reproduced, root-caused issues. Filing them
is the cheapest alignment lever we have: it converts our workarounds into
upstream regression tests. (Needs glitch's go-ahead — public repo activity.)
## The upgrade playbook (per upstream release)
1. Branch `chore/opentui-X.Y.Z`, bump the three pins, `npm ci`.
2. `npm run check` (648 tests; the windowing invariants — identical
scrollHeight ON/OFF, byte-stable frames across corrections — are literal
assertions and will catch behavioral drift).
3. Bench acceptance, sequential: `--cell gate` (determinism digest; EXPECT a
new digest if upstream changed rendering — eyeball the frame, re-bless),
`--cell mem3000 --msgs 2000` + `--cell scroll --msgs 3000` vs current
numbers (300375MB / p99 68ms), `--cell pipeline` (frame pacing ≥22fps).
4. Shim audit: try each boundary shim OFF; delete the ones upstream fixed.
5. Live tmux smoke (scroll sweep / resize / selection-copy), screenshots.
6. Windowing re-tune if layout got faster: margins up or hysteresis down,
re-run scroll cell, keep p99 ≤ 17ms gate.
The bench suite IS the upgrade contract — it's exactly the harness that lets
us take every upstream improvement within a day of release, with proof.
## Questions worth relaying to the maintainer
1. Any plan to widen the 16-bit native handle table (or split per-kind)?
That's our hard ceiling, independent of yoga.
2. Is the Node `--experimental-ffi` path on their support radar, or Bun-only?
(Native yoga adds new FFI surface; we run Node.)
3. Would they take the windowing layer's core-agnostic pieces (exact-height
spacer pattern, correction-legality rule) as a documented recipe or
framework-level utility? We have it production-shaped with tests.

View File

@@ -0,0 +1,150 @@
# OpenTUI — Background Activity: agents inspection, background panel, notifications + density
**Status:** SPEC (brainstormed with glitch 2026-06-13) · target branch `feat/opentui-native-engine`
**Hard constraint:** TUI-LAYER ONLY (`ui-opentui/`). **Zero changes to `tui_gateway/server.py` or
`run_agent.py` core.** Build only on gateway events/RPCs that already exist. Everything below was
feasibility-checked against the live gateway surface (see "Gateway surface" §).
## Why
Dogfeedback (screenshots `iznq/qxpe/rpiw/rplj`):
1. **Agents dashboard is too crowded** (`rplj`) — master rows dump each subagent's full multi-line
prompt; the trace pane is squished. Inspection + transcript reading is "not great."
2. **Background processes are basically invisible** (`qxpe`) — completions leak into the transcript
as plain lines that read like model output; no panel, no badge, notifications are non-existent.
3. **Input zone is too crowded** (`rpiw`) — status bar + composer + agents tray + completion menu +
shell note stack under the transcript.
## Design decisions (from the brainstorm)
- **Two SEPARATE surfaces, ONE shared substrate.** Background *agents* (delegated subagents) and
background *work* (detached runs + OS processes) are visually/feature-wise distinct, but share the
underlying tracking + notification + badge plumbing.
- **Notifications are multi-channel** on every relevant state change:
- **(C) inline card** in the transcript — a distinct, colored, collapsed *system card*, clearly
NOT model output (replaces today's plain-line leak).
- **(A) ambient badge** — a live count in chrome (status-bar `bg:`/the `⚡ N agents` tray) that
flashes on change; you pull-to-inspect. Stays visible while things run.
- **OSC desktop** — reuse the EXISTING `boundary/termChrome.ts` (`notify`, OSC 9/99/777, already
focus-gated so it only fires when the terminal is blurred).
- **Agents surface = inspection only.** No foregrounding / "become the subagent" (that would change
core subagent UX — explicitly out of scope). Scannable list + a faithful render of the *already-
tracked* live activity (goal/model/reasoning/tool calls/progress/summary). No new fetch.
- **Background surface = view + stop.** List runs + OS processes with status/uptime; cancel a run
(`session.interrupt`/`subagent.interrupt`); **stop-all** OS processes (`process.stop`). Per-process
kill and per-process logs are NOT exposed as RPCs → out of scope under the no-core rule (noted).
- **Input density is in scope** (own phase).
## Gateway surface we build on (verified — all already exist)
| Need | Mechanism (existing) |
|---|---|
| Background-run lifecycle | `prompt.background` (start), `background.complete` (event) |
| Notifications | `notification.show` / `notification.clear` events — payload `{text, level, kind, ttl_ms, key, id}` |
| Subagent stream | `subagent.spawn_requested/start/thinking/tool/progress/complete` events (store already consumes) |
| List OS processes | `agents.list` RPC → `{processes:[{session_id, command, status, uptime_seconds}]}` |
| Stop OS processes | `process.stop` RPC → `kill_all()` (**all**, not per-process) |
| Cancel a run / subagent | `session.interrupt`, `subagent.interrupt` |
| List active sessions/runs | `session.active_list`, `session.status` |
| Subagent trace (archived) | `spawn_tree.list/load` (already used by `/replay`) |
| OSC desktop notify | `boundary/termChrome.ts` `notify(TermNotification)` |
**Honest limits (no-core constraint):** OS processes get list + stop-all only — no per-process kill
(`process_registry.kill_process` exists but isn't an RPC) and no per-process log tail
(`read_log` isn't an RPC). If the no-core rule is ever relaxed, each is a ~5-line additive `@method`.
## Architecture (Approach 1 — substrate-first)
```
gateway events ──► store: backgroundActivity slice ──► derived counts/state
│ │
├─► notificationDispatcher ─────────┼─► (C) inline card (transcript)
│ (card + badge + OSC) ├─► (A) ambient badge (statusBar/tray)
│ └─► OSC via termChrome.notify
├─► Surface 1: AgentsDashboard (revamp) — list + rich activity pane
└─► Surface 2: BackgroundPanel (new) — runs + processes, stop
```
### Shared substrate (the "underneath" both surfaces use)
- **`logic/backgroundActivity.ts`** (new) — pure model + reducers. Types:
- `BackgroundRun` (from `prompt.background`/`background.complete`/`session.active_list`):
`{ id, label, status: 'running'|'complete'|'failed'|'cancelled', startedAt, summary? }`
- `BackgroundProcess` (from `agents.list`): `{ sessionId, command, status, uptimeSeconds }`
- `Notification` (from `notification.show`): `{ id, key?, text, level, kind, ttlMs?, at }`
- Pure helpers: `applyNotification`, `clearNotification(key)`, counts (`runningCount`),
`mergeProcessList`, dedupe by `key`/`id`. Fully unit-testable (no renderer).
- **`store.ts`** — a `backgroundActivity` slice + event handlers for `notification.show/clear`,
`background.complete`, and a polled `agents.list` snapshot (poll only while a panel/badge is live,
or piggyback existing cadence). Existing `subagent.*` handling is untouched.
- **`logic/notificationDispatcher.ts`** (new, pure) — given a state-change, decide the channels:
returns `{ card?: SystemCard, badge: delta, osc?: TermNotification }`. The boundary calls
`termChrome.notify` for the OSC part; the store appends the card + bumps the badge.
### Surface 1 — Agents inspection overlay (revamp `view/overlays/agentsDashboard.tsx`)
- **Master list rows = ONE line each:** `<statusGlyph> <truncated goal (truncRight to width)> · <model>`.
No multi-line prompt dump. Selected row highlighted (existing `▸` + accent).
- **Detail pane = faithful activity transcript** of the selected agent, styled like the main
transcript (not flat dumped lines): goal+model header, then the trace rendered by *type*
(reasoning / tool-call+result / progress / final summary), newest last, sticky-bottom, PgUp/PgDn.
- Requires giving `SubagentInfo.trace` light typing (`{ kind:'tool'|'reasoning'|'progress'|'summary', text }`)
instead of `string[]`, populated where `subagent.*` events are reduced. Internal data-shape
change only; no gateway change.
- Keep Esc/q close, ↑↓ select. Reuse theme + `truncRight` from statusBar.
### Surface 2 — Background panel (new `view/overlays/backgroundPanel.tsx`)
- **Two sections:** *Runs* (background agent runs) and *Processes* (OS processes from `agents.list`).
- Each row: status glyph + label/command (truncated) + uptime/elapsed + status.
- **Actions:** `↑↓` select; on a *run*`c` cancel (`session.interrupt`/`subagent.interrupt`);
global **stop-all processes** (`x``process.stop`, confirm). Esc/q close.
- **Access:** new client slash `/bg` (alias `/background`, `/jobs`) in `logic/slash.ts` CLIENT set →
`store.openBackgroundPanel()`. Also reachable from the ambient badge.
- Poll `agents.list` on open + on a light interval while open; stop polling on close.
### Notifications (the (C)+(A)+OSC wiring)
- **(C) inline card** — a new transcript element `view/notificationCard.tsx`: a bordered/colored,
`selectable:false` system card keyed by `notification.id`, level-tinted (`info/warn/error`),
collapsed to one line by default with the `kind` + `text`; clearable by `notification.clear` key.
Appended into the message stream as a distinct row type (NOT a plain `system` text line). Replaces
the current plain-line leak. (`/details` interplay: cards are chrome, always shown, never windowed.)
- **(A) ambient badge** — `statusBar.tsx` `bg: N` segment (already reserved) bound to
`runningCount()`; the `agentsTray.tsx` count already exists — extend it to "agents + background."
Flash/recolor on a fresh notification (brief).
- **OSC** — on `notification.show` with a terminal level (complete/failed), call
`termChrome.notify({title, body})` (already focus-gated). No new escape-sequence code.
### Input-zone density pass (`view/composer.tsx` / `view/App.tsx`)
- Audit what stacks under the transcript and collapse/gate: the `⚡ N agents` tray line folds into
the ambient badge (shrinks one line); ensure the shell-mode note, completion menu, and status bar
don't co-stack more than necessary. Concrete rules decided with a tmux density pass (ASCII-mocked,
approved) — kept minimal; no behavior change, just fewer competing chrome lines.
## Phases (implementation order — each gated + tmux-smoked + committed)
- **P1 — Notification substrate** (`backgroundActivity.ts` + `notificationDispatcher.ts` + store
slice + `notificationCard.tsx` + badge wiring + OSC call). Highest visible win; the shared core.
- **P2 — Agents inspection revamp** (`agentsDashboard.tsx` + typed `trace`). De-crowds `rplj`.
- **P3 — Background panel** (`backgroundPanel.tsx` + `/bg` + actions). New surface.
- **P4 — Input density pass.** Folds the tray into the badge; trims co-stacked chrome.
## Testing / gates (per phase)
- **Pure logic** (`backgroundActivity`, `notificationDispatcher`, slash `/bg` routing,
trace-typing) → vitest unit tests, TDD where natural.
- **Views** → headless frame tests (`renderProbe`) for the card, the de-crowded dashboard row
format, the background panel sections; + **live tmux smoke** (`tmux-pane-screenshot`) for each
surface using a seeded-store harness (the `uxSmoke` pattern: `store.apply`/`applyInfo`/
`commitSnapshot` + canned events).
- **Gate** `cd ui-opentui && npm run check` green (judge by real exit, not a piped tail) after each
phase; rebuild `dist/main.js`; commit `opentui(v6): …` (no attribution) and push per standing instr.
## Out of scope (explicit)
- Foregrounding / "becoming" a subagent (B/C from the brainstorm) — would change core subagent UX.
- Per-process kill + per-process log tail for OS processes — needs additive gateway RPCs (no-core veto).
- "Collect result into transcript" for finished runs — deferred (Q6=B, view+stop only).
- Any change to `tui_gateway/server.py` / `run_agent.py`.

View File

@@ -0,0 +1,248 @@
# Plan — OpenTUI composer/UX batch (10 features)
> **STATUS: SHIPPED (2026-06-13).** All 10 features implemented, gate green
> (ui-opentui 714 tests + 316 gateway + 25 cost tests), F5/F6 verified live via
> tmux screenshot. Commits: `f4dacc68e` (F1/F2/F7/F8/F8b/F9/F10), `20d516ae9`
> (F4/F5/F6), `9aa5e54be` (F3). Decisions taken: **D1 = cursor-aware onType**
> (threaded `ta.cursorOffset`); **D2 = chrome cost is Nous-header-only via a new
> `nous_header_cost_usd`, `/usage` page kept full via `real_session_cost_usd`**.
> F10 (right-pinned cwd) was added mid-session by the user.
**Branch:** `feat/opentui-native-engine` · **Engine:** `ui-opentui/` (Node 26)
**Gate:** `cd ui-opentui && PATH="$HOME/.local/share/fnm/node-versions/v26.3.0/installation/bin:$PATH" npm run check` → exit 0.
## TL;DR
Nine UX fixes for the native composer + clarify prompt. **8 of 9 are front-end-only**
in `ui-opentui/`; only F3 (cost) touches the Python gateway. Every backend the new
behaviour needs (`shell.exec`, `complete.path` with `@file:`/`@folder:`/fuzzy) **already
exists** — most of this is client wiring, not new RPC surface. No new core tools, no new
`HERMES_*` env vars, no prompt-cache impact (composer/prompt are client-render only).
| # | Symptom | Fix site | Backend |
|---|---|---|---|
| F1 | bare `/` opens the modal | `logic/slash.ts:115` `planCompletion` | none |
| F2 | `/abs/path` text triggers slash | `logic/slash.ts:115` + `logic/skillMatch.ts` | none |
| F3 | cost wrong / shows for non-Nous | `tui_gateway/server.py` + `agent/usage_pricing.py` | gateway |
| F4 | can't paste until composer focused | `view/composer.tsx` onPaste/focus | none |
| F5 | clarify ugly (no wrap, weak diff, "Other" is a row) | `view/prompts/clarifyPrompt.tsx` rewrite | none |
| F6 | clarify arrows scroll the transcript | same rewrite (preventDefault) | none |
| F7 | slash highlight/menu dies after line 1 | `logic/slash.ts:114` | none |
| F8 | file mention dies after line 1 | `logic/slash.ts:114` | none |
| F8b | `@` should be the ONLY file-mention trigger | `logic/slash.ts:93` `isPathLike` | none |
| F9 | `!cmd` → run bash, show result | `entry/main.tsx` submit + new system render | uses existing `shell.exec` |
---
## F1 + F2 + F7 + F8 + F8b — the completion trigger (`logic/slash.ts`)
All five live in one ~10-line function, `planCompletion` (slash.ts:113-121). Current:
```ts
export function planCompletion(text: string): CompletionPlan | null {
if (text.includes('\n')) return null // ← F7/F8 die here
if (text.startsWith('/')) return { from: 0, method: 'complete.slash', params: { text } } // ← F1/F2
const word = /(\S+)$/.exec(text)?.[1]
if (word && isPathLike(word)) { ... complete.path ... } // ← F8b: too many triggers
return null
}
```
### F1/F2 — slash only for a real command token
- A bare `/` (no char yet) must **not** query. Require `/` + at least one name char.
- A `/abs/path` (slash followed by a path with more `/`) is **not** a command — it's
text. The slash menu should only fire when the FIRST token matches the command
grammar (`/[A-Za-z0-9][\w.-]*` — the `NAME_RE` already in skillMatch.ts:51, which
excludes `/`). `/usr/bin` fails NAME_RE → no slash menu.
- Concretely: replace `text.startsWith('/')` with: the text starts with `/`, and the
first whitespace-delimited token after the `/` is non-empty AND matches `NAME_RE`
(i.e. `/m`, `/model foo` → yes; `/`, `/usr/bin`, `/./x` → no). Reuse `slashTokens`
/`NAME_RE` from skillMatch.ts so the trigger and the highlighter share one grammar.
### F7/F8 — completion must survive newlines (shift+enter)
- `if (text.includes('\n')) return null` is the bug. It was a blunt guard so a multi-line
paste wouldn't spam path-completion. The right rule operates on the **current line /
current token at the cursor**, not the whole buffer.
- The composer passes the full `plainText` to `onType`. We don't currently pass the
cursor offset. **Decision D1 (below):** either (a) thread the cursor offset into
`onType` and complete the token under the cursor, or (b) cheap interim — slice to the
**last line** (`text.slice(text.lastIndexOf('\n')+1)`) and run the existing logic on
that. (a) is correct (mid-buffer edits), (b) is 1 line and covers the reported case
(typing at the end on line N). Recommend (a) for correctness; it also future-proofs
@-mention mid-line.
- Slash *highlighting* (skillMatch.ts `slashTokens`) **already scans multi-line text
correctly** (it iterates the whole string, newline-aware via `nativeCharOffset`). So
F7's "highlighting stopped" is really the same `planCompletion` newline bail starving
the menu; the highlight token itself still styles. Verify in the live smoke.
### F8b — `@` is the only mention trigger
- `isPathLike` (slash.ts:93) currently returns true for `@`, `~`, `./`, `../`, `/`, or
any word containing `/`. The user wants **`@`-only** (drop `~`/`./`/bare paths as
mention triggers). Narrow it to `word.startsWith('@')`.
- The gateway `complete.path` (server.py:8543) already special-cases `@` richly
(`@file:`, `@folder:`, `@diff`, `@staged`, `@url:`, `@git:`, fuzzy basename search).
Its `~`/`./` branches become dead trigger paths from this TUI — leave the gateway code
(Ink still uses the path forms; it's shared) but stop emitting those queries from
ui-opentui. **No gateway change.**
- Net: typing `@` (even bare) opens the mention menu via the `@`-bare branch at
server.py:8555. Picking splices `@file:rel/path` etc. (existing accept path,
`completionFrom` honoured).
**Tests:** extend `test/slash.test.ts``planCompletion('/')` → null; `planCompletion('/usr/bin')`
→ null; `planCompletion('/model')` → complete.slash; multi-line `"a\n/mod"` → complete.slash
on the trailing token; `"~/foo"` / `"./x"` → null (no longer path-like); `"@foo"` → complete.path.
Keep them as behaviour assertions, not snapshots.
---
## F3 — cost: Nous-portal headers only (`tui_gateway` + `agent/usage_pricing.py`)
**Current:** `_get_usage` (server.py:2157-2167) sets `cost_usd` from
`real_session_cost_usd(agent)` (usage_pricing.py:887), which sums **two** provider-reported
sources:
1. `agent.session_actual_cost_usd` — OpenRouter `usage.cost` accumulator.
2. `agent.get_credits_spent_micros()` — Nous `x-nous-credits-*` header delta.
The TUI already **hides** the cost segment when `cost_usd` is absent (statusBar.tsx:241-243,
`costText` returns '' when `costUsd === undefined`) — so this is purely "which sources count."
**User's intent (F3):** cost should come **only from the Nous portal headers**; suppress it
for every other route (cache-token pricing is unreliable across the model long tail).
**Change:** make the OpenRouter accumulator source conditional on the route being Nous, OR
drop source #1 entirely so only the header delta (source #2) feeds `cost_usd`. Source #2 is
intrinsically Nous-only (the header only exists on Nous-portal responses), so dropping #1
achieves "Nous-header-only" with one edit.
> **DECISION D2 (needs glitch's confirm):** Drop OpenRouter's `session_actual_cost_usd`
> source from `real_session_cost_usd`? Trade-off: OpenRouter's `usage.cost` is itself
> *provider-reported* (the real charged number, not a Hermes estimate), so OR users lose an
> accurate readout. But it removes the cache-token guesswork the user is worried about and
> matches "only via the headers when using nous portal" literally.
> **Recommended default (implementing unless told otherwise):** gate source #1 so it only
> contributes when the active route is the Nous portal (base_url == nous inference api),
> else it's dropped. This keeps the segment Nous-only AND avoids touching shared OR/CLI
> behaviour for the `/usage` page. If even Nous-route OR-accumulator is unwanted, collapse
> to header-only.
**Scope guard:** `real_session_cost_usd` is also consumed by `/usage` page rendering
(server.py:2237) and DB usage totals. Prefer a NEW, status-bar-specific helper
(e.g. `nous_header_cost_usd(agent)`) wired only into `_get_usage`'s `cost_usd`, leaving the
`/usage` accounting page untouched — so we don't regress the full cost report. Confirm with
the gate + a gateway unit test (`tui_gateway` tests) that a non-Nous session yields no
`cost_usd`.
---
## F4 — paste while composer unfocused (`view/composer.tsx`)
**Current:** the global keyboard handler reclaims focus on a *printable keystroke*
(`isPrintableKey`, composer.tsx:415-417). A **bracketed-paste event is not a keystroke**
it arrives at `onPaste` only if the textarea is focused, so an unfocused composer drops it;
the user has to click/type first.
**Fix:** the renderer delivers paste through the focused renderable. Two options:
- (a) Keep focus on the composer more aggressively (opencode keeps the prompt focused via a
reactive effect). Risky — fights transcript scroll focus.
- (b) **Recommended:** handle paste at the renderer/global level. Check whether OpenTUI
exposes a global paste hook (`renderer.on('paste')` or a keyboard event with
`key.name === 'paste'` / a paste event type). If a global paste signal exists, on paste:
`ta.focus()` then route the bytes into the existing `onPaste` logic (image / placeholder /
insert). **Must verify the API in the `opentui` skill before coding** (skill_view
references/docs). If only the focused-renderable paste exists, fall back to (a) scoped:
refocus the composer whenever no overlay/prompt is open and focus drifted (a
`createEffect` watching focus + `store.state.prompt`/overlay state).
**Verify in live smoke** (tmux + tmux-pane-screenshot): scroll the transcript to drop focus,
then paste — text must land without a prior click.
---
## F5 + F6 — clarify prompt rewrite (`view/prompts/clarifyPrompt.tsx`)
Screenshot `/tmp/screenshots/SCR-20260613-iznq.png` confirms: long options run off the right
edge (no wrap), options differ only by `▶`/`—` glyphs (no numbers, weak), and "✎ Other…" is
a `<select>` row that *switches* to an input on Enter rather than being an inline input.
**Current:** one native `<select>` over `[...choices, {Other}]` (clarifyPrompt.tsx:61-75).
Native `<select>` doesn't wrap long rows and (F6) doesn't `preventDefault` arrows, so they
leak to the transcript scrollbox.
**Rewrite plan (verify renderable API in `opentui` skill first):**
- Replace native `<select>` with a **custom keyboard-driven list** (a `For` over options +
a `selected` signal + `useKeyboard` with `key.preventDefault()` on up/down/enter — same
pattern the composer's `routeMenuKey` uses; F6 fixed by preventDefault so arrows never
reach the scrollbox).
- **Wrapping (F5):** render each option as a `<text>` that wraps to the box width (no fixed
single-line). Indent continuation lines under the option label. Confirm `<text>` soft-wrap
behaviour in the opentui skill (it wraps by default within a flex box of bounded width).
- **Differentiation (F5):** number every option `1.` `2.` … (digit hotkeys optional, nice-to-
have), and give the selected row the themed `selectionBg` + accent fg (the composer's
`completionCurrentBg` model), not just a glyph. Number + background + accent = three signals.
- **Inline custom answer (F5):** render the `<input>` **inside the same screen, always
present** as the last "row" (an `Other:` labeled input), instead of an item that toggles.
Selecting/focusing it lets the user type; Enter in it submits the free text. Keep the
existing `clarify.respond {answer}` wiring. Arrow-down past the last choice lands on the
input; arrow-up from the input returns to the list (focus handoff like the composer↔tray).
- Keep Esc/Ctrl+C → cancel (clarifyPrompt.tsx:31-33).
**Reference:** opencode's selection/list components in `~/github/opencode/packages/tui` for
the wrap + highlight + hotkey idiom; the composer dropdown (composer.tsx:441-458) for the
in-repo highlight/selectable pattern.
**Tests:** `test/render.test.tsx`-style headless frame — long option wraps (frame contains the
tail of a long choice on a 2nd line), selected row shows numbered + highlighted, custom input
present in the same frame, arrow keys don't change scrollTop (assert transcript scroll
unchanged), Enter on a choice → onAnswer(choice), Enter in input → onAnswer(typed).
---
## F9 — `!cmd` runs bash (`entry/main.tsx` + a system render)
**Backend exists:** `shell.exec` (server.py:10301) runs the command (30s timeout, dangerous/
hardline-command guards, returns `{stdout, stderr, code}`).
**Ink parity reference:** `ui-tui/src/app/useSubmission.ts:291``full.startsWith('!')`
`shellExec(full.slice(1).trim())` → appends a user line `!cmd` + a system line with output;
the prompt glyph flips while the buffer starts with `!` (appLayout.tsx:178).
**Plan (ui-opentui):**
- In the entry `submit` (main.tsx:517-520), add a branch BEFORE the slash check:
`if (text.startsWith('!')) { runShell(text.slice(1).trim()); return }`.
- `runShell(cmd)`: `store.pushUser('!' + cmd)` (echo the invocation in the transcript), then
`gateway.request('shell.exec', { command: cmd })`; on resolve, `store.pushSystem` the
combined `stdout`/`stderr` (or the error message / non-zero `code`); on reject,
pushSystem the error. Detached `runFork` like `submitPrompt`. No session turn, no model call.
- Empty `!` (just the bang) → no-op (or a hint), matching Ink.
- **Optional polish (parity, not required):** flip the composer prompt glyph (or tint) while
the buffer starts with `!`, like Ink's appLayout. Low-risk; do only if cheap.
**Tests:** entry-level/logic test that a `!`-prefixed submit routes to `shell.exec` (not
`prompt.submit`), and the system line renders stdout. Mirror the slashMenu.test harness
(fake gateway capturing the method).
---
## Sequencing & fences (subagent-driven; disjoint files)
Parallel-safe groups (disjoint file fences):
1. **slash trigger**`logic/slash.ts` (+ `logic/skillMatch.ts` reuse) + `test/slash.test.ts`. (F1/F2/F7/F8/F8b)
2. **clarify**`view/prompts/clarifyPrompt.tsx` + a clarify test. (F5/F6)
3. **shell-exec**`entry/main.tsx` (edit DIRECTLY — load-bearing) + system render + test. (F9)
4. **paste focus**`view/composer.tsx` (edit directly; verify opentui paste API first). (F4)
5. **cost**`tui_gateway/server.py` + `agent/usage_pricing.py` + gateway test. (F3) — Python, isolated.
`entry/main.tsx` and `store.ts` are edited directly, never via subagent (handoff rule).
Each renderable change: `skill_view(opentui, references/docs/...)` FIRST. Verify every
subagent self-report (re-run `npm run check` exit code, read the diff).
## Open decisions (need glitch)
- **D1 (F7/F8):** thread cursor offset into `onType` (correct) vs. last-line slice (cheap)?
Recommend cursor offset.
- **D2 (F3):** drop OpenRouter cost source entirely, or gate it to the Nous route? Recommend
Nous-route gate via a status-bar-only helper, leaving `/usage` accounting intact.
## Invariants to preserve
- Per-conversation prompt caching untouched (all client-render or post-hoc gateway usage).
- No new `HERMES_*` env var (these are behaviour, not secrets).
- Strict no change-detector tests — assert behaviour/invariants.
- Don't regress the `/usage` accounting page when narrowing the chrome cost source.

View File

@@ -0,0 +1,217 @@
# OpenTUI — usage/credits notice in the composer chrome
**Status:** spec (not started) · **Engine:** `ui-opentui/` · **Author:** glitch · 2026-06-14
## Goal
Render the gateway's **usage / credits notices** as a persistent, level-tinted
**chrome banner pinned at the top of the input zone** (directly above the status
bar), with the same lifecycle the Ink engine already has — sticky vs TTL,
mid-turn hold + turn-end reveal, and "flash-and-yield" for the usage bands.
Today the OpenTUI engine **receives** these notices but mis-renders them as
scrolling inline transcript cards with no lifecycle. This spec fixes that without
touching the gateway or the agent (the data already flows correctly).
## What already exists (verified)
### The wire (source of truth — do NOT change)
The gateway emits one event for every notice, snake_case payload:
```
notification.show payload { text, level, kind, ttl_ms, key, id } # tui_gateway/server.py:2878
notification.clear payload { key } # tui_gateway/server.py:2890
```
These come from `AgentNotice` (`agent/credits_tracker.py:177`). The credits
policy (`evaluate_credits_notices`, `agent/credits_tracker.py:245`) emits exactly
four notices — the full catalog this feature renders:
| `key` | `text` (already glyphed by policy) | `level` | `kind` | `ttl_ms` | lifecycle |
|-----------------------|-------------------------------------------------|-----------|----------|----------|----------------|
| `credits.usage` | `⚠/• Credits N% used · $X cap` (bands 50/75/90) | info/warn | `sticky` | — | flash-and-yield |
| `credits.grant_spent` | `• Grant spent · $X top-up left` | info | `sticky` | — | flash-and-yield |
| `credits.depleted` | `✕ Credit access paused · run /usage for balance` | error | `sticky` | — | sticky |
| `credits.restored` | `✓ Credit access restored` | success | `ttl` | `8000` | TTL self-expire |
**Load-bearing facts:**
- `text` is **already glyphed** (⚠ • ✕ ✓) by the Python policy — the renderer
**must not** prepend another glyph. It only tints by `level`.
- `level` includes **`success`** (green) — a level the current OpenTUI parser
silently drops to `info`.
- `kind` is the **lifecycle marker** (`sticky` | `ttl`), NOT a display label.
`id` == `key` (stable per kind, not unique per emission).
- Notices are **reconciled**: the policy emits `to_clear` (a `notification.clear`)
then `to_show`. A band change clears `credits.usage` then re-shows it.
### The Ink reference behavior (what we're matching)
`ui-tui/src/app/turnController.ts` + `appChrome.tsx`:
- `showNotice` (`:181`): if **busy**, hold in `pendingNotice` (latest-wins);
if idle, apply now.
- `applyNotice` (`:213`): set the visible notice; for `kind: 'ttl'` with
`ttl_ms > 0`, arm a self-expiry timer (clearing any prior timer first).
- `clearNotice(key)` (`:198`): drop the visible **and** pending notice only when
the key matches (a stale clear must not wipe a newer notice).
- `flushPendingNotice` (`:245`): at **turn end** (only the real end sites) apply
the held notice — its TTL clock starts here, when it first becomes visible.
- **Flash-and-yield** (`startMessage`, `:917`): at **turn start**, if the visible
notice's key is `credits.usage` or `credits.grant_spent`, clear it — "show
once, then get out of the way." `credits.depleted` and others stay sticky. The
Python `active` latch keeps the key so it won't re-fire next turn.
- Session reset clears all notice state so session A's notice can't bleed into B.
- Color by level: `error→error`, `warn→warn`, `success→statusGood`,
`info→accent` (`noticeColor`, `appChrome.tsx:192`).
### The OpenTUI side (what we change)
- `notification.show``parseNotification``pushNotification`**inline card**
in the transcript (`store.ts:832`, `notificationCard.tsx`). All kinds, no
lifecycle. The Option B process-completion card (`kind: 'process.complete'`)
and `background.complete` (`kind: 'background task complete'`) also use this
path — **they must keep working unchanged.**
- `parseNotification` coerces `level` to `info|warn|error` only
(`backgroundActivity.ts:48`) — drops `success`.
- Store carries `lastNotification` (OSC seam), `bgTasks`; **no** `notice` slot.
- Theme has `accent`, `warn`, `error`, `ok`/`statusGood`, `muted`
(`logic/theme.ts`) — `success` maps to `statusGood`.
- Input zone layout (`view/App.tsx:140-211`): a top-bordered column —
`<StatusBar>` → composer `<Switch>``<AgentsTray>`. The new banner mounts at
`App.tsx:144`, **directly above `<StatusBar>`** (the topmost line of the chrome).
- Turn lifecycle hooks: `case 'message.start'` (`store.ts:779`, sets
`info.running = true`) and `case 'message.complete'` (`store.ts:811`, sets
`info.running = false`). `clearTranscript` (`store.ts:631`) is the reset site.
- `Date.now()` is used freely in the store (`:877`) — `setTimeout` for TTL is fine.
## The one design decision: routing
`kind` is the discriminator. **`notification.show` with `kind === 'sticky'` or
`kind === 'ttl'` → the new chrome-notice path; every other kind → the existing
inline-card path, untouched.** This mirrors Ink's `Notice.kind: 'sticky' | 'ttl'`
exactly, and the credits policy sets `kind` to one of those for all four notices,
while the process/background cards use label-strings (`process.complete`,
`background task complete`) that are neither — so they stay inline cards. No
gateway change, no key-prefix sniffing.
**Divergence from Ink (intentional):** Ink hides the notice while busy because the
FaceTicker shares its one status slot. OpenTUI's busy face (`StatusLine`) lives in
the transcript area, so the banner has a **dedicated row** and stays visible
through a turn (a depletion warning shouldn't vanish mid-turn). We still **hold
new notices** that arrive mid-turn (`pendingNotice`) and reveal them at turn end —
matching Ink's "don't pop a fresh banner mid-stream" intent.
## Implementation
### Phase 1 — parser + type (`logic/backgroundActivity.ts`)
1. Widen `ActivityNotification.level` to `'info' | 'warn' | 'error' | 'success'`.
2. `coerceLevel`: also accept `'success'` (still fall back to `'info'`).
3. Add `export function isChromeNotice(n: ActivityNotification): boolean`
`n.kind === 'sticky' || n.kind === 'ttl'`.
4. `parseNotification` already maps `ttl_ms → ttlMs` and preserves `key`/`id`
no shape change beyond the widened level.
**Tests** (`backgroundActivity.test.ts` or `notificationCard.test.tsx`):
`success` survives parse; `kind: 'ttl'` + `ttl_ms``ttlMs`; `isChromeNotice`
true for sticky/ttl, false for `process.complete`/`''`.
### Phase 2 — store lifecycle (`logic/store.ts`)
Add state + a private (non-reactive) timer handle in `createSessionStore`:
- `notice: ActivityNotification | null` (visible chrome notice) — new state field,
init `null`.
- `pendingNotice: ActivityNotification | null` — held mid-turn, init `null`.
- `let noticeTimer: ReturnType<typeof setTimeout> | undefined` (closure var).
Functions (port of `turnController`):
- `showNotice(n)`: `state.info.running ? setState('pendingNotice', n) : applyNotice(n)`
(latest-wins — assigning replaces any prior pending).
- `applyNotice(n)`: clear `noticeTimer`; `setState('notice', n)`; if
`n.kind === 'ttl' && n.ttlMs && n.ttlMs > 0`, arm `setTimeout(n.ttlMs)` that
clears `notice` only if `state.notice?.id === n.id` (defensive guard).
- `clearNotice(key)`: if `state.pendingNotice?.key === key` → null it; if
`state.notice?.key === key` → clear timer + null `notice`.
- `flushPendingNotice()`: if `state.pendingNotice``applyNotice` it, null pending.
- `clearNoticeState()`: null `notice` + `pendingNotice`, clear timer.
Wire into the event reducer:
- `notification.show` (`store.ts:832`): route —
`const n = parseNotification(...); if (!n) break; if (isChromeNotice(n)) showNotice(n); else pushNotification(n)`.
(Still record `lastNotification` for the OSC seam in **both** paths — extract
the `setState('lastNotification', {...n})` so a chrome notice also pings a
blurred terminal, matching the inline-card behavior.)
- `notification.clear` (`store.ts:837`): call **both** `clearNotificationCards(key)`
(cards) **and** `clearNotice(key)` (chrome) — a key only ever lives in one, so
calling both is safe and avoids guessing.
- `message.start` (`store.ts:779`): flash-and-yield — if
`state.notice?.key === 'credits.usage' || === 'credits.grant_spent'`
`clearNotice(state.notice.key)`. (Do this **before** flipping `running` true so
the read is clean.)
- `message.complete` (`store.ts:811`): call `flushPendingNotice()` (after the
`running = false` set, so a held notice reveals on the now-idle bar).
- `clearTranscript` (`store.ts:631`) and any session-switch reset:
`clearNoticeState()`.
Export `notice` via the store's state and `showNotice`/`clearNotice` if a test or
future slash command needs them.
**Tests** (`statusNotice.test.ts`, new):
- idle `showNotice``state.notice` set, no card pushed.
- routing: `notification.show` `kind:'sticky'``notice` set, **no** transcript
card; `kind:'process.complete'` → card pushed, `notice` still null.
- mid-turn hold: `message.start``showNotice``notice` stays null,
`pendingNotice` set → `message.complete``notice` revealed.
- `clearNotice` by key drops visible + pending; non-matching key is a no-op.
- TTL: `kind:'ttl', ttlMs:50` auto-clears (vitest fake timers).
- flash-and-yield: visible `credits.usage` cleared on `message.start`;
`credits.depleted` persists across a start/complete cycle.
- `clearTranscript` resets `notice` + `pendingNotice`.
- `success` notice keeps its level.
### Phase 3 — view (`view/noticeBanner.tsx` + `App.tsx`)
New `NoticeBanner` (sibling style to `notificationCard.tsx`):
- Props: `notice: ActivityNotification | null`, plus terminal width for truncation.
- `<Show when={notice}>` — renders nothing when null.
- One row, `flexShrink: 0`, `paddingLeft: 1`, `selectable={false}`.
- Text rendered **verbatim** (glyph already present), tinted by level:
`error→error`, `warn→warn`, `success→statusGood`, `info→accent`.
- Truncate to width with `truncRight` (`logic/truncate.ts`) so a long notice can
never push the composer or wrap.
Mount in `App.tsx:144`, the first child of the top-bordered input zone, directly
above `<StatusBar store={...} />`:
```tsx
<box border={['top']} ...>
<NoticeBanner notice={props.store.state.notice} /> {/* new */}
<StatusBar store={props.store} />
...
```
**Tests** (`noticeBanner.test.tsx`, frame): renders the text without adding a
glyph; warn→warn color, success→statusGood color; truncates at narrow width;
renders an empty frame when `notice` is null.
### Phase 4 — parity verification + docs
- `npm run check` green (prettier + eslint + vitest).
- Headless frame dump: a `credits.usage` warn banner above the status bar; a
`credits.depleted` error banner surviving a turn; a `credits.restored` success
banner that disappears after its TTL.
- tmux smoke per `docs/opentui-dev-handoff.md` (inject the three notices via the
test harness / a scripted gateway event; screenshot the chrome).
- Cross-check the four-notice catalog renders identically in tone to Ink's
`appChromeStatusRule` (color-by-level, no double glyph, truncation).
## Non-goals
- No gateway/agent changes — the wire and the policy are the source of truth.
- No new notice kinds — render exactly the four the policy emits.
- The inline-card path (process/background completions) is **unchanged**.
- No status-bar segment changes — the banner is its own row above the bar.
## Risk / footguns
- **Schema decode-at-boundary**: `notification.show` payload is a loose Record
read by `parseNotification`, not strict-decoded — a wrong-typed field won't blank
the bar (unlike `applyInfo`). Keep the loose reads.
- **createStore reference-aliasing**: store `notice` and `pendingNotice` distinct
objects; when applying pending, it's already its own object — don't alias it to
`lastNotification`. (See `[[solid-createstore-reference-aliasing]]`.)
- **Timer leak**: `clearNoticeState` must clear `noticeTimer`; ensure session
reset and store dispose clear it so a TTL callback can't fire into a dead store.
- **Routing regression**: assert in tests that `process.complete` /
`background task complete` still produce **cards**, not banners — the whole
feature hinges on the `kind` discriminator.

View File

@@ -145,8 +145,16 @@ def build_top_level_parser():
"--resume",
"-r",
metavar="SESSION",
# nargs="?" + const=True: bare `--resume` parses to the sentinel True,
# which `hermes --tui` turns into the session picker
# (HERMES_TUI_RESUME=picker). `--resume <id|title>` is unchanged.
nargs="?",
const=True,
default=None,
help="Resume a previous session by ID or title",
help=(
"Resume a previous session by ID or title. With --tui, bare "
"--resume (no argument) opens the session picker."
),
)
parser.add_argument(
"--continue",
@@ -301,8 +309,14 @@ def build_top_level_parser():
"--resume",
"-r",
metavar="SESSION_ID",
# Same bare-flag picker sentinel as the top-level --resume.
nargs="?",
const=True,
default=argparse.SUPPRESS,
help="Resume a previous session by ID (shown on exit)",
help=(
"Resume a previous session by ID (shown on exit). With --tui, "
"bare --resume opens the session picker."
),
)
chat_parser.add_argument(
"--continue",

View File

@@ -1640,8 +1640,286 @@ def _find_bundled_tui(hermes_cli_dir: Path | None = None) -> Path | None:
return bundled if bundled.is_file() else None
def _config_tui_engine_early() -> str | None:
"""Read ``display.tui_engine`` from config via a minimal YAML read.
Returns the configured engine string, or ``None`` when unset/unreadable so the
caller can apply the availability-gated default. Mirrors
:func:`_config_default_interface_early`.
"""
try:
home = os.environ.get("HERMES_HOME")
cfg_path = (
os.path.join(home, "config.yaml")
if home
else os.path.join(os.path.expanduser("~"), ".hermes", "config.yaml")
)
if os.path.exists(cfg_path):
import yaml as _yaml_eng
with open(cfg_path, encoding="utf-8") as _f:
raw = _yaml_eng.safe_load(_f) or {}
disp = raw.get("display", {})
if isinstance(disp, dict):
eng = disp.get("tui_engine")
if isinstance(eng, str) and eng.strip():
return eng.strip().lower()
except Exception:
pass
return None
def _resolve_tui_engine() -> str:
"""Which TUI engine to launch: "ink" (default) or "opentui".
Precedence: ``HERMES_TUI_ENGINE`` env > ``display.tui_engine`` config >
(OpenTUI when this host can run it — Node >= 26.3 + the built package — else Ink).
The OpenTUI engine runs on Node 26.3+ via the experimental ``node:ffi`` renderer,
which is not validated on Windows or Termux — a request for "opentui" there falls
back to "ink" with a notice so a stale flag never strands the user on an engine
that can't start.
"""
env = (os.environ.get("HERMES_TUI_ENGINE") or "").strip().lower()
# Explicit choice (env > config) wins; otherwise default to OpenTUI when this
# host is genuinely set up for it (Node >= 26.3 + the built bundle), else Ink.
engine = env or _config_tui_engine_early() or ("opentui" if _opentui_available() else "ink")
if engine != "opentui":
return "ink"
# opentui requested — gate on platform support.
unsupported = sys.platform.startswith("win") or _is_termux_startup_environment()
if unsupported:
if not os.environ.get("HERMES_QUIET"):
where = "Windows" if sys.platform.startswith("win") else "Termux"
print(
f"HERMES_TUI_ENGINE=opentui is not supported on {where} "
f"(needs Node 26.3+ with experimental FFI) — falling back to the Ink engine.",
file=sys.stderr,
)
return "ink"
return "opentui"
NODE26_MIN_VERSION = (26, 3, 0)
def _node_version_tuple(node_bin: str) -> tuple[int, int, int] | None:
"""Return (major, minor, patch) for a node binary, or ``None`` if unreadable."""
try:
out = subprocess.run([node_bin, "--version"], capture_output=True, text=True, timeout=5)
except Exception:
return None
if out.returncode != 0:
return None
raw = (out.stdout or "").strip().lstrip("v").split("-", 1)[0]
parts = raw.split(".")
try:
return (int(parts[0]), int(parts[1]), int(parts[2]))
except (IndexError, ValueError):
return None
def _fnm_node26_candidates() -> list[str]:
"""Node binaries from fnm's installed versions, newest first.
fnm keeps each version at ``<FNM_DIR>/node-versions/v<X.Y.Z>/installation/
bin/node`` (default ``FNM_DIR``: ``$XDG_DATA_HOME/fnm`` or ``~/.local/share/
fnm``; macOS Homebrew also uses ``~/Library/Application Support/fnm``). When
the *active* node is older than 26.3 — e.g. the user's fnm default is on
v25 — the right 26.x is still installed and usable; surface it so OpenTUI
works without the user re-aliasing their global default. Version-sorted so
the newest qualifying node wins.
"""
roots: list[Path] = []
fnm_dir = os.environ.get("FNM_DIR")
if fnm_dir:
roots.append(Path(fnm_dir))
xdg = os.environ.get("XDG_DATA_HOME")
if xdg:
roots.append(Path(xdg) / "fnm")
roots.append(Path.home() / ".local" / "share" / "fnm")
roots.append(Path.home() / "Library" / "Application Support" / "fnm")
seen: set[Path] = set()
found: list[tuple[tuple[int, int, int], str]] = []
for root in roots:
versions_dir = root / "node-versions"
if versions_dir in seen or not versions_dir.is_dir():
continue
seen.add(versions_dir)
try:
entries = list(versions_dir.iterdir())
except OSError:
continue
for entry in entries:
node_bin = entry / "installation" / "bin" / "node"
if not (node_bin.is_file() and os.access(node_bin, os.X_OK)):
continue
# Trust the directory name for sorting; the real probe happens in
# the caller (a renamed/symlinked dir still gets version-checked).
name = entry.name.lstrip("v").split("-", 1)[0]
parts = name.split(".")
try:
ver = (int(parts[0]), int(parts[1]), int(parts[2]))
except (IndexError, ValueError):
ver = (0, 0, 0)
found.append((ver, str(node_bin)))
found.sort(key=lambda pair: pair[0], reverse=True)
return [path for _, path in found]
def _node26_bin_or_none() -> str | None:
"""Resolve a Node >= 26.3.0 binary (no exit — a probe), or ``None``.
Order: ``HERMES_NODE`` override > ``node`` on PATH > newest fnm-installed
version. Each is gated on the real ``--version`` being >= 26.3.0. OpenTUI's
native renderer loads via the experimental ``node:ffi`` API that only exists
on Node 26.3+, so an older Node is treated as "not available" — but an
installed-yet-inactive 26.x (common when fnm's default is on an older line)
is discovered and used so the engine still launches.
"""
candidates: list[str] = []
env_node = os.environ.get("HERMES_NODE")
if env_node and os.path.isfile(env_node) and os.access(env_node, os.X_OK):
candidates.append(env_node)
path = shutil.which("node")
if path:
candidates.append(path)
candidates.extend(_fnm_node26_candidates())
for cand in candidates:
ver = _node_version_tuple(cand)
if ver is not None and ver >= NODE26_MIN_VERSION:
return cand
return None
def _node26_bin() -> str:
"""Resolve Node >= 26.3.0 for the OpenTUI engine, or exit with a clear message.
Use :func:`_node26_bin_or_none` for a non-fatal availability probe.
"""
node = _node26_bin_or_none()
if node is not None:
return node
print(
"Node.js >= 26.3.0 not found — the OpenTUI TUI engine needs it for the "
"experimental node:ffi renderer.\n"
"Install Node 26.3+ (e.g. via fnm/nvm) or set HERMES_NODE=/path/to/node, "
"or unset HERMES_TUI_ENGINE to use the default Ink engine.",
file=sys.stderr,
)
sys.exit(1)
def _opentui_npm() -> str:
"""Resolve npm (ships with Node) to build the OpenTUI bundle, or exit."""
npm = shutil.which("npm")
if npm:
return npm
print(
"npm not found — needed to build the OpenTUI engine bundle.\n"
"Install Node 26.3+ (it ships npm), or unset HERMES_TUI_ENGINE for Ink.",
file=sys.stderr,
)
sys.exit(1)
def _opentui_available() -> bool:
"""Whether the OpenTUI engine can actually launch on this host.
True only when the platform is supported (not Windows/Termux), a Node >= 26.3
binary resolves (the node:ffi floor), AND the v2 package is BUILT
(``dist/main.js``) with its ``node_modules`` installed. This gates the DEFAULT
engine: a host genuinely set up for OpenTUI defaults to it; everyone else stays
on Ink. An explicit ``HERMES_TUI_ENGINE`` env or ``display.tui_engine`` config
choice bypasses this probe (and triggers an on-demand build).
"""
if sys.platform.startswith("win") or _is_termux_startup_environment():
return False
if _node26_bin_or_none() is None:
return False
pkg = PROJECT_ROOT / "ui-opentui"
built = pkg / "dist" / "main.js"
return built.is_file() and (pkg / "node_modules" / "@opentui").is_dir()
def _make_opentui_argv(tui_dev: bool) -> tuple[list[str], Path]:
"""Argv for the native OpenTUI engine under Node 26 (no Bun).
Builds the Solid + Effect-at-boundary engine (``ui-opentui``) with esbuild
(``npm run build`` → ``dist/main.js``) when the bundle is missing (or always, in
``--dev``), then launches it on Node with the experimental FFI flag:
node --experimental-ffi --no-warnings dist/main.js
``--no-warnings`` keeps the ExperimentalWarning off the TUI's stderr. Returns the
argv and the package cwd.
The spawned ``tui_gateway`` resolves its Python from ``HERMES_PYTHON_SRC_ROOT``
(the caller sets it to ``PROJECT_ROOT``); the built bundle's own fallback also
walks up to the checkout root, so the gateway resolves correctly either way.
"""
app_dir = PROJECT_ROOT / "ui-opentui"
entry_src = app_dir / "src" / "entry" / "main.tsx"
if not entry_src.is_file():
print(
f"OpenTUI v2 engine entry not found at {entry_src}.\n"
f"Unset HERMES_TUI_ENGINE to use the default Ink engine.",
file=sys.stderr,
)
sys.exit(1)
node = _node26_bin()
# The esbuild build needs the package's node_modules (esbuild + the @opentui
# packages + the native blob). Without them the build/launch dies cryptically.
if not (app_dir / "node_modules" / "@opentui").is_dir():
print(
f"OpenTUI engine dependencies are not installed in {app_dir}.\n"
f"Run: (cd {app_dir} && npm install)\n"
f"Or unset HERMES_TUI_ENGINE to use the default Ink engine.",
file=sys.stderr,
)
sys.exit(1)
built = app_dir / "dist" / "main.js"
if tui_dev or not built.is_file():
npm = _opentui_npm()
if not os.environ.get("HERMES_QUIET"):
print("Building the OpenTUI engine…", file=sys.stderr)
result = subprocess.run(
[npm, "run", "build"],
cwd=str(app_dir),
capture_output=True,
text=True,
)
if result.returncode != 0:
combined = f"{result.stdout or ''}{result.stderr or ''}".strip()
preview = "\n".join(combined.splitlines()[-30:])
print("OpenTUI engine build failed.", file=sys.stderr)
if preview:
print(preview, file=sys.stderr)
sys.exit(1)
# --expose-gc (parity with Ink, main.py ~1909): makes `global.gc()` a real
# callable so the OpenTUI engine's GC hooks (W2 proactive idle GC; /heapdump)
# work instead of being silent no-ops. MUST be an argv flag — Node rejects
# --expose-gc in NODE_OPTIONS (see the heap-cap injection below).
return [node, "--experimental-ffi", "--no-warnings", "--expose-gc", str(built)], app_dir
def _make_tui_argv(tui_dir: Path, tui_dev: bool) -> tuple[list[str], Path]:
"""TUI: --dev → tsx src; else node dist (HERMES_TUI_DIR prebuilt or esbuild)."""
"""TUI: --dev → tsx src; else node dist (HERMES_TUI_DIR prebuilt or esbuild).
Dual-engine: when ``HERMES_TUI_ENGINE``/``display.tui_engine`` selects the
native OpenTUI engine, dispatch to ``_make_opentui_argv`` (Node 26 + its own
esbuild build) BEFORE the Ink Node bootstrap — the OpenTUI engine resolves its
own Node >= 26.3 and builds its own bundle, so it must not be routed through
``_ensure_tui_node`` / the Ink prebuilt-dir logic.
"""
if _resolve_tui_engine() == "opentui":
return _make_opentui_argv(tui_dev)
_ensure_tui_node()
def _node_bin(bin: str) -> str:
@@ -1877,6 +2155,57 @@ def _read_cgroup_memory_limit() -> Optional[int]:
return None
def _config_tui_heap_mb_early() -> int | None:
"""Read ``display.tui_heap_mb`` from config via a minimal YAML read.
Returns the configured V8 heap cap in MB, or ``None`` when unset/unreadable.
Mirrors :func:`_config_tui_engine_early`. A non-secret behavioral setting, so
it lives in ``config.yaml`` (NOT a ``HERMES_*`` env / the NODE_OPTIONS bridge,
which is denylisted) — the ``HERMES_TUI_HEAP_MB`` env is only the per-launch
override on top of this.
"""
try:
home = os.environ.get("HERMES_HOME")
cfg_path = (
os.path.join(home, "config.yaml")
if home
else os.path.join(os.path.expanduser("~"), ".hermes", "config.yaml")
)
if os.path.exists(cfg_path):
import yaml as _yaml_heap
with open(cfg_path, encoding="utf-8") as _f:
raw = _yaml_heap.safe_load(_f) or {}
disp = raw.get("display", {})
if isinstance(disp, dict):
val = disp.get("tui_heap_mb")
if isinstance(val, bool): # guard: YAML true/false is an int subclass
return None
if isinstance(val, int) and val > 0:
return val
if isinstance(val, str) and val.strip().isdigit():
n = int(val.strip())
if n > 0:
return n
except Exception:
pass
return None
def _resolve_tui_heap_override() -> int | None:
"""The user's explicit V8 heap cap (MB), or ``None`` for the default path.
Precedence: ``HERMES_TUI_HEAP_MB`` env > ``display.tui_heap_mb`` config
(matches the ``HERMES_TUI_ENGINE`` env-first pattern). Honored by BOTH engines
via the shared ``NODE_OPTIONS`` injection. A positive integer wins; anything
else (unset/garbage/non-positive) falls through to the cgroup-aware default.
"""
env_val = os.environ.get("HERMES_TUI_HEAP_MB", "").strip()
if env_val.isdigit() and int(env_val) > 0:
return int(env_val)
return _config_tui_heap_mb_early()
def _resolve_tui_heap_mb(default_mb: int = 8192) -> int:
"""Pick a V8 ``--max-old-space-size`` (MB) that fits the container.
@@ -1885,7 +2214,16 @@ def _resolve_tui_heap_mb(default_mb: int = 8192) -> int:
cgroup limit so the heap + non-heap RSS stays under the cgroup ceiling,
clamped to a sane floor (1536MB — below this V8 GC-thrashes and the TUI
is barely usable). Never exceeds ``default_mb``.
An explicit ``HERMES_TUI_HEAP_MB`` env / ``display.tui_heap_mb`` config
override REPLACES the 8192 default (D3): setting it low is the low-mem opt-in,
setting it high raises the ceiling. The cgroup-fit clamp still applies on top
so an override never exceeds what the container can hold — a low override is
honored as-is, a too-high one is still trimmed to ~75% of the cgroup limit.
"""
override = _resolve_tui_heap_override()
if override is not None:
default_mb = override
limit = _read_cgroup_memory_limit()
if not limit:
return default_mb
@@ -1902,7 +2240,8 @@ def _resolve_tui_heap_mb(default_mb: int = 8192) -> int:
def _launch_tui(
resume_session_id: Optional[str] = None,
# str session id, the bare-`--resume` picker sentinel True, or None.
resume_session_id: "Optional[str | bool]" = None,
tui_dev: bool = False,
model: Optional[str] = None,
provider: Optional[str] = None,
@@ -1921,6 +2260,14 @@ def _launch_tui(
"""Replace current process with the TUI."""
tui_dir = PROJECT_ROOT / "ui-tui"
# Bare `--resume` arrives as the argparse sentinel True: open the TUI
# resume picker instead of resuming a specific session id. Normalize it
# here so everything downstream (exit summary, env forwarding) keeps
# seeing either a real session id string or None.
resume_picker = resume_session_id is True
if resume_picker:
resume_session_id = None
import tempfile
env = os.environ.copy()
@@ -1934,11 +2281,31 @@ def _launch_tui(
)
os.close(active_session_fd)
env["HERMES_TUI_ACTIVE_SESSION_FILE"] = active_session_file
# Tree-sitter grammar cache for the OpenTUI engine: grammars are fetched
# from GitHub on first use and cached here (profile-aware). Unset → OpenTUI
# falls back to its XDG default ($XDG_DATA_HOME/opentui). See
# ui-opentui/src/boundary/parsers.ts.
try:
from hermes_cli.config import get_hermes_home
env["HERMES_TUI_PARSER_CACHE"] = str(
get_hermes_home() / "cache" / "opentui-parsers"
)
except Exception:
logger.debug("Failed to resolve OpenTUI parser cache dir", exc_info=True)
env["HERMES_PYTHON_SRC_ROOT"] = os.environ.get(
"HERMES_PYTHON_SRC_ROOT", str(PROJECT_ROOT)
)
env.setdefault("HERMES_PYTHON", sys.executable)
env.setdefault("HERMES_CWD", os.getcwd())
# The TUI subprocess is launched with cwd=<engine package dir> (so its
# build/resolution works), which means the gateway it spawns would otherwise
# auto-detect THAT dir as the workspace (chrome bar showed "ui-opentui" no
# matter where you ran hermes). TERMINAL_CWD is the gateway's canonical
# launch-dir channel (_completion_cwd) — set it to the real cwd here so the
# session, chrome bar, and terminal tool all anchor to where you actually
# are. Worktree mode overrides it to the worktree path below.
env.setdefault("TERMINAL_CWD", os.getcwd())
env.setdefault("NODE_ENV", "development" if tui_dev else "production")
wt_info = None
@@ -2015,6 +2382,11 @@ def _launch_tui(
# --expose-gc is *not* added here: Node rejects it in NODE_OPTIONS
# ("--expose-gc is not allowed in NODE_OPTIONS") and refuses to start.
# It is passed as a direct argv flag in _make_tui_argv() instead.
#
# Both TUI engines run on Node/V8 now — Ink, and the native OpenTUI engine
# (Node 26 + node:ffi). So --max-old-space-size (a V8/Node flag) applies to
# both. (Pre-Node-26 the OpenTUI engine ran on Bun/JavaScriptCore, which has
# no such flag; that gate is gone now that the engine is Node.)
_tokens = env.get("NODE_OPTIONS", "").split()
if not any(t.startswith("--max-old-space-size=") for t in _tokens):
_tokens.append(f"--max-old-space-size={_resolve_tui_heap_mb()}")
@@ -2027,7 +2399,11 @@ def _launch_tui(
# resolved for this invocation; direct `node ui-tui/dist/entry.js` users can
# still set HERMES_TUI_RESUME themselves.
env.pop("HERMES_TUI_RESUME", None)
if resume_session_id:
if resume_picker:
# Bare --resume: tell the TUI to open the resume picker before any
# session.create (create is lazy, so nothing is wasted).
env["HERMES_TUI_RESUME"] = "picker"
elif resume_session_id:
env["HERMES_TUI_RESUME"] = resume_session_id
argv, cwd = _make_tui_argv(tui_dir, tui_dev)
@@ -2136,6 +2512,18 @@ def cmd_chat(args):
"""Run interactive chat CLI."""
use_tui = _resolve_use_tui(args)
# Bare `--resume` (argparse sentinel True) opens the TUI resume picker —
# `_launch_tui` translates it to HERMES_TUI_RESUME=picker. The classic
# REPL has no picker overlay, so point at the equivalents instead of
# silently resuming something the user didn't choose.
if getattr(args, "resume", None) is True and not use_tui:
print("Bare --resume opens the session picker, which requires the TUI.")
print(
"Use 'hermes --tui --resume', 'hermes --resume <id|title>', "
"'hermes -c', or 'hermes sessions browse'."
)
sys.exit(2)
# Resolve --continue into --resume with the latest session or by name
continue_val = getattr(args, "continue_last", None)
if continue_val and not getattr(args, "resume", None):
@@ -2161,9 +2549,10 @@ def cmd_chat(args):
print(f"No previous {kind} session found to continue.")
sys.exit(1)
# Resolve --resume by title if it's not a direct session ID
# Resolve --resume by title if it's not a direct session ID. The bare
# picker sentinel (True) is not a name — leave it for _launch_tui.
resume_val = getattr(args, "resume", None)
if resume_val:
if resume_val and resume_val is not True:
resolved = _resolve_session_by_name_or_id(resume_val)
if resolved:
args.resume = resolved

View File

@@ -268,7 +268,7 @@ emit_manifest() {
if [ "$INCLUDE_DESKTOP" = true ]; then
desktop_stage='{"name":"desktop","title":"Build desktop app","category":"runtime","needs_user_input":false},'
fi
printf '%s' '{"protocol_version":1,"stages":[{"name":"prerequisites","title":"System prerequisites","category":"runtime","needs_user_input":false},{"name":"repository","title":"Download Hermes Agent","category":"runtime","needs_user_input":false},{"name":"venv","title":"Create Python virtual environment","category":"runtime","needs_user_input":false},{"name":"python-deps","title":"Install Python dependencies","category":"runtime","needs_user_input":false},{"name":"node-deps","title":"Install browser-tool dependencies","category":"runtime","needs_user_input":false},{"name":"path","title":"Install hermes command","category":"runtime","needs_user_input":false},{"name":"config","title":"Prepare config and skills","category":"configuration","needs_user_input":false},{"name":"setup","title":"Configure API keys and settings","category":"configuration","needs_user_input":true},{"name":"gateway","title":"Configure gateway service","category":"configuration","needs_user_input":true},'"$desktop_stage"'{"name":"complete","title":"Finish install","category":"runtime","needs_user_input":false}]}'
printf '%s' '{"protocol_version":1,"stages":[{"name":"prerequisites","title":"System prerequisites","category":"runtime","needs_user_input":false},{"name":"repository","title":"Download Hermes Agent","category":"runtime","needs_user_input":false},{"name":"venv","title":"Create Python virtual environment","category":"runtime","needs_user_input":false},{"name":"python-deps","title":"Install Python dependencies","category":"runtime","needs_user_input":false},{"name":"node-deps","title":"Install browser-tool dependencies","category":"runtime","needs_user_input":false},{"name":"opentui-engine","title":"Set up OpenTUI engine","category":"runtime","needs_user_input":false},{"name":"path","title":"Install hermes command","category":"runtime","needs_user_input":false},{"name":"config","title":"Prepare config and skills","category":"configuration","needs_user_input":false},{"name":"setup","title":"Configure API keys and settings","category":"configuration","needs_user_input":true},{"name":"gateway","title":"Configure gateway service","category":"configuration","needs_user_input":true},'"$desktop_stage"'{"name":"complete","title":"Finish install","category":"runtime","needs_user_input":false}]}'
printf '\n'
}
@@ -1980,6 +1980,76 @@ install_node_deps() {
restore_dirty_lockfiles "$INSTALL_DIR"
}
# Provision the native OpenTUI engine on NODE 26.3+ (no Bun): `npm install` +
# `npm run build` (esbuild → dist/main.js) in ui-opentui. The engine's
# renderer loads via the experimental `node:ffi` API that only exists on Node
# 26.3+. The launcher (hermes_cli/main.py:_opentui_available) only uses OpenTUI
# when a Node >= 26.3 resolves AND the v2 package is built; otherwise it falls
# back to the Ink engine. So this stage is STRICTLY best-effort: any failure
# (unsupported platform, Node < 26.3, no network, install/build fails) logs a
# warning and returns 0. A skipped OpenTUI setup just means the user gets Ink —
# breaking the install would be far worse than skipping OpenTUI. Every sub-step
# is guarded; this function never `exit`s and never returns non-zero.
install_opentui() {
# node:ffi isn't validated on Windows/Termux — keep those hosts on Ink.
if [ "$OS" = "windows" ] || [ "$DISTRO" = "termux" ] || [ "$OS" = "android" ]; then
log_info "Skipping OpenTUI engine (unsupported platform) — using Ink."
return 0
fi
# Only meaningful if the v2 package is present in this checkout.
if [ ! -f "$INSTALL_DIR/ui-opentui/package.json" ]; then
log_info "Skipping OpenTUI engine (ui-opentui not present) — using Ink."
return 0
fi
log_info "Setting up OpenTUI engine (native TUI, Node 26.3+ / node:ffi)..."
# Resolve a Node >= 26.3.0 (the node:ffi floor): HERMES_NODE > node on PATH,
# version-checked. We do NOT install Node here — if one new enough isn't
# available the launcher cleanly falls back to Ink.
local node_bin=""
for cand in "${HERMES_NODE:-}" "$(command -v node 2>/dev/null || true)"; do
[ -n "$cand" ] && [ -x "$cand" ] || continue
if "$cand" -e 'const p=process.versions.node.split(".").map(Number); process.exit(p[0]>26||(p[0]===26&&p[1]>=3)?0:1)' 2>/dev/null; then
node_bin="$cand"
break
fi
done
if [ -z "$node_bin" ]; then
log_warn "OpenTUI engine setup skipped (needs Node >= 26.3.0; none found) — using the Ink engine. Install Node 26.3+ or set HERMES_NODE."
return 0
fi
log_success "Node found ($("$node_bin" --version 2>/dev/null || echo "unknown"))"
# npm ships with Node; the build (`node scripts/build.mjs`) runs fine on any
# recent Node — only the runtime needs 26.3, which the launcher re-checks.
local npm_bin
npm_bin="$(command -v npm 2>/dev/null || true)"
if [ -z "$npm_bin" ]; then
log_warn "OpenTUI engine setup skipped (npm not found) — using the Ink engine."
return 0
fi
cd "$INSTALL_DIR/ui-opentui" || { log_warn "OpenTUI engine setup skipped (cd failed) — using Ink."; return 0; }
# Pull deps (fetches the per-arch @opentui/core-<arch> native lib) then build
# the Node bundle (dist/main.js). Both idempotent.
log_info "Installing OpenTUI dependencies (npm install)..."
if ! "$npm_bin" install --no-audit --no-fund >/dev/null 2>&1; then
log_warn "OpenTUI engine setup skipped (npm install failed) — the Ink engine will be used."
return 0
fi
log_info "Building OpenTUI engine (npm run build)..."
if ! "$npm_bin" run build >/dev/null 2>&1; then
log_warn "OpenTUI engine setup skipped (build failed) — the Ink engine will be used."
return 0
fi
log_success "OpenTUI engine ready (opt-in: HERMES_TUI_ENGINE=opentui; default is Ink)."
return 0
}
run_setup_wizard() {
if [ "$RUN_SETUP" = false ]; then
log_info "Skipping setup wizard (--skip-setup)"
@@ -2636,6 +2706,12 @@ run_stage_body() {
check_node
install_node_deps
;;
opentui-engine)
detect_os
resolve_install_layout
require_install_dir
install_opentui
;;
path)
detect_os
resolve_install_layout
@@ -2743,6 +2819,7 @@ main() {
setup_venv
install_deps
install_node_deps
install_opentui
setup_path
copy_config_templates
run_setup_wizard

View File

@@ -0,0 +1,69 @@
"""Node-26 resolution for the OpenTUI engine + the launch-cwd channel.
Regression coverage for two ways the local TUI silently fell back to Ink /
showed the wrong directory:
1. fnm's active/default node was on an older line (v25) while a usable v26.3
sat installed-but-inactive — ``_node26_bin_or_none`` only checked
``HERMES_NODE`` + ``which node`` and so reported "no node 26" → OpenTUI
unavailable → Ink fallback.
2. ``TERMINAL_CWD`` (the gateway's launch-dir channel) was only exported in
worktree mode, so a normal launch let the gateway auto-detect the engine's
own package dir as the workspace.
"""
import os
import stat
import pytest
import hermes_cli.main as main_mod
def _fake_node(path, version: str) -> None:
"""Write a stub `node` that prints `version` for `--version`."""
path.parent.mkdir(parents=True, exist_ok=True)
path.write_text(f'#!/bin/sh\necho "{version}"\n')
path.chmod(path.stat().st_mode | stat.S_IEXEC | stat.S_IXGRP | stat.S_IXOTH)
class TestFnmNode26Discovery:
def test_discovers_inactive_v26_when_default_is_older(self, tmp_path, monkeypatch):
"""A v26.3 installed under fnm is found even when PATH `node` is v25."""
fnm_dir = tmp_path / "fnm"
for ver in ("24.11.0", "25.9.0", "26.3.0"):
_fake_node(fnm_dir / "node-versions" / f"v{ver}" / "installation" / "bin" / "node", f"v{ver}")
monkeypatch.setenv("FNM_DIR", str(fnm_dir))
monkeypatch.delenv("HERMES_NODE", raising=False)
# PATH node is the too-old default (v25).
monkeypatch.setattr(main_mod.shutil, "which", lambda _b: str(
fnm_dir / "node-versions" / "v25.9.0" / "installation" / "bin" / "node"
))
resolved = main_mod._node26_bin_or_none()
assert resolved is not None
assert "v26.3.0" in resolved # newest qualifying, not the v25 default
def test_candidates_sorted_newest_first(self, tmp_path, monkeypatch):
fnm_dir = tmp_path / "fnm"
for ver in ("26.1.0", "26.4.0", "25.0.0"):
_fake_node(fnm_dir / "node-versions" / f"v{ver}" / "installation" / "bin" / "node", f"v{ver}")
monkeypatch.setenv("FNM_DIR", str(fnm_dir))
cands = main_mod._fnm_node26_candidates()
# Directory-name order: 26.4 before 26.1 before 25.0.
idx = [next(i for i, c in enumerate(cands) if f"v{v}" in c) for v in ("26.4.0", "26.1.0", "25.0.0")]
assert idx == sorted(idx)
def test_no_fnm_dir_is_empty_not_error(self, tmp_path, monkeypatch):
monkeypatch.setenv("FNM_DIR", str(tmp_path / "does-not-exist"))
monkeypatch.setenv("XDG_DATA_HOME", str(tmp_path / "xdg-none"))
monkeypatch.setattr(main_mod.Path, "home", classmethod(lambda cls: tmp_path / "home-none"))
assert main_mod._fnm_node26_candidates() == []
def test_hermes_node_still_wins(self, tmp_path, monkeypatch):
"""An explicit HERMES_NODE >= 26.3 takes precedence over fnm discovery."""
explicit = tmp_path / "explicit" / "node"
_fake_node(explicit, "v26.5.0")
monkeypatch.setenv("HERMES_NODE", str(explicit))
monkeypatch.setattr(main_mod.shutil, "which", lambda _b: None)
assert main_mod._node26_bin_or_none() == str(explicit)

View File

@@ -99,6 +99,70 @@ class TestResolveTuiHeapMb:
assert self._resolve(64 * GB) == 8192
class TestHeapOverride:
"""HERMES_TUI_HEAP_MB env / display.tui_heap_mb config override (W1/D3).
The override REPLACES the 8192 default; the cgroup-fit clamp still applies on
top so a too-high override can't exceed the container. Precedence: env > config.
"""
def _resolve(self, limit_bytes, env=None, config_mb=None):
with mock.patch.object(m, "_read_cgroup_memory_limit", return_value=limit_bytes), \
mock.patch.object(m, "_config_tui_heap_mb_early", return_value=config_mb), \
mock.patch.dict(m.os.environ, env or {}, clear=False):
if env is None:
m.os.environ.pop("HERMES_TUI_HEAP_MB", None)
return m._resolve_tui_heap_mb()
def test_env_override_unconstrained(self):
# explicit low cap, no cgroup limit -> used as-is (the low-mem opt-in).
assert self._resolve(None, env={"HERMES_TUI_HEAP_MB": "256"}) == 256
def test_env_override_raises_ceiling(self):
# a higher-than-default cap is honored when unconstrained.
assert self._resolve(None, env={"HERMES_TUI_HEAP_MB": "16384"}) == 16384
def test_env_wins_over_config(self):
assert self._resolve(None, env={"HERMES_TUI_HEAP_MB": "512"}, config_mb=4096) == 512
def test_config_used_when_no_env(self):
assert self._resolve(None, config_mb=2048) == 2048
def test_override_still_cgroup_clamped(self):
# user asks for 16GB but the container is 4GB -> trimmed to 75% = 3072.
assert self._resolve(4 * GB, env={"HERMES_TUI_HEAP_MB": "16384"}) == 3072
def test_low_override_honored_under_big_container(self):
# a deliberately low cap is NOT raised by a roomy container.
assert self._resolve(16 * GB, env={"HERMES_TUI_HEAP_MB": "256"}) == 256
def test_garbage_env_falls_through_to_default(self):
assert self._resolve(None, env={"HERMES_TUI_HEAP_MB": "nope"}) == 8192
def test_nonpositive_env_falls_through(self):
assert self._resolve(None, env={"HERMES_TUI_HEAP_MB": "0"}) == 8192
class TestExposeGcOnOpenTuiArgv:
"""W1/D4: the OpenTUI engine argv must carry --expose-gc (parity with Ink) so
global.gc() is a real call, not a no-op."""
def test_opentui_argv_has_expose_gc(self, tmp_path):
app_dir = tmp_path / "ui-opentui"
(app_dir / "src" / "entry").mkdir(parents=True)
(app_dir / "src" / "entry" / "main.tsx").write_text("// entry")
(app_dir / "node_modules" / "@opentui").mkdir(parents=True)
(app_dir / "dist").mkdir()
(app_dir / "dist" / "main.js").write_text("// built")
with mock.patch.object(m, "PROJECT_ROOT", tmp_path), \
mock.patch.object(m, "_node26_bin", return_value="/usr/bin/node"):
argv, cwd = m._make_opentui_argv(tui_dev=False)
assert "--expose-gc" in argv
assert argv[0] == "/usr/bin/node"
assert argv[-1].endswith("dist/main.js")
assert cwd == app_dir
class TestNodeOptionsTokenMerge:
"""The _launch_tui token-merge block must add the sized cap unless the user
already supplied one, and must preserve unrelated NODE_OPTIONS flags."""

View File

@@ -14,6 +14,20 @@ def main_mod():
return m
@pytest.fixture(autouse=True)
def _pin_ink_engine(monkeypatch):
"""These tests exercise the Ink/npm bootstrap inside ``_make_tui_argv``.
The dual-engine dispatch (``_resolve_tui_engine``) auto-selects the native
OpenTUI engine whenever ``ui-opentui/dist`` is built in the repo, which
would route ``_make_tui_argv`` away from the npm path under test. Pin the
engine to ink, mirroring test_tui_resume_flow.py.
"""
import hermes_cli.main as m
monkeypatch.setattr(m, "_resolve_tui_engine", lambda: "ink")
def _touch_ink(root: Path) -> None:
ink = root / "node_modules" / "@hermes" / "ink" / "package.json"
ink.parent.mkdir(parents=True, exist_ok=True)

View File

@@ -118,6 +118,82 @@ def test_cmd_chat_tui_resume_resolves_title_before_launch(monkeypatch, main_mod)
assert captured["resume"] == "20260409_000000_aa11bb"
def test_bare_resume_parses_to_picker_sentinel():
from hermes_cli._parser import build_top_level_parser
parser, _subparsers, _chat_parser = build_top_level_parser()
args = parser.parse_args(["--tui", "--resume"])
assert args.resume is True
args = parser.parse_args(["--resume", "abc123"])
assert args.resume == "abc123"
args = parser.parse_args(["chat", "--tui", "--resume"])
assert args.resume is True
def test_cmd_chat_tui_bare_resume_skips_resolution_and_launches_picker(
monkeypatch, main_mod
):
captured = {}
def fake_launch(resume_session_id=None, **kwargs):
captured["resume"] = resume_session_id
raise SystemExit(0)
def boom(_val):
raise AssertionError("bare --resume must not hit name/id resolution")
monkeypatch.setattr(main_mod, "_resolve_session_by_name_or_id", boom)
monkeypatch.setattr(main_mod, "_launch_tui", fake_launch)
with pytest.raises(SystemExit):
main_mod.cmd_chat(_args(resume=True))
assert captured["resume"] is True
def test_cmd_chat_bare_resume_without_tui_exits_with_guidance(
monkeypatch, capsys, main_mod
):
monkeypatch.setattr(main_mod, "_resolve_use_tui", lambda args: False)
monkeypatch.setattr(
main_mod,
"_launch_tui",
lambda *a, **kw: pytest.fail("must not launch the TUI"),
)
with pytest.raises(SystemExit) as exc:
main_mod.cmd_chat(_args(tui=False, resume=True))
assert exc.value.code == 2
out = capsys.readouterr().out
assert "requires the TUI" in out
assert "hermes --tui --resume" in out
def test_launch_tui_sets_picker_env_for_bare_resume(monkeypatch, main_mod):
captured = {}
monkeypatch.setenv("HERMES_TUI_RESUME", "stale-missing-session")
monkeypatch.setattr(
main_mod,
"_make_tui_argv",
lambda tui_dir, tui_dev: (["node", "dist/entry.js"], Path(".")),
)
monkeypatch.setattr(
main_mod.subprocess,
"call",
lambda argv, cwd=None, env=None: captured.update({"env": env}) or 1,
)
with pytest.raises(SystemExit):
main_mod._launch_tui(resume_session_id=True)
assert captured["env"]["HERMES_TUI_RESUME"] == "picker"
def test_cmd_chat_tui_passes_model_and_provider(monkeypatch, main_mod):
captured = {}
@@ -1008,6 +1084,10 @@ def test_make_tui_argv_dev_prebuilds_hermes_ink(monkeypatch, main_mod, tmp_path)
monkeypatch.setattr(main_mod, "_tui_need_npm_install", lambda _tui_dir: False)
monkeypatch.delenv("HERMES_TUI_DIR", raising=False)
monkeypatch.setattr(main_mod.shutil, "which", lambda bin_name: f"/usr/bin/{bin_name}")
# _make_tui_argv now dispatches on the TUI engine first; resolving "opentui"
# availability probes `node --version` (a subprocess.run this test would
# otherwise record). Pin the Ink engine — this test covers the Ink dev path.
monkeypatch.setattr(main_mod, "_resolve_tui_engine", lambda: "ink")
calls = []

View File

@@ -7,6 +7,7 @@ import time
import types
from datetime import datetime
from pathlib import Path
import pytest
from unittest.mock import patch
from hermes_constants import reset_hermes_home_override, set_hermes_home_override
@@ -380,6 +381,78 @@ def test_tui_verbose_tool_events_omit_details_when_redaction_fails(monkeypatch):
assert "result_text" not in events[1][2]
def test_tool_complete_emits_full_unified_diff(monkeypatch):
events: list[tuple[str, str, dict]] = []
monkeypatch.setattr(
server, "_emit", lambda event_type, sid, payload: events.append((event_type, sid, payload))
)
monkeypatch.setitem(
server._sessions,
"diff-test",
{"tool_progress_mode": "concise", "tool_started_at": {}, "edit_snapshots": {}},
)
diff = "--- a/x.py\n+++ b/x.py\n@@ -1 +1 @@\n-a = 1\n+a = 2\n"
result = json.dumps({"success": True, "diff": diff})
server._on_tool_complete("diff-test", "tool-1", "patch", {"mode": "replace", "path": "x.py"}, result)
assert events and events[0][0] == "tool.complete"
payload = events[0][2]
# the raw unified diff rides alongside the pretty/capped inline_diff
assert payload["diff_unified"] == diff
assert "inline_diff" in payload
def test_verbose_result_text_drops_diff_echo_when_diff_unified_ships(monkeypatch):
# A tall edit's result JSON embeds the WHOLE diff; tail-capping that echo
# yields an unparseable JSON-looking fragment the TUI can't suppress
# reliably. When diff_unified ships, result_text must carry only the
# non-diff signal — small, parseable, never the diff echo.
events: list[tuple[str, str, dict]] = []
monkeypatch.setattr(
server, "_emit", lambda event_type, sid, payload: events.append((event_type, sid, payload))
)
monkeypatch.setitem(
server._sessions,
"diff-echo-test",
{"tool_progress_mode": "verbose", "tool_started_at": {}, "edit_snapshots": {}},
)
lines = "\n".join(f"+def fn_{i}() -> int: return {i}" for i in range(60))
diff = f"--- a/x.py\n+++ b/x.py\n@@ -1,0 +1,60 @@\n{lines}\n"
result = json.dumps(
{"success": True, "diff": diff, "files_modified": ["x.py"], "_warning": "stale read"}
)
server._on_tool_complete("diff-echo-test", "tool-1", "patch", {"mode": "patch"}, result)
payload = events[0][2]
assert payload["diff_unified"] == diff
text = payload["result_text"]
assert "[showing verbose tail" not in text # small enough to dodge the cap
parsed = json.loads(text) # parseable …
assert "diff" not in parsed # … with the echo gone
assert parsed["_warning"] == "stale read" # non-diff signal survives
# without diff_unified (non-edit tools) the result_text is untouched
assert server._result_sans_diff_echo("plain text result") == "plain text result"
assert server._result_sans_diff_echo('{"output": "x"}') == '{"output": "x"}'
def test_cap_diff_unified_truncates_at_line_boundary():
line = "+" + "x" * 63 # 64 bytes per line incl. newline
diff = "\n".join([line] * 100)
capped = server._cap_diff_unified(diff, max_bytes=1000)
body, _, marker = capped.rpartition("\n")
assert marker.startswith("# … diff truncated (")
assert marker.endswith(" more bytes)")
assert len(body.encode("utf-8")) <= 1000
# cut on a line boundary: every surviving line is intact
assert all(l == line for l in body.split("\n"))
# under the cap → untouched
assert server._cap_diff_unified("small", max_bytes=1000) == "small"
def test_dispatch_rejects_non_object_request():
resp = server.dispatch([])
@@ -4162,6 +4235,51 @@ def test_session_info_includes_mcp_servers(monkeypatch):
assert info["mcp_servers"] == fake_status
def test_session_info_includes_session_title(monkeypatch):
"""session.info carries the live session title (window-title chrome).
Resolution order mirrors _session_live_title: DB row wins, a queued
pending_title fills in before the row exists, "" until either lands.
"""
agent = types.SimpleNamespace(tools=[], model="m", provider="p")
# No session at all -> "" (and never a crash).
assert server._session_info(agent, None)["title"] == ""
# pending_title before the DB row exists.
session = {"session_key": "k1", "pending_title": "rename the moon"}
monkeypatch.setattr(server, "_get_db", lambda: None)
assert server._session_info(agent, session)["title"] == "rename the moon"
# DB row wins over pending.
fake_db = types.SimpleNamespace(get_session_title=lambda key: "db title")
monkeypatch.setattr(server, "_get_db", lambda: fake_db)
assert server._session_info(agent, session)["title"] == "db title"
def test_emit_title_refresh_pushes_session_info(monkeypatch):
"""_emit_title_refresh emits a session.info for a live session and is a
silent no-op for unknown/agent-less sessions (it runs on the auto-title
worker thread -- it must never raise)."""
events = []
monkeypatch.setattr(
server, "_emit", lambda event_type, sid, payload: events.append((event_type, sid, payload))
)
monkeypatch.setattr(server, "_session_info", lambda agent, session: {"title": "t"})
agent = types.SimpleNamespace(tools=[], model="m", provider="p")
monkeypatch.setitem(server._sessions, "title-test", {"agent": agent, "session_key": "k"})
server._emit_title_refresh("title-test")
assert events == [("session.info", "title-test", {"title": "t"})]
# Unknown sid / no agent -> no emission, no exception.
server._emit_title_refresh("missing-sid")
monkeypatch.setitem(server._sessions, "agentless", {"session_key": "k2"})
server._emit_title_refresh("agentless")
assert len(events) == 1
# ---------------------------------------------------------------------------
# History-mutating commands must reject while session.running is True.
# Without these guards, prompt.submit's post-run history write either
@@ -4562,6 +4680,90 @@ def test_respond_unpacks_sid_tuple_correctly():
server._answers.pop("rid-x", None)
# ---------------------------------------------------------------------------
# Blocking prompts wait for the human (v6 north-star #5): _block with
# timeout=None must never expire — interrupt/shutdown (_clear_pending)
# are the only releases.
# ---------------------------------------------------------------------------
def _run_block_in_thread(monkeypatch, sid):
"""Start _block(timeout=None) on a background thread; return
(thread, results, get_rid) where get_rid polls for the pending rid."""
monkeypatch.setattr(server, "_emit", lambda *a, **kw: None)
results: list[str] = []
def runner():
results.append(server._block("clarify.request", sid, {"question": "q"}))
t = threading.Thread(target=runner, daemon=True)
t.start()
def get_rid():
deadline = time.time() + 5
while time.time() < deadline:
with server._prompt_lock:
for rid, (owner, _ev) in server._pending.items():
if owner == sid:
return rid
time.sleep(0.01)
raise AssertionError("pending rid never appeared for sid=%s" % sid)
return t, results, get_rid
def test_block_no_timeout_waits_for_delayed_answer(monkeypatch):
"""_block(timeout=None) must keep blocking until the answer arrives —
no premature empty return."""
t, results, get_rid = _run_block_in_thread(monkeypatch, "sid_block_wait")
rid = get_rid()
# Answer after a short delay; _block must still be waiting.
time.sleep(0.3)
assert t.is_alive(), "_block returned before any answer was provided"
with server._prompt_lock:
server._answers[rid] = "green"
server._pending[rid][1].set()
t.join(timeout=5)
assert not t.is_alive()
assert results == ["green"]
def test_clear_pending_releases_no_timeout_block(monkeypatch):
"""_clear_pending(sid) must release a timeout=None _block with ''."""
t, results, get_rid = _run_block_in_thread(monkeypatch, "sid_block_clear")
get_rid()
server._clear_pending("sid_block_clear")
t.join(timeout=5)
assert not t.is_alive()
assert results == [""]
def test_clear_pending_other_sid_does_not_release_block(monkeypatch):
"""_clear_pending on an unrelated session must NOT release a pending
timeout=None _block (session scoping)."""
t, results, get_rid = _run_block_in_thread(monkeypatch, "sid_block_scoped")
rid = get_rid()
server._clear_pending("sid_some_other_session")
time.sleep(0.2)
assert t.is_alive(), (
"_clear_pending on another sid released a prompt owned by "
"sid_block_scoped — session scoping is broken"
)
assert not results
# Clean up: release properly so the thread joins.
with server._prompt_lock:
server._answers[rid] = "done"
server._pending[rid][1].set()
t.join(timeout=5)
assert not t.is_alive()
assert results == ["done"]
# ---------------------------------------------------------------------------
# /model switch and other agent-mutating commands must reject while the
# session is running. agent.switch_model() mutates self.model, self.provider,
@@ -5776,6 +5978,351 @@ def test_session_most_recent_handles_db_unavailable(monkeypatch):
assert resp["result"]["session_id"] is None
# ── session.list (resume-picker filters + widened projection) ───────
def _picker_row(sid, source, started, **extra):
row = {
"id": sid,
"source": source,
"title": f"title-{sid}",
"preview": f"preview-{sid}",
"started_at": started,
"last_active": started,
"message_count": 3,
"ended_at": None,
"cwd": None,
"model": None,
}
row.update(extra)
return row
class _PickerDB:
"""list_sessions_rich stand-in honouring the kwargs session.list maps."""
def __init__(self, rows):
self.rows = rows
self.calls = []
def list_sessions_rich(
self,
*,
source=None,
limit=20,
offset=0,
order_by_last_active=False,
id_query=None,
):
self.calls.append(
{
"source": source,
"limit": limit,
"offset": offset,
"order_by_last_active": order_by_last_active,
"id_query": id_query,
}
)
rows = [dict(r) for r in self.rows]
if source:
rows = [r for r in rows if r.get("source") == source]
if id_query:
needle = id_query.strip().lower()
rows = [r for r in rows if needle in r["id"].lower()]
if order_by_last_active:
rows.sort(key=lambda r: r.get("last_active") or 0, reverse=True)
else:
rows.sort(key=lambda r: r.get("started_at") or 0, reverse=True)
return rows[offset : offset + limit]
def _picker_db():
return _PickerDB(
[
_picker_row(
"tui-1",
"tui",
100,
cwd="/home/u/proj",
model="nous/hermes-4",
ended_at=150.0,
),
_picker_row("cli-1", "cli", 90),
_picker_row("cron-1", "cron", 80),
_picker_row("cron-2", "cron", 70),
_picker_row("tg-1", "telegram", 60),
_picker_row("tool-1", "tool", 50),
]
)
def test_session_list_no_params_keeps_legacy_behavior_with_widened_rows(monkeypatch):
db = _picker_db()
monkeypatch.setattr(server, "_get_db", lambda: db)
resp = server.handle_request({"id": "1", "method": "session.list", "params": {}})
rows = resp["result"]["sessions"]
# Legacy semantics: started_at DESC, `tool` denied, one DB fetch.
assert [r["id"] for r in rows] == ["tui-1", "cli-1", "cron-1", "cron-2", "tg-1"]
assert db.calls == [
{
"source": None,
"limit": 400,
"offset": 0,
"order_by_last_active": False,
"id_query": None,
}
]
# Widened projection, None-safe for rows missing the new columns.
first, second = rows[0], rows[1]
assert first["cwd"] == "/home/u/proj"
assert first["model"] == "nous/hermes-4"
assert first["ended_at"] == 150.0
assert first["last_active"] == 100
assert second["cwd"] is None
assert second["model"] is None
assert second["ended_at"] is None
assert second["last_active"] == 90
# Legacy keys are still present and unchanged.
assert second["title"] == "title-cli-1"
assert second["preview"] == "preview-cli-1"
assert second["message_count"] == 3
assert second["source"] == "cli"
def test_session_list_single_source_passes_through_to_db(monkeypatch):
db = _picker_db()
monkeypatch.setattr(server, "_get_db", lambda: db)
resp = server.handle_request(
{"id": "1", "method": "session.list", "params": {"sources": ["tui"]}}
)
assert [r["id"] for r in resp["result"]["sessions"]] == ["tui-1"]
assert db.calls[0]["source"] == "tui" # pushed into SQL, not Python-filtered
def test_session_list_multi_source_filters_gateway_side(monkeypatch):
db = _picker_db()
monkeypatch.setattr(server, "_get_db", lambda: db)
resp = server.handle_request(
{
"id": "1",
"method": "session.list",
"params": {"sources": ["cron", "telegram"]},
}
)
assert [r["id"] for r in resp["result"]["sessions"]] == [
"cron-1",
"cron-2",
"tg-1",
]
assert db.calls[0]["source"] is None # multi-source: DB scan + gateway filter
def test_session_list_sources_tool_stays_denied(monkeypatch):
db = _picker_db()
monkeypatch.setattr(server, "_get_db", lambda: db)
resp = server.handle_request(
{"id": "1", "method": "session.list", "params": {"sources": ["tool"]}}
)
assert resp["result"]["sessions"] == []
def test_session_list_query_maps_to_id_query_on_last_active_path(monkeypatch):
db = _picker_db()
monkeypatch.setattr(server, "_get_db", lambda: db)
resp = server.handle_request(
{"id": "1", "method": "session.list", "params": {"query": "cron"}}
)
assert [r["id"] for r in resp["result"]["sessions"]] == ["cron-1", "cron-2"]
assert db.calls[0]["id_query"] == "cron"
assert db.calls[0]["order_by_last_active"] is True
def test_session_list_offset_limit_paginate(monkeypatch):
db = _picker_db()
monkeypatch.setattr(server, "_get_db", lambda: db)
def page(offset, limit):
resp = server.handle_request(
{
"id": "1",
"method": "session.list",
"params": {
"sources": ["tui", "cli", "cron", "telegram"],
"offset": offset,
"limit": limit,
},
}
)
return [r["id"] for r in resp["result"]["sessions"]]
assert page(0, 2) == ["tui-1", "cli-1"]
assert page(2, 2) == ["cron-1", "cron-2"]
assert page(4, 2) == ["tg-1"]
assert page(6, 2) == []
# ── session.peek (resume-picker Space preview) ───────────────────────
class _PeekDB:
def __init__(self, session, messages):
self.session = session
self.messages = messages
def get_session(self, session_id):
if self.session and session_id == self.session["id"]:
return dict(self.session)
return None
def get_messages(self, session_id):
return [dict(m) for m in self.messages]
def _peek_db():
session = {
"id": "sess-1",
"title": "picker demo",
"source": "tui",
"model": "nous/hermes-4",
"cwd": "/home/u/proj",
"started_at": 100.0,
"ended_at": 400.0,
"end_reason": "tui_shutdown",
"message_count": 6,
"actual_cost_usd": None,
"estimated_cost_usd": 0.42,
}
messages = [
{"id": 1, "role": "system", "content": "sys prompt", "timestamp": 100.0},
{"id": 2, "role": "user", "content": "first prompt", "timestamp": 110.0},
{"id": 3, "role": "assistant", "content": "first answer", "timestamp": 120.0},
{"id": 4, "role": "tool", "content": "tool output", "timestamp": 130.0},
{"id": 5, "role": "user", "content": "second prompt", "timestamp": 140.0},
{"id": 6, "role": "assistant", "content": "final answer", "timestamp": 150.0},
]
return _PeekDB(session, messages)
def test_session_peek_returns_metadata_and_head_tail(monkeypatch):
monkeypatch.setattr(server, "_get_db", _peek_db)
resp = server.handle_request(
{
"id": "1",
"method": "session.peek",
"params": {"session_id": "sess-1", "head": 1, "tail": 2},
}
)
result = resp["result"]
meta = result["session"]
assert meta == {
"id": "sess-1",
"title": "picker demo",
"source": "tui",
"model": "nous/hermes-4",
"cwd": "/home/u/proj",
"started_at": 100.0,
"ended_at": 400.0,
"end_reason": "tui_shutdown",
"message_count": 6,
"last_active": 150.0, # last message timestamp, not ended_at
"cost_usd": 0.42, # estimated fallback when actual is None
}
# Only displayable (user/assistant) messages, system/tool rows skipped.
assert [(m["role"], m["content"]) for m in result["head"]] == [
("user", "first prompt")
]
assert [(m["role"], m["content"]) for m in result["tail"]] == [
("user", "second prompt"),
("assistant", "final answer"),
]
assert result["total_messages"] == 4
assert all(m["truncated"] is False for m in result["head"] + result["tail"])
def test_session_peek_head_tail_never_overlap(monkeypatch):
db = _peek_db()
db.messages = db.messages[:3] # system + user + assistant
monkeypatch.setattr(server, "_get_db", lambda: db)
resp = server.handle_request(
{
"id": "1",
"method": "session.peek",
"params": {"session_id": "sess-1", "head": 2, "tail": 2},
}
)
result = resp["result"]
head_ids = [m["id"] for m in result["head"]]
tail_ids = [m["id"] for m in result["tail"]]
assert head_ids == [2, 3]
assert tail_ids == [] # both displayable rows already consumed by head
assert result["total_messages"] == 2
def test_session_peek_does_not_build_an_agent(monkeypatch):
spawned = []
monkeypatch.setattr(server, "_get_db", _peek_db)
monkeypatch.setattr(
server, "_make_agent", lambda *a, **kw: spawned.append("make") or None
)
monkeypatch.setattr(
server, "_start_agent_build", lambda *a, **kw: spawned.append("build")
)
before_sessions = dict(server._sessions)
resp = server.handle_request(
{"id": "1", "method": "session.peek", "params": {"session_id": "sess-1"}}
)
assert "result" in resp
assert spawned == []
assert server._sessions == before_sessions # no live session registered
def test_session_peek_unknown_id_is_clean_error(monkeypatch):
monkeypatch.setattr(server, "_get_db", _peek_db)
resp = server.handle_request(
{"id": "1", "method": "session.peek", "params": {"session_id": "nope"}}
)
assert resp["error"]["code"] == 4007
assert resp["error"]["message"] == "session not found"
def test_session_peek_requires_session_id(monkeypatch):
monkeypatch.setattr(server, "_get_db", _peek_db)
resp = server.handle_request({"id": "1", "method": "session.peek", "params": {}})
assert resp["error"]["code"] == 4006
def test_session_peek_db_unavailable(monkeypatch):
monkeypatch.setattr(server, "_get_db", lambda: None)
monkeypatch.setattr(server, "_db_error", "locked")
resp = server.handle_request(
{"id": "1", "method": "session.peek", "params": {"session_id": "sess-1"}}
)
assert resp["error"]["code"] == 5046
assert "state.db unavailable" in resp["error"]["message"]
# ── browser.manage ───────────────────────────────────────────────────
@@ -7223,6 +7770,84 @@ def test_sniff_image_ext_magic_and_filename():
assert server._sniff_image_ext(b"\x89PNG", "photo.jpeg") == ".jpeg"
def test_tool_complete_derives_error_from_result_convention(monkeypatch):
# The repo convention (agent.display._result_succeeded): a JSON-object
# result with a truthy string "error" — or success:false — IS a failure.
# The gateway surfaces it as payload["error"] so clients (OpenTUI ✗ state,
# Ink trail ✗) don't sniff the convention themselves.
events: list[tuple[str, str, dict]] = []
monkeypatch.setattr(
server, "_emit", lambda event_type, sid, payload: events.append((event_type, sid, payload))
)
monkeypatch.setitem(
server._sessions,
"err-test",
{"tool_progress_mode": "verbose", "tool_started_at": {}, "edit_snapshots": {}},
)
# 1) error-string result → flattened, capped error on the payload
server._on_tool_complete(
"err-test", "t1", "read_file", {"path": "/nope"},
json.dumps({"error": "File not found:\n /nope"}),
)
assert events[-1][2]["error"] == "File not found: /nope"
# 2) success:false without error string → generic failure marker
server._on_tool_complete(
"err-test", "t2", "patch", {"path": "x"}, json.dumps({"success": False})
)
assert events[-1][2]["error"] == "tool reported failure"
# 3) plain-text result → NEVER a failure
server._on_tool_complete("err-test", "t3", "terminal", {"command": "ls"}, "file-a\nfile-b")
assert "error" not in events[-1][2]
# 4) JSON result with error: null / no error key → success
server._on_tool_complete(
"err-test", "t4", "web_search", {"q": "x"},
json.dumps({"results": [1, 2], "error": None}),
)
assert "error" not in events[-1][2]
def test_session_list_reports_scan_cap_truncation(monkeypatch):
# The bounded multi-source scan must say so when the 10k safety cap stops
# it with the requested window unfilled — an empty page past the cap is
# "truncated", not "no more sessions".
class _DenyAllDB:
def __init__(self):
self.offsets = []
def list_sessions_rich(self, source=None, limit=20, offset=0, **kw):
self.offsets.append(offset)
return [
{"id": f"s{offset}-{i}", "source": "tool", "title": "", "preview": "",
"started_at": 1, "message_count": 1}
for i in range(limit)
]
db = _DenyAllDB()
monkeypatch.setattr(server, "_get_db", lambda: db)
resp = server.handle_request(
{"id": "1", "method": "session.list", "params": {"sources": ["tui", "cli"], "limit": 5}}
)
result = resp["result"]
assert result["sessions"] == []
assert result["truncated"] is True
assert max(db.offsets) <= 10_000
def test_session_list_truncated_false_on_normal_paths(monkeypatch):
db = _picker_db()
monkeypatch.setattr(server, "_get_db", lambda: db)
legacy = server.handle_request({"id": "1", "method": "session.list", "params": {}})
assert legacy["result"]["truncated"] is False
filtered = server.handle_request(
{"id": "2", "method": "session.list", "params": {"sources": ["tui"], "limit": 5}}
)
assert filtered["result"]["truncated"] is False
def test_slash_worker_close_reaps_zombie_and_closes_fds():
"""A hung worker is SIGKILLed, the zombie reaped, all pipes closed — once."""
calls = {k: 0 for k in ("terminate", "kill", "wait", "stdin", "stdout", "stderr")}
@@ -7497,3 +8122,51 @@ def test_reap_idle_sessions_closes_only_evictable(monkeypatch):
assert closed == [("stale", "idle_timeout")]
finally:
server._sessions.clear()
class TestProcessCompletionCard:
"""_emit_process_completion_card surfaces a background-process completion to
the TUI as a notification.show card (Option B, glitch 2026-06-14) — in
addition to the agent turn the completion triggers."""
@staticmethod
def _capture(monkeypatch):
emitted: list = []
monkeypatch.setattr(server, "_emit", lambda event, sid, payload=None: emitted.append((event, sid, payload)))
return emitted
def test_completion_exit_zero_is_an_info_card(self, monkeypatch):
emitted = self._capture(monkeypatch)
server._emit_process_completion_card(
"s1", {"type": "completion", "session_id": "proc_1", "command": "sleep 20 && echo hi", "exit_code": 0}
)
assert len(emitted) == 1
event, sid, payload = emitted[0]
assert event == "notification.show"
assert sid == "s1"
assert payload["text"] == "sleep 20 && echo hi exited 0"
assert payload["level"] == "info"
assert payload["kind"] == "process.complete"
assert payload["key"] == "proc:proc_1"
def test_nonzero_exit_is_a_warn_card(self, monkeypatch):
emitted = self._capture(monkeypatch)
server._emit_process_completion_card("s1", {"type": "completion", "command": "build", "exit_code": 1, "session_id": "p2"})
assert emitted[0][2]["level"] == "warn"
assert emitted[0][2]["text"] == "build exited 1"
def test_watch_match_is_not_carded(self, monkeypatch):
emitted = self._capture(monkeypatch)
server._emit_process_completion_card("s1", {"type": "watch_match", "command": "tail -f log"})
assert emitted == []
def test_long_command_is_truncated(self, monkeypatch):
emitted = self._capture(monkeypatch)
server._emit_process_completion_card("s1", {"type": "completion", "command": "x" * 100, "exit_code": 0, "session_id": "p3"})
assert "" in emitted[0][2]["text"]
assert len(emitted[0][2]["text"]) < 80
def test_missing_exit_code_says_finished(self, monkeypatch):
emitted = self._capture(monkeypatch)
server._emit_process_completion_card("s1", {"type": "completion", "command": "daemon", "session_id": "p4"})
assert emitted[0][2]["text"] == "daemon finished"

View File

@@ -176,6 +176,12 @@ _LONG_HANDLERS = frozenset(
{
"browser.manage",
"cli.exec",
# model.options is network-bound (pricing fetch + Nous tier check via
# build_models_payload, ~seconds). The native TUI prefetches it right
# after session.create; on the main thread that stalls every fast-path
# RPC — notably complete.slash, so the first `/` dropdown after launch
# took seconds to paint.
"model.options",
"plugins.manage",
"session.branch",
"session.compress",
@@ -1351,7 +1357,11 @@ def _enable_gateway_prompts() -> None:
# ── Blocking prompt factory ──────────────────────────────────────────
def _block(event: str, sid: str, payload: dict, timeout: int = 300) -> str:
def _block(event: str, sid: str, payload: dict, timeout: int | None = None) -> str:
# Blocking prompts wait for the human; interrupt/shutdown (via
# _clear_pending) are the only releases — v6 north-star #5. A
# timeout would orphan the TUI prompt and silently feed the agent
# an empty answer, so callers default to waiting forever.
rid = uuid.uuid4().hex[:8]
ev = threading.Event()
with _prompt_lock:
@@ -1402,6 +1412,11 @@ def resolve_skin() -> dict:
"banner_hero": skin.banner_hero,
"tool_prefix": skin.tool_prefix,
"help_header": (skin.branding or {}).get("help_header", ""),
# Native engines (Ink + OpenTUI) can now consume these too: spinner
# animation data (faces/verbs/wings) and per-tool emoji overrides.
# Additive + optional — old engines ignore unknown keys.
"spinner": skin.spinner or {},
"tool_emojis": skin.tool_emojis or {},
}
except Exception:
return {}
@@ -2410,6 +2425,17 @@ def _session_info(agent, session: dict | None = None) -> dict:
yolo = bool(_YOLO_MODE_FROZEN) or session_yolo or _get_approval_mode() == "off"
except Exception:
yolo = False
# Session title (DB row, falling back to a not-yet-applied pending_title).
# Drives client window-title chrome (OSC 0/2 in the native TUI); "" until
# the first exchange titles the session.
title = ""
if session is not None:
try:
title = _session_live_title(
session, str(session.get("session_key") or "")
)
except Exception:
title = ""
info: dict = {
"model": getattr(agent, "model", ""),
"provider": getattr(agent, "provider", ""),
@@ -2422,6 +2448,7 @@ def _session_info(agent, session: dict | None = None) -> dict:
"cwd": cwd,
"branch": _git_branch_for_cwd(cwd),
"personality": str(personality or ""),
"title": title,
"running": bool((session or {}).get("running")),
"desktop_contract": DESKTOP_BACKEND_CONTRACT,
"version": "",
@@ -2534,6 +2561,53 @@ def _cap_tui_verbose_text(text: str) -> str:
return f"{label}{tail}"
# The FULL raw unified diff shipped on file-edit tool.complete (`diff_unified`,
# for clients with a native diff renderer — ui-opentui). Unlike the verbose
# trail above this is rendered COLLAPSED by default and only on edit tools, so
# it gets a far larger budget; 512KB is still a hard ceiling so a runaway
# multi-megabyte edit can't flood the pipe.
_DIFF_UNIFIED_MAX_BYTES = 512 * 1024
def _cap_diff_unified(diff: str, max_bytes: int = _DIFF_UNIFIED_MAX_BYTES) -> str:
raw = diff.encode("utf-8")
if len(raw) <= max_bytes:
return diff
head = raw[:max_bytes].decode("utf-8", errors="ignore")
# Truncate at a line boundary so the surviving diff stays parseable, then
# append an honest marker line (never send more than the cap + marker).
cut = head.rfind("\n")
if cut > 0:
head = head[:cut]
omitted = len(raw) - len(head.encode("utf-8"))
return f"{head}\n# … diff truncated ({omitted} more bytes)"
def _result_sans_diff_echo(result: str) -> str:
"""The file-edit result JSON minus its `diff` echo.
Used for verbose `result_text` when the FULL diff already ships as
`diff_unified`: a multi-KB diff echo inside the result JSON gets
tail-capped by `_cap_tui_verbose_text` into an unparseable JSON-looking
fragment that a client can neither render nor reliably suppress. The
native renderer shows the real diff, so result_text should carry only the
non-diff signal (success/files_modified/warnings/lsp_diagnostics).
Returns `result` unchanged when it isn't a JSON object with a `diff` key.
"""
try:
data = json.loads(result)
except Exception:
return result
if not isinstance(data, dict) or "diff" not in data:
return result
try:
return json.dumps(
{k: v for k, v in data.items() if k != "diff"}, ensure_ascii=False
)
except Exception:
return result
def _redact_tui_verbose_text(text: str) -> str:
try:
from agent.redact import redact_sensitive_text
@@ -2637,8 +2711,37 @@ def _on_tool_start(sid: str, tool_call_id: str, name: str, args: dict):
_emit("tool.start", sid, payload)
def _tool_error_from_result(result: str) -> str | None:
"""Derive a tool-failure message from the repo's result convention.
Canon (same as ``agent.display._result_succeeded``): a JSON-object result
with a truthy string ``error`` key — or an explicit ``success: false`` —
means the tool failed. Returned flattened + capped for a one-line header.
Conservative on purpose: non-JSON / non-dict results are NEVER failures
(plain-text output is normal success output).
"""
try:
data = json.loads(result)
except Exception:
return None
if not isinstance(data, dict):
return None
err = data.get("error")
if isinstance(err, str) and err.strip():
return " ".join(err.split())[:300]
if data.get("success") is False:
return "tool reported failure"
return None
def _on_tool_complete(sid: str, tool_call_id: str, name: str, args: dict, result: str):
payload = {"tool_id": tool_call_id, "name": name, "args": args}
# Failure surfaced explicitly so clients don't have to sniff the result
# convention themselves (the TUI's ✗ state and Ink's trail ✗ both key off
# payload["error"], which was previously never set on this path).
_tool_err = _tool_error_from_result(result)
if _tool_err:
payload["error"] = _tool_err
session = _sessions.get(sid)
snapshot = None
started_at = None
@@ -2655,10 +2758,6 @@ def _on_tool_complete(sid: str, tool_call_id: str, name: str, args: dict, result
summary = _tool_summary(name, result, duration_s)
if summary:
payload["summary"] = summary
if _session_verbose(sid):
result_text = _tool_result_text(result)
if result_text:
payload["result_text"] = result_text
if name == "todo":
try:
data = json.loads(result)
@@ -2680,7 +2779,32 @@ def _on_tool_complete(sid: str, tool_call_id: str, name: str, args: dict, result
payload["inline_diff"] = "\n".join(rendered)
except Exception:
pass
if _tool_progress_enabled(sid) or payload.get("inline_diff"):
# Alongside the pretty-rendered/capped `inline_diff` (Ink consumes that), ship
# the RAW unified diff for clients with a native diff renderer (ui-opentui's
# file-tool view). Capped at _DIFF_UNIFIED_MAX_BYTES with an honest marker.
try:
from agent.display import extract_edit_diff
diff_unified = extract_edit_diff(
name,
result,
function_args=args,
snapshot=snapshot,
)
if diff_unified:
payload["diff_unified"] = _cap_diff_unified(diff_unified)
except Exception:
pass
if _session_verbose(sid):
# Computed AFTER diff_unified: when the full diff ships natively, the
# result_text drops the in-JSON diff echo (it would tail-cap into
# unparseable JSON-looking noise under the client's rendered diff).
result_text = _tool_result_text(
_result_sans_diff_echo(result) if payload.get("diff_unified") else result
)
if result_text:
payload["result_text"] = result_text
if _tool_progress_enabled(sid) or payload.get("inline_diff") or payload.get("diff_unified"):
_emit("tool.complete", sid, payload)
@@ -2910,7 +3034,7 @@ def _wire_callbacks(sid: str):
from tools.terminal_tool import set_sudo_password_callback
from tools.skills_tool import set_secret_capture_callback
set_sudo_password_callback(lambda: _block("sudo.request", sid, {}, timeout=120))
set_sudo_password_callback(lambda: _block("sudo.request", sid, {}))
def secret_cb(env_var, prompt, metadata=None):
pl = {"prompt": prompt, "env_var": env_var}
@@ -3696,7 +3820,13 @@ def _coerce_message_text(content: Any) -> str:
return str(content)
def _history_to_messages(history: list[dict]) -> list[dict]:
def _history_to_messages(history: list[dict], include_tool_output: bool = False) -> list[dict]:
# ``include_tool_output`` (opt-in; only the native/opentui engine passes it via
# session.resume) folds each tool's redacted+capped result + args into its row so
# a resumed transcript renders collapsible tool blocks identical to a live turn.
# OFF by default so the Ink path is byte-for-byte unchanged (its render tree showed
# the verbose trail expanded and OOM'd on big output — #34095; the native engine
# renders tools collapsed, so shipping the same capped tail is safe there).
messages = []
tool_call_args = {}
@@ -3724,9 +3854,13 @@ def _history_to_messages(history: list[dict]) -> list[dict]:
tc_info = tool_call_args.get(tc_id) if tc_id else None
name = (tc_info[0] if tc_info else None) or m.get("tool_name") or "tool"
args = (tc_info[1] if tc_info else None) or {}
messages.append(
{"role": "tool", "name": name, "context": _tool_ctx(name, args)}
)
tool_msg = {"role": "tool", "name": name, "context": _tool_ctx(name, args)}
if include_tool_output:
if args:
tool_msg["args"] = args
if content_text.strip():
tool_msg["result_text"] = _redact_tui_verbose_text(content_text)
messages.append(tool_msg)
continue
# An assistant turn may carry only reasoning/thinking content with no
# visible text (extended-thinking turns, thinking-only recovery
@@ -3966,6 +4100,23 @@ def _(rid, params: dict) -> dict:
@method("session.list")
def _(rid, params: dict) -> dict:
"""List stored sessions for the resume picker / sidebar.
Optional params — omitting all of them keeps the legacy behaviour
(most-recently-started first, ``tool`` rows denied, default limit 200):
- ``sources``: list of source tags to include (e.g. ``["cli", "tui"]``).
Powers the picker's tab strip. The ``tool`` deny-list still applies
on top. A single-element list is pushed into SQL
(``list_sessions_rich(source=...)``); multi-element lists are
filtered gateway-side over a bounded scan (the DB layer only takes a
single ``source`` string).
- ``query``: case-insensitive *session-id* substring filter, mapped to
``list_sessions_rich(id_query=...)``. The DB applies ``id_query``
only on its order-by-last-active path, so query results are ordered
by most recent activity. Title/preview search stays client-side.
- ``offset`` / ``limit``: pagination over the filtered list.
"""
db = _get_db()
if db is None:
return _db_unavailable_error(rid, code=5006)
@@ -3980,16 +4131,76 @@ def _(rid, params: dict) -> dict:
# platform is added or a user names their own source.
deny = frozenset({"tool"})
limit = int(params.get("limit", 200) or 200)
# Over-fetch modestly so per-source filtering doesn't leave us
# short; the compression-tip projection in ``list_sessions_rich``
# can also merge rows.
fetch_limit = max(limit * 2, 200)
rows = [
s
for s in db.list_sessions_rich(source=None, limit=fetch_limit)
if (s.get("source") or "").strip().lower() not in deny
][:limit]
try:
limit = max(1, int(params.get("limit", 200) or 200))
except (TypeError, ValueError):
limit = 200
try:
offset = max(0, int(params.get("offset", 0) or 0))
except (TypeError, ValueError):
offset = 0
query = str(params.get("query") or "").strip()
raw_sources = params.get("sources")
sources: list = []
if isinstance(raw_sources, (list, tuple)):
sources = [
str(s).strip().lower() for s in raw_sources if str(s).strip()
]
if not sources and not query and offset == 0:
# Legacy path (no filter params) — byte-for-byte today's
# behaviour. Over-fetch modestly so per-source filtering doesn't
# leave us short; the compression-tip projection in
# ``list_sessions_rich`` can also merge rows.
fetch_limit = max(limit * 2, 200)
rows = [
s
for s in db.list_sessions_rich(source=None, limit=fetch_limit)
if (s.get("source") or "").strip().lower() not in deny
][:limit]
list_truncated = False
else:
# Filtered/paginated path. Single source pushes into SQL; the
# deny-list and multi-source filter run gateway-side, so keep
# scanning DB pages until the requested window is full (bounded
# by a generous safety cap so a pathological DB can't pin us).
source_arg = sources[0] if len(sources) == 1 else None
allowed = frozenset(sources) if sources else None
wanted = offset + limit
def _eligible(row: dict) -> bool:
src = (row.get("source") or "").strip().lower()
if src in deny:
return False
return allowed is None or src in allowed
collected: list = []
db_offset = 0
page = max(wanted * 2, 200)
scan_capped = False
while len(collected) < wanted:
if db_offset >= 10_000:
# Safety cap hit with the window still unfilled — report it
# honestly (``truncated``) so the client can say "results
# truncated" instead of silently serving an empty page.
scan_capped = True
break
batch = db.list_sessions_rich(
source=source_arg,
limit=page,
offset=db_offset,
# ``id_query`` only applies on the order-by-last-active
# path; keep legacy started_at ordering when unfiltered.
order_by_last_active=bool(query),
id_query=query or None,
)
collected.extend(r for r in batch if _eligible(r))
if len(batch) < page:
break
db_offset += page
rows = collected[offset : offset + limit]
list_truncated = scan_capped and len(collected) < wanted
return _ok(
rid,
{
@@ -4001,15 +4212,133 @@ def _(rid, params: dict) -> dict:
"started_at": s.get("started_at") or 0,
"message_count": s.get("message_count") or 0,
"source": s.get("source") or "",
# Picker row metadata (None-safe; older rows may
# predate these columns).
"cwd": s.get("cwd") or None,
"last_active": s.get("last_active")
or s.get("started_at")
or 0,
"ended_at": s.get("ended_at"),
"model": s.get("model") or None,
}
for s in rows
]
],
# True when the bounded multi-source scan hit its safety cap
# before filling the requested window — the client should show
# "results truncated" instead of treating the page as final.
"truncated": list_truncated,
},
)
except Exception as e:
return _err(rid, 5006, str(e))
@method("session.peek")
def _(rid, params: dict) -> dict:
"""DB-only preview of a stored session for the resume picker.
``{session_id, head?, tail?}`` → session metadata plus the first ``head``
and last ``tail`` displayable messages (``user``/``assistant`` rows with
non-empty text content; tool spam and empty tool-call carriers are
skipped). Purely a read: no agent is constructed, no live session state
is created or switched — this powers the picker's Space preview, which
must stay cheap while the user scrolls.
Response shape::
{
"session": {id, title, source, model, cwd, started_at, ended_at,
end_reason, message_count, last_active, cost_usd},
"head": [{id, role, content, truncated, timestamp}, ...],
"tail": [{...}], # never overlaps head
"total_messages": <int> # displayable (user/assistant) count
}
"""
target = str(params.get("session_id") or "").strip()
if not target:
return _err(rid, 4006, "session_id required")
try:
head = max(0, int(params.get("head", 2) or 0))
except (TypeError, ValueError):
head = 2
try:
tail = max(0, int(params.get("tail", 2) or 0))
except (TypeError, ValueError):
tail = 2
db = _get_db()
if db is None:
return _db_unavailable_error(rid, code=5046)
try:
row = db.get_session(target)
if not row:
return _err(rid, 4007, "session not found")
msgs = db.get_messages(target)
last_active = (
(msgs[-1].get("timestamp") if msgs else None)
or row.get("started_at")
or 0
)
def _displayable(m: dict) -> bool:
if (m.get("role") or "") not in ("user", "assistant"):
return False
content = m.get("content")
if isinstance(content, str):
return bool(content.strip())
return bool(content) # multimodal parts list
def _peek_msg(m: dict) -> dict:
content = m.get("content")
if not isinstance(content, str):
try:
content = json.dumps(content, ensure_ascii=False)
except (TypeError, ValueError):
content = str(content)
content = content or ""
return {
"id": m.get("id"),
"role": m.get("role") or "",
"content": content[:2000],
"truncated": len(content) > 2000,
"timestamp": m.get("timestamp"),
}
display = [m for m in msgs if _displayable(m)]
head_msgs = display[:head] if head else []
# Slice the remainder so head and tail never overlap, even when
# head + tail >= len(display).
tail_msgs = display[head:][-tail:] if tail else []
cost = row.get("actual_cost_usd")
if cost is None:
cost = row.get("estimated_cost_usd")
return _ok(
rid,
{
"session": {
"id": row["id"],
"title": row.get("title") or "",
"source": row.get("source") or "",
"model": row.get("model") or None,
"cwd": row.get("cwd") or None,
"started_at": row.get("started_at") or 0,
"ended_at": row.get("ended_at"),
"end_reason": row.get("end_reason"),
"message_count": row.get("message_count") or 0,
"last_active": last_active,
"cost_usd": cost,
},
"head": [_peek_msg(m) for m in head_msgs],
"tail": [_peek_msg(m) for m in tail_msgs],
"total_messages": len(display),
},
)
except Exception as e:
return _err(rid, 5046, str(e))
@method("session.most_recent")
def _(rid, params: dict) -> dict:
"""Return the most recent human-facing session id, or ``None``.
@@ -4230,7 +4559,9 @@ def _(rid, params: dict) -> dict:
display_history_prefix = display_history[
: max(0, len(display_history) - len(history))
]
messages = _history_to_messages(display_history)
messages = _history_to_messages(
display_history, include_tool_output=bool(params.get("with_tool_output"))
)
tokens = _set_session_context(target)
try:
# Pass the profile's db so the agent persists turns to the right
@@ -4397,6 +4728,22 @@ def _session_live_title(session: dict, key: str) -> str:
return title
def _emit_title_refresh(sid: str) -> None:
"""Push a session.info refresh after a title change (pending-title
application, auto-title landing, or a session.title rename) so clients'
window-title chrome updates immediately. Thread-safe (auto-title calls
this from its worker thread; _emit serializes on the stdout lock).
Never raises."""
try:
session = _sessions.get(sid)
agent = (session or {}).get("agent")
if session is None or agent is None:
return
_emit("session.info", sid, _session_info(agent, session))
except Exception:
pass
def _session_live_item(sid: str, session: dict, current_sid: str = "") -> dict:
key = str(session.get("session_key") or sid)
agent = session.get("agent")
@@ -4622,15 +4969,18 @@ def _(rid, params: dict) -> dict:
title = (params.get("title", "") or "").strip()
if not title:
return _err(rid, 4021, "title required")
sid = str(params.get("session_id") or "")
try:
if db.set_session_title(key, title):
session["pending_title"] = None
_emit_title_refresh(sid)
return _ok(rid, {"pending": False, "title": title})
# rowcount == 0 can mean "same value" as well as "missing row".
# Queue only when the session row truly does not exist yet.
existing_row = db.get_session(key)
if existing_row:
session["pending_title"] = None
_emit_title_refresh(sid)
return _ok(
rid,
{
@@ -4639,6 +4989,7 @@ def _(rid, params: dict) -> dict:
},
)
session["pending_title"] = title
_emit_title_refresh(sid)
return _ok(rid, {"pending": True, "title": title})
except ValueError as e:
return _err(rid, 4022, str(e))
@@ -5603,6 +5954,39 @@ def _notification_event_dedup_key(evt: dict) -> tuple:
return (evt_sid, evt_type)
def _emit_process_completion_card(sid: str, evt: dict) -> None:
"""Surface a background-process COMPLETION to the TUI as a notification card,
in ADDITION to the agent turn it triggers. A bare `notify_on_complete` exit
otherwise reaches the TUI only as the agent's narration (the completion is fed
to the model as a synthetic prompt, never sent to the UI). Emitting a
``notification.show`` lets the OpenTUI engine render a distinct inline card so
the user actually sees the process finish; the Ink engine treats it as a
notice. Additive — no existing behaviour changes. Completion events only
(watch matches aren't terminal); the dedup at the call sites ensures one card
per completion. (glitch 2026-06-14)"""
if evt.get("type", "completion") != "completion":
return
cmd = str(evt.get("command") or "process").strip().replace("\n", " ")
if len(cmd) > 60:
cmd = cmd[:59] + ""
code = evt.get("exit_code")
if code is None:
text, level = f"{cmd} finished", "info"
else:
text = f"{cmd} exited {code}"
level = "info" if code == 0 else "warn"
_emit(
"notification.show",
sid,
{
"text": text,
"kind": "process.complete",
"level": level,
"key": f"proc:{evt.get('session_id', '')}",
},
)
def _notification_poller_loop(
stop_event: threading.Event, sid: str, session: dict
) -> None:
@@ -5661,6 +6045,7 @@ def _notification_poller_loop(
rid = f"__notif__{int(time.time() * 1000)}"
try:
_emit_process_completion_card(sid, evt)
_emit("message.start", sid)
_run_prompt_submit(rid, sid, session, text)
except Exception as exc:
@@ -5704,6 +6089,7 @@ def _notification_poller_loop(
rid = f"__notif__{int(time.time() * 1000)}"
try:
_emit_process_completion_card(sid, evt)
_emit("message.start", sid)
_run_prompt_submit(rid, sid, session, text)
except Exception as exc:
@@ -6039,10 +6425,20 @@ def _run_prompt_submit(rid, sid: str, session: dict, text: Any) -> None:
text,
raw,
session.get("history", []),
# Auto-title lands on a background thread — refresh
# session.info when it does so clients' window-title
# chrome (OSC 0/2) updates without waiting for the
# next turn. _emit is stdout-lock-guarded (thread-safe).
title_callback=lambda _title: _emit_title_refresh(sid),
)
except Exception:
pass
# The pending title (applied synchronously above) is visible NOW —
# refresh session.info so window-title chrome picks it up.
if status == "complete" and _pending:
_emit_title_refresh(sid)
# CLI parity: when voice-mode TTS is on, speak the agent reply
# (cli.py:_voice_speak_response). Only the final text — tool
# calls / reasoning already stream separately and would be
@@ -6135,6 +6531,7 @@ def _run_prompt_submit(rid, sid: str, session: dict, text: Any) -> None:
break
session["running"] = True
try:
_emit_process_completion_card(sid, _evt)
_emit("message.start", sid)
_run_prompt_submit(rid, sid, session, synth)
except Exception as _n_exc:
@@ -9930,6 +10327,67 @@ def _(rid, params: dict) -> dict:
return _err(rid, 5031, str(e))
@method("startup.catalog")
def _(rid, params: dict) -> dict:
# Aggregate tools / skills / MCP servers for the native engine's startup panel
# (item 9). Opt-in RPC — only the opentui home screen calls it, so the Ink path
# is untouched. Each section is best-effort: a failing source yields an empty
# section rather than erroring the whole call.
tools: dict = {"total": 0, "toolsets": []}
try:
from toolsets import get_all_toolsets, get_toolset_info
# enabled toolsets for THIS session (or the config default), mirroring tools.list
session = _sessions.get(params.get("session_id", ""))
enabled = (
set(getattr(session["agent"], "enabled_toolsets", []) or [])
if session
else set(_load_enabled_toolsets() or [])
)
for name in sorted(get_all_toolsets().keys()):
info = get_toolset_info(name)
if not info:
continue
is_on = name in enabled if enabled else True
# the startup panel lists ENABLED toolsets with their tools (Ink parity)
tool_names = [str(t) for t in (info.get("resolved_tools") or [])]
tools["toolsets"].append(
{"name": name, "count": int(info["tool_count"]), "enabled": is_on, "tools": tool_names}
)
if is_on:
tools["total"] += int(info["tool_count"])
except Exception:
pass
skills: dict = {"total": 0, "categories": []}
try:
from hermes_cli.banner import get_available_skills
by_cat = get_available_skills() or {}
for cat in sorted(by_cat.keys()):
names = by_cat[cat] or []
skills["categories"].append({"name": cat, "count": len(names)})
skills["total"] += len(names)
except Exception:
pass
mcp_servers: list = []
try:
from hermes_cli.config import read_raw_config
from hermes_cli.tools_config import _parse_enabled_flag
raw_cfg = read_raw_config() or {}
servers = raw_cfg.get("mcp_servers")
if isinstance(servers, dict):
for name, cfg in servers.items():
if isinstance(cfg, dict) and _parse_enabled_flag(cfg.get("enabled", True), default=True):
mcp_servers.append(str(name))
except Exception:
pass
return _ok(rid, {"tools": tools, "skills": skills, "mcp": {"servers": sorted(mcp_servers)}})
@method("tools.show")
def _(rid, params: dict) -> dict:
try:

10
ui-opentui/.gitignore vendored Normal file
View File

@@ -0,0 +1,10 @@
node_modules/
dist/
.repos/
*.frame.txt
*.ansi
bun.lockb
# the global ~/.gitignore_global `lib/` rule swallows our test harness — re-include it
!src/test/lib/
.bench/

1
ui-opentui/.node-version Normal file
View File

@@ -0,0 +1 @@
26.3

11
ui-opentui/.prettierrc Normal file
View File

@@ -0,0 +1,11 @@
{
"arrowParens": "avoid",
"bracketSpacing": true,
"endOfLine": "auto",
"printWidth": 120,
"semi": false,
"singleQuote": true,
"tabWidth": 2,
"trailingComma": "none",
"useTabs": false
}

53
ui-opentui/README.md Normal file
View File

@@ -0,0 +1,53 @@
# ui-opentui — native OpenTUI engine for Hermes
Solid + `@opentui/core` over Node FFI. Ink (`ui-tui/`) is the shipping default;
this is the experimental engine (draft PR #42922).
## Node 26 setup (required; will not touch your other projects)
This package needs **Node ≥ 26.3** (`--experimental-ffi` floor). Everything
else on this machine/repo can keep whatever Node it already uses — pin 26 to
this directory only:
```sh
# 1. install fnm (skip if you have it; nvm/mise work too — see below)
curl -fsSL https://fnm.vercel.app/install | bash
# add to ~/.zshrc (or bashrc): eval "$(fnm env --use-on-cd --shell zsh)"
# 2. install Node 26 SIDE BY SIDE (does NOT change your default)
fnm install 26
# 3. done — this directory has a .node-version (26.3), so `cd ui-opentui`
# auto-switches to 26 and leaving switches back. Do NOT run `fnm default 26`.
node -v # v26.x here; your old version everywhere else
```
No shell integration wanted (CI, scripts, one-off): `fnm exec --using 26 -- node ...`
or invoke the absolute binary (`~/.local/share/fnm/node-versions/v26.*/installation/bin/node`).
mise users: `mise use node@26` in this directory. nvm users: `nvm install 26`,
plus an `.nvmrc` shim (`echo 26 > .nvmrc`) if you rely on auto-switching.
### Gotchas
- **Native modules are ABI-locked.** A `node_modules` installed under Node
20/22 will not load under 26 (and vice versa) — run `npm ci` (or
`npm rebuild`) after switching versions. Same applies to the **tui-bench** repo's node-pty (`github.com/NousResearch/tui-bench`).
- **Global npm packages don't follow** between versions (per-version prefix);
reinstall the few you need, or don't use globals.
- **Editor terminals** (Zed/VS Code) need the `fnm env` line in your shell rc;
the `.node-version` auto-switch then covers any shell that cd's here.
- **Never run this package with bun** — the FFI seam and the Solid/JSX build
are Node-path only here.
- `package.json` declares `engines.node >= 26.3`, so a wrong-Node `npm ci`
warns immediately.
## Build & run
```sh
node scripts/build.mjs
HERMES_TUI_MOUSE=1 node --experimental-ffi --no-warnings dist/main.js
```
Gates: `npm run check` (typecheck + lint + tests). Memory/perf benchmarks live
in the **tui-bench** repo (`github.com/NousResearch/tui-bench`; see its README). Transcript windowing (memory architecture) is
documented in `../docs/plans/opentui-transcript-windowing.md`.

View File

@@ -0,0 +1,103 @@
import js from "@eslint/js"
import tseslint from "typescript-eslint"
import unusedImports from "eslint-plugin-unused-imports"
export default tseslint.config(
{
// .bench/ and .demo/ are build artifacts (bench `nodes` cell and the
// smoke demo: `node scripts/build.mjs scripts/demo.tsx .demo`) — never lint.
ignores: ["node_modules/**", "dist/**", ".bench/**", ".demo/**", ".repos/**", "*.frame.txt", "*.ansi"],
},
js.configs.recommended,
...tseslint.configs.recommendedTypeChecked,
{
files: ["**/*.ts", "**/*.tsx"],
languageOptions: {
parserOptions: {
projectService: true,
tsconfigRootDir: import.meta.dirname,
},
},
plugins: {
"unused-imports": unusedImports,
},
rules: {
// Boundary code bans these; the Solid view follows TS-strict but is not Effect.
"@typescript-eslint/no-explicit-any": "error",
"@typescript-eslint/consistent-type-imports": ["error", { prefer: "type-imports" }],
"@typescript-eslint/no-unused-vars": "off",
"@typescript-eslint/no-non-null-assertion": "error",
"unused-imports/no-unused-imports": "error",
"unused-imports/no-unused-vars": [
"error",
{ vars: "all", varsIgnorePattern: "^_", args: "after-used", argsIgnorePattern: "^_" },
],
// --- Type-aware, high-value: ON as ERROR ---
"@typescript-eslint/no-floating-promises": "error",
"@typescript-eslint/no-misused-promises": "error",
"@typescript-eslint/await-thenable": "error",
// --- Type-safety: ENFORCED as errors in our boundary/logic .ts code ---
// Production .ts is clean of the no-unsafe-* family (the loose-typed gateway
// payloads are Schema-decoded). The only sources are (a) *.tsx — @opentui/solid's
// JSX namespace types every component `return (<…>)` as `error`/unknown, a
// framework limitation disabled for views below — and (b) the test harness
// (loose render/effect fixtures + async mocks), exempt below. So we enforce ERROR.
"@typescript-eslint/no-unsafe-assignment": "error",
"@typescript-eslint/no-unsafe-member-access": "error",
"@typescript-eslint/no-unsafe-argument": "error",
"@typescript-eslint/no-unsafe-return": "error",
"@typescript-eslint/no-unsafe-call": "error",
"@typescript-eslint/no-base-to-string": "error",
"@typescript-eslint/restrict-template-expressions": "error",
"@typescript-eslint/no-unnecessary-type-assertion": "error",
"@typescript-eslint/require-await": "error",
// Defensive guards on untrusted runtime/gateway data: TS's narrowing doesn't
// model the wire, so "condition is always truthy" here is intentional armor,
// not dead code. Kept as a hint (warn), not a gate failure.
"@typescript-eslint/no-unnecessary-condition": "warn",
},
},
{
// @opentui/solid's custom JSX namespace types component returns as `error`/
// unknown, so EVERY `return (<…>)` in a view trips the no-unsafe-* family.
// That's a framework typing limitation, not unsafe app code — off for views.
files: ["**/*.tsx"],
rules: {
"@typescript-eslint/no-unsafe-return": "off",
"@typescript-eslint/no-unsafe-assignment": "off",
"@typescript-eslint/no-unsafe-member-access": "off",
"@typescript-eslint/no-unsafe-argument": "off",
"@typescript-eslint/no-unsafe-call": "off",
},
},
{
// Test helpers/fixtures: keep `!` on known-present data, and allow the loose
// render/effect harness casts + async mock signatures (they satisfy real
// Promise-returning interfaces with no body to await).
files: ["**/*.test.ts", "**/*.test.tsx", "src/test/lib/**"],
rules: {
"@typescript-eslint/no-non-null-assertion": "off",
"@typescript-eslint/no-unsafe-assignment": "off",
"@typescript-eslint/no-unsafe-member-access": "off",
"@typescript-eslint/no-unsafe-argument": "off",
"@typescript-eslint/no-unsafe-return": "off",
"@typescript-eslint/no-unsafe-call": "off",
"@typescript-eslint/no-unnecessary-type-assertion": "off",
"@typescript-eslint/require-await": "off",
},
},
{
// Build/config scripts (the eslint flat config, the esbuild build.mjs, the
// vitest config) are not part of the typed TS program, so the project service
// can't type them — disable type-aware linting there to avoid parser errors,
// and declare the Node globals they use (process, console, URL).
files: ["**/*.mjs", "*.config.ts"],
...tseslint.configs.disableTypeChecked,
languageOptions: {
...tseslint.configs.disableTypeChecked.languageOptions,
globals: { process: "readonly", console: "readonly", URL: "readonly", URLSearchParams: "readonly" },
},
},
)

4919
ui-opentui/package-lock.json generated Normal file

File diff suppressed because it is too large Load Diff

45
ui-opentui/package.json Normal file
View File

@@ -0,0 +1,45 @@
{
"name": "@hermes/ui-opentui",
"version": "0.0.0",
"private": true,
"type": "module",
"engines": {
"node": ">=26.3"
},
"description": "Native OpenTUI engine for Hermes (Solid + Effect-at-boundary, from scratch). Ink (ui-tui/) stays the shipping default.",
"scripts": {
"type-check": "tsc --noEmit",
"lint": "eslint .",
"lint:fix": "eslint . --fix",
"fmt": "prettier --write src",
"fix": "prettier --write src && eslint . --fix",
"build": "node scripts/build.mjs",
"start": "node --experimental-ffi --no-warnings dist/main.js",
"test": "vitest run",
"check": "bash scripts/check.sh",
"dev": "node scripts/build.mjs && node --experimental-ffi --no-warnings dist/main.js"
},
"dependencies": {
"@opentui/core": "0.4.1",
"@opentui/keymap": "0.4.1",
"@opentui/solid": "0.4.1",
"effect": "4.0.0-beta.78",
"fuzzysort": "^3.1.0",
"solid-js": "1.9.12"
},
"devDependencies": {
"@babel/core": "^7.29.7",
"@babel/preset-typescript": "^7.29.7",
"@effect/vitest": "^4.0.0-beta.78",
"@eslint/js": "^9",
"@types/node": "^24",
"babel-preset-solid": "^1.9.12",
"esbuild": "^0.28.0",
"eslint": "^9",
"eslint-plugin-unused-imports": "^4",
"prettier": "^3",
"typescript": "^5",
"typescript-eslint": "^8",
"vitest": "^4.1.8"
}
}

View File

@@ -0,0 +1,75 @@
#!/usr/bin/env bash
# Single acceptance command for the Bun→Node-26 switchover (see
# docs/plans/opentui-node26-build-spec.md). Proves, on a Node 26.3 host, that the
# OpenTUI v2 engine runs WITHOUT Bun and at parity:
#
# 1. Node >= 26.3 present (the node:ffi floor); reports whether bun is on PATH
# (the engine must NOT need it).
# 2. `npm run check` — prettier + tsc + eslint + vitest (151+), all on Node.
# 3. live-gateway transport smoke — spawns the real Python tui_gateway via the
# node:child_process client, asserts gateway.ready + session.create.
# (Skipped if no Hermes venv resolves — CI parity.)
# 4. selection/markdown smoke in a real tmux TTY — asserts the native <markdown>
# (Tree-sitter) PAINTS under node --experimental-ffi and that a selection
# copies the RAW markdown source. (Skipped if tmux is unavailable.)
#
# Run: cd ui-opentui && HERMES_PYTHON_SRC_ROOT=<checkout-root> bash scripts/acceptance.sh
set -uo pipefail
cd "$(dirname "$0")/.."
# Absolute node, so a fresh tmux pane (which won't inherit our PATH / fnm shim)
# runs the SAME Node 26.3, not the shell's default.
NODE_BIN="$(command -v node || echo node)"
pass=0; fail=0; skip=0
ok() { echo "$1"; pass=$((pass+1)); }
bad() { echo "$1"; fail=$((fail+1)); }
note() { echo "$1"; skip=$((skip+1)); }
echo "== [1/4] runtime: Node >= 26.3, Bun-free =="
NODE_V="$(node -p 'process.versions.node' 2>/dev/null || echo 0.0.0)"
node -e 'const [a,b]=process.versions.node.split(".").map(Number); process.exit(a>26||(a===26&&b>=3)?0:1)' \
&& ok "node $NODE_V (>= 26.3)" || bad "node $NODE_V is below the 26.3 node:ffi floor"
if command -v bun >/dev/null 2>&1; then
note "bun is on PATH ($(command -v bun)) — fine; the engine does not use it (proven below)"
else
ok "no bun on PATH — single-runtime host"
fi
echo "== [2/4] check: prettier + tsc + eslint + vitest =="
if bash scripts/check.sh >/tmp/accept-check.log 2>&1; then ok "check green ($(grep -c 'passed' /tmp/accept-check.log >/dev/null 2>&1; grep -oE '[0-9]+ passed' /tmp/accept-check.log | tail -1))"
else bad "check failed — see /tmp/accept-check.log"; tail -20 /tmp/accept-check.log; fi
echo "== [3/4] live-gateway transport smoke (real Python gateway, no Bun) =="
if [ -n "${HERMES_PYTHON_SRC_ROOT:-}" ] || [ -x "../.venv/bin/python" ]; then
rm -rf .accept && node scripts/build.mjs src/test/liveGateway.smoke.ts .accept >/dev/null 2>&1
OUT="$(node --experimental-ffi --no-warnings .accept/liveGateway.smoke.js 2>&1)"
echo "$OUT" | grep -q "^PASS" && ok "$(echo "$OUT" | grep '^PASS')" || { echo "$OUT" | grep -qE "TRANSPORT ERROR|SKIP" && note "gateway smoke skipped (no python/model)" || bad "gateway smoke: $(echo "$OUT" | head -1)"; }
rm -rf .accept
else
note "no HERMES_PYTHON_SRC_ROOT / venv — gateway smoke skipped"
fi
echo "== [4/4] selection/markdown smoke in a real tmux TTY (tree-sitter under FFI) =="
if command -v tmux >/dev/null 2>&1; then
rm -rf .accept && node scripts/build.mjs src/test/selectionCopy.smoke.tsx .accept >/dev/null 2>&1
rm -f /tmp/accept-sel.json
S="accept-$$"
tmux kill-session -t "$S" 2>/dev/null
tmux new-session -d -s "$S" -x 120 -y 40
tmux send-keys -t "$S" "SEL_SMOKE_OUT=/tmp/accept-sel.json $NODE_BIN --experimental-ffi --no-warnings $PWD/.accept/selectionCopy.smoke.js; tmux wait-for -S $S" Enter
tmux wait-for "$S" 2>/dev/null || sleep 6
tmux kill-session -t "$S" 2>/dev/null
if node -e 'process.exit(require("/tmp/accept-sel.json").pass===true?0:1)' 2>/dev/null; then
ok "markdown painted + selection copied source (tree-sitter under node FFI)"
else
bad "selection/markdown smoke failed — see /tmp/accept-sel.json"; cat /tmp/accept-sel.json 2>/dev/null
fi
rm -rf .accept
else
note "tmux not available — markdown smoke skipped (run it on a TTY host)"
fi
echo
echo "== acceptance: $pass passed, $fail failed, $skip skipped =="
[ "$fail" -eq 0 ] && { echo "ACCEPTANCE: PASS"; exit 0; } || { echo "ACCEPTANCE: FAIL"; exit 1; }

View File

@@ -0,0 +1,75 @@
/**
* Build the OpenTUI v2 Solid app for Node 26 (no Bun).
*
* Mirrors OpenTUI's own Node recipe (`~/github/opentui/.../run-node26.mjs` +
* `packages/solid/scripts/solid-transform.ts`): apply babel-preset-solid in
* `generate:"universal"` mode with `moduleName:"@opentui/solid"` to every app
* .tsx/.jsx, and force solid-js to its CLIENT/universal build (the package's
* `node` export condition points at the SSR `server.js`, which lacks the
* reactive primitives the universal renderer needs).
*
* `@opentui/core` stays EXTERNAL: it resolves its per-arch native `libopentui.so`
* (and the tree-sitter worker) from its own package dir via `import.meta.url`;
* bundling it would break those paths.
*
* Run with the Node that will launch the app:
* node scripts/build.mjs # → dist/main.js (app entry)
* node scripts/build.mjs <entry.tsx> <outdir> # build an arbitrary entry (smokes/spikes)
* Launch:
* node --experimental-ffi --no-warnings dist/main.js
*/
import { readFile } from 'node:fs/promises'
import { createRequire } from 'node:module'
import { dirname, resolve } from 'node:path'
import { fileURLToPath } from 'node:url'
import { transformAsync } from '@babel/core'
import tsPreset from '@babel/preset-typescript'
import solidPreset from 'babel-preset-solid'
import * as esbuild from 'esbuild'
const require = createRequire(import.meta.url)
const root = resolve(dirname(fileURLToPath(import.meta.url)), '..')
/** esbuild plugin that reproduces @opentui/solid's transform + solid-js resolution. */
const opentuiSolid = {
name: 'opentui-solid',
setup(build) {
// App JSX (.tsx/.jsx, never node_modules) → babel-preset-solid (universal).
build.onLoad({ filter: /\.[cm]?[jt]sx$/ }, async args => {
if (args.path.includes('/node_modules/')) return null
const code = await readFile(args.path, 'utf8')
const out = await transformAsync(code, {
filename: args.path,
configFile: false,
babelrc: false,
presets: [[solidPreset, { moduleName: '@opentui/solid', generate: 'universal' }], [tsPreset]]
})
return { contents: out?.code ?? '', loader: 'js' }
})
// Force the universal/client solid-js build (node condition → server.js otherwise).
build.onResolve({ filter: /^solid-js$/ }, () => ({ path: require.resolve('solid-js/dist/solid.js') }))
build.onResolve({ filter: /^solid-js\/store$/ }, () => ({ path: require.resolve('solid-js/store/dist/store.js') }))
}
}
const [, , entryArg, outdirArg] = process.argv
const entry = entryArg ? resolve(process.cwd(), entryArg) : resolve(root, 'src/entry/main.tsx')
const outdir = outdirArg ? resolve(process.cwd(), outdirArg) : resolve(root, 'dist')
await esbuild.build({
entryPoints: [entry],
outdir,
bundle: true,
format: 'esm',
platform: 'node',
target: 'node26',
splitting: true,
sourcemap: true,
logLevel: 'info',
// Native blob + tree-sitter worker resolve from @opentui/core's own dir at runtime.
external: ['@opentui/core', '@opentui/core/*'],
plugins: [opentuiSolid],
define: { 'process.env.OPENTUI_BUN_ONLY_EXAMPLES': '"false"' }
})

26
ui-opentui/scripts/check.sh Executable file
View File

@@ -0,0 +1,26 @@
#!/usr/bin/env bash
# Phase gate for the native OpenTUI engine (spec v4 §5). Runs the full headless
# suite: format + type-check + lint + vitest (which includes the headless frame
# gate via captureCharFrame). The agentic smoke (docs/plans/opentui-smoke.md) is
# the live complement — run BOTH every phase.
#
# Runs entirely on Node 26.3 (no Bun). The OpenTUI native core loads via node:ffi
# under --experimental-ffi; vitest passes that flag to its test forks (see
# vitest.config.ts). Requires `node -v` == v26.3.x on PATH.
set -euo pipefail
cd "$(dirname "$0")/.."
echo "== [1/4] format (prettier --check) =="
npx prettier --check src
echo "== [2/4] type-check =="
npm run --silent type-check
echo "== [3/4] lint =="
npm run --silent lint
echo "== [4/4] vitest (incl. headless frame gate) =="
npm test
echo "== check OK =="

View File

@@ -0,0 +1,57 @@
/**
* DEV DEMO — NOT a test, NOT production. Renders the bench fixture (lorem-ipsum +
* fat tool-turns from ./fixture.ts) in a REAL CliRenderer so you can attach over
* tmux, scroll, and eyeball the transcript + the rolling-cap truncation notice.
* No gateway is spawned (purely the fixture seeded into the store via the resume
* path), so typing won't reach a backend — it's for viewing/scrolling.
*
* Run (Node 26 — needs the esbuild/Solid transform, then --experimental-ffi):
* node scripts/build.mjs scripts/demo.tsx .demo
* node --experimental-ffi --no-warnings .demo/demo.js # inside tmux (needs a TTY)
* DEMO_TOTAL=200 fixture messages to seed (default 200)
* HERMES_TUI_MAX_MESSAGES=80 cap → the "⤒ N earlier messages" notice fires
* Quit: Ctrl+C.
*/
import { createCliRenderer } from '@opentui/core'
import { render } from '@opentui/solid'
import { installMultiClickSelection } from '../src/boundary/multiClickSelect.ts'
import { registerRemoteParsers } from '../src/boundary/parsers.ts'
import { createSessionStore } from '../src/logic/store.ts'
import { App } from '../src/view/App.tsx'
import { ThemeProvider } from '../src/view/theme.tsx'
import { materialize } from './fixture.ts'
// Same grammar registration as the live entry so fixture code blocks highlight
// (fetched+cached on first use; no HERMES_TUI_PARSER_CACHE here → OpenTUI default).
registerRemoteParsers()
const TOTAL = Number.parseInt(process.env.DEMO_TOTAL ?? '', 10) || 200
const store = createSessionStore()
store.apply({ type: 'gateway.ready' })
store.setSessionId('demo-fixture-20260609')
// Seed via the resume path so the cap slices + the `dropped` counter is set
// (drives the truncation notice) exactly as a real `session.resume` would.
store.beginBuffer()
store.commitSnapshot(materialize(TOTAL))
const renderer = await createCliRenderer({
externalOutputMode: 'passthrough',
targetFps: 60,
exitOnCtrlC: true,
useKittyKeyboard: {},
useMouse: true
})
// Same seam the live entry installs (boundary/renderer.ts) so the demo smokes
// double-click word / triple-click line / drag-extend too.
installMultiClickSelection(renderer)
void render(
() => (
<ThemeProvider theme={() => store.state.theme}>
<App store={store} />
</ThemeProvider>
),
renderer
)

View File

@@ -0,0 +1,292 @@
/**
* DEV BENCH FIXTURE — NOT a test, NOT production code. A deterministic generator
* for a REALISTIC heavy session, consumed by `scripts/mem-bench.tsx`. Excluded
* from the vitest run (not a *.test.ts) and lint-clean.
*
* The old synthetic bench pushed tiny 3-delta turns (~5.5 mounted nodes each) —
* an unrealistic per-message cost. Real transcripts are LUMPY: an assistant turn
* is ONE `message` but a fat node subtree (markdown blocks + a reasoning block +
* several tool headers, each a multi-line result). That makes message-count a
* LOOSE proxy for memory, which is exactly what we're trying to quantify before
* picking a `HERMES_TUI_MAX_MESSAGES` default.
*
* Design: a turn is modeled as a small typed `TurnAction` union (user / system /
* gateway-event). The driver maps user→`pushUser`, system→`pushSystem`, and every
* gateway event through the SAME `apply()` reducer real usage takes — so the
* mounted result is identical to a live session. The same action stream also
* materializes a settled `Message[]` (via `materialize`) for the resume-path check
* (`commitSnapshot`). Everything is seeded by index (no `Math.random` —
* unavailable here), so a given `total` reproduces byte-for-byte.
*/
import type { GatewayEvent } from '../src/boundary/schema/GatewayEvent.ts'
import { createSessionStore, type Message } from '../src/logic/store.ts'
/** One scripted action in a turn: a composer push or a decoded gateway event. */
type TurnAction =
| { kind: 'user'; text: string }
| { kind: 'system'; text: string }
| { kind: 'event'; event: GatewayEvent }
/** A pool of lorem-ipsum words — varied content is selected by index from here. */
const WORDS = [
'lorem',
'ipsum',
'dolor',
'sit',
'amet',
'consectetur',
'adipiscing',
'elit',
'sed',
'eiusmod',
'tempor',
'incididunt',
'labore',
'magna',
'aliqua',
'enim',
'minim',
'veniam',
'quis',
'nostrud',
'exercitation',
'ullamco',
'laboris',
'aliquip',
'commodo',
'consequat',
'duis',
'aute',
'irure',
'reprehenderit',
'voluptate',
'velit',
'esse',
'cillum',
'fugiat',
'nulla',
'pariatur',
'excepteur',
'occaecat',
'cupidatat',
'proident',
'sunt',
'culpa',
'officia',
'deserunt',
'mollit',
'anim'
] as const
/** Deterministic pseudo-word stream: pick from WORDS by a seeded index. */
function word(seed: number, k: number): string {
return WORDS[(seed * 31 + k * 7) % WORDS.length] ?? 'lorem'
}
/** A lorem sentence of `n` words, capitalized + terminated. */
function sentence(seed: number, n: number): string {
const parts: string[] = []
for (let k = 0; k < n; k++) parts.push(word(seed + k, k))
const text = parts.join(' ')
return text.charAt(0).toUpperCase() + text.slice(1) + '.'
}
/** A paragraph of `s` sentences (varying length by index). */
function paragraph(seed: number, s: number): string {
const out: string[] = []
for (let i = 0; i < s; i++) out.push(sentence(seed + i * 13, 6 + ((seed + i) % 9)))
return out.join(' ')
}
/** N lorem-ipsum lines (for tool result bodies), each varying in length. */
function lines(seed: number, n: number): string {
const out: string[] = []
for (let i = 0; i < n; i++) out.push(sentence(seed + i * 5, 4 + ((seed + i) % 11)))
return out.join('\n')
}
/** A markdown assistant body: paragraphs + a list + a fenced code block. */
function assistantMarkdown(seed: number): string {
const lead = paragraph(seed, 1 + (seed % 3))
const bullets = [`- ${sentence(seed + 1, 5)}`, `- ${sentence(seed + 2, 7)}`, `- ${sentence(seed + 3, 4)}`].join('\n')
const code = [
'```ts',
`const x${seed % 7} = ${seed % 100}`,
`function f${seed % 5}() {`,
' return x',
'}',
'```'
].join('\n')
const tail = paragraph(seed + 17, 1 + ((seed + 1) % 2))
return `${lead}\n\n${bullets}\n\n${code}\n\n${tail}`
}
/** Tool names cycled by index (mirrors a real tool mix). */
const TOOL_NAMES = ['terminal', 'read_file', 'edit_file', 'grep', 'web_search', 'write_file'] as const
/** A tool.start + tool.complete pair for tool `t` in turn `seed`. */
function toolEvents(seed: number, t: number): GatewayEvent[] {
const id = `tool-${seed}-${t}`
const name = TOOL_NAMES[(seed + t) % TOOL_NAMES.length] ?? 'terminal'
const variant = (seed + t) % 3
// short / capped-16-line / medium result bodies, mixing the render-cost cases.
// BENCH KNOB: HERMES_BENCH_TOOL_BODY_LINES overrides the body size to N lines
// for ALL tools — the "fat tool output" fixture (e.g. a `find /` dump) used to
// make W3's retention win measurable. UNSET = the original tiny bodies (so the
// default fixture stays byte-identical; existing benches are unaffected).
const fatLines = Number.parseInt(process.env.HERMES_BENCH_TOOL_BODY_LINES ?? '', 10)
const bodyLines = Number.isFinite(fatLines) && fatLines > 0 ? fatLines : variant === 0 ? 2 : variant === 1 ? 18 : 7
const resultText = lines(seed + t * 3, bodyLines)
const context = sentence(seed + t, 4)
// ~half the tools carry a multi-line args block (the expanded-view cost).
const withArgs = (seed + t) % 2 === 0
const start: GatewayEvent = {
type: 'tool.start',
payload: withArgs ? { tool_id: id, name, context, args_text: lines(seed + t, 5) } : { tool_id: id, name, context }
}
const complete: GatewayEvent = {
type: 'tool.complete',
payload: {
tool_id: id,
name,
result_text: resultText,
duration_s: 0.1 + ((seed + t) % 40) / 10,
args: { command: context, index: seed + t }
}
}
return [start, complete]
}
/** One USER message (14 lorem paragraphs; some very short, some RFC-sized). */
function userText(seed: number): string {
const shape = seed % 7
if (shape === 0) return 'yes do that'
if (shape === 1) return 'ok'
if (shape === 6) {
// an RFC-sized pasted block: many paragraphs.
const out: string[] = []
for (let p = 0; p < 8; p++) out.push(paragraph(seed + p * 23, 4 + (p % 3)))
return out.join('\n\n')
}
const n = 1 + (seed % 4)
const out: string[] = []
for (let p = 0; p < n; p++) out.push(paragraph(seed + p * 11, 1 + ((seed + p) % 3)))
return out.join('\n\n')
}
/**
* Build the scripted actions for ONE turn. Most turns are a plain user+assistant
* exchange; a deterministic subset are tool-heavy (115 tool calls) or a system
* slash-output line. Returns the actions for the whole turn in order.
*/
function turnActions(turn: number): TurnAction[] {
const actions: TurnAction[] = []
// Occasional system slash-output line (≈ every 9th turn) instead of a user line.
if (turn % 9 === 4) {
actions.push({ kind: 'system', text: sentence(turn, 8) })
return actions
}
actions.push({ kind: 'user', text: userText(turn) })
actions.push({ kind: 'event', event: { type: 'message.start' } })
// Reasoning on ≈ every 3rd assistant turn.
if (turn % 3 === 0) {
actions.push({
kind: 'event',
event: {
type: 'reasoning.delta',
payload: { text: `**${sentence(turn, 3).replace(/\.$/, '')}**\n\n${paragraph(turn + 5, 2)}` }
}
})
}
// Leading text part.
actions.push({ kind: 'event', event: { type: 'message.delta', payload: { text: assistantMarkdown(turn) } } })
// Tool-heavy turns: ≈ every 4th assistant turn carries several tool calls,
// interleaved with a follow-up text part (the fat-turn stress case).
if (turn % 4 === 0) {
const toolCount = 1 + (turn % 15) // 1..15 tools
for (let t = 0; t < toolCount; t++) {
for (const ev of toolEvents(turn, t)) actions.push({ kind: 'event', event: ev })
}
actions.push({ kind: 'event', event: { type: 'message.delta', payload: { text: paragraph(turn + 31, 2) } } })
}
actions.push({ kind: 'event', event: { type: 'message.complete' } })
return actions
}
/** How many transcript ROWS a turn produces (user/system + at most one assistant). */
export function rowsPerTurn(turn: number): number {
return turn % 9 === 4 ? 1 : 2
}
/** Apply ONE turn's actions to a store via the same paths real usage takes. */
export function applyTurn(store: ReturnType<typeof createSessionStore>, turn: number): void {
for (const action of turnActions(turn)) {
if (action.kind === 'user') store.pushUser(action.text)
else if (action.kind === 'system') store.pushSystem(action.text)
else store.apply(action.event)
}
}
/**
* Drive at least `total` MESSAGES into the live store, calling `onSample(pushes)`
* each time the cumulative produced-row count crosses a `sampleEvery` boundary.
* `pushes` counts MESSAGES (rows produced, pre-cap), so the matrix samples on a
* raw message cadence regardless of the rolling cap.
*/
export function drive(
store: ReturnType<typeof createSessionStore>,
total: number,
sampleEvery: number,
onSample: (pushes: number) => void
): number {
let pushed = 0
let nextSample = sampleEvery
let turn = 0
while (pushed < total) {
applyTurn(store, turn)
pushed += rowsPerTurn(turn)
turn++
while (pushed >= nextSample && nextSample <= total) {
onSample(Math.min(pushed, total))
nextSample += sampleEvery
}
}
return turn
}
/**
* Materialize the FULL settled `Message[]` for the resume path: replay the same
* action stream into a FRESH, EFFECTIVELY-UNCAPPED store and snapshot its rows.
* This guarantees the resume fixture is byte-identical to what the live push
* path produces (minus the rolling cap), so `commitSnapshot` mounts the real shape.
*/
export function materialize(total: number): Message[] {
// `uncappedFixture` bypasses the store's handle-safe cap CLAMP (an env value
// can no longer raise the cap past logic/store.ts HANDLE_SAFE_MAX_ROWS — the
// old env=MAX_SAFE_INTEGER trick would now silently truncate to 1000 rows).
// This store is never mounted into a renderer, so no native handles are at stake.
const store = createSessionStore({ uncappedFixture: true })
store.apply({ type: 'gateway.ready' })
let pushed = 0
let turn = 0
while (pushed < total) {
applyTurn(store, turn)
pushed += rowsPerTurn(turn)
turn++
}
// Deep-copy out of the solid store proxy into plain objects (the resume path
// takes a plain Message[]).
return store.state.messages.slice(0, total).map(cloneMessage)
}
/** Plain deep copy of a store Message (drop the solid proxy + streaming flag). */
function cloneMessage(m: Message): Message {
const copy: Message = { role: m.role, text: m.text }
if (m.parts) copy.parts = m.parts.map(p => ({ ...p }))
return copy
}

View File

@@ -0,0 +1,177 @@
/**
* DEV BENCH — NOT a test, NOT production code. Throwaway memory-measurement
* harness for tuning the rolling `HERMES_TUI_MAX_MESSAGES` cap. Mounts the
* production `<App store={createSessionStore()}>` under the `@opentui/solid` test
* renderer and samples `process.memoryUsage()` + the mounted-renderable count +
* `getAllocatorStats().activeAllocations`, forcing `global.gc()` before each
* sample. Excluded from the test run (not a *.test.ts) and lint-clean.
*
* It pushes a REALISTIC heavy-session fixture (scripts/fixture.ts) — varied user
* turns + fat multi-part assistant turns (markdown + reasoning + several tool
* headers) — because per-message size varies hugely, so message-count is only a
* LOOSE memory proxy and we're choosing a cap default.
*
* node scripts/build.mjs scripts/mem-bench.tsx .bench # build once (Solid+TS → JS)
* Uncapped: MEM_BENCH_TOTAL=8000 HERMES_TUI_MAX_MESSAGES=100000 \
* node --experimental-ffi --expose-gc --no-warnings .bench/mem-bench.js
* Capped: MEM_BENCH_TOTAL=8000 HERMES_TUI_MAX_MESSAGES=1500 \
* node --experimental-ffi --expose-gc --no-warnings .bench/mem-bench.js
*
* Run each cap as a SEPARATE node invocation so the WASM/native heap starts fresh.
* The matrix loop:
* for cap in 400 1500 3000 6000 100000; do \
* MEM_BENCH_TOTAL=8000 HERMES_TUI_MAX_MESSAGES=$cap \
* node --experimental-ffi --expose-gc --no-warnings .bench/mem-bench.js; done
*
* Signal: native `getAllocatorStats().activeAllocations` (the Zig-side allocator
* count — every live renderable/Yoga subtree contributes) and the recursive
* renderable descendant count under `renderer.root`. RSS is reported too but is
* noisy and grow-only (WASM linear memory never returns to the OS), so the
* meaningful comparison is the STEADY-STATE plateau: capped should flatten after
* ~CAP messages; uncapped should keep climbing.
*
* GC: forces `global.gc()` (synchronous) before each sample to measure RETAINED
* memory, not garbage — run Node with `--expose-gc` or the GC call is a no-op.
*
* RESUME PATH: after the live push matrix, builds the full fixture as a settled
* Message[] and `commitSnapshot`s it (the resume path), reporting mounted nodes +
* RSS — verifying the slice-before-set fix bounds resume mounting to ≤ cap.
*/
import { resolveRenderLib } from '@opentui/core'
import type { Renderable } from '@opentui/core'
import { testRender } from '@opentui/solid'
import { createSessionStore } from '../src/logic/store.ts'
import { App } from '../src/view/App.tsx'
import { ThemeProvider } from '../src/view/theme.tsx'
import { applyTurn, materialize, rowsPerTurn } from './fixture.ts'
const lib = resolveRenderLib()
const TOTAL = Number.parseInt(process.env.MEM_BENCH_TOTAL ?? '8000', 10)
const SAMPLE_EVERY = Number.parseInt(process.env.MEM_BENCH_SAMPLE ?? '500', 10)
const cap = process.env.HERMES_TUI_MAX_MESSAGES ?? '(default 400)'
const MB = (bytes: number) => (bytes / 1024 / 1024).toFixed(1)
/** Force a synchronous full GC to measure RETAINED memory. No-op without `node --expose-gc`. */
const forceGc = (): void => {
const gc = (globalThis as { gc?: () => void }).gc
if (gc) gc()
}
/** Recursively count every Renderable under root (a proxy for live Yoga nodes). */
function descendantCount(node: Renderable): number {
let n = 0
for (const child of node.getChildren()) n += 1 + descendantCount(child)
return n
}
async function main(): Promise<void> {
const store = createSessionStore()
store.apply({ type: 'gateway.ready' })
const setup = await testRender(
() => (
<ThemeProvider theme={() => store.state.theme}>
<App store={store} />
</ThemeProvider>
),
{ width: 100, height: 40, exitOnCtrlC: false }
)
await setup.renderOnce()
await setup.flush()
process.stdout.write(
`\n=== mem-bench (REALISTIC fixture) cap=${cap} total=${TOTAL} sampleEvery=${SAMPLE_EVERY} ===\n`
)
process.stdout.write(
'pushes | msgs | rss(MB) | heapUsed(MB) | external(MB) | arrayBuf(MB) | activeAllocs | renderables\n'
)
process.stdout.write(
'-------+------+---------+--------------+--------------+--------------+--------------+------------\n'
)
async function sample(pushes: number): Promise<void> {
await setup.renderOnce()
await setup.flush()
forceGc() // synchronous, full GC — measure retained, not garbage
const m = process.memoryUsage()
const alloc = lib.getAllocatorStats()
const renderables = descendantCount(setup.renderer.root)
const cols = [
String(pushes).padStart(6),
String(store.state.messages.length).padStart(4),
MB(m.rss).padStart(7),
MB(m.heapUsed).padStart(12),
MB(m.external).padStart(12),
MB(m.arrayBuffers).padStart(12),
String(alloc.activeAllocations).padStart(12),
String(renderables).padStart(11)
]
process.stdout.write(cols.join(' | ') + '\n')
}
await sample(0)
// Pump turns inline, sampling each time the cumulative produced-row count crosses
// a SAMPLE_EVERY boundary. Sampling is async (renderOnce/flush/gc), so it lives
// in the loop rather than a sync callback. Mounting is synchronous in Solid, so a
// render pass at the boundary reflects the just-pushed turns.
let pushed = 0
let nextSample = SAMPLE_EVERY
let turn = 0
while (pushed < TOTAL) {
applyTurn(store, turn)
pushed += rowsPerTurn(turn)
turn++
if (pushed >= nextSample) {
await sample(Math.min(pushed, TOTAL))
while (nextSample <= pushed) nextSample += SAMPLE_EVERY
}
}
// Tear down the live push tree BEFORE the resume path so its mounted nodes don't
// pollute the process-wide RSS the resume sample reads. (The renderable COUNT is
// already isolated per-renderer-root, but RSS is process-global.)
store.clearTranscript()
setup.renderer.destroy()
forceGc()
// ── RESUME PATH: build the full settled fixture and commitSnapshot it (the
// resume hydrate path). Verifies the slice-before-set fix bounds resume mounting
// to ≤ cap — mounting 8000 settled msgs at cap=1500 should mount ~1500-worth of
// rows, NOT 8000-worth. Done on a FRESH store + renderer so the live-push history
// above doesn't skew the count.
const resumeStore = createSessionStore()
resumeStore.apply({ type: 'gateway.ready' })
const resumeSetup = await testRender(
() => (
<ThemeProvider theme={() => resumeStore.state.theme}>
<App store={resumeStore} />
</ThemeProvider>
),
{ width: 100, height: 40, exitOnCtrlC: false }
)
await resumeSetup.renderOnce()
await resumeSetup.flush()
const fullFixture = materialize(TOTAL)
resumeStore.beginBuffer()
resumeStore.commitSnapshot(fullFixture)
await resumeSetup.renderOnce()
await resumeSetup.flush()
forceGc()
const rm = process.memoryUsage()
const ralloc = lib.getAllocatorStats()
const rrenderables = descendantCount(resumeSetup.renderer.root)
process.stdout.write('\n--- resume path (commitSnapshot of the full fixture) ---\n')
process.stdout.write(`fixture msgs built : ${fullFixture.length}\n`)
process.stdout.write(`mounted msgs (cap) : ${resumeStore.state.messages.length}\n`)
process.stdout.write(`mounted renderables: ${rrenderables}\n`)
process.stdout.write(`activeAllocations : ${ralloc.activeAllocations}\n`)
process.stdout.write(`rss(MB) : ${MB(rm.rss)}\n`)
resumeSetup.renderer.destroy()
}
await main()

View File

@@ -0,0 +1,126 @@
/**
* Clipboard (item 1) — copy via OSC 52 (works over SSH/tmux) + a native platform
* command, and read a clipboard IMAGE for paste-to-attach. Ported/trimmed from
* opencode `clipboard.ts`. A boundary concern (spawns processes / writes stdout);
* everything is best-effort and never throws into the view.
*/
import { spawn } from 'node:child_process'
import { existsSync } from 'node:fs'
import { platform } from 'node:os'
import { join } from 'node:path'
/** Whether `cmd` resolves on PATH (cached). We DON'T spawn missing tools: a failed
* spawn + writing to its dead stdin pipe raises EPIPE/SIGPIPE, and OpenTUI used to
* treat SIGPIPE as a shutdown signal — i.e. a clipboard miss would quit the TUI.
* Skipped on Windows (the built-in `clip` is always present; PATHEXT complicates
* a filename probe). */
const commandCache = new Map<string, boolean>()
function commandExists(cmd: string): boolean {
if (platform() === 'win32') return true
const cached = commandCache.get(cmd)
if (cached !== undefined) return cached
const dirs = (process.env.PATH ?? '').split(':').filter(Boolean)
const found = dirs.some(dir => existsSync(join(dir, cmd)))
commandCache.set(cmd, found)
return found
}
/** Run a command, optionally piping `input` to stdin; resolve its stdout bytes.
* Best-effort and crash-proof: every stream error (incl. EPIPE → SIGPIPE on a
* clipboard tool that exits early) is swallowed so a failed copy never throws out
* of the boundary or signals the process. */
function run(cmd: string, args: string[] = [], input?: string): Promise<Buffer> {
return new Promise((resolve, reject) => {
let child
try {
child = spawn(cmd, args, { stdio: [input === undefined ? 'ignore' : 'pipe', 'pipe', 'ignore'] })
} catch (cause) {
reject(cause instanceof Error ? cause : new Error(String(cause)))
return
}
const out: Buffer[] = []
child.on('error', reject)
child.stdout?.on('error', () => {}) // a closed stdout pipe must not throw
child.stdout?.on('data', (c: Buffer) => out.push(c))
child.on('close', code => (code === 0 ? resolve(Buffer.concat(out)) : reject(new Error(`${cmd} exit ${code}`))))
if (input !== undefined && child.stdin) {
// Writing to a tool that died/closed early raises EPIPE (→ SIGPIPE). Swallow it.
child.stdin.on('error', () => {})
try {
child.stdin.end(input)
} catch {
// pipe already gone — nothing to flush
}
}
})
}
/** OSC 52 copy — the terminal puts `text` on the system clipboard (SSH/tmux-safe). */
function writeOsc52(text: string): void {
if (!process.stdout.isTTY) return
const seq = `\x1b]52;c;${Buffer.from(text).toString('base64')}\x07`
// tmux/screen need the sequence wrapped in their passthrough escape.
process.stdout.write(process.env.TMUX || process.env.STY ? `\x1bPtmux;\x1b${seq}\x1b\\` : seq)
}
/** Native copy commands to try, in order, for the current platform. */
function copyCandidates(): Array<[string, string[]]> {
const os = platform()
if (os === 'darwin') return [['pbcopy', []]]
if (os === 'win32') return [['clip', []]]
// linux: prefer Wayland, then X11 tools
const list: Array<[string, string[]]> = []
if (process.env.WAYLAND_DISPLAY) list.push(['wl-copy', []])
list.push(['xclip', ['-selection', 'clipboard']], ['xsel', ['--clipboard', '--input']])
return list
}
/** Copy `text` to the clipboard: OSC 52 (always) + the first native command that works. */
export async function writeClipboard(text: string): Promise<void> {
writeOsc52(text) // primary path — SSH/tmux-safe, no subprocess
for (const [cmd, args] of copyCandidates()) {
if (!commandExists(cmd)) continue // never spawn a missing tool (avoids EPIPE/SIGPIPE)
try {
await run(cmd, args, text)
return
} catch {
// try the next candidate
}
}
}
/** Read a clipboard IMAGE as base64 PNG (for paste-to-attach); undefined if none. */
export async function readClipboardImage(): Promise<{ data: string; mime: string } | undefined> {
const os = platform()
const tries: Array<[string, string[]]> = []
if (os === 'linux') {
if (process.env.WAYLAND_DISPLAY) tries.push(['wl-paste', ['-t', 'image/png']])
tries.push(['xclip', ['-selection', 'clipboard', '-t', 'image/png', '-o']])
} else if (os === 'darwin') {
tries.push(['pngpaste', ['-']]) // brew install pngpaste
} else if (os === 'win32') {
tries.push([
'powershell.exe',
[
'-NonInteractive',
'-NoProfile',
'-Command',
'Add-Type -AssemblyName System.Windows.Forms; $img=[System.Windows.Forms.Clipboard]::GetImage(); if($img){$ms=New-Object System.IO.MemoryStream; $img.Save($ms,[System.Drawing.Imaging.ImageFormat]::Png); [Console]::Out.Write([System.Convert]::ToBase64String($ms.ToArray()))}'
]
])
}
for (const [cmd, args] of tries) {
if (!commandExists(cmd)) continue // skip missing tools (no pointless failing spawns)
try {
const buf = await run(cmd, args)
if (buf.length) {
// powershell already returns base64 text; the others return raw PNG bytes.
const data = os === 'win32' ? buf.toString('utf8').trim() : buf.toString('base64')
if (data) return { data, mime: 'image/png' }
}
} catch {
// try the next candidate
}
}
return undefined
}

View File

@@ -0,0 +1,29 @@
/**
* Typed errors at the gateway boundary.
*
* Per spec v4 §3.4: internal errors use `Data.TaggedError`; wire/serializable
* errors use Schema-based tagged errors (added in Phase 1 alongside the
* GatewayEvent schema). Phase 0 ships the internal set the renderer/transport
* boundary needs.
*
* Boundary code yields these directly (`return yield* new FooError(...)`) — no
* throw / try-catch / Promise.catch / orDie.
*/
import { Data } from 'effect'
/** The renderer (createCliRenderer) failed to acquire. */
export class RendererError extends Data.TaggedError('RendererError')<{
readonly cause: unknown
}> {}
/** Could not resolve a usable Python interpreter for the gateway. */
export class PythonResolutionError extends Data.TaggedError('PythonResolutionError')<{
readonly tried: ReadonlyArray<string>
}> {}
/** A JSON-RPC request to the gateway failed (timeout, transport down, rpc error). */
export class GatewayError extends Data.TaggedError('GatewayError')<{
readonly method: string
readonly reason: 'timeout' | 'transport-down' | 'rpc-error'
readonly message: string
}> {}

View File

@@ -0,0 +1,100 @@
/**
* Node-FFI coordinate safety shim for @opentui/core 0.4.0.
*
* Root cause (live crash, ERR_INVALID_ARG_VALUE looping every frame): several
* OptimizedBuffer methods marshal x/y/width/height as **u32** in the FFI table
* (zig.ts: `bufferFillRect: ["u32","u32","u32","u32","u32","ptr"]`, same for
* `bufferDrawText` / `bufferSetCell*` / `bufferDrawChar`), while renderables
* pass RAW SCREEN COORDINATES — which go NEGATIVE inside a <scrollbox> when an
* element is partially scrolled above the viewport. Concretely:
* `LineNumberRenderable.renderSelf` does `buffer.fillRect(this.x + gutterWidth,
* this.y + i, …)` for diff added/removed line backgrounds, so expanding a tall
* `<diff showLineNumbers>` pinned to the scrollbox bottom rendered with
* `this.y < 0` and threw out of `CliRenderer.loop` on EVERY frame (frozen UI,
* console error spam) until a resize forced a fresh layout.
*
* Upstream-on-Bun this never throws: Bun's FFI silently WRAPS negatives to
* huge u32s and the native side bounds-checks them into a no-op. Node's
* experimental FFI (node:ffi) instead REJECTS the argument. Other draw entry
* points (`bufferDrawBox`, `bufferDrawTextBufferView`) already use i32 — which
* is why ordinary text/boxes scroll fine and only the diff gutter path crashed.
*
* Fix at the seam we own: clamp/skip BEFORE the FFI call.
* - fillRect: clip the rect to the non-negative quadrant (the native side
* already clips right/bottom against the buffer + scissor) and skip empties.
* - drawText/setCell/setCellWithAlphaBlending/drawChar: skip when the origin
* is negative (Bun-parity: those cells/rows are off-screen anyway).
*
* TODO(upstream): file/track an OpenTUI issue to widen these FFI params to i32
* (or clamp in core) — then this shim can be deleted.
*/
import { OptimizedBuffer, TextBufferView } from '@opentui/core'
let installed = false
/** Patch OptimizedBuffer's u32-coordinate methods to tolerate negative coords. Idempotent. */
export function installFfiCoordSafety(): void {
if (installed) return
installed = true
const proto = OptimizedBuffer.prototype
// Prototype monkey-patching: extracting the original methods unbound is the
// point — they're re-invoked with `.call(this, …)` on the correct instance.
/* eslint-disable @typescript-eslint/unbound-method */
const origFillRect = proto.fillRect
proto.fillRect = function (this: OptimizedBuffer, x, y, width, height, bg) {
let x2 = Math.trunc(x)
let y2 = Math.trunc(y)
let w = Math.trunc(width)
let h = Math.trunc(height)
if (x2 < 0) {
w += x2
x2 = 0
}
if (y2 < 0) {
h += y2
y2 = 0
}
if (w <= 0 || h <= 0) return
origFillRect.call(this, x2, y2, w, h, bg)
}
const origDrawText = proto.drawText
proto.drawText = function (this: OptimizedBuffer, text, x, y, ...rest) {
if (x < 0 || y < 0) return
origDrawText.call(this, text, x, y, ...rest)
}
const origSetCell = proto.setCell
proto.setCell = function (this: OptimizedBuffer, x, y, ...rest) {
if (x < 0 || y < 0) return
origSetCell.call(this, x, y, ...rest)
}
const origSetCellAlpha = proto.setCellWithAlphaBlending
proto.setCellWithAlphaBlending = function (this: OptimizedBuffer, x, y, ...rest) {
if (x < 0 || y < 0) return
origSetCellAlpha.call(this, x, y, ...rest)
}
const origDrawChar = proto.drawChar
proto.drawChar = function (this: OptimizedBuffer, char, x, y, ...rest) {
if (x < 0 || y < 0) return
origDrawChar.call(this, char, x, y, ...rest)
}
// Same u32 marshaling on a different entry point: `textBufferViewSetViewport`
// takes x/y/width/height as u32, but `TextRenderable.onResize` feeds it the
// RAW transient layout size — observed NON-u32 (negative/NaN) mid-relayout
// while a shrinking list (the fuzzy picker filtering rows away) reflows. Bun
// wraps/coerces, node:ffi throws (`Argument 3 must be a uint32`). Coerce into
// the valid quadrant; a zero-sized viewport is the native side's own no-op.
const u32 = (v: number) => (Number.isFinite(v) ? Math.max(0, Math.trunc(v)) : 0)
const viewProto = TextBufferView.prototype
const origSetViewport = viewProto.setViewport
viewProto.setViewport = function (this: TextBufferView, x, y, width, height) {
origSetViewport.call(this, u32(x), u32(y), u32(width), u32(height))
}
}

View File

@@ -0,0 +1,29 @@
/**
* GatewayService — the Effect-side transport boundary.
*
* Phase 0: the SHAPE only. The live layer (spawning the Python `tui_gateway`,
* JSON-RPC framing, Schema-decoding the wire union) lands in Phase 1
* (`boundary/gateway/liveGateway.ts`). For now the only implementation is
* `FakeGateway.layer` (entry/fakeGateway.ts), which the render/test harness uses.
*
* This is one of exactly two Effect<->Solid contact points: the Solid store
* subscribes via `subscribe(handler)` and the boundary pushes DECODED events in.
* Per spec v4 §1, the store/reducer themselves are plain Solid, never Effect.
*/
import { Context, type Effect } from 'effect'
import type { GatewayError } from '../errors.ts'
import type { GatewayEvent } from '../schema/GatewayEvent.ts'
export interface GatewayServiceShape {
/** Push decoded gateway events into the Solid store. Returns an unsubscribe fn. */
readonly subscribe: (handler: (event: GatewayEvent) => void) => Effect.Effect<() => void>
/** Typed JSON-RPC request to the Python gateway. Fails with a typed GatewayError, never throws. */
readonly request: <A>(method: string, params: unknown) => Effect.Effect<A, GatewayError>
/** The active session id (for `approval.respond {session_id}`); undefined before a session exists. */
readonly sessionId: () => string | undefined
}
export class GatewayService extends Context.Service<GatewayService, GatewayServiceShape>()(
'@hermes-tui/GatewayService'
) {}

View File

@@ -0,0 +1,255 @@
/**
* Low-level JSON-RPC-over-stdio client for the Python `tui_gateway` (spec v4 §4).
* Re-authored minimal (NOT the Ink client's 740-LOC attach-mode/buffering) but
* the WIRE CONTRACT is identical (verified against ui-tui/src/gatewayClient.ts +
* tui_gateway/server.py + entry.py + transport.py):
*
* - spawn: `python -m tui_gateway.entry`, cwd=srcRoot, env={...process.env,
* PYTHONPATH=srcRoot:…, HERMES_PYTHON_SRC_ROOT=srcRoot}, stdio piped.
* - framing: newline-delimited compact JSON, BOTH directions, on ONE stdout.
* - request: {id:"r<n>", jsonrpc:"2.0", method, params} + "\n".
* - response: {jsonrpc, id, result} | {jsonrpc, id, error:{code,message}} — match by id.
* - event: {jsonrpc, method:"event", params:{type, session_id?, payload?}} (NO id).
* - handshake: child emits {event, params:{type:"gateway.ready", payload:{skin}}}
* UNSOLICITED first; no subscribe RPC. Then client drives session.create /
* session.resume / prompt.submit / *.respond.
* - GOTCHA: session.resume/prompt.submit/slash.exec are LONG handlers — their
* {id,result} arrives async, interleaved with events. Keep the pending map
* authoritative; never assume in-order response delivery.
*
* Raw events are surfaced as `unknown` (the params object). The liveGateway
* layer Schema-decodes them once at the boundary (spec v4 §3.3); this client
* stays decode-agnostic so the transport and the schema evolve independently.
*/
import { spawn, type ChildProcessWithoutNullStreams } from 'node:child_process'
import type { Log } from '../log.ts'
import { resolvePython, resolveSrcRoot } from './python.ts'
interface Pending {
resolve: (result: unknown) => void
reject: (error: Error) => void
method: string
}
export interface RawClientOptions {
readonly log: Log
/** Called with each server-pushed event's `params` object (still unknown — decoded upstream). */
readonly onEvent: (params: unknown) => void
/** Called when the child exits / errors (so the layer can reject pending + reconnect). */
readonly onExit?: (reason: string) => void
}
const REQUEST_TIMEOUT_MS = (() => {
const raw = Number.parseInt(process.env.HERMES_TUI_RPC_TIMEOUT_MS ?? '', 10)
return Number.isFinite(raw) && raw > 0 ? Math.max(5000, raw) : 120_000
})()
const STARTUP_TIMEOUT_MS = (() => {
const raw = Number.parseInt(process.env.HERMES_TUI_STARTUP_TIMEOUT_MS ?? '', 10)
return Number.isFinite(raw) && raw > 0 ? Math.max(2000, raw) : 20_000
})()
export class RawGatewayClient {
private proc: ChildProcessWithoutNullStreams | null = null
private pending = new Map<string, Pending>()
private reqId = 0
private stdinBuffer = ''
private startupTimer: ReturnType<typeof setTimeout> | undefined
private readonly log: Log
private readonly onEvent: (params: unknown) => void
private readonly onExit?: (reason: string) => void
constructor(options: RawClientOptions) {
this.log = options.log
this.onEvent = options.onEvent
if (options.onExit) this.onExit = options.onExit
}
/** Spawn the gateway child and begin reading frames. Idempotent. */
start(): void {
if (this.proc) return
const srcRoot = resolveSrcRoot()
const python = resolvePython(srcRoot)
const cwd = process.env.HERMES_CWD?.trim() || srcRoot
const env: Record<string, string> = { ...(process.env as Record<string, string>) }
env.PYTHONPATH = env.PYTHONPATH ? `${srcRoot}:${env.PYTHONPATH}` : srcRoot
env.HERMES_PYTHON_SRC_ROOT = srcRoot
this.log.info('gateway', 'spawning tui_gateway', { python, cwd, srcRoot })
const proc = spawn(python, ['-m', 'tui_gateway.entry'], {
cwd,
env,
stdio: ['pipe', 'pipe', 'pipe']
})
// Identity guard: a stale child's late exit/error must not act after a restart
// has already installed a new `this.proc` (else it'd null the live child).
// Nulling `this.proc` here makes a subsequent finish() a no-op (idempotent),
// covering the ENOENT case where 'error' fires and 'exit' does not.
const finish = (reason: string) => {
if (this.proc !== proc) return
this.log.warn('gateway', reason)
this.rejectAll(reason)
this.proc = null
this.onExit?.(reason)
}
proc.on('exit', (code, signal) => finish(`gateway exited (code=${code ?? 'null'} signal=${signal ?? 'null'})`))
proc.on('error', err => finish(`gateway spawn error: ${err instanceof Error ? err.message : String(err)}`))
this.proc = proc
this.readStdout(proc)
this.readStderr(proc)
// Startup-readiness watchdog: a child that hangs on import (wrong python /
// missing dep) never emits the unsolicited `gateway.ready` handshake, leaving
// a silent blank UI. Emit `gateway.start_timeout` so the store can surface a
// failure line + the captured stderr tail. Cleared on ready (dispatch) / stop.
// A recovery-respawn re-enters start(), so this re-arms per respawn — desired.
this.startupTimer = setTimeout(() => {
this.startupTimer = undefined
this.onEvent({
type: 'gateway.start_timeout',
payload: { message: `no gateway.ready within ${STARTUP_TIMEOUT_MS}ms` }
})
}, STARTUP_TIMEOUT_MS)
}
private readStdout(proc: ChildProcessWithoutNullStreams): void {
proc.stdout.setEncoding('utf8')
proc.stdout.on('data', (chunk: string) => {
this.stdinBuffer += chunk
let nl: number
while ((nl = this.stdinBuffer.indexOf('\n')) >= 0) {
const line = this.stdinBuffer.slice(0, nl)
this.stdinBuffer = this.stdinBuffer.slice(nl + 1)
if (line.trim()) this.dispatch(line)
}
})
proc.stdout.on('error', cause => this.log.error('gateway', 'stdout read loop failed', { cause: String(cause) }))
}
private readStderr(proc: ChildProcessWithoutNullStreams): void {
let buf = ''
proc.stderr.setEncoding('utf8')
proc.stderr.on('data', (chunk: string) => {
buf += chunk
let nl: number
while ((nl = buf.indexOf('\n')) >= 0) {
const line = buf.slice(0, nl)
buf = buf.slice(nl + 1)
if (line.trim()) {
this.log.debug('gateway.stderr', line)
// Surface as a synthetic gateway.stderr event (matches Ink).
this.onEvent({ type: 'gateway.stderr', payload: { line } })
}
}
})
// stderr pipe closing on exit is expected; ignore errors.
proc.stderr.on('error', () => {})
}
private dispatch(line: string): void {
let msg: unknown
try {
msg = JSON.parse(line)
} catch {
this.log.warn('gateway', 'unparseable frame', { preview: line.slice(0, 120) })
this.onEvent({ type: 'gateway.protocol_error', payload: { preview: line.slice(0, 120) } })
return
}
if (!msg || typeof msg !== 'object') return
const frame = msg as { id?: unknown; method?: unknown; params?: unknown; result?: unknown; error?: unknown }
// Response: has an id matching a pending request.
const pending = typeof frame.id === 'string' ? this.pending.get(frame.id) : undefined
if (typeof frame.id === 'string' && pending) {
const p = pending
this.pending.delete(frame.id)
if (frame.error) {
const err = frame.error as { code?: number; message?: string }
p.reject(new Error(err.message ?? `rpc error (${err.code ?? '?'})`))
} else {
p.resolve(frame.result)
}
return
}
// Event push: method === "event", no id. Surface params (decoded upstream).
if (frame.method === 'event' && frame.params && typeof frame.params === 'object') {
// Handshake arrived: cancel the startup-readiness watchdog. Narrow without
// `as` via `'type' in obj` + property access (the params record is loose).
if ('type' in frame.params && frame.params.type === 'gateway.ready') {
if (this.startupTimer) clearTimeout(this.startupTimer)
this.startupTimer = undefined
}
this.onEvent(frame.params)
return
}
this.log.warn('gateway', 'unroutable frame', { preview: line.slice(0, 120) })
}
/** Send a JSON-RPC request; resolves with `result` (long handlers reply async). */
request<A = unknown>(method: string, params: unknown): Promise<A> {
// Do NOT auto-start here: during the recovery backoff window `this.proc` is
// null, and a respawn here would BYPASS the backoff (the first spawn always
// comes from subscribe() → client.start()). A null proc rejects below.
const proc = this.proc
const stdin = proc?.stdin
if (!stdin) return Promise.reject(new Error('gateway not running'))
const id = `r${++this.reqId}`
const frame = JSON.stringify({ id, jsonrpc: '2.0', method, params: params ?? {} }) + '\n'
return new Promise<A>((resolve, reject) => {
const timer = setTimeout(() => {
if (this.pending.delete(id)) reject(new Error(`timeout: ${method}`))
}, REQUEST_TIMEOUT_MS)
this.pending.set(id, {
method,
resolve: result => {
clearTimeout(timer)
resolve(result as A)
},
reject: error => {
clearTimeout(timer)
reject(error)
}
})
try {
// Newline-delimited JSON to the child's stdin. Fire-and-forget: the write
// returns a backpressure boolean we intentionally ignore (frames are tiny
// and ordered; Node flushes the pipe itself).
stdin.write(frame)
} catch (cause) {
this.pending.delete(id)
clearTimeout(timer)
reject(cause instanceof Error ? cause : new Error(String(cause)))
}
})
}
private rejectAll(reason: string): void {
for (const p of this.pending.values()) p.reject(new Error(reason))
this.pending.clear()
}
/** Close stdin (EOF → child exits) and stop. */
stop(): void {
if (this.startupTimer) clearTimeout(this.startupTimer)
this.startupTimer = undefined
this.rejectAll('gateway stopping')
const stdin = this.proc?.stdin
if (stdin) {
try {
// Close stdin → child sees EOF and exits.
stdin.end()
} catch {
// already gone
}
}
this.proc = null
}
}

View File

@@ -0,0 +1,175 @@
/**
* liveGateway — the GatewayService layer backed by the real Python `tui_gateway`
* (spec v4 §2/§3.2). Adapts RawGatewayClient to GatewayServiceShape:
* - decodes each raw event ONCE with the GatewayEvent Schema
* (decodeUnknownOption → unrecognized/malformed events skipped, never crash),
* - coalesces decoded events on a 16ms debounce flushed inside Solid `batch()`
* so a burst of deltas is ONE repaint (opencode sdk.tsx:54-80),
* - tracks the session id (set from session.create/resume result) for
* approval.respond {session_id},
* - maps request failures to a typed GatewayError (never throws).
*
* The 16ms batch + `batch()` call is the boundary handing decoded events to
* Solid — one of the two approved Effect<->Solid contact points (spec v4 §1).
*/
import { Effect, Layer, Option, Schema } from 'effect'
import { batch } from 'solid-js'
import { backoffMs, planGatewayRecovery } from '../../logic/gatewayRecovery.ts'
import { GatewayError } from '../errors.ts'
import { getLog } from '../log.ts'
import { GatewayEventSchema, type GatewayEvent } from '../schema/GatewayEvent.ts'
import { GatewayService, type GatewayServiceShape } from './GatewayService.ts'
import { RawGatewayClient } from './client.ts'
const COALESCE_MS = 16
const decodeEvent = Schema.decodeUnknownOption(GatewayEventSchema)
function makeLiveGateway(): { service: GatewayServiceShape; stop: () => void } {
const log = getLog()
const handlers = new Set<(event: GatewayEvent) => void>()
let sessionId: string | undefined
// Auto-heal recovery state (driver below). `recoverSid` is the resume target
// carried across a respawn that died before gateway.ready; `recoveryAttempts`
// is the sliding crash-loop budget window; `restartTimer` is the pending
// backoff respawn (cleared on teardown so it can't fire post-stop).
let recoverSid: string | undefined
let recoveryAttempts: number[] = []
let restartTimer: ReturnType<typeof setTimeout> | undefined
// 16ms event coalescing → one batched repaint (opencode sdk.tsx model).
let queue: GatewayEvent[] = []
let timer: ReturnType<typeof setTimeout> | undefined
let last = 0
const flush = () => {
timer = undefined
if (queue.length === 0) return
const events = queue
queue = []
last = Date.now()
batch(() => {
for (const event of events) {
for (const handler of handlers) handler(event)
}
})
}
const enqueue = (event: GatewayEvent) => {
queue.push(event)
if (timer) return
// If we flushed recently (<16ms ago) batch with near-future events; else flush now.
if (Date.now() - last < COALESCE_MS) {
timer = setTimeout(flush, COALESCE_MS)
} else {
flush()
}
}
const onRawEvent = (params: unknown) => {
const decoded = decodeEvent(params)
if (Option.isNone(decoded)) {
const t = (params as { type?: unknown } | null)?.type
log.debug('gateway', 'skipped undecodable event', { type: typeof t === 'string' ? t : '(none)' })
return
}
enqueue(decoded.value)
}
// Recovery driver: on a child exit, clear the frozen spinner (via the store's
// gateway.exited case), then — under the crash-loop budget — respawn the child
// on exponential backoff. The post-respawn gateway.ready triggers the re-resume
// (driven from entry's subscribe callback). Hoisted so it can be passed to
// `new RawGatewayClient` below while itself referencing the `client` const —
// `client` is assigned by the time onExit ever fires at runtime.
function onExit(reason: string): void {
log.warn('gateway', 'transport exited', { reason })
// Clears the frozen spinner + shows status (store handles gateway.exited).
enqueue({ type: 'gateway.exited', payload: { reason } })
const plan = planGatewayRecovery(sessionId ?? null, recoverSid ?? null, recoveryAttempts, Date.now())
recoveryAttempts = plan.attempts
if (!plan.recover || plan.sid === null) {
enqueue({ type: 'error', payload: { message: 'gateway exited repeatedly — type /resume to retry' } })
return
}
recoverSid = plan.sid
const attempt = recoveryAttempts.length
const delay = backoffMs(attempt)
enqueue({ type: 'gateway.recovering', payload: { attempt, delay_ms: delay } })
if (restartTimer) clearTimeout(restartTimer)
restartTimer = setTimeout(() => {
restartTimer = undefined
client.start()
}, delay)
}
const client = new RawGatewayClient({
log,
onEvent: onRawEvent,
onExit
})
const service: GatewayServiceShape = {
subscribe: handler =>
Effect.sync(() => {
handlers.add(handler)
// Lazily spawn on first subscription so the child + its gateway.ready land.
client.start()
return () => {
handlers.delete(handler)
}
}),
request: <A>(method: string, params: unknown) =>
Effect.tryPromise({
try: () => client.request<A>(method, params),
catch: cause => {
const message = cause instanceof Error ? cause.message : String(cause)
const reason = message.startsWith('timeout:')
? ('timeout' as const)
: message.includes('not running') || message.includes('stopping')
? ('transport-down' as const)
: ('rpc-error' as const)
return new GatewayError({ method, reason, message })
}
}).pipe(
// Capture session id from create/resume results so approval.respond works.
Effect.tap(result =>
Effect.sync(() => {
if ((method === 'session.create' || method === 'session.resume') && result && typeof result === 'object') {
const sid = (result as { session_id?: unknown }).session_id
if (typeof sid === 'string') sessionId = sid
}
})
)
),
sessionId: () => sessionId
}
// Clear a pending coalesce timer on teardown so a queued flush() can't fire
// batch()/handlers into a torn-down store after the layer scope releases.
const stop = () => {
if (timer) clearTimeout(timer)
timer = undefined
// Also kill any pending backoff respawn so it can't fire after teardown.
if (restartTimer) clearTimeout(restartTimer)
restartTimer = undefined
client.stop()
}
return { service, stop }
}
/**
* The live GatewayService layer (spawns + talks to the real Python tui_gateway).
* Scoped so the child process is stopped (stdin EOF → exit) on scope teardown —
* no orphaned gateway children when the renderer is destroyed.
*/
export const liveGatewayLayer: Layer.Layer<GatewayService> = Layer.effect(
GatewayService,
Effect.acquireRelease(Effect.sync(makeLiveGateway), ({ stop }) => Effect.sync(stop)).pipe(
Effect.map(({ service }) => service)
)
)

View File

@@ -0,0 +1,49 @@
/**
* Python resolution for spawning the `tui_gateway` — mirrors Ink's
* `resolvePython` (ui-tui/src/gatewayClient.ts:45-64) EXACTLY so behavior is
* identical across engines (spec v4 §4). NEVER "probe any python".
*
* Order: HERMES_PYTHON / PYTHON env → $VIRTUAL_ENV (bin/python or
* Scripts/python.exe) → <root>/.venv → <root>/venv → bare `python3` (`python`
* on win32) on PATH. The source root is HERMES_PYTHON_SRC_ROOT (the launcher
* sets it) so the child resolves modules against the right checkout.
*/
import { existsSync } from 'node:fs'
import { dirname, resolve } from 'node:path'
export function resolvePython(root: string): string {
const configured = process.env.HERMES_PYTHON?.trim() || process.env.PYTHON?.trim()
if (configured) return configured
const venv = process.env.VIRTUAL_ENV?.trim()
const hit = [
venv && resolve(venv, 'bin/python'),
venv && resolve(venv, 'Scripts/python.exe'),
resolve(root, '.venv/bin/python'),
resolve(root, '.venv/bin/python3'),
resolve(root, 'venv/bin/python'),
resolve(root, 'venv/bin/python3')
].find(p => p && existsSync(p))
return hit || (process.platform === 'win32' ? 'python' : 'python3')
}
/** The Hermes checkout root used as PYTHONPATH / HERMES_PYTHON_SRC_ROOT for the child. */
export function resolveSrcRoot(): string {
const configured = process.env.HERMES_PYTHON_SRC_ROOT?.trim()
if (configured) return configured
// Fallback (no launcher env): walk up from this module to the Hermes checkout
// root — the dir holding the `hermes_cli` package / `pyproject.toml`. Bundle-
// agnostic, so it works whether running the source tree (.../src/boundary/gateway)
// or the built `dist/main.js`. (Under the real launcher this never runs — the
// launcher always sets HERMES_PYTHON_SRC_ROOT.)
let dir = import.meta.dirname
for (let i = 0; i < 8; i++) {
if (existsSync(resolve(dir, 'hermes_cli')) || existsSync(resolve(dir, 'pyproject.toml'))) return dir
const parent = dirname(dir)
if (parent === dir) break
dir = parent
}
return resolve(import.meta.dirname, '../../../../')
}

View File

@@ -0,0 +1,248 @@
/**
* Log — TUI diagnostics sink (glitch: "v. important … hook into logs to figure
* out TUI state"). Design mirrors opencode's `util/log.ts` (levels + priority
* filter, scoped/child loggers, a `.time()` span helper) but adds a dual sink:
*
* 1. an in-memory RING BUFFER (queryable at runtime — a `/logs` overlay or a
* test asserting TUI state transitions can read it live), AND
* 2. an append-only NDJSON FILE (default `~/.hermes/logs/opentui-v2.log`,
* override via HERMES_TUI_LOG_FILE) so a live session is `tail -f`-able.
*
* The ring buffer is the key advantage over opencode's file-only logger: it lets
* us inspect engine state from inside the running TUI without leaving it.
*
* CRITICAL: OpenTUI HIJACKS `console.*` and stdout (opentui skill / gotcha) —
* logging to the terminal corrupts the rendered frame. So this NEVER touches
* console/stdout/stderr; file + ring only. It's the single approved logging path
* for the whole engine. Level filter via HERMES_TUI_LOG_LEVEL (default INFO).
*/
import { appendFileSync, mkdirSync, renameSync, statSync, unlinkSync } from 'node:fs'
import { homedir } from 'node:os'
import { dirname, join } from 'node:path'
import { Schema } from 'effect'
// LogLevel is modeled schema-first (the schema-inferred-types idiom, mirroring
// `boundary/schema/GatewayEvent.ts`): declare the literal union once and INFER
// the TS type from it, so the two can never drift.
export const LogLevelSchema = Schema.Literals(['debug', 'info', 'warn', 'error'])
export type LogLevel = typeof LogLevelSchema.Type
const PRIORITY: Record<LogLevel, number> = { debug: 0, info: 1, warn: 2, error: 3 }
/**
* Serialize a value to JSON that NEVER throws. A caller-supplied `data` can hold
* a circular reference or a BigInt — plain `JSON.stringify` throws on both, which
* (in the file-write `catch` below) would flip `fileBroken` and kill ALL file
* logging for the session. Instead we degrade a bad payload to a placeholder:
* - circular refs (tracked via a per-call `WeakSet` of seen objects) → '[Circular]'
* - BigInt → `\`${n}n\`` (JSON has no bigint; keep it readable + reversible-ish)
* and wrap the whole thing so any other throw (e.g. a hostile `toJSON`) falls back
* to `String(value)`, then to '[unserializable]' if even that throws.
*/
export function safeStringify(value: unknown): string {
try {
const seen = new WeakSet<object>()
return JSON.stringify(value, (_key, val: unknown) => {
if (typeof val === 'bigint') return `${val}n`
if (typeof val === 'object' && val !== null) {
if (seen.has(val)) return '[Circular]'
seen.add(val)
}
return val
})
} catch {
try {
return String(value)
} catch {
return '[unserializable]'
}
}
}
export interface LogEntry {
readonly t: number // epoch ms
readonly level: LogLevel
readonly scope: string
readonly msg: string
readonly data?: unknown
}
const RING_LIMIT = 2000
// Size-based rotation for the append-only NDJSON file (mirrors opencode's
// keep-N model, but size- rather than time-keyed since we write one growing
// file). When the live file crosses LOG_MAX_BYTES we shift
// `.log` → `.log.1` → … → `.log.${LOG_KEEP}` (dropping the oldest) and resume on
// a fresh empty `.log`. Rotation is best-effort: any failure leaves us writing
// to the existing file (logging must never crash the engine).
const LOG_MAX_BYTES = 5 * 1024 * 1024
const LOG_KEEP = 5
function defaultLogFile(): string {
const explicit = process.env.HERMES_TUI_LOG_FILE?.trim()
if (explicit) return explicit
return join(homedir(), '.hermes', 'logs', 'opentui-v2.log')
}
function defaultLevel(): LogLevel {
const raw = process.env.HERMES_TUI_LOG_LEVEL?.trim().toLowerCase()
return raw === 'debug' || raw === 'info' || raw === 'warn' || raw === 'error' ? raw : 'info'
}
/** A timing span — call `.stop()` (or `using` it) to log completion + duration. */
export interface TimeSpan {
stop: () => void
[Symbol.dispose]: () => void
}
export class Log {
private ring: LogEntry[] = []
private file: string | null
private fileBroken = false
private minPriority: number
// Bytes in the live log file. Seeded from statSync on open (counter approach —
// we avoid a statSync on EVERY write); incremented by each line's byte length
// and reset to 0 after a rotation. Rotation triggers when this would cross
// LOG_MAX_BYTES, so the live file stays bounded without per-write fs stats.
private fileBytes = 0
constructor(file: string | null = defaultLogFile(), level: LogLevel = defaultLevel()) {
this.file = file
this.minPriority = PRIORITY[level]
if (this.file) {
try {
mkdirSync(dirname(this.file), { recursive: true })
} catch {
this.fileBroken = true
}
try {
this.fileBytes = statSync(this.file).size
} catch {
this.fileBytes = 0 // no existing file (or unreadable) → start the counter at 0
}
}
}
setLevel(level: LogLevel): void {
this.minPriority = PRIORITY[level]
}
/**
* Best-effort size-based rotation: `.log.${LOG_KEEP}` is dropped, every other
* `.log.N` shifts up, the live `.log` becomes `.log.1`, and the counter resets
* so writing continues on a fresh file. Any fs failure is swallowed and we keep
* writing to the existing file — rotation must never crash logging.
*/
private rotate(file: string): void {
try {
try {
unlinkSync(`${file}.${LOG_KEEP}`)
} catch {
// oldest slot may not exist yet — fine
}
for (let i = LOG_KEEP - 1; i >= 1; i--) {
try {
renameSync(`${file}.${i}`, `${file}.${i + 1}`)
} catch {
// that slot may not exist yet — fine
}
}
renameSync(file, `${file}.1`)
this.fileBytes = 0
} catch {
// rotation failed (e.g. live file vanished) — leave the counter alone and
// keep appending to the existing path; better an oversized log than none.
}
}
private write(level: LogLevel, scope: string, msg: string, data?: unknown): void {
if (PRIORITY[level] < this.minPriority) return
const entry: LogEntry =
data === undefined ? { t: Date.now(), level, scope, msg } : { t: Date.now(), level, scope, msg, data }
this.ring.push(entry)
if (this.ring.length > RING_LIMIT) this.ring.shift()
if (this.file && !this.fileBroken) {
try {
const line = safeStringify(entry) + '\n'
if (this.fileBytes > 0 && this.fileBytes + Buffer.byteLength(line) > LOG_MAX_BYTES) this.rotate(this.file)
appendFileSync(this.file, line)
this.fileBytes += Buffer.byteLength(line)
} catch {
this.fileBroken = true // stop hammering a broken path; the ring keeps working
}
}
}
debug(scope: string, msg: string, data?: unknown): void {
this.write('debug', scope, msg, data)
}
info(scope: string, msg: string, data?: unknown): void {
this.write('info', scope, msg, data)
}
warn(scope: string, msg: string, data?: unknown): void {
this.write('warn', scope, msg, data)
}
error(scope: string, msg: string, data?: unknown): void {
this.write('error', scope, msg, data)
}
/** A logger bound to a fixed scope (opencode's tagged-logger ergonomics). */
child(scope: string): ScopedLog {
return new ScopedLog(this, scope)
}
/** Time an operation: logs `<msg> started` now and `<msg> completed` + duration on stop. */
time(scope: string, msg: string, data?: Record<string, unknown>): TimeSpan {
const started = Date.now()
this.info(scope, `${msg} started`, data)
const stop = () => this.info(scope, `${msg} completed`, { ...data, duration_ms: Date.now() - started })
return { stop, [Symbol.dispose]: stop }
}
/** Snapshot of the in-memory ring (newest last). For a `/logs` overlay or tests. */
tail(n = RING_LIMIT): LogEntry[] {
return n >= this.ring.length ? [...this.ring] : this.ring.slice(this.ring.length - n)
}
/** Where the file log is written (for surfacing in the UI / `/logs`). */
get filePath(): string | null {
return this.fileBroken ? null : this.file
}
clear(): void {
this.ring = []
}
}
/** A logger with a fixed scope — forwards to the parent Log. */
export class ScopedLog {
constructor(
private readonly parent: Log,
private readonly scope: string
) {}
debug(msg: string, data?: unknown): void {
this.parent.debug(this.scope, msg, data)
}
info(msg: string, data?: unknown): void {
this.parent.info(this.scope, msg, data)
}
warn(msg: string, data?: unknown): void {
this.parent.warn(this.scope, msg, data)
}
error(msg: string, data?: unknown): void {
this.parent.error(this.scope, msg, data)
}
time(msg: string, data?: Record<string, unknown>): TimeSpan {
return this.parent.time(this.scope, msg, data)
}
}
let _singleton: Log | null = null
/** Module-singleton logger for the live engine. Tests construct their own `new Log(null)`. */
export function getLog(): Log {
_singleton ??= new Log()
return _singleton
}

View File

@@ -0,0 +1,91 @@
/**
* memlog — in-process 1Hz memory self-sampling to NDJSON.
*
* The fleet-monitoring answer to "attach live-attach.sh to all 510 of my
* sessions": instead of an external watcher chasing pids, every TUI session
* logs its OWN samples when enabled, keyed by pid + boot time, into
* `~/.hermes/logs/memwatch/`. Aggregate across sessions with
* the tui-bench repo's `memwatch-report.mjs` (github.com/NousResearch/tui-bench).
*
* Gating (docs/opentui-env-flags.md): `HERMES_TUI_MEMLOG` — defaults to the
* `HERMES_TUI_DIAGNOSTICS` master switch, individually overridable either way.
* One `export HERMES_TUI_DIAGNOSTICS=1` in a dev's shell rc therefore covers
* every session they ever start; regular users write nothing.
*
* Cost when on: one `process.memoryUsage()` + one short append per second
* (~60 bytes/s, ~5MB/day across ten busy sessions). The interval is unref'd —
* it never keeps the process alive. Every failure path disables the logger
* silently (diagnostics must never break the TUI). Retention: files older
* than 14 days are pruned at start, best-effort.
*
* Sample shape (one JSON object per line):
* { t, rss_kb, heap_used_kb, external_kb, mounted, peak_mounted }
* `mounted`/`peak_mounted` come from the windowing DEV counters
* (logic/window.ts) — they update whenever windowing is active, independent
* of the WINDOW_STATS exposure flag.
*/
import { appendFileSync, mkdirSync, readdirSync, statSync, unlinkSync } from 'node:fs'
import { homedir } from 'node:os'
import { join } from 'node:path'
import { diagnosticsEnabled, envFlag } from '../logic/env.ts'
import { windowRowStats } from '../logic/window.ts'
const RETENTION_DAYS = 14
const SAMPLE_MS = 1000
function memwatchDir(): string {
const home = process.env.HERMES_HOME?.trim()
const base = home && home.length > 0 ? home : join(homedir(), '.hermes')
return join(base, 'logs', 'memwatch')
}
function pruneOld(dir: string): void {
const cutoff = Date.now() - RETENTION_DAYS * 24 * 3600 * 1000
try {
for (const name of readdirSync(dir)) {
if (!name.endsWith('.jsonl')) continue
const p = join(dir, name)
try {
if (statSync(p).mtimeMs < cutoff) unlinkSync(p)
} catch {
/* best-effort */
}
}
} catch {
/* best-effort */
}
}
/** Start the self-sampler (no-op unless enabled). Returns a stop function. */
export function startMemlog(): () => void {
if (!envFlag(process.env.HERMES_TUI_MEMLOG, diagnosticsEnabled())) return () => {}
try {
const dir = memwatchDir()
mkdirSync(dir, { recursive: true })
pruneOld(dir)
const boot = new Date().toISOString().replace(/[:.]/g, '').slice(0, 15)
const file = join(dir, `${boot}-${process.pid}.jsonl`)
const timer = setInterval(() => {
try {
const m = process.memoryUsage()
const w = windowRowStats()
const line = JSON.stringify({
t: Math.floor(Date.now() / 1000),
rss_kb: Math.floor(m.rss / 1024),
heap_used_kb: Math.floor(m.heapUsed / 1024),
external_kb: Math.floor(m.external / 1024),
mounted: w.mounted,
peak_mounted: w.peakMounted
})
appendFileSync(file, line + '\n')
} catch {
clearInterval(timer) // a failing diagnostic must not retry forever
}
}, SAMPLE_MS)
timer.unref?.()
return () => clearInterval(timer)
} catch {
return () => {}
}
}

View File

@@ -0,0 +1,46 @@
/**
* memoryMonitor — the early-warning BOUNDARY (touches node:process; the pure
* threshold/growth logic lives in logic/memoryMonitor.ts).
*
* Ports the high-value #34095 silent-death early-warning from Ink
* (`ui-tui/src/lib/memoryMonitor.ts`) to the OpenTUI engine, and ONLY that:
* - NO auto heap-snapshot capture (the #41948 disk-fill bug class is not
* re-imported — the always-on memlog NDJSON trace is the diagnosis path,
* and its rss-vs-heap divergence is the better diagnostic for the native
* RSS-leak class a V8 snapshot captures poorly).
* - NO Ink cache eviction (Solid disposes out-of-window rows; windowing +
* proactiveGc already cover memory pressure).
*
* It polls `process.memoryUsage()` on a 10s unref'd interval and, when the
* pure detector fires, surfaces a single transcript system line so the user
* SEES "memory climbing fast" before Node OOMs under the exit threshold. This
* is ON by default (unlike memlog/heapdump): it's a user-facing safety
* heads-up, not a diagnostic dump, and costs one memoryUsage() read per 10s
* with zero disk. Every failure path disables silently (a diagnostic must
* never break the TUI — the one place the "errors propagate" rule is
* intentionally inverted, matching memlog/proactiveGc).
*/
import { createWarnState, evaluateWarn, warnLine } from '../logic/memoryMonitor.ts'
/** Sample cadence — matches Ink's monitor (10s, unref'd). */
const SAMPLE_MS = 10_000
/**
* Start the early-warning watcher. `emitWarn` receives the ready-to-show system
* line on the (one-shot) tick growth looks abnormal. Returns a stop function.
* The interval is unref'd so it never keeps the process alive.
*/
export function startMemoryMonitor(emitWarn: (line: string) => void): () => void {
const state = createWarnState()
const timer = setInterval(() => {
try {
const { heapUsed, rss } = process.memoryUsage()
const { fire, growthBytes } = evaluateWarn(state, heapUsed)
if (fire) emitWarn(warnLine(heapUsed, rss, growthBytes))
} catch {
clearInterval(timer) // a failing diagnostic must not retry forever
}
}, SAMPLE_MS)
timer.unref?.()
return () => clearInterval(timer)
}

View File

@@ -0,0 +1,130 @@
/**
* Multi-click selection — double-click selects the word, triple-click the
* line, drag after either extends by word/line with the clicked span held
* (boundary shim in the ffiSafe.ts / nativeHandles.ts mold).
*
* Why a shim: @opentui/core's renderer knows only press-drag character
* selection — `processSingleMouseEvent` calls `startSelection(renderable,x,y)`
* on a fresh left press and `updateSelection(renderable,x,y)` per drag step,
* with no click-count concept. Wrapping those two INSTANCE methods is the
* narrowest seam that adds multi-click without forking core: the press wrapper
* counts clicks (Ink's 500ms / 1-cell chain) and, on a multi-click, seeds the
* selection with the word/line span instead of a point; the drag wrapper snaps
* the focus to word/line bounds and flips the selection anchor to whichever
* end of the held span faces away from the pointer.
*
* Word/line bounds come from the presented frame (`currentRenderBuffer`'s
* char grid — the same buffer `captureCharFrame` reads in tests), so what
* highlights is exactly the run of characters the user sees. All wrapped paths
* degrade to core's plain character selection when anything is off (no
* buffer, destroyed renderer, out-of-bounds click) — selection must never
* throw out of the mouse pipeline.
*/
import type { CliRenderer } from '@opentui/core'
import type { AnchorSpan, Point, ScreenText } from '../logic/multiClick.ts'
import { comparePoints, createClickCounter, extendedSelection, lineSpanAt, wordSpanAt } from '../logic/multiClick.ts'
/** The renderable surface the shim needs (anchor tracking reads live x/y). */
interface AnchorRenderable {
readonly x: number
readonly y: number
}
/** The private renderer surface the shim wraps (runtime-verified shapes). */
interface RendererSeam {
startSelection(renderable: AnchorRenderable, x: number, y: number): void
updateSelection(
renderable: AnchorRenderable | undefined,
x: number,
y: number,
options?: { finishDragging?: boolean }
): void
currentRenderBuffer: {
width: number
height: number
buffers: { char: Uint32Array }
}
}
/** Adapt the presented frame to the pure logic's ScreenText; null when the
* buffer is unreadable (mid-teardown/resize) → degrade to char selection. */
function presentedFrame(seam: RendererSeam): ScreenText | null {
try {
const buffer = seam.currentRenderBuffer
const chars = buffer.buffers.char
const width = buffer.width
if (width <= 0 || buffer.height <= 0) return null
return {
width,
height: buffer.height,
codepointAt: (x, y) => chars[y * width + x] ?? 0
}
} catch {
return null
}
}
/**
* Native selection semantics (probed empirically, scratch test 2026-06-11):
* per-renderable native selection keeps the anchor from the initial
* `setLocalSelection` — the anchor args of later `updateLocalSelection` calls
* are IGNORED, so moving the anchor requires restarting the selection. And the
* selection is caret-style at the focus end: a forward selection covers cells
* `[anchor, focus)` (focus cell excluded) while a backward one covers
* `[focus, anchor]` (both included). Inclusive cell spans therefore translate
* to: forward focus = `hi + 1`, backward focus = `lo` exactly.
*/
function forwardFocusX(anchor: Point, focus: Point): number {
return comparePoints(focus, anchor) >= 0 ? focus.x + 1 : focus.x
}
/** Install the multi-click wrappers on a live renderer instance. */
export function installMultiClickSelection(renderer: CliRenderer): void {
const seam = renderer as unknown as RendererSeam
const nextClickCount = createClickCounter()
// The held span while a multi-click selection is live: cleared by the next
// single click (which starts a plain char selection). `anchor` mirrors the
// selection's current anchor end so drag steps only rebind it on a flip.
let held: { span: AnchorSpan; renderable: AnchorRenderable; anchor: Point } | null = null
const coreStart = seam.startSelection.bind(renderer)
const coreUpdate = seam.updateSelection.bind(renderer)
seam.startSelection = (renderable, x, y) => {
held = null
const clicks = nextClickCount(x, y, Date.now())
const screen = clicks >= 2 ? presentedFrame(seam) : null
const span = screen ? (clicks === 2 ? wordSpanAt(screen, x, y) : lineSpanAt(screen, y)) : null
if (!span) {
coreStart(renderable, x, y)
return
}
// Seed anchor at the span start, focus past its end (forward caret) — one
// start+update pair, exactly the calls a real press-then-drag would make.
coreStart(renderable, span.lo.x, span.lo.y)
coreUpdate(renderable, span.hi.x + 1, span.hi.y)
held = {
span: { ...span, kind: clicks === 2 ? 'word' : 'line' },
renderable,
anchor: span.lo
}
}
seam.updateSelection = (renderable, x, y, options) => {
const screen = held ? presentedFrame(seam) : null
if (!held || !screen) {
coreUpdate(renderable, x, y, options)
return
}
const { anchor, focus } = extendedSelection(held.span, screen, x, y)
if (anchor.x !== held.anchor.x || anchor.y !== held.anchor.y) {
// The anchor end flipped across the held span — native selection anchors
// are fixed at set time (see forwardFocusX note), so restart it there.
coreStart(held.renderable, anchor.x, anchor.y)
held = { ...held, anchor }
}
coreUpdate(renderable, forwardFocusX(anchor, focus), focus.y, options)
}
}

View File

@@ -0,0 +1,110 @@
/**
* Native handle-table exhaustion safety for @opentui/core 0.4.0 — sibling of
* the ffiSafe.ts coordinate shim (same class of fix: harden OUR side of the
* Node-FFI seam, TODO(upstream) to delete).
*
* Root cause (bench crash: every otui mem3000 cell died at ≈3000 lumpy fixture
* messages, exit 7, ~880MB RSS — far below the 2GB cgroup cap): the native
* core indexes EVERY object — TextBuffer, TextBufferView, SyntaxStyle,
* OptimizedBuffer, … — through ONE global handle registry with 16-bit slot
* indices (core `src/zig/handles.zig`: `INDEX_BITS = 16` → `MAX_SLOTS = 65535`,
* slot 0 reserved). Measured on this install: exactly 65,534 live handles, the
* 65,535th `createSyntaxStyle()` fails; `destroy()` does recycle slots, so
* exhaustion means LIVE objects.
*
* Every `TextBufferRenderable` burns THREE slots at construction
* (`TextBufferRenderable.ts:77-80`: `TextBuffer.create()` +
* `TextBufferView.create()` + `SyntaxStyle.create()`). The mount-everything
* transcript hits the wall at ≈1,400 store rows (≈21.8k text renderables ×3 ≈
* 65.5k handles): the next mount throws `Failed to create SyntaxStyle`
* (zig.ts:4554) out of a Solid mount effect → uncaught → the renderer's OWN
* `uncaughtException` handler (renderer.ts `handleError`) calls
* `console.show()`, which allocates the console-overlay `OptimizedBuffer` —
* needing ANOTHER slot — so the handler itself throws `Failed to create
* optimized buffer: WxH` and Node dies with exit 7 (fatal error in the
* uncaughtException handler), MASKING the real error. (The exception-handler
* guard lives in renderer.ts `guardRendererErrorHandlers`.)
*
* Why we can't just SHARE one SyntaxStyle across renderables (the obvious
* 3→2 fix): the per-buffer style is load-bearing. The native styled-text path
* (text-buffer.zig `setStyledText`) registers each chunk's color by NAME —
* "chunk0", "chunk1", … — into the buffer's OWN syntax style, and
* registration is name-keyed-overwrite (syntax-style.zig `putStyle`: existing
* name → overwrite that id's definition). A shared style would have every
* styled `<text>` overwrite every other one's chunk colors (live highlights
* reference style IDS, re-resolved at render). So pooling is unsound at our
* layer; the table pressure itself is bounded by the store row cap
* (logic/store.ts, clamped to a handle-safe ceiling) until #27 lands
* renderable-weight-aware capping/virtualization.
*
* What THIS shim does: makes style allocation failure DEGRADE instead of
* throwing out of mount/render. `SyntaxStyle.create()` on a full table
* returns a DETACHED style (handle 0 = the native INVALID_HANDLE):
* - JS-side styling still works — markdown/code chunk colors come from
* `getStyle`/`mergeStyles`, which read the instance's JS `styleDefs` map
* (see core lib/tree-sitter-styled-text.ts), never the native handle;
* - every native call on handle 0 is already a safe no-op in zig (acquire
* fails → early return), and `textBuffer.setSyntaxStyle(detached)` passes
* ptr 0 which the native side treats as "no style" — buffer-level styled
* -text highlights are skipped, i.e. that text renders unstyled;
* - `destroy()` on a detached style is a native no-op (beginDestroy(0)).
*
* TODO(upstream): file an OpenTUI issue — (a) a global 64k handle table with a
* 3-slot cost per text renderable is too small for transcript-style TUIs;
* (b) allocation failure throws out of the render loop with no degrade path;
* (c) `handleError` allocates (console overlay) and so crashes on the very
* condition it is reporting, masking the root cause with exit 7.
*/
import { SyntaxStyle, resolveRenderLib, type SyntaxStyleHandle } from '@opentui/core'
import { getLog } from './log.ts'
/** The native side's INVALID_HANDLE — every FFI entry point no-ops on it. */
const DETACHED: SyntaxStyleHandle = 0 as never
let installed = false
let warnedExhausted = false
/** Build a SyntaxStyle backed by NO native handle: JS-side styleDefs/merge
* caches fully functional, all native calls safe no-ops (handle 0). */
function detachedSyntaxStyle(): SyntaxStyle {
return new SyntaxStyle(resolveRenderLib(), DETACHED)
}
/**
* Patch `SyntaxStyle.create` (the static the core's own TextBufferRenderable
* constructor calls — @opentui/core is external, one shared class object) so
* native handle-table exhaustion degrades to a detached, unstyled-but-inert
* style instead of throwing out of a Solid mount effect. Idempotent.
*
* @param factory test seam — inject a failing allocator to exercise the
* degrade path (defaults to the real `SyntaxStyle.create`).
*/
export function installSyntaxStyleDegrade(factory?: () => SyntaxStyle): void {
if (installed) return
installed = true
const origCreate = factory ?? SyntaxStyle.create.bind(SyntaxStyle)
SyntaxStyle.create = function create(): SyntaxStyle {
try {
return origCreate()
} catch (cause) {
if (!warnedExhausted) {
warnedExhausted = true
try {
getLog().error(
'native',
'SyntaxStyle allocation failed — native handle table exhausted; degrading to unstyled',
{
cause: String(cause)
}
)
} catch {
// logging is best-effort inside a degrade path
}
}
return detachedSyntaxStyle()
}
}
}

View File

@@ -0,0 +1,65 @@
{
"comment": "Tree-sitter grammars beyond @opentui/core's bundled 5 (ts/js/markdown/markdown_inline/zig). NOT vendored — declared as remote URLs and fetched+cached at runtime by OpenTUI's TreeSitterClient into HERMES_TUI_PARSER_CACHE (see src/boundary/parsers.ts). This file is the URL source of truth. Grammar versions are pinned via the release tag in each wasm URL; .scm highlight queries follow opencode's per-language source choices (parser-repo queries for python/html where nvim-treesitter's are parser-incompatible, nvim-treesitter master otherwise). cpp deliberately dropped (3.28 MB, ~half the old vendored bundle). Aliases are belt-and-braces: core's extToFiletype/infoStringToFiletype already normalize py->python etc., but a literal alias filetype reaching the client still resolves. To add/refresh a grammar, edit the wasm tag + the highlights URL here — no binaries land in the repo.",
"parsers": [
{
"filetype": "python",
"aliases": ["py"],
"wasm": "https://github.com/tree-sitter/tree-sitter-python/releases/download/v0.23.6/tree-sitter-python.wasm",
"highlights": "https://github.com/tree-sitter/tree-sitter-python/raw/refs/heads/master/queries/highlights.scm"
},
{
"filetype": "rust",
"aliases": ["rs"],
"wasm": "https://github.com/tree-sitter/tree-sitter-rust/releases/download/v0.23.2/tree-sitter-rust.wasm",
"highlights": "https://raw.githubusercontent.com/nvim-treesitter/nvim-treesitter/refs/heads/master/queries/rust/highlights.scm"
},
{
"filetype": "go",
"aliases": [],
"wasm": "https://github.com/tree-sitter/tree-sitter-go/releases/download/v0.23.4/tree-sitter-go.wasm",
"highlights": "https://raw.githubusercontent.com/nvim-treesitter/nvim-treesitter/refs/heads/master/queries/go/highlights.scm"
},
{
"filetype": "bash",
"aliases": ["sh", "shell", "zsh"],
"wasm": "https://github.com/tree-sitter/tree-sitter-bash/releases/download/v0.23.3/tree-sitter-bash.wasm",
"highlights": "https://raw.githubusercontent.com/nvim-treesitter/nvim-treesitter/refs/heads/master/queries/bash/highlights.scm"
},
{
"filetype": "json",
"aliases": [],
"wasm": "https://github.com/tree-sitter/tree-sitter-json/releases/download/v0.24.8/tree-sitter-json.wasm",
"highlights": "https://raw.githubusercontent.com/nvim-treesitter/nvim-treesitter/refs/heads/master/queries/json/highlights.scm"
},
{
"filetype": "c",
"aliases": ["h"],
"wasm": "https://github.com/tree-sitter/tree-sitter-c/releases/download/v0.23.5/tree-sitter-c.wasm",
"highlights": "https://raw.githubusercontent.com/nvim-treesitter/nvim-treesitter/refs/heads/master/queries/c/highlights.scm"
},
{
"filetype": "html",
"aliases": [],
"wasm": "https://github.com/tree-sitter/tree-sitter-html/releases/download/v0.23.2/tree-sitter-html.wasm",
"highlights": "https://github.com/tree-sitter/tree-sitter-html/raw/refs/heads/master/queries/highlights.scm"
},
{
"filetype": "css",
"aliases": [],
"wasm": "https://github.com/tree-sitter/tree-sitter-css/releases/download/v0.23.2/tree-sitter-css.wasm",
"highlights": "https://raw.githubusercontent.com/nvim-treesitter/nvim-treesitter/refs/heads/master/queries/css/highlights.scm"
},
{
"filetype": "yaml",
"aliases": ["yml"],
"wasm": "https://github.com/tree-sitter-grammars/tree-sitter-yaml/releases/download/v0.7.1/tree-sitter-yaml.wasm",
"highlights": "https://raw.githubusercontent.com/nvim-treesitter/nvim-treesitter/refs/heads/master/queries/yaml/highlights.scm"
},
{
"filetype": "toml",
"aliases": [],
"wasm": "https://github.com/tree-sitter-grammars/tree-sitter-toml/releases/download/v0.7.0/tree-sitter-toml.wasm",
"highlights": "https://raw.githubusercontent.com/nvim-treesitter/nvim-treesitter/refs/heads/master/queries/toml/highlights.scm"
}
]
}

View File

@@ -0,0 +1,100 @@
/**
* Extra Tree-sitter grammar registration — the syntax-highlighting language
* expansion (docs/plans/opentui-syntax-highlighting-languages.md).
*
* @opentui/core@0.4.x bundles only a handful of grammars (ts/js/markdown/
* markdown_inline/zig); everything else renders plain text. The cure is the
* public `addDefaultParsers()` API fed with REMOTE grammar URLs — OpenTUI's
* TreeSitterClient fetches each `.wasm`/`.scm` lazily on first use of a
* filetype and caches it under the client's `dataPath`. We do NOT vendor any
* binaries (cf. opencode, which checks in zero `.wasm`/`.scm` and lets OpenTUI
* fetch+cache). The grammar set + its URLs live in `parsers.manifest.json`.
*
* Cache location: `HERMES_TUI_PARSER_CACHE` (set by the Python launcher to
* `~/.hermes/cache/opentui-parsers/`, profile-aware via get_hermes_home). When
* unset (dev/demo/CI), we leave OpenTUI's default data path
* (`$XDG_DATA_HOME/opentui` → `~/.local/share/opentui`) untouched.
*
* `setDataPath()` on the GLOBAL client must run BEFORE the client initializes
* (it only mutates `options.dataPath` until init, then the worker boots with
* it). `addDefaultParsers()` must run BEFORE the first `<code>`/`<markdown>`
* mount (they grab the global client lazily and trigger init). The entry
* imports + calls `registerRemoteParsers()` at module load, ahead of renderer
* acquisition, so both orderings hold.
*
* Offline behavior: registration itself does NO network (it only declares the
* URL configs). The fetch happens on first highlight of a given language; if it
* fails (air-gapped / GitHub unreachable), OpenTUI degrades that filetype to
* plain text — never a throw. A registration error likewise degrades the whole
* extra set to plain text.
*/
import { getTreeSitterClient } from '@opentui/core'
import { addDefaultParsers } from '@opentui/core'
import manifest from './parsers.manifest.json'
import { getLog } from './log.ts'
interface ManifestParser {
readonly filetype: string
readonly aliases: readonly string[]
readonly wasm: string
readonly highlights: string
}
/** The registered parser configs (exported shape for tests/diagnostics). */
export interface RegisteredParser {
filetype: string
aliases?: string[]
wasm: string
queries: { highlights: string[] }
}
/** The cache dir for fetched grammar assets, or undefined to use OpenTUI's
* default ($XDG_DATA_HOME/opentui). The launcher sets this per-profile. */
export function parserCacheDir(): string | undefined {
const dir = (process.env.HERMES_TUI_PARSER_CACHE ?? '').trim()
return dir.length ? dir : undefined
}
/** Build the remote parser configs from the manifest. Pure — no network, no
* filesystem; just declares the URL configs OpenTUI fetches lazily. */
export function remoteParsers(): RegisteredParser[] {
const configs: RegisteredParser[] = []
for (const parser of (manifest as { parsers: ManifestParser[] }).parsers) {
if (!parser.wasm || !parser.highlights) continue
configs.push({
filetype: parser.filetype,
...(parser.aliases.length ? { aliases: [...parser.aliases] } : {}),
wasm: parser.wasm,
queries: { highlights: [parser.highlights] }
})
}
return configs
}
/** Point the global tree-sitter client's cache at our profile dir, then
* register the remote grammars with core's global default-parser list.
* Returns what was registered (empty on any failure — plain-text fallback). */
export function registerRemoteParsers(): RegisteredParser[] {
try {
const cache = parserCacheDir()
if (cache) {
// Must precede the client's lazy initialize() (first <code>/<markdown>
// mount). Pre-init this only mutates options.dataPath; the returned
// promise resolves immediately (no worker yet) so we don't await it.
void getTreeSitterClient().setDataPath(cache)
}
const parsers = remoteParsers()
if (!parsers.length) {
getLog().warn('parsers', 'no remote tree-sitter grammars declared — extras render plain', {})
return []
}
addDefaultParsers(parsers)
return parsers
} catch (cause) {
getLog().warn('parsers', 'tree-sitter registration failed — extras render plain', {
cause: String(cause)
})
return []
}
}

View File

@@ -0,0 +1,95 @@
/**
* proactiveGc — opt-in, idle-gated `global.gc()` for the low-mem path (W2).
*
* GATED on the low-mem opt-in: only runs when the user set a LOW
* `HERMES_TUI_HEAP_MB` (the same knob W1 threads into `--max-old-space-size`).
* Default / unconstrained sessions do NOTHING — no proactive GC, no divergence
* from Ink on the default path (Ink never calls gc proactively; it only exposes
* it for heapdumps, so W2 is OpenTUI-only by design — spec D5).
*
* TRIGGER MODEL — idle-gated, never mid-stream:
* - A low-frequency timer ticks every IDLE_MS. On each tick it calls
* `global.gc()` ONLY when (a) a turn is NOT streaming (`isStreaming()` false
* — so we never pause mid-render/mid-reply) and (b) at least one full idle
* window has passed since the last activity (stream end / explicit touch).
* - If RSS crosses RSS_EAGER_KB (>400MB) the cadence tightens (EAGER_MS) but
* it STILL waits for idle — eagerness shortens the window, never bypasses it.
* - `--expose-gc` (W1) makes `global.gc` real; without it this is a silent
* no-op (we detect and disable). The timer is unref'd — it never keeps the
* process alive — and every failure path disables silently (a GC helper must
* never break the TUI).
*
* Reuses `process.memoryUsage().rss` (same read as memlog) for the >400MB check.
*/
import { envFlag } from '../logic/env.ts'
/** Below this heap cap (MB) we treat the session as low-mem opt-in. 8192 is the
* default; anyone who set a cap materially under it wants tight memory. */
const LOW_MEM_HEAP_MB = 4096
/** Idle window (ms): time since last activity before a GC is allowed. */
const IDLE_MS = 8000
/** Tightened idle window once RSS is high. */
const EAGER_MS = 3000
/** RSS (KB) above which GC becomes eager (still idle-gated). 400MB. */
const RSS_EAGER_KB = 400 * 1024
/** The configured heap cap in MB from the W1 knob, or null when unset/garbage.
* The Python launcher reads the same env; the child inherits it, so the Node
* side can read it directly to know whether low-mem mode is active. */
function configuredHeapMb(): number | null {
const v = (process.env.HERMES_TUI_HEAP_MB ?? '').trim()
if (!/^\d+$/.test(v)) return null
const n = Number.parseInt(v, 10)
return n > 0 ? n : null
}
/** Whether proactive GC should run: a low heap cap is set AND gc is exposed.
* `HERMES_TUI_PROACTIVE_GC` can force it on/off, but defaults to the low-mem
* signal so the knob composes (spec D9: independent knobs). */
export function proactiveGcEnabled(heapMb: number | null = configuredHeapMb()): boolean {
const lowMem = heapMb !== null && heapMb <= LOW_MEM_HEAP_MB
return envFlag(process.env.HERMES_TUI_PROACTIVE_GC, lowMem) && typeof global.gc === 'function'
}
/**
* Start the idle-gated proactive GC watcher. `isStreaming` reports whether a
* turn is mid-flight (read from the store's `info.running`). Returns a stop
* function and a `touch()` to mark fresh activity (e.g. on keypress / stream
* start) so the idle clock resets. No-op (returns inert handles) when disabled.
*/
export function startProactiveGc(isStreaming: () => boolean): { stop: () => void; touch: () => void } {
if (!proactiveGcEnabled()) return { stop: () => {}, touch: () => {} }
const gc = global.gc
if (typeof gc !== 'function') return { stop: () => {}, touch: () => {} }
let lastActivity = Date.now()
const touch = () => {
lastActivity = Date.now()
}
// Tick at the eager cadence; the idle-window check (below) does the real
// gating, so a high-RSS session reacts within EAGER_MS while a calm one still
// waits the full IDLE_MS. One cheap rss read + compare per tick.
const timer = setInterval(() => {
try {
if (isStreaming()) {
// mid-stream: defer entirely and keep the clock fresh so a GC can't fire
// the instant the stream ends — it waits a full idle window after.
lastActivity = Date.now()
return
}
const rssKb = Math.floor(process.memoryUsage().rss / 1024)
const window = rssKb > RSS_EAGER_KB ? EAGER_MS : IDLE_MS
if (Date.now() - lastActivity < window) return
gc()
// After a collection, reset the clock so we don't GC every tick — the next
// one waits another full idle window.
lastActivity = Date.now()
} catch {
clearInterval(timer) // a failing GC helper must not retry forever
}
}, EAGER_MS)
timer.unref?.()
return { stop: () => clearInterval(timer), touch }
}

View File

@@ -0,0 +1,234 @@
/**
* Renderer lifecycle — the Effect-side resource boundary (spec v4 §3.1).
*
* `acquireRelease(createCliRenderer)` so the renderer is always destroyed on
* scope exit; a `Deferred` resolved on the renderer's "destroy" event lets the
* entry block until the user quits. Mirrors opencode `app.tsx:177` /
* `:185-225`.
*
* No throw / try-catch here: acquisition failure surfaces as a typed
* `RendererError` via `Effect.tryPromise`'s `catch`.
*/
import { createCliRenderer, type CliRenderer, type KeyEvent, type Selection } from '@opentui/core'
import { Deferred, Effect } from 'effect'
import { RendererError } from './errors.ts'
import { installFfiCoordSafety } from './ffiSafe.ts'
import { getLog } from './log.ts'
import { installMultiClickSelection } from './multiClickSelect.ts'
import { installSyntaxStyleDegrade } from './nativeHandles.ts'
// Node-FFI seam: clamp negative draw coordinates BEFORE the u32 FFI marshaling
// (see ffiSafe.ts — scrolled-out <diff> line backgrounds crashed the render loop).
installFfiCoordSafety()
// Native handle-table seam: SyntaxStyle allocation failure (global 65,534-slot
// registry exhausted) degrades to an unstyled detached style instead of throwing
// out of a Solid mount effect (see nativeHandles.ts for the full root cause).
installSyntaxStyleDegrade()
/**
* The text a finished selection copies: the RENDERED text the user highlighted,
* verbatim (`getSelectedText()` does correct same-line merging). Markdown markers
* are concealed in the pretty render, so a partial selection cannot recover source —
* this copies exactly what was highlighted (the `/copy` command gives full source).
* Total by construction — a copy must NEVER throw out of an input/event handler
* (that would tear down the render loop).
*/
function selectionCopyText(selection: Selection): string {
try {
return selection.getSelectedText()
} catch (cause) {
getLog().warn('copy', 'getSelectedText failed', { cause: String(cause) })
return ''
}
}
export interface RendererOptions {
/** Mouse tracking on/off (from decoded display config). */
readonly mouse: boolean
/** When true, a blocking prompt owns Ctrl+C (cancel) — the global quit is suppressed (gotcha §8 #6). */
readonly isBlocked?: () => boolean
/**
* Ctrl+C handler (item 11). When set, it OWNS Ctrl+C while not blocked — the
* entry's state machine decides interrupt-the-turn vs quit. When omitted, the
* default is an immediate `renderer.destroy()` (quit).
*/
readonly onCtrlC?: () => void
/**
* Copy a mouse selection (item 1). When there's a live selection, Ctrl+C copies
* it (this callback) instead of interrupting/quitting — opencode's selection
* key precedence (`app.tsx:388`). Receives the rendered text the user highlighted.
*/
readonly onCopySelection?: (text: string) => void
}
/**
* Acquire a CliRenderer inside the current scope and register its release.
* Returns the renderer plus a Deferred that resolves when the renderer is
* destroyed (user quit) — `await` it to keep the entry alive.
*/
export const acquireRenderer = Effect.fn('Renderer.acquire')(function* (options: RendererOptions) {
const renderer = yield* Effect.acquireRelease(
Effect.tryPromise({
try: async () => {
// Snapshot process error listeners so we can guard exactly the ones the
// renderer installs (its `handleError` — see guardRendererErrorHandlers).
const preexisting = snapshotErrorListeners()
const created = await createCliRenderer({
// Root canvas: TRANSPARENT by default — the terminal's own background
// shows through (do not paint a "default dark" canvas; glitch hated
// it). A skin's explicit ui_bg lands reactively via the header's
// theme effect (view/header.tsx setBackgroundColor).
// scrollbox clips growing output → no terminal-scrollback corruption (gotcha §8 #2).
externalOutputMode: 'passthrough',
targetFps: 60,
// Don't let core's uncaught-error handler call the ALLOCATING
// `console.show()` (0.4.1 public option; defaults true-in-dev). That
// call needs a native handle, so under handle-table exhaustion — the
// very condition being reported — it throws and exit-7-masks the
// original error (the bench mem3000 postmortem). Disabling it removes
// that failure mode at the source; `guardRendererErrorHandlers` below
// stays as belt-and-suspenders (honest logging if any core handler
// still throws), but is no longer load-bearing for the exit-7 mask.
openConsoleOnError: false,
// prompts own Ctrl+C → deny/cancel (gotcha §8 #6); the global quit is gated on !blocked.
exitOnCtrlC: false,
// OpenTUI's default exitSignals include SIGPIPE + SIGBUS, and its handler
// calls renderer.destroy() — so a broken clipboard pipe (writeClipboard
// spawning xclip/wl-copy that dies) raises SIGPIPE and QUITS THE TUI on
// copy. SIGPIPE/SIGBUS are not shutdown intents; restrict to the genuine
// termination signals so a stray pipe error can never tear down the UI.
exitSignals: ['SIGINT', 'SIGTERM', 'SIGQUIT', 'SIGHUP'],
useKittyKeyboard: {},
useMouse: options.mouse
})
guardRendererErrorHandlers(created, preexisting)
// Editor-grade mouse selection: double-click word, triple-click line,
// drag extends with the clicked span held (see multiClickSelect.ts).
installMultiClickSelection(created)
return created
},
catch: cause => new RendererError({ cause })
}),
renderer => Effect.sync(() => destroyRenderer(renderer))
)
const shutdown = yield* Deferred.make<void>()
renderer.once('destroy', () => {
Deferred.doneUnsafe(shutdown, Effect.void)
})
// Global quit on Ctrl+C. `exitOnCtrlC:false` hands Ctrl+C to us as a key event
// (not SIGINT), so destroying here fires 'destroy' → resolves `shutdown` → the
// entry scope closes → finalizers run: renderer teardown + the gateway layer's
// `client.stop()` EOFs the Python child's stdin so it exits (no orphan). When a
// blocking prompt is up, it owns Ctrl+C (→ deny/cancel) so we suppress the quit
// (gotcha §8 #6) — the prompt's own handler sends the cancel reply.
const isBlocked = options.isBlocked ?? (() => false)
renderer.keyInput.on('keypress', (key: KeyEvent) => {
if (!(key.ctrl && key.name === 'c') || renderer.isDestroyed) return
// Copy a live mouse selection first (item 1) — takes precedence over the
// interrupt/quit machine and over a blocking prompt's cancel.
if (options.onCopySelection) {
const selection = renderer.getSelection()
const text = selection ? selectionCopyText(selection) : ''
if (text) {
options.onCopySelection(text)
renderer.clearSelection()
return
}
}
if (isBlocked()) return // a blocking prompt owns Ctrl+C (→ deny/cancel)
if (options.onCtrlC) options.onCtrlC()
else renderer.destroy()
})
// Copy-on-select (item 1 parity with free-code/Ink): the renderer's "selection"
// event fires ONCE when a free-form mouse selection COMPLETES (drag finish);
// auto-copy the spanned selectable text. Unlike the Ctrl+C path above we do NOT
// clearSelection() — the highlight persists so the user sees what was copied and
// Ctrl+C still works on it. `writeClipboard` is idempotent, so both paths writing
// the same text is harmless (no double-write bug). `CliRenderer extends
// EventEmitter`, so `on('selection', …)` is untyped → annotate `selection`.
const onCopy = options.onCopySelection
if (onCopy) {
renderer.on('selection', (selection: Selection) => {
const text = selectionCopyText(selection)
if (text) onCopy(text)
})
}
return { renderer, shutdown } as const
})
/** Best-effort renderer teardown; never throws out of the finalizer. */
function destroyRenderer(renderer: CliRenderer): void {
try {
if (!renderer.isDestroyed) renderer.destroy()
} catch {
// teardown is best-effort; a failed destroy must not mask the real exit cause.
}
}
// ── honest-crash guard for the renderer's process error handlers ─────────────
//
// CliRenderer installs its own `uncaughtException`/`unhandledRejection` handler
// (`handleError`: console.error + console.show()). `console.show()` ALLOCATES —
// the console-overlay OptimizedBuffer needs a native handle — so under native
// handle-table exhaustion (the very condition being reported, see
// nativeHandles.ts) the handler itself throws `Failed to create optimized
// buffer: WxH`, and Node kills the process with exit 7, MASKING the original
// error (this is exactly the bench mem3000 postmortem). Wrap the listeners the
// renderer added so a handler failure is logged honestly and the original
// error stays the story; while the renderer is alive the process keeps running
// (core's own contract: handled uncaught exceptions don't exit).
type ProcessErrorEvent = 'uncaughtException' | 'unhandledRejection'
const PROCESS_ERROR_EVENTS: readonly ProcessErrorEvent[] = ['uncaughtException', 'unhandledRejection']
type ErrorListener = (...args: unknown[]) => void
// Node's typings don't accept the union event name in listeners/on/removeListener
// overloads — view the process emitter through a minimal untyped seam.
const proc = process as unknown as {
listeners(event: ProcessErrorEvent): ErrorListener[]
on(event: ProcessErrorEvent, listener: ErrorListener): void
removeListener(event: ProcessErrorEvent, listener: ErrorListener): void
}
function snapshotErrorListeners(): ReadonlyMap<ProcessErrorEvent, ReadonlySet<unknown>> {
return new Map(PROCESS_ERROR_EVENTS.map(event => [event, new Set(proc.listeners(event))]))
}
/** Re-wrap the error listeners `createCliRenderer` added (delta vs the snapshot)
* so an exception INSIDE them can never exit-7-mask the original error. */
function guardRendererErrorHandlers(
renderer: CliRenderer,
preexisting: ReadonlyMap<ProcessErrorEvent, ReadonlySet<unknown>>
): void {
for (const event of PROCESS_ERROR_EVENTS) {
const before = preexisting.get(event)
for (const listener of proc.listeners(event)) {
if (before?.has(listener)) continue
proc.removeListener(event, listener)
const guarded: ErrorListener = (...args) => {
// After teardown the renderer can no longer report anything — rethrow so
// the ORIGINAL error reaches Node's default fatal path unmasked.
if (renderer.isDestroyed) throw args[0]
try {
listener(...args)
} catch (handlerFailure) {
try {
const original = args[0]
getLog().error('renderer', 'core error handler crashed while reporting an uncaught error', {
original: original instanceof Error ? (original.stack ?? original.message) : String(original),
handlerFailure: String(handlerFailure)
})
} catch {
// logging is best-effort — never throw out of an exception handler.
}
}
}
proc.on(event, guarded)
}
}
}

View File

@@ -0,0 +1,17 @@
/**
* Runtime composition — the single edge where layers are provided and the
* program is run (spec v4 §3.1). Layers are provided HERE by the caller
* (the launcher entry), never inside components. Mirrors opencode
* `cli/tui/layer.ts:6` + `cli/cmd/tui.ts` runMain.
*/
import { Layer } from 'effect'
import type { GatewayService } from './gateway/GatewayService.ts'
/**
* The application layer. Phase 0 takes the GatewayService layer as a parameter
* so the entry can choose Fake (dev/test) or — from Phase 1 — the live
* `tui_gateway`-spawning layer. Compose additional boundary services
* (Config, Theme-with-IO) here as they land.
*/
export const makeAppLayer = (gateway: Layer.Layer<GatewayService>) => Layer.mergeAll(gateway)

View File

@@ -0,0 +1,259 @@
/**
* GatewayEvent — the wire event union, modeled as an Effect Schema and decoded
* ONCE at the transport boundary (spec v4 §3.3). Mirrors Ink's
* `ui-tui/src/gatewayTypes.ts:509-587` (discriminant = `type`).
*
* beta.78 API (verified vs .d.ts): variants are `Schema.Struct` with a
* `Schema.Literal` `type`, combined with `Schema.Union([...]).pipe(
* Schema.toTaggedUnion("type"))`. Optional fields use `Schema.optionalKey`
* (exact-optional under exactOptionalPropertyTypes). Decode unknown wire JSON
* with `Schema.decodeUnknownOption` so an UNRECOGNIZED `type` yields `Option.none`
* and is skipped — a stray event never tears down the stream.
*
* Types are INFERRED from the schema (`typeof X["Type"]`), never hand-declared.
*/
import { Schema } from 'effect'
const Str = Schema.String
const opt = Schema.optionalKey
// ── Skin (mirror GatewaySkin in ui-tui/src/gatewayTypes.ts) ───────────
export const GatewaySkinSchema = Schema.Struct({
banner_hero: opt(Str),
banner_logo: opt(Str),
branding: opt(Schema.Record(Str, Str)),
colors: opt(Schema.Record(Str, Str)),
help_header: opt(Str),
tool_prefix: opt(Str),
// Spinner animation data (faces/verbs/wings) — mixed array/tuple shapes, kept
// loose at the boundary; the spinner component narrows what it reads. tool_emojis
// is a per-tool glyph override map. Both additive + optional (back-compat).
spinner: opt(Schema.Record(Str, Schema.Unknown)),
tool_emojis: opt(Schema.Record(Str, Str))
})
export type GatewaySkinDecoded = typeof GatewaySkinSchema.Type
// ── Variant schemas (one per wire `type`) ─────────────────────────────
// lifecycle
const GatewayReady = Schema.Struct({
type: Schema.Literal('gateway.ready'),
session_id: opt(Str),
payload: opt(Schema.Struct({ skin: opt(GatewaySkinSchema) }))
})
const SkinChanged = Schema.Struct({
type: Schema.Literal('skin.changed'),
session_id: opt(Str),
payload: opt(GatewaySkinSchema)
})
const SessionInfoEvent = Schema.Struct({
type: Schema.Literal('session.info'),
session_id: opt(Str),
// SessionInfo is large + evolving; keep it loose at the boundary (Record),
// the chrome phase narrows the fields it actually reads.
payload: Schema.Record(Str, Schema.Unknown)
})
// streaming text
const MessageStart = Schema.Struct({ type: Schema.Literal('message.start'), session_id: opt(Str) })
const MessageDelta = Schema.Struct({
type: Schema.Literal('message.delta'),
session_id: opt(Str),
payload: opt(Schema.Struct({ text: opt(Str), rendered: opt(Str) }))
})
const MessageComplete = Schema.Struct({
type: Schema.Literal('message.complete'),
session_id: opt(Str),
// `usage` carries the post-turn token/context totals → refreshes the status bar
// (item 14). Kept loose (Record) — the chrome reader narrows what it needs.
payload: opt(Schema.Struct({ text: opt(Str), rendered: opt(Str), usage: opt(Schema.Record(Str, Schema.Unknown)) }))
})
// reasoning / thinking — toTaggedUnion needs ONE literal per member, so the
// reasoning.delta/reasoning.available pair is two structs sharing a shape.
const ReasoningShape = {
session_id: opt(Str),
payload: opt(Schema.Struct({ text: opt(Str), verbose: opt(Schema.Boolean) }))
}
const ReasoningDelta = Schema.Struct({ type: Schema.Literal('reasoning.delta'), ...ReasoningShape })
const ReasoningAvailable = Schema.Struct({ type: Schema.Literal('reasoning.available'), ...ReasoningShape })
const ThinkingDelta = Schema.Struct({
type: Schema.Literal('thinking.delta'),
session_id: opt(Str),
payload: opt(Schema.Struct({ text: opt(Str) }))
})
// tools
const ToolStart = Schema.Struct({
type: Schema.Literal('tool.start'),
session_id: opt(Str),
payload: Schema.Record(Str, Schema.Unknown)
})
const ToolComplete = Schema.Struct({
type: Schema.Literal('tool.complete'),
session_id: opt(Str),
payload: Schema.Record(Str, Schema.Unknown)
})
const ToolProgress = Schema.Struct({
type: Schema.Literal('tool.progress'),
session_id: opt(Str),
payload: Schema.Struct({ name: opt(Str), preview: opt(Str) })
})
const ToolGenerating = Schema.Struct({
type: Schema.Literal('tool.generating'),
session_id: opt(Str),
payload: Schema.Struct({ name: opt(Str) })
})
// blocking prompts (deadlock-critical — Phase 3 renders these)
const ClarifyRequest = Schema.Struct({
type: Schema.Literal('clarify.request'),
session_id: opt(Str),
payload: Schema.Struct({
choices: opt(Schema.NullOr(Schema.Array(Str))),
question: opt(Str),
request_id: Str
})
})
const ApprovalRequest = Schema.Struct({
type: Schema.Literal('approval.request'),
session_id: opt(Str),
payload: Schema.Struct({ command: Str, description: Str })
})
const SudoRequest = Schema.Struct({
type: Schema.Literal('sudo.request'),
session_id: opt(Str),
payload: Schema.Struct({ request_id: Str })
})
const SecretRequest = Schema.Struct({
type: Schema.Literal('secret.request'),
session_id: opt(Str),
payload: Schema.Struct({ env_var: Str, prompt: Str, request_id: Str })
})
// chrome / agent
const StatusUpdate = Schema.Struct({
type: Schema.Literal('status.update'),
session_id: opt(Str),
payload: opt(Schema.Struct({ kind: opt(Str), text: opt(Str) }))
})
const NotificationShow = Schema.Struct({
type: Schema.Literal('notification.show'),
session_id: opt(Str),
payload: Schema.Record(Str, Schema.Unknown)
})
const NotificationClear = Schema.Struct({
type: Schema.Literal('notification.clear'),
session_id: opt(Str),
payload: opt(Schema.Struct({ key: opt(Str) }))
})
const VoiceStatus = Schema.Struct({
type: Schema.Literal('voice.status'),
session_id: opt(Str),
payload: opt(Schema.Struct({ state: opt(Schema.Literals(['idle', 'listening', 'transcribing'])) }))
})
const VoiceTranscript = Schema.Struct({
type: Schema.Literal('voice.transcript'),
session_id: opt(Str),
payload: opt(Schema.Struct({ no_speech_limit: opt(Schema.Boolean), text: opt(Str) }))
})
const BrowserProgress = Schema.Struct({
type: Schema.Literal('browser.progress'),
session_id: opt(Str),
payload: Schema.Record(Str, Schema.Unknown)
})
const BackgroundComplete = Schema.Struct({
type: Schema.Literal('background.complete'),
session_id: opt(Str),
payload: Schema.Struct({ task_id: Str, text: Str })
})
const ReviewSummary = Schema.Struct({
type: Schema.Literal('review.summary'),
session_id: opt(Str),
payload: opt(Schema.Struct({ text: opt(Str) }))
})
const SubagentShape = { session_id: opt(Str), payload: Schema.Record(Str, Schema.Unknown) }
const SubagentSpawnRequested = Schema.Struct({ type: Schema.Literal('subagent.spawn_requested'), ...SubagentShape })
const SubagentStart = Schema.Struct({ type: Schema.Literal('subagent.start'), ...SubagentShape })
const SubagentThinking = Schema.Struct({ type: Schema.Literal('subagent.thinking'), ...SubagentShape })
const SubagentTool = Schema.Struct({ type: Schema.Literal('subagent.tool'), ...SubagentShape })
const SubagentProgress = Schema.Struct({ type: Schema.Literal('subagent.progress'), ...SubagentShape })
const SubagentComplete = Schema.Struct({ type: Schema.Literal('subagent.complete'), ...SubagentShape })
// transport errors
const ErrorEvent = Schema.Struct({
type: Schema.Literal('error'),
session_id: opt(Str),
payload: opt(Schema.Struct({ message: opt(Str) }))
})
const GatewayStderr = Schema.Struct({
type: Schema.Literal('gateway.stderr'),
session_id: opt(Str),
payload: Schema.Struct({ line: Str })
})
const GatewayStartTimeout = Schema.Struct({
type: Schema.Literal('gateway.start_timeout'),
session_id: opt(Str),
payload: Schema.Record(Str, Schema.Unknown)
})
const GatewayProtocolError = Schema.Struct({
type: Schema.Literal('gateway.protocol_error'),
session_id: opt(Str),
payload: opt(Schema.Struct({ preview: opt(Str) }))
})
// gateway lifecycle recovery (auto-heal): the child exited (crash/kill) and the
// transport is respawning+resuming the session. Surfaced so the frozen spinner
// clears and the user sees the in-flight reply was lost (see store cases).
const GatewayExited = Schema.Struct({
type: Schema.Literal('gateway.exited'),
session_id: opt(Str),
payload: opt(Schema.Struct({ reason: opt(Str), code: opt(Schema.Number), signal: opt(Str) }))
})
const GatewayRecovering = Schema.Struct({
type: Schema.Literal('gateway.recovering'),
session_id: opt(Str),
payload: opt(Schema.Struct({ attempt: opt(Schema.Number), delay_ms: opt(Schema.Number) }))
})
// ── The union ─────────────────────────────────────────────────────────
export const GatewayEventSchema = Schema.Union([
GatewayReady,
SkinChanged,
SessionInfoEvent,
MessageStart,
MessageDelta,
MessageComplete,
ReasoningDelta,
ReasoningAvailable,
ThinkingDelta,
ToolStart,
ToolComplete,
ToolProgress,
ToolGenerating,
ClarifyRequest,
ApprovalRequest,
SudoRequest,
SecretRequest,
StatusUpdate,
NotificationShow,
NotificationClear,
VoiceStatus,
VoiceTranscript,
BrowserProgress,
BackgroundComplete,
ReviewSummary,
SubagentSpawnRequested,
SubagentStart,
SubagentThinking,
SubagentTool,
SubagentProgress,
SubagentComplete,
ErrorEvent,
GatewayStderr,
GatewayStartTimeout,
GatewayProtocolError,
GatewayExited,
GatewayRecovering
]).pipe(Schema.toTaggedUnion('type'))
/** The decoded, typed event. Inferred from the schema — never hand-declared. */
export type GatewayEvent = typeof GatewayEventSchema.Type

View File

@@ -0,0 +1,113 @@
/**
* SessionInfo + Catalog decoders — the decode-at-boundary idiom (spec v4 §3.3),
* mirroring GatewayEvent.ts. These two payloads are UNTRUSTED loose JSON from the
* Python `tui_gateway` (`session.info` event / `session.create`/`resume` result
* `info`, and the `startup.catalog` RPC result), so they are decoded ONCE with an
* Effect Schema instead of hand-rolled `as`-cast readers.
*
* Decode with `Schema.decodeUnknownOption`: a malformed/partial payload yields
* `Option.none` and the caller falls back to an empty patch / leaves the catalog
* unset — a stray shape never crashes the reducer.
*
* Wire field names are verified against `tui_gateway/server.py`:
* - session.info → `_session_info()` (server.py:~1798): top-level `model`,
* `reasoning_effort`, `fast`, `cwd`, `branch`, `running`, `profile_name`,
* `update_behind` (Optional[int] — null until the prefetched check lands),
* `update_command`, `mcp_servers` (list of {name,transport,connected,tools}
* dicts from `get_mcp_status()`), plus a nested `usage` (`_get_usage()`,
* server.py:~1683) carrying `context_used`, `context_max`,
* `context_percent`, `compressions` (context_* only present when the
* compressor knows a context length) and `cost_usd` (only when the pricing
* estimate succeeds).
* - startup.catalog → `@method("startup.catalog")` (server.py:~8521):
* `{ tools:{total, toolsets:[{name,count,enabled,tools}]},
* skills:{total, categories:[{name,count}]}, mcp:{servers:[]} }`.
*
* These schemas are used PURELY as decoders; they do NOT Effect-ify the store's
* reactivity or control flow (Solid stays the runtime — spec v4 §1).
*/
import { Schema } from 'effect'
const Str = Schema.String
const Num = Schema.Number
const Bool = Schema.Boolean
const opt = Schema.optionalKey
// ── session.info / session.create.info ────────────────────────────────
// Context/usage numbers arrive nested under `usage`; the same names may also
// appear at the top level depending on the RPC vs event path (the reader prefers
// `usage.context_*`, then the top-level fallback). All keys are optional — a
// `session.info` patch only carries the fields that actually changed.
const UsageSchema = Schema.Struct({
context_used: opt(Num),
context_max: opt(Num),
context_percent: opt(Num),
compressions: opt(Num),
cost_usd: opt(Num)
})
export const SessionInfoPatchSchema = Schema.Struct({
model: opt(Str),
reasoning_effort: opt(Str),
fast: opt(Bool),
cwd: opt(Str),
branch: opt(Str),
// session title ("" until the first exchange titles it) — drives the
// terminal window-title chrome (OSC 0/2 via renderer.setTerminalTitle).
title: opt(Str),
running: opt(Bool),
// status-bar chrome extras (Epic 1.3): update banner, profile badge, MCP count.
// `update_behind` is null on the wire until the async update check resolves.
update_behind: opt(Schema.NullOr(Num)),
update_command: opt(Str),
profile_name: opt(Str),
mcp_servers: opt(Schema.Array(Schema.Unknown)),
// top-level context fallback (used when there's no nested `usage`)
context_used: opt(Num),
context_max: opt(Num),
context_percent: opt(Num),
compressions: opt(Num),
usage: opt(UsageSchema)
})
export type SessionInfoPatchDecoded = typeof SessionInfoPatchSchema.Type
/** Decode a loose session.info payload → `Option<SessionInfoPatchDecoded>`. */
export const decodeSessionInfoPatch = Schema.decodeUnknownOption(SessionInfoPatchSchema)
// ── startup.catalog ───────────────────────────────────────────────────
// Mirrors the `Catalog` interface in store.ts. `enabled` defaults to true at the
// reader (an absent flag means on), so it stays optional here.
const ToolsetSchema = Schema.Struct({
name: opt(Str),
count: opt(Num),
enabled: opt(Bool),
tools: opt(Schema.Array(Schema.Unknown))
})
const CategorySchema = Schema.Struct({
name: opt(Str),
count: opt(Num)
})
export const CatalogSchema = Schema.Struct({
tools: opt(
Schema.Struct({
total: opt(Num),
toolsets: opt(Schema.Array(ToolsetSchema))
})
),
skills: opt(
Schema.Struct({
total: opt(Num),
categories: opt(Schema.Array(CategorySchema))
})
),
mcp: opt(
Schema.Struct({
servers: opt(Schema.Array(Schema.Unknown))
})
)
})
export type CatalogDecoded = typeof CatalogSchema.Type
/** Decode a loose startup.catalog result → `Option<CatalogDecoded>`. */
export const decodeCatalog = Schema.decodeUnknownOption(CatalogSchema)

View File

@@ -0,0 +1,55 @@
/**
* SessionPeek decoder — decode-at-boundary (house rule) for the `session.peek`
* RPC result (tui_gateway/server.py `@method("session.peek")`, shipped with the
* resume-picker gateway half, commit 529d8084b). The response powers the
* picker's Space preview:
*
* { session: {id, title, source, model, cwd, started_at, ended_at,
* end_reason, message_count, last_active, cost_usd},
* head: [{id, role, content(≤2000), truncated, timestamp}, …],
* tail: [same — never overlaps head],
* total_messages: int }
*
* Wire nullability per the server: `model`/`cwd`/`ended_at`/`end_reason`/
* `cost_usd` are `None` when unknown; message `id`/`timestamp` come straight
* off DB rows (left loose). Decoded with `Schema.decodeUnknownOption` — a
* malformed payload yields `Option.none` and the preview pane shows its
* honest "preview unavailable" line instead of crashing the overlay.
*/
import { Schema } from 'effect'
const Str = Schema.String
const Num = Schema.Number
const opt = Schema.optionalKey
const PeekMessageSchema = Schema.Struct({
role: opt(Str),
content: opt(Str),
truncated: opt(Schema.Boolean),
timestamp: opt(Schema.NullOr(Schema.Unknown))
})
export const SessionPeekSchema = Schema.Struct({
session: opt(
Schema.Struct({
id: opt(Str),
title: opt(Schema.NullOr(Str)),
source: opt(Schema.NullOr(Str)),
model: opt(Schema.NullOr(Str)),
cwd: opt(Schema.NullOr(Str)),
started_at: opt(Schema.NullOr(Num)),
ended_at: opt(Schema.NullOr(Num)),
end_reason: opt(Schema.NullOr(Str)),
message_count: opt(Schema.NullOr(Num)),
last_active: opt(Schema.NullOr(Num)),
cost_usd: opt(Schema.NullOr(Num))
})
),
head: opt(Schema.Array(PeekMessageSchema)),
tail: opt(Schema.Array(PeekMessageSchema)),
total_messages: opt(Num)
})
export type SessionPeekDecoded = typeof SessionPeekSchema.Type
/** Decode a loose session.peek result → `Option<SessionPeekDecoded>`. */
export const decodeSessionPeek = Schema.decodeUnknownOption(SessionPeekSchema)

View File

@@ -0,0 +1,127 @@
/**
* Terminal chrome seam — window title (OSC 0/2) + desktop notifications
* through the renderer's native primitives.
*
* Why the renderer and not process.stdout: the zig side owns the terminal —
* `setTerminalTitle` and `triggerNotification` are native FFI calls and
* `writeOut` serializes raw control bytes with frame presentation, so chrome
* writes can never tear a frame.
*
* Notifications go through the native `renderer.triggerNotification(message,
* title)` (zig `lib.triggerNotification`), NOT a hand-rolled OSC 9/99/777 spray.
* The zig side does what raw OSC can't: authoritative protocol detection
* (query > heuristic) so it picks the ONE protocol the terminal speaks, **tmux
* DCS passthrough wrapping** (raw OSC is silently eaten by tmux), and Zellij
* OSC-99 enforcement. It returns `false` when no protocol was detected.
*
* Focus suppression: core parses mode-1004 focus reports (`ESC[I`/`ESC[O`)
* and re-emits them as renderer `focus`/`blur` events — notifications are
* skipped while the terminal reports focused (you're already looking at it).
* Native `triggerNotification` does NOT do focus suppression, so it stays our
* policy here. Terminals that never report focus leave the state at the
* assumed-focused initial value… which would swallow every notification, so
* the FIRST blur is what arms suppression: until a blur arrives we treat focus
* as unknown and notify unconditionally (worst case: a redundant ping while
* focused).
*
* Everything here is total — chrome must never throw into the render loop
* or a teardown path.
*/
import type { CliRenderer } from '@opentui/core'
import type { TermNotification } from '../logic/termChrome.ts'
import {
notifyEnabled,
sanitizeOscText,
TITLE_STACK_RESTORE,
TITLE_STACK_SAVE,
windowTitleFor
} from '../logic/termChrome.ts'
import { getLog } from './log.ts'
/** What the view layer needs from the chrome seam (DI-friendly for tests). */
export interface TerminalChromeSeam {
/** Set the window title from the session title (undefined → generic). */
readonly setTitle: (sessionTitle: string | undefined) => void
/** Announce "waiting on you" to the hosting terminal (no-op while focused). */
readonly notify: (notification: TermNotification) => void
}
/** The renderer surface the seam writes through (runtime-verified shapes). */
interface RendererSeam {
setTerminalTitle(title: string): void
/** Native desktop notification (protocol detection + tmux/Zellij wrapping). */
triggerNotification(message: string, title?: string): boolean
writeOut(chunk: string): void
on(event: 'focus' | 'blur', listener: () => void): unknown
once(event: 'destroy', listener: () => void): unknown
readonly isDestroyed: boolean
}
/** Install the chrome seam on a live renderer. Idempotent per renderer use —
* the entry calls it once, right next to the render bridge. */
export function installTerminalChrome(renderer: CliRenderer): TerminalChromeSeam {
const seam = renderer as unknown as RendererSeam
const notificationsOn = notifyEnabled()
// unknown (null) until the terminal proves it reports focus; then boolean.
let focused: boolean | null = null
try {
seam.on('focus', () => {
focused = true
})
seam.on('blur', () => {
focused = false
})
} catch (cause) {
getLog().warn('chrome', 'focus tracking unavailable', { cause: String(cause) })
}
// Bracket our title ownership: save the user's title now, restore on quit.
// Best-effort — terminals without the XTWINOPS title stack ignore both.
writeRaw(seam, TITLE_STACK_SAVE)
seam.once('destroy', () => writeRaw(seam, TITLE_STACK_RESTORE, { evenIfDestroyed: true }))
let lastTitle = ''
return {
setTitle: sessionTitle => {
const title = windowTitleFor(sessionTitle)
if (title === lastTitle) return
lastTitle = title
try {
if (!seam.isDestroyed) seam.setTerminalTitle(title)
} catch (cause) {
getLog().warn('chrome', 'setTerminalTitle failed', { cause: String(cause) })
}
},
notify: notification => {
if (!notificationsOn || focused === true) return
// Map our {title:'Hermes', body:'finished — …'} → native (message, title):
// native API takes the BODY as the message and the heading as the title.
const title = sanitizeOscText(notification.title)
const body = sanitizeOscText(notification.body ?? '')
if (!title) return
const message = body || title
try {
if (!seam.isDestroyed) seam.triggerNotification(message, title)
} catch (cause) {
getLog().warn('chrome', 'triggerNotification failed', { cause: String(cause) })
}
}
}
}
/** Raw control write through the renderer; falls back to process.stdout when
* the renderer is already gone (the title-stack restore on destroy — at that
* point there is no frame left to tear). */
function writeRaw(seam: RendererSeam, chunk: string, options?: { evenIfDestroyed?: boolean }): void {
try {
if (!seam.isDestroyed) {
seam.writeOut(chunk)
return
}
if (options?.evenIfDestroyed) process.stdout.write(chunk)
} catch (cause) {
getLog().warn('chrome', 'control write failed', { cause: String(cause) })
}
}

View File

@@ -0,0 +1,64 @@
/**
* FakeGateway — the test/dev implementation of GatewayService (spec v4 §2/§5
* Layer-3 seam). Provides an emittable event source and a spy `request`, so
* store/component tests can drive synthetic streams and assert RPC calls
* without spawning Python. Mirrors opencode's injectable fake transport.
*
* Phase 0 uses it to stream a scripted "hello" so the entry/test renders a
* non-empty frame. Phase 1 swaps in `liveGateway.layer` (real `tui_gateway`).
*/
import { Effect, Layer } from 'effect'
import { GatewayService, type GatewayServiceShape } from '../boundary/gateway/GatewayService.ts'
import type { GatewayEvent } from '../boundary/schema/GatewayEvent.ts'
export interface FakeGatewayController {
readonly service: GatewayServiceShape
/** Emit a decoded event to all subscribers (drives the store in tests). */
readonly emit: (event: GatewayEvent) => void
/** Recorded (method, params) pairs from `request` calls. */
readonly calls: Array<{ method: string; params: unknown }>
}
/** Build a fresh fake controller (used directly in tests, or wrapped as a Layer). */
export function makeFakeGateway(initialSessionId = 'fake-session'): FakeGatewayController {
const handlers = new Set<(event: GatewayEvent) => void>()
const calls: Array<{ method: string; params: unknown }> = []
const service: GatewayServiceShape = {
subscribe: handler =>
Effect.sync(() => {
handlers.add(handler)
return () => {
handlers.delete(handler)
}
}),
request: <A>(method: string, params: unknown) =>
Effect.sync(() => {
calls.push({ method, params })
return undefined as A
}),
sessionId: () => initialSessionId
}
return {
service,
emit: event => {
for (const handler of handlers) handler(event)
},
calls
}
}
/** A GatewayService layer backed by a fresh FakeGateway. The controller is
* reachable for assertions via the returned tuple in tests; for the dev entry
* use {@link fakeGatewayLayer} and drive it from a scripted effect. */
export function fakeGatewayLayerWith(controller: FakeGatewayController): Layer.Layer<GatewayService> {
return Layer.succeed(GatewayService, controller.service)
}
/** Convenience: a layer + its controller, for the dev entry's scripted stream. */
export function makeFakeGatewayLayer(): { layer: Layer.Layer<GatewayService>; controller: FakeGatewayController } {
const controller = makeFakeGateway()
return { layer: Layer.succeed(GatewayService, controller.service), controller }
}

View File

@@ -0,0 +1,770 @@
/**
* Entry — the single boundary edge (spec v4 §3.1). This is the ONE place that:
* - acquires the renderer (acquireRelease + Deferred-on-destroy),
* - creates the Solid store,
* - wires GatewayService.subscribe -> store.apply (Effect->Solid contact #2),
* - does the one-line `render(() => <App/>, renderer)` bridge (contact #1),
* - (live) bootstraps a session and optionally submits an initial prompt,
* - blocks until the renderer is destroyed (user quit),
* and at the bottom PROVIDES the layers and runs (`Effect.provide(AppLayer)`).
*
* Backend selection (import.meta.main):
* - default → the LIVE `liveGatewayLayer` (spawns the real Python
* `tui_gateway`); after `gateway.ready` it `session.create`s and, if an
* initial prompt is given (HERMES_TUI_PROMPT or argv), `prompt.submit`s it.
* The composer lands in Phase 2 — until then the initial prompt is how a
* streamed reply is driven into the transcript (spec Phase-1 smoke).
* - HERMES_TUI_FAKE=1 → the scripted FakeGateway "hello" (offline dev/CI).
*
* The body of `run` does not change when the backend swaps — that's the point of
* the layer; only `makeAppLayer(...)` differs at the edge.
*/
import { createDefaultOpenTuiKeymap } from '@opentui/keymap/opentui'
import { KeymapProvider } from '@opentui/keymap/solid'
import { render } from '@opentui/solid'
import { Deferred, Duration, Effect } from 'effect'
import { writeFileSync } from 'node:fs'
import { readClipboardImage, writeClipboard } from '../boundary/clipboard.ts'
import { GatewayService, type GatewayServiceShape } from '../boundary/gateway/GatewayService.ts'
import { liveGatewayLayer } from '../boundary/gateway/liveGateway.ts'
import { getLog } from '../boundary/log.ts'
import { startMemlog } from '../boundary/memlog.ts'
import { startMemoryMonitor } from '../boundary/memoryMonitor.ts'
import { startProactiveGc } from '../boundary/proactiveGc.ts'
import { registerRemoteParsers } from '../boundary/parsers.ts'
import { acquireRenderer } from '../boundary/renderer.ts'
import { makeAppLayer } from '../boundary/runtime.ts'
import { nthAssistantResponse } from '../logic/copy.ts'
import { performHeapdump } from '../logic/diagnostics.ts'
import {
envFlag,
heapdumpOnStart,
launchCwd,
noConfirmDestructive,
resolveMouseEnabled,
startupImage,
startupPrompt,
STARTUP_IMAGE_DEFAULT_PROMPT
} from '../logic/env.ts'
import { createPromptHistory, dirHistoryPersister, loadDirHistory } from '../logic/history.ts'
import { parseProcessList } from '../logic/backgroundActivity.ts'
import { createPasteStore } from '../logic/pastes.ts'
import { mapResumeHistory } from '../logic/resume.ts'
import {
classifySubmit,
catalogCommandItems,
createCompletionGate,
dispatchSlash,
mapCompletions,
mapModelOptions,
planCompletion,
readReplaceFrom,
registerModelPrefetch,
type SlashContext
} from '../logic/slash.ts'
import { createSessionStore, type SessionStore } from '../logic/store.ts'
import { App } from '../view/App.tsx'
import { seedLearnedNames } from '../view/composer.tsx'
import { TerminalChrome } from '../view/terminalChrome.tsx'
// Syntax-highlighting language expansion: register the remote tree-sitter
// grammars (python/rust/go/bash/json/c/html/css/yaml/toml) before the first
// <code>/<markdown> mount initializes the global tree-sitter client. Grammars
// are fetched from GitHub on first use and cached under HERMES_TUI_PARSER_CACHE.
registerRemoteParsers()
import type { SessionPickerOps } from '../view/overlays/sessionPicker.tsx'
import { ThemeProvider } from '../view/theme.tsx'
import { makeFakeGatewayLayer, type FakeGatewayController } from './fakeGateway.ts'
export interface TuiInput {
/** Mouse tracking on/off. */
readonly mouse: boolean
/** Skip the live session bootstrap (the fake backend drives the stream itself). */
readonly fake: boolean
/** Terminal width passed to `session.create` (Ink uses the live cols; 80 is a fine default). */
readonly cols: number
/** Optional initial prompt submitted once the session is ready — the Phase-1 stand-in for the composer. */
readonly initialPrompt?: string
/** Optional image PATH attached (image.attach) before the initial prompt — `hermes --tui --image <path>`. */
readonly initialImage?: string
/** Resume a session instead of creating one: a session id, 'recent'/'last'
* (→ session.most_recent), or 'picker' (bare `--resume` — open the resume
* picker BEFORE any session.create; create stays lazy). */
readonly resumeId?: string
}
const READY_POLL = Duration.millis(100)
const READY_TIMEOUT_MS = 20_000
/** Window after a Ctrl+C in which a second Ctrl+C quits the TUI (item 11). */
const QUIT_WINDOW_MS = 3_000
/** Recursive renderable count under a node (the /mem store-cap diagnostic —
* same walk as scripts/mem-bench.tsx; cheap: one tree pass on demand). */
function descendantCount(node: { getChildren(): unknown[] }): number {
let n = 0
for (const child of node.getChildren()) {
n += 1
if (child && typeof child === 'object' && 'getChildren' in child) {
n += descendantCount(child as { getChildren(): unknown[] })
}
}
return n
}
/**
* Resume a session INTO the store: buffer live events across the `session.resume`
* RPC, then replace history + replay (gotcha §8 #5 tool rows handled by
* mapResumeHistory). Shared by the launch bootstrap and the session switcher.
* Timed (rpc_ms / hydrate_ms) for the resume profile.
*/
/**
* Record the CURRENT session id in `HERMES_TUI_ACTIVE_SESSION_FILE` (item #5).
* The launcher reads this on exit to print the right "Resume this session with…"
* epilogue (hermes_cli/main.py `_print_tui_exit_summary`). The Ink TUI writes it on
* every session change (useSessionLifecycle.writeActiveSessionFile); the native
* engine must too, or the launcher falls back to the INITIAL launch session and
* shows resume info for the wrong session after a `/session` switch.
*/
const writeActiveSession = (sid: string | undefined) => {
const file = process.env.HERMES_TUI_ACTIVE_SESSION_FILE
if (!file || !sid) return
try {
writeFileSync(file, JSON.stringify({ session_id: sid }), { mode: 0o600 })
} catch (cause) {
getLog().warn('bootstrap', 'active-session-file write failed', { cause: String(cause) })
}
}
const resumeInto = (gateway: GatewayServiceShape, store: SessionStore, sid: string, cols: number) =>
Effect.gen(function* () {
writeActiveSession(sid) // the session we're switching to is now the active one (#5)
store.setSessionId(sid)
store.beginBuffer()
const t0 = Date.now()
const resumed = yield* gateway.request<{ messages?: unknown; info?: Record<string, unknown> }>('session.resume', {
cols,
session_id: sid,
// native engine renders tools collapsed → safe to fold each tool's capped
// result into the resume snapshot so resumed turns render like live (item 1).
with_tool_output: true
})
const t1 = Date.now()
const snapshot = mapResumeHistory(resumed?.messages)
store.commitSnapshot(snapshot)
if (resumed?.info) store.applyInfo(resumed.info)
getLog().info('bootstrap', 'session resumed', {
count: snapshot.length,
hydrate_ms: Date.now() - t1,
rpc_ms: t1 - t0,
sid
})
})
/**
* Post-session setup, shared by every way a session comes to exist (create,
* boot resume, boot-picker pick): the tools/skills/MCP catalog for the home
* panel (item 9 — best-effort), the optional initial prompt, and the `/model`
* catalog prefetch (Epic 7 instant open: `model.options` is the slow RPC —
* network pricing fetch + Nous tier check — so pay it ONCE in an already-
* forked fiber; the promise is STASHED in the slash seam so an early `/model`
* awaits THIS request instead of doubling it).
*/
const postSessionSetup = (
gateway: GatewayServiceShape,
store: SessionStore,
sid: string,
initialPrompt?: string,
initialImage?: string
) =>
Effect.gen(function* () {
const catalog = yield* gateway
.request<unknown>('startup.catalog', { session_id: sid })
.pipe(Effect.catchCause(() => Effect.succeed(undefined)))
if (catalog) store.setCatalog(catalog)
// Seed the composer's slash-highlight catalog ONCE at boot (glitch
// 2026-06-14): `commands.catalog` returns the full uncapped command + skill
// name list ({pairs:[["/name","desc"],…]}); feeding the names through
// seedLearnedNames means a cold `/command` highlights on the first keystroke
// instead of only after its completion batch was browsed earlier. Best-effort
// — a failure just leaves the old lazy-learn behavior.
const cmdCatalog = yield* gateway
.request<unknown>('commands.catalog', {})
.pipe(Effect.catchCause(() => Effect.succeed(undefined)))
seedLearnedNames(catalogCommandItems(cmdCatalog))
// Seeded image (`hermes --tui --image <path>`): attach BEFORE submitting, so
// the next prompt.submit picks it up — exact Ink parity (createGatewayEventHandler
// scheduleStartupPrompt: image.attach then submit; default prompt when image-only).
const image = initialImage?.trim()
if (image) {
yield* gateway.request('image.attach', { path: image, session_id: sid }).pipe(
Effect.catchCause(cause =>
Effect.sync(() => {
getLog().warn('bootstrap', 'startup image attach failed', { cause: String(cause) })
store.pushSystem(`startup image attach failed: ${String(cause)}`)
})
)
)
}
const prompt = initialPrompt?.trim() || (image ? STARTUP_IMAGE_DEFAULT_PROMPT : undefined)
if (prompt) {
store.pushUser(prompt)
yield* gateway.request('prompt.submit', { session_id: sid, text: prompt })
}
const prefetch = Effect.runPromise(
gateway
.request<unknown>('model.options', { session_id: sid })
.pipe(Effect.catchCause(() => Effect.succeed(undefined)))
).then(modelOpts => {
const modelItems = mapModelOptions(modelOpts)
if (modelItems.length) store.setModelItems(modelItems)
})
registerModelPrefetch(prefetch)
yield* Effect.promise(() => prefetch)
})
/** Create a FRESH session + run the post-session setup (the default boot path;
* also the boot-picker's Esc fallback — closing the picker without a pick
* must still leave a usable session behind). */
const createFreshSession = (gateway: GatewayServiceShape, store: SessionStore, input: TuiInput) =>
Effect.gen(function* () {
const created = yield* gateway.request<{ session_id?: string; info?: Record<string, unknown> }>('session.create', {
cols: input.cols,
// The launch directory IS the workspace choice in a terminal (you cd'd
// here) — passing it makes the gateway treat it as explicit, so the
// session row gets a persisted cwd on first message, the chrome bar shows
// the right dir, and /sessions groups this directory's sessions first.
// NOT process.cwd(): the hermes launcher runs this engine with cwd set to
// its own package dir (ui-opentui), so process.cwd() would be the engine
// dir. The launcher exports the REAL launch dir as HERMES_CWD / the
// gateway's TERMINAL_CWD; prefer those, falling back to process.cwd()
// only when launched standalone (smokes/dev). (Desktop omits cwd — its
// launch dir is meaningless; see _ensure_session_db_row.)
cwd: launchCwd()
})
const sid = created?.session_id ?? gateway.sessionId()
if (!sid) {
getLog().warn('bootstrap', 'session.create returned no session_id')
return
}
if (created?.info) store.applyInfo(created.info)
writeActiveSession(sid) // record the new session for the launcher's exit epilogue (#5)
store.setSessionId(sid)
getLog().info('bootstrap', 'session created', { sid })
yield* postSessionSetup(gateway, store, sid, input.initialPrompt, input.initialImage)
})
/**
* Live session bootstrap: wait for the unsolicited `gateway.ready` handshake,
* then either RESUME a session (hydrate its transcript — incl. tool rows — via
* the snapshot, buffering live events across the RPC), open the resume PICKER
* (`resumeId === 'picker'` — bare `--resume`: no session is created until the
* user picks or closes; create is lazy), or CREATE a fresh one, and (if given)
* submit the initial prompt. Forked into the entry scope so it runs
* concurrently with the render + the quit-await. Any failure is logged and
* swallowed — a bootstrap hiccup must never tear down the rendered UI.
*/
const bootstrapSession = (gateway: GatewayServiceShape, store: SessionStore, input: TuiInput) =>
Effect.gen(function* () {
const log = getLog()
let waited = 0
while (!store.state.ready && waited < READY_TIMEOUT_MS) {
yield* Effect.sleep(READY_POLL)
waited += 100
}
if (!store.state.ready) {
log.warn('bootstrap', 'no gateway.ready within timeout', { waited })
return
}
if (input.resumeId === 'picker') {
// Boot picker (design doc §A): opens BEFORE any session.create. The pick
// resumes via onResume (which then runs postSessionSetup); a close
// without a pick falls back to createFreshSession (onSessionPickerClosed).
store.openSessionPicker('recent')
return
}
if (input.resumeId) {
let sid: string | undefined = input.resumeId
if (sid === 'recent' || sid === 'last') {
const recent = yield* gateway.request<{ session_id?: string }>('session.most_recent', {})
sid = recent?.session_id
}
if (!sid) {
log.warn('bootstrap', 'no session to resume', { resumeId: input.resumeId })
return
}
yield* resumeInto(gateway, store, sid, input.cols)
yield* postSessionSetup(gateway, store, sid, input.initialPrompt, input.initialImage)
return
}
yield* createFreshSession(gateway, store, input)
}).pipe(Effect.catchCause(cause => Effect.sync(() => getLog().warn('bootstrap', 'failed', { cause: String(cause) }))))
/** The entry Effect. Mirrors opencode `app.tsx:177` `run = Effect.fn("Tui.run")`. */
export const run = Effect.fn('Tui.run')(function* (input: TuiInput) {
yield* Effect.scoped(
Effect.gen(function* () {
// Solid side: the store + reducer. Created here, lives in Solid-land.
const store = createSessionStore()
// Prompt history (item 6): scoped to the launch directory so prior prompts
// from the same project dir are recallable (Up/Down), without bleeding
// across different dirs. process.cwd() is the user's launch dir under the
// real launcher.
const historyCwd = process.cwd()
const history = createPromptHistory({
initial: loadDirHistory(historyCwd),
persist: dirHistoryPersister(historyCwd)
})
// Pasted-text store — created ONCE here so it survives the composer
// remounting (overlay open/close); a per-composer store would lose a
// pending `[Pasted text #N]` mid-compose and submit would send it literally.
const pasteStore = createPasteStore()
// Contact point #2: boundary pushes decoded events into the Solid store.
// The callback ALSO drives auto-heal re-resume: a post-crash gateway.ready
// (i.e. one that follows a gateway.exited, so `recoverSid` is set) re-resumes
// the session so the transcript continues. The INITIAL gateway.ready has
// `recoverSid === undefined`, so the normal bootstrap path is untouched.
const gateway = yield* GatewayService
let recoverSid: string | undefined
yield* gateway.subscribe(event => {
store.apply(event)
if (event.type === 'gateway.exited') {
recoverSid = gateway.sessionId() ?? recoverSid
} else if (event.type === 'gateway.ready' && recoverSid !== undefined) {
const sid = recoverSid
recoverSid = undefined
Effect.runFork(
resumeInto(gateway, store, sid, input.cols).pipe(
Effect.catchCause(cause =>
Effect.sync(() => getLog().warn('recover', 'resume failed', { cause: String(cause) }))
)
)
)
}
})
// ── Ctrl+C state machine (item 11) ──────────────────────────────────
// While a turn runs, the first Ctrl+C STOPS the agent (session.interrupt);
// a second Ctrl+C within QUIT_WINDOW_MS (or when idle) KILLS the TUI. The
// debounce stops a stray Ctrl+C from nuking the session (opencode's
// double-press model; the user's preferred behaviour).
let quitArmed = false
let quitTimer: ReturnType<typeof setTimeout> | undefined
let doQuit = () => {} // assigned once the renderer exists
const disarmQuit = () => {
quitArmed = false
if (quitTimer) clearTimeout(quitTimer)
quitTimer = undefined
store.setHint(undefined)
}
const armQuit = (message: string) => {
quitArmed = true
store.setHint(message)
if (quitTimer) clearTimeout(quitTimer)
quitTimer = setTimeout(disarmQuit, QUIT_WINDOW_MS)
}
const interruptTurn = () => {
const sid = gateway.sessionId()
if (!sid) return
Effect.runFork(
gateway
.request('session.interrupt', { session_id: sid })
.pipe(
Effect.catchCause(cause =>
Effect.sync(() => getLog().warn('interrupt', 'failed', { cause: String(cause) }))
)
)
)
}
const onCtrlC = () => {
if (quitArmed) {
disarmQuit()
doQuit()
return
}
if (store.state.info.running) {
interruptTurn()
armQuit('⏹ stopped — Ctrl+C again to quit')
} else {
armQuit('Ctrl+C again to quit')
}
}
// Transient hint that auto-clears (used by copy/image-paste feedback).
const flashHint = (message: string, ms = 1500) => {
store.setHint(message)
setTimeout(() => {
if (store.state.hint === message) store.setHint(undefined)
}, ms)
}
// Copy a mouse selection to the clipboard (item 1) — OSC 52 + native command.
// Copies exactly the rendered text the user highlighted (markers are concealed
// in the pretty render; the `/copy` command copies a full response's source).
const onCopySelection = (text: string) => {
void writeClipboard(text)
flashHint('Copied selection')
}
// Paste an IMAGE (item 1): read the clipboard image and attach it to the
// session (image.attach_bytes); the next prompt.submit picks it up.
const onImagePaste = () => {
void (async () => {
const img = await readClipboardImage()
if (!img) {
flashHint('No image in clipboard', 2000)
return
}
const sid = gateway.sessionId()
if (!sid) {
flashHint('No session for image', 2000)
return
}
try {
await Effect.runPromise(
gateway.request('image.attach_bytes', {
content_base64: img.data,
filename: 'pasted.png',
session_id: sid
})
)
flashHint('🖼 image attached — type a message and send', 3000)
} catch {
flashHint('Image attach failed', 2000)
}
})()
}
// A blocking prompt owns Ctrl+C (→ cancel); otherwise the state machine above runs.
const { renderer, shutdown } = yield* acquireRenderer({
mouse: input.mouse,
isBlocked: () => store.state.prompt !== undefined,
onCtrlC,
onCopySelection
})
// Fleet memory self-sampling (HERMES_TUI_MEMLOG / diagnostics master
// switch — boundary/memlog.ts). Scoped acquire→release like the renderer.
const stopMemlog = startMemlog()
yield* Effect.addFinalizer(() => Effect.sync(stopMemlog))
// Proactive idle GC (W2) — opt-in via a low HERMES_TUI_HEAP_MB (no-op on
// the default path). Idle-gated on the store's streaming flag so it never
// collects mid-reply. Scoped release like memlog.
const proactiveGc = startProactiveGc(() => store.state.info.running === true)
yield* Effect.addFinalizer(() => Effect.sync(proactiveGc.stop))
// Memory early-warning (#34095 parity) — surfaces a transcript system line
// when heap climbs abnormally fast below the OOM ceiling (the silent-death
// regime). ON by default: a KB user-facing safety heads-up, not a
// diagnostic dump. No auto heap-snapshot (memlog is the diagnosis path).
const stopMemoryMonitor = startMemoryMonitor(line => store.pushSystem(line))
yield* Effect.addFinalizer(() => Effect.sync(stopMemoryMonitor))
// HERMES_HEAPDUMP_ON_START (Ink parity): a deliberate baseline snapshot at
// boot. Bypasses the diagnostics master switch (you set it on purpose).
// Best-effort + synchronous (writeHeapSnapshot blocks V8) — a failure must
// never block launch.
if (heapdumpOnStart()) {
try {
const dump = performHeapdump()
store.pushSystem(`heap snapshot written: ${dump.path}`)
} catch (cause) {
getLog().warn('bootstrap', 'heapdump-on-start failed', { cause: String(cause) })
}
}
doQuit = () => {
if (!renderer.isDestroyed) renderer.destroy()
}
// Native keymap host (Phase 3): one keymap bound to this renderer, provided
// to the whole Solid tree via <KeymapProvider>. Overlays/prompts register
// close (and confirm) layers against it through useCloseLayer/useBindings.
const keymap = createDefaultOpenTuiKeymap(renderer)
// Submit a user turn: the service value is in hand, so `gateway.request(...)`
// is Effect<…, never> — fire it detached with runFork; failures are logged.
const submitPrompt = (text: string) => {
store.pushUser(text)
const sid = gateway.sessionId()
if (!sid) {
getLog().warn('submit', 'no session yet — dropping prompt', { text })
return
}
Effect.runFork(
gateway
.request('prompt.submit', { session_id: sid, text })
.pipe(
Effect.catchCause(cause => Effect.sync(() => getLog().warn('submit', 'failed', { cause: String(cause) })))
)
)
}
// `!cmd` — run a shell command directly (Ink/free-code parity: F9). The
// gateway's `shell.exec` runs it (30s timeout, dangerous/hardline guards)
// and returns {stdout, stderr, code}; we echo the invocation as a user line
// and the combined output (or the error / non-zero exit) as a system line.
// No model turn — this never hits prompt.submit. Detached like submitPrompt.
const runShell = (cmd: string) => {
if (!cmd) return
store.pushUser(`!${cmd}`)
Effect.runFork(
gateway.request<{ stdout?: string; stderr?: string; code?: number }>('shell.exec', { command: cmd }).pipe(
Effect.tap(r =>
Effect.sync(() => {
const out = [r.stdout, r.stderr].filter(Boolean).join('\n').trimEnd()
if (out) store.pushSystem(out)
if ((r.code ?? 0) !== 0 || !out) store.pushSystem(`exit ${r.code ?? 0}`)
})
),
Effect.catchCause(cause =>
Effect.sync(() => {
getLog().warn('shell', 'failed', { cause: String(cause) })
store.pushSystem(`error: ${String(cause)}`)
})
)
)
)
}
// Resume a chosen session (resume picker pick or `/resume <id>` direct
// path) — the same hydrate path as launch. When the picker was the BOOT
// surface (bare `--resume`), no create ever ran, so the post-session
// setup (catalog, /model prefetch) runs here exactly once.
const onResume = (resumeSid: string) => {
Effect.runFork(
Effect.gen(function* () {
yield* resumeInto(gateway, store, resumeSid, input.cols)
if (!store.state.catalog) yield* postSessionSetup(gateway, store, resumeSid)
}).pipe(
Effect.catchCause(cause => Effect.sync(() => getLog().warn('resume', 'failed', { cause: String(cause) })))
)
)
}
// The resume picker's gateway calls (view/overlays/sessionPicker.tsx).
// `rename` goes through `session.title` — the existing title RPC (it
// reaches only LIVE gateway sessions; the picker surfaces rejections).
const sessionOps: SessionPickerOps = {
list: params => Effect.runPromise(gateway.request('session.list', params)),
peek: sessionId => Effect.runPromise(gateway.request('session.peek', { session_id: sessionId })),
rename: (sessionId, title) =>
Effect.runPromise(gateway.request('session.title', { session_id: sessionId, title })).then(() => undefined)
}
// The background-process panel's gateway calls (view/overlays/backgroundPanel.tsx):
// `agents.list` lists the OS process registry; `process.stop` kills ALL of them
// (the gateway exposes kill-all only — no per-process RPC, hence no per-row kill).
const backgroundOps = {
list: () => Effect.runPromise(gateway.request('agents.list', {})).then(parseProcessList),
stopAll: () => Effect.runPromise(gateway.request('process.stop', {})).then(() => undefined)
}
// Boot-picker Esc fallback: the picker closed without a pick and no
// session exists yet (bare `--resume` launch) — create a fresh one so
// the composer has somewhere to send prompts.
const onSessionPickerClosed = () => {
if (gateway.sessionId()) return
Effect.runFork(
createFreshSession(gateway, store, input).pipe(
Effect.catchCause(cause =>
Effect.sync(() => getLog().warn('bootstrap', 'post-picker create failed', { cause: String(cause) }))
)
)
)
}
// Slash dispatch context (Solid logic; the boundary just hands it a
// Promise-returning `request` + the host capabilities it needs).
const slashCtx: SlashContext = {
clearTranscript: () => store.clearTranscript(),
compact: () => store.state.compact,
setCompact: on => store.setCompact(on),
details: () => store.state.details,
setDetails: mode => store.setDetails(mode),
renderableCount: () => {
try {
return descendantCount(renderer.root)
} catch {
return undefined
}
},
// HERMES_TUI_NO_CONFIRM (Ink parity): skip the destructive-action confirm
// step and run the action immediately. Read per call so a wrapper that
// mutates env before launch sees the live value.
confirm: (message, onConfirm) => (noConfirmDestructive() ? onConfirm() : store.setConfirm(message, onConfirm)),
copyResponse: n => {
const text = nthAssistantResponse(store.state.messages, n)
if (!text) return false
void writeClipboard(text)
flashHint(n > 1 ? `Copied response #${n} to clipboard` : 'Copied response to clipboard')
return true
},
modelItems: () => store.state.modelItems,
setModelItems: items => store.setModelItems(items),
logTail: () =>
getLog()
.tail(200)
.map(e => `${e.scope}: ${e.msg}`),
openDashboard: () => store.openDashboard(),
openBackgroundPanel: () => store.openBackgroundPanel(),
addBgTask: id => store.addBgTask(id),
openPager: (title, text) => store.openPager(title, text),
openPicker: picker => store.openPicker(picker),
openSessionPicker: tab => store.openSessionPicker(tab),
resumeSession: onResume,
pushSystem: text => store.pushSystem(text),
quit: () => {
if (!renderer.isDestroyed) renderer.destroy()
},
request: (method, params) => Effect.runPromise(gateway.request(method, params)),
sessionId: () => gateway.sessionId(),
submit: submitPrompt
}
// The composer's submit: `!cmd` runs a shell command (F9), `/command`
// routes through the slash ladder, else a prompt turn.
const submit = (text: string) => {
const route = classifySubmit(text)
if (route.kind === 'shell') runShell(route.payload)
else if (route.kind === 'slash') void dispatchSlash(route.payload, slashCtx)
else submitPrompt(route.payload)
}
// Live completions (items 5 + 13): a `/command [args]` line queries
// `complete.slash` (the gateway completes names AND args); a trailing
// path-like word queries `complete.path` (file/@-mention tagging). The
// accepted item replaces from the gateway's `replace_from` (or the token
// start), so only the relevant token is spliced — not the whole line.
// Fired per keystroke (a debounce is a polish item).
//
// Out-of-order guard (glitch 2026-06-14): the gateway transport does NOT
// guarantee in-order response delivery, and these RPCs fire per keystroke
// with no debounce — a slow earlier `complete.slash` could resolve AFTER a
// later `@`-mention `complete.path` and clobber the store, blanking the
// `@` dropdown ("a leading /path message breaks @-mentions afterward").
// The completion gate (claimed on EVERY call, before the clear branch, so
// an intermediate keystroke that fires no RPC still invalidates the older
// in-flight one) drops any response a newer keystroke has superseded.
const completionGate = createCompletionGate()
const onType = (text: string, cursor: number = text.length) => {
const token = completionGate.claim()
const plan = planCompletion(text, cursor)
if (!plan) {
store.clearCompletions()
return
}
Effect.runPromise(gateway.request(plan.method, plan.params))
.then(result => {
if (!completionGate.isCurrent(token)) return // a newer keystroke superseded this query
store.setCompletions(mapCompletions(result), readReplaceFrom(result, plan.from))
})
.catch(() => {
if (!completionGate.isCurrent(token)) return
store.clearCompletions()
})
}
// Blocking-prompt replies (clarify/approval/sudo/secret `*.respond`). Same
// detached-runFork pattern; failures logged, never thrown into the view.
const respond = (method: string, params: Record<string, unknown>) => {
Effect.runFork(
gateway
.request(method, params)
.pipe(
Effect.catchCause(cause =>
Effect.sync(() => getLog().warn('respond', 'failed', { cause: String(cause), method }))
)
)
)
}
// Live backend: drive a session (create + optional initial prompt) concurrently.
if (!input.fake) yield* Effect.forkScoped(bootstrapSession(gateway, store, input))
// (No ambient OS-process poll: the `bg:` badge now counts in-flight
// background-PROMPT tasks from the event stream, and the /processes panel
// fetches `agents.list` on open. Nothing to poll for.)
// Contact point #1: the single render bridge. After this, the screen is Solid's.
// The theme is sourced reactively from the store (skin events update it).
yield* Effect.promise(() =>
render(
() => (
<KeymapProvider keymap={keymap}>
<ThemeProvider theme={() => store.state.theme}>
<TerminalChrome store={store} />
<App
store={store}
onSubmit={submit}
onType={onType}
onRespond={respond}
onResume={onResume}
sessionOps={sessionOps}
onSessionPickerClosed={onSessionPickerClosed}
sessionId={() => gateway.sessionId()}
history={history}
onImagePaste={onImagePaste}
pasteStore={pasteStore}
backgroundOps={backgroundOps}
/>
</ThemeProvider>
</KeymapProvider>
),
renderer
)
)
// Block until the renderer is destroyed (Ctrl+C / quit); finalizers then run.
yield* Deferred.await(shutdown)
})
)
})
/** Scripted "hello" stream so the fake backend paints a non-empty frame offline. */
function streamHello(controller: FakeGatewayController): void {
controller.emit({ type: 'gateway.ready' })
controller.emit({ type: 'message.start' })
for (const chunk of ['Hi ', 'there, ', 'glitch!']) {
controller.emit({ type: 'message.delta', payload: { text: chunk } })
}
controller.emit({ type: 'message.complete' })
}
if (import.meta.main) {
const fake = envFlag(process.env.HERMES_TUI_FAKE, false)
const cols = process.stdout.columns || 80
// `hermes --tui "prompt"` / `--image` seed: the launcher sets HERMES_TUI_QUERY
// (+ HERMES_TUI_IMAGE); we also honor HERMES_TUI_PROMPT (OpenTUI alias) and a
// bare argv tail (standalone dev). See logic/env.ts startupPrompt/startupImage.
const initialPrompt = startupPrompt()
const initialImage = startupImage()
const resumeId = process.env.HERMES_TUI_RESUME?.trim()
// Mouse on by default. Defers to Ink's env surface (HERMES_TUI_MOUSE_TRACKING >
// HERMES_TUI_DISABLE_MOUSE > HERMES_TUI_MOUSE alias > default on). See env.ts.
const mouse = resolveMouseEnabled()
const base = { mouse, fake, cols }
const withPrompt = initialPrompt ? { ...base, initialPrompt } : base
const withImage = initialImage ? { ...withPrompt, initialImage } : withPrompt
const input: TuiInput = resumeId ? { ...withImage, resumeId } : withImage
const onFatal = (error: unknown) => {
getLog().error('entry', 'fatal', { error: String(error) })
process.exitCode = 1
}
if (fake) {
const { layer, controller } = makeFakeGatewayLayer()
// Drive the fake stream shortly after mount so the subscription is live.
setTimeout(() => streamHello(controller), 50)
Effect.runPromise(run(input).pipe(Effect.provide(makeAppLayer(layer)))).catch(onFatal)
} else {
Effect.runPromise(run(input).pipe(Effect.provide(makeAppLayer(liveGatewayLayer)))).catch(onFatal)
}
}

View File

@@ -0,0 +1,129 @@
/**
* Background-activity logic — pure parsers + derive helpers for the "ambient
* activity" feature (notifications, long-running processes, background runs).
* No state container here: the store owns the arrays; these functions parse
* loose wire payloads (everything off the gateway is `unknown`) and compute
* derived values over immutable arrays. Mirrors the defensive loose-read style
* of `logic/slash.ts` (`readStr`) and the snake_case→camel mapping the wire
* needs.
*
* Wire shapes (see boundary/schema/GatewayEvent.ts ~134):
* notification.show payload {text, level, kind, ttl_ms, key, id} (loose Record)
* notification.clear payload {key}
* agents.list result {processes:[{session_id, command, status, uptime_seconds}]}
*/
export interface ActivityNotification {
id: string
key?: string
text: string
level: 'info' | 'warn' | 'error' | 'success'
kind: string
ttlMs?: number
}
export interface BackgroundProcess {
sessionId: string
command: string
status: string
uptimeSeconds: number
}
/** Loose-read a string field off an `unknown` object (slash.ts `readStr` style). */
function readStr(value: unknown, key: string): string | undefined {
if (!value || typeof value !== 'object') return undefined
const v = (value as { [k: string]: unknown })[key]
return typeof v === 'string' ? v : undefined
}
/** Loose-read a finite number off an `unknown` object. */
function readNum(value: unknown, key: string): number | undefined {
if (!value || typeof value !== 'object') return undefined
const v = (value as { [k: string]: unknown })[key]
return typeof v === 'number' && Number.isFinite(v) ? v : undefined
}
/** Coerce any wire `level` to the closed union; anything that isn't a known
* level (absent, garbage, wrong-typed) falls back to 'info'. */
function coerceLevel(value: unknown): ActivityNotification['level'] {
return value === 'warn' || value === 'error' || value === 'success' ? value : 'info'
}
/** A chrome notice (status-bar banner with lifecycle), distinguished from an
* inline card by its lifecycle kind. Credits/usage notices set kind sticky|ttl;
* process/background cards use label kinds (process.complete, etc.). */
export function isChromeNotice(n: ActivityNotification): boolean {
return n.kind === 'sticky' || n.kind === 'ttl'
}
/**
* Parse a `notification.show` payload (unknown) → ActivityNotification, or null
* when there's no usable text (text is the load-bearing field — without it the
* card has nothing to show). Maps snake_case `ttl_ms` → `ttlMs`, coerces a
* garbage/missing `level` to 'info', and defaults `kind` to ''.
*
* id resolution: prefer the wire `id`, then fall back to `key`, else synthesize
* `id = `n:${text}`` (a stable, text-derived id rather than a random one). The
* original `key` (if any) is preserved separately so notification.clear by key
* still targets the right cards.
*/
export function parseNotification(payload: unknown): ActivityNotification | null {
const text = readStr(payload, 'text')
if (!text) return null
const key = readStr(payload, 'key')
const id = readStr(payload, 'id') ?? key ?? `n:${text}`
const out: ActivityNotification = {
id,
kind: readStr(payload, 'kind') ?? '',
level: coerceLevel((payload as { level?: unknown } | null | undefined)?.level),
text
}
if (key !== undefined) out.key = key
const ttlMs = readNum(payload, 'ttl_ms')
if (ttlMs !== undefined) out.ttlMs = ttlMs
return out
}
/** Parse an `agents.list` result ({processes:[...]}) → BackgroundProcess[],
* skipping malformed rows (a row missing session_id/command is dropped, not
* defaulted). snake_case `session_id`/`uptime_seconds` → camelCase; a missing
* uptime defaults to 0, a missing status to ''. */
export function parseProcessList(result: unknown): BackgroundProcess[] {
if (!result || typeof result !== 'object') return []
const processes = (result as { processes?: unknown }).processes
if (!Array.isArray(processes)) return []
const out: BackgroundProcess[] = []
for (const row of processes) {
const sessionId = readStr(row, 'session_id')
const command = readStr(row, 'command')
if (!sessionId || !command) continue
out.push({
command,
sessionId,
status: readStr(row, 'status') ?? '',
uptimeSeconds: readNum(row, 'uptime_seconds') ?? 0
})
}
return out
}
/** Terminal (no-longer-running) process statuses. A process whose status is
* NOT one of these is treated as running — leniently, because the gateway's
* status vocabulary is open-ended and we'd rather over-count the ambient badge
* than silently hide a still-live process under an unfamiliar status string.
* Matched case-insensitively after trimming. */
/** Terminal (no-longer-running) process statuses — exported as the single
* source of truth (the panel imports `procIsRunning` rather than re-declaring). */
export const DONE_STATUSES = new Set(['exited', 'failed', 'complete', 'done', 'killed'])
/** Whether a process status is "running-ish": NOT one of DONE_STATUSES. Lenient
* by design — the gateway's status vocabulary is open-ended, so we over-count
* rather than hide a live process under an unfamiliar status. Case-insensitive. */
export function procIsRunning(status: string): boolean {
return !DONE_STATUSES.has(status.trim().toLowerCase())
}
/** Count of running background processes (the ambient `bg:` badge). */
export function runningCount(procs: readonly BackgroundProcess[]): number {
return procs.filter(p => procIsRunning(p.status)).length
}

View File

@@ -0,0 +1,66 @@
/**
* Completion-menu key routing (Epic 8) — the pure decision table for the
* composer's completions dropdown, kept out of the view so the precedence
* rules are unit-testable.
*
* Precedence (the hard part):
* - Tab accepts the highlighted item and Esc dismisses whenever ANY menu is
* open (slash-command OR path/@-mention) — the pre-Epic-8 semantics.
* - Up/Down move the highlight (wrapping) and Enter accepts it ONLY for the
* SLASH menu (the composer's first token starts with `/`). On a path menu
* — or with a Ctrl/Alt-modified key — they `pass`, keeping their existing
* meanings (prompt history, cursor moves, textarea submit).
* - A closed menu (`count === 0`) always passes.
*
* The caller owns the side effects: `move` updates the selection signal,
* `accept` splices the item into the composer (then arg-completion continues
* as before), `dismiss` clears the candidates, `pass` falls through to the
* history/cursor handling.
*/
/** Max dropdown rows shown (the view slices candidates to this). */
export const MENU_MAX = 8
export interface MenuKeyContext {
/** Number of VISIBLE candidates (already capped at MENU_MAX). */
count: number
/** The currently highlighted row. */
selected: number
/** Whether this is the slash-command menu (composer text starts with `/`). */
slashMenu: boolean
}
export type MenuKeyAction =
| { kind: 'move'; selected: number }
| { kind: 'accept'; index: number }
| { kind: 'dismiss' }
| { kind: 'pass' }
const PASS: MenuKeyAction = { kind: 'pass' }
/** Clamp the selection into the visible range (a shrunken list can strand it). */
function clampSelected(ctx: MenuKeyContext): number {
return Math.min(Math.max(0, ctx.selected), ctx.count - 1)
}
/**
* Route one key press against the open menu. `modified` is Ctrl/Alt/Option —
* modified arrows/Enter never belong to the menu (Tab/Esc keep their
* pre-existing modifier-blind accept/dismiss semantics).
*
* ANY open menu owns plain arrows/Enter (glitch, 2026-06-10): @-path and
* arg menus navigate exactly like the slash menu — standard editor-
* autocomplete behavior; Esc dismisses to hand the cursor keys back.
* (`ctx.slashMenu` still feeds the hint text + suggestion rows.)
*/
export function routeMenuKey(name: string, modified: boolean, ctx: MenuKeyContext): MenuKeyAction {
if (ctx.count <= 0) return PASS
if (name === 'tab') return { index: clampSelected(ctx), kind: 'accept' }
if (name === 'escape') return { kind: 'dismiss' }
if (modified) return PASS
const sel = clampSelected(ctx)
if (name === 'up') return { kind: 'move', selected: (sel - 1 + ctx.count) % ctx.count }
if (name === 'down') return { kind: 'move', selected: (sel + 1) % ctx.count }
if (name === 'return') return { index: sel, kind: 'accept' }
return PASS
}

View File

@@ -0,0 +1,41 @@
/**
* Assistant-text extraction (the `/copy [n]` command's pure logic). An assistant
* turn's answer lives in `parts` (the `type:'text'` fragments, concatenated) while
* live, OR in `.text` once settled/resumed. We copy the ANSWER only — reasoning and
* tool parts are excluded. `nthAssistantResponse` indexes newest-first (1-based).
*
* NB: mouse-selection copies the RENDERED text verbatim (native OpenTUI selection,
* `selection.getSelectedText()`), not markdown source — markers are concealed in the
* pretty render and can't be recovered from a partial selection (user's choice). The
* source-bearing path is this `/copy` command, which copies a whole response's source.
*/
import type { Message } from './store.ts'
/** The answer text of one message: concat the `text` parts (trimmed) when live, else `.text`. */
export function messageText(m: Message): string {
if (m.parts && m.parts.length) {
return m.parts
.filter(p => p.type === 'text')
.map(p => p.text)
.join('')
.trim()
}
return m.text
}
/** Newest-first list of the non-empty answer text for every assistant message. */
export function assistantResponses(messages: Message[]): string[] {
const out: string[] = []
for (let i = messages.length - 1; i >= 0; i--) {
const m = messages[i]
if (!m || m.role !== 'assistant') continue
const text = messageText(m)
if (text) out.push(text)
}
return out
}
/** The n-th newest assistant response (1-based; n=1 → last). `undefined` if out of range. */
export function nthAssistantResponse(messages: Message[], n: number): string | undefined {
return assistantResponses(messages)[n - 1]
}

View File

@@ -0,0 +1,12 @@
/**
* deferClose — defer an overlay/prompt close by one tick.
*
* Overlays REPLACE the composer (a `<Switch>`), so when one closes the composer
* remounts + refocuses. Running the close on the NEXT tick lets the current
* key/close event (Esc/q/Enter/y/select) finish dispatching first, so the
* keystroke that triggered the close can't leak into the freshly-focused
* composer (e.g. `/clear`→y once left a stray "y" in the input).
*/
export function deferClose(fn: () => void): void {
setTimeout(fn, 0)
}

View File

@@ -0,0 +1,84 @@
/**
* Global detail-mode logic (/details — Epic 3 utility-command port; mirrors Ink
* `domain/details.ts`, GLOBAL mode only — per-section overrides are explicitly
* deferred). The mode drives how the transcript treats tool + reasoning rows:
*
* - `collapsed` (default): today's behaviour — headers with click-to-expand.
* - `expanded`: tool bodies + settled reasoning previews default-OPEN.
* - `hidden`: tool/reasoning runs reduce to ONE muted `⚡ N tools hidden`-style
* line per run (never silently dropped — flipping the mode back restores,
* since the parts stay in the store untouched).
*
* Pure data + functions; the store carries the flag, messageLine/toolPart/
* reasoningPart read it via the display context.
*/
import type { Part } from './store.ts'
export type DetailsMode = 'hidden' | 'collapsed' | 'expanded'
/** Cycle order (Ink parity: hidden → collapsed → expanded → hidden). */
export const DETAILS_MODES = ['hidden', 'collapsed', 'expanded'] as const
/** Gateway `complete.slash` suggests these per-section names after `/details ` —
* recognized so picking one yields an honest "not supported yet" notice instead
* of the generic usage line (per-section overrides are deferred). */
export const DETAILS_SECTIONS = ['thinking', 'tools', 'subagents', 'activity'] as const
export const DETAILS_USAGE = 'usage: /details [hidden|collapsed|expanded|cycle]'
/** Parse a mode word; null for anything unrecognized (non-strings included). */
export function parseDetailsMode(v: unknown): DetailsMode | null {
if (typeof v !== 'string') return null
const norm = v.trim().toLowerCase()
return DETAILS_MODES.find(m => m === norm) ?? null
}
/** The next mode in the cycle (`/details cycle`). */
export function nextDetailsMode(m: DetailsMode): DetailsMode {
return DETAILS_MODES[(DETAILS_MODES.indexOf(m) + 1) % DETAILS_MODES.length] ?? 'collapsed'
}
/** One collapsed RUN of consecutive tool/reasoning parts (hidden mode). */
export interface HiddenRun {
type: 'hiddenRun'
/** Stable-ish key: the first hidden part's id. */
id: string
tools: number
thoughts: number
}
/** What the transcript renders per part slot: a real part, or a hidden-run marker. */
export type DisplayPart = Part | HiddenRun
/**
* Hidden mode: keep text parts, fold each consecutive run of tool/reasoning
* parts into ONE HiddenRun marker (so a 5-tool fan-out reads as a single muted
* line, not 5 of them). Pure — the source parts are untouched, so switching
* the mode back restores everything.
*/
export function collapseHiddenParts(parts: readonly Part[]): DisplayPart[] {
const out: DisplayPart[] = []
let run: HiddenRun | undefined
for (const part of parts) {
if (part.type === 'text') {
run = undefined
out.push(part)
continue
}
if (!run) {
run = { id: `hidden-${part.id}`, thoughts: 0, tools: 0, type: 'hiddenRun' }
out.push(run)
}
if (part.type === 'tool') run.tools += 1
else run.thoughts += 1
}
return out
}
/** Muted one-liner for a hidden run: `2 tools · 1 thought hidden`. */
export function hiddenRunLabel(run: HiddenRun): string {
const segs: string[] = []
if (run.tools) segs.push(`${run.tools} tool${run.tools === 1 ? '' : 's'}`)
if (run.thoughts) segs.push(`${run.thoughts} thought${run.thoughts === 1 ? '' : 's'}`)
return `${segs.join(' · ')} hidden — /details collapsed to show`
}

View File

@@ -0,0 +1,82 @@
/**
* Process diagnostics for the /mem + /heapdump utility commands (Epic 3 port;
* Ink ref `app/slash/commands/debug.ts` + `lib/memory.ts`). Pure formatters plus
* the one impure seam (`performHeapdump` → `v8.writeHeapSnapshot`), kept out of
* slash.ts so the dispatcher stays plain and tests can mock `node:v8`.
*/
import { mkdirSync } from 'node:fs'
import { homedir } from 'node:os'
import { dirname, join } from 'node:path'
import { writeHeapSnapshot } from 'node:v8'
/** `123456789` → `117.7 MB` (binary units, one decimal above bytes). */
export function formatBytes(n: number): string {
if (!Number.isFinite(n) || n < 0) return '0 B'
if (n < 1024) return `${Math.round(n)} B`
const units = ['KB', 'MB', 'GB', 'TB'] as const
let v = n
let i = -1
do {
v /= 1024
i += 1
} while (v >= 1024 && i < units.length - 1)
return `${v.toFixed(1)} ${units[i]}`
}
/** Where heap snapshots land: `$HERMES_HOME`/`~/.hermes` + `logs/opentui-heap-<ts>.heapsnapshot`. */
export function heapSnapshotPath(now = new Date()): string {
const home = process.env.HERMES_HOME?.trim() || join(homedir(), '.hermes')
const ts = now.toISOString().replace(/[:.]/g, '-')
return join(home, 'logs', `opentui-heap-${ts}.heapsnapshot`)
}
export interface HeapdumpResult {
path: string
before: { heapUsed: number; rss: number }
after: { heapUsed: number; rss: number }
}
/**
* Write a V8 heap snapshot (SYNCHRONOUS — blocks the event loop while V8 walks
* the heap; that's inherent to writeHeapSnapshot) and report heap/rss before
* vs after. Throws on I/O failure — the caller renders the error.
*/
export function performHeapdump(): HeapdumpResult {
const before = process.memoryUsage()
const path = heapSnapshotPath()
mkdirSync(dirname(path), { recursive: true })
const written = writeHeapSnapshot(path)
const after = process.memoryUsage()
return {
after: { heapUsed: after.heapUsed, rss: after.rss },
before: { heapUsed: before.heapUsed, rss: before.rss },
path: written
}
}
export interface MemSnapshot {
heapUsed: number
heapTotal: number
external: number
arrayBuffers: number
rss: number
}
/**
* The /mem system-line body (Ink's Memory panel as aligned rows). `renderables`
* is the mounted-renderable count under the live renderer root (the store-cap
* diagnostic) — omitted when unavailable (e.g. no renderer in tests).
*/
export function memReport(usage: MemSnapshot, uptimeSeconds: number, renderables?: number): string {
const rows: Array<[string, string]> = [
['heap used', formatBytes(usage.heapUsed)],
['heap total', formatBytes(usage.heapTotal)],
['external', formatBytes(usage.external)],
['array buffers', formatBytes(usage.arrayBuffers)],
['rss', formatBytes(usage.rss)],
['uptime', `${Math.round(uptimeSeconds)}s`]
]
if (renderables !== undefined) rows.push(['renderables', String(renderables)])
const pad = Math.max(...rows.map(([k]) => k.length))
return ['memory', ...rows.map(([k, v]) => ` ${k.padEnd(pad)} ${v}`)].join('\n')
}

View File

@@ -0,0 +1,87 @@
/**
* Pure unified-diff helpers for the file-tool renderer (Epic 2.3). No
* OpenTUI/Solid imports — just string work, trivially unit-testable (like
* `toolOutput.ts`). The gateway ships the FULL raw unified diff on file-edit
* `tool.complete` (`diff_unified`); these helpers turn it into the collapsed
* `+N M` summary and per-file sections for the native `<diff>` renderable
* (which parses only the FIRST file of a multi-file diff — so we split).
*/
/** Added/removed line counts for the collapsed header summary (`+N M`). */
export interface DiffStats {
added: number
removed: number
}
/** Count changed lines in a unified diff, excluding the `+++`/`---` file headers. */
export function diffStats(diff: string): DiffStats {
let added = 0
let removed = 0
for (const line of diff.split('\n')) {
if (line.startsWith('+++') || line.startsWith('---')) continue
if (line.startsWith('+')) added++
else if (line.startsWith('-')) removed++
}
return { added, removed }
}
/**
* Path relative to the session cwd: exact prefix strip only (no `~` for home —
* deliberately simple). Paths outside cwd come back unchanged; the cwd itself
* becomes `.`. A trailing slash on cwd is tolerated.
*/
export function relativizePath(path: string, cwd?: string): string {
if (!path || !cwd) return path
const base = cwd.endsWith('/') && cwd !== '/' ? cwd.slice(0, -1) : cwd
if (path === base) return '.'
const prefix = base === '/' ? '/' : base + '/'
if (path.startsWith(prefix)) return path.slice(prefix.length) || '.'
return path
}
/** One file's section of a (possibly multi-file) unified diff. */
export interface DiffFileSection {
/** Target path from the `+++ b/…` header (or `--- a/…` for deletions); '' if unknown. */
path: string
/** The section's unified diff text, parseable on its own. */
diff: string
}
/** Extract the path from a `--- a/x` / `+++ b/x` header line ('' for /dev/null). */
function headerPath(line: string): string {
let p = line.slice(4).trim()
const tab = p.indexOf('\t') // difflib may append a date after a tab
if (tab !== -1) p = p.slice(0, tab)
if (!p || p === '/dev/null') return ''
if (p.startsWith('a/') || p.startsWith('b/')) p = p.slice(2)
return p
}
function sectionPath(lines: string[]): string {
const to = lines.find(l => l.startsWith('+++ '))
const from = lines.find(l => l.startsWith('--- '))
return (to ? headerPath(to) : '') || (from ? headerPath(from) : '')
}
/**
* Split a unified diff into per-file sections (the gateway concatenates one
* difflib diff per edited file; `patch`-mode diffs can also be multi-file). A
* new section starts at a `--- ` header — required to be FOLLOWED by `+++ `
* and to come after the current section's hunks, so removed lines that merely
* start with `--` can't split a file in half.
*/
export function splitUnifiedDiff(diff: string): DiffFileSection[] {
const lines = diff.replace(/\n$/, '').split('\n')
const sections: string[][] = []
let current: string[] = []
for (let i = 0; i < lines.length; i++) {
const line = lines[i] ?? ''
if (current.some(l => l.startsWith('@@')) && line.startsWith('--- ') && (lines[i + 1] ?? '').startsWith('+++ ')) {
sections.push(current)
current = []
}
current.push(line)
}
if (current.length > 0) sections.push(current)
return sections.filter(s => s.some(l => l.startsWith('@@'))).map(s => ({ diff: s.join('\n'), path: sectionPath(s) }))
}

222
ui-opentui/src/logic/env.ts Normal file
View File

@@ -0,0 +1,222 @@
/**
* env — shared boolean env-flag parsing (one source for the TRUE/FALSE regexes).
*
* Recognized truthy values: 1/true/yes/on; falsy: 0/false/no/off (case-insensitive,
* surrounding whitespace trimmed). Anything else (incl. unset) is "unrecognized".
*/
export const TRUE_RE = /^(?:1|true|yes|on)$/i
export const FALSE_RE = /^(?:0|false|no|off)$/i
/** Parse a boolean env var; returns `fallback` when unset/unrecognized. */
export function envFlag(value: string | undefined, fallback: boolean): boolean {
const v = value?.trim() ?? ''
if (TRUE_RE.test(v)) return true
if (FALSE_RE.test(v)) return false
return fallback
}
/**
* Tri-state toggle parse: `true`/`false` for a recognized value, `null` when
* unset/unrecognized (so a caller can fall through to the next precedence rung).
* Mirrors Ink's `parseToggle` (`ui-tui/src/config/env.ts`).
*/
export function envToggle(value: string | undefined): boolean | null {
const v = value?.trim() ?? ''
if (TRUE_RE.test(v)) return true
if (FALSE_RE.test(v)) return false
return null
}
/**
* Resolve whether mouse tracking is ON at boot, deferring to Ink's env surface
* (`ui-tui/src/config/env.ts`) so muscle memory + docs + support scripts carry
* over. Precedence (highest first):
* 1. `HERMES_TUI_MOUSE_TRACKING` (toggle) — the explicit force knob; beats all.
* 2. `HERMES_TUI_DISABLE_MOUSE=1` — the legacy Ink kill switch (off).
* 3. `HERMES_TUI_MOUSE` (toggle) — the OpenTUI-native alias (kept, rule 2);
* it's also what the launcher sets, so it stays a first-class boot knob.
* 4. default ON (opencode parity: wheel-scroll, drag-scrollbar, click-to-expand,
* text-aware selection).
* OpenTUI's renderer mouse is a single boolean, so Ink's granular off|wheel|
* buttons|all collapses to on/off here (any non-off tracking mode → on).
*/
export function resolveMouseEnabled(env: { readonly [k: string]: string | undefined } = process.env): boolean {
const trackingOverride = envToggle(env.HERMES_TUI_MOUSE_TRACKING)
if (trackingOverride !== null) return trackingOverride
if (envFlag(env.HERMES_TUI_DISABLE_MOUSE, false)) return false
const mouseAlias = envToggle(env.HERMES_TUI_MOUSE)
if (mouseAlias !== null) return mouseAlias
return true
}
/**
* The seeded initial prompt for `hermes --tui "prompt"` / `--image`.
*
* The launcher (`hermes_cli/main.py`) sets `HERMES_TUI_QUERY` (the established
* cross-engine contract Ink reads via `STARTUP_QUERY`); the OpenTUI engine also
* accepts `HERMES_TUI_PROMPT` as its own alias and a bare argv tail for
* standalone dev launches. QUERY wins (it's the launcher contract); PROMPT and
* argv are fallbacks. Empty → undefined.
*/
export function startupPrompt(
env: { readonly [k: string]: string | undefined } = process.env,
argv: readonly string[] = process.argv.slice(2)
): string | undefined {
const query = env.HERMES_TUI_QUERY?.trim()
if (query) return query
const prompt = env.HERMES_TUI_PROMPT?.trim()
if (prompt) return prompt
const tail = argv.join(' ').trim()
return tail || undefined
}
/**
* The seeded image PATH for `hermes --tui --image <path>`. The launcher sets
* `HERMES_TUI_IMAGE` (Ink reads it as `STARTUP_IMAGE` and `image.attach`es the
* path before submitting the query). Empty → undefined.
*/
export function startupImage(env: { readonly [k: string]: string | undefined } = process.env): string | undefined {
const image = env.HERMES_TUI_IMAGE?.trim()
return image || undefined
}
/** Ink's default prompt when an image is seeded with no query (`STARTUP_QUERY`). */
export const STARTUP_IMAGE_DEFAULT_PROMPT = 'What do you see in this image?'
/**
* `HERMES_TUI_NO_CONFIRM` — skip destructive-action confirm prompts (Ink parity,
* `ui-tui/src/config/env.ts` `NO_CONFIRM_DESTRUCTIVE`). When truthy, the `/clear`
* and `/new` confirm step is bypassed and the action runs immediately. Default
* off (confirm). Same name, same truthy parsing as Ink.
*/
export function noConfirmDestructive(env: { readonly [k: string]: string | undefined } = process.env): boolean {
return envFlag(env.HERMES_TUI_NO_CONFIRM, false)
}
/**
* `HERMES_HEAPDUMP_ON_START` — write a manual heap snapshot at boot (Ink parity).
* A diagnostic escape hatch that BYPASSES the diagnostics master switch (you set
* it deliberately to capture a baseline). Default off.
*/
export function heapdumpOnStart(env: { readonly [k: string]: string | undefined } = process.env): boolean {
return envFlag(env.HERMES_HEAPDUMP_ON_START, false)
}
/**
* `HERMES_TUI_SCROLL_SPEED` (or `CLAUDE_CODE_SCROLL_SPEED` for portability) —
* the wheel-scroll speed multiplier (Ink parity, `lib/wheelAccel.ts`
* `readScrollSpeedBase`). Default 1 (the engine's native scroll behavior is
* untouched), clamped to (0, 20]. Returns `null` when UNSET/garbage so the
* caller leaves OpenTUI's native scroll acceleration alone — only an explicit,
* in-range value installs a constant-multiplier override.
*/
export function scrollSpeedMultiplier(env: { readonly [k: string]: string | undefined } = process.env): number | null {
const raw = (env.HERMES_TUI_SCROLL_SPEED ?? env.CLAUDE_CODE_SCROLL_SPEED ?? '').trim()
if (!raw) return null
const n = Number.parseFloat(raw)
if (!Number.isFinite(n) || n <= 0) return null
return Math.min(n, 20)
}
/**
* The diagnostics master switch — `HERMES_TUI_DIAGNOSTICS` (default OFF).
*
* Gates the developer/profiling surface a regular user should never trip
* over: the diagnostic slash commands (`/mem`, `/heapdump`) and the default
* for `HERMES_TUI_WINDOW_STATS` (which can still be set individually). It is
* an enable switch, not a secret: anyone CAN set it (support flows say
* "relaunch with HERMES_TUI_DIAGNOSTICS=1"), it just keeps the day-to-day
* surface clean. Read per call so tests (and long-lived processes whose
* wrapper mutates env before launch) see the live value.
*/
export function diagnosticsEnabled(): boolean {
return envFlag(process.env.HERMES_TUI_DIAGNOSTICS, false)
}
/**
* Whether rich tool-call OUTPUTS are kept — `HERMES_TUI_TOOL_OUTPUTS` (default
* ON). OpenTUI's rich tool cards (full result body + raw result/args dicts) are
* its differentiator vs Ink, so they stay on for real users. Setting `=off`
* drops both the RENDER and the STORE of those bodies (exact Ink parity: Ink
* keeps only a short context line and discards the result/args dicts), which is
* the biggest memory lever — used by the bench (D8: a fair Ink-vs-OpenTUI
* engine-overhead comparison) and the low-mem mode. The redaction-safe
* `argsPreview` one-liner, name/duration/error, and file-edit diffs are KEPT
* either way (a diff is a high-value surface, not generic "output"). Read per
* call so a wrapper that mutates env before launch sees the live value.
*/
export function toolOutputsEnabled(): boolean {
return envFlag(process.env.HERMES_TUI_TOOL_OUTPUTS, true)
}
/**
* Parse `HERMES_TUI_TOOL_OUTPUT_LINES` (a TUI-only env var — deliberately NOT
* a config.yaml knob): how many output lines an expanded tool body shows.
* UNSET → Infinity (UNLIMITED — expanded tool output is uncapped by default;
* setting the var is how you RESTORE a cap, e.g. `=200`). A positive integer
* → that cap. `0` → Infinity too (back-compat: it was the old opt-in
* "unlimited" value). Garbage → Infinity (unrecognized ≙ no cap asked for —
* the semantic is "cap only when the user asked for one").
*/
export function envOutputLines(value: string | undefined): number {
const v = value?.trim() ?? ''
if (!/^\d+$/.test(v)) return Number.POSITIVE_INFINITY
const n = Number.parseInt(v, 10)
return n === 0 ? Number.POSITIVE_INFINITY : n
}
/**
* Default visible-height cap for the composer textarea, in rows (Ink composer
* parity — 8 lines, ref feature request #10418). Beyond this the textarea
* scrolls INTERNALLY (the native edit buffer keeps the cursor in view).
*/
export const COMPOSER_MAX_ROWS = 8
/**
* Parse `HERMES_TUI_COMPOSER_ROWS` (a TUI-only env var — deliberately NOT a
* config.yaml knob): the composer's visible-height cap before internal scroll
* kicks in. A positive integer → that cap; unset / `0` / garbage → the
* COMPOSER_MAX_ROWS default.
*/
export function envComposerRows(value: string | undefined): number {
const v = value?.trim() ?? ''
if (!/^\d+$/.test(v)) return COMPOSER_MAX_ROWS
const n = Number.parseInt(v, 10)
return n > 0 ? n : COMPOSER_MAX_ROWS
}
/**
* Whether NO line cap applies (unset / `0` / unparseable). When unlimited,
* the store prefers the always-full raw `result` over a gateway tail-capped
* `result_text` — an "unlimited" view of a tail would still be missing its
* head — see store.ts tool.complete. With an explicit finite cap the gateway
* tail (+ honest omitted note) is kept: the user asked for a bounded view.
*/
export function envOutputUnlimited(value: string | undefined): boolean {
return envOutputLines(value) === Number.POSITIVE_INFINITY
}
/**
* The session's launch directory for `session.create`'s `cwd` param.
*
* The hermes launcher runs the OpenTUI engine with its process cwd set to the
* engine's own package dir, so `process.cwd()` is NOT where the user ran
* hermes. The launcher exports the real launch dir as `HERMES_CWD` (and the
* gateway's `TERMINAL_CWD`); prefer those. Falls back to `process.cwd()` only
* for standalone launches (smokes/dev) where no launcher set them, and returns
* `undefined` when even that is empty so the gateway resolves its own default.
*/
export function launchCwd(env: { readonly [k: string]: string | undefined } = process.env): string | undefined {
// First NON-BLANK of the launcher's vars (?? would keep a blank HERMES_CWD
// and never reach TERMINAL_CWD).
for (const value of [env.HERMES_CWD, env.TERMINAL_CWD]) {
const trimmed = (value ?? '').trim()
if (trimmed) return trimmed
}
try {
const cwd = process.cwd().trim()
return cwd || undefined
} catch {
return undefined
}
}

View File

@@ -0,0 +1,140 @@
/**
* fuzzy.ts — pure fuzzy filtering + grouped presentation for picker overlays
* (Epic 7 model picker v2; resume-session picker; skills hub). Matching/ranking
* is delegated to `fuzzysort` (the library opencode uses in production, see its
* dialog-select.tsx) through a thin adapter that preserves this module's API:
* call sites pass weighted `FuzzyField[]` haystacks and get back a ranked list.
*
* Adapter semantics on top of fuzzysort:
* - Multi-key scoring à la opencode: each field is a fuzzysort key; the final
* score is the weight-multiplied SUM of per-key scores (label conventionally
* ×2, opencode's `r[0].score * 2 + r[1].score`), so label hits outrank
* equal-quality group/slug hits.
* - Multi-term AND (a feature of the old hand-rolled scorer that fuzzysort
* lacks natively): the query is whitespace-split and fuzzysort runs once per
* term over the progressively-filtered pool — every term must match at least
* one field; per-term scores accumulate. Chosen over a joined single needle
* because it keeps `anthropic son` / `copilot son` matching ACROSS fields.
* - Empty/blank query → all items in catalog order (fuzzysort returns nothing
* for an empty needle; the old all-rows behavior is preserved here).
* - Equal final scores keep catalog order (fuzzysort's sort is not stable; the
* adapter re-sorts with the original index as tie-break).
*/
import fuzzysort from 'fuzzysort'
/** One searchable field of an item (e.g. model id ×2, provider slug, lab name). */
export interface FuzzyField {
text: string
/** Score multiplier (default 1). The primary label is conventionally 2. */
weight?: number
}
/** Pool entry: the item plus its precomputed fields, catalog position and the
* per-term accumulated score. */
interface Entry<T> {
item: T
at: number
fields: FuzzyField[]
total: number
}
/**
* Filter + rank items by query. Empty query → the items in catalog order;
* otherwise matches sorted by score (descending), ties keeping catalog order.
* Every whitespace-split term must fuzzy-match at least one field.
*/
export function fuzzyFilter<T>(query: string, items: readonly T[], fieldsOf: (item: T) => FuzzyField[]): T[] {
const terms = query.trim().split(/\s+/).filter(Boolean)
if (!terms.length) return [...items]
let pool: Entry<T>[] = items.map((item, at) => ({ at, fields: fieldsOf(item), item, total: 0 }))
// Items may carry different field counts (description/haystacks optional):
// one key per field slot, missing slots read as '' (never match).
const keyCount = pool.reduce((max, e) => Math.max(max, e.fields.length), 0)
const keys = Array.from({ length: keyCount }, (_, i) => (e: Entry<T>) => e.fields[i]?.text ?? '')
for (const term of terms) {
const results = fuzzysort.go(term, pool, {
keys,
// Weighted sum of per-key scores (unmatched keys score 0). Inclusion is
// decided by fuzzysort (≥1 key must match); this only ranks.
scoreFn: r => {
let sum = 0
for (let i = 0; i < r.length; i++) sum += (r[i]?.score ?? 0) * (r.obj.fields[i]?.weight ?? 1)
return sum
}
})
if (!results.length) return []
pool = results.map(r => {
r.obj.total += r.score
return r.obj
})
}
pool.sort((a, b) => b.total - a.total || a.at - b.at)
return pool.map(e => e.item)
}
/** A render row of a grouped picker: a non-selectable group header or an item.
* `index` is the item's position in the flat ARROW-TRAVERSAL order; `-1` marks
* a non-selectable item row (rendered dimmed, skipped by traversal). */
export type PickerRow<T> = { kind: 'header'; label: string } | { kind: 'item'; item: T; index: number }
/**
* Group items for display (group order = first appearance, so a score-sorted
* input puts the best group first). Returns the header+item render rows and
* the flat selectable list in traversal order — arrows walk `flat` and thus
* cross group boundaries seamlessly; headers are never selectable. Items
* without a group render headerless (e.g. the skills picker). Items failing
* `selectableOf` (picker v2.1: unconfigured-provider hint rows) still RENDER
* (index `-1`) but never enter `flat`, so ↑↓ traversal skips them.
*/
export function buildPickerRows<T>(
items: readonly T[],
groupOf: (item: T) => string | undefined,
selectableOf: (item: T) => boolean = () => true
): { rows: PickerRow<T>[]; flat: T[] } {
const order: string[] = []
const buckets = new Map<string, T[]>()
for (const item of items) {
const group = groupOf(item) ?? ''
let bucket = buckets.get(group)
if (!bucket) {
bucket = []
buckets.set(group, bucket)
order.push(group)
}
bucket.push(item)
}
const rows: PickerRow<T>[] = []
const flat: T[] = []
for (const group of order) {
if (group) rows.push({ kind: 'header', label: group })
for (const item of buckets.get(group) ?? []) {
if (selectableOf(item)) {
rows.push({ index: flat.length, item, kind: 'item' })
flat.push(item)
} else {
rows.push({ index: -1, item, kind: 'item' })
}
}
}
return { flat, rows }
}
/**
* Slice rows to a visible window of at most `cap` rows that keeps the selected
* item in view (centered when possible). `above`/`below` are the hidden row
* counts for the ↑/↓ "more" indicators.
*/
export function visibleRows<T>(
rows: readonly PickerRow<T>[],
selected: number,
cap: number
): { rows: PickerRow<T>[]; above: number; below: number } {
if (rows.length <= cap) return { above: 0, below: 0, rows: [...rows] }
const selRow = rows.findIndex(r => r.kind === 'item' && r.index === selected)
const anchor = selRow === -1 ? 0 : selRow
const start = Math.max(0, Math.min(anchor - Math.floor(cap / 2), rows.length - cap))
return { above: start, below: rows.length - (start + cap), rows: rows.slice(start, start + cap) }
}

View File

@@ -0,0 +1,52 @@
/**
* Pure recovery-budget policy for the gateway exit handler (LOGIC side — no
* Effect, no refs, no UI). Ported from Ink's `ui-tui/src/app/gatewayRecovery.ts`
* and EXTENDED with opencode-style exponential backoff.
*
* A gateway that crash-loops on startup must not let the TUI spawn-storm, so
* respawn+resume attempts are capped to GATEWAY_RECOVERY_LIMIT within a sliding
* GATEWAY_RECOVERY_WINDOW_MS; past the budget the app falls back to the inert
* "gateway exited" state. Kept pure (no refs/UI) so the bound — including the
* crash-loop case — is unit-testable.
*/
export const GATEWAY_RECOVERY_LIMIT = 3
export const GATEWAY_RECOVERY_WINDOW_MS = 60_000
export interface RecoveryPlan {
/** Attempt timestamps to persist (the pruned window, plus `now` iff recovering). */
attempts: number[]
recover: boolean
/**
* Session to resume — the live sid, or the not-yet-consumed recovery target
* when the live sid was already cleared by a prior exit.
*/
sid: null | string
}
/**
* Decide whether to respawn+resume after a gateway death. `liveSid` is the
* current session (nulled on the first exit); `recoverSid` is a pending
* recovery target carried across a respawn that died before gateway.ready —
* so a startup crash-loop keeps retrying the same session up to the budget
* instead of stranding it after one attempt.
*/
export function planGatewayRecovery(
liveSid: null | string,
recoverSid: null | string,
attempts: number[],
now: number
): RecoveryPlan {
const sid = liveSid ?? recoverSid
const recent = attempts.filter(t => now - t < GATEWAY_RECOVERY_WINDOW_MS)
const recover = Boolean(sid) && recent.length < GATEWAY_RECOVERY_LIMIT
return { attempts: recover ? [...recent, now] : recent, recover, sid }
}
/**
* Exponential backoff between respawn attempts (opencode-style): 1s, 2s, 4s, …
* capped at 30s. `attempt` is 1-based (the first respawn waits 1s).
*/
export function backoffMs(attempt: number): number {
return Math.min(1000 * 2 ** Math.max(0, attempt - 1), 30_000)
}

View File

@@ -0,0 +1,122 @@
/**
* Prompt history (item 6) — the SOLID side, plain TS. Up/Down cycle through the
* prompts you've sent, scoped PER DIRECTORY: launching Hermes again in the same
* project dir reuses that dir's prior prompts (the "bleed for the same dir" the
* user asked for), while a session in a different dir keeps its own list.
*
* `createPromptHistory` is pure + injectable (initial entries + a `persist`
* sink) so the cursor logic is unit-tested with no filesystem. The real wiring
* uses `loadDirHistory(cwd)` / `dirHistoryPersister(cwd)` to read/append a
* per-dir JSONL file under `$HERMES_HOME/tui-history/<hash>.jsonl` (one
* JSON-encoded prompt per line, multiline-safe; opencode's prompt-history.jsonl
* model, Ink's ~/.hermes/.hermes_history idea, scoped by dir).
*/
import { appendFileSync, mkdirSync, readFileSync } from 'node:fs'
import { homedir } from 'node:os'
import { createHash } from 'node:crypto'
import { dirname, join } from 'node:path'
const DEFAULT_MAX = 200
export interface PromptHistoryOptions {
/** Entries already on disk for this dir (oldest → newest). */
initial?: string[]
/** Persist a newly pushed prompt (real use: append to the per-dir file). */
persist?: (text: string) => void
/** Cap on retained entries (oldest dropped). */
max?: number
}
export interface PromptHistory {
/** All cycleable entries (oldest → newest) — loaded prev-session + this session. */
entries: () => string[]
/** Record a submitted prompt (skips a consecutive duplicate) and reset the cursor. */
push: (text: string) => void
/** Cycle to the OLDER entry (Up). Stashes `currentInput` as the draft on the first step. */
prev: (currentInput: string) => string | null
/** Cycle to the NEWER entry (Down); returns the stashed draft at the bottom. */
next: () => string | null
/** Reset the cursor to the live draft (call on any edit). */
reset: () => void
}
export function createPromptHistory(opts: PromptHistoryOptions = {}): PromptHistory {
const entries = [...(opts.initial ?? [])]
const max = opts.max ?? DEFAULT_MAX
// `idx === entries.length` means "at the live draft" (past the newest entry).
let idx = entries.length
let draft = ''
return {
entries: () => entries.slice(),
push(text) {
if (!text.trim()) return
if (entries[entries.length - 1] !== text) {
entries.push(text)
if (entries.length > max) entries.shift()
opts.persist?.(text)
}
idx = entries.length
draft = ''
},
prev(currentInput) {
if (entries.length === 0) return null
if (idx === entries.length) draft = currentInput // leaving the bottom — stash the draft
if (idx > 0) idx--
return entries[idx] ?? null
},
next() {
if (idx >= entries.length) return null
idx++
return idx === entries.length ? draft : (entries[idx] ?? null)
},
reset() {
idx = entries.length
}
}
}
// ── per-directory file persistence (best-effort; never throws) ──────────
function hermesHome(): string {
return process.env.HERMES_HOME?.trim() || join(homedir(), '.hermes')
}
/** The history file for a given working directory (keyed by a hash of the abs path). */
function dirHistoryPath(cwd: string): string {
const key = createHash('sha1').update(cwd).digest('hex').slice(0, 16)
return join(hermesHome(), 'tui-history', `${key}.jsonl`)
}
/** Load a directory's prior prompts (oldest → newest); [] if none / unreadable. */
export function loadDirHistory(cwd: string, max = DEFAULT_MAX): string[] {
try {
const raw = readFileSync(dirHistoryPath(cwd), 'utf8')
const out: string[] = []
for (const line of raw.split('\n')) {
if (!line.trim()) continue
try {
const v: unknown = JSON.parse(line)
if (typeof v === 'string') out.push(v)
} catch {
// skip a corrupt line — never let it break loading
}
}
return out.length > max ? out.slice(out.length - max) : out
} catch {
return []
}
}
/** A persister that appends each pushed prompt to the dir's JSONL file (best-effort). */
export function dirHistoryPersister(cwd: string): (text: string) => void {
const path = dirHistoryPath(cwd)
return text => {
try {
mkdirSync(dirname(path), { recursive: true })
appendFileSync(path, JSON.stringify(text) + '\n', 'utf8')
} catch {
// history persistence is non-essential — a write failure must not disrupt the turn
}
}
}

View File

@@ -0,0 +1,279 @@
/**
* Fence-aware LaTeX→Unicode span converter. Runs on the raw markdown string
* BEFORE it reaches the native `<markdown>` renderable (the one seam in
* view/markdown.tsx), so the native parser only ever sees already-converted
* unicode text. Tier-A: text-only — no styled spans, no accent color on math
* (that needs renderNode hooks into MarkdownRenderable; deferred).
*
* Span detection ports the Ink tokenizer's EXACT rules (ui-tui/src/components/
* markdown.tsx — keep in sync):
*
* • inline `$…$` — INLINE_RE group 17:
* (?<!\$)\$([^\s$](?:[^$\n]*?[^\s$])?)\$(?!\$)
* content starts AND ends with a non-space-non-`$`, contains no `$` or
* newline. This is the currency guard: in `I paid $5 and $10` the closing
* `$` is preceded by a space, so nothing matches and the prose survives.
* • inline `\(…\)` — INLINE_RE group 18: `\\\(([^\n]+?)\\\)` (single line).
* • display `$$…$$` / `\[…\]` — MATH_BLOCK_OPEN_RE: opener only at the start
* of a (whitespace-trimmed) line; closes on the same line (`$$x$$`) or on a
* later line ENDING with the closer. No closer anywhere → the line passes
* through verbatim (Ink renders it as a plain paragraph). That rule is what
* makes streaming safe for free: an unclosed `$$`/`$` mid-stream stays
* verbatim and converts exactly once, when the closing delimiter arrives
* (the whole text re-feeds per delta).
*
* Because the markdown parser hasn't run yet, fence / inline-code state is
* tracked here:
* • fenced blocks: ``` or ~~~ runs (3+, any info string) open; a line that is
* only a run of the SAME character, at least as long, closes (CommonMark).
* Everything inside, including the fence lines, passes through untouched.
* • inline code: per-line backtick scan — a run of N backticks opens a span
* closed by the next run of EXACTLY N backticks on the same line
* (CommonMark rule); unmatched runs are literal text. Multi-line inline
* code spans are NOT supported (the Ink tokenizer was per-line too).
*
* Known, documented deviations from full markdown awareness (both rare, both
* shared with or narrower than the Ink renderer's behavior):
* • a paired `$…$` inside a link destination (`[x](http://a$b$c)`) converts;
* Ink's tokenizer matched the link first.
* • 4-space-indented code blocks are not tracked (fences only).
*
* `\boxed{…}` sentinels (U+0001/U+0002 from texToUnicode) are STRIPPED to the
* inner text — injecting a styled span into the native renderable needs
* renderer hooks; deferred with the rest of tier-B.
*
* Perf: this runs over the FULL text on every streaming delta. Early-exit
* fast path returns the same string reference when no `$` / `\(` / `\[`
* appears at all, and when a scan converts nothing the original reference is
* returned too (so the renderable's content prop stays identity-stable).
*/
import { BOX_RE, texToUnicode } from './mathUnicode.ts'
const FENCE_OPEN_RE = /^\s*(`{3,}|~{3,})/
const FENCE_CLOSE_RE = /^\s*(`{3,}|~{3,})\s*$/
// Display math openers/closers — ported verbatim from Ink's markdown.tsx.
const MATH_BLOCK_OPEN_RE = /^\s*(\$\$|\\\[)(.*)$/
const MATH_BLOCK_CLOSE_DOLLAR_RE = /^(.*?)\$\$\s*$/
const MATH_BLOCK_CLOSE_BRACKET_RE = /^(.*?)\\\]\s*$/
// Ink INLINE_RE group 17 / 18, anchored (sticky). The `(?<!\$)` lookbehind is
// checked by the caller (prev char), everything else is byte-for-byte Ink's.
const INLINE_DOLLAR_RE = /\$([^\s$](?:[^$\n]*?[^\s$])?)\$(?!\$)/y
const INLINE_PAREN_RE = /\\\(([^\n]+?)\\\)/y
/** texToUnicode + strip the \boxed highlight sentinels down to plain text. */
const toUnicode = (tex: string): string => texToUnicode(tex).replace(BOX_RE, '$1')
/** Index of the next run of EXACTLY `len` backticks at/after `from`, or -1. */
const findBacktickClose = (line: string, from: number, len: number): number => {
let i = from
while (i < line.length) {
if (line[i] !== '`') {
i++
continue
}
let j = i + 1
while (j < line.length && line[j] === '`') j++
if (j - i === len) {
return i
}
i = j
}
return -1
}
// Convert inline `$…$` / `\(…\)` spans in one prose line, skipping inline
// code spans. Returns the SAME string reference when nothing converted.
const convertInline = (line: string): string => {
let out = ''
let i = 0
let changed = false
while (i < line.length) {
const ch = line[i]
if (ch === '`') {
let j = i + 1
while (j < line.length && line[j] === '`') j++
const close = findBacktickClose(line, j, j - i)
if (close >= 0) {
out += line.slice(i, close + (j - i))
i = close + (j - i)
} else {
out += line.slice(i, j)
i = j
}
continue
}
if (ch === '$' && line[i - 1] !== '$') {
INLINE_DOLLAR_RE.lastIndex = i
const m = INLINE_DOLLAR_RE.exec(line)
if (m) {
out += toUnicode(m[1] ?? '')
i = INLINE_DOLLAR_RE.lastIndex
changed = true
continue
}
}
if (ch === '\\' && line[i + 1] === '(') {
INLINE_PAREN_RE.lastIndex = i
const m = INLINE_PAREN_RE.exec(line)
if (m) {
out += toUnicode(m[1] ?? '')
i = INLINE_PAREN_RE.lastIndex
changed = true
continue
}
}
out += ch
i++
}
return changed ? out : line
}
export function preprocessMath(markdown: string, _opts?: { streaming?: boolean | undefined }): string {
// Fast path — REQUIRED, this runs on every streaming delta. No math trigger
// characters anywhere → hand back the exact same string (identity).
if (!markdown.includes('$') && !markdown.includes('\\(') && !markdown.includes('\\[')) {
return markdown
}
const lines = markdown.split('\n')
const out: string[] = []
let changed = false
let fence: { char: string; len: number } | null = null
let i = 0
// Emit a converted display block as its own paragraph: blank-line separated
// from surrounding prose (only where a separator is actually missing).
const pushDisplay = (block: string[], nextIdx: number) => {
if (out.length > 0 && out[out.length - 1]?.trim()) {
out.push('')
}
out.push(...block)
if (nextIdx < lines.length && lines[nextIdx]?.trim()) {
out.push('')
}
changed = true
}
while (i < lines.length) {
const line = lines[i] ?? ''
if (fence) {
out.push(line)
const close = line.match(FENCE_CLOSE_RE)?.[1]
if (close && close.charAt(0) === fence.char && close.length >= fence.len) {
fence = null
}
i++
continue
}
const open = line.match(FENCE_OPEN_RE)?.[1]
if (open) {
fence = { char: open.charAt(0), len: open.length }
out.push(line)
i++
continue
}
const mathOpen = line.match(MATH_BLOCK_OPEN_RE)
if (mathOpen) {
const closeRe = mathOpen[1] === '$$' ? MATH_BLOCK_CLOSE_DOLLAR_RE : MATH_BLOCK_CLOSE_BRACKET_RE
const headRest = mathOpen[2] ?? ''
// Single-line block: `$$x + y = z$$` or `\[x\]`.
const sameLineClose = headRest.match(closeRe)
if (sameLineClose) {
const inner = (sameLineClose[1] ?? '').trim()
pushDisplay(inner ? [toUnicode(inner)] : [], i + 1)
i++
continue
}
// Multi-line block: scan ahead for a real closer before committing. If
// none exists in the rest of the (possibly still-streaming) document,
// the line stays verbatim — Ink's paragraph fallback.
let closeIdx = -1
let closeTail = ''
for (let j = i + 1; j < lines.length; j++) {
const m = (lines[j] ?? '').match(closeRe)
if (m) {
closeIdx = j
closeTail = m[1] ?? ''
break
}
}
if (closeIdx >= 0) {
const block: string[] = []
if (headRest.trim()) {
block.push(headRest)
}
for (let j = i + 1; j < closeIdx; j++) {
block.push(lines[j] ?? '')
}
const tail = closeTail.trimEnd()
if (tail.trim()) {
block.push(tail)
}
pushDisplay(
block.map(l => toUnicode(l)),
closeIdx + 1
)
i = closeIdx + 1
continue
}
}
const converted = convertInline(line)
if (converted !== line) {
changed = true
}
out.push(converted)
i++
}
return changed ? out.join('\n') : markdown
}

View File

@@ -0,0 +1,773 @@
// Best-effort LaTeX → Unicode for inline / display math (ported verbatim from
// ui-tui/src/lib/mathUnicode.ts — keep the two in sync). The terminal can't
// typeset LaTeX, but Unicode covers
// most of what models actually emit: Greek letters, blackboard / fraktur /
// calligraphic capitals, set theory + logic operators, common arrows,
// sub/superscripts, and `\frac{a}{b}` collapsed to `a/b`.
//
// Design rules:
// • Pure regex pipeline. Anything we don't recognise is preserved
// verbatim (so a `\foo{bar}` we've never heard of still survives).
// A real LaTeX parser would be more correct but throws on partial
// input — terminal users would rather see the raw command than a
// parse-error placeholder.
// • Longest-match-first ordering on commands so `\le` doesn't shadow
// `\leq`, `\sub` doesn't shadow `\subseteq`, etc.
// • Word-boundary lookahead `(?![A-Za-z])` after each command so
// `\pix` (made-up command) doesn't get partially substituted as `π`.
// • `\mathbb{X}`, `\mathcal{X}`, `\mathfrak{X}` only handle a single
// letter argument — multi-letter `\mathbb{NN}` is rare and would
// need a real parser to do correctly.
// • Sub/super scripts only convert if EVERY character has a Unicode
// equivalent. Mixed content like `^{n+1}` falls back to the raw
// LaTeX so we don't emit `ⁿ+¹` (which has no `+` superscript glyph
// in some fonts and reads worse than the source).
const SYMBOLS: Record<string, string> = {
// Greek lowercase
'\\alpha': 'α',
'\\beta': 'β',
'\\gamma': 'γ',
'\\delta': 'δ',
'\\epsilon': 'ε',
'\\varepsilon': 'ε',
'\\zeta': 'ζ',
'\\eta': 'η',
'\\theta': 'θ',
'\\vartheta': 'ϑ',
'\\iota': 'ι',
'\\kappa': 'κ',
'\\lambda': 'λ',
'\\mu': 'μ',
'\\nu': 'ν',
'\\xi': 'ξ',
'\\pi': 'π',
'\\varpi': 'ϖ',
'\\rho': 'ρ',
'\\varrho': 'ϱ',
'\\sigma': 'σ',
'\\varsigma': 'ς',
'\\tau': 'τ',
'\\upsilon': 'υ',
'\\phi': 'φ',
'\\varphi': 'φ',
'\\chi': 'χ',
'\\psi': 'ψ',
'\\omega': 'ω',
// Greek uppercase
'\\Gamma': 'Γ',
'\\Delta': 'Δ',
'\\Theta': 'Θ',
'\\Lambda': 'Λ',
'\\Xi': 'Ξ',
'\\Pi': 'Π',
'\\Sigma': 'Σ',
'\\Upsilon': 'Υ',
'\\Phi': 'Φ',
'\\Psi': 'Ψ',
'\\Omega': 'Ω',
// Big operators
'\\sum': '∑',
'\\prod': '∏',
'\\coprod': '∐',
'\\int': '∫',
'\\iint': '∬',
'\\iiint': '∭',
'\\oint': '∮',
'\\bigcup': '',
'\\bigcap': '⋂',
'\\bigvee': '',
'\\bigwedge': '⋀',
'\\bigoplus': '⨁',
'\\bigotimes': '⨂',
// Calculus
'\\partial': '∂',
'\\nabla': '∇',
'\\sqrt': '√',
// Sets
'\\emptyset': '∅',
'\\varnothing': '∅',
'\\infty': '∞',
'\\in': '∈',
'\\notin': '∉',
'\\ni': '∋',
'\\subset': '⊂',
'\\supset': '⊃',
'\\subseteq': '⊆',
'\\supseteq': '⊇',
'\\subsetneq': '⊊',
'\\supsetneq': '⊋',
'\\cup': '',
'\\cap': '∩',
'\\setminus': '',
'\\complement': '∁',
// Logic
'\\forall': '∀',
'\\exists': '∃',
'\\nexists': '∄',
'\\land': '∧',
'\\lor': '',
'\\lnot': '¬',
'\\neg': '¬',
'\\therefore': '∴',
'\\because': '∵',
// Relations
'\\le': '≤',
'\\leq': '≤',
'\\ge': '≥',
'\\geq': '≥',
'\\ne': '≠',
'\\neq': '≠',
'\\ll': '≪',
'\\gg': '≫',
'\\approx': '≈',
'\\equiv': '≡',
'\\cong': '≅',
'\\sim': '',
'\\simeq': '≃',
'\\propto': '∝',
'\\perp': '⊥',
'\\parallel': '∥',
'\\models': '⊨',
'\\vdash': '⊢',
'\\mid': '',
'\\nmid': '∤',
'\\divides': '',
// Common standalone glyphs
'\\blacksquare': '■',
'\\square': '□',
'\\Box': '□',
'\\qed': '∎',
'\\bigstar': '★',
// Modular arithmetic — the `\pmod{p}` form (with arg) is handled below;
// the bare `\bmod` / `\mod` commands are simple text substitutions.
'\\bmod': 'mod',
'\\mod': 'mod',
// Brackets / fences (named delimiter commands; the `\left\X` / `\right\X`
// unwrapping below leaves these behind for the symbol pass to resolve).
'\\langle': '⟨',
'\\rangle': '⟩',
'\\lceil': '⌈',
'\\rceil': '⌉',
'\\lfloor': '⌊',
'\\rfloor': '⌋',
'\\|': '‖',
// Arrows
'\\to': '→',
'\\rightarrow': '→',
'\\leftarrow': '←',
'\\leftrightarrow': '↔',
'\\Rightarrow': '⇒',
'\\Leftarrow': '⇐',
'\\Leftrightarrow': '⇔',
'\\implies': '⟹',
'\\impliedby': '⟸',
'\\iff': '⟺',
'\\mapsto': '↦',
'\\hookrightarrow': '↪',
'\\hookleftarrow': '↩',
'\\uparrow': '↑',
'\\downarrow': '↓',
'\\updownarrow': '↕',
// Binary operators
'\\cdot': '⋅',
'\\cdots': '⋯',
'\\ldots': '…',
'\\dots': '…',
'\\dotsb': '…',
'\\dotsc': '…',
'\\vdots': '⋮',
'\\ddots': '⋱',
'\\times': '×',
'\\div': '÷',
'\\pm': '±',
'\\mp': '∓',
'\\circ': '∘',
'\\bullet': '•',
'\\star': '⋆',
'\\ast': '',
'\\oplus': '⊕',
'\\ominus': '⊖',
'\\otimes': '⊗',
'\\odot': '⊙',
'\\diamond': '⋄',
'\\angle': '∠',
'\\triangle': '△',
// Spacing — collapse to varying widths of regular space
'\\,': ' ',
'\\;': ' ',
'\\:': ' ',
'\\!': '',
'\\ ': ' ',
'\\quad': ' ',
'\\qquad': ' ',
// Functions (LaTeX renders these in roman; we just keep the name)
'\\sin': 'sin',
'\\cos': 'cos',
'\\tan': 'tan',
'\\cot': 'cot',
'\\sec': 'sec',
'\\csc': 'csc',
'\\arcsin': 'arcsin',
'\\arccos': 'arccos',
'\\arctan': 'arctan',
'\\sinh': 'sinh',
'\\cosh': 'cosh',
'\\tanh': 'tanh',
'\\log': 'log',
'\\ln': 'ln',
'\\exp': 'exp',
'\\det': 'det',
'\\dim': 'dim',
'\\ker': 'ker',
'\\lim': 'lim',
'\\liminf': 'liminf',
'\\limsup': 'limsup',
'\\sup': 'sup',
'\\inf': 'inf',
'\\max': 'max',
'\\min': 'min',
'\\arg': 'arg',
'\\gcd': 'gcd',
// Escaped literals — model occasionally emits these for display
'\\&': '&',
'\\%': '%',
'\\$': '$',
'\\#': '#',
'\\_': '_',
'\\{': '{',
'\\}': '}'
}
const BB: Record<string, string> = {
A: '𝔸',
B: '𝔹',
C: '',
D: '𝔻',
E: '𝔼',
F: '𝔽',
G: '𝔾',
H: '',
I: '𝕀',
J: '𝕁',
K: '𝕂',
L: '𝕃',
M: '𝕄',
N: '',
O: '𝕆',
P: '',
Q: '',
R: '',
S: '𝕊',
T: '𝕋',
U: '𝕌',
V: '𝕍',
W: '𝕎',
X: '𝕏',
Y: '𝕐',
Z: ''
}
const CAL: Record<string, string> = {
A: '𝒜',
B: '',
C: '𝒞',
D: '𝒟',
E: '',
F: '',
G: '𝒢',
H: '',
I: '',
J: '𝒥',
K: '𝒦',
L: '',
M: '',
N: '𝒩',
O: '𝒪',
P: '𝒫',
Q: '𝒬',
R: '',
S: '𝒮',
T: '𝒯',
U: '𝒰',
V: '𝒱',
W: '𝒲',
X: '𝒳',
Y: '𝒴',
Z: '𝒵'
}
const FRAK: Record<string, string> = {
A: '𝔄',
B: '𝔅',
C: '',
D: '𝔇',
E: '𝔈',
F: '𝔉',
G: '𝔊',
H: '',
I: '',
J: '𝔍',
K: '𝔎',
L: '𝔏',
M: '𝔐',
N: '𝔑',
O: '𝔒',
P: '𝔓',
Q: '𝔔',
R: '',
S: '𝔖',
T: '𝔗',
U: '𝔘',
V: '𝔙',
W: '𝔚',
X: '𝔛',
Y: '𝔜',
Z: ''
}
const SUPERSCRIPT: Record<string, string> = {
'0': '⁰',
'1': '¹',
'2': '²',
'3': '³',
'4': '⁴',
'5': '⁵',
'6': '⁶',
'7': '⁷',
'8': '⁸',
'9': '⁹',
'+': '⁺',
'-': '⁻',
'=': '⁼',
'(': '⁽',
')': '⁾',
a: 'ᵃ',
b: 'ᵇ',
c: 'ᶜ',
d: 'ᵈ',
e: 'ᵉ',
f: 'ᶠ',
g: 'ᵍ',
h: 'ʰ',
i: 'ⁱ',
j: 'ʲ',
k: 'ᵏ',
l: 'ˡ',
m: 'ᵐ',
n: 'ⁿ',
o: 'ᵒ',
p: 'ᵖ',
r: 'ʳ',
s: 'ˢ',
t: 'ᵗ',
u: 'ᵘ',
v: 'ᵛ',
w: 'ʷ',
x: 'ˣ',
y: 'ʸ',
z: 'ᶻ'
}
const SUBSCRIPT: Record<string, string> = {
'0': '₀',
'1': '₁',
'2': '₂',
'3': '₃',
'4': '₄',
'5': '₅',
'6': '₆',
'7': '₇',
'8': '₈',
'9': '₉',
'+': '₊',
'-': '₋',
'=': '₌',
'(': '₍',
')': '₎',
a: 'ₐ',
e: 'ₑ',
h: 'ₕ',
i: 'ᵢ',
j: 'ⱼ',
k: 'ₖ',
l: 'ₗ',
m: 'ₘ',
n: 'ₙ',
o: 'ₒ',
p: 'ₚ',
r: 'ᵣ',
s: 'ₛ',
t: 'ₜ',
u: 'ᵤ',
v: 'ᵥ',
x: 'ₓ'
}
// Sentinel control characters used to mark `\boxed` / `\fbox` regions in
// the converted output. The renderer splits on these to apply a highlight
// style; consumers that don't want highlighting can strip them with the
// exported `BOX_RE` below.
export const BOX_OPEN = '\u0001'
export const BOX_CLOSE = '\u0002'
// eslint-disable-next-line no-control-regex
export const BOX_RE = /\u0001([^\u0001\u0002]*)\u0002/g
const escapeRe = (s: string) => s.replace(/[.*+?^${}()|[\]\\]/g, '\\$&')
// Pre-compile two symbol regexes: one for letter-ending commands (`\pi`,
// `\sum`) which need a `(?![A-Za-z])` lookahead so they don't partially
// match `\pix` or `\summa`, and one for punctuation-ending commands
// (`\{`, `\,`, `\|`) which must NOT have the lookahead — otherwise
// `\{p` would refuse to substitute because `p` is a letter.
//
// Longest commands first inside each group so `\leq` beats `\le`.
const splitByEnding = (keys: string[]) => {
const letter: string[] = []
const punct: string[] = []
for (const k of keys) {
if (/[A-Za-z]$/.test(k)) {
letter.push(k)
} else {
punct.push(k)
}
}
return { letter, punct }
}
const buildAlt = (cmds: string[]) =>
cmds
.sort((a, b) => b.length - a.length)
.map(escapeRe)
.join('|')
const { letter: LETTER_CMDS, punct: PUNCT_CMDS } = splitByEnding(Object.keys(SYMBOLS))
const SYMBOL_LETTER_RE = new RegExp('(?:' + buildAlt(LETTER_CMDS) + ')(?![A-Za-z])', 'g')
const SYMBOL_PUNCT_RE = new RegExp('(?:' + buildAlt(PUNCT_CMDS) + ')', 'g')
const convertScript = (input: string, table: Record<string, string>, sigil: '^' | '_'): string => {
let out = ''
let allMapped = true
for (const ch of input) {
const mapped = table[ch]
if (!mapped) {
allMapped = false
break
}
out += mapped
}
if (allMapped) {
return out
}
// Fallback: if the body is a single visible character (e.g. `∞` after
// earlier symbol substitution), render it without braces — `^∞` reads
// far better than `^{∞}` in a terminal. Multi-char bodies that don't
// fully convert use parens (`e^(iπ)`) instead of braces (`e^{iπ}`)
// because parens are normal punctuation while braces look like
// unrendered LaTeX.
const trimmed = input.trim()
if ([...trimmed].length === 1) {
return `${sigil}${trimmed}`
}
return `${sigil}(${trimmed})`
}
// Walk the string and parse `{...}` honouring nested braces. Unlike a
// `\{[^{}]*\}` regex this survives `\frac{|t|^{p-1}|P(t)|^p}{...}` where
// the numerator contains its own braces from a superscript. Returns the
// inner content (without the outer braces) and the offset just past the
// closing `}`. Returns null if there is no balanced brace at `start`.
const readBraced = (s: string, start: number): { content: string; end: number } | null => {
if (s[start] !== '{') {
return null
}
let depth = 1
let i = start + 1
while (i < s.length && depth > 0) {
const c = s[i]
// Skip escapes — `\{` and `\}` inside a body are literal braces and
// should not change the brace counter.
if (c === '\\' && i + 1 < s.length) {
i += 2
continue
}
if (c === '{') {
depth++
} else if (c === '}') {
depth--
}
if (depth > 0) {
i++
}
}
if (depth !== 0) {
return null
}
return { content: s.slice(start + 1, i), end: i + 1 }
}
// Replace every occurrence of `\command{arg}` using balanced-brace parsing
// (so `\boxed{x^{n+1}}` works where a `[^{}]*` regex would fail). The
// `render` callback receives the inner content already recursed-into, so
// `\boxed{\boxed{x}}` resolves outside-in cleanly. Unmatched `\command`
// (no following `{...}`) is preserved verbatim.
const replaceBracedCommand = (input: string, command: string, render: (content: string) => string): string => {
const cmdLen = command.length
let out = ''
let i = 0
while (i < input.length) {
const idx = input.indexOf(command, i)
if (idx < 0) {
out += input.slice(i)
return out
}
const after = input[idx + cmdLen]
if (after && /[A-Za-z]/.test(after)) {
out += input.slice(i, idx + cmdLen)
i = idx + cmdLen
continue
}
out += input.slice(i, idx)
let p = idx + cmdLen
while (input[p] === ' ' || input[p] === '\t') p++
const arg = readBraced(input, p)
if (!arg) {
out += input.slice(idx, p + 1)
i = p + 1
continue
}
out += render(replaceBracedCommand(arg.content, command, render))
i = arg.end
}
return out
}
// Replace every `\frac{num}{den}` with `num/den` (parens around either
// side when its precedence demands it). The recursion handles nested
// fractions naturally: `\frac{1}{\frac{1}{x}}` collapses to `1/(1/x)`
// because we recurse into `den` before deciding whether to parenthesise.
const replaceFracs = (input: string): string => {
let out = ''
let i = 0
while (i < input.length) {
const idx = input.indexOf('\\frac', i)
if (idx < 0) {
out += input.slice(i)
return out
}
const after = input[idx + 5]
// `(?![A-Za-z])` — protect hypothetical commands like `\fraction`.
if (after && /[A-Za-z]/.test(after)) {
out += input.slice(i, idx + 5)
i = idx + 5
continue
}
out += input.slice(i, idx)
let p = idx + 5
while (input[p] === ' ' || input[p] === '\t') p++
const num = readBraced(input, p)
if (!num) {
out += input.slice(idx, p + 1)
i = p + 1
continue
}
p = num.end
while (input[p] === ' ' || input[p] === '\t') p++
const den = readBraced(input, p)
if (!den) {
out += input.slice(idx, p + 1)
i = p + 1
continue
}
out += `${wrapForFrac(replaceFracs(num.content))}/${wrapForFrac(replaceFracs(den.content))}`
i = den.end
}
return out
}
// Wrap multi-token expressions in parens so `\frac{a+b}{c}` becomes
// `(a+b)/c` rather than `a+b/c`. We wrap whenever inline `/` would
// change the meaning — that's any binary operator (`+`, `-`, `*`, `/`)
// or whitespace separating tokens. `*` and `/` matter because nested
// fractions and products like `\frac{a*b}{c}` and `\frac{1/x}{y}` would
// otherwise read as `a*b/c` (right-associative ambiguity) and `1/x/y`.
// Atomic factors like `n!`, `x^2`, `\sin x` don't trigger any of these
// and stay un-parenthesised — wrapping them just clutters the output.
const wrapForFrac = (expr: string) => {
const trimmed = expr.trim()
if (!trimmed) {
return trimmed
}
if (/^\(.*\)$/.test(trimmed)) {
return trimmed
}
if (/[+\-/*]|\s/.test(trimmed)) {
return `(${trimmed})`
}
return trimmed
}
export function texToUnicode(input: string): string {
let s = input
s = s.replace(/\\mathbb\s*\{([A-Za-z])\}/g, (raw, c: string) => BB[c] ?? raw)
s = s.replace(/\\mathcal\s*\{([A-Za-z])\}/g, (raw, c: string) => CAL[c] ?? raw)
s = s.replace(/\\mathfrak\s*\{([A-Za-z])\}/g, (raw, c: string) => FRAK[c] ?? raw)
s = s.replace(/\\mathbf\s*\{([^{}]+)\}/g, (_, c: string) => c)
s = s.replace(/\\mathit\s*\{([^{}]+)\}/g, (_, c: string) => c)
s = s.replace(/\\mathrm\s*\{([^{}]+)\}/g, (_, c: string) => c)
s = s.replace(/\\text\s*\{([^{}]+)\}/g, (_, c: string) => c)
s = s.replace(/\\operatorname\s*\{([^{}]+)\}/g, (_, c: string) => c)
s = s.replace(/\\overline\s*\{([^{}]+)\}/g, (_, c: string) => `${c}\u0305`)
s = s.replace(/\\hat\s*\{([^{}]+)\}/g, (_, c: string) => `${c}\u0302`)
s = s.replace(/\\bar\s*\{([^{}]+)\}/g, (_, c: string) => `${c}\u0304`)
s = s.replace(/\\tilde\s*\{([^{}]+)\}/g, (_, c: string) => `${c}\u0303`)
s = s.replace(/\\vec\s*\{([^{}]+)\}/g, (_, c: string) => `${c}\u20D7`)
s = s.replace(/\\dot\s*\{([^{}]+)\}/g, (_, c: string) => `${c}\u0307`)
s = s.replace(/\\ddot\s*\{([^{}]+)\}/g, (_, c: string) => `${c}\u0308`)
s = replaceFracs(s)
// `\boxed{X}` / `\fbox{X}` highlight a final answer. Terminals can't
// draw a real box, so we wrap the content in U+0001 / U+0002 control
// characters — non-printable, never present in real text — so a renderer
// with styled-span hooks can apply a highlight style (inverse video) to
// the bracketed region. The OpenTUI engine's preprocessor currently
// strips them via BOX_RE (styled-span injection into the native
// `<markdown>` renderable is deferred); `texToUnicode` stays pure-string.
// Argument is parsed with balanced braces so nested `{...}` from
// superscripts / fractions inside the box survive.
s = replaceBracedCommand(s, '\\boxed', body => `${BOX_OPEN}${body.trim()}${BOX_CLOSE}`)
s = replaceBracedCommand(s, '\\fbox', body => `${BOX_OPEN}${body.trim()}${BOX_CLOSE}`)
// `\xrightarrow{label}` / `\xleftarrow{label}` collapse to an arrow with
// the label inline. LaTeX renders the label above the arrow; in monospace
// we put it adjacent — `─label→` is the closest readable approximation.
// Run before the symbol pass so the label can still pick up Greek and
// operator substitutions afterwards.
s = s.replace(/\\xrightarrow\s*\{([^{}]*)\}/g, (_, label: string) => `${label.trim()}`)
s = s.replace(/\\xleftarrow\s*\{([^{}]*)\}/g, (_, label: string) => `${label.trim()}`)
s = s.replace(/\\Longrightarrow/g, '⟹')
s = s.replace(/\\Longleftarrow/g, '⟸')
s = s.replace(/\\Longleftrightarrow/g, '⟺')
// `\pmod{p}` → ` (mod p)` (LaTeX adds parens automatically); `\pod{p}`
// is a paren-less variant; `\tag{n}` is the equation-number annotation
// shown to the right of an equation. Collapse to a single-space-prefixed
// bracketed form. The leading `\s*` in the pattern absorbs any whitespace
// already in the source so we don't end up with `b (mod p)` (double
// space) when the user wrote `b \pmod{p}`.
s = s.replace(/\s*\\pmod\s*\{([^{}]*)\}/g, (_, p: string) => ` (mod ${p.trim()})`)
s = s.replace(/\s*\\pod\s*\{([^{}]*)\}/g, (_, p: string) => ` (${p.trim()})`)
s = s.replace(/\s*\\tag\s*\{([^{}]*)\}/g, (_, n: string) => ` (${n.trim()})`)
// `\big`, `\Big`, `\bigg`, `\Bigg` (with optional `l`/`r`/`m` suffix)
// are sizing wrappers analogous to `\left`/`\right` but without the
// automatic-pairing semantics. Strip them and leave whatever delimiter
// follows. The trailing `(?![A-Za-z])` protects `\bigtriangleup` and
// any other letter-continuation command from being shaved.
s = s.replace(/\\(?:Bigg|bigg|Big|big)[lrm]?(?![A-Za-z])/g, '')
// Style / size hints that don't typeset any glyph and only affect how
// things would be sized in a real LaTeX engine. In a terminal every
// glyph is one monospace cell, so there's nothing to do — drop them
// (with any trailing whitespace) so they don't leak through as raw
// `\displaystyle` in the output.
s = s.replace(/\\(?:scriptscriptstyle|displaystyle|scriptstyle|textstyle|nolimits|limits)(?![A-Za-z])\s*/g, '')
// `\left` and `\right` are sizing wrappers around any delimiter — bare
// (`\left(`), escaped (`\left\{`), or named (`\left\langle`). Strip the
// wrapper unconditionally and let the rest of the pipeline (or the
// upcoming symbol pass) handle whatever delimiter follows. The optional
// `.?` consumes `\left.` / `\right.` which mean "no delimiter".
// Lookahead `(?![A-Za-z])` keeps `\leftarrow` / `\leftrightarrow` safe.
s = s.replace(/\\left(?![A-Za-z])\.?/g, '')
s = s.replace(/\\right(?![A-Za-z])\.?/g, '')
// Run symbol substitution BEFORE scripts so a body like `^{\infty}`
// becomes `^{∞}` first; convertScript can then either map ∞ to a
// superscript (it can't — Unicode lacks one) or fall back to `^∞`
// by stripping braces around the now-single-character body.
//
// Punctuation pass first — these can be followed by letters (`\{p`
// is "open-brace then p"), so the letter pass's `(?![A-Za-z])` rule
// would wrongly block them.
s = s.replace(SYMBOL_PUNCT_RE, m => SYMBOLS[m] ?? m)
s = s.replace(SYMBOL_LETTER_RE, m => SYMBOLS[m] ?? m)
// Bare `^c` / `_c` handles ONLY alphanumerics and `+`/`-`/`=`. Parens
// are intentionally excluded because the braced-fallback above can
// emit `(...)` and we don't want a second pass to greedily convert
// its opening paren into `⁽` and orphan the closing one.
s = s.replace(/\^\s*\{([^{}]+)\}/g, (_, body: string) => convertScript(body, SUPERSCRIPT, '^'))
s = s.replace(/\^([A-Za-z0-9+\-=])/g, (raw, ch: string) => SUPERSCRIPT[ch] ?? raw)
s = s.replace(/_\s*\{([^{}]+)\}/g, (_, body: string) => convertScript(body, SUBSCRIPT, '_'))
s = s.replace(/_([A-Za-z0-9+\-=])/g, (raw, ch: string) => SUBSCRIPT[ch] ?? raw)
return s
}

View File

@@ -0,0 +1,88 @@
/**
* Memory-monitor LOGIC (pure, no node:v8/process/file imports — testable).
*
* Ports the high-value SMART part of Ink's memory monitor
* (`ui-tui/src/lib/memoryMonitor.ts`): the #34095 silent-death EARLY-WARNING.
* It deliberately does NOT port Ink's auto heap-snapshot capture — the OpenTUI
* engine's always-on `memlog` NDJSON trace (boundary/memlog.ts) is the
* diagnosis path, and the rss-vs-heap divergence it records is the better
* diagnostic for the native-RSS leak class (#15141) that a V8 heap snapshot
* captures poorly anyway. So we skip the #41948 disk-fill bug class entirely.
*
* The early-warning regime is BELOW the OOM ceiling: Node can OOM from a render-
* tree / store blowup at a few hundred MB, well under any "critical" exit
* watermark, so a plain level machine never sees it and the death looks silent
* (#34095 showed up only as a bare gateway `stdin EOF`). We fire ONCE when heap
* both crosses a modest absolute floor AND is climbing steeply (≥150MB between
* ticks) — the render-tree-blowup signature — and re-arm only after heap falls
* back below the floor. The boundary turns the fire into a visible transcript
* system line so the user gets a heads-up before the process dies.
*/
const MB = 1024 ** 2
/** Heap floor below which we never warn (a small heap climbing is normal). */
export const WARN_FLOOR_BYTES = 600 * MB
/** Per-tick growth that, combined with crossing the floor, signals a blowup. */
export const WARN_GROWTH_STEP_BYTES = 150 * MB
/** Mutable arm/disarm state for the early-warning detector. */
export interface WarnState {
/** Previous heapUsed sample; `-1` until the first sample is seen. */
lastHeap: number
/** Whether we've already fired since the last re-arm (one-shot until reset). */
warned: boolean
}
/** A fresh, un-seeded warn state (lastHeap < 0 ⇒ first sample can't "grow"). */
export function createWarnState(): WarnState {
return { lastHeap: -1, warned: false }
}
export interface WarnEvaluation {
/** True exactly on the tick the warning should fire (one-shot). */
readonly fire: boolean
/** The growth since the previous sample (bytes; 0 on the first sample). */
readonly growthBytes: number
}
/**
* Advance the early-warning state machine by one sample. MUTATES `state`
* (lastHeap + warned) and returns whether to fire this tick.
*
* Fires once when, while below any OOM ceiling: heap ≥ floor AND grew
* ≥ step since the previous sample AND we haven't already fired. Re-arms
* (warned ← false) once heap drops back below the floor. The first
* (un-seeded) sample only seeds lastHeap and never fires.
*/
export function evaluateWarn(
state: WarnState,
heapUsed: number,
floorBytes: number = WARN_FLOOR_BYTES,
stepBytes: number = WARN_GROWTH_STEP_BYTES
): WarnEvaluation {
const seeded = state.lastHeap >= 0
const growthBytes = seeded ? heapUsed - state.lastHeap : 0
let fire = false
if (seeded) {
if (!state.warned && heapUsed >= floorBytes && growthBytes >= stepBytes) {
state.warned = true
fire = true
} else if (heapUsed < floorBytes) {
state.warned = false
}
}
state.lastHeap = heapUsed
return { fire, growthBytes }
}
/** Render the user-facing early-warning line (KB system line, no disk cost). */
export function warnLine(heapUsed: number, rss: number, growthBytes: number): string {
const mb = (n: number) => Math.round(n / MB)
return (
`⚠ memory climbing fast — heap ${mb(heapUsed)}MB (+${mb(growthBytes)}MB), rss ${mb(rss)}MB. ` +
`If the TUI dies, this is why; relaunch with HERMES_TUI_DIAGNOSTICS=1 for a trace.`
)
}

View File

@@ -0,0 +1,168 @@
/**
* Multi-click selection logic — double-click selects the word, triple-click the
* line, and a drag after either extends word-by-word / line-by-line while the
* originally clicked span stays selected (native macOS / VS Code behavior).
* Ported from the Ink fork's `hermes-ink/src/ink/selection.ts` (wordBoundsAt /
* selectLineAt / extendSelection) onto OpenTUI's screen model: the rendered
* frame is a flat grid of codepoints (`OptimizedBuffer.buffers.char`), so word
* scanning reads the frame the user actually sees — concealed markdown, tool
* chrome and all.
*
* Pure string/number work, no OpenTUI imports — the boundary shim
* (`boundary/multiClickSelect.ts`) adapts the live buffer to `ScreenText`.
*/
/** Screen-buffer cell coordinates (0-indexed col/row). */
export interface Point {
readonly x: number
readonly y: number
}
/** Inclusive span from `lo` to `hi` in reading order (row-major). */
export interface Span {
readonly lo: Point
readonly hi: Point
}
/** The multi-clicked span a drag extends from. */
export interface AnchorSpan extends Span {
readonly kind: 'word' | 'line'
}
/** Read-only view of the rendered frame's character grid. */
export interface ScreenText {
readonly width: number
readonly height: number
/** Unicode codepoint at cell (x,y); 0 marks a wide-char continuation cell. */
readonly codepointAt: (x: number, y: number) => number
}
/** -1 if a < b, 1 if a > b, 0 if equal (reading order: row then col). */
export function comparePoints(a: Point, b: Point): number {
if (a.y !== b.y) return a.y < b.y ? -1 : 1
if (a.x !== b.x) return a.x < b.x ? -1 : 1
return 0
}
// Unicode-aware word character matcher: letters (any script), digits, and the
// punctuation set iTerm2 treats as word-part by default (`/-+\~_.`). Matching
// iTerm2's default means double-clicking a path like `src/logic/multiClick.ts`
// selects the whole path — the muscle memory terminal users have.
const WORD_CHAR = /[\p{L}\p{N}_/.\-+~\\]/u
/**
* Character class for double-click word-expansion: 0 = whitespace/empty,
* 1 = word char, 2 = other punctuation. Cells with the same class as the
* clicked cell are one run; a class change is a boundary — so double-click on
* `foo` selects `foo`, on `->` selects `->`, on spaces the whitespace run.
*/
function charClass(cp: number): 0 | 1 | 2 {
if (cp === 0 || cp === 32) return 0
if (WORD_CHAR.test(String.fromCodePoint(cp))) return 1
return 2
}
/**
* Bounds of the same-class character run at (x, y), or null when the click is
* out of bounds. Wide-char continuation cells (codepoint 0) belong to the head
* glyph at their left: a click on one resolves to the head, the left scan
* steps over them to the head's class, and the right scan includes them in the
* span so the highlight covers the full glyph.
*/
export function wordSpanAt(screen: ScreenText, x: number, y: number): Span | null {
if (y < 0 || y >= screen.height || x < 0 || x >= screen.width) return null
// Land on a continuation cell → step back to the wide-char head.
let c = x
while (c > 0 && screen.codepointAt(c, y) === 0) c -= 1
const cls = charClass(screen.codepointAt(c, y))
let lo = c
while (lo > 0) {
let prev = lo - 1
while (prev > 0 && screen.codepointAt(prev, y) === 0) prev -= 1
if (charClass(screen.codepointAt(prev, y)) !== cls) break
lo = prev
}
let hi = c
while (hi < screen.width - 1) {
const cp = screen.codepointAt(hi + 1, y)
// A continuation cell after a run member is the tail of the run's last
// wide glyph — include it and keep scanning.
if (cp !== 0 && charClass(cp) !== cls) break
hi += 1
}
return { lo: { x: lo, y }, hi: { x: hi, y } }
}
/** The full row as a span (triple-click). Null when the row is out of bounds —
* per-renderable `getSelectedText` trims what shouldn't copy, matching the
* Ink fork where line-select spans the visual row. */
export function lineSpanAt(screen: ScreenText, y: number): Span | null {
if (y < 0 || y >= screen.height || screen.width <= 0) return null
return { lo: { x: 0, y }, hi: { x: screen.width - 1, y } }
}
/**
* Where a drag at (x, y) puts the selection while an anchor span is held:
* the span under the mouse (word at the pointer, or its row in line mode;
* raw cell fallback when the pointer is out of bounds) is merged with the
* anchor span so the original word/line always stays selected.
*/
export function extendedSelection(
span: AnchorSpan,
screen: ScreenText,
x: number,
y: number
): { anchor: Point; focus: Point } {
let mouseLo: Point
let mouseHi: Point
if (span.kind === 'word') {
const b = wordSpanAt(screen, x, y)
mouseLo = b ? b.lo : { x, y }
mouseHi = b ? b.hi : { x, y }
} else {
const row = Math.max(0, Math.min(y, screen.height - 1))
mouseLo = { x: 0, y: row }
mouseHi = { x: screen.width - 1, y: row }
}
// Mouse target entirely before the anchor span → grow backward from its end;
// entirely after → grow forward from its start; overlapping → just the span.
if (comparePoints(mouseHi, span.lo) < 0) return { anchor: span.hi, focus: mouseLo }
if (comparePoints(mouseLo, span.hi) > 0) return { anchor: span.lo, focus: mouseHi }
return { anchor: span.lo, focus: span.hi }
}
/** Same chain window the Ink fork uses (`App.tsx` MULTI_CLICK_*). */
export const MULTI_CLICK_TIMEOUT_MS = 500
export const MULTI_CLICK_DISTANCE = 1
/**
* Click-chain counter: a press within MULTI_CLICK_TIMEOUT_MS and
* MULTI_CLICK_DISTANCE of the previous press continues the chain, otherwise
* the count resets to 1. The returned count is capped at 3 — quadruple+
* clicks stay line-select, like every terminal/editor.
*/
export function createClickCounter(): (x: number, y: number, now: number) => 1 | 2 | 3 {
let lastTime = 0
let lastX = -1
let lastY = -1
let count = 0
return (x, y, now) => {
const chained =
now - lastTime <= MULTI_CLICK_TIMEOUT_MS &&
Math.abs(x - lastX) <= MULTI_CLICK_DISTANCE &&
Math.abs(y - lastY) <= MULTI_CLICK_DISTANCE
count = chained ? count + 1 : 1
lastTime = now
lastX = x
lastY = y
return count >= 3 ? 3 : (count as 1 | 2)
}
}

View File

@@ -0,0 +1,29 @@
/**
* Notification → desktop-OSC decision. EVERY notification renders an inline
* transcript card (so there's nothing to decide there); this only decides whether
* a notification is important enough to ALSO fire a desktop/terminal OSC ping
* (to pull the user back). The OSC payload is termChrome's `TermNotification`;
* the boundary (terminalChrome) owns the actual escape-sequence write.
*/
import type { ActivityNotification } from './backgroundActivity.ts'
import type { TermNotification } from './termChrome.ts'
/** Kind substrings that mark a "the work finished, look here" notification —
* matched case-insensitively anywhere in the kind. */
const COMPLETION_KIND_HINTS = ['complete', 'done', 'finish']
function isImportant(n: ActivityNotification): boolean {
if (n.level === 'error' || n.level === 'warn') return true
const kind = n.kind.toLowerCase()
return COMPLETION_KIND_HINTS.some(hint => kind.includes(hint))
}
/**
* The desktop OSC notification for `n`, or `undefined` when it's not important
* enough to interrupt — level 'error'/'warn', or a kind containing
* 'complete'/'done'/'finish' (case-insensitive). Title is always 'Hermes' with
* the notification text as the body.
*/
export function notificationOsc(n: ActivityNotification): TermNotification | undefined {
return isImportant(n) ? { body: n.text, title: 'Hermes' } : undefined
}

View File

@@ -0,0 +1,35 @@
/**
* Transient-notice seam (per-block copy feedback, Epic: design pass piece 2).
* Deep view nodes (e.g. the per-block `⧉` copy affordance in messageLine) need
* to flash a short notice ("Copied") on the EXISTING hint line (StatusLine —
* the same surface the entry's flashHint uses for /copy and selection-copy),
* but they don't hold the store. The store registers its `setHint` here at
* creation (one live store per app; the latest registration wins, which is
* also what headless tests want), and `flashNotice` mirrors the entry's
* flashHint contract: set, then auto-clear after `ms` unless something newer
* replaced it. No-op when nothing is registered (bare component tests).
*/
type NotifySink = (text: string | undefined) => void
let sink: NotifySink | undefined
let timer: ReturnType<typeof setTimeout> | undefined
let current: string | undefined
/** Register (or clear) the app-wide notice sink — the store's `setHint`. */
export function registerNotifier(fn: NotifySink | undefined): void {
sink = fn
}
/** Flash a transient notice on the hint line; auto-clears after `ms`. */
export function flashNotice(text: string, ms = 1500): void {
sink?.(text)
current = text
if (timer) clearTimeout(timer)
timer = setTimeout(() => {
if (current === text) {
sink?.(undefined)
current = undefined
}
}, ms)
}

View File

@@ -0,0 +1,50 @@
/**
* Pasted-text placeholders (free-code's model). A large paste isn't dumped raw
* into the composer — instead a compact `[Pasted text #N +M lines]` chip is shown
* and the real content is held in a Map, then expanded back on submit. Pure + no
* OpenTUI imports → trivially unit-testable.
*
* The store is created ONCE per session (entry) and passed to the Composer, so it
* survives the composer remounting when overlays open/close (a per-composer store
* would lose a pending paste mid-compose).
*/
export interface PasteStore {
/** Register a pasted block; returns the placeholder to insert into the input. */
add(text: string): string
/** Replace every `[Pasted text #N …]` placeholder with its stored content. */
expand(input: string): string
/** Drop all stored pastes (call after a successful submit). */
clear(): void
}
// Matches `[Pasted text #12]` and `[Pasted text #12 +34 lines]`. The id is the key.
const REF = /\[Pasted text #(\d+)(?: \+\d+ lines)?\]/g
export function createPasteStore(): PasteStore {
const map = new Map<number, string>()
let seq = 0
return {
add(text) {
const id = ++seq
map.set(id, text)
const lines = text.split('\n').length
return lines > 1 ? `[Pasted text #${id} +${lines} lines]` : `[Pasted text #${id}]`
},
// String.replace(/g) is a SINGLE left-to-right pass over the ORIGINAL string,
// so content inserted for one ref is never re-scanned for another ref —
// a pasted block that itself contains `[Pasted text #k]` is safe.
expand(input) {
return (input ?? '').replace(REF, (m, id: string) => map.get(Number(id)) ?? m)
},
clear() {
map.clear()
seq = 0
}
}
}
/** A paste big enough to placeholder rather than inline (conservative thresholds). */
export function shouldPlaceholder(text: string): boolean {
return text.split('\n').length >= 4 || text.length > 400
}

View File

@@ -0,0 +1,118 @@
/**
* promptHistory — pure logic for the Esc+Esc session prompt viewer (Epic 5).
*
* Model: free-code's rewind dialog (`useDoublePress.ts`, `MessageSelector.tsx`)
* — 800ms double-press window, only-when-input-empty trigger, 7 visible rows
* newest-first with a centered window, Enter → confirm step.
*
* Semantics (spec Epic 5, RESOLVED block):
* - Entries are THIS session's user prompts from the store transcript
* (NOT the per-dir JSONL composer history), newest first. Empty → no modal.
* - Undo = conversation layer (`/undo` removes the LAST user/assistant
* exchange; files kept) → offered ONLY for the most recent prompt. We never
* fake arbitrary-depth conversation rewind.
* - Rollback = filesystem layer (`/rollback` checkpoints; conversation kept).
* Prompt→checkpoint mapping isn't feasible client-side (neither store
* messages nor `session.history` carry timestamps to correlate against
* `rollback.list`'s checkpoint timestamps), so the honest action is plain
* `/rollback`: the gateway's own checkpoint list lands in the transcript
* and the user picks `/rollback <n>` from real data.
*/
/** Double-press window (free-code `DOUBLE_PRESS_TIMEOUT_MS`). */
export const DOUBLE_PRESS_WINDOW_MS = 800
/** Max visible prompt rows before the list windows (free-code `MAX_VISIBLE_MESSAGES`). */
export const MAX_VISIBLE = 7
/**
* Double-press detector (pure state machine; the free-code hook without React).
* `press(now)` returns true on the SECOND press within the window — and then
* disarms, so a third press starts a fresh cycle. `reset()` disarms (call it on
* any intervening key, and never call `press` for an Esc something else
* consumed — that's what keeps a dropdown-dismiss Esc from arming).
*/
export interface DoublePress {
press(now?: number): boolean
reset(): void
}
export function createDoublePress(windowMs: number = DOUBLE_PRESS_WINDOW_MS): DoublePress {
let armedAt: number | undefined
return {
// performance.now() is monotonic (Node) — an NTP/wall-clock jump between
// two presses can't break or spuriously satisfy the window (review finding).
press(now: number = performance.now()): boolean {
if (armedAt !== undefined && now - armedAt <= windowMs) {
armedAt = undefined
return true
}
armedAt = now
return false
},
reset(): void {
armedAt = undefined
}
}
}
/** One viewer row: a user prompt of THIS session. `index` is its position in
* the source transcript (stable identity across renders). */
export interface PromptEntry {
readonly index: number
readonly text: string
}
/**
* Source the viewer entries from the store transcript: USER prompts only,
* non-empty, NEWEST FIRST. Session-only by construction (the store holds only
* this session's messages). Empty session → [] (the trigger shows nothing).
*/
export function promptHistoryEntries(messages: ReadonlyArray<{ readonly role: string; text: string }>): PromptEntry[] {
const entries: PromptEntry[] = []
for (let i = messages.length - 1; i >= 0; i--) {
const m = messages[i]
if (m && m.role === 'user' && m.text.trim() !== '') entries.push({ index: i, text: m.text })
}
return entries
}
/** A confirm-step action. */
export type HistoryAction = 'undo' | 'rollback'
export interface ConfirmOption {
readonly action: HistoryAction
readonly label: string
}
/** The exact signed-off confirm labels (spec Epic 5). */
export const UNDO_LABEL = 'Undo — rewind the conversation (files kept)'
export const ROLLBACK_LABEL = 'Rollback — restore files from checkpoint (conversation kept)'
/**
* The confirm-step options for a selected entry. `/undo` only removes the LAST
* exchange, so Undo is offered ONLY for the most recent prompt (`isLatest`) —
* an option the gateway can't honor is hidden, never a dead button. Rollback
* (filesystem checkpoints) applies regardless of the selected depth.
*/
export function confirmOptions(isLatest: boolean): ConfirmOption[] {
const options: ConfirmOption[] = []
if (isLatest) options.push({ action: 'undo', label: UNDO_LABEL })
options.push({ action: 'rollback', label: ROLLBACK_LABEL })
return options
}
/** The slash command an action dispatches — through the SAME command path the
* composer uses (`dispatchSlash` → `slash.exec`/`command.dispatch`). */
export function actionCommand(action: HistoryAction): string {
return action === 'undo' ? '/undo' : '/rollback'
}
/**
* First visible row index for a list window: keep the selection centered until
* the window hits either end (free-code `firstVisibleIndex`). Total ≤ visible
* → 0 (everything shows).
*/
export function windowStart(selected: number, total: number, visible: number = MAX_VISIBLE): number {
return Math.max(0, Math.min(selected - Math.floor(visible / 2), total - visible))
}

View File

@@ -0,0 +1,126 @@
/**
* /replay — spawn-tree inspector logic (Epic 3 port; Ink ref
* `app/slash/commands/ops.ts` /replay + `spawnHistoryStore.ts`). The gateway
* archives each completed delegation fan-out as a JSON snapshot
* (`spawn_tree.save`); these helpers read `spawn_tree.list` / `spawn_tree.load`
* payloads and format them as PAGER TEXT — the native engine renders replays
* through the existing pager overlay instead of Ink's agents overlay.
*
* All readers are defensive (wire JSON is loose, snapshots cross versions).
*/
export interface SpawnTreeEntry {
path: string
label: string
count: number
/** Epoch SECONDS (gateway convention). */
finishedAt?: number
sessionId?: string
}
function str(v: unknown): string | undefined {
return typeof v === 'string' && v ? v : undefined
}
function num(v: unknown): number | undefined {
return typeof v === 'number' && Number.isFinite(v) ? v : undefined
}
/** Map a `spawn_tree.list` result ({entries:[…]}) into typed rows (pathless rows dropped). */
export function readSpawnTreeEntries(result: unknown): SpawnTreeEntry[] {
if (!result || typeof result !== 'object') return []
const entries = (result as { entries?: unknown }).entries
if (!Array.isArray(entries)) return []
const out: SpawnTreeEntry[] = []
for (const e of entries) {
if (!e || typeof e !== 'object') continue
const o = e as { [k: string]: unknown }
const path = str(o['path'])
if (!path) continue
const entry: SpawnTreeEntry = {
count: num(o['count']) ?? 0,
label: str(o['label']) ?? '',
path
}
const finishedAt = num(o['finished_at'])
if (finishedAt !== undefined) entry.finishedAt = finishedAt
const sessionId = str(o['session_id'])
if (sessionId !== undefined) entry.sessionId = sessionId
out.push(entry)
}
return out
}
function fmtWhen(epochSeconds: number | undefined): string {
if (epochSeconds === undefined) return '?'
try {
return new Date(epochSeconds * 1000).toLocaleString()
} catch {
return '?'
}
}
/** The bare `/replay` listing: indexed rows the user replays by number. */
export function formatSpawnTreeList(entries: readonly SpawnTreeEntry[]): string {
const lines: string[] = ['Archived spawn trees — /replay <n> to view, /replay <path> for any snapshot', '']
entries.forEach((e, i) => {
const label = e.label || `${e.count} subagent${e.count === 1 ? '' : 's'}`
lines.push(`${String(i + 1).padStart(3)}. ${fmtWhen(e.finishedAt)} · ${e.count}×${label}`)
lines.push(` ${e.path}`)
})
return lines.join('\n')
}
/** Status glyph for an archived subagent row. */
function statusGlyph(status: string): string {
if (status === 'completed') return '✓'
if (status === 'error' || status === 'failed' || status === 'timeout') return '✗'
if (status === 'interrupted') return '⏹'
return '●'
}
/** One archived subagent → its pager lines (indented by spawn depth). */
function subagentLines(raw: unknown, index: number): string[] {
const o = (raw && typeof raw === 'object' ? raw : {}) as { [k: string]: unknown }
const depth = num(o['depth']) ?? 0
const pad = ' '.repeat(Math.max(0, depth))
const status = str(o['status']) ?? 'completed'
const goal = str(o['goal']) ?? 'subagent'
const lines = [`${pad}${statusGlyph(status)} [${index + 1}] ${goal}`]
const meta: string[] = [status]
const model = str(o['model'])
if (model) meta.push(model)
const duration = num(o['durationSeconds'])
if (duration !== undefined) meta.push(`${Math.round(duration)}s`)
const tools = num(o['toolCount'])
if (tools) meta.push(`${tools} tool${tools === 1 ? '' : 's'}`)
const tokIn = num(o['inputTokens'])
const tokOut = num(o['outputTokens'])
if (tokIn !== undefined || tokOut !== undefined) meta.push(`${tokIn ?? 0} in / ${tokOut ?? 0} out tok`)
lines.push(`${pad} ${meta.join(' · ')}`)
const summary = str(o['summary'])
if (summary) for (const s of summary.split('\n')) lines.push(`${pad} ${s}`)
const notes = o['notes']
if (Array.isArray(notes)) {
for (const note of notes) if (typeof note === 'string' && note) lines.push(`${pad} · ${note}`)
}
return lines
}
/** A loaded snapshot (`spawn_tree.load` payload) → the full pager text. */
export function formatSpawnTree(payload: unknown): string {
const o = (payload && typeof payload === 'object' ? payload : {}) as { [k: string]: unknown }
const subagents = Array.isArray(o['subagents']) ? (o['subagents'] as unknown[]) : []
const header: string[] = []
const label = str(o['label'])
header.push(label ?? 'spawn tree')
const meta: string[] = []
const sessionId = str(o['session_id'])
if (sessionId) meta.push(`session ${sessionId}`)
meta.push(`finished ${fmtWhen(num(o['finished_at']))}`)
meta.push(`${subagents.length} subagent${subagents.length === 1 ? '' : 's'}`)
header.push(meta.join(' · '))
if (!subagents.length) return [...header, '', '(snapshot empty or unreadable)'].join('\n')
const body = subagents.flatMap((s, i) => ['', ...subagentLines(s, i)])
return [...header, ...body].join('\n')
}

View File

@@ -0,0 +1,101 @@
/**
* Resume snapshot mapper (spec §1 lifecycle; gotcha §8 #5). Maps the
* `session.resume` response `messages` (tui_gateway `_history_to_messages`) into
* the store's `Message[]`. Each history entry is either `{role, text}` (user/
* assistant/system) or `{role:'tool', name, context}` (NO text — render it).
*
* Tool rows are folded into the PRECEDING assistant turn's ordered `parts[]`
* (state:'complete', summary=context) so a resumed transcript renders inline like
* a live one. Resumed assistant text is given a single text part so it renders
* through the native markdown path. IDs are `r*` (distinct from live `p*`).
*/
import type { Message, Part, SessionItem, ToolPartState } from './store.ts'
import { stripOmittedNote, stripToolEnvelope } from './toolOutput.ts'
function readStr(value: unknown, key: string): string | undefined {
if (!value || typeof value !== 'object') return undefined
const v = (value as { [k: string]: unknown })[key]
return typeof v === 'string' ? v : undefined
}
function readNum(value: unknown, key: string): number {
if (!value || typeof value !== 'object') return 0
const v = (value as { [k: string]: unknown })[key]
return typeof v === 'number' ? v : 0
}
/** Map a `session.list` result into switcher rows (loose-typed read). */
export function mapSessionList(result: unknown): SessionItem[] {
if (!result || typeof result !== 'object') return []
const sessions = (result as { sessions?: unknown }).sessions
if (!Array.isArray(sessions)) return []
const out: SessionItem[] = []
for (const s of sessions) {
const id = readStr(s, 'id')
if (!id) continue
out.push({
id,
messageCount: readNum(s, 'message_count'),
preview: readStr(s, 'preview') ?? '',
title: readStr(s, 'title') ?? ''
})
}
return out
}
export function mapResumeHistory(history: unknown): Message[] {
if (!Array.isArray(history)) return []
const out: Message[] = []
let seq = 0
const id = () => `r${++seq}`
let currentAssistant: Message | undefined
for (const raw of history) {
const role = readStr(raw, 'role')
if (role === 'tool') {
const name = readStr(raw, 'name') ?? 'tool'
const context = readStr(raw, 'context')
const tool: ToolPartState = { type: 'tool', id: id(), name, state: 'complete' }
// Match the live tool part exactly (item 1): primary-arg preview in the
// header, plus the (capped) output so resumed tools are collapsible too.
if (context) tool.argsPreview = context
const rawResult = readStr(raw, 'result_text')
if (rawResult) {
const { body, omittedNote } = stripOmittedNote(rawResult)
const resultText = stripToolEnvelope(body)
if (resultText) {
tool.resultText = resultText
tool.lineCount = resultText.replace(/\s+$/, '').split('\n').length
}
if (omittedNote) tool.omittedNote = omittedNote
}
const args = (raw as { args?: unknown }).args
if (args && typeof args === 'object') {
try {
tool.argsText = JSON.stringify(args, null, 2)
} catch {
/* unstringifiable — leave unset */
}
}
if (!currentAssistant) {
currentAssistant = { role: 'assistant', text: '', parts: [] }
out.push(currentAssistant)
}
;(currentAssistant.parts ??= []).push(tool)
continue
}
const text = readStr(raw, 'text') ?? ''
if (role === 'assistant') {
const parts: Part[] = text ? [{ type: 'text', id: id(), text }] : []
currentAssistant = { role: 'assistant', text, parts }
out.push(currentAssistant)
} else if (role === 'user' || role === 'system') {
out.push({ role, text })
currentAssistant = undefined
}
}
return out
}

View File

@@ -0,0 +1,332 @@
/**
* sessionPicker.ts — pure logic for the tabbed resume picker (design doc
* docs/plans/opentui-resume-picker.md §A/§B item 5; supersedes the flat
* SessionSwitcher). Everything here is view-free and vitest-covered:
*
* - tab definitions + source→tab classification (Recent = interactive
* cli/tui/acp + unknown/custom; Cron = cron; Gateways = the known platform
* sources; All = everything minus the deny-listed `tool`),
* - the `session.list` params each tab queries with (`sources` allow-list —
* note the one honest gap: unknown/custom sources CLASSIFY as Recent but
* can't be expressed in an allow-list, so they surface under All),
* - the client-side search filter chain over title/preview/cwd/id (reuses
* fuzzy.ts — same scorer as the model picker),
* - the key-routing decision table (pattern: completionMenu's routeMenuKey),
* - the relative-time formatter + row-meta composer (time · source · N msgs
* · tail-truncated cwd),
* - `/sessions <tab>` arg parsing and the `/resume <id|name>` resolver.
*/
import { fuzzyFilter, type FuzzyField } from './fuzzy.ts'
// ── tabs + classification ─────────────────────────────────────────────────
export type SessionTabId = 'recent' | 'cron' | 'gateways' | 'all'
/** Tab strip order + labels (design doc §A — Recent is the default). */
export const SESSION_TABS: ReadonlyArray<{ id: SessionTabId; label: string }> = [
{ id: 'recent', label: 'Recent' },
{ id: 'cron', label: 'Cron' },
{ id: 'gateways', label: 'Gateways' },
{ id: 'all', label: 'All' }
]
/** Interactive sources — the Recent tab's allow-list. */
export const INTERACTIVE_SOURCES: readonly string[] = ['cli', 'tui', 'acp']
/** Known platform/gateway sources (the Gateways tab's allow-list). The gateway
* itself deny-lists only `tool`, so this list is the picker's working set of
* "messaging platform" tags; new platforms join here (or show under All). */
export const PLATFORM_SOURCES: readonly string[] = [
'telegram',
'discord',
'slack',
'whatsapp',
'signal',
'imessage',
'matrix',
'teams',
'email',
'webhook',
'x',
'twitter',
'mastodon',
'irc',
'mattermost'
]
/** Classify a session `source` tag into its home tab (`tool` = deny-listed —
* never shown). Unknown/custom sources (incl. empty) default to Recent per
* the design table: they're assumed interactive `HERMES_SESSION_SOURCE`s. */
export function classifySource(source: string | undefined): 'recent' | 'cron' | 'gateways' | 'tool' {
const s = (source ?? '').trim().toLowerCase()
if (s === 'tool') return 'tool'
if (s === 'cron') return 'cron'
if (PLATFORM_SOURCES.includes(s)) return 'gateways'
return 'recent'
}
/** Whether a row with this source belongs on the given tab. */
export function tabAccepts(tab: SessionTabId, source: string | undefined): boolean {
const cls = classifySource(source)
if (cls === 'tool') return false
return tab === 'all' || cls === tab
}
/**
* The `session.list` params a tab queries with. Cron/Gateways push an exact
* `sources` allow-list to the gateway; All omits it (the gateway deny-lists
* `tool` itself). Recent sends the interactive allow-list — the one honest gap
* vs `classifySource` (unknown/custom sources can't be allow-listed, so they
* appear under All only); fetching everything and filtering client-side would
* make Recent unusable in cron-heavy DBs (1500+ cron rows drown the page).
*/
export function listParamsFor(tab: SessionTabId, offset: number, limit: number): Record<string, unknown> {
const base: Record<string, unknown> = { limit, offset }
if (tab === 'recent') return { ...base, sources: [...INTERACTIVE_SOURCES] }
if (tab === 'cron') return { ...base, sources: ['cron'] }
if (tab === 'gateways') return { ...base, sources: [...PLATFORM_SOURCES] }
return base
}
// ── session.list row mapping ──────────────────────────────────────────────
/** One picker row — the widened `session.list` projection (gateway 529d8084b). */
export interface SessionRow {
id: string
title: string
preview: string
source: string
messageCount: number
startedAt: number
lastActive: number
endedAt?: number
model?: string
cwd?: string
}
function readStr(value: unknown, key: string): string | undefined {
if (!value || typeof value !== 'object') return undefined
const v = (value as { [k: string]: unknown })[key]
return typeof v === 'string' ? v : undefined
}
function readNum(value: unknown, key: string): number {
if (!value || typeof value !== 'object') return 0
const v = (value as { [k: string]: unknown })[key]
return typeof v === 'number' ? v : 0
}
/** Map a widened `session.list` result into rows + the honesty flag. */
export function mapSessionRows(result: unknown): { rows: SessionRow[]; truncated: boolean } {
if (!result || typeof result !== 'object') return { rows: [], truncated: false }
const sessions = (result as { sessions?: unknown }).sessions
const truncated = (result as { truncated?: unknown }).truncated === true
if (!Array.isArray(sessions)) return { rows: [], truncated }
const rows: SessionRow[] = []
for (const s of sessions) {
const id = readStr(s, 'id')
if (!id) continue
const row: SessionRow = {
id,
lastActive: readNum(s, 'last_active') || readNum(s, 'started_at'),
messageCount: readNum(s, 'message_count'),
preview: readStr(s, 'preview') ?? '',
source: readStr(s, 'source') ?? '',
startedAt: readNum(s, 'started_at'),
title: readStr(s, 'title') ?? ''
}
const endedAt = readNum(s, 'ended_at')
if (endedAt) row.endedAt = endedAt
const model = readStr(s, 'model')
if (model) row.model = model
const cwd = readStr(s, 'cwd')
if (cwd) row.cwd = cwd
rows.push(row)
}
return { rows, truncated }
}
// ── search filter chain (client-side, within the active tab) ──────────────
/** Fuzzy haystacks of a row: title ×2 (primary), preview, cwd, id. */
export function sessionFields(row: SessionRow): FuzzyField[] {
const fields: FuzzyField[] = []
if (row.title) fields.push({ text: row.title, weight: 2 })
if (row.preview) fields.push({ text: row.preview })
if (row.cwd) fields.push({ text: row.cwd })
fields.push({ text: row.id })
return fields
}
/** Filter + rank rows by the search query (empty → all rows, fetch order). */
export function filterSessions(query: string, rows: readonly SessionRow[]): SessionRow[] {
return fuzzyFilter(query, rows, sessionFields)
}
// ── this-directory grouping ───────────────────────────────────────────────
/** Path equality for cwd grouping: trim + drop trailing slashes. Pure string
* work (no fs) — rows carry the gateway's already-absolute paths. */
export function normalizeCwd(path: string | undefined): string {
return (path ?? '').trim().replace(/\/+$/, '')
}
/** Display order with sessions started in the CURRENT directory first.
*
* Browse mode only: while a search query is active the fuzzy score owns the
* order (relevance beats locality), so `hereCount` is 0 and rows pass through.
* Stable within both groups (each keeps the gateway's recency order). The
* view renders section captions off `hereCount`; selection math is untouched
* because this just reorders the one flat list.
*/
export function orderRowsForCwd(
rows: SessionRow[],
currentCwd: string | undefined,
query: string
): { rows: SessionRow[]; hereCount: number } {
const here = normalizeCwd(currentCwd)
if (!here || query.trim()) return { hereCount: 0, rows }
const local: SessionRow[] = []
const elsewhere: SessionRow[] = []
for (const row of rows) (normalizeCwd(row.cwd) === here ? local : elsewhere).push(row)
if (!local.length) return { hereCount: 0, rows }
return { hereCount: local.length, rows: [...local, ...elsewhere] }
}
// ── key routing (pattern: completionMenu.ts routeMenuKey) ────────────────
export interface SessionPickerKeyContext {
/** Whether the inline Ctrl+R rename is active (it owns Enter/Esc). */
renaming: boolean
/** Whether the search query is empty (←/→ only cycle tabs when it is). */
queryEmpty: boolean
}
export type SessionPickerAction =
| { kind: 'close' }
| { kind: 'resume' }
| { kind: 'move'; dir: 1 | -1 }
| { kind: 'cycle-tab'; dir: 1 | -1 }
| { kind: 'preview' }
| { kind: 'rename' }
| { kind: 'commit-rename' }
| { kind: 'cancel-rename' }
| { kind: 'pass' }
const PASS: SessionPickerAction = { kind: 'pass' }
/**
* Route one key press. While RENAMING, the rename input owns every key except
* Enter (commit) and Esc/Ctrl+C (cancel rename — NOT close). Otherwise:
* Esc/Ctrl+C close, Enter resumes, ↑↓ (or Ctrl+P/N) move, Tab/Shift+Tab cycle
* tabs, ←/→ cycle only on an empty query (with text they stay cursor moves),
* Space toggles the preview (it never types — fuzzy terms don't need literal
* spaces), Ctrl+R starts the inline rename. Everything else belongs to the
* focused search input.
*/
export function routeSessionPickerKey(
name: string,
mods: { ctrl?: boolean; shift?: boolean },
ctx: SessionPickerKeyContext
): SessionPickerAction {
if (ctx.renaming) {
if (name === 'return') return { kind: 'commit-rename' }
if (name === 'escape' || (mods.ctrl && name === 'c')) return { kind: 'cancel-rename' }
return PASS
}
if (name === 'escape' || (mods.ctrl && name === 'c')) return { kind: 'close' }
if (name === 'return') return { kind: 'resume' }
if (name === 'up' || (mods.ctrl && name === 'p')) return { kind: 'move', dir: -1 }
if (name === 'down' || (mods.ctrl && name === 'n')) return { kind: 'move', dir: 1 }
if (name === 'tab') return { kind: 'cycle-tab', dir: mods.shift ? -1 : 1 }
if ((name === 'left' || name === 'right') && ctx.queryEmpty) {
return { kind: 'cycle-tab', dir: name === 'left' ? -1 : 1 }
}
if (name === 'space') return { kind: 'preview' }
if (mods.ctrl && name === 'r') return { kind: 'rename' }
return PASS
}
// ── relative time + row meta ──────────────────────────────────────────────
/** Epoch seconds OR milliseconds → ms (DB rows are seconds; be lenient). */
function toMs(epoch: number): number {
return epoch >= 1e12 ? epoch : epoch * 1000
}
const TIME_STEPS: ReadonlyArray<{ ms: number; unit: string }> = [
{ ms: 60_000, unit: 'minute' },
{ ms: 3_600_000, unit: 'hour' },
{ ms: 86_400_000, unit: 'day' },
{ ms: 604_800_000, unit: 'week' },
{ ms: 2_629_800_000, unit: 'month' },
{ ms: 31_557_600_000, unit: 'year' }
]
/** "just now" / "1 minute ago" / "5 hours ago" / "2 weeks ago" … */
export function relativeTime(epoch: number | undefined, nowMs: number): string {
if (!epoch) return 'unknown'
const delta = nowMs - toMs(epoch)
if (delta < 60_000) return 'just now'
for (let i = TIME_STEPS.length - 1; i >= 0; i--) {
const step = TIME_STEPS[i]
if (step && delta >= step.ms) {
const n = Math.floor(delta / step.ms)
return `${n} ${step.unit}${n === 1 ? '' : 's'} ago`
}
}
return 'just now'
}
/** Tail-truncate a path-ish string to `max` chars (`…tail/of/path`). */
export function tailTruncate(text: string, max: number): string {
if (text.length <= max) return text
return `${text.slice(text.length - (max - 1))}`
}
/** Max cwd tail shown in a row's meta line. */
const META_CWD_MAX = 40
/** Row meta line: relative time · source · N msgs · cwd (when present). */
export function rowMeta(row: SessionRow, nowMs: number): string {
const parts = [
relativeTime(row.lastActive || row.startedAt, nowMs),
row.source || 'unknown',
`${row.messageCount} msgs`
]
if (row.cwd) parts.push(tailTruncate(row.cwd, META_CWD_MAX))
return parts.join(' · ')
}
// ── slash entry points ────────────────────────────────────────────────────
/** Parse a `/sessions <tab>` argument (case-insensitive, singular tolerated;
* bare/empty → the default Recent tab). Garbage → undefined (usage notice). */
export function parseSessionTabArg(arg: string): SessionTabId | undefined {
const a = arg.trim().toLowerCase()
if (!a || a === 'recent') return 'recent'
if (a === 'cron') return 'cron'
if (a === 'gateway' || a === 'gateways') return 'gateways'
if (a === 'all') return 'all'
return undefined
}
/**
* Resolve a `/resume <id|name>` argument against listed rows (the direct
* path): exact id → unique id prefix → exact title (case-insensitive) →
* unique case-insensitive title substring. Ambiguous/missing → undefined.
*/
export function resolveSessionArg(rows: readonly SessionRow[], arg: string): SessionRow | undefined {
const needle = arg.trim()
if (!needle) return undefined
const exactId = rows.find(r => r.id === needle)
if (exactId) return exactId
const idPrefix = rows.filter(r => r.id.startsWith(needle))
if (idPrefix.length === 1) return idPrefix[0]
const lower = needle.toLowerCase()
const exactTitle = rows.filter(r => r.title.toLowerCase() === lower)
if (exactTitle.length === 1) return exactTitle[0]
const sub = rows.filter(r => r.title.toLowerCase().includes(lower))
if (sub.length === 1) return sub[0]
return undefined
}

View File

@@ -0,0 +1,161 @@
/**
* Slash-token matching for the composer (Epic 6) — pure tokenizer + matcher,
* no deps, fully table-testable. The composer uses this to:
*
* 1. HIGHLIGHT a `/name` token whose name exactly matches a valid
* command/skill name (native textarea highlight ranges),
* 2. SUGGEST an autocorrect when the message IS a bare `/name` token at the
* very start and the name is exactly one edit away (Damerau-Levenshtein /
* OSA distance 1) from exactly ONE valid name — surfaced through the
* existing completion dropdown, never auto-applied.
*
* Anti-jank rule (the whole point): a `/` in the middle of prose must NOT
* trigger completion or autocorrect. Mid-prose tokens get highlight-only when
* they exactly match a valid name; otherwise nothing happens. Path-looking
* tokens (`a/b`, `/usr/bin`, `./x`) are never tokens at all.
*
* The catalog of valid names is supplied by the caller (the composer LEARNS it
* from the slash-completion batches the gateway already sends — the completion
* flow is the source of truth; nothing is hardcoded here).
*/
/** A standalone `/name` token found in the composer text. */
export interface SlashToken {
/** The name WITHOUT the leading `/`. */
name: string
/** Char offset of the leading `/` in the text. */
start: number
/** Char offset one past the last name char. */
end: number
/** Whether the token sits at the very start of the message (offset 0). */
lead: boolean
}
/** An autocorrect suggestion for the lead token (`/comit` → `commit`). */
export interface SlashSuggestion {
/** The corrected name (no slash). */
name: string
/** Char offset the accepted suggestion replaces from (just past the `/`). */
from: number
}
export interface SlashAnalysis {
/** Tokens whose name EXACTLY matches a valid name — highlight these. */
highlights: SlashToken[]
/** The one-edit autocorrect for a bare lead token; null when none applies. */
suggestion: SlashSuggestion | null
}
/** Command/skill name charset: starts alphanumeric, then word chars / `.` / `-`.
* Notably EXCLUDES `/` — `/usr/bin` is a path, never a command token. */
const NAME_RE = /^[A-Za-z0-9][\w.-]*$/
const isSpace = (ch: string | undefined): boolean => ch === ' ' || ch === '\t' || ch === '\n' || ch === '\r'
/**
* Extract every standalone `/name` token. Boundary rules:
* - the `/` must be at the start of the text or preceded by whitespace
* (`a/b` and `path/to` are not tokens),
* - the name runs to the next whitespace (or end) and must match NAME_RE
* (`/usr/bin` has a `/` in the body → not a token; bare `/` is nothing).
*/
export function slashTokens(text: string): SlashToken[] {
const tokens: SlashToken[] = []
for (let i = 0; i < text.length; i++) {
if (text[i] !== '/') continue
if (i > 0 && !isSpace(text[i - 1])) continue
let j = i + 1
while (j < text.length && !isSpace(text[j])) j++
const name = text.slice(i + 1, j)
if (NAME_RE.test(name)) tokens.push({ end: j, lead: i === 0, name, start: i })
i = j
}
return tokens
}
/**
* Whether `a` and `b` are EXACTLY one edit apart under Damerau-Levenshtein
* (OSA): one substitution, one insertion/deletion, or one adjacent
* transposition. Equal strings are zero edits → false.
*/
export function isOneEdit(a: string, b: string): boolean {
if (a === b) return false
const la = a.length
const lb = b.length
if (Math.abs(la - lb) > 1) return false
if (la === lb) {
// one substitution, or one adjacent transposition
let i = 0
while (i < la && a[i] === b[i]) i++
if (i === la) return false // identical (handled above, defensive)
// try substitution: rest after i must match
if (a.slice(i + 1) === b.slice(i + 1)) return true
// try transposition of i,i+1
return i + 1 < la && a[i] === b[i + 1] && a[i + 1] === b[i] && a.slice(i + 2) === b.slice(i + 2)
}
// one insertion/deletion: align the longer against the shorter
const [short, long] = la < lb ? [a, b] : [b, a]
let i = 0
while (i < short.length && short[i] === long[i]) i++
return short.slice(i) === long.slice(i + 1)
}
/**
* Analyze the composer text against the valid-name catalog.
* Matching is CASE-SENSITIVE per the catalog (commands are stored lowercase;
* `/Help` is not exact — though it IS one edit from `help`, so it suggests).
*
* Suggestion rules (anti-jank):
* - only for the LEAD token, and only while the message is EXACTLY the bare
* token (`/comit` — not `/comit args`, never mid-prose),
* - the token must not already be exact,
* - exactly ONE catalog name within one edit; ambiguity → nothing.
*/
export function analyzeSlash(text: string, names: ReadonlySet<string>): SlashAnalysis {
const tokens = slashTokens(text)
const highlights = tokens.filter(t => names.has(t.name))
let suggestion: SlashSuggestion | null = null
const lead = tokens[0]
if (lead && lead.lead && text === `/${lead.name}` && !names.has(lead.name)) {
let candidate: string | undefined
let count = 0
for (const n of names) {
if (isOneEdit(lead.name, n)) {
candidate = n
if (++count > 1) break
}
}
if (count === 1 && candidate !== undefined) suggestion = { from: 1, name: candidate }
}
return { highlights, suggestion }
}
/**
* Names learnable from a slash-completion batch. Only when the composer text is
* a bare lead token (`/…` with no space) are the candidates command/skill NAMES
* — after a space the gateway completes ARGS (`/details thinking`), which must
* not pollute the catalog. Item text arrives as `name `, `name`, or `/name`
* (the gateway's extras carry the slash); all normalize to the bare name.
*/
export function learnableNames(text: string, items: ReadonlyArray<{ text: string }>): string[] {
if (!/^\/\S*$/.test(text)) return []
const out: string[] = []
for (const item of items) {
let name = item.text.trim()
if (name.startsWith('/')) name = name.slice(1)
if (NAME_RE.test(name)) out.push(name)
}
return out
}
/**
* Convert a JS string offset into the native highlight char offset: the native
* char-range counter skips newlines (mirror of ExtmarksController.
* offsetExcludingNewlines in @opentui/core for plain-width text).
*/
export function nativeCharOffset(text: string, offset: number): number {
let newlines = 0
const max = Math.min(offset, text.length)
for (let i = 0; i < max; i++) if (text[i] === '\n') newlines++
return offset - newlines
}

View File

@@ -0,0 +1,894 @@
/**
* Slash command system — the SOLID side (spec §1; mirrors Ink
* `app/createSlashHandler.ts` + `domain/slash.ts`). Plain functions/data, NOT
* Effect; the boundary injects a Promise-returning `request` so dispatch can call
* `slash.exec` / `command.dispatch` / `commands.catalog`.
*
* Dispatch ladder (Ink parity):
* 1. client-local command (the TUI-only set — handled in-process)
* 2. `slash.exec {command, session_id}` → `{output, warning?}` → system line
* 3. on reject → `command.dispatch {arg, name, session_id}` → typed action
* (exec/plugin → system · alias → re-dispatch · skill/send → submit a turn ·
* prefill → notice). Long output routes to the pager (Phase 5a).
*/
import { diagnosticsEnabled } from './env.ts'
import { DETAILS_SECTIONS, DETAILS_USAGE, type DetailsMode, nextDetailsMode, parseDetailsMode } from './details.ts'
import { formatBytes, memReport, performHeapdump } from './diagnostics.ts'
import { formatSpawnTree, formatSpawnTreeList, readSpawnTreeEntries } from './replay.ts'
import { mapSessionRows, parseSessionTabArg, resolveSessionArg, type SessionTabId } from './sessionPicker.ts'
import type { CompletionItem, PickerItem, PickerState } from './store.ts'
export interface ParsedSlash {
name: string
arg: string
}
/** Parse `/name rest…` → {name, arg}; null if not a slash command. */
export function parseSlash(input: string): ParsedSlash | null {
if (!input.startsWith('/')) return null
const body = input.slice(1).trimStart()
if (!body) return null
const sp = body.indexOf(' ')
return sp === -1 ? { arg: '', name: body } : { arg: body.slice(sp + 1).trim(), name: body.slice(0, sp) }
}
/** How a submitted composer line is routed (F9 + slash ladder): a `!cmd` runs a
* shell command, a `/command` goes through the slash dispatcher, everything else
* is a prompt turn. `payload` is the command (shell) with the lead `!` stripped
* and trimmed, or the original text (slash/prompt). */
export type SubmitRoute =
| { kind: 'shell'; payload: string }
| { kind: 'slash'; payload: string }
| { kind: 'prompt'; payload: string }
export function classifySubmit(text: string): SubmitRoute {
if (text.startsWith('!')) return { kind: 'shell', payload: text.slice(1).trim() }
if (text.startsWith('/')) return { kind: 'slash', payload: text }
return { kind: 'prompt', payload: text }
}
/** The host capabilities the dispatcher needs (wired by the entry boundary). */
export interface SlashContext {
/** Server RPC (resolves with the result, rejects on GatewayError). */
readonly request: (method: string, params: Record<string, unknown>) => Promise<unknown>
readonly sessionId: () => string | undefined
readonly pushSystem: (text: string) => void
/** Open the full-screen pager (long output: /status, /logs, …). */
readonly openPager: (title: string, text: string) => void
/** Submit a user turn (skill/send dispatch results). */
readonly submit: (text: string) => void
/** Open a local Y/N confirm; `onConfirm` runs on Yes. */
readonly confirm: (message: string, onConfirm: () => void) => void
readonly clearTranscript: () => void
/** Copy the n-th newest assistant response to the clipboard; returns whether something was copied. */
readonly copyResponse: (n: number) => boolean
readonly quit: () => void
/** Recent log lines for `/logs` (the ring buffer). */
readonly logTail: () => string[]
/** Open the tabbed resume picker on the given tab (/sessions, bare /resume). */
readonly openSessionPicker: (tab: SessionTabId) => void
/** Resume a session directly by id (`/resume <id|name>` — no picker). */
readonly resumeSession: (sessionId: string) => void
/** Open a generic picker (model picker, skills hub). */
readonly openPicker: (picker: PickerState) => void
/** Open the agents dashboard (/agents, /tasks). */
readonly openDashboard: () => void
/** Open the OS background-process panel (/processes). */
readonly openBackgroundPanel: () => void
/** Track an in-flight background-prompt task id (`/bg` → prompt.background). */
readonly addBgTask: (id: string) => void
/** Cached `/model` picker rows (Epic 7 instant open); undefined until prefetched. */
readonly modelItems: () => PickerItem[] | undefined
/** Update the cached `/model` picker rows. */
readonly setModelItems: (items: PickerItem[]) => void
/** Read / set the compact-transcript display flag (/compact — Epic 3). */
readonly compact: () => boolean
readonly setCompact: (on: boolean) => void
/** Read / set the global tool/reasoning detail mode (/details — Epic 3). */
readonly details: () => DetailsMode
readonly setDetails: (mode: DetailsMode) => void
/** Mounted-renderable count under the live renderer root (a /mem diagnostic);
* undefined when no renderer is reachable (tests). */
readonly renderableCount: () => number | undefined
}
function readStr(value: unknown, key: string): string | undefined {
if (!value || typeof value !== 'object') return undefined
const v = (value as { [k: string]: unknown })[key]
return typeof v === 'string' ? v : undefined
}
const titleCase = (name: string) => name.charAt(0).toUpperCase() + name.slice(1)
/** A planned completion query (item 5/13): which RPC + params, and where an
* accepted item replaces from if the RPC omits its own `replace_from`. */
export interface CompletionPlan {
method: 'complete.slash' | 'complete.path'
params: Record<string, unknown>
from: number
}
/** The command-name grammar for the lead `/token` (mirrors skillMatch NAME_RE):
* starts alphanumeric, then word chars / `.` / `-`. Notably EXCLUDES `/`, so a
* path like `/usr/bin` is NEVER a slash command (F2). */
const SLASH_NAME_RE = /^[A-Za-z0-9][\w.-]*$/
/** `@`-mention is the ONLY file/dir completion trigger now (F8b — glitch
* 2026-06-13: drop `~`/`./`/`/`/bare-path as triggers; the gateway's
* complete.path still understands `@file:`/`@folder:`/fuzzy basename). */
function isPathLike(word: string): boolean {
return word.startsWith('@')
}
/**
* Decide what to complete for the composer text + cursor offset:
* - the text is a slash command — `/` at the very start → `complete.slash
* {text}`. A bare `/` opens the full command list immediately (glitch
* 2026-06-13); `/m`, `/model foo` narrow it. A `/abs/path` whose first token
* isn't a valid name (F2) → no slash menu.
* - the WORD under the cursor is an `@`-mention → `complete.path {word}` for
* file/dir tagging (F8b).
* - otherwise nothing.
*
* Cursor-aware (F7/F8): completion is computed from the line/token at the cursor,
* so it keeps working on later lines after Shift+Enter (the old whole-buffer
* `includes('\n')` bail killed it on every multi-line buffer). `cursor` defaults
* to the end of `text`. Slash commands stay first-line-only (a `/` mid-buffer is
* prose, never a command).
* Returns null when there's no completion to run (so the dropdown clears).
*/
export function planCompletion(text: string, cursor: number = text.length): CompletionPlan | null {
// Slash command: only when the WHOLE buffer's lead token is a command. A `/`
// after a newline is prose, so a slash command never spans lines.
if (text.startsWith('/') && !text.includes('\n')) {
const body = text.slice(1)
const space = body.search(/\s/)
const name = space === -1 ? body : body.slice(0, space)
// Hydrate on a BARE `/` (body === '', glitch 2026-06-13 — open the full
// command list on the first slash) or a valid command name. A `/abs/path`
// (the lead token contains a `/`) is never a command (F2), and a `/ ` with a
// trailing space past an empty name is not arg-completion on nothing.
if (body === '' || SLASH_NAME_RE.test(name)) {
return { from: 0, method: 'complete.slash', params: { text } }
}
return null
}
// @-mention: the whitespace-delimited token the cursor sits in/just after.
const pos = Math.max(0, Math.min(cursor, text.length))
const head = text.slice(0, pos)
const tokenStart = head.search(/\S+$/)
if (tokenStart === -1) return null
const word = head.slice(tokenStart)
if (isPathLike(word)) {
return { from: tokenStart, method: 'complete.path', params: { word } }
}
return null
}
/** Read a `replace_from` offset off a completion result, falling back to `fallback`. */
export function readReplaceFrom(result: unknown, fallback: number): number {
if (result && typeof result === 'object') {
const rf = (result as { replace_from?: unknown }).replace_from
if (typeof rf === 'number') return rf
}
return fallback
}
/** Map a `complete.slash`/`complete.path` result ({items:[{text,display,meta}]}) into candidates. */
export function mapCompletions(result: unknown): CompletionItem[] {
if (!result || typeof result !== 'object') return []
const items = (result as { items?: unknown }).items
if (!Array.isArray(items)) return []
const out: CompletionItem[] = []
for (const it of items) {
const text = readStr(it, 'text')
if (!text) continue
out.push({ display: readStr(it, 'display') ?? text, meta: readStr(it, 'meta') ?? '', text })
}
return out
}
/** Extract `{text}` items from a `commands.catalog` result ({pairs:[["/name",
* "desc"],…]}) for seeding the composer's slash-highlight catalog at boot
* (glitch 2026-06-14). Each pair's first element is the `/name`; non-string or
* empty entries are skipped. Shape-defensive — any junk → []. */
export function catalogCommandItems(result: unknown): { text: string }[] {
if (!result || typeof result !== 'object') return []
const pairs = (result as { pairs?: unknown }).pairs
if (!Array.isArray(pairs)) return []
const out: { text: string }[] = []
for (const pair of pairs as unknown[]) {
const name = Array.isArray(pair) ? (pair as unknown[])[0] : undefined
if (typeof name === 'string' && name) out.push({ text: name })
}
return out
}
/**
* A monotonic gate for the per-keystroke completion RPCs (glitch 2026-06-14).
* The gateway transport does NOT guarantee in-order response delivery and
* `onType` fires an RPC per keystroke with no debounce, so a slow earlier query
* (the first bare-`/` `complete.slash`) can resolve AFTER a newer one (an
* `@`-mention `complete.path`) and clobber the store with stale results — which
* is what made "a leading /path message breaks @-mentions afterward."
*
* `claim()` is called once per keystroke (BEFORE the early-return clear branch,
* so an intermediate keystroke that fires no RPC still invalidates the older
* in-flight one) and returns a token; `isCurrent(token)` is true only for the
* most recently claimed token, so a resolving response applies ONLY when no
* newer keystroke has superseded it.
*/
export interface CompletionGate {
claim: () => number
isCurrent: (token: number) => boolean
}
export function createCompletionGate(): CompletionGate {
let seq = 0
return {
claim: () => ++seq,
isCurrent: (token: number) => token === seq
}
}
/** Long output → the pager; short → a system line (Ink: >180 chars or >2 lines). */
function present(ctx: SlashContext, title: string, text: string): void {
const long = text.length > 180 || text.split('\n').filter(Boolean).length > 2
if (long) ctx.openPager(title, text)
else ctx.pushSystem(text)
}
/** Process-diagnostic commands — hidden behind `HERMES_TUI_DIAGNOSTICS`
* (logic/env.ts). Regular users never see them; support flows enable them
* with one env var. Keep this set in sync with the `(diag)` lines below.
* DESIGN ASSUMPTION (review 2026-06-12): these stay CLIENT-ONLY. Completion
* is gateway-driven and hides them only because the gateway doesn't know
* them — adding a server command with one of these names requires gating it
* gateway-side too (the early return below would shadow, not hide, it). */
const DIAGNOSTIC_COMMANDS = new Set(['mem', 'heapdump'])
const CLIENT_HELP_LINES = [
'/help — list commands',
'/model [name] — switch model (picker if bare)',
'/copy [n] — copy the last (or n-th) response',
'/skills — browse skills',
'/skin [name] — switch theme skin (live)',
'/sessions [cron|gateways|all] — browse/resume sessions (tabbed picker)',
'/resume [id|name] — resume directly, or open the picker',
'/clear, /new — clear the transcript (confirm)',
'/compact [on|off|toggle] — compact transcript spacing',
'/details [hidden|collapsed|expanded|cycle] — tool/reasoning detail',
'/bg <prompt> — launch a background prompt',
'/processes — OS background processes (list + stop all)',
'/replay [n|path] — inspect an archived spawn tree',
'/mem — live memory stats (diag)',
'/heapdump — write a V8 heap snapshot (diag)',
'/logs — recent engine log lines',
'/quit, /exit — quit',
'(other /commands run on the gateway)'
]
function clientHelp(): string {
const lines = diagnosticsEnabled() ? CLIENT_HELP_LINES : CLIENT_HELP_LINES.filter(l => !l.includes('(diag)'))
return lines.join('\n')
}
type ClientHandler = (arg: string, ctx: SlashContext) => void | Promise<void>
/** `/sessions [recent|cron|gateways|all]` — open the tabbed resume picker,
* pre-selecting the named tab (shared by /sessions, /switch, /session). */
const sessionsCmd: ClientHandler = (arg, ctx) => {
const tab = parseSessionTabArg(arg)
if (!tab) {
ctx.pushSystem('usage: /sessions [recent|cron|gateways|all]')
return
}
ctx.openSessionPicker(tab)
}
/** `/resume` — bare opens the picker; `/resume <id|name>` keeps the DIRECT
* path: resolve the arg against `session.list` (exact id → unique id prefix
* → exact/unique title) and hydrate without the overlay. */
const resumeCmd: ClientHandler = async (arg, ctx) => {
const needle = arg.trim()
if (!needle) {
ctx.openSessionPicker('recent')
return
}
try {
// One bounded page over ALL sources (the gateway deny-lists `tool`) — the
// direct path targets a known session, not a browse.
const { rows } = mapSessionRows(await ctx.request('session.list', { limit: 200 }))
const hit = resolveSessionArg(rows, needle)
if (!hit) {
ctx.pushSystem(`/resume: no session matching “${needle}” — try /sessions`)
return
}
ctx.resumeSession(hit.id)
} catch (error) {
ctx.pushSystem(`/resume: ${error instanceof Error ? error.message : 'session.list failed'}`)
}
}
/**
* Flatten `model.options` into grouped picker rows (Epic 7; v2.1 availability):
* group = the provider's display ("lab") name, haystacks = slug + lab name (so
* `oai`/`copilot`/`anthropic` fuzzy-match the whole group), value = the FULL
* switch arg `<model> --provider <slug>` so picking a model under a different
* provider actually switches provider+model (the gateway's
* `_apply_model_switch` parses `--provider` via parse_model_flags). The current
* model is flagged, not baked into the label, so the fuzzy scorer never matches
* the ✓.
*
* UNCONFIGURED providers (`authenticated: false` skeleton rows — the gateway
* sends them via `build_models_payload(include_unconfigured=True,
* picker_hints=True)`, with `key_env`/`warning` setup hints) become one
* `unavailable` hint row each (`no API key — set <ENV_VAR>`): hidden by
* default, revealed dimmed + non-selectable by the picker's Ctrl+U toggle.
*/
export function mapModelOptions(opts: unknown): PickerItem[] {
if (!opts || typeof opts !== 'object') return []
const providers = (opts as { providers?: unknown }).providers
if (!Array.isArray(providers)) return []
const current = readStr(opts, 'model')
const currentProvider = readStr(opts, 'provider')
const items: PickerItem[] = []
for (const p of providers) {
if (!p || typeof p !== 'object') continue
const slug = readStr(p, 'slug') ?? readStr(p, 'name') ?? ''
const lab = readStr(p, 'name') ?? slug
if ((p as { authenticated?: unknown }).authenticated === false) {
// Unconfigured provider → one dimmed hint row under its own group header.
// Identity (slug + display name) is the haystack so a provider-name query
// still narrows to the group; the hint text itself is not searched.
const keyEnv = readStr(p, 'key_env')
const item: PickerItem = {
group: lab || slug,
label: keyEnv ? `no API key — set ${keyEnv}` : (readStr(p, 'warning') ?? 'not configured'),
unavailable: true,
value: slug || lab
}
const hay = [slug, lab].filter(Boolean)
if (hay.length) item.haystacks = hay
items.push(item)
continue
}
if ((p as { authenticated?: unknown }).authenticated !== true) continue
// The gateway's own normalized "this row is the active provider" flag —
// more reliable than comparing `provider` to `slug` (the agent's provider
// string can be the API dialect, e.g. an openai-compatible base_url).
const rowCurrent = (p as { is_current?: unknown }).is_current === true
const models = (p as { models?: unknown }).models
if (!Array.isArray(models)) continue
for (const m of models) {
if (typeof m !== 'string') continue
const item: PickerItem = { label: m, value: slug ? `${m} --provider ${slug}` : m }
// current = same model id under the active provider (row flag first,
// then the slug comparison, then "no provider known at all").
if (m === current && (rowCurrent || currentProvider === slug || !currentProvider)) item.current = true
if (lab) item.group = lab
const haystacks = [slug, lab].filter(Boolean)
if (haystacks.length) item.haystacks = haystacks
items.push(item)
}
}
// Provider matching failed entirely (string-normalization drift) but the
// model id is known → flag the first id match so the ✓ never just vanishes.
if (current && !items.some(i => i.current)) {
const fallback = items.find(i => i.label === current)
if (fallback) fallback.current = true
}
return items
}
/**
* Provider tab order for the model picker's chip strip (picker v2.2): each
* CONFIGURED provider's group (= lab display name) in catalog order, with
* Nous-identified groups (slug or lab name containing `nous`) hoisted to the
* front. Unconfigured providers (`unavailable` hint rows) get NO tab — they
* stay reachable via Ctrl+U under the picker's trailing `All` tab (which the
* picker appends itself; it is not part of this list).
*/
export function buildModelTabs(items: readonly PickerItem[]): string[] {
const seen = new Set<string>()
const nous: string[] = []
const rest: string[] = []
for (const it of items) {
if (it.unavailable || !it.group || seen.has(it.group)) continue
seen.add(it.group)
const identity = [it.group, ...(it.haystacks ?? [])].join(' ').toLowerCase()
;(identity.includes('nous') ? nous : rest).push(it.group)
}
return [...nous, ...rest]
}
/** Flatten `skills.manage {action:'list'}` ({skills: Record<category, names[]>}) into
* grouped picker rows (category = group header; also a fuzzy haystack). */
function mapSkills(result: unknown): PickerItem[] {
if (!result || typeof result !== 'object') return []
const skills = (result as { skills?: unknown }).skills
if (!skills || typeof skills !== 'object') return []
const items: PickerItem[] = []
for (const [category, names] of Object.entries(skills as { [k: string]: unknown })) {
if (!Array.isArray(names)) continue
for (const n of names) if (typeof n === 'string') items.push({ group: category, label: n, value: n })
}
return items
}
/** Re-fetch `model.options` and update the cached picker rows. Resolves with
* the fresh rows (the open picker swaps them in live — Ctrl+R, picker v2.1);
* rejections are the CALLER's to handle (background callers fire-and-forget). */
function refreshModelItems(ctx: SlashContext): Promise<PickerItem[]> {
return ctx.request('model.options', { session_id: ctx.sessionId() }).then(opts => {
const items = mapModelOptions(opts)
if (items.length) ctx.setModelItems(items)
return items
})
}
/**
* The open picker's manual-refresh seam (picker v2.1 Ctrl+R). Whoever opens a
* picker registers (or clears) the catalog re-fetch here; the mounted Picker
* triggers it via `runPickerRefresh` and swaps in the resolved rows live. A
* module slot rather than a Picker prop because the App→Picker prop plumbing
* carries only the PickerState basics; the seam keeps the overlay generic for
* the upcoming resume-session picker (register a `session.list` re-fetch).
*/
let activePickerRefresh: (() => Promise<PickerItem[]>) | undefined
/** Register (or clear, with `undefined`) the open picker's catalog re-fetch. */
export function registerPickerRefresh(fn: (() => Promise<PickerItem[]>) | undefined): void {
activePickerRefresh = fn
}
/** Whether a refresh is registered (the picker's footer hint is gated on it). */
export function canRefreshPicker(): boolean {
return activePickerRefresh !== undefined
}
/** Run the registered catalog re-fetch; undefined when none is registered. */
export function runPickerRefresh(): Promise<PickerItem[]> | undefined {
return activePickerRefresh?.()
}
/**
* The open picker's tab-strip seam (picker v2.2 provider tabs) — same pattern
* as the refresh seam above: whoever opens a picker registers (or clears) a
* tab DERIVATION over the picker's live rows; the mounted Picker re-derives
* through it whenever the rows swap (Ctrl+R), so fresh providers grow chips
* without re-opening. `/model` registers `buildModelTabs`; pickers without
* tabs (skills) clear it and render the classic stripless view.
*/
let activePickerTabs: ((items: readonly PickerItem[]) => string[]) | undefined
/** Register (or clear, with `undefined`) the open picker's tab derivation. */
export function registerPickerTabs(fn: ((items: readonly PickerItem[]) => string[]) | undefined): void {
activePickerTabs = fn
}
/** Derive the open picker's tabs from its rows; [] when no tabs are registered. */
export function pickerTabs(items: readonly PickerItem[]): string[] {
return activePickerTabs?.(items) ?? []
}
/**
* The bootstrap `model.options` prefetch seam (perf: prefetch dedupe). The
* entry stashes its in-flight prefetch promise here; a bare `/model` that
* finds the cache empty AWAITS it (bounded by `waitMs`) and re-checks the
* cache instead of issuing a second concurrent `model.options` RPC. A hung
* prefetch only delays the picker by the bound — `/model` then opens via its
* own fetch as before.
*/
let modelPrefetch: { promise: Promise<unknown>; waitMs: number } | undefined
/** Register (or clear, with `undefined`) the in-flight bootstrap prefetch. */
export function registerModelPrefetch(promise: Promise<unknown> | undefined, waitMs = 2000): void {
modelPrefetch = promise ? { promise, waitMs } : undefined
}
/** Await the registered prefetch (bounded); resolves immediately when none. */
function awaitModelPrefetch(): Promise<void> {
const pending = modelPrefetch
if (!pending) return Promise.resolve()
return Promise.race([pending.promise, new Promise(resolve => setTimeout(resolve, pending.waitMs))]).then(
() => undefined
)
}
/** Switch the model via the server (shared by `/model <name>` and the picker pick).
* A successful switch refreshes the cached rows in the background (fresh ✓). */
async function switchModel(ctx: SlashContext, name: string): Promise<void> {
try {
const r = await ctx.request('slash.exec', { command: `model ${name}`, session_id: ctx.sessionId() })
ctx.pushSystem(readStr(r, 'output') || `${name}`)
void refreshModelItems(ctx).catch(() => {})
} catch (error) {
ctx.pushSystem(`/model ${name}: ${error instanceof Error ? error.message : 'switch failed'}`)
}
}
/** `/model` — bare opens the model picker; `/model <name>` switches directly.
* Opens from the CACHED catalog when present — zero RPCs, same-frame paint
* (Epic 7; the catalog is prefetched at bootstrap and refreshed on switch).
* An empty cache first awaits the in-flight bootstrap prefetch (bounded) so
* an early `/model` never doubles the slow `model.options` RPC. */
const modelCmd: ClientHandler = async (arg, ctx) => {
if (arg.trim()) {
await switchModel(ctx, arg.trim())
return
}
const open = (items: PickerItem[]) => {
// Ctrl+R in the open picker re-fetches the catalog (and re-syncs the cache).
registerPickerRefresh(() => refreshModelItems(ctx))
// Provider chip strip (picker v2.2): Nous-first configured-provider tabs.
registerPickerTabs(buildModelTabs)
ctx.openPicker({ items, onPick: name => void switchModel(ctx, name), title: 'Switch model' })
}
const cached = ctx.modelItems()
if (cached?.length) {
open(cached)
return
}
// Cache empty but the bootstrap prefetch may be in flight — await it
// (bounded) and re-check instead of racing a SECOND model.options RPC.
await awaitModelPrefetch()
const prefetched = ctx.modelItems()
if (prefetched?.length) {
open(prefetched)
return
}
const items = mapModelOptions(await ctx.request('model.options', { session_id: ctx.sessionId() }))
// Unavailable hint rows alone are not a usable catalog — keep the notice.
if (!items.some(i => !i.unavailable)) {
ctx.pushSystem('No models available (no authenticated providers).')
return
}
ctx.setModelItems(items)
open(items)
}
/** `/skills` — open the skills hub; picking a skill shows its info in the pager. */
const skillsCmd: ClientHandler = async (_arg, ctx) => {
const items = mapSkills(await ctx.request('skills.manage', { action: 'list' }))
if (!items.length) {
ctx.pushSystem('No skills found.')
return
}
registerPickerRefresh(undefined) // no Ctrl+R catalog re-fetch for skills (yet)
registerPickerTabs(undefined) // no tab strip for skills — classic grouped view
ctx.openPicker({
items,
onPick: name =>
void ctx
.request('skills.manage', { action: 'inspect', query: name })
.then(info => ctx.openPager(`Skill: ${name}`, readStr(info, 'info') || JSON.stringify(info, null, 2)))
.catch(() => ctx.pushSystem(`/skills: could not inspect ${name}`)),
title: 'Skills'
})
}
/** `on`/`off`/`toggle`/bare → the next flag value; null on garbage (Ink flagFromArg). */
function flagFromArg(arg: string, current: boolean): boolean | null {
const mode = arg.trim().toLowerCase()
if (!mode || mode === 'toggle') return !current
if (mode === 'on') return true
if (mode === 'off') return false
return null
}
/** `/compact [on|off|toggle]` — compact transcript spacing. The flag flips locally
* (the store drives the render); persistence mirrors Ink: a fire-and-forget
* `config.set {key:'compact'}` so the Ink TUI + future launches share the pref
* (the gateway does NOT send the persisted value to this TUI, so each launch
* starts off — see store.ts `compact`). */
const compactCmd: ClientHandler = (arg, ctx) => {
const next = flagFromArg(arg, ctx.compact())
if (next === null) {
ctx.pushSystem('usage: /compact [on|off|toggle]')
return
}
ctx.setCompact(next)
void ctx.request('config.set', { key: 'compact', value: next ? 'on' : 'off' }).catch(() => {})
ctx.pushSystem(`compact ${next ? 'on' : 'off'}`)
}
/**
* `/details [hidden|collapsed|expanded|cycle]` — GLOBAL detail mode (per-section
* overrides deferred; the gateway's arg completion also suggests section names,
* so those get an honest "not supported yet" notice). Bare `/details` reports the
* persisted mode (`config.get details_mode`) and syncs the local flag to it; a
* mode set persists via `config.set` (fire-and-forget, Ink parity).
*/
const detailsCmd: ClientHandler = async (arg, ctx) => {
const first = arg.trim().toLowerCase().split(/\s+/)[0] ?? ''
if (!first) {
try {
const r = await ctx.request('config.get', { key: 'details_mode' })
const mode = parseDetailsMode(readStr(r, 'value')) ?? ctx.details()
ctx.setDetails(mode)
ctx.pushSystem(`details: ${mode}`)
} catch {
ctx.pushSystem(`details: ${ctx.details()}`)
}
return
}
if ((DETAILS_SECTIONS as readonly string[]).includes(first)) {
ctx.pushSystem(`per-section detail overrides are not supported in the native engine yet — ${DETAILS_USAGE}`)
return
}
const next = first === 'cycle' || first === 'toggle' ? nextDetailsMode(ctx.details()) : parseDetailsMode(first)
if (!next) {
ctx.pushSystem(DETAILS_USAGE)
return
}
ctx.setDetails(next)
void ctx.request('config.set', { key: 'details_mode', value: next }).catch(() => {})
ctx.pushSystem(`details: ${next}`)
}
/** `/skin [name]` — switch the active theme skin (Ink parity:
* ui-tui/src/app/slash/commands/session.ts). Bare `/skin` reports the persisted
* skin (`config.get skin`); `/skin <name>` persists via `config.set` which makes
* the gateway emit `skin.changed` → the store re-themes the running UI LIVE (no
* relaunch). Skin-name arg completion comes from the gateway's `complete.slash`
* for free. Fire-and-forget with a guarded notice, matching compact/details. */
const skinCmd: ClientHandler = async (arg, ctx) => {
const name = arg.trim()
if (!name) {
try {
const r = await ctx.request('config.get', { key: 'skin' })
ctx.pushSystem(`skin: ${readStr(r, 'value') || 'default'}`)
} catch {
ctx.pushSystem('skin: default')
}
return
}
try {
const r = await ctx.request('config.set', { key: 'skin', value: name })
ctx.pushSystem(`skin → ${readStr(r, 'value') || name}`)
} catch (error) {
ctx.pushSystem(`/skin: ${error instanceof Error ? error.message : 'config.set failed'}`)
}
}
/** Fetch + map the session's archived spawn trees (`spawn_tree.list`). */
async function listSpawnTrees(ctx: SlashContext) {
const r = await ctx.request('spawn_tree.list', { limit: 30, session_id: ctx.sessionId() ?? 'default' })
return readSpawnTreeEntries(r)
}
/**
* `/replay [n|path]` — spawn-tree inspector through the pager (Ink renders these
* in its agents overlay; the flow + RPCs are the same): bare lists the archived
* trees with indices, `<n>` loads the n-th listed tree, anything else is treated
* as a snapshot path on disk (`load <path>` accepted for Ink muscle memory).
*/
const replayCmd: ClientHandler = async (arg, ctx) => {
const raw = arg.trim()
const lower = raw.toLowerCase()
try {
if (!raw || lower === 'list' || lower === 'ls') {
const entries = await listSpawnTrees(ctx)
if (!entries.length) {
ctx.pushSystem('no archived spawn trees for this session — completed delegations are archived automatically')
return
}
ctx.openPager('Spawn trees', formatSpawnTreeList(entries))
return
}
if (/^\d+$/.test(raw)) {
const n = Number.parseInt(raw, 10)
const entries = await listSpawnTrees(ctx)
const entry = entries[n - 1]
if (!entry) {
ctx.pushSystem(
entries.length
? `replay: index out of range 1..${entries.length} — /replay to list`
: 'no archived spawn trees for this session'
)
return
}
const tree = await ctx.request('spawn_tree.load', { path: entry.path })
ctx.openPager(`Replay ${n}`, formatSpawnTree(tree))
return
}
const path = lower.startsWith('load ') ? raw.slice(5).trim() : raw
const tree = await ctx.request('spawn_tree.load', { path })
ctx.openPager('Replay', formatSpawnTree(tree))
} catch (error) {
ctx.pushSystem(`/replay: ${error instanceof Error ? error.message : 'failed'}`)
}
}
/** `/heapdump` — write a V8 heap snapshot to `$HERMES_HOME|~/.hermes/logs/` and
* report the path + heap/rss before vs after (Ink ref debug.ts /heapdump). */
const heapdumpCmd: ClientHandler = (_arg, ctx) => {
const pre = process.memoryUsage()
ctx.pushSystem(`writing heap dump (heap ${formatBytes(pre.heapUsed)} · rss ${formatBytes(pre.rss)})…`)
try {
const { after, before, path } = performHeapdump()
ctx.pushSystem(
`heapdump: ${path}\n` +
`heap ${formatBytes(before.heapUsed)}${formatBytes(after.heapUsed)} · ` +
`rss ${formatBytes(before.rss)}${formatBytes(after.rss)}`
)
} catch (error) {
ctx.pushSystem(`heapdump failed: ${error instanceof Error ? error.message : String(error)}`)
}
}
/** `/mem` — live V8 heap/rss numbers + uptime + the mounted-renderable count
* (the store-cap diagnostic) as one system block (Ink ref debug.ts /mem). */
const memCmd: ClientHandler = (_arg, ctx) => {
ctx.pushSystem(memReport(process.memoryUsage(), process.uptime(), ctx.renderableCount()))
}
/** `/tools` — fetch the tool roster from the gateway and show it in the pager (navigable). */
const toolsCmd: ClientHandler = async (arg, ctx) => {
const command = arg.trim() ? `tools ${arg.trim()}` : 'tools'
try {
const r = await ctx.request('slash.exec', { command, session_id: ctx.sessionId() })
ctx.openPager('Tools', readStr(r, 'output') || '(no tool info)')
} catch (error) {
ctx.pushSystem(`/tools: ${error instanceof Error ? error.message : 'failed'}`)
}
}
/** `/bg <prompt>` (aliases /background, /btw) — launch a background PROMPT via
* `prompt.background` (Ink parity): echo "bg <id> started" and track the task so
* the `bg: N` badge counts it until `background.complete` clears it. NOT the OS
* process panel (that's /processes). */
const backgroundCmd: ClientHandler = async (arg, ctx) => {
const text = arg.trim()
if (!text) {
ctx.pushSystem('/bg <prompt> — launch a background prompt')
return
}
try {
const r = await ctx.request('prompt.background', { session_id: ctx.sessionId(), text })
const taskId = readStr(r, 'task_id')
if (taskId) {
ctx.addBgTask(taskId)
ctx.pushSystem(`bg ${taskId} started`)
} else {
ctx.pushSystem('/bg: no task id returned')
}
} catch (error) {
ctx.pushSystem(`/bg: ${error instanceof Error ? error.message : 'failed'}`)
}
}
/** The TUI-only client commands (run in-process, never hit the gateway). */
const CLIENT: Record<string, ClientHandler> = {
agents: (_arg, ctx) => ctx.openDashboard(),
background: backgroundCmd,
bg: backgroundCmd,
btw: backgroundCmd,
clear: (_arg, ctx) => ctx.confirm('Clear the transcript?', ctx.clearTranscript),
compact: compactCmd,
copy: (arg, ctx) => {
const n = Math.max(1, Number.parseInt(arg, 10) || 1)
if (!ctx.copyResponse(n)) ctx.pushSystem('Nothing to copy yet.')
},
detail: detailsCmd,
details: detailsCmd,
exit: (_arg, ctx) => ctx.quit(),
heapdump: heapdumpCmd,
mem: memCmd,
processes: (_arg, ctx) => ctx.openBackgroundPanel(),
procs: (_arg, ctx) => ctx.openBackgroundPanel(),
model: modelCmd,
replay: replayCmd,
resume: resumeCmd,
session: sessionsCmd,
sessions: sessionsCmd,
skills: skillsCmd,
skin: skinCmd,
switch: sessionsCmd,
tasks: (_arg, ctx) => ctx.openDashboard(),
tools: toolsCmd,
help: async (_arg, ctx) => {
// Prefer the live catalog; fall back to the client list if it's unavailable.
try {
const cat = await ctx.request('commands.catalog', {})
ctx.pushSystem(renderCatalog(cat) || clientHelp())
} catch {
ctx.pushSystem(clientHelp())
}
},
logs: (_arg, ctx) => ctx.openPager('Logs', ctx.logTail().join('\n') || '(log empty)'),
new: (_arg, ctx) => ctx.confirm('Start fresh? (clears the transcript)', ctx.clearTranscript),
quit: (_arg, ctx) => ctx.quit()
}
/** The registered client-command names (catalog introspection — tests/menus). */
export function clientCommandNames(): string[] {
const names = Object.keys(CLIENT)
return (diagnosticsEnabled() ? names : names.filter(n => !DIAGNOSTIC_COMMANDS.has(n))).sort()
}
/** Render the gateway `commands.catalog` into a help block (loose-typed read).
* The TUI catalog shape is `{ pairs: [["/name","desc"], …], canon, categories }`
* (tui_gateway/server.py `commands.catalog`). */
function renderCatalog(cat: unknown): string {
if (!cat || typeof cat !== 'object') return ''
const pairs = (cat as { pairs?: unknown }).pairs
if (!Array.isArray(pairs)) return ''
const lines = pairs
.map(pair => {
if (!Array.isArray(pair) || typeof pair[0] !== 'string') return null
const desc = typeof pair[1] === 'string' ? pair[1] : ''
return desc ? `${pair[0]}${desc}` : pair[0]
})
.filter((l): l is string => l !== null)
return lines.length ? lines.join('\n') : ''
}
function handleDispatchResult(parsed: ParsedSlash, raw: unknown, ctx: SlashContext): void {
const type = readStr(raw, 'type')
const argTail = parsed.arg ? ` ${parsed.arg}` : ''
switch (type) {
case 'exec':
case 'plugin':
ctx.pushSystem(readStr(raw, 'output') || '(no output)')
return
case 'alias': {
const target = readStr(raw, 'target')
if (target) void dispatchSlash(`/${target}${argTail}`, ctx)
return
}
case 'skill':
case 'send': {
const notice = readStr(raw, 'notice')
if (notice) ctx.pushSystem(notice)
const message = readStr(raw, 'message')
if (message?.trim()) ctx.submit(message)
else ctx.pushSystem(`/${parsed.name}: empty message`)
return
}
case 'prefill': {
// /undo etc. — composer prefill lands with the composer-ref plumbing; show it for now.
const message = readStr(raw, 'message')
ctx.pushSystem(message ? `(edit & resubmit) ${message}` : `/${parsed.name}: nothing to prefill`)
return
}
default:
ctx.pushSystem(`error: invalid response: command.dispatch`)
}
}
/** Dispatch a `/command` through the ladder. Returns once the (async) work settles. */
export async function dispatchSlash(input: string, ctx: SlashContext): Promise<void> {
const parsed = parseSlash(input)
if (!parsed) return
if (DIAGNOSTIC_COMMANDS.has(parsed.name) && !diagnosticsEnabled()) {
// Not a secret — an enable switch. Tell the user exactly how to get it.
ctx.pushSystem(`/${parsed.name} is a diagnostic command — relaunch with HERMES_TUI_DIAGNOSTICS=1 to enable it.`)
return
}
const client = CLIENT[parsed.name]
if (client) {
await client(parsed.arg, ctx)
return
}
const sid = ctx.sessionId()
try {
const result = await ctx.request('slash.exec', { command: input.slice(1), session_id: sid })
const output = readStr(result, 'output') || `/${parsed.name}: no output`
const warning = readStr(result, 'warning')
const text = warning ? `warning: ${warning}\n${output}` : output
// Long output → pager (Ink: >180 chars or >2 non-empty lines), else a system line.
present(ctx, titleCase(parsed.name), text)
} catch {
try {
const raw = await ctx.request('command.dispatch', { arg: parsed.arg, name: parsed.name, session_id: sid })
handleDispatchResult(parsed, raw, ctx)
} catch (error) {
ctx.pushSystem(`error: ${error instanceof Error ? error.message : String(error)}`)
}
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,85 @@
/**
* Terminal chrome logic — window-title text and notification message shaping.
* Pure string work (no OpenTUI imports); the boundary shim
* (`boundary/termChrome.ts`) owns the renderer writes and focus tracking.
*
* Title: OSC 0/2 content is set natively via `renderer.setTerminalTitle`
* (the zig side emits the escape) — this module only SHAPES the text:
* `"{session title} — Hermes"` once the gateway titles the session,
* `"Hermes Agent"` until then.
*
* Notifications: the desktop ping itself is the renderer's native
* `triggerNotification(message, title)` (boundary/termChrome.ts) — protocol
* detection + tmux/Zellij wrapping live in the zig side. This module only
* supplies the message TEXT (promptNotification / TURN_COMPLETE_NOTIFICATION)
* and the sanitizer; it no longer hand-rolls OSC 9/99/777 escape strings.
*/
const ESC = '\u001b'
/** Strip control chars (C0/C1, incl. ESC/BEL) so user text can never
* terminate or splice an escape sequence; collapse runs of whitespace;
* cap the length. */
export function sanitizeOscText(text: string, max = 120): string {
const clean = (text ?? '')
// eslint-disable-next-line no-control-regex
.replace(/[\u0000-\u001f\u007f-\u009f]/g, ' ')
.replace(/\s+/g, ' ')
.trim()
return clean.length > max ? clean.slice(0, Math.max(1, max - 1)) + '…' : clean
}
/** The window-title string: session title when known, Ink-era generic otherwise. */
export function windowTitleFor(sessionTitle: string | undefined): string {
const title = sanitizeOscText(sessionTitle ?? '', 80)
return title ? `${title} — Hermes` : 'Hermes Agent'
}
/** A notification's two text parts (body optional). */
export interface TermNotification {
readonly title: string
readonly body?: string
}
/** The XTWINOPS title-stack pushes/pops bracketing our title ownership: save
* the user's title on install, restore it on teardown (terminals without the
* stack ignore these — they just keep our last title, same as today). */
export const TITLE_STACK_SAVE = `${ESC}[22;0t`
export const TITLE_STACK_RESTORE = `${ESC}[23;0t`
/** What to announce for a blocking prompt, by kind. Kinds arrive from the
* store's ActivePrompt union; unknown kinds get the generic line so a new
* prompt type can never silently drop notifications. */
export function promptNotification(kind: string): TermNotification {
switch (kind) {
case 'clarify':
return { title: 'Hermes', body: 'needs an answer to continue' }
case 'approval':
return { title: 'Hermes', body: 'wants approval to run a command' }
case 'sudo':
return { title: 'Hermes', body: 'needs your sudo password' }
case 'secret':
return { title: 'Hermes', body: 'needs a secret/API key' }
case 'confirm':
return { title: 'Hermes', body: 'is asking you to confirm' }
default:
return { title: 'Hermes', body: 'is waiting for your input' }
}
}
/** Turn-complete announcement. */
export const TURN_COMPLETE_NOTIFICATION: TermNotification = {
title: 'Hermes',
body: 'finished — awaiting your input'
}
/**
* `HERMES_TUI_NOTIFY` kill-switch (TUI-only env var, same family as
* HERMES_TUI_TOOL_OUTPUT_LINES): unset/anything-else = on, `0`/`false`/`off`
* = no notification sequences are ever written. The window title is NOT
* gated by this — it's chrome, not interruption.
*/
export function notifyEnabled(env: { readonly [k: string]: string | undefined } = process.env): boolean {
const raw = (env.HERMES_TUI_NOTIFY ?? '').trim().toLowerCase()
return raw !== '0' && raw !== 'false' && raw !== 'off'
}

View File

@@ -0,0 +1,599 @@
/**
* Theme / skin engine (SOLID side, pure TS — spec v4 §7.5). A faithful 1:1 port
* of Ink's `ui-tui/src/theme.ts` so EXISTING Hermes skins work UNCHANGED: same
* `Theme`/`ThemeColors`/`ThemeBrand` shapes, same `DARK_THEME`/`LIGHT_THEME`
* defaults, same `detectLightMode`, same Apple-Terminal ANSI-256 normalization,
* and the same `fromSkin(colors, branding, …)` mapping + fallback chains.
*
* The view never hardcodes colors — it reads `theme.color.*` / `theme.brand.*`
* via the ThemeProvider context (view/theme.tsx). The boundary feeds skins in
* through `gateway.ready{payload.skin}` / `skin.changed` → fromSkin → the theme
* signal.
*
* Source of truth for the contract: ui-tui/src/theme.ts (+ GatewaySkin in
* ui-tui/src/gatewayTypes.ts). Keep this port in sync if that contract changes.
*
* INTENTIONAL divergences from the Ink port (visual-hierarchy design pass):
* - `muted` is a true NEUTRAL grey (not a darker gold) and no longer borrows
* the skin's `banner_dim` (a banner-gold shade in the stock skin); skins
* override it via the dedicated `ui_muted` key instead.
* - a `bg` token paints the root canvas (true black/white; `ui_bg` override).
*/
import { FALSE_RE, TRUE_RE } from './env.ts'
export interface ThemeColors {
primary: string
accent: string
border: string
text: string
muted: string
/** Root canvas background. DEFAULT IS `transparent` — the terminal's own
* background shows through (glitch rejected a painted "default dark" canvas;
* the dark room is the user's terminal, not ours). Skins may opt into a
* painted canvas via `ui_bg`. */
bg: string
completionBg: string
completionCurrentBg: string
completionMetaBg: string
completionMetaCurrentBg: string
label: string
ok: string
error: string
warn: string
prompt: string
sessionLabel: string
sessionBorder: string
statusBg: string
statusFg: string
statusGood: string
statusWarn: string
statusBad: string
statusCritical: string
selectionBg: string
diffAdded: string
diffRemoved: string
diffAddedWord: string
diffRemovedWord: string
// Line backgrounds for the NATIVE `<diff>` renderable (file-tool renderer).
// Separate from the Ink-parity diffAdded/diffRemoved pair above: those are
// `rgb(…)` strings (Ink parses them; OpenTUI's parseColor only takes hex /
// CSS names / "transparent"), and they're pastel full-line fills tuned for
// Ink's fg-on-bg rendering — the native diff wants darker hex backgrounds.
diffAddedBg: string
diffRemovedBg: string
shellDollar: string
}
export interface ThemeBrand {
name: string
icon: string
prompt: string
/** The composer glyph while in `!`-shell mode (defaults to `$`); skin-overridable. */
shellPrompt?: string
welcome: string
goodbye: string
tool: string
helpHeader: string
}
export interface Theme {
color: ThemeColors
brand: ThemeBrand
bannerLogo: string
bannerHero: string
/** Spinner animation config from the skin (empty = engine defaults). */
spinner: SpinnerConfig
/** Per-tool glyph overrides from the skin (tool name → glyph); {} = registry defaults. */
toolEmojis: Record<string, string>
}
/** The skin payload as emitted by the gateway (mirror ui-tui/src/gatewayTypes.ts GatewaySkin). */
export interface GatewaySkin {
banner_hero?: string
banner_logo?: string
branding?: Record<string, string>
colors?: Record<string, string>
help_header?: string
tool_prefix?: string
/** Spinner animation data (faces/verbs/wings) — loose; SpinnerConfig narrows it. */
spinner?: Record<string, unknown>
/** Per-tool glyph overrides (tool name → emoji/char). */
tool_emojis?: Record<string, string>
}
/** Normalized spinner animation config the busy indicator consumes. Empty arrays
* mean "use the engine defaults" (a skin that ships no spinner block). */
export interface SpinnerConfig {
waitingFaces: string[]
thinkingFaces: string[]
thinkingVerbs: string[]
/** [left, right] decoration pairs. */
wings: [string, string][]
}
const EMPTY_SPINNER: SpinnerConfig = { waitingFaces: [], thinkingFaces: [], thinkingVerbs: [], wings: [] }
/** Parse the loose gateway `spinner` record into a typed SpinnerConfig (defensive:
* any malformed field → empty, so a bad skin never crashes the spinner). */
export function parseSpinner(raw: Record<string, unknown> | undefined): SpinnerConfig {
if (!raw || typeof raw !== 'object') return EMPTY_SPINNER
const strArr = (v: unknown): string[] => (Array.isArray(v) ? v.filter((x): x is string => typeof x === 'string') : [])
const wings: [string, string][] = []
const rawWings = (raw as { wings?: unknown }).wings
if (Array.isArray(rawWings)) {
for (const pair of rawWings) {
if (Array.isArray(pair) && pair.length === 2 && typeof pair[0] === 'string' && typeof pair[1] === 'string') {
wings.push([pair[0], pair[1]])
}
}
}
return {
waitingFaces: strArr((raw as { waiting_faces?: unknown }).waiting_faces),
thinkingFaces: strArr((raw as { thinking_faces?: unknown }).thinking_faces),
thinkingVerbs: strArr((raw as { thinking_verbs?: unknown }).thinking_verbs),
wings
}
}
// ── Color math ───────────────────────────────────────────────────────
function parseHex(h: string): [number, number, number] | null {
const m = /^#?([0-9a-f]{6})$/i.exec(h)
const hex = m?.[1]
if (!hex) return null
const n = parseInt(hex, 16)
return [(n >> 16) & 0xff, (n >> 8) & 0xff, n & 0xff]
}
function mix(a: string, b: string, t: number) {
const pa = parseHex(a)
const pb = parseHex(b)
if (!pa || !pb) return a
const lerp = (i: 0 | 1 | 2) => Math.round(pa[i] + (pb[i] - pa[i]) * t)
return '#' + ((1 << 24) | (lerp(0) << 16) | (lerp(1) << 8) | lerp(2)).toString(16).slice(1)
}
const XTERM_6_LEVELS = [0, 95, 135, 175, 215, 255] as const
const ANSI_LIGHT_MAX_LUMINANCE = 0.72
const ANSI_LIGHT_TARGET_LUMINANCE = 0.34
const ANSI_LIGHT_MIN_SATURATION = 0.22
const ANSI_MUTED_BUCKET = 245
const ANSI_NORMALIZED_FOREGROUNDS: readonly (keyof ThemeColors)[] = [
'text',
'label',
'ok',
'error',
'warn',
'prompt',
'statusFg',
'statusGood',
'statusWarn',
'statusBad',
'statusCritical',
'shellDollar'
]
const ANSI_MUTED_FOREGROUNDS: readonly (keyof ThemeColors)[] = ['muted', 'sessionLabel', 'sessionBorder']
function xtermEightBitRgb(colorNumber: number): [number, number, number] {
if (colorNumber >= 232) {
const value = 8 + (colorNumber - 232) * 10
return [value, value, value]
}
if (colorNumber >= 16) {
const offset = colorNumber - 16
// Indices are `% 6`, always within XTERM_6_LEVELS' bounds; `?? 0` only
// satisfies noUncheckedIndexedAccess and is never actually reached.
return [
XTERM_6_LEVELS[Math.floor(offset / 36) % 6] ?? 0,
XTERM_6_LEVELS[Math.floor(offset / 6) % 6] ?? 0,
XTERM_6_LEVELS[offset % 6] ?? 0
]
}
return [0, 0, 0]
}
function channelLuminance(value: number): number {
const normalized = value / 255
return normalized <= 0.03928 ? normalized / 12.92 : ((normalized + 0.055) / 1.055) ** 2.4
}
function relativeLuminance(red: number, green: number, blue: number): number {
return 0.2126 * channelLuminance(red) + 0.7152 * channelLuminance(green) + 0.0722 * channelLuminance(blue)
}
function rgbToHsl(red: number, green: number, blue: number): [number, number, number] {
const rn = red / 255
const gn = green / 255
const bn = blue / 255
const max = Math.max(rn, gn, bn)
const min = Math.min(rn, gn, bn)
const lightness = (max + min) / 2
if (max === min) return [0, 0, lightness]
const delta = max - min
const saturation = lightness > 0.5 ? delta / (2 - max - min) : delta / (max + min)
const hue =
max === rn ? (gn - bn) / delta + (gn < bn ? 6 : 0) : max === gn ? (bn - rn) / delta + 2 : (rn - gn) / delta + 4
return [hue / 6, saturation, lightness]
}
function circularDistance(a: number, b: number): number {
const distance = Math.abs(a - b)
return Math.min(distance, 1 - distance)
}
// Mirrors @hermes/ink's colorize.ts (kept local, like the Ink app copy).
function richEightBitColorNumber(red: number, green: number, blue: number): number {
const [, saturation, lightness] = rgbToHsl(red, green, blue)
if (saturation < 0.15) {
const gray = Math.round(lightness * 25)
return gray === 0 ? 16 : gray === 25 ? 231 : 231 + gray
}
const sixRed = red < 95 ? red / 95 : 1 + (red - 95) / 40
const sixGreen = green < 95 ? green / 95 : 1 + (green - 95) / 40
const sixBlue = blue < 95 ? blue / 95 : 1 + (blue - 95) / 40
return 16 + 36 * Math.round(sixRed) + 6 * Math.round(sixGreen) + Math.round(sixBlue)
}
function bestReadableAnsiColor(red: number, green: number, blue: number): number {
const [hue, saturation, lightness] = rgbToHsl(red, green, blue)
let bestColor = richEightBitColorNumber(red, green, blue)
let bestScore = Number.POSITIVE_INFINITY
for (let colorNumber = 16; colorNumber <= 255; colorNumber += 1) {
const [candidateRed, candidateGreen, candidateBlue] = xtermEightBitRgb(colorNumber)
const candidateLuminance = relativeLuminance(candidateRed, candidateGreen, candidateBlue)
if (candidateLuminance > ANSI_LIGHT_MAX_LUMINANCE) continue
const [candidateHue, candidateSaturation, candidateLightness] = rgbToHsl(
candidateRed,
candidateGreen,
candidateBlue
)
const saturationFloorPenalty =
candidateSaturation < ANSI_LIGHT_MIN_SATURATION ? (ANSI_LIGHT_MIN_SATURATION - candidateSaturation) * 3 : 0
const score =
circularDistance(candidateHue, hue) * 4 +
Math.abs(candidateSaturation - Math.max(ANSI_LIGHT_MIN_SATURATION, saturation)) * 0.8 +
Math.abs(candidateLightness - Math.min(lightness, ANSI_LIGHT_TARGET_LUMINANCE)) * 2 +
saturationFloorPenalty
if (score < bestScore) {
bestColor = colorNumber
bestScore = score
}
}
return bestColor
}
function normalizeAnsiForeground(color: string): string {
const rgb = parseHex(color)
if (!rgb) return color
const richAnsi = richEightBitColorNumber(rgb[0], rgb[1], rgb[2])
const richRgb = xtermEightBitRgb(richAnsi)
const ansi =
relativeLuminance(richRgb[0], richRgb[1], richRgb[2]) > ANSI_LIGHT_MAX_LUMINANCE
? bestReadableAnsiColor(rgb[0], rgb[1], rgb[2])
: richAnsi
return `ansi256(${ansi})`
}
// ── Defaults ─────────────────────────────────────────────────────────
const BRAND: ThemeBrand = {
name: 'Hermes Agent',
icon: '⚕',
prompt: '',
shellPrompt: '$',
welcome: 'Type your message or /help for commands.',
goodbye: 'Goodbye! ⚕',
tool: '┊',
helpHeader: '(^_^)? Commands'
}
const cleanPromptSymbol = (s: string | undefined, fallback: string) => {
const cleaned = String(s ?? '')
.replace(/\s+/g, ' ')
.trim()
return cleaned || fallback
}
export const DARK_THEME: Theme = {
color: {
primary: '#FFD700',
accent: '#FFBF00',
border: '#CD7F32',
text: '#FFF8DC',
// TRUE NEUTRAL (design pass precondition): muted was `#CC9B1F` — itself
// gold — so "dim" read as "darker gold" and the hero color was the
// wallpaper. Re-pointed to the statusFg `#C0C0C0` (silver) family's darker
// step, CSS `gray` — no invented hexes. Grey = everything that merely
// happened; gold stays earned.
muted: '#808080',
bg: 'transparent',
completionBg: '#1a1a2e',
completionCurrentBg: '#333355',
completionMetaBg: '#1a1a2e',
completionMetaCurrentBg: '#333355',
label: '#DAA520',
ok: '#4caf50',
error: '#ef5350',
warn: '#ffa726',
prompt: '#FFF8DC',
// session chrome rides the same neutral muted family (was gold #CC9B1F).
sessionLabel: '#808080',
sessionBorder: '#808080',
statusBg: '#1a1a2e',
statusFg: '#C0C0C0',
statusGood: '#8FBC8F',
statusWarn: '#FFD700',
statusBad: '#FF8C00',
statusCritical: '#FF6B6B',
selectionBg: '#3a3a55',
diffAdded: 'rgb(220,255,220)',
diffRemoved: 'rgb(255,220,220)',
diffAddedWord: 'rgb(36,138,61)',
diffRemovedWord: 'rgb(207,34,46)',
diffAddedBg: '#1a4d1a',
diffRemovedBg: '#4d1a1a',
shellDollar: '#4dabf7'
},
brand: BRAND,
bannerLogo: '',
bannerHero: '',
spinner: EMPTY_SPINNER,
toolEmojis: {}
}
export const LIGHT_THEME: Theme = {
color: {
primary: '#8B6914',
accent: '#A0651C',
border: '#7A4F1F',
text: '#3D2F13',
// same disease as dark: muted was `#7A5A0F` (gold-brown). True neutral —
// the statusFg `#333333` family's lighter step, CSS `dimgray`.
muted: '#696969',
bg: 'transparent',
completionBg: '#F5F5F5',
completionCurrentBg: mix('#F5F5F5', '#A0651C', 0.25),
completionMetaBg: '#F5F5F5',
completionMetaCurrentBg: mix('#F5F5F5', '#A0651C', 0.25),
label: '#7A5A0F',
ok: '#2E7D32',
error: '#C62828',
warn: '#E65100',
prompt: '#2B2014',
sessionLabel: '#696969',
sessionBorder: '#696969',
statusBg: '#F5F5F5',
statusFg: '#333333',
statusGood: '#2E7D32',
statusWarn: '#8B6914',
statusBad: '#D84315',
statusCritical: '#B71C1C',
selectionBg: '#D4E4F7',
diffAdded: 'rgb(200,240,200)',
diffRemoved: 'rgb(240,200,200)',
diffAddedWord: 'rgb(27,94,32)',
diffRemovedWord: 'rgb(183,28,28)',
diffAddedBg: '#c8f0c8',
diffRemovedBg: '#f0c8c8',
shellDollar: '#1565C0'
},
brand: BRAND,
bannerLogo: '',
bannerHero: '',
spinner: EMPTY_SPINNER,
toolEmojis: {}
}
const LIGHT_DEFAULT_TERM_PROGRAMS = new Set<string>(['Apple_Terminal'])
const LUMA_LIGHT_THRESHOLD = 0.6
const HEX_3_RE = /^[0-9a-f]{3}$/
const HEX_6_RE = /^[0-9a-f]{6}$/
function backgroundLuminance(raw: string): null | number {
const v = raw.trim().toLowerCase()
if (!v) return null
const hex = v.startsWith('#') ? v.slice(1) : v
let rgb: [number, number, number] | null = null
if (HEX_6_RE.test(hex)) {
rgb = [parseInt(hex.slice(0, 2), 16), parseInt(hex.slice(2, 4), 16), parseInt(hex.slice(4, 6), 16)]
} else if (HEX_3_RE.test(hex)) {
// `charAt` always returns a string (vs index access, which is `string |
// undefined` under noUncheckedIndexedAccess); the regex guarantees 3 chars.
const r = hex.charAt(0)
const g = hex.charAt(1)
const b = hex.charAt(2)
rgb = [parseInt(r + r, 16), parseInt(g + g, 16), parseInt(b + b, 16)]
}
if (!rgb) return null
const [r, g, b] = rgb
return (0.2126 * r + 0.7152 * g + 0.0722 * b) / 255
}
/** Pick light vs dark with ordered, explainable env signals (mirror Ink). */
export function detectLightMode(
env: Record<string, string | undefined> = process.env,
lightDefaultTermPrograms: ReadonlySet<string> = LIGHT_DEFAULT_TERM_PROGRAMS
): boolean {
const lightFlag = (env.HERMES_TUI_LIGHT ?? '').trim().toLowerCase()
if (TRUE_RE.test(lightFlag)) return true
if (FALSE_RE.test(lightFlag)) return false
const themeFlag = (env.HERMES_TUI_THEME ?? '').trim().toLowerCase()
if (themeFlag === 'light') return true
if (themeFlag === 'dark') return false
const bgHint = backgroundLuminance(env.HERMES_TUI_BACKGROUND ?? '')
if (bgHint !== null) return bgHint >= LUMA_LIGHT_THRESHOLD
const colorfgbg = (env.COLORFGBG ?? '').trim()
if (colorfgbg) {
const lastField = colorfgbg.split(';').at(-1) ?? ''
if (/^\d+$/.test(lastField)) {
const bg = Number(lastField)
if (bg === 7 || bg === 15) return true
if (bg >= 0 && bg < 16) return false
}
}
const termProgram = (env.TERM_PROGRAM ?? '').trim()
return lightDefaultTermPrograms.has(termProgram)
}
function shouldNormalizeAnsiLightTheme(
env: Record<string, string | undefined> = process.env,
isLight = detectLightMode(env)
): boolean {
const colorTerm = (env.COLORTERM ?? '').trim().toLowerCase()
const termProgram = (env.TERM_PROGRAM ?? '').trim()
return termProgram === 'Apple_Terminal' && colorTerm !== 'truecolor' && colorTerm !== '24bit' && isLight
}
export function normalizeThemeForAnsiLightTerminal(
theme: Theme,
env: Record<string, string | undefined> = process.env,
isLight = detectLightMode(env)
): Theme {
if (!shouldNormalizeAnsiLightTheme(env, isLight)) return theme
const color = { ...theme.color }
for (const key of ANSI_NORMALIZED_FOREGROUNDS) color[key] = normalizeAnsiForeground(color[key])
for (const key of ANSI_MUTED_FOREGROUNDS) color[key] = `ansi256(${ANSI_MUTED_BUCKET})`
return { ...theme, color }
}
const DEFAULT_LIGHT_MODE = detectLightMode()
export const DEFAULT_THEME: Theme = normalizeThemeForAnsiLightTerminal(
DEFAULT_LIGHT_MODE ? LIGHT_THEME : DARK_THEME,
process.env,
DEFAULT_LIGHT_MODE
)
// ── Skin → Theme ─────────────────────────────────────────────────────
export function fromSkin(
colors: Record<string, string>,
branding: Record<string, string>,
bannerLogo = '',
bannerHero = '',
toolPrefix = '',
helpHeader = '',
spinner: Record<string, unknown> | undefined = undefined,
toolEmojis: Record<string, string> | undefined = undefined
): Theme {
const d = DEFAULT_THEME
const c = (k: string) => colors[k]
const hasSkinColors = Object.keys(colors).length > 0
const accent = c('ui_accent') ?? c('banner_accent') ?? d.color.accent
const bannerAccent = c('banner_accent') ?? c('banner_title') ?? d.color.accent
// Design pass (Appendix C precondition): `muted` is the transcript's "merely
// happened" NEUTRAL — it must NOT borrow `banner_dim` (the stock skin's dim
// GOLD banner shade; the Ink engine still uses it for its banner-tinted dim).
// Borrowing it re-golded every dim surface in the live app and made the hero
// color the wallpaper. Skins that want a custom transcript dim ship the
// dedicated `ui_muted` key; everything else gets the theme's true neutral.
const muted = c('ui_muted') ?? d.color.muted
const completionBg = c('completion_menu_bg') ?? d.color.completionBg
const completionCurrentBg =
c('completion_menu_current_bg') ??
(hasSkinColors ? mix(completionBg, bannerAccent, 0.25) : d.color.completionCurrentBg)
const completionMetaBg = c('completion_menu_meta_bg') ?? completionBg
const completionMetaCurrentBg = c('completion_menu_meta_current_bg') ?? completionCurrentBg
return normalizeThemeForAnsiLightTerminal(
{
color: {
primary: c('ui_primary') ?? c('banner_title') ?? d.color.primary,
accent,
border: c('ui_border') ?? c('banner_border') ?? d.color.border,
text: c('ui_text') ?? c('banner_text') ?? d.color.text,
muted,
// root canvas — skins may override (`ui_bg`); default true black/white.
bg: c('ui_bg') ?? d.color.bg,
completionBg,
completionCurrentBg,
completionMetaBg,
completionMetaCurrentBg,
label: c('ui_label') ?? d.color.label,
ok: c('ui_ok') ?? d.color.ok,
error: c('ui_error') ?? d.color.error,
warn: c('ui_warn') ?? d.color.warn,
prompt: c('prompt') ?? c('banner_text') ?? d.color.prompt,
sessionLabel: c('session_label') ?? muted,
sessionBorder: c('session_border') ?? muted,
statusBg: c('status_bar_bg') ?? d.color.statusBg,
statusFg: c('status_bar_text') ?? d.color.statusFg,
statusGood: c('status_bar_good') ?? c('ui_ok') ?? d.color.statusGood,
statusWarn: c('status_bar_warn') ?? c('ui_warn') ?? d.color.statusWarn,
statusBad: c('status_bar_bad') ?? d.color.statusBad,
statusCritical: c('status_bar_critical') ?? d.color.statusCritical,
selectionBg:
c('selection_bg') ??
c('completion_menu_current_bg') ??
(hasSkinColors ? completionCurrentBg : d.color.selectionBg),
diffAdded: d.color.diffAdded,
diffRemoved: d.color.diffRemoved,
diffAddedWord: d.color.diffAddedWord,
diffRemovedWord: d.color.diffRemovedWord,
diffAddedBg: c('diff_added_bg') ?? d.color.diffAddedBg,
diffRemovedBg: c('diff_removed_bg') ?? d.color.diffRemovedBg,
shellDollar: c('shell_dollar') ?? d.color.shellDollar
},
brand: {
name: branding.agent_name ?? d.brand.name,
icon: d.brand.icon,
prompt: cleanPromptSymbol(branding.prompt_symbol, d.brand.prompt),
welcome: branding.welcome ?? d.brand.welcome,
goodbye: branding.goodbye ?? d.brand.goodbye,
tool: toolPrefix || d.brand.tool,
helpHeader: branding.help_header ?? (helpHeader || d.brand.helpHeader)
},
bannerLogo,
bannerHero,
spinner: parseSpinner(spinner),
toolEmojis: toolEmojis ?? {}
},
process.env,
DEFAULT_LIGHT_MODE
)
}
/** Convenience: map a GatewaySkin payload straight to a Theme (defaults if empty). */
export function themeFromSkin(skin: GatewaySkin | undefined): Theme {
if (!skin) return DEFAULT_THEME
return fromSkin(
skin.colors ?? {},
skin.branding ?? {},
skin.banner_logo ?? '',
skin.banner_hero ?? '',
skin.tool_prefix ?? '',
skin.help_header ?? '',
skin.spinner,
skin.tool_emojis
)
}

View File

@@ -0,0 +1,139 @@
/**
* Pure text-shaping helpers for compact tool-result rendering (spec v4 §7 / §8).
* No OpenTUI/Solid imports — just string work, trivially unit-testable. Ported
* 1:1 from the React build's `engine/toolOutput.ts` (itself mirroring opencode's
* `util/collapse-tool-output.ts` + the gateway tool-result JSON-envelope unwrap).
*/
/** Result of collapsing tool output for the block render. */
export interface Collapsed {
lines: string[]
/** How many trailing lines were dropped (0 when nothing was hidden). */
hiddenLines: number
truncated: boolean
}
// CSI escape sequences (SGR colors, cursor, mouse). The gateway colors some
// slash/notice text with raw ANSI for the Ink TUI, which interprets it; the
// native `<text>` renders byte-for-byte, so those codes would leak as literal
// glyphs. Strip them on display (item 8).
// eslint-disable-next-line no-control-regex
const ANSI_CSI = /[\u001b\u009b]\[[0-9;:?<>=]*[ -/]*[@-~]/g
/** Remove ANSI/SGR/mouse escape sequences so they don't render as literal text. */
export function stripAnsi(s: string): string {
return (s ?? '').replace(ANSI_CSI, '')
}
/** Truncate a single line to `width` columns, adding an ellipsis when cut. */
export function truncate(s: string, width: number): string {
const w = Math.max(1, width)
return s.length > w ? s.slice(0, Math.max(1, w - 1)) + '…' : s
}
/**
* Un-double-escape gateway output that arrived with LITERAL `\n`/`\t` escapes
* (some tool tails are repr'd, so newlines show as backslash-n — item 7 "ugly").
* Conservative: only un-escapes when literal `\n` sequences OUTNUMBER real
* newlines, so genuinely multi-line output (and code that legitimately contains
* the two chars `\` + `n`) is left untouched.
*/
export function normalizeOutput(text: string): string {
const real = (text.match(/\n/g) ?? []).length
const literal = (text.match(/\\n/g) ?? []).length
if (literal > real)
return text
.replace(/\\r\\n/g, '\n')
.replace(/\\n/g, '\n')
.replace(/\\t/g, ' ')
return text
}
/**
* Unwrap the gateway's tool-result JSON envelope so the view shows the actual
* output, not the wrapper. Many tools return
* `{"output": "...", "exit_code": 0, "error": null}`. If `raw` parses to such an
* object, return its `output` (plus a compact error/exit suffix when the command
* failed); otherwise return `raw` unchanged. (Gotcha §8 — strip the envelope.)
*/
/**
* When the gateway tail-caps a LARGE result it serialises the whole envelope
* first, so the surviving tail ends mid-string with the envelope close — and,
* if the head survived, opens with the envelope's prefix up to `"output": "`.
* The fragment can't be JSON.parsed, so peel those affixes off conservatively
* (only the exact gateway shapes; real output won't end this way). Two shapes
* observed live (v6fix wire capture):
* terminal/process: `{"output": "…", "exit_code": 0, "error": null}`
* execute_code: `{"status": "success", "output": "…",
* "tool_calls_made": 0, "duration_seconds": 0.21[, "error": "…"]}`
* The tail anchors on the first trailing key (`exit_code`/`tool_calls_made`)
* then allows the remaining envelope keys in any order. Items 2 + 6.
*/
const ENVELOPE_HEAD = /^\s*\{\s*(?:"status"\s*:\s*"[^"]*"\s*,\s*)?"output"\s*:\s*"/
const ENVELOPE_TAIL =
/"\s*,\s*"(?:exit_code|tool_calls_made)"\s*:\s*-?\d+(?:\s*,\s*"(?:error|status|duration_seconds|exit_code|tool_calls_made)"\s*:\s*(?:null|-?\d+(?:\.\d+)?|"(?:[^"\\]|\\.)*"))*\s*\}\s*$/
function unwrapEnvelopeFragment(s: string): string {
const tail = ENVELOPE_TAIL.test(s)
const head = ENVELOPE_HEAD.test(s)
if (!tail && !head) return s
return s.replace(ENVELOPE_HEAD, '').replace(ENVELOPE_TAIL, '')
}
export function stripToolEnvelope(raw: string): string {
const s = (raw ?? '').trim()
if (!s.startsWith('{')) return normalizeOutput(unwrapEnvelopeFragment(raw ?? ''))
try {
const parsed: unknown = JSON.parse(s)
if (parsed && typeof parsed === 'object' && !Array.isArray(parsed) && 'output' in parsed) {
const obj = parsed as Record<string, unknown>
let out = typeof obj.output === 'string' ? obj.output : JSON.stringify(obj.output, null, 2)
const err = obj.error
const code = obj.exit_code
if (typeof err === 'string' && err) out += `\n[error] ${err}`
else if (typeof code === 'number' && code !== 0) out += `\n[exit ${code}]`
return normalizeOutput(out)
}
} catch {
// not parseable as a whole — maybe a tail-capped envelope fragment
}
return normalizeOutput(unwrapEnvelopeFragment(raw ?? ''))
}
/**
* The gateway caps verbose tool output to a tail and PREFIXES a literal label
* (`tui_gateway/server.py:_cap_tui_verbose_text`):
* `[showing verbose tail; omitted 5 lines / 234 chars]\n<tail>`
* `[showing verbose tail; omitted 512 chars]\n<tail>`
* The raw label is neither useful nor pretty (item 2). Strip it off and hand the
* view a tidy `omittedNote` ("5 lines / 234 chars") to render as a dim affordance.
*/
export function stripOmittedNote(text: string): { body: string; omittedNote?: string } {
const s = (text ?? '').replace(/^\s+/, '')
const match = s.match(/^\[showing verbose tail; omitted (.+?)\]\n/)
if (!match) return { body: text ?? '' }
return { body: s.slice(match[0].length), omittedNote: match[1] ?? '' }
}
/**
* Width cap for a collapsed tool header's ARGS preview (design pass): args are
* context, not content — they get at most ~half the pane, so a long command or
* path can never become the loudest mass on screen. Shared by the header
* truncation (toolPart) and the bash body's "did the header already show the
* whole command" echo check (bashTool) so the two stay mirrored.
*/
export function argsCapColumns(totalWidth: number): number {
return Math.max(8, Math.floor(totalWidth / 2))
}
/**
* Collapse text to at most `maxLines` lines, each capped to `width` columns. The
* view renders an overflow marker from `hiddenLines`; this stays pure (no marker).
*/
export function collapseToolOutput(text: string, maxLines: number, width: number): Collapsed {
const all = (text ?? '').replace(/\s+$/, '').split('\n')
const limit = Math.max(1, maxLines)
const lines = all.slice(0, limit).map(l => truncate(l, width))
const hiddenLines = Math.max(0, all.length - lines.length)
return { hiddenLines, lines, truncated: hiddenLines > 0 }
}

View File

@@ -0,0 +1,17 @@
/**
* One-line string truncation helpers shared by the chrome views (status bar,
* agents dashboard, background panel) — keep them here so the ellipsis rule
* doesn't drift between copies.
*/
/** Keep the HEAD of a string, suffixing `…` when it must clip (e.g. a goal/command row). */
export function truncRight(s: string, max: number): string {
if (max <= 1) return s.length > max ? '…' : s
return s.length <= max ? s : s.slice(0, max - 1) + '…'
}
/** Keep the TAIL of a string, prefixing `…` when it must clip (e.g. a deep cwd path). */
export function truncLeft(s: string, max: number): string {
if (max <= 1) return s.length > max ? '…' : s
return s.length <= max ? s : '…' + s.slice(s.length - max + 1)
}

View File

@@ -0,0 +1,259 @@
/**
* window — pure transcript-windowing math (slices S1+S2 of docs/plans/
* opentui-transcript-windowing.md, issue #27). The view (view/transcript.tsx)
* replaces out-of-window rows with EXACT-HEIGHT empty boxes (1 yoga node, no
* text buffers / native handles), so the mounted set stays ~3 viewports of
* rows regardless of transcript length. This module is the testable core:
*
* - `computeWindow` — which row keys must be mounted for a given scrollTop:
* rows intersecting [scrollTop margin, scrollTop + viewport + margin)
* over CUMULATIVE row heights (exact recorded heights; a line-count
* estimate stands in for never-measured rows), plus the never-window rows
* (streaming/live) and the bottom K rows (sticky-bottom region). With
* `pinnedBottom` (S2) the window anchors to the BOTTOM of the content
* instead of `scrollTop`: during burst appends / a resume snapshot the
* sticky pin will land at the new bottom, but layout (and therefore
* scrollTop) lags the store — anchoring to the cumulative content bottom
* adjudicates appended rows immediately instead of one frame late.
* - `shouldRecompute` — the hysteresis gate (≥ ¼ viewport via
* `hysteresisFor`): a computed window only changes once scrollTop has
* moved ≥ hysteresis from the anchor it was computed at, so swaps don't
* thrash at window edges.
* - `correctionIsLegal` — the jank rule for spacer-height corrections:
* a correction may only touch rows fully ABOVE the viewport (the caller
* compensates scrollTop in the same frame — automatic when bottom-anchored
* via the sticky pin) or fully BELOW it (invisible by definition). Anything
* intersecting the viewport would visibly move content: forbidden.
* - `estimateMessageHeight` — the cheap line-count estimate for rows that
* have never been measured (resume history above the viewport). A wrong
* estimate is fixed by remount (scrolling near) or by the S2 idle measure
* pass, both governed by the jank rule.
* - `edgeMeasureBatch` (S2 — design §4, the SIMPLE choice): @opentui/core
* cannot lay a renderable out without parenting it into the live tree
* (layout is the tree's Yoga pass), so true offscreen measurement isn't
* available. Instead the idle pass mounts a small batch of never-measured
* rows nearest the bottom window edge — they are the next to be seen when
* the user scrolls back — records their exact heights, and lets the next
* window recompute swap them back to (now exact) spacers. Estimates far
* from the window stay estimates until the march reaches them.
* - `windowRowStats` — a DEV counter (current / peak simultaneously-mounted
* real rows) the integration tests assert against and the bench can read
* (transcript.tsx exposes it on globalThis behind HERMES_TUI_WINDOW_STATS).
*/
import type { Message, Part } from './store.ts'
/** One transcript row as the window calc sees it. */
export interface WindowRow<K> {
readonly key: K
/** Exact recorded height (the row wrapper's last onSizeChange measurement,
* margins included) — or null when the row has never been measured. */
readonly height: number | null
/** Line-count estimate used while `height` is null (see estimateMessageHeight). */
readonly estimate?: number | undefined
/** Always mounted regardless of the window (streaming/live rows — a remount
* would restart native markdown streaming). */
readonly neverWindow: boolean
}
export interface WindowParams<K> {
readonly rows: readonly WindowRow<K>[]
readonly scrollTop: number
readonly viewportHeight: number
/** Mounted band kept above/below the viewport (design: 1 viewport each side). */
readonly margin: number
/** Stand-in height for null-height rows without their own estimate. */
readonly fallbackHeight?: number
/** The bottom K rows are always mounted (sticky-bottom region). */
readonly bottomK?: number
/** Anchor the window to the BOTTOM of the cumulative content instead of
* `scrollTop` (S2 append-time adjudication): while the view is pinned to
* the bottom, appended rows extend the content BELOW the last laid-out
* scrollTop — the sticky pin only catches up at the next layout pass.
* Anchoring to the content bottom adjudicates those rows immediately
* (new in-window rows mount, rows pushed past the margin become spacers)
* without waiting a frame. */
readonly pinnedBottom?: boolean
}
export interface WindowResult<K> {
/** Row keys that must be mounted; everything else renders as a spacer. */
readonly mounted: ReadonlySet<K>
/** The scrollTop this window was computed at — the next hysteresis anchor. */
readonly anchor: number
}
/** Default stand-in for a null-height row with no estimate (≈ a short row). */
export const DEFAULT_FALLBACK_HEIGHT = 2
/** Ceiling on a single row's line-count estimate — a pathological wall of text
* must not make the never-mounted region look kilometers tall. */
const ESTIMATE_MAX_LINES = 500
/** Hysteresis for the window recompute: ≥ ¼ viewport (design rule), never 0. */
export function hysteresisFor(viewportHeight: number): number {
return Math.max(1, Math.ceil(viewportHeight / 4))
}
/** Whether scrollTop has moved far enough from the last computation anchor to
* justify a new window (no anchor yet → always). */
export function shouldRecompute(scrollTop: number, anchor: number | null, hysteresis: number): boolean {
if (anchor === null) return true
return Math.abs(scrollTop - anchor) >= hysteresis
}
/** Compute the set of row keys that must be mounted for this scroll position. */
export function computeWindow<K>(params: WindowParams<K>): WindowResult<K> {
const fallback = params.fallbackHeight ?? DEFAULT_FALLBACK_HEIGHT
const bottomK = params.bottomK ?? 0
const heightOf = (r: WindowRow<K>): number => r.height ?? r.estimate ?? fallback
// pinnedBottom: the effective scrollTop is where the sticky pin will land —
// the cumulative content bottom minus one viewport (clamped at 0).
let effectiveTop = params.scrollTop
if (params.pinnedBottom) {
let contentHeight = 0
for (const r of params.rows) contentHeight += heightOf(r)
effectiveTop = Math.max(0, contentHeight - params.viewportHeight)
}
const windowStart = effectiveTop - params.margin
const windowEnd = effectiveTop + params.viewportHeight + params.margin
const total = params.rows.length
const mounted = new Set<K>()
let top = 0
let index = 0
for (const r of params.rows) {
const bottom = top + heightOf(r)
// half-open intersection: a row merely touching a window edge stays out.
const intersects = bottom > windowStart && top < windowEnd
if (intersects || r.neverWindow || index >= total - bottomK) mounted.add(r.key)
top = bottom
index++
}
return { mounted, anchor: effectiveTop }
}
/** Rows the S2 idle measure pass should mount next: up to `batch` never-
* measured, not-currently-mounted, windowable rows, NEAREST THE BOTTOM first
* (the bottom window edge is where a scroll-back enters history, so these are
* the next rows to be seen; the march then proceeds upward over idle pulses).
* Never-window rows are excluded — they are always mounted anyway. */
export function edgeMeasureBatch<K>(rows: readonly WindowRow<K>[], mounted: ReadonlySet<K>, batch: number): K[] {
const out: K[] = []
for (let i = rows.length - 1; i >= 0 && out.length < batch; i--) {
const r = rows[i]
if (!r || r.height !== null || r.neverWindow || mounted.has(r.key)) continue
out.push(r.key)
}
return out
}
/** Default idle delay before a lazy measure pulse (design §4): no appends, no
* scroll movement, no running turn for this long → mount one small batch. */
export const DEFAULT_MEASURE_IDLE_MS = 1000
/** Parse `HERMES_TUI_WINDOW_IDLE_MS` (TUI-only DEV/test knob): the idle delay
* before a lazy measure pulse. A non-negative integer → that delay (0 = pulse
* on every idle frame — the headless tests use this to make pulses
* deterministic); unset/garbage → DEFAULT_MEASURE_IDLE_MS. */
export function measureIdleDelayMs(value: string | undefined): number {
const v = value?.trim() ?? ''
if (!/^\d+$/.test(v)) return DEFAULT_MEASURE_IDLE_MS
return Number.parseInt(v, 10)
}
// ── DEV counter: simultaneously-mounted real rows (current + peak) ────────
// Two ints, always maintained (the cost is negligible); the integration tests
// assert `peakMounted` stays bounded during bursts/resume, and transcript.tsx
// exposes the live object on globalThis when HERMES_TUI_WINDOW_STATS is set so
// the bench can sample it. One transcript per process in practice; tests that
// mount several reset between phases.
export interface WindowRowStats {
mounted: number
peakMounted: number
}
const rowStats: WindowRowStats = { mounted: 0, peakMounted: 0 }
/** The live stats object (mutated in place — safe to hold a reference). */
export function windowRowStats(): Readonly<WindowRowStats> {
return rowStats
}
export function noteRowMounted(): void {
rowStats.mounted++
if (rowStats.mounted > rowStats.peakMounted) rowStats.peakMounted = rowStats.mounted
}
export function noteRowUnmounted(): void {
rowStats.mounted--
}
/** Reset the peak to the CURRENT mounted count (rows still live stay counted). */
export function resetWindowRowStats(): void {
rowStats.peakMounted = rowStats.mounted
}
/**
* The jank rule: may a spacer-height correction for the row spanning
* [rowTop, rowBottom) be applied at this scroll position without visibly
* moving content?
*
* - Fully BELOW the viewport → legal (invisible by definition).
* - Fully ABOVE the viewport → legal, PROVIDED the caller compensates
* scrollTop by the height delta in the same frame. When `atBottom`
* (sticky-bottom pinned) the pin performs that compensation automatically
* (bottom-anchored ⇒ zero visual movement); legality is the same either
* way — the flag documents which side owes the compensation.
* - Intersecting the viewport → forbidden; defer until the row scrolls out
* or is remounted for view.
*/
export function correctionIsLegal(
rowTop: number,
rowBottom: number,
scrollTop: number,
viewportHeight: number,
_atBottom: boolean
): boolean {
if (rowTop >= scrollTop + viewportHeight) return true // fully below the viewport
if (rowBottom <= scrollTop) return true // fully above — compensate scrollTop in the same frame
return false
}
/** Rendered line count of a text block (1-based; empty text still occupies a row). */
function lineCount(text: string): number {
if (!text) return 1
let lines = 1
for (let i = 0; i < text.length; i++) if (text.charCodeAt(i) === 10) lines++
return lines
}
/** Estimated rendered lines of one part: text → its line count (view strips
* leading/trailing blanks — mirror that); tool/reasoning → 1 collapsed header
* line (the default render for settled, never-mounted history). */
function partLines(part: Part): number {
if (part.type === 'text') return lineCount(part.text.replace(/^\n+|\n+$/g, ''))
return 1 // collapsed tool/reasoning header line
}
/**
* Cheap line-count height estimate for a row that has never been measured
* (resume history above the viewport). Deliberately ignores soft wrapping
* — it is a placeholder until the row is actually mounted/measured, and a
* wrong value may only be corrected per `correctionIsLegal` (or left until
* remount). `spacing` is the row's turnSpacing margins; `gap` the inter-part
* blank line (0 in /compact).
*/
export function estimateMessageHeight(
message: Pick<Message, 'text' | 'parts'> & { readonly role?: Message['role'] },
spacing: { readonly top: number; readonly bottom: number },
gap: number
): number {
const parts = message.parts
let content: number
if (parts && parts.length > 0) {
content = gap * (parts.length - 1)
for (const part of parts) content += partLines(part)
} else {
content = lineCount(message.text)
}
return Math.min(ESTIMATE_MAX_LINES, Math.max(1, content)) + spacing.top + spacing.bottom
}

View File

@@ -0,0 +1,51 @@
/**
* Agents dashboard (P2 de-crowd) — the master list is ONE line per subagent
* (long goals truncate, no multi-line prompt dump) and the detail pane renders
* the TYPED trace by kind (⚡ tool / · progress / ✓ summary).
*/
import { describe, expect, test } from 'vitest'
import { createSessionStore } from '../logic/store.ts'
import { App } from '../view/App.tsx'
import { ThemeProvider } from '../view/theme.tsx'
import { captureFrame } from './lib/render.ts'
const LONG_GOAL =
'Poll the current UTC time 10 times with a 3-second sleep between each poll, run date -u and record each result, then report all ten timestamps as a timing exercise'
function dash() {
const store = createSessionStore()
store.apply({ type: 'gateway.ready' })
store.apply({
type: 'subagent.start',
payload: { subagent_id: 'a1', goal: LONG_GOAL, model: 'anthropic/claude-opus-4-8', depth: 0 }
})
store.apply({ type: 'subagent.tool', payload: { subagent_id: 'a1', tool_name: 'terminal', text: 'date -u' } })
store.apply({ type: 'subagent.progress', payload: { subagent_id: 'a1', text: 'poll 4 of 10 recorded' } })
store.apply({ type: 'subagent.complete', payload: { subagent_id: 'a1', summary: 'all ten timestamps collected' } })
store.openDashboard()
return () => (
<ThemeProvider theme={() => store.state.theme}>
<App store={store} />
</ThemeProvider>
)
}
describe('agents dashboard de-crowd (P2)', () => {
test('a long goal is truncated to one line in the master list (no full-prompt wall)', async () => {
const frame = await captureFrame(dash(), { until: 'Agents', width: 116, height: 30 })
// The master row truncates to one line — the head shows with an ellipsis.
// (The detail pane below still shows the full goal; that's the inspect half.)
expect(frame).toContain('Poll the current UTC time')
expect(frame).toContain('…') // ellipsis proves the master row is one-line, not a wrapped wall
})
test('the detail pane renders the typed trace by kind (tool ⚡, summary ✓)', async () => {
const frame = await captureFrame(dash(), { until: 'Agents', width: 116, height: 30 })
expect(frame).toContain('⚡') // tool entry glyph
expect(frame).toContain('terminal — date -u') // tool entry text
expect(frame).toContain('✓') // summary entry glyph
expect(frame).toContain('all ten timestamps collected') // summary text (detail, not master)
expect(frame).toContain('poll 4 of 10 recorded') // progress entry
})
})

View File

@@ -0,0 +1,264 @@
/**
* Background-agents tray tests (Epic 2.7). Headless frames through the real
* App + Composer + AgentsTray with a simulated keyboard:
*
* - visibility: nothing rendered with 0 running agents; a one-line muted
* indicator with the count otherwise; completed/failed agents drop out.
* - focus-routing table: Down on an EMPTY composer with running agents
* focuses/expands the tray; Down with text keeps its meaning; Down with
* the slash menu open stays menu navigation (routeMenuKey integration
* pin); Down with 0 agents keeps prompt history; Esc from the tray
* returns focus to the composer; a printable key from the tray bounces
* focus back AND inserts the char (the composer's reclaim rule).
* - Enter on a tray row opens the agents dashboard preselected on that row.
*
* The onType wiring mirrors slashMenu.test.tsx (planCompletion → fake catalog)
* so the menu-precedence pin runs against entry-parity completions.
*/
import { describe, expect, test } from 'vitest'
import { createPromptHistory } from '../logic/history.ts'
import { planCompletion } from '../logic/slash.ts'
import { createSessionStore, type CompletionItem, type SessionStore } from '../logic/store.ts'
import { App } from '../view/App.tsx'
import { isTrayAgent } from '../view/agentsTray.tsx'
import { ThemeProvider } from '../view/theme.tsx'
import { renderProbe, type RenderProbe } from './lib/render.ts'
const EXPANDED_HINT = 'Enter inspect'
/** Fake gateway catalog (what `complete.slash` would return for a `/` prefix). */
const CATALOG: CompletionItem[] = [
{ display: '/clear', meta: 'clear the transcript', text: '/clear' },
{ display: '/copy', meta: 'copy the last response', text: '/copy' }
]
interface Harness {
probe: RenderProbe
store: SessionStore
submitted: string[]
typed: string[]
}
/** Mount the real App (entry-parity onType, like slashMenu.test.tsx). */
async function mountApp(historyEntries: string[] = []): Promise<Harness> {
const store = createSessionStore()
store.apply({ type: 'gateway.ready' })
const submitted: string[] = []
const typed: string[] = []
const history = createPromptHistory({ initial: historyEntries })
const onType = (text: string) => {
typed.push(text)
const plan = planCompletion(text)
if (!plan || plan.method !== 'complete.slash') {
store.clearCompletions()
return
}
const q = String(plan.params.text).toLowerCase()
const items = CATALOG.filter(c => c.text.startsWith(q) && c.text !== q)
if (items.length) store.setCompletions(items, plan.from)
else store.clearCompletions()
}
const probe = await renderProbe(
() => (
<ThemeProvider theme={() => store.state.theme}>
<App store={store} onSubmit={t => submitted.push(t)} onType={onType} history={history} />
</ThemeProvider>
),
// kitty keyboard: a SIMULATED lone ESC never parses under legacy input, and
// the Esc-from-tray test needs it.
{ height: 26, kittyKeyboard: true, width: 70 }
)
return { probe, store, submitted, typed }
}
const spawn = (store: SessionStore, id: string, goal: string) =>
store.apply({ type: 'subagent.start', payload: { depth: 0, goal, subagent_id: id } })
const complete = (store: SessionStore, id: string) =>
store.apply({ type: 'subagent.complete', payload: { subagent_id: id, summary: 'done' } })
describe('agents tray — visibility', () => {
test('isTrayAgent: running-ish statuses are in; ALL terminal statuses are out', () => {
for (const status of ['running', 'thinking', 'tool', 'working']) {
expect(isTrayAgent({ depth: 0, goal: 'g', id: 'x', status })).toBe(true)
}
// `complete` is the store fallback; the LIVE gateway sends delegate_tool's
// payload status verbatim — `completed`/`failed`/`error`/`timeout`/`interrupted`
// (verified live: the success path emits status="completed").
for (const status of ['complete', 'completed', 'failed', 'error', 'timeout', 'interrupted']) {
expect(isTrayAgent({ depth: 0, goal: 'g', id: 'x', status })).toBe(false)
}
})
test('0 running agents → the tray renders nothing', async () => {
const h = await mountApp()
try {
expect(h.probe.frame()).not.toContain('⚡')
} finally {
h.probe.destroy()
}
})
test('2 running agents → a ⚡ chip in the status bar (no persistent tray line)', async () => {
const h = await mountApp()
try {
spawn(h.store, 'a1', 'research X')
spawn(h.store, 'a2', 'compile Y')
const frame = await h.probe.waitForFrame(f => f.includes('⚡'))
expect(frame).toContain(`⚡ 2`)
expect(frame).not.toContain(EXPANDED_HINT) // collapsed until focused
} finally {
h.probe.destroy()
}
})
test('completed agents drop out; the tray empties when all finish', async () => {
const h = await mountApp()
try {
spawn(h.store, 'a1', 'research X')
spawn(h.store, 'a2', 'compile Y')
await h.probe.waitForFrame(f => f.includes('⚡ 2'))
complete(h.store, 'a1')
const one = await h.probe.waitForFrame(f => f.includes('⚡ 1'))
expect(one).toContain(`⚡ 1`)
complete(h.store, 'a2')
const none = await h.probe.waitForFrame(f => !f.includes('⚡'))
expect(none).not.toContain('⚡')
} finally {
h.probe.destroy()
}
})
})
describe('agents tray — Down-arrow focus routing', () => {
test('Down on an EMPTY composer with running agents focuses + expands the tray', async () => {
const h = await mountApp()
try {
spawn(h.store, 'a1', 'research X')
spawn(h.store, 'a2', 'compile Y')
await h.probe.waitForFrame(f => f.includes('⚡'))
h.probe.keys.pressArrow('down')
const frame = await h.probe.waitForFrame(f => f.includes(EXPANDED_HINT))
// rows show goal + status, with the first row selected
expect(frame).toContain('research X')
expect(frame).toContain('compile Y')
expect(frame).toContain('● running')
expect(frame).toMatch(/▸ ● running\s+research X/)
expect(frame).not.toContain('↓ to inspect') // the old persistent tray hint is gone (folded to the status bar)
} finally {
h.probe.destroy()
}
})
test('Down with TEXT in the composer keeps its meaning (no tray focus)', async () => {
const h = await mountApp()
try {
spawn(h.store, 'a1', 'research X')
await h.probe.waitForFrame(f => f.includes('⚡'))
await h.probe.keys.typeText('hello')
await h.probe.settle()
h.probe.keys.pressArrow('down')
await h.probe.settle()
const frame = h.probe.frame()
expect(frame).toContain('hello') // text untouched
expect(frame).not.toContain(EXPANDED_HINT)
expect(frame).toContain('⚡') // still just the indicator
} finally {
h.probe.destroy()
}
})
test('Down with the slash menu open stays MENU navigation (routeMenuKey pin)', async () => {
const h = await mountApp()
try {
spawn(h.store, 'a1', 'research X')
await h.probe.waitForFrame(f => f.includes('⚡'))
await h.probe.keys.typeText('/c')
await h.probe.settle()
await h.probe.waitForFrame(f => f.includes('/copy'))
h.probe.keys.pressArrow('down') // menu: /clear → /copy (NOT the tray)
await h.probe.settle()
expect(h.probe.frame()).not.toContain(EXPANDED_HINT)
h.probe.keys.pressEnter() // accepts the highlighted command
await h.probe.settle()
expect(h.typed.at(-1)).toBe('/copy ')
expect(h.submitted).toEqual([])
} finally {
h.probe.destroy()
}
})
test('Down with 0 running agents keeps prompt history as today', async () => {
const h = await mountApp(['older prompt'])
try {
h.probe.keys.pressArrow('up') // recall
await h.probe.settle()
expect(h.probe.frame()).toContain('older prompt')
h.probe.keys.pressArrow('down') // back to the (empty) draft — not a tray focus
await h.probe.settle()
const frame = h.probe.frame()
expect(frame).not.toContain('older prompt')
expect(frame).not.toContain(EXPANDED_HINT)
} finally {
h.probe.destroy()
}
})
test('Esc from the focused tray collapses it and refocuses the composer', async () => {
const h = await mountApp()
try {
spawn(h.store, 'a1', 'research X')
await h.probe.waitForFrame(f => f.includes('⚡'))
h.probe.keys.pressArrow('down')
await h.probe.waitForFrame(f => f.includes(EXPANDED_HINT))
h.probe.keys.pressEscape()
const frame = await h.probe.waitForFrame(f => !f.includes(EXPANDED_HINT))
expect(frame).toContain('⚡') // back to the collapsed line
await h.probe.keys.typeText('hi') // composer has focus again
await h.probe.settle()
expect(h.probe.frame()).toContain('hi')
} finally {
h.probe.destroy()
}
})
test('a printable key from the focused tray bounces to the composer AND inserts', async () => {
const h = await mountApp()
try {
spawn(h.store, 'a1', 'research X')
await h.probe.waitForFrame(f => f.includes('⚡'))
h.probe.keys.pressArrow('down')
await h.probe.waitForFrame(f => f.includes(EXPANDED_HINT))
await h.probe.keys.typeText('x')
const frame = await h.probe.waitForFrame(f => !f.includes(EXPANDED_HINT))
expect(frame).toContain('⚡') // tray collapsed (textarea reclaimed focus)
expect(frame).toContain('x') // …and the char landed in the composer
} finally {
h.probe.destroy()
}
})
})
describe('agents tray — Enter opens the dashboard preselected', () => {
test('Down to the second row + Enter → dashboard open on THAT agent', async () => {
const h = await mountApp()
try {
spawn(h.store, 'a1', 'research X')
spawn(h.store, 'a2', 'compile Y')
await h.probe.waitForFrame(f => f.includes('⚡'))
h.probe.keys.pressArrow('down') // focus the tray (row 0)
await h.probe.waitForFrame(f => f.includes(EXPANDED_HINT))
h.probe.keys.pressArrow('down') // select row 1 (compile Y)
await h.probe.settle()
h.probe.keys.pressEnter()
const frame = await h.probe.waitForFrame(f => f.includes('⛓ Agents'))
expect(h.store.state.dashboard).toBe(true)
expect(h.store.state.dashboardAgent).toBe('a2')
expect(frame).toMatch(/▸ ● running\s+compile Y/) // master list preselected
expect(h.submitted).toEqual([]) // Enter opened the dashboard, no submit
} finally {
h.probe.destroy()
}
})
})

View File

@@ -0,0 +1,140 @@
/**
* Background-activity logic tests — pure parsers + derive helpers. Everything
* off the wire is `unknown`, so the parsers must defend against garbage/missing
* fields and map snake_case → camelCase.
*/
import { describe, expect, test } from 'vitest'
import {
type BackgroundProcess,
isChromeNotice,
parseNotification,
parseProcessList,
runningCount
} from '../logic/backgroundActivity.ts'
describe('parseNotification', () => {
test('happy path: full payload, snake_case ttl_ms → ttlMs', () => {
expect(
parseNotification({ id: 'job-1', key: 'k1', kind: 'task.complete', level: 'warn', text: 'done', ttl_ms: 5000 })
).toEqual({ id: 'job-1', key: 'k1', kind: 'task.complete', level: 'warn', text: 'done', ttlMs: 5000 })
})
test('garbage / missing level coerces to info; missing kind → ""', () => {
expect(parseNotification({ id: 'a', level: 'screaming', text: 'hi' })).toEqual({
id: 'a',
kind: '',
level: 'info',
text: 'hi'
})
expect(parseNotification({ id: 'b', text: 'no level' })?.level).toBe('info')
})
test('missing/empty text → null (text is load-bearing for the card)', () => {
expect(parseNotification({ id: 'a', level: 'info' })).toBeNull()
expect(parseNotification({ id: 'a', text: '' })).toBeNull()
expect(parseNotification(null)).toBeNull()
expect(parseNotification('nope')).toBeNull()
})
test('id falls back to key when id is absent', () => {
const n = parseNotification({ key: 'k-only', text: 'hello' })
expect(n?.id).toBe('k-only')
expect(n?.key).toBe('k-only')
})
test('no id and no key → synthesized stable id `n:${text}`', () => {
const n = parseNotification({ text: 'build finished' })
expect(n?.id).toBe('n:build finished')
expect(n?.key).toBeUndefined()
})
test('id is preferred over key when both present', () => {
expect(parseNotification({ id: 'real', key: 'k', text: 'x' })?.id).toBe('real')
})
test('non-number ttl_ms is dropped (no ttlMs)', () => {
const n = parseNotification({ id: 'a', text: 'x', ttl_ms: 'soon' })
expect(n?.ttlMs).toBeUndefined()
})
test('preserves level "success" (previously dropped to info)', () => {
expect(parseNotification({ id: 's', level: 'success', text: 'credits topped up' })?.level).toBe('success')
})
test('ttl credits notice: kind "ttl" + ttl_ms → kind/ttlMs preserved', () => {
const n = parseNotification({ kind: 'ttl', text: 'low on credits', ttl_ms: 8000 })
expect(n?.kind).toBe('ttl')
expect(n?.ttlMs).toBe(8000)
})
})
describe('isChromeNotice', () => {
const mk = (kind: string): Parameters<typeof isChromeNotice>[0] => ({ id: 'i', kind, level: 'info', text: 't' })
test('true for lifecycle kinds sticky | ttl (credits/usage notices)', () => {
expect(isChromeNotice(mk('sticky'))).toBe(true)
expect(isChromeNotice(mk('ttl'))).toBe(true)
})
test('false for label kinds / empty (inline process+background cards)', () => {
expect(isChromeNotice(mk('process.complete'))).toBe(false)
expect(isChromeNotice(mk(''))).toBe(false)
expect(isChromeNotice(mk('background task complete'))).toBe(false)
})
})
describe('parseProcessList', () => {
test('maps good rows, snake_case → camelCase', () => {
expect(
parseProcessList({
processes: [
{ command: 'npm test', session_id: 's1', status: 'running', uptime_seconds: 12 },
{ command: 'build', session_id: 's2', status: 'exited', uptime_seconds: 99 }
]
})
).toEqual([
{ command: 'npm test', sessionId: 's1', status: 'running', uptimeSeconds: 12 },
{ command: 'build', sessionId: 's2', status: 'exited', uptimeSeconds: 99 }
])
})
test('skips malformed rows (missing session_id or command); defaults status/uptime', () => {
expect(
parseProcessList({
processes: [
{ command: 'ok', session_id: 's1' }, // no status/uptime → defaults
{ command: 'no-session' }, // dropped
{ session_id: 's3' }, // dropped
null, // dropped
'garbage' // dropped
]
})
).toEqual([{ command: 'ok', sessionId: 's1', status: '', uptimeSeconds: 0 }])
})
test('non-object / missing processes → []', () => {
expect(parseProcessList(null)).toEqual([])
expect(parseProcessList({})).toEqual([])
expect(parseProcessList({ processes: 'nope' })).toEqual([])
})
})
describe('runningCount', () => {
const procs: BackgroundProcess[] = [
{ command: 'a', sessionId: 's1', status: 'running', uptimeSeconds: 1 },
{ command: 'b', sessionId: 's2', status: 'exited', uptimeSeconds: 1 },
{ command: 'c', sessionId: 's3', status: 'Sleeping', uptimeSeconds: 1 }, // unknown → running (lenient)
{ command: 'd', sessionId: 's4', status: 'DONE', uptimeSeconds: 1 }, // case-insensitive terminal
{ command: 'e', sessionId: 's5', status: 'killed', uptimeSeconds: 1 }
]
test('counts running-ish processes (lenient on unknown statuses)', () => {
// running + Sleeping = 2 (exited/DONE/killed excluded)
expect(runningCount(procs)).toBe(2)
})
test('empty input → 0', () => {
expect(runningCount([])).toBe(0)
})
})

View File

@@ -0,0 +1,59 @@
/**
* Background-process panel (P3) — /bg opens it; it lists the polled OS processes
* with a running count + stop-all affordance, and shows an empty state.
*/
import { describe, expect, test } from 'vitest'
import { parseProcessList } from '../logic/backgroundActivity.ts'
import { createSessionStore } from '../logic/store.ts'
import { App } from '../view/App.tsx'
import { ThemeProvider } from '../view/theme.tsx'
import { captureFrame } from './lib/render.ts'
function appWith(store: ReturnType<typeof createSessionStore>) {
return () => (
<ThemeProvider theme={() => store.state.theme}>
<App store={store} />
</ThemeProvider>
)
}
describe('background-process panel (P3)', () => {
test('parseProcessList maps an agents.list result (snake_case → camel, skips junk)', () => {
const procs = parseProcessList({
processes: [
{ session_id: 's1', command: 'vite dev', status: 'running', uptime_seconds: 42 },
{ command: 'no session id — dropped' },
{ session_id: 's2', command: 'claude --bg', status: 'exited', uptime_seconds: 5 }
]
})
expect(procs).toEqual([
{ sessionId: 's1', command: 'vite dev', status: 'running', uptimeSeconds: 42 },
{ sessionId: 's2', command: 'claude --bg', status: 'exited', uptimeSeconds: 5 }
])
})
test('the panel lists processes with a running count + stop-all hint', async () => {
const store = createSessionStore()
store.apply({ type: 'gateway.ready' })
store.setBackgroundProcesses([
{ sessionId: 's1', command: 'vite dev --host 0.0.0.0 --port 3000', status: 'running', uptimeSeconds: 125 },
{ sessionId: 's2', command: 'pytest -x --watch', status: 'running', uptimeSeconds: 8 },
{ sessionId: 's3', command: 'claude-code background job', status: 'exited', uptimeSeconds: 4 }
])
store.openBackgroundPanel()
const frame = await captureFrame(appWith(store), { until: 'Background processes', width: 110, height: 24 })
expect(frame).toContain('Background processes · 2 running') // exited one excluded
expect(frame).toContain('vite dev')
expect(frame).toContain('pytest')
expect(frame).toContain('x stop all') // footer affordance
})
test('empty state when nothing is running', async () => {
const store = createSessionStore()
store.apply({ type: 'gateway.ready' })
store.openBackgroundPanel()
const frame = await captureFrame(appWith(store), { until: 'Background processes', width: 110, height: 24 })
expect(frame).toContain('No background processes running.')
})
})

View File

@@ -0,0 +1,142 @@
/**
* ClarifyPrompt rewrite (F5/F6) — headless frames + simulated keyboard.
*
* Asserts the four user-reported fixes:
* - long option text WRAPS (appears on a second line) instead of clipping (F5),
* - options are NUMBERED and the selected row is highlighted (F5),
* - the custom answer is an inline input in the SAME screen (F5),
* - Up/Down drive the selection and Enter answers the highlighted choice; the
* arrows don't escape to a scrollbox (F6 — we assert selection moved).
*/
import { ThemeProvider } from '../view/theme.tsx'
import { describe, expect, test } from 'vitest'
import { ClarifyPrompt } from '../view/prompts/clarifyPrompt.tsx'
import { createSessionStore } from '../logic/store.ts'
import { renderProbe, type RenderProbe } from './lib/render.ts'
const LONG =
'Just analyze for now — give me the implementation plan doc (code-path refs + line numbers, screen-by-screen), no code yet.'
const theme = createSessionStore().state.theme
async function mount(
choices: string[] | null,
onAnswer: (a: string) => void = () => {},
onCancel: () => void = () => {}
): Promise<RenderProbe> {
return renderProbe(
() => (
<ThemeProvider theme={() => theme}>
<ClarifyPrompt
question="How do you want me to proceed?"
choices={choices}
onAnswer={onAnswer}
onCancel={onCancel}
/>
</ThemeProvider>
),
{ height: 24, kittyKeyboard: true, width: 60 }
)
}
describe('ClarifyPrompt (F5/F6)', () => {
test('numbers every option and shows the inline custom-answer input (F5)', async () => {
const h = await mount(['Alpha option', 'Beta option'])
try {
const frame = h.frame()
expect(frame).toContain('1. ')
expect(frame).toContain('2. ')
// the inline custom input is present in the SAME screen (not a separate view)
expect(frame).toContain('or type a custom answer')
// NOTE: the option BODIES render through the native <markdown> renderable
// (so `**bold**`/`code` in a choice isn't shown raw — glitch 2026-06-14).
// Tree-sitter markdown doesn't settle in the headless test renderer, so the
// body text isn't in the frame here (same limitation as render.test.tsx:38-40
// and the transcript text parts) — the painted markdown is verified in the
// live smoke. We assert the structural chrome (numbers + input) instead.
} finally {
h.destroy()
}
})
test('a long option does not crash the bordered layout (F5)', async () => {
const h = await mount([LONG, 'Short'])
try {
const frame = h.frame()
// The long option flows into a flex column that wraps within the box width
// (no clipping at the right edge). The body renders via native <markdown>
// which doesn't paint headlessly (see the note above), so assert the layout
// chrome survived a very long choice: both numbered rows + the box border +
// the input are present (a clipping/overflow regression would break these).
expect(frame).toContain('1. ')
expect(frame).toContain('2. ')
expect(frame).toContain('or type a custom answer')
expect(frame).toContain('┌')
expect(frame).toContain('└')
} finally {
h.destroy()
}
})
test('Down moves the selection; Enter answers the highlighted choice (F6)', async () => {
let answered: string | undefined
const h = await mount(['Alpha option', 'Beta option'], a => (answered = a))
try {
h.keys.pressArrow('down') // 0 → 1 (Beta)
await h.settle()
h.keys.pressEnter()
await h.settle()
expect(answered).toBe('Beta option')
} finally {
h.destroy()
}
})
test('Down past the last choice lands on the custom input; Enter sends typed text', async () => {
let answered: string | undefined
const h = await mount(['Only choice'], a => (answered = a))
try {
h.keys.pressArrow('down') // choice 0 → custom input (index 1)
await h.settle()
await h.keys.typeText('my custom reply')
await h.settle()
h.keys.pressEnter()
await h.settle()
expect(answered).toBe('my custom reply')
} finally {
h.destroy()
}
})
test('no choices → the input is the only control and is focused', async () => {
let answered: string | undefined
const h = await mount(null, a => (answered = a))
try {
expect(h.frame()).toContain('Type your answer')
await h.keys.typeText('freeform')
await h.settle()
h.keys.pressEnter()
await h.settle()
expect(answered).toBe('freeform')
} finally {
h.destroy()
}
})
test('Esc cancels', async () => {
let cancelled = false
const h = await mount(
['A', 'B'],
() => {},
() => (cancelled = true)
)
try {
h.keys.pressEscape()
await h.settle()
expect(cancelled).toBe(true)
} finally {
h.destroy()
}
})
})

View File

@@ -0,0 +1,210 @@
/**
* Composer input tests — shift+enter newline (kitty), the Alt+Enter universal
* fallback, the visible-height cap with internal scroll, and big-buffer line
* navigation (item: composer input improvements).
*
* Protocol reality, pinned here:
* - kitty keyboard protocol (ghostty/kitty/wezterm): Shift+Enter arrives as a
* distinct `return + shift` event → newline; plain Enter still submits.
* - LEGACY input: Shift+Enter is byte-identical to Enter (both CR), so it
* submits — the mock keyboard reproduces this faithfully (the shift
* modifier can't be encoded on a bare CR). Alt+Enter (ESC-prefixed CR)
* works everywhere and inserts the newline instead.
*
* Height cap: the textarea auto-grows to COMPOSER_MAX_ROWS (8) then scrolls
* INTERNALLY — the viewport follows the cursor, and Up/Down in a multi-line
* buffer are line navigation, never history recall.
*/
import { describe, expect, test } from 'vitest'
import { COMPOSER_MAX_ROWS, envComposerRows } from '../logic/env.ts'
import { createPromptHistory } from '../logic/history.ts'
import { createSessionStore } from '../logic/store.ts'
import { App } from '../view/App.tsx'
import { ThemeProvider } from '../view/theme.tsx'
import { renderProbe, type RenderProbe } from './lib/render.ts'
interface Harness {
probe: RenderProbe
submitted: string[]
}
async function mountComposer(opts?: { kitty?: boolean; history?: string[] }): Promise<Harness> {
const store = createSessionStore()
store.apply({ type: 'gateway.ready' })
const submitted: string[] = []
const history = createPromptHistory({ initial: opts?.history ?? [] })
const probe = await renderProbe(
() => (
<ThemeProvider theme={() => store.state.theme}>
<App store={store} onSubmit={t => submitted.push(t)} history={history} />
</ThemeProvider>
),
{ height: 30, kittyKeyboard: opts?.kitty ?? false, width: 70 }
)
return { probe, submitted }
}
/** Row index of the first frame line containing `text` (-1 when absent). */
function rowOf(frame: string, text: string): number {
return frame.split('\n').findIndex(l => l.includes(text))
}
describe('shift+enter — kitty protocol inserts a newline', () => {
test('kitty: Shift+Enter → newline (no submit); Enter then submits the multi-line text', async () => {
const h = await mountComposer({ kitty: true })
try {
await h.probe.keys.typeText('alpha')
h.probe.keys.pressEnter({ shift: true })
await h.probe.settle()
await h.probe.keys.typeText('beta')
await h.probe.settle()
expect(h.submitted).toEqual([]) // newline, NOT a submit
const frame = h.probe.frame()
expect(rowOf(frame, 'alpha')).toBeGreaterThanOrEqual(0)
expect(rowOf(frame, 'beta')).toBe(rowOf(frame, 'alpha') + 1) // separate composer rows
h.probe.keys.pressEnter() // plain Enter still submits
await h.probe.settle()
expect(h.submitted).toEqual(['alpha\nbeta'])
} finally {
h.probe.destroy()
}
})
test('kitty: plain Enter submits (pin — shift handling must not eat Enter)', async () => {
const h = await mountComposer({ kitty: true })
try {
await h.probe.keys.typeText('hello kitty')
h.probe.keys.pressEnter()
await h.probe.settle()
expect(h.submitted).toEqual(['hello kitty'])
} finally {
h.probe.destroy()
}
})
test('legacy: Shift+Enter is indistinguishable from Enter → submits (honest pin)', async () => {
const h = await mountComposer({ kitty: false })
try {
await h.probe.keys.typeText('hello legacy')
// legacy CR carries no shift bit — the mock emits the same bare \r
h.probe.keys.pressEnter({ shift: true })
await h.probe.settle()
expect(h.submitted).toEqual(['hello legacy'])
} finally {
h.probe.destroy()
}
})
test('legacy: Alt+Enter (ESC-prefixed CR) inserts the newline — the universal fallback', async () => {
const h = await mountComposer({ kitty: false })
try {
await h.probe.keys.typeText('one')
h.probe.keys.pressEnter({ meta: true })
await h.probe.settle()
await h.probe.keys.typeText('two')
await h.probe.settle()
expect(h.submitted).toEqual([]) // Alt+Enter = newline, not the stock submit
h.probe.keys.pressEnter()
await h.probe.settle()
expect(h.submitted).toEqual(['one\ntwo'])
} finally {
h.probe.destroy()
}
})
})
describe('height cap + internal scroll (Ink parity: 8 rows)', () => {
const lines = Array.from({ length: 20 }, (_, i) => `q${String(i + 1).padStart(2, '0')}`)
async function typeTallBuffer(h: Harness): Promise<void> {
for (let i = 0; i < lines.length; i++) {
await h.probe.keys.typeText(lines[i]!)
if (i < lines.length - 1) h.probe.keys.pressEnter({ shift: true })
}
await h.probe.settle()
}
test('a 20-line buffer renders at most COMPOSER_MAX_ROWS rows, scrolled to the cursor', async () => {
const h = await mountComposer({ kitty: true })
try {
await typeTallBuffer(h)
const frame = h.probe.frame()
const visible = lines.filter(l => frame.includes(l))
expect(visible.length).toBeLessThanOrEqual(COMPOSER_MAX_ROWS)
expect(frame).toContain('q20') // the cursor line (bottom) is in view …
expect(frame).not.toContain('q01') // … the top scrolled out internally
expect(frame).toContain('line 20/20') // the quiet position indicator
expect(h.submitted).toEqual([]) // nothing submitted while composing
} finally {
h.probe.destroy()
}
})
test('Up walks the cursor through the lines and the viewport follows', async () => {
const h = await mountComposer({ history: ['previous prompt'], kitty: true })
try {
await typeTallBuffer(h)
for (let i = 0; i < lines.length - 1; i++) h.probe.keys.pressArrow('up')
await h.probe.settle()
const frame = h.probe.frame()
expect(frame).toContain('q01') // viewport followed the cursor to the top
expect(frame).not.toContain('q20') // the bottom scrolled out
expect(frame).toContain('line 1/20')
// multi-line buffer: Up at the top is NOT a history recall
h.probe.keys.pressArrow('up')
await h.probe.settle()
expect(h.probe.frame()).not.toContain('previous prompt')
// … and Down walks back down instead of recalling newer history
for (let i = 0; i < lines.length - 1; i++) h.probe.keys.pressArrow('down')
await h.probe.settle()
const back = h.probe.frame()
expect(back).toContain('q20')
expect(back).not.toContain('previous prompt')
} finally {
h.probe.destroy()
}
})
test('single-line buffers keep the existing history recall on Up (regression pin)', async () => {
const h = await mountComposer({ history: ['previous prompt'], kitty: true })
try {
await h.probe.keys.typeText('draft')
h.probe.keys.pressArrow('up')
await h.probe.settle()
expect(h.probe.frame()).toContain('previous prompt')
} finally {
h.probe.destroy()
}
})
test('no indicator while the buffer fits the visible cap', async () => {
const h = await mountComposer({ kitty: true })
try {
await h.probe.keys.typeText('short')
h.probe.keys.pressEnter({ shift: true })
await h.probe.keys.typeText('buffer')
await h.probe.settle()
expect(h.probe.frame()).not.toContain('line 2/2')
} finally {
h.probe.destroy()
}
})
})
describe('envComposerRows — the TUI-only override (not config.yaml)', () => {
test.each([
[undefined, COMPOSER_MAX_ROWS],
['', COMPOSER_MAX_ROWS],
['12', 12],
['4', 4],
['0', COMPOSER_MAX_ROWS], // zero rows is nonsense — fall back
['tall', COMPOSER_MAX_ROWS] // garbage — fall back
])('%j → %d', (value, expected) => {
expect(envComposerRows(value as string | undefined)).toBe(expected)
})
test('the default cap is the Ink-parity 8', () => {
expect(COMPOSER_MAX_ROWS).toBe(8)
})
})

View File

@@ -0,0 +1,94 @@
/**
* Assistant-text extraction helpers (the /copy command's logic). Pure functions:
* pull the answer text out of a live (parts) or settled (.text) assistant turn,
* excluding reasoning/tool parts; pick the n-th newest assistant response.
*/
import { describe, expect, test } from 'vitest'
import { assistantResponses, messageText, nthAssistantResponse } from '../logic/copy.ts'
import type { Message } from '../logic/store.ts'
describe('messageText', () => {
test('a live parts turn concatenates text parts; excludes reasoning/tool', () => {
const m: Message = {
role: 'assistant',
text: '',
parts: [
{ type: 'reasoning', id: 'p1', text: 'thinking…' },
{ type: 'text', id: 'p2', text: 'Hello' },
{ type: 'tool', id: 't1', name: 'bash', state: 'complete', resultText: 'ran' },
{ type: 'text', id: 'p3', text: ' world' }
]
}
expect(messageText(m)).toBe('Hello world')
})
test('trims surrounding whitespace from concatenated text parts', () => {
const m: Message = {
role: 'assistant',
text: '',
parts: [{ type: 'text', id: 'p1', text: ' spaced ' }]
}
expect(messageText(m)).toBe('spaced')
})
test('a settled/resumed turn (no parts) returns .text', () => {
const m: Message = { role: 'assistant', text: 'resumed answer' }
expect(messageText(m)).toBe('resumed answer')
})
test('empty parts array falls back to .text', () => {
const m: Message = { role: 'assistant', text: 'flat body', parts: [] }
expect(messageText(m)).toBe('flat body')
})
})
describe('assistantResponses', () => {
test('picks only assistant rows, newest-first, non-empty', () => {
const messages: Message[] = [
{ role: 'system', text: 'welcome' },
{ role: 'user', text: 'hi' },
{ role: 'assistant', text: 'first reply' },
{ role: 'user', text: 'and?' },
{ role: 'assistant', text: '', parts: [{ type: 'text', id: 'p1', text: 'second reply' }] }
]
expect(assistantResponses(messages)).toEqual(['second reply', 'first reply'])
})
test('skips assistant rows that resolve to empty text', () => {
const messages: Message[] = [
{ role: 'assistant', text: 'kept' },
{ role: 'assistant', text: '', parts: [{ type: 'reasoning', id: 'r1', text: 'only thinking' }] }
]
expect(assistantResponses(messages)).toEqual(['kept'])
})
test('empty messages → []', () => {
expect(assistantResponses([])).toEqual([])
})
})
describe('nthAssistantResponse', () => {
const messages: Message[] = [
{ role: 'assistant', text: 'oldest' },
{ role: 'user', text: 'q' },
{ role: 'assistant', text: 'newest' }
]
test('n=1 is the last assistant response', () => {
expect(nthAssistantResponse(messages, 1)).toBe('newest')
})
test('n=2 is the previous assistant response', () => {
expect(nthAssistantResponse(messages, 2)).toBe('oldest')
})
test('n past the end → undefined', () => {
expect(nthAssistantResponse(messages, 3)).toBeUndefined()
})
test('no assistant responses → undefined', () => {
expect(nthAssistantResponse([{ role: 'user', text: 'hi' }], 1)).toBeUndefined()
expect(nthAssistantResponse([], 1)).toBeUndefined()
})
})

View File

@@ -0,0 +1,79 @@
/**
* Unit tests for the pure diff helpers (Epic 2.3 — logic/diff.ts): `+N M`
* counting (file headers excluded, trailing newline optional), cwd-relative
* paths (exact prefix strip only — no `~`), and per-file splitting of
* multi-file unified diffs (the native DiffRenderable parses only the first
* file, so the renderer feeds it one section at a time).
*/
import { describe, expect, test } from 'vitest'
import { diffStats, relativizePath, splitUnifiedDiff } from '../logic/diff.ts'
const ONE_FILE = ['--- a/src/main.ts', '+++ b/src/main.ts', '@@ -1,3 +1,4 @@', ' ctx', '-old', '+new', '+more'].join(
'\n'
)
describe('diffStats', () => {
test('counts added/removed lines, excluding the +++/--- file headers', () => {
expect(diffStats(ONE_FILE + '\n')).toEqual({ added: 2, removed: 1 })
})
test('handles a diff without a trailing newline', () => {
expect(diffStats(ONE_FILE)).toEqual({ added: 2, removed: 1 })
})
test('a multi-file diff counts headers of every file out', () => {
const diff = `${ONE_FILE}\n--- a/b.py\n+++ b/b.py\n@@ -1 +1 @@\n-x\n+y\n`
expect(diffStats(diff)).toEqual({ added: 3, removed: 2 })
})
test('empty diff → zero stats', () => {
expect(diffStats('')).toEqual({ added: 0, removed: 0 })
})
})
describe('relativizePath', () => {
test.each([
// inside cwd → relative
['/home/u/proj/src/main.ts', '/home/u/proj', 'src/main.ts'],
// outside cwd → unchanged
['/etc/hosts', '/home/u/proj', '/etc/hosts'],
// exactly the cwd → '.'
['/home/u/proj', '/home/u/proj', '.'],
// trailing slash on cwd tolerated
['/home/u/proj/a.txt', '/home/u/proj/', 'a.txt'],
// sibling dir sharing the prefix string is NOT inside cwd
['/home/u/proj2/a.txt', '/home/u/proj', '/home/u/proj2/a.txt'],
// no cwd → unchanged (and already-relative paths pass through)
['src/main.ts', undefined, 'src/main.ts']
])('%s relative to %s → %s', (path, cwd, expected) => {
expect(relativizePath(path, cwd)).toBe(expected)
})
})
describe('splitUnifiedDiff', () => {
test('single-file diff → one section with the b/ path stripped', () => {
const sections = splitUnifiedDiff(ONE_FILE + '\n')
expect(sections).toHaveLength(1)
expect(sections[0]?.path).toBe('src/main.ts')
expect(sections[0]?.diff).toBe(ONE_FILE)
})
test('multi-file diff splits at the next ---/+++ header pair', () => {
const second = ['--- a/b.py', '+++ b/b.py', '@@ -1 +1 @@', '-x', '+y'].join('\n')
const sections = splitUnifiedDiff(`${ONE_FILE}\n${second}\n`)
expect(sections.map(s => s.path)).toEqual(['src/main.ts', 'b.py'])
expect(sections[1]?.diff).toBe(second)
})
test('a removed line starting with --- does not split the file', () => {
const tricky = ['--- a/x.md', '+++ b/x.md', '@@ -1,2 +1,1 @@', '--- a heading rule', ' kept'].join('\n')
const sections = splitUnifiedDiff(tricky)
expect(sections).toHaveLength(1)
})
test('new-file diff (--- /dev/null) takes the +++ path', () => {
const created = ['--- /dev/null', '+++ b/new.txt', '@@ -0,0 +1 @@', '+hello'].join('\n')
expect(splitUnifiedDiff(created)[0]?.path).toBe('new.txt')
})
})

View File

@@ -0,0 +1,140 @@
/**
* Display-mode frame tests (Epic 3: /compact + /details store flags → render).
* Headless frames through the real App tree (store → Transcript →
* DisplayProvider → messageLine/toolPart/reasoningPart):
* - details collapsed (default) vs expanded vs hidden on tool + reasoning rows,
* including that flipping hidden back RESTORES the rows (nothing dropped),
* - compact collapses the blank line between messages (frame line-distance).
*/
import { describe, expect, test } from 'vitest'
import { createSessionStore } from '../logic/store.ts'
import { App } from '../view/App.tsx'
import { ThemeProvider } from '../view/theme.tsx'
import { renderProbe, type RenderProbe } from './lib/render.ts'
type Store = ReturnType<typeof createSessionStore>
async function mountApp(store: Store, width = 80, height = 30): Promise<RenderProbe> {
return renderProbe(
() => (
<ThemeProvider theme={() => store.state.theme}>
<App store={store} />
</ThemeProvider>
),
{ height, width }
)
}
/** Seed one settled assistant turn: reasoning + a multi-line tool + answer text. */
function seedDetailedTurn(store: Store) {
store.apply({ type: 'gateway.ready' })
store.apply({ type: 'message.start' })
store.apply({ payload: { text: '**Plan**\n\nthink about the steps' }, type: 'reasoning.delta' })
store.apply({ payload: { context: 'ls -la', name: 'terminal', tool_id: 't1' }, type: 'tool.start' })
store.apply({
payload: {
args: { command: 'ls -la' },
duration_s: 0.3,
name: 'terminal',
result_text: 'alpha.txt\nbeta.txt\ngamma.txt',
tool_id: 't1'
},
type: 'tool.complete'
})
store.apply({ payload: { text: 'done listing' }, type: 'message.delta' })
store.apply({ type: 'message.complete' })
}
describe('/details — global detail mode drives default expansion (frame)', () => {
test('collapsed (default) → headers only; expanded → tool body + reasoning preview open', async () => {
const store = createSessionStore()
seedDetailedTurn(store)
const probe = await mountApp(store)
try {
// default: collapsed — tool body lines stay hidden, Thought folded.
// (Markdown BODY text never paints in the headless char frame — a known
// harness limitation, see render.test.tsx — so assertions stick to the
// plain-text renderables: tool output lines + the ◐/▼ headers.)
const collapsed = await probe.waitForFrame(f => f.includes('terminal'))
expect(collapsed).toContain('◐ Thought: Plan')
expect(collapsed).not.toContain('beta.txt')
// /details expanded → tool body + reasoning preview default-open (no clicks)
store.setDetails('expanded')
const expanded = await probe.waitForFrame(f => f.includes('beta.txt'))
expect(expanded).toContain('alpha.txt')
expect(expanded).toContain('▼ Thought: Plan')
// back to collapsed → bodies fold again
store.setDetails('collapsed')
const back = await probe.waitForFrame(f => !f.includes('beta.txt'))
expect(back).toContain('terminal')
expect(back).toContain('◐ Thought: Plan')
} finally {
probe.destroy()
}
})
test('hidden → one muted run line replaces the tool+reasoning rows; flipping back restores', async () => {
const store = createSessionStore()
seedDetailedTurn(store)
const probe = await mountApp(store)
try {
await probe.waitForFrame(f => f.includes('terminal'))
store.setDetails('hidden')
// reasoning + tool fold into ONE honest run line
const hidden = await probe.waitForFrame(f => f.includes('hidden'))
expect(hidden).toContain('⚡ 1 tool · 1 thought hidden — /details collapsed to show')
expect(hidden).not.toContain('terminal')
expect(hidden).not.toContain('Thought: Plan')
// the parts are still in the store (folding is render-only — recoverable)
expect((store.state.messages.at(-1)!.parts ?? []).map(p => p.type)).toEqual(['reasoning', 'tool', 'text'])
// restore — flipping the mode back brings the rows straight back
store.setDetails('collapsed')
const restored = await probe.waitForFrame(f => f.includes('terminal'))
expect(restored).toContain('◐ Thought: Plan')
expect(restored).not.toContain('hidden — /details')
} finally {
probe.destroy()
}
})
})
describe('/compact — transcript spacing (frame line-count)', () => {
test('compact on collapses the blank line between messages; off restores it', async () => {
const store = createSessionStore()
store.apply({ type: 'gateway.ready' })
store.pushUser('alpha-line')
store.pushUser('beta-line')
const probe = await mountApp(store)
try {
const spaced = await probe.waitForFrame(f => f.includes('beta-line'))
const rows = spaced.split('\n')
const a = rows.findIndex(r => r.includes('alpha-line'))
const b = rows.findIndex(r => r.includes('beta-line'))
expect(a).toBeGreaterThanOrEqual(0)
// user turns are set off by MORE space than the part gap (design pass:
// turn boundary > part gap): top 2 + bottom 1 around each prompt.
expect(b - a).toBe(4)
store.setCompact(true)
await probe.settle()
const dense = probe.frame().split('\n')
const a2 = dense.findIndex(r => r.includes('alpha-line'))
const b2 = dense.findIndex(r => r.includes('beta-line'))
expect(b2 - a2).toBe(1) // adjacent rows — densified
store.setCompact(false)
await probe.settle()
const again = probe.frame().split('\n')
const a3 = again.findIndex(r => r.includes('alpha-line'))
const b3 = again.findIndex(r => r.includes('beta-line'))
expect(b3 - a3).toBe(4)
} finally {
probe.destroy()
}
})
})

View File

@@ -0,0 +1,212 @@
import { describe, expect, test } from 'vitest'
import {
envFlag,
envOutputLines,
envOutputUnlimited,
envToggle,
heapdumpOnStart,
launchCwd,
noConfirmDestructive,
resolveMouseEnabled,
scrollSpeedMultiplier,
startupImage,
startupPrompt
} from '../logic/env.ts'
describe('envFlag', () => {
test('recognizes truthy values regardless of case/whitespace', () => {
for (const v of ['1', 'true', 'yes', 'on', 'TRUE', 'Yes', ' on ']) {
expect(envFlag(v, false)).toBe(true)
}
})
test('recognizes falsy values regardless of case/whitespace', () => {
for (const v of ['0', 'false', 'no', 'off', 'FALSE', 'No', ' off ']) {
expect(envFlag(v, true)).toBe(false)
}
})
test('returns fallback when unset', () => {
expect(envFlag(undefined, true)).toBe(true)
expect(envFlag(undefined, false)).toBe(false)
expect(envFlag('', true)).toBe(true)
expect(envFlag(' ', false)).toBe(false)
})
test('returns fallback for unrecognized garbage', () => {
expect(envFlag('maybe', true)).toBe(true)
expect(envFlag('maybe', false)).toBe(false)
expect(envFlag('2', true)).toBe(true)
expect(envFlag('enabled', false)).toBe(false)
})
})
describe('envOutputLines (HERMES_TUI_TOOL_OUTPUT_LINES)', () => {
test('unset → Infinity (UNLIMITED by default — the env var RESTORES a cap)', () => {
expect(envOutputLines(undefined)).toBe(Number.POSITIVE_INFINITY)
expect(envOutputLines('')).toBe(Number.POSITIVE_INFINITY)
expect(envOutputLines(' ')).toBe(Number.POSITIVE_INFINITY)
})
test('a positive integer → that cap (whitespace-tolerant)', () => {
expect(envOutputLines('50')).toBe(50)
expect(envOutputLines(' 50 ')).toBe(50)
expect(envOutputLines('1')).toBe(1)
expect(envOutputLines('200')).toBe(200)
expect(envOutputLines('1000')).toBe(1000)
})
test('"0" → Infinity too (back-compat with the old opt-in "unlimited" value)', () => {
expect(envOutputLines('0')).toBe(Number.POSITIVE_INFINITY)
})
test('garbage → Infinity (unrecognized ≙ no cap asked for)', () => {
expect(envOutputLines('unlimited')).toBe(Number.POSITIVE_INFINITY)
expect(envOutputLines('-5')).toBe(Number.POSITIVE_INFINITY)
expect(envOutputLines('1.5')).toBe(Number.POSITIVE_INFINITY)
expect(envOutputLines('50 lines')).toBe(Number.POSITIVE_INFINITY)
})
test('envOutputUnlimited: true unless an explicit finite cap was asked for', () => {
expect(envOutputUnlimited(undefined)).toBe(true)
expect(envOutputUnlimited('')).toBe(true)
expect(envOutputUnlimited(' ')).toBe(true)
expect(envOutputUnlimited('0')).toBe(true)
expect(envOutputUnlimited('garbage')).toBe(true)
expect(envOutputUnlimited('50')).toBe(false)
expect(envOutputUnlimited('200')).toBe(false)
})
})
describe('launchCwd (session.create cwd)', () => {
test('prefers HERMES_CWD (real launch dir the hermes launcher exports)', () => {
expect(launchCwd({ HERMES_CWD: '/home/u/proj', TERMINAL_CWD: '/other' })).toBe('/home/u/proj')
})
test('falls back to TERMINAL_CWD when HERMES_CWD is unset/blank', () => {
expect(launchCwd({ TERMINAL_CWD: '/home/u/wt' })).toBe('/home/u/wt')
expect(launchCwd({ HERMES_CWD: ' ', TERMINAL_CWD: '/home/u/wt' })).toBe('/home/u/wt')
})
test('falls back to process.cwd() (non-empty) when no launcher env set', () => {
expect(launchCwd({})).toBe(process.cwd())
})
})
describe('envToggle (tri-state)', () => {
test('true/false for recognized values, null otherwise', () => {
expect(envToggle('on')).toBe(true)
expect(envToggle('0')).toBe(false)
expect(envToggle(undefined)).toBe(null)
expect(envToggle('')).toBe(null)
expect(envToggle('maybe')).toBe(null)
})
})
describe('resolveMouseEnabled (defers to Ink env surface)', () => {
test('default ON when nothing is set', () => {
expect(resolveMouseEnabled({})).toBe(true)
})
test('HERMES_TUI_MOUSE_TRACKING is the highest-precedence force knob', () => {
// beats DISABLE_MOUSE and the MOUSE alias either way (toggle values, matching
// Ink's parseToggle — the granular off|wheel|buttons|all lives in config.yaml,
// the env var is on/off only).
expect(
resolveMouseEnabled({ HERMES_TUI_MOUSE_TRACKING: 'off', HERMES_TUI_DISABLE_MOUSE: '0', HERMES_TUI_MOUSE: '1' })
).toBe(false)
expect(
resolveMouseEnabled({ HERMES_TUI_MOUSE_TRACKING: 'on', HERMES_TUI_DISABLE_MOUSE: '1', HERMES_TUI_MOUSE: '0' })
).toBe(true)
})
test('an UNRECOGNIZED tracking value falls through to the next rung (Ink parity)', () => {
// Ink's parseToggle returns null for non-toggle strings like "all", so the
// legacy kill switch / alias / default decide.
expect(resolveMouseEnabled({ HERMES_TUI_MOUSE_TRACKING: 'all' })).toBe(true)
expect(resolveMouseEnabled({ HERMES_TUI_MOUSE_TRACKING: 'all', HERMES_TUI_DISABLE_MOUSE: '1' })).toBe(false)
})
test('legacy HERMES_TUI_DISABLE_MOUSE=1 kill switch (below TRACKING)', () => {
expect(resolveMouseEnabled({ HERMES_TUI_DISABLE_MOUSE: '1' })).toBe(false)
// ...but an explicit TRACKING toggle still wins over the legacy kill switch
expect(resolveMouseEnabled({ HERMES_TUI_DISABLE_MOUSE: '1', HERMES_TUI_MOUSE_TRACKING: 'on' })).toBe(true)
})
test('HERMES_TUI_MOUSE alias is honored (kept — OpenTUI-native + launcher sets it)', () => {
expect(resolveMouseEnabled({ HERMES_TUI_MOUSE: '0' })).toBe(false)
expect(resolveMouseEnabled({ HERMES_TUI_MOUSE: '1' })).toBe(true)
// alias sits below DISABLE_MOUSE: kill switch wins
expect(resolveMouseEnabled({ HERMES_TUI_DISABLE_MOUSE: '1', HERMES_TUI_MOUSE: '1' })).toBe(false)
})
})
describe('startupPrompt (--tui "prompt" seed)', () => {
test('HERMES_TUI_QUERY wins (the launcher contract Ink also reads)', () => {
expect(startupPrompt({ HERMES_TUI_QUERY: 'hi', HERMES_TUI_PROMPT: 'other' }, ['argv'])).toBe('hi')
})
test('HERMES_TUI_PROMPT is the OpenTUI alias fallback', () => {
expect(startupPrompt({ HERMES_TUI_PROMPT: 'from prompt' }, [])).toBe('from prompt')
})
test('bare argv tail is the last fallback (standalone dev)', () => {
expect(startupPrompt({}, ['hello', 'world'])).toBe('hello world')
})
test('blank/unset → undefined', () => {
expect(startupPrompt({}, [])).toBeUndefined()
expect(startupPrompt({ HERMES_TUI_QUERY: ' ' }, [])).toBeUndefined()
})
})
describe('startupImage (--image seed)', () => {
test('reads HERMES_TUI_IMAGE path (the launcher sets it; was silently dropped)', () => {
expect(startupImage({ HERMES_TUI_IMAGE: '/tmp/a.png' })).toBe('/tmp/a.png')
expect(startupImage({ HERMES_TUI_IMAGE: ' /tmp/b.png ' })).toBe('/tmp/b.png')
})
test('blank/unset → undefined', () => {
expect(startupImage({})).toBeUndefined()
expect(startupImage({ HERMES_TUI_IMAGE: ' ' })).toBeUndefined()
})
})
describe('noConfirmDestructive (HERMES_TUI_NO_CONFIRM)', () => {
test('truthy skips the confirm; default off; Ink parity', () => {
expect(noConfirmDestructive({})).toBe(false)
expect(noConfirmDestructive({ HERMES_TUI_NO_CONFIRM: '1' })).toBe(true)
expect(noConfirmDestructive({ HERMES_TUI_NO_CONFIRM: 'true' })).toBe(true)
expect(noConfirmDestructive({ HERMES_TUI_NO_CONFIRM: '0' })).toBe(false)
})
})
describe('heapdumpOnStart (HERMES_HEAPDUMP_ON_START)', () => {
test('truthy enables; default off', () => {
expect(heapdumpOnStart({})).toBe(false)
expect(heapdumpOnStart({ HERMES_HEAPDUMP_ON_START: 'on' })).toBe(true)
expect(heapdumpOnStart({ HERMES_HEAPDUMP_ON_START: 'no' })).toBe(false)
})
})
describe('scrollSpeedMultiplier (HERMES_TUI_SCROLL_SPEED)', () => {
test('null when unset/garbage (keep native scroll behavior)', () => {
expect(scrollSpeedMultiplier({})).toBeNull()
expect(scrollSpeedMultiplier({ HERMES_TUI_SCROLL_SPEED: '' })).toBeNull()
expect(scrollSpeedMultiplier({ HERMES_TUI_SCROLL_SPEED: 'fast' })).toBeNull()
expect(scrollSpeedMultiplier({ HERMES_TUI_SCROLL_SPEED: '0' })).toBeNull()
expect(scrollSpeedMultiplier({ HERMES_TUI_SCROLL_SPEED: '-2' })).toBeNull()
})
test('a positive value is honored and clamped to 20', () => {
expect(scrollSpeedMultiplier({ HERMES_TUI_SCROLL_SPEED: '3' })).toBe(3)
expect(scrollSpeedMultiplier({ HERMES_TUI_SCROLL_SPEED: '1.5' })).toBe(1.5)
expect(scrollSpeedMultiplier({ HERMES_TUI_SCROLL_SPEED: '999' })).toBe(20)
})
test('CLAUDE_CODE_SCROLL_SPEED is the portability fallback (HERMES wins)', () => {
expect(scrollSpeedMultiplier({ CLAUDE_CODE_SCROLL_SPEED: '4' })).toBe(4)
expect(scrollSpeedMultiplier({ HERMES_TUI_SCROLL_SPEED: '2', CLAUDE_CODE_SCROLL_SPEED: '9' })).toBe(2)
})
})

View File

@@ -0,0 +1,141 @@
/**
* Regression: tall <diff showLineNumbers> scrolled partially above the
* transcript viewport crashed the render loop under node:ffi.
*
* @opentui/core 0.4.0 marshals OptimizedBuffer.fillRect/drawText coordinates
* as u32 (zig.ts FFI table) while LineNumberRenderable passes raw screen
* coordinates — NEGATIVE when the diff is partially scrolled out of a
* <scrollbox>. Bun's FFI silently wraps negatives (native bounds-check →
* no-op); Node's experimental FFI throws ERR_INVALID_ARG_VALUE out of
* CliRenderer.loop on EVERY frame (frozen UI + console error spam). Fixed by
* boundary/ffiSafe.ts clamping/skipping before the FFI call.
*/
import { OptimizedBuffer, RGBA } from '@opentui/core'
import { describe, expect, test } from 'vitest'
import { installFfiCoordSafety } from '../boundary/ffiSafe.ts'
import { createSessionStore } from '../logic/store.ts'
import { App } from '../view/App.tsx'
import { ThemeProvider } from '../view/theme.tsx'
import { renderProbe, type RenderProbe } from './lib/render.ts'
type Store = ReturnType<typeof createSessionStore>
// TALL diff: when expanded inside the sticky-bottom scrollbox the diff's TOP
// rows render above the viewport (negative screen y) — the live-crash trigger.
const ADDED = Array.from({ length: 40 }, (_, i) => `+def fn_${i}(): pass`)
const DIFF = [
'--- a//tmp/v6smoke/greet.py',
'+++ b//tmp/v6smoke/greet.py',
'@@ -1,5 +1,45 @@',
' def greet(name):',
'- print("hello " + name)',
'+ print(f"hello {name}")',
...ADDED,
' ',
' if __name__ == "__main__":',
' greet("world")',
''
].join('\n')
function seed(store: Store) {
store.apply({ type: 'gateway.ready' })
store.apply({ type: 'message.start' })
store.apply({ type: 'tool.start', payload: { tool_id: 'p1', name: 'patch', context: '/tmp/v6smoke/greet.py' } })
store.apply({
type: 'tool.complete',
payload: {
tool_id: 'p1',
name: 'patch',
args: { path: '/tmp/v6smoke/greet.py', mode: 'replace' },
diff_unified: DIFF,
duration_s: 0.2,
result: JSON.stringify({ success: true, diff: DIFF })
}
})
store.apply({ type: 'message.complete' })
}
async function clickHeader(probe: RenderProbe, name: string): Promise<void> {
const frame = await probe.waitForFrame(f => f.includes(name))
const rows = frame.split('\n')
const y = rows.findIndex(line => line.includes(name))
expect(y).toBeGreaterThanOrEqual(0)
const x = (rows[y] ?? '').indexOf(name)
await probe.click(x, y)
}
describe('node-ffi coordinate safety (boundary/ffiSafe.ts)', () => {
test('negative coordinates no longer throw ERR_INVALID_ARG_VALUE', () => {
installFfiCoordSafety() // idempotent (test/lib/render.ts installs it too)
const buf = OptimizedBuffer.create(20, 10, 'unicode', { id: 'ffi-safety-probe' })
const red = RGBA.fromInts(255, 0, 0, 255)
try {
// each of these threw TypeError ERR_INVALID_ARG_VALUE ("must be a uint32")
expect(() => buf.fillRect(2, -3, 5, 2, red)).not.toThrow()
expect(() => buf.fillRect(-1, 2, 5, 2, red)).not.toThrow()
expect(() => buf.fillRect(2, 2, -5, 2, red)).not.toThrow()
expect(() => buf.drawText('hi', -1, 2, red)).not.toThrow()
expect(() => buf.drawText('hi', 2, -1, red)).not.toThrow()
expect(() => buf.setCell(-1, 0, 'x', red, red)).not.toThrow()
expect(() => buf.setCellWithAlphaBlending(0, -1, 'x', red, red)).not.toThrow()
// a clipped fillRect still draws its visible part
buf.fillRect(-2, -2, 6, 6, red)
expect(() => buf.fillRect(0, 0, 4, 4, red)).not.toThrow()
} finally {
buf.destroy()
}
})
test('tall diff expand/collapse + resize churn survives without render-loop errors', async () => {
const store = createSessionStore()
seed(store)
const probe = await renderProbe(
() => (
<ThemeProvider theme={() => store.state.theme}>
<App store={store} />
</ThemeProvider>
),
{ width: 120, height: 35 }
)
const errors: unknown[] = []
const onErr = (e: unknown) => errors.push(e)
process.on('uncaughtException', onErr)
try {
await clickHeader(probe, 'patch')
// let tree-sitter + the scrollAnchor's sticky-suspension window land
await new Promise(r => setTimeout(r, 200))
// added rows only paint when the diff body is actually expanded (the
// scrollAnchor holds the viewport at the diff TOP, so assert early rows)
const expanded = await probe.waitForFrame(f => f.includes('fn_0'))
expect(expanded).toContain('+ def fn_0(): pass')
// Scroll INTO the tall diff so its top rows sit ABOVE the viewport
// (negative screen y) — the exact live-crash condition. (The old anchor
// produced this via transient sticky-bottom frames; the sticky
// suspension removed those, so drive the scroll-cut explicitly.)
let downTicks = 0
while (probe.frame().includes('fn_0(') && downTicks < 30) {
await probe.scroll(40, 15, 'down')
downTicks++
}
const cut = probe.frame()
expect(cut).not.toContain('fn_0(') // the diff top is cut above the viewport…
expect(cut).toContain('fn_') // …while mid-diff rows still paint (negative-y path)
// bring the header back on screen for the toggle churn
for (let i = 0; i < downTicks + 5; i++) await probe.scroll(40, 15, 'up')
// toggle a few times + resize churn
await clickHeader(probe, 'patch')
await new Promise(r => setTimeout(r, 100))
await clickHeader(probe, 'patch')
await new Promise(r => setTimeout(r, 200))
probe.resize(100, 30)
await new Promise(r => setTimeout(r, 100))
probe.resize(120, 35)
await new Promise(r => setTimeout(r, 200))
expect(errors).toEqual([])
} finally {
process.off('uncaughtException', onErr)
probe.destroy()
}
}, 30000)
})

View File

@@ -0,0 +1,254 @@
/**
* fuzzy.ts tests (Epic 7) — the fuzzysort-backed filter + grouped-rows helpers
* behind the picker overlays: subsequence matching, ranking (prefix >
* word-boundary > scattered), multi-field (provider/model/lab), multi-term AND,
* empty query = catalog order, no-match = empty, header rows non-selectable,
* the flat arrow-traversal order across groups, and long/messy haystacks shaped
* like the resume-session picker (titles + cwd paths + sources).
*
* Matching/ranking comes from `fuzzysort` via the adapter in logic/fuzzy.ts —
* all matching assertions go through the public `fuzzyFilter` (the old
* hand-rolled scorer internals `scoreTerm`/`scoreFields` are gone).
*/
import { describe, expect, test } from 'vitest'
import { buildPickerRows, fuzzyFilter, visibleRows, type FuzzyField } from '../logic/fuzzy.ts'
/** Filter plain labels (the single-field degenerate case). */
const byLabel = (query: string, labels: string[]): string[] => fuzzyFilter(query, labels, l => [{ text: l, weight: 2 }])
describe('fuzzyFilter — subsequence matching', () => {
test('matches subsequences (case-insensitive), drops non-subsequences', () => {
expect(byLabel('son', ['claude-sonnet-4'])).toEqual(['claude-sonnet-4'])
expect(byLabel('son4', ['claude-sonnet-4'])).toEqual(['claude-sonnet-4']) // the complaint's example
expect(byLabel('SON', ['claude-sonnet-4'])).toEqual(['claude-sonnet-4'])
expect(byLabel('xyz', ['claude-sonnet-4'])).toEqual([])
expect(byLabel('sonn5', ['claude-sonnet-4'])).toEqual([]) // 5 not present after sonn
expect(byLabel('', ['anything'])).toEqual(['anything']) // empty query matches everything
})
test('ranking: prefix > word-boundary > scattered', () => {
// catalog order is deliberately worst-first; ranking must invert it.
expect(byLabel('son', ['meson', 'claude-sonnet', 'sonnet'])).toEqual(['sonnet', 'claude-sonnet', 'meson'])
})
test('anchors at the BEST occurrence, not greedily at the first', () => {
// greedy-from-first-char would match saturn's s@0 then o/n far away; the
// boundary anchor at the second `s` (start of "sonnet") must win over a
// genuinely scattered match.
expect(byLabel('son', ['meson', 'saturn-sonnet'])).toEqual(['saturn-sonnet', 'meson'])
})
})
describe('fuzzyFilter — multi-field, multi-term', () => {
const row = { lab: 'Anthropic', label: 'claude-sonnet-4', provider: 'anthropic' }
const fieldsOf = (r: typeof row): FuzzyField[] => [
{ text: r.label, weight: 2 },
{ text: r.provider },
{ text: r.lab }
]
test('a term may match ANY field (provider/model/lab)', () => {
expect(fuzzyFilter('son4', [row], fieldsOf)).toHaveLength(1) // via the model id
expect(fuzzyFilter('anthro', [row], fieldsOf)).toHaveLength(1) // via the provider
expect(fuzzyFilter('nope', [row], fieldsOf)).toHaveLength(0)
})
test('every whitespace term must match some field (anthropic son works)', () => {
expect(fuzzyFilter('anthropic son', [row], fieldsOf)).toHaveLength(1)
expect(fuzzyFilter('anthropic zzz', [row], fieldsOf)).toHaveLength(0)
})
test('label matches outrank same-quality secondary-field matches (weight 2×)', () => {
const labelHit = { label: 'claude-sonnet-4', provider: 'anthropic' }
const providerHit = { label: 'other-model', provider: 'claude' }
const fields = (r: typeof labelHit): FuzzyField[] => [{ text: r.label, weight: 2 }, { text: r.provider }]
// providerHit comes FIRST in catalog order; the ×2 label hit must beat it.
expect(fuzzyFilter('claude', [providerHit, labelHit], fields)[0]).toBe(labelHit)
})
})
interface Row {
label: string
provider: string
lab: string
}
const CATALOG: Row[] = [
{ lab: 'Anthropic', label: 'claude-sonnet-4', provider: 'anthropic' },
{ lab: 'Anthropic', label: 'claude-opus-4', provider: 'anthropic' },
{ lab: 'OpenAI', label: 'gpt-5', provider: 'openai' },
{ lab: 'Nous Research', label: 'hermes-4-405b', provider: 'nous' }
]
const rowFields = (r: Row): FuzzyField[] => [{ text: r.label, weight: 2 }, { text: r.provider }, { text: r.lab }]
describe('fuzzyFilter', () => {
test('empty/blank query → catalog order, untouched', () => {
expect(fuzzyFilter('', CATALOG, rowFields)).toEqual(CATALOG)
expect(fuzzyFilter(' ', CATALOG, rowFields)).toEqual(CATALOG)
})
test('no match → empty', () => {
expect(fuzzyFilter('qqqq', CATALOG, rowFields)).toEqual([])
})
test('son4 finds claude-sonnet-4 (under anthropic) first', () => {
expect(fuzzyFilter('son4', CATALOG, rowFields)[0]?.label).toBe('claude-sonnet-4')
})
test('oai matches the openai-provider model via the provider field', () => {
const hits = fuzzyFilter('oai', CATALOG, rowFields)
expect(hits.map(h => h.label)).toContain('gpt-5')
})
test('equal-quality prefix matches rank the shorter label first; true ties keep catalog order', () => {
// DELIBERATE expectation change with the fuzzysort adapter: the old scorer
// scored both `claude-*` labels identically and fell back to catalog order
// (sonnet first). fuzzysort additionally rewards how much of the target the
// match covers, so the SHORTER claude-opus-4 now outranks claude-sonnet-4 —
// better for a user: the closer-to-exact label surfaces first.
const hits = fuzzyFilter('claude', CATALOG, rowFields)
expect(hits.map(h => h.label)).toEqual(['claude-opus-4', 'claude-sonnet-4'])
// genuinely equal scores (same-length labels, same match shape) stay stable
// in catalog order — fuzzysort's own sort is unstable; the adapter re-ties.
expect(byLabel('son', ['claude-sonnet', 'saturn-sonnet'])).toEqual(['claude-sonnet', 'saturn-sonnet'])
expect(byLabel('son', ['saturn-sonnet', 'claude-sonnet'])).toEqual(['saturn-sonnet', 'claude-sonnet'])
})
})
/** Rows shaped like the upcoming resume-session picker: long human titles,
* deep cwd paths and a source tag as secondary haystacks. */
interface Session {
title: string
cwd: string
source: string
}
const SESSIONS: Session[] = [
{
cwd: '/home/daimon/github/worktrees/hermes-agent/lively-thrush',
source: 'tui',
title: 'Adopt OpenTUI paradigm for UI implementation'
},
{ cwd: '/home/daimon/github/opentui', source: 'tui', title: 'Fix memory leak in Ink renderer' },
{ cwd: '/home/daimon/github/daimon-nous', source: 'discord', title: 'Triage daimon-nous webhook reviewer pipeline' },
{ cwd: '/home/daimon/github/worktrees/hermes-agent/quiet-finch', source: 'tui', title: 'Parser cleanup pass' },
{ cwd: '/home/daimon/notes', source: 'telegram', title: 'Resume-session picker design notes' }
]
const sessionFields = (s: Session): FuzzyField[] => [{ text: s.title, weight: 2 }, { text: s.cwd }, { text: s.source }]
describe('fuzzyFilter — long/messy haystacks (resume-session shape)', () => {
test('`opentui par` ANDs across one long title (word-boundary terms)', () => {
const hits = fuzzyFilter('opentui par', SESSIONS, sessionFields)
expect(hits.map(h => h.title)).toEqual(['Adopt OpenTUI paradigm for UI implementation'])
})
test('`lively` matches via the cwd-path haystack alone', () => {
const hits = fuzzyFilter('lively', SESSIONS, sessionFields)
expect(hits.map(h => h.title)).toEqual(['Adopt OpenTUI paradigm for UI implementation'])
})
test('`worktr herm` ANDs across deep path segments, keeps ONLY worktree sessions', () => {
const hits = fuzzyFilter('worktr herm', SESSIONS, sessionFields)
expect(hits.map(h => h.title).sort()).toEqual([
'Adopt OpenTUI paradigm for UI implementation',
'Parser cleanup pass'
])
})
test('a title hit outranks a path-only hit for the same query', () => {
// 'Fix memory leak…' matches `opentui` ONLY via its cwd; the title hit
// (label ×2) must come first even though the path row is earlier in catalog.
const hits = fuzzyFilter('opentui', SESSIONS, sessionFields)
expect(hits.map(h => h.title)).toEqual([
'Adopt OpenTUI paradigm for UI implementation',
'Fix memory leak in Ink renderer'
])
})
test('a noisy shared path prefix does not drown a title match', () => {
// every github row shares /home/daimon/…; the title containing `daimon`
// (the daimon-nous session) must outrank the rows matching only via cwd.
const hits = fuzzyFilter('daimon', SESSIONS, sessionFields)
expect(hits[0]?.title).toBe('Triage daimon-nous webhook reviewer pipeline')
expect(hits.length).toBe(SESSIONS.length) // all rows match somewhere (path/source)
})
test('multi-term over title words: `resume pick` pins the picker-design session; junk → empty', () => {
expect(fuzzyFilter('resume pick', SESSIONS, sessionFields).map(h => h.title)).toEqual([
'Resume-session picker design notes'
])
expect(fuzzyFilter('github.zzz', SESSIONS, sessionFields)).toEqual([])
})
})
describe('buildPickerRows — grouping + traversal order', () => {
test('items group by provider with headers; flat traversal crosses groups', () => {
const { flat, rows } = buildPickerRows(CATALOG, r => r.lab)
expect(rows.map(r => (r.kind === 'header' ? `# ${r.label}` : r.item.label))).toEqual([
'# Anthropic',
'claude-sonnet-4',
'claude-opus-4',
'# OpenAI',
'gpt-5',
'# Nous Research',
'hermes-4-405b'
])
// the flat ARROW order is exactly the item rows in render order — so ↓ from
// claude-opus-4 lands on gpt-5 (next group) and headers are never selectable.
expect(flat.map(f => f.label)).toEqual(['claude-sonnet-4', 'claude-opus-4', 'gpt-5', 'hermes-4-405b'])
expect(rows.flatMap(r => (r.kind === 'item' ? [r.index] : []))).toEqual([0, 1, 2, 3])
})
test('ungrouped items render headerless (flat list)', () => {
const { rows } = buildPickerRows(CATALOG, () => undefined)
expect(rows.every(r => r.kind === 'item')).toBe(true)
})
test('group order = first appearance (score-sorted input → best group first)', () => {
const sorted = [CATALOG[2]!, CATALOG[0]!, CATALOG[1]!] // gpt-5 scored best
const { rows } = buildPickerRows(sorted, r => r.lab)
expect(rows[0]).toEqual({ kind: 'header', label: 'OpenAI' })
})
test('non-selectable items (picker v2.1 unconfigured rows) render with index -1, stay out of flat', () => {
// an unconfigured "provider hint" row sits BETWEEN two configured groups
const mixed = [
{ lab: 'Anthropic', label: 'claude-sonnet-4', provider: 'anthropic' },
{ lab: 'Mistral', label: 'no API key — set MISTRAL_API_KEY', provider: 'mistral' },
{ lab: 'OpenAI', label: 'gpt-5', provider: 'openai' }
]
const { flat, rows } = buildPickerRows(
mixed,
r => r.lab,
r => !r.label.startsWith('no API key')
)
// hint row RENDERS (with its header) but is index -1 and absent from flat —
// so ↑↓ traversal (which walks flat) skips it entirely.
expect(rows.map(r => (r.kind === 'header' ? `# ${r.label}` : `${r.index}:${r.item.label}`))).toEqual([
'# Anthropic',
'0:claude-sonnet-4',
'# Mistral',
'-1:no API key — set MISTRAL_API_KEY',
'# OpenAI',
'1:gpt-5'
])
expect(flat.map(f => f.label)).toEqual(['claude-sonnet-4', 'gpt-5'])
})
})
describe('visibleRows — selection-following window', () => {
const { rows } = buildPickerRows(CATALOG, r => r.lab) // 7 rows
test('no slicing when everything fits', () => {
const w = visibleRows(rows, 0, 12)
expect(w.rows).toHaveLength(7)
expect(w.above).toBe(0)
expect(w.below).toBe(0)
})
test('keeps the selected item in view and reports hidden counts', () => {
const w = visibleRows(rows, 3, 4) // last item selected, window of 4
expect(w.rows.some(r => r.kind === 'item' && r.index === 3)).toBe(true)
expect(w.above + w.below + w.rows.length).toBe(7)
expect(w.above).toBeGreaterThan(0)
})
})

View File

@@ -0,0 +1,43 @@
/**
* Phase 0 boundary test (spec v4 §5 Layer 1). Exercises the GatewayService shape
* through the FakeGateway layer using @effect/vitest's `it.effect`: subscribe
* receives emitted events; request records the call. Proves the Effect<->Solid
* seam (subscribe) and the typed request path compile + run.
*
* `it.effect` runs the program in a scoped test runtime (TestClock + TestConsole
* provided automatically), replacing the old hand-rolled ManagedRuntime shim.
* The fake layer carries per-test controller state (we assert `controller.calls`),
* so it's provided locally — the testing guide's allowed one-off, not a shared
* `layer(...)` group.
*/
import { assert, describe, it } from '@effect/vitest'
import { Effect } from 'effect'
import { GatewayService } from '../boundary/gateway/GatewayService.ts'
import type { GatewayEvent } from '../boundary/schema/GatewayEvent.ts'
import { fakeGatewayLayerWith, makeFakeGateway } from '../entry/fakeGateway.ts'
describe('GatewayService via FakeGateway (Phase 0)', () => {
it.effect('subscribe receives emitted events; request records the call', () => {
const controller = makeFakeGateway('sess-123')
const received: GatewayEvent[] = []
return Effect.gen(function* () {
const gateway = yield* GatewayService
const unsubscribe = yield* gateway.subscribe(event => received.push(event))
// Emit after subscribing (synchronous fan-out in the fake).
controller.emit({ type: 'gateway.ready' })
controller.emit({ type: 'message.start' })
yield* gateway.request('prompt.submit', { text: 'hi' })
unsubscribe()
controller.emit({ type: 'message.complete' }) // dropped: unsubscribed
assert.strictEqual(gateway.sessionId(), 'sess-123')
assert.deepStrictEqual(
received.map(e => e.type),
['gateway.ready', 'message.start']
)
assert.deepStrictEqual(controller.calls, [{ method: 'prompt.submit', params: { text: 'hi' } }])
}).pipe(Effect.provide(fakeGatewayLayerWith(controller)))
})
})

View File

@@ -0,0 +1,76 @@
/**
* Recovery-budget policy test (LOGIC side, pure). The crash-loop bound: attempts
* are capped within a sliding window, stale attempts are pruned, and recovery is
* refused with no session. Plus opencode-style exponential backoff (1s→30s cap).
*/
import { describe, expect, test } from 'vitest'
import {
backoffMs,
GATEWAY_RECOVERY_LIMIT,
GATEWAY_RECOVERY_WINDOW_MS,
planGatewayRecovery
} from '../logic/gatewayRecovery.ts'
describe('planGatewayRecovery — crash-loop budget', () => {
test('allows GATEWAY_RECOVERY_LIMIT attempts within the window, refuses the next', () => {
const sid = 'sess-1'
let attempts: number[] = []
const now = 1_000_000
// The first LIMIT exits all recover, each recording its timestamp.
for (let i = 0; i < GATEWAY_RECOVERY_LIMIT; i++) {
const plan = planGatewayRecovery(sid, null, attempts, now + i)
expect(plan.recover).toBe(true)
expect(plan.sid).toBe(sid)
attempts = plan.attempts
}
expect(attempts).toHaveLength(GATEWAY_RECOVERY_LIMIT)
// The (LIMIT+1)th within the window is refused; attempts are NOT extended.
const refused = planGatewayRecovery(sid, null, attempts, now + GATEWAY_RECOVERY_LIMIT)
expect(refused.recover).toBe(false)
expect(refused.attempts).toHaveLength(GATEWAY_RECOVERY_LIMIT)
})
test('prunes attempts older than GATEWAY_RECOVERY_WINDOW_MS, freeing the budget', () => {
const sid = 'sess-1'
const now = 1_000_000
// Three stale attempts (all outside the window) + one fresh.
const stale = [now - GATEWAY_RECOVERY_WINDOW_MS - 5, now - GATEWAY_RECOVERY_WINDOW_MS - 4, now - 30_000]
const plan = planGatewayRecovery(sid, null, stale, now)
// The two truly-stale ones are pruned; the in-window one survives + `now` added.
expect(plan.recover).toBe(true)
expect(plan.attempts).toEqual([now - 30_000, now])
})
test('refuses recovery when there is no session id (live nor recover)', () => {
const plan = planGatewayRecovery(null, null, [], 1_000_000)
expect(plan.recover).toBe(false)
expect(plan.sid).toBeNull()
expect(plan.attempts).toEqual([])
})
test('falls back to the recoverSid when the live sid was already cleared', () => {
const plan = planGatewayRecovery(null, 'pending-sess', [], 1_000_000)
expect(plan.recover).toBe(true)
expect(plan.sid).toBe('pending-sess')
})
})
describe('backoffMs — exponential delay (1s→30s cap)', () => {
test('doubles per attempt (1-based) and caps at 30000ms', () => {
expect(backoffMs(1)).toBe(1000)
expect(backoffMs(2)).toBe(2000)
expect(backoffMs(3)).toBe(4000)
expect(backoffMs(4)).toBe(8000)
expect(backoffMs(5)).toBe(16000)
expect(backoffMs(6)).toBe(30000) // 32000 clamped to the cap
expect(backoffMs(10)).toBe(30000) // stays at the cap
})
test('clamps a non-positive attempt to the first delay', () => {
expect(backoffMs(0)).toBe(1000)
expect(backoffMs(-3)).toBe(1000)
})
})

Some files were not shown because too many files have changed in this diff Show More