hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-04-28 06:51:16 +08:00

Author	SHA1	Message	Date
Brooklyn Nicholson	ffa33e53f6	chore(tui): remove dead branch cleanup code - drop unused TUI helpers, test-only layout scaffolding, and stale public debug exports - remove an unused profiler import and trim test-only coverage for deleted helpers	2026-04-26 21:54:24 -05:00
Brooklyn Nicholson	82f842277e	perf(tui): profile harness gains --loop, --save, --compare Before: change code → build → run profile → manually compare to mental model of last run. After: `--loop` watches ui-tui/src and packages/hermes-ink/src for .ts(x) changes, rebuilds on change, re-runs the same scenario, prints a side-by-side A/B diff against the previous iteration — so each edit's impact is quantified instantly. Ctrl+C to stop. Also added: --save LABEL saves metrics snapshot to /tmp/perf-<LABEL>.json --compare LABEL diffs the current run vs that snapshot --extra-flag X pass-through to node dist/entry.js (prepping for --no-fullscreen below) key_metrics() flattens a full run into scalar numbers across frames, React commits, and per-phase timings. format_diff() prints a table with ↑/↓ markers denoting regressions vs improvements based on whether the metric is lower-is-better (p99, max, patches, drain) or higher-is-better (fps, gaps_under_16ms). Run-to-run noise on static code is ~5-15% on most metrics — big signal (>30% change on renderer_p99 / fps) cuts through cleanly. Useful both for validating a single fix and for detecting subtle regressions during the wheel-accel port. Usage during the next perf session: # one-shot with a baseline for later comparison scripts/profile-tui.py --seconds 6 --hold wheel_up --save pre-accel # after porting the wheel handler scripts/profile-tui.py --seconds 6 --hold wheel_up --compare pre-accel # continuous iteration scripts/profile-tui.py --seconds 6 --hold wheel_up --loop	2026-04-26 17:08:07 -05:00
Brooklyn Nicholson	f823535db2	perf(tui): instrument stdout drain — rule out terminal parse bottleneck Adds four fields to FrameEvent.phases and the matching profile summary: optimizedPatches post-optimize patch count (what's actually written to stdout; the .patches field is pre-optimize) writeBytes UTF-8 byte count of the write this frame backpressure true when Node's stdout.write returned false (Writable buffer full — outer terminal can't keep up) prevFrameDrainMs end-to-end drain time of the PREVIOUS frame's write, captured from stdout.write's 2-arg callback. Reported on the next frame so the measurement reflects "time until OS flushed the bytes to the terminal fd", not "time until queued in Node". writeDiffToTerminal() now returns { bytes, backpressure } and accepts an optional onDrain callback. Only attached on TTY with diff; piped/non-TTY stdout bypasses flow control so the callback would fire synchronously anyway. Initial measurements under hold-wheel_up against 1106-msg session (30Hz for 6s): patches total 28,888 optimized total 16,700 (ratio 0.58 — optimizer cuts ~42%) writeBytes 42 KB / 10s = 4.2 KB/s throughput drainMs p50 0.14 ms terminal accepts bytes instantly drainMs p99 0.85 ms backpressure 0% of frames This rules out the terminal-parse hypothesis — Cursor's xterm.js drains our output in sub-millisecond time at only 4 KB/s. The remaining lag has to be in the render pipeline, not the wire. Profile output now includes the bytes+drain+backpressure lines to keep this visible on every subsequent iteration.	2026-04-26 17:06:22 -05:00
Brooklyn Nicholson	71eee26640	perf(tui): full-pipeline instrumentation + profiling harness Extends HERMES_DEV_PERF to capture the complete render pipeline, not just React commits. Adds scripts/profile-tui.py to drive repeatable hold-PageUp stress tests against a real long session. perfPane.tsx: Wires ink's onFrame callback (already plumbed through the fork) into the same perf.log as the React.Profiler samples. Captures per-phase timing (yoga calculateLayout, renderNodeToOutput, screen diff, patch optimize, stdout write) plus yoga counters (visited/measured/cache- Hits/live) and patch counts per frame. Events are tagged {src: 'react'\|'frame'} so jq can split them. logFrameEvent is undefined when HERMES_DEV_PERF is unset, so ink doesn't even attach the callback. entry.tsx: Passes logFrameEvent into render(). types/hermes-ink.d.ts: Declares FrameEvent + onFrame on RenderOptions so the ui-tui side type-checks against the plumbed-through ink option. scripts/profile-tui.py: New harness. Launches the built TUI under a PTY with the longest session in state.db resumed, holds PageUp/PageDown/etc at a configurable Hz for N seconds, then parses perf.log and prints per-phase p50/p95/p99/max plus yoga-counter summaries. Zero deps beyond stdlib. Exit 2 if nothing was captured (wiring broken). Initial findings (1106-msg session, 6s PageUp hold at 30Hz): - Steady state: 10 fps; renderer phase p99=63ms, write p99=0.2ms - 4/107 heavy frames (>=16ms), all dominated by renderNodeToOutput - One pathological 97ms frame with yoga measuring 70,415 text cells and Yoga visiting 225k nodes — the cold-unmeasured-region hit - Ink's scroll fast-path (DECSTBM blit from prevScreen) is disqualified because our spacer-based virtual history doesn't keep heightDelta in sync with scroll.delta, so every PageUp step falls through to a full 2000-4800 patch re-render instead of ~40	2026-04-26 16:36:25 -05:00

4 Commits