Merge pull request #15926 from NousResearch/bb/tui-long-session-perf

perf(tui): stabilize long-session scrolling
This commit is contained in:
brooklyn!
2026-04-26 23:10:08 -05:00
committed by GitHub
100 changed files with 6824 additions and 662 deletions

View File

@@ -0,0 +1,151 @@
---
name: debugging-hermes-tui-commands
description: Use when debugging or adding Hermes TUI slash commands across the Python backend (hermes_cli/commands.py), the tui_gateway bridge, and the TypeScript/Ink frontend. Covers autocomplete gaps, gateway dispatch issues, and live UI-state wiring.
version: 1.0.0
author: Hermes Agent
license: MIT
metadata:
hermes:
tags: [debugging, hermes-agent, tui, slash-commands, typescript, python]
related_skills: [python-debugpy, node-inspect-debugger, systematic-debugging]
---
# Debugging Hermes TUI Slash Commands
## Overview
Hermes slash commands span three layers — Python command registry, tui_gateway JSON-RPC bridge, and the Ink/TypeScript frontend. When a command misbehaves (missing from autocomplete, works in CLI but not TUI, config persists but UI doesn't update), the bug is almost always one layer being out of sync with another.
Use this skill when you encounter issues with slash commands in the Hermes TUI, particularly when commands aren't showing in autocomplete, aren't working properly in the TUI, or need to be added/updated.
## When to Use
- A slash command exists in one part of the codebase but doesn't work fully
- A command needs to be added to both backend and frontend
- Command autocomplete isn't working for specific commands
- Command behavior is inconsistent between CLI and TUI
- A command persists config but doesn't apply live in the TUI
## Architecture Overview
```
Python backend (hermes_cli/commands.py) <- canonical COMMAND_REGISTRY
TUI gateway (tui_gateway/server.py) <- slash.exec / command.dispatch
TUI frontend (ui-tui/src/app/slash/) <- local handlers + fallthrough
```
Command definitions must be registered consistently across Python and TypeScript to work properly. The Python `COMMAND_REGISTRY` is the source of truth for: CLI dispatch, gateway help, Telegram BotCommand menu, Slack subcommand map, and autocomplete data shipped to Ink.
## Investigation Steps
1. **Check if the command exists in the TUI frontend:**
```bash
search_files --pattern "/commandname" --file_glob "*.ts" --path ui-tui/
search_files --pattern "/commandname" --file_glob "*.tsx" --path ui-tui/
```
2. **Examine the TUI command definition:**
```bash
read_file ui-tui/src/app/slash/commands/core.ts
# If not there:
search_files --pattern "commandname" --path ui-tui/src/app/slash/commands --target files
```
3. **Check if the command exists in the Python backend:**
```bash
search_files --pattern "CommandDef" --file_glob "*.py" --path hermes_cli/
search_files --pattern "commandname" --path hermes_cli/commands.py --context 3
```
4. **Examine the gateway implementation:**
```bash
search_files --pattern "complete.slash|slash.exec" --path tui_gateway/
```
## Fix: Missing Command Autocomplete
If a command exists in the TUI but doesn't show in autocomplete:
1. Add a `CommandDef` entry to `COMMAND_REGISTRY` in `hermes_cli/commands.py`:
```python
CommandDef("commandname", "Description of the command", "Session",
cli_only=True, aliases=("alias",),
args_hint="[arg1|arg2|arg3]",
subcommands=("arg1", "arg2", "arg3")),
```
2. Pick `cli_only` vs gateway availability carefully:
- `cli_only=True` — only in the interactive CLI/TUI
- `gateway_only=True` — only in messaging platforms
- neither — available everywhere
- `gateway_config_gate="display.foo"` — config-gated availability in the gateway
3. Ensure `subcommands` matches the expected tab-completion options shown by the TUI.
4. If the command runs server-side, add a handler in `HermesCLI.process_command()` in `cli.py`:
```python
elif canonical == "commandname":
self._handle_commandname(cmd_original)
```
5. For gateway-available commands, add a handler in `gateway/run.py`:
```python
if canonical == "commandname":
return await self._handle_commandname(event)
```
## Common Issues
1. **Command shows in TUI but not in autocomplete.** The command is defined in the TUI codebase but missing from `COMMAND_REGISTRY` in `hermes_cli/commands.py`. Autocomplete data ships from Python.
2. **Command shows in autocomplete but doesn't work.** Check the command handler in `tui_gateway/server.py` and the frontend handler in `ui-tui/src/app/createSlashHandler.ts`. If the command is local-only in Ink, it must be handled in `app.tsx` built-in branch; otherwise it falls through to `slash.exec` and must have a Python handler.
3. **Command behavior differs between CLI and TUI.** The command might have different implementations. Check both `cli.py::process_command` and the TUI's local handler. Local TUI handlers take precedence over gateway dispatch.
4. **Command persists config but doesn't apply live.** For TUI-local commands, updating `config.set` is not enough. Also patch the relevant nanostore state immediately (usually `patchUiState(...)`) and pass any new state through rendering components. Example: `/details collapsed` must update live detail visibility, not just save `details_mode`; in-session global `/details <mode>` may need a separate command-override flag so live commands can override built-in section defaults while startup/config sync preserves default-expanded thinking/tools behavior.
5. **Gateway dispatch silently ignores the command.** The gateway only dispatches commands it knows about. Check `GATEWAY_KNOWN_COMMANDS` (derived from `COMMAND_REGISTRY` automatically) includes the canonical name. If the command is `cli_only` with a `gateway_config_gate`, verify the gated config value is truthy.
## Debugging Tactics
When surface-level inspection doesn't reveal the bug:
- **Python side hangs or misbehaves:** use the `python-debugpy` skill to break inside `_SlashWorker.exec` or the command handler. `remote-pdb` set at the handler entry is the fastest path.
- **Ink side not reacting:** use the `node-inspect-debugger` skill to break in `app.tsx`'s slash dispatch or the local command branch. `sb('dist/app.js', <line>)` after `npm run build`.
- **Registry mismatch / unclear which side is wrong:** compare the canonical `COMMAND_REGISTRY` entry against the TUI's local command list side-by-side.
## Pitfalls
- Don't forget to set the appropriate category for the command in `CommandDef` (e.g., "Session", "Configuration", "Tools & Skills", "Info", "Exit")
- Make sure any aliases are properly registered in the `aliases` tuple — no other file changes are needed, everything downstream (Telegram menu, Slack mapping, autocomplete, help) derives from it
- For commands with subcommands, ensure the `subcommands` tuple in `CommandDef` matches what's in the TUI code
- `cli_only=True` commands won't work in gateway/messaging platforms — unless you add a `gateway_config_gate` and the gate is truthy
- After adding live UI state, search every consumer of the old prop/helper and thread the new state through all render paths, not just the active streaming path. TUI detail rendering has at least two important paths: live `StreamingAssistant`/`ToolTrail` and transcript/pending `MessageLine` rows. A `/clean` pass should explicitly check both.
- Rebuild the TUI (`npm --prefix ui-tui run build`) before testing — tsx watch mode may lag on first launch
## Verification
After fixing:
1. Rebuild the TUI:
```bash
cd /home/bb/hermes-agent && npm --prefix ui-tui run build
```
2. Run the TUI and test the command:
```bash
hermes --tui
```
3. Type `/` and verify the command appears in autocomplete suggestions with the expected description and args hint.
4. Execute the command and confirm:
- Expected behavior fires
- Any persisted config updates correctly (`read_file ~/.hermes/config.yaml`)
- Live UI state reflects the change immediately (not just after restart)
5. If the command is also gateway-available, test it from at least one messaging platform (or run the gateway tests: `scripts/run_tests.sh tests/gateway/`).

View File

@@ -0,0 +1,164 @@
---
name: hermes-agent-skill-authoring
description: Use when authoring or updating a SKILL.md inside the hermes-agent repo itself (skills/ tree, committed to a branch). Covers required frontmatter, validator limits, peer-matching structure, and the write_file-vs-skill_manage distinction for in-repo skills.
version: 1.0.0
author: Hermes Agent
license: MIT
metadata:
hermes:
tags: [skills, authoring, hermes-agent, conventions, skill-md]
related_skills: [writing-plans, requesting-code-review]
---
# Authoring Hermes-Agent Skills (in-repo)
## Overview
There are two places a SKILL.md can live:
1. **User-local:** `~/.hermes/skills/<maybe-category>/<name>/SKILL.md` — personal, not shared. Created via `skill_manage(action='create')`.
2. **In-repo (this skill is about this case):** `/home/bb/hermes-agent/skills/<category>/<name>/SKILL.md` — committed, shipped with the package. Use `write_file` + `git add`. `skill_manage(action='create')` does NOT target this tree.
## When to Use
- User asks you to add a skill "in this branch / repo / commit"
- You're committing a reusable workflow that should ship with hermes-agent
- You're editing an existing skill under `/home/bb/hermes-agent/skills/` (use `patch` for small edits, `write_file` for rewrites; `skill_manage` still works for patch on in-repo skills, but not for `create`)
## Required Frontmatter
Source of truth: `tools/skill_manager_tool.py::_validate_frontmatter`. Hard requirements:
- Starts with `---` as the first bytes (no leading blank line).
- Closes with `\n---\n` before the body.
- Parses as a YAML mapping.
- `name` field present.
- `description` field present, ≤ **1024 chars** (`MAX_DESCRIPTION_LENGTH`).
- Non-empty body after the closing `---`.
Peer-matched shape used by every skill under `skills/software-development/`:
```yaml
---
name: my-skill-name # lowercase, hyphens, ≤64 chars (MAX_NAME_LENGTH)
description: Use when <trigger>. <one-line behavior>.
version: 1.0.0
author: Hermes Agent
license: MIT
metadata:
hermes:
tags: [short, descriptive, tags]
related_skills: [other-skill, another-skill]
---
```
`version` / `author` / `license` / `metadata` are NOT enforced by the validator, but every peer has them — omit and your skill sticks out.
## Size Limits
- Description: ≤ 1024 chars (enforced).
- Full SKILL.md: ≤ 100,000 chars (enforced as `MAX_SKILL_CONTENT_CHARS`, ~36k tokens).
- Peer skills in `software-development/` sit at **8-14k chars**. Aim for that range. If you're pushing past 20k, split into `references/*.md` and reference them from SKILL.md.
## Peer-Matched Structure
Every in-repo skill follows roughly:
```
# <Title>
## Overview
One or two paragraphs: what and why.
## When to Use
- Bulleted triggers
- "Don't use for:" counter-triggers
## <Topic sections specific to the skill>
- Quick-reference tables are common
- Code blocks with exact commands
- Hermes-specific recipes (tests via scripts/run_tests.sh, ui-tui paths, etc.)
## Common Pitfalls
Numbered list of mistakes and their fixes.
## Verification Checklist
- [ ] Checkbox list of post-action verifications
## One-Shot Recipes (optional)
Named scenarios → concrete command sequences.
```
Not every section is mandatory, but `Overview` + `When to Use` + actionable body + pitfalls are the minimum for the skill to feel like a peer.
## Directory Placement
```
skills/<category>/<skill-name>/SKILL.md
```
Categories currently in repo (confirm with `ls skills/`): `autonomous-ai-agents`, `creative`, `data-science`, `devops`, `dogfood`, `email`, `gaming`, `github`, `leisure`, `mcp`, `media`, `mlops/*`, `note-taking`, `productivity`, `red-teaming`, `research`, `smart-home`, `social-media`, `software-development`.
Pick the closest existing category. Don't invent new top-level categories casually.
## Workflow
1. **Survey peers** in the target category:
```
ls skills/<category>/
```
Read 2-3 peer SKILL.md files to match tone and structure.
2. **Check validator constraints** in `tools/skill_manager_tool.py` if unsure.
3. **Draft** with `write_file` to `skills/<category>/<name>/SKILL.md`.
4. **Validate locally**:
```python
import yaml, re, pathlib
content = pathlib.Path("skills/<category>/<name>/SKILL.md").read_text()
assert content.startswith("---")
m = re.search(r'\n---\s*\n', content[3:])
fm = yaml.safe_load(content[3:m.start()+3])
assert "name" in fm and "description" in fm
assert len(fm["description"]) <= 1024
assert len(content) <= 100_000
```
5. **Git add + commit** on the active branch.
6. **Note:** the CURRENT session's skill loader is cached — `skill_view` / `skills_list` will not see the new skill until a new session. This is expected, not a bug.
## Cross-Referencing Other Skills
`metadata.hermes.related_skills` unions both trees (`skills/` in-repo and `~/.hermes/skills/`) at load time. You CAN reference a user-local skill from an in-repo skill, but it won't resolve for other users who clone the repo fresh. Prefer referencing only in-repo skills from in-repo skills. If a frequently-referenced skill lives only in `~/.hermes/skills/`, consider promoting it to the repo.
## Editing Existing In-Repo Skills
- **Small fix (typo, added pitfall, tightened trigger):** `skill_manage(action='patch', name=..., old_string=..., new_string=...)` works fine on in-repo skills.
- **Major rewrite:** `write_file` the whole SKILL.md. `skill_manage(action='edit')` also works but requires supplying the full new content.
- **Adding supporting files:** `write_file` to `skills/<category>/<name>/references/<file>.md`, `templates/<file>`, or `scripts/<file>`. `skill_manage(action='write_file')` also works and enforces the references/templates/scripts/assets subdir allowlist.
- **Always commit** the edit — in-repo skills are source, not runtime state.
## Common Pitfalls
1. **Using `skill_manage(action='create')` for an in-repo skill.** It writes to `~/.hermes/skills/`, not the repo tree. Use `write_file` for in-repo creation.
2. **Leading whitespace before `---`.** The validator checks `content.startswith("---")`; any leading blank line or BOM fails validation.
3. **Description too generic.** Peer descriptions start with "Use when ..." and describe the *trigger class*, not the one task. "Use when debugging X" > "Debug X".
4. **Forgetting the author/license/metadata block.** Not validator-enforced, but every peer has it; omitting makes the skill look half-finished.
5. **Writing a skill that duplicates a peer.** Before creating, `ls skills/<category>/` and open 2-3 peers. Prefer extending an existing skill to creating a narrow sibling.
6. **Expecting the current session to see the new skill.** It won't. The skill loader is initialized at session start. Verify in a fresh session or via `skill_view` using the exact path.
7. **Linking to skills that don't exist in-repo.** `related_skills: [some-user-local-skill]` works for you but breaks for other clones. Prefer only in-repo links.
## Verification Checklist
- [ ] File is at `skills/<category>/<name>/SKILL.md` (not in `~/.hermes/skills/`)
- [ ] Frontmatter starts at byte 0 with `---`, closes with `\n---\n`
- [ ] `name`, `description`, `version`, `author`, `license`, `metadata.hermes.{tags, related_skills}` all present
- [ ] Name ≤ 64 chars, lowercase + hyphens
- [ ] Description ≤ 1024 chars and starts with "Use when ..."
- [ ] Total file ≤ 100,000 chars (aim for 8-15k)
- [ ] Structure: `# Title` → `## Overview` → `## When to Use` → body → `## Common Pitfalls` → `## Verification Checklist`
- [ ] `related_skills` references resolve in-repo (or are explicitly OK to be user-local)
- [ ] `git add skills/<category>/<name>/ && git commit` completed on the intended branch

View File

@@ -0,0 +1,318 @@
---
name: node-inspect-debugger
description: Use when debugging Node.js code (ui-tui, tui_gateway child processes, any Node script/test) with real breakpoints, stepping, scope inspection, and expression evaluation. Drives `node --inspect` via the Chrome DevTools Protocol from the terminal — no browser required.
version: 1.0.0
author: Hermes Agent
license: MIT
metadata:
hermes:
tags: [debugging, nodejs, node-inspect, cdp, breakpoints, ui-tui]
related_skills: [systematic-debugging, python-debugpy, debugging-hermes-tui-commands]
---
# Node.js Inspect Debugger
## Overview
When `console.log` isn't enough, drive Node's built-in V8 inspector programmatically from the terminal. You get real breakpoints, step in/over/out, call-stack walking, local/closure scope dumps, and arbitrary expression evaluation in the paused frame.
Two tools, pick one:
- **`node inspect`** — built-in, zero install, CLI REPL. Best for quick poking.
- **`ndb` / CDP via `chrome-remote-interface`** — scriptable from Node/Python; best when you want to automate many breakpoints, collect state across runs, or debug non-interactively from an agent loop.
**Prefer `node inspect` first.** It's always available and the REPL is fast.
## When to Use
- A Node test fails and you need to see intermediate state
- ui-tui crashes or behaves wrong and you want to inspect React/Ink state pre-render
- tui_gateway child processes (`_SlashWorker`, PTY bridge workers) misbehave
- You need to inspect a value in a closure that `console.log` can't reach without patching
- Perf: attach to a running process to capture a CPU profile or heap snapshot
**Don't use for:** things `console.log` solves in under a minute. Breakpoint-driven debugging is heavier; use it when the payoff is real.
## Quick Reference: `node inspect` REPL
Launch paused on first line:
```bash
node inspect path/to/script.js
# or with tsx
node --inspect-brk $(which tsx) path/to/script.ts
```
The `debug>` prompt accepts:
| Command | Action |
|---|---|
| `c` or `cont` | continue |
| `n` or `next` | step over |
| `s` or `step` | step into |
| `o` or `out` | step out |
| `pause` | pause running code |
| `sb('file.js', 42)` | set breakpoint at file.js line 42 |
| `sb(42)` | set breakpoint at line 42 of current file |
| `sb('functionName')` | break when function is called |
| `cb('file.js', 42)` | clear breakpoint |
| `breakpoints` | list all breakpoints |
| `bt` | backtrace (call stack) |
| `list(5)` | show 5 lines of source around current position |
| `watch('expr')` | evaluate expr on every pause |
| `watchers` | show watched expressions |
| `repl` | drop into REPL in current scope (Ctrl+C to exit REPL) |
| `exec expr` | evaluate expression once |
| `restart` | restart script |
| `kill` | kill the script |
| `.exit` | quit debugger |
**In the `repl` sub-mode:** type any JS expression, including access to locals/closure variables. `Ctrl+C` exits back to `debug>`.
## Attaching to a Running Process
When the process is already running (e.g. a long-lived dev server or the TUI gateway):
```bash
# 1. Send SIGUSR1 to enable the inspector on an existing process
kill -SIGUSR1 <pid>
# Node prints: Debugger listening on ws://127.0.0.1:9229/<uuid>
# 2. Attach the debugger CLI
node inspect -p <pid>
# or by URL
node inspect ws://127.0.0.1:9229/<uuid>
```
To start a process with the inspector from the beginning:
```bash
node --inspect script.js # listen on 127.0.0.1:9229, keep running
node --inspect-brk script.js # listen AND pause on first line
node --inspect=0.0.0.0:9230 script.js # custom host:port
```
For TypeScript via tsx:
```bash
node --inspect-brk --import tsx script.ts
# or older tsx
node --inspect-brk -r tsx/cjs script.ts
```
## Programmatic CDP (scripting from terminal)
When you want to automate — set many breakpoints, capture scope state, script a repro — use `chrome-remote-interface`:
```bash
npm i -g chrome-remote-interface # or project-local
# Start your target:
node --inspect-brk=9229 target.js &
```
Driver script (save as `/tmp/cdp-debug.js`):
```javascript
const CDP = require('chrome-remote-interface');
(async () => {
const client = await CDP({ port: 9229 });
const { Debugger, Runtime } = client;
Debugger.paused(async ({ callFrames, reason }) => {
const top = callFrames[0];
console.log(`PAUSED: ${reason} @ ${top.url}:${top.location.lineNumber + 1}`);
// Walk scopes for locals
for (const scope of top.scopeChain) {
if (scope.type === 'local' || scope.type === 'closure') {
const { result } = await Runtime.getProperties({
objectId: scope.object.objectId,
ownProperties: true,
});
for (const p of result) {
console.log(` ${scope.type}.${p.name} =`, p.value?.value ?? p.value?.description);
}
}
}
// Evaluate an expression in the paused frame
const { result } = await Debugger.evaluateOnCallFrame({
callFrameId: top.callFrameId,
expression: 'typeof state !== "undefined" ? JSON.stringify(state) : "n/a"',
});
console.log('state =', result.value ?? result.description);
await Debugger.resume();
});
await Runtime.enable();
await Debugger.enable();
// Set a breakpoint by URL regex + line
await Debugger.setBreakpointByUrl({
urlRegex: '.*app\\.tsx$',
lineNumber: 119, // 0-indexed
columnNumber: 0,
});
await Runtime.runIfWaitingForDebugger();
})();
```
Run it:
```bash
node /tmp/cdp-debug.js
```
Hermes-specific note: `chrome-remote-interface` is NOT in `ui-tui/package.json`. Install it to a throwaway location if you don't want to dirty the project:
```bash
mkdir -p /tmp/cdp-tools && cd /tmp/cdp-tools && npm i chrome-remote-interface
NODE_PATH=/tmp/cdp-tools/node_modules node /tmp/cdp-debug.js
```
## Debugging Hermes ui-tui
The TUI is built Ink + tsx. Two common scenarios:
### Debugging a single Ink component under dev
`ui-tui/package.json` has `npm run dev` (tsx --watch). Add `--inspect-brk` by running tsx directly:
```bash
cd /home/bb/hermes-agent/ui-tui
npm run build # produce dist/ once so transpile isn't needed on first load
node --inspect-brk dist/entry.js
# In another terminal:
node inspect -p <node pid>
```
Then inside `debug>`:
```
sb('dist/app.js', 220) # or wherever the suspect render is
cont
```
When it pauses, `repl` → inspect `props`, state refs, `useInput` handler values, etc.
### Debugging a running `hermes --tui`
The TUI spawns Node from the Python CLI. Easiest path:
```bash
# 1. Launch TUI
hermes --tui &
TUI_PID=$(pgrep -f 'ui-tui/dist/entry' | head -1)
# 2. Enable inspector on that Node PID
kill -SIGUSR1 "$TUI_PID"
# 3. Find the WS URL
curl -s http://127.0.0.1:9229/json/list | jq -r '.[0].webSocketDebuggerUrl'
# 4. Attach
node inspect ws://127.0.0.1:9229/<uuid>
```
Interacting with the TUI (typing in its window) continues to advance execution; your debugger can pause it on a breakpoint at any `sb(...)`.
### Debugging `_SlashWorker` / PTY child processes
Those are Python, not Node — use the `python-debugpy` skill for them. Only Node portions (Ink UI, tui_gateway client, tsx-run tests under `ui-tui/`) use this skill.
## Running Vitest Tests Under the Debugger
```bash
cd /home/bb/hermes-agent/ui-tui
# Run a single test file paused on entry
node --inspect-brk ./node_modules/vitest/vitest.mjs run --no-file-parallelism src/app/foo.test.tsx
```
In another terminal: `node inspect -p <pid>`, then `sb('src/app/foo.tsx', 42)`, `cont`.
Use `--no-file-parallelism` (vitest) or `--runInBand` (jest) so only one worker exists — debugging a pool is painful.
## Heap Snapshots & CPU Profiles (Non-interactive)
From the CDP driver above, swap Debugger for `HeapProfiler` / `Profiler`:
```javascript
// CPU profile for 5 seconds
await client.Profiler.enable();
await client.Profiler.start();
await new Promise(r => setTimeout(r, 5000));
const { profile } = await client.Profiler.stop();
require('fs').writeFileSync('/tmp/cpu.cpuprofile', JSON.stringify(profile));
// Open /tmp/cpu.cpuprofile in Chrome DevTools → Performance tab
```
```javascript
// Heap snapshot
await client.HeapProfiler.enable();
const chunks = [];
client.HeapProfiler.addHeapSnapshotChunk(({ chunk }) => chunks.push(chunk));
await client.HeapProfiler.takeHeapSnapshot({ reportProgress: false });
require('fs').writeFileSync('/tmp/heap.heapsnapshot', chunks.join(''));
```
## Common Pitfalls
1. **Wrong line numbers in TS source.** Breakpoints hit the emitted JS, not the `.ts`. Either (a) break in the built `dist/*.js`, or (b) enable sourcemaps (`node --enable-source-maps`) and use `sb('src/app.tsx', N)` — but only with CDP clients that follow sourcemaps. `node inspect` CLI does not.
2. **`--inspect` vs `--inspect-brk`.** `--inspect` starts the inspector but doesn't pause; your script races past your first breakpoint if you attach too late. Use `--inspect-brk` when you need to set breakpoints before any code runs.
3. **Port collisions.** Default is `9229`. If multiple Node processes are inspecting, pass `--inspect=0` (random port) and read the actual URL from `/json/list`:
```bash
curl -s http://127.0.0.1:9229/json/list # lists all inspectable targets on the host
```
4. **Child processes.** `--inspect` on a parent does NOT inspect its children. Use `NODE_OPTIONS='--inspect-brk' node parent.js` to propagate to every child; be aware they all need unique ports (Node auto-increments when `NODE_OPTIONS='--inspect'` is inherited).
5. **Background kills.** If you `Ctrl+C` out of `node inspect` while the target is paused, the target stays paused. Either `cont` first, or `kill` the target explicitly.
6. **Running `node inspect` through an agent terminal.** It's a PTY-friendly REPL. In Hermes, launch it with `terminal(pty=true)` or `background=true` + `process(action='submit', data='...')`. Non-PTY foreground mode will work for one-shot commands but not for interactive stepping.
7. **Security.** `--inspect=0.0.0.0:9229` exposes arbitrary code execution. Always bind to `127.0.0.1` (the default) unless you have an isolated network.
## Verification Checklist
After setting up a debug session, verify:
- [ ] `curl -s http://127.0.0.1:9229/json/list` returns exactly the target you expect
- [ ] First breakpoint actually hits (if it doesn't, you likely missed `--inspect-brk` or attached after execution completed)
- [ ] Source listing at pause shows the right file (mismatch = sourcemap issue, see pitfall 1)
- [ ] `exec process.pid` in `repl` returns the PID you meant to attach to
## One-Shot Recipes
**"Why is this variable undefined at line X?"**
```bash
node --inspect-brk script.js &
node inspect -p $!
# debug>
sb('script.js', X)
cont
# paused. Now:
repl
> myVariable
> Object.keys(this)
```
**"What's the call path into this function?"**
```
debug> sb('suspectFn')
debug> cont
# paused on entry
debug> bt
```
**"This async chain hangs — where?"**
```
# Start with --inspect (no -brk), let it run to the hang, then:
debug> pause
debug> bt
# Now you see the stuck frame
```

View File

@@ -0,0 +1,374 @@
---
name: python-debugpy
description: Use when debugging Python code (run_agent.py, cli.py, tui_gateway, tests, scripts) with real breakpoints, stepping, scope inspection, and post-mortem analysis. Covers `pdb` for interactive REPL debugging and `debugpy` for remote/headless DAP-driven sessions.
version: 1.0.0
author: Hermes Agent
license: MIT
metadata:
hermes:
tags: [debugging, python, pdb, debugpy, breakpoints, dap, post-mortem]
related_skills: [systematic-debugging, node-inspect-debugger, debugging-hermes-tui-commands]
---
# Python Debugger (pdb + debugpy)
## Overview
Three tools, picked by situation:
| Tool | When |
|---|---|
| **`breakpoint()` + pdb** | Local, interactive, simplest. Add `breakpoint()` in the source, run normally, get a REPL at that line. |
| **`python -m pdb`** | Launch an existing script under pdb with no source edits. Useful for quick poking. |
| **`debugpy`** | Remote / headless / "attach to already-running process." Talks DAP, scriptable from terminal, works for long-lived processes (gateway, daemon, PTY children). |
**Start with `breakpoint()`.** It's the cheapest thing that works.
## When to Use
- A test fails and the traceback doesn't reveal why a value is wrong
- You need to step through a function and watch a collection mutate
- A long-running process (hermes gateway, tui_gateway) misbehaves and you can't restart it
- Post-mortem: an exception fired in prod-ish code and you want to inspect locals at the crash site
- A subprocess / child (Python `_SlashWorker`, PTY bridge worker) is the actual bug site
**Don't use for:** things `print()` / `logging.debug` solve in under a minute, or things `pytest -vv --tb=long --showlocals` already reveals.
## pdb Quick Reference
Inside any pdb prompt (`(Pdb)`):
| Command | Action |
|---|---|
| `h` / `h cmd` | help |
| `n` | next line (step over) |
| `s` | step into |
| `r` | return from current function |
| `c` | continue |
| `unt N` | continue until line N |
| `j N` | jump to line N (same function only) |
| `l` / `ll` | list source around current line / full function |
| `w` | where (stack trace) |
| `u` / `d` | move up / down in the stack |
| `a` | print args of the current function |
| `p expr` / `pp expr` | print / pretty-print expression |
| `display expr` | auto-print expr on every stop |
| `b file:line` | set breakpoint |
| `b func` | break on function entry |
| `b file:line, cond` | conditional breakpoint |
| `cl N` | clear breakpoint N |
| `tbreak file:line` | one-shot breakpoint |
| `!stmt` | execute arbitrary Python (assignments included) |
| `interact` | drop into full Python REPL in current scope (Ctrl+D to exit) |
| `q` | quit |
The `interact` command is the most powerful — you can import anything, inspect complex objects, even call methods that mutate state. Locals are read-only by default; use `!x = 42` from the `(Pdb)` prompt to mutate.
## Recipe 1: Local breakpoint
Easiest. Edit the file:
```python
def compute(x, y):
result = some_helper(x)
breakpoint() # <-- drops into pdb here
return result + y
```
Run the code normally. You land at the `breakpoint()` line with full access to locals.
**Don't forget to remove `breakpoint()` before committing.** Use `git diff` or a pre-commit grep:
```bash
rg -n 'breakpoint\(\)' --type py
```
## Recipe 2: Launch a script under pdb (no source edits)
```bash
python -m pdb path/to/script.py arg1 arg2
# Lands at first line of script
(Pdb) b path/to/script.py:42
(Pdb) c
```
## Recipe 3: Debug a pytest test
The hermes test runner and pytest both support this:
```bash
# Drop to pdb on failure (or on any raised exception):
scripts/run_tests.sh tests/path/to/test_file.py::test_name --pdb
# Drop to pdb at the START of the test:
scripts/run_tests.sh tests/path/to/test_file.py::test_name --trace
# Show locals in tracebacks without pdb:
scripts/run_tests.sh tests/path/to/test_file.py --showlocals --tb=long
```
Note: `scripts/run_tests.sh` uses xdist (`-n 4`) by default, and pdb does NOT work under xdist. Add `-p no:xdist` or run a single test with `-n 0`:
```bash
scripts/run_tests.sh tests/foo_test.py::test_bar --pdb -p no:xdist
# or
source .venv/bin/activate
python -m pytest tests/foo_test.py::test_bar --pdb
```
This bypasses the hermetic-env guarantees — fine for debugging, but re-run under the wrapper to confirm before pushing.
## Recipe 4: Post-mortem on any exception
```python
import pdb, sys
try:
run_the_thing()
except Exception:
pdb.post_mortem(sys.exc_info()[2])
```
Or wrap a whole script:
```bash
python -m pdb -c continue script.py
# When it crashes, pdb catches it and you're in the frame of the exception
```
Or set a global hook in a repl/jupyter:
```python
import sys
def excepthook(etype, value, tb):
import pdb; pdb.post_mortem(tb)
sys.excepthook = excepthook
```
## Recipe 5: Remote debug with debugpy (attach to running process)
For long-lived processes: Hermes gateway, tui_gateway, a daemon, a process that's already misbehaving and can't be restarted clean.
### Setup
```bash
source /home/bb/hermes-agent/.venv/bin/activate
pip install debugpy
```
### Pattern A: Source-edit — process waits for debugger at launch
Add near the top of the entry point (or inside the function you want to debug):
```python
import debugpy
debugpy.listen(("127.0.0.1", 5678))
print("debugpy listening on 5678, waiting for client...", flush=True)
debugpy.wait_for_client()
debugpy.breakpoint() # optional: pause immediately once attached
```
Start the process; it blocks on `wait_for_client()`.
### Pattern B: No source edit — launch with `-m debugpy`
```bash
python -m debugpy --listen 127.0.0.1:5678 --wait-for-client your_script.py arg1
```
Equivalent for module entry:
```bash
python -m debugpy --listen 127.0.0.1:5678 --wait-for-client -m your.module
```
### Pattern C: Attach to an already-running process
Needs the PID and debugpy preinstalled in the target's environment:
```bash
python -m debugpy --listen 127.0.0.1:5678 --pid <pid>
# debugpy injects itself into the process. Then attach a client as below.
```
Some kernels/security configs block the ptrace-based injection (`/proc/sys/kernel/yama/ptrace_scope`). Fix with:
```bash
echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope
```
### Connecting a client from the terminal
The easiest terminal-side DAP client is VS Code CLI or a small script. From inside Hermes you have two practical options:
**Option 1: `debugpy`'s own CLI REPL** — not an official feature, but a tiny DAP client script:
```python
# /tmp/dap_client.py
import socket, json, itertools, time, sys
HOST, PORT = "127.0.0.1", 5678
s = socket.create_connection((HOST, PORT))
seq = itertools.count(1)
def send(msg):
msg["seq"] = next(seq)
body = json.dumps(msg).encode()
s.sendall(f"Content-Length: {len(body)}\r\n\r\n".encode() + body)
def recv():
header = b""
while b"\r\n\r\n" not in header:
header += s.recv(1)
length = int(header.decode().split("Content-Length:")[1].split("\r\n")[0].strip())
body = b""
while len(body) < length:
body += s.recv(length - len(body))
return json.loads(body)
send({"type": "request", "command": "initialize", "arguments": {"adapterID": "python"}})
print(recv())
send({"type": "request", "command": "attach", "arguments": {}})
print(recv())
send({"type": "request", "command": "setBreakpoints",
"arguments": {"source": {"path": sys.argv[1]},
"breakpoints": [{"line": int(sys.argv[2])}]}})
print(recv())
send({"type": "request", "command": "configurationDone"})
# ... loop reading events and sending continue/stepIn/etc.
```
This is fine for one-off automation but painful as an interactive UX.
**Option 2: Attach from VS Code / Cursor / Zed** — if the user has one open, they can add a `launch.json`:
```json
{
"name": "Attach to Hermes",
"type": "debugpy",
"request": "attach",
"connect": { "host": "127.0.0.1", "port": 5678 },
"justMyCode": false,
"pathMappings": [
{ "localRoot": "${workspaceFolder}", "remoteRoot": "/home/bb/hermes-agent" }
]
}
```
**Option 3: Ditch DAP, use `remote-pdb`** — usually what you actually want from a terminal agent:
```bash
pip install remote-pdb
```
In your code:
```python
from remote_pdb import set_trace
set_trace(host="127.0.0.1", port=4444) # blocks until connection
```
Then from the terminal:
```bash
nc 127.0.0.1 4444
# You get a (Pdb) prompt exactly as if debugging locally.
```
`remote-pdb` is the cleanest agent-friendly choice when `debugpy`'s DAP protocol is overkill. Use `debugpy` only when you actually need IDE integration.
## Debugging Hermes-specific Processes
### Tests
See Recipe 3. Always add `-p no:xdist` or run single tests without xdist.
### `run_agent.py` / CLI — one-shot
Easiest: add `breakpoint()` near the suspect line, then run `hermes` normally. Control returns to your terminal at the pause point.
### `tui_gateway` subprocess (spawned by `hermes --tui`)
The gateway runs as a child of the Node TUI. Options:
**A. Source-edit the gateway:**
```python
# tui_gateway/server.py near the top of serve()
import debugpy
debugpy.listen(("127.0.0.1", 5678))
debugpy.wait_for_client()
```
Start `hermes --tui`. The TUI will appear frozen (its backend is waiting). Attach a client; execution resumes when you `continue`.
**B. Use `remote-pdb` at a specific handler:**
```python
from remote_pdb import set_trace
set_trace(host="127.0.0.1", port=4444) # in the RPC handler you want to trap
```
Trigger the matching slash command from the TUI, then `nc 127.0.0.1 4444` in another terminal.
### `_SlashWorker` subprocess
Same pattern — `remote-pdb` with `set_trace()` inside the worker's `exec` path. The worker is persistent across slash commands, so the first trigger blocks until you connect; subsequent slash commands pass through normally unless you re-arm.
### Gateway (`gateway/run.py`)
Long-lived. Use `remote-pdb` at a handler, or `debugpy` with `--wait-for-client` if you're restarting the gateway anyway.
## Common Pitfalls
1. **pdb under pytest-xdist silently does nothing.** You won't see the prompt, the test just hangs. Always use `-p no:xdist` or `-n 0`.
2. **`breakpoint()` in CI / non-TTY contexts hangs the process.** Safe locally; never commit it. Add a pre-commit grep as a safety net.
3. **`PYTHONBREAKPOINT=0`** disables all `breakpoint()` calls. Check the env if your breakpoint isn't hitting:
```bash
echo $PYTHONBREAKPOINT
```
4. **`debugpy.listen` blocks only if you also call `wait_for_client()`.** Without it, execution continues and your first breakpoint may fire before the client is attached.
5. **Attach to PID fails on hardened kernels.** `ptrace_scope=1` (Ubuntu default) allows only same-user ptrace of child processes. Workaround: `echo 0 > /proc/sys/kernel/yama/ptrace_scope` (needs root) or launch under `debugpy` from the start.
6. **Threads.** `pdb` only debugs the current thread. For multithreaded code, use `debugpy` (thread-aware DAP) or set `threading.settrace()` per thread.
7. **asyncio.** `pdb` works in coroutines but `await` inside pdb requires Python 3.13+ or `await` from `interact` mode on older versions. For 3.11/3.12, use `asyncio.run_coroutine_threadsafe` tricks or `!stmt`-based awaits via `asyncio.ensure_future`.
8. **`scripts/run_tests.sh` strips credentials and sets `HOME=<tmpdir>`.** If your bug depends on user config or real API keys, it won't reproduce under the wrapper. Debug with raw `pytest` first to repro, then re-confirm under the wrapper.
9. **Forking / multiprocessing.** pdb does not follow forks. Each child needs its own `breakpoint()` or `set_trace()`. For Hermes subagents, debug one process at a time.
## Verification Checklist
- [ ] After `pip install debugpy`, confirm: `python -c "import debugpy; print(debugpy.__version__)"`
- [ ] For remote debug, confirm the port is actually listening: `ss -tlnp | grep 5678`
- [ ] First breakpoint actually hits (if it doesn't, you likely have `PYTHONBREAKPOINT=0`, you're under xdist, or execution finished before attach)
- [ ] `where` / `w` shows the expected call stack
- [ ] Post-debug cleanup: no stray `breakpoint()` / `set_trace()` in committed code
```bash
rg -n 'breakpoint\(\)|set_trace\(|debugpy\.listen' --type py
```
## One-Shot Recipes
**"Why is this dict missing a key?"**
```python
# add above the KeyError site
breakpoint()
# then in pdb:
(Pdb) pp d
(Pdb) pp list(d.keys())
(Pdb) w # how did we get here
```
**"This test passes in isolation but fails in the suite."**
```bash
scripts/run_tests.sh tests/the_test.py --pdb -p no:xdist
# But if it only fails WITH other tests:
source .venv/bin/activate
python -m pytest tests/ -x --pdb -p no:xdist
# Now it pdb-traps at the exact failing test after state accumulated.
```
**"My async handler deadlocks."**
```python
# Add at handler entry
import remote_pdb; remote_pdb.set_trace(host="127.0.0.1", port=4444)
```
Trigger the handler. `nc 127.0.0.1 4444`, then `w` to see the suspended frame, `!import asyncio; asyncio.all_tasks()` to see what else is pending.
**"Post-mortem on a crash in an Ink child process / subprocess."**
```bash
PYTHONFAULTHANDLER=1 python -m pdb -c continue path/to/entrypoint.py
# On crash, pdb lands at the frame of the exception with full locals
```