fix(kanban): deep-scan pass 2 — synthetic runs, event.run_id plumbing, invariant recovery, live drawer refresh

Second integration audit covering surfaces the first pass didn't hit.
Found eight issues spanning kernel, dashboard frontend, notifier, and CLI.
All behavioral / UX fixes; no schema change.

Kernel
  - complete_task on a never-claimed task (ready/blocked → done with no
    run in flight) was silently dropping the summary/metadata/result
    onto a non-existent run. Now synthesizes a zero-duration run
    (started_at == ended_at) so attempt history is complete. Only
    fires when there's actually handoff data to persist — bare
    complete_task(tid) remains a no-op for run creation.
  - block_task on a never-claimed task had the same bug for --reason.
    Same fix: synthesize a zero-duration run when a reason is passed.
  - Event dataclass gained a `run_id: Optional[int] = None` field.
    list_events, unseen_events_for_sub, and the dashboard _event_dict
    were all SELECTing the column but dropping it on the way out,
    so downstream consumers couldn't group events by attempt. Every
    read path now surfaces run_id.
  - claim_task got a defensive invariant-recovery step: if somehow
    `current_run_id` is non-NULL on a task in 'ready' status (invariant
    violation from an unknown code path), close the leaked run as
    'reclaimed' inside the same txn as the new claim. No-op in the
    common case; belt-and-suspenders in case a future code path forgets
    to clear the pointer.

Dashboard
  - GET /tasks/:id events array now carries run_id per event (via
    _event_dict).
  - WebSocket /events SELECT now includes run_id in the pushed event
    payload.
  - TaskDrawer reloads itself on live events for its own task id. New
    `taskEventTick[taskId]` state in the Board, incremented on every
    WS event, passed down as `eventTick` prop; drawer's useEffect
    depends on it. Previously, background workers completing a task
    the user was viewing left the drawer showing stale data until
    manual close/reopen.
  - CSS: added `.hermes-kanban-run--ended` rule for the fallback class
    the JS emits when outcome is unset. Harmless before; just
    inconsistent.

CLI
  - `hermes kanban watch --kinds` help text listed the legacy event
    name `spawn_auto_blocked`. The kernel migration renames it to
    `gave_up`, so users typing the documented name got zero matches.
    Now shows the current lexicon (`completed,blocked,gave_up,
    crashed,timed_out`).

Tests (+6 in core functionality, +1 in dashboard plugin)
  - complete_never_claimed_task_synthesizes_run
  - block_never_claimed_task_synthesizes_run
  - complete_never_claimed_without_handoff_skips_synthesis
  - event_dataclass_carries_run_id (created.run_id None, completed.run_id matches)
  - unseen_events_for_sub_includes_run_id (notifier path)
  - claim_task_recovers_from_invariant_leak (engineer the leak, verify recovery)
  - event_dict_includes_run_id (dashboard API shape)

171/171 kanban suite pass under scripts/run_tests.sh. Live-smoke (isolated
HERMES_HOME via execute_code) exercised all six fixed paths plus the
claim-after-leak recovery sequence.

Docs
  - Runs section: new 'Synthetic runs for never-claimed completions'
    and 'Live drawer refresh' paragraphs explaining the invariants.
  - Event reference: `created` / `promoted` / `unblocked` entries now
    explicitly note `run_id` is `NULL`; `completed` / `blocked`
    describe synthetic-run fallback.
This commit is contained in:
Teknium
2026-04-27 19:23:49 -07:00
parent 1c78f6627a
commit e27c819de3
8 changed files with 303 additions and 8 deletions

View File

@@ -386,6 +386,10 @@ Runs are exposed on the dashboard (Run History section in the drawer, one colour
**Reclaimed runs from status changes.** If you drag a running task off `running` in the dashboard (back to `ready`, or straight to `todo`), or archive a task that was still running, the in-flight run closes with `outcome='reclaimed'` rather than being orphaned. The `task_runs` row is always in a terminal state when `tasks.current_run_id` is `NULL`, and vice versa — that invariant holds across CLI, dashboard, dispatcher, and notifier.
**Synthetic runs for never-claimed completions.** Completing or blocking a task that was never claimed (e.g. a human closes a `ready` task from the dashboard with a summary, or a CLI user runs `hermes kanban complete <ready-task> --summary X`) would otherwise drop the handoff. Instead the kernel inserts a zero-duration run row (`started_at == ended_at`) carrying the summary / metadata / reason so attempt history stays complete. The `completed` / `blocked` event's `run_id` points at that row.
**Live drawer refresh.** When the dashboard's WebSocket event stream reports new events for the task the user is currently viewing, the drawer reloads itself (via a per-task event counter threaded into its `useEffect` dependency list). Closing and reopening is no longer required to see a run's new row or updated outcome.
### Forward compatibility
Two nullable columns on `tasks` are reserved for v2 workflow routing: `workflow_template_id` (which template this task belongs to) and `current_step_key` (which step in that template is active). The v1 kernel ignores them for routing but lets clients write them, so a v2 release can add the routing machinery without another schema migration.
@@ -398,12 +402,12 @@ Every transition appends a row to `task_events`. Each row carries an optional `r
| Kind | Payload | When |
|---|---|---|
| `created` | `{assignee, status, parents, tenant}` | Task inserted. |
| `promoted` | — | `todo → ready` because all parents hit `done`. |
| `created` | `{assignee, status, parents, tenant}` | Task inserted. `run_id` is `NULL`. |
| `promoted` | — | `todo → ready` because all parents hit `done`. `run_id` is `NULL`. |
| `claimed` | `{lock, expires, run_id}` | Dispatcher atomically claimed a `ready` task for spawn. |
| `completed` | `{result_len, summary?}` | Worker wrote `--result` / `--summary` and task hit `done`. `summary` is the first-line handoff (400-char cap); full version lives on the run row. |
| `blocked` | `{reason}` | Worker or human flipped the task to `blocked`. |
| `unblocked` | — | `blocked → ready`, either manually or via `/unblock`. |
| `completed` | `{result_len, summary?}` | Worker wrote `--result` / `--summary` and task hit `done`. `summary` is the first-line handoff (400-char cap); full version lives on the run row. If `complete_task` is called on a never-claimed task with handoff fields, a zero-duration run is synthesized so `run_id` still points at something. |
| `blocked` | `{reason}` | Worker or human flipped the task to `blocked`. Synthesizes a zero-duration run when called on a never-claimed task with `--reason`. |
| `unblocked` | — | `blocked → ready`, either manually or via `/unblock`. `run_id` is `NULL`. |
| `archived` | — | Hidden from the default board. If the task was still running, carries the `run_id` of the run that was reclaimed as a side effect. |
**Edits** (human-driven changes that aren't transitions):