Files
hermes-agent/plugins
Teknium e27c819de3 fix(kanban): deep-scan pass 2 — synthetic runs, event.run_id plumbing, invariant recovery, live drawer refresh
Second integration audit covering surfaces the first pass didn't hit.
Found eight issues spanning kernel, dashboard frontend, notifier, and CLI.
All behavioral / UX fixes; no schema change.

Kernel
  - complete_task on a never-claimed task (ready/blocked → done with no
    run in flight) was silently dropping the summary/metadata/result
    onto a non-existent run. Now synthesizes a zero-duration run
    (started_at == ended_at) so attempt history is complete. Only
    fires when there's actually handoff data to persist — bare
    complete_task(tid) remains a no-op for run creation.
  - block_task on a never-claimed task had the same bug for --reason.
    Same fix: synthesize a zero-duration run when a reason is passed.
  - Event dataclass gained a `run_id: Optional[int] = None` field.
    list_events, unseen_events_for_sub, and the dashboard _event_dict
    were all SELECTing the column but dropping it on the way out,
    so downstream consumers couldn't group events by attempt. Every
    read path now surfaces run_id.
  - claim_task got a defensive invariant-recovery step: if somehow
    `current_run_id` is non-NULL on a task in 'ready' status (invariant
    violation from an unknown code path), close the leaked run as
    'reclaimed' inside the same txn as the new claim. No-op in the
    common case; belt-and-suspenders in case a future code path forgets
    to clear the pointer.

Dashboard
  - GET /tasks/:id events array now carries run_id per event (via
    _event_dict).
  - WebSocket /events SELECT now includes run_id in the pushed event
    payload.
  - TaskDrawer reloads itself on live events for its own task id. New
    `taskEventTick[taskId]` state in the Board, incremented on every
    WS event, passed down as `eventTick` prop; drawer's useEffect
    depends on it. Previously, background workers completing a task
    the user was viewing left the drawer showing stale data until
    manual close/reopen.
  - CSS: added `.hermes-kanban-run--ended` rule for the fallback class
    the JS emits when outcome is unset. Harmless before; just
    inconsistent.

CLI
  - `hermes kanban watch --kinds` help text listed the legacy event
    name `spawn_auto_blocked`. The kernel migration renames it to
    `gave_up`, so users typing the documented name got zero matches.
    Now shows the current lexicon (`completed,blocked,gave_up,
    crashed,timed_out`).

Tests (+6 in core functionality, +1 in dashboard plugin)
  - complete_never_claimed_task_synthesizes_run
  - block_never_claimed_task_synthesizes_run
  - complete_never_claimed_without_handoff_skips_synthesis
  - event_dataclass_carries_run_id (created.run_id None, completed.run_id matches)
  - unseen_events_for_sub_includes_run_id (notifier path)
  - claim_task_recovers_from_invariant_leak (engineer the leak, verify recovery)
  - event_dict_includes_run_id (dashboard API shape)

171/171 kanban suite pass under scripts/run_tests.sh. Live-smoke (isolated
HERMES_HOME via execute_code) exercised all six fixed paths plus the
claim-after-leak recovery sequence.

Docs
  - Runs section: new 'Synthetic runs for never-claimed completions'
    and 'Live drawer refresh' paragraphs explaining the invariants.
  - Event reference: `created` / `promoted` / `unblocked` entries now
    explicitly note `run_id` is `NULL`; `completed` / `blocked`
    describe synthetic-run fallback.
2026-04-27 19:23:49 -07:00
..