feat(kanban): runs as first-class (v1); structured handoffs; forward-compat for v2 workflows

Addresses vulcan-artivus's RFC review on issue #16102. Picks up the structural changes that are expensive to retrofit later and zero-cost to land now; defers workflow-template routing + per-stage lanes to v2 (kept forward-compat hooks in the schema). Kernel - New `task_runs` table. Each claim opens a run (pid, claim_lock, heartbeat, max_runtime, started_at), each terminal transition closes it with an outcome (completed / blocked / crashed / timed_out / spawn_failed / gave_up / reclaimed). Multiple rows per task when retries happen, preserving full attempt history. - `tasks.current_run_id` points at the active run (NULL when idle); denormalised for cheap reads. - `task_events.run_id` carries the run a given event belongs to so UIs group events by attempt. claim/spawned/complete/block/crash/ timeout/spawn_fail/gave_up/heartbeat events are all run-scoped; created/promoted/assigned/edited stay task-scoped (run_id=NULL). - Legacy DBs: migration adds the columns + indexes + synthesizes a run row for any task that's 'running' before the runs table existed, so subsequent complete/heartbeat/reclaim calls have a target. Idempotent. Structured handoff - `complete_task(summary=, metadata=)` persists both on the closing run. `summary` falls back to `result` when omitted so single-run callers don't duplicate. `metadata` is a free-form dict ({changed_files, tests_run, findings, ...}). - `build_worker_context` rewrites: now reads "Prior attempts on this task" (closed runs: outcome, summary, error, metadata) and "Parent task results" pulls run.summary + run.metadata of the most-recent completed run per parent, falling back to task.result for legacy rows without runs. Retrying workers see why earlier attempts failed; downstream workers see parent handoffs structurally, not as loose `result` strings. CLI - `hermes kanban complete <id> --summary "..." --metadata '{"files":1}'`. JSON is parsed and rejected with exit-2 if malformed. - New `hermes kanban runs <id> [--json]` verb. Shows per-run rows: outcome, profile, elapsed, summary, error. JSON mode serializes the full run dataclass for scripting. Dashboard plugin - GET /tasks/:id now carries a runs[] array alongside task / events / comments / links. Each run serialised with outcome, summary, metadata, worker_pid, elapsed fields. - New Run History section in the drawer. Outcome-coloured left border (green=active, blue=completed, amber=reclaimed, red=crashed/timed_out/gave_up/blocked). Collapsed when >3 runs with a '+N earlier' toggle. Shows summary + error + metadata inline. Forward-compat for v2 (vulcan's workflow templates + stages) - `tasks.workflow_template_id` and `tasks.current_step_key` added as nullable columns. v1 kernel ignores them for routing; v2 will add workflow_templates + workflow_steps tables and wire the dispatcher to consult them. task_runs has a matching `step_key` column. Lets a v2 release land additively without another schema migration. Tests (+22 in test_kanban_core_functionality.py, +2 in dashboard) - run_created_on_claim / run_closed_on_complete_with_summary - run_summary_falls_back_to_result - multiple_attempts_preserved_as_runs (3 attempts: reclaimed → crashed → completed, all visible in list_runs) - run_on_block_with_reason / run_on_spawn_failure_records_failed_runs (5 spawn_failed runs + 1 gave_up run) - event_rows_carry_run_id (task-scoped vs run-scoped split) - build_worker_context_includes_prior_attempts - build_worker_context_uses_parent_run_summary (metadata JSON in context) - migration_backfills_inflight_run_for_legacy_db (simulates a pre-migration running task, re-runs init_db, asserts backfill) - forward_compat_columns_writable - cli_runs_verb + cli_runs_json - cli_complete_with_summary_and_metadata (JSON round-trip through shlex + argparse) - cli_complete_bad_metadata_exits_nonzero - task_detail_includes_runs / task_detail_runs_empty_before_claim 269/269 kanban suite pass under scripts/run_tests.sh. Live-smoke covered: single-attempt complete → run closed + summary persisted; retry scenario → two runs visible (blocked + completed); parent run summary + metadata surfaced to child via build_worker_context; forward-compat columns writable via UPDATE; GET /tasks/:id returns runs[]. Docs - New 'Runs — one row per attempt' section in kanban.md: the why (full attempt history, structured metadata), the two-table model (task is logical, run is execution), the structured handoff shape (--summary / --metadata), example CLI + dashboard output, forward-compat note for v2. - Event reference updated to mention task_events.run_id. - CLI reference gains 'hermes kanban runs <id>'. Not in v1 (deferred to v2): - Workflow templates (workflow_templates + workflow_steps tables, stage-based routing, success/failure step links). - 'stage' as a distinct axis from status in the UI. - Shared-by-default workspace binding across stages of the same workflow run. - Pipeline replacement for the kanban-orchestrator skill (the orchestrator's 'decompose, don't execute' guidance is still correct; it becomes partly redundant once workflows land).
2026-05-04 09:47:54 +08:00 · 2026-04-27 06:54:19 -07:00
parent da7d09c3b6
commit 0146cb2bd2
8 changed files with 1172 additions and 47 deletions
--- a/website/docs/user-guide/features/kanban.md
+++ b/website/docs/user-guide/features/kanban.md
@@ -283,6 +283,7 @@ hermes kanban tail <id>                                # follow a single task's
 hermes kanban watch [--assignee P] [--tenant T]        # live stream ALL events to the terminal
        [--kinds completed,blocked,…] [--interval SECS]
 hermes kanban heartbeat <id> [--note "..."]            # worker liveness signal for long ops
+hermes kanban runs <id> [--json]                       # attempt history (one row per run)
 hermes kanban assignees [--json]                       # profiles on disk + per-assignee task counts
 hermes kanban dispatch [--dry-run] [--max N]           # one-shot pass
        [--failure-limit N] [--json]
@@ -349,9 +350,45 @@ hermes kanban notify-unsubscribe t_abcd \

 A subscription removes itself automatically once the task reaches `done` or `archived`; no cleanup needed.

+## Runs — one row per attempt
+
+A task is a logical unit of work; a **run** is one attempt to execute it. When the dispatcher claims a ready task it creates a row in `task_runs` and points `tasks.current_run_id` at it. When that attempt ends — completed, blocked, crashed, timed out, spawn-failed, reclaimed — the run row closes with an `outcome` and the task's pointer clears. A task that's been attempted three times has three `task_runs` rows.
+
+Why two tables instead of just mutating the task: you need **full attempt history** for real-world postmortems ("the second reviewer attempt got to approve, the third merged"), and you need a clean place to hang per-attempt metadata — which files changed, which tests ran, which findings a reviewer noted. Those are run facts, not task facts.
+
+Runs are also where **structured handoff** lives. When a worker completes a task it can pass:
+
+- `--result "<short log line>"` — goes on the task row as before (for back-compat).
+- `--summary "<human handoff>"` — goes on the run; downstream children see it in their `build_worker_context`.
+- `--metadata '{"changed_files": [...], "tests_run": 12}'` — JSON dict on the run; children see it serialized alongside the summary.
+
+Downstream children read the most recent completed run's summary + metadata for each parent. Retrying workers read the prior attempts on their own task (outcome, summary, error) so they don't repeat a path that already failed.
+
+```bash
+# Worker completes with a structured handoff:
+hermes kanban complete t_abcd \
+    --result "rate limiter shipped" \
+    --summary "implemented token bucket, keys on user_id with IP fallback, all tests pass" \
+    --metadata '{"changed_files": ["limiter.py", "tests/test_limiter.py"], "tests_run": 14}'
+
+# Review the attempt history on a retried task:
+hermes kanban runs t_abcd
+#   #  OUTCOME       PROFILE           ELAPSED  STARTED
+#   1  blocked       worker               12s  2026-04-27 14:02
+#        → BLOCKED: need decision on rate-limit key
+#   2  completed     worker                8m   2026-04-27 15:18
+#        → implemented token bucket, keys on user_id with IP fallback
+```
+
+Runs are exposed on the dashboard (Run History section in the drawer, one coloured row per attempt) and on the REST API (`GET /api/plugins/kanban/tasks/:id` returns a `runs[]` array). Task_events rows carry the run_id they belong to so the UI can group them by attempt.
+
+### Forward compatibility
+
+Two nullable columns on `tasks` are reserved for v2 workflow routing: `workflow_template_id` (which template this task belongs to) and `current_step_key` (which step in that template is active). The v1 kernel ignores them for routing but lets clients write them, so a v2 release can add the routing machinery without another schema migration.
+
 ## Event reference

-Every transition appends a row to `task_events`. The kinds group into three clusters so filtering is easy (`hermes kanban watch --kinds completed,gave_up,timed_out`):
+Every transition appends a row to `task_events`. Each row carries an optional `run_id` so UIs can group events by attempt. Kinds group into three clusters so filtering is easy (`hermes kanban watch --kinds completed,gave_up,timed_out`):

 **Lifecycle** (what changed about the task as a logical unit):