hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-04 09:47:54 +08:00

Author	SHA1	Message	Date
teknium1	832ecde4b0	feat(kanban): structured tool surface for worker + orchestrator agents Seven new tools in `tools/kanban_tools.py` that give kanban workers a backend-portable, schema-filtered way to interact with the board from inside their own Python process — no shelling out to `hermes kanban`. Motivation The CLI path (`hermes kanban complete \$TASK --summary ...`) breaks on any remote terminal backend (Docker, Modal, Singularity, SSH). The terminal tool runs `hermes kanban` inside the container, where `hermes` isn't installed and `~/.hermes/kanban.db` isn't mounted. Tools run in the agent's own Python process, so they always reach the board regardless of backend. Also skips shell-quoting fragility on --metadata JSON and gives structured error returns the model can reason about. The seven tools kanban_show read current task (defaults to HERMES_KANBAN_TASK) kanban_complete structured handoff: summary + metadata kanban_block ask for human input kanban_heartbeat signal liveness during long operations kanban_comment append to task thread kanban_create fan out into child tasks (orchestrator path) kanban_link add parent→child dependency after the fact Gating Each tool's check_fn returns True iff HERMES_KANBAN_TASK is set in the process env. The dispatcher sets it when spawning a worker; normal `hermes chat` sessions never have it. Empirically verified: a baseline hermes-cli schema is 27 tools; with HERMES_KANBAN_TASK set it grows to exactly 34 (+7). Zero leak into normal sessions. Also set HERMES_PROFILE in the spawn env so the kanban_comment tool's author default works cleanly (it's what the tool reads to attribute comments). Skill updates - `skills/devops/kanban-worker/SKILL.md`: lifecycle rewritten to use kanban_show / kanban_heartbeat / kanban_block / kanban_complete / kanban_comment / kanban_create directly. CLI fallback section added for human operators / scripts. - `skills/devops/kanban-orchestrator/SKILL.md`: all examples ported from CLI to tool form; top-banner note explaining tools are the primary surface. kanban_create / kanban_link throughout. Docs `website/docs/user-guide/features/kanban.md`: new "How workers interact with the board" section explaining the tool surface, gating mechanism, and why tools vs CLI. The worker skill / orchestrator skill subsections are now nested under it. Tests (+25 in tests/tools/test_kanban_tools.py) - Schema gating: kanban_tools_hidden_without_env_var, kanban_tools_visible_with_env_var. - Happy paths: show (default + explicit task_id), complete (with summary+metadata, with result only), block, heartbeat (with and without note), comment (default + custom author), create (with list parents, with string parent), link. - Error paths: complete rejects no-handoff and non-dict metadata, block rejects empty reason, comment rejects empty body, create rejects no title / no assignee / non-list parents, link rejects self-reference / missing args / cycles. - End-to-end: full worker lifecycle driven entirely through the tools, verified against DB state. 214/214 kanban suite pass under scripts/run_tests.sh.	2026-04-28 04:30:22 -07:00
Teknium	be184aa5fa	fix(kanban): close the two v2-flagged issues in v1 Both items the atypical-scenarios pass flagged as "v2 follow-up" actually belong in v1. Fixed now. Fix 1: workspace path traversal resolve_workspace now rejects non-absolute paths for all three workspace_kinds (scratch-with-explicit-path, dir:, worktree). A relative path like '../../../tmp/attacker' was being silently resolved against the dispatcher's CWD — a confused-deputy escape. Error message points users at the absolute-path requirement. Storage remains verbatim (kernel doesn't rewrite user input); the refusal happens at resolution time, so the dispatcher's existing spawn-failure circuit breaker correctly categorizes it. Threat model documented in website/docs/user-guide/features/kanban.md: single-host, trusted-local-user. The absolute-path rule prevents ambiguity-driven escape, not malicious access — kanban runs as you, with your uid, on your filesystem. Fix 2: build_worker_context unbounded Added per-section caps so worker prompts stay bounded on pathological boards: _CTX_MAX_PRIOR_ATTEMPTS = 10 most-recent N runs shown; older collapsed into "N earlier attempts omitted" marker. Attempt numbering preserved (shows "Attempt 16" not renumbered). _CTX_MAX_COMMENTS = 30 same pattern for comments. _CTX_MAX_FIELD_BYTES = 4 KB per summary / error / metadata / result. _CTX_MAX_BODY_BYTES = 8 KB per task.body (opening post). _CTX_MAX_COMMENT_BYTES = 2 KB per comment. Truncation uses a visible ellipsis + char-count so the worker knows it's been truncated. Effect on atypical-scenario runs: huge_run_count_on_one_task (1000 runs): 63 KB → 820 chars comment_storm (1000 comments): 50 KB → 1,671 chars Tests (+6 in main suite) test_resolve_workspace_rejects_relative_dir_path — relative dir: path stored verbatim but refused at resolve. test_resolve_workspace_accepts_absolute_dir_path — legitimate absolute paths are created and returned. test_resolve_workspace_rejects_relative_worktree_path — same guard for worktree kind. test_build_worker_context_caps_prior_attempts — 25 runs → exactly _CTX_MAX_PRIOR_ATTEMPTS shown, omitted marker present, attempt numbering preserves original index. test_build_worker_context_caps_comments — 100 comments → 30 shown, 70 in the omitted marker. test_build_worker_context_caps_huge_summary — 1 MB summary on a prior run → context under 10 KB total, truncation marker visible. 189/189 kanban suite pass. Atypical-scenarios stress script still passes all 28 scenarios with the new caps in effect.	2026-04-28 01:20:08 -07:00
Teknium	7206eed319	docs(kanban): add step-by-step tutorial with 10 dashboard screenshots New website/docs/user-guide/features/kanban-tutorial.md walks four user stories end-to-end, each backed by a real screenshot of the dashboard running against seeded data. Stories 1. Solo dev shipping a feature (parent->child dependencies, structured handoff, run history rendering). 2. Fleet farming (parallel independent tasks across 3 assignees, lanes-by-profile grouping, dispatcher daemon). 3. Role pipeline with retry (PM spec -> eng implements -> review blocks -> eng retries -> review approves; two-run history visible in the drawer; downstream workers pull parent summary+metadata). 4. Circuit breaker + crash recovery (2 spawn_failed + 1 gave_up for a deploy with missing creds; 1 crashed + 1 completed for an OOM-killed migration that recovered on retry). Each story shows both CLI commands and the dashboard drawer equivalent. Screenshots captured via playwright + chromium at 2x device scale, then repalettized with PIL (22MB -> 6.1MB for the 10-image set, no visible quality loss verified against vision). Side updates - website/sidebars.ts: added kanban-tutorial under features. - website/docs/user-guide/features/kanban.md: prefix banner linking new readers to the tutorial before the reference. All image references validate: `/img/kanban-tutorial/*` maps to website/static/img/kanban-tutorial/ (10 files). Docusaurus build not run locally (no node_modules in worktree); CI build on merge will confirm.	2026-04-27 20:28:53 -07:00
Teknium	e27c819de3	fix(kanban): deep-scan pass 2 — synthetic runs, event.run_id plumbing, invariant recovery, live drawer refresh Second integration audit covering surfaces the first pass didn't hit. Found eight issues spanning kernel, dashboard frontend, notifier, and CLI. All behavioral / UX fixes; no schema change. Kernel - complete_task on a never-claimed task (ready/blocked → done with no run in flight) was silently dropping the summary/metadata/result onto a non-existent run. Now synthesizes a zero-duration run (started_at == ended_at) so attempt history is complete. Only fires when there's actually handoff data to persist — bare complete_task(tid) remains a no-op for run creation. - block_task on a never-claimed task had the same bug for --reason. Same fix: synthesize a zero-duration run when a reason is passed. - Event dataclass gained a `run_id: Optional[int] = None` field. list_events, unseen_events_for_sub, and the dashboard _event_dict were all SELECTing the column but dropping it on the way out, so downstream consumers couldn't group events by attempt. Every read path now surfaces run_id. - claim_task got a defensive invariant-recovery step: if somehow `current_run_id` is non-NULL on a task in 'ready' status (invariant violation from an unknown code path), close the leaked run as 'reclaimed' inside the same txn as the new claim. No-op in the common case; belt-and-suspenders in case a future code path forgets to clear the pointer. Dashboard - GET /tasks/:id events array now carries run_id per event (via _event_dict). - WebSocket /events SELECT now includes run_id in the pushed event payload. - TaskDrawer reloads itself on live events for its own task id. New `taskEventTick[taskId]` state in the Board, incremented on every WS event, passed down as `eventTick` prop; drawer's useEffect depends on it. Previously, background workers completing a task the user was viewing left the drawer showing stale data until manual close/reopen. - CSS: added `.hermes-kanban-run--ended` rule for the fallback class the JS emits when outcome is unset. Harmless before; just inconsistent. CLI - `hermes kanban watch --kinds` help text listed the legacy event name `spawn_auto_blocked`. The kernel migration renames it to `gave_up`, so users typing the documented name got zero matches. Now shows the current lexicon (`completed,blocked,gave_up, crashed,timed_out`). Tests (+6 in core functionality, +1 in dashboard plugin) - complete_never_claimed_task_synthesizes_run - block_never_claimed_task_synthesizes_run - complete_never_claimed_without_handoff_skips_synthesis - event_dataclass_carries_run_id (created.run_id None, completed.run_id matches) - unseen_events_for_sub_includes_run_id (notifier path) - claim_task_recovers_from_invariant_leak (engineer the leak, verify recovery) - event_dict_includes_run_id (dashboard API shape) 171/171 kanban suite pass under scripts/run_tests.sh. Live-smoke (isolated HERMES_HOME via execute_code) exercised all six fixed paths plus the claim-after-leak recovery sequence. Docs - Runs section: new 'Synthetic runs for never-claimed completions' and 'Live drawer refresh' paragraphs explaining the invariants. - Event reference: `created` / `promoted` / `unblocked` entries now explicitly note `run_id` is `NULL`; `completed` / `blocked` describe synthetic-run fallback.	2026-04-27 19:23:49 -07:00
Teknium	1c78f6627a	docs(kanban): document audit-pass invariants — bulk-close guard, reclaimed-on-status-change, completed event carries summary - Runs section: dashboard PATCH parity (summary/metadata forward), `completed` event embeds first-line summary for notifiers, bulk --summary/--metadata refused, archive/drag-drop reclaim semantics. - Event reference: added Payload column to Lifecycle and Edits tables; called out the invariant that `status` carries run_id when closing a reclaimed run.	2026-04-27 08:46:08 -07:00
Teknium	0146cb2bd2	feat(kanban): runs as first-class (v1); structured handoffs; forward-compat for v2 workflows Addresses vulcan-artivus's RFC review on issue #16102. Picks up the structural changes that are expensive to retrofit later and zero-cost to land now; defers workflow-template routing + per-stage lanes to v2 (kept forward-compat hooks in the schema). Kernel - New `task_runs` table. Each claim opens a run (pid, claim_lock, heartbeat, max_runtime, started_at), each terminal transition closes it with an outcome (completed / blocked / crashed / timed_out / spawn_failed / gave_up / reclaimed). Multiple rows per task when retries happen, preserving full attempt history. - `tasks.current_run_id` points at the active run (NULL when idle); denormalised for cheap reads. - `task_events.run_id` carries the run a given event belongs to so UIs group events by attempt. claim/spawned/complete/block/crash/ timeout/spawn_fail/gave_up/heartbeat events are all run-scoped; created/promoted/assigned/edited stay task-scoped (run_id=NULL). - Legacy DBs: migration adds the columns + indexes + synthesizes a run row for any task that's 'running' before the runs table existed, so subsequent complete/heartbeat/reclaim calls have a target. Idempotent. Structured handoff - `complete_task(summary=, metadata=)` persists both on the closing run. `summary` falls back to `result` when omitted so single-run callers don't duplicate. `metadata` is a free-form dict ({changed_files, tests_run, findings, ...}). - `build_worker_context` rewrites: now reads "Prior attempts on this task" (closed runs: outcome, summary, error, metadata) and "Parent task results" pulls run.summary + run.metadata of the most-recent completed run per parent, falling back to task.result for legacy rows without runs. Retrying workers see why earlier attempts failed; downstream workers see parent handoffs structurally, not as loose `result` strings. CLI - `hermes kanban complete <id> --summary "..." --metadata '{"files":1}'`. JSON is parsed and rejected with exit-2 if malformed. - New `hermes kanban runs <id> [--json]` verb. Shows per-run rows: outcome, profile, elapsed, summary, error. JSON mode serializes the full run dataclass for scripting. Dashboard plugin - GET /tasks/:id now carries a runs[] array alongside task / events / comments / links. Each run serialised with outcome, summary, metadata, worker_pid, elapsed fields. - New Run History section in the drawer. Outcome-coloured left border (green=active, blue=completed, amber=reclaimed, red=crashed/timed_out/gave_up/blocked). Collapsed when >3 runs with a '+N earlier' toggle. Shows summary + error + metadata inline. Forward-compat for v2 (vulcan's workflow templates + stages) - `tasks.workflow_template_id` and `tasks.current_step_key` added as nullable columns. v1 kernel ignores them for routing; v2 will add workflow_templates + workflow_steps tables and wire the dispatcher to consult them. task_runs has a matching `step_key` column. Lets a v2 release land additively without another schema migration. Tests (+22 in test_kanban_core_functionality.py, +2 in dashboard) - run_created_on_claim / run_closed_on_complete_with_summary - run_summary_falls_back_to_result - multiple_attempts_preserved_as_runs (3 attempts: reclaimed → crashed → completed, all visible in list_runs) - run_on_block_with_reason / run_on_spawn_failure_records_failed_runs (5 spawn_failed runs + 1 gave_up run) - event_rows_carry_run_id (task-scoped vs run-scoped split) - build_worker_context_includes_prior_attempts - build_worker_context_uses_parent_run_summary (metadata JSON in context) - migration_backfills_inflight_run_for_legacy_db (simulates a pre-migration running task, re-runs init_db, asserts backfill) - forward_compat_columns_writable - cli_runs_verb + cli_runs_json - cli_complete_with_summary_and_metadata (JSON round-trip through shlex + argparse) - cli_complete_bad_metadata_exits_nonzero - task_detail_includes_runs / task_detail_runs_empty_before_claim 269/269 kanban suite pass under scripts/run_tests.sh. Live-smoke covered: single-attempt complete → run closed + summary persisted; retry scenario → two runs visible (blocked + completed); parent run summary + metadata surfaced to child via build_worker_context; forward-compat columns writable via UPDATE; GET /tasks/:id returns runs[]. Docs - New 'Runs — one row per attempt' section in kanban.md: the why (full attempt history, structured metadata), the two-table model (task is logical, run is execution), the structured handoff shape (--summary / --metadata), example CLI + dashboard output, forward-compat note for v2. - Event reference updated to mention task_events.run_id. - CLI reference gains 'hermes kanban runs <id>'. Not in v1 (deferred to v2): - Workflow templates (workflow_templates + workflow_steps tables, stage-based routing, success/failure step links). - 'stage' as a distinct axis from status in the UI. - Shared-by-default workspace binding across stages of the same workflow run. - Pipeline replacement for the kanban-orchestrator skill (the orchestrator's 'decompose, don't execute' guidance is still correct; it becomes partly redundant once workflows land).	2026-04-27 06:54:19 -07:00
Teknium	da7d09c3b6	feat(kanban): max-runtime timeouts, worker heartbeats, assignees picker, event vocab cleanup Ports four items from the Multica audit (https://github.com/multica-ai/multica). Dropped their cross-host server/daemon architecture and their Postgres+pgvector skill search — both the wrong shape for our single-host SQLite kernel. 1. Per-task max-runtime (`max_runtime_seconds` column) - New kernel function `enforce_max_runtime(conn)` runs in every dispatch tick. When a running task's elapsed time exceeds the cap, we SIGTERM the worker, wait a 5 s grace (polling _pid_alive), then SIGKILL. The task goes back to 'ready' with a `timed_out` event and re-queues on the next tick (unless the spawn-failure circuit breaker has already parked it). - Host-local only: lock prefix must match this host's claimer_id so we never signal a PID on another machine. - CLI: `hermes kanban create --max-runtime 30m \| 2h \| 1d \| <seconds>`. New `_parse_duration` helper accepts s/m/h/d suffixes or bare integers. - Dashboard POST body + the card's `max_runtime_seconds` field. 2. Worker heartbeat (`last_heartbeat_at` column, `heartbeat` event) - `heartbeat_worker(conn, task_id, note=None)` emits the event and touches last_heartbeat_at. Refused when the task isn't running. - CLI: `hermes kanban heartbeat <id> [--note "..."]`. - kanban-worker skill instructs workers to heartbeat during long loops (training runs, encodes, crawls, batch uploads). - Separate signal from PID crash detection: a worker's Python can still be alive while the actual work process is stuck. Heartbeat absence is diagnostic; future work can auto-block on stale heartbeats but v1 just surfaces the signal. 3. Assignee enumeration (`known_assignees`, `list_profiles_on_disk`) - Scans ~/.hermes/profiles/ for dirs containing config.yaml + unions with current assignees on the board. Each entry returns {name, on_disk, counts: {status: n}}. - CLI: `hermes kanban assignees [--json]`. Also hooked into `hermes kanban init` which now prints discovered profiles so new installs see 'these are the assignees you can target' immediately. - Dashboard: GET /api/plugins/kanban/assignees for the picker. 4. Event vocab cleanup (three renames + three new kinds) - `ready` → `promoted` (fires when deps clear; clearer semantic). - `priority` → `reprioritized` (past-tense verb, matches others). - `spawn_auto_blocked` → `gave_up` (short, memorable; the circuit breaker gave up on this task). - New: `spawned` (emitted with {pid} on successful spawn), `heartbeat` ({note?}), `timed_out` ({pid, elapsed_seconds, limit_seconds, sigkill}). - One-shot migration in `_migrate_add_optional_columns` renames legacy rows in-place on init_db(), so existing DBs upgrade cleanly. - Gateway notifier's TERMINAL_KINDS set updated; timed_out gets its own ⏱ message template, gave_up renamed from 'auto-blocked'. - Plugin_api.py's two 'priority' emit sites renamed to 'reprioritized'. - Documented in a new 'Event reference' section in kanban.md, grouped into three clusters (lifecycle / edits / worker telemetry) with payload shapes. Tests (+18 in tests/hermes_cli/test_kanban_core_functionality.py, 136/136 pass): - max_runtime_terminates_overrun_worker: real SIGTERM flow with _pid_alive stub, verifies event payload + state reset. - max_runtime_none_means_no_cap: unbounded tasks aren't timed out. - create_task_persists_max_runtime. - enforce_max_runtime_integrates_with_dispatch: kernel-level + dispatch_once chaining. - heartbeat_on_running_task + heartbeat_refused_when_not_running. - cli_heartbeat_verb with --note round-trip. - recompute_ready_emits_promoted_not_ready. - spawn_failure_circuit_breaker_emits_gave_up. - spawned_event_emitted_with_pid. - migration_renames_legacy_event_kinds (injects old rows, re-runs init_db, asserts rename). - list_profiles_on_disk (tmp_path + config.yaml filter). - known_assignees_merges_disk_and_board (profiles on disk + board assignees + per-status counts). - cli_assignees_json. - parse_duration_accepts_formats (s/m/h/d/float). - parse_duration_rejects_garbage. - cli_create_max_runtime_via_duration (2h → 7200). - cli_create_max_runtime_bad_format_exits_nonzero. Live smoke: POST /tasks with max_runtime_seconds round-trips; /assignees returns the union of on-disk + board-assigned names; PATCH priority produces 'reprioritized' events (not 'priority'); board cards expose max_runtime_seconds + last_heartbeat_at. Docs (website/docs/user-guide/features/kanban.md): - New 'Event reference' section with three-cluster table (lifecycle / edits / worker telemetry) + payload shapes. - CLI reference updated for --max-runtime, heartbeat, assignees. - Gateway notifications section updated for the new TERMINAL_KINDS. Not ported from Multica (deliberate, documented in the out-of-scope section already): Postgres+pgvector skill search (heavy deps conflict with SQLite kernel), server+daemon cross-host model (we're single-host on purpose), first-class agent identity with threaded comments (we keep the board profile-agnostic).	2026-04-27 06:32:17 -07:00
Teknium	af8d43dbbb	feat(kanban): core hardening — daemon, circuit breaker, crash detect, logs, notify, bulk, stats Eliminates every 'known broken on day one' item in the core functionality audit. The board is now self-driving (daemon, not cron), self-healing (crash detection, spawn-failure circuit breaker), and self-reporting (logs, stats, gateway notifications). Dispatcher - New `hermes kanban daemon` long-lived loop with --interval, --max, --failure-limit, --pidfile, --verbose, signal-clean shutdown (SIGINT/SIGTERM via threading.Event). A kb.run_daemon() entry point lets tests drive it inline without subprocess. - `hermes kanban init` now prints the dispatcher setup hint so users don't leave the board off-by-default. Ships a systemd user unit at plugins/kanban/systemd/hermes-kanban-dispatcher.service. - Removed the old 'add this to cron' doc path. Cron runs agent prompts (LLM cost per tick) — unacceptable for a per-minute coordination loop. Worker aliveness / safety - Spawn returns the child's PID; dispatcher stores it on the task row and calls detect_crashed_workers() every tick. If the PID is gone but the claim TTL hasn't expired, the task drops back to ready with a 'crashed' event. Host-local only — cross-host PIDs are ignored per the single-host design. - Spawn-failure circuit breaker: after N consecutive spawn_failed events on the same task (default 5), the dispatcher auto-blocks with the last error as the reason. Success resets the counter. Workspace-resolution failures count against the same budget. - Log rotation: _rotate_worker_log trims at 2 MiB, keeps one generation (.log.1), bounds per-task disk usage at ~4 MiB. Idempotency / dedup - create_task(idempotency_key=...) returns the existing non-archived task id for retried webhooks. --idempotency-key on the CLI, json body field on the dashboard plugin. Archived tasks don't block a fresh create with the same key. CLI surface - Bulk verbs: complete, unblock, archive accept multiple ids; block accepts --ids for sibling blocks with the same reason. - New verbs: daemon, watch (live event tail filtered by assignee/tenant/kinds), stats, log, notify-subscribe, notify-list, notify-unsubscribe. - dispatch gains --failure-limit + crashed/auto_blocked columns in JSON output and human-readable output. - gc accepts --event-retention-days / --log-retention-days; prunes task_events for terminal tasks and old log files. Gateway integration - New GatewayRunner._kanban_notifier_watcher: polls kanban_notify_subs every 5s, pushes ✔/⏸/✖ messages to subscribed chats for completed/blocked/spawn_auto_blocked/crashed events. Cursor-advanced per-sub; auto-removed when the task reaches done/archived. Runs alongside the session expiry and platform reconnect watchers — SQLite work in asyncio.to_thread so the event loop never blocks. - /kanban create in the gateway auto-subscribes the originating chat (platform + chat_id + thread_id). Users see '(subscribed — you'll be notified when t_abcd completes or blocks)' appended to the response. Dashboard plugin - GET /stats returns board_stats (by_status, by_assignee, oldest_ready_age_seconds). - GET /tasks/:id/log returns the worker log with optional ?tail=N cap. 404 on unknown task, exists=false when the task has never spawned. - POST /tasks accepts idempotency_key; both Pydantic body and the create_task kwarg now round-trip. - /board attaches task.age (created/started/time_to_complete in seconds) so the UI can colour stale cards without recomputing. - Card CSS: amber border after N minutes, red border when clearly stuck (tier per status: running 10m/60m, ready 1h/24h, todo 7d/30d, blocked 1h/24h). - Drawer: new Worker log section, auto-loads on mount, last 100 KB cap with on-disk path surfaced when truncated. Kernel - Schema additions: tasks.idempotency_key, tasks.spawn_failures, tasks.worker_pid, tasks.last_spawn_error; new kanban_notify_subs table. All gated by _migrate_add_optional_columns so legacy DBs upgrade cleanly. - release_stale_claims / complete_task / block_task now all clear worker_pid so crash detection doesn't false-positive on reclaimed tasks. - read_worker_log fixed: tail-skip no longer eats one-giant-line logs (common with child processes that don't flush newlines before dying). Tests (tests/hermes_cli/test_kanban_core_functionality.py, 28 new) - Idempotency: same key returns existing, archived doesn't block, no key never collides - Circuit breaker: auto-blocks after limit, success resets counter, workspace-resolution failure counts against budget - Aliveness: _pid_alive helper, detect_crashed_workers reclaims exited child - Daemon: runs and stops cleanly via stop_event, survives a tick exception - Stats + task_age helpers - Notify subs: CRUD, cursor advances, distinct-thread is a separate row - GC: events-only-for-terminal-tasks, old worker logs deleted - Log: rotation keeps one generation, read_worker_log tail - CLI: bulk complete/archive/unblock/block, create with --idempotency-key, stats --json, notify-subscribe+list, log missing task, gc reports counts - run_slash parity: smoke-tests every registered verb (23 invocations); none may raise or return empty string Full kanban test suite: 234/234 pass under scripts/run_tests.sh (60 original + 30 dashboard plugin + 28 new core + 116 command registry). Live smoke covers /stats, idempotency, age, log endpoint with and without content, log?tail= truncation signal, 404 on unknown task. Docs (website/docs/user-guide/features/kanban.md) - 'Core concepts' rewritten: new statuses (triage), idempotency key, dispatcher-as-daemon-not-cron with circuit breaker behaviour documented. - Quick start swapped to daemon. New systemd section covers user service install. - New sections: idempotent create, bulk verbs, gateway notifications, out-of-scope single-host note (kanban.db is local; don't expect multi-host). - CLI reference updated for every new verb, every new flag.	2026-04-26 13:01:09 -07:00
Teknium	27fc6c1086	feat(kanban): bulk ops, drawer edit, dep editor, markdown, touch, config The dashboard plugin gets the last layer of features that turn it from a 'usable read surface with drag-drop' into a 'full kanban UI' — no more 'drop to CLI to do X' moments from inside the tab. Plugin backend - POST /tasks/bulk — apply the same patch (status / archive / assignee / priority) to every id in the request body. Each id runs independently: one bad id reports {ok: false, error: ...} without aborting siblings. Status transitions that aren't legal for the current state are surfaced per-id ('transition to done refused'). Used by the multi-select bulk action bar. - GET /config — returns the dashboard.kanban section of config.yaml (default_tenant, lane_by_profile, include_archived_by_default, render_markdown) with sensible defaults when the section is absent. Loaded once by the SPA to preselect filters and toggle markdown rendering. - _conn() helper — every handler now goes through it, calling kanban_db.init_db() (idempotent) before every connection. Fresh installs work whether the first hit is GET /board, POST /tasks, or any other endpoint — no more 'no such table: tasks' when the CLI or a script hits the plugin before the dashboard has ever loaded. Plugin UI (plugin bundle, +~12 KB) - Multi-select: per-card checkbox; shift/ctrl-click also toggles without opening the drawer. A BulkActionBar appears above the columns with batch → ready / complete / archive / reassign (profile dropdown + unassign option). Destructive batches confirm first. Partial failures from the backend are surfaced inline. - Drawer inline editing: - Click the title → TitleEditor swaps in an input, Enter saves, Escape cancels. - Click the Assignee meta row → AssigneeEditor input (empty string unassigns). - Click the Priority meta row → PriorityEditor numeric input. - New 'edit' button on Description → full-width textarea; Save / Cancel switch back to rendered view. - Dependency editor: chip list of parents + children with per-chip × button (calls DELETE /links). Add-parent / add-child dropdowns filter out self + already-linked tasks so you cannot re-add a duplicate edge or a self-loop. Cycle rejections from the server surface cleanly via the existing error banner. - Parent selection in InlineCreate: new dropdown listing every task on the board ('{id} — {title}') — picking one sends parents=[id] with the create payload, so the task lands in todo (or triage if created from the Triage column) with the dependency wired up. - Safe markdown rendering for description, comment bodies, and result. A small in-bundle renderer handles headings, bold, italic, inline code, fenced code, bullet lists, and http(s)/mailto links. Every substitution runs on HTML-escaped input (no raw HTML), links get target=_blank + rel=noopener,noreferrer. Disabled by config key dashboard.kanban.render_markdown=false (falls back to <pre>). - Touch drag-drop: attachTouchDrag() installs a pointerdown handler that spawns a drag proxy, tracks elementFromPoint under the finger, and dispatches a hermes-kanban:drop CustomEvent on the column when released. Desktop continues to use native HTML5 DnD. Columns listen for both. - ErrorBoundary already present from the prior commit catches any renderer throw; markdown escape + touch-proxy cleanup both have their own try/finally. Tests (tests/plugins/test_kanban_dashboard_plugin.py — 90/90 pass) - bulk_status_ready: 3 tasks blocked, batch → ready, all move - bulk_archive hides all ids from default board - bulk_reassign changes every assignee - bulk_unassign_via_empty_string sets assignee back to None - bulk_partial_failure_doesnt_abort_siblings: bogus id in middle, good siblings still get priority=7 - bulk_empty_ids_400 - config_returns_defaults_when_section_missing - config_reads_dashboard_kanban_section (writes config.yaml, verifies every key round-trips) Live smoke (real FastAPI app + isolated HERMES_HOME): - /config without section returns defaults - /config with dashboard.kanban section returns the configured values - POST /tasks as the first-ever request (no prior /board) succeeds — auto-init handles it - Link add + remove via POST /links + DELETE /links round-trip - Bulk priority bump on 2 ids, both get priority=5 - Bulk archive hides ids from default board - PATCH {title, body} updates the task, markdown source survives the round trip - POST /tasks {triage: true, parents: [id]} lands in triage, not todo - Bulk partial: 2 good + 1 bogus returns per-id outcome Docs (website/docs/user-guide/features/kanban.md) - 'What the plugin gives you' rewritten to reflect bulk, drawer edit, dep editor, parent-on-create, markdown, touch drag-drop. - New 'Dashboard config' subsection with a YAML example for dashboard.kanban.*. - REST table gains /tasks/bulk and /config rows.	2026-04-26 12:36:23 -07:00
Teknium	45806629c5	feat(kanban): Triage column, progress rollup, WS auth, lanes, polish Follows up on the initial dashboard plugin with the items called out during self-review — ships the GUI-reality claims the PR body made, closes the WebSocket auth gap, and lands the 'Triage' status the design spec's Fusion-style screenshot leads with. Kernel changes - kanban_db.VALID_STATUSES gains 'triage'. status is TEXT without a CHECK constraint so no schema migration is needed. - create_task(triage=True) forces the initial status to 'triage' regardless of parents, and parent ids are still validated so the eventual link rows don't dangle. recompute_ready() only promotes 'todo' -> 'ready', so triage tasks are naturally isolated from the dispatcher pipeline. - hermes kanban create gains --triage. Patterns table (docs) gains P9 'Triage specifier'. Plugin backend (plugins/kanban/dashboard/plugin_api.py) - GET /board now auto-init's kanban.db on first read (idempotent). A fresh install shows an empty board instead of 'failed to load'. - GET /board returns a new 'progress' field per task — {done, total} of child-task completion, or None if the task has no children. - BOARD_COLUMNS prepends 'triage'. - POST /tasks accepts {triage: bool}; PATCH /tasks/:id accepts {status: 'triage'}. - WebSocket /events now requires ?token=<session_token> as a query param — browsers can't set Authorization on a WS upgrade, so this matches the pattern the in-browser PTY bridge uses. Constant-time compare against hermes_cli.web_server._SESSION_TOKEN. In bare-test contexts (no dashboard module) the check no-ops so the tail loop stays testable. Security boundary documented in the module header and in website/docs/user-guide/features/kanban.md. Plugin UI (plugins/kanban/dashboard/dist/index.js + style.css) - Adds the Triage column (lilac dot) with helper text 'Raw ideas — a specifier will flesh out the spec'. Inline-create from the Triage column parks new tasks in triage. - Status action row in the drawer gains '→ triage'. - Progress pill (N/M) on cards that have children. Full-complete state tints the pill green. - 'Lanes by profile' toolbar toggle — sub-groups the Running column by assignee so you see at a glance which specialist is busy on what. - Destructive status moves (done / archived / blocked) via drag-drop OR via the drawer action row now prompt for confirmation. - Escape closes the drawer. - Live-update reloads are debounced (250ms) so a burst of task_events triggers one refetch, not N. - WebSocket includes ?token= built from window.__HERMES_SESSION_TOKEN__. - WebSocket reconnect uses exponential backoff capped at 30s, not a fixed 1.5s spin loop, and surfaces a user-visible error on code-1008 (auth rejected) instead of reconnecting forever. - ErrorBoundary wraps the page — a bad card render shows a 'rendering error, reload view' card instead of crashing the tab. Tests (tests/plugins/test_kanban_dashboard_plugin.py, +5 tests = 21) - empty-board shape now asserts all 6 columns including 'triage' - create_triage_lands_in_triage_column - triage_task_not_promoted_to_ready (dispatcher bypasses triage) - patch_status_triage_works (both into triage and out of it) - board_progress_rollup (0/2 -> 1/2 -> childless cards = None) - board_auto_initializes_missing_db - ws_events_rejects_when_token_required (three sub-assertions: missing → 1008, wrong → 1008, correct → handshake accepted) All 82 kanban tests pass under scripts/run_tests.sh. Docs - kanban.md 'What the plugin gives you' fully rewritten to match shipped reality (triage, progress pill, assignee lanes, destructive-confirm, Escape-close, debounce). - New 'Security model' subsection documents the explicit-plugin- route-bypass, the WS token requirement, and the --host 0.0.0.0 warning; also notes that kanban.db is profile-agnostic on purpose (the coordination primitive) so cross-profile visibility is expected. - CLI command reference shows --triage. - Collaboration patterns table adds P9 'Triage specifier'.	2026-04-26 12:26:43 -07:00
Teknium	4093201c47	feat(kanban): dashboard plugin — Linear/Fusion-style board UI Ships plugins/kanban/dashboard/ as a bundled dashboard plugin. No core changes — uses the standard dashboard plugin contract (manifest.json + dist/index.js + plugin_api.py) documented in 'Extending the Dashboard'. What the tab gives you: - One column per kanban status (todo / ready / running / blocked / done; archived behind a toggle), column counts, coloured status dots. - Cards with id, title, priority badge, tenant tag, assignee, comment/link counts, 'created N ago'. - HTML5 drag-drop between columns — status change routes through the same kanban_db code the CLI /kanban verbs use, so the three surfaces (CLI, gateway, dashboard) can never drift. - Inline create per-column (title, assignee, priority). - Side drawer on card click: description, status action row (→ ready / → running / block / unblock / complete / archive), dependency links, comment thread with Enter-to-submit, last 20 events. - Toolbar: search, tenant filter, assignee filter, show-archived, nudge-dispatcher (skip the 60s wait), refresh. - Live updates via WebSocket tailing task_events — the board reflects CLI or gateway actions in real time. REST surface under /api/plugins/kanban/: GET /board, GET /tasks/:id, POST /tasks, PATCH /tasks/:id, POST /tasks/:id/comments, POST /links, DELETE /links, POST /dispatch, WS /events. Every handler is a thin wrapper around kanban_db — no new business logic. Visually theme-aware: the plugin CSS reads only --color-*, --radius, --font-mono etc. so it reskins with whichever dashboard theme is active. Tests (tests/plugins/test_kanban_dashboard_plugin.py, 16 tests): - empty board shape - create + appears in ready column with tenant/assignee rollups - tenant filter - detail includes parents/children/events - 404 on unknown task - PATCH status: complete / block / unblock / ready drag-drop / running - PATCH reassign, priority, edit, invalid-status rejection - POST comment (plus empty-body rejection) - POST link + DELETE link + cycle rejection - POST dispatch (dry run) All 76 kanban tests pass under scripts/run_tests.sh. Docs: website/docs/user-guide/features/kanban.md gains a full 'Dashboard (GUI)' section covering install, architecture, REST surface, live-updates mechanism, extending, and scope boundary.	2026-04-26 12:08:47 -07:00
Teknium	9f610aa8f3	docs(kanban): add GUI/Dashboard plugin section The /kanban CLI + slash command are enough to run the board headlessly, but triage and cross-profile supervision want a visual board. Document the design as a dashboard plugin that: - reads live state from kanban.db over a WebSocket on task_events (no polling) - writes through run_slash() so CLI/gateway/GUI cannot drift - mounts under /api/plugins/kanban/ following the existing 'Extending the Dashboard' plugin shape The plugin is strictly a thin layer over kanban_db — no new business logic, nothing to merge into the kernel.	2026-04-26 11:57:04 -07:00
Teknium	e1c5e741ad	feat(kanban): durable multi-profile collaboration board (#16081 ) New `hermes kanban` CLI subcommand + `/kanban` slash command + skills for worker and orchestrator profiles. SQLite-backed task board (~/.hermes/kanban.db) shared across all profiles on the host. Zero changes to run_agent.py, no new core tools, no tool-schema bloat. Motivation: delegate_task is a function call — sync fork/join, anonymous subagent, no resumability, no human-in-the-loop. Kanban is the durable shape needed for research triage, scheduled ops, digital twins, engineering pipelines, and fleet work. They coexist (workers may call delegate_task internally). What this adds - hermes_cli/kanban_db.py — schema, CAS claim, dependency resolution, dispatcher, workspace resolution, worker-context builder. - hermes_cli/kanban.py — 15-verb CLI surface and shared run_slash() entry point used by both CLI and gateway. - skills/devops/kanban-worker — how a profile should work a claimed task. - skills/devops/kanban-orchestrator — "you are a dispatcher, not a worker" template with anti-temptation rules. - /kanban slash command wired into cli.py and gateway/run.py. Bypasses the running-agent guard (board writes don't touch agent state), so /kanban unblock can free a stuck worker mid-conversation. - Design spec at docs/hermes-kanban-v1-spec.pdf — comparative analysis vs Cline Kanban, Paperclip, NanoClaw, Gemini Enterprise; 8 patterns; 4 user stories; implementation plan; concurrency correctness. - Docs: website/docs/user-guide/features/kanban.md, CLI reference updated, sidebar entry added. Architecture highlights - Three planes: control (user + gateway), state (board + dispatcher), execution (pool of profile processes). - Every worker is a full OS process, spawned as `hermes -p <profile>`. No in-process subagent swarms — solves NanoClaw's SDK-lifecycle failure class. - Atomic claim via SQLite CAS in a BEGIN IMMEDIATE transaction; stale claims reclaimed 15 min after their TTL expires. - Tenant namespacing via one nullable column — one specialist fleet can serve many businesses with data isolation by workspace path. Tests: 60 targeted tests (schema, CAS atomicity, dependency resolution, dispatcher, workspace kinds, tenancy, CLI + slash surface). All pass hermetic via scripts/run_tests.sh.	2026-04-26 08:29:46 -07:00
Teknium	06f81752ed	Revert "feat(kanban): durable multi-profile collaboration board (#16081 )" (#16098 ) This reverts commit `15937a6b46`.	2026-04-26 08:29:37 -07:00
Teknium	15937a6b46	feat(kanban): durable multi-profile collaboration board (#16081 ) New `hermes kanban` CLI subcommand + `/kanban` slash command + skills for worker and orchestrator profiles. SQLite-backed task board (~/.hermes/kanban.db) shared across all profiles on the host. Zero changes to run_agent.py, no new core tools, no tool-schema bloat. Motivation: delegate_task is a function call — sync fork/join, anonymous subagent, no resumability, no human-in-the-loop. Kanban is the durable shape needed for research triage, scheduled ops, digital twins, engineering pipelines, and fleet work. They coexist (workers may call delegate_task internally). What this adds - hermes_cli/kanban_db.py — schema, CAS claim, dependency resolution, dispatcher, workspace resolution, worker-context builder. - hermes_cli/kanban.py — 15-verb CLI surface and shared run_slash() entry point used by both CLI and gateway. - skills/devops/kanban-worker — how a profile should work a claimed task. - skills/devops/kanban-orchestrator — "you are a dispatcher, not a worker" template with anti-temptation rules. - /kanban slash command wired into cli.py and gateway/run.py. Bypasses the running-agent guard (board writes don't touch agent state), so /kanban unblock can free a stuck worker mid-conversation. - Design spec at docs/hermes-kanban-v1-spec.pdf — comparative analysis vs Cline Kanban, Paperclip, NanoClaw, Gemini Enterprise; 8 patterns; 4 user stories; implementation plan; concurrency correctness. - Docs: website/docs/user-guide/features/kanban.md, CLI reference updated, sidebar entry added. Architecture highlights - Three planes: control (user + gateway), state (board + dispatcher), execution (pool of profile processes). - Every worker is a full OS process, spawned as `hermes -p <profile>`. No in-process subagent swarms — solves NanoClaw's SDK-lifecycle failure class. - Atomic claim via SQLite CAS in a BEGIN IMMEDIATE transaction; stale claims reclaimed 15 min after their TTL expires. - Tenant namespacing via one nullable column — one specialist fleet can serve many businesses with data isolation by workspace path. Tests: 60 targeted tests (schema, CAS atomicity, dependency resolution, dispatcher, workspace kinds, tenancy, CLI + slash surface). All pass hermetic via scripts/run_tests.sh.	2026-04-26 08:24:26 -07:00

15 Commits