Ports four items from the Multica audit (https://github.com/multica-ai/multica).
Dropped their cross-host server/daemon architecture and their Postgres+pgvector
skill search — both the wrong shape for our single-host SQLite kernel.
1. Per-task max-runtime (`max_runtime_seconds` column)
- New kernel function `enforce_max_runtime(conn)` runs in every dispatch
tick. When a running task's elapsed time exceeds the cap, we SIGTERM
the worker, wait a 5 s grace (polling _pid_alive), then SIGKILL. The
task goes back to 'ready' with a `timed_out` event and re-queues
on the next tick (unless the spawn-failure circuit breaker has
already parked it).
- Host-local only: lock prefix must match this host's claimer_id so we
never signal a PID on another machine.
- CLI: `hermes kanban create --max-runtime 30m | 2h | 1d | <seconds>`.
New `_parse_duration` helper accepts s/m/h/d suffixes or bare
integers.
- Dashboard POST body + the card's `max_runtime_seconds` field.
2. Worker heartbeat (`last_heartbeat_at` column, `heartbeat` event)
- `heartbeat_worker(conn, task_id, note=None)` emits the event and
touches last_heartbeat_at. Refused when the task isn't running.
- CLI: `hermes kanban heartbeat <id> [--note "..."]`.
- kanban-worker skill instructs workers to heartbeat during long
loops (training runs, encodes, crawls, batch uploads).
- Separate signal from PID crash detection: a worker's Python can
still be alive while the actual work process is stuck. Heartbeat
absence is diagnostic; future work can auto-block on stale
heartbeats but v1 just surfaces the signal.
3. Assignee enumeration (`known_assignees`, `list_profiles_on_disk`)
- Scans ~/.hermes/profiles/ for dirs containing config.yaml + unions
with current assignees on the board. Each entry returns
{name, on_disk, counts: {status: n}}.
- CLI: `hermes kanban assignees [--json]`. Also hooked into
`hermes kanban init` which now prints discovered profiles so new
installs see 'these are the assignees you can target' immediately.
- Dashboard: GET /api/plugins/kanban/assignees for the picker.
4. Event vocab cleanup (three renames + three new kinds)
- `ready` → `promoted` (fires when deps clear; clearer semantic).
- `priority` → `reprioritized` (past-tense verb, matches others).
- `spawn_auto_blocked` → `gave_up` (short, memorable; the circuit
breaker gave up on this task).
- New: `spawned` (emitted with {pid} on successful spawn),
`heartbeat` ({note?}), `timed_out`
({pid, elapsed_seconds, limit_seconds, sigkill}).
- One-shot migration in `_migrate_add_optional_columns` renames
legacy rows in-place on init_db(), so existing DBs upgrade cleanly.
- Gateway notifier's TERMINAL_KINDS set updated; timed_out gets its
own ⏱ message template, gave_up renamed from 'auto-blocked'.
- Plugin_api.py's two 'priority' emit sites renamed to
'reprioritized'.
- Documented in a new 'Event reference' section in kanban.md,
grouped into three clusters (lifecycle / edits / worker
telemetry) with payload shapes.
Tests (+18 in tests/hermes_cli/test_kanban_core_functionality.py,
136/136 pass):
- max_runtime_terminates_overrun_worker: real SIGTERM flow with
_pid_alive stub, verifies event payload + state reset.
- max_runtime_none_means_no_cap: unbounded tasks aren't timed out.
- create_task_persists_max_runtime.
- enforce_max_runtime_integrates_with_dispatch: kernel-level +
dispatch_once chaining.
- heartbeat_on_running_task + heartbeat_refused_when_not_running.
- cli_heartbeat_verb with --note round-trip.
- recompute_ready_emits_promoted_not_ready.
- spawn_failure_circuit_breaker_emits_gave_up.
- spawned_event_emitted_with_pid.
- migration_renames_legacy_event_kinds (injects old rows, re-runs
init_db, asserts rename).
- list_profiles_on_disk (tmp_path + config.yaml filter).
- known_assignees_merges_disk_and_board (profiles on disk + board
assignees + per-status counts).
- cli_assignees_json.
- parse_duration_accepts_formats (s/m/h/d/float).
- parse_duration_rejects_garbage.
- cli_create_max_runtime_via_duration (2h → 7200).
- cli_create_max_runtime_bad_format_exits_nonzero.
Live smoke: POST /tasks with max_runtime_seconds round-trips;
/assignees returns the union of on-disk + board-assigned names;
PATCH priority produces 'reprioritized' events (not 'priority');
board cards expose max_runtime_seconds + last_heartbeat_at.
Docs (website/docs/user-guide/features/kanban.md):
- New 'Event reference' section with three-cluster table
(lifecycle / edits / worker telemetry) + payload shapes.
- CLI reference updated for --max-runtime, heartbeat, assignees.
- Gateway notifications section updated for the new TERMINAL_KINDS.
Not ported from Multica (deliberate, documented in the out-of-scope
section already): Postgres+pgvector skill search (heavy deps conflict
with SQLite kernel), server+daemon cross-host model (we're
single-host on purpose), first-class agent identity with threaded
comments (we keep the board profile-agnostic).
New `hermes kanban` CLI subcommand + `/kanban` slash command + skills for
worker and orchestrator profiles. SQLite-backed task board
(~/.hermes/kanban.db) shared across all profiles on the host. Zero
changes to run_agent.py, no new core tools, no tool-schema bloat.
Motivation: delegate_task is a function call — sync fork/join, anonymous
subagent, no resumability, no human-in-the-loop. Kanban is the durable
shape needed for research triage, scheduled ops, digital twins,
engineering pipelines, and fleet work. They coexist (workers may call
delegate_task internally).
What this adds
- hermes_cli/kanban_db.py — schema, CAS claim, dependency resolution,
dispatcher, workspace resolution, worker-context builder.
- hermes_cli/kanban.py — 15-verb CLI surface and shared run_slash()
entry point used by both CLI and gateway.
- skills/devops/kanban-worker — how a profile should work a claimed task.
- skills/devops/kanban-orchestrator — "you are a dispatcher, not a
worker" template with anti-temptation rules.
- /kanban slash command wired into cli.py and gateway/run.py. Bypasses
the running-agent guard (board writes don't touch agent state), so
/kanban unblock can free a stuck worker mid-conversation.
- Design spec at docs/hermes-kanban-v1-spec.pdf — comparative analysis
vs Cline Kanban, Paperclip, NanoClaw, Gemini Enterprise; 8 patterns;
4 user stories; implementation plan; concurrency correctness.
- Docs: website/docs/user-guide/features/kanban.md, CLI reference
updated, sidebar entry added.
Architecture highlights
- Three planes: control (user + gateway), state (board + dispatcher),
execution (pool of profile processes).
- Every worker is a full OS process, spawned as `hermes -p <profile>`.
No in-process subagent swarms — solves NanoClaw's SDK-lifecycle
failure class.
- Atomic claim via SQLite CAS in a BEGIN IMMEDIATE transaction; stale
claims reclaimed 15 min after their TTL expires.
- Tenant namespacing via one nullable column — one specialist fleet
can serve many businesses with data isolation by workspace path.
Tests: 60 targeted tests (schema, CAS atomicity, dependency resolution,
dispatcher, workspace kinds, tenancy, CLI + slash surface). All pass
hermetic via scripts/run_tests.sh.
New `hermes kanban` CLI subcommand + `/kanban` slash command + skills for
worker and orchestrator profiles. SQLite-backed task board
(~/.hermes/kanban.db) shared across all profiles on the host. Zero
changes to run_agent.py, no new core tools, no tool-schema bloat.
Motivation: delegate_task is a function call — sync fork/join, anonymous
subagent, no resumability, no human-in-the-loop. Kanban is the durable
shape needed for research triage, scheduled ops, digital twins,
engineering pipelines, and fleet work. They coexist (workers may call
delegate_task internally).
What this adds
- hermes_cli/kanban_db.py — schema, CAS claim, dependency resolution,
dispatcher, workspace resolution, worker-context builder.
- hermes_cli/kanban.py — 15-verb CLI surface and shared run_slash()
entry point used by both CLI and gateway.
- skills/devops/kanban-worker — how a profile should work a claimed task.
- skills/devops/kanban-orchestrator — "you are a dispatcher, not a
worker" template with anti-temptation rules.
- /kanban slash command wired into cli.py and gateway/run.py. Bypasses
the running-agent guard (board writes don't touch agent state), so
/kanban unblock can free a stuck worker mid-conversation.
- Design spec at docs/hermes-kanban-v1-spec.pdf — comparative analysis
vs Cline Kanban, Paperclip, NanoClaw, Gemini Enterprise; 8 patterns;
4 user stories; implementation plan; concurrency correctness.
- Docs: website/docs/user-guide/features/kanban.md, CLI reference
updated, sidebar entry added.
Architecture highlights
- Three planes: control (user + gateway), state (board + dispatcher),
execution (pool of profile processes).
- Every worker is a full OS process, spawned as `hermes -p <profile>`.
No in-process subagent swarms — solves NanoClaw's SDK-lifecycle
failure class.
- Atomic claim via SQLite CAS in a BEGIN IMMEDIATE transaction; stale
claims reclaimed 15 min after their TTL expires.
- Tenant namespacing via one nullable column — one specialist fleet
can serve many businesses with data isolation by workspace path.
Tests: 60 targeted tests (schema, CAS atomicity, dependency resolution,
dispatcher, workspace kinds, tenancy, CLI + slash surface). All pass
hermetic via scripts/run_tests.sh.
External services can now push plain-text notifications to a user's chat
via the webhook adapter without invoking the agent. Set deliver_only=true
on a route and the rendered prompt template becomes the literal message
body — dispatched directly to the configured target (Telegram, Discord,
Slack, GitHub PR comment, etc.).
Reuses all existing webhook infrastructure: HMAC-SHA256 signature
validation, per-route rate limiting, idempotency cache, body-size limits,
template rendering with dot-notation, home-channel fallback. No new HTTP
server, no new auth scheme, no new port.
Use cases: Supabase/Firebase webhooks → user notifications, monitoring
alert forwarding, inter-agent pings, background job completion alerts.
Changes:
- gateway/platforms/webhook.py: new _direct_deliver() helper + early
dispatch branch in _handle_webhook when deliver_only=true. Startup
validation rejects deliver_only with deliver=log.
- hermes_cli/main.py + hermes_cli/webhook.go: --deliver-only flag on
subscribe; list/show output marks direct-delivery routes.
- website/docs/user-guide/messaging/webhooks.md: new Direct Delivery
Mode section with config example, CLI example, response codes.
- skills/devops/webhook-subscriptions/SKILL.md: document --deliver-only
with use cases (bumped to v1.1.0).
- tests/gateway/test_webhook_deliver_only.py: 14 new tests covering
agent bypass, template rendering, status codes, HMAC still enforced,
idempotency still applies, rate limit still applies, startup
validation, and direct-deliver dispatch.
Validation: 78 webhook tests pass (64 existing + 14 new). E2E verified
with real aiohttp server + real urllib POST — agent not invoked, target
adapter.send() called with rendered template, duplicate delivery_id
suppressed.
Closes the gap identified in PR #12117 (thanks to @H1an1 / Antenna team)
without adding a second HTTP ingress server.
Adds 'hermes webhook' CLI subcommand and a skill — zero new model tools.
CLI commands (require webhook platform to be enabled):
hermes webhook subscribe <name> [--events, --prompt, --deliver, ...]
hermes webhook list
hermes webhook remove <name>
hermes webhook test <name>
All commands gate on webhook platform being enabled in config. If not
configured, prints setup instructions (gateway setup wizard, manual
config.yaml, or env vars).
The agent uses these via terminal tool, guided by the webhook-subscriptions
skill which documents setup, common patterns (GitHub, Stripe, CI/CD,
monitoring), prompt template syntax, security, and troubleshooting.
Adapter enhancement: webhook.py hot-reloads dynamic subscriptions from
~/.hermes/webhook_subscriptions.json on each incoming request (mtime-gated).
Static config.yaml routes always take precedence.
Docs: updated webhooks.md with Dynamic Subscriptions section, added
hermes webhook to cli-commands.md reference.
No new model tools. No toolset changes.
24 new tests for CLI CRUD, persistence, enabled-gate, and adapter
dynamic route loading.