hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-04-28 06:51:16 +08:00

Author	SHA1	Message	Date
Teknium	1c78f6627a	docs(kanban): document audit-pass invariants — bulk-close guard, reclaimed-on-status-change, completed event carries summary - Runs section: dashboard PATCH parity (summary/metadata forward), `completed` event embeds first-line summary for notifiers, bulk --summary/--metadata refused, archive/drag-drop reclaim semantics. - Event reference: added Payload column to Lifecycle and Edits tables; called out the invariant that `status` carries run_id when closing a reclaimed run.	2026-04-27 08:46:08 -07:00
Teknium	8ef2ae6502	fix(kanban): audit pass — close orphaned runs on archive / dashboard direct-status / drag-drop Integration audit of the runs-as-first-class work (`0146cb2bd`) found five bugs where structured runs got orphaned or dashboard parity was missing. All behavioral fixes; no schema change needed. Kernel - archive_task: when called on a running task, now closes the in-flight run with outcome='reclaimed' and clears current_run_id. Previously, dashboard bulk-archive or CLI `kanban archive <running>` would leave the task_runs row open with ended_at=NULL forever and strand the pointer. Adds the claim_lock / claim_expires / worker_pid clearing to the UPDATE so the task row is clean too. - complete_task: embeds the first-line handoff summary in the `completed` event payload (capped at 400 chars). Notifier can now render `✔ task done — <title>\n<summary>` without a second SQL hit, and the full summary still lives on the run row. Dashboard plugin - _set_status_direct: drag-drop OFF 'running' (to 'ready', 'todo', 'triage', 'done' — anywhere except back to 'running') now closes the active run with outcome='reclaimed'. Clears worker_pid too. Snapshots previous status + current_run_id before the UPDATE so the decision has the right before-state. status event rows now carry run_id when closing a run, NULL otherwise. - UpdateTaskBody: adds `summary` and `metadata` fields. PATCH /tasks/:id with status='done' now forwards them to complete_task, giving the dashboard parity with `hermes kanban complete --summary ... --metadata ...`. Previously these fields only existed on the CLI. CLI - `hermes kanban complete a b c --summary X` or `--metadata Y`: refused with a clear stderr message instead of silently applying the same handoff to every task. Bulk-close without handoff flags still works. (Note: hermes_cli.main discards subcommand exit codes via `args.func(args)` without propagating; tracked separately. Side-effect check is the real guard.) Gateway notifier - Completion message prefers run.summary (carried in event payload) over task.result. task.result remains the fallback for legacy rows written before runs shipped. - Docstring: renamed stale `spawn_auto_blocked` reference to `gave_up` / `timed_out` — matches the actual TERMINAL_KINDS tuple, which was already correct in code. Tests (+8 in core functionality, +3 in dashboard plugin) - archive_of_running_task_closes_run - archive_of_ready_task_does_not_create_spurious_run - dashboard_direct_status_change_off_running_closes_run - dashboard_direct_status_change_within_same_state_is_noop_for_runs - cli_bulk_complete_with_summary_rejects (side-effect assertion) - cli_bulk_complete_without_summary_still_works - completed_event_payload_carries_summary - completed_event_payload_summary_none_when_missing - patch_status_done_with_summary_and_metadata - patch_status_done_without_summary_still_works (legacy path) - patch_status_archive_closes_running_run (E2E through FastAPI TestClient) 164/164 kanban suite pass under scripts/run_tests.sh. Live smoke (execute_code with isolated HERMES_HOME) covered all five fixed paths plus a re-claim-after-drag-drop to confirm the fresh run is tracked correctly after the orphan close.	2026-04-27 07:44:39 -07:00
Teknium	0146cb2bd2	feat(kanban): runs as first-class (v1); structured handoffs; forward-compat for v2 workflows Addresses vulcan-artivus's RFC review on issue #16102. Picks up the structural changes that are expensive to retrofit later and zero-cost to land now; defers workflow-template routing + per-stage lanes to v2 (kept forward-compat hooks in the schema). Kernel - New `task_runs` table. Each claim opens a run (pid, claim_lock, heartbeat, max_runtime, started_at), each terminal transition closes it with an outcome (completed / blocked / crashed / timed_out / spawn_failed / gave_up / reclaimed). Multiple rows per task when retries happen, preserving full attempt history. - `tasks.current_run_id` points at the active run (NULL when idle); denormalised for cheap reads. - `task_events.run_id` carries the run a given event belongs to so UIs group events by attempt. claim/spawned/complete/block/crash/ timeout/spawn_fail/gave_up/heartbeat events are all run-scoped; created/promoted/assigned/edited stay task-scoped (run_id=NULL). - Legacy DBs: migration adds the columns + indexes + synthesizes a run row for any task that's 'running' before the runs table existed, so subsequent complete/heartbeat/reclaim calls have a target. Idempotent. Structured handoff - `complete_task(summary=, metadata=)` persists both on the closing run. `summary` falls back to `result` when omitted so single-run callers don't duplicate. `metadata` is a free-form dict ({changed_files, tests_run, findings, ...}). - `build_worker_context` rewrites: now reads "Prior attempts on this task" (closed runs: outcome, summary, error, metadata) and "Parent task results" pulls run.summary + run.metadata of the most-recent completed run per parent, falling back to task.result for legacy rows without runs. Retrying workers see why earlier attempts failed; downstream workers see parent handoffs structurally, not as loose `result` strings. CLI - `hermes kanban complete <id> --summary "..." --metadata '{"files":1}'`. JSON is parsed and rejected with exit-2 if malformed. - New `hermes kanban runs <id> [--json]` verb. Shows per-run rows: outcome, profile, elapsed, summary, error. JSON mode serializes the full run dataclass for scripting. Dashboard plugin - GET /tasks/:id now carries a runs[] array alongside task / events / comments / links. Each run serialised with outcome, summary, metadata, worker_pid, elapsed fields. - New Run History section in the drawer. Outcome-coloured left border (green=active, blue=completed, amber=reclaimed, red=crashed/timed_out/gave_up/blocked). Collapsed when >3 runs with a '+N earlier' toggle. Shows summary + error + metadata inline. Forward-compat for v2 (vulcan's workflow templates + stages) - `tasks.workflow_template_id` and `tasks.current_step_key` added as nullable columns. v1 kernel ignores them for routing; v2 will add workflow_templates + workflow_steps tables and wire the dispatcher to consult them. task_runs has a matching `step_key` column. Lets a v2 release land additively without another schema migration. Tests (+22 in test_kanban_core_functionality.py, +2 in dashboard) - run_created_on_claim / run_closed_on_complete_with_summary - run_summary_falls_back_to_result - multiple_attempts_preserved_as_runs (3 attempts: reclaimed → crashed → completed, all visible in list_runs) - run_on_block_with_reason / run_on_spawn_failure_records_failed_runs (5 spawn_failed runs + 1 gave_up run) - event_rows_carry_run_id (task-scoped vs run-scoped split) - build_worker_context_includes_prior_attempts - build_worker_context_uses_parent_run_summary (metadata JSON in context) - migration_backfills_inflight_run_for_legacy_db (simulates a pre-migration running task, re-runs init_db, asserts backfill) - forward_compat_columns_writable - cli_runs_verb + cli_runs_json - cli_complete_with_summary_and_metadata (JSON round-trip through shlex + argparse) - cli_complete_bad_metadata_exits_nonzero - task_detail_includes_runs / task_detail_runs_empty_before_claim 269/269 kanban suite pass under scripts/run_tests.sh. Live-smoke covered: single-attempt complete → run closed + summary persisted; retry scenario → two runs visible (blocked + completed); parent run summary + metadata surfaced to child via build_worker_context; forward-compat columns writable via UPDATE; GET /tasks/:id returns runs[]. Docs - New 'Runs — one row per attempt' section in kanban.md: the why (full attempt history, structured metadata), the two-table model (task is logical, run is execution), the structured handoff shape (--summary / --metadata), example CLI + dashboard output, forward-compat note for v2. - Event reference updated to mention task_events.run_id. - CLI reference gains 'hermes kanban runs <id>'. Not in v1 (deferred to v2): - Workflow templates (workflow_templates + workflow_steps tables, stage-based routing, success/failure step links). - 'stage' as a distinct axis from status in the UI. - Shared-by-default workspace binding across stages of the same workflow run. - Pipeline replacement for the kanban-orchestrator skill (the orchestrator's 'decompose, don't execute' guidance is still correct; it becomes partly redundant once workflows land).	2026-04-27 06:54:19 -07:00
Teknium	da7d09c3b6	feat(kanban): max-runtime timeouts, worker heartbeats, assignees picker, event vocab cleanup Ports four items from the Multica audit (https://github.com/multica-ai/multica). Dropped their cross-host server/daemon architecture and their Postgres+pgvector skill search — both the wrong shape for our single-host SQLite kernel. 1. Per-task max-runtime (`max_runtime_seconds` column) - New kernel function `enforce_max_runtime(conn)` runs in every dispatch tick. When a running task's elapsed time exceeds the cap, we SIGTERM the worker, wait a 5 s grace (polling _pid_alive), then SIGKILL. The task goes back to 'ready' with a `timed_out` event and re-queues on the next tick (unless the spawn-failure circuit breaker has already parked it). - Host-local only: lock prefix must match this host's claimer_id so we never signal a PID on another machine. - CLI: `hermes kanban create --max-runtime 30m \| 2h \| 1d \| <seconds>`. New `_parse_duration` helper accepts s/m/h/d suffixes or bare integers. - Dashboard POST body + the card's `max_runtime_seconds` field. 2. Worker heartbeat (`last_heartbeat_at` column, `heartbeat` event) - `heartbeat_worker(conn, task_id, note=None)` emits the event and touches last_heartbeat_at. Refused when the task isn't running. - CLI: `hermes kanban heartbeat <id> [--note "..."]`. - kanban-worker skill instructs workers to heartbeat during long loops (training runs, encodes, crawls, batch uploads). - Separate signal from PID crash detection: a worker's Python can still be alive while the actual work process is stuck. Heartbeat absence is diagnostic; future work can auto-block on stale heartbeats but v1 just surfaces the signal. 3. Assignee enumeration (`known_assignees`, `list_profiles_on_disk`) - Scans ~/.hermes/profiles/ for dirs containing config.yaml + unions with current assignees on the board. Each entry returns {name, on_disk, counts: {status: n}}. - CLI: `hermes kanban assignees [--json]`. Also hooked into `hermes kanban init` which now prints discovered profiles so new installs see 'these are the assignees you can target' immediately. - Dashboard: GET /api/plugins/kanban/assignees for the picker. 4. Event vocab cleanup (three renames + three new kinds) - `ready` → `promoted` (fires when deps clear; clearer semantic). - `priority` → `reprioritized` (past-tense verb, matches others). - `spawn_auto_blocked` → `gave_up` (short, memorable; the circuit breaker gave up on this task). - New: `spawned` (emitted with {pid} on successful spawn), `heartbeat` ({note?}), `timed_out` ({pid, elapsed_seconds, limit_seconds, sigkill}). - One-shot migration in `_migrate_add_optional_columns` renames legacy rows in-place on init_db(), so existing DBs upgrade cleanly. - Gateway notifier's TERMINAL_KINDS set updated; timed_out gets its own ⏱ message template, gave_up renamed from 'auto-blocked'. - Plugin_api.py's two 'priority' emit sites renamed to 'reprioritized'. - Documented in a new 'Event reference' section in kanban.md, grouped into three clusters (lifecycle / edits / worker telemetry) with payload shapes. Tests (+18 in tests/hermes_cli/test_kanban_core_functionality.py, 136/136 pass): - max_runtime_terminates_overrun_worker: real SIGTERM flow with _pid_alive stub, verifies event payload + state reset. - max_runtime_none_means_no_cap: unbounded tasks aren't timed out. - create_task_persists_max_runtime. - enforce_max_runtime_integrates_with_dispatch: kernel-level + dispatch_once chaining. - heartbeat_on_running_task + heartbeat_refused_when_not_running. - cli_heartbeat_verb with --note round-trip. - recompute_ready_emits_promoted_not_ready. - spawn_failure_circuit_breaker_emits_gave_up. - spawned_event_emitted_with_pid. - migration_renames_legacy_event_kinds (injects old rows, re-runs init_db, asserts rename). - list_profiles_on_disk (tmp_path + config.yaml filter). - known_assignees_merges_disk_and_board (profiles on disk + board assignees + per-status counts). - cli_assignees_json. - parse_duration_accepts_formats (s/m/h/d/float). - parse_duration_rejects_garbage. - cli_create_max_runtime_via_duration (2h → 7200). - cli_create_max_runtime_bad_format_exits_nonzero. Live smoke: POST /tasks with max_runtime_seconds round-trips; /assignees returns the union of on-disk + board-assigned names; PATCH priority produces 'reprioritized' events (not 'priority'); board cards expose max_runtime_seconds + last_heartbeat_at. Docs (website/docs/user-guide/features/kanban.md): - New 'Event reference' section with three-cluster table (lifecycle / edits / worker telemetry) + payload shapes. - CLI reference updated for --max-runtime, heartbeat, assignees. - Gateway notifications section updated for the new TERMINAL_KINDS. Not ported from Multica (deliberate, documented in the out-of-scope section already): Postgres+pgvector skill search (heavy deps conflict with SQLite kernel), server+daemon cross-host model (we're single-host on purpose), first-class agent identity with threaded comments (we keep the board profile-agnostic).	2026-04-27 06:32:17 -07:00
Teknium	af8d43dbbb	feat(kanban): core hardening — daemon, circuit breaker, crash detect, logs, notify, bulk, stats Eliminates every 'known broken on day one' item in the core functionality audit. The board is now self-driving (daemon, not cron), self-healing (crash detection, spawn-failure circuit breaker), and self-reporting (logs, stats, gateway notifications). Dispatcher - New `hermes kanban daemon` long-lived loop with --interval, --max, --failure-limit, --pidfile, --verbose, signal-clean shutdown (SIGINT/SIGTERM via threading.Event). A kb.run_daemon() entry point lets tests drive it inline without subprocess. - `hermes kanban init` now prints the dispatcher setup hint so users don't leave the board off-by-default. Ships a systemd user unit at plugins/kanban/systemd/hermes-kanban-dispatcher.service. - Removed the old 'add this to cron' doc path. Cron runs agent prompts (LLM cost per tick) — unacceptable for a per-minute coordination loop. Worker aliveness / safety - Spawn returns the child's PID; dispatcher stores it on the task row and calls detect_crashed_workers() every tick. If the PID is gone but the claim TTL hasn't expired, the task drops back to ready with a 'crashed' event. Host-local only — cross-host PIDs are ignored per the single-host design. - Spawn-failure circuit breaker: after N consecutive spawn_failed events on the same task (default 5), the dispatcher auto-blocks with the last error as the reason. Success resets the counter. Workspace-resolution failures count against the same budget. - Log rotation: _rotate_worker_log trims at 2 MiB, keeps one generation (.log.1), bounds per-task disk usage at ~4 MiB. Idempotency / dedup - create_task(idempotency_key=...) returns the existing non-archived task id for retried webhooks. --idempotency-key on the CLI, json body field on the dashboard plugin. Archived tasks don't block a fresh create with the same key. CLI surface - Bulk verbs: complete, unblock, archive accept multiple ids; block accepts --ids for sibling blocks with the same reason. - New verbs: daemon, watch (live event tail filtered by assignee/tenant/kinds), stats, log, notify-subscribe, notify-list, notify-unsubscribe. - dispatch gains --failure-limit + crashed/auto_blocked columns in JSON output and human-readable output. - gc accepts --event-retention-days / --log-retention-days; prunes task_events for terminal tasks and old log files. Gateway integration - New GatewayRunner._kanban_notifier_watcher: polls kanban_notify_subs every 5s, pushes ✔/⏸/✖ messages to subscribed chats for completed/blocked/spawn_auto_blocked/crashed events. Cursor-advanced per-sub; auto-removed when the task reaches done/archived. Runs alongside the session expiry and platform reconnect watchers — SQLite work in asyncio.to_thread so the event loop never blocks. - /kanban create in the gateway auto-subscribes the originating chat (platform + chat_id + thread_id). Users see '(subscribed — you'll be notified when t_abcd completes or blocks)' appended to the response. Dashboard plugin - GET /stats returns board_stats (by_status, by_assignee, oldest_ready_age_seconds). - GET /tasks/:id/log returns the worker log with optional ?tail=N cap. 404 on unknown task, exists=false when the task has never spawned. - POST /tasks accepts idempotency_key; both Pydantic body and the create_task kwarg now round-trip. - /board attaches task.age (created/started/time_to_complete in seconds) so the UI can colour stale cards without recomputing. - Card CSS: amber border after N minutes, red border when clearly stuck (tier per status: running 10m/60m, ready 1h/24h, todo 7d/30d, blocked 1h/24h). - Drawer: new Worker log section, auto-loads on mount, last 100 KB cap with on-disk path surfaced when truncated. Kernel - Schema additions: tasks.idempotency_key, tasks.spawn_failures, tasks.worker_pid, tasks.last_spawn_error; new kanban_notify_subs table. All gated by _migrate_add_optional_columns so legacy DBs upgrade cleanly. - release_stale_claims / complete_task / block_task now all clear worker_pid so crash detection doesn't false-positive on reclaimed tasks. - read_worker_log fixed: tail-skip no longer eats one-giant-line logs (common with child processes that don't flush newlines before dying). Tests (tests/hermes_cli/test_kanban_core_functionality.py, 28 new) - Idempotency: same key returns existing, archived doesn't block, no key never collides - Circuit breaker: auto-blocks after limit, success resets counter, workspace-resolution failure counts against budget - Aliveness: _pid_alive helper, detect_crashed_workers reclaims exited child - Daemon: runs and stops cleanly via stop_event, survives a tick exception - Stats + task_age helpers - Notify subs: CRUD, cursor advances, distinct-thread is a separate row - GC: events-only-for-terminal-tasks, old worker logs deleted - Log: rotation keeps one generation, read_worker_log tail - CLI: bulk complete/archive/unblock/block, create with --idempotency-key, stats --json, notify-subscribe+list, log missing task, gc reports counts - run_slash parity: smoke-tests every registered verb (23 invocations); none may raise or return empty string Full kanban test suite: 234/234 pass under scripts/run_tests.sh (60 original + 30 dashboard plugin + 28 new core + 116 command registry). Live smoke covers /stats, idempotency, age, log endpoint with and without content, log?tail= truncation signal, 404 on unknown task. Docs (website/docs/user-guide/features/kanban.md) - 'Core concepts' rewritten: new statuses (triage), idempotency key, dispatcher-as-daemon-not-cron with circuit breaker behaviour documented. - Quick start swapped to daemon. New systemd section covers user service install. - New sections: idempotent create, bulk verbs, gateway notifications, out-of-scope single-host note (kanban.db is local; don't expect multi-host). - CLI reference updated for every new verb, every new flag.	2026-04-26 13:01:09 -07:00
Teknium	27fc6c1086	feat(kanban): bulk ops, drawer edit, dep editor, markdown, touch, config The dashboard plugin gets the last layer of features that turn it from a 'usable read surface with drag-drop' into a 'full kanban UI' — no more 'drop to CLI to do X' moments from inside the tab. Plugin backend - POST /tasks/bulk — apply the same patch (status / archive / assignee / priority) to every id in the request body. Each id runs independently: one bad id reports {ok: false, error: ...} without aborting siblings. Status transitions that aren't legal for the current state are surfaced per-id ('transition to done refused'). Used by the multi-select bulk action bar. - GET /config — returns the dashboard.kanban section of config.yaml (default_tenant, lane_by_profile, include_archived_by_default, render_markdown) with sensible defaults when the section is absent. Loaded once by the SPA to preselect filters and toggle markdown rendering. - _conn() helper — every handler now goes through it, calling kanban_db.init_db() (idempotent) before every connection. Fresh installs work whether the first hit is GET /board, POST /tasks, or any other endpoint — no more 'no such table: tasks' when the CLI or a script hits the plugin before the dashboard has ever loaded. Plugin UI (plugin bundle, +~12 KB) - Multi-select: per-card checkbox; shift/ctrl-click also toggles without opening the drawer. A BulkActionBar appears above the columns with batch → ready / complete / archive / reassign (profile dropdown + unassign option). Destructive batches confirm first. Partial failures from the backend are surfaced inline. - Drawer inline editing: - Click the title → TitleEditor swaps in an input, Enter saves, Escape cancels. - Click the Assignee meta row → AssigneeEditor input (empty string unassigns). - Click the Priority meta row → PriorityEditor numeric input. - New 'edit' button on Description → full-width textarea; Save / Cancel switch back to rendered view. - Dependency editor: chip list of parents + children with per-chip × button (calls DELETE /links). Add-parent / add-child dropdowns filter out self + already-linked tasks so you cannot re-add a duplicate edge or a self-loop. Cycle rejections from the server surface cleanly via the existing error banner. - Parent selection in InlineCreate: new dropdown listing every task on the board ('{id} — {title}') — picking one sends parents=[id] with the create payload, so the task lands in todo (or triage if created from the Triage column) with the dependency wired up. - Safe markdown rendering for description, comment bodies, and result. A small in-bundle renderer handles headings, bold, italic, inline code, fenced code, bullet lists, and http(s)/mailto links. Every substitution runs on HTML-escaped input (no raw HTML), links get target=_blank + rel=noopener,noreferrer. Disabled by config key dashboard.kanban.render_markdown=false (falls back to <pre>). - Touch drag-drop: attachTouchDrag() installs a pointerdown handler that spawns a drag proxy, tracks elementFromPoint under the finger, and dispatches a hermes-kanban:drop CustomEvent on the column when released. Desktop continues to use native HTML5 DnD. Columns listen for both. - ErrorBoundary already present from the prior commit catches any renderer throw; markdown escape + touch-proxy cleanup both have their own try/finally. Tests (tests/plugins/test_kanban_dashboard_plugin.py — 90/90 pass) - bulk_status_ready: 3 tasks blocked, batch → ready, all move - bulk_archive hides all ids from default board - bulk_reassign changes every assignee - bulk_unassign_via_empty_string sets assignee back to None - bulk_partial_failure_doesnt_abort_siblings: bogus id in middle, good siblings still get priority=7 - bulk_empty_ids_400 - config_returns_defaults_when_section_missing - config_reads_dashboard_kanban_section (writes config.yaml, verifies every key round-trips) Live smoke (real FastAPI app + isolated HERMES_HOME): - /config without section returns defaults - /config with dashboard.kanban section returns the configured values - POST /tasks as the first-ever request (no prior /board) succeeds — auto-init handles it - Link add + remove via POST /links + DELETE /links round-trip - Bulk priority bump on 2 ids, both get priority=5 - Bulk archive hides ids from default board - PATCH {title, body} updates the task, markdown source survives the round trip - POST /tasks {triage: true, parents: [id]} lands in triage, not todo - Bulk partial: 2 good + 1 bogus returns per-id outcome Docs (website/docs/user-guide/features/kanban.md) - 'What the plugin gives you' rewritten to reflect bulk, drawer edit, dep editor, parent-on-create, markdown, touch drag-drop. - New 'Dashboard config' subsection with a YAML example for dashboard.kanban.*. - REST table gains /tasks/bulk and /config rows.	2026-04-26 12:36:23 -07:00
Teknium	45806629c5	feat(kanban): Triage column, progress rollup, WS auth, lanes, polish Follows up on the initial dashboard plugin with the items called out during self-review — ships the GUI-reality claims the PR body made, closes the WebSocket auth gap, and lands the 'Triage' status the design spec's Fusion-style screenshot leads with. Kernel changes - kanban_db.VALID_STATUSES gains 'triage'. status is TEXT without a CHECK constraint so no schema migration is needed. - create_task(triage=True) forces the initial status to 'triage' regardless of parents, and parent ids are still validated so the eventual link rows don't dangle. recompute_ready() only promotes 'todo' -> 'ready', so triage tasks are naturally isolated from the dispatcher pipeline. - hermes kanban create gains --triage. Patterns table (docs) gains P9 'Triage specifier'. Plugin backend (plugins/kanban/dashboard/plugin_api.py) - GET /board now auto-init's kanban.db on first read (idempotent). A fresh install shows an empty board instead of 'failed to load'. - GET /board returns a new 'progress' field per task — {done, total} of child-task completion, or None if the task has no children. - BOARD_COLUMNS prepends 'triage'. - POST /tasks accepts {triage: bool}; PATCH /tasks/:id accepts {status: 'triage'}. - WebSocket /events now requires ?token=<session_token> as a query param — browsers can't set Authorization on a WS upgrade, so this matches the pattern the in-browser PTY bridge uses. Constant-time compare against hermes_cli.web_server._SESSION_TOKEN. In bare-test contexts (no dashboard module) the check no-ops so the tail loop stays testable. Security boundary documented in the module header and in website/docs/user-guide/features/kanban.md. Plugin UI (plugins/kanban/dashboard/dist/index.js + style.css) - Adds the Triage column (lilac dot) with helper text 'Raw ideas — a specifier will flesh out the spec'. Inline-create from the Triage column parks new tasks in triage. - Status action row in the drawer gains '→ triage'. - Progress pill (N/M) on cards that have children. Full-complete state tints the pill green. - 'Lanes by profile' toolbar toggle — sub-groups the Running column by assignee so you see at a glance which specialist is busy on what. - Destructive status moves (done / archived / blocked) via drag-drop OR via the drawer action row now prompt for confirmation. - Escape closes the drawer. - Live-update reloads are debounced (250ms) so a burst of task_events triggers one refetch, not N. - WebSocket includes ?token= built from window.__HERMES_SESSION_TOKEN__. - WebSocket reconnect uses exponential backoff capped at 30s, not a fixed 1.5s spin loop, and surfaces a user-visible error on code-1008 (auth rejected) instead of reconnecting forever. - ErrorBoundary wraps the page — a bad card render shows a 'rendering error, reload view' card instead of crashing the tab. Tests (tests/plugins/test_kanban_dashboard_plugin.py, +5 tests = 21) - empty-board shape now asserts all 6 columns including 'triage' - create_triage_lands_in_triage_column - triage_task_not_promoted_to_ready (dispatcher bypasses triage) - patch_status_triage_works (both into triage and out of it) - board_progress_rollup (0/2 -> 1/2 -> childless cards = None) - board_auto_initializes_missing_db - ws_events_rejects_when_token_required (three sub-assertions: missing → 1008, wrong → 1008, correct → handshake accepted) All 82 kanban tests pass under scripts/run_tests.sh. Docs - kanban.md 'What the plugin gives you' fully rewritten to match shipped reality (triage, progress pill, assignee lanes, destructive-confirm, Escape-close, debounce). - New 'Security model' subsection documents the explicit-plugin- route-bypass, the WS token requirement, and the --host 0.0.0.0 warning; also notes that kanban.db is profile-agnostic on purpose (the coordination primitive) so cross-profile visibility is expected. - CLI command reference shows --triage. - Collaboration patterns table adds P9 'Triage specifier'.	2026-04-26 12:26:43 -07:00
Teknium	4093201c47	feat(kanban): dashboard plugin — Linear/Fusion-style board UI Ships plugins/kanban/dashboard/ as a bundled dashboard plugin. No core changes — uses the standard dashboard plugin contract (manifest.json + dist/index.js + plugin_api.py) documented in 'Extending the Dashboard'. What the tab gives you: - One column per kanban status (todo / ready / running / blocked / done; archived behind a toggle), column counts, coloured status dots. - Cards with id, title, priority badge, tenant tag, assignee, comment/link counts, 'created N ago'. - HTML5 drag-drop between columns — status change routes through the same kanban_db code the CLI /kanban verbs use, so the three surfaces (CLI, gateway, dashboard) can never drift. - Inline create per-column (title, assignee, priority). - Side drawer on card click: description, status action row (→ ready / → running / block / unblock / complete / archive), dependency links, comment thread with Enter-to-submit, last 20 events. - Toolbar: search, tenant filter, assignee filter, show-archived, nudge-dispatcher (skip the 60s wait), refresh. - Live updates via WebSocket tailing task_events — the board reflects CLI or gateway actions in real time. REST surface under /api/plugins/kanban/: GET /board, GET /tasks/:id, POST /tasks, PATCH /tasks/:id, POST /tasks/:id/comments, POST /links, DELETE /links, POST /dispatch, WS /events. Every handler is a thin wrapper around kanban_db — no new business logic. Visually theme-aware: the plugin CSS reads only --color-*, --radius, --font-mono etc. so it reskins with whichever dashboard theme is active. Tests (tests/plugins/test_kanban_dashboard_plugin.py, 16 tests): - empty board shape - create + appears in ready column with tenant/assignee rollups - tenant filter - detail includes parents/children/events - 404 on unknown task - PATCH status: complete / block / unblock / ready drag-drop / running - PATCH reassign, priority, edit, invalid-status rejection - POST comment (plus empty-body rejection) - POST link + DELETE link + cycle rejection - POST dispatch (dry run) All 76 kanban tests pass under scripts/run_tests.sh. Docs: website/docs/user-guide/features/kanban.md gains a full 'Dashboard (GUI)' section covering install, architecture, REST surface, live-updates mechanism, extending, and scope boundary.	2026-04-26 12:08:47 -07:00
Teknium	9f610aa8f3	docs(kanban): add GUI/Dashboard plugin section The /kanban CLI + slash command are enough to run the board headlessly, but triage and cross-profile supervision want a visual board. Document the design as a dashboard plugin that: - reads live state from kanban.db over a WebSocket on task_events (no polling) - writes through run_slash() so CLI/gateway/GUI cannot drift - mounts under /api/plugins/kanban/ following the existing 'Extending the Dashboard' plugin shape The plugin is strictly a thin layer over kanban_db — no new business logic, nothing to merge into the kernel.	2026-04-26 11:57:04 -07:00
Teknium	e1c5e741ad	feat(kanban): durable multi-profile collaboration board (#16081 ) New `hermes kanban` CLI subcommand + `/kanban` slash command + skills for worker and orchestrator profiles. SQLite-backed task board (~/.hermes/kanban.db) shared across all profiles on the host. Zero changes to run_agent.py, no new core tools, no tool-schema bloat. Motivation: delegate_task is a function call — sync fork/join, anonymous subagent, no resumability, no human-in-the-loop. Kanban is the durable shape needed for research triage, scheduled ops, digital twins, engineering pipelines, and fleet work. They coexist (workers may call delegate_task internally). What this adds - hermes_cli/kanban_db.py — schema, CAS claim, dependency resolution, dispatcher, workspace resolution, worker-context builder. - hermes_cli/kanban.py — 15-verb CLI surface and shared run_slash() entry point used by both CLI and gateway. - skills/devops/kanban-worker — how a profile should work a claimed task. - skills/devops/kanban-orchestrator — "you are a dispatcher, not a worker" template with anti-temptation rules. - /kanban slash command wired into cli.py and gateway/run.py. Bypasses the running-agent guard (board writes don't touch agent state), so /kanban unblock can free a stuck worker mid-conversation. - Design spec at docs/hermes-kanban-v1-spec.pdf — comparative analysis vs Cline Kanban, Paperclip, NanoClaw, Gemini Enterprise; 8 patterns; 4 user stories; implementation plan; concurrency correctness. - Docs: website/docs/user-guide/features/kanban.md, CLI reference updated, sidebar entry added. Architecture highlights - Three planes: control (user + gateway), state (board + dispatcher), execution (pool of profile processes). - Every worker is a full OS process, spawned as `hermes -p <profile>`. No in-process subagent swarms — solves NanoClaw's SDK-lifecycle failure class. - Atomic claim via SQLite CAS in a BEGIN IMMEDIATE transaction; stale claims reclaimed 15 min after their TTL expires. - Tenant namespacing via one nullable column — one specialist fleet can serve many businesses with data isolation by workspace path. Tests: 60 targeted tests (schema, CAS atomicity, dependency resolution, dispatcher, workspace kinds, tenancy, CLI + slash surface). All pass hermetic via scripts/run_tests.sh.	2026-04-26 08:29:46 -07:00
Teknium	e3901d5b25	fix(run_agent): background review fork inherits parent's live runtime (#16099 ) The background memory/skill review (_spawn_background_review) has always forked a new AIAgent passing only model and provider, then relied on AIAgent.__init__ to re-resolve credentials from env vars. This works for users with keys in ~/.hermes/.env but silently falls back to env-var auto-resolution in all cases, which fails for OAuth-only providers, session-scoped creds, and credential-pool setups where auth can't be reconstructed from env. This used to be invisible -- failures were swallowed via logger.debug(). PR `8a2506af4` (Apr 24) surfaced auxiliary failures to the user, which made the stale bug visible as: "Auxiliary background review failed: No LLM provider configured" Fix: pass api_key, base_url, api_mode, and credential_pool from the parent's live runtime into the fork -- matching how every other auxiliary path (compression, memory flush, vision, session search) already inherits the parent's credentials via _current_main_runtime().	2026-04-26 08:29:40 -07:00
Teknium	06f81752ed	Revert "feat(kanban): durable multi-profile collaboration board (#16081 )" (#16098 ) This reverts commit `15937a6b46`.	2026-04-26 08:29:37 -07:00
Teknium	9ef1ae138a	fix(docker): don't chown config.yaml after gosu drop (#15865 ) (#16096 ) The chown/chmod block on config.yaml was added in `b24d239ce` to keep the file readable by the hermes runtime user, but it sat in the post-gosu 'running as hermes' section of the entrypoint. That meant: 1. Default `docker run <image>` — container starts as root, entrypoint drops to hermes via gosu, then non-root hermes tries to chown the file to hermes. Works by coincidence because the file was just created by root during volume setup and gosu target == target owner. 2. `docker run -u $(id -u):$(id -g) <image>` (#15865) — container starts as the caller's UID. The root block is skipped entirely, we land in the hermes section as some arbitrary non-root user, and chown to 'hermes' fails with 'Operation not permitted'. Script aborts under `set -e`. Move the chown/chmod into the root block (before the gosu exec) where it actually has privilege, and guard with `2>/dev/null \|\| true` so rootless Podman (where even in-container root lacks host-side chown rights) doesn't abort either. Closes #15865	2026-04-26 08:27:39 -07:00
Teknium	c5196f1fc2	chore(release): map focusflow.app.help@gmail.com to yes999zc Salvage PR #15883 cherry-picked FocusFlow Dev's commit; release-notes CI needs the AUTHOR_MAP entry to attribute to the PR author's GitHub login rather than a placeholder.	2026-04-26 08:25:22 -07:00
FocusFlow Dev	63bf7a29b6	fix(run_agent): prevent reasoning_content regression in DeepSeek/Kimi tool-call replay PR #15478 fixed missing reasoning_content for DeepSeek API but introduced a regression: tool-call messages with genuine 'reasoning' field were overwritten by empty-string fallback before promotion. Re-order _copy_reasoning_content_for_api steps: 1. Preserve explicit reasoning_content 2. Promote 'reasoning' field (MOVED UP) 3. DeepSeek/Kimi tool-call empty-string fallback (MOVED DOWN) 4. Non-thinking provider cleanup Fixes #15812, relates #15749, #15478.	2026-04-26 08:25:22 -07:00
Teknium	15937a6b46	feat(kanban): durable multi-profile collaboration board (#16081 ) New `hermes kanban` CLI subcommand + `/kanban` slash command + skills for worker and orchestrator profiles. SQLite-backed task board (~/.hermes/kanban.db) shared across all profiles on the host. Zero changes to run_agent.py, no new core tools, no tool-schema bloat. Motivation: delegate_task is a function call — sync fork/join, anonymous subagent, no resumability, no human-in-the-loop. Kanban is the durable shape needed for research triage, scheduled ops, digital twins, engineering pipelines, and fleet work. They coexist (workers may call delegate_task internally). What this adds - hermes_cli/kanban_db.py — schema, CAS claim, dependency resolution, dispatcher, workspace resolution, worker-context builder. - hermes_cli/kanban.py — 15-verb CLI surface and shared run_slash() entry point used by both CLI and gateway. - skills/devops/kanban-worker — how a profile should work a claimed task. - skills/devops/kanban-orchestrator — "you are a dispatcher, not a worker" template with anti-temptation rules. - /kanban slash command wired into cli.py and gateway/run.py. Bypasses the running-agent guard (board writes don't touch agent state), so /kanban unblock can free a stuck worker mid-conversation. - Design spec at docs/hermes-kanban-v1-spec.pdf — comparative analysis vs Cline Kanban, Paperclip, NanoClaw, Gemini Enterprise; 8 patterns; 4 user stories; implementation plan; concurrency correctness. - Docs: website/docs/user-guide/features/kanban.md, CLI reference updated, sidebar entry added. Architecture highlights - Three planes: control (user + gateway), state (board + dispatcher), execution (pool of profile processes). - Every worker is a full OS process, spawned as `hermes -p <profile>`. No in-process subagent swarms — solves NanoClaw's SDK-lifecycle failure class. - Atomic claim via SQLite CAS in a BEGIN IMMEDIATE transaction; stale claims reclaimed 15 min after their TTL expires. - Tenant namespacing via one nullable column — one specialist fleet can serve many businesses with data isolation by workspace path. Tests: 60 targeted tests (schema, CAS atomicity, dependency resolution, dispatcher, workspace kinds, tenancy, CLI + slash surface). All pass hermetic via scripts/run_tests.sh.	2026-04-26 08:24:26 -07:00
Teknium	454d883e69	refactor: drop persist_session plumbing + fix broken btw mid-turn bypass (#16075 ) Follow-up to PR #16053 (/btw as /background alias). Cleans up the plumbing added exclusively for the old ephemeral /btw handler and repairs a broken btw bypass that landed between my refactor and this follow-up. run_agent.py: - Remove persist_session kwarg, instance attr, and _persist_session short-circuit. Only /btw ever passed persist_session=False; with /btw gone the default (always persist) is the only behavior anyone ever wanted. gateway/run.py: - Remove the unreachable 'if _cmd_def_inner.name == "btw"' block (PR #16059). Canonical name for a /btw message is 'background' after alias resolution — the comparison could never be true, and it called _handle_btw_command which no longer exists. The /background branch above it already dispatches /btw correctly. tests/gateway/test_running_agent_session_toggles.py: - Fix test_btw_dispatches_mid_run to mock _handle_background_command (the real dispatch target for /btw) instead of the deleted _handle_btw_command.	2026-04-26 07:15:23 -07:00
Teknium	70f56e7605	fix(gateway): let /btw dispatch mid-turn instead of being rejected /btw spawns a parallel ephemeral side-question task (self-guarded against concurrent /btw on the same chat) — exactly like /background. But it was missing from the running-agent bypass list in _handle_message(), so it fell through to the catch-all and returned: ⏳ Agent is running — /btw can't run mid-turn. Wait for the current response or /stop first. That's the opposite of what /btw is for — asking a side question while the main turn is still working. Add the bypass next to /background and a regression test covering the mid-turn dispatch path. Reported by @IuriiTiunov on Telegram.	2026-04-26 07:11:10 -07:00
Teknium	7fa70b6c87	refactor: /btw is now an alias for /background (#16053 ) The ephemeral no-tools side-question variant of /btw confused users who expected 'by-the-way' to mean 'run this off to the side with tools' — they'd type /btw and get a toolless agent that couldn't do the work. /bg worked because it was /background with full tools. Collapse the two: /btw and /bg both alias to /background. One command, one behavior, no more gotchas about which variant has tools. Removed: - _handle_btw_command in cli.py and gateway/run.py - _run_btw_task + _active_btw_tasks state in gateway/run.py - prompt.btw JSON-RPC method + btw.complete event in tui_gateway - BtwStartResponse type + btw.complete case in ui-tui - Standalone /btw slash tree registration in Discord - Standalone btw CommandDef in hermes_cli/commands.py Updated: - background CommandDef aliases: (bg,) -> (bg, btw) - TUI session.ts: local btw handler merged into background - Docs and tips updated to describe /btw as a /background alias	2026-04-26 07:11:08 -07:00
Teknium	9a70260490	Revert "feat(onboarding): port first-touch hints to the TUI (#16054 )" (#16062 ) This reverts commit `ffd2621039`.	2026-04-26 06:31:37 -07:00
Teknium	ffd2621039	feat(onboarding): port first-touch hints to the TUI (#16054 ) PR #16046 added /busy and /verbose hints to the classic CLI and the gateway runner but skipped the Ink TUI (and therefore the dashboard /chat page, which embeds the TUI via PTY). This extends the same latch to the TUI with TUI-native wording. The TUI's busy-input model is not the /busy knob from the CLI — single Enter while busy auto-queues, double Enter on an empty line interrupts. The new busy-input hint teaches THAT gesture instead of telling the user to flip a config that does not apply. Changes: - agent/onboarding.py — add busy_input_hint_tui() + tool_progress_hint_tui() - tui_gateway/server.py — onboarding.claim JSON-RPC (Ink triggers busy hint on enqueue) + _maybe_emit_onboarding_hint helper hooked into _on_tool_complete for the 30s/tool_progress=all path. Same config.yaml latch so each hint fires at most once per install across CLI, gateway, and TUI combined. - ui-tui/src/gatewayTypes.ts — OnboardingClaimResponse + onboarding.hint event - ui-tui/src/app/createGatewayEventHandler.ts — render the hint event as sys() - ui-tui/src/app/useSubmission.ts — claim busy_input_prompt on first busy enqueue - tests/agent/test_onboarding.py — +3 cases for TUI hint shape - tests/tui_gateway/test_protocol.py — +4 cases for onboarding.claim - website/docs/user-guide/tui.md — new 'Interrupting and queueing' section explaining the TUI's double-Enter model and the hints Validation: scripts/run_tests.sh tests/agent/test_onboarding.py \ tests/tui_gateway/test_protocol.py \ tests/gateway/test_busy_session_ack.py -> 66 passed npm --prefix ui-tui run type-check -> clean npm --prefix ui-tui run lint -> clean npm --prefix ui-tui run build -> clean	2026-04-26 06:24:19 -07:00
Teknium	1e37ddc929	feat(cli): add 'hermes fallback' command to manage fallback providers (#16052 ) Manage the fallback_providers chain from the CLI instead of hand-editing config.yaml. The picker reuses select_provider_and_model() from 'hermes model' — same provider list, same credential prompts, same model picker. hermes fallback [list] Show the current chain (primary + fallbacks) hermes fallback add Run the model picker, append selection to chain hermes fallback remove Pick an entry to delete (arrow-key menu) hermes fallback clear Remove all entries (with confirmation) 'add' snapshots config['model'] before calling the picker, extracts the user's selection from the post-picker state, then restores the primary and appends {provider, model, base_url?, api_mode?} to fallback_providers. Auth store's active_provider is snapshot/restored too so OAuth-provider fallbacks don't silently deactivate the user's primary. Duplicates and self-as-fallback are rejected. Legacy single-dict 'fallback_model' entries are auto-migrated to the list format on first write.	2026-04-26 06:19:04 -07:00
Teknium	83c1c201f6	feat(onboarding): contextual first-touch hints for /busy and /verbose (#16046 ) Instead of a blocking first-run questionnaire, show a one-time hint the first time the user hits each behavior fork: 1. First message while the agent is working — appends a hint to the busy-ack explaining the /busy queue vs /busy interrupt knob, phrased to match the mode that was just applied (don't tell a queue-mode user to switch to queue). 2. First tool that runs for >= 30s in the noisiest progress mode (tool_progress: all) — prints a hint about /verbose to cycle display modes (all -> new -> off -> verbose). Gated on /verbose actually being usable on the surface: always shown on CLI; on gateway only shown when display.tool_progress_command is enabled. Each hint is latched in config.yaml under onboarding.seen.<flag>, so it fires exactly once per install across CLI, gateway, and cron, then never again. Users can wipe the section to re-see hints. New: - agent/onboarding.py — is_seen / mark_seen / hint strings, shared by both CLI and gateway. - onboarding.seen in DEFAULT_CONFIG (hermes_cli/config.py) and in load_cli_config defaults (cli.py). No _config_version bump — deep merge handles new keys. Wired: - gateway/run.py: _handle_active_session_busy_message appends the hint after building the ack. progress_callback tracks tool.completed duration and queues the tool-progress hint into the progress bubble. - cli.py: CLI input loop appends the busy-input hint on the first busy Enter; _on_tool_progress appends the tool-progress hint on the first >=30s tool completion. In-memory CLI_CONFIG is also updated so subsequent fires in the same process are suppressed immediately. All writes go through atomic_yaml_write and are wrapped in try/except so onboarding can never break the input/busy-ack paths.	2026-04-26 06:06:27 -07:00
Teknium	4bda9dcade	fix(gateway): honor voice.auto_tts config in auto-TTS gate (#16007 ) (#16039 ) The base adapter's auto-TTS path fired on any voice message unless the chat had explicitly run /voice off — it never read voice.auto_tts from config.yaml, so users who set auto_tts: false still got audio replies. Gate the base adapter on a three-layer decision instead: 1. chat in _auto_tts_enabled_chats (explicit /voice on\|tts) → fire 2. chat in _auto_tts_disabled_chats (explicit /voice off) → suppress 3. else → voice.auto_tts global default Runner now pushes voice.auto_tts onto the adapter as _auto_tts_default and mirrors /voice on\|tts chats into _auto_tts_enabled_chats via the existing _sync_voice_mode_state_to_adapter path. /voice off still wins. Closes #16007.	2026-04-26 05:52:05 -07:00
Teknium	67dcace412	docs(config): show options in comments for display settings (#16038 ) Users who run `hermes setup` get `cli-config.yaml.example` copied verbatim (including comments) to ~/.hermes/config.yaml. But several display settings had thin comments that didn't enumerate the valid options, so users couldn't tell from reading their config what values each key accepts. - busy_input_mode: widen from 'CLI' to 'CLI and gateway platforms'; note /stop as gateway equivalent of Ctrl+C; add /busy_input_mode runtime hint - compact, interim_assistant_messages, bell_on_complete, show_reasoning, streaming: add true/false option lines showing effect of each value - skin: refresh the built-in skin list (was missing daylight, warm-lightmode, poseidon, sisyphus, charizard — 5 of 9 built-ins undocumented)	2026-04-26 05:51:37 -07:00
Teknium	35c57cc46b	fix(gateway): suppress tool-progress bubbles after interrupt (#16034 ) When the LLM response carries N parallel tool calls, the agent fires N tool.started events back-to-back before its interrupt check runs. A user sending /stop mid-batch would see the '⚡ Interrupting current task' ack followed by a trail of 🔍 web_search bubbles for the remaining events in the batch — making the interrupt feel ignored. progress_callback and the drain loop in send_progress_messages now check agent.is_interrupted (via agent_holder[0], the existing cross-scope handle). Events that arrive after interrupt are dropped at both the queueing and rendering stages. The '⚡ Interrupting' message is sent through a separate adapter path and is unaffected.	2026-04-26 05:47:37 -07:00
Teknium	e8441c4c0f	fix(clipboard): report native/tmux success, keep Ctrl+Shift+C on dashboard Follow-up on #16020 salvage. Three corrections: 1. Truth signal for /copy Before: success was 'OSC 52 sequence was emitted to stdout'. That's false on local Linux inside tmux (emitSequence=false), so /copy kept printing 'clipboard copy failed' to users whose xclip/wl-copy had already succeeded fire-and-forget. Fix: setClipboard() now returns { sequence, success } where success = native-fired OR tmux-buffer-loaded OR osc52-emitted. copyNative() returns a boolean telling setClipboard whether a native attempt was made. /copy only shows 'failed' when literally no path was taken. 2. Dashboard keybinding Before: Ctrl+C for copy on non-Mac (Ctrl+Shift+C for paste). That swallows SIGINT when a stale selection is present and breaks the xterm/gnome-terminal/konsole/Windows-Terminal convention where Ctrl+C in a terminal emulator is always SIGINT. The real bug was that clipboard writes lost user-gesture through OSC-52 round-trips, which the direct writeText already fixes. Fix: revert copyModifier to Ctrl+Shift+C on non-Mac. Direct writeText in the keydown handler preserves user gesture. term.write Escape replaced with term.clearSelection() (works without relying on TUI input mode). 3. Error toast text Before: 'see HERMES_TUI_DEBUG_CLIPBOARD' — tells users how to debug but not how to fix. Fix: point users at HERMES_TUI_FORCE_OSC52=1 first (the actual escape hatch), mention the debug var second.	2026-04-26 05:46:45 -07:00
Harry Riddle	2511207cb0	chore: revert docs	2026-04-26 05:46:45 -07:00
Harry Riddle	0f3a6f0fb3	fix(clipboard): dashboard Ctrl+C direct copy; TUI honest feedback; HERMES_TUI_FORCE_OSC52 - Dashboard copy: direct Clipboard API on Ctrl+C/Cmd+C (user gesture); send Escape to TUI to clear selection; Ctrl+Shift+C kept as fallback. - TUI /copy: copySelection() async; only reports success if OSC52 emitted. - Add HERMES_TUI_FORCE_OSC52 env var to override native-tool detection. - Fixes "copied N chars" false-positive when clipboard backend absent. Changes: web/src/pages/ChatPage.tsx — direct navigator.clipboard.writeText ui-tui/packages/hermes-ink/src/ink/ink.tsx — async copySelection ui-tui/packages/hermes-ink/src/ink/termio/osc.ts — HERMES_TUI_FORCE_OSC52 ui-tui/src/app/slash/commands/core.ts — async /copy with honest feedback	2026-04-26 05:46:45 -07:00
Harry Riddle	a562420383	fix(tui): robust clipboard handling with debug logging and headless detection Problem: Ctrl+C in Hermes TUI shows 'copied' but clipboard often empty. Root causes: - Native Linux tools (xclip, wl-copy) require DISPLAY/WAYLAND_DISPLAY; in headless Docker/SSH they fail or hang. - OSC 52 fallback requires terminal emulator support; when absent, sequence is dropped silently. - Dashboard OSC 52 → Clipboard API path fails due to missing user gesture; errors were silently caught. - User feedback 'copied selection' was shown unconditionally, regardless of success. Solution implemented: - Short-circuit Linux native clipboard probing when no display server is present (no DISPLAY and no WAYLAND_DISPLAY). Avoids futile attempts and timeouts. - Add HERMES_TUI_DEBUG_CLIPBOARD env var (1/true). When set, TUI logs to stderr which clipboard path is used, probe results on Linux, and whether OSC 52 was emitted. Greatly improves diagnosability. - Improve dashboard clipboard error handling: replace empty catch blocks with console.warn messages for OSC 52 decode/Write failures and direct copy/paste errors. Makes browser permission/user-gesture failures visible in DevTools. - Add comprehensive clipboard troubleshooting documentation to README and AGENTS, covering OSC 52 verification, tmux config, Docker/headless constraints, env vars, dashboard caveats, and fallback strategies. Technical details: - in ui-tui/packages/hermes-ink/src/ink/termio/osc.ts: - Early return on Linux if both DISPLAY and WAYLAND_DISPLAY unset. - Refactor probe sequence to async with 500ms timeout, caching result; subsequent copies use cached tool immediately. - Emit debug logs when HERMES_TUI_DEBUG_CLIPBOARD=1. - in ink.tsx: log when OSC 52 not emitted (native or tmux path in use) in debug mode. - : OSC 52 handler and Ctrl+Shift+C handler now log warnings to console on Clipboard API rejection with error message. - Documentation: new 'Clipboard Troubleshooting' section in README; new 'Clipboard environment variables and pitfalls' subsection in AGENTS.md (Known Pitfalls). Tests: full ui-tui test suite (292 tests) passes; clipboard and OSC tests unaffected. No breaking changes. Files changed: - ui-tui/packages/hermes-ink/src/ink/termio/osc.ts - ui-tui/packages/hermes-ink/src/ink/ink.tsx - web/src/pages/ChatPage.tsx - README.md - AGENTS.md - CHANGELOG.md (new)	2026-04-26 05:46:45 -07:00
Teknium	855366909f	feat(models): remote model catalog manifest for OpenRouter + Nous Portal (#16033 ) OpenRouter and Nous Portal curated picker lists now resolve via a JSON manifest served by the docs site, falling back to the in-repo snapshot when unreachable. Lets us update model lists without shipping a release. Live URL: https://hermes-agent.nousresearch.com/docs/api/model-catalog.json (source at website/static/api/model-catalog.json; auto-deploys via the existing deploy-site.yml GitHub Pages pipeline on every merge to main). Schema (v1) carries id + optional description + free-form metadata at manifest, provider, and model levels. Pricing and context length stay live-fetched via existing machinery (/v1/models endpoints, models.dev). Config (new model_catalog section, default enabled): model_catalog.url master manifest URL model_catalog.ttl_hours disk cache TTL (default 24h) model_catalog.providers.<name>.url optional per-provider override Fetch pipeline: in-process cache -> disk cache (fresh < TTL) -> HTTP fetch -> disk-cache-on-failure fallback -> in-repo snapshot as last resort. Never raises to callers; at worst returns the bundled list. Changes: - website/static/api/model-catalog.json initial manifest (35 OR + 31 Nous) - scripts/build_model_catalog.py regenerator from in-repo lists - hermes_cli/model_catalog.py fetch + validate + cache module - hermes_cli/models.py fetch_openrouter_models() + new get_curated_nous_model_ids() - hermes_cli/main.py, hermes_cli/auth.py Nous flows use the helper - hermes_cli/config.py model_catalog defaults - website/docs/reference/model-catalog.md + sidebars.ts - tests/hermes_cli/test_model_catalog.py 21 tests (validation, fetch success/failure, accessors, disabled, overrides, integration)	2026-04-26 05:46:43 -07:00
Teknium	d09ab8ff13	fix(mcp-oauth): preserve server_url path for protected-resource validation (#16031 ) Stop pre-stripping the path from the configured MCP server URL before constructing OAuthClientProvider. The MCP SDK strips the path itself via OAuthContext.get_authorization_base_url() for authorization-server discovery, but uses the full server_url through resource_url_from_server_url() + check_resource_allowed() to validate against the server's RFC 9728 Protected Resource Metadata. For servers whose PRM advertises a path-scoped resource (e.g. Notion's https://mcp.notion.com/mcp), our _parse_base_url() collapsed the URL to the origin, so check_resource_allowed() saw requested='/' vs configured='/mcp/' and refused the token. Fixes OAuth against Notion MCP (and any other path-scoped resource). Closes #16015.	2026-04-26 05:43:54 -07:00
Teknium	438db0c7b0	fix(cli): /model picker honors provider-specific context caps (#16030 ) `_apply_model_switch_result` (the interactive `/model` picker's confirmation path) printed `ModelInfo.context_window` straight from models.dev, which reports the vendor-wide value (1.05M for gpt-5.5 on openai). ChatGPT Codex OAuth caps the same slug at 272K, so the picker showed 1M while the runtime (compressor, gateway `/model`, typed `/model <name>`) correctly used 272K — the classic 'sometimes 1M, sometimes 272K' mismatch on a single model. Both display paths now go through `resolve_display_context_length()`, matching the fix that `_handle_model_switch` received earlier. Also bump the stale last-resort fallback in DEFAULT_CONTEXT_LENGTHS (`gpt-5.5: 400000 -> 1050000`) to match the real OpenAI API value; the 272K Codex cap is already enforced via the Codex-OAuth branch, so the fallback now reflects what every non-Codex probe-miss should see. Tests: adds `test_apply_model_switch_result_context.py` with three scenarios (Codex cap wins, OpenRouter shows 1.05M, resolver-empty falls back to ModelInfo). Updates the existing non-Codex fallback test to assert 1.05M (the correct value). ## Validation \| path \| before \| after \| \|-------------------------------\|-----------\|-----------\| \| picker -> gpt-5.5 on Codex \| 1,050,000 \| 272,000 \| \| picker -> gpt-5.5 on OpenAI \| 1,050,000 \| 1,050,000 \| \| picker -> gpt-5.5 on OpenRouter \| 1,050,000 \| 1,050,000 \| \| typed /model gpt-5.5 on Codex \| 272,000 \| 272,000 \|	2026-04-26 05:43:31 -07:00
zkl	2ccdadcca6	fix(deepseek): bump V4 family context window to 1M tokens #14934 added deepseek-v4-pro / deepseek-v4-flash to the DeepSeek native provider but the context-window lookup still falls back to the existing "deepseek" substring entry (128K). DeepSeek V4 ships with a 1M context window, so any caller relying on get_model_context_length() for pre-flight token budgeting (compression, context warnings) under-counts by ~8x. Add explicit lowercase entries for the four DeepSeek model ids that ship 1M context: - deepseek-v4-pro - deepseek-v4-flash - deepseek-chat (legacy alias, server-side maps to v4-flash non-thinking) - deepseek-reasoner (legacy alias, server-side maps to v4-flash thinking) Longest-key-first substring matching means these explicit entries also cover the vendor-prefixed forms (deepseek/deepseek-v4-pro on OpenRouter and Nous Portal) without regressing the existing 128K fallback for older / unknown DeepSeek model ids on custom endpoints. Source: https://api-docs.deepseek.com/zh-cn/quick_start/pricing	2026-04-26 05:32:54 -07:00
Teknium	76042f5867	feat(review): class-first skill review prompt (#16026 ) The background skill-review prompt (spawned after N user turns) now instructs the reviewer to SURVEY existing skills first, identify the CLASS of task, and PREFER updating/generalizing an existing skill over creating a new narrow one. This reduces near-duplicate skill accumulation at the source. Catches the common failure mode where repeated tasks of the same class each spawn their own specific skill ("fix-my-tauri-error", "fix-my-electron-error") instead of a single class-level skill ("desktop-app-build-troubleshooting"). Applied to both _SKILL_REVIEW_PROMPT and the Skills half of _COMBINED_REVIEW_PROMPT. Memory-only review prompt unchanged. Groundwork for the Curator feature (issue #7816) — the creation-side fix. Curator handles the retirement/consolidation side in a follow-up PR. Tests assert the behavioral instructions are present (survey, class, update- over-create, overlap-flagging, opt-out clause) rather than snapshotting the full prompt text.	2026-04-26 05:17:10 -07:00
Teknium	192e7eb21f	fix(nous): don't trip cross-session rate breaker on upstream-capacity 429s (#15898 ) Nous Portal multiplexes multiple upstream providers (DeepSeek, Kimi, MiMo, Hermes) behind one endpoint. Before this fix, any 429 on any of those models recorded a cross-session file breaker that blocked EVERY model on Nous for the cooldown window -- even though the caller's own RPM/RPH/TPM/TPH buckets were healthy. Users hit a DeepSeek V4 Pro capacity error, restarted, switched to Kimi 2.6, and still got 'Nous Portal rate limit active -- resets in 46m 53s'. Nous already emits the full x-ratelimit-* header suite on every response (captured by rate_limit_tracker into agent._rate_limit_state). We now gate the breaker on that data: trip it only when either the 429's own headers or the last-known-good state show a bucket with remaining == 0 AND a reset window >= 60s. Upstream-capacity 429s (healthy buckets everywhere, but upstream out of capacity) fall through to normal retry/fallback and the breaker is never written. Note: the in-memory 'restart TUI/gateway to clear' workaround circulated in Discord does NOT work -- the breaker is file-backed at ~/.hermes/rate_limits/nous.json. The workaround for users still affected by a bad state file is to delete it. Reported in Discord by CrazyDok1 and KYSIV (Apr 2026).	2026-04-26 04:53:42 -07:00
Teknium	59b56d445c	feat(hooks): add duration_ms to post_tool_call + transform_tool_result (#15429 ) Plugin hooks fired after a tool dispatch now receive an integer duration_ms kwarg measuring how long the tool's registry.dispatch() call took (time.monotonic() before/after). Inspired by Claude Code 2.1.119 which added the same field to PostToolUse hook inputs. Wire points: - model_tools.py: measure dispatch latency, pass duration_ms to invoke_hook("post_tool_call", ...) and invoke_hook("transform_tool_result", ...) - hermes_cli/hooks.py: include duration_ms in the synthetic payload used by 'hermes hooks test' and 'hermes hooks doctor' so shell-hook authors see the same shape at development time as runtime - shell hooks (agent/shell_hooks.py): no code change needed; _serialize_payload already surfaces non-top-level kwargs under payload['extra'], so duration_ms lands at extra.duration_ms for shell-hook scripts Plugin authors can now build latency dashboards, per-tool SLO alerts, and regression canaries without having to wrap every tool manually. Test: tests/test_model_tools.py::test_post_tool_call_receives_non_negative_integer_duration_ms E2E: real PluginManager + dispatch monkey-patched with a 50ms sleep, hook callback observes duration_ms=50 (int). Refs: https://code.claude.com/docs/en/changelog (2.1.119, Apr 23 2026)	2026-04-25 22:13:12 -07:00
Teknium	eb28145f36	feat(approval): hardline blocklist for unrecoverable commands (#15878 ) Adds a floor below --yolo: a tiny set of commands so catastrophic they should never run via the agent, regardless of --yolo, gateway /yolo, approvals.mode=off, or cron approve mode. Opting into yolo is trusting the agent with your files and services — not trusting it to wipe the disk or power the box off. The list is deliberately small (12 patterns), covering only unrecoverable ops: - rm -rf targeting /, /home, /etc, /usr, /var, /boot, /bin, /sbin, /lib, ~, $HOME - mkfs (any variant) - dd + redirection to raw block devices (/dev/sd, /dev/nvme, etc.) - fork bomb - kill -1 / kill -9 -1 - shutdown, reboot, halt, poweroff, init 0/6, telinit 0/6, systemctl poweroff/reboot/halt/kexec Recoverable-but-costly commands (git reset --hard, rm -rf /tmp/x, chmod -R 777, curl \| sh) stay in DANGEROUS_PATTERNS where yolo can still pass them through — that's what yolo is for. Container backends (docker/singularity/modal/daytona) continue to bypass both hardline and dangerous checks, since nothing they do can touch the host. Inspired by Mercury Agent's permission-hardened blocklist.	2026-04-25 22:07:12 -07:00
Teknium	a55de5bcd0	feat(setup): auto-reconfigure on existing installs (#15879 ) Bare `hermes setup` on a returning user now drops straight into the full reconfigure wizard — every prompt shows the current value as its default, press Enter to keep or type a new value to change it. The returning-user menu is gone. Behavior: - First-time user: first-time wizard (unchanged) - Returning user, bare command: full reconfigure wizard (new default) - Returning user, `--quick`: only prompt for missing/unset items - Returning user, one section: `hermes setup model\|terminal\|gateway\|tools\|agent` - `--reconfigure`: preserved as backwards-compat alias (no-op since it's now default) The section functions already used current values as prompt defaults — this change just removes the extra click to get to them. The 'Quick Setup - configure missing items only' menu option is now exposed as the explicit `--quick` flag; it's the narrow case of filling in missing config (e.g. after a partial OpenClaw migration or when a required API key got cleared). Inspired by Mercury Agent's `mercury doctor` UX. Also removes: - RETURNING_USER_MENU_SECTION_KEYS (orphaned constant) - Two returning-user menu tests in test_setup_noninteractive.py (guarding behavior that no longer exists — covered by test_setup_reconfigure.py instead)	2026-04-25 22:02:02 -07:00
brooklyn!	cec0af02ad	Merge pull request #15870 from NousResearch/bb/fix-skills-search fix(tui): restore skills search RPC	2026-04-25 22:13:28 -05:00
Brooklyn Nicholson	91a7a0acbe	fix(tui): restore skills search RPC	2026-04-25 22:11:52 -05:00
Teknium	7c50ed707c	docs(azure-foundry): add provider guide, env vars, release AUTHOR_MAP - New website/docs/guides/azure-foundry.md covering both OpenAI-style and Anthropic-style endpoints, auto-detection behaviour, gpt-5.x routing, /v1 stripping, api-version query forwarding, and the provider: anthropic + Azure URL alternative setup. - environment-variables.md picks up AZURE_FOUNDRY_API_KEY, AZURE_FOUNDRY_BASE_URL, AZURE_ANTHROPIC_KEY. - cli-commands.md includes azure-foundry in the provider choices list. - configuration.md lists azure-foundry among auxiliary-task providers. - sidebars.ts wires the new guide into the Guides section. - scripts/release.py AUTHOR_MAP entries for TechPrototyper, HangGlidersRule (noreply), and pein892 so the contributor-attribution CI check does not reject the salvage.	2026-04-25 18:48:43 -07:00
Teknium	731e1ef8cb	feat(azure-foundry): auto-detect transport, models, context length The azure-foundry wizard now probes the endpoint before asking the user to pick anything by hand: 1. URL path sniff — endpoints ending in /anthropic are Azure Foundry Claude routes and skip to anthropic_messages. 2. GET <base>/models probe — if the endpoint returns an OpenAI-shaped model list, we switch to chat_completions and prefill the picker with the returned deployment/model IDs. 3. Anthropic Messages probe — fallback for endpoints that don't expose /models but do speak the Anthropic Messages shape. 4. Manual fallback — private endpoints / custom routes still work; the user picks API mode + types a deployment name. Context length for the selected model is resolved through the existing agent.model_metadata.get_model_context_length chain (models.dev, provider metadata, hardcoded family fallbacks) and stored in model.context_length when a non-default value is found. Also refactors runtime_provider so Azure Foundry resolution is reused between the explicit-credentials path and the default top-level path — previously the /v1 strip for Anthropic-style Azure only ran when the caller passed explicit_* args, which meant config-driven sessions hit a double-/v1 URL. New module hermes_cli/azure_detect.py with 19 unit tests covering: - path sniff, model ID extraction, probe fallbacks - HTTP error handling (URLError, HTTPError) - context-length lookup passthrough - DEFAULT_FALLBACK_CONTEXT rejection New runtime tests cover: - OpenAI-style Azure Foundry - Anthropic-style Azure Foundry with /v1 stripping - Missing base_url / API key raising AuthError Rationale: Microsoft confirms there's no pure-API-key endpoint to list Azure deployments (that requires ARM management auth). The v1 Azure OpenAI endpoint does expose /models with the resource's available model catalog, which is good enough for picker prefill in the common case. Users on private/gated endpoints fall through to manual entry.	2026-04-25 18:48:43 -07:00
akhater	ac57114284	fix(agent): support Azure OpenAI gpt-5.x on chat/completions endpoint Azure OpenAI exposes an OpenAI-compatible endpoint at `{resource}.openai.azure.com/openai/v1` that accepts the standard `openai` Python client. Two issues prevented gpt-5.x models from working: 1. `_max_tokens_param()` only sent `max_completion_tokens` for `api.openai.com` URLs. Azure also requires `max_completion_tokens` for gpt-5.x models. 2. The `codex_responses` upgrade gate unconditionally upgraded gpt-5.x to Responses API. Azure does NOT support the Responses API — it serves gpt-5.x on the regular `/chat/completions` path, causing a 404. Fix: add `_is_azure_openai_url()` that matches `openai.azure.com` URLs. - `_max_tokens_param()` now returns `max_completion_tokens` for Azure. - The `codex_responses` upgrade gate skips Azure so gpt-5.x stays on `chat_completions` where Azure actually serves it. - The fallback-provider api_mode picker also recognises Azure and stays on chat_completions. - Tests cover max_tokens routing, api_mode behaviour, and URL detection. gpt-4.x models on Azure are unaffected (already used chat_completions + max_tokens, which Azure accepts for those models). Salvage of PR #10086 — rewritten against current main where the codex_responses upgrade gate gained copilot-acp / explicit-api_mode exclusions.	2026-04-25 18:48:43 -07:00
pein892	24b4b24d79	fix: preserve URL query params for Azure OpenAI and custom endpoints Azure OpenAI requires an `api-version` query parameter on every request. When users include it in the base_url (e.g. `?api-version=2025-04-01-preview`), the OpenAI SDK silently drops it during URL construction, causing 404 errors. Extract query params from base_url and pass them via `default_query` so the SDK appends them to every request. This is a generic solution that works for any custom endpoint requiring query parameters, not just Azure. No-op for URLs without query params — fully backward compatible.	2026-04-25 18:48:43 -07:00
HangGlidersRule	c15064fa37	fix: pass api-version as default_query param, not in base_url — SDK was producing malformed URLs like /anthropic?api-version=.../v1/messages	2026-04-25 18:48:43 -07:00
HangGlidersRule	7bfa9442de	fix: skip OAuth token refresh for Azure Anthropic endpoints — prevents ~/.claude/.credentials.json from overwriting Azure key mid-session	2026-04-25 18:48:43 -07:00
HangGlidersRule	d8e4c7214e	fix: Azure Anthropic short-circuit in resolve_runtime_provider — bypass custom runtime when provider=anthropic + azure.com URL	2026-04-25 18:48:43 -07:00
HangGlidersRule	6ef3a47ce5	fix: use Azure API key directly for Azure endpoints, bypass OAuth token priority chain	2026-04-25 18:48:43 -07:00
TechPrototyper	3a7653dd1f	feat: Add Azure Foundry provider with OpenAI/Anthropic API mode selection Add support for Azure Foundry as a new inference provider. Azure Foundry endpoints can use either OpenAI-style (/v1/chat/completions) or Anthropic-style (/v1/messages) API formats. Changes: - Add azure-foundry to PROVIDER_REGISTRY (auth.py) - Add azure-foundry overlay in HERMES_OVERLAYS (providers.py) - Add empty model list for azure-foundry (models.py) - Add _model_flow_azure_foundry() interactive setup (main.py) - Add azure-foundry runtime resolution with api_mode support (runtime_provider.py) - Add AZURE_FOUNDRY_API_KEY and AZURE_FOUNDRY_BASE_URL env vars (config.py) Usage: hermes model -> More providers -> Azure Foundry The setup wizard prompts for: - Endpoint URL - API format (OpenAI or Anthropic-style) - API key - Model name Configuration is saved to config.yaml (model.provider, model.base_url, model.api_mode, model.default) and ~/.hermes/.env (AZURE_FOUNDRY_API_KEY).	2026-04-25 18:48:43 -07:00

1 2 3 4 5 ...

5993 Commits