hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-06 10:47:12 +08:00

Author	SHA1	Message	Date
Teknium	2a285d5ec2	fix(agent): stateful streaming scrubber for reasoning-block leaks (#17924 ) (#20184 ) * revert(gateway): remove stale-code self-check and auto-restart Removes the _detect_stale_code / _trigger_stale_code_restart mechanism introduced in #17648 and iterated in #19740. On every incoming message the gateway compared the boot-time git HEAD SHA to the current SHA on disk, and if they differed it would reply with Gateway code was updated in the background -- restarting this gateway so your next message runs on the new code. Please retry in a moment. and then kick off a graceful restart. This is unwanted behaviour: users who run a long-lived gateway and do their own ad-hoc git operations on the checkout end up with their chat interrupted and the current message dropped every time HEAD moves, with no way to opt out. If an operator really needs the old protection against stale sys.modules after "hermes update", the SIGKILL-survivor sweep in hermes update (hermes_cli/main.py, also tagged #17648) already handles the supervisor-respawn case on its own. Removed: gateway/run.py: - _STALE_CODE_SENTINELS, _GIT_SHA_CACHE_TTL_SECS - _read_git_head_sha(), _compute_repo_mtime() module helpers - class-level _boot_wall_time / _boot_repo_mtime / _boot_git_sha / _stale_code_restart_triggered defaults - __init__ boot-snapshot block (_boot_, _cached_current_sha, _repo_root_for_staleness, _stale_code_notified) - _current_git_sha_cached(), _detect_stale_code(), _trigger_stale_code_restart() methods - stale-code check + user-facing restart notice at the top of _handle_message() tests/gateway/test_stale_code_self_check.py (deleted, 412 lines) No new logic added. Zero remaining references to any removed symbol. Gateway test suite passes the same 4589 tests it passed before; the 3 pre-existing unrelated failures (discord free-channel, feishu bot admission, teams typing) are unchanged by this commit. * fix(agent): stateful streaming scrubber for reasoning-block leaks (#17924) Per-delta _strip_think_blocks ran at _fire_stream_delta and destroyed downstream state. When MiniMax-M2.7 / DeepSeek / Qwen3 streamed a tag split across deltas (delta1='<think>', delta2='Let me check'), the regex case-2 match erased delta1 entirely, so CLI/gateway state machines never learned a block was open and leaked delta2 as content. Raw consumers (ACP, api_server, TTS) had no downstream defense at all. Replace the per-delta regex with a stateful StreamingThinkScrubber that survives delta boundaries: - Closed <tag>X</tag> pairs always stripped (matches _strip_think_blocks case 1). - Unterminated open at block boundary enters a block; content discarded until close tag arrives. At end-of-stream, held content is dropped. - Orphan close tags stripped without boundary gating. - Partial tags at delta boundaries held back until resolved. - Block-boundary rule (start-of-stream, after \n, or whitespace-only since last \n) preserves prose that mentions tag names. Reset at turn start alongside the existing context scrubber; flush at turn end so a benign '<' held back at end-of-stream reaches the UI. E2E-verified on live OpenRouter->MiniMax-m2 streams: closed pairs strip cleanly, first word of post-block content is preserved, pure content passes through unchanged. Stefan's screenshot case (#17924) — 'Let me check' getting chopped to ' me check' — no longer happens. Final _strip_think_blocks calls on completed strings (final_response, replay, compression) are preserved; only the streaming per-delta call site switched to the scrubber.	2026-05-05 04:33:38 -07:00
Brecht-H	f25d3ec917	fix(kanban): suppress dispatcher stuck-warn when ready queue holds only non-spawnable assignees After PR #20105 (dispatcher skips ready tasks whose assignee fails ``profile_exists()`` to prevent the orion-cc/orion-research crash loop), the gateway and CLI emit a spurious "kanban dispatcher stuck: ready queue non-empty for N consecutive ticks but 0 workers spawned" warning every 5 minutes on multi-lane setups where the queue is steadily full of human-pulled work assigned to terminal lanes. The warn is intended to catch real failure modes (broken PATH, missing venv, credential loss for a real Hermes profile). On a multi-lane host it fires forever even though everything is healthy: the dispatcher correctly chose not to spawn, and there is nothing for the operator to fix. Changes: * ``DispatchResult`` gains a ``skipped_nonspawnable`` field (separate from ``skipped_unassigned``) so callers can distinguish "task missing an owner — operator should route it" from "task owned by a control-plane lane — terminal will pull it". * ``dispatch_once`` routes the ``not profile_exists(assignee)`` skip into the new bucket (was lumped into ``skipped_unassigned``). * New helper ``has_spawnable_ready(conn)`` returns True iff at least one ready+assigned+unclaimed task in the DB has an assignee that maps to a real Hermes profile. Falls back to legacy "any ready+assigned" when ``profile_exists`` is unimportable so degraded installs still surface the original warn. * The gateway dispatcher (``gateway/run.py``) and the CLI standalone daemon (``hermes_cli/kanban.py``) both swap their cheap ``ready_nonempty`` probe to use ``has_spawnable_ready``. Stuck-warn now fires only when there is genuine spawnable work the dispatcher failed to start. * CLI dispatch output prints ``Skipped (non-spawnable assignee — terminal lane, OK)`` for visibility without alarm. Tests: * New ``has_spawnable_ready`` cases (empty queue, terminal-lane only, mixed real+terminal). * New ``test_dispatch_skips_nonspawnable_into_separate_bucket`` verifies the bucketing change. * Updated ``test_dispatch_skips_unassigned`` to assert no cross-leak. * Added ``all_assignees_spawnable`` fixture in ``tests/hermes_cli/conftest.py`` and threaded it through dispatcher tests that use synthetic assignees ("alice", "bob"). PR #20105 (the parent commit) silently broke 8 such tests by routing those assignees into ``skipped_nonspawnable`` instead of spawning; this PR repairs them as part of the same code area. Verified locally: 246/246 kanban-suite tests pass. Stacks on top of fix/kanban-dispatcher-skip-missing-profile-2026-05-05 (PR #20105). Reviewer: this PR is meant to merge AFTER #20105. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 04:13:12 -07:00
simbam99	8ad5e98f8d	fix(gateway): preserve pending update prompts across restarts	2026-05-05 03:59:39 -07:00
helix4u	b632290166	fix(gateway): handle planned service stops	2026-05-04 16:00:49 -07:00
Teknium	d90f73bcec	fix(gateway): use git HEAD SHA, not file mtimes, for stale-code check (#19740 ) The stale-code self-check (Issue #17648) used sentinel-file mtimes to decide whether the gateway survived a `hermes update` with stale `sys.modules`. That signal false-positives on any write to the sentinel files — including agent-driven edits during Hermes-on-Hermes dev sessions. Telling the agent to patch `run_agent.py` would flip the check to True on the next user message and force a gateway restart even though no update happened. Switch the signal to `git rev-parse HEAD`. Agent file edits don't move HEAD; `hermes update` (git pull) always does. Reading .git/HEAD directly (no subprocess) with a 5s cache keeps the overhead negligible on bursty chats. Non-git installs short-circuit to False — the stale-modules class can't occur without a git-backed update path, so there's nothing to detect. The legacy `_compute_repo_mtime` helper is kept but unused by detection, reserved as a fallback hook for future pip-install update paths. - _read_git_head_sha(): resolves HEAD across main checkout, worktree (follows `gitdir:` + `commondir` pointers), and packed-refs layouts. - _current_git_sha_cached(): per-runner 5s SHA cache. - _detect_stale_code(): boot SHA vs current SHA, returns False when either is unavailable. - Tests cover all four layouts, the agent-edits-don't-trigger regression, and cache behavior. Refs #17648.	2026-05-04 12:33:21 -07:00
teknium1	d35efb9898	feat(telegram): /topic off + help + auth gate + screenshot debounce Four production-readiness additions to topic mode: 1. /topic off — clean disable path. Flips telegram_dm_topic_mode.enabled to 0 and clears telegram_dm_topic_bindings for this chat. Previously users had to edit state.db with sqlite3 to turn the feature off. Idempotent: calling /topic off when the chat was never enabled returns a friendly no-op message. 2. /topic help — inline usage printed in the DM so users don't have to visit docs to discover /topic off, /topic <session-id>, etc. 3. Authorization gate. /topic mutates SQLite side tables and flips the root DM into a lobby, so the action must be authorized. Now calls self._is_user_authorized(source); unauthorized DMs get a refusal instead of activation. Defense in depth on top of the gateway's existing pre-route auth. 4. BotFather screenshot debounce. A user repeatedly running /topic while Threads Settings is still disabled would previously re-upload the same screenshot every time. Now rate-limited to one send per 5 minutes per chat. /topic off resets the counter so re-enabling starts fresh. Command-def args hint updated: /topic [off\|help\|session-id]. Docs: - New /topic subcommands table at the top of the multi-session section - Disable instructions updated to recommend /topic off first, with the raw SQL fallback kept for bulk cleanup - Under-the-hood list extended with the capability-hint debounce and the authorization gate Tests (6 new): - /topic help returns usage and doesn't create topic tables - /topic off disables mode AND clears bindings - /topic off is idempotent when never enabled - Unauthorized users get refusal, no tables created - Capability-hint debounce is per-chat - /topic off resets both lobby and capability debounce counters All 402 targeted tests pass. Full gateway sweep: 4809/4810 (pre-existing test_teams::test_send_typing unrelated).	2026-05-04 12:07:17 -07:00
teknium1	1381c89e56	fix(telegram): polish topic mode — CASCADE, General-topic handling, rename guard, debounce Five follow-ups to topic mode based on integration audit: 1. ON DELETE CASCADE on telegram_dm_topic_bindings.session_id. Session pruning (manual /delete, auto-cleanup, any future prune job) would have thrown 'FOREIGN KEY constraint failed' for sessions bound to a topic. Migration bumped to v2, rebuilds the bindings table in place if FK lacks CASCADE. Idempotent; only runs once per DB. 2. Never auto-rename operator-declared topics. If an operator has extra.dm_topics configured AND a user runs /topic, messages in those pre-declared topics would previously trigger auto-rename and silently mutate operator config. _rename_telegram_topic_for_session_title now early-returns when _get_dm_topic_info returns a dict for this (chat_id, thread_id). Uses class-based lookup (not hasattr) so MagicMock test fixtures don't accidentally trip the guard. 3. General topic handling. Telegram's General (pinned top) topic in a forum-enabled private chat may send messages with message_thread_id=1 or omit thread_id entirely depending on client. Both are now treated as the root lobby, not a topic lane. Prevents users from accidentally burning a session on the General topic. 4. Debounce the root-lobby reminder. 30-second cooldown per chat so a user who forgets topic mode is enabled and types ten messages in the root gets one reminder, not ten. Explicit command replies (/new-in-lobby, /topic <session-id>) still land every time. 5. Docs: added under-the-hood invariants for the above, plus a Downgrade section explaining that rolling back to a pre-/topic Hermes build leaves the DB tables orphaned but harmless — DMs just revert to native per-thread isolation. Tests: - test_operator_declared_topic_is_not_auto_renamed - test_general_topic_is_treated_as_root_lobby - test_lobby_reminder_is_debounced_per_chat - test_binding_survives_session_deletion_via_cascade - test_migration_rebuilds_v1_binding_table_with_cascade_fk Validated: 4803/4804 tests pass (tests/gateway/ + tests/test_hermes_state.py). Sole failure is a pre-existing test_teams::test_send_typing flake unrelated to this PR.	2026-05-04 12:07:17 -07:00
teknium1	a7683d04a9	fix(telegram): harden DM topic binding — persist through switch_session, rebind on /new Follow-up on @EmelyanenkoK's feat: add Telegram DM topic-mode sessions. Three issues: 1. Split-brain session state. After get_or_create_session() returned a SessionEntry for a topic lane, the handler was mutating .session_id in place to the binding's target, but never persisting the switch through SessionStore. The sessions.json session_key → session_id map kept pointing at the lane's natural id; any reader that reloaded from disk saw the wrong id. Fixed by routing through SessionStore.switch_session(), which _save()s the mapping and ends the old session in SQLite like /resume does. 2. /new inside a topic was a one-message no-op. Reset created a new session but left the telegram_dm_topic_bindings row pointing at the old session_id, so the next message's binding lookup switched right back. Now _handle_reset_command rebinds the topic to the new session_id after reset. 3. is_telegram_session_linked_to_topic and list_unlinked_telegram_sessions_for_user both called apply_telegram_topic_migration() on read, contradicting the PR's own invariant that migration only runs on explicit /topic opt-in. They now tolerate missing topic tables and return empty/False. Also: _telegram_topic_mode_enabled() now only treats True as enabled (not any truthy return), so test fixtures with MagicMock session_db don't accidentally flip every DM into lobby mode — this was breaking 4 pre-existing test_status_command tests. Tests: - New regression: /new inside a topic must update the binding row (test_new_inside_telegram_topic_rewrites_binding_to_new_session). - _make_runner now stubs switch_session so existing restore tests still exercise the new code path. Validated end-to-end with real SessionDB + SessionStore: readers on fresh DB don't create topic tables; enable creates them; binding override persists across SessionStore restart; /new rebinds and the new id survives a restart. Co-authored-by: EmelyanenkoK <emelyanenko.kirill@gmail.com>	2026-05-04 12:07:17 -07:00
EmelyanenkoK	25065283b3	fix: improve telegram topic mode setup	2026-05-04 12:07:17 -07:00
EmelyanenkoK	d6615d8ec7	feat: add Telegram DM topic-mode sessions	2026-05-04 12:07:17 -07:00
cong	3ccf723bf9	fix(gateway): read context_length from custom_providers in session info header	2026-05-04 04:51:13 -07:00
Teknium	5ec6baa400	feat(kanban): multi-project boards — one install, many kanbans (#19653 ) Adds first-class board support to kanban so users can separate unrelated streams of work (projects, repos, domains) into isolated queues. Single- project users stay on the 'default' board and see no UI change. Isolation model --------------- - Each board is a directory at `~/.hermes/kanban/boards/<slug>/` with its own `kanban.db`, `workspaces/`, and `logs/`. The 'default' board keeps its legacy path (`~/.hermes/kanban.db`) for back-compat — fresh installs and pre-boards users get zero migration. - Workers spawned by the dispatcher have `HERMES_KANBAN_BOARD` pinned in their env alongside the existing `HERMES_KANBAN_DB` / `HERMES_KANBAN_WORKSPACES_ROOT` pins, so workers physically cannot see other boards' tasks. - The gateway's single dispatcher loop now sweeps every board per tick; per-tick cost is a few extra filesystem stats. - CAS concurrency guarantees are preserved per-board (each board is its own SQLite DB, same WAL+IMMEDIATE machinery as before). CLI --- hermes kanban boards list\|create\|switch\|show\|rename\|rm hermes kanban --board <slug> <any-subcommand> Board resolution order: `--board` flag → `HERMES_KANBAN_BOARD` env → `~/.hermes/kanban/current` file → `default`. Slug validation is strict: lowercase alphanumerics + hyphens + underscores, 1-64 chars, starts with alphanumeric. Uppercase is auto-downcased; slashes / dots / `..` / control chars are rejected so boards can't name their way out of the boards/ directory. Passive discoverability: when more than one board exists, `hermes kanban list` prints a one-line header ("Board: foo (2 other boards …)") so users who stumble across multi-project never have to hunt for the feature. Invisible for single-board installs. Dashboard --------- - New `BoardSwitcher` component at the top of the Kanban tab: dropdown with all boards + task counts, `+ New board` button, `Archive` button (non-default only). Hidden entirely when only `default` exists and is empty — single-project users never see it. - New `NewBoardDialog` modal: slug / display name / description / icon + "switch to this board after creating" checkbox. - Selected board persists to `localStorage` so browser users don't shift the CLI's active board out from under a terminal they left open. - New `?board=<slug>` query param on every existing endpoint plus a new `/boards` CRUD surface (`GET /boards`, `POST /boards`, `PATCH /boards/<slug>`, `DELETE /boards/<slug>`, `POST /boards/<slug>/switch`). - Events WebSocket is pinned to a board at connection time; switching opens a fresh WS against the new board. Also fixes a pre-existing bug in the plugin's tenant / assignee filters: the SDK's `Select` uses `onValueChange(value)`, not native `onChange(event)`, so those filters silently didn't work. New `selectChangeHandler` helper wires both signatures. Tests ----- 49 new tests in `tests/hermes_cli/test_kanban_boards.py` covering: slug validation (valid / invalid / auto-downcase), path resolution (default = legacy path, named = `boards/<slug>/`, env var override), current-board resolution chain (env > file > default), board CRUD + archive / hard-delete, per-board connection isolation (tasks don't leak), worker spawn env injection (`HERMES_KANBAN_BOARD`, `HERMES_KANBAN_DB`, `HERMES_KANBAN_WORKSPACES_ROOT` all point at the right board), and end-to-end CLI surface. Regression surface: all 264 pre-existing kanban tests continue to pass. Live-tested via the dashboard: created 3 boards (default, hermes-agent, atm10-server), created tasks on each via both CLI (`--board <slug> create`) and dashboard (inline create on the Ready column), confirmed zero cross-board leakage, confirmed `BoardSwitcher` + `NewBoardDialog` work end-to-end in the browser.	2026-05-04 04:42:38 -07:00
Teknium	5b6d413476	fix(cli,gateway): surface title errors from /new <name> The contributor's PR silently swallowed ValueError from SessionDB.set_session_title() with bare except Exception: pass. Users typing /new <title> with an already-in-use title got an untitled session and no feedback. Changes: - cli.py: catch ValueError from both sanitize_title() and set_session_title(); print the error and mark the session untitled in the banner (never echo the rejected title back). - gateway/run.py: append a warning note to the reset reply on title rejection; reflect the accepted title in the header. - Add regression tests for the duplicate-title path in CLI and gateway. Also map exx@example.com -> @exxmen in scripts/release.py.	2026-05-04 03:14:50 -07:00
Exx	f720751d79	feat(cli,gateway): /new accepts optional session name argument Allow users to start a fresh session and immediately set its title by passing a name to /new (or /reset): /new Refactor auth module Changes: - hermes_cli/commands.py: add args_hint='[name]' to /new command - cli.py: parse title argument in process_command(), pass to new_session() - cli.py: new_session() accepts title=None, sets title via SessionDB - gateway/run.py: _handle_reset_command() parses title, sets on new entry - gateway/session.py: reset_session() accepts optional display_name - tests: add test_new_session_with_title, test_reset_command_with_title, test_new_command_in_help_output All 36 affected tests pass.	2026-05-04 03:14:50 -07:00
Hermes Agent	74c997d985	fix(gateway): move quick-command dispatch before built-in handlers Quick commands of type "alias" that target built-in slash commands (e.g. /h -> /model) were processed too late in _handle_message — after the if-canonical=="model" checks. This meant alias expansion never reached the target handler and fell through to the LLM as raw text. Two fixes: 1. Move the quick_commands block before built-in dispatch so alias targets (like /model) hit the correct handler after expansion. 2. Extract bare command name from target_command via .split()[0] to feed _resolve_cmd() correctly (was using the full arg-string).	2026-05-04 01:39:23 -07:00
Siddharth Balyan	a11aed1acc	fix(cli): local backend CLI always uses launch directory, stops .env sync of TERMINAL_CWD (#19334 ) The old CWD heuristic was fooled by: 1. TERMINAL_CWD persisted to .env by `hermes config set terminal.cwd` 2. Inherited TERMINAL_CWD from parent hermes processes 3. Only resolved when config had a placeholder value (not explicit paths) Fix: - load_cli_config() unconditionally uses os.getcwd() for local backend - TERMINAL_CWD always force-exported in CLI mode (overrides stale values) - Gateway sets _HERMES_GATEWAY=1 marker so lazy cli.py imports don't clobber - Remove terminal.cwd from config-set .env sync map (prevents re-poisoning) - Clarify setup wizard label as 'Gateway working directory' Closes #19214	2026-05-04 11:36:19 +05:30
molvikar	74636f9c4a	fix(gateway): clear queued reload-skills notes on new/resume/branch	2026-05-03 17:00:31 -07:00
Kenny Wang	222767e5e8	fix: sanitize Telegram help command mentions	2026-05-03 17:00:09 -07:00
clawbot	1bd975c0ba	fix(gateway): suppress duplicate voice transcripts Deduplicate exact and near-exact Discord voice STT transcripts per guild/user over a short window to avoid duplicate delayed agent replies. Adds regression tests for exact and near-duplicate voice transcript suppression.	2026-05-03 16:59:21 -07:00
leprincep35700	b59bb4e351	fix(gateway): preserve home-channel thread targets across restart notifications	2026-05-03 08:47:49 -07:00
Teknium	d87fd9f039	fix(goals): make /goal work in TUI and fix gateway verdict delivery (#19209 ) /goal was silently broken outside the classic CLI. TUI: /goal was routed through the HermesCLI slash-worker subprocess, which set the goal row in SessionDB but then called _pending_input.put(state.goal) — the subprocess has no reader for that queue, so the kickoff message was discarded. No post-turn judge was wired into prompt.submit either, so even a manual kickoff would not continue the goal loop. Intercept /goal in command.dispatch instead, drive GoalManager directly, and return {type: send, notice, message} so the TUI client renders the Goal-set notice and fires the kickoff. Run the judge in _run_prompt_submit after message.complete, surface the verdict via status.update {kind: goal}, and chain the continuation turn after the running guard is released. Gateway: _post_turn_goal_continuation was gated on hasattr(adapter, 'send_message'), but adapters only expose send(). That branch was dead on every platform — users never saw '✓ Goal achieved', 'Continuing toward goal', or budget-exhausted messages. Replace the dead call with adapter.send(chat_id, content, metadata) and drop a broken reference to self._loop. Tests: - tests/tui_gateway/test_goal_command.py — full /goal dispatch matrix (set / status / pause / resume / clear / stop / done / whitespace) plus regressions for slash.exec → 4018 and 'goal' staying in _PENDING_INPUT_COMMANDS. - tests/gateway/test_goal_verdict_send.py — locks in the adapter.send path for done / continue / budget-exhausted and verifies the hook no-ops when no goal is set or the adapter lacks send().	2026-05-03 05:49:12 -07:00
millerc79	f1e0292517	fix(gateway): resume sessions after crash/restart instead of blanket suspend suspend_recently_active() was unconditionally setting suspended=True on startup, causing get_or_create_session() to wipe conversation history on every restart. Change to set resume_pending=True instead, so sessions auto-resume while still allowing stuck-loop escalation after 3 failures.	2026-05-03 03:54:03 -07:00
0xyg3n	19ba9e43b6	fix(gateway/discord): require allowlist auth on slash commands Slash commands (_run_simple_slash, _handle_thread_create_slash) bypassed every DISCORD_ALLOWED_* gate enforced by on_message. Any guild member could invoke /background (RCE via terminal), /restart, /model, /skill, etc. CVSS 9.8 Critical. - _evaluate_slash_authorization mirrors on_message gates (user, role, channel, ignored channel) with fail-closed semantics - _check_slash_authorization sends ephemeral reject + logs + admin alert - Auth gate runs before defer() so rejections are ephemeral - /skill autocomplete returns [] for unauthorized users (no catalog leak) - Component views (ExecApproval, SlashConfirm, UpdatePrompt, ModelPicker) now honor role allowlists via shared _component_check_auth helper - Optional DISCORD_HIDE_SLASH_COMMANDS defense-in-depth - Cross-platform admin alert (Telegram/Slack fallback) on unauthorized attempts Based on PR #18125 by @0xyg3n.	2026-05-03 03:44:55 -07:00
Teknium	1dce908930	fix(gateway): shutdown + restart hygiene (drain timeout, false-fatal, success log) (#18761 ) * fix(gateway): config.yaml wins over .env for agent/display/timezone settings Regression from the silent config→env bridge. The bridge at module import time is correct for max_turns (unconditional overwrite), but every other agent., display., timezone, and security bridge key was guarded by 'if X not in os.environ' — so a stale .env entry from an old 'hermes setup' run would shadow the user's current config.yaml indefinitely. Symptom: agent.max_turns: 500 in config.yaml, HERMES_MAX_ITERATIONS=60 in .env from an old setup, and the gateway silently capped at 60 iterations per turn. Gateway logs confirmed api_calls never exceeded 60. Three changes: 1. gateway/run.py: drop the 'not in os.environ' guards for all agent., display., timezone, and security.* bridge keys. config.yaml is now authoritative for these settings — same semantics already in place for max_turns, terminal., and auxiliary.. Also surface the bridge failure (previously 'except Exception: pass') to stderr so operators see bridge errors instead of silently falling back to .env. 2. gateway/run.py: INFO-log the resolved max_iterations at gateway start so operators can verify the config→env bridge did the right thing instead of chasing a phantom budget ceiling. 3. hermes_cli/setup.py: stop writing HERMES_MAX_ITERATIONS to .env in the setup wizard. config.yaml is the single source of truth. Also clean up any stale .env entry left behind by pre-fix setups. Regression tests in tests/gateway/test_config_env_bridge_authority.py guard each config→env key against the 'stale .env shadows config' bug. * fix(gateway): shutdown + restart hygiene (drain timeout, false-fatal, success log) Three issues observed in production gateway.log during a rapid restart chain on 2026-05-02, all fixed here. 1. _send_restart_notification logged unconditional success adapter.send() catches provider errors (e.g. Telegram 'Chat not found') and returns SendResult(success=False); it never raises. The caller ignored the return value and always logged 'Sent restart notification to <chat>' at INFO, producing a misleading success line directly below the 'Failed to send Telegram message' traceback on every boot. Now inspects result.success and logs WARNING with the error otherwise. 2. WhatsApp bridge SIGTERM on shutdown classified as fatal error _check_managed_bridge_exit() saw the bridge's returncode -15 (our own SIGTERM from disconnect()) and fired the full fatal-error path, producing 'ERROR ... WhatsApp bridge process exited unexpectedly' plus 'Fatal whatsapp adapter error (whatsapp_bridge_exited)' on every planned shutdown, immediately before the normal '✓ whatsapp disconnected'. Adds a _shutting_down flag that disconnect() sets before the terminate, and _check_managed_bridge_exit() returns None for returncode in {0, -2, -15} while shutting down. OOM-kill (137) and other non-signal exits still hit the fatal path. 3. restart_drain_timeout default 60s → 180s On 2026-05-02 01:43:27 a user /restart fired while three agents were mid-API-call (82s, 112s, 154s into their turns). The 60s drain budget expired and all three were force-interrupted. 180s covers realistic in-flight agent turns; users on very-long-reasoning models can still raise it further via agent.restart_drain_timeout in config.yaml. Existing explicit user values are preserved by deep-merge. Tests - tests/gateway/test_restart_notification.py: two new tests assert INFO is only logged on SendResult(success=True) and WARNING with the error string is logged on SendResult(success=False). - tests/gateway/test_whatsapp_connect.py: parametrized test for returncode in {0, -2, -15} proves shutdown-time exits are suppressed; separate test proves returncode 137 (SIGKILL/OOM) still surfaces as fatal even when _shutting_down is set. - _check_managed_bridge_exit() reads _shutting_down via getattr-with- default so existing _make_adapter() test helpers that bypass __init__ (pitfall #17 in AGENTS.md) keep working unmodified.	2026-05-02 02:08:06 -07:00
Teknium	10297fa23c	fix(discord): `/reload-skills` now refreshes the `/skill` autocomplete live (#18754 ) `_register_skill_group` captured the skill catalog in closure variables (`entries` and `skill_lookup`) so the single `tree.add_command` call at startup owned the only live copy. The closure is never re-entered after startup, so `/reload-skills` — which rescans the on-disk skills dir and refreshes the in-process `_skill_commands` registry — had no way to propagate results into the `/skill` autocomplete on Discord. New skills stayed invisible in the dropdown, and deleted skills returned "Unknown skill" when the stale autocomplete entry was clicked. The fix is purely a dataflow change: promote `entries` and `skill_lookup` to instance attributes (`_skill_entries`, `_skill_lookup`), split the collector-driven rebuild into a helper (`_refresh_skill_catalog_state`), and add a public `refresh_skill_group()` method that re-runs the helper and is safe to call at any point after the initial registration. The gateway's `_handle_reload_skills_command` then iterates `self.adapters` and calls `refresh_skill_group()` on any adapter that exposes it (currently only Discord). Both sync and async implementations are supported; adapters that don't override the method (Telegram's BotCommand menu, Slack subcommand map, etc.) are silently skipped — the in-process `reload_skills()` call covers them. No `tree.sync()` is required because Discord fetches autocomplete options dynamically on every keystroke — mutating the instance state the callbacks already read from is sufficient. That sidesteps the per-app command-bucket rate limit (~5 writes / 20 s) that made the previous bulk-sync-on-reload approach unusable (#16713 context). Tests: tests/gateway/test_reload_skills_discord_resync.py — five cases covering (1) refresh replaces entries, (2) entries stay sorted after refresh, (3) collector exception leaves cached state intact, (4) `_refresh_skill_catalog_state` populates the instance attrs, (5) orchestrator calls `refresh_skill_group()` on sync + async adapters and skips adapters that don't expose it.	2026-05-02 02:00:11 -07:00
Teknium	6ec74aec07	fix(gateway): match disabled/optional skills by frontmatter slug, not dir name (#18753 ) _check_unavailable_skill is meant to turn a typed "/foo" command that doesn't resolve into a specific hint — "disabled, enable with hermes skills config" or "available but not installed, install with hermes skills install …" — instead of the generic "unknown command" reply. It was doing the match with `skill_md.parent.name.lower().replace("_", "-")`, comparing that to the typed command. For every skill whose directory name drifted from its declared frontmatter `name:`, that comparison failed and the user got the unhelpful generic path. On a standard install today 19 skills have this drift, e.g.: dir: mlops/stable-diffusion frontmatter: name: Stable Diffusion Image Generation registered slug (what the user types): /stable-diffusion-image-generation dir: mlops/qdrant frontmatter: name: Qdrant Vector Search registered slug: /qdrant-vector-search dir: mlops/flash-attention frontmatter: name: Optimizing Attention Flash registered slug: /optimizing-attention-flash In every case, _check_unavailable_skill would fall through because "stable-diffusion" != "stable-diffusion-image-generation", even with the skill sitting right there on disk. Fix: extract a small `_skill_slug_from_frontmatter` helper that reads the SKILL.md frontmatter and normalizes exactly like scan_skill_commands (lower, spaces/underscores → hyphens, strip non-[a-z0-9-], collapse runs of hyphens, strip edges). Use it in both the disabled-skills branch and the optional-skills branch. The disabled-set membership check now uses the declared frontmatter name (which is what `hermes skills config` writes into skills.disabled / platform_disabled), not the slug. Tests: five cases in tests/gateway/test_unavailable_skill_hint.py — the drift case for the disabled branch, unknown-command negative, matched-but-not-disabled negative, non-alnum stripping, and the drift case for the optional-skills branch. All five fail against main and pass with the fix.	2026-05-02 02:00:09 -07:00
kshitijk4poor	8fcc160f6b	fix(gateway/slack): review fixes — scope ephemeral to commands, user isolation Self-review fixes for the slash ephemeral ack: - Only stash response_url when text starts with '/' (gateway command). Free-form questions via '/hermes <question>' must produce public agent replies visible to the whole channel, not ephemeral. - Use a ContextVar (_slash_user_id) to thread the invoking user's ID from _handle_slash_command through to send(). _pop_slash_context now matches the exact (channel_id, user_id) key when the ContextVar is set, preventing concurrent users on the same channel from stealing each other's ephemeral context. ContextVars propagate to child asyncio.Tasks, so the value survives through handle_message → _process_message_background → _send_with_retry → send(). - Add truncate_message() in _send_slash_ephemeral to prevent silent failures on long responses (response_url has the same ~40k limit). - Log send_private_notice failures at debug level instead of bare except/pass — aids diagnostics without spamming. - Document app_mention dedup dependency on shared event ts. - Add tests: free-form question must NOT stash context, concurrent users on the same channel get isolated contexts, non-slash send() path fallback behavior.	2026-05-01 13:33:06 -07:00
probepark	0ab2d752ff	feat(gateway): private notice delivery and Slack format_message fixes Adds platform-level private notice delivery abstraction so operational messages (e.g. sethome prompt) can be sent ephemerally on Slack when configured with `slack.notice_delivery: private`. Changes: - gateway/config.py: _normalize_notice_delivery() + GatewayConfig.get_notice_delivery() with per-platform config bridging - gateway/platforms/base.py: send_private_notice() default implementation (falls through to send()) - gateway/platforms/slack.py: send_private_notice() via chat_postEphemeral - gateway/run.py: _deliver_platform_notice() helper replaces direct adapter.send() for the sethome notice, with private→public fallback - gateway/platforms/slack.py: app_mention handler now forwards to _handle_slack_message (safe due to ts-based dedup) instead of no-op pass, fixing edge-case Slack configs where mentions arrive only as app_mention - gateway/platforms/slack.py format_message: negative lookbehind prevents markdown images (![]()) from becoming broken Slack links; italic regex now requires non-whitespace boundaries so 'a * b * c' stays literal Based on PR #9340 by @probepark.	2026-05-01 13:33:06 -07:00
Teknium	f99676e315	fix(gateway): auto-restart when source files change out from under us (#17648 ) (#18409 ) Long-running gateway processes that survive 'hermes update' keep pre-update modules cached in sys.modules. When new tool files on disk then try to 'from hermes_cli.config import cfg_get' (added in PR #17304), the import resolves against the stale module object and raises ImportError — hitting users on Matrix, Telegram, Feishu, and other platforms. Two defenses: 1. Gateway self-check (gateway/run.py). On __init__, snapshot the newest mtime across sentinel source files (hermes_cli/config.py, run_agent.py, gateway/run.py, etc.). On every inbound message, re-read those mtimes; if any is newer than boot time + 2s slack, request a graceful restart via the normal drain path and return a one-line ack to the user. Idempotent, works regardless of how the update happened (hermes update, manual git pull, installer). 2. Post-restart survivor sweep ('hermes update'). After the existing restart loop, sleep 3s, rescan for gateway PIDs we already tried to kill, and SIGKILL any survivors. The detached profile watchers and systemd then relaunch with fresh code instead of waiting out the 120s watcher timeout. Closes #17648.	2026-05-01 09:50:08 -07:00
Mikey O'Brien	1be3b74cfb	fix(gateway): honor MATRIX_HOME_ROOM in onboarding	2026-04-30 23:13:34 -07:00
Teknium	265bd59c1d	feat: /goal — persistent cross-turn goals (Ralph loop) (#18262 ) Add a standing-goal slash command that keeps Hermes working toward a user-stated objective across turns until it is achieved, paused, or the turn budget runs out. Our take on the Ralph loop — cf. Codex CLI 0.128.0's /goal. After each turn, a lightweight auxiliary-model judge call asks 'is this goal satisfied by the assistant's last response?'. If not, and we're under the turn budget (default 20), Hermes feeds a continuation prompt back into the same session as a normal user message. Any real user message preempts the continuation loop automatically. Judge failures fail OPEN (continue) so a flaky judge never wedges progress — the turn budget is the real backstop. ### Commands - `/goal <text>` — set a standing goal (kicks off the first turn) - `/goal` or `/goal status` — show current state - `/goal pause` — pause the continuation loop - `/goal resume` — resume (resets turn counter) - `/goal clear` — drop the goal Works on both CLI and gateway platforms via the central CommandDef registry. ### Design invariants preserved - Prompt cache: continuation prompts are regular user-role messages appended to history. No system-prompt mutation, no toolset swap. - Role alternation: continuation is a user turn, never injected mid-tool-loop. - Session persistence: goal state lives in SessionDB.state_meta keyed by `goal:<session_id>`, so `/resume` picks it up. - Mid-run safety: on the gateway, `/goal status\|pause\|clear` are allowed mid-run (control-plane only); setting a new goal requires `/stop` first so we don't race a second continuation prompt against the current turn. ### Files - `hermes_cli/goals.py` (new, 380 lines) — GoalManager + judge + state - `hermes_cli/commands.py` — CommandDef entry - `hermes_cli/config.py` — `goals.max_turns` default - `hermes_cli/web_server.py` — dashboard category merge - `cli.py` — /goal handler + post-turn continuation hook in process_loop - `gateway/run.py` — /goal handler + post-turn continuation hook wrapping _handle_message_with_agent - `tests/hermes_cli/test_goals.py` (new, 26 tests) — judge parsing, fail-open semantics, lifecycle, persistence, budget exhaustion - `website/docs/reference/slash-commands.md` — docs entry	2026-04-30 23:10:20 -07:00
Teknium	4caad285a6	feat(gateway): auto-delete slash-command system notices after TTL (#18266 ) Adds opt-in auto-deletion for slash-command reply messages like "New session started!", "Restarting gateway…", "Stopped.", and YOLO toggles. After the TTL elapses the gateway calls the adapter's delete_message; on platforms without a delete API (everything except Telegram today) the TTL is silently ignored and the message stays. Requested on Twitter by @charlesmcdowell — tool-call bubbles are useful real-time, but system notices clutter the thread once the agent finishes. Implementation: - EphemeralReply(str) sentinel in gateway/platforms/base.py. Subclasses str so existing 'X' in response / response.startswith(...) checks in tests and call sites keep working unchanged; isinstance() still distinguishes it for the send path. - _process_message_background and both busy-session bypass paths (in base.py) call _unwrap_ephemeral() on the handler return, send the unwrapped text, and schedule a detached delete task when the TTL > 0 AND the adapter class overrides delete_message. - display.ephemeral_system_ttl (default 0 = disabled) in DEFAULT_CONFIG. Handler can pass ttl_seconds explicitly to override. - Wrapped the highest-noise return sites: /new, /reset, /stop, /yolo on/off, /restart success + "already in progress". Draining notices and /help output left as plain strings — those are informational and users want to read them. Backward-compat: default TTL 0 → no scheduling, no behavior change for existing users. Platforms without delete_message silently no-op.	2026-04-30 23:05:48 -07:00
Teknium	f0dc919f92	fix(compression): include system prompt + tool schemas in token estimates (#18265 ) The user-visible /compress banner and the post-compression last_prompt_tokens writeback both counted only the raw message transcript (chars/4). With a 15KB system prompt and 30 tool schemas (~26KB), a 4-message transcript that looks like ~45 tokens to the transcript-only estimator is really ~10.5K tokens of request pressure — a 234x gap. Two user-facing consequences: - Banner shows 'Compressing … (~45 tokens)…' while compression is actually firing on 10K+ tokens of real pressure, confusing users about why compression triggered (reported by @codecovenant on X; #6217). - Post-compression last_prompt_tokens writeback omits tool schemas, so the next should_compress() check compares real usage against a stale underestimate — compression triggers late, potentially past the model's context limit on small-context models (#14695). Swap estimate_messages_tokens_rough() for estimate_request_tokens_rough() at every user-visible banner and at the post-compression writeback. estimate_request_tokens_rough() already existed for exactly this purpose and includes system prompt + tool schemas. Touched call sites: - run_agent.py: post-compression last_prompt_tokens writeback, post-tool call should_compress() fallback when provider usage is missing - cli.py: /compress banner + summary - gateway/run.py: gateway /compress banner + summary - tui_gateway/server.py: TUI /compress status + summary - acp_adapter/server.py: ACP /compact before/after Left intentionally alone: - Session-hygiene fallback and the 'no agent' /status path in gateway/run.py — no agent instance is in scope to query for system prompt/tools, and the existing 30-50% overestimate wobble on hygiene is safety-accepted. - Verbose-mode 'Request size' logging — informational only, already counts system prompt via api_messages[0]. Also relabels the feedback line from 'Rough transcript estimate' to 'Approx request size' so the metric label matches what it actually measures. Credits: diagnoses from @devilardis (#14695) and @Jackten (#6217); user report @codecovenant on X (2026-04-30). Closes #14695 Closes #6217	2026-04-30 23:03:54 -07:00
Teknium	27ec74c68a	fix: coerce show_reasoning and guard_agent_created config bools Widens #16528 to two sibling sites that had the same quoted-boolean bug: a YAML string "false" (or "0", "no", "off") silently evaluated truthy under bool() / if-check. - gateway/run.py _load_show_reasoning: is_truthy_value wrap - tools/skill_manager_tool.py _guard_agent_created_enabled: is_truthy_value wrap - regression tests for both	2026-04-30 20:40:46 -07:00
johnncenae	bb706c3f38	fix(gateway): coerce tool_progress_command as a real boolean	2026-04-30 20:40:46 -07:00
simbam99	7ba1a2b3df	fix(gateway): preserve assistant metadata when branching sessions	2026-04-30 20:40:28 -07:00
Roy-oss1	b94cb8e2c4	feat(feishu): operator-configurable bot admission and mention policy Add two operator-facing toggles for inbound Feishu admission, enabling bot-to-bot scenarios such as A2A orchestration and inter-bot notifications: FEISHU_ALLOW_BOTS=none\|mentions\|all (default: none) Accept messages from other bots. `mentions` requires the peer bot to @-mention Hermes; `all` admits every peer-bot message. FEISHU_REQUIRE_MENTION=true\|false (default: true) Whether group messages must @-mention the bot. Override per-chat via `group_rules.<chat_id>.require_mention` in config.yaml. Defaults preserve prior behavior. Self-echo protection is always on: when the bot's identity is unresolved (auto-detection failed and FEISHU_BOT_OPEN_ID unset), peer-bot messages are rejected fail-closed to avoid feedback loops. Admitted peer bots bypass the human-user allowlist (FEISHU_ALLOWED_USERS) to match existing Discord behavior; humans still need an explicit allowlist entry. yaml feishu.allow_bots is bridged to the env var so the adapter and gateway auth layer share one source of truth. Resolving peer-bot display names requires the application:bot.basic_info:read scope; without it, peers still route but appear as their open_id. Test: tests/gateway/test_feishu_bot_admission.py covers the admission pipeline, group-policy bot-bypass, hydration, and event-dispatch plumbing as a parametrized matrix. Change-Id: I363cccb578c2a5c8b8bf0f0a890c01c89909e256	2026-04-30 20:30:31 -07:00
buray	fa9fd26acb	fix(gateway): re-inject topic-bound skill after /new or /reset reset_session() creates a fresh SessionEntry with created_at == updated_at, but get_or_create_session() bumps updated_at on the next inbound message, causing _is_new_session in _handle_message_with_agent to evaluate False. The topic/channel skill auto-load gate (group_topics, channel_skill_bindings) silently skips the first message after a manual reset. Add an is_fresh_reset flag on SessionEntry, set by reset_session() and consumed once by the message handler. Kept distinct from was_auto_reset because that flag also drives a 'session expired due to inactivity' user-facing notice and a context-note prepend — both wrong for an explicit /new or /reset. Persisted through to_dict/from_dict so the flag survives gateway restart between /reset and the next message. Fixes #6508 Co-authored-by: warabe1122 <45554392+warabe1122@users.noreply.github.com> Co-authored-by: willy-scr <187001140+willy-scr@users.noreply.github.com>	2026-04-30 20:29:19 -07:00
Jezza Hehn	7abc9ce4df	fix(gateway): read /status token totals from SessionDB (#17158 ) /status was reading session_entry.total_tokens from the in-memory SessionStore (gateway/session.py), which the agent never writes to — so the token count was always 0. The agent already persists token deltas to the SQLite SessionDB (run_agent.py:11497) for every platform with a session_id. Route /status through that single source of truth instead of duplicating token writes into a second store. Fix: - gateway/run.py: _handle_status_command now calls self._session_db.get_session(session_id) and sums the five token component columns (input/output/cache_read/cache_write/reasoning). Falls back to 0 when no SessionDB is configured or no row exists. - Two new regression tests covering the populated-row and missing-row paths. Co-authored-by: Hermes <127238744+teknium1@users.noreply.github.com>	2026-04-30 20:28:50 -07:00
Teknium	a178081468	fix(gateway): use _session_key_for_source for native image buffer write Minor follow-up to the native-image-buffer isolation fix. The write site in _prepare_inbound_message_text was calling build_session_key directly, while every other call site in gateway/run.py uses the _session_key_for_source helper — which consults session_store._generate_session_key first and falls back to build_session_key. Keeping the write key and consume key on the same helper prevents key drift if the session store ever overrides the default keying behavior.	2026-04-30 20:26:35 -07:00
Yukipukii1	bdb7edd89e	fix(gateway): isolate pending native image paths by session	2026-04-30 20:26:35 -07:00
Teknium	9a75743496	fix(gateway): apply agent.disabled_toolsets in gateway message loop Widens the cherry-picked fix from @jatingodnani (#17343) to the gateway path. On main, user_config.agent.disabled_toolsets was only honored by _get_platform_tools' name-level subtraction — it did not catch tools pulled in implicitly by a composite toolset (browser includes web_search, hermes-* platforms include most tools). Changes: - gateway/run.py: resolve disabled_toolsets alongside enabled_toolsets and pass to AIAgent at both user-facing construction sites (normal message loop + single-turn cron-like path). Hygiene/compression agents (fixed enabled_toolsets=[memory]) are intentionally untouched. - gateway/run.py: add (agent, disabled_toolsets) to _CACHE_BUSTING_CONFIG_KEYS so editing the list in config.yaml invalidates the cached AIAgent on the next message. - cli.py: drop unused 'import platform' left over from PR #17343's import churn; restore 'import sys' used throughout the file. - model_tools.py: drop unused 'import os, sys' added by PR #17343; fix comment reference from #15291 (unrelated OAuth issue) to #17309. Co-authored-by: jatin godnani <godnanijatin@gmail.com>	2026-04-30 20:24:39 -07:00
Teknium	01cc701e54	docs + nit: busy_ack_enabled follow-ups - Move the disabled-ack guard above the debounce so we don't stamp _busy_ack_ts[session_key] when no ack was actually sent. Harmless (never read when disabled) but cosmetically off. - Document display.busy_ack_enabled in user-guide/messaging/index.md and HERMES_GATEWAY_BUSY_ACK_ENABLED in reference/environment-variables.md. - Add JezzaHehn to scripts/release.py AUTHOR_MAP for contributor credit. Follow-up to #17491 (Jezza Hehn).	2026-04-30 20:22:30 -07:00
Jezza Hehn	2b512cbca4	feat(gateway): add busy_ack_enabled config option to suppress ack messages When a user sends a message while the gateway is busy processing, an acknowledgment message is sent. This can be spammy for users who send rapid messages. Add display.busy_ack_enabled config option (default: true) to allow users to suppress these busy-input acknowledgment messages. Fixes #17457	2026-04-30 20:22:30 -07:00
Yukipukii1	25cbe3e1d6	fix(gateway): preserve thread routing for /update progress and prompts	2026-04-30 20:19:23 -07:00
johnncenae	1ef9e88549	fix(gateway): write restart markers atomically and fix Windows lock collisions	2026-04-30 19:58:16 -07:00
Teknium	c868425467	feat(kanban): durable multi-profile collaboration board (#17805 ) Salvage of PR #16100 onto current main (after emozilla's #17514 fix that unblocks plugin Pydantic body validation). History preserved on the standing `feat/kanban-standing` branch; this squashes the 22 iterative commits into one clean landing. What this lands: - SQLite kernel (hermes_cli/kanban_db.py) — durable task board with tasks, task_links, task_runs, task_comments, task_events, kanban_notify_subs tables. WAL mode, atomic claim via CAS, tenant-namespaced, skills JSON array per task, max-runtime timeouts, worker heartbeats, idempotency keys, circuit breaker on repeated spawn failures, crash detection via /proc/<pid>/status, run history preserved across attempts. - Dispatcher — runs inside the gateway by default (`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims stale claims, promotes ready tasks, spawns `hermes -p <assignee> chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK + HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker` plus any per-task skills. Health telemetry warns on stuck ready queue. - Structured tool surface (tools/kanban_tools.py) — 7 tools (kanban_show, kanban_complete, kanban_block, kanban_heartbeat, kanban_comment, kanban_create, kanban_link). Gated on HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal sessions. - System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE) injected only when kanban tools are active. - Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board UI: triage/todo/ready/running/blocked/done columns, drag-drop, inline create, task drawer with markdown, comments, run history, dependency editor, bulk ops, lanes-by-profile grouping, WS-driven live refresh. Matches active dashboard theme via CSS variables. - CLI — `hermes kanban init\|create\|list\|show\|assign\|link\|unlink\| claim\|comment\|complete\|block\|unblock\|archive\|tail\|dispatch\|context\| init\|gc\|watch\|stats\|notify\|log\|heartbeat\|runs\|assignees` + `/kanban` slash in-session. - Worker + orchestrator skills (skills/devops/kanban-worker + kanban-orchestrator) — pattern library for good summary/metadata shapes, retry diagnostics, block-reason examples, fan-out patterns. - Per-task force-loaded skills — `--skill <name>` (repeatable), stored as JSON, threaded through to dispatcher argv as one `--skills X` pair per skill alongside the built-in kanban-worker. Dashboard + CLI + tool parity. - Deprecation of standalone `hermes kanban daemon` — stub exits 2 with migration guidance; `--force` escape hatch for headless hosts. - Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md) with 11 dashboard screenshots walking through four user stories (Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker). - Tests (251 passing): kernel schema + migration + CAS atomicity, dispatcher logic, circuit breaker, crash detection, max-runtime timeouts, claim lifecycle, tenant isolation, idempotency keys, per- task skills round-trip + validation + dispatcher argv, tool surface (7 tools × round-trip + error paths), dashboard REST (CRUD + bulk + links + warnings), gateway-embedded dispatcher (config gate, env override, graceful shutdown), CLI deprecation stub, migration from legacy schemas. Gateway integration: - GatewayRunner._kanban_dispatcher_watcher — new asyncio background task, symmetric with _kanban_notifier_watcher. Runs dispatch_once via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0 env override for debugging. - Config: new `kanban` section in DEFAULT_CONFIG with `dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`. Additive — no \_config_version bump needed. Forward-compat: - workflow_template_id / current_step_key columns on tasks (v1 writes NULL; v2 will use them for routing). - task_runs holds claim machinery (claim_lock, claim_expires, worker_pid, last_heartbeat_at) so multi-attempt history is first- class from day one. Closes #16102. Co-authored-by: emozilla <emozilla@nousresearch.com>	2026-04-30 13:36:47 -07:00
Leone Parise	eda1d516dc	fix(skills): exclude .archive from skill index walk Archived skills (moved to ~/.hermes/skills/.archive/ by the curator) were still surfaced in the <available_skills> system prompt under a fake '.archive' category, causing the agent to load and try to use deprecated skills. The os.walk in iter_skill_index_files() only excluded .git/.github/.hub. Add '.archive' to EXCLUDED_SKILL_DIRS, and to the two other places that hardcode the same exclusion tuple (gateway/run.py and agent/skill_commands.py).	2026-04-30 04:59:22 -07:00
konsisumer	d1d0ef6dbd	fix(gateway): persist user message on transient agent failures (#7100 ) The #1630 fix introduced a blanket ``agent_failed_early`` transcript skip to prevent context-overflow sessions from looping. That guard also triggers for unrelated transient failures (429 rate limits, read timeouts, connection resets, provider 5xx) which have nothing to do with session size — and it silently drops the user's message, so the agent has no memory of the last turn on retry. Split the failure classification in ``GatewayRunner._run_agent``: * Context-overflow (``compression_exhausted`` flag, explicit context-length phrases, or generic 400 with a long history) → keep the existing skip, preserving the #1630/#9893 fix. * Anything else that failed → persist just the user message so the conversation survives a retry. Use specific multi-word phrases (``context length``, ``token limit``, ``prompt is too long``, etc.) to match ``run_agent.py``'s own classifier; bare ``exceed`` false-positively flagged "rate limit exceeded" as context overflow. Covered by new tests in ``tests/gateway/test_7100_transient_failure_transcript.py`` and the existing #1630 suite still passes.	2026-04-30 04:32:33 -07:00
Rob Moen	0dd373ec43	fix(context): honor model.context_length for Ollama num_ctx and all display paths When a user sets model.context_length in config.yaml, the value was only used for Hermes' internal compression decisions (context_compressor) but NOT for Ollama's num_ctx parameter. Ollama auto-detects context from GGUF metadata (often 256K+) and allocates that much VRAM regardless of the user's config — causing OOM on smaller GPUs like the P100 (16GB). Root cause: two separate context values existed independently: - context_compressor.context_length = config value (e.g. 65536) ✓ - _ollama_num_ctx = GGUF metadata value (e.g. 256000) ✗ ignored config Changes: 1. Cap Ollama num_ctx to config context_length (run_agent.py) When model.context_length is explicitly set and no explicit ollama_num_ctx override exists, cap the auto-detected GGUF value to the user's context_length. This is the core fix — it prevents Ollama from allocating more VRAM than the user budgeted. 2. Pass config_context_length through all secondary call sites Several paths called get_model_context_length() without the config override, falling through to the 256K default fallback: - cli.py: @-reference expansion and /model switch display - gateway/run.py: @-reference expansion and /model switch display - tui_gateway/server.py: @-reference expansion - hermes_cli/model_switch.py: resolve_display_context_length() 3. Normalize root-level context_length in config (hermes_cli/config.py) _normalize_root_model_keys() now migrates root-level context_length into the model section, matching existing behavior for provider and base_url. Users who wrote `context_length: 65536` at the YAML root instead of under `model:` had it silently ignored. 4. Fix misleading comments (agent/model_metadata.py) DEFAULT_FALLBACK_CONTEXT is 256K (CONTEXT_PROBE_TIERS[0]), not 128K as two comments stated. Tests: 3 new tests for root-level context_length normalization. All existing context_length tests pass (96 tests).	2026-04-30 04:31:23 -07:00

1 2 3 4 5 ...

751 Commits