hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-04-28 06:51:16 +08:00

Author	SHA1	Message	Date
Ivan Tonov	930494d687	fix(cron): reap orphaned MCP stdio subprocesses after each tick MCP stdio servers are spawned via the SDK's stdio_client, which on Linux uses start_new_session=True (setsid). When a cron job is cancelled mid-way (timeout, agent finish, exception), the subprocess often escapes the SDK's teardown and survives as a session leader. Because setsid() detaches the child from the gateway's process group / cgroup tree, systemd does not reap it on service restart either — so every cron tick that touches an MCP tool leaks a dangling server process. Fix: * tools/mcp_tool.py — _run_stdio now wraps the whole stdio+session context in try/finally. On any exit path (clean, exception, cancellation), PIDs still alive are moved from the active _stdio_pids set into a new _orphan_stdio_pids set. Orphan detection is done via os.kill(pid, 0) — a cheap liveness probe that never signals the target. * tools/mcp_tool.py — _kill_orphaned_mcp_children gains an include_active=False flag. Default behaviour now only reaps the orphan set so concurrent sessions (other parallel cron jobs or live user chats) are never disrupted. The existing shutdown path passes include_active=True to keep the previous "kill everything" semantics after the MCP loop is stopped. * cron/scheduler.py — the cleanup hook is moved from run_job()'s finally (which would race with parallel siblings after #13021) into tick() after the ThreadPoolExecutor has joined every future. At that point there are no in-flight sessions from this tick, so sweeping the orphan set is always safe. Net effect: zero regression for healthy sessions, and orphan MCP servers no longer accumulate between gateway restarts. Made-with: Cursor	2026-04-26 18:21:20 -07:00
ghostmfr	e818ec520a	fix(slack): harden attachment handling Multiple overlapping Slack attachment improvements: 1. Upload retry with backoff on transient errors (429, 5xx, connection reset, rate_limited, service unavailable). New _is_retryable_upload_error helper covers three upload paths: _upload_file, send_video, send_document. Up to 3 attempts with 1.5s * attempt backoff. 2. Thread participation tracking: successful file uploads now add the thread_ts to _bot_message_ts, mirroring how text replies are tracked. This lets follow-up thread messages auto-trigger the bot (same engagement rules as replied threads). 3. Thread metadata preservation in the image redirect-guard fallback (send_image → send text fallback) and in two gateway.run.py send paths (image + document fallback calls). 4. HTML response rejection in _download_slack_file_bytes. Parallels the existing check in _download_slack_file. Guards against Slack returning a sign-in / redirect page as document bytes when scopes are missing, so the agent doesn't get HTML-as-a-PDF. 5. File lifecycle event acks (file_shared / file_created / file_change). These events arrive around snippet uploads. Acking them silences the slack_bolt 'Unhandled request' 404 warnings without changing behavior. 6. Post-loop message type classification so a mixed image+document upload classifies as PHOTO (or VOICE if no image), falling back to DOCUMENT. Previously, the per-file classification in the inbound loop could be overwritten unpredictably. 7. Expanded text-inject whitelist in inbound document handling to cover .csv, .json, .xml, .yaml, .yml, .toml, .ini, .cfg (up to 100KB) so snippets and config files are directly visible to the agent, not just cached as opaque uploads. Paired with new MIME entries in SUPPORTED_DOCUMENT_TYPES in base.py. Squashed from two commits in #11819 so the single commit carries the contributor's GitHub attribution (the original commits were authored under a local dev hostname).	2026-04-26 18:20:17 -07:00
Teknium	b16f9d438b	feat(telegram): send fresh finals for stale preview streams (port openclaw#72038) (#16261 ) Ports openclaw/openclaw#72038 to hermes-agent. Telegram's `editMessageText` preserves the original message timestamp, so a long-running streamed reply (reasoning models that take 60+ seconds to finish) would keep the first-token timestamp even after completion. Users can't tell how long a task actually took. When a preview message has been visible for >= 60s (configurable via `streaming.fresh_final_after_seconds`), finalize by sending a fresh message instead of editing in place, then best-effort delete the stale preview. Short previews still edit in place (the existing fast path). Implementation notes adapted from OpenClaw's TypeScript original: - `StreamConsumerConfig` gains `fresh_final_after_seconds` (default 0 = legacy edit-in-place). Gateway-level `StreamingConfig` defaults to 60. - `GatewayStreamConsumer` tracks `_message_created_ts` at first-send and checks it in `_send_or_edit` on `finalize=True`. New helpers `_should_send_fresh_final` + `_try_fresh_final`. - `BasePlatformAdapter` gains optional `delete_message(chat_id, message_id)` returning False by default. `TelegramAdapter` implements it via `_bot.delete_message`. - `gateway/run.py` only enables fresh-final for `Platform.TELEGRAM`; other platforms ignore the setting (they don't have the stale-edit timestamp problem or edit-then-read works cheaply). - Fallback to normal edit on any fresh-send failure — no user-visible regression if Telegram rate-limits a send or the message is gone. Tests: 15 new cases in tests/gateway/test_stream_consumer_fresh_final.py covering short/long previews, config plumbing, delete-support absent, send-failure fallback, __no_edit__ sentinel safety, and StreamingConfig round-trip. Co-authored-by: Hermes Agent <agent@nousresearch.com>	2026-04-26 17:26:37 -07:00
Brooklyn Nicholson	cb7cfba6de	fix(cli): surface last_active in search_sessions so -c works	2026-04-26 16:21:57 -05:00
Brooklyn Nicholson	bde89c169b	fix(cli): -c picks the most recently used session	2026-04-26 16:17:39 -05:00
Brooklyn Nicholson	1566f1eecc	fix(tui): report actual session on exit	2026-04-26 15:55:01 -05:00
Brooklyn Nicholson	d4dde6b5f2	fix(tui): restore resumed transcript lineage	2026-04-26 15:16:12 -05:00
Wang-tianhao	6087e04043	fix(slack): extract rich_text quotes/lists and link unfurl previews Slack's modern composer sends messages with a 'blocks' array that contains rich_text elements. When a user forwards or quotes another message, the quoted content shows up in the rich_text_quote children of that array — and is NOT included in the plain 'text' field. The agent saw only the lossy plain text and was blind to forwarded / quoted content. Same story for link unfurl previews (Notion, docs, GitHub, etc.) which Slack puts in the 'attachments' array. Two fixes in the inbound handler: 1. _extract_text_from_slack_blocks walks rich_text / rich_text_quote / rich_text_list / rich_text_preformatted trees and renders readable text ('> quoted', '• bullet', code fences), dedupes against the plain text field, and appends the extracted content so the agent sees everything. 2. Link unfurl / attachment preview extraction reads title, url, body, and footer from the 'attachments' array and appends a '📎 [title](url)\n body\n _footer_' section per preview. Skips is_msg_unfurl to avoid echoing our own Slack replies back. Routing is careful not to trust augmented text: mention gating (is_mentioned) and slash-command detection both run against the original 'text' field, so forwarded content containing '<@bot>' or '/deploy' in a quote can't trick the bot into responding in a channel it shouldn't or classifying a normal message as a command. Adjustment from original PR: dropped _serialize_slack_blocks_for_agent, which inlined a redacted JSON dump of non-rich_text blocks (section, accessory, actions, etc.) — the agent would see the raw Block Kit structure for UI-heavy alerts. It added up to 6000 characters to the prompt context on every qualifying message with no opt-out. The rich_text extraction and attachment unfurls cover the common bug-fix case (quoted/forwarded content + link previews) without the prefill tax. If a user needs block inspection later, it can return as a config opt-in. Also updates the Slack platform notes in session.py to accurately describe what the gateway inlines.	2026-04-26 13:02:51 -07:00
Teknium	4921b26945	fix(cron): keep homeassistant toolset enabled when HASS_TOKEN is set (#16208 ) After #14798 made cron honor per-platform `hermes tools` config, the `_DEFAULT_OFF_TOOLSETS` filter silently stripped `homeassistant` from cron jobs for users who'd been relying on the previous blanket toolset. Norbert's HA cron reports regressed as a result. The HA toolset is already runtime-gated by its `check_fn` (requires HASS_TOKEN to register any tools). When HASS_TOKEN is set the user has explicitly opted in — `_DEFAULT_OFF_TOOLSETS` adds nothing in that case, so stop double-gating and restore HA for cron / cli / other platforms without an explicit saved toolset list. moa and rl stay off by default (original #14798 goal preserved). Fixes HA cron regression reported by Norbert.	2026-04-26 12:55:58 -07:00
maxims-oss	18beb69b49	fix(memory): close embedded Hindsight async client cleanly HindsightEmbedded.close() delegates to its sync client.close(). When Hermes created/used that client on the shared async loop, closing it from the main thread raises 'attached to a different loop' before aiohttp releases the session — so the ClientSession / TCPConnector leak past provider teardown. Close the embedded inner async client on the shared loop first via _run_sync(inner_client.aclose()), then let the wrapper's sync close() do its daemon/UI bookkeeping. Salvage of #14605: test placement rebased — appended TestShutdown class after TestSharedEventLoopLifecycle (which landed on main after the PR was written). Original author attribution preserved.	2026-04-26 12:54:46 -07:00
Tranquil-Flow	bf05b8f4a2	fix(gateway): clean up cached agents on shutdown (#11205 )	2026-04-26 12:51:53 -07:00
Zainan Victor Zhou	778fd1898e	fix(slack): surface attachment access diagnostics Translate Slack attachment failures into actionable user-facing notices instead of generic download errors. When a scope/auth/permission issue breaks attachment processing, the user sees: [Slack attachment notice] - Slack attachment access failed for photo.jpg. Missing scope: files:read. Update the Slack app scopes/settings and reinstall the app to the workspace. Two helpers do the translation: _describe_slack_api_error — handles SlackApiError responses (missing_scope, invalid_auth, file_not_found, access_denied, etc.) _describe_slack_download_failure — handles httpx.HTTPStatusError (401/403/404) and Slack-returns-HTML-sign-in fallbacks Wired into three existing call sites: - the Slack Connect files.info path (PR #11111) so scope errors surface instead of being logged as generic "files.info failed" - the image, audio, and document download paths so 401/403 and HTML-body responses translate into actionable notices Adjustment from original PR: dropped _probe_slack_file_access_issue, the proactive pre-download files.info probe. It added one extra Slack API call per attachment even on healthy ones, and overlapped with the existing files.info call from PR #11111. The post-failure translation path covers the same user-facing diagnostic value without the per-message tax. Also documents files:read scope more prominently in the Slack setup guide and troubleshooting table. Contributed back from https://github.com/xinbenlv/zn-hermes-agent. Closes #7015. Co-authored-by: xinbenlv <zzn+pa@zzn.im>	2026-04-26 12:47:43 -07:00
Teknium	45bfcb9e71	test: update bare-agent helper for live-runtime attrs added by #16099 Background review fork now inherits session_id, credential_pool, and status_callback from the parent (added in #16099 after this PR was written). Extend the bare-agent helper so the regression test keeps reaching the cleanup assertions instead of failing in the runtime resolver. Signed-off-by: Teknium <8425893+teknium1@users.noreply.github.com>	2026-04-26 12:45:39 -07:00
MRHwick	2d86e97a7e	fix(run_agent): shut down background review memory providers Temporary background review agents can initialize Hindsight-backed memory clients, but close() alone skips provider teardown. Shut the memory provider down before closing so aiohttp sessions do not leak at process exit. Made-with: Cursor	2026-04-26 12:45:39 -07:00
Satoshi-agi	c0d25df311	fix(slack): preserve thread-parent context when cron/bot posted the parent The Slack thread-context fetcher used to drop every message with a bot_id, which silently erased the thread parent whenever a cron job (or any other bot) had posted it. As a result, replies to a cron-posted summary lost all context and the agent answered as if from a blank thread. Changes: 1. gateway/platforms/slack.py::_fetch_thread_context - Keep the thread parent even when it was posted by a bot (e.g. cron summaries, third-party integrations). - Only skip our own prior bot replies to avoid circular context, matching the per-workspace bot user id via _team_bot_user_ids so multi-workspace deployments stay correct. - Keep non-self bot children (useful third-party context). 2. gateway/platforms/slack.py::_handle_slack_message - Populate MessageEvent.reply_to_text for thread replies (parity with Telegram/Discord/Feishu/WeCom). gateway.run uses this field to inject a [Replying to: "..."] prefix when the parent is not already in the session history, which is exactly the scenario triggered by cron-generated thread parents. - New helper _fetch_thread_parent_text reuses the existing thread- context cache (and its 60s TTL) to avoid duplicate conversations.replies calls; falls back to a cheap limit=1 fetch when the cache is cold. Tests: - Updated TestSlackThreadContext::test_skips_bot_messages to reflect the new behaviour (self-bot child dropped, third-party bot kept). - Added: * test_fetch_thread_context_includes_bot_parent * test_fetch_thread_context_excludes_self_bot_replies * test_fetch_thread_context_multi_workspace * test_fetch_thread_context_current_ts_excluded (regression guard) * test_fetch_thread_parent_text_from_cache * test_slack_reply_to_text_set_on_thread_reply * test_slack_reply_to_text_none_for_top_level_message Full Slack suite: 176 passed (was 169).	2026-04-26 12:35:16 -07:00
helix4u	10e36188da	fix(cli): wire approvals in background tasks	2026-04-26 12:29:48 -07:00
bde3249023	75d3eaa0e4	fix(slack): exclude U/W user IDs from explicit target regex Slack's chat.postMessage API rejects user IDs (U...) and workspace IDs (W...) — they are not valid conversation IDs. Posting to them fails because the API requires a channel ID (C/G/D). To DM a user, the sender must first call conversations.open to obtain a D... ID. Tighten _SLACK_TARGET_RE from [CGDUW] to [CGD] so the send path rejects U/W values as explicit targets and instead falls through to channel- name resolution (where they'll fail with a clear 'could not resolve' error rather than silently getting stuck in a retry loop on the API). Flip the corresponding regression test to assert U/W values are not explicit. Matches the narrower regex briandevans proposed in #15939. Co-authored-by: briandevans <brian@bde.io>	2026-04-26 12:29:02 -07:00
hhuang91	802c7acb81	fix(Slack): resolve Slack channels by raw ID and enumerate joined channels send_message(target='slack:<channel_id>') failed with "Could not resolve" because _parse_target_ref had no Slack branch — Slack's uppercase alphanumeric IDs fell through to channel-name resolution, which only matched by name. As a fallback, the agent would retry with bare target='slack' and post to the home channel instead. Three fixes: - _parse_target_ref recognizes Slack IDs (C/G/D/U/W prefix) as explicit targets so the name-resolver is bypassed entirely. - resolve_channel_name tries a case-sensitive raw-ID match before the existing name match, so any platform's IDs resolve cleanly. - _build_slack now actually calls users.conversations against each workspace's AsyncWebClient (paginated), instead of only returning session-history entries. This populates the directory with public and private channels the bot has joined, so action='list' shows them and they can also be addressed by name. Errors from one workspace don't block others. build_channel_directory becomes async (Slack web calls require it). The two async-context callers in gateway/run.py are awaited; the cron ticker thread call bridges via asyncio.run_coroutine_threadsafe. Slack bot needs channels:read and groups:read scopes for full enumeration; missing scopes degrade gracefully per-workspace. addressing #15927	2026-04-26 12:29:02 -07:00
Teknium	4d119bb62a	test: blank platform-gating env vars in hermetic fixture load_gateway_config() has a side effect: when config.yaml contains platform-gating keys (slack.require_mention, slack.strict_mention, slack.free_response_channels, slack.allow_bots, slack.reactions, plus analogous keys for discord/telegram/whatsapp/dingtalk/matrix), it calls os.environ[KEY] = ... to bridge them to env-var form. monkeypatch.delenv doesn't track direct os.environ mutations made inside the test body, so tests that call load_gateway_config() leak those env vars into later tests on the same xdist worker. The failure mode is flaky seed-dependent: test_top_level_message_requires_mention_ even_with_session (and siblings in TestThreadReplyHandling) pass when SLACK_REQUIRE_MENTION is unset but fail when a leaked value of 'false' is present. Add the gating env vars to _HERMES_BEHAVIORAL_VARS so the hermetic autouse fixture blanks them on every test setup, closing the leak regardless of which test sets them.	2026-04-26 12:23:20 -07:00
Honza Stepanovsky	50dd67c680	fix(slack): skip _mentioned_threads registration when strict_mention is on Extends the strict_mention feature so an @mention in strict mode no longer persistently tags the thread as 'mentioned'. Without this, the thread's first mention would permanently auto-trigger the bot on every subsequent message — which is exactly what strict_mention is designed to prevent. Closes the agent-to-agent ack loop hole hhhonzik identified in #14117. Co-authored-by: hhhonzik <me@janstepanovsky.cz>	2026-04-26 12:23:20 -07:00
Ching	aea4a90f0e	feat(slack): add opt-in slack.strict_mention gate for channel threads Adds a strict_mention config option that, when enabled, requires an explicit @-mention on every message in channel threads. Disables the 'once mentioned, forever in the thread' and session-presence auto-triggers. - New _slack_strict_mention() helper (config.extra + SLACK_STRICT_MENTION env) - Bridged top-level slack.strict_mention yaml to SLACK_STRICT_MENTION env, matching require_mention/allow_bots bridging - Unit tests for the helper + config bridge	2026-04-26 12:23:20 -07:00
Teknium	4b5a88d714	fix(slack): honor reply_in_thread=false for top-level channel messages Top-level channel messages arrive at _resolve_thread_ts with metadata.thread_id set to the message's own ts, because the inbound handler in _handle_message_event uses 'event.ts' as a session-keying fallback when event.thread_ts is absent. That made metadata alone insufficient to distinguish a real thread reply from a top-level message, so reply_in_thread=false only took effect in DMs. Use reply_to (== incoming message_id == ts for top-level messages) as the tiebreaker: when metadata.thread_id == reply_to the 'thread' is the synthetic session-keying fallback, not a real parent, so we reply directly in the channel. Real thread replies (reply_to != thread_id) still resolve to the parent thread and preserve conversation context. Closes #9268.	2026-04-26 12:04:46 -07:00
bde3249023	b1be86ef96	fix(gateway): bridge slack.reply_in_thread config	2026-04-26 12:04:46 -07:00
sgaofen	c730f6cc0b	test(gateway): cover Slack vs non-Slack home-channel onboarding hint Parameterize the test helpers in test_status_command.py to accept a Platform and add two regression tests ensuring the first-run home-channel onboarding uses '/hermes sethome' on Slack and '/sethome' everywhere else. Co-authored-by: sgaofen <135070653+sgaofen@users.noreply.github.com>	2026-04-26 11:56:23 -07:00
Teknium	1dfcc2ffc3	fix(gateway): /queue is now a true FIFO — each invocation gets its own turn (#16175 ) Repeated /queue commands now each produce a full agent turn, in order, with no merging. Previously the second /queue overwrote the first because the handler wrote directly into the adapter's single-slot _pending_messages dict. - GatewayRunner grows a _queued_events overflow buffer (dict of list). - /queue puts new items in the adapter's next-up slot when free, otherwise appends to the overflow. After each run's drain consumes the slot, the next overflow item is promoted so the recursive run picks it up. - /new and /reset clear the overflow. - /status now reports queue depth when non-zero. - Ack message shows the depth once it exceeds 1. Helpers (_enqueue_fifo, _promote_queued_event, _queue_depth) use the getattr default-fallback pattern so existing tests that build bare GatewayRunner instances via object.__new__ keep working.	2026-04-26 11:55:09 -07:00
Teknium	5b2c59559a	feat(terminal): collapse subagent task_ids to shared container (#16177 ) Before: delegate_task children each allocated their own terminal sandbox keyed by child task_id. Starting extra containers (or Modal sandboxes / Daytona workspaces) is expensive, and the subagent's work is invisible to the parent — files written by the child in its container don't exist in the parent's when the subagent returns. After: a single `_resolve_container_task_id` helper maps any tool-call task_id to "default" UNLESS an env override is registered for it. The parent agent and all delegate_task children therefore share one long-lived sandbox — installed packages, cwd, /workspace files, and /tmp scratch carry over freely between them. RL and benchmark environments (TerminalBench2, HermesSweEnv, ...) opt in to isolation via `register_task_env_overrides(task_id, {...})`; those task_ids survive the collapse and get their own sandbox, preserving the per-task Docker image behavior these benchmarks rely on. file_state / active-subagents registry / TUI events still key off the original child task_id, so the 'subagent wrote a file the parent read' warning and UI per-subagent panels keep working. Tradeoff: parallel delegate_task children (tasks=[...]) now share one bash/container. Concurrent cd, env-var mutations, and writes to the same path will collide. If that bites a specific workflow, the subagent can opt back into isolation via register_task_env_overrides. Applied at four lookup sites: - tools/terminal_tool.py terminal_tool() and get_active_env() - tools/file_tools.py _get_file_ops() and _get_live_tracking_cwd() - tools/code_execution_tool.py _get_or_create_environment() Docs: website/docs/user-guide/configuration.md updated to reflect the shared-container reality and document the RL/benchmark carve-out. Tests: tests/tools/test_shared_container_task_id.py (9 cases).	2026-04-26 11:55:02 -07:00
Brooklyn Nicholson	4a21920b5e	fix(tui): address copilot review nits	2026-04-26 13:43:08 -05:00
Brooklyn Nicholson	cc16d0ef77	Merge remote-tracking branch 'origin/main' into bb/tui-long-session-perf # Conflicts: # ui-tui/src/app/interfaces.ts	2026-04-26 13:39:57 -05:00
Teknium	087e74d4d7	feat(slack): register every gateway command as a native slash (Discord/Telegram parity) (#16164 ) Every command in COMMAND_REGISTRY (/btw, /stop, /model, /help, /new, /bg, /reset, ...) is now a first-class Slack slash command instead of a /hermes <subcommand>. Users get the same autocomplete-driven slash picker experience Slack users expect and that Discord and Telegram already provide. Previously Slack registered ONE native slash (/hermes) and split on the first word, so typing /btw in Slack's composer got 'couldn't find an app for /btw' because the workspace manifest never declared it. Changes - hermes_cli/commands.py: slack_native_slashes() + slack_app_manifest() generate a Slack manifest from the registry (canonical names + aliases + plugin commands), clamped to Slack's 50-slash cap with /hermes reserved as the catch-all. - gateway/platforms/slack.py: single regex matcher dispatches every registered slash to _handle_slash_command, which dispatches on command['command']. Legacy /hermes <subcommand> keeps working for backward compat with older workspace manifests. - hermes_cli/slack_cli.py + hermes_cli/main.py: new 'hermes slack manifest' command prints/writes a full manifest (display info, OAuth scopes, event subs, socket mode, slash commands) ready to paste into 'Create from manifest' or Features → App Manifest. - hermes_cli/setup.py: _setup_slack() now writes the manifest up-front and points users at the 'From an app manifest' flow; also offers to refresh the manifest on reconfigure for picking up new commands. - Tests: 14 new tests covering native-slash dispatch (/btw, /stop, /model), legacy /hermes <sub> compat, manifest structure, and telegram<->slack parity (every Telegram command must also register as a Slack slash). Existing /hermes-registration test updated to assert the new regex matches /hermes, /btw, /stop, /model, /help. - Docs: slack.md gains a 'Slash Commands' section + Option A manifest flow in Step 1; cli-commands.md documents 'hermes slack manifest'. Users pick up the new slashes by running 'hermes slack manifest --write' and pasting into Features → App Manifest → Edit in their Slack app config, then Save (Slack prompts for reinstall if scopes changed).	2026-04-26 11:38:32 -07:00
Teknium	9662e3218a	fix(tui): call maybe_auto_title for TUI sessions (#15949 ) (#16151 ) * fix(tui): call maybe_auto_title for TUI sessions (#15961) The maybe_auto_title() helper is called from cli.py and gateway/run.py but was never wired into tui_gateway/server.py, so every session started via 'hermes --tui' landed in state.db with an empty title. Evidence from the issue reporter: 0/154 TUI sessions titled vs 91/383 CLI. Mirror the CLI/Gateway pattern: after emitting message.complete, when the turn finished cleanly, fire-and-forget title generation using the session key, user prompt, agent response, and current history. Fixes #15949. Co-authored-by: math0r-be <math0r-be@github.com> * chore(release): map math0r-be placeholder email in AUTHOR_MAP --------- Co-authored-by: math0r-be <math0r-be@github.com>	2026-04-26 10:44:22 -07:00
Teknium	0824ba6a9d	fix(/branch): redirect session_log_file and expose branch sessions in list (#14854 ) (#16150 ) * fix(/branch): redirect session_log_file and expose branch sessions in list Two bugs when using /branch: 1. cli.py _handle_branch_command updated agent.session_id but not agent.session_log_file, so all messages written after branching landed in the original session's JSON file and the branch never got its own session_{id}.json on disk. Fix: mirror the compression-split path (run_agent.py:7579) and update session_log_file immediately after changing session_id. 2. hermes_state.py list_sessions_rich filtered out every session with parent_session_id IS NOT NULL to hide sub-agent runs and compression continuations. Branch sessions share this column, so they became invisible to `hermes sessions list` and `sessions browse`. Fix: also include branch children — those whose parent ended with end_reason='branched' AND whose started_at >= parent.ended_at (the same timing condition that get_compression_tip uses to distinguish continuations from live-spawned subagents). Fixes #14854 Co-Authored-By: Octopus <liyuan851277048@icloud.com> * chore(release): map octo-patch placeholder email in AUTHOR_MAP --------- Co-authored-by: octo-patch <octo-patch@github.com> Co-authored-by: Octopus <liyuan851277048@icloud.com>	2026-04-26 10:28:19 -07:00
Teknium	42c076d349	feat(browser): auto-spawn local Chromium for LAN/localhost URLs in cloud mode (#16136 ) When a cloud browser provider (Browserbase / Browser-Use / Firecrawl) is configured, browser_navigate now transparently spawns a local Chromium sidecar for URLs whose host resolves to a private/loopback/LAN address (localhost, 127.0.0.1, 192.168.x.x, 10.x.x.x, .local, .lan, *.internal, ::1, 169.254.x.x). Public URLs continue to use the cloud provider in the same conversation. Previously, setting BROWSERBASE_API_KEY / cloud_provider: browserbase pinned the whole tool to cloud for the process — localhost URLs were either SSRF-blocked (default) or sent to Browserbase (where they 404'd because the cloud can't reach your LAN). Users who wanted 'cloud for public, local for localhost' had no way to express it short of toggling providers mid-session. Implementation uses a composite session key scheme: the bare task_id serves the cloud session, and a '{task_id}::local' sidecar serves the local Chromium. _last_active_session_key[task_id] tracks which of the two served the most recent nav so snapshot/click/fill/etc. hit the correct one. cleanup_browser(bare_task_id) reaps both. Feature is on by default. Opt out via: browser: auto_local_for_private_urls: false The cloud provider never sees private URLs. Post-redirect SSRF guard is preserved: redirects from public onto private addresses still block.	2026-04-26 09:57:58 -07:00
Teknium	0e2a53eab2	feat(skills): show enabled/disabled status in 'skills list' (#16129 ) 'hermes skills list' now shows every skill's enabled/disabled status and accepts --enabled-only to filter down to what will actually load for the active profile: hermes -p dario skills list --enabled-only Previously the command was a flat catalog — it did not apply skills.disabled from config.yaml, so there was no way to see the live skill set for a profile without reading config by hand. Profile switching already works via -p (swaps HERMES_HOME); this just surfaces the result visibly. Changes: - hermes_cli/skills_hub.py: do_list adds a Status column and an enabled_only filter; summary reports enabled/disabled split - hermes_cli/main.py: --enabled-only flag on 'skills list' - /skills list slash command accepts --enabled-only too - tests: 4 new (status column, disabled marking, enabled-only hiding, no platform leakage into get_disabled_skill_names); existing fixtures updated to accept skip_disabled kwarg Reported by @mochizukimr on X.	2026-04-26 09:20:53 -07:00
briandevans	4e356098d2	fixup! fix(gateway): preserve inactivity clock on interrupt-recursive cached-agent turns (#15654 ) Address Copilot review findings: 1. Gate _last_activity_desc on interrupt_depth == 0 alongside _last_activity_ts. Both fields are semantically paired — desc describes the activity at ts. Updating desc without ts made get_activity_summary() report "starting new turn (cached)" for 20+ minutes while the timestamp showed the true stale duration, producing misleading diagnostic output. 2. Monkeypatch gateway.run.time.time to a fixed epoch in tests that assert on _last_activity_ts values. Real time.time() comparisons were latently flaky under slow CI or NTP adjustments. _FAKE_NOW = 10_000.0 is used as the reference; assertions are now exact equality rather than >=. 3. Add test_fresh_turn_resets_desc and test_interrupt_turn_preserves_desc to directly cover the gated desc behaviour introduced by (1). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 08:45:44 -07:00
briandevans	de24315978	fix(gateway): preserve inactivity clock on interrupt-recursive cached-agent turns (#15654 ) _last_activity_ts was unconditionally reset to time.time() on every _agent_cache hit. For interrupt-recursive _run_agent calls (_interrupt_depth > 0) this silently reset the inactivity watchdog's idle clock on each re-entry, preventing the 30-min timeout from ever firing when a turn got stuck in an interrupt loop. A stuck session would emit "Still working... iteration 0/60, starting new turn (cached)" heartbeats indefinitely instead of timing out. Gate the reset on _interrupt_depth == 0 only. Fresh external turns still receive the reset so a session idle for 29 min doesn't trip the watchdog before the new turn makes its first API call (#9051). The per-turn reset logic is extracted into a static helper _init_cached_agent_for_turn() to make it directly testable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 08:45:44 -07:00
Teknium	f2d655529a	fix(auth): hoist get_env_value import + strengthen .env fallback tests Follow-up to cherry-picked PR #15920: - agent/credential_pool.py: hoist 'from hermes_cli.config import get_env_value' to module top instead of inline try/except in each seed site (3 sites). No import cycle — hermes_cli/config.py doesn't depend on agent.credential_pool. - hermes_cli/auth.py: same hoist for the _resolve_api_key_provider_secret loop. - tests/tools/test_credential_pool_env_fallback.py: replace smoke-only tests with real .env file I/O. Each test writes a temp ~/.hermes/.env, verifies _seed_from_env / _resolve_api_key_provider_secret read from it, and asserts the full priority chain: os.environ > .env > credential_pool. Uses 'deepseek' as the test provider since 'openai' isn't in PROVIDER_REGISTRY and _seed_from_env's generic path requires a real pconfig lookup.	2026-04-26 08:32:09 -07:00
阿泥豆	27f4dba5ce	test: add unit tests for credential pool env fallback	2026-04-26 08:32:09 -07:00
Teknium	06f81752ed	Revert "feat(kanban): durable multi-profile collaboration board (#16081 )" (#16098 ) This reverts commit `15937a6b46`.	2026-04-26 08:29:37 -07:00
Teknium	15937a6b46	feat(kanban): durable multi-profile collaboration board (#16081 ) New `hermes kanban` CLI subcommand + `/kanban` slash command + skills for worker and orchestrator profiles. SQLite-backed task board (~/.hermes/kanban.db) shared across all profiles on the host. Zero changes to run_agent.py, no new core tools, no tool-schema bloat. Motivation: delegate_task is a function call — sync fork/join, anonymous subagent, no resumability, no human-in-the-loop. Kanban is the durable shape needed for research triage, scheduled ops, digital twins, engineering pipelines, and fleet work. They coexist (workers may call delegate_task internally). What this adds - hermes_cli/kanban_db.py — schema, CAS claim, dependency resolution, dispatcher, workspace resolution, worker-context builder. - hermes_cli/kanban.py — 15-verb CLI surface and shared run_slash() entry point used by both CLI and gateway. - skills/devops/kanban-worker — how a profile should work a claimed task. - skills/devops/kanban-orchestrator — "you are a dispatcher, not a worker" template with anti-temptation rules. - /kanban slash command wired into cli.py and gateway/run.py. Bypasses the running-agent guard (board writes don't touch agent state), so /kanban unblock can free a stuck worker mid-conversation. - Design spec at docs/hermes-kanban-v1-spec.pdf — comparative analysis vs Cline Kanban, Paperclip, NanoClaw, Gemini Enterprise; 8 patterns; 4 user stories; implementation plan; concurrency correctness. - Docs: website/docs/user-guide/features/kanban.md, CLI reference updated, sidebar entry added. Architecture highlights - Three planes: control (user + gateway), state (board + dispatcher), execution (pool of profile processes). - Every worker is a full OS process, spawned as `hermes -p <profile>`. No in-process subagent swarms — solves NanoClaw's SDK-lifecycle failure class. - Atomic claim via SQLite CAS in a BEGIN IMMEDIATE transaction; stale claims reclaimed 15 min after their TTL expires. - Tenant namespacing via one nullable column — one specialist fleet can serve many businesses with data isolation by workspace path. Tests: 60 targeted tests (schema, CAS atomicity, dependency resolution, dispatcher, workspace kinds, tenancy, CLI + slash surface). All pass hermetic via scripts/run_tests.sh.	2026-04-26 08:24:26 -07:00
Teknium	454d883e69	refactor: drop persist_session plumbing + fix broken btw mid-turn bypass (#16075 ) Follow-up to PR #16053 (/btw as /background alias). Cleans up the plumbing added exclusively for the old ephemeral /btw handler and repairs a broken btw bypass that landed between my refactor and this follow-up. run_agent.py: - Remove persist_session kwarg, instance attr, and _persist_session short-circuit. Only /btw ever passed persist_session=False; with /btw gone the default (always persist) is the only behavior anyone ever wanted. gateway/run.py: - Remove the unreachable 'if _cmd_def_inner.name == "btw"' block (PR #16059). Canonical name for a /btw message is 'background' after alias resolution — the comparison could never be true, and it called _handle_btw_command which no longer exists. The /background branch above it already dispatches /btw correctly. tests/gateway/test_running_agent_session_toggles.py: - Fix test_btw_dispatches_mid_run to mock _handle_background_command (the real dispatch target for /btw) instead of the deleted _handle_btw_command.	2026-04-26 07:15:23 -07:00
Teknium	70f56e7605	fix(gateway): let /btw dispatch mid-turn instead of being rejected /btw spawns a parallel ephemeral side-question task (self-guarded against concurrent /btw on the same chat) — exactly like /background. But it was missing from the running-agent bypass list in _handle_message(), so it fell through to the catch-all and returned: ⏳ Agent is running — /btw can't run mid-turn. Wait for the current response or /stop first. That's the opposite of what /btw is for — asking a side question while the main turn is still working. Add the bypass next to /background and a regression test covering the mid-turn dispatch path. Reported by @IuriiTiunov on Telegram.	2026-04-26 07:11:10 -07:00
Teknium	9a70260490	Revert "feat(onboarding): port first-touch hints to the TUI (#16054 )" (#16062 ) This reverts commit `ffd2621039`.	2026-04-26 06:31:37 -07:00
Teknium	ffd2621039	feat(onboarding): port first-touch hints to the TUI (#16054 ) PR #16046 added /busy and /verbose hints to the classic CLI and the gateway runner but skipped the Ink TUI (and therefore the dashboard /chat page, which embeds the TUI via PTY). This extends the same latch to the TUI with TUI-native wording. The TUI's busy-input model is not the /busy knob from the CLI — single Enter while busy auto-queues, double Enter on an empty line interrupts. The new busy-input hint teaches THAT gesture instead of telling the user to flip a config that does not apply. Changes: - agent/onboarding.py — add busy_input_hint_tui() + tool_progress_hint_tui() - tui_gateway/server.py — onboarding.claim JSON-RPC (Ink triggers busy hint on enqueue) + _maybe_emit_onboarding_hint helper hooked into _on_tool_complete for the 30s/tool_progress=all path. Same config.yaml latch so each hint fires at most once per install across CLI, gateway, and TUI combined. - ui-tui/src/gatewayTypes.ts — OnboardingClaimResponse + onboarding.hint event - ui-tui/src/app/createGatewayEventHandler.ts — render the hint event as sys() - ui-tui/src/app/useSubmission.ts — claim busy_input_prompt on first busy enqueue - tests/agent/test_onboarding.py — +3 cases for TUI hint shape - tests/tui_gateway/test_protocol.py — +4 cases for onboarding.claim - website/docs/user-guide/tui.md — new 'Interrupting and queueing' section explaining the TUI's double-Enter model and the hints Validation: scripts/run_tests.sh tests/agent/test_onboarding.py \ tests/tui_gateway/test_protocol.py \ tests/gateway/test_busy_session_ack.py -> 66 passed npm --prefix ui-tui run type-check -> clean npm --prefix ui-tui run lint -> clean npm --prefix ui-tui run build -> clean	2026-04-26 06:24:19 -07:00
Teknium	1e37ddc929	feat(cli): add 'hermes fallback' command to manage fallback providers (#16052 ) Manage the fallback_providers chain from the CLI instead of hand-editing config.yaml. The picker reuses select_provider_and_model() from 'hermes model' — same provider list, same credential prompts, same model picker. hermes fallback [list] Show the current chain (primary + fallbacks) hermes fallback add Run the model picker, append selection to chain hermes fallback remove Pick an entry to delete (arrow-key menu) hermes fallback clear Remove all entries (with confirmation) 'add' snapshots config['model'] before calling the picker, extracts the user's selection from the post-picker state, then restores the primary and appends {provider, model, base_url?, api_mode?} to fallback_providers. Auth store's active_provider is snapshot/restored too so OAuth-provider fallbacks don't silently deactivate the user's primary. Duplicates and self-as-fallback are rejected. Legacy single-dict 'fallback_model' entries are auto-migrated to the list format on first write.	2026-04-26 06:19:04 -07:00
Teknium	83c1c201f6	feat(onboarding): contextual first-touch hints for /busy and /verbose (#16046 ) Instead of a blocking first-run questionnaire, show a one-time hint the first time the user hits each behavior fork: 1. First message while the agent is working — appends a hint to the busy-ack explaining the /busy queue vs /busy interrupt knob, phrased to match the mode that was just applied (don't tell a queue-mode user to switch to queue). 2. First tool that runs for >= 30s in the noisiest progress mode (tool_progress: all) — prints a hint about /verbose to cycle display modes (all -> new -> off -> verbose). Gated on /verbose actually being usable on the surface: always shown on CLI; on gateway only shown when display.tool_progress_command is enabled. Each hint is latched in config.yaml under onboarding.seen.<flag>, so it fires exactly once per install across CLI, gateway, and cron, then never again. Users can wipe the section to re-see hints. New: - agent/onboarding.py — is_seen / mark_seen / hint strings, shared by both CLI and gateway. - onboarding.seen in DEFAULT_CONFIG (hermes_cli/config.py) and in load_cli_config defaults (cli.py). No _config_version bump — deep merge handles new keys. Wired: - gateway/run.py: _handle_active_session_busy_message appends the hint after building the ack. progress_callback tracks tool.completed duration and queues the tool-progress hint into the progress bubble. - cli.py: CLI input loop appends the busy-input hint on the first busy Enter; _on_tool_progress appends the tool-progress hint on the first >=30s tool completion. In-memory CLI_CONFIG is also updated so subsequent fires in the same process are suppressed immediately. All writes go through atomic_yaml_write and are wrapped in try/except so onboarding can never break the input/busy-ack paths.	2026-04-26 06:06:27 -07:00
Teknium	4bda9dcade	fix(gateway): honor voice.auto_tts config in auto-TTS gate (#16007 ) (#16039 ) The base adapter's auto-TTS path fired on any voice message unless the chat had explicitly run /voice off — it never read voice.auto_tts from config.yaml, so users who set auto_tts: false still got audio replies. Gate the base adapter on a three-layer decision instead: 1. chat in _auto_tts_enabled_chats (explicit /voice on\|tts) → fire 2. chat in _auto_tts_disabled_chats (explicit /voice off) → suppress 3. else → voice.auto_tts global default Runner now pushes voice.auto_tts onto the adapter as _auto_tts_default and mirrors /voice on\|tts chats into _auto_tts_enabled_chats via the existing _sync_voice_mode_state_to_adapter path. /voice off still wins. Closes #16007.	2026-04-26 05:52:05 -07:00
Teknium	35c57cc46b	fix(gateway): suppress tool-progress bubbles after interrupt (#16034 ) When the LLM response carries N parallel tool calls, the agent fires N tool.started events back-to-back before its interrupt check runs. A user sending /stop mid-batch would see the '⚡ Interrupting current task' ack followed by a trail of 🔍 web_search bubbles for the remaining events in the batch — making the interrupt feel ignored. progress_callback and the drain loop in send_progress_messages now check agent.is_interrupted (via agent_holder[0], the existing cross-scope handle). Events that arrive after interrupt are dropped at both the queueing and rendering stages. The '⚡ Interrupting' message is sent through a separate adapter path and is unaffected.	2026-04-26 05:47:37 -07:00
Teknium	855366909f	feat(models): remote model catalog manifest for OpenRouter + Nous Portal (#16033 ) OpenRouter and Nous Portal curated picker lists now resolve via a JSON manifest served by the docs site, falling back to the in-repo snapshot when unreachable. Lets us update model lists without shipping a release. Live URL: https://hermes-agent.nousresearch.com/docs/api/model-catalog.json (source at website/static/api/model-catalog.json; auto-deploys via the existing deploy-site.yml GitHub Pages pipeline on every merge to main). Schema (v1) carries id + optional description + free-form metadata at manifest, provider, and model levels. Pricing and context length stay live-fetched via existing machinery (/v1/models endpoints, models.dev). Config (new model_catalog section, default enabled): model_catalog.url master manifest URL model_catalog.ttl_hours disk cache TTL (default 24h) model_catalog.providers.<name>.url optional per-provider override Fetch pipeline: in-process cache -> disk cache (fresh < TTL) -> HTTP fetch -> disk-cache-on-failure fallback -> in-repo snapshot as last resort. Never raises to callers; at worst returns the bundled list. Changes: - website/static/api/model-catalog.json initial manifest (35 OR + 31 Nous) - scripts/build_model_catalog.py regenerator from in-repo lists - hermes_cli/model_catalog.py fetch + validate + cache module - hermes_cli/models.py fetch_openrouter_models() + new get_curated_nous_model_ids() - hermes_cli/main.py, hermes_cli/auth.py Nous flows use the helper - hermes_cli/config.py model_catalog defaults - website/docs/reference/model-catalog.md + sidebars.ts - tests/hermes_cli/test_model_catalog.py 21 tests (validation, fetch success/failure, accessors, disabled, overrides, integration)	2026-04-26 05:46:43 -07:00
Teknium	d09ab8ff13	fix(mcp-oauth): preserve server_url path for protected-resource validation (#16031 ) Stop pre-stripping the path from the configured MCP server URL before constructing OAuthClientProvider. The MCP SDK strips the path itself via OAuthContext.get_authorization_base_url() for authorization-server discovery, but uses the full server_url through resource_url_from_server_url() + check_resource_allowed() to validate against the server's RFC 9728 Protected Resource Metadata. For servers whose PRM advertises a path-scoped resource (e.g. Notion's https://mcp.notion.com/mcp), our _parse_base_url() collapsed the URL to the origin, so check_resource_allowed() saw requested='/' vs configured='/mcp/' and refused the token. Fixes OAuth against Notion MCP (and any other path-scoped resource). Closes #16015.	2026-04-26 05:43:54 -07:00
Teknium	438db0c7b0	fix(cli): /model picker honors provider-specific context caps (#16030 ) `_apply_model_switch_result` (the interactive `/model` picker's confirmation path) printed `ModelInfo.context_window` straight from models.dev, which reports the vendor-wide value (1.05M for gpt-5.5 on openai). ChatGPT Codex OAuth caps the same slug at 272K, so the picker showed 1M while the runtime (compressor, gateway `/model`, typed `/model <name>`) correctly used 272K — the classic 'sometimes 1M, sometimes 272K' mismatch on a single model. Both display paths now go through `resolve_display_context_length()`, matching the fix that `_handle_model_switch` received earlier. Also bump the stale last-resort fallback in DEFAULT_CONTEXT_LENGTHS (`gpt-5.5: 400000 -> 1050000`) to match the real OpenAI API value; the 272K Codex cap is already enforced via the Codex-OAuth branch, so the fallback now reflects what every non-Codex probe-miss should see. Tests: adds `test_apply_model_switch_result_context.py` with three scenarios (Codex cap wins, OpenRouter shows 1.05M, resolver-empty falls back to ModelInfo). Updates the existing non-Codex fallback test to assert 1.05M (the correct value). ## Validation \| path \| before \| after \| \|-------------------------------\|-----------\|-----------\| \| picker -> gpt-5.5 on Codex \| 1,050,000 \| 272,000 \| \| picker -> gpt-5.5 on OpenAI \| 1,050,000 \| 1,050,000 \| \| picker -> gpt-5.5 on OpenRouter \| 1,050,000 \| 1,050,000 \| \| typed /model gpt-5.5 on Codex \| 272,000 \| 272,000 \|	2026-04-26 05:43:31 -07:00

1 2 3 4 5 ...

2652 Commits