Commit Graph

1239 Commits

Author SHA1 Message Date
Teknium
6993e566ba fix(whatsapp_identity): pin identifier regex to ASCII, clarify it's defense-in-depth
Follow-up on top of #16243. Two small tweaks:

- Compile the regex once as `_SAFE_IDENTIFIER_RE` and pin it to
  `[A-Za-z0-9@.+\-]`. The previous `\w` accepts Unicode word chars
  (full-width digits, accented letters) which aren't valid WhatsApp
  identifiers and shouldn't reach the mapping-file lookup.
- Add a comment clarifying this is defense-in-depth, not a live
  traversal. The hardcoded `lid-mapping-{current}{suffix}.json`
  prefix already prevents escape via pathlib's component split —
  with `current='../secrets'`, the first path component under
  `session/` is the literal directory name `lid-mapping-..`,
  which the attacker cannot create.

E2E verified: legit mapping chains still resolve, all probed attack
shapes (`../`, absolute paths, shell metacharacters, Unicode digit
tricks) are rejected before any file access.
2026-04-26 20:48:31 -07:00
sprmn24
91512b8210 fix(whatsapp_identity): guard against path traversal and silent mapping errors
expand_whatsapp_aliases() interpolated untrusted identifiers directly
into filenames (lid-mapping-{current}.json) without validation.
An identifier containing ../ or / could escape the session directory.

Also replaced bare except Exception: continue with targeted
(OSError, json.JSONDecodeError) and a debug log so mapping
corruption is diagnosable instead of silently skipped.

Fixes:
- Reject identifiers with unsafe characters via re.match guard
- Replace broad exception swallow with specific catch + debug log
2026-04-26 20:48:31 -07:00
Teknium
478444c262 feat(checkpoints): auto-prune orphan and stale shadow repos at startup (#16303)
Every working dir hermes ever touches gets its own shadow git repo under
~/.hermes/checkpoints/{sha256(abs_dir)[:16]}/.  The per-repo _prune is a
no-op (comment in CheckpointManager._prune says so), so abandoned repos
from deleted/moved projects or one-off tmp dirs pile up forever.  Field
reports put the typical offender at 1000+ repos / ~12 GB on active
contributor machines.

Adds an opt-in startup sweep that mirrors the sessions.auto_prune
pattern from #13861 / #16286:

- tools/checkpoint_manager.py: new prune_checkpoints() and
  maybe_auto_prune_checkpoints() helpers.  Deletes shadow repos that
  are orphan (HERMES_WORKDIR marker points to a path that no longer
  exists) or stale (newest in-repo mtime older than retention_days).
  Idempotent via a CHECKPOINT_BASE/.last_prune marker file so it only
  runs once per min_interval_hours regardless of how many hermes
  processes start up.
- hermes_cli/config.py: new checkpoints.auto_prune /
  retention_days / delete_orphans / min_interval_hours knobs.
  Default auto_prune: false so users who rely on /rollback against
  long-ago sessions never lose data silently.
- cli.py / gateway/run.py: startup hooks gated on checkpoints.auto_prune,
  called right next to the existing state.db maintenance block.
- Docs updated with the new config knobs.
- 11 regression tests: orphan/stale deletion, precedence, byte-freed
  tracking, non-shadow dir skip, interval gating, corrupt marker
  recovery.

Refs #3015 (session-file disk growth was fixed in #16286; this covers
the checkpoint side noted out-of-scope there).
2026-04-26 19:05:52 -07:00
Teknium
77d4766602 fix(gateway): clear pending model note on auto-reset paths too
PR #16013 plugged the leak in `/new`, but two sibling session-boundary
resets had the same bug:

1. Inactivity / suspended-session auto-reset (top of `_handle_message`)
   previously cleared only reasoning. Now drops model override and the
   queued "/model switched" note as well.
2. Compression-exhaustion auto-reset now also drops the pending note
   alongside the existing model/reasoning cleanup.

All three session-boundary sites now use the identical cleanup idiom.
2026-04-26 19:01:50 -07:00
johnncenae
00c6480a05 fix(gateway): clear stale pending model note on session reset 2026-04-26 19:01:50 -07:00
simbam99
cebf95854b Fix MessageDeduplicator max_size enforcement 2026-04-26 18:51:51 -07:00
Teknium
ab6879634e yuanbao platform (#16298)
Co-authored-by: loongzhao <loongzhao@tencent.com>
2026-04-26 18:50:49 -07:00
Teknium
90c84c6dba fix(gateway): unblock update subprocess on recognized-command bypass
When the gateway intercepts a pending /update prompt and the user sends
a recognized slash command (/new, /help, ...), the command now dispatches
normally AND the detached update subprocess is unblocked by writing a
blank .update_response. _gateway_prompt reads '' → strips → returns the
prompt's default (typically a safe 'n' / skip), so the update process
exits cleanly instead of blocking on stdin until the 30-minute watcher
timeout.

Also clears _update_prompt_pending[session_key] on this path so stray
future input for the same session isn't re-intercepted.

Extends PR #15849 with tests for the new cancel-write + a regression
test pinning the legacy behavior of unrecognized /foo slash commands
still being consumed as the response.
2026-04-26 18:39:44 -07:00
Yukipukii1
bdaf56a94d fix(gateway): bypass slash commands during pending update prompts 2026-04-26 18:39:44 -07:00
Badgerbees
55f212a7a2 fix(slack): honor NO_PROXY for Slack transport 2026-04-26 18:33:35 -07:00
Xnbi
7eaad06a87 fix(gateway): default Slack tool_progress to off
Slack Bolt posts are not editable like CLI spinners; medium-tier new still emitted a permanent line per tool start (issue #14663).

- Built-in slack default: off; other tier-2 platforms unchanged.

- Adjust /verbose isolation test for off to new cycle.

- Migration tests: read/write config.yaml as UTF-8 (Windows locale).
2026-04-26 18:33:35 -07:00
haru398801
a01e767b24 fix(gateway): respect config.yaml slack.enabled when SLACK_BOT_TOKEN env var is set
Previously, setting SLACK_BOT_TOKEN in .env would unconditionally enable
the Slack gateway adapter regardless of `slack.enabled: false` in config.yaml.
This caused spurious "SLACK_APP_TOKEN not set" errors when the token was
used only by skills (e.g. cron jobs that send Slack messages) rather than
for the Hermes messaging gateway.

Now, enabled: false in config.yaml is respected — the token is stored so
skills can still use it, but the gateway adapter is not activated.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-26 18:33:35 -07:00
hharry11
fd474d0f00 fix(gateway): avoid cross-user mirror writes in per-user group sessions 2026-04-26 18:31:24 -07:00
Yang Zhi
3b60abb6bb fix(sessions): delete on-disk transcript files during prune and delete (#3015)
`delete_session()` and `prune_sessions()` only removed SQLite records,
leaving .json/.jsonl transcript files on disk forever. Over time this
causes unbounded disk growth (~27MB/day observed).

Changes:
- Add `_remove_session_files()` static helper that cleans up
  `{session_id}.json`, `.jsonl`, and `request_dump_{session_id}_*.json`
- `delete_session()` accepts optional `sessions_dir` param and removes
  files for the deleted session and its children
- `prune_sessions()` accepts optional `sessions_dir` param and removes
  files for all pruned sessions after the DB transaction
- Wire up CLI `hermes sessions delete` and `hermes sessions prune` to
  pass `sessions_dir`
- File cleanup is best-effort (OSError silenced) so DB operations are
  never blocked by filesystem issues
- Fully backward-compatible: `sessions_dir=None` (default) preserves
  existing behavior
2026-04-26 18:31:07 -07:00
mewwts
8fb861ea6e feat(gateway/slack): support channel_skill_bindings
Extends the existing channel_skill_bindings mechanism (previously
Discord-only) to Slack, so a channel or DM can auto-load one or more
skills at session start without relying on the model's skill selector
for every short reply.

Motivation: Mats's German flashcards DM pushes a cron-driven card
5x/day; he responds with one-word guesses like 'work'. Previously each
reply required the main agent to decide whether to load german-flashcards
(full opus turn just to pick a skill). With the binding configured per
Slack channel, the skill is injected at session start and grading runs
directly.

Changes:
- Extract resolve_channel_skills() from DiscordAdapter._resolve_channel_skills
  into gateway.platforms.base (now shared across adapters).
- DiscordAdapter._resolve_channel_skills delegates to the shared helper
  (behavior preserved — existing test suite still passes unchanged).
- SlackAdapter: resolve channel_skill_bindings on each message and attach
  auto_skill to MessageEvent. gateway/run.py already handles auto-skill
  injection on new sessions; this just wires Slack through it.
- gateway/config.py: accept channel_skill_bindings in slack: block of
  config.yaml (was Discord-only).
- Tests: new tests/gateway/test_slack_channel_skills.py with 11 cases
  covering DM/thread/parent resolution, single-vs-list skills, dedup,
  malformed entries. Discord suite unchanged.
- Docs: add 'Per-Channel Skill Bindings' section to Slack user guide.

Config example:
  slack:
    channel_skill_bindings:
      - id: "D0ATH9TQ0G6"
        skills: ["german-flashcards"]
2026-04-26 18:25:41 -07:00
Teknium
635253b918 feat(busy): add 'steer' as a third display.busy_input_mode option (#16279)
Enter while the agent is busy can now inject the typed text via /steer —
arriving at the agent after the next tool call — instead of interrupting
(current default) or queueing for the next turn.

Changes:
- cli.py: keybinding honors busy_input_mode='steer' by calling
  agent.steer(text) on the UI thread (thread-safe), with automatic
  fallback to 'queue' when the agent is missing, steer() is unavailable,
  images are attached, or steer() rejects the payload. /busy accepts
  'steer' as a fourth argument alongside queue/interrupt/status.
- gateway/run.py: busy-message handler and the PRIORITY running-agent
  path both route through running_agent.steer() when the mode is 'steer',
  with the same fallback-to-queue safety net. Ack wording tells users
  their message was steered into the current run. Restart-drain queueing
  now also activates for 'steer' so messages aren't lost across restarts.
- agent/onboarding.py: first-touch hint has a steer branch for both
  CLI and gateway.
- hermes_cli/commands.py: /busy args_hint updated to include steer,
  and 'steer' is registered as a subcommand (completions).
- hermes_cli/web_server.py: dashboard select widget offers steer.
- hermes_cli/config.py, cli-config.yaml.example, hermes_cli/tips.py:
  inline docs updated.
- website/docs/user-guide/cli.md + messaging/index.md: documented.
- Tests: steer set/status path for /busy; onboarding hints;
  _load_busy_input_mode accepts steer; busy-session ack exercises
  steer success + two fallback-to-queue branches.

Requested on X by @CodingAcct.

Default is unchanged (interrupt).
2026-04-26 18:21:29 -07:00
ghostmfr
e818ec520a fix(slack): harden attachment handling
Multiple overlapping Slack attachment improvements:

1. Upload retry with backoff on transient errors (429, 5xx, connection
   reset, rate_limited, service unavailable). New _is_retryable_upload_error
   helper covers three upload paths: _upload_file, send_video,
   send_document. Up to 3 attempts with 1.5s * attempt backoff.

2. Thread participation tracking: successful file uploads now add the
   thread_ts to _bot_message_ts, mirroring how text replies are tracked.
   This lets follow-up thread messages auto-trigger the bot (same
   engagement rules as replied threads).

3. Thread metadata preservation in the image redirect-guard fallback
   (send_image → send text fallback) and in two gateway.run.py send
   paths (image + document fallback calls).

4. HTML response rejection in _download_slack_file_bytes. Parallels
   the existing check in _download_slack_file. Guards against Slack
   returning a sign-in / redirect page as document bytes when scopes
   are missing, so the agent doesn't get HTML-as-a-PDF.

5. File lifecycle event acks (file_shared / file_created / file_change).
   These events arrive around snippet uploads. Acking them silences the
   slack_bolt 'Unhandled request' 404 warnings without changing behavior.

6. Post-loop message type classification so a mixed image+document upload
   classifies as PHOTO (or VOICE if no image), falling back to DOCUMENT.
   Previously, the per-file classification in the inbound loop could be
   overwritten unpredictably.

7. Expanded text-inject whitelist in inbound document handling to cover
   .csv, .json, .xml, .yaml, .yml, .toml, .ini, .cfg (up to 100KB) so
   snippets and config files are directly visible to the agent, not just
   cached as opaque uploads. Paired with new MIME entries in
   SUPPORTED_DOCUMENT_TYPES in base.py.

Squashed from two commits in #11819 so the single commit carries the
contributor's GitHub attribution (the original commits were authored
under a local dev hostname).
2026-04-26 18:20:17 -07:00
Teknium
b16f9d438b feat(telegram): send fresh finals for stale preview streams (port openclaw#72038) (#16261)
Ports openclaw/openclaw#72038 to hermes-agent.

Telegram's `editMessageText` preserves the original message timestamp,
so a long-running streamed reply (reasoning models that take 60+ seconds
to finish) would keep the first-token timestamp even after completion.
Users can't tell how long a task actually took.

When a preview message has been visible for >= 60s (configurable via
`streaming.fresh_final_after_seconds`), finalize by sending a fresh
message instead of editing in place, then best-effort delete the stale
preview. Short previews still edit in place (the existing fast path).

Implementation notes adapted from OpenClaw's TypeScript original:
- `StreamConsumerConfig` gains `fresh_final_after_seconds` (default 0 =
  legacy edit-in-place). Gateway-level `StreamingConfig` defaults to 60.
- `GatewayStreamConsumer` tracks `_message_created_ts` at first-send and
  checks it in `_send_or_edit` on `finalize=True`. New helpers
  `_should_send_fresh_final` + `_try_fresh_final`.
- `BasePlatformAdapter` gains optional `delete_message(chat_id, message_id)`
  returning False by default. `TelegramAdapter` implements it via
  `_bot.delete_message`.
- `gateway/run.py` only enables fresh-final for `Platform.TELEGRAM`;
  other platforms ignore the setting (they don't have the stale-edit
  timestamp problem or edit-then-read works cheaply).
- Fallback to normal edit on any fresh-send failure — no user-visible
  regression if Telegram rate-limits a send or the message is gone.

Tests: 15 new cases in tests/gateway/test_stream_consumer_fresh_final.py
covering short/long previews, config plumbing, delete-support absent,
send-failure fallback, __no_edit__ sentinel safety, and StreamingConfig
round-trip.

Co-authored-by: Hermes Agent <agent@nousresearch.com>
2026-04-26 17:26:37 -07:00
Wang-tianhao
6087e04043 fix(slack): extract rich_text quotes/lists and link unfurl previews
Slack's modern composer sends messages with a 'blocks' array that
contains rich_text elements. When a user forwards or quotes another
message, the quoted content shows up in the rich_text_quote children
of that array — and is NOT included in the plain 'text' field. The
agent saw only the lossy plain text and was blind to forwarded /
quoted content. Same story for link unfurl previews (Notion, docs,
GitHub, etc.) which Slack puts in the 'attachments' array.

Two fixes in the inbound handler:

1. _extract_text_from_slack_blocks walks rich_text / rich_text_quote /
   rich_text_list / rich_text_preformatted trees and renders readable
   text ('> quoted', '• bullet', code fences), dedupes against the
   plain text field, and appends the extracted content so the agent
   sees everything.

2. Link unfurl / attachment preview extraction reads title, url,
   body, and footer from the 'attachments' array and appends a
   '📎 [title](url)\n   body\n   _footer_' section per preview.
   Skips is_msg_unfurl to avoid echoing our own Slack replies back.

Routing is careful not to trust augmented text: mention gating
(is_mentioned) and slash-command detection both run against the
original 'text' field, so forwarded content containing '<@bot>' or
'/deploy' in a quote can't trick the bot into responding in a
channel it shouldn't or classifying a normal message as a command.

Adjustment from original PR: dropped _serialize_slack_blocks_for_agent,
which inlined a redacted JSON dump of non-rich_text blocks (section,
accessory, actions, etc.) — the agent would see the raw Block Kit
structure for UI-heavy alerts. It added up to 6000 characters to the
prompt context on every qualifying message with no opt-out. The
rich_text extraction and attachment unfurls cover the common bug-fix
case (quoted/forwarded content + link previews) without the prefill
tax. If a user needs block inspection later, it can return as a
config opt-in.

Also updates the Slack platform notes in session.py to accurately
describe what the gateway inlines.
2026-04-26 13:02:51 -07:00
Tranquil-Flow
bf05b8f4a2 fix(gateway): clean up cached agents on shutdown (#11205) 2026-04-26 12:51:53 -07:00
Zainan Victor Zhou
778fd1898e fix(slack): surface attachment access diagnostics
Translate Slack attachment failures into actionable user-facing notices
instead of generic download errors. When a scope/auth/permission issue
breaks attachment processing, the user sees:

  [Slack attachment notice]
  - Slack attachment access failed for photo.jpg. Missing scope:
    files:read. Update the Slack app scopes/settings and reinstall
    the app to the workspace.

Two helpers do the translation:

  _describe_slack_api_error — handles SlackApiError responses
    (missing_scope, invalid_auth, file_not_found, access_denied, etc.)

  _describe_slack_download_failure — handles httpx.HTTPStatusError
    (401/403/404) and Slack-returns-HTML-sign-in fallbacks

Wired into three existing call sites:
 - the Slack Connect files.info path (PR #11111) so scope errors
   surface instead of being logged as generic "files.info failed"
 - the image, audio, and document download paths so 401/403 and
   HTML-body responses translate into actionable notices

Adjustment from original PR: dropped _probe_slack_file_access_issue,
the proactive pre-download files.info probe. It added one extra
Slack API call per attachment even on healthy ones, and overlapped
with the existing files.info call from PR #11111. The post-failure
translation path covers the same user-facing diagnostic value
without the per-message tax.

Also documents files:read scope more prominently in the Slack setup
guide and troubleshooting table.

Contributed back from https://github.com/xinbenlv/zn-hermes-agent.

Closes #7015.
Co-authored-by: xinbenlv <zzn+pa@zzn.im>
2026-04-26 12:47:43 -07:00
kunlabs
f9885130b4 fix(slack): download files in Slack Connect channels
Slack Connect channels return file objects with file_access="check_file_info"
and no url_private_download field (see
https://docs.slack.dev/reference/objects/file-object/#slack_connect_files).
These stub objects must be resolved via files.info before download can
proceed. Without this the agent silently skips attachments posted in
Slack Connect channels.

Call files.info on every file whose file_access is check_file_info,
replace the stub with the full file object, and let the existing
download path continue. Warn and skip on files.info failures.

Closes #11095.
2026-04-26 12:35:16 -07:00
flobo3
f414df3a56 fix(slack): include team_id in thread-context cache key 2026-04-26 12:35:16 -07:00
Satoshi-agi
c0d25df311 fix(slack): preserve thread-parent context when cron/bot posted the parent
The Slack thread-context fetcher used to drop every message with a
bot_id, which silently erased the thread parent whenever a cron job (or
any other bot) had posted it. As a result, replies to a cron-posted
summary lost all context and the agent answered as if from a blank
thread.

Changes:

1. gateway/platforms/slack.py::_fetch_thread_context
   - Keep the thread parent even when it was posted by a bot
     (e.g. cron summaries, third-party integrations).
   - Only skip *our own* prior bot replies to avoid circular context,
     matching the per-workspace bot user id via _team_bot_user_ids so
     multi-workspace deployments stay correct.
   - Keep non-self bot children (useful third-party context).

2. gateway/platforms/slack.py::_handle_slack_message
   - Populate MessageEvent.reply_to_text for thread replies (parity
     with Telegram/Discord/Feishu/WeCom). gateway.run uses this field
     to inject a [Replying to: "..."] prefix when the parent is not
     already in the session history, which is exactly the scenario
     triggered by cron-generated thread parents.
   - New helper _fetch_thread_parent_text reuses the existing thread-
     context cache (and its 60s TTL) to avoid duplicate
     conversations.replies calls; falls back to a cheap limit=1 fetch
     when the cache is cold.

Tests:

- Updated TestSlackThreadContext::test_skips_bot_messages to reflect
  the new behaviour (self-bot child dropped, third-party bot kept).
- Added:
    * test_fetch_thread_context_includes_bot_parent
    * test_fetch_thread_context_excludes_self_bot_replies
    * test_fetch_thread_context_multi_workspace
    * test_fetch_thread_context_current_ts_excluded (regression guard)
    * test_fetch_thread_parent_text_from_cache
    * test_slack_reply_to_text_set_on_thread_reply
    * test_slack_reply_to_text_none_for_top_level_message

Full Slack suite: 176 passed (was 169).
2026-04-26 12:35:16 -07:00
hhuang91
802c7acb81 fix(Slack): resolve Slack channels by raw ID and enumerate joined channels
send_message(target='slack:<channel_id>') failed with "Could not
resolve" because _parse_target_ref had no Slack branch — Slack's
uppercase alphanumeric IDs fell through to channel-name resolution,
which only matched by name. As a fallback, the agent would retry with
bare target='slack' and post to the home channel instead.

Three fixes:

- _parse_target_ref recognizes Slack IDs (C/G/D/U/W prefix) as
  explicit targets so the name-resolver is bypassed entirely.
- resolve_channel_name tries a case-sensitive raw-ID match before
  the existing name match, so any platform's IDs resolve cleanly.
- _build_slack now actually calls users.conversations against each
  workspace's AsyncWebClient (paginated), instead of only returning
  session-history entries. This populates the directory with public
  and private channels the bot has joined, so action='list' shows
  them and they can also be addressed by name. Errors from one
  workspace don't block others.

build_channel_directory becomes async (Slack web calls require it).
The two async-context callers in gateway/run.py are awaited; the
cron ticker thread call bridges via asyncio.run_coroutine_threadsafe.

Slack bot needs channels:read and groups:read scopes for full
enumeration; missing scopes degrade gracefully per-workspace.

addressing #15927
2026-04-26 12:29:02 -07:00
Honza Stepanovsky
50dd67c680 fix(slack): skip _mentioned_threads registration when strict_mention is on
Extends the strict_mention feature so an @mention in strict mode no
longer persistently tags the thread as 'mentioned'. Without this, the
thread's first mention would permanently auto-trigger the bot on every
subsequent message — which is exactly what strict_mention is designed
to prevent. Closes the agent-to-agent ack loop hole hhhonzik identified
in #14117.

Co-authored-by: hhhonzik <me@janstepanovsky.cz>
2026-04-26 12:23:20 -07:00
Ching
aea4a90f0e feat(slack): add opt-in slack.strict_mention gate for channel threads
Adds a strict_mention config option that, when enabled, requires an
explicit @-mention on every message in channel threads. Disables the
'once mentioned, forever in the thread' and session-presence auto-triggers.

- New _slack_strict_mention() helper (config.extra + SLACK_STRICT_MENTION env)
- Bridged top-level slack.strict_mention yaml to SLACK_STRICT_MENTION env,
  matching require_mention/allow_bots bridging
- Unit tests for the helper + config bridge
2026-04-26 12:23:20 -07:00
Teknium
4b5a88d714 fix(slack): honor reply_in_thread=false for top-level channel messages
Top-level channel messages arrive at _resolve_thread_ts with
metadata.thread_id set to the message's own ts, because the inbound
handler in _handle_message_event uses 'event.ts' as a session-keying
fallback when event.thread_ts is absent. That made metadata alone
insufficient to distinguish a real thread reply from a top-level
message, so reply_in_thread=false only took effect in DMs.

Use reply_to (== incoming message_id == ts for top-level messages) as
the tiebreaker: when metadata.thread_id == reply_to the 'thread' is the
synthetic session-keying fallback, not a real parent, so we reply
directly in the channel. Real thread replies (reply_to != thread_id)
still resolve to the parent thread and preserve conversation context.

Closes #9268.
2026-04-26 12:04:46 -07:00
bde3249023
b1be86ef96 fix(gateway): bridge slack.reply_in_thread config 2026-04-26 12:04:46 -07:00
Zhi Yan Liu
d993a3f450 fix(gateway): use /hermes sethome in onboarding hint on Slack
Slack's adapter registers a single parent slash command /hermes and
dispatches subcommands via slack_subcommand_map(). Bare /sethome is
not a registered command on Slack and fails with 'app did not
respond', logging 'Unhandled request' in slack_bolt.AsyncApp.

Show /hermes sethome in the first-run onboarding hint when the
source platform is Slack; keep /sethome for Telegram, Discord,
Matrix, Mattermost, and other platforms that register it directly.

Fixes #14632
2026-04-26 11:56:23 -07:00
Teknium
1dfcc2ffc3 fix(gateway): /queue is now a true FIFO — each invocation gets its own turn (#16175)
Repeated /queue commands now each produce a full agent turn, in order,
with no merging.  Previously the second /queue overwrote the first
because the handler wrote directly into the adapter's single-slot
_pending_messages dict.

- GatewayRunner grows a _queued_events overflow buffer (dict of list).
- /queue puts new items in the adapter's next-up slot when free,
  otherwise appends to the overflow.  After each run's drain consumes
  the slot, the next overflow item is promoted so the recursive run
  picks it up.
- /new and /reset clear the overflow.
- /status now reports queue depth when non-zero.
- Ack message shows the depth once it exceeds 1.

Helpers (_enqueue_fifo, _promote_queued_event, _queue_depth) use the
getattr default-fallback pattern so existing tests that build bare
GatewayRunner instances via object.__new__ keep working.
2026-04-26 11:55:09 -07:00
Teknium
087e74d4d7 feat(slack): register every gateway command as a native slash (Discord/Telegram parity) (#16164)
Every command in COMMAND_REGISTRY (/btw, /stop, /model, /help, /new,
/bg, /reset, ...) is now a first-class Slack slash command instead of
a /hermes <subcommand>. Users get the same autocomplete-driven slash
picker experience Slack users expect and that Discord and Telegram
already provide.

Previously Slack registered ONE native slash (/hermes) and split on
the first word, so typing /btw in Slack's composer got 'couldn't find
an app for /btw' because the workspace manifest never declared it.

Changes
- hermes_cli/commands.py: slack_native_slashes() + slack_app_manifest()
  generate a Slack manifest from the registry (canonical names +
  aliases + plugin commands), clamped to Slack's 50-slash cap with
  /hermes reserved as the catch-all.
- gateway/platforms/slack.py: single regex matcher dispatches every
  registered slash to _handle_slash_command, which dispatches on
  command['command']. Legacy /hermes <subcommand> keeps working for
  backward compat with older workspace manifests.
- hermes_cli/slack_cli.py + hermes_cli/main.py: new 'hermes slack
  manifest' command prints/writes a full manifest (display info,
  OAuth scopes, event subs, socket mode, slash commands) ready to
  paste into 'Create from manifest' or Features → App Manifest.
- hermes_cli/setup.py: _setup_slack() now writes the manifest up-front
  and points users at the 'From an app manifest' flow; also offers
  to refresh the manifest on reconfigure for picking up new commands.
- Tests: 14 new tests covering native-slash dispatch (/btw, /stop,
  /model), legacy /hermes <sub> compat, manifest structure, and
  telegram<->slack parity (every Telegram command must also register
  as a Slack slash). Existing /hermes-registration test updated to
  assert the new regex matches /hermes, /btw, /stop, /model, /help.
- Docs: slack.md gains a 'Slash Commands' section + Option A manifest
  flow in Step 1; cli-commands.md documents 'hermes slack manifest'.

Users pick up the new slashes by running 'hermes slack manifest --write'
and pasting into Features → App Manifest → Edit in their Slack app
config, then Save (Slack prompts for reinstall if scopes changed).
2026-04-26 11:38:32 -07:00
briandevans
4e356098d2 fixup! fix(gateway): preserve inactivity clock on interrupt-recursive cached-agent turns (#15654)
Address Copilot review findings:

1. Gate _last_activity_desc on interrupt_depth == 0 alongside _last_activity_ts.
   Both fields are semantically paired — desc describes the activity *at* ts.
   Updating desc without ts made get_activity_summary() report "starting new
   turn (cached)" for 20+ minutes while the timestamp showed the true stale
   duration, producing misleading diagnostic output.

2. Monkeypatch gateway.run.time.time to a fixed epoch in tests that assert
   on _last_activity_ts values.  Real time.time() comparisons were latently
   flaky under slow CI or NTP adjustments.  _FAKE_NOW = 10_000.0 is used
   as the reference; assertions are now exact equality rather than >=.

3. Add test_fresh_turn_resets_desc and test_interrupt_turn_preserves_desc to
   directly cover the gated desc behaviour introduced by (1).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 08:45:44 -07:00
briandevans
de24315978 fix(gateway): preserve inactivity clock on interrupt-recursive cached-agent turns (#15654)
_last_activity_ts was unconditionally reset to time.time() on every
_agent_cache hit.  For interrupt-recursive _run_agent calls
(_interrupt_depth > 0) this silently reset the inactivity watchdog's
idle clock on each re-entry, preventing the 30-min timeout from ever
firing when a turn got stuck in an interrupt loop.  A stuck session
would emit "Still working... iteration 0/60, starting new turn (cached)"
heartbeats indefinitely instead of timing out.

Gate the reset on _interrupt_depth == 0 only.  Fresh external turns
still receive the reset so a session idle for 29 min doesn't trip the
watchdog before the new turn makes its first API call (#9051).

The per-turn reset logic is extracted into a static helper
_init_cached_agent_for_turn() to make it directly testable.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 08:45:44 -07:00
Teknium
20cb706e03 chore: extend [SYSTEM:→[IMPORTANT: rename + AUTHOR_MAP
Follow-up to #6616 covering the remaining user-injected prompt markers that
the original PR did not touch (reporter's second comment on #6576 explicitly
flagged these). Azure OpenAI Default/DefaultV2 content filters treat any
bracketed [SYSTEM: ...] as prompt-injection and reject with HTTP 400.

Remaining call sites renamed:
- cli.py: background-process notifications (watch_disabled, watch_match,
  completion), MCP reload notice (4 live + 1 docstring)
- gateway/run.py: same notification paths + auto-loaded skill banner +
  MCP reload notice (5 live + 1 docstring)
- tools/process_registry.py: comment reference

Not renamed:
- environments/hermes_base_env.py '[SYSTEM]\n{content}' — RL training
  trajectory rendering only, never sent to Azure, part of a symmetric
  [USER]/[ASSISTANT]/[TOOL] scheme.

AUTHOR_MAP: buraysandro9@gmail.com -> ygd58.
2026-04-26 08:44:58 -07:00
Teknium
06f81752ed Revert "feat(kanban): durable multi-profile collaboration board (#16081)" (#16098)
This reverts commit 15937a6b46.
2026-04-26 08:29:37 -07:00
Teknium
15937a6b46 feat(kanban): durable multi-profile collaboration board (#16081)
New `hermes kanban` CLI subcommand + `/kanban` slash command + skills for
worker and orchestrator profiles. SQLite-backed task board
(~/.hermes/kanban.db) shared across all profiles on the host. Zero
changes to run_agent.py, no new core tools, no tool-schema bloat.

Motivation: delegate_task is a function call — sync fork/join, anonymous
subagent, no resumability, no human-in-the-loop. Kanban is the durable
shape needed for research triage, scheduled ops, digital twins,
engineering pipelines, and fleet work. They coexist (workers may call
delegate_task internally).

What this adds
- hermes_cli/kanban_db.py — schema, CAS claim, dependency resolution,
  dispatcher, workspace resolution, worker-context builder.
- hermes_cli/kanban.py — 15-verb CLI surface and shared run_slash()
  entry point used by both CLI and gateway.
- skills/devops/kanban-worker — how a profile should work a claimed task.
- skills/devops/kanban-orchestrator — "you are a dispatcher, not a
  worker" template with anti-temptation rules.
- /kanban slash command wired into cli.py and gateway/run.py. Bypasses
  the running-agent guard (board writes don't touch agent state), so
  /kanban unblock can free a stuck worker mid-conversation.
- Design spec at docs/hermes-kanban-v1-spec.pdf — comparative analysis
  vs Cline Kanban, Paperclip, NanoClaw, Gemini Enterprise; 8 patterns;
  4 user stories; implementation plan; concurrency correctness.
- Docs: website/docs/user-guide/features/kanban.md, CLI reference
  updated, sidebar entry added.

Architecture highlights
- Three planes: control (user + gateway), state (board + dispatcher),
  execution (pool of profile processes).
- Every worker is a full OS process, spawned as `hermes -p <profile>`.
  No in-process subagent swarms — solves NanoClaw's SDK-lifecycle
  failure class.
- Atomic claim via SQLite CAS in a BEGIN IMMEDIATE transaction; stale
  claims reclaimed 15 min after their TTL expires.
- Tenant namespacing via one nullable column — one specialist fleet
  can serve many businesses with data isolation by workspace path.

Tests: 60 targeted tests (schema, CAS atomicity, dependency resolution,
dispatcher, workspace kinds, tenancy, CLI + slash surface). All pass
hermetic via scripts/run_tests.sh.
2026-04-26 08:24:26 -07:00
Teknium
454d883e69 refactor: drop persist_session plumbing + fix broken btw mid-turn bypass (#16075)
Follow-up to PR #16053 (/btw as /background alias). Cleans up the
plumbing added exclusively for the old ephemeral /btw handler and
repairs a broken btw bypass that landed between my refactor and this
follow-up.

run_agent.py:
- Remove persist_session kwarg, instance attr, and _persist_session
  short-circuit. Only /btw ever passed persist_session=False; with
  /btw gone the default (always persist) is the only behavior anyone
  ever wanted.

gateway/run.py:
- Remove the unreachable 'if _cmd_def_inner.name == "btw"' block
  (PR #16059). Canonical name for a /btw message is 'background' after
  alias resolution — the comparison could never be true, and it called
  _handle_btw_command which no longer exists. The /background branch
  above it already dispatches /btw correctly.

tests/gateway/test_running_agent_session_toggles.py:
- Fix test_btw_dispatches_mid_run to mock _handle_background_command
  (the real dispatch target for /btw) instead of the deleted
  _handle_btw_command.
2026-04-26 07:15:23 -07:00
Teknium
70f56e7605 fix(gateway): let /btw dispatch mid-turn instead of being rejected
/btw spawns a parallel ephemeral side-question task (self-guarded against
concurrent /btw on the same chat) — exactly like /background. But it was
missing from the running-agent bypass list in _handle_message(), so it
fell through to the catch-all and returned:

   Agent is running — /btw can't run mid-turn. Wait for the current
  response or /stop first.

That's the opposite of what /btw is for — asking a side question while
the main turn is still working. Add the bypass next to /background and a
regression test covering the mid-turn dispatch path.

Reported by @IuriiTiunov on Telegram.
2026-04-26 07:11:10 -07:00
Teknium
7fa70b6c87 refactor: /btw is now an alias for /background (#16053)
The ephemeral no-tools side-question variant of /btw confused users who
expected 'by-the-way' to mean 'run this off to the side with tools' —
they'd type /btw and get a toolless agent that couldn't do the work.
/bg worked because it was /background with full tools.

Collapse the two: /btw and /bg both alias to /background. One command,
one behavior, no more gotchas about which variant has tools.

Removed:
- _handle_btw_command in cli.py and gateway/run.py
- _run_btw_task + _active_btw_tasks state in gateway/run.py
- prompt.btw JSON-RPC method + btw.complete event in tui_gateway
- BtwStartResponse type + btw.complete case in ui-tui
- Standalone /btw slash tree registration in Discord
- Standalone btw CommandDef in hermes_cli/commands.py

Updated:
- background CommandDef aliases: (bg,) -> (bg, btw)
- TUI session.ts: local btw handler merged into background
- Docs and tips updated to describe /btw as a /background alias
2026-04-26 07:11:08 -07:00
Teknium
83c1c201f6 feat(onboarding): contextual first-touch hints for /busy and /verbose (#16046)
Instead of a blocking first-run questionnaire, show a one-time hint the first
time the user hits each behavior fork:

1. First message while the agent is working — appends a hint to the busy-ack
   explaining the /busy queue vs /busy interrupt knob, phrased to match the
   mode that was just applied (don't tell a queue-mode user to switch to
   queue).

2. First tool that runs for >= 30s in the noisiest progress mode
   (tool_progress: all) — prints a hint about /verbose to cycle display
   modes (all -> new -> off -> verbose). Gated on /verbose actually being
   usable on the surface: always shown on CLI; on gateway only shown when
   display.tool_progress_command is enabled.

Each hint is latched in config.yaml under onboarding.seen.<flag>, so it
fires exactly once per install across CLI, gateway, and cron, then never
again. Users can wipe the section to re-see hints.

New:
- agent/onboarding.py — is_seen / mark_seen / hint strings, shared by
  both CLI and gateway.
- onboarding.seen in DEFAULT_CONFIG (hermes_cli/config.py) and in
  load_cli_config defaults (cli.py). No _config_version bump — deep
  merge handles new keys.

Wired:
- gateway/run.py: _handle_active_session_busy_message appends the hint
  after building the ack.  progress_callback tracks tool.completed
  duration and queues the tool-progress hint into the progress bubble.
- cli.py: CLI input loop appends the busy-input hint on the first busy
  Enter; _on_tool_progress appends the tool-progress hint on the first
  >=30s tool completion.  In-memory CLI_CONFIG is also updated so
  subsequent fires in the same process are suppressed immediately.

All writes go through atomic_yaml_write and are wrapped in try/except
so onboarding can never break the input/busy-ack paths.
2026-04-26 06:06:27 -07:00
Teknium
4bda9dcade fix(gateway): honor voice.auto_tts config in auto-TTS gate (#16007) (#16039)
The base adapter's auto-TTS path fired on any voice message unless the
chat had explicitly run /voice off — it never read voice.auto_tts from
config.yaml, so users who set auto_tts: false still got audio replies.

Gate the base adapter on a three-layer decision instead:
  1. chat in _auto_tts_enabled_chats (explicit /voice on|tts) → fire
  2. chat in _auto_tts_disabled_chats (explicit /voice off)  → suppress
  3. else → voice.auto_tts global default

Runner now pushes voice.auto_tts onto the adapter as _auto_tts_default
and mirrors /voice on|tts chats into _auto_tts_enabled_chats via the
existing _sync_voice_mode_state_to_adapter path. /voice off still wins.

Closes #16007.
2026-04-26 05:52:05 -07:00
Teknium
35c57cc46b fix(gateway): suppress tool-progress bubbles after interrupt (#16034)
When the LLM response carries N parallel tool calls, the agent fires
N tool.started events back-to-back before its interrupt check runs.
A user sending /stop mid-batch would see the ' Interrupting current
task' ack followed by a trail of 🔍 web_search bubbles for the remaining
events in the batch — making the interrupt feel ignored.

progress_callback and the drain loop in send_progress_messages now
check agent.is_interrupted (via agent_holder[0], the existing
cross-scope handle). Events that arrive after interrupt are dropped
at both the queueing and rendering stages. The ' Interrupting'
message is sent through a separate adapter path and is unaffected.
2026-04-26 05:47:37 -07:00
Teknium
125de02056 fix(context): honor custom_providers context_length on /model switch + bump probe tier to 256K (#15844)
Fixes #15779. Custom-provider per-model context_length (`custom_providers[].models.<id>.context_length`) is now honored across every resolution path, not just agent startup. Also adds 256K as the top probe tier and default fallback.

## What changed

New helper `hermes_cli.config.get_custom_provider_context_length()` — single source of truth for the per-model override lookup, with trailing-slash-insensitive base-url matching.

`agent.model_metadata.get_model_context_length()` gains an optional `custom_providers=` kwarg (step 0b — runs after explicit `config_context_length` but before every other probe).

Wired through five call sites that previously either duplicated the lookup or ignored it entirely:
- `run_agent.py` startup — refactored to use the new helper (dedups legacy inline loop, keeps invalid-value warning)
- `AIAgent.switch_model()` — re-reads custom_providers from live config on every /model switch
- `hermes_cli.model_switch.resolve_display_context_length()` — new `custom_providers=` kwarg
- `gateway/run.py` /model confirmation (picker callback + text path)
- `gateway/run.py` `_format_session_info` (/info)

## Context probe tiers

`CONTEXT_PROBE_TIERS = [256_000, 128_000, 64_000, 32_000, 16_000, 8_000]` — was `[128_000, ...]`. `DEFAULT_FALLBACK_CONTEXT` follows tier[0], so unknown models now default to 256K. The stale `128000` literal in the OpenRouter metadata-miss path is replaced with `DEFAULT_FALLBACK_CONTEXT` for consistency.

## Repro (from #15779)

```yaml
custom_providers:
  - name: my-custom-endpoint
    base_url: https://example.invalid/v1
    model: gpt-5.5
    models:
      gpt-5.5:
        context_length: 1050000
```

`/model gpt-5.5 --provider custom:my-custom-endpoint` → previously "Context: 128,000", now "Context: 1,050,000".

## Tests

- `tests/hermes_cli/test_custom_provider_context_length.py` — new file, 19 tests covering the helper, step-0b integration, and the 256K tier invariants
- `tests/hermes_cli/test_model_switch_context_display.py` — added regression tests for #15779 through the display resolver
- `tests/gateway/test_session_info.py` — updated default-fallback assertion (128K → 256K)
- `tests/agent/test_model_metadata.py` — updated tier assertions for the new top tier
2026-04-25 18:47:53 -07:00
Teknium
01535a4732 fix(api_server): cap stop-run wait at 5s so interrupt can't hang handler
task.cancel() can't preempt the run_in_executor thread running
run_conversation(), so we rely on agent.interrupt() to wake the loop.
Without a timeout, a slow/unresponsive interrupt blocks the HTTP
response indefinitely. Wrap the await in wait_for(shield(task), 5.0)
and log a warning on timeout.

Also tidy one extra space in the module docstring's /stop entry.
2026-04-25 18:40:35 -07:00
ekko
0a15dbdc43 feat(api_server): add POST /v1/runs/{run_id}/stop endpoint
Add ability to interrupt a running agent via the runs API. Previously
/v1/runs could start a run and subscribe to events, but there was no
way to cancel it. The new endpoint stores agent and task references
during execution, calls agent.interrupt() to stop LLM calls, then
cancels the asyncio task.

Includes 15 tests covering start, events, and stop scenarios.
2026-04-25 18:40:35 -07:00
nerijusas
81e01f6ee9 fix(agent): preserve Codex message items for replay 2026-04-25 18:22:06 -07:00
Iris Jin
25ba6a4a74 fix(gateway): make reasoning session-scoped by default 2026-04-25 18:01:31 -07:00
kshitijk4poor
7c17accb29 fix: /stop now immediately aborts streaming retry loop
When a user sends /stop during a streaming API call, the outer poll loop
detects _interrupt_requested and closes the HTTP connection. However, the
inner _call() thread catches the connection error and enters its retry
loop — opening a FRESH connection without checking the interrupt flag.

On slow providers like ollama-cloud, each retry attempt blocks for the
full stream-read timeout (120s+). With 3 retry attempts this caused
510+ second delays between /stop and actual response — the agent appeared
completely unresponsive despite the stop being acknowledged.

Fix: add an _interrupt_requested check at the top of the streaming retry
loop so the agent exits immediately instead of retrying.

Also fix log truncation: all session key logging in gateway/run.py used
[:20] or [:30] slices, which truncated 'agent:main:telegram:dm:5690190437'
(33 chars) to 'agent:main:telegram:' — losing the identifying chat type
and user ID. Replace with full keys to make logs debuggable.

Reported by user Sidharth Pulipaka via Telegram on ollama-cloud provider.
2026-04-25 09:51:39 -07:00
Teknium
ea01bdcebe refactor(memory): remove flush_memories entirely (#15696)
The AIAgent.flush_memories pre-compression save, the gateway
_flush_memories_for_session, and everything feeding them are
obsolete now that the background memory/skill review handles
persistent memory extraction.

Problems with flush_memories:

- Pre-dates the background review loop.  It was the only memory-save
  path when introduced; the background review now fires every 10 user
  turns on CLI and gateway alike, which is far more frequent than
  compression or session reset ever triggered flush.
- Blocking and synchronous.  Pre-compression flush ran on the live agent
  before compression, blocking the user-visible response.
- Cache-breaking.  Flush built a temporary conversation prefix
  (system prompt + memory-only tool list) that diverged from the live
  conversation's cached prefix, invalidating prompt caching.  The
  gateway variant spawned a fresh AIAgent with its own clean prompt
  for each finalized session — still cache-breaking, just in a
  different process.
- Redundant.  Background review runs in the live conversation's
  session context, gets the same content, writes to the same memory
  store, and doesn't break the cache.  Everything flush_memories
  claimed to preserve is already covered.

What this removes:

- AIAgent.flush_memories() method (~248 LOC in run_agent.py)
- Pre-compression flush call in _compress_context
- flush_memories call sites in cli.py (/new + exit)
- GatewayRunner._flush_memories_for_session + _async_flush_memories
  (and the 3 call sites: session expiry watcher, /new, /resume)
- 'flush_memories' entry from DEFAULT_CONFIG auxiliary tasks,
  hermes tools UI task list, auxiliary_client docstrings
- _memory_flush_min_turns config + init
- #15631's headroom-deduction math in
  _check_compression_model_feasibility (headroom was only needed
  because flush dragged the full main-agent system prompt along;
  the compression summariser sends a single user-role prompt so
  new_threshold = aux_context is safe again)
- The dedicated test files and assertions that exercised
  flush-specific paths

What this renames (with read-time backcompat on sessions.json):

- SessionEntry.memory_flushed -> SessionEntry.expiry_finalized.
  The session-expiry watcher still uses the flag to avoid re-running
  finalize/eviction on the same expired session; the new name
  reflects what it now actually gates.  from_dict() reads
  'expiry_finalized' first, falls back to the legacy 'memory_flushed'
  key so existing sessions.json files upgrade seamlessly.

Supersedes #15631 and #15638.

Tested: 383 targeted tests pass across run_agent/, agent/, cli/,
and gateway/ session-boundary suites.  No behavior regressions —
background memory review continues to handle persistent memory
extraction on both CLI and gateway.
2026-04-25 08:21:14 -07:00