The previous bare except swallowed every exception from app.reply()
silently. Log at debug so real failures (auth, chat gone) leave a
trace while keeping the group-chat 400 fallback working. Also fix
the Teams entry's indentation in the messaging flowchart.
Group chats return 400 for threaded sends. Catch the error and
fall back to a flat send so messages always get delivered.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Wire reply_to into send() using App.reply(conv_id, msg_id, content)
which constructs the threaded conversation ID internally.
Threads supported in channels and group chats.
Update comparison table: Threads ✅
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Follow-up polish to the kanban dashboard from #19864 and #19705.
**Home-channel toggle contrast.** The `.hermes-kanban-home-sub--on`
class previously used `color-mix(var(--color-ring) 14%, transparent)`
which was nearly invisible on both the default teal and NERV themes —
the on/off distinction relied almost entirely on the ✓ prefix glyph.
Bump to 32% fill + full-opacity ring border + inner ring shadow +
font-weight 600. Still theme-scoped (no hardcoded colors), but reads
at a glance on both tested themes.
**Drop the → running status action.** Since #19705, `PATCH /tasks/:id`
rejects `status=running` with HTTP 400 — only the dispatcher's
`claim_task` path legitimately enters that state (so the run row,
claim lock, and worker PID are created atomically). The UI button was
still present and produced a 400 on click, which is a confusing dead
affordance. Remove it from `StatusActions`; add a comment pointing to
#19535 so future editors know why it's missing.
Live-tested on the default Hermes Teal theme. 53/53 kanban dashboard
plugin tests still pass.
* revert: auto-subscribe gateway chat on tool-driven kanban_create (#19718)
Reverts ff3d2773e2. Teknium reviewed the merged PR and decided this
behavior isn't wanted — tool-driven kanban_create should not mirror
the slash-command path's auto-subscribe. Orchestrators that want
their originating chat notified can call kanban_notify-subscribe
explicitly; we're not going to make it implicit.
* feat(kanban-dashboard): per-platform home-channel notification toggles
Adds a "Notify home channels" section to the task drawer in the kanban
dashboard plugin. Each platform where the user has set a home channel
(/sethome, TELEGRAM_HOME_CHANNEL env var, gateway.platforms.<p>.home_channel
in config.yaml) gets a toggle pill. Toggling on writes a kanban_notify_subs
row keyed to that platform's home (chat_id + thread_id); toggling off
removes it. The existing gateway notifier watcher delivers completed /
blocked / gave_up events without any new plumbing — this is purely a GUI
surface over existing machinery.
Replaces the reverted auto-subscribe behavior from #19718 with an explicit,
per-task, per-platform, user-controlled opt-in. No implicit subscription
on tool-driven kanban_create; no CLI commands; no slash commands. Just a
toggle in the drawer.
Backend (plugins/kanban/dashboard/plugin_api.py):
- GET /api/plugins/kanban/home-channels[?task_id=X]
Returns every platform with a configured home, plus a per-entry
subscribed: bool relative to task_id (false when task_id omitted).
Reads the live GatewayConfig via load_gateway_config() so env-var
overlays stay honored.
- POST /api/plugins/kanban/tasks/:id/home-subscribe/:platform
Idempotent add_notify_sub keyed to the platform's home.
- DELETE /api/plugins/kanban/tasks/:id/home-subscribe/:platform
remove_notify_sub for the same tuple.
- 404 when the platform has no home configured, or task_id doesn't
exist (POST only).
Frontend (plugins/kanban/dashboard/dist/index.js):
- TaskDrawer fetches /home-channels on open, keyed on task_id.
- HomeSubsSection renders nothing when zero platforms have a home (so
users who haven't set one up don't see an empty UI block).
- Optimistic toggle with busy flag + revert-on-failure. One pill per
platform; ✓ prefix and --on class indicate the subscribed state.
CSS (plugins/kanban/dashboard/dist/style.css):
- .hermes-kanban-home-subs flex row + .hermes-kanban-home-sub pill
style + --on subscribed variant (subtle ring-colored background).
Live-tested against a dashboard with TELEGRAM + DISCORD_BOT_TOKEN /
HOME_CHANNEL env vars set: drawer shows both pills, toggling each
flips its visual state AND writes/removes the correct kanban_notify_subs
row (verified via direct DB read).
Tests (tests/plugins/test_kanban_dashboard_plugin.py, 11 new, 53/53
pass total):
- home-channels lists only platforms with a home (slack with a
token but no home is excluded)
- no task_id -> all subscribed=false
- subscribe creates notify_sub row with correct chat/thread/platform
- subscribed=true reflected in subsequent GET
- idempotent re-subscribe
- unknown platform -> 404
- unknown task -> 404
- unsubscribe removes the row
- telegram + discord subscribe/unsubscribe independent
- zero homes -> empty list
The Teams adapter's interactive_setup() tried to import prompt,
prompt_yes_no, print_info, print_success, and print_warning from
hermes_cli.config, but those helpers live in hermes_cli.cli_output.
Only get_env_value/save_env_value live in hermes_cli.config.
This caused 'hermes setup' to crash with ImportError as soon as the
user picked Teams in the messaging-platforms wizard.
Split the import accordingly.
* feat(achievements): share card render on unlocked badges
Adds a Share button to each unlocked achievement card that opens a
modal and renders a 1200x630 PNG share card client-side via Canvas2D
(no backend, no network, no new deps). Two actions: Download PNG and
Copy image to clipboard.
Card layout mirrors the in-dashboard visual language: tier-colored
glow, icon from the existing LUCIDE sprite set, achievement name,
tier badge pill, description, progress stat line, and a Hermes Agent
watermark. Sized for X/Twitter, Discord, LinkedIn, Bluesky link
previews.
Vendored on top of the upstream @PCinkusz bundle; the 'in-progress
scan banner' precedent already established this divergence pattern.
Manifest bumped 0.3.1 -> 0.4.0.
* feat(achievements): share-on-X as primary action on share dialog
Adds a 'Share on X' button as the primary action in the share dialog.
Opens https://x.com/intent/post with a pre-filled tweet referencing
the achievement name, tier, @NousResearch, and the Hermes docs URL.
Copy image and Download PNG become secondary actions: users who want
the badge attached can Copy image, paste into the X composer, post.
Primary button styled as X's signature black-on-white fill so the
action is unambiguous.
The PATCH /tasks/:id endpoint allows setting status='running' via
_set_status_direct(), bypassing the dispatcher/claim path that creates
run rows, claim locks, expiry, and worker process metadata. This can
leave tasks stuck in 'running' with no active worker.
Fix: reject status='running' with HTTP 400, requiring all transitions
to 'running' to go through the canonical claim_task() path.
Closes#19535
Adds first-class board support to kanban so users can separate unrelated
streams of work (projects, repos, domains) into isolated queues. Single-
project users stay on the 'default' board and see no UI change.
Isolation model
---------------
- Each board is a directory at `~/.hermes/kanban/boards/<slug>/` with
its own `kanban.db`, `workspaces/`, and `logs/`. The 'default' board
keeps its legacy path (`~/.hermes/kanban.db`) for back-compat — fresh
installs and pre-boards users get zero migration.
- Workers spawned by the dispatcher have `HERMES_KANBAN_BOARD` pinned in
their env alongside the existing `HERMES_KANBAN_DB` /
`HERMES_KANBAN_WORKSPACES_ROOT` pins, so workers physically cannot see
other boards' tasks.
- The gateway's single dispatcher loop now sweeps every board per tick;
per-tick cost is a few extra filesystem stats.
- CAS concurrency guarantees are preserved per-board (each board is its
own SQLite DB, same WAL+IMMEDIATE machinery as before).
CLI
---
hermes kanban boards list|create|switch|show|rename|rm
hermes kanban --board <slug> <any-subcommand>
Board resolution order: `--board` flag → `HERMES_KANBAN_BOARD` env →
`~/.hermes/kanban/current` file → `default`. Slug validation is strict:
lowercase alphanumerics + hyphens + underscores, 1-64 chars, starts with
alphanumeric. Uppercase is auto-downcased; slashes / dots / `..` /
control chars are rejected so boards can't name their way out of the
boards/ directory.
Passive discoverability: when more than one board exists, `hermes kanban
list` prints a one-line header ("Board: foo (2 other boards …)") so
users who stumble across multi-project never have to hunt for the
feature. Invisible for single-board installs.
Dashboard
---------
- New `BoardSwitcher` component at the top of the Kanban tab: dropdown
with all boards + task counts, `+ New board` button, `Archive`
button (non-default only). Hidden entirely when only `default` exists
and is empty — single-project users never see it.
- New `NewBoardDialog` modal: slug / display name / description / icon
+ "switch to this board after creating" checkbox.
- Selected board persists to `localStorage` so browser users don't
shift the CLI's active board out from under a terminal they left open.
- New `?board=<slug>` query param on every existing endpoint plus a
new `/boards` CRUD surface (`GET /boards`, `POST /boards`,
`PATCH /boards/<slug>`, `DELETE /boards/<slug>`,
`POST /boards/<slug>/switch`).
- Events WebSocket is pinned to a board at connection time; switching
opens a fresh WS against the new board.
Also fixes a pre-existing bug in the plugin's tenant / assignee
filters: the SDK's `Select` uses `onValueChange(value)`, not
native `onChange(event)`, so those filters silently didn't work.
New `selectChangeHandler` helper wires both signatures.
Tests
-----
49 new tests in `tests/hermes_cli/test_kanban_boards.py` covering:
slug validation (valid / invalid / auto-downcase), path resolution
(default = legacy path, named = `boards/<slug>/`, env var override),
current-board resolution chain (env > file > default), board CRUD +
archive / hard-delete, per-board connection isolation (tasks don't
leak), worker spawn env injection (`HERMES_KANBAN_BOARD`,
`HERMES_KANBAN_DB`, `HERMES_KANBAN_WORKSPACES_ROOT` all point at the
right board), and end-to-end CLI surface.
Regression surface: all 264 pre-existing kanban tests continue to pass.
Live-tested via the dashboard: created 3 boards (default,
hermes-agent, atm10-server), created tasks on each via both CLI
(`--board <slug> create`) and dashboard (inline create on the Ready
column), confirmed zero cross-board leakage, confirmed `BoardSwitcher`
+ `NewBoardDialog` work end-to-end in the browser.
Closes#18718. Exposes the existing `workspace_kind` + `workspace_path`
fields (already accepted by POST /api/plugins/kanban/tasks) in the
dashboard's per-column inline-create form so users can create tasks
targeting a git worktree or an explicit directory without dropping
back to the CLI.
- Add a workspace-kind Select (scratch / worktree / dir) to
InlineCreate in plugins/kanban/dashboard/dist/index.js.
- Conditionally render a workspace_path Input next to the select when
kind != scratch; placeholder tells the user whether the path is
required (dir) or optional (worktree — derived from assignee when
blank).
- Submit wires `workspace_kind` / `workspace_path` into the POST body
only when they're non-default, keeping the request shape small and
interoperable with older dispatcher versions.
E2E verified in a dashboard pointed at the worktree: selecting dir +
typing /tmp/test-18718 produces a POST body with
{workspace_kind: 'dir', workspace_path: '/tmp/test-18718'} and the
task lands in sqlite with those fields set. 42/42 kanban dashboard
plugin tests pass.
Closes#18576. Addresses three of four complaints from the readability
report; live-verified in a dashboard against a seeded task with body,
comments, and run history.
- Drawer default width 480px → 640px, exposed as the CSS var
`--hermes-kanban-drawer-width` so deployments / user themes can
override without forking the plugin.
- Bump body/meta/pre/log/run-history font sizes from the 0.65-0.75rem
cluster to the 0.78-0.85rem cluster. Long paths and code snippets in
task bodies, run metadata, and worker logs are legible again instead
of requiring a squint.
- Fix the black-text-on-dark-theme regression in fenced markdown code
blocks. Root cause: themes that don't define `--color-foreground`
(NERV, at least) leave `color: var(--color-foreground)` resolving
empty on <code>, which then falls back to the UA default (near-black)
instead of inheriting from the drawer's <body>. Fix: force
`color: inherit` on both inline and fenced code, and give the fenced
block background via `currentColor` instead of `--color-foreground`
so there's a visible card even when the theme var is absent.
Out of scope for this PR (comments added to #18576):
- Draggable resize handle (structural JS work; plugin ships built-only,
no src/ in-tree).
- Live worker-log viewer for running tasks (backend WS + component).
- Sibling fix: themes like NERV should define --color-foreground. The
current changes make the drawer robust against that gap, but the
root fix belongs in the theme layer.
Four callsites hardcoded Path.home() / '.hermes' with no HERMES_HOME
check, breaking Docker deployments and profile isolation (hermes -p):
- plugins/hermes-achievements/dashboard/plugin_api.py:
state_path(), snapshot_path(), checkpoint_path() bare-literal paths
- scripts/profile-tui.py:
DEFAULT_STATE_DB and DEFAULT_LOG defaults ignored HERMES_HOME
- hermes_cli/slack_cli.py:
except-Exception fallback for slack-manifest.json dump
- optional-skills/migration/openclaw-migration/scripts/openclaw_to_hermes.py:
--target argparse default
Use get_hermes_home() (with an ImportError shim for the standalone
scripts) or 'os.environ.get("HERMES_HOME") or str(Path.home()/".hermes")'
where importing hermes_constants is impractical.
E2E-verified: with HERMES_HOME=/tmp/x all three achievements paths and
both profile-tui defaults route under /tmp/x.
Salvaged from #18068 (original scope was broader mechanical cleanup
claiming 23 callsites were buggy; most were already respecting
HERMES_HOME via os.environ.get(key, default) — only these 4 had no env
check at all). Credit: @web-dev0521.
_get_peer() and _get_or_create_honcho_session() accessed _peers_cache
and _sessions_cache without holding _cache_lock, while other paths
in the same class use the lock consistently. Under concurrent tool
calls or prefetch threads, this can produce stale reads or lost
cache updates.
Wrap both unguarded cache read sites in _cache_lock. Network calls
(honcho.peer() and honcho.session()) remain outside the lock to
avoid holding it during I/O.
Three int() calls in HonchoClient.from_global_config() parsed
dialecticMaxChars, messageMaxChars, and dialecticMaxInputChars
directly without guards. A malformed value in honcho.json would
raise ValueError and abort provider initialization entirely.
Add _parse_int_config() helper following the existing
_parse_context_tokens() pattern, and replace all three raw
int() calls with it.
microsoft-teams-apps 2.0.0 added the `client` option to AppOptions,
accepting a ClientOptions instance. Use it to set the User-Agent
header to "Hermes" on all outgoing HTTP requests.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds a deterministic pre-check on top of htsh's exception-based fallback:
before calling /content/abstract or /content/overview on a non-pseudo URI,
probe /api/v1/fs/stat. If the server says the URI is a file, route straight
to /content/read instead of eating a failing 500 round-trip.
This is the same idea pty819 and chennest independently landed in PRs
#12757 and #12937 — merged here on top of htsh's broader fix so we keep
pseudo-URI normalization and v0.3.3 browse-shape handling while avoiding
the slow exception path on servers that return a raised 500 every time.
The exception fallback from #5886 stays in place for environments where
fs/stat is unavailable or returns an unfamiliar shape.
Also credits pty819, chennest, and htsh in AUTHOR_MAP so future release
notes attribute them correctly.
OpenViking returns 500 for /content/abstract and /content/overview when URI points to mem_*.md files.
Add resilient fallback to /content/read for non-pseudo summary file URIs while preserving pseudo summary normalization.
Also add regression tests for fallback behavior.
OpenViking v0.3.3 expects directory URIs for abstract/overview reads.
Passing pseudo-files like /.overview.md and /.abstract.md to
/api/v1/content/overview|abstract triggers HTTP 500.
This change normalizes those pseudo-URIs to their parent directory for
abstract/overview requests, preserves full reads, and hardens parsing for
wrapped/unwrapped result payloads and fs list response shapes.
Replace the Azure portal credential prompts with the teams CLI
workflow: install @microsoft/teams.cli, run teams app create,
paste the output credentials. Matches the setup docs.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Pass cmd/desc in button action data so the card response can
reconstruct the original body. Clicking a button now replaces
only the actions with a status line, keeping the command and
reason text visible.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The gateway calls send_image_file() for locally cached images
(e.g. from image_gen tools). Without this override the base class
falls back to sending the file path as plain text. Delegate to
send_image() which already handles base64 encoding local paths.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Teams doesn't render markdown image syntax. Send images using the SDK's
Attachment API instead — base64 data URI for local files, direct URL
for remote images.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Hello! I am the maintainer of the microsoft-teams-apps Python SDK and
I built this Teams adapter to integrate Microsoft Teams into Hermes.
Adds a `plugins/platforms/teams` platform plugin using the new
PlatformRegistry system from #17751. The adapter self-registers via
`register(ctx)` — no hardcoding in run.py, toolsets.py, or any
other core file.
Key features:
- Supports personal DMs, group chats, and channel posts
- Adaptive Card approval prompts with in-place button replacement
(Allow Once / Allow Session / Always Allow / Deny)
- aiohttp webhook server bridged from the Teams SDK to avoid
the fastapi/uvicorn dependency
- ConversationReference caching for correct proactive sends in
non-DM chats
- `interactive_setup()` for `hermes gateway setup` integration
- `platform_hint` for LLM context (Teams markdown subset)
- 34 tests covering adapter init, send, message handling, and
plugin registration
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(plugins): bundle hermes-achievements, scan full session history
Ships @PCinkusz's hermes-achievements dashboard plugin (https://github.com/PCinkusz/hermes-achievements) as a bundled plugin at plugins/hermes-achievements/ and fixes a bug in the scan path that made the plugin only see the first 200 sessions — making lifetime badges (50k tool calls, 75k errors, etc.) unreachable on long-running installs.
Changes:
- plugins/hermes-achievements/: vendor v0.3.1 verbatim (manifest, dist/, plugin_api.py, tests, docs, README).
- plugins/hermes-achievements/dashboard/plugin_api.py:
* scan_sessions(): limit=None now scans ALL sessions via SQLite LIMIT -1. Previously capped at 200, so users with 8000+ sessions saw ~2% of their history.
* evaluate_all(): first-ever scans run in a background thread so the dashboard request path never blocks. Stale snapshots serve immediately while a background refresh runs. force=True still blocks synchronously for manual /rescan.
* _build_pending_snapshot(), _start_background_scan(), _run_scan_and_update_cache(): supporting plumbing + idempotent thread spawn.
- tests/plugins/test_achievements_plugin.py: new tests covering the 200-cap regression, the background-scan first-run flow, stale-serve-plus-background-refresh, forced sync rescan, and scan-thread idempotency.
- website/docs/user-guide/features/built-in-plugins.md: lists hermes-achievements in the bundled-plugins table and documents API endpoints, state files, and performance characteristics.
E2E validated against a real 8564-session ~6.4GB state.db:
* Cold scan: 13m 19s (one-time, backgrounded — UI never blocks)
* Warm rescan: 1.47s (8563/8564 sessions reused from checkpoint cache)
* 57/60 achievements unlocked, 3 discovered — aggregates like total_tool_calls=259958, total_errors=164213, skill_events=368243 correctly surface lifetime badges that the 200-cap made unreachable.
Original credit: @PCinkusz (MIT-licensed). Upstream repo remains the staging ground for new badges; this bundle keeps the dashboard feature parity with Hermes core changes.
* feat(achievements): publish partial snapshots during cold scan
Previously a cold scan on a large session DB (13min on 8564 sessions)
showed zero badges for the entire duration, then every badge at once
when the scan completed. A dashboard refresh mid-scan was indistinguishable
from a fresh install with no history.
Now the scanner publishes a partial snapshot to _SNAPSHOT_CACHE every
250 sessions, so each refresh during a cold scan surfaces more badges
incrementally.
Mechanism:
- scan_sessions() takes an optional progress_callback fired every
progress_every sessions with (sessions_so_far, scanned, total).
- _compute_from_scan() is extracted from compute_all() and gains an
is_partial flag that skips writing to state.json — we don't want
to record unlocked_at based on a half-complete aggregate that a
later session might rebalance.
- _run_scan_and_update_cache() installs a publisher callback that
builds a partial snapshot, marks it mode='in_progress', and writes
it to the cache with age=0 so the UI keeps polling /scan-status
and picks up the final snapshot when the scan completes.
- Manual /rescan (force=True) disables partial publishing — the
caller is blocking on the final result anyway.
E2E against real 8564-session state.db (polled cache every 10s):
t=10s: cache empty
t=20s: 250/8564 scanned, 35 unlocked, 25 discovered
t=40s: 500/8564 scanned, 42 unlocked, 18 discovered
t=60s: 1000/8564 scanned, 49 unlocked, 11 discovered
...
Tests: 9/9 pass (2 new — partial snapshot publication + no-persist-on-partial).
Upstream unittest suite: 10/10 pass.
* feat(achievements): in-progress scan banner with live % progress
Previously the dashboard showed zero badges silently during long cold
scans (13min on 8564 sessions). The backend was publishing partial
snapshots every 250 sessions, but the bundled UI didn't surface any
indicator that a scan was running — it just rendered the main page
with whatever counts were currently published and no way for the user
to know more progress was coming.
UI changes (dist/index.js, dist/style.css):
- Added a scan-in-progress banner rendered between the hero and stats
when scan_meta.mode is 'pending' or 'in_progress'. Shows:
BUILDING ACHIEVEMENT PROFILE…
Scanned 1,750 of 8,564 sessions · 20%. Badges unlock as more history streams in.
with a pulsing teal indicator and a filling teal/cyan progress bar.
Disappears the moment the backend flips to 'full' or 'incremental'.
- Added an auto-poller via useEffect — while scanInFlight is true the
page re-fetches /achievements every 4s WITHOUT toggling the loading
skeleton, so unlock counts tick up visibly without the user refreshing.
The effect cleans itself up when the scan finishes.
- Added refresh() (re-fetch, no loading flip) alongside the existing
load() (full reload, used by the Rescan button).
Attribution preserved:
- Added a header comment to index.js crediting @PCinkusz
(https://github.com/PCinkusz/hermes-achievements, MIT) as the
original author, noting the banner is a layered addition on top
of the original dist bundle.
- Matching header comment in style.css, flagging the new
.ha-scan-banner* rules as the local addition.
Live-verified end to end:
- Spun up `hermes dashboard --port 9229 --no-open` against a fresh
HERMES_HOME symlinked to the real 8564-session state.db.
- Opened /achievements in a browser, confirmed the banner renders with
live progress: 'Scanned 1,000 of 8,564 sessions · 11%' → updates to
'1,250 ... · 14%' → '1,750 ... · 20%' without user interaction,
matching the backend's partial publications.
- Stats row simultaneously climbed from 35 → 49 → 53 unlocked as
more history streamed in.
- Vision analysis of the rendered page confirms the banner styling
matches the rest of the dashboard (dark card bg, teal accent, same
small-caps typography, pulsing indicator reusing ha-pulse keyframes).
Platform plugins shipped in-repo under plugins/platforms/ should be
available out of the box — users shouldn't have to add 'irc-platform'
to plugins.enabled before they can pick IRC from the gateway setup menu.
Adds a new ``kind: platform`` plugin type that mirrors the existing
``kind: backend`` auto-load semantics:
- Bundled (shipped in the hermes-agent repo): auto-load unconditionally.
- User-installed (~/.hermes/plugins/): still opt-in via plugins.enabled
so untrusted code doesn't silently run.
Changes:
* hermes_cli/plugins.py: add 'platform' to _VALID_PLUGIN_KINDS, document
the new kind in the PluginManifest docstring, extend the bundled auto-
load rule from 'backend only' to 'backend or platform'.
* plugins/platforms/irc/plugin.yaml: declare kind: platform.
* hermes_cli/gateway.py: remove the now-redundant
_load_bundled_platform_plugins_for_enumeration() helper and the
_enable_plugin_for_platform() helper. The setup menu's _all_platforms()
just calls discover_plugins() and reads the registry — bundled
platforms are already loaded at that point. Drops the 'needs_enable'
flag and the 'plugin disabled — select to enable' status string.
* hermes_cli/setup.py: relax the "gateway is configured" detector used
during OpenClaw migration. Switching to _platform_status() in an
earlier commit tightened the check to require an exact "configured"
match, dropping platforms whose status is "enabled, not paired",
"partially configured", "configured + E2EE", etc. Now any non-"not
configured" status counts — the user has already started setup there
and we shouldn't force the section to rerun.
* tests/hermes_cli/test_setup_irc.py: drop the TestIRCPluginDisabledFlow
class and test_configure_platform_enables_disabled_plugin_first — the
no-longer-existent flow they were testing.
* tests/hermes_cli/test_setup_openclaw_migration.py: patch both
setup.get_env_value and gateway.get_env_value in the 4 gateway-section
tests that reach _platform_status() through the unified setup flow;
switch WHATSAPP_ENABLED to the literal "true" in the registry-parity
test so WhatsApp's value-shape validator matches.
Verified via fresh-install smoke (empty plugins.enabled, no env vars):
IRC plugin loads, Platform('irc') resolves, _all_platforms() lists IRC
with status 'not configured'. 160 targeted tests pass.
feat(gateway): refine Platform._missing_ and platform-connected dispatch
Restricts plugin-name acceptance to bundled plugin scan + registry
(no arbitrary string -> enum-pollution), pulls per-platform connectivity
checks into a _PLATFORM_CONNECTED_CHECKERS lambda map with a clean
_is_platform_connected method, and adds tests covering the checker map,
plugin platform interface, and IRC setup wizard.
Merge the two gateway setup paths (hermes setup gateway + hermes gateway
setup) to use a single _unified_platforms() list that merges built-in
_PLATFORMS with dynamically registered plugin entries from
platform_registry.
- Add setup_fn field to PlatformEntry for plugin setup flows
- _unified_platforms() merges built-ins with registry entries by key
- setup_gateway() now uses unified list instead of hardcoded
_GATEWAY_PLATFORMS tuple list
- gateway_setup() uses same unified list, plugin entries appear
alongside built-ins with no [plugin] suffix
- _platform_status() handles plugin platforms via registry check_fn
- Plugin platforms with setup_fn get called directly; plugins without
get a generic env-var display fallback
IRC and other plugin platforms now appear automatically in the setup
menu when registered via platform_registry.register().
feat(gateway): surface disabled platform plugins in setup and auto-enable on select
Platform plugins under plugins/platforms/* (IRC, etc.) were gated behind
plugins.enabled, so `hermes gateway setup` wouldn't list them until the
user ran `hermes plugins enable <name>` first. Now the setup menu always
surfaces them as "plugin disabled — select to enable", and picking one
adds it to plugins.enabled before running its setup flow.
Along the way, unify the two gateway setup flows so `hermes setup gateway`
and `hermes gateway setup` both read from the same platform list (built-in
_PLATFORMS + platform_registry entries), dispatch through a single
_configure_platform() helper, and share _platform_status(). Deletes the
dead bespoke wrappers in setup.py (_setup_whatsapp, _setup_weixin,
_setup_email, etc.) that duplicated logic now covered by the registry
path or _setup_standard_platform.
Also:
- PlatformEntry gains a plugin_name field so the registry knows which
plugin owns each entry (required for auto-enable).
- PluginContext.register_platform auto-stamps plugin_name from the
manifest so plugins don't have to pass it explicitly.
- PluginManager now scans plugins/platforms/* as its own category root,
one level below the bundled plugin scan.
- Fix IRC plugin discovery: rename PLUGIN.yaml → plugin.yaml (the
scanner is case-sensitive) and add the missing __init__.py that
_load_directory_module requires.
Closes remaining functional gaps and adds documentation.
webhook.py: Cross-platform delivery now checks the plugin registry
for unknown platform names instead of hardcoding 15 names in a tuple.
Plugin platforms can receive webhook-routed deliveries.
prompt_builder: Platform hints (system prompt LLM guidance) now fall
back to the plugin registry's platform_hint field. Plugin platforms
can tell the LLM 'you're on IRC, no markdown.'
PlatformEntry: Added platform_hint field for LLM guidance injection.
IRC adapter: Added acquire_scoped_lock/release_scoped_lock in
connect/disconnect to prevent two profiles from using the same IRC
identity. Added platform_hint for IRC-specific LLM guidance.
Removed dead token-empty-warning extension for plugin platforms
(plugin adapters handle their own env vars via check_fn).
website/docs/developer-guide/adding-platform-adapters.md:
- Added 'Plugin Path (Recommended)' section with full code examples,
PLUGIN.yaml template, config.yaml examples, and a table showing all
18 integration points the plugin system handles automatically
- Renamed built-in checklist to clarify it's for core contributors
gateway/platforms/ADDING_A_PLATFORM.md:
- Added Plugin Path section pointing to the reference implementation
and full docs guide
- Clarified built-in path is for core contributors only
Extends the platform plugin interface from Phase 1 to cover every
touchpoint where built-in platforms have hardcoded behavior.
- allowed_users_env / allow_all_env: per-platform auth env vars
- max_message_length: smart-chunking for send_message tool
- pii_safe: session PII redaction flag
- emoji: CLI/gateway display
- allow_update_command: /update access control
send_message tool (tools/send_message_tool.py):
- Replaced hardcoded platform_map dict with Platform() call
- Added _send_via_adapter() for plugin platforms — routes through
live gateway adapter when available
- Registry-aware max message length for smart chunking
Cron delivery (cron/scheduler.py):
- Replaced hardcoded 15-entry platform_map with Platform() call
- Plugin platforms now work as cron delivery targets
User authorization (gateway/run.py _is_user_authorized):
- Registry fallback: checks PlatformEntry.allowed_users_env and
allow_all_env when platform not in hardcoded maps
- Plugin platforms get per-platform auth support
_UPDATE_ALLOWED_PLATFORMS: checks registry allow_update_command flag
Channel directory: includes plugin platforms in session enumeration
Orphaned config warning: descriptive message when plugin platform is
in config but no plugin registered it
Gateway weakref: _gateway_runner_ref for cross-module adapter access
hermes status: shows plugin platforms with (plugin) tag
hermes gateway setup: plugin platforms appear in menu with setup hints
hermes_cli/platforms.py: get_all_platforms() merges with registry,
platform_label() falls back to registry for plugin names
- 8 new tests (extended fields, cron resolution, platforms merge)
- Updated 3 tests for new Platform() based resolution
- 2829 passed, 24 pre-existing failures, zero new failures
Adds a platform adapter plugin interface so anyone can create new gateway
platforms (IRC, Viber, Line, etc.) as drop-in plugins without modifying
core gateway code.
- PlatformEntry dataclass: name, label, adapter_factory, check_fn,
validate_config, required_env, install_hint, source
- PlatformRegistry singleton with register/unregister/create_adapter
- _create_adapter() in gateway/run.py checks registry first, falls
through to existing if/elif chain for built-in platforms
- Platform._missing_() accepts unknown string values, creating cached
pseudo-members so Platform('irc') is Platform('irc') holds true
- GatewayConfig.from_dict() now parses plugin platform names from
config.yaml without rejecting them
- get_connected_platforms() delegates to registry for unknown platforms
- PluginContext.register_platform() for plugin authors
- Mirrors the existing register_tool() / register_hook() pattern
- Full async IRC adapter using stdlib asyncio (zero external deps)
- Connects via TLS, handles PING/PONG, nick collision, NickServ auth
- Channel messages require addressing (nick: msg), DMs always dispatch
- Markdown stripping for IRC-clean output, message splitting for
512-byte line limit
- Config via config.yaml extra dict or IRC_* env vars
- Platform enum dynamic members (identity stability, case normalization)
- PlatformRegistry (register, unregister, create, validation, factory)
- GatewayConfig integration (from_dict parsing, get_connected_platforms)
- IRC adapter (init, send, protocol parsing, markdown, requirements)
No existing platform adapters were migrated — the if/elif chain is
untouched. This is Phase 1: prove the interface with a real plugin.
Follow-up to the cherry-picked PR #17447. The original flush spawned a
bare threading.Thread for the buffer-flush path, overwriting
self._sync_thread — which is aliased to the long-lived writer thread.
Two consequences:
1. No serialization with the writer queue. If old-session retains were
still queued in _retain_queue, the flush ran concurrently with the
writer and both threads could call aretain_batch against the same
document_id.
2. The pre-spawn 'self._sync_thread.join(timeout=5.0)' tried to join the
long-lived writer, which never exits, so the join was a no-op that
just timed out — never actually serialized anything.
Fix: enqueue the flush closure on _retain_queue via _ensure_writer +
put(). Natural FIFO ordering behind any pending retains, no new thread,
no broken join. Shutdown-aware so it doesn't enqueue after teardown.
Tests updated to drain via _retain_queue.join() instead of the stale
_sync_thread.join(). Added regression guard
test_flush_serializes_behind_pending_retains_via_writer_queue that
blocks the writer mid-retain to prove the flush waits in FIFO behind
the old retain.
Also seeds _retain_queue / _shutting_down / stubbed _ensure_writer on
the bare-object test helper in test_memory_session_switch.py so that
path doesn't blow up under the new queue-enqueue.
tests/plugins/memory/test_hindsight_provider.py + tests/agent/test_memory_session_switch.py: 103/103 passing.
Two data-loss / leak gaps in HindsightMemoryProvider.on_session_switch
introduced by #17409.
1. Buffered turns silently lost when retain_every_n_turns > 1.
on_session_switch unconditionally cleared _session_turns without
flushing. Users who batched every N>1 turns and switched mid-batch
(/reset, /new, /resume, /branch, or context compression) had those
buffered turns disappear. Same data-loss class as the shutdown race,
different lifecycle event.
Note commit_memory_session() -> on_session_end() runs *before*
on_session_switch on /reset, but Hindsight doesn't implement
on_session_end so the buffer survives that step and dies at clear
time. /resume, /branch, and compression skip commit_memory_session
entirely so an on_session_end impl wouldn't help them anyway.
Fix: snapshot the old _session_id, _document_id, _parent_session_id,
_turn_index, and _session_turns; spawn one final retain that lands
under the OLD document_id; then rotate state. Metadata is built
synchronously against the old self._* so session_id / lineage tags
on the flushed item all reference the prior session consistently.
2. Stale _prefetch_result leaks across switch.
If queue_prefetch ran in the old session and the result hadn't been
consumed by prefetch() yet, on_session_switch left the cached recall
text in place. The next session's first prefetch() call would return
text mined from the prior session's bank/query.
Fix: join any in-flight _prefetch_thread (3s bounded — matches
shutdown()), then clear _prefetch_result under _prefetch_lock before
rotating session_id.
Tests
-----
- tests/plugins/memory/test_hindsight_provider.py (TestSessionSwitchBufferFlush):
- buffered turns flushed under OLD document_id with OLD lineage tags
- empty buffer => no spurious retain
- _prefetch_result cleared on switch
- in-flight prefetch thread is awaited before clear (no race)
- tests/agent/test_memory_session_switch.py: factory extended to seed the
attrs the new flush path reads (_retain_source, _platform, _bank_id,
prefetch state, etc.) and stub _run_hindsight_operation so existing
switch-state assertions keep passing without network setup.
The plugin used to spawn one daemon thread per sync_turn() to do the
aretain_batch network write. On CLI exit, that pattern raced interpreter
shutdown — the last retain could reach aiohttp after asyncio's
"cannot schedule new futures" guard had fired, producing noisy logs and
silently losing the final unsaved turn:
WARNING ... Hindsight sync failed: cannot schedule new futures after
interpreter shutdown
ERROR asyncio: Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x...>
Switch to a single-writer model: each provider owns one long-lived
writer thread plus a queue. sync_turn() snapshots state and enqueues a
job; the writer drains sequentially. Once shutdown() is called:
- new sync_turn() / queue_prefetch() calls are dropped, not enqueued
- a sentinel wakes the writer so it finishes in-flight work
- shutdown joins the writer (10s) before nulling the client
Also register an idempotent atexit hook from the first sync_turn(), so
exit paths that don't go through MemoryManager.shutdown_all() (Ctrl-C,
abrupt exit) still get a chance to drain.
Tests: keep _sync_thread as a legacy alias to the writer, swap join()
calls to _retain_queue.join() (canonical wait-for-drain), add a new
TestShutdownRace suite covering single-writer reuse, post-shutdown drop,
queue draining, and shutdown idempotency.
Fixes#6672
Memory providers now receive on_session_switch() whenever AIAgent.session_id
rotates mid-process — /resume, /branch, /reset, /new, and context
compression. Before this, providers that cached per-session state in
initialize() (Hindsight's _session_id, _document_id, accumulated
_session_turns, _turn_counter) kept writing into the old session's
record after the agent had moved on.
MemoryProvider ABC
------------------
- New optional hook on_session_switch(new_session_id, *,
parent_session_id='', reset=False, **kwargs) with no-op default for
backward compat. reset=True signals /reset or /new — providers should
flush accumulated per-session buffers. reset=False for /resume,
/branch, compression where the logical conversation continues.
MemoryManager
-------------
- on_session_switch() fans the hook out to every registered provider.
Isolated try/except per provider — one bad provider can't block others.
- Empty/None new_session_id is a no-op to avoid corrupting provider state
during shutdown paths.
run_agent.py
------------
- _sync_external_memory_for_turn now passes session_id=self.session_id
into sync_all() and queue_prefetch_all(). Providers with defensive
session_id updates in sync_turn (Hindsight already had this at
plugins/memory/hindsight/__init__.py:1199) now actually receive the
current id.
- Compression block at ~L8884 already notified the context engine of
the rollover; now also calls
_memory_manager.on_session_switch(reason='compression').
cli.py
------
- new_session() fires reset=True, reason='new_session' so providers
flush buffers.
- _handle_resume_command fires reset=False, reason='resume' with the
previous session as parent_session_id.
- _handle_branch_command fires reset=False, reason='branch' with the
parent session_id already captured for the DB parent link.
gateway/run.py
--------------
- _handle_resume_command now evicts the cached AIAgent, mirroring
/branch and /reset. The next message rebuilds a fresh agent whose
memory provider initialize() runs with the correct session_id —
matches the pattern the gateway already uses for provider state
cross-session transitions.
Hindsight reference implementation
----------------------------------
- plugins/memory/hindsight/__init__.py adds on_session_switch that:
updates _session_id, mints a fresh _document_id (prevents
vectorize-io/hindsight#1303 overwrite), and clears _session_turns /
_turn_counter / _turn_index so in-flight batches don't flush under
the new document id. parent_session_id only overwritten when provided
(avoids clobbering on a bare switch).
Tests
-----
- tests/agent/test_memory_session_switch.py: new dedicated file. ABC
default no-op, manager fan-out, failure isolation, empty-id no-op,
session_id propagation through sync_all/queue_prefetch_all, Hindsight
state transitions for every reset/non-reset case, parent preservation.
- tests/cli/test_branch_command.py: new test verifying /branch fires
the hook with correct parent_session_id + reset=False + reason.
- tests/gateway/test_resume_command.py: new test verifying /resume
evicts the cached agent.
- tests/run_agent/test_memory_sync_interrupted.py: updated existing
assertions to account for the session_id kwarg on sync_all and
queue_prefetch_all.
E2E verified (real imports, tmp HERMES_HOME):
- /resume: session_id updates, doc_id fresh, buffers cleared, parent set
- /branch: session_id forks, parent links to original
- /new: reset=True clears accumulated state
- compression: reason='compression' propagated, lineage preserved
- Empty id: no-op, state preserved
- Legacy provider without on_session_switch: no crash
Reported by @nicoloboschi (Hindsight maintainer); related scope-widening
comment by @kidonng extending coverage to compression.
Completes the cfg_get migration started in PR #17304. Covers the
remaining hermes_cli/ and plugins/ config-access sites that the first
PR intentionally left opportunistic.
Migrated (33 sites across 14 files):
hermes_cli/setup.py 13 sites (terminal.*, agent.*, display.*, compression.*, tts.*)
hermes_cli/tools_config.py 7 sites (tts.*, browser.*, web.*, platform_toolsets.*)
hermes_cli/plugins_cmd.py 3 sites (plugins.*, memory.*, context.*)
plugins/memory/honcho/cli.py 3 sites (hosts.*)
hermes_cli/web_server.py 1 site (dashboard.*)
hermes_cli/skills_config.py 1 site (platform_disabled)
hermes_cli/plugins.py 1 site (plugins.disabled)
hermes_cli/status.py 1 site (terminal.backend)
hermes_cli/mcp_config.py 1 site (mcp_servers.*)
hermes_cli/webhook.py 1 site (platforms.webhook)
plugins/memory/__init__.py 1 site (memory.provider)
plugins/memory/hindsight/ 1 site (banks.hermes)
plugins/memory/holographic/ 1 site (plugins.hermes-memory-store)
run_agent.py 1 site (auxiliary.compression)
The helper supports non-literal keys too, so e.g.
cfg.get('hosts', {}).get(HOST, {})
becomes
cfg_get(cfg, 'hosts', HOST, default={})
Migration bugs caught and fixed during this PR:
1. An AST-based batch rewrite naïvely captured the first word token in
a chain, which corrupted 'self._config.get(...).get(...)' into
'self.cfg_get(_config, ...)' (dropping 'self.', creating a broken
method call). Plugins/memory/hindsight caught it via its test suite.
Fixed manually to 'cfg_get(self._config, ...)'.
2. Import-extension heuristic rewrote multi-line parenthesized imports
('from X import (\n A,\n B,\n)') as
'from X import cfg_get, (' — syntactically broken. Fixed by inserting
cfg_get as the first name inside the parentheses.
Combined with PR #17304, the cfg_get migration now covers:
PR #17304 (first batch): 20 sites in tools/ + gateway/
PR #17317 (this one): 33 sites in hermes_cli/ + plugins/ + run_agent.py
Total: 53 sites migrated. Remaining ~8 sites are either:
- Function-call chains (e.g. '_load_stt_config().get(...).get(...)')
that would need double-evaluation or a local binding to migrate
cleanly — intentionally deferred.
- JSON response-navigation (e.g. 'response_data.get('data',{}).get('web'))
which is unrelated to config access and shouldn't use cfg_get.
Verified:
- 412/412 tests/plugins/ pass (including the hindsight test that caught
the self.X regex bug before commit)
- 3181/3189 tests/hermes_cli/ pass (8 pre-existing failures on main,
verified by git-stash comparison)
- Live 'hermes status' and 'hermes config' render correctly (exercise
the migrated terminal.backend, tts.provider, browser.cloud_provider,
compression.threshold, display.tool_progress sites)
- Live 'hermes chat': 1 turn + /quit, zero errors in 11-line log window
No semantic changes — cfg_get was already proven to be a 1:1 match for
the original .get("X",{}).get("Y",default) pattern in PR #17304.
Opt-in Langfuse tracing for Hermes conversations — LLM calls, tool
usage, usage/cost breakdown per span. Hooks into pre/post_api_request,
pre/post_llm_call, pre/post_tool_call. SDK is optional; missing SDK or
credentials renders the plugin inert.
Salvaged from PR #16845 by @kshitijk4poor, who wrote the plugin
(~875 LOC, 6 hooks, Langfuse usage-details/cost-details normalization,
read_file payload summarization).
Salvage scope (why this isn't PR #16845 as-authored):
- Lives at plugins/observability/langfuse/ (standalone kind, opt-in via
plugins.enabled) instead of a new parallel optional-plugins/
directory. Standalone bundled plugins are already opt-in — only their
plugin.yaml is scanned at startup; the Python module is not imported
unless the user enables it. The premise of optional-plugins/ (avoid
import cost for users who don't want it) is already solved by the
existing plugin system.
- Dropped the triple activation gate (plugins.enabled +
plugins.langfuse.enabled + HERMES_LANGFUSE_ENABLED). The Hermes plugin
system's own enable/disable is authoritative; runtime credentials
gate whether the hook actually traces.
- Rewrote _is_enabled() → cached _get_langfuse() with an _INIT_FAILED
sentinel. The original called hermes_cli.config.load_config() from
every hook invocation (full yaml parse + deep merge + env expansion
on every pre/post_tool_call, potentially 100+ times per turn). The
cached version reads env once and returns the cached client or None
on every subsequent call with zero further work.
- hermes tools → Langfuse Observability post-setup adds
observability/langfuse to plugins.enabled directly (via
_save_enabled_set) instead of going through an install-copy flow.
Enable:
hermes tools # interactive
hermes plugins enable observability/langfuse # manual
Required env (set by `hermes tools` or in ~/.hermes/.env):
HERMES_LANGFUSE_PUBLIC_KEY
HERMES_LANGFUSE_SECRET_KEY
HERMES_LANGFUSE_BASE_URL # optional
Co-authored-by: kshitijk4poor <kshitijk4poor@gmail.com>
Closed PR #5137 addressed the retrieval path (peer cards via get_card()
instead of the session-scoped lookup that returned empty for per-session
messaging flows) — that architectural fix is already in main as
_fetch_peer_card / _fetch_peer_context.
What never got fixed is the user-visible side: honcho_profile returning
a flat 'No profile facts available yet.' leaves the model to guess at
why. The model then often surfaces it to the user as a cryptic error.
Adds a diagnostic hint next to the existing 'result' message, enumerating
the likely causes in rough order of frequency:
1. Observation disabled for this peer (user_observe_me/others off)
2. Peer card hasn't accumulated yet (fresh peer / dialectic cadence
hasn't fired enough turns — cards build over time)
3. Generic fallback: self-hosted Honcho < 3.x lacks peer cards
The hint also suggests alternative tools (honcho_reasoning / honcho_search)
so the model can route around the empty card rather than giving up.
Schema description updated so the model knows the hint field exists and
that an empty card is NOT an error state.
7 tests cover the hint paths: warmup, observation-disabled for user + ai,
generic fallback, populated card still returns plain result (no hint),
alternative-tool suggestion present.
The scheme-validation commit (e77a3f2c) was too strict: a user with
legacy ''baseUrl: localhost:8000'' (no ''http://'' prefix) in their
''~/.honcho/config.json'' would get ''No API key configured'' from the
CLI after that change, even though their setup worked before.
urlparse on a schemeless host:port treats the host segment as the
scheme and leaves netloc empty, so the http/https check rejected it.
Falls back to a lenient check for schemeless strings that look like
hosts: contain '.' or ':', aren't a boolean/null literal, aren't pure
digits. The SDK still rejects truly malformed URLs at connect time
with a clearer error than ours.
Three new tests: legacy schemeless hosts accepted; obvious garbage
literals (''true'', ''null'', ''12345'') still rejected. Reviewer
noted concern #1: schemeless regression for self-hosters with old
configs.
main's 6a957a74 added an optional 'metadata' kwarg to
MemoryProvider.on_memory_write so providers can distinguish tool-driven
memory writes from background-review writes. MemoryManager already
does a getfullargspec-based introspection, so the old 3-arg signature
didn't break at runtime — but it missed the origin hint entirely.
Updates HonchoMemoryProvider.on_memory_write to accept the kwarg. The
metadata isn't yet threaded into Honcho's create_conclusion payload —
that's worth its own PR once the consolidation lands and the new
metadata shape stabilises.
Two small follow-ups to the PR review:
- Hoist hashlib import from _enforce_session_id_limit() to module top.
stdlib imports are free after first cache, but keeping all imports at
module top matches the rest of the codebase.
- _resolve_api_key now URL-parses baseUrl and requires http/https +
non-empty netloc before returning the 'local' sentinel. A typo like
baseUrl: 'true' (or bare 'localhost') no longer silently passes the
credential guard; the CLI correctly reports 'not configured'.
Three new tests cover the new validation (garbage strings, non-http
schemes, valid https).
new_session() was popping the old cached session, releasing the lock,
calling get_or_create, then re-acquiring the lock to insert. A concurrent
caller could observe the empty-cache window and race-create its own
session, producing two divergent session objects for the same key.
_cache_lock is an RLock, so nested reacquisition inside get_or_create is
safe. Hold it across the whole pop/create/insert sequence.
Follow-up to #13510 (@hekaru-agent).
When no explicit timeout is configured (HonchoClientConfig.timeout,
honcho.timeout / requestTimeout, or HONCHO_TIMEOUT), get_honcho_client
previously constructed the SDK with no timeout kwarg, letting the
underlying httpx client hang indefinitely if the Honcho backend
became unreachable mid-request.
This is a silent-failure hazard on the post-response path of
run_conversation: the memory_manager.sync_all() / queue_prefetch_all()
calls fire after the agent has already generated its final reply, so
a stalled Honcho request blocks run_conversation from returning.
The gateway never logs "response ready" and never delivers the
response to the platform (Telegram, etc.), even though the text is
already saved to the session file.
Repro: unplug the network or block app.honcho.dev mid-turn after
the model has produced its final message. Without this change,
_run_agent never returns. With it, the call aborts after 30s,
run_conversation returns, and the gateway delivers the response
(Honcho sync failure is logged and swallowed as before).
The default applies only when nothing is configured, so any
deployment that has explicitly set timeout / HONCHO_TIMEOUT /
honcho.timeout / honcho.requestTimeout keeps its existing value.
Self-hosted deployments that genuinely need a longer ceiling can
still override via any of those knobs.
_resolve_api_key() only checks for apiKey / HONCHO_API_KEY, so all
CLI subcommands (identity --show, status, migrate, etc.) bail with
"No API key configured" on self-hosted instances that use baseUrl
without an API key.
Return "local" when baseUrl or HONCHO_BASE_URL is set, matching the
client.py behavior that already handles this case for the SDK.
Tested on: macOS, self-hosted Honcho (Docker, localhost:8000).
Wraps _session_cache mutations in threading.RLock. Without this, concurrent
gateway sessions (e.g., Telegram + Discord hitting Honcho at the same time)
can race on the cache and silently lose conclusions or memory writes.
Adopted from #13510 by @hekaru-agent; the off-topic cron/jobs.py cleanup
hunk from that PR is dropped here for scope isolation. Resolved a small
conflict with the pinPeerName guard (kept both).