hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-04-28 06:51:16 +08:00

Author	SHA1	Message	Date
dontcallmejames	f1ba4014e1	fix: harden memory-context leak boundaries	2026-04-27 12:37:33 -07:00
hekaru-agent	dad0217450	fix(honcho): thread-safe session cache via RLock Wraps _session_cache mutations in threading.RLock. Without this, concurrent gateway sessions (e.g., Telegram + Discord hitting Honcho at the same time) can race on the cache and silently lose conclusions or memory writes. Adopted from #13510 by @hekaru-agent; the off-topic cron/jobs.py cleanup hunk from that PR is dropped here for scope isolation. Resolved a small conflict with the pinPeerName guard (kept both).	2026-04-27 12:37:33 -07:00
Sanjays2402	cd1c4812ab	fix(honcho): truncate resolve_session_name output to Honcho's 100-char limit (#13868 ) Gateway session keys (Matrix "!room:server" + thread event IDs, Telegram supergroup reply chains, Slack thread IDs with long workspace prefixes) can exceed Honcho's 100-character session ID limit after sanitization. Every Honcho API call for those sessions then 400s with "session_id too long". Add a helper that enforces the 100-char limit after sanitization: short keys (the common case) short-circuit unchanged; over-limit keys keep a prefix and append a deterministic `-<8 hex>` SHA-256 suffix over the original key so two long keys sharing a leading segment can't collide onto the same truncated ID. Adds 7 regression tests in tests/honcho_plugin/test_client.py covering short / exact-limit / long / deterministic / collision-resistant / allowlist-preserving / hash-suffix-present cases.	2026-04-27 12:37:33 -07:00
Brian D. Evans	326c9daa69	fix(honcho): require strict True for pin_peer_name to survive MagicMock configs (#15162 ) CI caught that ``test_session_manager_prefers_runtime_user_id_over_config_peer_name`` in ``tests/agent/test_memory_user_id.py`` failed after this branch: that test passes a ``MagicMock`` for ``config``, where ``mock.pin_peer_name`` silently returns another ``MagicMock`` — truthy by default. My ``getattr(..., "pin_peer_name", False)`` fallback was supposed to guard against callers that haven't added the new attr, but MagicMock does have the attr — it just returns a live mock for it. Tightened the gate to ``getattr(..., False) is True``. Real configs built via ``HonchoClientConfig.from_global_config`` always yield a proper boolean, so strict equality matches the pinned case and rejects both the unset-attr fallback and MagicMock stand-ins. Added a comment explaining why ``is True`` is intentional, not paranoid. Also tightened the ``peer_name`` existence check to ``getattr(..., None)`` so a MagicMock with ``peer_name`` left at its default (also truthy) doesn't spuriously enable pinning either. Verified against both the new ``test_pin_peer_name.py`` suite (13/13 pass) and the previously-failing ``TestHonchoUserIdScoping`` (3/3 pass). Zero behaviour change for real ``HonchoClientConfig`` values. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 12:37:33 -07:00
Brian D. Evans	d03c6fcc45	fix(honcho): pinPeerName opt-in keeps memory unified across platforms (#14984 ) When a gateway drives Hermes (Telegram, Discord, Slack, ...), it passes the platform-native user ID as ``runtime_user_peer_name`` into the Honcho session manager. That ID wins over ``peer_name`` in ``honcho.json``, so a single user who connects over three platforms ends up as three separate Honcho peers — one per platform — with fragmented memory and no cross- platform context continuity. For multi-user bots this is correct (and must not change): each user gets their own peer scope. For the vast majority of personal Hermes deployments the configured ``peer_name`` is an unambiguous identity, though, so the reporter asked for an opt-in knob that pins the user peer to that value. Fix: new ``pinPeerName`` boolean on the host config, default ``false``. When ``true`` AND ``peerName`` is set, the configured peer_name beats the gateway's runtime identity; every other resolution case is unchanged. honcho.json: { "peerName": "Igor", "hosts": { "hermes": { "pinPeerName": true } } } session.py (resolution order, pinned case): runtime_user_peer_name → skipped (opt-in flag active) config.peer_name → WINS "Igor" session-key fallback → unreached Parsing follows the same host-block-overrides-root pattern as every other flag in HonchoClientConfig.from_global_config (``_resolve_bool`` helper). Tests (tests/honcho_plugin/test_pin_peer_name.py — 13 cases, 5 groups): - Config parsing: default, root true, host-block true, host overrides root, explicit false. - Peer resolution: runtime wins by default (regression guard for multi- user bots), config wins when pinned, pin-without-peer_name is a no-op (prevents silent peer-id collapse to session-key fallback), CLI path where runtime is absent, deepest fallback intact, assistant peer untouched by the flag. - Cross-platform unification: Telegram UID + Discord snowflake collapse to one peer when pinned; negative control confirms two distinct runtime IDs still produce two peers when unpinned. 244 honcho_plugin tests pass, 3 pre-existing skips, zero regressions. Defensive detail: session.py uses ``getattr(self._config, "pin_peer_name", False)`` so callers building partial config objects (several test fixtures across the codebase do this) don't break if they haven't updated yet. Runtime cost: one attr lookup per new session. Closes #14984 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 12:37:33 -07:00
Teknium	df3c9593f8	feat(plugins): google_meet \u2014 join, transcribe, speak, follow up (#16364 ) * feat(plugins): google_meet — bundled plugin for join+transcribe Meet calls v1 shipping transcribe-only. Spawns headless Chromium via Playwright, joins an explicit https://meet.google.com/ URL, enables live captions, and scrapes them into a transcript file the agent can read across turns. The agent then has the meeting content in context and can do followup work (send recap, file issues, schedule followups) with its regular tools. Surface: - Tools: meet_join, meet_status, meet_transcript, meet_leave, meet_say (meet_say is a v1 stub — returns not-implemented; v2 will wire realtime duplex audio via OpenAI Realtime / Gemini Live + BlackHole / PulseAudio null-sink.) - CLI: hermes meet setup \| auth \| join \| status \| transcript \| stop - Lifecycle: on_session_end auto-leaves any still-running bot. Safety: - URL regex rejects anything that isn't https://meet.google.com/... - No calendar scanning, no auto-dial, no auto-consent announcement. - Single active meeting per install; a second meet_join leaves the first. - Platform-gated to Linux + macOS (Windows audio routing for v2 untested). - Opt-in: standalone plugin, user must add 'google_meet' to plugins.enabled in config.yaml. Zero core changes. Plugin uses existing register_tool / register_cli_command / register_hook surfaces. 21 new unit tests cover the URL safety gate, transcript dedup + status round-trip, process-manager refusals/start/stop paths, tool-handler JSON shape under each branch, session-end cleanup, and platform-gated register(). * feat(plugins/google_meet): v2 realtime audio + v3 remote node host v2 \u2014 agent speaks in-meeting audio_bridge.py: PulseAudio null-sink (Linux) + BlackHole probe (macOS). On Linux we load pactl module-null-sink + module-virtual-source, track module ids for teardown; Chrome gets PULSE_SOURCE=<virt src> env so its fake mic reads what we write to the sink. macOS just probes BlackHole 2ch and returns its device name \u2014 the plugin refuses to switch the user's default audio input (that would surprise them). realtime/openai_client.py: sync WebSocket client for the OpenAI Realtime API. RealtimeSession.speak(text) sends conversation.item.create + response.create, accumulates response.audio.delta PCM bytes, appends them to a file. RealtimeSpeaker runs a JSONL-queue loop consuming meet_say calls. 'websockets' is an optional dep imported lazily. meet_bot.py: when HERMES_MEET_MODE=realtime, provisions AudioBridge, starts RealtimeSession + speaker thread, spawns paplay to pump PCM into the null-sink, then cleans everything up on SIGTERM. If any realtime setup step fails, falls back cleanly to transcribe mode with an error flagged in status.json. process_manager.enqueue_say(): writes a JSONL line to say_queue.jsonl; refuses when no active meeting or active meeting is transcribe-only. tools.meet_say: real implementation; requires active mode='realtime'. meet_join: adds mode='transcribe'\|'realtime' param. v3 \u2014 remote node host node/protocol.py: JSON envelope (type, id, token, payload) + validate. node/registry.py: $HERMES_HOME/workspace/meetings/nodes.json, with resolve() auto-selecting the sole registered node when name is None. node/server.py: NodeServer \u2014 websockets.serve, bearer-token auth, dispatches start_bot/stop/status/transcript/say/ping onto the local process_manager. Token auto-generated + persisted on first run. node/client.py: NodeClient \u2014 short-lived sync WS per RPC, raises RuntimeError on error envelopes, clean API matching the server. node/cli.py: 'hermes meet node {run,list,approve,remove,status,ping}' subtree; wired into the main meet CLI by cli.py so 'hermes meet node' Just Works. tools.py: every meet_* tool accepts node='<name>'\|'auto'; when set, routes through NodeClient to the remote bot instead of running locally. Unknown node \u2192 clear 'no registered meet node matches ...' error. cli.py: 'hermes meet join --node my-mac --mode realtime' and 'hermes meet say "..." --node my-mac' route to the node; 'hermes meet node approve <name> <url> <token>' registers one. Tests 21 v1 tests updated (meet_say is no longer a stub; active-record now carries mode). 20 new audio_bridge + realtime tests. 42 new node tests (protocol/registry/server/client/cli). 17 new v1/v2/v3 integration tests at the plugin level covering enqueue_say edge cases, env var passthrough, mode validation, node routing (known/unknown/auto/ambiguous), and argparse wiring for `hermes meet say` + `hermes meet node` + --mode/--node flags. Total: 100 plugin tests + 58 plugin-system tests = 158 passing. E2E verified on Linux with fresh HERMES_HOME: plugin loads, 5 tools register, on_session_end hook wires, 'hermes meet' CLI tree wires including the node subtree, NodeRegistry round-trips, meet_join routes correctly to NodeClient under node='my-mac' with mode='realtime', enqueue_say accepts realtime/rejects transcribe, argparse parses every new flag cleanly. Zero changes to core. All new code lives under plugins/google_meet/. * feat(plugins/google_meet): auto-install, admission detect, mac PCM pump, barge-in, richer status Ready-for-live-test follow-up on PR #16364. Five additions that matter for the first live run on a real Meet, in priority order: 1. hermes meet install [--realtime] [--yes] pip install playwright websockets + python -m playwright install chromium --realtime: installs platform audio deps (pulseaudio-utils on Linux via sudo apt, blackhole-2ch + ffmpeg on macOS via brew). Prompts before sudo/brew unless --yes. Refuses on Windows. Refuses to auto-flip the macOS default input — user still selects BlackHole in System Settings (deliberate; surprise audio rerouting is worse than a manual step). 2. Admission detection _detect_admission(page): Leave-button visible OR caption region attached OR participants list present → we're in-call. _detect_denied(page): 'You can\'t join this video call' / 'You were removed' / 'No one responded to your request' → bail out. HERMES_MEET_LOBBY_TIMEOUT (default 300s) caps how long we sit in the lobby before giving up. in_call stays False until admitted. Status surfaces leaveReason: duration_expired \| lobby_timeout \| denied \| page_closed. 3. macOS PCM pump ffmpeg reads speaker.pcm (24kHz s16le mono) and writes to the BlackHole AVFoundation output via -f audiotoolbox -audio_device_index <N>. _mac_audio_device_index() probes ffmpeg -f avfoundation -list_devices true to resolve 'BlackHole 2ch' → numeric index. Falls back to index 0 on probe failure. Linux paplay pump unchanged. 4. Richer status dict _BotState now tracks realtime, realtimeReady, realtimeDevice, audioBytesOut, lastAudioOutAt, lastBargeInAt, joinAttemptedAt, leaveReason. RealtimeSession.audio_bytes_out / last_audio_out_at counters fold into the status file once a second so meet_status() can show the agent's voice activity in near-real-time. 5. Barge-in RealtimeSession.cancel_response() sends type='response.cancel' over the same WS (lock-guarded so it's safe to call from the caption thread while speak() is reading frames). Handles response.cancelled as a terminal frame type. _looks_like_human_speaker() gates triggers so the bot's own name, 'You', 'Unknown', and blanks don't self-cancel. Called from the caption drain loop: when a new caption arrives attributed to a real participant while rt.session exists, we fire cancel_response() and stamp lastBargeInAt. Tests: 20 new unit tests across _BotState telemetry, barge-in gating, admission/denied probe error handling, cancel_response with and without a connected WS, and `hermes meet install` CLI wiring (flag parsing + end-to-end subprocess.run verification + Linux-already-installed fast path). Total 171 passing across all google_meet test files + the plugin-system regression suite. E2E verified on Linux: plugin loads, all 5 tools register, `hermes meet install --realtime --yes` parses, fresh-bot status.json has every new telemetry key, cancel_response on a disconnected session returns False without raising, barge-in helper gates the bot's own name correctly. Still out of scope (for a future PR, not blocking live test): mic → Realtime duplex (the agent listening to meeting audio via WebRTC), node-host TLS/pairing UX, Windows audio, Meet create+Twilio. Docs updated: SKILL.md now lists the installer subcommand, lobby timeout, barge-in caveat, and the full status-dict reference table. README.md quick-start uses hermes meet install.	2026-04-27 06:22:25 -07:00
Wysie	64a497bfa9	fix(hindsight): preserve setup config on blank input	2026-04-27 03:34:58 -07:00
Wysie	0ba6471dd1	fix: recover hindsight embedded daemon after idle shutdown	2026-04-26 18:29:11 -07:00
maxims-oss	18beb69b49	fix(memory): close embedded Hindsight async client cleanly HindsightEmbedded.close() delegates to its sync client.close(). When Hermes created/used that client on the shared async loop, closing it from the main thread raises 'attached to a different loop' before aiohttp releases the session — so the ClientSession / TCPConnector leak past provider teardown. Close the embedded inner async client on the shared loop first via _run_sync(inner_client.aclose()), then let the wrapper's sync close() do its daemon/UI bookkeeping. Salvage of #14605: test placement rebased — appended TestShutdown class after TestSharedEventLoopLifecycle (which landed on main after the PR was written). Original author attribution preserved.	2026-04-26 12:54:46 -07:00
Teknium	ea01bdcebe	refactor(memory): remove flush_memories entirely (#15696 ) The AIAgent.flush_memories pre-compression save, the gateway _flush_memories_for_session, and everything feeding them are obsolete now that the background memory/skill review handles persistent memory extraction. Problems with flush_memories: - Pre-dates the background review loop. It was the only memory-save path when introduced; the background review now fires every 10 user turns on CLI and gateway alike, which is far more frequent than compression or session reset ever triggered flush. - Blocking and synchronous. Pre-compression flush ran on the live agent before compression, blocking the user-visible response. - Cache-breaking. Flush built a temporary conversation prefix (system prompt + memory-only tool list) that diverged from the live conversation's cached prefix, invalidating prompt caching. The gateway variant spawned a fresh AIAgent with its own clean prompt for each finalized session — still cache-breaking, just in a different process. - Redundant. Background review runs in the live conversation's session context, gets the same content, writes to the same memory store, and doesn't break the cache. Everything flush_memories claimed to preserve is already covered. What this removes: - AIAgent.flush_memories() method (~248 LOC in run_agent.py) - Pre-compression flush call in _compress_context - flush_memories call sites in cli.py (/new + exit) - GatewayRunner._flush_memories_for_session + _async_flush_memories (and the 3 call sites: session expiry watcher, /new, /resume) - 'flush_memories' entry from DEFAULT_CONFIG auxiliary tasks, hermes tools UI task list, auxiliary_client docstrings - _memory_flush_min_turns config + init - #15631's headroom-deduction math in _check_compression_model_feasibility (headroom was only needed because flush dragged the full main-agent system prompt along; the compression summariser sends a single user-role prompt so new_threshold = aux_context is safe again) - The dedicated test files and assertions that exercised flush-specific paths What this renames (with read-time backcompat on sessions.json): - SessionEntry.memory_flushed -> SessionEntry.expiry_finalized. The session-expiry watcher still uses the flag to avoid re-running finalize/eviction on the same expired session; the new name reflects what it now actually gates. from_dict() reads 'expiry_finalized' first, falls back to the legacy 'memory_flushed' key so existing sessions.json files upgrade seamlessly. Supersedes #15631 and #15638. Tested: 383 targeted tests pass across run_agent/, agent/, cli/, and gateway/ session-boundary suites. No behavior regressions — background memory review continues to handle persistent memory extraction on both CLI and gateway.	2026-04-25 08:21:14 -07:00
Teknium	af22421e87	feat(dashboard): page-scoped plugin slots for built-in pages (#15658 ) * fix(terminal): three-layer defense against watch_patterns notification spam Background processes that stack notify_on_complete=True with watch_patterns can flood the user with duplicate, delayed notifications — matches deliver asynchronously via the completion queue and continue arriving minutes after the process has exited. The docstring warning against this (PR #12113) has proven insufficient; agents still misuse the combination. Three layered defenses, each sufficient on its own: 1. Mutual exclusion (terminal_tool.py): When both flags are set on a background process, drop watch_patterns with a warning. notify_on_complete wins because 'let me know when it's done' is the more useful signal and fires exactly once. Extracted as _resolve_notification_flag_conflict() so the rule is testable in isolation. 2. Suppress-after-exit (process_registry.py): _check_watch_patterns() now bails the moment session.exited is True. Post-exit chunks (buffered reads draining after the process is gone) no longer produce notifications. This is the fix flagged as future work in session 20260418_020302_79881c. 3. Global circuit breaker (process_registry.py): Per-session rate limits don't catch the sibling-flood case — N concurrent processes can each stay under 8/10s and still collectively spam. New WATCH_GLOBAL_MAX_PER_WINDOW=15 cap trips a 30-second cooldown across ALL sessions, emits a single watch_overflow_tripped event, silently counts dropped events, and emits a watch_overflow_released summary when the cooldown ends. Also updates the tool schema + docstring to document the new behavior. Tests: 8 new tests covering all three fixes (suppress-after-exit x2, mutual-exclusion resolver x4, global breaker trip/cooldown/release x2). All 60 tests across test_watch_patterns.py, test_notify_on_complete.py, test_terminal_tool.py pass. Real-world trigger: self-inflicted in session 20260425_051924 — three concurrent hermes-sweeper review subprocesses each set watch_patterns= ['failed validation', 'errored'] AND notify_on_complete=True, then iterated over multiple items, producing enough matches per process to defeat the per-session cap while staying under the global cap that didn't yet exist. * fix(terminal): aggressive 1-per-15s watch_patterns rate limit + strike-3 promotion Per Teknium's direction, the watch_patterns rate limit is now much more aggressive and self-healing. ## New rule — per session - HARD cap: 1 watch-match notification per 15 seconds per process. - Any match arriving inside the cooldown window is dropped and counts as ONE strike for that window (many drops in the same window still = 1 strike). - After 3 consecutive strike windows, watch_patterns is permanently disabled for the session and the session is auto-promoted to notify_on_complete semantics — exactly one notification when the process actually exits. - A cooldown window that expires with zero drops resets the consecutive strike counter — healthy cadence is forgiven. ## Schema + docstring rewritten The tool schema description now gives the model explicit guidance: - notify_on_complete is 'the right choice for almost every long-running task' - watch_patterns is for RARE one-shot signals on LONG-LIVED processes - Do NOT use watch_patterns with loops/batch jobs — error patterns fire every iteration and will hit the strike limit fast - Mutual exclusion is stated on both parameter descriptions - 1/15s cooldown and 3-strike promotion are stated in the watch_patterns description so the model sees the contract every turn ## Removed - WATCH_MAX_PER_WINDOW (8/10s) and WATCH_OVERLOAD_KILL_SECONDS (45) — the new 1/15s limit subsumes both; keeping them would double-count. - _watch_window_hits / _watch_window_start / _watch_overload_since fields on ProcessSession. Replaced by _watch_last_emit_at / _watch_cooldown_until / _watch_strike_candidate / _watch_consecutive_strikes. ## Kept - Global circuit breaker across all sessions (15/10s → 30s cooldown) as a secondary safety net for concurrent siblings. Still valuable when 20 short-lived processes each fire once — none individually violates the per-session limit. - Suppress-after-exit guard. - Mutual exclusion resolver at the tool entry point. ## Tests - 6 new tests in TestPerSessionRateLimit covering: first match delivers, second in cooldown suppressed, multi-drop = single strike, 3 strikes disables + promotes, clean window resets counter, suppressed count carried to next emit. - Global circuit breaker tests rewritten to use fresh sessions instead of hacking removed per-window fields. - 50/50 watch_patterns + notify_on_complete tests pass. - 60/60 including test_terminal_tool.py pass. * feat(dashboard): page-scoped plugin slots for built-in pages Dashboard plugins can now inject components into specific built-in pages (Sessions, Analytics, Logs, Cron, Skills, Config, Env, Docs, Chat) without overriding the whole route. Previously, plugins could only: 1. Add new tabs (tab.path) 2. Replace whole built-in pages (tab.override) 3. Inject into global shell slots (header-, footer-, pre-main, ...) None of those let a plugin add a banner, card, or widget to an existing page. The new <page>:top / <page>:bottom slots close that gap, reusing the existing registerSlot() API. Changes - web/src/plugins/slots.ts: 18 new KNOWN_SLOT_NAMES entries (sessions:top, sessions:bottom, analytics:top, ..., chat:bottom), grouped under "Shell-wide" vs "Page-scoped" in the docblock - web/src/pages/*: each built-in page now renders <PluginSlot name="<page>:top" /> as the first child of its outer wrapper and <PluginSlot name="<page>:bottom" /> as the last child -- zero visual cost when no plugin registers - plugins/example-dashboard: registers a demo banner into sessions:top via registerSlot(), with matching slots entry in the manifest -- so freshly-setup users can see what page-scoped slots look like without writing any plugin code - website/docs: new "Page-scoped slots" table in the plugin authoring guide, with a worked example - tests/hermes_cli/test_web_server.py: round-trip test for colon-bearing slot names (sessions:top, analytics:bottom, ...) Validation - npm run build: clean (tsc -b + vite build, 2761 modules) - scripts/run_tests.sh tests/hermes_cli/test_web_server.py::TestDashboardPluginManifestExtensions: 5/5 pass	2026-04-25 06:55:35 -07:00
Teknium	8d12fb1e6b	refactor(spotify): convert to built-in bundled plugin under plugins/spotify (#15174 ) Moves the Spotify integration from tools/ into plugins/spotify/, matching the existing pattern established by plugins/image_gen/ for third-party service integrations. Why: - tools/ should be reserved for foundational capabilities (terminal, read_file, web_search, etc.). tools/providers/ was a one-off directory created solely for spotify_client.py. - plugins/ is already the home for image_gen backends, memory providers, context engines, and standalone hook-based plugins. Spotify is a third-party service integration and belongs alongside those, not in tools/. - Future service integrations (eventually: Deezer, Apple Music, etc.) now have a pattern to copy. Changes: - tools/spotify_tool.py → plugins/spotify/tools.py (handlers + schemas) - tools/providers/spotify_client.py → plugins/spotify/client.py - tools/providers/ removed (was only used for Spotify) - New plugins/spotify/__init__.py with register(ctx) calling ctx.register_tool() × 7. The handler/check_fn wiring is unchanged. - New plugins/spotify/plugin.yaml (kind: backend, bundled, auto-load). - tests/tools/test_spotify_client.py: import paths updated. tools_config fix — _DEFAULT_OFF_TOOLSETS now wins over plugin auto-enable: - _get_platform_tools() previously auto-enabled unknown plugin toolsets for new platforms. That was fine for image_gen (which has no toolset of its own) but bad for Spotify, which explicitly requires opt-in (don't ship 7 tool schemas to users who don't use it). Added a check: if a plugin toolset is in _DEFAULT_OFF_TOOLSETS, it stays off until the user picks it in 'hermes tools'. Pre-existing test bug fix: - tests/hermes_cli/test_plugins.py::test_list_returns_sorted asserted names were sorted, but list_plugins() sorts by key (path-derived, e.g. image_gen/openai). With only image_gen plugins bundled, name and key order happened to agree. Adding plugins/spotify broke that coincidence (spotify sorts between openai-codex and xai by name but after xai by key). Updated test to assert key order, which is what the code actually documents. Validation: - scripts/run_tests.sh tests/hermes_cli/test_plugins.py \ tests/hermes_cli/test_tools_config.py \ tests/hermes_cli/test_spotify_auth.py \ tests/tools/test_spotify_client.py \ tests/tools/test_registry.py → 143 passed - E2E plugin load: 'spotify' appears in loaded plugins, all 7 tools register into the spotify toolset, check_fn gating intact.	2026-04-24 07:06:11 -07:00
Nicolò Boschi	edff2fbe7e	feat(hindsight): optional bank_id_template for per-agent / per-user banks Adds an optional bank_id_template config that derives the bank name at initialize() time from runtime context. Existing users with a static bank_id keep the current behavior (template is empty by default). Supported placeholders: {profile} — active Hermes profile (agent_identity kwarg) {workspace} — Hermes workspace (agent_workspace kwarg) {platform} — cli, telegram, discord, etc. {user} — platform user id (gateway sessions) {session} — session id Unsafe characters in placeholder values are sanitized, and empty placeholders collapse cleanly (e.g. "hermes-{user}" with no user becomes "hermes"). If the template renders empty, the static bank_id is used as a fallback. Common uses: bank_id_template: hermes-{profile} # isolate per Hermes profile bank_id_template: {workspace}-{profile} # workspace + profile scoping bank_id_template: hermes-{user} # per-user banks for gateway	2026-04-24 03:38:17 -07:00
Nicolò Boschi	f9c6c5ab84	fix(hindsight): scope document_id per process to avoid resume overwrite (#6602 ) Reusing session_id as document_id caused data loss on /resume: when the session is loaded again, _session_turns starts empty and the next retain replaces the entire previously stored content. Now each process lifecycle gets its own document_id formed as {session_id}-{startup_timestamp}, so: - Same session, same process: turns accumulate into one document (existing behavior) - Resume (new process, same session): writes a new document, old one preserved - Forks: child process gets its own document; parent's doc is untouched Also adds session lineage tags so all processes for the same session (or its parent) can still be filtered together via recall: - session:<session_id> on every retain - parent:<parent_session_id> when initialized with parent_session_id Closes #6602	2026-04-24 03:38:17 -07:00
tekgnosis-net	f1ba2f0c0b	fix(hindsight): use configured timeout in _run_sync for all async operations The previous commit added HINDSIGHT_TIMEOUT as a configurable env var, but _run_sync still used the hardcoded _DEFAULT_TIMEOUT (120s). All async operations (recall, retain, reflect, aclose) now go through an instance method that uses self._timeout, so the configured value is actually applied. Also: added backward-compatible alias comment for the module-level function.	2026-04-24 03:36:02 -07:00
tekgnosis-net	403c82b6b6	feat(hindsight): add configurable HINDSIGHT_TIMEOUT env var The Hindsight Cloud API can take 30-40 seconds per request. The hardcoded 30s timeout was too aggressive and caused frequent timeout errors. This patch: 1. Adds HINDSIGHT_TIMEOUT environment variable (default: 120s) 2. Adds timeout to the config schema for setup wizard visibility 3. Uses the configurable timeout in both _run_sync() and client creation 4. Reads from config.json or env var, falling back to 120s default This makes the timeout upgrade-proof — users can set it via env var or config without patching source code. Signed-off-by: Kumar <kumar@tekgnosis.net>	2026-04-24 03:36:02 -07:00
Jason Perlow	93a74f74bf	fix(hindsight): preserve shared event loop across provider shutdowns The module-global `_loop` / `_loop_thread` pair is shared across every `HindsightMemoryProvider` instance in the process — the plugin loader creates one provider per `AIAgent`, and the gateway creates one `AIAgent` per concurrent chat session (Telegram/Discord/Slack/CLI). `HindsightMemoryProvider.shutdown()` stopped the shared loop when any one session ended. That stranded the aiohttp `ClientSession` and `TCPConnector` owned by every sibling provider on a now-dead loop — they were never reachable for close and surfaced as the `Unclosed client session` / `Unclosed connector` warnings reported in #11923. Fix: stop stopping the shared loop in `shutdown()`. Per-provider cleanup still closes that provider's own client via `self._client.aclose()`. The loop runs on a daemon thread and is reclaimed on process exit; keeping it alive between provider shutdowns means sibling providers can drain their own sessions cleanly. Regression tests in `tests/plugins/memory/test_hindsight_provider.py` (`TestSharedEventLoopLifecycle`): - `test_shutdown_does_not_stop_shared_event_loop` — two providers share the loop; shutting down one leaves the loop live for the other. This test reproduces the #11923 leak on `main` and passes with the fix. - `test_client_aclose_called_on_cloud_mode_shutdown` — each provider's own aiohttp session is still closed via `aclose()`. Fixes #11923.	2026-04-24 03:34:12 -07:00
LeonSGP43	df55660e3c	fix(hindsight): disable broken local runtime on unsupported CPUs	2026-04-24 03:33:14 -07:00
bwjoke	3e994e38f7	[verified] fix: materialize hindsight profile env during setup	2026-04-24 03:30:11 -07:00
JC的AI分身	127048e643	fix(hindsight): accept snake_case api_key config	2026-04-24 03:30:03 -07:00
harryplusplus	d6b65bbc47	fix(hindsight): preserve non-ASCII text in retained conversation turns	2026-04-24 03:29:58 -07:00
Chris Danis	a5c7422f23	fix(hindsight): always write HINDSIGHT_LLM_API_KEY to .env, even when empty When user runs ✓ Memory provider: built-in only Saved to config.yaml and leaves the API key blank, the old code skipped writing it entirely. This caused the uvx daemon launcher to fail at startup because it couldn't distinguish between "key not configured" and "explicitly blank key." Now HINDSIGHT_LLM_API_KEY is always written to .env so the value is either set or explicitly empty.	2026-04-24 03:29:53 -07:00
Teknium	f593c367be	feat(dashboard): reskin extension points for themes and plugins (#14776 ) Themes and plugins can now pull off arbitrary dashboard reskins (cockpit HUD, retro terminal, etc.) without touching core code. Themes gain four new fields: - layoutVariant: standard \| cockpit \| tiled — shell layout selector - assets: {bg, hero, logo, crest, sidebar, header, custom: {...}} — artwork URLs exposed as --theme-asset-* CSS vars - customCSS: raw CSS injected as a scoped <style> tag on theme apply (32 KiB cap, cleaned up on theme switch) - componentStyles: per-component CSS-var overrides (clipPath, borderImage, background, boxShadow, ...) for card/header/sidebar/ backdrop/tab/progress/badge/footer/page Plugin manifests gain three new fields: - tab.override: replaces a built-in route instead of adding a tab - tab.hidden: register component + slots without adding a nav entry - slots: declares shell slots the plugin populates 10 named shell slots: backdrop, header-left/right/banner, sidebar, pre-main, post-main, footer-left/right, overlay. Plugins register via window.__HERMES_PLUGINS__.registerSlot(name, slot, Component). A <PluginSlot> React helper is exported on the plugin SDK. Ships a full demo at plugins/strike-freedom-cockpit/ — theme YAML + slot-only plugin that reproduces a Gundam cockpit dashboard: MS-STATUS sidebar with live telemetry, COMPASS crest in header, notched card corners via componentStyles, scanline overlay via customCSS, gold/cyan palette, Orbitron typography. Validation: - 15 new tests in test_web_server.py covering every extended field - tests/hermes_cli/: 2615 passed (3 pre-existing unrelated failures) - tsc -b --noEmit: clean - vite build: 418 kB bundle, ~2 kB delta for slots/theme extensions Co-authored-by: Teknium <p@nousresearch.com>	2026-04-23 15:31:01 -07:00
teknium1	9599271180	fix(xai-image): drop unreachable editing code path The agent-facing image_generate tool only passes prompt + aspect_ratio to provider.generate() (see tools/image_generation_tool.py:953). The editing block (reference_images / edit_image kwargs) could never fire from the tool surface, and the xAI edits endpoint is /images/edits with a different payload shape anyway — not /images/generations as submitted. - Remove reference_images / edit_image kwargs handling from generate() - Remove matching test_with_reference_images case - Update docstring + plugin.yaml description to text-to-image only - Surface resolution in the success extras Follow-up to PR #14547. Tests: 18/18 pass.	2026-04-23 15:13:34 -07:00
Julien Talbot	a5e4a86ebe	feat(xai): add xAI image generation provider (grok-imagine-image) Add xAI as a plugin-based image generation backend using grok-imagine-image. Follows the existing ImageGenProvider ABC pattern used by OpenAI and FAL. Changes: - plugins/image_gen/xai/__init__.py: xAI provider implementation - Uses xAI /images/generations endpoint - Supports text-to-image and image editing with reference images - Multiple aspect ratios (1:1, 16:9, 9:16, 4:3, 3:4, 3:2, 2:3) - Multiple resolutions (1K, 2K) - Base64 output saved to cache - Config via config.yaml image_gen.xai section - plugins/image_gen/xai/plugin.yaml: plugin metadata - tests/plugins/image_gen/test_xai_provider.py: 19 unit tests - Provider class (name, display_name, is_available, list_models, setup_schema) - Config (default model, resolution, custom model) - Generate (missing key, success b64/url, API error, timeout, empty response, reference images, auth header) - Registration Requires XAI_API_KEY in ~/.hermes/.env. To use: set image_gen.provider: xai in config.yaml.	2026-04-23 15:13:34 -07:00
Teknium	eda5ae5a5e	feat(image_gen): add openai-codex plugin (gpt-image-2 via Codex OAuth) (#14317 ) New built-in image_gen backend at plugins/image_gen/openai-codex/ that exposes the same gpt-image-2 low/medium/high tier catalog as the existing 'openai' plugin, but routes generation through the ChatGPT/ Codex Responses image_generation tool path. Available whenever the user has Codex OAuth signed in; no OPENAI_API_KEY required. The two plugins are independent — users select between them via 'hermes tools' → Image Generation, and image_gen.provider in config.yaml. The existing 'openai' (API-key) plugin is unchanged. Reuses _read_codex_access_token() and _codex_cloudflare_headers() from agent.auxiliary_client so token expiry / cred-pool / Cloudflare originator handling stays in one place. Inspired by #14047 by @Hygaard, but re-implemented as a separate plugin instead of an in-place fork of the openai plugin. Closes #11195	2026-04-22 20:43:21 -07:00
Abner	b66644f0ec	feat(hindsight): richer session-scoped retain metadata - Add configurable retain_tags / retain_source / retain_user_prefix / retain_assistant_prefix knobs for native Hindsight. - Thread gateway session identity (user_name, chat_id, chat_name, chat_type, thread_id) through AIAgent and MemoryManager into MemoryProvider.initialize kwargs so providers can scope and tag retained memories. - Hindsight attaches the new identity fields as retain metadata, merges per-call tool tags with configured default tags, and uses the configurable transcript labels for auto-retained turns. Co-authored-by: Abner <abner.the.foreman@agentmail.to>	2026-04-22 05:27:10 -07:00
Teknium	ff9752410a	feat(plugins): pluggable image_gen backends + OpenAI provider (#13799 ) * feat(plugins): pluggable image_gen backends + OpenAI provider Adds a ImageGenProvider ABC so image generation backends register as bundled plugins under `plugins/image_gen/<name>/`. The plugin scanner gains three primitives to make this work generically: - `kind:` manifest field (`standalone` \| `backend` \| `exclusive`). Bundled `kind: backend` plugins auto-load — no `plugins.enabled` incantation. User-installed backends stay opt-in. - Path-derived keys: `plugins/image_gen/openai/` gets key `image_gen/openai`, so a future `tts/openai` cannot collide. - Depth-2 recursion into category namespaces (parent dirs without a `plugin.yaml` of their own). Includes `OpenAIImageGenProvider` as the first consumer (gpt-image-1.5 default, plus gpt-image-1, gpt-image-1-mini, DALL-E 3/2). Base64 responses save to `$HERMES_HOME/cache/images/`; URL responses pass through. FAL stays in-tree for this PR — a follow-up ports it into `plugins/image_gen/fal/` so the in-tree `image_generation_tool.py` slims down. The dispatch shim in `_handle_image_generate` only fires when `image_gen.provider` is explicitly set to a non-FAL value, so existing FAL setups are untouched. - 41 unit tests (scanner recursion, kind parsing, gate logic, registry, OpenAI payload shapes) - E2E smoke verified: bundled plugin autoloads, registers, and `_handle_image_generate` routes to OpenAI when configured * fix(image_gen/openai): don't send response_format to gpt-image-* The live API rejects it: 'Unknown parameter: response_format' (verified 2026-04-21 with gpt-image-1.5). gpt-image-* models return b64_json unconditionally, so the parameter was both unnecessary and actively broken. * feat(image_gen/openai): gpt-image-2 only, drop legacy catalog gpt-image-2 is the latest/best OpenAI image model (released 2026-04-21) and there's no reason to expose the older gpt-image-1.5 / gpt-image-1 / dall-e-3 / dall-e-2 alongside it — slower, lower quality, or awkward (dall-e-2 squares only). Trim the catalog down to a single model. Live-verified end-to-end: landscape 1536x1024 render of a Moog-style synth matches prompt exactly, 2.4MB PNG saved to cache. * feat(image_gen/openai): expose gpt-image-2 as three quality tiers Users pick speed/fidelity via the normal model picker instead of a hidden quality knob. All three tier IDs resolve to the single underlying gpt-image-2 API model with a different quality parameter: gpt-image-2-low ~15s fast iteration gpt-image-2-medium ~40s default gpt-image-2-high ~2min highest fidelity Live-measured on OpenAI's API today: 15.4s / 40.8s / 116.9s for the same 1024x1024 prompt. Config: image_gen.openai.model: gpt-image-2-high # or image_gen.model: gpt-image-2-low # or env var for scripts/tests OPENAI_IMAGE_MODEL=gpt-image-2-medium Live-verified end-to-end with the low tier: 18.8s landscape render of a golden retriever in wildflowers, vision-confirmed exact match. * feat(tools_config): plugin image_gen providers inject themselves into picker 'hermes tools' → Image Generation now shows plugin-registered backends alongside Nous Subscription and FAL.ai without tools_config.py needing to know about them. OpenAI appears as a third option today; future backends appear automatically as they're added. Mechanism: - ImageGenProvider gains an optional get_setup_schema() hook (name, badge, tag, env_vars). Default derived from display_name. - tools_config._plugin_image_gen_providers() pulls the schemas from every registered non-FAL plugin provider. - _visible_providers() appends those rows when rendering the Image Generation category. - _configure_provider() handles the new image_gen_plugin_name marker: writes image_gen.provider and routes to the plugin's list_models() catalog for the model picker. - _toolset_needs_configuration_prompt('image_gen') stops demanding a FAL key when any plugin provider reports is_available(). FAL is skipped in the plugin path because it already has hardcoded TOOL_CATEGORIES rows — when it gets ported to a plugin in a follow-up PR the hardcoded rows go away and it surfaces through the same path as OpenAI. Verified live: picker shows Nous Subscription / FAL.ai / OpenAI. Picking OpenAI prompts for OPENAI_API_KEY, then shows the gpt-image-2-low/medium/high model picker sourced from the plugin. 397 tests pass across plugins/, tools_config, registry, and picker. * fix(image_gen): close final gaps for plugin-backend parity with FAL Two small places that still hardcoded FAL: - hermes_cli/setup.py status line: an OpenAI-only setup showed 'Image Generation: missing FAL_KEY'. Now probes plugin providers and reports '(OpenAI)' when one is_available() — or falls back to 'missing FAL_KEY or OPENAI_API_KEY' if nothing is configured. - image_generate tool schema description: said 'using FAL.ai, default FLUX 2 Klein 9B'. Rewrote provider-neutral — 'backend and model are user-configured' — and notes the 'image' field can be a URL or an absolute path, which the gateway delivers either way via extract_local_files().	2026-04-21 21:30:10 -07:00
Teknium	a25c8c6a56	docs(plugins): rename disk-guardian to disk-cleanup + bundled-plugins docs The original name was cute but non-obvious; disk-cleanup says what it does. Plugin directory, script, state path, log lines, slash command, and test module all renamed. No user-visible state exists yet, so no migration path is needed. New website page "Built-in Plugins" documents the <repo>/plugins/<name>/ source, how discovery interacts with user/project plugins, the HERMES_DISABLE_BUNDLED_PLUGINS escape hatch, disk-cleanup's hook behaviour and deletion rules, and guidance on when a plugin belongs bundled vs. user-installable. Added to the Features → Core sidebar next to the main Plugins page, with a cross-reference from plugins.md.	2026-04-20 04:46:45 -07:00
Teknium	1386e277e5	feat(plugins): convert disk-guardian skill into a bundled plugin Rewires @LVT382009's disk-guardian (PR #12212) from a skill-plus-script into a plugin that runs entirely via hooks — no agent compliance needed. - post_tool_call hook auto-tracks files created by write_file / terminal / patch when they match test_/tmp_/.test. patterns under HERMES_HOME - on_session_end hook runs cmd_quick cleanup when test files were auto-tracked during the turn; stays quiet otherwise - /disk-guardian slash command keeps status / dry-run / quick / deep / track / forget for manual use - Deterministic cleanup rules, path safety, atomic writes, and audit logging preserved from the original contribution - Protect well-known top-level state dirs (logs/, memories/, sessions/, cron/, cache/, etc.) from empty-dir removal so fresh installs don't get gutted on first session end The plugin system gains a bundled-plugin discovery path (<repo>/plugins/ <name>/) alongside user/project/entry-point sources. Memory and context_engine subdirs are skipped — they keep their own discovery paths. HERMES_DISABLE_BUNDLED_PLUGINS=1 suppresses the scan; the test conftest sets it by default so existing plugin tests stay clean. Co-authored-by: LVT382009 <levantam.98.2324@gmail.com>	2026-04-20 04:46:45 -07:00
Erosika	21d5ef2f17	feat(honcho): wizard cadence default 2, surface reasoning level, backwards-compat fallback Setup wizard now always writes dialecticCadence=2 on new configs and surfaces the reasoning level as an explicit step with all five options (minimal / low / medium / high / max), always writing dialecticReasoningLevel. Code keeps a backwards-compat fallback of 1 when dialecticCadence is unset so existing honcho.json configs that predate the setting keep firing every turn on upgrade. New setups via the wizard get 2 explicitly; docs show 2 as the default. Also scrubs editorial lines from code and docs ("max is reserved for explicit tool-path selection", "Unset → every turn; wizard pre-fills 2", and similar process-exposing phrasing) and adds an inline link to app.honcho.dev where the server-side observation sync is mentioned in honcho.md. Recommended cadence range updated to 1-5 across docs and wizard copy.	2026-04-18 22:50:55 -07:00
LeonSGP43	5b6792f04d	fix(honcho): scope gateway sessions by runtime user id	2026-04-18 22:50:55 -07:00
Erosika	c630dfcdac	feat(honcho): dialectic liveness — stale-thread watchdog, stale-result discard, empty-streak backoff Hardens the dialectic lifecycle against three failure modes that could leave the prefetch pipeline stuck or injecting stale content: - Stale-thread watchdog: _thread_is_live() treats any prefetch thread older than timeout × 2.0 as dead. A hung Honcho call can no longer block subsequent fires indefinitely. - Stale-result discard: pending _prefetch_result is tagged with its fire turn. prefetch() discards the result if more than cadence × 2 turns passed before a consumer read it (e.g. a run of trivial-prompt turns between fire and read). - Empty-streak backoff: consecutive empty dialectic returns widen the effective cadence (dialectic_cadence + streak, capped at cadence × 8). A healthy fire resets the streak. Prevents the plugin from hammering the backend every turn when the peer graph is cold. - liveness_snapshot() on the provider exposes current turn, last fire, pending fire-at, empty streak, effective cadence, and thread status for in-process diagnostics. - system_prompt_block: nudge the model that honcho_reasoning accepts reasoning_level minimal/low/medium/high/max per call. - hermes honcho status: surface base reasoning level, cap, and heuristic toggle so config drift is visible at a glance. Tests: 550 passed. - TestDialecticLiveness (8 tests): stale-thread recovery, stale-result discard, fresh-result retention, backoff widening, backoff ceiling, streak reset on success, streak increment on empty, snapshot shape. - Existing TestDialecticCadenceAdvancesOnSuccess::test_in_flight_thread_is_not_stacked updated to set _prefetch_thread_started_at so it tests the fresh-thread-blocks branch (stale path covered separately). - test_cli TestCmdStatus fake updated with the new config attrs surfaced in the status block.	2026-04-18 22:50:55 -07:00
Erosika	098efde848	docs(honcho): wizard cadence default 2, prewarm/depth + observation + multi-peer - cli: setup wizard pre-fills dialecticCadence=2 (code default stays 1 so unset → every turn) - honcho.md: fix stale dialecticCadence default in tables, add Session-Start Prewarm subsection (depth runs at init), add Query-Adaptive Reasoning Level subsection, expand Observation section with directional vs unified semantics and per-peer patterns - memory-providers.md: fix stale default, rename Multi-agent/Profiles to Multi-peer setup, add concrete walkthrough for new profiles and sync, document observation toggles + presets, link to honcho.md - SKILL.md: fix stale defaults, add Depth at session start callout	2026-04-18 22:50:55 -07:00
Erosika	5f9907c116	chore(honcho): drop docs from PR scope, scrub commentary - Revert website/docs and SKILL.md changes; docs unification handled separately - Scrub commit/PR refs and process narration from code comments and test docstrings (no behavior change)	2026-04-18 22:50:55 -07:00
Erosika	78586ce036	fix(honcho): dialectic lifecycle — defaults, retry, prewarm consumption Several correctness and cost-safety fixes to the Honcho dialectic path after a multi-turn investigation surfaced a chain of silent failures: - dialecticCadence default flipped 3 → 1. PR #10619 changed this from 1 to 3 for cost, but existing installs with no explicit config silently went from per-turn dialectic to every-3-turns on upgrade. Restores pre-#10619 behavior; 3+ remains available for cost-conscious setups. Docs + wizard + status output updated to match. - Session-start prewarm now consumed. Previously fired a .chat() on init whose result landed in HonchoSessionManager._dialectic_cache and was never read — pop_dialectic_result had zero call sites. Turn 1 paid for a duplicate synchronous dialectic. Prewarm now writes directly to the plugin's _prefetch_result via _prefetch_lock so turn 1 consumes it with no extra call. - Prewarm is now dialecticDepth-aware. A single-pass prewarm can return weak output on cold peers; the multi-pass audit/reconcile cycle is exactly the case dialecticDepth was built for. Prewarm now runs the full configured depth in the background. - Silent dialectic failure no longer burns the cadence window. _last_dialectic_turn now advances only when the result is non-empty. Empty result → next eligible turn retries immediately instead of waiting the full cadence gap. - Thread pile-up guard. queue_prefetch skips when a prior dialectic thread is still in-flight, preventing stacked races on _prefetch_result. - First-turn sync timeout is recoverable. Previously on timeout the background thread's result was stored in a dead local list. Now the thread writes into _prefetch_result under lock so the next turn picks it up. - Cadence gate applies uniformly. At cadence=1 the old "cadence > 1" guard let first-turn sync + same-turn queue_prefetch both fire. Gate now always applies. - Restored query-length reasoning-level scaling, dropped in 9a0ab34c. Scales dialecticReasoningLevel up on longer queries (+1 at ≥120 chars, +2 at ≥400), clamped at reasoningLevelCap. Two new config keys: `reasoningHeuristic` (bool, default true) and `reasoningLevelCap` (string, default "high"; previously parsed but never enforced). Respects dialecticDepthLevels and proportional lighter-early passes. - Restored short-prompt skip, dropped in `ef7f3156`. One-word acknowledgements ("ok", "y", "thanks") and slash commands bypass both injection and dialectic fire. - Purged dead code in session.py: prefetch_dialectic, _dialectic_cache, set_dialectic_result, pop_dialectic_result — all unused after prewarm refactor. Tests: 542 passed across honcho_plugin/, agent/test_memory_provider.py, and run_agent/test_run_agent.py. New coverage: - TestTrivialPromptHeuristic (classifier + prefetch/queue skip) - TestDialecticCadenceAdvancesOnSuccess (empty-result retry, pile-up guard) - TestSessionStartDialecticPrewarm (prewarm consumed, sync fallback) - TestReasoningHeuristic (length bumps, cap clamp, interaction with depth) - TestDialecticLifecycleSmoke (end-to-end 8-turn session walk)	2026-04-18 22:50:55 -07:00
kshitijk4poor	fe3e68f572	fix(honcho): strip whitespace from conclusion and delete_id inputs Models may send whitespace-only strings like {"conclusion": " "} which pass bool() but create meaningless conclusions. Strip both inputs so whitespace-only values are treated as empty. Adds tests for whitespace-only conclusion and delete_id. Reviewed-by: @erosika	2026-04-16 09:50:10 -07:00
ogzerber	4377d7da0d	fix(honcho): improve conclude descriptions and add exactly-one validation Improve honcho_conclude tool descriptions to explicitly tell the model not to send both params together. Add runtime validation that rejects calls with both or neither of conclusion/delete_id. Add schema regression test and both-params rejection test. Consolidates #10847 by @ygd58, #10864 by @cola-runner, #10870 by @vominh1919, and #10952 by @ogzerber. The anyOf removal itself was already merged; this adds the runtime validation and tests those PRs contributed. Co-authored-by: ygd58 <ygd58@users.noreply.github.com> Co-authored-by: cola-runner <cola-runner@users.noreply.github.com> Co-authored-by: vominh1919 <vominh1919@users.noreply.github.com>	2026-04-16 09:50:10 -07:00
Teknium	50d438d125	fix(honcho): drop anyOf schema — breaks Fireworks and other providers The honcho_conclude tool schema used anyOf with nested required fields which is unsupported by Fireworks AI, MiniMax, and other providers that only handle basic JSON Schema. The handler already validates that conclusion or delete_id is present (line 1018-1020), so the schema constraint was redundant. Replace with required: [] and let the handler reject bad calls.	2026-04-16 04:10:36 -07:00
Teknium	01214a7f73	feat: dashboard plugin system — extend the web UI with custom tabs Add a plugin system that lets plugins add new tabs to the dashboard. Plugins live in ~/.hermes/plugins/<name>/dashboard/ alongside any existing CLI/gateway plugin code. Plugin structure: plugins/<name>/dashboard/ manifest.json # name, label, icon, tab config, entry point dist/index.js # pre-built JS bundle (IIFE, uses SDK globals) plugin_api.py # optional FastAPI router mounted at /api/plugins/<name>/ Backend (hermes_cli/web_server.py): - Plugin discovery: scans plugins/*/dashboard/manifest.json from user, bundled, and project plugin directories - GET /api/dashboard/plugins — returns discovered plugin manifests - GET /api/dashboard/plugins/rescan — force re-discovery - GET /dashboard-plugins/<name>/<path> — serves plugin static assets with path traversal protection - Optional API route mounting: imports plugin_api.py and mounts its router under /api/plugins/<name>/ - Plugin API routes bypass session token auth (localhost-only) Frontend (web/src/plugins/): - Plugin SDK exposed on window.__HERMES_PLUGIN_SDK__ — provides React, hooks, UI components (Card, Badge, Button, etc.), API client, fetchJSON, theme/i18n hooks, and utilities - Plugin registry on window.__HERMES_PLUGINS__.register(name, Component) - usePlugins() hook: fetches manifests, loads JS/CSS, resolves components - App.tsx dynamically adds nav items and routes for discovered plugins - Icon resolution via static map of 20 common Lucide icons (no tree- shaking penalty — bundle only +5KB over baseline) Example plugin (plugins/example-dashboard/): - Demonstrates SDK usage: Card components, backend API call, SDK reference - Backend route: GET /api/plugins/example/hello Tested: plugin discovery, static serving, API routes, path traversal blocking, unknown plugin 404, bundle size (400KB vs 394KB baseline).	2026-04-16 04:10:06 -07:00
Teknium	cc6e8941db	feat(honcho): context injection overhaul, 5-tool surface, cost safety, session isolation (#10619 ) Salvaged from PR #9884 by erosika. Cherry-picked plugin changes onto current main with minimal core modifications. Plugin changes (plugins/memory/honcho/): - New honcho_reasoning tool (5th tool, splits LLM calls from honcho_context) - Two-layer context injection: base context (summary + representation + card) on contextCadence, dialectic supplement on dialecticCadence - Multi-pass dialectic depth (1-3 passes) with early bail-out on strong signal - Cold/warm prompt selection based on session state - dialecticCadence defaults to 3 (was 1) — ~66% fewer Honcho LLM calls - Session summary injection for conversational continuity - Bidirectional peer targeting on all 5 tools - Correctness fixes: peer param fallback, None guard on set_peer_card, schema validation, signal_sufficient anchored regex, mid->medium level fix Core changes (~20 lines across 3 files): - agent/memory_manager.py: Enhanced sanitize_context() to strip full <memory-context> blocks and system notes (prevents leak from saveMessages) - run_agent.py: gateway_session_key param for stable per-chat Honcho sessions, on_turn_start() call before prefetch_all() for cadence tracking, sanitize_context() on user messages to strip leaked memory blocks - gateway/run.py: skip_memory=True on 2 temp agents (prevents orphan sessions), gateway_session_key threading to main agent Tests: 509 passed (3 skipped — honcho SDK not installed locally) Docs: Updated honcho.md, memory-providers.md, tools-reference.md, SKILL.md Co-authored-by: erosika <erosika@users.noreply.github.com>	2026-04-15 19:12:19 -07:00
Teknium	e402906d48	fix: five HERMES_HOME profile-isolation leaks (#10570 ) * fix: show correct env var name in provider API key error (#9506) The error message for missing provider API keys dynamically built the env var name as PROVIDER_API_KEY (e.g. ALIBABA_API_KEY), but some providers use different names (alibaba uses DASHSCOPE_API_KEY). Users following the error message set the wrong variable. Fix: look up the actual env var from PROVIDER_REGISTRY before building the error. Falls back to the dynamic name if the registry lookup fails. Closes #9506 * fix: five HERMES_HOME profile-isolation leaks (#5947) Bug A: Thread session_title from session_db to memory provider init kwargs so honcho can derive chat-scoped session keys instead of falling back to cwd-based naming that merges all gateway users into one session. Bug B: Replace 14 hardcoded ~/.hermes/skills/ paths across 10 skill files with HERMES_HOME-aware alternatives (${HERMES_HOME:-$HOME/.hermes} in shell, os.environ.get('HERMES_HOME', ...) in Python). Bug C: install.sh now respects HERMES_HOME env var and adds --hermes-home flag. Previously --dir only set INSTALL_DIR while HERMES_HOME was always hardcoded to $HOME/.hermes. Bug D: Remove hardcoded ~/.hermes/honcho.json fallback in resolve_config_path(). Non-default profiles no longer silently inherit the default profile's honcho config. Falls through to ~/.honcho/config.json (global) instead. Bug E: Guard _edit_skill, _patch_skill, _delete_skill, _write_file, and _remove_file against writing to skills found in external_dirs. Skills outside the local SKILLS_DIR are now read-only from the agent's perspective. Closes #5947	2026-04-15 17:09:41 -07:00
Teknium	a9197f9bb1	fix(memory): discover user-installed memory providers from $HERMES_HOME/plugins/ (#10529 ) Memory provider discovery (discover_memory_providers, load_memory_provider) only scanned the bundled plugins/memory/ directory. User-installed providers at $HERMES_HOME/plugins/<name>/ were invisible, forcing users to symlink into the repo source tree — which broke on hermes update and created a dual-registration path causing duplicate tool names (400 errors on strict providers like Xiaomi MiMo). Changes: - Add _get_user_plugins_dir(), _is_memory_provider_dir(), _iter_provider_dirs(), and find_provider_dir() helpers to plugins/memory/__init__.py - discover_memory_providers() now scans both bundled and user dirs - load_memory_provider() uses find_provider_dir() (bundled-first) - discover_plugin_cli_commands() uses find_provider_dir() - _install_dependencies() in memory_setup.py uses find_provider_dir() - User plugins use _hermes_user_memory namespace to avoid sys.modules collisions - Non-memory user plugins filtered via source text heuristic - Bundled providers always take precedence on name collisions Fixes #4956, #9099. Supersedes #4987, #9123, #9130, #9132, #9982.	2026-04-15 14:25:40 -07:00
zhiheng.liu	7cb06e3bb3	refactor(memory): drop on_session_reset — commit-only is enough OV transparently handles message history across /new and /compress: old messages stay in the same session and extraction is idempotent, so there's no need to rebind providers to a new session_id. The only thing the session boundary actually needs is to trigger extraction. - MemoryProvider / MemoryManager: remove on_session_reset hook - OpenViking: remove on_session_reset override (nothing to do) - AIAgent: replace rotate_memory_session with commit_memory_session (just calls on_session_end, no rebind) - cli.py / run_agent.py: single commit_memory_session call at the session boundary before session_id rotates - tests: replace on_session_reset coverage with routing tests for MemoryManager.on_session_end Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-15 11:28:45 -07:00
zhiheng.liu	8275fa597a	refactor(memory): promote on_session_reset to base provider hook Replace hasattr-forked OpenViking-specific paths with a proper base-class hook. Collapse the two agent wrappers into a single rotate_memory_session so callers don't orchestrate commit + rebind themselves. - MemoryProvider: add on_session_reset(new_session_id) as a default no-op - MemoryManager: on_session_reset fans out unconditionally (no hasattr, no builtin skip — base no-op covers it) - OpenViking: rename reset_session -> on_session_reset; drop the explicit POST /api/v1/sessions (OV auto-creates on first message) and the two debug raise_for_status wrappers - AIAgent: collapse commit_memory_session + reinitialize_memory_session into rotate_memory_session(new_sid, messages) - cli.py / run_agent.py: replace hasattr blocks and the split calls with a single unconditional rotate_memory_session call; compression path now passes the real messages list instead of [] - tests: align with on_session_reset, assert reset does NOT POST /sessions Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-15 11:28:45 -07:00
zhiheng.liu	7856d304f2	fix(openviking): commit session on /new and context compression The OpenViking memory provider extracts memories when its session is committed (POST /api/v1/sessions/{id}/commit). Before this fix, the CLI had two code paths that changed the active session_id without ever committing the outgoing OpenViking session: 1. /new (new_session() in cli.py) — called flush_memories() to write MEMORY.md, then immediately discarded the old session_id. The accumulated OpenViking session was never committed, so all context from that session was lost before extraction could run. 2. /compress and auto-compress (_compress_context() in run_agent.py) — split the SQLite session (new session_id) but left the OpenViking provider pointing at the old session_id with no commit, meaning all messages synced to OpenViking were silently orphaned. The gateway already handles session commit on /new and /reset via shutdown_memory_provider() on the cached agent; the CLI path did not. Fix: introduce a lightweight session-transition lifecycle alongside the existing full shutdown path: - OpenVikingMemoryProvider.reset_session(new_session_id): waits for in-flight background threads, resets per-session counters, and creates the new OV session via POST /api/v1/sessions — without tearing down the HTTP client (avoids connection overhead on /new). - MemoryManager.restart_session(new_session_id): calls reset_session() on providers that implement it; falls back to initialize() for providers that do not. Skips the builtin provider (no per-session state). - AIAgent.commit_memory_session(messages): wraps memory_manager.on_session_end() without shutdown — commits OV session for extraction but leaves the provider alive for the next session. - AIAgent.reinitialize_memory_session(new_session_id): wraps memory_manager.restart_session() — transitions all external providers to the new session after session_id has been assigned. Call sites: - cli.py new_session(): commit BEFORE session_id changes, reinitialize AFTER — ensuring OV extraction runs on the correct session and the new session is immediately ready for the next turn. - run_agent._compress_context(): same pattern, inside the if self._session_db: block where the session_id split happens. /compress and auto-compress are functionally identical at this layer: both call _compress_context(), so both are fixed by the same change. Tests added to tests/agent/test_memory_provider.py: - TestMemoryManagerRestartSession: reset_session() routing, builtin skip, initialize() fallback, failure tolerance, empty-manager noop. - TestOpenVikingResetSession: session_id update, per-session state clear, POST /api/v1/sessions call, API failure tolerance, no-client noop. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 11:28:45 -07:00
zhiheng.liu	f3ec4b3a16	Fix OpenViking integration issues: explicit session creation, better error logging	2026-04-15 11:28:45 -07:00
ZaynJarvis	5082a9f66c	fix: wire agent/account/user params through _VikingClient - Fix copy-paste bug: `self._agent = user` → `self._agent = agent` with new `agent` parameter in `_VikingClient.__init__` - Read account/user/agent env vars in `initialize()` and pass them to all 4 `_VikingClient` instantiations so identity headers are consistently applied across health check, prefetch, sync, and memory write paths Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-15 11:28:45 -07:00
Zayn Jarvis	0c30385be2	chore: update doc	2026-04-15 11:28:45 -07:00
Zayn Jarvis	8b167af66b	feat: add ov agent header	2026-04-15 11:28:45 -07:00

1 2

87 Commits