hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-06 02:37:05 +08:00

Author	SHA1	Message	Date
Leon	19eebf6e0d	fix(openrouter): treat xiaomi models as reasoning-capable	2026-05-05 06:07:44 -07:00
vominh1919	96514de472	fix(auxiliary): avoid locking into custom path when api_key is empty When auxiliary.<task> config has base_url set but api_key is empty (common when user expects env var fallback), _resolve_task_provider_model() returned provider="custom" with api_key=None. This caused downstream client construction to make API calls without an Authorization header, resulting in HTTP 401 errors. Fix: only return "custom" when BOTH cfg_base_url AND cfg_api_key are non-empty. When base_url is set without api_key but with a known provider (e.g. "openrouter"), pass through to that provider so it can resolve credentials from environment variables. Fixes #16829	2026-05-05 06:07:07 -07:00
Teknium	c7fc5af122	chore: AUTHOR_MAP entry for tangyuanjc	2026-05-05 06:04:20 -07:00
JC的AI分身	80b386a472	fix(feishu): refresh bot identity during hydration	2026-05-05 06:04:20 -07:00
Teknium	314361733f	test(api_server): _run_agent result now carries session_id for #16938	2026-05-05 06:01:03 -07:00
vominh1919	7f735b4db2	fix: return effective session_id after context compression (#16938 ) When context compression rotates the agent's session_id to a new child session, the API server was still returning the stale parent session_id in the X-Hermes-Session-Id response header. This caused external clients to keep sending the old session_id, loading uncompressed parent history instead of the compressed continuation. Fix: _run_agent() now includes the effective session_id in its result dict, and the response header uses it instead of the original provided session_id.	2026-05-05 06:01:03 -07:00
Hafiy Zakaria	34c6f93496	fix: resolve model.aliases from config.yaml in /model alias resolution hermes config set model.aliases.xxx commands write to the model.aliases nested key, but _load_direct_aliases() only read from the top-level model_aliases key. This meant aliases set via hermes config set were invisible to the /model command, and unrecognised inputs fell through to the DeepSeek normaliser which mapped everything to deepseek-chat. Add a second pass in _load_direct_aliases() that reads model.aliases and converts string-value entries (provider/model format) into DirectAlias objects. The provider is parsed from the slash prefix; if no slash, the current default provider from config is used. Also prevent simple aliases from overriding explicit model_aliases dict entries when both exist.	2026-05-05 05:49:01 -07:00
briandevans	c1a2710a32	test(aux): cover effort: 0 fallback in Codex reasoning translation Copilot review on PR #17012 noted the docstring/comment lists `0` among the falsy effort values that fall back to `medium`, but the existing regression tests only cover `None` and `""`. Add the third case to lock in the full contract. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 05:47:50 -07:00
briandevans	9e893d16d1	fix(aux): default Codex reasoning effort to medium when extra_body.reasoning.effort is falsy auxiliary.<task>.extra_body.reasoning, but the new translation path in _CodexCompletionsAdapter.create() reads the effort with ``reasoning_cfg.get("effort", "medium")``. That returns the configured value verbatim when the key is present, so ``effort: null`` / ``effort: ""`` (both common YAML shapes) flow through as ``{"effort": null, "summary": "auto"}`` and Codex rejects the request with "Invalid value for parameter ``reasoning.effort``". agent/transports/codex.py::build_kwargs() — which the new adapter is documented to mirror — uses a truthy check (``elif reasoning_config.get("effort"):``) so the same falsy values keep the "medium" default. Switch the auxiliary adapter to the same ``or "medium"`` truthy form so identical config produces identical requests on both paths. - [x] Two new regression tests cover ``effort: None`` and ``effort: ""`` and assert the request goes out as ``{"effort": "medium", "summary": "auto"}``. - [x] Old behaviour fails the new tests (``{'effort': None} != {'effort': 'medium'}``); fixed behaviour passes all 11 tests in the ``TestCodexAdapterReasoningTranslation`` class. - [x] Adjacent suites green: ``tests/agent/test_auxiliary_client.py`` (108 passed) and ``tests/agent/transports/test_codex_transport.py + test_chat_completions.py`` (73 passed). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 05:47:50 -07:00
vominh1919	44cf33449d	fix(mcp): add periodic keepalive to _wait_for_lifecycle_event Sends a lightweight list_tools() probe every 3 minutes during idle periods to prevent TCP connections from going stale behind LB / NAT idle timeouts (commonly 300-600s). When the keepalive fails, the reconnect event fires so the transport rebuilds the session cleanly. Salvages the keepalive portion of @vominh1919's PR #17016. The circuit-breaker half-open recovery from the same PR was independently landed on main via #benbarclay's commit `8cc3cebca` ("fix(mcp): add half-open state to circuit breaker", Apr 21); only the keepalive is salvaged here. Fixes #17003.	2026-05-05 05:47:33 -07:00
Teknium	005b2f4c5d	chore: AUTHOR_MAP entry for beardthelion	2026-05-05 05:46:16 -07:00
beardthelion	f15b0fbb4f	fix: add PLATFORM_HINTS entry for api_server platform The API server is a documented, first-class messaging platform with its own gateway adapter, docs pages, and toolset. But it's the only messaging platform missing from PLATFORM_HINTS in agent/prompt_builder.py. Without a platform hint, the agent has no context about the API server's rendering environment and defaults to markdown-heavy document-style outputs (code fences, bold, bullet points) — which break on the plain-text frontends most API server consumers wrap (Open WebUI, custom agents, third-party bridges). Adds a generic api_server entry that describes the medium (unknown rendering, assume plain text) without encoding any specific use case. Individual consumers can layer additional style guidance via ephemeral system prompts. Before (DeepSeek V4 Pro via API server, no hint): Sendblue bridge at /opt/sendblue-bridge - 68MB on disk After (same prompt, with hint): Sendblue bridge at /opt/sendblue-bridge, 68MB on disk No breaking changes — new dict entry only. Existing API server consumers see no behavioral change except for models that previously defaulted to markdown formatting, which now produce cleaner plain-text output.	2026-05-05 05:46:16 -07:00
Teknium	b10e38e392	fix(skills): pin protects against deletion only, not edits (#20220 ) Previously, pinning a skill blocked every skill_manage write action (edit, patch, delete, write_file, remove_file). The 'hard fence' design conflated two concerns: 1. Pin as deletion protection — don't let the curator archive or the agent delete a stable skill. 2. Pin as content freeze — don't let the agent rewrite it mid-conversation. In practice (1) is what users pin for: they want a skill to survive curator passes. (2) created friction — agents finding a new pitfall in a pinned skill had to ask the user to unpin, then the agent patches, then the user re-pins. The dance discouraged skill maintenance and pinned skills went stale. This narrows the _pinned_guard to skill_manage(action='delete') only. Patches, edits, and supporting-file writes go through on pinned skills so the agent can keep improving them. The curator's own pinned-skip behavior (agent/curator.py:271 for auto-archive, line 349 for the LLM review prompt) is unchanged — curator still never touches pinned skills. Changes: - tools/skill_manager_tool.py: remove _pinned_guard calls from _edit_skill, _patch_skill, _write_file, _remove_file; keep on _delete_skill. Updated _pinned_guard docstring and error message. - tools/skill_manager_tool.py: updated skill_manage model-facing tool description to reflect the new semantic. - website/docs/user-guide/features/curator.md: updated pinning section. - tests/tools/test_skill_manager_tool.py: flipped refuses-pinned tests for edit/patch/write_file/remove_file into allowed-when-pinned; kept test_delete_refuses_pinned (strengthened assertion to check the 'cannot be deleted' wording). Closes #18354	2026-05-05 05:43:10 -07:00
Teknium	fe8560fc12	feat(api-server): X-Hermes-Session-Key header for long-term memory scoping (#20199 ) * feat(api-server): X-Hermes-Session-Key header for long-term memory scoping API Server integrations (Open WebUI, custom web UIs) can now pass a stable per-channel identifier via X-Hermes-Session-Key that scopes long-term memory (Honcho, etc.) independently of the transcript-scoped X-Hermes-Session-Id. This matches the native gateway's session_key / session_id split: one stable key per assistant channel, many independent transcripts that rotate on /new. - _create_agent and _run_agent accept gateway_session_key and pass it to AIAgent(gateway_session_key=...), which is already honored by the Honcho memory provider (plugins/memory/honcho/client.py resolve_session_name). - New shared helper _parse_session_key_header applies the same API-key gate, control-character sanitization, and a 256-char length cap as the existing session-id header. - All three agent endpoints honor the header: /v1/chat/completions, /v1/responses, /v1/runs. JSON and SSE responses echo it back. - /v1/capabilities advertises session_key_header so clients can feature-detect. Closes #20060. Co-authored-by: Andy Stewart <lazycat.manatee@gmail.com> * chore: AUTHOR_MAP entry for manateelazycat --------- Co-authored-by: Andy Stewart <lazycat.manatee@gmail.com>	2026-05-05 05:34:47 -07:00
Teknium	436672de0e	feat(curator): add archive and prune subcommands (#20200 ) * fix(curator): protect hub skills by frontmatter name * test(skill_usage): add mark_agent_created to regression test The cherry-picked test predates #19618/#19621 which rewrote list_agent_created_skill_names() to require an explicit created_by: 'agent' provenance marker. Without mark_agent_created(), my-skill is excluded from the list and the positive assertion fails. * feat(curator): add archive and prune subcommands Adds 'hermes curator archive <skill>' and 'hermes curator prune [--days N] [--yes] [--dry-run]' alongside the existing status, run, pause, resume, pin, unpin, restore, backup, rollback verbs. These are the two genuinely new user-facing verbs requested in #19384. The other verbs proposed there ('stats' and 'restore') already exist as 'curator status' and 'curator restore', so no duplicate surface is added — all skill lifecycle commands live under the single 'hermes curator' namespace. - archive: manual archive of an agent-created skill. Refuses pinned skills with a hint pointing at 'hermes curator unpin'. - prune: bulk-archive unpinned skills idle for >= N days (default 90). Falls back to created_at when last_activity_at is null so never-used skills can still be pruned. --dry-run previews, --yes skips prompt. Adapted from @elmatadorgh's PR #19454 which placed the same verbs under 'hermes skills' with a separate hermes_cli/skills_config.py handler and rich table for stats. The 'stats' and 'restore' parts of that PR duplicated existing surface, so only archive and prune are kept, rewritten to match hermes_cli/curator.py's existing plain-text handler style. Tests rewritten from scratch against the new handlers. Closes #19384 Co-authored-by: elmatadorgh <coktinbaran5@gmail.com> --------- Co-authored-by: LeonSGP43 <cine.dreamer.one@gmail.com> Co-authored-by: elmatadorgh <coktinbaran5@gmail.com>	2026-05-05 05:15:54 -07:00
Teknium	4f76166cf0	chore: AUTHOR_MAP entry for qxxaa	2026-05-05 05:01:12 -07:00
qxxaa	0a7cc85eab	fix(honcho): pass user_message as search_query in get_prefetch_context The user_message parameter was accepted by get_prefetch_context but intentionally discarded, with the rationale that passing it would expose conversation content in server access logs. This rationale is inconsistent: Honcho already persists every message in full via saveMessages. The content is already in the database. A search query in an access log adds negligible additional exposure, and is moot for self-hosted Honcho deployments where the operator owns the logs. Without search_query, Honcho returns the full peer representation - all observations, deductive/inductive layers, and peer card - in insertion order. When contextTokens is set, the most useful parts (peer card, dialectic conclusions) are truncated because raw observations fill the budget first. Passing user_message as search_query enables Honcho's semantic retrieval to return only conclusions relevant to the current session topic, reducing injection noise and improving context quality on cold starts. The _fetch_peer_context method already accepts and passes search_query to the Honcho API. This change simply connects the two.	2026-05-05 05:01:12 -07:00
Teknium	046c293183	chore: AUTHOR_MAP entry for chengoak	2026-05-05 05:00:41 -07:00
chengoak	8f4c0bf088	fix(wecom): pad base64 AES key before decode WeCom doesn't pad base64 aeskey, causing Python strict mode decode failure on media/image/file messages. Add automatic padding before base64 decode: aes_key + '=' * ((4 - len(aes_key) % 4) % 4). Salvages the AES padding fix from @chengoak's PR #17040. The SSRF whitelist entry for a private COS bucket hostname was dropped as it belongs in user config, not the built-in trusted-private-IP-hosts list. The debug-level full-body info log was dropped to avoid logging potentially sensitive message content at INFO level.	2026-05-05 05:00:41 -07:00
Teknium	83a07f4759	chore: AUTHOR_MAP entry for happy5318	2026-05-05 05:00:05 -07:00
Teknium	9e0ef2a1bc	test: pin per-turn reasoning extraction semantics Covers four scenarios for the reasoning-box extraction loop: - simple turn with reasoning - simple turn with no reasoning - tool-calling turn where reasoning lives on the tool-call step - prior turn had reasoning, current turn does not (the stale-display bug the fix exists for) - tool-calling turn where reasoning lives on BOTH steps (latest wins) - empty-string reasoning treated as missing Also updates the four inline replica loops in tests/cli/test_reasoning_command.py to match the new turn-boundary shape so the test file reflects production semantics.	2026-05-05 05:00:05 -07:00
happy5318	efe1cb00c8	fix: prevent stale reasoning from being reused across turns The reasoning-box extraction loop in run_conversation() walked backwards through the entire message history looking for any assistant message with a non-empty 'reasoning' field. When the current turn produced no reasoning (e.g. the provider returned reasoning_content=null for a trivial response), the loop walked past the current turn and showed reasoning from a prior turn — stale text from minutes or hours ago displayed as if it belonged to the current reply. Fix: stop the walk at the user message that started the current turn. That picks the most recent reasoning WITHIN the turn (correct for tool-calling turns where reasoning lands on the tool-call step and the final-answer step has reasoning=None — common on Claude thinking, DeepSeek v4, Codex Responses), and returns None cleanly when the current turn genuinely had no reasoning. Co-authored-by: happy5318 <happy5318@users.noreply.github.com>	2026-05-05 05:00:05 -07:00
Teknium	4577f392f9	chore: AUTHOR_MAP entry for ashermorse	2026-05-05 04:58:23 -07:00
Asher Morse	6b76ea4707	fix(gateway): load reply_to_mode from config.yaml for Discord and Telegram The YAML-to-env-var bridge in load_gateway_config() mapped every Discord and Telegram config key (require_mention, auto_thread, reactions, etc.) except reply_to_mode. Users setting discord.reply_to_mode or telegram.reply_to_mode in ~/.hermes/config.yaml got no effect — the adapter only read the env var, which nothing populated from YAML. Add the missing bridge for both platforms, following the existing pattern. Top-level <platform>.reply_to_mode preferred, falls back to <platform>.extra.reply_to_mode, env var never overwritten. Handles YAML 1.1 bare `off` → Python False coercion. This is a re-submission of the work from #9837 and #13930, which both implemented the same fix but neither landed (see co-authors below). Co-authored-by: Matteo De Agazio <hypnosis.mda@gmail.com> Co-authored-by: ishardo <239075732+ishardo@users.noreply.github.com>	2026-05-05 04:58:23 -07:00
LeonSGP43	354502ee48	fix(kanban): preserve dashboard completion summaries	2026-05-05 04:57:38 -07:00
Teknium	cca8587d35	docs(quickstart): link Onchain AI Garage Hermes tutorials playlist (#20192 ) * revert(gateway): remove stale-code self-check and auto-restart Removes the _detect_stale_code / _trigger_stale_code_restart mechanism introduced in #17648 and iterated in #19740. On every incoming message the gateway compared the boot-time git HEAD SHA to the current SHA on disk, and if they differed it would reply with Gateway code was updated in the background -- restarting this gateway so your next message runs on the new code. Please retry in a moment. and then kick off a graceful restart. This is unwanted behaviour: users who run a long-lived gateway and do their own ad-hoc git operations on the checkout end up with their chat interrupted and the current message dropped every time HEAD moves, with no way to opt out. If an operator really needs the old protection against stale sys.modules after "hermes update", the SIGKILL-survivor sweep in hermes update (hermes_cli/main.py, also tagged #17648) already handles the supervisor-respawn case on its own. Removed: gateway/run.py: - _STALE_CODE_SENTINELS, _GIT_SHA_CACHE_TTL_SECS - _read_git_head_sha(), _compute_repo_mtime() module helpers - class-level _boot_wall_time / _boot_repo_mtime / _boot_git_sha / _stale_code_restart_triggered defaults - __init__ boot-snapshot block (_boot_, _cached_current_sha, _repo_root_for_staleness, _stale_code_notified) - _current_git_sha_cached(), _detect_stale_code(), _trigger_stale_code_restart() methods - stale-code check + user-facing restart notice at the top of _handle_message() tests/gateway/test_stale_code_self_check.py (deleted, 412 lines) No new logic added. Zero remaining references to any removed symbol. Gateway test suite passes the same 4589 tests it passed before; the 3 pre-existing unrelated failures (discord free-channel, feishu bot admission, teams typing) are unchanged by this commit. * docs(quickstart): link Onchain AI Garage Hermes tutorials playlist Adds a 'Prefer to watch?' tip callout near the top of the quickstart page pointing to @OnchainAIGarage's Hermes Agent Tutorials + Use Cases playlist, which includes a Masterclass series covering install, setup, and basic commands. * docs(quickstart): embed Masterclass video in Prefer to watch section Swaps the plain-link tip callout for an inline responsive YouTube embed of the Hermes Agent Masterclass (R3YOGfTBcQg) plus a kept link to the full Onchain AI Garage tutorials playlist.	2026-05-05 04:56:54 -07:00
Teknium	4d0f59fa5a	test(skill_usage): add mark_agent_created to regression test The cherry-picked test predates #19618/#19621 which rewrote list_agent_created_skill_names() to require an explicit created_by: 'agent' provenance marker. Without mark_agent_created(), my-skill is excluded from the list and the positive assertion fails.	2026-05-05 04:55:22 -07:00
LeonSGP43	68c1a08ad1	fix(curator): protect hub skills by frontmatter name	2026-05-05 04:55:22 -07:00
Teknium	5168226d60	feat(file_tools): post-write delta lint on write_file + patch, add JSON/YAML/TOML/Python in-process linters (#20191 ) Closes the gap where write_file skipped the post-edit syntax check that patch already ran, so silent file corruption (bad quote escaping, truncated writes, etc.) would persist on disk until a later read. ## Changes tools/file_operations.py: - Add in-process linters for .py, .json, .yaml, .toml (LINTERS_INPROC). Python uses ast.parse, JSON/YAML/TOML use stdlib/PyYAML parsers. Zero subprocess overhead; preferred over shell linters when both apply. - _check_lint() now accepts optional content and routes to in-process linter first. Shell linter (py_compile, node --check, tsc, go vet, rustfmt) remains the fallback for languages without an in-process equivalent. - New _check_lint_delta() implements the post-first/pre-lazy pattern borrowed from Cline and OpenCode: lint post-write state first; only if errors are found AND pre-content was captured does it lint the pre-state and diff. If the pre-existing file had the SAME errors the edit didn't introduce anything new, so the file is reported as 'still broken, pre-existing' with success=False but a message explaining the errors were pre-existing. If the edit introduced genuinely new errors, those are surfaced and pre-existing ones are filtered out. - WriteResult gains a lint field. - write_file() captures pre-content for in-process-lintable extensions and calls _check_lint_delta after a successful write. - patch_replace() switches from _check_lint to _check_lint_delta, reusing the pre-edit content it already has in scope. tools/file_tools.py: - Update write_file schema description to mention the post-write lint. tests/tools/test_file_operations_edge_cases.py: - Update existing brace-path tests to use .js (shell linter) now that .py is in-process. - Add TestCheckLintInproc (9 tests) covering Python/JSON/YAML/TOML in-process linters. - Add TestCheckLintDelta (5 tests) covering the post-first/pre-lazy short-circuit, new-file path, and the single-error-parser caveat. ## Performance In-process linters are microseconds per call (ast.parse, json.loads). The hot path (clean write) runs exactly one lint — matches main's cost for patch. Pre-state capture is skipped when the file has no applicable linter. Measured 4.89ms/write average over 100 .py writes including lint. ## Inspiration - Cline's DiffViewProvider.getNewDiagnosticProblems() — filters pre-write diagnostics from post-write diagnostics (src/integrations/editor/DiffViewProvider.ts). - OpenCode's WriteTool — runs lsp.diagnostics() after write and appends errors to tool output (packages/opencode/src/tool/write.ts). - Claude Code's DiagnosticTrackingService — captures baseline via beforeFileEdited() and returns new-diagnostics-only from getNewDiagnostics() (src/services/diagnosticTracking.ts). ## Validation - tests/tools/test_file_operations.py + test_file_operations_edge_cases.py + test_file_tools.py + test_file_tools_live.py + test_file_write_safety.py + test_write_deny.py + test_patch_parser.py + test_file_ops_cwd_tracking.py: 228 passed locally. - Live E2E reproduction of the tips.py corruption incident: broken content written; lint field surfaces 'SyntaxError: invalid syntax. Perhaps you forgot a comma? (line 6, column 5)' — the exact error that would have self-corrected the bug on the next turn.	2026-05-05 04:54:17 -07:00
Teknium	b93643c8fe	chore: AUTHOR_MAP entry for wmagev	2026-05-05 04:51:29 -07:00
wmagev	2eef395e1c	fix(compaction): mark end of context summary in role=user fallback When the head ends with assistant/tool and the tail starts with assistant, the summary is inserted as a standalone role="user" message. The body's verbatim "## Active Task" quote then gets read as fresh user input by weak/local models (#11475, #14521). The merge-into-tail path already appends an explicit end-of-summary marker for this reason. Mirror it on the standalone path so both insertion routes give the model the same "summary above, not new input" signal.	2026-05-05 04:51:29 -07:00
Teknium	c725d7d648	chore: AUTHOR_MAP entry for TheEpTic	2026-05-05 04:45:32 -07:00
Nexus	660ce7c54b	fix(ui-tui): prevent React effect cleanup from killing python TUI gateway subprocess The useEffect at useMainApp.ts:546-565 calls gw.kill() in its cleanup function. React calls cleanup on every re-render when the dependency array ([gw, sys]) shifts — which happens whenever sys changes identity (any system message). This sends SIGTERM to the Python TUI gateway subprocess, silently killing the backend mid-session. The kill path was already handled by entry.tsx's setupGracefulExit for real app exits (SIGINT, uncaught exception). The die() function also calls gw.kill() for explicit user exit. Removing the cleanup kill leaves all exit paths covered while preventing accidental mid-session kills on ordinary React re-renders.	2026-05-05 04:45:32 -07:00
LeonSGP43	1a03e3b1c6	fix(kanban): detect darwin zombie workers	2026-05-05 04:43:40 -07:00
0xsir0000	f6b68f0f50	fix(gateway): keep DoH-confirmed Telegram IPs that match system DNS (#14520 ) discover_fallback_ips() filtered out any DoH-resolved IP that also appeared in the system resolver's answer set, on the assumption that the system IP was unreachable. When DoH and system DNS agreed (a common case), the function returned the hardcoded _SEED_FALLBACK_IPS list instead — and on networks where those seed addresses are not routable, the Telegram fallback transport had nothing usable to retry against and polling failed. Drop the system_ips exclusion so DoH-confirmed IPs are preserved regardless of system DNS overlap. The TelegramFallbackTransport already tries the primary path first via system DNS, then falls through to the IP-rewrite path on connect failure; including the same IP in both lanes lets a transient primary failure recover via the explicit IP route instead of escalating to seed addresses. Update the two tests that codified the old exclusion to reflect the new, inclusion-by-default behaviour. Fixes #14520	2026-05-05 04:42:59 -07:00
revaraver	aacf36e943	fix(cli): persist manual compress handoff	2026-05-05 04:42:48 -07:00
Teknium	fe8dc26bc9	chore: AUTHOR_MAP entry for revaraver noreply	2026-05-05 04:42:44 -07:00
revaraver	4a3e3e20e5	fix(compression): preserve iterative summary continuity	2026-05-05 04:42:44 -07:00
Teknium	f8a6db68ca	test(kanban): isolate HERMES_KANBAN_BOARD writes in pin-env tests The helper under test writes to os.environ directly, bypassing monkeypatch tracking. Without an explicit snapshot/restore fixture, the mutation leaks into subsequent tests and breaks TestSharedBoardPaths (kanban path resolution reads HERMES_KANBAN_BOARD and routes through boards/<leaked-slug>/ instead of the test's own HERMES_HOME). Add an autouse fixture that snapshots the env var before the test and restores (or pops) it after, regardless of what the helper did.	2026-05-05 04:37:47 -07:00
0xDevNinja	b22b3f506a	fix(cli): pin HERMES_KANBAN_BOARD at chat boot to stop subprocess board drift Without an explicit pin, in-process kanban tools and shelled-out `hermes kanban …` subprocesses resolve the active board on different paths: the env var when set, otherwise the global `<root>/kanban/current` file. When a concurrent session toggles the current-board pointer mid-turn, the same chat ends up routing tool calls to board A while its shell calls hit board B, surfacing as phantom "no such task" errors. Pin the resolved board into env once at `cmd_chat` boot when HERMES_KANBAN_BOARD isn't already set. Mirrors what the dispatcher does for spawned workers (kanban_db.py:2622-2623). Idempotent and a no-op when the env is already pinned by the caller. Closes #20074	2026-05-05 04:37:47 -07:00
Teknium	d472d697cd	chore(release): map stevekelly622@gmail.com → @steezkelly	2026-05-05 04:34:45 -07:00
Steve Kelly	8c82d0664d	fix(kanban): ignore stale current board pointers	2026-05-05 04:34:45 -07:00
Teknium	2a285d5ec2	fix(agent): stateful streaming scrubber for reasoning-block leaks (#17924 ) (#20184 ) * revert(gateway): remove stale-code self-check and auto-restart Removes the _detect_stale_code / _trigger_stale_code_restart mechanism introduced in #17648 and iterated in #19740. On every incoming message the gateway compared the boot-time git HEAD SHA to the current SHA on disk, and if they differed it would reply with Gateway code was updated in the background -- restarting this gateway so your next message runs on the new code. Please retry in a moment. and then kick off a graceful restart. This is unwanted behaviour: users who run a long-lived gateway and do their own ad-hoc git operations on the checkout end up with their chat interrupted and the current message dropped every time HEAD moves, with no way to opt out. If an operator really needs the old protection against stale sys.modules after "hermes update", the SIGKILL-survivor sweep in hermes update (hermes_cli/main.py, also tagged #17648) already handles the supervisor-respawn case on its own. Removed: gateway/run.py: - _STALE_CODE_SENTINELS, _GIT_SHA_CACHE_TTL_SECS - _read_git_head_sha(), _compute_repo_mtime() module helpers - class-level _boot_wall_time / _boot_repo_mtime / _boot_git_sha / _stale_code_restart_triggered defaults - __init__ boot-snapshot block (_boot_, _cached_current_sha, _repo_root_for_staleness, _stale_code_notified) - _current_git_sha_cached(), _detect_stale_code(), _trigger_stale_code_restart() methods - stale-code check + user-facing restart notice at the top of _handle_message() tests/gateway/test_stale_code_self_check.py (deleted, 412 lines) No new logic added. Zero remaining references to any removed symbol. Gateway test suite passes the same 4589 tests it passed before; the 3 pre-existing unrelated failures (discord free-channel, feishu bot admission, teams typing) are unchanged by this commit. * fix(agent): stateful streaming scrubber for reasoning-block leaks (#17924) Per-delta _strip_think_blocks ran at _fire_stream_delta and destroyed downstream state. When MiniMax-M2.7 / DeepSeek / Qwen3 streamed a tag split across deltas (delta1='<think>', delta2='Let me check'), the regex case-2 match erased delta1 entirely, so CLI/gateway state machines never learned a block was open and leaked delta2 as content. Raw consumers (ACP, api_server, TTS) had no downstream defense at all. Replace the per-delta regex with a stateful StreamingThinkScrubber that survives delta boundaries: - Closed <tag>X</tag> pairs always stripped (matches _strip_think_blocks case 1). - Unterminated open at block boundary enters a block; content discarded until close tag arrives. At end-of-stream, held content is dropped. - Orphan close tags stripped without boundary gating. - Partial tags at delta boundaries held back until resolved. - Block-boundary rule (start-of-stream, after \n, or whitespace-only since last \n) preserves prose that mentions tag names. Reset at turn start alongside the existing context scrubber; flush at turn end so a benign '<' held back at end-of-stream reaches the UI. E2E-verified on live OpenRouter->MiniMax-m2 streams: closed pairs strip cleanly, first word of post-block content is preserved, pure content passes through unchanged. Stefan's screenshot case (#17924) — 'Let me check' getting chopped to ' me check' — no longer happens. Final _strip_think_blocks calls on completed strings (final_response, replay, compression) are preserved; only the streaming per-delta call site switched to the scrubber.	2026-05-05 04:33:38 -07:00
Chris Danis	28f4d6db63	fix(tool-schemas): reactive strip of pattern/format on llama.cpp grammar 400s MCP servers commonly emit JSON Schema `pattern` (e.g. `\\d{4}-\\d{2}-\\d{2}` for date-time params) and `format` keywords. llama.cpp's `json-schema-to-grammar` converter rejects regex escape classes (\\d/\\w/\\s) and most format values, returning HTTP 400 "parse: error parsing grammar: unknown escape at \\d" — the whole request fails. Cloud providers (OpenAI, Anthropic, OpenRouter, Gemini) accept these keywords fine and use them as prompting hints. Stripping unconditionally loses useful hints for every cloud user to fix a llama.cpp-only bug. Approach: classify the llama.cpp grammar-parse 400 in the error classifier, and on match do a one-shot in-place strip of pattern/format from `self.tools`, then retry. Follows the existing `thinking_signature` recovery pattern. Cloud users hit zero overhead; llama.cpp users pay one failed request per session. Changes - agent/error_classifier.py: new `FailoverReason.llama_cpp_grammar_pattern` + narrow HTTP-400 branch matching "error parsing grammar", "json-schema-to-grammar", or "unable to generate parser ... template". - tools/schema_sanitizer.py: new `strip_pattern_and_format()` helper — reactive, walks schema nodes, skips property names (search_files.pattern survives). Returns strip count for logging. - run_agent.py: new one-shot recovery block in the retry loop. Strips, logs, continues. Falls through to normal retry if nothing to strip. - tests: 4 classifier tests (3 variants + 1 non-400 negative), 7 strip tests including the property-name preservation and idempotency checks. Co-authored-by: Chris Danis <cdanis@gmail.com>	2026-05-05 04:25:18 -07:00
Interstellar-code	542e06c789	fix: include default profile in kanban assignees	2026-05-05 04:25:05 -07:00
Teknium	fc4aa66ee4	feat(tips): add 100 new CLI startup tips (#20168 ) Expands TIPS corpus from 280 to 380 entries covering untapped territory across slash commands, CLI flags, env vars, config keys, and platform features. Every tip verified against real code and docs. Batch 1 (50): advanced slash commands (/steer, /goal, /snapshot, /copy, /redraw, /agents, /footer, /busy, /topic, /approve, /restart, /kanban, /reload), no-agent cron, gateway hooks, curator, credential pools, provider routing, TUI/dashboard env vars and themes, checkpoints, Piper TTS, API server, GATEWAY_PROXY_URL, MATRIX_DEVICE_ID, TELEGRAM_WEBHOOK_SECRET, batch_runner --resume. Batch 2 (50): lesser-known slash commands (/new, /clear, /history, /save, /status, /image, /platforms, /commands, /toolsets, /gquota, /voice tts, /reload-skills, /indicator, /debug), CLI subcommands (hermes -z, --pass-session-id, --image, --ignore-user-config, --source tool, dump --show-keys, sessions rename/delete, import, fallback, pairing, setup, status --deep), agent behavior env vars (HERMES_AGENT_TIMEOUT, HERMES_ENABLE_PROJECT_PLUGINS, HERMES_DISABLE_FILE_STATE_GUARD, HERMES_ALLOW_PRIVATE_URLS, HERMES_OPTIONAL_SKILLS, HERMES_BUNDLED_SKILLS, HERMES_DUMP_REQUEST_STDOUT, HERMES_OAUTH_TRACE, HERMES_STREAM_RETRIES), gateway env vars, image_gen config, auxiliary.session_search, tirith_fail_open, source tool filtering, API_SERVER_MODEL_NAME, dashboard plugins.	2026-05-05 04:15:58 -07:00
Brecht-H	f25d3ec917	fix(kanban): suppress dispatcher stuck-warn when ready queue holds only non-spawnable assignees After PR #20105 (dispatcher skips ready tasks whose assignee fails ``profile_exists()`` to prevent the orion-cc/orion-research crash loop), the gateway and CLI emit a spurious "kanban dispatcher stuck: ready queue non-empty for N consecutive ticks but 0 workers spawned" warning every 5 minutes on multi-lane setups where the queue is steadily full of human-pulled work assigned to terminal lanes. The warn is intended to catch real failure modes (broken PATH, missing venv, credential loss for a real Hermes profile). On a multi-lane host it fires forever even though everything is healthy: the dispatcher correctly chose not to spawn, and there is nothing for the operator to fix. Changes: * ``DispatchResult`` gains a ``skipped_nonspawnable`` field (separate from ``skipped_unassigned``) so callers can distinguish "task missing an owner — operator should route it" from "task owned by a control-plane lane — terminal will pull it". * ``dispatch_once`` routes the ``not profile_exists(assignee)`` skip into the new bucket (was lumped into ``skipped_unassigned``). * New helper ``has_spawnable_ready(conn)`` returns True iff at least one ready+assigned+unclaimed task in the DB has an assignee that maps to a real Hermes profile. Falls back to legacy "any ready+assigned" when ``profile_exists`` is unimportable so degraded installs still surface the original warn. * The gateway dispatcher (``gateway/run.py``) and the CLI standalone daemon (``hermes_cli/kanban.py``) both swap their cheap ``ready_nonempty`` probe to use ``has_spawnable_ready``. Stuck-warn now fires only when there is genuine spawnable work the dispatcher failed to start. * CLI dispatch output prints ``Skipped (non-spawnable assignee — terminal lane, OK)`` for visibility without alarm. Tests: * New ``has_spawnable_ready`` cases (empty queue, terminal-lane only, mixed real+terminal). * New ``test_dispatch_skips_nonspawnable_into_separate_bucket`` verifies the bucketing change. * Updated ``test_dispatch_skips_unassigned`` to assert no cross-leak. * Added ``all_assignees_spawnable`` fixture in ``tests/hermes_cli/conftest.py`` and threaded it through dispatcher tests that use synthetic assignees ("alice", "bob"). PR #20105 (the parent commit) silently broke 8 such tests by routing those assignees into ``skipped_nonspawnable`` instead of spawning; this PR repairs them as part of the same code area. Verified locally: 246/246 kanban-suite tests pass. Stacks on top of fix/kanban-dispatcher-skip-missing-profile-2026-05-05 (PR #20105). Reviewer: this PR is meant to merge AFTER #20105. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 04:13:12 -07:00
Brecht-H	ca5595fe7b	fix(kanban): dispatcher skips ready tasks whose assignee is not a real profile The kanban dispatcher's `_default_spawn` invokes ``hermes -p <task.assignee> chat -q ...``. When ``assignee`` names a control-plane lane (e.g. an interactive Claude Code terminal like ``orion-cc`` / ``orion-research``) instead of a real Hermes profile, the subprocess fails on startup with "Profile 'X' does not exist", gets reaped as a zombie, the TTL/crash detector marks the task back to ``ready``, and the next tick re-spawns the same crashing worker. Result: a permanent crash loop emitting ``spawned=2 crashed=2 every tick`` in the gateway log and burning CPU forever. Reproduce on a fresh Hermes-agent install: # 1. Create a kanban task whose assignee names a non-profile. hermes kanban create --assignee orion-cc --status ready \ --title "Review PR #N" --body "..." # 2. Start the gateway with the embedded dispatcher. hermes gateway run # gateway.log lines every minute: # kanban dispatcher: tick spawned=1 reclaimed=0 crashed=1 ... # 3. ps -ef \| grep '[h]ermes.*defunct' shows zombies. Fix --- ``dispatch_once()`` now pre-checks ``hermes_cli.profiles. profile_exists(assignee)`` before claiming. If False, the row is added to ``skipped_unassigned`` (it's effectively "unassigned-to-an-executable-profile") and the dispatcher moves on without claiming, spawning, or counting a crash. The check is opt-in safe: if the import fails (e.g. test isolation, profile module restructured), ``profile_exists`` falls back to ``None`` and the original behaviour is preserved unchanged. This addresses the explicit hint in the kanban task body (``t_2bab06e3``): "Should ready-state tasks auto-spawn at all, or only on explicit orion-cc claim? If spurious, gate the auto-spawn behind a config flag (e.g. only assignee=hermes or assignee=auto)." Profile-existence is a tighter gate than a config flag — it self-documents (the user already knows whether they have an ``orion-cc`` profile), and it doesn't require Mac to maintain an allowlist as new lane names appear. New lanes that ARE real profiles (created via ``hermes profile create``) auto- qualify the moment the profile dir is created. Validated live -------------- On Orion's hermes-agent install, two ``orion-research``- assigned tasks (Bug A and Bug C investigations) had been crash-looping since 2026-05-05 06:58 local. After applying the patch + restarting the gateway: - Stale ``running`` claims released to ``ready`` cleanly. - New gateway emitted ``kanban dispatcher: embedded`` and has ticked silently for 2+ minutes — no spawned=, crashed=, or stuck= log lines (all spawn skips are quiet). - Tasks remain ``ready`` with ``claim_lock=None``, ``worker_pid=None``, ``spawn_failures=0``. - Dashboard + telegram + freqtrade unaffected. Confidence: high (live verified on Orion). Scope-risk: narrow (additive guard inside one function). Not-tested: behaviour when a profile is renamed mid-tick — current code re-imports ``profile_exists`` per row so a freshly created profile auto-qualifies on the next tick. Machine: orion-terminal Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 04:13:12 -07:00
Teknium	91ce8fc000	fix(setup): offer Keep/Replace/Clear when API key already exists hermes setup / hermes model used to silently skip the key prompt when any value was present in .env — even a malformed paste — leaving users with a stuck '✓' and no way to recover without hand-editing .env. Replace the silent acknowledgement at all three API-key provider flows (Kimi, Stepfun, generic) with a single [K]eep / [R]eplace / [C]lear menu via a shared `_prompt_api_key` helper. - K / Enter / Ctrl-C / unknown input → keep (never destroys the key) - R → getpass for new key; empty input cancels and preserves existing - C → clears the env var, tells user to rerun hermes setup, aborts flow LM Studio's no-auth-placeholder substitution stays on first-time entry only; on Replace an empty input means 'cancel', not 'overwrite with dummy key'. 11 unit tests cover all branches incl. garbage-input-keeps-key, Ctrl-C at the choice prompt, Replace-cancel preserving the old key, Clear wiping only the target env var, and lmstudio placeholder semantics. Fixes #16394 Reshapes #18355 — original PR pasted the menu inline at 3 sites with no tests; this consolidates to one helper (+88/-66) with coverage. Co-authored-by: Feranmi10 <89228157+Feranmi10@users.noreply.github.com>	2026-05-05 04:08:11 -07:00
simbam99	8ad5e98f8d	fix(gateway): preserve pending update prompts across restarts	2026-05-05 03:59:39 -07:00

1 2 3 4 5 ...

7260 Commits