hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-04-28 06:51:16 +08:00

Author	SHA1	Message	Date
Teknium	13ca9ee665	Merge remote-tracking branch 'origin/main' into hermes/curator-infra	2026-04-26 07:16:35 -07:00
Teknium	83b22637af	feat(curator): hook into the gateway's cron-ticker thread Long-running gateways need the curator to fire on cadence without restarts. Piggy-back on the existing cron ticker thread (which already runs image/document cache cleanup every hour on the same pattern) instead of spawning a dedicated timer thread. - New CURATOR_EVERY = 60 ticks (poll hourly at default 60s interval). The inner config.interval_hours gate controls the real cadence, so 60 of these 60 hourly pokes are cheap no-ops and one runs the review. - Removed the boot-time call added in the prior commit — the ticker covers boot + every hour thereafter. Avoids double-running. Handles the weekly-default-on-24/7-gateway gap flagged in review.	2026-04-26 07:16:27 -07:00
Teknium	454d883e69	refactor: drop persist_session plumbing + fix broken btw mid-turn bypass (#16075 ) Follow-up to PR #16053 (/btw as /background alias). Cleans up the plumbing added exclusively for the old ephemeral /btw handler and repairs a broken btw bypass that landed between my refactor and this follow-up. run_agent.py: - Remove persist_session kwarg, instance attr, and _persist_session short-circuit. Only /btw ever passed persist_session=False; with /btw gone the default (always persist) is the only behavior anyone ever wanted. gateway/run.py: - Remove the unreachable 'if _cmd_def_inner.name == "btw"' block (PR #16059). Canonical name for a /btw message is 'background' after alias resolution — the comparison could never be true, and it called _handle_btw_command which no longer exists. The /background branch above it already dispatches /btw correctly. tests/gateway/test_running_agent_session_toggles.py: - Fix test_btw_dispatches_mid_run to mock _handle_background_command (the real dispatch target for /btw) instead of the deleted _handle_btw_command.	2026-04-26 07:15:23 -07:00
Teknium	70f56e7605	fix(gateway): let /btw dispatch mid-turn instead of being rejected /btw spawns a parallel ephemeral side-question task (self-guarded against concurrent /btw on the same chat) — exactly like /background. But it was missing from the running-agent bypass list in _handle_message(), so it fell through to the catch-all and returned: ⏳ Agent is running — /btw can't run mid-turn. Wait for the current response or /stop first. That's the opposite of what /btw is for — asking a side question while the main turn is still working. Add the bypass next to /background and a regression test covering the mid-turn dispatch path. Reported by @IuriiTiunov on Telegram.	2026-04-26 07:11:10 -07:00
Teknium	7fa70b6c87	refactor: /btw is now an alias for /background (#16053 ) The ephemeral no-tools side-question variant of /btw confused users who expected 'by-the-way' to mean 'run this off to the side with tools' — they'd type /btw and get a toolless agent that couldn't do the work. /bg worked because it was /background with full tools. Collapse the two: /btw and /bg both alias to /background. One command, one behavior, no more gotchas about which variant has tools. Removed: - _handle_btw_command in cli.py and gateway/run.py - _run_btw_task + _active_btw_tasks state in gateway/run.py - prompt.btw JSON-RPC method + btw.complete event in tui_gateway - BtwStartResponse type + btw.complete case in ui-tui - Standalone /btw slash tree registration in Discord - Standalone btw CommandDef in hermes_cli/commands.py Updated: - background CommandDef aliases: (bg,) -> (bg, btw) - TUI session.ts: local btw handler merged into background - Docs and tips updated to describe /btw as a /background alias	2026-04-26 07:11:08 -07:00
Teknium	0be51452fa	fix(curator): default cycle is every 7 days, not 24 hours Weekly is closer to how skill churn actually works — most agent-created skills don't change multiple times per day, so a daily review is pure cost without benefit. Bumping the default to 7 days reduces aux-model spend while still catching drift and staleness on the timescales that matter (30d stale, 90d archive). Changes: - DEFAULT_INTERVAL_HOURS: 24 -> 168 (7 days) - config.yaml default: interval_hours: 24 -> 24 * 7 - CLI status line renders as '7d' when interval is a whole-day multiple - Test `test_old_run_eligible` decoupled from the exact default: it now uses 2 * get_interval_hours() so future tweaks don't break it	2026-04-26 06:32:18 -07:00
Teknium	9a70260490	Revert "feat(onboarding): port first-touch hints to the TUI (#16054 )" (#16062 ) This reverts commit `ffd2621039`.	2026-04-26 06:31:37 -07:00
Teknium	ffd2621039	feat(onboarding): port first-touch hints to the TUI (#16054 ) PR #16046 added /busy and /verbose hints to the classic CLI and the gateway runner but skipped the Ink TUI (and therefore the dashboard /chat page, which embeds the TUI via PTY). This extends the same latch to the TUI with TUI-native wording. The TUI's busy-input model is not the /busy knob from the CLI — single Enter while busy auto-queues, double Enter on an empty line interrupts. The new busy-input hint teaches THAT gesture instead of telling the user to flip a config that does not apply. Changes: - agent/onboarding.py — add busy_input_hint_tui() + tool_progress_hint_tui() - tui_gateway/server.py — onboarding.claim JSON-RPC (Ink triggers busy hint on enqueue) + _maybe_emit_onboarding_hint helper hooked into _on_tool_complete for the 30s/tool_progress=all path. Same config.yaml latch so each hint fires at most once per install across CLI, gateway, and TUI combined. - ui-tui/src/gatewayTypes.ts — OnboardingClaimResponse + onboarding.hint event - ui-tui/src/app/createGatewayEventHandler.ts — render the hint event as sys() - ui-tui/src/app/useSubmission.ts — claim busy_input_prompt on first busy enqueue - tests/agent/test_onboarding.py — +3 cases for TUI hint shape - tests/tui_gateway/test_protocol.py — +4 cases for onboarding.claim - website/docs/user-guide/tui.md — new 'Interrupting and queueing' section explaining the TUI's double-Enter model and the hints Validation: scripts/run_tests.sh tests/agent/test_onboarding.py \ tests/tui_gateway/test_protocol.py \ tests/gateway/test_busy_session_ack.py -> 66 passed npm --prefix ui-tui run type-check -> clean npm --prefix ui-tui run lint -> clean npm --prefix ui-tui run build -> clean	2026-04-26 06:24:19 -07:00
Teknium	1e37ddc929	feat(cli): add 'hermes fallback' command to manage fallback providers (#16052 ) Manage the fallback_providers chain from the CLI instead of hand-editing config.yaml. The picker reuses select_provider_and_model() from 'hermes model' — same provider list, same credential prompts, same model picker. hermes fallback [list] Show the current chain (primary + fallbacks) hermes fallback add Run the model picker, append selection to chain hermes fallback remove Pick an entry to delete (arrow-key menu) hermes fallback clear Remove all entries (with confirmation) 'add' snapshots config['model'] before calling the picker, extracts the user's selection from the post-picker state, then restores the primary and appends {provider, model, base_url?, api_mode?} to fallback_providers. Auth store's active_provider is snapshot/restored too so OAuth-provider fallbacks don't silently deactivate the user's primary. Duplicates and self-as-fallback are rejected. Legacy single-dict 'fallback_model' entries are auto-migrated to the list format on first write.	2026-04-26 06:19:04 -07:00
Teknium	76df76477f	fix(curator): defense-in-depth gates against bundled/hub skills Previous invariants only gated the primary entry points (apply_automatic_transitions, archive_skill, CLI pin). Several paths were unprotected: - bump_view / bump_use / bump_patch / set_state / set_pinned wrote usage records unconditionally, which is confusing noise in .usage.json even though the review list filtered them out - restore_skill did not check whether a bundled skill now shadows the archived name - CLI unpin was asymmetric with CLI pin — it had no gate Fixes: - _mutate() (the shared counter / state writer) now drops silently when the skill is not agent-created. .usage.json never gains a record for a bundled or hub-installed skill. - restore_skill() refuses to restore under a name that is now bundled or hub-installed (would shadow upstream). - CLI unpin gate matches CLI pin. New tests: - 5 provenance-guard tests on skill_usage (one per mutator) - 1 end-to-end test that hammers every mutator at a bundled skill and a hub skill, asserts both are untouched on disk, and asserts the sidecar stays clean - 2 CLI tests proving pin/unpin refuse bundled skills symmetrically 64/64 tests passing (29 skill_usage + 27 curator + 8 new guards).	2026-04-26 06:17:01 -07:00
Teknium	f40ccece11	refactor(curator): point review prompt at existing tools The LLM review prompt mentioned bespoke `archive_skill` and `pin_skill` tools that are not registered as model tools. Swap the prompt to rely on the real surface: - skill_manage action=patch — for patching and consolidation - terminal — to `mv` skill dirs into .archive/ Also drop `pin` from the model's decision list — pinning is a user opt-out for `hermes curator pin <skill>`, not something the model should do autonomously. Decision list is now: keep / patch / consolidate / archive. Tests updated: prompt-invariant test now asserts the existing tools are referenced and that bespoke tool names do NOT appear. New test prevents `pin` from being re-added as a model decision.	2026-04-26 06:13:09 -07:00
Teknium	9dd59cb637	feat(curator): background skill maintenance (issue #7816 ) Adds the Curator — an auxiliary-model background task that periodically reviews AGENT-CREATED skills and keeps the collection tidy: tracks usage, transitions unused skills through active → stale → archived, and spawns a forked AIAgent to consolidate overlaps and patch drift. Default: enabled, inactivity-triggered (no cron daemon). Runs on CLI startup and gateway boot when the last run is older than interval_hours (default 24) AND the agent has been idle for min_idle_hours (default 2). Invariants (all load-bearing): - Never touches bundled or hub-installed skills (.bundled_manifest + .hub/lock.json double-filter) - Never auto-deletes — archive only. Archives are recoverable via `hermes curator restore <skill>` - Pinned skills bypass all auto-transitions - Uses the aux client; never touches the main session's prompt cache New files: - tools/skill_usage.py — sidecar .usage.json telemetry, atomic writes, provenance filter - agent/curator.py — orchestrator: config, idle gating, state-machine transitions (pure, no LLM), forked-agent review prompt - hermes_cli/curator.py — `hermes curator {status,run,pause,resume, pin,unpin,restore}` subcommand - tests/tools/test_skill_usage.py — 29 tests - tests/agent/test_curator.py — 25 tests Modified files (surgical patches): - tools/skills_tool.py — bump view_count on successful skill_view - tools/skill_manager_tool.py — bump patch_count on skill_manage patch/edit/write_file/remove_file; forget record on delete - hermes_cli/config.py — add curator: section to DEFAULT_CONFIG - hermes_cli/commands.py — add /curator CommandDef with subcommands - hermes_cli/main.py — register `hermes curator` subparser via register_cli() from hermes_cli.curator - cli.py — /curator slash-command dispatch + startup hook - gateway/run.py — gateway-boot hook (mirrors CLI) Validation: - 54 new tests across skill_usage + curator, all passing in 3s - 346 tests across all touched files' neighbors green - 2783 tests across hermes_cli/ + gateway/test_run_progress_topics.py green - CLI smoke: `hermes curator status/pause/resume` work end-to-end Companion to PR #16026 (class-first skill review prompt) — together they form a loop: the review prompt stops near-duplicate skill creation at the source, and the curator prunes/consolidates what still accumulates. Refs #7816.	2026-04-26 06:08:39 -07:00
Teknium	83c1c201f6	feat(onboarding): contextual first-touch hints for /busy and /verbose (#16046 ) Instead of a blocking first-run questionnaire, show a one-time hint the first time the user hits each behavior fork: 1. First message while the agent is working — appends a hint to the busy-ack explaining the /busy queue vs /busy interrupt knob, phrased to match the mode that was just applied (don't tell a queue-mode user to switch to queue). 2. First tool that runs for >= 30s in the noisiest progress mode (tool_progress: all) — prints a hint about /verbose to cycle display modes (all -> new -> off -> verbose). Gated on /verbose actually being usable on the surface: always shown on CLI; on gateway only shown when display.tool_progress_command is enabled. Each hint is latched in config.yaml under onboarding.seen.<flag>, so it fires exactly once per install across CLI, gateway, and cron, then never again. Users can wipe the section to re-see hints. New: - agent/onboarding.py — is_seen / mark_seen / hint strings, shared by both CLI and gateway. - onboarding.seen in DEFAULT_CONFIG (hermes_cli/config.py) and in load_cli_config defaults (cli.py). No _config_version bump — deep merge handles new keys. Wired: - gateway/run.py: _handle_active_session_busy_message appends the hint after building the ack. progress_callback tracks tool.completed duration and queues the tool-progress hint into the progress bubble. - cli.py: CLI input loop appends the busy-input hint on the first busy Enter; _on_tool_progress appends the tool-progress hint on the first >=30s tool completion. In-memory CLI_CONFIG is also updated so subsequent fires in the same process are suppressed immediately. All writes go through atomic_yaml_write and are wrapped in try/except so onboarding can never break the input/busy-ack paths.	2026-04-26 06:06:27 -07:00
Teknium	4bda9dcade	fix(gateway): honor voice.auto_tts config in auto-TTS gate (#16007 ) (#16039 ) The base adapter's auto-TTS path fired on any voice message unless the chat had explicitly run /voice off — it never read voice.auto_tts from config.yaml, so users who set auto_tts: false still got audio replies. Gate the base adapter on a three-layer decision instead: 1. chat in _auto_tts_enabled_chats (explicit /voice on\|tts) → fire 2. chat in _auto_tts_disabled_chats (explicit /voice off) → suppress 3. else → voice.auto_tts global default Runner now pushes voice.auto_tts onto the adapter as _auto_tts_default and mirrors /voice on\|tts chats into _auto_tts_enabled_chats via the existing _sync_voice_mode_state_to_adapter path. /voice off still wins. Closes #16007.	2026-04-26 05:52:05 -07:00
Teknium	67dcace412	docs(config): show options in comments for display settings (#16038 ) Users who run `hermes setup` get `cli-config.yaml.example` copied verbatim (including comments) to ~/.hermes/config.yaml. But several display settings had thin comments that didn't enumerate the valid options, so users couldn't tell from reading their config what values each key accepts. - busy_input_mode: widen from 'CLI' to 'CLI and gateway platforms'; note /stop as gateway equivalent of Ctrl+C; add /busy_input_mode runtime hint - compact, interim_assistant_messages, bell_on_complete, show_reasoning, streaming: add true/false option lines showing effect of each value - skin: refresh the built-in skin list (was missing daylight, warm-lightmode, poseidon, sisyphus, charizard — 5 of 9 built-ins undocumented)	2026-04-26 05:51:37 -07:00
Teknium	35c57cc46b	fix(gateway): suppress tool-progress bubbles after interrupt (#16034 ) When the LLM response carries N parallel tool calls, the agent fires N tool.started events back-to-back before its interrupt check runs. A user sending /stop mid-batch would see the '⚡ Interrupting current task' ack followed by a trail of 🔍 web_search bubbles for the remaining events in the batch — making the interrupt feel ignored. progress_callback and the drain loop in send_progress_messages now check agent.is_interrupted (via agent_holder[0], the existing cross-scope handle). Events that arrive after interrupt are dropped at both the queueing and rendering stages. The '⚡ Interrupting' message is sent through a separate adapter path and is unaffected.	2026-04-26 05:47:37 -07:00
Teknium	e8441c4c0f	fix(clipboard): report native/tmux success, keep Ctrl+Shift+C on dashboard Follow-up on #16020 salvage. Three corrections: 1. Truth signal for /copy Before: success was 'OSC 52 sequence was emitted to stdout'. That's false on local Linux inside tmux (emitSequence=false), so /copy kept printing 'clipboard copy failed' to users whose xclip/wl-copy had already succeeded fire-and-forget. Fix: setClipboard() now returns { sequence, success } where success = native-fired OR tmux-buffer-loaded OR osc52-emitted. copyNative() returns a boolean telling setClipboard whether a native attempt was made. /copy only shows 'failed' when literally no path was taken. 2. Dashboard keybinding Before: Ctrl+C for copy on non-Mac (Ctrl+Shift+C for paste). That swallows SIGINT when a stale selection is present and breaks the xterm/gnome-terminal/konsole/Windows-Terminal convention where Ctrl+C in a terminal emulator is always SIGINT. The real bug was that clipboard writes lost user-gesture through OSC-52 round-trips, which the direct writeText already fixes. Fix: revert copyModifier to Ctrl+Shift+C on non-Mac. Direct writeText in the keydown handler preserves user gesture. term.write Escape replaced with term.clearSelection() (works without relying on TUI input mode). 3. Error toast text Before: 'see HERMES_TUI_DEBUG_CLIPBOARD' — tells users how to debug but not how to fix. Fix: point users at HERMES_TUI_FORCE_OSC52=1 first (the actual escape hatch), mention the debug var second.	2026-04-26 05:46:45 -07:00
Harry Riddle	2511207cb0	chore: revert docs	2026-04-26 05:46:45 -07:00
Harry Riddle	0f3a6f0fb3	fix(clipboard): dashboard Ctrl+C direct copy; TUI honest feedback; HERMES_TUI_FORCE_OSC52 - Dashboard copy: direct Clipboard API on Ctrl+C/Cmd+C (user gesture); send Escape to TUI to clear selection; Ctrl+Shift+C kept as fallback. - TUI /copy: copySelection() async; only reports success if OSC52 emitted. - Add HERMES_TUI_FORCE_OSC52 env var to override native-tool detection. - Fixes "copied N chars" false-positive when clipboard backend absent. Changes: web/src/pages/ChatPage.tsx — direct navigator.clipboard.writeText ui-tui/packages/hermes-ink/src/ink/ink.tsx — async copySelection ui-tui/packages/hermes-ink/src/ink/termio/osc.ts — HERMES_TUI_FORCE_OSC52 ui-tui/src/app/slash/commands/core.ts — async /copy with honest feedback	2026-04-26 05:46:45 -07:00
Harry Riddle	a562420383	fix(tui): robust clipboard handling with debug logging and headless detection Problem: Ctrl+C in Hermes TUI shows 'copied' but clipboard often empty. Root causes: - Native Linux tools (xclip, wl-copy) require DISPLAY/WAYLAND_DISPLAY; in headless Docker/SSH they fail or hang. - OSC 52 fallback requires terminal emulator support; when absent, sequence is dropped silently. - Dashboard OSC 52 → Clipboard API path fails due to missing user gesture; errors were silently caught. - User feedback 'copied selection' was shown unconditionally, regardless of success. Solution implemented: - Short-circuit Linux native clipboard probing when no display server is present (no DISPLAY and no WAYLAND_DISPLAY). Avoids futile attempts and timeouts. - Add HERMES_TUI_DEBUG_CLIPBOARD env var (1/true). When set, TUI logs to stderr which clipboard path is used, probe results on Linux, and whether OSC 52 was emitted. Greatly improves diagnosability. - Improve dashboard clipboard error handling: replace empty catch blocks with console.warn messages for OSC 52 decode/Write failures and direct copy/paste errors. Makes browser permission/user-gesture failures visible in DevTools. - Add comprehensive clipboard troubleshooting documentation to README and AGENTS, covering OSC 52 verification, tmux config, Docker/headless constraints, env vars, dashboard caveats, and fallback strategies. Technical details: - in ui-tui/packages/hermes-ink/src/ink/termio/osc.ts: - Early return on Linux if both DISPLAY and WAYLAND_DISPLAY unset. - Refactor probe sequence to async with 500ms timeout, caching result; subsequent copies use cached tool immediately. - Emit debug logs when HERMES_TUI_DEBUG_CLIPBOARD=1. - in ink.tsx: log when OSC 52 not emitted (native or tmux path in use) in debug mode. - : OSC 52 handler and Ctrl+Shift+C handler now log warnings to console on Clipboard API rejection with error message. - Documentation: new 'Clipboard Troubleshooting' section in README; new 'Clipboard environment variables and pitfalls' subsection in AGENTS.md (Known Pitfalls). Tests: full ui-tui test suite (292 tests) passes; clipboard and OSC tests unaffected. No breaking changes. Files changed: - ui-tui/packages/hermes-ink/src/ink/termio/osc.ts - ui-tui/packages/hermes-ink/src/ink/ink.tsx - web/src/pages/ChatPage.tsx - README.md - AGENTS.md - CHANGELOG.md (new)	2026-04-26 05:46:45 -07:00
Teknium	855366909f	feat(models): remote model catalog manifest for OpenRouter + Nous Portal (#16033 ) OpenRouter and Nous Portal curated picker lists now resolve via a JSON manifest served by the docs site, falling back to the in-repo snapshot when unreachable. Lets us update model lists without shipping a release. Live URL: https://hermes-agent.nousresearch.com/docs/api/model-catalog.json (source at website/static/api/model-catalog.json; auto-deploys via the existing deploy-site.yml GitHub Pages pipeline on every merge to main). Schema (v1) carries id + optional description + free-form metadata at manifest, provider, and model levels. Pricing and context length stay live-fetched via existing machinery (/v1/models endpoints, models.dev). Config (new model_catalog section, default enabled): model_catalog.url master manifest URL model_catalog.ttl_hours disk cache TTL (default 24h) model_catalog.providers.<name>.url optional per-provider override Fetch pipeline: in-process cache -> disk cache (fresh < TTL) -> HTTP fetch -> disk-cache-on-failure fallback -> in-repo snapshot as last resort. Never raises to callers; at worst returns the bundled list. Changes: - website/static/api/model-catalog.json initial manifest (35 OR + 31 Nous) - scripts/build_model_catalog.py regenerator from in-repo lists - hermes_cli/model_catalog.py fetch + validate + cache module - hermes_cli/models.py fetch_openrouter_models() + new get_curated_nous_model_ids() - hermes_cli/main.py, hermes_cli/auth.py Nous flows use the helper - hermes_cli/config.py model_catalog defaults - website/docs/reference/model-catalog.md + sidebars.ts - tests/hermes_cli/test_model_catalog.py 21 tests (validation, fetch success/failure, accessors, disabled, overrides, integration)	2026-04-26 05:46:43 -07:00
Teknium	d09ab8ff13	fix(mcp-oauth): preserve server_url path for protected-resource validation (#16031 ) Stop pre-stripping the path from the configured MCP server URL before constructing OAuthClientProvider. The MCP SDK strips the path itself via OAuthContext.get_authorization_base_url() for authorization-server discovery, but uses the full server_url through resource_url_from_server_url() + check_resource_allowed() to validate against the server's RFC 9728 Protected Resource Metadata. For servers whose PRM advertises a path-scoped resource (e.g. Notion's https://mcp.notion.com/mcp), our _parse_base_url() collapsed the URL to the origin, so check_resource_allowed() saw requested='/' vs configured='/mcp/' and refused the token. Fixes OAuth against Notion MCP (and any other path-scoped resource). Closes #16015.	2026-04-26 05:43:54 -07:00
Teknium	438db0c7b0	fix(cli): /model picker honors provider-specific context caps (#16030 ) `_apply_model_switch_result` (the interactive `/model` picker's confirmation path) printed `ModelInfo.context_window` straight from models.dev, which reports the vendor-wide value (1.05M for gpt-5.5 on openai). ChatGPT Codex OAuth caps the same slug at 272K, so the picker showed 1M while the runtime (compressor, gateway `/model`, typed `/model <name>`) correctly used 272K — the classic 'sometimes 1M, sometimes 272K' mismatch on a single model. Both display paths now go through `resolve_display_context_length()`, matching the fix that `_handle_model_switch` received earlier. Also bump the stale last-resort fallback in DEFAULT_CONTEXT_LENGTHS (`gpt-5.5: 400000 -> 1050000`) to match the real OpenAI API value; the 272K Codex cap is already enforced via the Codex-OAuth branch, so the fallback now reflects what every non-Codex probe-miss should see. Tests: adds `test_apply_model_switch_result_context.py` with three scenarios (Codex cap wins, OpenRouter shows 1.05M, resolver-empty falls back to ModelInfo). Updates the existing non-Codex fallback test to assert 1.05M (the correct value). ## Validation \| path \| before \| after \| \|-------------------------------\|-----------\|-----------\| \| picker -> gpt-5.5 on Codex \| 1,050,000 \| 272,000 \| \| picker -> gpt-5.5 on OpenAI \| 1,050,000 \| 1,050,000 \| \| picker -> gpt-5.5 on OpenRouter \| 1,050,000 \| 1,050,000 \| \| typed /model gpt-5.5 on Codex \| 272,000 \| 272,000 \|	2026-04-26 05:43:31 -07:00
zkl	2ccdadcca6	fix(deepseek): bump V4 family context window to 1M tokens #14934 added deepseek-v4-pro / deepseek-v4-flash to the DeepSeek native provider but the context-window lookup still falls back to the existing "deepseek" substring entry (128K). DeepSeek V4 ships with a 1M context window, so any caller relying on get_model_context_length() for pre-flight token budgeting (compression, context warnings) under-counts by ~8x. Add explicit lowercase entries for the four DeepSeek model ids that ship 1M context: - deepseek-v4-pro - deepseek-v4-flash - deepseek-chat (legacy alias, server-side maps to v4-flash non-thinking) - deepseek-reasoner (legacy alias, server-side maps to v4-flash thinking) Longest-key-first substring matching means these explicit entries also cover the vendor-prefixed forms (deepseek/deepseek-v4-pro on OpenRouter and Nous Portal) without regressing the existing 128K fallback for older / unknown DeepSeek model ids on custom endpoints. Source: https://api-docs.deepseek.com/zh-cn/quick_start/pricing	2026-04-26 05:32:54 -07:00
Teknium	76042f5867	feat(review): class-first skill review prompt (#16026 ) The background skill-review prompt (spawned after N user turns) now instructs the reviewer to SURVEY existing skills first, identify the CLASS of task, and PREFER updating/generalizing an existing skill over creating a new narrow one. This reduces near-duplicate skill accumulation at the source. Catches the common failure mode where repeated tasks of the same class each spawn their own specific skill ("fix-my-tauri-error", "fix-my-electron-error") instead of a single class-level skill ("desktop-app-build-troubleshooting"). Applied to both _SKILL_REVIEW_PROMPT and the Skills half of _COMBINED_REVIEW_PROMPT. Memory-only review prompt unchanged. Groundwork for the Curator feature (issue #7816) — the creation-side fix. Curator handles the retirement/consolidation side in a follow-up PR. Tests assert the behavioral instructions are present (survey, class, update- over-create, overlap-flagging, opt-out clause) rather than snapshotting the full prompt text.	2026-04-26 05:17:10 -07:00
Teknium	192e7eb21f	fix(nous): don't trip cross-session rate breaker on upstream-capacity 429s (#15898 ) Nous Portal multiplexes multiple upstream providers (DeepSeek, Kimi, MiMo, Hermes) behind one endpoint. Before this fix, any 429 on any of those models recorded a cross-session file breaker that blocked EVERY model on Nous for the cooldown window -- even though the caller's own RPM/RPH/TPM/TPH buckets were healthy. Users hit a DeepSeek V4 Pro capacity error, restarted, switched to Kimi 2.6, and still got 'Nous Portal rate limit active -- resets in 46m 53s'. Nous already emits the full x-ratelimit-* header suite on every response (captured by rate_limit_tracker into agent._rate_limit_state). We now gate the breaker on that data: trip it only when either the 429's own headers or the last-known-good state show a bucket with remaining == 0 AND a reset window >= 60s. Upstream-capacity 429s (healthy buckets everywhere, but upstream out of capacity) fall through to normal retry/fallback and the breaker is never written. Note: the in-memory 'restart TUI/gateway to clear' workaround circulated in Discord does NOT work -- the breaker is file-backed at ~/.hermes/rate_limits/nous.json. The workaround for users still affected by a bad state file is to delete it. Reported in Discord by CrazyDok1 and KYSIV (Apr 2026).	2026-04-26 04:53:42 -07:00
Teknium	59b56d445c	feat(hooks): add duration_ms to post_tool_call + transform_tool_result (#15429 ) Plugin hooks fired after a tool dispatch now receive an integer duration_ms kwarg measuring how long the tool's registry.dispatch() call took (time.monotonic() before/after). Inspired by Claude Code 2.1.119 which added the same field to PostToolUse hook inputs. Wire points: - model_tools.py: measure dispatch latency, pass duration_ms to invoke_hook("post_tool_call", ...) and invoke_hook("transform_tool_result", ...) - hermes_cli/hooks.py: include duration_ms in the synthetic payload used by 'hermes hooks test' and 'hermes hooks doctor' so shell-hook authors see the same shape at development time as runtime - shell hooks (agent/shell_hooks.py): no code change needed; _serialize_payload already surfaces non-top-level kwargs under payload['extra'], so duration_ms lands at extra.duration_ms for shell-hook scripts Plugin authors can now build latency dashboards, per-tool SLO alerts, and regression canaries without having to wrap every tool manually. Test: tests/test_model_tools.py::test_post_tool_call_receives_non_negative_integer_duration_ms E2E: real PluginManager + dispatch monkey-patched with a 50ms sleep, hook callback observes duration_ms=50 (int). Refs: https://code.claude.com/docs/en/changelog (2.1.119, Apr 23 2026)	2026-04-25 22:13:12 -07:00
Teknium	eb28145f36	feat(approval): hardline blocklist for unrecoverable commands (#15878 ) Adds a floor below --yolo: a tiny set of commands so catastrophic they should never run via the agent, regardless of --yolo, gateway /yolo, approvals.mode=off, or cron approve mode. Opting into yolo is trusting the agent with your files and services — not trusting it to wipe the disk or power the box off. The list is deliberately small (12 patterns), covering only unrecoverable ops: - rm -rf targeting /, /home, /etc, /usr, /var, /boot, /bin, /sbin, /lib, ~, $HOME - mkfs (any variant) - dd + redirection to raw block devices (/dev/sd, /dev/nvme, etc.) - fork bomb - kill -1 / kill -9 -1 - shutdown, reboot, halt, poweroff, init 0/6, telinit 0/6, systemctl poweroff/reboot/halt/kexec Recoverable-but-costly commands (git reset --hard, rm -rf /tmp/x, chmod -R 777, curl \| sh) stay in DANGEROUS_PATTERNS where yolo can still pass them through — that's what yolo is for. Container backends (docker/singularity/modal/daytona) continue to bypass both hardline and dangerous checks, since nothing they do can touch the host. Inspired by Mercury Agent's permission-hardened blocklist.	2026-04-25 22:07:12 -07:00
Teknium	a55de5bcd0	feat(setup): auto-reconfigure on existing installs (#15879 ) Bare `hermes setup` on a returning user now drops straight into the full reconfigure wizard — every prompt shows the current value as its default, press Enter to keep or type a new value to change it. The returning-user menu is gone. Behavior: - First-time user: first-time wizard (unchanged) - Returning user, bare command: full reconfigure wizard (new default) - Returning user, `--quick`: only prompt for missing/unset items - Returning user, one section: `hermes setup model\|terminal\|gateway\|tools\|agent` - `--reconfigure`: preserved as backwards-compat alias (no-op since it's now default) The section functions already used current values as prompt defaults — this change just removes the extra click to get to them. The 'Quick Setup - configure missing items only' menu option is now exposed as the explicit `--quick` flag; it's the narrow case of filling in missing config (e.g. after a partial OpenClaw migration or when a required API key got cleared). Inspired by Mercury Agent's `mercury doctor` UX. Also removes: - RETURNING_USER_MENU_SECTION_KEYS (orphaned constant) - Two returning-user menu tests in test_setup_noninteractive.py (guarding behavior that no longer exists — covered by test_setup_reconfigure.py instead)	2026-04-25 22:02:02 -07:00
brooklyn!	cec0af02ad	Merge pull request #15870 from NousResearch/bb/fix-skills-search fix(tui): restore skills search RPC	2026-04-25 22:13:28 -05:00
Brooklyn Nicholson	91a7a0acbe	fix(tui): restore skills search RPC	2026-04-25 22:11:52 -05:00
Teknium	7c50ed707c	docs(azure-foundry): add provider guide, env vars, release AUTHOR_MAP - New website/docs/guides/azure-foundry.md covering both OpenAI-style and Anthropic-style endpoints, auto-detection behaviour, gpt-5.x routing, /v1 stripping, api-version query forwarding, and the provider: anthropic + Azure URL alternative setup. - environment-variables.md picks up AZURE_FOUNDRY_API_KEY, AZURE_FOUNDRY_BASE_URL, AZURE_ANTHROPIC_KEY. - cli-commands.md includes azure-foundry in the provider choices list. - configuration.md lists azure-foundry among auxiliary-task providers. - sidebars.ts wires the new guide into the Guides section. - scripts/release.py AUTHOR_MAP entries for TechPrototyper, HangGlidersRule (noreply), and pein892 so the contributor-attribution CI check does not reject the salvage.	2026-04-25 18:48:43 -07:00
Teknium	731e1ef8cb	feat(azure-foundry): auto-detect transport, models, context length The azure-foundry wizard now probes the endpoint before asking the user to pick anything by hand: 1. URL path sniff — endpoints ending in /anthropic are Azure Foundry Claude routes and skip to anthropic_messages. 2. GET <base>/models probe — if the endpoint returns an OpenAI-shaped model list, we switch to chat_completions and prefill the picker with the returned deployment/model IDs. 3. Anthropic Messages probe — fallback for endpoints that don't expose /models but do speak the Anthropic Messages shape. 4. Manual fallback — private endpoints / custom routes still work; the user picks API mode + types a deployment name. Context length for the selected model is resolved through the existing agent.model_metadata.get_model_context_length chain (models.dev, provider metadata, hardcoded family fallbacks) and stored in model.context_length when a non-default value is found. Also refactors runtime_provider so Azure Foundry resolution is reused between the explicit-credentials path and the default top-level path — previously the /v1 strip for Anthropic-style Azure only ran when the caller passed explicit_* args, which meant config-driven sessions hit a double-/v1 URL. New module hermes_cli/azure_detect.py with 19 unit tests covering: - path sniff, model ID extraction, probe fallbacks - HTTP error handling (URLError, HTTPError) - context-length lookup passthrough - DEFAULT_FALLBACK_CONTEXT rejection New runtime tests cover: - OpenAI-style Azure Foundry - Anthropic-style Azure Foundry with /v1 stripping - Missing base_url / API key raising AuthError Rationale: Microsoft confirms there's no pure-API-key endpoint to list Azure deployments (that requires ARM management auth). The v1 Azure OpenAI endpoint does expose /models with the resource's available model catalog, which is good enough for picker prefill in the common case. Users on private/gated endpoints fall through to manual entry.	2026-04-25 18:48:43 -07:00
akhater	ac57114284	fix(agent): support Azure OpenAI gpt-5.x on chat/completions endpoint Azure OpenAI exposes an OpenAI-compatible endpoint at `{resource}.openai.azure.com/openai/v1` that accepts the standard `openai` Python client. Two issues prevented gpt-5.x models from working: 1. `_max_tokens_param()` only sent `max_completion_tokens` for `api.openai.com` URLs. Azure also requires `max_completion_tokens` for gpt-5.x models. 2. The `codex_responses` upgrade gate unconditionally upgraded gpt-5.x to Responses API. Azure does NOT support the Responses API — it serves gpt-5.x on the regular `/chat/completions` path, causing a 404. Fix: add `_is_azure_openai_url()` that matches `openai.azure.com` URLs. - `_max_tokens_param()` now returns `max_completion_tokens` for Azure. - The `codex_responses` upgrade gate skips Azure so gpt-5.x stays on `chat_completions` where Azure actually serves it. - The fallback-provider api_mode picker also recognises Azure and stays on chat_completions. - Tests cover max_tokens routing, api_mode behaviour, and URL detection. gpt-4.x models on Azure are unaffected (already used chat_completions + max_tokens, which Azure accepts for those models). Salvage of PR #10086 — rewritten against current main where the codex_responses upgrade gate gained copilot-acp / explicit-api_mode exclusions.	2026-04-25 18:48:43 -07:00
pein892	24b4b24d79	fix: preserve URL query params for Azure OpenAI and custom endpoints Azure OpenAI requires an `api-version` query parameter on every request. When users include it in the base_url (e.g. `?api-version=2025-04-01-preview`), the OpenAI SDK silently drops it during URL construction, causing 404 errors. Extract query params from base_url and pass them via `default_query` so the SDK appends them to every request. This is a generic solution that works for any custom endpoint requiring query parameters, not just Azure. No-op for URLs without query params — fully backward compatible.	2026-04-25 18:48:43 -07:00
HangGlidersRule	c15064fa37	fix: pass api-version as default_query param, not in base_url — SDK was producing malformed URLs like /anthropic?api-version=.../v1/messages	2026-04-25 18:48:43 -07:00
HangGlidersRule	7bfa9442de	fix: skip OAuth token refresh for Azure Anthropic endpoints — prevents ~/.claude/.credentials.json from overwriting Azure key mid-session	2026-04-25 18:48:43 -07:00
HangGlidersRule	d8e4c7214e	fix: Azure Anthropic short-circuit in resolve_runtime_provider — bypass custom runtime when provider=anthropic + azure.com URL	2026-04-25 18:48:43 -07:00
HangGlidersRule	6ef3a47ce5	fix: use Azure API key directly for Azure endpoints, bypass OAuth token priority chain	2026-04-25 18:48:43 -07:00
TechPrototyper	3a7653dd1f	feat: Add Azure Foundry provider with OpenAI/Anthropic API mode selection Add support for Azure Foundry as a new inference provider. Azure Foundry endpoints can use either OpenAI-style (/v1/chat/completions) or Anthropic-style (/v1/messages) API formats. Changes: - Add azure-foundry to PROVIDER_REGISTRY (auth.py) - Add azure-foundry overlay in HERMES_OVERLAYS (providers.py) - Add empty model list for azure-foundry (models.py) - Add _model_flow_azure_foundry() interactive setup (main.py) - Add azure-foundry runtime resolution with api_mode support (runtime_provider.py) - Add AZURE_FOUNDRY_API_KEY and AZURE_FOUNDRY_BASE_URL env vars (config.py) Usage: hermes model -> More providers -> Azure Foundry The setup wizard prompts for: - Endpoint URL - API format (OpenAI or Anthropic-style) - API key - Model name Configuration is saved to config.yaml (model.provider, model.base_url, model.api_mode, model.default) and ~/.hermes/.env (AZURE_FOUNDRY_API_KEY).	2026-04-25 18:48:43 -07:00
Teknium	125de02056	fix(context): honor custom_providers context_length on /model switch + bump probe tier to 256K (#15844 ) Fixes #15779. Custom-provider per-model context_length (`custom_providers[].models.<id>.context_length`) is now honored across every resolution path, not just agent startup. Also adds 256K as the top probe tier and default fallback. ## What changed New helper `hermes_cli.config.get_custom_provider_context_length()` — single source of truth for the per-model override lookup, with trailing-slash-insensitive base-url matching. `agent.model_metadata.get_model_context_length()` gains an optional `custom_providers=` kwarg (step 0b — runs after explicit `config_context_length` but before every other probe). Wired through five call sites that previously either duplicated the lookup or ignored it entirely: - `run_agent.py` startup — refactored to use the new helper (dedups legacy inline loop, keeps invalid-value warning) - `AIAgent.switch_model()` — re-reads custom_providers from live config on every /model switch - `hermes_cli.model_switch.resolve_display_context_length()` — new `custom_providers=` kwarg - `gateway/run.py` /model confirmation (picker callback + text path) - `gateway/run.py` `_format_session_info` (/info) ## Context probe tiers `CONTEXT_PROBE_TIERS = [256_000, 128_000, 64_000, 32_000, 16_000, 8_000]` — was `[128_000, ...]`. `DEFAULT_FALLBACK_CONTEXT` follows tier[0], so unknown models now default to 256K. The stale `128000` literal in the OpenRouter metadata-miss path is replaced with `DEFAULT_FALLBACK_CONTEXT` for consistency. ## Repro (from #15779) ```yaml custom_providers: - name: my-custom-endpoint base_url: https://example.invalid/v1 model: gpt-5.5 models: gpt-5.5: context_length: 1050000 ``` `/model gpt-5.5 --provider custom:my-custom-endpoint` → previously "Context: 128,000", now "Context: 1,050,000". ## Tests - `tests/hermes_cli/test_custom_provider_context_length.py` — new file, 19 tests covering the helper, step-0b integration, and the 256K tier invariants - `tests/hermes_cli/test_model_switch_context_display.py` — added regression tests for #15779 through the display resolver - `tests/gateway/test_session_info.py` — updated default-fallback assertion (128K → 256K) - `tests/agent/test_model_metadata.py` — updated tier assertions for the new top tier	2026-04-25 18:47:53 -07:00
Teknium	4c591c2819	chore(release): map fqsy1416@gmail.com to EKKOLearnAI	2026-04-25 18:40:35 -07:00
Teknium	01535a4732	fix(api_server): cap stop-run wait at 5s so interrupt can't hang handler task.cancel() can't preempt the run_in_executor thread running run_conversation(), so we rely on agent.interrupt() to wake the loop. Without a timeout, a slow/unresponsive interrupt blocks the HTTP response indefinitely. Wrap the await in wait_for(shield(task), 5.0) and log a warning on timeout. Also tidy one extra space in the module docstring's /stop entry.	2026-04-25 18:40:35 -07:00
ekko	0a15dbdc43	feat(api_server): add POST /v1/runs/{run_id}/stop endpoint Add ability to interrupt a running agent via the runs API. Previously /v1/runs could start a run and subscribe to events, but there was no way to cancel it. The new endpoint stores agent and task references during execution, calls agent.interrupt() to stop LLM calls, then cancels the asyncio task. Includes 15 tests covering start, events, and stop scenarios.	2026-04-25 18:40:35 -07:00
Teknium	ce0513dd2e	chore(release): map Feranmi10 personal email	2026-04-25 18:39:55 -07:00
Oluwadare Feranmi	dc5e02ea7f	feat(cli): implement hermes update --check flag (fixes #10318 )	2026-04-25 18:39:55 -07:00
brooklyn!	ff851ba7b9	Merge pull request #15821 from NousResearch/fix/tui-ctrl-g-editor fix: external editor handoff in CLI/TUI	2026-04-25 20:37:05 -05:00
Brooklyn Nicholson	14dd8e9a72	fix(tui): address Copilot review on editor handoff - resolveEditor() now returns argv (string[]) so EDITOR='code --wait' and VISUAL='emacsclient -t' tokenize correctly into spawnSync's separate command + args. Previously the whole string was passed as argv[0] and would ENOENT. - Skip the POSIX X_OK PATH walk on Windows; return ['notepad.exe'] there since fs.constants.X_OK is not meaningful and PATHEXT-based resolution would need its own implementation. - Surface openEditor() rejections via actions.sys instead of letting them become unhandled promise rejections in the useInput callback. - Hotkey docs/comment now say Cmd/Ctrl+G to match isAction()'s platform-action-modifier behavior (Cmd on macOS, Ctrl elsewhere).	2026-04-25 20:34:24 -05:00
Wysie	1d80e92c7e	test(discord): add guild to fake e2e messages	2026-04-25 18:25:56 -07:00
Teknium	edce7522a5	chore(release): add AUTHOR_MAP entry for voidborne-d personal email	2026-04-25 18:25:13 -07:00

1 2 3 4 5 ...

5983 Commits