hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-06 10:47:12 +08:00

Author	SHA1	Message	Date
helix4u	4f37669170	fix(tools): reconfigure enabled unconfigured toolsets	2026-05-03 00:33:02 -07:00
helix4u	d409a4409c	fix(model): avoid bedrock credential probe in provider picker	2026-05-03 00:32:55 -07:00
Siddharth Balyan	5d3be898a8	docs(tts): mention xAI custom voice support (#18776 ) Point users to xAI's custom voices feature — clone your voice in the console, paste the voice_id into tts.xai.voice_id. No code changes needed; the existing TTS pipeline already handles arbitrary voice IDs. - config.py: link to xAI custom voices docs in voice_id comment - setup.py: prompt accepts custom voice IDs during xAI TTS setup - tts.md: short section linking to xAI console and docs	2026-05-02 16:08:01 +05:30
Teknium	1dce908930	fix(gateway): shutdown + restart hygiene (drain timeout, false-fatal, success log) (#18761 ) * fix(gateway): config.yaml wins over .env for agent/display/timezone settings Regression from the silent config→env bridge. The bridge at module import time is correct for max_turns (unconditional overwrite), but every other agent., display., timezone, and security bridge key was guarded by 'if X not in os.environ' — so a stale .env entry from an old 'hermes setup' run would shadow the user's current config.yaml indefinitely. Symptom: agent.max_turns: 500 in config.yaml, HERMES_MAX_ITERATIONS=60 in .env from an old setup, and the gateway silently capped at 60 iterations per turn. Gateway logs confirmed api_calls never exceeded 60. Three changes: 1. gateway/run.py: drop the 'not in os.environ' guards for all agent., display., timezone, and security.* bridge keys. config.yaml is now authoritative for these settings — same semantics already in place for max_turns, terminal., and auxiliary.. Also surface the bridge failure (previously 'except Exception: pass') to stderr so operators see bridge errors instead of silently falling back to .env. 2. gateway/run.py: INFO-log the resolved max_iterations at gateway start so operators can verify the config→env bridge did the right thing instead of chasing a phantom budget ceiling. 3. hermes_cli/setup.py: stop writing HERMES_MAX_ITERATIONS to .env in the setup wizard. config.yaml is the single source of truth. Also clean up any stale .env entry left behind by pre-fix setups. Regression tests in tests/gateway/test_config_env_bridge_authority.py guard each config→env key against the 'stale .env shadows config' bug. * fix(gateway): shutdown + restart hygiene (drain timeout, false-fatal, success log) Three issues observed in production gateway.log during a rapid restart chain on 2026-05-02, all fixed here. 1. _send_restart_notification logged unconditional success adapter.send() catches provider errors (e.g. Telegram 'Chat not found') and returns SendResult(success=False); it never raises. The caller ignored the return value and always logged 'Sent restart notification to <chat>' at INFO, producing a misleading success line directly below the 'Failed to send Telegram message' traceback on every boot. Now inspects result.success and logs WARNING with the error otherwise. 2. WhatsApp bridge SIGTERM on shutdown classified as fatal error _check_managed_bridge_exit() saw the bridge's returncode -15 (our own SIGTERM from disconnect()) and fired the full fatal-error path, producing 'ERROR ... WhatsApp bridge process exited unexpectedly' plus 'Fatal whatsapp adapter error (whatsapp_bridge_exited)' on every planned shutdown, immediately before the normal '✓ whatsapp disconnected'. Adds a _shutting_down flag that disconnect() sets before the terminate, and _check_managed_bridge_exit() returns None for returncode in {0, -2, -15} while shutting down. OOM-kill (137) and other non-signal exits still hit the fatal path. 3. restart_drain_timeout default 60s → 180s On 2026-05-02 01:43:27 a user /restart fired while three agents were mid-API-call (82s, 112s, 154s into their turns). The 60s drain budget expired and all three were force-interrupted. 180s covers realistic in-flight agent turns; users on very-long-reasoning models can still raise it further via agent.restart_drain_timeout in config.yaml. Existing explicit user values are preserved by deep-merge. Tests - tests/gateway/test_restart_notification.py: two new tests assert INFO is only logged on SendResult(success=True) and WARNING with the error string is logged on SendResult(success=False). - tests/gateway/test_whatsapp_connect.py: parametrized test for returncode in {0, -2, -15} proves shutdown-time exits are suppressed; separate test proves returncode 137 (SIGKILL/OOM) still surfaces as fatal even when _shutting_down is set. - _check_managed_bridge_exit() reads _shutting_down via getattr-with- default so existing _make_adapter() test helpers that bypass __init__ (pitfall #17 in AGENTS.md) keep working unmodified.	2026-05-02 02:08:06 -07:00
Teknium	5eac6084bc	fix(discord): warn on 32-char clamp collisions in the /skill collector (#18759 ) Discord's per-command name limit is 32 chars. When two skill slugs share the same first 32 chars (or a skill slug clamps onto a reserved gateway command name), only the first seen wins — the second is dropped from the /skill autocomplete. The old behavior incremented a ``hidden`` counter silently, so skill authors had no way to discover the drop short of noticing their skill was missing from the picker. Not an actively-biting bug today (no collisions on the default catalog as of 2026-05), but a landmine the moment someone ships a skill with a long name. The earlier series in #18745 / #18753 / #18754 dropped the other silent data-loss paths in the Discord /skill collector; this one lights up the last remaining one. Fix: promote ``_names_used`` from a set to a dict keyed by the clamped name, mapping to the source cmd_key (or a ``"<reserved>"`` sentinel for names inherited via ``reserved_names``). On collision, log a WARNING naming both sides — the winner, the loser, the clamped name, and what to rename. Two phrasings: * skill-vs-skill — "both clamp to X on Discord's 32-char command-name limit; only the winner appears in /skill. Rename one skill's frontmatter ``name:`` to differ in its first 32 chars." * skill-vs-reserved — "collides with a reserved gateway command name; the skill will not appear in /skill. Rename the skill's frontmatter ``name:``." Tests: three cases in ``tests/hermes_cli/test_discord_skill_clamp_warning.py`` — skill-vs-skill collision (warning names both cmd_keys + clamped prefix), skill-vs-reserved collision (warning uses the distinct phrasing), and a no-collision negative (zero warnings emitted).	2026-05-02 02:05:01 -07:00
Teknium	8825e9044c	fix(discord): complete #18741 for /skill autocomplete and drop legacy 25x25 caps (#18745 ) ``discord_skill_commands_by_category`` was lagging the flat ``discord_skill_commands`` collector on two counts. Both were actively dropping skills from Discord's ``/skill`` autocomplete dropdown. 1. External-dir skills were filtered out. #18741 widened the flat collector to accept ``SKILLS_DIR + skills.external_dirs`` but left this sibling collector — the one ``_register_skill_group`` actually uses on Discord — still matching ``SKILLS_DIR`` only. External skills were visible in ``hermes skills list`` and the agent's ``/skill-name`` dispatch but silently absent from Discord's ``/skill`` picker. Widen the accepted roots to match, and derive categories from whichever root the skill lives under so ``<ext>/mlops/foo/SKILL.md`` still lands in the ``mlops`` group. 2. 25-group × 25-subcommand caps were still applied. PR #11580 refactored ``/skill`` to a flat autocomplete (whose options Discord fetches dynamically — no per-command payload concern) and its docstring promises "no hidden skills." The collector kept the old nested-layout caps anyway, silently dropping anything past the 25th alphabetical category. On installs with 29 category dirs today (real example: tail categories ``social-media``, ``software-development``, ``yuanbao`` going missing) this was biting immediately. Remove the caps; ``hidden`` now reports only 32-char name-clamp collisions against reserved names. Tests: guard both behaviors. ``test_no_legacy_25x25_cap`` builds 30 categories × 30 skills each and asserts all 900 are returned. ``test_external_dirs_skills_included`` monkeypatches ``get_external_skills_dirs`` and asserts an external-dir skill makes it into the result grouped under its own top-level directory.	2026-05-02 02:00:06 -07:00
CoreyNoDream	c5e3a6fb5b	fix(cli): decode .env as UTF-8 to avoid GBK crash on Windows Path.read_text() uses the system locale by default. On Windows CN/JP/KR locales (GBK/CP932/CP949), reading a UTF-8 .env raises UnicodeDecodeError as soon as it contains any non-ASCII byte (e.g. an em dash). Pin encoding="utf-8" on every .env read in hermes_cli to match how the rest of the codebase (load_dotenv at doctor.py:26) already decodes it. Adds a regression test that monkeypatches Path.read_text to simulate a GBK locale and asserts 'hermes doctor' no longer raises. Refs #18637	2026-05-02 01:40:31 -07:00
Teknium	e2cea6eeba	fix(gateway): include external_dirs skills in Telegram/Discord slash commands (#18741 ) Skills configured through `skills.external_dirs` in config.yaml were visible via `hermes skills list`, `get_skill_commands()`, and the agent's `/skill-name` dispatch, but silently excluded from the Telegram and Discord slash-command menus. The filter in `_collect_gateway_skill_entries` only accepted skills whose `skill_md_path` started with `SKILLS_DIR`, so anything under an external directory fell through. Widen the accepted-prefix set to include all configured external dirs alongside the local skills dir. Every prefix is now slash-terminated so `/my-skills` cannot also admit `/my-skills-extra`. Also guard against empty `skill_md_path` values so they can't accidentally match. Fixes #8110 Salvages #8790 by luyao618. Co-authored-by: Yao <34041715+luyao618@users.noreply.github.com>	2026-05-02 01:36:57 -07:00
Teknium	97acd66b4c	fix(curator): authoritative absorbed_into on delete + restore cron skill links on rollback (#18671 ) (#18731 ) * fix(curator): authoritative absorbed_into declarations on skill delete Closes #18671. The classification pipeline that feeds cron-ref rewriting used to infer consolidation vs pruning from two brittle signals: the curator model's post-hoc YAML summary block, and a substring heuristic scanning other tool calls for the removed skill's name. Both miss in real consolidations — the model forgets the YAML under reasoning pressure, and the heuristic misses when the umbrella's patch content describes the absorbed behavior abstractly instead of naming the old slug. When both miss, the skill falls through to 'no-evidence fallback' pruned, and #18253's cron rewriter drops the cron ref entirely instead of mapping it to the umbrella. Same observable symptom as pre-#18253: 'Skill(s) not found and skipped' at the next cron run. The fix makes the model declare intent at the moment of deletion. skill_manage(action='delete') now accepts absorbed_into: - absorbed_into='<umbrella>' -> consolidated, target must exist on disk - absorbed_into='' -> explicit prune, no forwarding target - missing -> legacy path, falls through to heuristic/YAML The curator reconciler reads these declarations off llm_meta.tool_calls BEFORE either the YAML block or the substring heuristic. Declaration wins. Fallback logic stays intact for backward compat with any caller (human or older curator conversation) that doesn't populate the arg. Changes - tools/skill_manager_tool.py: add absorbed_into param to skill_manage + _delete_skill. Validate target exists when non-empty. Reject absorbed_into=<self>. Wire through dispatcher + registry + schema. - agent/curator.py: new _extract_absorbed_into_declarations() walks tool calls for skill_manage(delete) with the arg. _reconcile_classification accepts absorbed_declarations= and treats them as authoritative. Curator prompt updated to require the arg on every delete. - Tests: 7 new skill_manager tests covering the tool contract (valid target, empty string, nonexistent target, self-reference, whitespace, backward compat, dispatcher plumbing). 11 new curator tests covering the extractor + authoritative reconciler path + mixed-legacy-and- declared runs. Validation - 307/307 targeted tests pass (curator + cron + skill_manager suites). - E2E #18671 repro: 3 narrow skills, 1 umbrella, cron job referencing all 3. Model emits NO YAML block. Heuristic misses (patch prose doesn't name old slugs). Delete calls carry absorbed_into. Result: both PR skills correctly classified 'consolidated' + cron rewritten ['pr-review-format', 'pr-review-checklist', 'stale-junk'] -> ['hermes-agent-dev']; stale-junk pruned via absorbed_into=''. - E2E backward-compat: delete without absorbed_into, model emits YAML -> routed via existing 'model' source, cron still rewritten correctly. * feat(curator): capture + restore cron skill links across snapshot/rollback Before this, rolling back a curator run restored the skills tree but cron jobs still pointed at the umbrella skills the curator had rewritten them to. The user would see their old narrow skills back on disk but their cron jobs still configured with the merged umbrella — not actually 'back to how it was'. Snapshot side: snapshot_skills() now captures ~/.hermes/cron/jobs.json alongside the skills tarball, as cron-jobs.json. The manifest gets a new 'cron_jobs' block with {backed_up, jobs_count} so rollback (and the CLI confirm dialog) can surface what's in the snapshot. If jobs.json is missing/unreadable/malformed, snapshot proceeds without cron data — the skills backup is the core guarantee; cron is additive. Rollback side: after the skills extract succeeds, the new _restore_cron_skill_links() reconciles the backed-up jobs into the live jobs.json SURGICALLY. Only 'skills' and 'skill' fields are restored, and only on jobs matched by id. Everything else about a cron job — schedule, last_run_at, next_run_at, enabled, prompt, workdir, hooks — is live state the user or scheduler has modified since the snapshot; overwriting it would regress unrelated activity. Reconciliation rules: - Job in backup AND live, skills differ → skills restored. - Job in backup AND live, skills match → no-op. - Job in backup, NOT in live → skipped (user deleted it after snapshot; their choice is later than the snapshot). - Job in live, NOT in backup → untouched (user created it after snapshot). - Snapshot missing cron-jobs.json at all → rollback still succeeds, reports 'not captured' (older pre-feature snapshots keep working). Writes go through cron.jobs.save_jobs under the same _jobs_file_lock the scheduler uses, so rollback doesn't race tick(). Also: - hermes_cli/curator.py: rollback confirm dialog now shows 'cron jobs: N (will be restored for skill-link fields only)' when the snapshot has cron data, or 'not in snapshot (<reason>)' otherwise. - rollback()'s message string includes a 'cron links: ...' clause summarizing the reconciliation outcome. Tests - 9 new cases: snapshot-with-cron, snapshot-without-cron, malformed-json captured-as-raw, full rollback-restores-skills-and-cron, rollback touches only skill fields, rollback skips user-deleted jobs, rollback leaves user-created jobs untouched, rollback still works with pre-feature snapshot that has no cron-jobs.json, standalone unit test on _restore_cron_skill_links exercising the full report shape. Validation - 484/484 targeted tests pass (curator + cron + skill_manager suites). - E2E: real snapshot_skills, real cron rewrite, real rollback. Before: ['pr-review-format', 'pr-review-checklist', 'pr-triage-salvage']. After curator: ['hermes-agent-dev']. After rollback: ['pr-review-format', 'pr-review-checklist', 'pr-triage-salvage']. Non-skill fields (id, name, prompt) preserved across the round trip.	2026-05-02 01:29:57 -07:00
Siddharth Balyan	f98b5d00a4	fix: gateway systemd unit now retries indefinitely with backoff (#18639 ) The old defaults (StartLimitIntervalSec=600, StartLimitBurst=5, RestartSec=30) meant any network outage over ~5 minutes would permanently kill the gateway until manual intervention. Changes: - StartLimitIntervalSec=0 (never give up) - Restart=always (not just on-failure) - RestartSec=60 with RestartMaxDelaySec=300, RestartSteps=5 (exponential backoff: 60 → 120 → 180 → 240 → 300s cap) - After=network-online.target + Wants= (both units now wait for actual connectivity, not just network.target) Power outage → internet down → internet back = auto-recovery.	2026-05-02 08:51:30 +05:30
Siddharth Balyan	585d6778da	fix: allow WebSocket connections from non-loopback IPs in --insecure mode (#18633 ) When the dashboard is bound to 0.0.0.0 with --insecure (e.g. behind Tailscale Serve), WebSocket endpoints (/api/pty, /api/ws, /api/pub, /api/events) rejected connections from non-loopback client IPs with code 4403 — causing 'events feed disconnected' in the UI. Extract the repeated loopback check into _ws_client_is_allowed() which respects the public bind flag. Session token auth still guards all endpoints regardless of bind mode.	2026-05-02 08:17:45 +05:30
Prive FE Coder	a717199bbf	fix(slack): exclude reserved Slack commands from native slash manifest Slack has built-in slash commands (e.g. /status, /me, /join) that apps cannot register. When running `hermes slack manifest --write`, the generated manifest included /status, causing Slack to reject the entire manifest with a reserved-command error. Add _SLACK_RESERVED_COMMANDS frozenset of all known Slack built-ins and skip them in slack_native_slashes(). Affected commands remain reachable via /hermes <command>. Tests updated: - New test_excludes_slack_reserved_commands validates no leaks - test_includes_canonical_commands no longer asserts /status - test_telegram_parity accounts for expected Slack-only exclusions	2026-05-01 14:01:26 -07:00
Teknium	f99676e315	fix(gateway): auto-restart when source files change out from under us (#17648 ) (#18409 ) Long-running gateway processes that survive 'hermes update' keep pre-update modules cached in sys.modules. When new tool files on disk then try to 'from hermes_cli.config import cfg_get' (added in PR #17304), the import resolves against the stale module object and raises ImportError — hitting users on Matrix, Telegram, Feishu, and other platforms. Two defenses: 1. Gateway self-check (gateway/run.py). On __init__, snapshot the newest mtime across sentinel source files (hermes_cli/config.py, run_agent.py, gateway/run.py, etc.). On every inbound message, re-read those mtimes; if any is newer than boot time + 2s slack, request a graceful restart via the normal drain path and return a one-line ack to the user. Idempotent, works regardless of how the update happened (hermes update, manual git pull, installer). 2. Post-restart survivor sweep ('hermes update'). After the existing restart loop, sleep 3s, rescan for gateway PIDs we already tried to kill, and SIGKILL any survivors. The detached profile watchers and systemd then relaunch with fresh code instead of waiting out the 120s watcher timeout. Closes #17648.	2026-05-01 09:50:08 -07:00
Teknium	77c0bc6b13	fix(curator): defer first run and add --dry-run preview (#18373 ) (#18389 ) * fix(curator): defer first run and add --dry-run preview (#18373) Curator was meant to run 7 days after install, not on the very first gateway tick. On a fresh install (no .curator_state), should_run_now() returned True immediately because last_run_at was None — so the gateway cron ticker fired Curator against a fresh skill library moments after 'hermes update'. Combined with the binary 'agent-created' provenance model (anything not bundled and not hub-installed), this consolidated hand-authored user workflow skills without consent. Changes: - should_run_now(): first observation seeds last_run_at='now' and returns False. The next real pass fires one full interval_hours later (7 days by default), matching the original design intent. - hermes curator run --dry-run: produces the same review report without applying automatic transitions OR permitting the LLM to call skill_manage / terminal mv. A DRY-RUN banner is prepended to the prompt and the caller skips apply_automatic_transitions. State is NOT advanced so a preview doesn't defer the next scheduled real pass. - hermes update: prints a one-liner on fresh installs pointing at --dry-run, pause, and the docs. Silent on steady state. - Docs: curator.md and cli-commands.md explain the deferred first-run behavior and warn that hand-written SKILL.md files share the 'agent-created' bucket, with guidance to pin or preview before the first pass. Tests: - test_first_run_defers replaces the old 'first run always eligible' assertion — same fixture, inverted expectation. - test_maybe_run_curator_defers_on_fresh_install covers the gateway tick path end-to-end. - Three new dry-run tests cover state-advance suppression, prompt banner injection, and apply_automatic_transitions skipping. Fixes #18373. * feat(curator): pre-run backup + rollback (#18373) Every real curator pass now snapshots ~/.hermes/skills/ into ~/.hermes/skills/.curator_backups/<utc-iso>/skills.tar.gz before calling apply_automatic_transitions or the LLM review. If a run consolidates or archives something the user didn't want touched, 'hermes curator rollback' restores the tree in one command. Dry-run is skipped — no mutation means no snapshot needed. Changes: - agent/curator_backup.py (new): tar.gz snapshot + safe rollback. The snapshot excludes .curator_backups/ (would recurse) and .hub/ (managed by the skills hub). Extract refuses absolute paths and .. components, and uses tarfile's filter='data' on Python 3.12+. Rollback takes a pre-rollback safety snapshot FIRST, stages the current tree into .rollback-staging-<ts>/ so the extract lands in an empty dir, and cleans the staging dir on success. A failed extract restores the staged contents. - agent/curator.py: run_curator_review() calls curator_backup. snapshot_skills(reason='pre-curator-run') before apply_automatic_ transitions. Best-effort — a failed snapshot logs at debug and the run continues (a transient disk issue shouldn't silently disable curator forever). - hermes_cli/curator.py: new 'hermes curator backup' and 'hermes curator rollback' subcommands. rollback supports --list, --id <ts>, -y. - hermes_cli/config.py: curator.backup.{enabled, keep} config block with sane defaults (enabled=true, keep=5). - Docs: curator.md gets a 'Backups and rollback' section; cli-commands .md table gets the new rows. Tests (new file tests/agent/test_curator_backup.py, 16 cases): - snapshot creates tarball + manifest with correct counts - snapshot excludes .curator_backups/ (recursion guard) and .hub/ - snapshot disabled via config returns None without creating anything - snapshot uniquifies ids within the same second (-01 suffix) - prune honors keep count, newest-first - list_backups + _resolve_backup cover newest-default and unknown-id - rollback restores a deleted skill with content intact - rollback is itself undoable — safety snapshot shows up in list_backups - rollback with no snapshots returns an error - rollback refuses tarballs with absolute paths or .. components - real curator runs take a 'pre-curator-run' snapshot; dry-runs do not All curator tests: 210 passing locally.	2026-05-01 09:49:59 -07:00
Siddharth Balyan	c5b4c48165	fix: lazy session creation — defer DB row until first message (#18370 ) Prevents ghost sessions from accumulating in state.db when the TUI/web dashboard is opened and closed without sending a message. Changes: - run_agent.py: Add _ensure_db_session() gate method, called at run_conversation() entry. Remove eager create_session() from __init__. Handle compression rotation flag correctly. - tui_gateway/server.py: Remove eager db.create_session() in _start_agent_build(). Add post-first-message pending_title re-apply. - hermes_state.py: Extract _insert_session_row() shared helper (DRY). Add prune_empty_ghost_sessions() for one-time migration. - cli.py: One-time ghost session prune on startup. Fix _pending_title to call _ensure_db_session() before set_session_title(). - hermes_cli/main.py: Guard TUI exit summary on message_count > 0. - tests: Update test_860_dedup to call _ensure_db_session() before direct _flush_messages_to_session_db() calls. Closes: ghost session clutter in hermes sessions list and web dashboard.	2026-05-01 18:39:12 +05:30
Austin Pickett	5ad030d19d	Merge pull request #18095 from NousResearch/austin/feat/plugins-page feat(dashboard): Plugins page — manage, enable/disable, auth status	2026-05-01 05:29:24 -07:00
web-dev0521	dfe512c58d	fix(paths): route achievements plugin + profile-tui through HERMES_HOME Four callsites hardcoded Path.home() / '.hermes' with no HERMES_HOME check, breaking Docker deployments and profile isolation (hermes -p): - plugins/hermes-achievements/dashboard/plugin_api.py: state_path(), snapshot_path(), checkpoint_path() bare-literal paths - scripts/profile-tui.py: DEFAULT_STATE_DB and DEFAULT_LOG defaults ignored HERMES_HOME - hermes_cli/slack_cli.py: except-Exception fallback for slack-manifest.json dump - optional-skills/migration/openclaw-migration/scripts/openclaw_to_hermes.py: --target argparse default Use get_hermes_home() (with an ImportError shim for the standalone scripts) or 'os.environ.get("HERMES_HOME") or str(Path.home()/".hermes")' where importing hermes_constants is impractical. E2E-verified: with HERMES_HOME=/tmp/x all three achievements paths and both profile-tui defaults route under /tmp/x. Salvaged from #18068 (original scope was broader mechanical cleanup claiming 23 callsites were buggy; most were already respecting HERMES_HOME via os.environ.get(key, default) — only these 4 had no env check at all). Credit: @web-dev0521.	2026-04-30 23:21:54 -07:00
Teknium	265bd59c1d	feat: /goal — persistent cross-turn goals (Ralph loop) (#18262 ) Add a standing-goal slash command that keeps Hermes working toward a user-stated objective across turns until it is achieved, paused, or the turn budget runs out. Our take on the Ralph loop — cf. Codex CLI 0.128.0's /goal. After each turn, a lightweight auxiliary-model judge call asks 'is this goal satisfied by the assistant's last response?'. If not, and we're under the turn budget (default 20), Hermes feeds a continuation prompt back into the same session as a normal user message. Any real user message preempts the continuation loop automatically. Judge failures fail OPEN (continue) so a flaky judge never wedges progress — the turn budget is the real backstop. ### Commands - `/goal <text>` — set a standing goal (kicks off the first turn) - `/goal` or `/goal status` — show current state - `/goal pause` — pause the continuation loop - `/goal resume` — resume (resets turn counter) - `/goal clear` — drop the goal Works on both CLI and gateway platforms via the central CommandDef registry. ### Design invariants preserved - Prompt cache: continuation prompts are regular user-role messages appended to history. No system-prompt mutation, no toolset swap. - Role alternation: continuation is a user turn, never injected mid-tool-loop. - Session persistence: goal state lives in SessionDB.state_meta keyed by `goal:<session_id>`, so `/resume` picks it up. - Mid-run safety: on the gateway, `/goal status\|pause\|clear` are allowed mid-run (control-plane only); setting a new goal requires `/stop` first so we don't race a second continuation prompt against the current turn. ### Files - `hermes_cli/goals.py` (new, 380 lines) — GoalManager + judge + state - `hermes_cli/commands.py` — CommandDef entry - `hermes_cli/config.py` — `goals.max_turns` default - `hermes_cli/web_server.py` — dashboard category merge - `cli.py` — /goal handler + post-turn continuation hook in process_loop - `gateway/run.py` — /goal handler + post-turn continuation hook wrapping _handle_message_with_agent - `tests/hermes_cli/test_goals.py` (new, 26 tests) — judge parsing, fail-open semantics, lifecycle, persistence, budget exhaustion - `website/docs/reference/slash-commands.md` — docs entry	2026-04-30 23:10:20 -07:00
Teknium	50c046331d	feat(update): add --yes/-y flag to skip interactive prompts (#18261 ) hermes update had two interactive [Y/n] prompts with no bypass: 1. Config migration (after new env/config options are added) 2. Autostash restore (when uncommitted work was stashed before pull) hermes uninstall already has --yes/-y; mirrors that. Under --yes: - Config-migrate prompt → auto-yes, migrate_config(interactive=False) so new config fields are applied but API-key prompts are skipped (user runs 'hermes config migrate' later for those). Matches gateway-mode semantics. - Stash-restore prompt → auto-yes, git stash apply runs automatically. Closes the 'can I hermes update -y, No ! Fix' gap reported by @murelux.	2026-04-30 23:06:32 -07:00
Teknium	4caad285a6	feat(gateway): auto-delete slash-command system notices after TTL (#18266 ) Adds opt-in auto-deletion for slash-command reply messages like "New session started!", "Restarting gateway…", "Stopped.", and YOLO toggles. After the TTL elapses the gateway calls the adapter's delete_message; on platforms without a delete API (everything except Telegram today) the TTL is silently ignored and the message stays. Requested on Twitter by @charlesmcdowell — tool-call bubbles are useful real-time, but system notices clutter the thread once the agent finishes. Implementation: - EphemeralReply(str) sentinel in gateway/platforms/base.py. Subclasses str so existing 'X' in response / response.startswith(...) checks in tests and call sites keep working unchanged; isinstance() still distinguishes it for the send path. - _process_message_background and both busy-session bypass paths (in base.py) call _unwrap_ephemeral() on the handler return, send the unwrapped text, and schedule a detached delete task when the TTL > 0 AND the adapter class overrides delete_message. - display.ephemeral_system_ttl (default 0 = disabled) in DEFAULT_CONFIG. Handler can pass ttl_seconds explicitly to override. - Wrapped the highest-noise return sites: /new, /reset, /stop, /yolo on/off, /restart success + "already in progress". Draining notices and /help output left as plain strings — those are informational and users want to read them. Backward-compat: default TTL 0 → no scheduling, no behavior change for existing users. Platforms without delete_message silently no-op.	2026-04-30 23:05:48 -07:00
Teknium	fc78e708ed	fix(update): don't crash hermes update if skill config scan fails (#18257 ) `hermes update` ran the config migration (11 → 17) successfully then crashed at `agent/skill_utils.py:340` during the post-migration skill-config prompt. User @FlockonUS reported this on Twitter. Root cause: `get_missing_skill_config_vars` in hermes_cli/config.py only guarded the import of `discover_all_skill_config_vars`, not the call. Any runtime exception inside the skill scan (malformed SKILL.md, unreadable external skill dir, etc.) propagated up through `migrate_config` and aborted `hermes update` after the version bump. Wrap the call in try/except so skill-config prompting — which is a post-migration nicety — can never block the migration itself.	2026-04-30 22:44:41 -07:00
Mind-Dragon	0704589ceb	fix(agent): make tool loop guardrails warning-first	2026-04-30 20:43:15 -07:00
johnncenae	bb706c3f38	fix(gateway): coerce tool_progress_command as a real boolean	2026-04-30 20:40:46 -07:00
Yukipukii1	55366510e5	fix(auth): make provider config writes atomic	2026-04-30 20:39:41 -07:00
jatin godnani	e3624e00db	fix: enforce strictly subtractive toolset filtration Refactor tool resolution logic in model_tools.py to ensure that disabled_toolsets are always subtracted at the end, preventing composite toolsets (e.g. 'browser') from implicitly enabling tools that should be hidden. - Added 'disabled_toolsets' to DEFAULT_CONFIG in hermes_cli/config.py - Updated HermesCLI in cli.py to load and propagate disabled toolsets to AIAgent - Implemented robust two-phase resolution (additive then subtractive) in model_tools.py	2026-04-30 20:24:39 -07:00
Teknium	0ddc8aba68	fix(fallback): let custom_providers shadow built-in aliases When a user defines `custom_providers: [{name: kimi, ...}]` and references `provider: kimi` from fallback_model or the main config, the built-in alias rewriting (`kimi` → `kimi-coding`) was hijacking the request before the named-custom lookup ran. `_get_named_custom_provider` also refused to return a match when the raw name resolved to any built-in (including aliases), so the custom endpoint was unreachable. Fix at both layers of the resolution chain so every caller benefits, not just `_try_activate_fallback`: - hermes_cli/runtime_provider.py: narrow `_get_named_custom_provider`'s built-in-wins guard to canonical provider names only. An alias like `kimi` that resolves to a different canonical (`kimi-coding`) no longer blocks the custom lookup; a canonical name like `nous` still does. - agent/auxiliary_client.py: in `resolve_provider_client`, try the named- custom lookup with the original (pre-alias-normalization) name before the alias-normalized one, so aliased requests reach the user's custom entry. Also honour `explicit_base_url` and `explicit_api_key` in the API-key provider branch so callers that pass explicit hints (e.g. fallback activation) can override the registered defaults. Tests added for: - custom `kimi` shadowing built-in alias (regression for #15743) - custom `nous` NOT shadowing canonical built-in (behaviour preserved) - bare `kimi` without any custom entry still routing to built-in - explicit base_url/api_key override on the API-key provider branch Original PR #17827 by @Feranmi10 identified the same bug class and implemented a narrower fix in `_try_activate_fallback`; this reshapes the fix to live in the shared resolution layer so all callers benefit. Fixes #15743 Co-authored-by: Feranmi10 <89228157+Feranmi10@users.noreply.github.com>	2026-04-30 20:18:44 -07:00
Teknium	96691268df	fix(gateway): drain manual profile gateways via SIGUSR1 before respawn The PR wired in a detached watcher that respawns manual profile gateways after they exit. Pair that with a SIGUSR1 graceful drain (same path systemd/launchd use) so in-flight agent runs finish instead of getting SIGTERM'd. Fall back to SIGTERM if SIGUSR1 isn't wired or the gateway doesn't exit within the drain budget — the watcher sees the exit and relaunches either way. Tested end-to-end against an orphaned gateway: graceful drain exits in 0.5s and the watcher fires the relaunch command.	2026-04-30 20:00:31 -07:00
Michael Nguyen	77fe7ab6b2	feat(gateway): restart manual profile gateways after update	2026-04-30 20:00:31 -07:00
teknium1	447a2bba3a	fix(plugins): bound async plugin command await with 30s timeout Follow-up to #17963. The threaded branch of resolve_plugin_command_result previously called Event.wait() with no timeout — a hung async plugin handler would wedge the terminal indefinitely. Cap the wait at 30s and raise TimeoutError instead. Added a regression test covering the hung handler path.	2026-04-30 19:56:18 -07:00
hharry11	ca9a61ae38	fix(plugins): await async handlers in CLI and TUI dispatch	2026-04-30 19:56:18 -07:00
johnncenae	2bf73fbe2c	fix(cli): coerce tls insecure flag safely in auth state	2026-04-30 19:55:48 -07:00
Feranmi10	f4ba97ad9a	fix(status): add NVIDIA_API_KEY to hermes status API keys display Closes #16082 The `hermes status` command listed provider API keys under the ◆ API Keys section but NVIDIA_API_KEY was absent. Users configured with NVIDIA NIM had no way to verify their key was set from status output. Add it alongside the other inference provider keys.	2026-04-30 19:46:06 -07:00
Mind-Dragon	5ad8281885	fix(model_switch): correct user_providers override for private models The switch_model override logic incorrectly iterated over user_providers as if it were a list of dicts, but it's actually a dict mapping provider_slug -> config. This meant private models defined in a provider's `models:` section (e.g. nahcrof-dedicated with discover_models: false) were never accepted when the API /models list didn't include them. Fix: iterate over user_providers.items(), match by slug, and handle both dict and list forms of the models config.	2026-04-30 19:44:26 -07:00
Teknium	e2e6b6ff1a	chore(models): move Vercel AI Gateway to bottom of provider picker (#18112 ) It was sitting at position 4 of the `hermes model` list, ahead of Anthropic, OpenAI, Xiaomi, and other first-class API providers. Move it to the end of CANONICAL_PROVIDERS and drop the "(200+ models, $5 free credit, no markup)" parenthetical so the entry just reads "Vercel AI Gateway".	2026-04-30 19:34:19 -07:00
Austin Pickett	c73b799de7	feat(dashboard): add hide/show toggle for dashboard plugins in sidebar - New config key: dashboard.hidden_plugins (list of plugin names) - GET /api/dashboard/plugins now filters out hidden plugins from sidebar - POST /api/dashboard/plugins/{name}/visibility toggles visibility - Hub response includes user_hidden boolean per plugin row - Eye/EyeOff toggle on plugin cards with dashboard manifests - i18n: 'Show in sidebar' / 'Hide from sidebar' (en/zh)	2026-04-30 20:29:37 -04:00
Austin Pickett	6549b0f2b7	fix(security): address CodeQL path-traversal and info-exposure findings - Add _validate_plugin_name() guard on all {name} path param endpoints (rejects /, \, .. before reaching plugin logic) - Strip after_install_path from install response (no internal paths to client) - Update nix/tui.nix lockfile hash to match committed package-lock.json	2026-04-30 20:29:37 -04:00
Austin Pickett	e2a4905606	feat(dashboard): add Plugins page with enable/disable, auth status, install/remove - New PluginsPage.tsx: full plugin management UI (list, enable/disable, install from git, remove, git pull updates, provider picker) - Backend: dashboard_set_agent_plugin_enabled now also toggles the plugin's toolset in platform_toolsets so enabling actually makes tools visible in agent sessions - Backend: /api/dashboard/plugins/hub returns auth_required + auth_command per plugin (checks tool registry check_fn) - Frontend: auth_required shown as Badge + CommandBlock with copy-able auth command - Fix: Select overflow in providers card (min-w-0 grid cells, removed truncate/overflow-hidden that clipped dropdown) - Refactor: _install_plugin_core extracted for non-interactive reuse, PluginOperationError for structured error handling - i18n: en/zh/types updated with all new plugin page strings	2026-04-30 20:29:37 -04:00
Teknium	c868425467	feat(kanban): durable multi-profile collaboration board (#17805 ) Salvage of PR #16100 onto current main (after emozilla's #17514 fix that unblocks plugin Pydantic body validation). History preserved on the standing `feat/kanban-standing` branch; this squashes the 22 iterative commits into one clean landing. What this lands: - SQLite kernel (hermes_cli/kanban_db.py) — durable task board with tasks, task_links, task_runs, task_comments, task_events, kanban_notify_subs tables. WAL mode, atomic claim via CAS, tenant-namespaced, skills JSON array per task, max-runtime timeouts, worker heartbeats, idempotency keys, circuit breaker on repeated spawn failures, crash detection via /proc/<pid>/status, run history preserved across attempts. - Dispatcher — runs inside the gateway by default (`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims stale claims, promotes ready tasks, spawns `hermes -p <assignee> chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK + HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker` plus any per-task skills. Health telemetry warns on stuck ready queue. - Structured tool surface (tools/kanban_tools.py) — 7 tools (kanban_show, kanban_complete, kanban_block, kanban_heartbeat, kanban_comment, kanban_create, kanban_link). Gated on HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal sessions. - System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE) injected only when kanban tools are active. - Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board UI: triage/todo/ready/running/blocked/done columns, drag-drop, inline create, task drawer with markdown, comments, run history, dependency editor, bulk ops, lanes-by-profile grouping, WS-driven live refresh. Matches active dashboard theme via CSS variables. - CLI — `hermes kanban init\|create\|list\|show\|assign\|link\|unlink\| claim\|comment\|complete\|block\|unblock\|archive\|tail\|dispatch\|context\| init\|gc\|watch\|stats\|notify\|log\|heartbeat\|runs\|assignees` + `/kanban` slash in-session. - Worker + orchestrator skills (skills/devops/kanban-worker + kanban-orchestrator) — pattern library for good summary/metadata shapes, retry diagnostics, block-reason examples, fan-out patterns. - Per-task force-loaded skills — `--skill <name>` (repeatable), stored as JSON, threaded through to dispatcher argv as one `--skills X` pair per skill alongside the built-in kanban-worker. Dashboard + CLI + tool parity. - Deprecation of standalone `hermes kanban daemon` — stub exits 2 with migration guidance; `--force` escape hatch for headless hosts. - Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md) with 11 dashboard screenshots walking through four user stories (Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker). - Tests (251 passing): kernel schema + migration + CAS atomicity, dispatcher logic, circuit breaker, crash detection, max-runtime timeouts, claim lifecycle, tenant isolation, idempotency keys, per- task skills round-trip + validation + dispatcher argv, tool surface (7 tools × round-trip + error paths), dashboard REST (CRUD + bulk + links + warnings), gateway-embedded dispatcher (config gate, env override, graceful shutdown), CLI deprecation stub, migration from legacy schemas. Gateway integration: - GatewayRunner._kanban_dispatcher_watcher — new asyncio background task, symmetric with _kanban_notifier_watcher. Runs dispatch_once via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0 env override for debugging. - Config: new `kanban` section in DEFAULT_CONFIG with `dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`. Additive — no \_config_version bump needed. Forward-compat: - workflow_template_id / current_step_key columns on tasks (v1 writes NULL; v2 will use them for routing). - task_runs holds claim machinery (claim_lock, claim_expires, worker_pid, last_heartbeat_at) so multi-attempt history is first- class from day one. Closes #16102. Co-authored-by: emozilla <emozilla@nousresearch.com>	2026-04-30 13:36:47 -07:00
Teknium	1d8068d71d	feat(models): add openrouter/owl-alpha (free) to curated OpenRouter list (#18071 )	2026-04-30 12:57:02 -07:00
Austin Pickett	6bc5d72271	Merge pull request #16419 from vincez-hms-coder/feat/dashboard-profiles-hms-coder feat(dashboard): add profiles management page	2026-04-30 12:09:23 -07:00
Teknium	73bf3ab1b2	chore: release v0.12.0 (2026.4.30) (#18057 ) The Curator release — Hermes Agent now maintains itself. Autonomous background Curator grades, prunes, and consolidates the skill library; self-improvement loop substantially upgraded; four new inference providers; Microsoft Teams (via pluggable platforms) + Yuanbao as 18th and 19th messaging platforms; Spotify + Google Meet native integrations; ComfyUI + TouchDesigner-MCP bundled by default; Humanizer skill ported; ~57% cut to visible TUI cold start. Stats since v0.11.0: 1,096 commits, 550 merged PRs, 1,270 files changed, 217,776 insertions, 213 community contributors.	2026-04-30 11:31:01 -07:00
Teknium	d60a9917d3	feat(curator): show most-used and least-used skills in `hermes curator status` (#18033 ) Alongside the existing 'least recently used' section, surface two more rankings so users can see which of their agent-created skills actually get exercised: - 'most used (top 5)' — sorted by use_count descending. Hidden when every skill has use_count=0 (noise suppression on fresh installs). - 'least used (top 5)' — sorted by use_count ascending. Always shown when the catalog is non-empty. use_count started tracking real agent skill activation in PR #17932 (bump_use wired into skill_view tool + slash invocation + --skill preload), so these rankings are now meaningful. Tests: 3 new in tests/hermes_cli/test_curator_status.py — happy path with mixed use_counts, zero-use suppression of the most-used section, and the no-skills clean-empty case.	2026-04-30 10:37:33 -07:00
y0shualee	f4b76fa272	fix: use skill activity in curator status Treat skill views and edits as activity when curator reports and applies lifecycle transitions, so recently loaded or patched skills are not displayed or transitioned as never used.\n\nAdds regression tests for activity derivation, automatic transitions, and CLI status output.	2026-04-30 10:31:47 -07:00
Teknium	e8e5985ce6	fix(curator): seed defaults on update, create logs/curator dir, defer fire import (#17927 ) Three fixes bundled for curator reliability on existing installs and broken/partial installs: 1. run_agent.py: defer `import fire` into the __main__ block. `fire` is only used by `fire.Fire(main)` when running run_agent.py directly as a CLI — it is NOT needed for library usage. Importing it at module top made `from run_agent import AIAgent` from a daemon thread (e.g. the curator's forked review agent) crash with ModuleNotFoundError on broken/partial installs where `fire` isn't present. 2. hermes_cli/config.py: add version 22 → 23 migration that writes the `curator` + `auxiliary.curator` sections to config.yaml with their defaults, only filling keys the user hasn't overridden. Existing configs from before PR #16049 / the April 2026 `auxiliary.curator` unification had neither section on disk, so users couldn't see or edit the settings in their config.yaml (runtime deep-merge papered over it at read time, but the file never reflected reality). 3. hermes_cli/config.py: `ensure_hermes_home()` now pre-creates `~/.hermes/logs/curator/` alongside cron/sessions/logs/memories on every CLI launch. Managed-mode (NixOS) variant mkdir's it defensively after the activation-script existence checks, since the activation script may not know about this subpath. 4. agent/curator.py: `_reports_root()` mkdir's the dir at call time as belt-and-suspenders for entry paths that bypass both ensure_hermes_home() and the v23 migration (gateway-only installs, bare library use). E2E validated in isolated HERMES_HOME: fresh install gets full defaults seeded; partial-override config keeps user's `enabled: false` and custom `interval_hours` while filling the missing keys; re-running the migration is a no-op.	2026-04-30 04:52:28 -07:00
Teknium	b50bc13ef9	fix(config): preserve YAML lists in hermes config set (#17876 ) _set_nested unconditionally replaced any non-dict value with an empty dict when walking the dotted path, which silently destroyed list-typed config nodes the moment someone set a value with a numeric index (e.g. 'hermes config set custom_providers.0.api_key NEW'). Any sibling entries and any fields inside the targeted entry that the user didn't write were lost. Fix: - _set_nested now detects list nodes and navigates by numeric index, and preserves both dicts AND lists at intermediate positions (scalars are still replaced so bare-scalar -> nested overrides keep working). - set_config_value drops its duplicated navigation logic and calls _set_nested instead -- single source of truth for the rules. Regression tests (tests/hermes_cli/test_set_config_value.py): - test_indexed_set_preserves_sibling_list_entries -- exact #17876 repro - test_indexed_set_preserves_non_targeted_fields -- inner-dict fields survive - test_deeper_nesting_through_list -- dict -> list -> dict -> scalar path 35/35 existing + new tests pass. E2E-verified with the issue's repro against a real on-disk config.yaml -- list stays a list, entry 0 updated, entry 1 intact. Closes #17876	2026-04-30 04:32:17 -07:00
Andy	201f7caed8	fix: prevent bare 'custom' slug in model.provider (#17478 ) When hermes model picker switches to a custom_providers entry, the slug assignment can write the literal string 'custom' to model.provider if a prior failed switch already left that value in config.yaml. Two fixes: 1. model_switch.py: filter out bare 'custom' in slug assignment, always resolve to canonical custom:<name> form 2. providers.py: resolve_custom_provider() self-heals bare 'custom' by falling back to the first valid custom_providers entry Closes #17478	2026-04-30 04:32:11 -07:00
Rob Moen	0dd373ec43	fix(context): honor model.context_length for Ollama num_ctx and all display paths When a user sets model.context_length in config.yaml, the value was only used for Hermes' internal compression decisions (context_compressor) but NOT for Ollama's num_ctx parameter. Ollama auto-detects context from GGUF metadata (often 256K+) and allocates that much VRAM regardless of the user's config — causing OOM on smaller GPUs like the P100 (16GB). Root cause: two separate context values existed independently: - context_compressor.context_length = config value (e.g. 65536) ✓ - _ollama_num_ctx = GGUF metadata value (e.g. 256000) ✗ ignored config Changes: 1. Cap Ollama num_ctx to config context_length (run_agent.py) When model.context_length is explicitly set and no explicit ollama_num_ctx override exists, cap the auto-detected GGUF value to the user's context_length. This is the core fix — it prevents Ollama from allocating more VRAM than the user budgeted. 2. Pass config_context_length through all secondary call sites Several paths called get_model_context_length() without the config override, falling through to the 256K default fallback: - cli.py: @-reference expansion and /model switch display - gateway/run.py: @-reference expansion and /model switch display - tui_gateway/server.py: @-reference expansion - hermes_cli/model_switch.py: resolve_display_context_length() 3. Normalize root-level context_length in config (hermes_cli/config.py) _normalize_root_model_keys() now migrates root-level context_length into the model section, matching existing behavior for provider and base_url. Users who wrote `context_length: 65536` at the YAML root instead of under `model:` had it silently ignored. 4. Fix misleading comments (agent/model_metadata.py) DEFAULT_FALLBACK_CONTEXT is 256K (CONTEXT_PROBE_TIERS[0]), not 128K as two comments stated. Tests: 3 new tests for root-level context_length normalization. All existing context_length tests pass (96 tests).	2026-04-30 04:31:23 -07:00
VinceZ-Hms-Coder	ca7f46beb5	Merge upstream/main and address Copilot review feedback Merge resolved conflicts in web/src/{i18n/{en,zh,types}.ts,lib/api.ts} by keeping both this branch's `profiles` additions and upstream's new `models` page additions. Copilot review feedback: - Implement POST /api/profiles/{name}/open-terminal endpoint (already present); align Windows branch to `cmd.exe /c start "" <cmd>` so it matches the new test and spawns a fresh window instead of /k reusing the parent console. - Move backslash escaping out of the macOS AppleScript f-string expression (Python <3.12 disallows backslashes inside f-string expression parts). - Patch `_get_wrapper_dir` via monkeypatch in test_profiles_create_creates_wrapper_alias_when_safe so the test no longer writes to the real `~/.local/bin`. - Extend test_dashboard_browser_safe_imports to scan `.ts` files in addition to `.tsx`. - Switch upstream's new ModelsPage.tsx away from the `@nous-research/ui` root barrel onto per-component subpaths to satisfy the stricter scan. - Fix NouiTypography `leading-1.4` -> `leading-[1.4]` so Tailwind actually emits the line-height for the `sm` variant. - Guard ProfilesPage.openSoulEditor against out-of-order responses by tracking the latest requested profile via a ref. - Replace ProfilesPage's hand-rolled setup command with a fetch to `/api/profiles/{name}/setup-command` so the copied command always matches what the backend would actually run (handles wrapper-alias collisions and reserved names correctly). - Wire SOUL.md textarea label `htmlFor` -> textarea `id` so screen readers and clicking the label work as expected.	2026-04-30 06:43:22 -04:00
Rylen Anil	3858f9419e	fix: handle gateway Ctrl+C shutdown cleanly	2026-04-30 03:29:57 -07:00
Sebastian B	362996e269	fix(runtime_provider): _get_named_custom_provider must honour transport field on v12+ providers dict The v11→v12 migrate_config step writes the API mode for every entry under the new transport: field (per the v12+ schema in _normalize_custom_provider_entry). _get_named_custom_provider read the legacy api_mode: spelling only, so for every migrated config the lookup returned None for the api mode. Downstream, _resolve_named_custom_runtime then falls back through custom_provider.get("api_mode") or _detect_api_mode_for_url(base_url) or "chat_completions". For loopback URLs (proxies, local servers) or unknown hostnames, the URL detector returns None and the resolver silently downgrades the configured codex_responses / anthropic_messages transport to chat_completions. Requests get sent to /v1/chat/completions instead of /v1/responses or /v1/messages and the provider 404s — or worse, returns a usable chat_completions response while skipping the model's reasoning / caching surface. Fix: read both field names — entry.get("api_mode") or entry.get("transport") — at the two match-by-key + match-by-name branches in _get_named_custom_provider. The runtime normaliser _normalize_custom_provider_entry already accepts both spellings; this lifts the same compat into the direct-dict reader so v12+ configs work without going through the shim. Adds three regression tests under tests/hermes_cli/test_user_providers_model_switch.py: - transport field is read on the match-by-key branch - legacy api_mode spelling still works for hand-edited configs - transport is read on the match-by-display-name branch	2026-04-30 03:29:48 -07:00

1 2 3 4 5 ...

1609 Commits