hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-04-28 06:51:16 +08:00

Author	SHA1	Message	Date
Teknium	a9033c9220	feat(backup): exclude checkpoints/ from backups (#16572 ) Session-local trajectory cache — keyed by session hash, regenerated per-session, won't port to another machine anyway. On a large install this was multiple GB of pure noise in every zip. Also adds a regression test for the pre-existing backups/ exclusion so the two machine-local dirs share coverage.	2026-04-27 06:40:18 -07:00
Teknium	ea3c5a14c3	feat(update): make pre-update backup opt-in (off by default) (#16566 ) The zip backup could add minutes to every 'hermes update' on large HERMES_HOME directories. Flip the default to off and add a --backup flag for one-off opt-in runs. - updates.pre_update_backup default: True -> False - hermes update: new --backup flag (opposite of existing --no-backup) - Silent no-op when disabled (no message spam on every update) - Existing --no-backup still works and wins over --backup - Users who explicitly set pre_update_backup: true keep the old behavior - Tests updated to cover default-off, --backup opt-in, and config-enabled paths	2026-04-27 06:36:35 -07:00
Teknium	ec671c4154	feat(image-input): native multimodal routing based on model vision capability (#16506 ) * feat(image-input): native multimodal routing based on model vision capability Attach user-sent images as OpenAI-style content parts on the user turn when the active model supports native vision, so vision-capable models see real pixels instead of a lossy text description from vision_analyze. Routing decision (agent/image_routing.py::decide_image_input_mode): agent.image_input_mode = auto \| native \| text (default: auto) In auto mode: - If auxiliary.vision.provider/model is explicitly configured, keep the text pipeline (user paid for a dedicated vision backend). - Else if models.dev reports supports_vision=True for the active provider/model, attach natively. - Else fall back to text (current behaviour). Call sites updated: gateway/run.py (all messaging platforms), tui_gateway (dashboard/Ink), cli.py (interactive /attach + drag-drop). run_agent.py changes: - _prepare_anthropic_messages_for_api now passes image parts through unchanged when the model supports vision — the Anthropic adapter translates them to native image blocks. Previous behaviour (vision_analyze → text) only runs for non-vision Anthropic models. - New _prepare_messages_for_non_vision_model mirrors the same contract for chat.completions and codex_responses paths, so non-vision models on any provider get text-fallback instead of failing at the provider. - New _model_supports_vision() helper reads models.dev caps. vision_analyze description rewritten: positions it as a tool for images NOT already visible in the conversation (URLs, tool output, deeper inspection). Prevents the model from redundantly calling it on images already attached natively. Config default: agent.image_input_mode = auto. Tests: 35 new (test_image_routing.py + test_vision_aware_preprocessing.py), all existing tests that reference _prepare_anthropic_messages_for_api still pass (198 targeted + new tests green). * feat(image-input): size-cap + resize oversized images, charge image tokens in compressor Two follow-ups that make the native image routing safer for long / heavy sessions: 1) Oversize handling in build_native_content_parts: - 20 MB ceiling per image (matches vision_tools._MAX_BASE64_BYTES, the most restrictive provider — Gemini inline data). - Delegates to vision_tools._resize_image_for_vision (Pillow-based, already battle-tested) to downscale to 5 MB first-try. - If Pillow is missing or resize still overshoots, the image is dropped and reported back in skipped[]; caller falls back to text enrichment for that image. 2) Image-token accounting in context_compressor: - New _IMAGE_TOKEN_ESTIMATE = 1600 (matches Claude Code's constant; within the realistic range for Anthropic/GPT-4o/Gemini billing). - _content_length_for_budget() helper: sums text-part lengths and charges _IMAGE_CHAR_EQUIVALENT (1600 * 4 chars) per image/image_url/ input_image part. Base64 payload inside image_url is NOT counted as chars — dimensions don't matter, only image-presence. - Both tail-cut sites (_prune_old_tool_results L527 and _find_tail_cut_by_tokens L1126) now call the helper so multi-image conversations don't slip past compression budget. Tests: 9 new in test_image_routing.py (oversize triggers resize, resize-fails-returns-None, oversize-skipped-reported), 11 new in test_compressor_image_tokens.py (flat charge per image, multiple images, Responses-API / Anthropic-native / OpenAI-chat shapes, no-inflation on raw base64, bounds-check on the constant, integration test that an image-heavy tail actually gets trimmed). * fix(image-input): replace blanket 20MB ceiling with empirically-verified per-provider limits The previous commit imposed a hardcoded 20 MB base64 ceiling on all providers, triggering auto-resize on anything larger. This was wrong in both directions: * Too loose for Anthropic — actual limit is 5 MB (returns HTTP 400 'image exceeds 5 MB maximum' above that). * Too strict for OpenAI / Codex / OpenRouter — accept 49 MB+ without complaint (empirically verified April 2026 with progressive PNG sizes). New behaviour: * _PROVIDER_BASE64_CEILING table: only anthropic and bedrock have a ceiling (5 MB, since bedrock-on-Claude shares Anthropic's decoder). * Providers NOT in the table get no ceiling — images attach at native size and we trust the provider to return its own error if it disagrees. A provider-specific 400 message is clearer than us guessing wrong and silently degrading image quality. * build_native_content_parts() gains a keyword-only provider arg; gateway/CLI/TUI pass the active provider so Anthropic users get auto-resize protection while OpenAI users don't pay it. * Resize target dropped from 5 MB to 4 MB to slide safely under Anthropic's boundary with header overhead. Empirical measurements (direct API, no Hermes in the loop): image b64 anthropic openrouter/gpt5.5 codex-oauth/gpt5.5 0.19 MB ✓ ✓ ✓ 12.37 MB ✗ 400 5MB ✓ ✓ 23.85 MB ✗ 400 5MB ✓ ✓ 49.46 MB ✗ 413 ✓ ✓ Tests: rewrote TestOversizeHandling (5 tests): no-ceiling pass-through, Anthropic resize fires, Anthropic skip on resize-fail, build_native_parts routes ceiling by provider, unknown provider gets no ceiling. All 52 targeted tests pass. * refactor(image-input): attempt native, shrink-and-retry on provider reject Replace proactive per-provider size ceilings with a reactive shrink path on the provider's actual rejection. All providers now attempt native full-size attachment first; if the provider returns an image-too-large error, the agent silently shrinks and retries once. Why the previous design was wrong: hardcoding provider ceilings (anthropic=5MB, others=unlimited) meant OpenAI users on a 10MB image paid no tax, but Anthropic users lost quality on anything >5MB even though the empirical behaviour at provider-reject time is the same (shrink + retry). Baking the table into the routing layer also requires updating Hermes every time a provider's limit changes. Reactive design: - image_routing.py: _file_to_data_url encodes native size, no ceiling. build_native_content_parts drops its provider kwarg. - error_classifier.py: new FailoverReason.image_too_large + pattern match ("image exceeds", "image too large", etc.) checked BEFORE context_overflow so Anthropic's 5MB rejection lands in the right bucket. - run_agent.py: new _try_shrink_image_parts_in_messages walks api messages in-place, re-encodes oversized data: URL image parts through vision_tools._resize_image_for_vision to fit under 4MB, handles both chat.completions (dict image_url) and Responses (string image_url) shapes, ignores http URLs (provider-fetched). New image_shrink_retry_attempted flag in the retry loop fires the shrink exactly once per turn after credential-pool recovery but before auth retries. E2E verified live against Anthropic claude-sonnet-4-6: - 17.9MB PNG (23.9MB b64) attached at native size - Anthropic returns 400 "image exceeds 5 MB maximum" - Agent logs '📐 Image(s) exceeded provider size limit — shrank and retrying...' - Retry succeeds, correct response delivered in 6.8s total. Tests: 12 new (8 shrink-helper shapes + 4 classifier signals), replaces 5 proactive-ceiling tests with 3 simpler 'native attach works' tests. 181 targeted tests pass. test_enum_members_exist in test_error_classifier.py updated for the new enum value.	2026-04-27 06:27:59 -07:00
Teknium	8ed599dc05	feat(update): auto-backup HERMES_HOME before hermes update (#16539 ) Every 'hermes update' now runs a full backup of ~/.hermes/ first, so users can always roll back to the exact state they had before the update if anything goes wrong (corrupted sessions.db, broken skills, config migrations that don't round-trip, etc.). Changes: - hermes_cli/backup.py: new create_pre_update_backup() helper. Writes to <HERMES_HOME>/backups/pre-update-<stamp>.zip using the same exclusion rules and SQLite safe-copy as 'hermes backup'. Auto-rotates (keep last N, pre-update-*.zip only — hand-dropped zips in backups/ are untouched). Adds 'backups' to _EXCLUDED_DIRS so subsequent backups don't nest prior ones. - hermes_cli/main.py: _run_pre_update_backup() wired into _cmd_update_impl before any git operation. Prints save path, restore command, and how to disable. Swallows failures so a broken backup never blocks the update itself. New --no-backup flag on 'hermes update' for one-off override. - hermes_cli/config.py: new 'updates' section in DEFAULT_CONFIG with pre_update_backup (default true) and backup_keep (default 5). Auto-surfaces in the dashboard config UI. - tests/hermes_cli/test_backup.py: +11 tests covering backup location, content parity with 'hermes backup', no-recursion, rotation, manual file preservation, config gate, --no-backup flag, flag-wins-over-config.	2026-04-27 05:36:19 -07:00
Teknium	bb00b783fb	fix(cli): eliminate ghost status-bar + DSR input leaks from terminal drift The CLI renders through prompt_toolkit in non-full-screen mode, so every repaint uses the renderer's tracked _cursor_pos.y to cursor_up() + erase before drawing the new frame. Any time that tracked position drifts from terminal reality, redraws stack on top of stale content instead of overwriting it. Four user-visible bugs share this root cause. Fixes: - #5474 (SIGWINCH ghosts): the resize wrapper previously only handled column-shrink reflow. Generalize it to force a full screen-clear (erase_screen + cursor_goto(0,0)) and renderer.reset() on every resize — covers widen, row-shrink, and multiplexer SIGWINCH-less redraws. - #8688 (cmux/tmux tab switch): no SIGWINCH fires on focus regain, so prompt_toolkit has no signal to recover. Add a _force_full_redraw() helper, bound to Ctrl+L (standard bash/zsh/vim convention) and exposed as /redraw. Users can manually clear drift without restarting Hermes. - #14692 (DSR response leaks — ^[[53;1R): resize storms make prompt_toolkit's CSI 6n queries race past the input parser; the terminal's reply ends up as literal input text. Add a sibling of the bracketed-paste sanitizer that strips \x1b[<row>;<col>R and the caret-escape visible form from paste text, buffer text-filter, and the input-processing loop. The idle-redraw removal (#12641) is in the preceding commit from @foxion37 — keeping them as separate commits preserves attribution.	2026-04-27 05:31:47 -07:00
Teknium	90a3e73daf	fix(debug): sweep expired paste.rs uploads on a real timer (#16431 ) Previously 'hermes debug share' uploads only got DELETEd when the user ran 'hermes debug share' again — opportunistic-sweep-on-invoke was the only cleanup path. A user who uploaded once and never ran debug again left pastes up until paste.rs's retention kicked in (which, empirically, never actually expires them). Hook _sweep_expired_pastes into the gateway cron ticker at the same hourly cadence as the image/document cache cleanups. The opportunistic sweep in 'hermes debug share' stays as a fallback for CLI-only users who never start the gateway.	2026-04-27 00:36:33 -07:00
Teknium	21f503c23c	feat(update): snapshot pairing data before git pull (#16383 ) Quick state snapshot now includes pairing JSONs (generic + legacy + Feishu comment pairing), and `hermes update` takes a pre-update snapshot labeled `pre-update` before pulling. Pairing data lives outside state.db in platform-specific JSONs under ~/.hermes/pairing/, ~/.hermes/platforms/pairing/, and ~/.hermes/feishu_comment_pairing.json. The update command already couldn't touch $HERMES_HOME, but #15733 reports lost pairing after an update — this gives users something to restore from via `/snapshot list` / `/snapshot restore <id>` if anything clobbers the approved-user lists. - Extend _QUICK_STATE_FILES with pairing paths (files + dirs) - Snapshot walks directories recursively and records each file in the manifest individually so restore logic is unchanged - _cmd_update_impl calls create_quick_snapshot(label='pre-update') after 'Found N new commits' and before 'Pulling updates' - Snapshot failures are logged at debug and never block the update Refs #15733.	2026-04-27 00:19:12 -07:00
Teknium	8258f4dcb7	fix(model): avoid persisting key_env-resolved secrets to providers entry (#16372 ) When 'hermes model' runs against a providers: (keyed-schema) entry that relies only on key_env, the picker resolves the env var for the live /models request and then wrote a synthesized 'api_key: ${KEY_ENV}' back to the providers.<key> entry. That's redundant — the runtime already resolves from key_env directly — and it clutters configs that intentionally keep credentials out of config.yaml. Only persist provider_entry['api_key'] when the user originally had an inline value (literal secret or ${VAR} template). Entries that declared only key_env stay clean on save. Fixes #15803.	2026-04-26 21:52:12 -07:00
Teknium	c5781d50c7	fix(azure-foundry): auto-route gpt-5.x / codex / o-series to Responses API (#16361 ) Azure Foundry deploys GPT-5.x, codex-*, and o1/o3/o4 reasoning models as Responses-API-only. Calling /chat/completions against these deployments returns 400 'The requested operation is unsupported.', which broke any user who ran 'hermes model' on Azure, picked a gpt-5/codex deployment, and kept the default api_mode: chat_completions. Verified in a user debug bundle on 2026-04-26: gpt-5.3-codex failed on synopsisse.openai.azure.com with that exact payload while gpt-4o-pure on the same endpoint worked. Adds azure_foundry_model_api_mode(model_name) that returns codex_responses when the model name starts with gpt-5, codex, o1, o3, or o4 — otherwise None so chat_completions / anthropic_messages stay untouched for gpt-4o, Llama, Claude-via-Anthropic, etc. Resolver (both the direct Azure Foundry path and the pool-entry path) consults it and upgrades api_mode unless the user explicitly picked anthropic_messages. target_model (from /model mid-session switch) takes precedence over the persisted default so switching from gpt-4o to gpt-5.3-codex routes correctly before the next request. Docs: correct the azure-foundry guide which previously claimed Azure keeps gpt-5.x on chat completions — that was only true for early Azure OpenAI, not Azure Foundry codex/o-series deployments. Tests: 14 unit tests for azure_foundry_model_api_mode + 6 integration tests in TestAzureFoundryResolution covering Bob's exact scenario, target_model override, anthropic_messages guard, and o3-mini.	2026-04-26 21:33:31 -07:00
brooklyn!	e63929d4f3	Merge pull request #15926 from NousResearch/bb/tui-long-session-perf perf(tui): stabilize long-session scrolling	2026-04-26 23:10:08 -05:00
Teknium	9c416e20ab	feat(skills): install skills from a direct HTTP(S) URL (#16323 ) * feat(skills): install skills from a direct HTTP(S) URL Adds UrlSource adapter so `hermes skills install <url-to-SKILL.md>` and `/skills install <url>` work as first-class operations — no more improvising with curl + patch + cp. - Claims identifiers that start with http(s):// and end in .md - Skips /.well-known/skills/ URLs (WellKnownSkillSource handles those) - Skill name from YAML frontmatter, URL-slug fallback - Single-file SKILL.md only (v1 scope — multi-file skills need a manifest) - Trust level 'community'; full security scan still runs - Lock file stores the URL as identifier so `hermes skills update` re-fetches from the same URL cleanly Scope matches real user need from @versun's docx feedback where `https://sharethis.chat/SKILL.md` had no first-class install path. * feat(skills): interactive name/category for URL installs + --name override Follow-up to the UrlSource adapter. The previous commit fell back to weak heuristics when frontmatter had no ``name:`` and could produce garbage names like ``SKILL`` or ``unnamed-skill``. Now: tools/skills_hub.py - ``UrlSource._is_valid_skill_name()`` — strict identifier check (``^[a-z][a-z0-9_-]*$``), rejects sentinel values (``SKILL``, ``README``, ``INDEX``, ``unnamed-skill``, empty, non-strings). - ``_resolve_skill_name()`` returns ``Optional[str]`` — ``None`` when nothing valid is resolvable. Also ignores unsafe frontmatter names (``../evil``) and falls through to URL slug instead of returning None immediately, so a URL with a bad frontmatter but a good path still works. - ``fetch()``/``inspect()`` carry an ``awaiting_name=True`` marker in metadata/extra when resolution fails, letting ``do_install`` decide whether to prompt, apply an override, or error out. hermes_cli/skills_hub.py - ``do_install`` gains a ``name_override`` parameter. - On URL-sourced bundles with ``awaiting_name=True``: 1. If ``name_override`` is valid → use it. 2. If ``name_override`` is invalid → refuse with a clear error. 3. Else if ``skip_confirm=True`` (non-interactive: slash / TUI / gateway / scripts) → refuse with an actionable retry hint pointing at ``--name <your-name>`` on both CLI and slash forms. 4. Else (interactive TTY) → prompt for the name. - Interactive TTY also prompts for a category when none is given for a URL-sourced install, hinting existing category buckets so users can reuse ``productivity``, ``devops``, etc. Empty input → flat install. - ``_existing_categories()`` scans ``~/.hermes/skills/`` for subdirs that look like category buckets (contain nested SKILL.md files); skips top-level skills and hidden dirs. - ``_prompt_for_skill_name()`` / ``_prompt_for_category()`` helpers (EOF/Ctrl-C-safe, match the existing ``Confirm [y/N]`` prompt style). hermes_cli/main.py - ``hermes skills install`` argparse gains ``--name <name>``. hermes_cli/skills_hub.py (slash) - ``/skills install <url> --name <x>`` parsing added. Tests - tests/tools/test_skills_hub.py: updated ``UrlSource`` tests to assert the new ``awaiting_name`` metadata; added 4 new tests for ``_is_valid_skill_name`` rejection sets and the awaiting-name marker. - tests/hermes_cli/test_skills_hub.py: 8 new tests covering --name override accept/reject, non-interactive error, interactive name prompt, interactive category prompt, cancel-aborts-install, and ``_existing_categories`` scan behavior (buckets vs flat skills). - E2E verified all four paths (no-name/no-override → error; --name override → install; frontmatter name → install; invalid --name → rejection). --------- Co-authored-by: teknium1 <teknium@noreply.github.com>	2026-04-26 20:57:10 -07:00
Teknium	366351b94d	refactor(timeouts): drop redundant ImportError in except clause Exception already covers ImportError; (ImportError, Exception) was a cosmetic wart from the bugfix. Pure no-op.	2026-04-26 20:48:20 -07:00
sprmn24	16e243e067	fix(timeouts): guard load_config() call against runtime exceptions Both get_provider_request_timeout() and get_provider_stale_timeout() wrapped the load_config import in try/except ImportError but left the actual load_config() call unprotected. A corrupt config file, YAML parse error, or permission failure would raise instead of returning None safely. Move load_config() inside the try block so any exception returns None.	2026-04-26 20:48:20 -07:00
Brooklyn Nicholson	3e1664923d	Revert "fix(tui): report actual session on exit" This reverts commit `1566f1eecc`.	2026-04-26 22:43:34 -05:00
Brooklyn Nicholson	c23463fce9	chore(tui): keep MRU resume split out of perf PR - remove the temporary -c MRU logic and companion test from this branch so PR #15926 stays focused on TUI perf work - keep the resume-ordering change isolated in the dedicated follow-up PR	2026-04-26 22:40:35 -05:00
Brooklyn Nicholson	625c31fcea	fix(tui): run built TUI with production React by default CPU profiling showed the built TUI loading React development modules unless NODE_ENV was set. Default CLI and dashboard TUI children to production while preserving explicit user overrides.	2026-04-26 21:34:31 -05:00
Brooklyn Nicholson	7da2f07641	Merge remote-tracking branch 'origin/main' into bb/tui-long-session-perf	2026-04-26 21:07:15 -05:00
Teknium	478444c262	feat(checkpoints): auto-prune orphan and stale shadow repos at startup (#16303 ) Every working dir hermes ever touches gets its own shadow git repo under ~/.hermes/checkpoints/{sha256(abs_dir)[:16]}/. The per-repo _prune is a no-op (comment in CheckpointManager._prune says so), so abandoned repos from deleted/moved projects or one-off tmp dirs pile up forever. Field reports put the typical offender at 1000+ repos / ~12 GB on active contributor machines. Adds an opt-in startup sweep that mirrors the sessions.auto_prune pattern from #13861 / #16286: - tools/checkpoint_manager.py: new prune_checkpoints() and maybe_auto_prune_checkpoints() helpers. Deletes shadow repos that are orphan (HERMES_WORKDIR marker points to a path that no longer exists) or stale (newest in-repo mtime older than retention_days). Idempotent via a CHECKPOINT_BASE/.last_prune marker file so it only runs once per min_interval_hours regardless of how many hermes processes start up. - hermes_cli/config.py: new checkpoints.auto_prune / retention_days / delete_orphans / min_interval_hours knobs. Default auto_prune: false so users who rely on /rollback against long-ago sessions never lose data silently. - cli.py / gateway/run.py: startup hooks gated on checkpoints.auto_prune, called right next to the existing state.db maintenance block. - Docs updated with the new config knobs. - 11 regression tests: orphan/stale deletion, precedence, byte-freed tracking, non-shadow dir skip, interval gating, corrupt marker recovery. Refs #3015 (session-file disk growth was fixed in #16286; this covers the checkpoint side noted out-of-scope there).	2026-04-26 19:05:52 -07:00
Teknium	87610ce380	fix(tools): coerce quoted use_gateway in image_gen UI detection Follow-up to #15960 — the provider-active detection in tools_config.py also read use_gateway with raw truthiness (is False, not dict.get), so quoted 'false' caused the FAL-direct row to show wrong active status in the hermes tools picker. Route both sites through is_truthy_value().	2026-04-26 19:02:55 -07:00
Yoimex	f66ebe64e8	fix(cli): coerce use_gateway config flags in tool routing	2026-04-26 19:02:55 -07:00
Teknium	34eb1aaa9a	fix(update): use npm ci to stop rewriting package-lock on every update (#16295 ) `npm install --silent` (used by `_build_web_ui` and `_update_node_dependencies`) silently rewrites package-lock.json on npm ≥ 10 (strips "peer": true etc.), leaving the working tree dirty after every `hermes update`. The next update then detects the dirty lockfile and stashes it — producing a trail of hermes-update-autostash entries for web/package-lock.json, ui-tui/package-lock.json, and root package-lock.json. Switch to `npm ci` (strict, lockfile-preserving) via a new `_run_npm_install_deterministic` helper that falls back to `npm install` when the lockfile is missing or out of sync (WIP forks). Verified locally: all three lockfiles stay byte-identical after the real _build_web_ui / _update_node_dependencies run twice back-to-back. Fallback path tested with a deliberately out-of-sync lockfile and a no-lockfile case.	2026-04-26 18:51:31 -07:00
Teknium	ab6879634e	yuanbao platform (#16298 ) Co-authored-by: loongzhao <loongzhao@tencent.com>	2026-04-26 18:50:49 -07:00
Teknium	5eb6cd82b2	fix(sessions): /save lands under $HERMES_HOME, widen browse+TUI picker, force-refresh ollama-cloud on setup (#16296 ) Four independent session-UX bugs reported by an external user (#16294). /save wrote hermes_conversation_<ts>.json to CWD — invisible to 'hermes sessions browse' and easy to lose. Snapshots now write under ~/.hermes/sessions/saved/ and the command prints the absolute path plus a 'hermes --resume <id>' hint for the live DB-indexed session. 'hermes sessions browse' default --limit raised from 50 to 500. With the old ceiling, users with moderately long histories saw only the most recent 50 rows and assumed older sessions had been lost. TUI session.list (`/resume` picker) switched from a hardcoded allow-list of 13 gateway source names to a deny-list of just { 'tool' }. Sessions tagged acp / webhook / user-defined HERMES_SESSION_SOURCE values and any newly-added platform now surface. Default limit 20 → 200. ollama-cloud provider setup passes force_refresh=True to fetch_ollama_cloud_models() so a user entering their API key sees the fresh catalog (e.g. deepseek v4 flash, kimi k2.6) immediately instead of waiting up to an hour for the disk cache TTL to expire. Closes #16294.	2026-04-26 18:49:48 -07:00
Teknium	55e9329ee6	feat(config): register bundled-skill API keys in OPTIONAL_ENV_VARS Adds NOTION_API_KEY, LINEAR_API_KEY, TENOR_API_KEY, and AIRTABLE_API_KEY to OPTIONAL_ENV_VARS so: - They persist to ~/.hermes/.env via save_env_value like every other key Hermes knows about, instead of being ad-hoc variables the user has to hand-edit the dotfile for. - load_env() / reload_env() populate os.environ from .env on every startup — the user sets the key once, skills keep working across restarts without losing access. - hermes setup / hermes config show surface them as known optional vars with the correct signup URL (linear.app/settings/api, airtable.com/create/tokens, etc.). These four entries use category="skill" (new) rather than "tool". tools/environments/local.py auto-adds every category=tool/messaging entry to _HERMES_PROVIDER_ENV_BLOCKLIST, which stops env passthrough from leaking provider credentials into the execute_code sandbox (GHSA-rhgp-j443-p4rf). Skill API keys are the opposite case — the point is for the agent's subprocess to see them so curl can read Authorization headers — so they must be outside the blocklist. The new category is inert for that check. All four entries are advanced=True: they show up in 'hermes config' and 'hermes status' displays, but do not nag users who have never touched those skills during setup checklists. E2E verified: save_env_value → reload_env → os.environ populated → skill_view reports setup_needed=False → env_passthrough registers the key for subprocess inheritance.	2026-04-26 18:45:15 -07:00
George Glessner	5b5a53a155	fix(cli): check hermes_cli/web_dist/ not web/dist/ for build staleness _web_ui_build_needed() in PR #14914 checked web_dir/"dist" as the sentinel, but vite.config.ts sets outDir: "../hermes_cli/web_dist" so the build output lands in hermes_cli/web_dist/, never in web/dist/. The sentinel was therefore always missing → _web_ui_build_needed always returned True → npm install + Vite build ran on every startup → OOM on low-memory VPS persisted unchanged. Fix: derive dist_dir as web_dir.parent / "hermes_cli" / "web_dist" so the sentinel points to the actual build output directory. Fixes #14898	2026-04-26 18:43:57 -07:00
Yang Zhi	3b60abb6bb	fix(sessions): delete on-disk transcript files during prune and delete (#3015 ) `delete_session()` and `prune_sessions()` only removed SQLite records, leaving .json/.jsonl transcript files on disk forever. Over time this causes unbounded disk growth (~27MB/day observed). Changes: - Add `_remove_session_files()` static helper that cleans up `{session_id}.json`, `.jsonl`, and `request_dump_{session_id}_*.json` - `delete_session()` accepts optional `sessions_dir` param and removes files for the deleted session and its children - `prune_sessions()` accepts optional `sessions_dir` param and removes files for all pruned sessions after the DB transaction - Wire up CLI `hermes sessions delete` and `hermes sessions prune` to pass `sessions_dir` - File cleanup is best-effort (OSError silenced) so DB operations are never blocked by filesystem issues - Fully backward-compatible: `sessions_dir=None` (default) preserves existing behavior	2026-04-26 18:31:07 -07:00
Teknium	635253b918	feat(busy): add 'steer' as a third display.busy_input_mode option (#16279 ) Enter while the agent is busy can now inject the typed text via /steer — arriving at the agent after the next tool call — instead of interrupting (current default) or queueing for the next turn. Changes: - cli.py: keybinding honors busy_input_mode='steer' by calling agent.steer(text) on the UI thread (thread-safe), with automatic fallback to 'queue' when the agent is missing, steer() is unavailable, images are attached, or steer() rejects the payload. /busy accepts 'steer' as a fourth argument alongside queue/interrupt/status. - gateway/run.py: busy-message handler and the PRIORITY running-agent path both route through running_agent.steer() when the mode is 'steer', with the same fallback-to-queue safety net. Ack wording tells users their message was steered into the current run. Restart-drain queueing now also activates for 'steer' so messages aren't lost across restarts. - agent/onboarding.py: first-touch hint has a steer branch for both CLI and gateway. - hermes_cli/commands.py: /busy args_hint updated to include steer, and 'steer' is registered as a subcommand (completions). - hermes_cli/web_server.py: dashboard select widget offers steer. - hermes_cli/config.py, cli-config.yaml.example, hermes_cli/tips.py: inline docs updated. - website/docs/user-guide/cli.md + messaging/index.md: documented. - Tests: steer set/status path for /busy; onboarding hints; _load_busy_input_mode accepts steer; busy-session ack exercises steer success + two fallback-to-queue branches. Requested on X by @CodingAcct. Default is unchanged (interrupt).	2026-04-26 18:21:29 -07:00
Brooklyn Nicholson	bde89c169b	fix(cli): -c picks the most recently used session	2026-04-26 16:17:39 -05:00
Brooklyn Nicholson	1566f1eecc	fix(tui): report actual session on exit	2026-04-26 15:55:01 -05:00
Teknium	4921b26945	fix(cron): keep homeassistant toolset enabled when HASS_TOKEN is set (#16208 ) After #14798 made cron honor per-platform `hermes tools` config, the `_DEFAULT_OFF_TOOLSETS` filter silently stripped `homeassistant` from cron jobs for users who'd been relying on the previous blanket toolset. Norbert's HA cron reports regressed as a result. The HA toolset is already runtime-gated by its `check_fn` (requires HASS_TOKEN to register any tools). When HASS_TOKEN is set the user has explicitly opted in — `_DEFAULT_OFF_TOOLSETS` adds nothing in that case, so stop double-gating and restore HA for cron / cli / other platforms without an explicit saved toolset list. moa and rl stay off by default (original #14798 goal preserved). Fixes HA cron regression reported by Norbert.	2026-04-26 12:55:58 -07:00
Teknium	541cd732e8	chore(models): drop deepseek from OpenRouter and Nous Portal curated picker lists (#16197 ) Removes deepseek/deepseek-v4-pro and deepseek/deepseek-v4-flash from OPENROUTER_MODELS and _PROVIDER_MODELS['nous'], then regenerates website/static/api/model-catalog.json so the hosted picker JSON drops them too. Direct-API deepseek provider support is unchanged.	2026-04-26 12:28:17 -07:00
Teknium	897dc3a2bb	fix(install+update): add /usr/local/bin PATH guard for RHEL root non-login shells (#16191 ) * fix(install): add /usr/local/bin PATH guard for RHEL root non-login shells The FHS-layout branch assumed /usr/local/bin is on PATH for every standard shell. That holds for login shells (via /etc/profile's pathmunge) but breaks on RHEL/CentOS/Rocky/Alma 8+ root in non-login interactive shells (su, sudo -s, tmux panes, some web terminals) — /etc/bashrc does not add /usr/local/bin and /root/.bash_profile doesn't either. Result: hermes command links to /usr/local/bin/hermes but the user has to type the absolute path each time. Probe a fresh 'bash -i -c' (non-login interactive, matching the user scenario) after symlinking. If hermes isn't resolvable, append an idempotent PATH guard to /root/.bashrc and /root/.bash_profile, same grep pattern already used by the ~/.local/bin branch below. No change on distros where /usr/local/bin is already inherited. * fix(update): repair RHEL root PATH on hermes update Existing RHEL/CentOS/Rocky/Alma root installs won't be repaired by the install.sh fix alone because 'hermes update' is an in-place git pull, not a rerun of install.sh. Port the same probe + idempotent .bashrc write into cmd_update so affected users get fixed automatically on next update. _ensure_fhs_path_guard() runs after 'Update complete!': - Linux + root + FHS-layout install (command at /usr/local/bin/hermes) only - Probe: env -i bash -i -c 'command -v hermes' — fresh non-login interactive shell, same scenario the user reports - On failure, append PATH guard to /root/.bashrc and /root/.bash_profile, skipping if any uncommented PATH line already mentions /usr/local/bin - Silent no-op on macOS, non-root, legacy layout, or shells that already resolve hermes	2026-04-26 12:22:37 -07:00
Teknium	087e74d4d7	feat(slack): register every gateway command as a native slash (Discord/Telegram parity) (#16164 ) Every command in COMMAND_REGISTRY (/btw, /stop, /model, /help, /new, /bg, /reset, ...) is now a first-class Slack slash command instead of a /hermes <subcommand>. Users get the same autocomplete-driven slash picker experience Slack users expect and that Discord and Telegram already provide. Previously Slack registered ONE native slash (/hermes) and split on the first word, so typing /btw in Slack's composer got 'couldn't find an app for /btw' because the workspace manifest never declared it. Changes - hermes_cli/commands.py: slack_native_slashes() + slack_app_manifest() generate a Slack manifest from the registry (canonical names + aliases + plugin commands), clamped to Slack's 50-slash cap with /hermes reserved as the catch-all. - gateway/platforms/slack.py: single regex matcher dispatches every registered slash to _handle_slash_command, which dispatches on command['command']. Legacy /hermes <subcommand> keeps working for backward compat with older workspace manifests. - hermes_cli/slack_cli.py + hermes_cli/main.py: new 'hermes slack manifest' command prints/writes a full manifest (display info, OAuth scopes, event subs, socket mode, slash commands) ready to paste into 'Create from manifest' or Features → App Manifest. - hermes_cli/setup.py: _setup_slack() now writes the manifest up-front and points users at the 'From an app manifest' flow; also offers to refresh the manifest on reconfigure for picking up new commands. - Tests: 14 new tests covering native-slash dispatch (/btw, /stop, /model), legacy /hermes <sub> compat, manifest structure, and telegram<->slack parity (every Telegram command must also register as a Slack slash). Existing /hermes-registration test updated to assert the new regex matches /hermes, /btw, /stop, /model, /help. - Docs: slack.md gains a 'Slash Commands' section + Option A manifest flow in Step 1; cli-commands.md documents 'hermes slack manifest'. Users pick up the new slashes by running 'hermes slack manifest --write' and pasting into Features → App Manifest → Edit in their Slack app config, then Save (Slack prompts for reinstall if scopes changed).	2026-04-26 11:38:32 -07:00
Teknium	42c076d349	feat(browser): auto-spawn local Chromium for LAN/localhost URLs in cloud mode (#16136 ) When a cloud browser provider (Browserbase / Browser-Use / Firecrawl) is configured, browser_navigate now transparently spawns a local Chromium sidecar for URLs whose host resolves to a private/loopback/LAN address (localhost, 127.0.0.1, 192.168.x.x, 10.x.x.x, .local, .lan, *.internal, ::1, 169.254.x.x). Public URLs continue to use the cloud provider in the same conversation. Previously, setting BROWSERBASE_API_KEY / cloud_provider: browserbase pinned the whole tool to cloud for the process — localhost URLs were either SSRF-blocked (default) or sent to Browserbase (where they 404'd because the cloud can't reach your LAN). Users who wanted 'cloud for public, local for localhost' had no way to express it short of toggling providers mid-session. Implementation uses a composite session key scheme: the bare task_id serves the cloud session, and a '{task_id}::local' sidecar serves the local Chromium. _last_active_session_key[task_id] tracks which of the two served the most recent nav so snapshot/click/fill/etc. hit the correct one. cleanup_browser(bare_task_id) reaps both. Feature is on by default. Opt out via: browser: auto_local_for_private_urls: false The cloud provider never sees private URLs. Post-redirect SSRF guard is preserved: redirects from public onto private addresses still block.	2026-04-26 09:57:58 -07:00
Teknium	0e2a53eab2	feat(skills): show enabled/disabled status in 'skills list' (#16129 ) 'hermes skills list' now shows every skill's enabled/disabled status and accepts --enabled-only to filter down to what will actually load for the active profile: hermes -p dario skills list --enabled-only Previously the command was a flat catalog — it did not apply skills.disabled from config.yaml, so there was no way to see the live skill set for a profile without reading config by hand. Profile switching already works via -p (swaps HERMES_HOME); this just surfaces the result visibly. Changes: - hermes_cli/skills_hub.py: do_list adds a Status column and an enabled_only filter; summary reports enabled/disabled split - hermes_cli/main.py: --enabled-only flag on 'skills list' - /skills list slash command accepts --enabled-only too - tests: 4 new (status column, disabled marking, enabled-only hiding, no platform leakage into get_disabled_skill_names); existing fixtures updated to accept skip_disabled kwarg Reported by @mochizukimr on X.	2026-04-26 09:20:53 -07:00
Teknium	f2d655529a	fix(auth): hoist get_env_value import + strengthen .env fallback tests Follow-up to cherry-picked PR #15920: - agent/credential_pool.py: hoist 'from hermes_cli.config import get_env_value' to module top instead of inline try/except in each seed site (3 sites). No import cycle — hermes_cli/config.py doesn't depend on agent.credential_pool. - hermes_cli/auth.py: same hoist for the _resolve_api_key_provider_secret loop. - tests/tools/test_credential_pool_env_fallback.py: replace smoke-only tests with real .env file I/O. Each test writes a temp ~/.hermes/.env, verifies _seed_from_env / _resolve_api_key_provider_secret read from it, and asserts the full priority chain: os.environ > .env > credential_pool. Uses 'deepseek' as the test provider since 'openai' isn't in PROVIDER_REGISTRY and _seed_from_env's generic path requires a real pconfig lookup.	2026-04-26 08:32:09 -07:00
阿泥豆	8443998dc3	fix(auth): resolve API keys from ~/.hermes/.env and credential_pool _resolve_api_key_provider_secret() and _seed_from_env() only checked os.environ for provider API keys. When keys exist in ~/.hermes/.env but are not loaded into the process environment (e.g. ACP adapter entry point, post-session-start .env edits, or non-CLI entry points), the resolution returns an empty string, causing HTTP 401 failures. Changes: - credential_pool._seed_from_env: use get_env_value() which checks both os.environ and ~/.hermes/.env file, preventing _prune_stale_seeded_entries from removing valid entries whose env var isn't in os.environ - credential_pool._seed_from_env: same fix for openrouter and base_url_env_var resolution - auth._resolve_api_key_provider_secret: use get_env_value() instead of os.getenv(), and add credential_pool fallback when env resolution fails Fixes #15914	2026-04-26 08:32:09 -07:00
Teknium	06f81752ed	Revert "feat(kanban): durable multi-profile collaboration board (#16081 )" (#16098 ) This reverts commit `15937a6b46`.	2026-04-26 08:29:37 -07:00
Teknium	15937a6b46	feat(kanban): durable multi-profile collaboration board (#16081 ) New `hermes kanban` CLI subcommand + `/kanban` slash command + skills for worker and orchestrator profiles. SQLite-backed task board (~/.hermes/kanban.db) shared across all profiles on the host. Zero changes to run_agent.py, no new core tools, no tool-schema bloat. Motivation: delegate_task is a function call — sync fork/join, anonymous subagent, no resumability, no human-in-the-loop. Kanban is the durable shape needed for research triage, scheduled ops, digital twins, engineering pipelines, and fleet work. They coexist (workers may call delegate_task internally). What this adds - hermes_cli/kanban_db.py — schema, CAS claim, dependency resolution, dispatcher, workspace resolution, worker-context builder. - hermes_cli/kanban.py — 15-verb CLI surface and shared run_slash() entry point used by both CLI and gateway. - skills/devops/kanban-worker — how a profile should work a claimed task. - skills/devops/kanban-orchestrator — "you are a dispatcher, not a worker" template with anti-temptation rules. - /kanban slash command wired into cli.py and gateway/run.py. Bypasses the running-agent guard (board writes don't touch agent state), so /kanban unblock can free a stuck worker mid-conversation. - Design spec at docs/hermes-kanban-v1-spec.pdf — comparative analysis vs Cline Kanban, Paperclip, NanoClaw, Gemini Enterprise; 8 patterns; 4 user stories; implementation plan; concurrency correctness. - Docs: website/docs/user-guide/features/kanban.md, CLI reference updated, sidebar entry added. Architecture highlights - Three planes: control (user + gateway), state (board + dispatcher), execution (pool of profile processes). - Every worker is a full OS process, spawned as `hermes -p <profile>`. No in-process subagent swarms — solves NanoClaw's SDK-lifecycle failure class. - Atomic claim via SQLite CAS in a BEGIN IMMEDIATE transaction; stale claims reclaimed 15 min after their TTL expires. - Tenant namespacing via one nullable column — one specialist fleet can serve many businesses with data isolation by workspace path. Tests: 60 targeted tests (schema, CAS atomicity, dependency resolution, dispatcher, workspace kinds, tenancy, CLI + slash surface). All pass hermetic via scripts/run_tests.sh.	2026-04-26 08:24:26 -07:00
Teknium	7fa70b6c87	refactor: /btw is now an alias for /background (#16053 ) The ephemeral no-tools side-question variant of /btw confused users who expected 'by-the-way' to mean 'run this off to the side with tools' — they'd type /btw and get a toolless agent that couldn't do the work. /bg worked because it was /background with full tools. Collapse the two: /btw and /bg both alias to /background. One command, one behavior, no more gotchas about which variant has tools. Removed: - _handle_btw_command in cli.py and gateway/run.py - _run_btw_task + _active_btw_tasks state in gateway/run.py - prompt.btw JSON-RPC method + btw.complete event in tui_gateway - BtwStartResponse type + btw.complete case in ui-tui - Standalone /btw slash tree registration in Discord - Standalone btw CommandDef in hermes_cli/commands.py Updated: - background CommandDef aliases: (bg,) -> (bg, btw) - TUI session.ts: local btw handler merged into background - Docs and tips updated to describe /btw as a /background alias	2026-04-26 07:11:08 -07:00
Teknium	1e37ddc929	feat(cli): add 'hermes fallback' command to manage fallback providers (#16052 ) Manage the fallback_providers chain from the CLI instead of hand-editing config.yaml. The picker reuses select_provider_and_model() from 'hermes model' — same provider list, same credential prompts, same model picker. hermes fallback [list] Show the current chain (primary + fallbacks) hermes fallback add Run the model picker, append selection to chain hermes fallback remove Pick an entry to delete (arrow-key menu) hermes fallback clear Remove all entries (with confirmation) 'add' snapshots config['model'] before calling the picker, extracts the user's selection from the post-picker state, then restores the primary and appends {provider, model, base_url?, api_mode?} to fallback_providers. Auth store's active_provider is snapshot/restored too so OAuth-provider fallbacks don't silently deactivate the user's primary. Duplicates and self-as-fallback are rejected. Legacy single-dict 'fallback_model' entries are auto-migrated to the list format on first write.	2026-04-26 06:19:04 -07:00
Teknium	83c1c201f6	feat(onboarding): contextual first-touch hints for /busy and /verbose (#16046 ) Instead of a blocking first-run questionnaire, show a one-time hint the first time the user hits each behavior fork: 1. First message while the agent is working — appends a hint to the busy-ack explaining the /busy queue vs /busy interrupt knob, phrased to match the mode that was just applied (don't tell a queue-mode user to switch to queue). 2. First tool that runs for >= 30s in the noisiest progress mode (tool_progress: all) — prints a hint about /verbose to cycle display modes (all -> new -> off -> verbose). Gated on /verbose actually being usable on the surface: always shown on CLI; on gateway only shown when display.tool_progress_command is enabled. Each hint is latched in config.yaml under onboarding.seen.<flag>, so it fires exactly once per install across CLI, gateway, and cron, then never again. Users can wipe the section to re-see hints. New: - agent/onboarding.py — is_seen / mark_seen / hint strings, shared by both CLI and gateway. - onboarding.seen in DEFAULT_CONFIG (hermes_cli/config.py) and in load_cli_config defaults (cli.py). No _config_version bump — deep merge handles new keys. Wired: - gateway/run.py: _handle_active_session_busy_message appends the hint after building the ack. progress_callback tracks tool.completed duration and queues the tool-progress hint into the progress bubble. - cli.py: CLI input loop appends the busy-input hint on the first busy Enter; _on_tool_progress appends the tool-progress hint on the first >=30s tool completion. In-memory CLI_CONFIG is also updated so subsequent fires in the same process are suppressed immediately. All writes go through atomic_yaml_write and are wrapped in try/except so onboarding can never break the input/busy-ack paths.	2026-04-26 06:06:27 -07:00
Teknium	855366909f	feat(models): remote model catalog manifest for OpenRouter + Nous Portal (#16033 ) OpenRouter and Nous Portal curated picker lists now resolve via a JSON manifest served by the docs site, falling back to the in-repo snapshot when unreachable. Lets us update model lists without shipping a release. Live URL: https://hermes-agent.nousresearch.com/docs/api/model-catalog.json (source at website/static/api/model-catalog.json; auto-deploys via the existing deploy-site.yml GitHub Pages pipeline on every merge to main). Schema (v1) carries id + optional description + free-form metadata at manifest, provider, and model levels. Pricing and context length stay live-fetched via existing machinery (/v1/models endpoints, models.dev). Config (new model_catalog section, default enabled): model_catalog.url master manifest URL model_catalog.ttl_hours disk cache TTL (default 24h) model_catalog.providers.<name>.url optional per-provider override Fetch pipeline: in-process cache -> disk cache (fresh < TTL) -> HTTP fetch -> disk-cache-on-failure fallback -> in-repo snapshot as last resort. Never raises to callers; at worst returns the bundled list. Changes: - website/static/api/model-catalog.json initial manifest (35 OR + 31 Nous) - scripts/build_model_catalog.py regenerator from in-repo lists - hermes_cli/model_catalog.py fetch + validate + cache module - hermes_cli/models.py fetch_openrouter_models() + new get_curated_nous_model_ids() - hermes_cli/main.py, hermes_cli/auth.py Nous flows use the helper - hermes_cli/config.py model_catalog defaults - website/docs/reference/model-catalog.md + sidebars.ts - tests/hermes_cli/test_model_catalog.py 21 tests (validation, fetch success/failure, accessors, disabled, overrides, integration)	2026-04-26 05:46:43 -07:00
Teknium	59b56d445c	feat(hooks): add duration_ms to post_tool_call + transform_tool_result (#15429 ) Plugin hooks fired after a tool dispatch now receive an integer duration_ms kwarg measuring how long the tool's registry.dispatch() call took (time.monotonic() before/after). Inspired by Claude Code 2.1.119 which added the same field to PostToolUse hook inputs. Wire points: - model_tools.py: measure dispatch latency, pass duration_ms to invoke_hook("post_tool_call", ...) and invoke_hook("transform_tool_result", ...) - hermes_cli/hooks.py: include duration_ms in the synthetic payload used by 'hermes hooks test' and 'hermes hooks doctor' so shell-hook authors see the same shape at development time as runtime - shell hooks (agent/shell_hooks.py): no code change needed; _serialize_payload already surfaces non-top-level kwargs under payload['extra'], so duration_ms lands at extra.duration_ms for shell-hook scripts Plugin authors can now build latency dashboards, per-tool SLO alerts, and regression canaries without having to wrap every tool manually. Test: tests/test_model_tools.py::test_post_tool_call_receives_non_negative_integer_duration_ms E2E: real PluginManager + dispatch monkey-patched with a 50ms sleep, hook callback observes duration_ms=50 (int). Refs: https://code.claude.com/docs/en/changelog (2.1.119, Apr 23 2026)	2026-04-25 22:13:12 -07:00
Teknium	a55de5bcd0	feat(setup): auto-reconfigure on existing installs (#15879 ) Bare `hermes setup` on a returning user now drops straight into the full reconfigure wizard — every prompt shows the current value as its default, press Enter to keep or type a new value to change it. The returning-user menu is gone. Behavior: - First-time user: first-time wizard (unchanged) - Returning user, bare command: full reconfigure wizard (new default) - Returning user, `--quick`: only prompt for missing/unset items - Returning user, one section: `hermes setup model\|terminal\|gateway\|tools\|agent` - `--reconfigure`: preserved as backwards-compat alias (no-op since it's now default) The section functions already used current values as prompt defaults — this change just removes the extra click to get to them. The 'Quick Setup - configure missing items only' menu option is now exposed as the explicit `--quick` flag; it's the narrow case of filling in missing config (e.g. after a partial OpenClaw migration or when a required API key got cleared). Inspired by Mercury Agent's `mercury doctor` UX. Also removes: - RETURNING_USER_MENU_SECTION_KEYS (orphaned constant) - Two returning-user menu tests in test_setup_noninteractive.py (guarding behavior that no longer exists — covered by test_setup_reconfigure.py instead)	2026-04-25 22:02:02 -07:00
Teknium	731e1ef8cb	feat(azure-foundry): auto-detect transport, models, context length The azure-foundry wizard now probes the endpoint before asking the user to pick anything by hand: 1. URL path sniff — endpoints ending in /anthropic are Azure Foundry Claude routes and skip to anthropic_messages. 2. GET <base>/models probe — if the endpoint returns an OpenAI-shaped model list, we switch to chat_completions and prefill the picker with the returned deployment/model IDs. 3. Anthropic Messages probe — fallback for endpoints that don't expose /models but do speak the Anthropic Messages shape. 4. Manual fallback — private endpoints / custom routes still work; the user picks API mode + types a deployment name. Context length for the selected model is resolved through the existing agent.model_metadata.get_model_context_length chain (models.dev, provider metadata, hardcoded family fallbacks) and stored in model.context_length when a non-default value is found. Also refactors runtime_provider so Azure Foundry resolution is reused between the explicit-credentials path and the default top-level path — previously the /v1 strip for Anthropic-style Azure only ran when the caller passed explicit_* args, which meant config-driven sessions hit a double-/v1 URL. New module hermes_cli/azure_detect.py with 19 unit tests covering: - path sniff, model ID extraction, probe fallbacks - HTTP error handling (URLError, HTTPError) - context-length lookup passthrough - DEFAULT_FALLBACK_CONTEXT rejection New runtime tests cover: - OpenAI-style Azure Foundry - Anthropic-style Azure Foundry with /v1 stripping - Missing base_url / API key raising AuthError Rationale: Microsoft confirms there's no pure-API-key endpoint to list Azure deployments (that requires ARM management auth). The v1 Azure OpenAI endpoint does expose /models with the resource's available model catalog, which is good enough for picker prefill in the common case. Users on private/gated endpoints fall through to manual entry.	2026-04-25 18:48:43 -07:00
HangGlidersRule	d8e4c7214e	fix: Azure Anthropic short-circuit in resolve_runtime_provider — bypass custom runtime when provider=anthropic + azure.com URL	2026-04-25 18:48:43 -07:00
HangGlidersRule	6ef3a47ce5	fix: use Azure API key directly for Azure endpoints, bypass OAuth token priority chain	2026-04-25 18:48:43 -07:00
TechPrototyper	3a7653dd1f	feat: Add Azure Foundry provider with OpenAI/Anthropic API mode selection Add support for Azure Foundry as a new inference provider. Azure Foundry endpoints can use either OpenAI-style (/v1/chat/completions) or Anthropic-style (/v1/messages) API formats. Changes: - Add azure-foundry to PROVIDER_REGISTRY (auth.py) - Add azure-foundry overlay in HERMES_OVERLAYS (providers.py) - Add empty model list for azure-foundry (models.py) - Add _model_flow_azure_foundry() interactive setup (main.py) - Add azure-foundry runtime resolution with api_mode support (runtime_provider.py) - Add AZURE_FOUNDRY_API_KEY and AZURE_FOUNDRY_BASE_URL env vars (config.py) Usage: hermes model -> More providers -> Azure Foundry The setup wizard prompts for: - Endpoint URL - API format (OpenAI or Anthropic-style) - API key - Model name Configuration is saved to config.yaml (model.provider, model.base_url, model.api_mode, model.default) and ~/.hermes/.env (AZURE_FOUNDRY_API_KEY).	2026-04-25 18:48:43 -07:00
Teknium	125de02056	fix(context): honor custom_providers context_length on /model switch + bump probe tier to 256K (#15844 ) Fixes #15779. Custom-provider per-model context_length (`custom_providers[].models.<id>.context_length`) is now honored across every resolution path, not just agent startup. Also adds 256K as the top probe tier and default fallback. ## What changed New helper `hermes_cli.config.get_custom_provider_context_length()` — single source of truth for the per-model override lookup, with trailing-slash-insensitive base-url matching. `agent.model_metadata.get_model_context_length()` gains an optional `custom_providers=` kwarg (step 0b — runs after explicit `config_context_length` but before every other probe). Wired through five call sites that previously either duplicated the lookup or ignored it entirely: - `run_agent.py` startup — refactored to use the new helper (dedups legacy inline loop, keeps invalid-value warning) - `AIAgent.switch_model()` — re-reads custom_providers from live config on every /model switch - `hermes_cli.model_switch.resolve_display_context_length()` — new `custom_providers=` kwarg - `gateway/run.py` /model confirmation (picker callback + text path) - `gateway/run.py` `_format_session_info` (/info) ## Context probe tiers `CONTEXT_PROBE_TIERS = [256_000, 128_000, 64_000, 32_000, 16_000, 8_000]` — was `[128_000, ...]`. `DEFAULT_FALLBACK_CONTEXT` follows tier[0], so unknown models now default to 256K. The stale `128000` literal in the OpenRouter metadata-miss path is replaced with `DEFAULT_FALLBACK_CONTEXT` for consistency. ## Repro (from #15779) ```yaml custom_providers: - name: my-custom-endpoint base_url: https://example.invalid/v1 model: gpt-5.5 models: gpt-5.5: context_length: 1050000 ``` `/model gpt-5.5 --provider custom:my-custom-endpoint` → previously "Context: 128,000", now "Context: 1,050,000". ## Tests - `tests/hermes_cli/test_custom_provider_context_length.py` — new file, 19 tests covering the helper, step-0b integration, and the 256K tier invariants - `tests/hermes_cli/test_model_switch_context_display.py` — added regression tests for #15779 through the display resolver - `tests/gateway/test_session_info.py` — updated default-fallback assertion (128K → 256K) - `tests/agent/test_model_metadata.py` — updated tier assertions for the new top tier	2026-04-25 18:47:53 -07:00

1 2 3 4 5 ...

1446 Commits