Files
hermes-agent/website/docs/user-guide/features/fallback-providers.md

373 lines
14 KiB
Markdown
Raw Normal View History

docs: fallback providers + /background command documentation * docs: comprehensive fallback providers documentation - New dedicated page: user-guide/features/fallback-providers.md covering both primary model fallback and auxiliary task fallback systems - Updated configuration.md with fallback_model config section - Updated environment-variables.md noting fallback is config-only - Fleshed out developer-guide/provider-runtime.md fallback section with internal architecture details (trigger points, activation flow, config flow) - Added cross-reference from provider-routing.md distinguishing OpenRouter sub-provider routing from Hermes-level model fallback - Added new page to sidebar under Integrations * docs: comprehensive /background command documentation - Added Background Sessions section to cli.md covering how it works (daemon threads, isolated sessions, config inheritance, Rich panel output, bell notification, concurrent tasks) - Added Background Sessions section to messaging/index.md covering messaging-specific behavior (async execution, result delivery back to same chat, fire-and-forget pattern) - Documented background_process_notifications config (all/result/error/off) in messaging docs and configuration.md - Added HERMES_BACKGROUND_NOTIFICATIONS env var to reference page - Fixed inconsistency in slash-commands.md: /background was listed as messaging-only but works in both CLI and messaging. Moved it to the 'both surfaces' note. - Expanded one-liner table descriptions with detail and cross-references
2026-03-15 06:24:28 -07:00
---
title: Fallback Providers
description: Configure automatic failover to backup LLM providers when your primary model is unavailable.
sidebar_label: Fallback Providers
sidebar_position: 8
---
# Fallback Providers
feat(auth): same-provider credential pools with rotation, custom endpoint support, and interactive CLI (#2647) * feat(auth): add same-provider credential pools and rotation UX Add same-provider credential pooling so Hermes can rotate across multiple credentials for a single provider, recover from exhausted credentials without jumping providers immediately, and configure that behavior directly in hermes setup. - agent/credential_pool.py: persisted per-provider credential pools - hermes auth add/list/remove/reset CLI commands - 429/402/401 recovery with pool rotation in run_agent.py - Setup wizard integration for pool strategy configuration - Auto-seeding from env vars and existing OAuth state Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com> Salvaged from PR #2647 * fix(tests): prevent pool auto-seeding from host env in credential pool tests Tests for non-pool Anthropic paths and auth remove were failing when host env vars (ANTHROPIC_API_KEY) or file-backed OAuth credentials were present. The pool auto-seeding picked these up, causing unexpected pool entries in tests. - Mock _select_pool_entry in auxiliary_client OAuth flag tests - Clear Anthropic env vars and mock _seed_from_singletons in auth remove test * feat(auth): add thread safety, least_used strategy, and request counting - Add threading.Lock to CredentialPool for gateway thread safety (concurrent requests from multiple gateway sessions could race on pool state mutations without this) - Add 'least_used' rotation strategy that selects the credential with the lowest request_count, distributing load more evenly - Add request_count field to PooledCredential for usage tracking - Add mark_used() method to increment per-credential request counts - Wrap select(), mark_exhausted_and_rotate(), and try_refresh_current() with lock acquisition - Add tests: least_used selection, mark_used counting, concurrent thread safety (4 threads × 20 selects with no corruption) * feat(auth): add interactive mode for bare 'hermes auth' command When 'hermes auth' is called without a subcommand, it now launches an interactive wizard that: 1. Shows full credential pool status across all providers 2. Offers a menu: add, remove, reset cooldowns, set strategy 3. For OAuth-capable providers (anthropic, nous, openai-codex), the add flow explicitly asks 'API key or OAuth login?' — making it clear that both auth types are supported for the same provider 4. Strategy picker shows all 4 options (fill_first, round_robin, least_used, random) with the current selection marked 5. Remove flow shows entries with indices for easy selection The subcommand paths (hermes auth add/list/remove/reset) still work exactly as before for scripted/non-interactive use. * fix(tests): update runtime_provider tests for config.yaml source of truth (#4165) Tests were using OPENAI_BASE_URL env var which is no longer consulted after #4165. Updated to use model config (provider, base_url, api_key) which is the new single source of truth for custom endpoint URLs. * feat(auth): support custom endpoint credential pools keyed by provider name Custom OpenAI-compatible endpoints all share provider='custom', making the provider-keyed pool useless. Now pools for custom endpoints are keyed by 'custom:<normalized_name>' where the name comes from the custom_providers config list (auto-generated from URL hostname). - Pool key format: 'custom:together.ai', 'custom:local-(localhost:8080)' - load_pool('custom:name') seeds from custom_providers api_key AND model.api_key when base_url matches - hermes auth add/list now shows custom endpoints alongside registry providers - _resolve_openrouter_runtime and _resolve_named_custom_runtime check pool before falling back to single config key - 6 new tests covering custom pool keying, seeding, and listing * docs: add Excalidraw diagram of full credential pool flow Comprehensive architecture diagram showing: - Credential sources (env vars, auth.json OAuth, config.yaml, CLI) - Pool storage and auto-seeding - Runtime resolution paths (registry, custom, OpenRouter) - Error recovery (429 retry-then-rotate, 402 immediate, 401 refresh) - CLI management commands and strategy configuration Open at: https://excalidraw.com/#json=2Ycqhqpi6f12E_3ITyiwh,c7u9jSt5BwrmiVzHGbm87g * fix(tests): update setup wizard pool tests for unified select_provider_and_model flow The setup wizard now delegates to select_provider_and_model() instead of using its own prompt_choice-based provider picker. Tests needed: - Mock select_provider_and_model as no-op (provider pre-written to config) - Call _stub_tts BEFORE custom prompt_choice mock (it overwrites it) - Pre-write model.provider to config so the pool step is reached * docs: add comprehensive credential pool documentation - New page: website/docs/user-guide/features/credential-pools.md Full guide covering quick start, CLI commands, rotation strategies, error recovery, custom endpoint pools, auto-discovery, thread safety, architecture, and storage format. - Updated fallback-providers.md to reference credential pools as the first layer of resilience (same-provider rotation before cross-provider) - Added hermes auth to CLI commands reference with usage examples - Added credential_pool_strategies to configuration guide * chore: remove excalidraw diagram from repo (external link only) * refactor: simplify credential pool code — extract helpers, collapse extras, dedup patterns - _load_config_safe(): replace 4 identical try/except/import blocks - _iter_custom_providers(): shared generator for custom provider iteration - PooledCredential.extra dict: collapse 11 round-trip-only fields (token_type, scope, client_id, portal_base_url, obtained_at, expires_in, agent_key_id, agent_key_expires_in, agent_key_reused, agent_key_obtained_at, tls) into a single extra dict with __getattr__ for backward-compatible access - _available_entries(): shared exhaustion-check between select and peek - Dedup anthropic OAuth seeding (hermes_pkce + claude_code identical) - SimpleNamespace replaces class _Args boilerplate in auth_commands - _try_resolve_from_custom_pool(): shared pool-check in runtime_provider Net -17 lines. All 383 targeted tests pass. --------- Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
2026-03-31 03:10:01 -07:00
Hermes Agent has three layers of resilience that keep your sessions running when providers hit issues:
docs: fallback providers + /background command documentation * docs: comprehensive fallback providers documentation - New dedicated page: user-guide/features/fallback-providers.md covering both primary model fallback and auxiliary task fallback systems - Updated configuration.md with fallback_model config section - Updated environment-variables.md noting fallback is config-only - Fleshed out developer-guide/provider-runtime.md fallback section with internal architecture details (trigger points, activation flow, config flow) - Added cross-reference from provider-routing.md distinguishing OpenRouter sub-provider routing from Hermes-level model fallback - Added new page to sidebar under Integrations * docs: comprehensive /background command documentation - Added Background Sessions section to cli.md covering how it works (daemon threads, isolated sessions, config inheritance, Rich panel output, bell notification, concurrent tasks) - Added Background Sessions section to messaging/index.md covering messaging-specific behavior (async execution, result delivery back to same chat, fire-and-forget pattern) - Documented background_process_notifications config (all/result/error/off) in messaging docs and configuration.md - Added HERMES_BACKGROUND_NOTIFICATIONS env var to reference page - Fixed inconsistency in slash-commands.md: /background was listed as messaging-only but works in both CLI and messaging. Moved it to the 'both surfaces' note. - Expanded one-liner table descriptions with detail and cross-references
2026-03-15 06:24:28 -07:00
feat(auth): same-provider credential pools with rotation, custom endpoint support, and interactive CLI (#2647) * feat(auth): add same-provider credential pools and rotation UX Add same-provider credential pooling so Hermes can rotate across multiple credentials for a single provider, recover from exhausted credentials without jumping providers immediately, and configure that behavior directly in hermes setup. - agent/credential_pool.py: persisted per-provider credential pools - hermes auth add/list/remove/reset CLI commands - 429/402/401 recovery with pool rotation in run_agent.py - Setup wizard integration for pool strategy configuration - Auto-seeding from env vars and existing OAuth state Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com> Salvaged from PR #2647 * fix(tests): prevent pool auto-seeding from host env in credential pool tests Tests for non-pool Anthropic paths and auth remove were failing when host env vars (ANTHROPIC_API_KEY) or file-backed OAuth credentials were present. The pool auto-seeding picked these up, causing unexpected pool entries in tests. - Mock _select_pool_entry in auxiliary_client OAuth flag tests - Clear Anthropic env vars and mock _seed_from_singletons in auth remove test * feat(auth): add thread safety, least_used strategy, and request counting - Add threading.Lock to CredentialPool for gateway thread safety (concurrent requests from multiple gateway sessions could race on pool state mutations without this) - Add 'least_used' rotation strategy that selects the credential with the lowest request_count, distributing load more evenly - Add request_count field to PooledCredential for usage tracking - Add mark_used() method to increment per-credential request counts - Wrap select(), mark_exhausted_and_rotate(), and try_refresh_current() with lock acquisition - Add tests: least_used selection, mark_used counting, concurrent thread safety (4 threads × 20 selects with no corruption) * feat(auth): add interactive mode for bare 'hermes auth' command When 'hermes auth' is called without a subcommand, it now launches an interactive wizard that: 1. Shows full credential pool status across all providers 2. Offers a menu: add, remove, reset cooldowns, set strategy 3. For OAuth-capable providers (anthropic, nous, openai-codex), the add flow explicitly asks 'API key or OAuth login?' — making it clear that both auth types are supported for the same provider 4. Strategy picker shows all 4 options (fill_first, round_robin, least_used, random) with the current selection marked 5. Remove flow shows entries with indices for easy selection The subcommand paths (hermes auth add/list/remove/reset) still work exactly as before for scripted/non-interactive use. * fix(tests): update runtime_provider tests for config.yaml source of truth (#4165) Tests were using OPENAI_BASE_URL env var which is no longer consulted after #4165. Updated to use model config (provider, base_url, api_key) which is the new single source of truth for custom endpoint URLs. * feat(auth): support custom endpoint credential pools keyed by provider name Custom OpenAI-compatible endpoints all share provider='custom', making the provider-keyed pool useless. Now pools for custom endpoints are keyed by 'custom:<normalized_name>' where the name comes from the custom_providers config list (auto-generated from URL hostname). - Pool key format: 'custom:together.ai', 'custom:local-(localhost:8080)' - load_pool('custom:name') seeds from custom_providers api_key AND model.api_key when base_url matches - hermes auth add/list now shows custom endpoints alongside registry providers - _resolve_openrouter_runtime and _resolve_named_custom_runtime check pool before falling back to single config key - 6 new tests covering custom pool keying, seeding, and listing * docs: add Excalidraw diagram of full credential pool flow Comprehensive architecture diagram showing: - Credential sources (env vars, auth.json OAuth, config.yaml, CLI) - Pool storage and auto-seeding - Runtime resolution paths (registry, custom, OpenRouter) - Error recovery (429 retry-then-rotate, 402 immediate, 401 refresh) - CLI management commands and strategy configuration Open at: https://excalidraw.com/#json=2Ycqhqpi6f12E_3ITyiwh,c7u9jSt5BwrmiVzHGbm87g * fix(tests): update setup wizard pool tests for unified select_provider_and_model flow The setup wizard now delegates to select_provider_and_model() instead of using its own prompt_choice-based provider picker. Tests needed: - Mock select_provider_and_model as no-op (provider pre-written to config) - Call _stub_tts BEFORE custom prompt_choice mock (it overwrites it) - Pre-write model.provider to config so the pool step is reached * docs: add comprehensive credential pool documentation - New page: website/docs/user-guide/features/credential-pools.md Full guide covering quick start, CLI commands, rotation strategies, error recovery, custom endpoint pools, auto-discovery, thread safety, architecture, and storage format. - Updated fallback-providers.md to reference credential pools as the first layer of resilience (same-provider rotation before cross-provider) - Added hermes auth to CLI commands reference with usage examples - Added credential_pool_strategies to configuration guide * chore: remove excalidraw diagram from repo (external link only) * refactor: simplify credential pool code — extract helpers, collapse extras, dedup patterns - _load_config_safe(): replace 4 identical try/except/import blocks - _iter_custom_providers(): shared generator for custom provider iteration - PooledCredential.extra dict: collapse 11 round-trip-only fields (token_type, scope, client_id, portal_base_url, obtained_at, expires_in, agent_key_id, agent_key_expires_in, agent_key_reused, agent_key_obtained_at, tls) into a single extra dict with __getattr__ for backward-compatible access - _available_entries(): shared exhaustion-check between select and peek - Dedup anthropic OAuth seeding (hermes_pkce + claude_code identical) - SimpleNamespace replaces class _Args boilerplate in auth_commands - _try_resolve_from_custom_pool(): shared pool-check in runtime_provider Net -17 lines. All 383 targeted tests pass. --------- Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
2026-03-31 03:10:01 -07:00
1. **[Credential pools](./credential-pools.md)** — rotate across multiple API keys for the *same* provider (tried first)
2. **Primary model fallback** — automatically switches to a *different* provider:model when your main model fails
3. **Auxiliary task fallback** — independent provider resolution for side tasks like vision, compression, and web extraction
docs: fallback providers + /background command documentation * docs: comprehensive fallback providers documentation - New dedicated page: user-guide/features/fallback-providers.md covering both primary model fallback and auxiliary task fallback systems - Updated configuration.md with fallback_model config section - Updated environment-variables.md noting fallback is config-only - Fleshed out developer-guide/provider-runtime.md fallback section with internal architecture details (trigger points, activation flow, config flow) - Added cross-reference from provider-routing.md distinguishing OpenRouter sub-provider routing from Hermes-level model fallback - Added new page to sidebar under Integrations * docs: comprehensive /background command documentation - Added Background Sessions section to cli.md covering how it works (daemon threads, isolated sessions, config inheritance, Rich panel output, bell notification, concurrent tasks) - Added Background Sessions section to messaging/index.md covering messaging-specific behavior (async execution, result delivery back to same chat, fire-and-forget pattern) - Documented background_process_notifications config (all/result/error/off) in messaging docs and configuration.md - Added HERMES_BACKGROUND_NOTIFICATIONS env var to reference page - Fixed inconsistency in slash-commands.md: /background was listed as messaging-only but works in both CLI and messaging. Moved it to the 'both surfaces' note. - Expanded one-liner table descriptions with detail and cross-references
2026-03-15 06:24:28 -07:00
feat(auth): same-provider credential pools with rotation, custom endpoint support, and interactive CLI (#2647) * feat(auth): add same-provider credential pools and rotation UX Add same-provider credential pooling so Hermes can rotate across multiple credentials for a single provider, recover from exhausted credentials without jumping providers immediately, and configure that behavior directly in hermes setup. - agent/credential_pool.py: persisted per-provider credential pools - hermes auth add/list/remove/reset CLI commands - 429/402/401 recovery with pool rotation in run_agent.py - Setup wizard integration for pool strategy configuration - Auto-seeding from env vars and existing OAuth state Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com> Salvaged from PR #2647 * fix(tests): prevent pool auto-seeding from host env in credential pool tests Tests for non-pool Anthropic paths and auth remove were failing when host env vars (ANTHROPIC_API_KEY) or file-backed OAuth credentials were present. The pool auto-seeding picked these up, causing unexpected pool entries in tests. - Mock _select_pool_entry in auxiliary_client OAuth flag tests - Clear Anthropic env vars and mock _seed_from_singletons in auth remove test * feat(auth): add thread safety, least_used strategy, and request counting - Add threading.Lock to CredentialPool for gateway thread safety (concurrent requests from multiple gateway sessions could race on pool state mutations without this) - Add 'least_used' rotation strategy that selects the credential with the lowest request_count, distributing load more evenly - Add request_count field to PooledCredential for usage tracking - Add mark_used() method to increment per-credential request counts - Wrap select(), mark_exhausted_and_rotate(), and try_refresh_current() with lock acquisition - Add tests: least_used selection, mark_used counting, concurrent thread safety (4 threads × 20 selects with no corruption) * feat(auth): add interactive mode for bare 'hermes auth' command When 'hermes auth' is called without a subcommand, it now launches an interactive wizard that: 1. Shows full credential pool status across all providers 2. Offers a menu: add, remove, reset cooldowns, set strategy 3. For OAuth-capable providers (anthropic, nous, openai-codex), the add flow explicitly asks 'API key or OAuth login?' — making it clear that both auth types are supported for the same provider 4. Strategy picker shows all 4 options (fill_first, round_robin, least_used, random) with the current selection marked 5. Remove flow shows entries with indices for easy selection The subcommand paths (hermes auth add/list/remove/reset) still work exactly as before for scripted/non-interactive use. * fix(tests): update runtime_provider tests for config.yaml source of truth (#4165) Tests were using OPENAI_BASE_URL env var which is no longer consulted after #4165. Updated to use model config (provider, base_url, api_key) which is the new single source of truth for custom endpoint URLs. * feat(auth): support custom endpoint credential pools keyed by provider name Custom OpenAI-compatible endpoints all share provider='custom', making the provider-keyed pool useless. Now pools for custom endpoints are keyed by 'custom:<normalized_name>' where the name comes from the custom_providers config list (auto-generated from URL hostname). - Pool key format: 'custom:together.ai', 'custom:local-(localhost:8080)' - load_pool('custom:name') seeds from custom_providers api_key AND model.api_key when base_url matches - hermes auth add/list now shows custom endpoints alongside registry providers - _resolve_openrouter_runtime and _resolve_named_custom_runtime check pool before falling back to single config key - 6 new tests covering custom pool keying, seeding, and listing * docs: add Excalidraw diagram of full credential pool flow Comprehensive architecture diagram showing: - Credential sources (env vars, auth.json OAuth, config.yaml, CLI) - Pool storage and auto-seeding - Runtime resolution paths (registry, custom, OpenRouter) - Error recovery (429 retry-then-rotate, 402 immediate, 401 refresh) - CLI management commands and strategy configuration Open at: https://excalidraw.com/#json=2Ycqhqpi6f12E_3ITyiwh,c7u9jSt5BwrmiVzHGbm87g * fix(tests): update setup wizard pool tests for unified select_provider_and_model flow The setup wizard now delegates to select_provider_and_model() instead of using its own prompt_choice-based provider picker. Tests needed: - Mock select_provider_and_model as no-op (provider pre-written to config) - Call _stub_tts BEFORE custom prompt_choice mock (it overwrites it) - Pre-write model.provider to config so the pool step is reached * docs: add comprehensive credential pool documentation - New page: website/docs/user-guide/features/credential-pools.md Full guide covering quick start, CLI commands, rotation strategies, error recovery, custom endpoint pools, auto-discovery, thread safety, architecture, and storage format. - Updated fallback-providers.md to reference credential pools as the first layer of resilience (same-provider rotation before cross-provider) - Added hermes auth to CLI commands reference with usage examples - Added credential_pool_strategies to configuration guide * chore: remove excalidraw diagram from repo (external link only) * refactor: simplify credential pool code — extract helpers, collapse extras, dedup patterns - _load_config_safe(): replace 4 identical try/except/import blocks - _iter_custom_providers(): shared generator for custom provider iteration - PooledCredential.extra dict: collapse 11 round-trip-only fields (token_type, scope, client_id, portal_base_url, obtained_at, expires_in, agent_key_id, agent_key_expires_in, agent_key_reused, agent_key_obtained_at, tls) into a single extra dict with __getattr__ for backward-compatible access - _available_entries(): shared exhaustion-check between select and peek - Dedup anthropic OAuth seeding (hermes_pkce + claude_code identical) - SimpleNamespace replaces class _Args boilerplate in auth_commands - _try_resolve_from_custom_pool(): shared pool-check in runtime_provider Net -17 lines. All 383 targeted tests pass. --------- Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
2026-03-31 03:10:01 -07:00
Credential pools handle same-provider rotation (e.g., multiple OpenRouter keys). This page covers cross-provider fallback. Both are optional and work independently.
docs: fallback providers + /background command documentation * docs: comprehensive fallback providers documentation - New dedicated page: user-guide/features/fallback-providers.md covering both primary model fallback and auxiliary task fallback systems - Updated configuration.md with fallback_model config section - Updated environment-variables.md noting fallback is config-only - Fleshed out developer-guide/provider-runtime.md fallback section with internal architecture details (trigger points, activation flow, config flow) - Added cross-reference from provider-routing.md distinguishing OpenRouter sub-provider routing from Hermes-level model fallback - Added new page to sidebar under Integrations * docs: comprehensive /background command documentation - Added Background Sessions section to cli.md covering how it works (daemon threads, isolated sessions, config inheritance, Rich panel output, bell notification, concurrent tasks) - Added Background Sessions section to messaging/index.md covering messaging-specific behavior (async execution, result delivery back to same chat, fire-and-forget pattern) - Documented background_process_notifications config (all/result/error/off) in messaging docs and configuration.md - Added HERMES_BACKGROUND_NOTIFICATIONS env var to reference page - Fixed inconsistency in slash-commands.md: /background was listed as messaging-only but works in both CLI and messaging. Moved it to the 'both surfaces' note. - Expanded one-liner table descriptions with detail and cross-references
2026-03-15 06:24:28 -07:00
## Primary Model Fallback
When your main LLM provider encounters errors — rate limits, server overload, auth failures, connection drops — Hermes can automatically switch to a backup provider:model pair mid-session without losing your conversation.
### Configuration
Add a `fallback_model` section to `~/.hermes/config.yaml`:
```yaml
fallback_model:
provider: openrouter
model: anthropic/claude-sonnet-4
```
Both `provider` and `model` are **required**. If either is missing, the fallback is disabled.
### Supported Providers
| Provider | Value | Requirements |
|----------|-------|-------------|
| AI Gateway | `ai-gateway` | `AI_GATEWAY_API_KEY` |
docs: fallback providers + /background command documentation * docs: comprehensive fallback providers documentation - New dedicated page: user-guide/features/fallback-providers.md covering both primary model fallback and auxiliary task fallback systems - Updated configuration.md with fallback_model config section - Updated environment-variables.md noting fallback is config-only - Fleshed out developer-guide/provider-runtime.md fallback section with internal architecture details (trigger points, activation flow, config flow) - Added cross-reference from provider-routing.md distinguishing OpenRouter sub-provider routing from Hermes-level model fallback - Added new page to sidebar under Integrations * docs: comprehensive /background command documentation - Added Background Sessions section to cli.md covering how it works (daemon threads, isolated sessions, config inheritance, Rich panel output, bell notification, concurrent tasks) - Added Background Sessions section to messaging/index.md covering messaging-specific behavior (async execution, result delivery back to same chat, fire-and-forget pattern) - Documented background_process_notifications config (all/result/error/off) in messaging docs and configuration.md - Added HERMES_BACKGROUND_NOTIFICATIONS env var to reference page - Fixed inconsistency in slash-commands.md: /background was listed as messaging-only but works in both CLI and messaging. Moved it to the 'both surfaces' note. - Expanded one-liner table descriptions with detail and cross-references
2026-03-15 06:24:28 -07:00
| OpenRouter | `openrouter` | `OPENROUTER_API_KEY` |
| Nous Portal | `nous` | `hermes auth` (OAuth) |
docs: fallback providers + /background command documentation * docs: comprehensive fallback providers documentation - New dedicated page: user-guide/features/fallback-providers.md covering both primary model fallback and auxiliary task fallback systems - Updated configuration.md with fallback_model config section - Updated environment-variables.md noting fallback is config-only - Fleshed out developer-guide/provider-runtime.md fallback section with internal architecture details (trigger points, activation flow, config flow) - Added cross-reference from provider-routing.md distinguishing OpenRouter sub-provider routing from Hermes-level model fallback - Added new page to sidebar under Integrations * docs: comprehensive /background command documentation - Added Background Sessions section to cli.md covering how it works (daemon threads, isolated sessions, config inheritance, Rich panel output, bell notification, concurrent tasks) - Added Background Sessions section to messaging/index.md covering messaging-specific behavior (async execution, result delivery back to same chat, fire-and-forget pattern) - Documented background_process_notifications config (all/result/error/off) in messaging docs and configuration.md - Added HERMES_BACKGROUND_NOTIFICATIONS env var to reference page - Fixed inconsistency in slash-commands.md: /background was listed as messaging-only but works in both CLI and messaging. Moved it to the 'both surfaces' note. - Expanded one-liner table descriptions with detail and cross-references
2026-03-15 06:24:28 -07:00
| OpenAI Codex | `openai-codex` | `hermes model` (ChatGPT OAuth) |
| GitHub Copilot | `copilot` | `COPILOT_GITHUB_TOKEN`, `GH_TOKEN`, or `GITHUB_TOKEN` |
| GitHub Copilot ACP | `copilot-acp` | External process (editor integration) |
docs: fallback providers + /background command documentation * docs: comprehensive fallback providers documentation - New dedicated page: user-guide/features/fallback-providers.md covering both primary model fallback and auxiliary task fallback systems - Updated configuration.md with fallback_model config section - Updated environment-variables.md noting fallback is config-only - Fleshed out developer-guide/provider-runtime.md fallback section with internal architecture details (trigger points, activation flow, config flow) - Added cross-reference from provider-routing.md distinguishing OpenRouter sub-provider routing from Hermes-level model fallback - Added new page to sidebar under Integrations * docs: comprehensive /background command documentation - Added Background Sessions section to cli.md covering how it works (daemon threads, isolated sessions, config inheritance, Rich panel output, bell notification, concurrent tasks) - Added Background Sessions section to messaging/index.md covering messaging-specific behavior (async execution, result delivery back to same chat, fire-and-forget pattern) - Documented background_process_notifications config (all/result/error/off) in messaging docs and configuration.md - Added HERMES_BACKGROUND_NOTIFICATIONS env var to reference page - Fixed inconsistency in slash-commands.md: /background was listed as messaging-only but works in both CLI and messaging. Moved it to the 'both surfaces' note. - Expanded one-liner table descriptions with detail and cross-references
2026-03-15 06:24:28 -07:00
| Anthropic | `anthropic` | `ANTHROPIC_API_KEY` or Claude Code credentials |
| z.ai / GLM | `zai` | `GLM_API_KEY` |
| Kimi / Moonshot | `kimi-coding` | `KIMI_API_KEY` |
| MiniMax | `minimax` | `MINIMAX_API_KEY` |
| MiniMax (China) | `minimax-cn` | `MINIMAX_CN_API_KEY` |
| DeepSeek | `deepseek` | `DEEPSEEK_API_KEY` |
| NVIDIA NIM | `nvidia` | `NVIDIA_API_KEY` (optional: `NVIDIA_BASE_URL`) |
| Ollama Cloud | `ollama-cloud` | `OLLAMA_API_KEY` |
| Google Gemini (OAuth) | `google-gemini-cli` | `hermes model` (Google OAuth; optional: `HERMES_GEMINI_PROJECT_ID`) |
docs: correctness audit — fix wrong values, add missing coverage (#11972) Comprehensive audit of every reference/messaging/feature doc page against the live code registries (PROVIDER_REGISTRY, OPTIONAL_ENV_VARS, COMMAND_REGISTRY, TOOLSETS, tool registry, on-disk skills). Every fix was verified against code before writing. ### Wrong values fixed (users would paste-and-fail) - reference/environment-variables.md: - DASHSCOPE_BASE_URL default was `coding-intl.dashscope.aliyuncs.com/v1` \u2192 actual `dashscope-intl.aliyuncs.com/compatible-mode/v1`. - MINIMAX_BASE_URL and MINIMAX_CN_BASE_URL defaults were `/v1` \u2192 actual `/anthropic` (Hermes calls MiniMax via its Anthropic Messages endpoint). - reference/toolsets-reference.md MCP example used the non-existent nested `mcp: servers:` key \u2192 real key is the flat `mcp_servers:`. - reference/skills-catalog.md listed ~20 bundled skills that no longer exist on disk (all moved to `optional-skills/`). Regenerated the whole bundled section from `skills/**/SKILL.md` \u2014 79 skills, accurate paths and names. - messaging/slack.md ":::info" callout claimed Slack has no `free_response_channels` equivalent; both the env var and the yaml key are in fact read. - messaging/qqbot.md documented `QQ_MARKDOWN_SUPPORT` as an env var, but the adapter only reads `extra.markdown_support` from config.yaml. Removed the env var row and noted config-only nature. - messaging/qqbot.md `hermes setup gateway` \u2192 `hermes gateway setup`. ### Missing coverage added - Providers: AWS Bedrock and Qwen Portal (qwen-oauth) \u2014 both in PROVIDER_REGISTRY but undocumented everywhere. Added sections to integrations/providers.md, rows to quickstart.md and fallback-providers.md. - integrations/providers.md "Fallback Model" provider list now includes gemini, google-gemini-cli, qwen-oauth, xai, nvidia, ollama-cloud, bedrock. - reference/cli-commands.md `--provider` enum and HERMES_INFERENCE_PROVIDER enum in env-vars now include the same set. - reference/slash-commands.md: added `/agents` (alias `/tasks`) and `/copy`. Removed duplicate rows for `/snapshot`, `/fast` (\u00d72), `/debug`. - reference/tools-reference.md: fixed "47 built-in tools" \u2192 52. Added `feishu_doc` and `feishu_drive` toolset sections. - reference/toolsets-reference.md: added `feishu_doc` / `feishu_drive` core rows + all missing `hermes-<platform>` toolsets in the platform table (bluebubbles, dingtalk, feishu, qqbot, wecom, wecom-callback, weixin, homeassistant, webhook, gateway). Fixed the `debugging` composite to describe the actual `includes=[...]` mechanism. - reference/optional-skills-catalog.md: added `fitness-nutrition`. - reference/environment-variables.md: added NOUS_BASE_URL, NOUS_INFERENCE_BASE_URL, NVIDIA_API_KEY/BASE_URL, OLLAMA_API_KEY/BASE_URL, XAI_API_KEY/BASE_URL, MISTRAL_API_KEY, AWS_REGION/AWS_PROFILE, BEDROCK_BASE_URL, HERMES_QWEN_BASE_URL, DISCORD_ALLOWED_CHANNELS, DISCORD_PROXY, TELEGRAM_REPLY_TO_MODE, MATRIX_DEVICE_ID, MATRIX_REACTIONS, QQBOT_HOME_CHANNEL_NAME, QQ_SANDBOX. - messaging/discord.md: documented DISCORD_ALLOWED_CHANNELS, DISCORD_PROXY, HERMES_DISCORD_TEXT_BATCH_DELAY_SECONDS and HERMES_DISCORD_TEXT_BATCH_SPLIT _DELAY_SECONDS (all actively read by the adapter). - messaging/matrix.md: documented MATRIX_REACTIONS (default true). - messaging/telegram.md: removed the redundant second Webhook Mode section that invented a `telegram.webhook_mode: true` yaml key the adapter does not read. - user-guide/features/hooks.md: added `on_session_finalize` and `on_session_reset` (both emitted via invoke_hook but undocumented). - user-guide/features/api-server.md: documented GET /health/detailed, the `/api/jobs/*` CRUD surface, POST /v1/runs, and GET /v1/runs/{id}/events (10 routes that were live but undocumented). - user-guide/features/fallback-providers.md: added `approval` and `title_generation` auxiliary-task rows; added gemini, bedrock, qwen-oauth to the supported-providers table. - user-guide/features/tts.md: "seven providers" \u2192 "eight" (post-xAI add oversight in #11942). - user-guide/configuration.md: TTS provider enum gains `xai` and `gemini`; yaml example block gains `mistral:`, `gemini:`, `xai:` subsections. Auxiliary-provider enum now enumerates all real registry entries. - reference/faq.md: stale AIAgent/config examples bumped from `nous/hermes-3-llama-3.1-70b` and `claude-sonnet-4.6` to `claude-opus-4.7`. ### Docs-site integrity - guides/build-a-hermes-plugin.md referenced two nonexistent hooks (`pre_api_request`, `post_api_request`). Replaced with the real `on_session_finalize` / `on_session_reset` entries. - messaging/open-webui.md and features/api-server.md had pre-existing broken links to `/docs/user-guide/features/profiles` (actual path is `/docs/user-guide/profiles`). Fixed. - reference/skills-catalog.md had one `<1%` literal that MDX parsed as a JSX tag. Escaped to `&lt;1%`. ### False positives filtered out (not changed, verified correct) - `/set-home` is a registered alias of `/sethome` \u2014 docs were fine. - `hermes setup gateway` is valid syntax (`hermes setup \<section\>`); changed in qqbot.md for cross-doc consistency, not as a bug fix. - Telegram reactions "disabled by default" matches code (default `"false"`). - Matrix encryption "opt-in" matches code (empty env default \u2192 disabled). - `pre_api_request` / `post_api_request` hooks do NOT exist in current code; documented instead the real `on_session_finalize` / `on_session_reset`. - SIGNAL_IGNORE_STORIES is already in env-vars.md (subagent missed it). Validation: - `docusaurus build` \u2014 passes (only pre-existing nix-setup anchor warning). - `ascii-guard lint docs` \u2014 124 files, 0 errors. - 22 files changed, +317 / \u2212158.
2026-04-18 01:45:48 -07:00
| Google AI Studio | `gemini` | `GOOGLE_API_KEY` (alias: `GEMINI_API_KEY`) |
| xAI (Grok) | `xai` (alias `grok`) | `XAI_API_KEY` (optional: `XAI_BASE_URL`) |
docs: correctness audit — fix wrong values, add missing coverage (#11972) Comprehensive audit of every reference/messaging/feature doc page against the live code registries (PROVIDER_REGISTRY, OPTIONAL_ENV_VARS, COMMAND_REGISTRY, TOOLSETS, tool registry, on-disk skills). Every fix was verified against code before writing. ### Wrong values fixed (users would paste-and-fail) - reference/environment-variables.md: - DASHSCOPE_BASE_URL default was `coding-intl.dashscope.aliyuncs.com/v1` \u2192 actual `dashscope-intl.aliyuncs.com/compatible-mode/v1`. - MINIMAX_BASE_URL and MINIMAX_CN_BASE_URL defaults were `/v1` \u2192 actual `/anthropic` (Hermes calls MiniMax via its Anthropic Messages endpoint). - reference/toolsets-reference.md MCP example used the non-existent nested `mcp: servers:` key \u2192 real key is the flat `mcp_servers:`. - reference/skills-catalog.md listed ~20 bundled skills that no longer exist on disk (all moved to `optional-skills/`). Regenerated the whole bundled section from `skills/**/SKILL.md` \u2014 79 skills, accurate paths and names. - messaging/slack.md ":::info" callout claimed Slack has no `free_response_channels` equivalent; both the env var and the yaml key are in fact read. - messaging/qqbot.md documented `QQ_MARKDOWN_SUPPORT` as an env var, but the adapter only reads `extra.markdown_support` from config.yaml. Removed the env var row and noted config-only nature. - messaging/qqbot.md `hermes setup gateway` \u2192 `hermes gateway setup`. ### Missing coverage added - Providers: AWS Bedrock and Qwen Portal (qwen-oauth) \u2014 both in PROVIDER_REGISTRY but undocumented everywhere. Added sections to integrations/providers.md, rows to quickstart.md and fallback-providers.md. - integrations/providers.md "Fallback Model" provider list now includes gemini, google-gemini-cli, qwen-oauth, xai, nvidia, ollama-cloud, bedrock. - reference/cli-commands.md `--provider` enum and HERMES_INFERENCE_PROVIDER enum in env-vars now include the same set. - reference/slash-commands.md: added `/agents` (alias `/tasks`) and `/copy`. Removed duplicate rows for `/snapshot`, `/fast` (\u00d72), `/debug`. - reference/tools-reference.md: fixed "47 built-in tools" \u2192 52. Added `feishu_doc` and `feishu_drive` toolset sections. - reference/toolsets-reference.md: added `feishu_doc` / `feishu_drive` core rows + all missing `hermes-<platform>` toolsets in the platform table (bluebubbles, dingtalk, feishu, qqbot, wecom, wecom-callback, weixin, homeassistant, webhook, gateway). Fixed the `debugging` composite to describe the actual `includes=[...]` mechanism. - reference/optional-skills-catalog.md: added `fitness-nutrition`. - reference/environment-variables.md: added NOUS_BASE_URL, NOUS_INFERENCE_BASE_URL, NVIDIA_API_KEY/BASE_URL, OLLAMA_API_KEY/BASE_URL, XAI_API_KEY/BASE_URL, MISTRAL_API_KEY, AWS_REGION/AWS_PROFILE, BEDROCK_BASE_URL, HERMES_QWEN_BASE_URL, DISCORD_ALLOWED_CHANNELS, DISCORD_PROXY, TELEGRAM_REPLY_TO_MODE, MATRIX_DEVICE_ID, MATRIX_REACTIONS, QQBOT_HOME_CHANNEL_NAME, QQ_SANDBOX. - messaging/discord.md: documented DISCORD_ALLOWED_CHANNELS, DISCORD_PROXY, HERMES_DISCORD_TEXT_BATCH_DELAY_SECONDS and HERMES_DISCORD_TEXT_BATCH_SPLIT _DELAY_SECONDS (all actively read by the adapter). - messaging/matrix.md: documented MATRIX_REACTIONS (default true). - messaging/telegram.md: removed the redundant second Webhook Mode section that invented a `telegram.webhook_mode: true` yaml key the adapter does not read. - user-guide/features/hooks.md: added `on_session_finalize` and `on_session_reset` (both emitted via invoke_hook but undocumented). - user-guide/features/api-server.md: documented GET /health/detailed, the `/api/jobs/*` CRUD surface, POST /v1/runs, and GET /v1/runs/{id}/events (10 routes that were live but undocumented). - user-guide/features/fallback-providers.md: added `approval` and `title_generation` auxiliary-task rows; added gemini, bedrock, qwen-oauth to the supported-providers table. - user-guide/features/tts.md: "seven providers" \u2192 "eight" (post-xAI add oversight in #11942). - user-guide/configuration.md: TTS provider enum gains `xai` and `gemini`; yaml example block gains `mistral:`, `gemini:`, `xai:` subsections. Auxiliary-provider enum now enumerates all real registry entries. - reference/faq.md: stale AIAgent/config examples bumped from `nous/hermes-3-llama-3.1-70b` and `claude-sonnet-4.6` to `claude-opus-4.7`. ### Docs-site integrity - guides/build-a-hermes-plugin.md referenced two nonexistent hooks (`pre_api_request`, `post_api_request`). Replaced with the real `on_session_finalize` / `on_session_reset` entries. - messaging/open-webui.md and features/api-server.md had pre-existing broken links to `/docs/user-guide/features/profiles` (actual path is `/docs/user-guide/profiles`). Fixed. - reference/skills-catalog.md had one `<1%` literal that MDX parsed as a JSX tag. Escaped to `&lt;1%`. ### False positives filtered out (not changed, verified correct) - `/set-home` is a registered alias of `/sethome` \u2014 docs were fine. - `hermes setup gateway` is valid syntax (`hermes setup \<section\>`); changed in qqbot.md for cross-doc consistency, not as a bug fix. - Telegram reactions "disabled by default" matches code (default `"false"`). - Matrix encryption "opt-in" matches code (empty env default \u2192 disabled). - `pre_api_request` / `post_api_request` hooks do NOT exist in current code; documented instead the real `on_session_finalize` / `on_session_reset`. - SIGNAL_IGNORE_STORIES is already in env-vars.md (subagent missed it). Validation: - `docusaurus build` \u2014 passes (only pre-existing nix-setup anchor warning). - `ascii-guard lint docs` \u2014 124 files, 0 errors. - 22 files changed, +317 / \u2212158.
2026-04-18 01:45:48 -07:00
| AWS Bedrock | `bedrock` | Standard boto3 auth (`AWS_REGION` + `AWS_PROFILE` or `AWS_ACCESS_KEY_ID`) |
| Qwen Portal (OAuth) | `qwen-oauth` | `hermes model` (Qwen Portal OAuth; optional: `HERMES_QWEN_BASE_URL`) |
| OpenCode Zen | `opencode-zen` | `OPENCODE_ZEN_API_KEY` |
| OpenCode Go | `opencode-go` | `OPENCODE_GO_API_KEY` |
| Kilo Code | `kilocode` | `KILOCODE_API_KEY` |
| Xiaomi MiMo | `xiaomi` | `XIAOMI_API_KEY` |
| Arcee AI | `arcee` | `ARCEEAI_API_KEY` |
| Alibaba / DashScope | `alibaba` | `DASHSCOPE_API_KEY` |
| Hugging Face | `huggingface` | `HF_TOKEN` |
| Custom endpoint | `custom` | `base_url` + `key_env` (see below) |
docs: fallback providers + /background command documentation * docs: comprehensive fallback providers documentation - New dedicated page: user-guide/features/fallback-providers.md covering both primary model fallback and auxiliary task fallback systems - Updated configuration.md with fallback_model config section - Updated environment-variables.md noting fallback is config-only - Fleshed out developer-guide/provider-runtime.md fallback section with internal architecture details (trigger points, activation flow, config flow) - Added cross-reference from provider-routing.md distinguishing OpenRouter sub-provider routing from Hermes-level model fallback - Added new page to sidebar under Integrations * docs: comprehensive /background command documentation - Added Background Sessions section to cli.md covering how it works (daemon threads, isolated sessions, config inheritance, Rich panel output, bell notification, concurrent tasks) - Added Background Sessions section to messaging/index.md covering messaging-specific behavior (async execution, result delivery back to same chat, fire-and-forget pattern) - Documented background_process_notifications config (all/result/error/off) in messaging docs and configuration.md - Added HERMES_BACKGROUND_NOTIFICATIONS env var to reference page - Fixed inconsistency in slash-commands.md: /background was listed as messaging-only but works in both CLI and messaging. Moved it to the 'both surfaces' note. - Expanded one-liner table descriptions with detail and cross-references
2026-03-15 06:24:28 -07:00
### Custom Endpoint Fallback
For a custom OpenAI-compatible endpoint, add `base_url` and optionally `key_env`:
docs: fallback providers + /background command documentation * docs: comprehensive fallback providers documentation - New dedicated page: user-guide/features/fallback-providers.md covering both primary model fallback and auxiliary task fallback systems - Updated configuration.md with fallback_model config section - Updated environment-variables.md noting fallback is config-only - Fleshed out developer-guide/provider-runtime.md fallback section with internal architecture details (trigger points, activation flow, config flow) - Added cross-reference from provider-routing.md distinguishing OpenRouter sub-provider routing from Hermes-level model fallback - Added new page to sidebar under Integrations * docs: comprehensive /background command documentation - Added Background Sessions section to cli.md covering how it works (daemon threads, isolated sessions, config inheritance, Rich panel output, bell notification, concurrent tasks) - Added Background Sessions section to messaging/index.md covering messaging-specific behavior (async execution, result delivery back to same chat, fire-and-forget pattern) - Documented background_process_notifications config (all/result/error/off) in messaging docs and configuration.md - Added HERMES_BACKGROUND_NOTIFICATIONS env var to reference page - Fixed inconsistency in slash-commands.md: /background was listed as messaging-only but works in both CLI and messaging. Moved it to the 'both surfaces' note. - Expanded one-liner table descriptions with detail and cross-references
2026-03-15 06:24:28 -07:00
```yaml
fallback_model:
provider: custom
model: my-local-model
base_url: http://localhost:8000/v1
key_env: MY_LOCAL_KEY # env var name containing the API key
docs: fallback providers + /background command documentation * docs: comprehensive fallback providers documentation - New dedicated page: user-guide/features/fallback-providers.md covering both primary model fallback and auxiliary task fallback systems - Updated configuration.md with fallback_model config section - Updated environment-variables.md noting fallback is config-only - Fleshed out developer-guide/provider-runtime.md fallback section with internal architecture details (trigger points, activation flow, config flow) - Added cross-reference from provider-routing.md distinguishing OpenRouter sub-provider routing from Hermes-level model fallback - Added new page to sidebar under Integrations * docs: comprehensive /background command documentation - Added Background Sessions section to cli.md covering how it works (daemon threads, isolated sessions, config inheritance, Rich panel output, bell notification, concurrent tasks) - Added Background Sessions section to messaging/index.md covering messaging-specific behavior (async execution, result delivery back to same chat, fire-and-forget pattern) - Documented background_process_notifications config (all/result/error/off) in messaging docs and configuration.md - Added HERMES_BACKGROUND_NOTIFICATIONS env var to reference page - Fixed inconsistency in slash-commands.md: /background was listed as messaging-only but works in both CLI and messaging. Moved it to the 'both surfaces' note. - Expanded one-liner table descriptions with detail and cross-references
2026-03-15 06:24:28 -07:00
```
### When Fallback Triggers
The fallback activates automatically when the primary model fails with:
- **Rate limits** (HTTP 429) — after exhausting retry attempts
- **Server errors** (HTTP 500, 502, 503) — after exhausting retry attempts
- **Auth failures** (HTTP 401, 403) — immediately (no point retrying)
- **Not found** (HTTP 404) — immediately
- **Invalid responses** — when the API returns malformed or empty responses repeatedly
When triggered, Hermes:
1. Resolves credentials for the fallback provider
2. Builds a new API client
3. Swaps the model, provider, and client in-place
4. Resets the retry counter and continues the conversation
The switch is seamless — your conversation history, tool calls, and context are preserved. The agent continues from exactly where it left off, just using a different model.
:::info One-Shot
Fallback activates **at most once** per session. If the fallback provider also fails, normal error handling takes over (retries, then error message). This prevents cascading failover loops.
:::
### Examples
**OpenRouter as fallback for Anthropic native:**
```yaml
model:
provider: anthropic
default: claude-sonnet-4-6
fallback_model:
provider: openrouter
model: anthropic/claude-sonnet-4
```
**Nous Portal as fallback for OpenRouter:**
```yaml
model:
provider: openrouter
default: anthropic/claude-opus-4
fallback_model:
provider: nous
model: nous-hermes-3
```
**Local model as fallback for cloud:**
```yaml
fallback_model:
provider: custom
model: llama-3.1-70b
base_url: http://localhost:8000/v1
key_env: LOCAL_API_KEY
docs: fallback providers + /background command documentation * docs: comprehensive fallback providers documentation - New dedicated page: user-guide/features/fallback-providers.md covering both primary model fallback and auxiliary task fallback systems - Updated configuration.md with fallback_model config section - Updated environment-variables.md noting fallback is config-only - Fleshed out developer-guide/provider-runtime.md fallback section with internal architecture details (trigger points, activation flow, config flow) - Added cross-reference from provider-routing.md distinguishing OpenRouter sub-provider routing from Hermes-level model fallback - Added new page to sidebar under Integrations * docs: comprehensive /background command documentation - Added Background Sessions section to cli.md covering how it works (daemon threads, isolated sessions, config inheritance, Rich panel output, bell notification, concurrent tasks) - Added Background Sessions section to messaging/index.md covering messaging-specific behavior (async execution, result delivery back to same chat, fire-and-forget pattern) - Documented background_process_notifications config (all/result/error/off) in messaging docs and configuration.md - Added HERMES_BACKGROUND_NOTIFICATIONS env var to reference page - Fixed inconsistency in slash-commands.md: /background was listed as messaging-only but works in both CLI and messaging. Moved it to the 'both surfaces' note. - Expanded one-liner table descriptions with detail and cross-references
2026-03-15 06:24:28 -07:00
```
**Codex OAuth as fallback:**
```yaml
fallback_model:
provider: openai-codex
model: gpt-5.3-codex
```
### Where Fallback Works
| Context | Fallback Supported |
|---------|-------------------|
| CLI sessions | ✔ |
| Messaging gateway (Telegram, Discord, etc.) | ✔ |
| Subagent delegation | ✘ (subagents do not inherit fallback config) |
| Cron jobs | ✘ (run with a fixed provider) |
| Auxiliary tasks (vision, compression) | ✘ (use their own provider chain — see below) |
:::tip
There are no environment variables for `fallback_model` — it is configured exclusively through `config.yaml`. This is intentional: fallback configuration is a deliberate choice, not something a stale shell export should override.
:::
---
## Auxiliary Task Fallback
Hermes uses separate lightweight models for side tasks. Each task has its own provider resolution chain that acts as a built-in fallback system.
### Tasks with Independent Provider Resolution
| Task | What It Does | Config Key |
|------|-------------|-----------|
| Vision | Image analysis, browser screenshots | `auxiliary.vision` |
| Web Extract | Web page summarization | `auxiliary.web_extract` |
docs: comprehensive update for recent merged PRs (#9019) Audit and update documentation across 12 files to match changes from ~50 recently merged PRs. Key updates: Slash commands (slash-commands.md): - Add 5 missing commands: /snapshot, /fast, /image, /debug, /restart - Fix /status incorrectly labeled as messaging-only (available in both) - Add --global flag to /model docs - Add [focus topic] arg to /compress docs CLI commands (cli-commands.md): - Add hermes debug share section with options and examples - Add hermes backup section with --quick and --label flags - Add hermes import section Feature docs: - TTS: document global tts.speed and per-provider speed for Edge/OpenAI - Web dashboard: add docs for 5 missing pages (Sessions, Logs, Analytics, Cron, Skills) and 15+ API endpoints - WhatsApp: add streaming, 4K chunking, and markdown formatting docs - Skills: add GitHub rate-limit/GITHUB_TOKEN troubleshooting tip - Budget: document CLI notification on iteration budget exhaustion Config migration (compression.summary_* → auxiliary.compression.*): - Update configuration.md, environment-variables.md, fallback-providers.md, cli.md, and context-compression-and-caching.md - Replace legacy compression.summary_model/provider/base_url references with auxiliary.compression.model/provider/base_url - Add legacy migration info boxes explaining auto-migration Minor fixes: - wecom-callback.md: clarify 'text only' limitation (input only) - Escape {session_id}/{job_id} in web-dashboard.md headings for MDX
2026-04-13 10:50:59 -07:00
| Compression | Context compression summaries | `auxiliary.compression` |
docs: fallback providers + /background command documentation * docs: comprehensive fallback providers documentation - New dedicated page: user-guide/features/fallback-providers.md covering both primary model fallback and auxiliary task fallback systems - Updated configuration.md with fallback_model config section - Updated environment-variables.md noting fallback is config-only - Fleshed out developer-guide/provider-runtime.md fallback section with internal architecture details (trigger points, activation flow, config flow) - Added cross-reference from provider-routing.md distinguishing OpenRouter sub-provider routing from Hermes-level model fallback - Added new page to sidebar under Integrations * docs: comprehensive /background command documentation - Added Background Sessions section to cli.md covering how it works (daemon threads, isolated sessions, config inheritance, Rich panel output, bell notification, concurrent tasks) - Added Background Sessions section to messaging/index.md covering messaging-specific behavior (async execution, result delivery back to same chat, fire-and-forget pattern) - Documented background_process_notifications config (all/result/error/off) in messaging docs and configuration.md - Added HERMES_BACKGROUND_NOTIFICATIONS env var to reference page - Fixed inconsistency in slash-commands.md: /background was listed as messaging-only but works in both CLI and messaging. Moved it to the 'both surfaces' note. - Expanded one-liner table descriptions with detail and cross-references
2026-03-15 06:24:28 -07:00
| Session Search | Past session summarization | `auxiliary.session_search` |
| Skills Hub | Skill search and discovery | `auxiliary.skills_hub` |
| MCP | MCP helper operations | `auxiliary.mcp` |
| Memory Flush | Memory consolidation | `auxiliary.flush_memories` |
docs: correctness audit — fix wrong values, add missing coverage (#11972) Comprehensive audit of every reference/messaging/feature doc page against the live code registries (PROVIDER_REGISTRY, OPTIONAL_ENV_VARS, COMMAND_REGISTRY, TOOLSETS, tool registry, on-disk skills). Every fix was verified against code before writing. ### Wrong values fixed (users would paste-and-fail) - reference/environment-variables.md: - DASHSCOPE_BASE_URL default was `coding-intl.dashscope.aliyuncs.com/v1` \u2192 actual `dashscope-intl.aliyuncs.com/compatible-mode/v1`. - MINIMAX_BASE_URL and MINIMAX_CN_BASE_URL defaults were `/v1` \u2192 actual `/anthropic` (Hermes calls MiniMax via its Anthropic Messages endpoint). - reference/toolsets-reference.md MCP example used the non-existent nested `mcp: servers:` key \u2192 real key is the flat `mcp_servers:`. - reference/skills-catalog.md listed ~20 bundled skills that no longer exist on disk (all moved to `optional-skills/`). Regenerated the whole bundled section from `skills/**/SKILL.md` \u2014 79 skills, accurate paths and names. - messaging/slack.md ":::info" callout claimed Slack has no `free_response_channels` equivalent; both the env var and the yaml key are in fact read. - messaging/qqbot.md documented `QQ_MARKDOWN_SUPPORT` as an env var, but the adapter only reads `extra.markdown_support` from config.yaml. Removed the env var row and noted config-only nature. - messaging/qqbot.md `hermes setup gateway` \u2192 `hermes gateway setup`. ### Missing coverage added - Providers: AWS Bedrock and Qwen Portal (qwen-oauth) \u2014 both in PROVIDER_REGISTRY but undocumented everywhere. Added sections to integrations/providers.md, rows to quickstart.md and fallback-providers.md. - integrations/providers.md "Fallback Model" provider list now includes gemini, google-gemini-cli, qwen-oauth, xai, nvidia, ollama-cloud, bedrock. - reference/cli-commands.md `--provider` enum and HERMES_INFERENCE_PROVIDER enum in env-vars now include the same set. - reference/slash-commands.md: added `/agents` (alias `/tasks`) and `/copy`. Removed duplicate rows for `/snapshot`, `/fast` (\u00d72), `/debug`. - reference/tools-reference.md: fixed "47 built-in tools" \u2192 52. Added `feishu_doc` and `feishu_drive` toolset sections. - reference/toolsets-reference.md: added `feishu_doc` / `feishu_drive` core rows + all missing `hermes-<platform>` toolsets in the platform table (bluebubbles, dingtalk, feishu, qqbot, wecom, wecom-callback, weixin, homeassistant, webhook, gateway). Fixed the `debugging` composite to describe the actual `includes=[...]` mechanism. - reference/optional-skills-catalog.md: added `fitness-nutrition`. - reference/environment-variables.md: added NOUS_BASE_URL, NOUS_INFERENCE_BASE_URL, NVIDIA_API_KEY/BASE_URL, OLLAMA_API_KEY/BASE_URL, XAI_API_KEY/BASE_URL, MISTRAL_API_KEY, AWS_REGION/AWS_PROFILE, BEDROCK_BASE_URL, HERMES_QWEN_BASE_URL, DISCORD_ALLOWED_CHANNELS, DISCORD_PROXY, TELEGRAM_REPLY_TO_MODE, MATRIX_DEVICE_ID, MATRIX_REACTIONS, QQBOT_HOME_CHANNEL_NAME, QQ_SANDBOX. - messaging/discord.md: documented DISCORD_ALLOWED_CHANNELS, DISCORD_PROXY, HERMES_DISCORD_TEXT_BATCH_DELAY_SECONDS and HERMES_DISCORD_TEXT_BATCH_SPLIT _DELAY_SECONDS (all actively read by the adapter). - messaging/matrix.md: documented MATRIX_REACTIONS (default true). - messaging/telegram.md: removed the redundant second Webhook Mode section that invented a `telegram.webhook_mode: true` yaml key the adapter does not read. - user-guide/features/hooks.md: added `on_session_finalize` and `on_session_reset` (both emitted via invoke_hook but undocumented). - user-guide/features/api-server.md: documented GET /health/detailed, the `/api/jobs/*` CRUD surface, POST /v1/runs, and GET /v1/runs/{id}/events (10 routes that were live but undocumented). - user-guide/features/fallback-providers.md: added `approval` and `title_generation` auxiliary-task rows; added gemini, bedrock, qwen-oauth to the supported-providers table. - user-guide/features/tts.md: "seven providers" \u2192 "eight" (post-xAI add oversight in #11942). - user-guide/configuration.md: TTS provider enum gains `xai` and `gemini`; yaml example block gains `mistral:`, `gemini:`, `xai:` subsections. Auxiliary-provider enum now enumerates all real registry entries. - reference/faq.md: stale AIAgent/config examples bumped from `nous/hermes-3-llama-3.1-70b` and `claude-sonnet-4.6` to `claude-opus-4.7`. ### Docs-site integrity - guides/build-a-hermes-plugin.md referenced two nonexistent hooks (`pre_api_request`, `post_api_request`). Replaced with the real `on_session_finalize` / `on_session_reset` entries. - messaging/open-webui.md and features/api-server.md had pre-existing broken links to `/docs/user-guide/features/profiles` (actual path is `/docs/user-guide/profiles`). Fixed. - reference/skills-catalog.md had one `<1%` literal that MDX parsed as a JSX tag. Escaped to `&lt;1%`. ### False positives filtered out (not changed, verified correct) - `/set-home` is a registered alias of `/sethome` \u2014 docs were fine. - `hermes setup gateway` is valid syntax (`hermes setup \<section\>`); changed in qqbot.md for cross-doc consistency, not as a bug fix. - Telegram reactions "disabled by default" matches code (default `"false"`). - Matrix encryption "opt-in" matches code (empty env default \u2192 disabled). - `pre_api_request` / `post_api_request` hooks do NOT exist in current code; documented instead the real `on_session_finalize` / `on_session_reset`. - SIGNAL_IGNORE_STORIES is already in env-vars.md (subagent missed it). Validation: - `docusaurus build` \u2014 passes (only pre-existing nix-setup anchor warning). - `ascii-guard lint docs` \u2014 124 files, 0 errors. - 22 files changed, +317 / \u2212158.
2026-04-18 01:45:48 -07:00
| Approval | Smart command-approval classification | `auxiliary.approval` |
| Title Generation | Session title summaries | `auxiliary.title_generation` |
docs: fallback providers + /background command documentation * docs: comprehensive fallback providers documentation - New dedicated page: user-guide/features/fallback-providers.md covering both primary model fallback and auxiliary task fallback systems - Updated configuration.md with fallback_model config section - Updated environment-variables.md noting fallback is config-only - Fleshed out developer-guide/provider-runtime.md fallback section with internal architecture details (trigger points, activation flow, config flow) - Added cross-reference from provider-routing.md distinguishing OpenRouter sub-provider routing from Hermes-level model fallback - Added new page to sidebar under Integrations * docs: comprehensive /background command documentation - Added Background Sessions section to cli.md covering how it works (daemon threads, isolated sessions, config inheritance, Rich panel output, bell notification, concurrent tasks) - Added Background Sessions section to messaging/index.md covering messaging-specific behavior (async execution, result delivery back to same chat, fire-and-forget pattern) - Documented background_process_notifications config (all/result/error/off) in messaging docs and configuration.md - Added HERMES_BACKGROUND_NOTIFICATIONS env var to reference page - Fixed inconsistency in slash-commands.md: /background was listed as messaging-only but works in both CLI and messaging. Moved it to the 'both surfaces' note. - Expanded one-liner table descriptions with detail and cross-references
2026-03-15 06:24:28 -07:00
### Auto-Detection Chain
When a task's provider is set to `"auto"` (the default), Hermes tries providers in order until one works:
**For text tasks (compression, web extract, etc.):**
```text
OpenRouter → Nous Portal → Custom endpoint → Codex OAuth →
API-key providers (z.ai, Kimi, MiniMax, Xiaomi MiMo, Hugging Face, Anthropic) → give up
docs: fallback providers + /background command documentation * docs: comprehensive fallback providers documentation - New dedicated page: user-guide/features/fallback-providers.md covering both primary model fallback and auxiliary task fallback systems - Updated configuration.md with fallback_model config section - Updated environment-variables.md noting fallback is config-only - Fleshed out developer-guide/provider-runtime.md fallback section with internal architecture details (trigger points, activation flow, config flow) - Added cross-reference from provider-routing.md distinguishing OpenRouter sub-provider routing from Hermes-level model fallback - Added new page to sidebar under Integrations * docs: comprehensive /background command documentation - Added Background Sessions section to cli.md covering how it works (daemon threads, isolated sessions, config inheritance, Rich panel output, bell notification, concurrent tasks) - Added Background Sessions section to messaging/index.md covering messaging-specific behavior (async execution, result delivery back to same chat, fire-and-forget pattern) - Documented background_process_notifications config (all/result/error/off) in messaging docs and configuration.md - Added HERMES_BACKGROUND_NOTIFICATIONS env var to reference page - Fixed inconsistency in slash-commands.md: /background was listed as messaging-only but works in both CLI and messaging. Moved it to the 'both surfaces' note. - Expanded one-liner table descriptions with detail and cross-references
2026-03-15 06:24:28 -07:00
```
**For vision tasks:**
```text
Main provider (if vision-capable) → OpenRouter → Nous Portal →
Codex OAuth → Anthropic → Custom endpoint → give up
```
If the resolved provider fails at call time, Hermes also has an internal retry: if the provider is not OpenRouter and no explicit `base_url` is set, it tries OpenRouter as a last-resort fallback.
### Configuring Auxiliary Providers
Each task can be configured independently in `config.yaml`:
```yaml
auxiliary:
vision:
provider: "auto" # auto | openrouter | nous | codex | main | anthropic
model: "" # e.g. "openai/gpt-4o"
base_url: "" # direct endpoint (takes precedence over provider)
api_key: "" # API key for base_url
web_extract:
provider: "auto"
model: ""
compression:
provider: "auto"
model: ""
session_search:
provider: "auto"
model: ""
timeout: 30
max_concurrency: 3
extra_body: {}
docs: fallback providers + /background command documentation * docs: comprehensive fallback providers documentation - New dedicated page: user-guide/features/fallback-providers.md covering both primary model fallback and auxiliary task fallback systems - Updated configuration.md with fallback_model config section - Updated environment-variables.md noting fallback is config-only - Fleshed out developer-guide/provider-runtime.md fallback section with internal architecture details (trigger points, activation flow, config flow) - Added cross-reference from provider-routing.md distinguishing OpenRouter sub-provider routing from Hermes-level model fallback - Added new page to sidebar under Integrations * docs: comprehensive /background command documentation - Added Background Sessions section to cli.md covering how it works (daemon threads, isolated sessions, config inheritance, Rich panel output, bell notification, concurrent tasks) - Added Background Sessions section to messaging/index.md covering messaging-specific behavior (async execution, result delivery back to same chat, fire-and-forget pattern) - Documented background_process_notifications config (all/result/error/off) in messaging docs and configuration.md - Added HERMES_BACKGROUND_NOTIFICATIONS env var to reference page - Fixed inconsistency in slash-commands.md: /background was listed as messaging-only but works in both CLI and messaging. Moved it to the 'both surfaces' note. - Expanded one-liner table descriptions with detail and cross-references
2026-03-15 06:24:28 -07:00
skills_hub:
provider: "auto"
model: ""
mcp:
provider: "auto"
model: ""
flush_memories:
provider: "auto"
model: ""
```
docs: comprehensive update for recent merged PRs (#9019) Audit and update documentation across 12 files to match changes from ~50 recently merged PRs. Key updates: Slash commands (slash-commands.md): - Add 5 missing commands: /snapshot, /fast, /image, /debug, /restart - Fix /status incorrectly labeled as messaging-only (available in both) - Add --global flag to /model docs - Add [focus topic] arg to /compress docs CLI commands (cli-commands.md): - Add hermes debug share section with options and examples - Add hermes backup section with --quick and --label flags - Add hermes import section Feature docs: - TTS: document global tts.speed and per-provider speed for Edge/OpenAI - Web dashboard: add docs for 5 missing pages (Sessions, Logs, Analytics, Cron, Skills) and 15+ API endpoints - WhatsApp: add streaming, 4K chunking, and markdown formatting docs - Skills: add GitHub rate-limit/GITHUB_TOKEN troubleshooting tip - Budget: document CLI notification on iteration budget exhaustion Config migration (compression.summary_* → auxiliary.compression.*): - Update configuration.md, environment-variables.md, fallback-providers.md, cli.md, and context-compression-and-caching.md - Replace legacy compression.summary_model/provider/base_url references with auxiliary.compression.model/provider/base_url - Add legacy migration info boxes explaining auto-migration Minor fixes: - wecom-callback.md: clarify 'text only' limitation (input only) - Escape {session_id}/{job_id} in web-dashboard.md headings for MDX
2026-04-13 10:50:59 -07:00
Every task above follows the same **provider / model / base_url** pattern. Context compression is configured under `auxiliary.compression`:
```yaml
docs: comprehensive update for recent merged PRs (#9019) Audit and update documentation across 12 files to match changes from ~50 recently merged PRs. Key updates: Slash commands (slash-commands.md): - Add 5 missing commands: /snapshot, /fast, /image, /debug, /restart - Fix /status incorrectly labeled as messaging-only (available in both) - Add --global flag to /model docs - Add [focus topic] arg to /compress docs CLI commands (cli-commands.md): - Add hermes debug share section with options and examples - Add hermes backup section with --quick and --label flags - Add hermes import section Feature docs: - TTS: document global tts.speed and per-provider speed for Edge/OpenAI - Web dashboard: add docs for 5 missing pages (Sessions, Logs, Analytics, Cron, Skills) and 15+ API endpoints - WhatsApp: add streaming, 4K chunking, and markdown formatting docs - Skills: add GitHub rate-limit/GITHUB_TOKEN troubleshooting tip - Budget: document CLI notification on iteration budget exhaustion Config migration (compression.summary_* → auxiliary.compression.*): - Update configuration.md, environment-variables.md, fallback-providers.md, cli.md, and context-compression-and-caching.md - Replace legacy compression.summary_model/provider/base_url references with auxiliary.compression.model/provider/base_url - Add legacy migration info boxes explaining auto-migration Minor fixes: - wecom-callback.md: clarify 'text only' limitation (input only) - Escape {session_id}/{job_id} in web-dashboard.md headings for MDX
2026-04-13 10:50:59 -07:00
auxiliary:
compression:
provider: main # Same provider options as other auxiliary tasks
model: google/gemini-3-flash-preview
base_url: null # Custom OpenAI-compatible endpoint
```
And the fallback model uses:
```yaml
fallback_model:
provider: openrouter
model: anthropic/claude-sonnet-4
# base_url: http://localhost:8000/v1 # Optional custom endpoint
docs: fallback providers + /background command documentation * docs: comprehensive fallback providers documentation - New dedicated page: user-guide/features/fallback-providers.md covering both primary model fallback and auxiliary task fallback systems - Updated configuration.md with fallback_model config section - Updated environment-variables.md noting fallback is config-only - Fleshed out developer-guide/provider-runtime.md fallback section with internal architecture details (trigger points, activation flow, config flow) - Added cross-reference from provider-routing.md distinguishing OpenRouter sub-provider routing from Hermes-level model fallback - Added new page to sidebar under Integrations * docs: comprehensive /background command documentation - Added Background Sessions section to cli.md covering how it works (daemon threads, isolated sessions, config inheritance, Rich panel output, bell notification, concurrent tasks) - Added Background Sessions section to messaging/index.md covering messaging-specific behavior (async execution, result delivery back to same chat, fire-and-forget pattern) - Documented background_process_notifications config (all/result/error/off) in messaging docs and configuration.md - Added HERMES_BACKGROUND_NOTIFICATIONS env var to reference page - Fixed inconsistency in slash-commands.md: /background was listed as messaging-only but works in both CLI and messaging. Moved it to the 'both surfaces' note. - Expanded one-liner table descriptions with detail and cross-references
2026-03-15 06:24:28 -07:00
```
For `auxiliary.session_search`, Hermes also supports:
- `max_concurrency` to limit how many session summaries run at once
- `extra_body` to pass provider-specific OpenAI-compatible request fields through on the summarization calls
Example:
```yaml
auxiliary:
session_search:
provider: main
model: glm-4.5-air
max_concurrency: 2
extra_body:
enable_thinking: false
```
If your provider does not support a native OpenAI-compatible reasoning-control field, `extra_body` will not help for that part; in that case `max_concurrency` is still useful for reducing request-burst 429s.
All three — auxiliary, compression, fallback — work the same way: set `provider` to pick who handles the request, `model` to pick which model, and `base_url` to point at a custom endpoint (overrides provider).
docs: fallback providers + /background command documentation * docs: comprehensive fallback providers documentation - New dedicated page: user-guide/features/fallback-providers.md covering both primary model fallback and auxiliary task fallback systems - Updated configuration.md with fallback_model config section - Updated environment-variables.md noting fallback is config-only - Fleshed out developer-guide/provider-runtime.md fallback section with internal architecture details (trigger points, activation flow, config flow) - Added cross-reference from provider-routing.md distinguishing OpenRouter sub-provider routing from Hermes-level model fallback - Added new page to sidebar under Integrations * docs: comprehensive /background command documentation - Added Background Sessions section to cli.md covering how it works (daemon threads, isolated sessions, config inheritance, Rich panel output, bell notification, concurrent tasks) - Added Background Sessions section to messaging/index.md covering messaging-specific behavior (async execution, result delivery back to same chat, fire-and-forget pattern) - Documented background_process_notifications config (all/result/error/off) in messaging docs and configuration.md - Added HERMES_BACKGROUND_NOTIFICATIONS env var to reference page - Fixed inconsistency in slash-commands.md: /background was listed as messaging-only but works in both CLI and messaging. Moved it to the 'both surfaces' note. - Expanded one-liner table descriptions with detail and cross-references
2026-03-15 06:24:28 -07:00
### Provider Options for Auxiliary Tasks
These options apply to `auxiliary:`, `compression:`, and `fallback_model:` configs only — `"main"` is **not** a valid value for your top-level `model.provider`. For custom endpoints, use `provider: custom` in your `model:` section (see [AI Providers](/docs/integrations/providers)).
docs: fallback providers + /background command documentation * docs: comprehensive fallback providers documentation - New dedicated page: user-guide/features/fallback-providers.md covering both primary model fallback and auxiliary task fallback systems - Updated configuration.md with fallback_model config section - Updated environment-variables.md noting fallback is config-only - Fleshed out developer-guide/provider-runtime.md fallback section with internal architecture details (trigger points, activation flow, config flow) - Added cross-reference from provider-routing.md distinguishing OpenRouter sub-provider routing from Hermes-level model fallback - Added new page to sidebar under Integrations * docs: comprehensive /background command documentation - Added Background Sessions section to cli.md covering how it works (daemon threads, isolated sessions, config inheritance, Rich panel output, bell notification, concurrent tasks) - Added Background Sessions section to messaging/index.md covering messaging-specific behavior (async execution, result delivery back to same chat, fire-and-forget pattern) - Documented background_process_notifications config (all/result/error/off) in messaging docs and configuration.md - Added HERMES_BACKGROUND_NOTIFICATIONS env var to reference page - Fixed inconsistency in slash-commands.md: /background was listed as messaging-only but works in both CLI and messaging. Moved it to the 'both surfaces' note. - Expanded one-liner table descriptions with detail and cross-references
2026-03-15 06:24:28 -07:00
| Provider | Description | Requirements |
|----------|-------------|-------------|
| `"auto"` | Try providers in order until one works (default) | At least one provider configured |
| `"openrouter"` | Force OpenRouter | `OPENROUTER_API_KEY` |
| `"nous"` | Force Nous Portal | `hermes auth` |
docs: fallback providers + /background command documentation * docs: comprehensive fallback providers documentation - New dedicated page: user-guide/features/fallback-providers.md covering both primary model fallback and auxiliary task fallback systems - Updated configuration.md with fallback_model config section - Updated environment-variables.md noting fallback is config-only - Fleshed out developer-guide/provider-runtime.md fallback section with internal architecture details (trigger points, activation flow, config flow) - Added cross-reference from provider-routing.md distinguishing OpenRouter sub-provider routing from Hermes-level model fallback - Added new page to sidebar under Integrations * docs: comprehensive /background command documentation - Added Background Sessions section to cli.md covering how it works (daemon threads, isolated sessions, config inheritance, Rich panel output, bell notification, concurrent tasks) - Added Background Sessions section to messaging/index.md covering messaging-specific behavior (async execution, result delivery back to same chat, fire-and-forget pattern) - Documented background_process_notifications config (all/result/error/off) in messaging docs and configuration.md - Added HERMES_BACKGROUND_NOTIFICATIONS env var to reference page - Fixed inconsistency in slash-commands.md: /background was listed as messaging-only but works in both CLI and messaging. Moved it to the 'both surfaces' note. - Expanded one-liner table descriptions with detail and cross-references
2026-03-15 06:24:28 -07:00
| `"codex"` | Force Codex OAuth | `hermes model` → Codex |
| `"main"` | Use whatever provider the main agent uses (auxiliary tasks only) | Active main provider configured |
docs: fallback providers + /background command documentation * docs: comprehensive fallback providers documentation - New dedicated page: user-guide/features/fallback-providers.md covering both primary model fallback and auxiliary task fallback systems - Updated configuration.md with fallback_model config section - Updated environment-variables.md noting fallback is config-only - Fleshed out developer-guide/provider-runtime.md fallback section with internal architecture details (trigger points, activation flow, config flow) - Added cross-reference from provider-routing.md distinguishing OpenRouter sub-provider routing from Hermes-level model fallback - Added new page to sidebar under Integrations * docs: comprehensive /background command documentation - Added Background Sessions section to cli.md covering how it works (daemon threads, isolated sessions, config inheritance, Rich panel output, bell notification, concurrent tasks) - Added Background Sessions section to messaging/index.md covering messaging-specific behavior (async execution, result delivery back to same chat, fire-and-forget pattern) - Documented background_process_notifications config (all/result/error/off) in messaging docs and configuration.md - Added HERMES_BACKGROUND_NOTIFICATIONS env var to reference page - Fixed inconsistency in slash-commands.md: /background was listed as messaging-only but works in both CLI and messaging. Moved it to the 'both surfaces' note. - Expanded one-liner table descriptions with detail and cross-references
2026-03-15 06:24:28 -07:00
| `"anthropic"` | Force Anthropic native | `ANTHROPIC_API_KEY` or Claude Code credentials |
### Direct Endpoint Override
For any auxiliary task, setting `base_url` bypasses provider resolution entirely and sends requests directly to that endpoint:
```yaml
auxiliary:
vision:
base_url: "http://localhost:1234/v1"
api_key: "local-key"
model: "qwen2.5-vl"
```
`base_url` takes precedence over `provider`. Hermes uses the configured `api_key` for authentication, falling back to `OPENAI_API_KEY` if not set. It does **not** reuse `OPENROUTER_API_KEY` for custom endpoints.
---
## Context Compression Fallback
docs: comprehensive update for recent merged PRs (#9019) Audit and update documentation across 12 files to match changes from ~50 recently merged PRs. Key updates: Slash commands (slash-commands.md): - Add 5 missing commands: /snapshot, /fast, /image, /debug, /restart - Fix /status incorrectly labeled as messaging-only (available in both) - Add --global flag to /model docs - Add [focus topic] arg to /compress docs CLI commands (cli-commands.md): - Add hermes debug share section with options and examples - Add hermes backup section with --quick and --label flags - Add hermes import section Feature docs: - TTS: document global tts.speed and per-provider speed for Edge/OpenAI - Web dashboard: add docs for 5 missing pages (Sessions, Logs, Analytics, Cron, Skills) and 15+ API endpoints - WhatsApp: add streaming, 4K chunking, and markdown formatting docs - Skills: add GitHub rate-limit/GITHUB_TOKEN troubleshooting tip - Budget: document CLI notification on iteration budget exhaustion Config migration (compression.summary_* → auxiliary.compression.*): - Update configuration.md, environment-variables.md, fallback-providers.md, cli.md, and context-compression-and-caching.md - Replace legacy compression.summary_model/provider/base_url references with auxiliary.compression.model/provider/base_url - Add legacy migration info boxes explaining auto-migration Minor fixes: - wecom-callback.md: clarify 'text only' limitation (input only) - Escape {session_id}/{job_id} in web-dashboard.md headings for MDX
2026-04-13 10:50:59 -07:00
Context compression uses the `auxiliary.compression` config block to control which model and provider handles summarization:
docs: fallback providers + /background command documentation * docs: comprehensive fallback providers documentation - New dedicated page: user-guide/features/fallback-providers.md covering both primary model fallback and auxiliary task fallback systems - Updated configuration.md with fallback_model config section - Updated environment-variables.md noting fallback is config-only - Fleshed out developer-guide/provider-runtime.md fallback section with internal architecture details (trigger points, activation flow, config flow) - Added cross-reference from provider-routing.md distinguishing OpenRouter sub-provider routing from Hermes-level model fallback - Added new page to sidebar under Integrations * docs: comprehensive /background command documentation - Added Background Sessions section to cli.md covering how it works (daemon threads, isolated sessions, config inheritance, Rich panel output, bell notification, concurrent tasks) - Added Background Sessions section to messaging/index.md covering messaging-specific behavior (async execution, result delivery back to same chat, fire-and-forget pattern) - Documented background_process_notifications config (all/result/error/off) in messaging docs and configuration.md - Added HERMES_BACKGROUND_NOTIFICATIONS env var to reference page - Fixed inconsistency in slash-commands.md: /background was listed as messaging-only but works in both CLI and messaging. Moved it to the 'both surfaces' note. - Expanded one-liner table descriptions with detail and cross-references
2026-03-15 06:24:28 -07:00
```yaml
docs: comprehensive update for recent merged PRs (#9019) Audit and update documentation across 12 files to match changes from ~50 recently merged PRs. Key updates: Slash commands (slash-commands.md): - Add 5 missing commands: /snapshot, /fast, /image, /debug, /restart - Fix /status incorrectly labeled as messaging-only (available in both) - Add --global flag to /model docs - Add [focus topic] arg to /compress docs CLI commands (cli-commands.md): - Add hermes debug share section with options and examples - Add hermes backup section with --quick and --label flags - Add hermes import section Feature docs: - TTS: document global tts.speed and per-provider speed for Edge/OpenAI - Web dashboard: add docs for 5 missing pages (Sessions, Logs, Analytics, Cron, Skills) and 15+ API endpoints - WhatsApp: add streaming, 4K chunking, and markdown formatting docs - Skills: add GitHub rate-limit/GITHUB_TOKEN troubleshooting tip - Budget: document CLI notification on iteration budget exhaustion Config migration (compression.summary_* → auxiliary.compression.*): - Update configuration.md, environment-variables.md, fallback-providers.md, cli.md, and context-compression-and-caching.md - Replace legacy compression.summary_model/provider/base_url references with auxiliary.compression.model/provider/base_url - Add legacy migration info boxes explaining auto-migration Minor fixes: - wecom-callback.md: clarify 'text only' limitation (input only) - Escape {session_id}/{job_id} in web-dashboard.md headings for MDX
2026-04-13 10:50:59 -07:00
auxiliary:
compression:
provider: "auto" # auto | openrouter | nous | main
model: "google/gemini-3-flash-preview"
docs: fallback providers + /background command documentation * docs: comprehensive fallback providers documentation - New dedicated page: user-guide/features/fallback-providers.md covering both primary model fallback and auxiliary task fallback systems - Updated configuration.md with fallback_model config section - Updated environment-variables.md noting fallback is config-only - Fleshed out developer-guide/provider-runtime.md fallback section with internal architecture details (trigger points, activation flow, config flow) - Added cross-reference from provider-routing.md distinguishing OpenRouter sub-provider routing from Hermes-level model fallback - Added new page to sidebar under Integrations * docs: comprehensive /background command documentation - Added Background Sessions section to cli.md covering how it works (daemon threads, isolated sessions, config inheritance, Rich panel output, bell notification, concurrent tasks) - Added Background Sessions section to messaging/index.md covering messaging-specific behavior (async execution, result delivery back to same chat, fire-and-forget pattern) - Documented background_process_notifications config (all/result/error/off) in messaging docs and configuration.md - Added HERMES_BACKGROUND_NOTIFICATIONS env var to reference page - Fixed inconsistency in slash-commands.md: /background was listed as messaging-only but works in both CLI and messaging. Moved it to the 'both surfaces' note. - Expanded one-liner table descriptions with detail and cross-references
2026-03-15 06:24:28 -07:00
```
docs: comprehensive update for recent merged PRs (#9019) Audit and update documentation across 12 files to match changes from ~50 recently merged PRs. Key updates: Slash commands (slash-commands.md): - Add 5 missing commands: /snapshot, /fast, /image, /debug, /restart - Fix /status incorrectly labeled as messaging-only (available in both) - Add --global flag to /model docs - Add [focus topic] arg to /compress docs CLI commands (cli-commands.md): - Add hermes debug share section with options and examples - Add hermes backup section with --quick and --label flags - Add hermes import section Feature docs: - TTS: document global tts.speed and per-provider speed for Edge/OpenAI - Web dashboard: add docs for 5 missing pages (Sessions, Logs, Analytics, Cron, Skills) and 15+ API endpoints - WhatsApp: add streaming, 4K chunking, and markdown formatting docs - Skills: add GitHub rate-limit/GITHUB_TOKEN troubleshooting tip - Budget: document CLI notification on iteration budget exhaustion Config migration (compression.summary_* → auxiliary.compression.*): - Update configuration.md, environment-variables.md, fallback-providers.md, cli.md, and context-compression-and-caching.md - Replace legacy compression.summary_model/provider/base_url references with auxiliary.compression.model/provider/base_url - Add legacy migration info boxes explaining auto-migration Minor fixes: - wecom-callback.md: clarify 'text only' limitation (input only) - Escape {session_id}/{job_id} in web-dashboard.md headings for MDX
2026-04-13 10:50:59 -07:00
:::info Legacy migration
Older configs with `compression.summary_model` / `compression.summary_provider` / `compression.summary_base_url` are automatically migrated to `auxiliary.compression.*` on first load (config version 17).
:::
docs: fallback providers + /background command documentation * docs: comprehensive fallback providers documentation - New dedicated page: user-guide/features/fallback-providers.md covering both primary model fallback and auxiliary task fallback systems - Updated configuration.md with fallback_model config section - Updated environment-variables.md noting fallback is config-only - Fleshed out developer-guide/provider-runtime.md fallback section with internal architecture details (trigger points, activation flow, config flow) - Added cross-reference from provider-routing.md distinguishing OpenRouter sub-provider routing from Hermes-level model fallback - Added new page to sidebar under Integrations * docs: comprehensive /background command documentation - Added Background Sessions section to cli.md covering how it works (daemon threads, isolated sessions, config inheritance, Rich panel output, bell notification, concurrent tasks) - Added Background Sessions section to messaging/index.md covering messaging-specific behavior (async execution, result delivery back to same chat, fire-and-forget pattern) - Documented background_process_notifications config (all/result/error/off) in messaging docs and configuration.md - Added HERMES_BACKGROUND_NOTIFICATIONS env var to reference page - Fixed inconsistency in slash-commands.md: /background was listed as messaging-only but works in both CLI and messaging. Moved it to the 'both surfaces' note. - Expanded one-liner table descriptions with detail and cross-references
2026-03-15 06:24:28 -07:00
If no provider is available for compression, Hermes drops middle conversation turns without generating a summary rather than failing the session.
---
## Delegation Provider Override
Subagents spawned by `delegate_task` do **not** use the primary fallback model. However, they can be routed to a different provider:model pair for cost optimization:
```yaml
delegation:
provider: "openrouter" # override provider for all subagents
model: "google/gemini-3-flash-preview" # override model
# base_url: "http://localhost:1234/v1" # or use a direct endpoint
# api_key: "local-key"
```
See [Subagent Delegation](/docs/user-guide/features/delegation) for full configuration details.
---
## Cron Job Providers
Cron jobs run with whatever provider is configured at execution time. They do not support a fallback model. To use a different provider for cron jobs, configure `provider` and `model` overrides on the cron job itself:
```python
cronjob(
action="create",
schedule="every 2h",
prompt="Check server status",
provider="openrouter",
model="google/gemini-3-flash-preview"
)
```
See [Scheduled Tasks (Cron)](/docs/user-guide/features/cron) for full configuration details.
---
## Summary
| Feature | Fallback Mechanism | Config Location |
|---------|-------------------|----------------|
| Main agent model | `fallback_model` in config.yaml — one-shot failover on errors | `fallback_model:` (top-level) |
| Vision | Auto-detection chain + internal OpenRouter retry | `auxiliary.vision` |
| Web extraction | Auto-detection chain + internal OpenRouter retry | `auxiliary.web_extract` |
docs: comprehensive update for recent merged PRs (#9019) Audit and update documentation across 12 files to match changes from ~50 recently merged PRs. Key updates: Slash commands (slash-commands.md): - Add 5 missing commands: /snapshot, /fast, /image, /debug, /restart - Fix /status incorrectly labeled as messaging-only (available in both) - Add --global flag to /model docs - Add [focus topic] arg to /compress docs CLI commands (cli-commands.md): - Add hermes debug share section with options and examples - Add hermes backup section with --quick and --label flags - Add hermes import section Feature docs: - TTS: document global tts.speed and per-provider speed for Edge/OpenAI - Web dashboard: add docs for 5 missing pages (Sessions, Logs, Analytics, Cron, Skills) and 15+ API endpoints - WhatsApp: add streaming, 4K chunking, and markdown formatting docs - Skills: add GitHub rate-limit/GITHUB_TOKEN troubleshooting tip - Budget: document CLI notification on iteration budget exhaustion Config migration (compression.summary_* → auxiliary.compression.*): - Update configuration.md, environment-variables.md, fallback-providers.md, cli.md, and context-compression-and-caching.md - Replace legacy compression.summary_model/provider/base_url references with auxiliary.compression.model/provider/base_url - Add legacy migration info boxes explaining auto-migration Minor fixes: - wecom-callback.md: clarify 'text only' limitation (input only) - Escape {session_id}/{job_id} in web-dashboard.md headings for MDX
2026-04-13 10:50:59 -07:00
| Context compression | Auto-detection chain, degrades to no-summary if unavailable | `auxiliary.compression` |
docs: fallback providers + /background command documentation * docs: comprehensive fallback providers documentation - New dedicated page: user-guide/features/fallback-providers.md covering both primary model fallback and auxiliary task fallback systems - Updated configuration.md with fallback_model config section - Updated environment-variables.md noting fallback is config-only - Fleshed out developer-guide/provider-runtime.md fallback section with internal architecture details (trigger points, activation flow, config flow) - Added cross-reference from provider-routing.md distinguishing OpenRouter sub-provider routing from Hermes-level model fallback - Added new page to sidebar under Integrations * docs: comprehensive /background command documentation - Added Background Sessions section to cli.md covering how it works (daemon threads, isolated sessions, config inheritance, Rich panel output, bell notification, concurrent tasks) - Added Background Sessions section to messaging/index.md covering messaging-specific behavior (async execution, result delivery back to same chat, fire-and-forget pattern) - Documented background_process_notifications config (all/result/error/off) in messaging docs and configuration.md - Added HERMES_BACKGROUND_NOTIFICATIONS env var to reference page - Fixed inconsistency in slash-commands.md: /background was listed as messaging-only but works in both CLI and messaging. Moved it to the 'both surfaces' note. - Expanded one-liner table descriptions with detail and cross-references
2026-03-15 06:24:28 -07:00
| Session search | Auto-detection chain | `auxiliary.session_search` |
| Skills hub | Auto-detection chain | `auxiliary.skills_hub` |
| MCP helpers | Auto-detection chain | `auxiliary.mcp` |
| Memory flush | Auto-detection chain | `auxiliary.flush_memories` |
docs: correctness audit — fix wrong values, add missing coverage (#11972) Comprehensive audit of every reference/messaging/feature doc page against the live code registries (PROVIDER_REGISTRY, OPTIONAL_ENV_VARS, COMMAND_REGISTRY, TOOLSETS, tool registry, on-disk skills). Every fix was verified against code before writing. ### Wrong values fixed (users would paste-and-fail) - reference/environment-variables.md: - DASHSCOPE_BASE_URL default was `coding-intl.dashscope.aliyuncs.com/v1` \u2192 actual `dashscope-intl.aliyuncs.com/compatible-mode/v1`. - MINIMAX_BASE_URL and MINIMAX_CN_BASE_URL defaults were `/v1` \u2192 actual `/anthropic` (Hermes calls MiniMax via its Anthropic Messages endpoint). - reference/toolsets-reference.md MCP example used the non-existent nested `mcp: servers:` key \u2192 real key is the flat `mcp_servers:`. - reference/skills-catalog.md listed ~20 bundled skills that no longer exist on disk (all moved to `optional-skills/`). Regenerated the whole bundled section from `skills/**/SKILL.md` \u2014 79 skills, accurate paths and names. - messaging/slack.md ":::info" callout claimed Slack has no `free_response_channels` equivalent; both the env var and the yaml key are in fact read. - messaging/qqbot.md documented `QQ_MARKDOWN_SUPPORT` as an env var, but the adapter only reads `extra.markdown_support` from config.yaml. Removed the env var row and noted config-only nature. - messaging/qqbot.md `hermes setup gateway` \u2192 `hermes gateway setup`. ### Missing coverage added - Providers: AWS Bedrock and Qwen Portal (qwen-oauth) \u2014 both in PROVIDER_REGISTRY but undocumented everywhere. Added sections to integrations/providers.md, rows to quickstart.md and fallback-providers.md. - integrations/providers.md "Fallback Model" provider list now includes gemini, google-gemini-cli, qwen-oauth, xai, nvidia, ollama-cloud, bedrock. - reference/cli-commands.md `--provider` enum and HERMES_INFERENCE_PROVIDER enum in env-vars now include the same set. - reference/slash-commands.md: added `/agents` (alias `/tasks`) and `/copy`. Removed duplicate rows for `/snapshot`, `/fast` (\u00d72), `/debug`. - reference/tools-reference.md: fixed "47 built-in tools" \u2192 52. Added `feishu_doc` and `feishu_drive` toolset sections. - reference/toolsets-reference.md: added `feishu_doc` / `feishu_drive` core rows + all missing `hermes-<platform>` toolsets in the platform table (bluebubbles, dingtalk, feishu, qqbot, wecom, wecom-callback, weixin, homeassistant, webhook, gateway). Fixed the `debugging` composite to describe the actual `includes=[...]` mechanism. - reference/optional-skills-catalog.md: added `fitness-nutrition`. - reference/environment-variables.md: added NOUS_BASE_URL, NOUS_INFERENCE_BASE_URL, NVIDIA_API_KEY/BASE_URL, OLLAMA_API_KEY/BASE_URL, XAI_API_KEY/BASE_URL, MISTRAL_API_KEY, AWS_REGION/AWS_PROFILE, BEDROCK_BASE_URL, HERMES_QWEN_BASE_URL, DISCORD_ALLOWED_CHANNELS, DISCORD_PROXY, TELEGRAM_REPLY_TO_MODE, MATRIX_DEVICE_ID, MATRIX_REACTIONS, QQBOT_HOME_CHANNEL_NAME, QQ_SANDBOX. - messaging/discord.md: documented DISCORD_ALLOWED_CHANNELS, DISCORD_PROXY, HERMES_DISCORD_TEXT_BATCH_DELAY_SECONDS and HERMES_DISCORD_TEXT_BATCH_SPLIT _DELAY_SECONDS (all actively read by the adapter). - messaging/matrix.md: documented MATRIX_REACTIONS (default true). - messaging/telegram.md: removed the redundant second Webhook Mode section that invented a `telegram.webhook_mode: true` yaml key the adapter does not read. - user-guide/features/hooks.md: added `on_session_finalize` and `on_session_reset` (both emitted via invoke_hook but undocumented). - user-guide/features/api-server.md: documented GET /health/detailed, the `/api/jobs/*` CRUD surface, POST /v1/runs, and GET /v1/runs/{id}/events (10 routes that were live but undocumented). - user-guide/features/fallback-providers.md: added `approval` and `title_generation` auxiliary-task rows; added gemini, bedrock, qwen-oauth to the supported-providers table. - user-guide/features/tts.md: "seven providers" \u2192 "eight" (post-xAI add oversight in #11942). - user-guide/configuration.md: TTS provider enum gains `xai` and `gemini`; yaml example block gains `mistral:`, `gemini:`, `xai:` subsections. Auxiliary-provider enum now enumerates all real registry entries. - reference/faq.md: stale AIAgent/config examples bumped from `nous/hermes-3-llama-3.1-70b` and `claude-sonnet-4.6` to `claude-opus-4.7`. ### Docs-site integrity - guides/build-a-hermes-plugin.md referenced two nonexistent hooks (`pre_api_request`, `post_api_request`). Replaced with the real `on_session_finalize` / `on_session_reset` entries. - messaging/open-webui.md and features/api-server.md had pre-existing broken links to `/docs/user-guide/features/profiles` (actual path is `/docs/user-guide/profiles`). Fixed. - reference/skills-catalog.md had one `<1%` literal that MDX parsed as a JSX tag. Escaped to `&lt;1%`. ### False positives filtered out (not changed, verified correct) - `/set-home` is a registered alias of `/sethome` \u2014 docs were fine. - `hermes setup gateway` is valid syntax (`hermes setup \<section\>`); changed in qqbot.md for cross-doc consistency, not as a bug fix. - Telegram reactions "disabled by default" matches code (default `"false"`). - Matrix encryption "opt-in" matches code (empty env default \u2192 disabled). - `pre_api_request` / `post_api_request` hooks do NOT exist in current code; documented instead the real `on_session_finalize` / `on_session_reset`. - SIGNAL_IGNORE_STORIES is already in env-vars.md (subagent missed it). Validation: - `docusaurus build` \u2014 passes (only pre-existing nix-setup anchor warning). - `ascii-guard lint docs` \u2014 124 files, 0 errors. - 22 files changed, +317 / \u2212158.
2026-04-18 01:45:48 -07:00
| Approval classification | Auto-detection chain | `auxiliary.approval` |
| Title generation | Auto-detection chain | `auxiliary.title_generation` |
docs: fallback providers + /background command documentation * docs: comprehensive fallback providers documentation - New dedicated page: user-guide/features/fallback-providers.md covering both primary model fallback and auxiliary task fallback systems - Updated configuration.md with fallback_model config section - Updated environment-variables.md noting fallback is config-only - Fleshed out developer-guide/provider-runtime.md fallback section with internal architecture details (trigger points, activation flow, config flow) - Added cross-reference from provider-routing.md distinguishing OpenRouter sub-provider routing from Hermes-level model fallback - Added new page to sidebar under Integrations * docs: comprehensive /background command documentation - Added Background Sessions section to cli.md covering how it works (daemon threads, isolated sessions, config inheritance, Rich panel output, bell notification, concurrent tasks) - Added Background Sessions section to messaging/index.md covering messaging-specific behavior (async execution, result delivery back to same chat, fire-and-forget pattern) - Documented background_process_notifications config (all/result/error/off) in messaging docs and configuration.md - Added HERMES_BACKGROUND_NOTIFICATIONS env var to reference page - Fixed inconsistency in slash-commands.md: /background was listed as messaging-only but works in both CLI and messaging. Moved it to the 'both surfaces' note. - Expanded one-liner table descriptions with detail and cross-references
2026-03-15 06:24:28 -07:00
| Delegation | Provider override only (no automatic fallback) | `delegation.provider` / `delegation.model` |
| Cron jobs | Per-job provider override only (no automatic fallback) | Per-job `provider` / `model` |