Use OPENROUTER_MODELS (28 curated) and _PROVIDER_MODELS from models.py
instead of the raw models.dev catalog (167 OpenRouter models). These are
hand-picked agentic models that work as agent backends.
Before: 167 models from models.dev sorted by capability score
After: 28 curated models in our recommended order
/model (no args) now lists every provider the user has credentials for,
plus all user-defined endpoints from config.yaml providers: section.
Each entry shows:
- Provider display name
- The --provider slug to use
- (current) tag on the active provider
- Top models sorted by capability (tool_call + context window)
- Model count for providers with large catalogs
- URL for user-defined endpoints
Example output:
OpenRouter [--provider openrouter] (current):
anthropic/claude-sonnet-4.6, openai/gpt-5.2-codex, ... (+161 more)
Anthropic [--provider anthropic]:
claude-opus-4-6, claude-sonnet-4-6, ... (+16 more)
My Ollama [--provider my-ollama]:
http://localhost:11434/v1
Detection works by checking env vars from models.dev provider metadata
and auth store entries for OAuth providers. User-defined endpoints are
always shown.
Major overhaul of the model/provider system:
## models.dev as primary database (agent/models_dev.py)
- Full ModelInfo dataclass: context window, max output, cost/M tokens,
capabilities (reasoning, tools, vision, PDF, audio, structured output),
modalities, knowledge cutoff, open_weights, family, status
- Full ProviderInfo dataclass: name, base URL, env vars, doc link
- New queries: get_provider_info(), get_model_info(), list_all_providers(),
get_providers_for_env_var(), get_model_info_any_provider(),
list_provider_model_infos()
- 109 providers, 4000+ models with exact metadata
- Backward-compatible: existing ModelCapabilities API unchanged
## Hermes overlay system (hermes_cli/providers.py)
- HermesOverlay: transport type, auth patterns, aggregator flags
- Merge chain: models.dev + overlay + user config = complete ProviderDef
- User-defined endpoints via config.yaml providers: section
- resolve_provider_full() — single entry point for --provider resolution
- Works for built-in, models.dev-only, AND user-defined providers
## --provider flag (hermes_cli/model_switch.py)
- parse_model_flags() extracts --provider and --global cleanly
- No more colon-based provider:model syntax (colons reserved for
OpenRouter :free/:extended/:fast/:beta suffixes)
- Explicit provider path: resolve → credentials → alias on target
- Implicit path: alias → fallback → catalog → detect_provider
## Rich metadata display (cli.py, gateway/run.py)
- /model (no args) shows: context, max output, cost, capabilities
- /model switch shows: full metadata from models.dev
- Fallback to old context length lookup when models.dev has no data
## Config (hermes_cli/config.py)
- Added providers: {} to DEFAULT_CONFIG for user-defined endpoints
New foundation files:
- hermes_cli/providers.py: single source of truth for provider identity,
aliases, labels, transport types, api_mode determination
- hermes_cli/model_normalize.py: per-provider model name normalization
(anthropic uses hyphens, openrouter uses vendor/ prefix, etc.)
- agent/models_dev.py: extended with ModelCapabilities, get_model_capabilities(),
list_provider_models(), search_models_dev()
Rebuilt model_switch.py:
- Dynamic alias resolution from catalog (no hardcoded versions)
- Aggregator-aware resolution (stays on OpenRouter, doesn't hijack to opencode-zen)
- Vendor:model conversion on aggregators (openai:gpt-5.4 -> openai/gpt-5.4)
- Per-provider model name normalization
- Capability metadata from models.dev
- Fuzzy suggestions on error
AIAgent.switch_model(): in-place model swap following _try_activate_fallback()
pattern. Updates primary runtime, invalidates system prompt, rebuilds client
for cross-api-mode switches. Uses determine_api_mode() from providers.py.
/model command:
- Session-only by default (no config.yaml write)
- --global flag to persist permanently
- Confirmation shows model, provider, context, capabilities, cache status
- Running-agent guard on gateway
- Gateway stores session overrides in _session_model_overrides dict
- Works across CLI, Telegram, Discord, Slack, Matrix, all platforms
OpenCode Zen and Go are mixed-API-surface providers — different models
behind them use different API surfaces (GPT on Zen uses codex_responses,
Claude on Zen uses anthropic_messages, MiniMax on Go uses
anthropic_messages, GLM/Kimi on Go use chat_completions).
Changes:
- Add normalize_opencode_model_id() and opencode_model_api_mode() to
models.py for model ID normalization and API surface routing
- Add _provider_supports_explicit_api_mode() to runtime_provider.py
to prevent stale api_mode from leaking across provider switches
- Wire opencode routing into all three api_mode resolution paths:
pool entry, api_key provider, and explicit runtime
- Add api_mode field to ModelSwitchResult for propagation through the
switch pipeline
- Consolidate _PROVIDER_MODELS from main.py into models.py (single
source of truth, eliminates duplicate dict)
- Add opencode normalization to setup wizard and model picker flows
- Add opencode block to _normalize_model_for_provider in CLI
- Add opencode-zen/go fallback model lists to setup.py
Tests: 160 targeted tests pass (26 new tests covering normalization,
api_mode routing per provider/model, persistence, and setup wizard
normalization).
Based on PR #3017 by SaM13997.
Co-authored-by: SaM13997 <139419381+SaM13997@users.noreply.github.com>
Phase 4 of the /model command overhaul.
Both the CLI (cli.py) and gateway (gateway/run.py) /model handlers
had ~50 lines of duplicated core logic: parsing, provider detection,
credential resolution, and model validation. This extracts that
pipeline into hermes_cli/model_switch.py.
New module exports:
- ModelSwitchResult: dataclass with all fields both handlers need
- CustomAutoResult: dataclass for bare '/model custom' results
- switch_model(): core pipeline — parse → detect → resolve → validate
- switch_to_custom_provider(): resolve endpoint + auto-detect model
The shared functions are pure (no I/O side effects). Each caller
handles its own platform-specific concerns:
- CLI: sets self.model/provider/etc, calls save_config_value(), prints
- Gateway: writes config.yaml directly, sets env vars, returns markdown
Net result: -244 lines from handlers, +234 lines in shared module.
The handlers are now ~80 lines each (down from ~150+) and can't drift
apart on core logic.