Commit Graph

7 Commits

Author SHA1 Message Date
Teknium
fdd8d0515e fix: /model listing uses curated model lists, not full models.dev catalog
Use OPENROUTER_MODELS (28 curated) and _PROVIDER_MODELS from models.py
instead of the raw models.dev catalog (167 OpenRouter models). These are
hand-picked agentic models that work as agent backends.

Before: 167 models from models.dev sorted by capability score
After: 28 curated models in our recommended order
2026-04-05 01:01:15 -07:00
Teknium
7b60189206 feat: /model shows authenticated providers with top models
/model (no args) now lists every provider the user has credentials for,
plus all user-defined endpoints from config.yaml providers: section.

Each entry shows:
- Provider display name
- The --provider slug to use
- (current) tag on the active provider
- Top models sorted by capability (tool_call + context window)
- Model count for providers with large catalogs
- URL for user-defined endpoints

Example output:
  OpenRouter [--provider openrouter] (current):
    anthropic/claude-sonnet-4.6, openai/gpt-5.2-codex, ...  (+161 more)

  Anthropic [--provider anthropic]:
    claude-opus-4-6, claude-sonnet-4-6, ...  (+16 more)

  My Ollama [--provider my-ollama]:
    http://localhost:11434/v1

Detection works by checking env vars from models.dev provider metadata
and auth store entries for OAuth providers. User-defined endpoints are
always shown.
2026-04-05 01:01:15 -07:00
Teknium
1cc264a76e feat: models.dev as primary database + --provider flag + full metadata
Major overhaul of the model/provider system:

## models.dev as primary database (agent/models_dev.py)
- Full ModelInfo dataclass: context window, max output, cost/M tokens,
  capabilities (reasoning, tools, vision, PDF, audio, structured output),
  modalities, knowledge cutoff, open_weights, family, status
- Full ProviderInfo dataclass: name, base URL, env vars, doc link
- New queries: get_provider_info(), get_model_info(), list_all_providers(),
  get_providers_for_env_var(), get_model_info_any_provider(),
  list_provider_model_infos()
- 109 providers, 4000+ models with exact metadata
- Backward-compatible: existing ModelCapabilities API unchanged

## Hermes overlay system (hermes_cli/providers.py)
- HermesOverlay: transport type, auth patterns, aggregator flags
- Merge chain: models.dev + overlay + user config = complete ProviderDef
- User-defined endpoints via config.yaml providers: section
- resolve_provider_full() — single entry point for --provider resolution
- Works for built-in, models.dev-only, AND user-defined providers

## --provider flag (hermes_cli/model_switch.py)
- parse_model_flags() extracts --provider and --global cleanly
- No more colon-based provider:model syntax (colons reserved for
  OpenRouter :free/:extended/:fast/:beta suffixes)
- Explicit provider path: resolve → credentials → alias on target
- Implicit path: alias → fallback → catalog → detect_provider

## Rich metadata display (cli.py, gateway/run.py)
- /model (no args) shows: context, max output, cost, capabilities
- /model switch shows: full metadata from models.dev
- Fallback to old context length lookup when models.dev has no data

## Config (hermes_cli/config.py)
- Added providers: {} to DEFAULT_CONFIG for user-defined endpoints
2026-04-05 01:01:15 -07:00
Teknium
afcbab5323 feat: /model command — full provider+model system overhaul
New foundation files:
- hermes_cli/providers.py: single source of truth for provider identity,
  aliases, labels, transport types, api_mode determination
- hermes_cli/model_normalize.py: per-provider model name normalization
  (anthropic uses hyphens, openrouter uses vendor/ prefix, etc.)
- agent/models_dev.py: extended with ModelCapabilities, get_model_capabilities(),
  list_provider_models(), search_models_dev()

Rebuilt model_switch.py:
- Dynamic alias resolution from catalog (no hardcoded versions)
- Aggregator-aware resolution (stays on OpenRouter, doesn't hijack to opencode-zen)
- Vendor:model conversion on aggregators (openai:gpt-5.4 -> openai/gpt-5.4)
- Per-provider model name normalization
- Capability metadata from models.dev
- Fuzzy suggestions on error

AIAgent.switch_model(): in-place model swap following _try_activate_fallback()
pattern. Updates primary runtime, invalidates system prompt, rebuilds client
for cross-api-mode switches. Uses determine_api_mode() from providers.py.

/model command:
- Session-only by default (no config.yaml write)
- --global flag to persist permanently
- Confirmation shows model, provider, context, capabilities, cache status
- Running-agent guard on gateway
- Gateway stores session overrides in _session_model_overrides dict
- Works across CLI, Telegram, Discord, Slack, Matrix, all platforms
2026-04-05 01:01:15 -07:00
Teknium
28a073edc6 fix: repair OpenCode model routing and selection (#4508)
OpenCode Zen and Go are mixed-API-surface providers — different models
behind them use different API surfaces (GPT on Zen uses codex_responses,
Claude on Zen uses anthropic_messages, MiniMax on Go uses
anthropic_messages, GLM/Kimi on Go use chat_completions).

Changes:
- Add normalize_opencode_model_id() and opencode_model_api_mode() to
  models.py for model ID normalization and API surface routing
- Add _provider_supports_explicit_api_mode() to runtime_provider.py
  to prevent stale api_mode from leaking across provider switches
- Wire opencode routing into all three api_mode resolution paths:
  pool entry, api_key provider, and explicit runtime
- Add api_mode field to ModelSwitchResult for propagation through the
  switch pipeline
- Consolidate _PROVIDER_MODELS from main.py into models.py (single
  source of truth, eliminates duplicate dict)
- Add opencode normalization to setup wizard and model picker flows
- Add opencode block to _normalize_model_for_provider in CLI
- Add opencode-zen/go fallback model lists to setup.py

Tests: 160 targeted tests pass (26 new tests covering normalization,
api_mode routing per provider/model, persistence, and setup wizard
normalization).

Based on PR #3017 by SaM13997.

Co-authored-by: SaM13997 <139419381+SaM13997@users.noreply.github.com>
2026-04-02 09:36:24 -07:00
Teknium
8bb1d15da4 chore: remove ~100 unused imports across 55 files (#3016)
Automated cleanup via pyflakes + autoflake with manual review.

Changes:
- Removed unused stdlib imports (os, sys, json, pathlib.Path, etc.)
- Removed unused typing imports (List, Dict, Any, Optional, Tuple, Set, etc.)
- Removed unused internal imports (hermes_cli.auth, hermes_cli.config, etc.)
- Fixed cli.py: removed 8 shadowed banner imports (imported from hermes_cli.banner
  then immediately redefined locally — only build_welcome_banner is actually used)
- Added noqa comments to imports that appear unused but serve a purpose:
  - Re-exports (gateway/session.py SessionResetPolicy, tools/terminal_tool.py
    is_interrupted/_interrupt_event)
  - SDK presence checks in try/except (daytona, fal_client, discord)
  - Test mock targets (auxiliary_client.py Path, mcp_config.py get_hermes_home)

Zero behavioral changes. Full test suite passes (6162/6162, 2 pre-existing
streaming test failures unrelated to this change).
2026-03-25 15:02:03 -07:00
Teknium
2e524272b1 refactor(model): extract shared switch_model() from CLI and gateway handlers
Phase 4 of the /model command overhaul.

Both the CLI (cli.py) and gateway (gateway/run.py) /model handlers
had ~50 lines of duplicated core logic: parsing, provider detection,
credential resolution, and model validation. This extracts that
pipeline into hermes_cli/model_switch.py.

New module exports:
- ModelSwitchResult: dataclass with all fields both handlers need
- CustomAutoResult: dataclass for bare '/model custom' results
- switch_model(): core pipeline — parse → detect → resolve → validate
- switch_to_custom_provider(): resolve endpoint + auto-detect model

The shared functions are pure (no I/O side effects). Each caller
handles its own platform-specific concerns:
- CLI: sets self.model/provider/etc, calls save_config_value(), prints
- Gateway: writes config.yaml directly, sets env vars, returns markdown

Net result: -244 lines from handlers, +234 lines in shared module.
The handlers are now ~80 lines each (down from ~150+) and can't drift
apart on core logic.
2026-03-24 07:08:07 -07:00