hermes-agent/agent at 48cb8d20b25885a0899aa3dab110d43ce36cfaf4 - hermes-agent - ling

ling/hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-04-28 15:01:34 +08:00

Files

History

taeng0204 6f79b8f01d fix(kimi): route temperature override by base_url — kimi-k2.5 needs 1.0 on api.moonshot.ai

Follow-up to #12144.  That PR standardized the kimi-k2.* temperature lock
against the Coding Plan endpoint (api.kimi.com/coding/v1) docs, where
non-thinking models require 0.6.  Verified empirically against Moonshot
(April 2026) that the public chat endpoint (api.moonshot.ai/v1) has a
different contract for kimi-k2.5: it only accepts temperature=1, and rejects
0.6 with:

    HTTP 400 "invalid temperature: only 1 is allowed for this model"

Users hit the public endpoint when KIMI_API_KEY is a legacy sk-* key (the
sk-kimi-* prefix routes to Coding Plan — see hermes_cli/auth.py).  So for
Coding Plan subscribers the fix from #12144 is correct, but for public-API
users it reintroduces the exact 400 reported in #9125.

Reproduction on api.moonshot.ai/v1 + kimi-k2.5:
  temperature=1.0 → 200 OK
  temperature=0.6 → 400 "only 1 is allowed"     ← #12144 default
  temperature=None → 200 OK

Other kimi-k2.* models are unaffected empirically — turbo-preview accepts
0.6 and thinking-turbo accepts 1.0 on both endpoints — so only kimi-k2.5
diverges.

Fix: thread the client's actual base_url through _build_call_kwargs (the
parameter already existed but callers passed config-level resolved_base_url;
for auto-detected routes that was often empty).  _fixed_temperature_for_model
now checks api.moonshot.ai first via an explicit _KIMI_PUBLIC_API_OVERRIDES
map, then falls back to the Coding Plan defaults.  Tests parametrize over
endpoint + model to lock both contracts.

Closes #9125.

2026-04-19 18:54:35 -07:00

..

__init__.py

Refactor Terminal and AIAgent cleanup

2026-02-21 22:31:43 -08:00

anthropic_adapter.py

feat(providers): extend request_timeout_seconds to all client paths

2026-04-19 11:23:00 -07:00

auxiliary_client.py

fix(kimi): route temperature override by base_url — kimi-k2.5 needs 1.0 on api.moonshot.ai

2026-04-19 18:54:35 -07:00

bedrock_adapter.py

feat: native AWS Bedrock provider via Converse API

2026-04-15 16:17:17 -07:00

context_compressor.py

feat(compression): summaries now respect the conversation's language

2026-04-19 11:05:14 -07:00

context_engine.py

refactor: remove dead code — 1,784 lines across 77 files (#9180 )

2026-04-13 16:32:04 -07:00

context_references.py

fix(agent): preserve quoted @file references with spaces

2026-04-10 13:05:01 -07:00

copilot_acp_client.py

fix: handle httpx.Timeout object in CopilotACPClient (#11058 )

2026-04-16 12:05:11 -07:00

credential_pool.py

fix(codex): Hermes owns its own Codex auth; stop touching ~/.codex/auth.json (#12360 )

2026-04-18 19:19:46 -07:00

display.py

fix: remove context pressure warnings entirely (#11039 )

2026-04-16 06:44:23 -07:00

error_classifier.py

feat: native AWS Bedrock provider via Converse API

2026-04-15 16:17:17 -07:00

gemini_cloudcode_adapter.py

fix(gemini-cli): surface MODEL_CAPACITY_EXHAUSTED cleanly + drop retired gemma-4-26b (#11833 )

2026-04-17 15:34:12 -07:00

gemini_native_adapter.py

fix(gemini): tighten native routing and streaming replay

2026-04-19 12:40:08 -07:00

google_code_assist.py

fix(gemini-cli): surface MODEL_CAPACITY_EXHAUSTED cleanly + drop retired gemma-4-26b (#11833 )

2026-04-17 15:34:12 -07:00

google_oauth.py

feat(gemini): add Google Gemini CLI OAuth provider via Cloud Code Assist (free + paid tiers) (#11270 )

2026-04-16 16:49:00 -07:00

insights.py

fix(insights): hide cache read/write and cost metrics from display (#11477 )

2026-04-17 01:02:06 -07:00

manual_compression_feedback.py

fix(gateway): make manual compression feedback truthful

2026-04-10 21:16:53 -07:00

memory_manager.py

feat(honcho): context injection overhaul, 5-tool surface, cost safety, session isolation (#10619 )

2026-04-15 19:12:19 -07:00

memory_provider.py

refactor(memory): drop on_session_reset — commit-only is enough

2026-04-15 11:28:45 -07:00

model_metadata.py

fix(gemini-cli): surface MODEL_CAPACITY_EXHAUSTED cleanly + drop retired gemma-4-26b (#11833 )

2026-04-17 15:34:12 -07:00

models_dev.py

fix(gemini): hide stale and low-TPM Google models

2026-04-18 12:52:01 -07:00

nous_rate_guard.py

fix: Nous Portal rate limit guard — prevent retry amplification (#10568 )

2026-04-15 16:31:48 -07:00

prompt_builder.py

docs(memory): steer agents to save declarative facts, not instructions (#12665 )

2026-04-19 12:00:53 -07:00

prompt_caching.py

fix(prompt-caching): skip top-level cache_control on role:tool for OpenRouter

2026-03-21 16:54:43 -07:00

rate_limit_tracker.py

refactor: remove dead code — 1,784 lines across 77 files (#9180 )

2026-04-13 16:32:04 -07:00

redact.py

fix(security): add JWT token and Discord mention redaction (#10547 )

2026-04-15 16:08:52 -07:00

retry_utils.py

feat(agent): add jittered retry backoff

2026-04-08 00:41:36 -07:00

skill_commands.py

fix: use absolute skill_dir for external skills (#10313 ) (#10587 )

2026-04-15 17:22:55 -07:00

skill_utils.py

feat(plugins): namespaced skill registration for plugin skill bundles

2026-04-14 10:42:58 -07:00

subdirectory_hints.py

fix(agent): catch PermissionError in subdirectory hint discovery

2026-04-09 03:10:30 -07:00

title_generator.py

fix: title_generator no longer logs as 'compression' task

2026-04-12 04:17:18 -07:00

trajectory.py

Refactor Terminal and AIAgent cleanup

2026-02-21 22:31:43 -08:00

usage_pricing.py

feat: native AWS Bedrock provider via Converse API

2026-04-15 16:17:17 -07:00