mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-04-28 06:51:16 +08:00
Universal reactive fix for 'HTTP 400: Unsupported parameter: temperature' across all providers/models — not just Codex Responses. The same backend can accept temperature for some models and reject it for others (e.g. gpt-5.4 accepts but gpt-5.5 rejects on the same OpenAI endpoint; similar patterns on Copilot, OpenRouter reasoning routes, and Anthropic Opus 4.7+ via OAI-compat). An allow/deny-list by model name does not scale. call_llm / async_call_llm now detect the concrete 'unsupported parameter: temperature' 400 and transparently retry once without temperature. Kimi's server-managed omission and Opus 4.7+'s proactive strip stay in place — this is the safety net for everything else. Changes: - agent/auxiliary_client.py: add _is_unsupported_temperature_error helper; wire into both sync and async call_llm paths before the existing max_tokens/payment/auth retry ladder - tests/agent/test_unsupported_temperature_retry.py: 19 tests covering detector phrasings, sync + async retry, no-retry-without-temperature, and non-temperature 400s not triggering the retry Builds on PR #15620 (codex_responses fallback) which stripped temperature up front for that one api_mode. This PR closes the gap for every other provider/model combo via reactive retry. Credit: retry approach and detector originate from @BlueBirdBack's PR #15578. Co-authored-by: BlueBirdBack <BlueBirdBack@users.noreply.github.com>