hermes-agent/tests at acdcb167fb1cb9ba36cf6e36e98c2ff83ba11d70 - hermes-agent - ling

ling/hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-04 01:37:34 +08:00

Files

History

Teknium 51f4c9827f fix(context): resolve real Codex OAuth context windows (272k, not 1M) (#14935 )

On ChatGPT Codex OAuth every gpt-5.x slug actually caps at 272,000 tokens,
but Hermes was resolving gpt-5.5 / gpt-5.4 to 1,050,000 (from models.dev)
because openai-codex aliases to the openai entry there. At 1.05M the
compressor never fires and requests hard-fail with 'context window
exceeded' around the real 272k boundary.

Verified live against chatgpt.com/backend-api/codex/models:
  gpt-5.5, gpt-5.4, gpt-5.4-mini, gpt-5.3-codex, gpt-5.2-codex,
  gpt-5.2, gpt-5.1-codex-max → context_window = 272000

Changes:
- agent/model_metadata.py:
  * _fetch_codex_oauth_context_lengths() — probe the Codex /models
    endpoint with the OAuth bearer token and read context_window per
    slug (1h in-memory TTL).
  * _resolve_codex_oauth_context_length() — prefer the live probe,
    fall back to hardcoded _CODEX_OAUTH_CONTEXT_FALLBACK (all 272k).
  * Wire into get_model_context_length() when provider=='openai-codex',
    running BEFORE the models.dev lookup (which returns 1.05M). Result
    persists via save_context_length() so subsequent lookups skip the
    probe entirely.
  * Fixed the now-wrong comment on the DEFAULT_CONTEXT_LENGTHS gpt-5.5
    entry (400k was never right for Codex; it's the catch-all for
    providers we can't probe live).

Tests (4 new in TestCodexOAuthContextLength):
- fallback table used when no token is available (no models.dev leakage)
- live probe overrides the fallback
- probe failure (non-200) falls back to hardcoded 272k
- non-codex providers (openrouter, direct openai) unaffected

Non-codex context resolution is unchanged — the Codex branch only fires
when provider=='openai-codex'.

2026-04-23 22:39:47 -07:00

..

fix(acp): wire approval callback + make it thread-local (#13525 )

2026-04-21 06:20:40 -07:00

fix(context): resolve real Codex OAuth context windows (272k, not 1M) (#14935 )

2026-04-23 22:39:47 -07:00

test(approval): regression guards for thread-local callback contract

2026-04-21 14:29:08 -07:00

feat(cron): honor hermes tools config for the cron platform (#14798 )

2026-04-23 15:48:50 -07:00

fix: follow-up for salvaged PRs #6293 , #7387 , #9091 , #13131

2026-04-20 14:56:04 -07:00

environments/benchmarks

fix(security): consolidated security hardening — SSRF, timing attack, tar traversal, credential leakage (#5944 )

2026-04-07 17:28:37 -07:00

…

fix(tui): restore voice/panic handlers + scope fuzzy paths to cwd

2026-04-23 19:38:33 -05:00

feat(tui): match CLI's voice slash + VAD-continuous recording model

2026-04-23 16:18:15 -07:00

feat(honcho): wizard cadence default 2, surface reasoning level, backwards-compat fallback

2026-04-18 22:50:55 -07:00

fix(discord): strip RTP padding before DAVE/Opus decode (#11267 )

2026-04-16 16:50:15 -07:00

fix(xai-image): drop unreachable editing code path

2026-04-23 15:13:34 -07:00

test,chore: cover stringified array/object coercion + AUTHOR_MAP entry

2026-04-23 16:38:38 -07:00

fix(google-workspace): normalize authorized user token writes

2026-04-16 04:22:16 -07:00

feat(browser): CDP supervisor — dialog detection + response + cross-origin iframe eval (#14540 )

2026-04-23 22:23:37 -07:00

Merge branch 'main' into fix/tui-provider-resolution

2026-04-22 11:47:49 -07:00

__init__.py

…

conftest.py

test(conftest): reset module-level state + unset platform allowlists (#13400 )

2026-04-21 01:33:10 -07:00

run_interrupt_test.py

fix: thread safety for concurrent subagent delegation (#1672 )

2026-03-17 02:53:33 -07:00

test_account_usage.py

feat(account-usage): add per-provider account limits module

2026-04-21 01:56:35 -07:00

test_base_url_hostname.py

security(runtime_provider): close OLLAMA_API_KEY substring-leak sweep miss (#13522 )

2026-04-21 06:06:16 -07:00

test_batch_runner_checkpoint.py

fix(batch_runner): mark discarded no-reasoning prompts as completed (#9950 )

2026-04-20 04:56:06 -07:00

test_cli_file_drop.py

fix(tui): improve macOS paste and shortcut parity

2026-04-21 08:00:00 -07:00

test_cli_skin_integration.py

fix: align status bar skin tests with upstream main

2026-04-22 13:20:02 -07:00

test_ctx_halving_fix.py

fix(tests): fix 78 CI test failures and remove dead test (#9036 )

2026-04-13 10:50:24 -07:00

test_empty_model_fallback.py

fix: fall back to provider's default model when model config is empty (#8303 )

2026-04-12 03:53:30 -07:00

test_evidence_store.py

…

test_hermes_constants.py

fix(gateway): harden Docker/container gateway pathway

2026-04-12 16:36:11 -07:00

test_hermes_logging.py

fix(tests): fix 78 CI test failures and remove dead test (#9036 )

2026-04-13 10:50:24 -07:00

test_hermes_state.py

feat(dashboard): track real API call count per session

2026-04-22 05:51:58 -07:00

test_honcho_client_config.py

feat(memory): pluggable memory provider interface with profile isolation, review fixes, and honcho CLI restoration (#4623 )

2026-04-02 15:33:51 -07:00

test_ipv4_preference.py

feat: add network.force_ipv4 config to fix IPv6 timeout issues (#8196 )

2026-04-11 23:12:11 -07:00

test_mcp_serve.py

feat: add MCP server mode — hermes mcp serve (#3795 )

2026-03-29 15:47:19 -07:00

test_mini_swe_runner.py

fix(kimi): omit temperature entirely for Kimi/Moonshot models (#13157 )

2026-04-20 12:23:05 -07:00

test_minimax_model_validation.py

fix(models): validate MiniMax models against static catalog (#12611 , #12460 , #12399 , #12547 )

2026-04-19 22:44:47 -07:00

test_minisweagent_path.py

chore: remove all remaining mini-swe-agent references

2026-03-24 08:19:23 -07:00

test_model_picker_scroll.py

fix: CLI/UX batch — ChatConsole errors, curses scroll, skin-aware banner, git state banner (#5974 )

2026-04-07 17:59:42 -07:00

test_model_tools_async_bridge.py

fix(core): ensure non-blocking executor shutdown on async timeout

2026-04-22 14:42:32 -07:00

test_model_tools.py

feat(plugins): add transform_tool_result hook for generic tool-result rewriting (#12972 )

2026-04-20 03:48:08 -07:00

test_ollama_num_ctx.py

fix: provider/model resolution — salvage 4 PRs + MiniMax aux URL fix (#5983 )

2026-04-07 22:23:28 -07:00

test_packaging_metadata.py

chore: prepare Hermes for Homebrew packaging (#4099 )

2026-03-30 17:34:43 -07:00

test_plugin_skills.py

fix(tests): attach caplog to specific logger in 3 order-dependent tests (#11453 )

2026-04-17 00:20:40 -07:00

test_project_metadata.py

build(deps): add qrcode to dingtalk + feishu extras (parity with messaging) (#11627 )

2026-04-17 13:31:53 -07:00

test_retry_utils.py

feat(agent): add jittered retry backoff

2026-04-08 00:41:36 -07:00

test_sql_injection.py

fix(security): eliminate SQL string formatting in execute() calls

2026-03-19 15:16:35 +01:00

test_subprocess_home_isolation.py

fix: per-profile subprocess HOME isolation (#4426 ) (#7357 )

2026-04-10 13:37:45 -07:00

test_timezone.py

test: speed up slow tests (backoff + subprocess + IMDS network) (#11797 )

2026-04-17 14:21:22 -07:00

test_toolset_distributions.py

…

test_toolsets.py

fix(ci): unblock test suite + cut ~2s of dead Z.AI probes from every AIAgent

2026-04-19 19:18:19 -07:00

test_trajectory_compressor_async.py

fix(kimi): omit temperature entirely for Kimi/Moonshot models (#13157 )

2026-04-20 12:23:05 -07:00

test_trajectory_compressor.py

fix(kimi): omit temperature entirely for Kimi/Moonshot models (#13157 )

2026-04-20 12:23:05 -07:00

test_transform_tool_result_hook.py

test: stop testing mutable data — convert change-detectors to invariants (#13363 )

2026-04-20 23:20:33 -07:00

test_tui_gateway_server.py

Merge pull request #14135 from helix4u/fix/tui-state-db-optional

2026-04-22 20:11:07 -05:00

test_utils_truthy_values.py

Gate tool-gateway behind an env var, so it's not in users' faces until we're ready. Even if users enable it, it'll be blocked server-side for now, until we unlock for non-admin users on tool-gateway.

2026-03-30 13:28:10 +09:00