hermes-agent/tests/run_agent at 4093ee9c62571d79c834a3bd72a8a32d8221a65b - hermes-agent - ling

ling/hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-04-28 06:51:16 +08:00

Files

History

Teknium 4093ee9c62 fix(codex): detect leaked tool-call text in assistant content (#15347 )

gpt-5.x on the Codex Responses API sometimes degenerates and emits
Harmony-style `to=functions.<name> {json}` serialization as plain
assistant-message text instead of a structured `function_call` item.
The intent never makes it into `response.output` as a function_call,
so `tool_calls` is empty and `_normalize_codex_response()` returns
the leaked text as the final content. Downstream (e.g. delegate_task),
this surfaces as a confident-looking summary with `tool_trace: []`
because no tools actually ran — the Taiwan-embassy-email bug report.

Detect the pattern, scrub the content, and return finish_reason=
'incomplete' so the existing Codex-incomplete continuation path
(run_agent.py:11331, 3 retries) gets a chance to re-elicit a proper
function_call item. Encrypted reasoning items are preserved so the
model keeps its chain-of-thought on the retry.

Regression tests: leaked text triggers incomplete, real tool calls
alongside leak-looking text are preserved, clean responses pass
through unchanged.

Reported on Discord (gpt-5.4 / openai-codex).

2026-04-24 14:39:59 -07:00

..

__init__.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

conftest.py

test: speed up slow tests (backoff + subprocess + IMDS network) (#11797 )

2026-04-17 14:21:22 -07:00

test_413_compression.py

test: speed up slow tests (backoff + subprocess + IMDS network) (#11797 )

2026-04-17 14:21:22 -07:00

test_860_dedup.py

fix(tests): make AIAgent constructor calls self-contained (#11755 )

2026-04-17 12:32:03 -07:00

test_1630_context_overflow_loop.py

fix(tests): make AIAgent constructor calls self-contained (#11755 )

2026-04-17 12:32:03 -07:00

test_agent_guardrails.py

fix(delegate): make max_concurrent_children configurable + error on excess

2026-04-10 13:38:14 -07:00

test_agent_loop_tool_calling.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

test_agent_loop_vllm.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

test_agent_loop.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

test_anthropic_error_handling.py

feat(providers): extend request_timeout_seconds to all client paths

2026-04-19 11:23:00 -07:00

test_anthropic_prompt_cache_policy.py

fix(cache): enable prompt caching for Qwen on OpenCode/OpenCode-Go/Alibaba (#13528 )

2026-04-21 06:40:58 -07:00

test_anthropic_third_party_oauth_guard.py

fix(anthropic): complete third-party Anthropic-compatible provider support (#12846 )

2026-04-19 22:43:09 -07:00

test_anthropic_truncation_continuation.py

refactor: remove _nr_to_assistant_message shim + fix flush_memories guard

2026-04-23 02:30:05 -07:00

test_api_max_retries_config.py

feat(agent): make API retry count configurable via agent.api_max_retries (#14730 )

2026-04-23 13:59:32 -07:00

test_async_httpx_del_neuter.py

fix: bound auxiliary client cache to prevent fd exhaustion in long-running gateways (#10200 ) (#10470 )

2026-04-15 13:16:28 -07:00

test_background_review_summary.py

fix(agent): exclude prior-history tool messages from background review summary

2026-04-24 03:10:19 -07:00

test_compress_focus_plugin_fallback.py

fix(compress): don't reach into ContextCompressor privates from /compress (#15039 )

2026-04-24 02:55:43 -07:00

test_compression_boundary.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

test_compression_feasibility.py

fix(compression): enforce 64k floor on aux model + auto-correct threshold (#12898 )

2026-04-20 00:56:04 -07:00

test_compression_persistence.py

fix(tests): make AIAgent constructor calls self-contained (#11755 )

2026-04-17 12:32:03 -07:00

test_compression_trigger_excludes_reasoning.py

fix(compression): exclude completion tokens from compression trigger (#12026 )

2026-04-20 05:12:10 -07:00

test_compressor_fallback_update.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

test_concurrent_interrupt.py

fix(tests): resolve 17 persistent CI test failures (#15084 )

2026-04-24 03:46:46 -07:00

test_context_token_tracking.py

feat(providers): extend request_timeout_seconds to all client paths

2026-04-19 11:23:00 -07:00

test_create_openai_client_kwargs_isolation.py

fix(tests): make AIAgent constructor calls self-contained (#11755 )

2026-04-17 12:32:03 -07:00

test_create_openai_client_proxy_env.py

test(proxy): regression tests for NO_PROXY bypass on keepalive client

2026-04-24 03:04:42 -07:00

test_create_openai_client_reuse.py

fix(tests): make AIAgent constructor calls self-contained (#11755 )

2026-04-17 12:32:03 -07:00

test_dict_tool_call_args.py

fix(tests): fix 78 CI test failures and remove dead test (#9036 )

2026-04-13 10:50:24 -07:00

test_exit_cleanup_interrupt.py

test: speed up slow tests (backoff + subprocess + IMDS network) (#11797 )

2026-04-17 14:21:22 -07:00

test_fallback_model.py

test: speed up slow tests (backoff + subprocess + IMDS network) (#11797 )

2026-04-17 14:21:22 -07:00

test_flush_memories_codex.py

fix(memory): add write origin metadata

2026-04-24 14:37:55 -07:00

test_interactive_interrupt.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

test_interrupt_propagation.py

test: stop testing mutable data — convert change-detectors to invariants (#13363 )

2026-04-20 23:20:33 -07:00

test_invalid_context_length_warning.py

fix(tests): resolve CI test failures — pool auto-seeding, stale assertions, mock isolation

2026-04-15 22:05:21 -07:00

test_jsondecodeerror_retryable.py

fix(agent): retry on json.JSONDecodeError instead of treating it as a local validation error (#15107 )

2026-04-24 05:02:58 -07:00

test_long_context_tier_429.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

test_memory_provider_init.py

fix(memory): keep Honcho provider opt-in

2026-04-18 22:50:55 -07:00

test_openai_client_lifecycle.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

test_percentage_clamp.py

fix: update 6 test files broken by dead code removal

2026-04-10 03:44:43 -07:00

test_plugin_context_engine_init.py

fix(tests): make AIAgent constructor calls self-contained (#11755 )

2026-04-17 12:32:03 -07:00

test_primary_runtime_restore.py

fix(agent): only set rate-limit cooldown when leaving primary; add tests

2026-04-24 05:35:43 -07:00

test_provider_attribution_headers.py

fix(providers): send user agent to routermint endpoints

2026-04-24 03:02:16 -07:00

test_provider_fallback.py

fix(agent): fall back on rate limit when pool has no rotation room

2026-04-24 05:20:05 -07:00

test_provider_parity.py

feat: add ResponsesApiTransport + wire all Codex transport paths

2026-04-21 19:48:56 -07:00

test_real_interrupt_subagent.py

fix(tests): fix 78 CI test failures and remove dead test (#9036 )

2026-04-13 10:50:24 -07:00

test_redirect_stdout_issue.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

test_repair_tool_call_arguments.py

fix: extract _repair_tool_call_arguments helper, add tests, bound loop

2026-04-20 05:12:55 -07:00

test_repair_tool_call_name.py

fix(agent): repair CamelCase + _tool suffix tool-call emissions (#15124 )

2026-04-24 05:32:08 -07:00

test_run_agent_codex_responses.py

fix(codex): detect leaked tool-call text in assistant content (#15347 )

2026-04-24 14:39:59 -07:00

test_run_agent_multimodal_prologue.py

refactor: unify transport dispatch + collapse normalize shims

2026-04-22 18:34:25 -07:00

test_run_agent.py

feat: read prompt caching cache_ttl from config

2026-04-24 03:21:29 -07:00

test_sequential_chats_live.py

test: regression guards for the keepalive/transport bug class (#10933 ) (#11266 )

2026-04-16 16:36:33 -07:00

test_session_meta_filtering.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

test_session_reset_fix.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

test_steer.py

refactor(steer): simplify injection marker to 'User guidance:' prefix (#13340 )

2026-04-20 22:18:49 -07:00

test_streaming.py

fix(streaming): silent retry when stream dies mid tool-call (#14151 )

2026-04-22 13:47:33 -07:00

test_strict_api_validation.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

test_strip_reasoning_tags_cli.py

fix(display): strip standalone tool-call XML tags from visible text

2026-04-22 18:12:42 -07:00

test_switch_model_context.py

fix: pass config_context_length to switch_model context compressor

2026-04-10 05:52:45 -07:00

test_switch_model_fallback_prune.py

fix(agent): default missing fallback chain on switch

2026-04-24 05:35:43 -07:00

test_token_persistence_non_cli.py

fix(tests): make AIAgent constructor calls self-contained (#11755 )

2026-04-17 12:32:03 -07:00

test_tool_arg_coercion.py

test,chore: cover stringified array/object coercion + AUTHOR_MAP entry

2026-04-23 16:38:38 -07:00

test_unicode_ascii_codec.py

fix: always retry on ASCII codec UnicodeEncodeError — don't gate on per-component sanitization

2026-04-15 15:03:28 -07:00