The ASCII-locale recovery path in run_agent.py sanitized the canonical
'messages' list but left 'api_messages' untouched. api_messages is a
separate API-copy built before the retry loop and may carry extra fields
(reasoning_content, extra_body entries) that are not present in
'messages'. This caused the retry to still raise UnicodeEncodeError even
after the 'System encoding is ASCII — stripped...' log line appeared.
Two changes:
- _sanitize_messages_non_ascii now walks all extra top-level string fields
in each message dict (any key not in {content, name, tool_calls, role})
so reasoning_content and future extras are cleaned in both 'messages'
and 'api_messages'.
- The ASCII-codec recovery block now also calls sanitize on api_messages
and api_kwargs so no non-ASCII survives into the next retry attempt.
Adds regression tests covering:
- reasoning_content with non-ASCII in api_messages
- extra_body with non-ASCII in api_kwargs
- canonical messages clean but api_messages dirty
Fixes#6843