refactor(steer): simplify injection marker to 'User guidance:' prefix (#13340)

The mid-run steer marker was '[USER STEER (injected mid-run, not tool
output): <text>]'. Replaced with a plain two-newline-prefixed
'User guidance: <text>' suffix.

Rationale: the marker lives inside the tool result's content string
regardless of whether the tool returned JSON, plain text, an MCP
result, or a plugin result. The bracketed tag read like structured
metadata that some tools (terminal, execute_code) could confuse with
their own output formatting. A plain labelled suffix works uniformly
across every content shape we produce.

Behavior unchanged:
- Still injected into the last tool-role message's content.
- Still preserves multimodal (Anthropic) content-block lists by
  appending a text block.
- Still drained at both sites added in #12959 and #13205 — per-tool
  drain between individual calls, and pre-API-call drain at the top
  of each main-loop iteration.

Checked Codex's equivalent (pending_input / inject_user_message_without_turn
in codex-rs/core): they record mid-turn user input as a real role:user
message via record_user_prompt_and_emit_turn_item(). That's cleaner for
their Responses-API model but not portable to Chat Completions where
role alternation after tool_calls is strict. Embedding the guidance in
the last tool result remains the correct placement for us.

Validation: all 21 tests in tests/run_agent/test_steer.py pass.
This commit is contained in:
Teknium
2026-04-20 22:18:49 -07:00
committed by GitHub
parent dbb7e00e7e
commit 70d7f79bef
2 changed files with 13 additions and 12 deletions

View File

@@ -85,7 +85,7 @@ class TestSteerInjection:
# The LAST tool result is modified; earlier ones are untouched.
assert messages[2]["content"] == "ls output A"
assert "ls output B" in messages[3]["content"]
assert "[USER STEER" in messages[3]["content"]
assert "User guidance:" in messages[3]["content"]
assert "please also check auth.log" in messages[3]["content"]
# And pending_steer is consumed.
assert agent._pending_steer is None
@@ -107,18 +107,19 @@ class TestSteerInjection:
# Steer should remain pending (nothing to drain into)
assert agent._pending_steer == "steer"
def test_marker_is_unambiguous_about_origin(self):
"""The injection marker must make clear the text is from the user
and not tool output — this is the cache-safe way to signal
provenance without violating message-role alternation.
def test_marker_labels_text_as_user_guidance(self):
"""The injection marker must label the appended text as user
guidance so the model attributes it to the user rather than
confusing it with tool output. This is the cache-safe way to
signal provenance without violating message-role alternation.
"""
agent = _bare_agent()
agent.steer("stop after next step")
messages = [{"role": "tool", "content": "x", "tool_call_id": "1"}]
agent._apply_pending_steer_to_tool_results(messages, num_tool_msgs=1)
content = messages[-1]["content"]
assert "USER STEER" in content
assert "not tool output" in content.lower() or "injected mid-run" in content.lower()
assert "User guidance:" in content
assert "stop after next step" in content
def test_multimodal_content_list_preserved(self):
"""Anthropic-style list content should be preserved, with the steer
@@ -226,9 +227,9 @@ class TestPreApiCallSteerDrain:
# Inject into last tool msg (mirrors the new code in run_conversation)
for _si in range(len(messages) - 1, -1, -1):
if messages[_si].get("role") == "tool":
messages[_si]["content"] += f"\n\n[USER STEER (injected mid-run, not tool output): {_pre_api_steer}]"
messages[_si]["content"] += f"\n\nUser guidance: {_pre_api_steer}"
break
assert "[USER STEER" in messages[-1]["content"]
assert "User guidance:" in messages[-1]["content"]
assert "focus on error handling" in messages[-1]["content"]
assert agent._pending_steer is None
@@ -270,7 +271,7 @@ class TestPreApiCallSteerDrain:
assert _pre_api_steer is not None
for _si in range(len(messages) - 1, -1, -1):
if messages[_si].get("role") == "tool":
messages[_si]["content"] += f"\n\n[USER STEER (injected mid-run, not tool output): {_pre_api_steer}]"
messages[_si]["content"] += f"\n\nUser guidance: {_pre_api_steer}"
break
assert "change approach" in messages[2]["content"]