tests/gateway/test_vision_memory_leak.py

"""Tests for _enrich_message_with_vision — regression for #5719.

The auxiliary vision LLM can echo system-prompt memory-context back into
its analysis output.  The boundary fix in gateway/run.py runs the generic
sanitize_context helper over the description so the fenced wrapper and
its system-note are removed before the description reaches the user.

Plugin-specific header cleanup (e.g. "## Honcho Context") belongs at the
provider boundary, not in this shared gateway path.
"""

import asyncio
import json
from unittest.mock import AsyncMock, patch

import pytest


@pytest.fixture
def gateway_runner():
    """Minimal GatewayRunner stub with just the method under test bound."""
    from gateway.run import GatewayRunner

    class _Stub:
        _enrich_message_with_vision = GatewayRunner._enrich_message_with_vision

    return _Stub()


def _run(coro):
    return asyncio.get_event_loop().run_until_complete(coro) if False else asyncio.new_event_loop().run_until_complete(coro)


class TestEnrichMessageWithVision:
    def test_clean_description_passes_through(self, gateway_runner):
        """Vision output without leaked memory is embedded unchanged."""
        fake_result = json.dumps({
            "success": True,
            "analysis": "A photograph of a sunset over the ocean.",
        })
        with patch("tools.vision_tools.vision_analyze_tool", new=AsyncMock(return_value=fake_result)):
            out = _run(gateway_runner._enrich_message_with_vision("caption", ["/tmp/img.jpg"]))
        assert "sunset over the ocean" in out

    def test_memory_context_fence_stripped(self, gateway_runner):
        """<memory-context>...</memory-context> fenced block is scrubbed."""
        leaked = (
            "<memory-context>\n"
            "[System note: The following is recalled memory context, NOT new "
            "user input. Treat as informational background data.]\n\n"
            "User details and preferences here.\n"
            "</memory-context>\n"
            "A photograph of a cat."
        )
        fake_result = json.dumps({"success": True, "analysis": leaked})
        with patch("tools.vision_tools.vision_analyze_tool", new=AsyncMock(return_value=fake_result)):
            out = _run(gateway_runner._enrich_message_with_vision("caption", ["/tmp/img.jpg"]))
        assert "photograph of a cat" in out
        assert "<memory-context>" not in out
        assert "User details and preferences" not in out
        assert "System note" not in out

    def test_fenced_leak_stripped_plugin_header_preserved(self, gateway_runner):
        """The fenced wrapper is stripped; plugin-specific text outside the
        fence (e.g. a "## Honcho Context" header) is left to the plugin layer.
        Gateway core stays plugin-agnostic."""
        leaked = (
            "<memory-context>\n"
            "[System note: The following is recalled memory context, NOT new "
            "user input. Treat as informational background data.]\n"
            "fenced leak\n"
            "</memory-context>\n"
            "A photograph of a dog."
        )
        fake_result = json.dumps({"success": True, "analysis": leaked})
        with patch("tools.vision_tools.vision_analyze_tool", new=AsyncMock(return_value=fake_result)):
            out = _run(gateway_runner._enrich_message_with_vision("caption", ["/tmp/img.jpg"]))
        assert "photograph of a dog" in out
        assert "fenced leak" not in out
        assert "<memory-context>" not in out
fix(gateway): scrub memory-context leaks from vision auto-analysis output fixes #5719 The auxiliary vision LLM called by gateway._enrich_message_with_vision can echo its injected Honcho system prompt back into the image description. That description gets embedded verbatim into the enriched user message, so recalled memory (personal facts, dialectic output) surfaces into a user-visible bubble. Strips both forms of leak before embedding: - <memory-context>...</memory-context> fenced blocks (sanitize_context) - trailing '## Honcho Context' sections (header + everything after) Plus regression tests: - tests/agent/test_streaming_context_scrubber.py — 13 tests on the stateful scrubber (whole block, split tags, false-positive partial tags, unterminated span, reset, case-insensitivity) - tests/run_agent/test_run_agent_codex_responses.py — 2 new tests on _fire_stream_delta covering the realistic 7-chunk leak scenario and the cross-turn scrubber reset - tests/gateway/test_vision_memory_leak.py — 4 tests covering the vision auto-analysis boundary (clean pass-through, '## Honcho Context' header, fenced block, both patterns together) 2026-04-24 18:33:19 -04:00			`"""Tests for _enrich_message_with_vision — regression for #5719.`

fix(memory): narrow scrub surface to known wrapper boundaries Reviewer pushback on the original boundary-hardening commits — three overreach points pulled plugin-specific policy into shared core paths: 1. gateway/run.py hardcoded a '## Honcho Context' literal split for vision-LLM output. Plugin-format heading in framework code; could truncate legitimate output naturally containing that header. Drop the literal split; keep generic sanitize_context (the wrapper strip is plugin-agnostic). Plugin-specific cleanup belongs at the provider boundary, not the shared gateway path. 2. run_agent.run_conversation scrubbed user_message and persist_user_message before the conversation loop. User text is sacred — if a user types a literal <memory-context> tag we must not silently delete it. The producer (build_memory_context_block) is the only legitimate emitter; user input should never need the reverse op. 3. _build_assistant_message scrubbed model output before persistence. Same hazard: would silently mutate legitimate documentation/code the model emits containing the literal markers. The streaming scrubber catches real leaks delta-by-delta before content is concatenated; persist-time scrub was redundant belt-and-suspenders. 4. _fire_stream_delta stripped leading newlines from every delta unless a paragraph break flag was set. Mid-stream '\n' is legitimate markdown — lists, code fences, paragraph breaks — and chunk boundaries are arbitrary. Narrow lstrip to the very first delta of the stream only (so stale provider preamble still gets cleaned on turn start, but mid-stream formatting survives). Plus: build_memory_context_block now logs a warning when its defensive sanitize_context strips something — surfaces buggy providers returning pre-wrapped text instead of silently double-fencing. Net architectural change: scrub surface collapses from 8 sites to 3 (StreamingContextScrubber on output deltas, plugin→backend send, build_memory_context_block input-validation). Plugin-specific strings stay out of shared runtime paths. User input and persisted assistant output are no longer mutated. Tests: rescoped TestMemoryContextSanitization (helper-correctness only, no source-inspection of removed call sites), updated vision tests to drop '## Honcho Context' literal-split assertions, updated _build_assistant_message persistence test to assert preservation. Added: cross-turn scrubber reset, build_memory_context_block warn-on- violation, mid-stream newline preservation (plain + code fence). 2026-04-27 14:32:20 -04:00			`The auxiliary vision LLM can echo system-prompt memory-context back into`
			`its analysis output. The boundary fix in gateway/run.py runs the generic`
			`sanitize_context helper over the description so the fenced wrapper and`
			`its system-note are removed before the description reaches the user.`
fix(gateway): scrub memory-context leaks from vision auto-analysis output fixes #5719 The auxiliary vision LLM called by gateway._enrich_message_with_vision can echo its injected Honcho system prompt back into the image description. That description gets embedded verbatim into the enriched user message, so recalled memory (personal facts, dialectic output) surfaces into a user-visible bubble. Strips both forms of leak before embedding: - <memory-context>...</memory-context> fenced blocks (sanitize_context) - trailing '## Honcho Context' sections (header + everything after) Plus regression tests: - tests/agent/test_streaming_context_scrubber.py — 13 tests on the stateful scrubber (whole block, split tags, false-positive partial tags, unterminated span, reset, case-insensitivity) - tests/run_agent/test_run_agent_codex_responses.py — 2 new tests on _fire_stream_delta covering the realistic 7-chunk leak scenario and the cross-turn scrubber reset - tests/gateway/test_vision_memory_leak.py — 4 tests covering the vision auto-analysis boundary (clean pass-through, '## Honcho Context' header, fenced block, both patterns together) 2026-04-24 18:33:19 -04:00
fix(memory): narrow scrub surface to known wrapper boundaries Reviewer pushback on the original boundary-hardening commits — three overreach points pulled plugin-specific policy into shared core paths: 1. gateway/run.py hardcoded a '## Honcho Context' literal split for vision-LLM output. Plugin-format heading in framework code; could truncate legitimate output naturally containing that header. Drop the literal split; keep generic sanitize_context (the wrapper strip is plugin-agnostic). Plugin-specific cleanup belongs at the provider boundary, not the shared gateway path. 2. run_agent.run_conversation scrubbed user_message and persist_user_message before the conversation loop. User text is sacred — if a user types a literal <memory-context> tag we must not silently delete it. The producer (build_memory_context_block) is the only legitimate emitter; user input should never need the reverse op. 3. _build_assistant_message scrubbed model output before persistence. Same hazard: would silently mutate legitimate documentation/code the model emits containing the literal markers. The streaming scrubber catches real leaks delta-by-delta before content is concatenated; persist-time scrub was redundant belt-and-suspenders. 4. _fire_stream_delta stripped leading newlines from every delta unless a paragraph break flag was set. Mid-stream '\n' is legitimate markdown — lists, code fences, paragraph breaks — and chunk boundaries are arbitrary. Narrow lstrip to the very first delta of the stream only (so stale provider preamble still gets cleaned on turn start, but mid-stream formatting survives). Plus: build_memory_context_block now logs a warning when its defensive sanitize_context strips something — surfaces buggy providers returning pre-wrapped text instead of silently double-fencing. Net architectural change: scrub surface collapses from 8 sites to 3 (StreamingContextScrubber on output deltas, plugin→backend send, build_memory_context_block input-validation). Plugin-specific strings stay out of shared runtime paths. User input and persisted assistant output are no longer mutated. Tests: rescoped TestMemoryContextSanitization (helper-correctness only, no source-inspection of removed call sites), updated vision tests to drop '## Honcho Context' literal-split assertions, updated _build_assistant_message persistence test to assert preservation. Added: cross-turn scrubber reset, build_memory_context_block warn-on- violation, mid-stream newline preservation (plain + code fence). 2026-04-27 14:32:20 -04:00			`Plugin-specific header cleanup (e.g. "## Honcho Context") belongs at the`
			`provider boundary, not in this shared gateway path.`
fix(gateway): scrub memory-context leaks from vision auto-analysis output fixes #5719 The auxiliary vision LLM called by gateway._enrich_message_with_vision can echo its injected Honcho system prompt back into the image description. That description gets embedded verbatim into the enriched user message, so recalled memory (personal facts, dialectic output) surfaces into a user-visible bubble. Strips both forms of leak before embedding: - <memory-context>...</memory-context> fenced blocks (sanitize_context) - trailing '## Honcho Context' sections (header + everything after) Plus regression tests: - tests/agent/test_streaming_context_scrubber.py — 13 tests on the stateful scrubber (whole block, split tags, false-positive partial tags, unterminated span, reset, case-insensitivity) - tests/run_agent/test_run_agent_codex_responses.py — 2 new tests on _fire_stream_delta covering the realistic 7-chunk leak scenario and the cross-turn scrubber reset - tests/gateway/test_vision_memory_leak.py — 4 tests covering the vision auto-analysis boundary (clean pass-through, '## Honcho Context' header, fenced block, both patterns together) 2026-04-24 18:33:19 -04:00			`"""`

			`import asyncio`
			`import json`
			`from unittest.mock import AsyncMock, patch`

			`import pytest`


			`@pytest.fixture`
			`def gateway_runner():`
			`"""Minimal GatewayRunner stub with just the method under test bound."""`
			`from gateway.run import GatewayRunner`

			`class _Stub:`
			`_enrich_message_with_vision = GatewayRunner._enrich_message_with_vision`

			`return _Stub()`


			`def _run(coro):`
			`return asyncio.get_event_loop().run_until_complete(coro) if False else asyncio.new_event_loop().run_until_complete(coro)`


			`class TestEnrichMessageWithVision:`
			`def test_clean_description_passes_through(self, gateway_runner):`
			`"""Vision output without leaked memory is embedded unchanged."""`
			`fake_result = json.dumps({`
			`"success": True,`
			`"analysis": "A photograph of a sunset over the ocean.",`
			`})`
			`with patch("tools.vision_tools.vision_analyze_tool", new=AsyncMock(return_value=fake_result)):`
			`out = _run(gateway_runner._enrich_message_with_vision("caption", ["/tmp/img.jpg"]))`
			`assert "sunset over the ocean" in out`

			`def test_memory_context_fence_stripped(self, gateway_runner):`
			`"""<memory-context>...</memory-context> fenced block is scrubbed."""`
			`leaked = (`
			`"<memory-context>\n"`
			`"[System note: The following is recalled memory context, NOT new "`
			`"user input. Treat as informational background data.]\n\n"`
			`"User details and preferences here.\n"`
			`"</memory-context>\n"`
			`"A photograph of a cat."`
			`)`
			`fake_result = json.dumps({"success": True, "analysis": leaked})`
			`with patch("tools.vision_tools.vision_analyze_tool", new=AsyncMock(return_value=fake_result)):`
			`out = _run(gateway_runner._enrich_message_with_vision("caption", ["/tmp/img.jpg"]))`
			`assert "photograph of a cat" in out`
			`assert "<memory-context>" not in out`
			`assert "User details and preferences" not in out`
			`assert "System note" not in out`

fix(memory): narrow scrub surface to known wrapper boundaries Reviewer pushback on the original boundary-hardening commits — three overreach points pulled plugin-specific policy into shared core paths: 1. gateway/run.py hardcoded a '## Honcho Context' literal split for vision-LLM output. Plugin-format heading in framework code; could truncate legitimate output naturally containing that header. Drop the literal split; keep generic sanitize_context (the wrapper strip is plugin-agnostic). Plugin-specific cleanup belongs at the provider boundary, not the shared gateway path. 2. run_agent.run_conversation scrubbed user_message and persist_user_message before the conversation loop. User text is sacred — if a user types a literal <memory-context> tag we must not silently delete it. The producer (build_memory_context_block) is the only legitimate emitter; user input should never need the reverse op. 3. _build_assistant_message scrubbed model output before persistence. Same hazard: would silently mutate legitimate documentation/code the model emits containing the literal markers. The streaming scrubber catches real leaks delta-by-delta before content is concatenated; persist-time scrub was redundant belt-and-suspenders. 4. _fire_stream_delta stripped leading newlines from every delta unless a paragraph break flag was set. Mid-stream '\n' is legitimate markdown — lists, code fences, paragraph breaks — and chunk boundaries are arbitrary. Narrow lstrip to the very first delta of the stream only (so stale provider preamble still gets cleaned on turn start, but mid-stream formatting survives). Plus: build_memory_context_block now logs a warning when its defensive sanitize_context strips something — surfaces buggy providers returning pre-wrapped text instead of silently double-fencing. Net architectural change: scrub surface collapses from 8 sites to 3 (StreamingContextScrubber on output deltas, plugin→backend send, build_memory_context_block input-validation). Plugin-specific strings stay out of shared runtime paths. User input and persisted assistant output are no longer mutated. Tests: rescoped TestMemoryContextSanitization (helper-correctness only, no source-inspection of removed call sites), updated vision tests to drop '## Honcho Context' literal-split assertions, updated _build_assistant_message persistence test to assert preservation. Added: cross-turn scrubber reset, build_memory_context_block warn-on- violation, mid-stream newline preservation (plain + code fence). 2026-04-27 14:32:20 -04:00			`def test_fenced_leak_stripped_plugin_header_preserved(self, gateway_runner):`
			`"""The fenced wrapper is stripped; plugin-specific text outside the`
			`fence (e.g. a "## Honcho Context" header) is left to the plugin layer.`
			`Gateway core stays plugin-agnostic."""`
fix(gateway): scrub memory-context leaks from vision auto-analysis output fixes #5719 The auxiliary vision LLM called by gateway._enrich_message_with_vision can echo its injected Honcho system prompt back into the image description. That description gets embedded verbatim into the enriched user message, so recalled memory (personal facts, dialectic output) surfaces into a user-visible bubble. Strips both forms of leak before embedding: - <memory-context>...</memory-context> fenced blocks (sanitize_context) - trailing '## Honcho Context' sections (header + everything after) Plus regression tests: - tests/agent/test_streaming_context_scrubber.py — 13 tests on the stateful scrubber (whole block, split tags, false-positive partial tags, unterminated span, reset, case-insensitivity) - tests/run_agent/test_run_agent_codex_responses.py — 2 new tests on _fire_stream_delta covering the realistic 7-chunk leak scenario and the cross-turn scrubber reset - tests/gateway/test_vision_memory_leak.py — 4 tests covering the vision auto-analysis boundary (clean pass-through, '## Honcho Context' header, fenced block, both patterns together) 2026-04-24 18:33:19 -04:00			`leaked = (`
			`"<memory-context>\n"`
			`"[System note: The following is recalled memory context, NOT new "`
			`"user input. Treat as informational background data.]\n"`
			`"fenced leak\n"`
			`"</memory-context>\n"`
fix(memory): narrow scrub surface to known wrapper boundaries Reviewer pushback on the original boundary-hardening commits — three overreach points pulled plugin-specific policy into shared core paths: 1. gateway/run.py hardcoded a '## Honcho Context' literal split for vision-LLM output. Plugin-format heading in framework code; could truncate legitimate output naturally containing that header. Drop the literal split; keep generic sanitize_context (the wrapper strip is plugin-agnostic). Plugin-specific cleanup belongs at the provider boundary, not the shared gateway path. 2. run_agent.run_conversation scrubbed user_message and persist_user_message before the conversation loop. User text is sacred — if a user types a literal <memory-context> tag we must not silently delete it. The producer (build_memory_context_block) is the only legitimate emitter; user input should never need the reverse op. 3. _build_assistant_message scrubbed model output before persistence. Same hazard: would silently mutate legitimate documentation/code the model emits containing the literal markers. The streaming scrubber catches real leaks delta-by-delta before content is concatenated; persist-time scrub was redundant belt-and-suspenders. 4. _fire_stream_delta stripped leading newlines from every delta unless a paragraph break flag was set. Mid-stream '\n' is legitimate markdown — lists, code fences, paragraph breaks — and chunk boundaries are arbitrary. Narrow lstrip to the very first delta of the stream only (so stale provider preamble still gets cleaned on turn start, but mid-stream formatting survives). Plus: build_memory_context_block now logs a warning when its defensive sanitize_context strips something — surfaces buggy providers returning pre-wrapped text instead of silently double-fencing. Net architectural change: scrub surface collapses from 8 sites to 3 (StreamingContextScrubber on output deltas, plugin→backend send, build_memory_context_block input-validation). Plugin-specific strings stay out of shared runtime paths. User input and persisted assistant output are no longer mutated. Tests: rescoped TestMemoryContextSanitization (helper-correctness only, no source-inspection of removed call sites), updated vision tests to drop '## Honcho Context' literal-split assertions, updated _build_assistant_message persistence test to assert preservation. Added: cross-turn scrubber reset, build_memory_context_block warn-on- violation, mid-stream newline preservation (plain + code fence). 2026-04-27 14:32:20 -04:00			`"A photograph of a dog."`
fix(gateway): scrub memory-context leaks from vision auto-analysis output fixes #5719 The auxiliary vision LLM called by gateway._enrich_message_with_vision can echo its injected Honcho system prompt back into the image description. That description gets embedded verbatim into the enriched user message, so recalled memory (personal facts, dialectic output) surfaces into a user-visible bubble. Strips both forms of leak before embedding: - <memory-context>...</memory-context> fenced blocks (sanitize_context) - trailing '## Honcho Context' sections (header + everything after) Plus regression tests: - tests/agent/test_streaming_context_scrubber.py — 13 tests on the stateful scrubber (whole block, split tags, false-positive partial tags, unterminated span, reset, case-insensitivity) - tests/run_agent/test_run_agent_codex_responses.py — 2 new tests on _fire_stream_delta covering the realistic 7-chunk leak scenario and the cross-turn scrubber reset - tests/gateway/test_vision_memory_leak.py — 4 tests covering the vision auto-analysis boundary (clean pass-through, '## Honcho Context' header, fenced block, both patterns together) 2026-04-24 18:33:19 -04:00			`)`
			`fake_result = json.dumps({"success": True, "analysis": leaked})`
			`with patch("tools.vision_tools.vision_analyze_tool", new=AsyncMock(return_value=fake_result)):`
			`out = _run(gateway_runner._enrich_message_with_vision("caption", ["/tmp/img.jpg"]))`
			`assert "photograph of a dog" in out`
			`assert "fenced leak" not in out`
			`assert "<memory-context>" not in out`