fix(approval): close remaining prompt_toolkit deadlock vectors (#15216)

PR #13734 fixed the concurrent-tool-executor vector (ThreadPoolExecutor workers didn't inherit the CLI's TLS approval callback). Two vectors remained that could still land in the deadlocking input() fallback: 1. _spawn_background_review spawns a raw threading.Thread with no approval callback installed, so any dangerous-command guard the review agent trips falls back to input() -> deadlock against the parent's prompt_toolkit TUI (same class as delegate_task subagents, fixed in 023b1bff1 / #15491). Install a _bg_review_auto_deny callback at thread start, clear on finally. 2. prompt_dangerous_approval's fallback unconditionally spawned a daemon thread calling input() when approval_callback was None. That fallback can never succeed under prompt_toolkit because the user's Enter goes to pt's raw-mode stdin capture. Detect an active pt Application via get_app_or_none() and fail closed (deny + log) instead, so future threads that forget to install a callback degrade gracefully instead of hanging 60s invisibly. Regression guards: - tests/run_agent/test_background_review.py verifies the review worker thread sees a callable auto-deny callback mid-run and that the slot is cleared in the finally block. - tests/tools/test_approval.py TestFailClosedUnderPromptToolkit verifies prompt_dangerous_approval returns 'deny' fast under a mocked pt Application, and that a real callback still wins over the guard.
2026-04-28 06:51:16 +08:00 · 2026-04-27 06:41:02 -07:00
parent 0046d170dc
commit 008860a23f
4 changed files with 163 additions and 0 deletions
--- a/tests/run_agent/test_background_review.py
+++ b/tests/run_agent/test_background_review.py
@@ -71,3 +71,59 @@ def test_background_review_shuts_down_memory_provider_before_close(monkeypatch):
        "shutdown_memory_provider",
        "close",
    ]
+
+
+def test_background_review_installs_auto_deny_approval_callback(monkeypatch):
+    """Regression guard for #15216.
+
+    The background review thread must install a non-interactive approval
+    callback. If it doesn't, any dangerous-command guard the review agent
+    trips falls back to input() on a daemon thread, which deadlocks against
+    the parent's prompt_toolkit TUI.
+    """
+    import tools.terminal_tool as tt
+
+    observed: dict = {"during_run": "<unread>", "after_finally": "<unread>"}
+
+    class FakeReviewAgent:
+        def __init__(self, **kwargs):
+            self._session_messages = []
+
+        def run_conversation(self, **kwargs):
+            # Capture what the callback looks like mid-run. It must be
+            # a callable (the auto-deny) -- not None.
+            observed["during_run"] = tt._get_approval_callback()
+
+        def shutdown_memory_provider(self):
+            pass
+
+        def close(self):
+            pass
+
+    monkeypatch.setattr(run_agent_module, "AIAgent", FakeReviewAgent)
+    monkeypatch.setattr(run_agent_module.threading, "Thread", ImmediateThread)
+
+    # Start from a clean slot.
+    tt.set_approval_callback(None)
+    agent = _bare_agent()
+
+    AIAgent._spawn_background_review(
+        agent,
+        messages_snapshot=[{"role": "user", "content": "hello"}],
+        review_memory=True,
+    )
+
+    observed["after_finally"] = tt._get_approval_callback()
+
+    assert callable(observed["during_run"]), (
+        "Background review did not install an approval callback on its "
+        "worker thread; dangerous-command prompts will deadlock against "
+        "the parent TUI (#15216)."
+    )
+    # The installed callback must deny (it's a safety gate, not a prompt).
+    assert observed["during_run"]("rm -rf /", "test") == "deny"
+
+    assert observed["after_finally"] is None, (
+        "Background review leaked its approval callback into the worker "
+        "thread's TLS slot; a recycled thread-id could reuse it."
+    )