fix(approval): close remaining prompt_toolkit deadlock vectors (#15216)

PR #13734 fixed the concurrent-tool-executor vector (ThreadPoolExecutor
workers didn't inherit the CLI's TLS approval callback). Two vectors
remained that could still land in the deadlocking input() fallback:

1. _spawn_background_review spawns a raw threading.Thread with no
   approval callback installed, so any dangerous-command guard the
   review agent trips falls back to input() -> deadlock against the
   parent's prompt_toolkit TUI (same class as delegate_task subagents,
   fixed in 023b1bff1 / #15491). Install a _bg_review_auto_deny
   callback at thread start, clear on finally.

2. prompt_dangerous_approval's fallback unconditionally spawned a
   daemon thread calling input() when approval_callback was None.
   That fallback can never succeed under prompt_toolkit because the
   user's Enter goes to pt's raw-mode stdin capture. Detect an active
   pt Application via get_app_or_none() and fail closed (deny + log)
   instead, so future threads that forget to install a callback
   degrade gracefully instead of hanging 60s invisibly.

Regression guards:
- tests/run_agent/test_background_review.py verifies the review
  worker thread sees a callable auto-deny callback mid-run and that
  the slot is cleared in the finally block.
- tests/tools/test_approval.py TestFailClosedUnderPromptToolkit
  verifies prompt_dangerous_approval returns 'deny' fast under a
  mocked pt Application, and that a real callback still wins over
  the guard.
This commit is contained in:
Teknium
2026-04-27 06:41:02 -07:00
committed by Teknium
parent 0046d170dc
commit 008860a23f
4 changed files with 163 additions and 0 deletions

View File

@@ -71,3 +71,59 @@ def test_background_review_shuts_down_memory_provider_before_close(monkeypatch):
"shutdown_memory_provider",
"close",
]
def test_background_review_installs_auto_deny_approval_callback(monkeypatch):
"""Regression guard for #15216.
The background review thread must install a non-interactive approval
callback. If it doesn't, any dangerous-command guard the review agent
trips falls back to input() on a daemon thread, which deadlocks against
the parent's prompt_toolkit TUI.
"""
import tools.terminal_tool as tt
observed: dict = {"during_run": "<unread>", "after_finally": "<unread>"}
class FakeReviewAgent:
def __init__(self, **kwargs):
self._session_messages = []
def run_conversation(self, **kwargs):
# Capture what the callback looks like mid-run. It must be
# a callable (the auto-deny) -- not None.
observed["during_run"] = tt._get_approval_callback()
def shutdown_memory_provider(self):
pass
def close(self):
pass
monkeypatch.setattr(run_agent_module, "AIAgent", FakeReviewAgent)
monkeypatch.setattr(run_agent_module.threading, "Thread", ImmediateThread)
# Start from a clean slot.
tt.set_approval_callback(None)
agent = _bare_agent()
AIAgent._spawn_background_review(
agent,
messages_snapshot=[{"role": "user", "content": "hello"}],
review_memory=True,
)
observed["after_finally"] = tt._get_approval_callback()
assert callable(observed["during_run"]), (
"Background review did not install an approval callback on its "
"worker thread; dangerous-command prompts will deadlock against "
"the parent TUI (#15216)."
)
# The installed callback must deny (it's a safety gate, not a prompt).
assert observed["during_run"]("rm -rf /", "test") == "deny"
assert observed["after_finally"] is None, (
"Background review leaked its approval callback into the worker "
"thread's TLS slot; a recycled thread-id could reuse it."
)