Files
hermes-agent/website/docs/reference
Teknium 4b8272f549 feat(browser): add browser_dialog for native JS dialog handling
Ergonomic wrapper over CDP's Page.handleJavaScriptDialog that accepts
or dismisses alert/confirm/prompt/beforeunload dialogs blocking a page.
Unsticks pages whose JS thread is frozen by an unhandled dialog —
symptom is that browser_snapshot, browser_console, browser_click etc.
start hanging or erroring.

- action='accept'|'dismiss' required; prompt_text optional for prompt()
- target_id auto-resolves when exactly one page tab is open; with
  multiple page tabs, errors with the tab list so the agent picks one
- Shares browser_cdp's check_fn gate — only appears when CDP is
  reachable (/browser connect or browser.cdp_url in config). Hidden
  otherwise so backends that can't use it don't see it.
- Safe as a probe: CDP returns a clean 'No dialog is showing' error
  when nothing's pending, which we pass through verbatim

Dialog detection (knowing a dialog is open without being told) is NOT
included — it requires persistent CDP subscriptions per session, a
larger architectural change. Documented as a follow-up; agents infer
from symptoms and use this tool to recover.

Tests: 11 new unit tests against mock CDP server covering the wrapper
(action validation, auto-resolve with 0/1/multiple page targets,
explicit target_id accept/dismiss flow, prompt_text passthrough, shared
gate with browser_cdp, registry dispatch). E2E probe case against real
headless Chrome passes. Positive-case real-Chrome E2E is blocked by
Chromium's headless auto-dismiss behavior when no persistent listener
is attached — unit tests exercise the exact CDP protocol we send, so
the handling path is protocol-verified; headful real-browser usage
(the actual /browser connect case) keeps dialogs alive via the Chrome
UI.
2026-04-19 05:20:51 -07:00
..