mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-06-29 05:06:48 +08:00
Brings desktop control to Windows hosts: UI Automation element discovery with SOM overlays, SendInput mouse/keyboard (virtual- desktop-normalized absolute coords, Unicode typing), and focus-free set_value via UIA value/selection/range patterns. Backend selection is platform-aware (HERMES_COMPUTER_USE_BACKEND still overrides) and check_computer_use_requirements() now gates per platform. Windows session-killing key combos (win+l, ctrl+alt+del, alt+f4) are hard-blocked alongside the macOS list. Unlike cua-driver on macOS there is no background input injection on Windows: pointer/keyboard actions briefly foreground the target window, and the platform-aware tool schema tells the model so. Requires uiautomation (+comtypes) in the venv; windows_backend degrades to unavailable when imports fail. 118 computer_use tests pass incl. 21 new dependency-free Windows tests; verified live against Notepad (capture/SOM/type/set_value/key). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>