mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-06-16 15:11:18 +08:00
Brings desktop control to Windows hosts: UI Automation element discovery with SOM overlays, SendInput mouse/keyboard (virtual- desktop-normalized absolute coords, Unicode typing), and focus-free set_value via UIA value/selection/range patterns. Backend selection is platform-aware (HERMES_COMPUTER_USE_BACKEND still overrides) and check_computer_use_requirements() now gates per platform. Windows session-killing key combos (win+l, ctrl+alt+del, alt+f4) are hard-blocked alongside the macOS list. Unlike cua-driver on macOS there is no background input injection on Windows: pointer/keyboard actions briefly foreground the target window, and the platform-aware tool schema tells the model so. Requires uiautomation (+comtypes) in the venv; windows_backend degrades to unavailable when imports fail. 118 computer_use tests pass incl. 21 new dependency-free Windows tests; verified live against Notepad (capture/SOM/type/set_value/key). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
41 lines
1.3 KiB
Python
41 lines
1.3 KiB
Python
"""Shim for tool discovery. Registers `computer_use` with tools.registry.
|
|
|
|
The real implementation lives in the `tools/computer_use/` package to keep
|
|
the file structure clean. This shim exists because tools.registry auto-imports
|
|
`tools/*.py` — we need a top-level module to trigger the registration.
|
|
"""
|
|
|
|
from __future__ import annotations
|
|
|
|
from tools.computer_use.schema import COMPUTER_USE_SCHEMA
|
|
from tools.computer_use.tool import (
|
|
check_computer_use_requirements,
|
|
handle_computer_use,
|
|
set_approval_callback,
|
|
)
|
|
from tools.registry import registry
|
|
|
|
|
|
registry.register(
|
|
name="computer_use",
|
|
toolset="computer_use",
|
|
schema=COMPUTER_USE_SCHEMA,
|
|
handler=lambda args, **kw: handle_computer_use(args, **kw),
|
|
check_fn=check_computer_use_requirements,
|
|
requires_env=[],
|
|
description=(
|
|
"Universal desktop control. Works with any tool-capable model "
|
|
"(Anthropic, OpenAI, OpenRouter, local vLLM, etc.). macOS: "
|
|
"background computer-use via cua-driver (does NOT steal the user's "
|
|
"cursor or keyboard focus). Windows: UI Automation + SendInput "
|
|
"(actions briefly foreground the target window)."
|
|
),
|
|
)
|
|
|
|
|
|
__all__ = [
|
|
"handle_computer_use",
|
|
"set_approval_callback",
|
|
"check_computer_use_requirements",
|
|
]
|