mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-02 16:57:36 +08:00
feat: Computer Use Tool — macOS desktop control via Anthropic native API
Salvaged from PR #3816 by 0xbyt4. Stripped unrelated changes (telegram thread retry, cache logging in quiet_mode), preserved existing beta headers (interleaved-thinking, fine-grained-tool-streaming), and rebased onto current main. New computer_use toolset: - Screenshot capture via macOS native screencapture + sips - Mouse: click, double/triple/right/middle click, drag, move - Keyboard: type text (clipboard paste for Unicode), key combos - Zoom for inspecting small screen regions at full resolution - Auto-screenshot after destructive actions (saves API round-trips) Architecture: - Dual-schema: stub (OpenAI format) for dispatch + native (computer_20251124) injected into Anthropic API calls - Provider gating: stripped from non-Anthropic providers at init - Beta API routing: messages.create → beta.messages.create when native tools present (both streaming and non-streaming) - Multimodal results: _anthropic_content_blocks on tool messages, content stays string for session DB / trajectory compatibility Token optimization: - Server-side context editing (context-management-2025-06-27 beta) - Client-side screenshot-aware pruning in context compressor - Image eviction: keeps only 3 most recent screenshots - Image-aware token estimation (flat 1500 tokens per image) Safety: - Hard-blocked key combos (empty trash, force delete, lock screen) - Blocked type patterns (curl|bash, sudo -S -p '' rm -rf, privilege escalation) - Anti-injection system prompt guidance - Approval callback wired (disabled during beta) Includes: 102 tests, 657-line macOS workflow skill (auto-loaded), feature docs page, reference catalog updates.
This commit is contained in:
@@ -13,6 +13,7 @@ Toolsets are named bundles of tools that you can enable with `hermes chat --tool
|
||||
| `browser` | core | `browser_back`, `browser_click`, `browser_close`, `browser_console`, `browser_get_images`, `browser_navigate`, `browser_press`, `browser_scroll`, `browser_snapshot`, `browser_type`, `browser_vision`, `web_search` |
|
||||
| `clarify` | core | `clarify` |
|
||||
| `code_execution` | core | `execute_code` |
|
||||
| `computer_use` | core | `computer` |
|
||||
| `cronjob` | core | `cronjob` |
|
||||
| `debugging` | composite | `patch`, `process`, `read_file`, `search_files`, `terminal`, `web_extract`, `web_search`, `write_file` |
|
||||
| `delegation` | core | `delegate_task` |
|
||||
|
||||
Reference in New Issue
Block a user