Files
hermes-agent/website/docs/reference/toolsets-reference.md
Teknium 8e3803f3ce feat: Computer Use Tool — macOS desktop control via Anthropic native API
Salvaged from PR #3816 by 0xbyt4. Stripped unrelated changes (telegram
thread retry, cache logging in quiet_mode), preserved existing beta
headers (interleaved-thinking, fine-grained-tool-streaming), and
rebased onto current main.

New computer_use toolset:
- Screenshot capture via macOS native screencapture + sips
- Mouse: click, double/triple/right/middle click, drag, move
- Keyboard: type text (clipboard paste for Unicode), key combos
- Zoom for inspecting small screen regions at full resolution
- Auto-screenshot after destructive actions (saves API round-trips)

Architecture:
- Dual-schema: stub (OpenAI format) for dispatch + native
  (computer_20251124) injected into Anthropic API calls
- Provider gating: stripped from non-Anthropic providers at init
- Beta API routing: messages.create → beta.messages.create when
  native tools present (both streaming and non-streaming)
- Multimodal results: _anthropic_content_blocks on tool messages,
  content stays string for session DB / trajectory compatibility

Token optimization:
- Server-side context editing (context-management-2025-06-27 beta)
- Client-side screenshot-aware pruning in context compressor
- Image eviction: keeps only 3 most recent screenshots
- Image-aware token estimation (flat 1500 tokens per image)

Safety:
- Hard-blocked key combos (empty trash, force delete, lock screen)
- Blocked type patterns (curl|bash, sudo -S -p '' rm -rf, privilege escalation)
- Anti-injection system prompt guidance
- Approval callback wired (disabled during beta)

Includes: 102 tests, 657-line macOS workflow skill (auto-loaded),
feature docs page, reference catalog updates.
2026-04-02 01:59:32 -07:00

4.8 KiB

sidebar_position, title, description
sidebar_position title description
4 Toolsets Reference Reference for Hermes core, composite, platform, and dynamic toolsets

Toolsets Reference

Toolsets are named bundles of tools that you can enable with hermes chat --toolsets ..., configure per platform, or resolve inside the agent runtime.

Toolset Kind Resolves to
browser core browser_back, browser_click, browser_close, browser_console, browser_get_images, browser_navigate, browser_press, browser_scroll, browser_snapshot, browser_type, browser_vision, web_search
clarify core clarify
code_execution core execute_code
computer_use core computer
cronjob core cronjob
debugging composite patch, process, read_file, search_files, terminal, web_extract, web_search, write_file
delegation core delegate_task
file core patch, read_file, search_files, write_file
hermes-acp platform browser_back, browser_click, browser_close, browser_console, browser_get_images, browser_navigate, browser_press, browser_scroll, browser_snapshot, browser_type, browser_vision, delegate_task, execute_code, memory, patch, process, read_file, search_files, session_search, skill_manage, skill_view, skills_list, terminal, todo, vision_analyze, web_extract, web_search, write_file
hermes-cli platform browser_back, browser_click, browser_close, browser_console, browser_get_images, browser_navigate, browser_press, browser_scroll, browser_snapshot, browser_type, browser_vision, clarify, cronjob, delegate_task, execute_code, ha_call_service, ha_get_state, ha_list_entities, ha_list_services, honcho_conclude, honcho_context, honcho_profile, honcho_search, image_generate, memory, mixture_of_agents, patch, process, read_file, search_files, send_message, session_search, skill_manage, skill_view, skills_list, terminal, text_to_speech, todo, vision_analyze, web_extract, web_search, write_file
hermes-api-server platform browser_back, browser_click, browser_close, browser_console, browser_get_images, browser_navigate, browser_press, browser_scroll, browser_snapshot, browser_type, browser_vision, cronjob, delegate_task, execute_code, ha_call_service, ha_get_state, ha_list_entities, ha_list_services, honcho_conclude, honcho_context, honcho_profile, honcho_search, image_generate, memory, mixture_of_agents, patch, process, read_file, search_files, session_search, skill_manage, skill_view, skills_list, terminal, todo, vision_analyze, web_extract, web_search, write_file
hermes-dingtalk platform (same as hermes-cli)
hermes-feishu platform (same as hermes-cli)
hermes-wecom platform (same as hermes-cli)
hermes-discord platform (same as hermes-cli)
hermes-email platform (same as hermes-cli)
hermes-gateway composite Union of all messaging platform toolsets
hermes-homeassistant platform (same as hermes-cli)
hermes-matrix platform (same as hermes-cli)
hermes-mattermost platform (same as hermes-cli)
hermes-signal platform (same as hermes-cli)
hermes-slack platform (same as hermes-cli)
hermes-sms platform (same as hermes-cli)
hermes-telegram platform (same as hermes-cli)
hermes-whatsapp platform (same as hermes-cli)
homeassistant core ha_call_service, ha_get_state, ha_list_entities, ha_list_services
honcho core honcho_conclude, honcho_context, honcho_profile, honcho_search
image_gen core image_generate
memory core memory
messaging core send_message
moa core mixture_of_agents
rl core rl_check_status, rl_edit_config, rl_get_current_config, rl_get_results, rl_list_environments, rl_list_runs, rl_select_environment, rl_start_training, rl_stop_training, rl_test_inference
safe composite image_generate, mixture_of_agents, vision_analyze, web_extract, web_search
search core web_search
session_search core session_search
skills core skill_manage, skill_view, skills_list
terminal core process, terminal
todo core todo
tts core text_to_speech
vision core vision_analyze
web core web_extract, web_search

Dynamic toolsets

  • mcp-<server> — generated at runtime for each configured MCP server.
  • Custom toolsets can be created in configuration and resolved at startup.
  • Wildcards: all and * expand to every registered toolset.