mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-02 00:41:43 +08:00
Broad drift audit against origin/main (b52b63396).
Reference pages (most user-visible drift):
- slash-commands: add /busy, /curator, /footer, /indicator, /redraw, /steer
that were missing; drop non-existent /terminal-setup; fix /q footnote
(resolves to /queue, not /quit); extend CLI-only list with all 24
CLI-only commands in the registry
- cli-commands: add dedicated sections for hermes curator / fallback /
hooks (new subcommands not previously documented); remove stale
hermes honcho standalone section (the plugin registers dynamically
via hermes memory); list curator/fallback/hooks in top-level table;
fix completion to include fish
- toolsets-reference: document the real 52-toolset count; split browser
vs browser-cdp; add discord / discord_admin / spotify / yuanbao;
correct hermes-cli tool count from 36 to 38; fix misleading claim
that hermes-homeassistant adds tools (it's identical to hermes-cli)
- tools-reference: bump tool count 55 -> 68; add 7 Spotify, 5 Yuanbao,
2 Discord toolsets; move browser_cdp/browser_dialog to their own
browser-cdp toolset section
- environment-variables: add 40+ user-facing HERMES_* vars that were
undocumented (--yolo, --accept-hooks, --ignore-*, inference model
override, agent/stream/checkpoint timeouts, OAuth trace, per-platform
batch tuning for Telegram/Discord/Matrix/Feishu/WeCom, cron knobs,
gateway restart/connect timeouts); dedupe the Cron Scheduler section;
replace stale QQ_SANDBOX with QQ_PORTAL_HOST
User-guide (top level):
- cli.md: compression preserves last 20 turns, not 4 (protect_last_n: 20)
- configuration.md: display.platforms is the canonical per-platform
override key; tool_progress_overrides is deprecated and auto-migrated
- profiles.md: model.default is the config key, not model.model
- sessions.md: CLI/TUI session IDs use 6-char hex, gateway uses 8
- checkpoints-and-rollback.md: destructive-command list now matches
_DESTRUCTIVE_PATTERNS (adds rmdir, cp, install, dd)
- docker.md: the container runs as non-root hermes (UID 10000) via
gosu; fix install command (uv pip); add missing --insecure on the
dashboard compose example (required for non-loopback bind)
- security.md: systemctl danger pattern also matches 'restart'
- index.md: built-in tool count 47 -> 68
- integrations/index.md: 6 STT providers, 8 memory providers
- integrations/providers.md: drop fictional dashscope/qwen aliases
Features:
- overview.md: 9 image models (not 8), 9 TTS providers (not 5),
8 memory providers (Supermemory was missing)
- tool-gateway.md: 9 image models
- tools.md: extend common-toolsets list with search / messaging /
spotify / discord / debugging / safe
- fallback-providers.md: add 6 real providers from PROVIDER_REGISTRY
(lmstudio, kimi-coding-cn, stepfun, alibaba-coding-plan,
tencent-tokenhub, azure-foundry)
- plugins.md: Available Hooks table now includes on_session_finalize,
on_session_reset, subagent_stop
- built-in-plugins.md: add the 7 bundled plugins the page didn't
mention (spotify, google_meet, three image_gen providers, two
dashboard examples)
- web-dashboard.md: add --insecure and --tui flags
- cron.md: hermes cron create takes positional schedule/prompt, not
flags
Messaging:
- telegram.md: TELEGRAM_WEBHOOK_SECRET is now REQUIRED when
TELEGRAM_WEBHOOK_URL is set (gateway refuses to start without it
per GHSA-3vpc-7q5r-276h). Biggest user-visible drift in the batch.
- discord.md: HERMES_DISCORD_TEXT_BATCH_SPLIT_DELAY_SECONDS default
is 2.0, not 0.1
- dingtalk.md: document DINGTALK_REQUIRE_MENTION /
FREE_RESPONSE_CHATS / MENTION_PATTERNS / HOME_CHANNEL /
ALLOW_ALL_USERS that the adapter supports
- bluebubbles.md: drop fictional BLUEBUBBLES_SEND_READ_RECEIPTS env
var; the setting lives in platforms.bluebubbles.extra only
- qqbot.md: drop dead QQ_SANDBOX; add real QQ_PORTAL_HOST and
QQ_GROUP_ALLOWED_USERS
- wecom-callback.md: replace 'hermes gateway start' (service-only)
with 'hermes gateway' for first-time setup
Developer-guide:
- architecture.md: refresh tool/toolset counts (61/52), terminal
backend count (7), line counts for run_agent.py (~13.7k), cli.py
(~11.5k), main.py (~10.4k), setup.py (~3.5k), gateway/run.py
(~12.2k), mcp_tool.py (~3.1k); add yuanbao adapter, bump platform
adapter count 18 -> 20
- agent-loop.md: run_agent.py line count 10.7k -> 13.7k
- tools-runtime.md: add vercel_sandbox backend
- adding-tools.md: remove stale 'Discovery import added to
model_tools.py' checklist item (registry auto-discovery)
- adding-platform-adapters.md: mark send_typing / get_chat_info as
concrete base methods; only connect/disconnect/send are abstract
- acp-internals.md: ACP sessions now persist to SessionDB
(~/.hermes/state.db); acp.run_agent call uses
use_unstable_protocol=True
- cron-internals.md: gateway runs scheduler in a dedicated background
thread via _start_cron_ticker, not on a maintenance cycle; locking
is cross-process via fcntl.flock (Unix) / msvcrt.locking (Windows)
- gateway-internals.md: gateway/run.py ~12k lines
- provider-runtime.md: cron DOES support fallback (run_job reads
fallback_providers from config)
- session-storage.md: SCHEMA_VERSION = 11 (not 9); add migrations
10 and 11 (trigram FTS, inline-mode FTS5 re-index); add
api_call_count column to Sessions DDL; document messages_fts_trigram
and state_meta in the architecture tree
- context-compression-and-caching.md: remove the obsolete 'context
pressure warnings' section (warnings were removed for causing
models to give up early)
- context-engine-plugin.md: compress() signature now includes
focus_topic param
- extending-the-cli.md: _build_tui_layout_children signature now
includes model_picker_widget; add to default layout
Also fixed three pre-existing broken links/anchors the build warned
about (docker.md -> api-server.md, yuanbao.md -> cron-jobs.md and
tips#background-tasks, nix-setup.md -> #container-aware-cli).
Regenerated per-skill pages via website/scripts/generate-skill-docs.py
so catalog tables and sidebar are consistent with current SKILL.md
frontmatter.
docusaurus build: clean, no broken links or anchors.
280 lines
16 KiB
Markdown
280 lines
16 KiB
Markdown
---
|
|
sidebar_position: 1
|
|
title: "Architecture"
|
|
description: "Hermes Agent internals — major subsystems, execution paths, data flow, and where to read next"
|
|
---
|
|
|
|
# Architecture
|
|
|
|
This page is the top-level map of Hermes Agent internals. Use it to orient yourself in the codebase, then dive into subsystem-specific docs for implementation details.
|
|
|
|
## System Overview
|
|
|
|
```text
|
|
┌─────────────────────────────────────────────────────────────────────┐
|
|
│ Entry Points │
|
|
│ │
|
|
│ CLI (cli.py) Gateway (gateway/run.py) ACP (acp_adapter/) │
|
|
│ Batch Runner API Server Python Library │
|
|
└──────────┬──────────────┬───────────────────────┬───────────────────┘
|
|
│ │ │
|
|
▼ ▼ ▼
|
|
┌─────────────────────────────────────────────────────────────────────┐
|
|
│ AIAgent (run_agent.py) │
|
|
│ │
|
|
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
|
|
│ │ Prompt │ │ Provider │ │ Tool │ │
|
|
│ │ Builder │ │ Resolution │ │ Dispatch │ │
|
|
│ │ (prompt_ │ │ (runtime_ │ │ (model_ │ │
|
|
│ │ builder.py) │ │ provider.py)│ │ tools.py) │ │
|
|
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
|
|
│ │ │ │ │
|
|
│ ┌──────┴───────┐ ┌──────┴───────┐ ┌──────┴───────┐ │
|
|
│ │ Compression │ │ 3 API Modes │ │ Tool Registry│ │
|
|
│ │ & Caching │ │ chat_compl. │ │ (registry.py)│ │
|
|
│ │ │ │ codex_resp. │ │ 61 tools │ │
|
|
│ │ │ │ anthropic │ │ 52 toolsets │ │
|
|
│ └──────────────┘ └──────────────┘ └──────────────┘ │
|
|
└─────────┴─────────────────┴─────────────────┴───────────────────────┘
|
|
│ │
|
|
▼ ▼
|
|
┌───────────────────┐ ┌──────────────────────┐
|
|
│ Session Storage │ │ Tool Backends │
|
|
│ (SQLite + FTS5) │ │ Terminal (7 backends) │
|
|
│ hermes_state.py │ │ Browser (5 backends) │
|
|
│ gateway/session.py│ │ Web (4 backends) │
|
|
└───────────────────┘ │ MCP (dynamic) │
|
|
│ File, Vision, etc. │
|
|
└──────────────────────┘
|
|
```
|
|
|
|
## Directory Structure
|
|
|
|
```text
|
|
hermes-agent/
|
|
├── run_agent.py # AIAgent — core conversation loop (~13,700 lines)
|
|
├── cli.py # HermesCLI — interactive terminal UI (~11,500 lines)
|
|
├── model_tools.py # Tool discovery, schema collection, dispatch
|
|
├── toolsets.py # Tool groupings and platform presets
|
|
├── hermes_state.py # SQLite session/state database with FTS5
|
|
├── hermes_constants.py # HERMES_HOME, profile-aware paths
|
|
├── batch_runner.py # Batch trajectory generation
|
|
│
|
|
├── agent/ # Agent internals
|
|
│ ├── prompt_builder.py # System prompt assembly
|
|
│ ├── context_engine.py # ContextEngine ABC (pluggable)
|
|
│ ├── context_compressor.py # Default engine — lossy summarization
|
|
│ ├── prompt_caching.py # Anthropic prompt caching
|
|
│ ├── auxiliary_client.py # Auxiliary LLM for side tasks (vision, summarization)
|
|
│ ├── model_metadata.py # Model context lengths, token estimation
|
|
│ ├── models_dev.py # models.dev registry integration
|
|
│ ├── anthropic_adapter.py # Anthropic Messages API format conversion
|
|
│ ├── display.py # KawaiiSpinner, tool preview formatting
|
|
│ ├── skill_commands.py # Skill slash commands
|
|
│ ├── memory_manager.py # Memory manager orchestration
|
|
│ ├── memory_provider.py # Memory provider ABC
|
|
│ └── trajectory.py # Trajectory saving helpers
|
|
│
|
|
├── hermes_cli/ # CLI subcommands and setup
|
|
│ ├── main.py # Entry point — all `hermes` subcommands (~10,400 lines)
|
|
│ ├── config.py # DEFAULT_CONFIG, OPTIONAL_ENV_VARS, migration
|
|
│ ├── commands.py # COMMAND_REGISTRY — central slash command definitions
|
|
│ ├── auth.py # PROVIDER_REGISTRY, credential resolution
|
|
│ ├── runtime_provider.py # Provider → api_mode + credentials
|
|
│ ├── models.py # Model catalog, provider model lists
|
|
│ ├── model_switch.py # /model command logic (CLI + gateway shared)
|
|
│ ├── setup.py # Interactive setup wizard (~3,500 lines)
|
|
│ ├── skin_engine.py # CLI theming engine
|
|
│ ├── skills_config.py # hermes skills — enable/disable per platform
|
|
│ ├── skills_hub.py # /skills slash command
|
|
│ ├── tools_config.py # hermes tools — enable/disable per platform
|
|
│ ├── plugins.py # PluginManager — discovery, loading, hooks
|
|
│ ├── callbacks.py # Terminal callbacks (clarify, sudo, approval)
|
|
│ └── gateway.py # hermes gateway start/stop
|
|
│
|
|
├── tools/ # Tool implementations (one file per tool)
|
|
│ ├── registry.py # Central tool registry
|
|
│ ├── approval.py # Dangerous command detection
|
|
│ ├── terminal_tool.py # Terminal orchestration
|
|
│ ├── process_registry.py # Background process management
|
|
│ ├── file_tools.py # read_file, write_file, patch, search_files
|
|
│ ├── web_tools.py # web_search, web_extract
|
|
│ ├── browser_tool.py # 10 browser automation tools
|
|
│ ├── code_execution_tool.py # execute_code sandbox
|
|
│ ├── delegate_tool.py # Subagent delegation
|
|
│ ├── mcp_tool.py # MCP client (~3,100 lines)
|
|
│ ├── credential_files.py # File-based credential passthrough
|
|
│ ├── env_passthrough.py # Env var passthrough for sandboxes
|
|
│ ├── ansi_strip.py # ANSI escape stripping
|
|
│ └── environments/ # Terminal backends (local, docker, ssh, modal, daytona, singularity)
|
|
│
|
|
├── gateway/ # Messaging platform gateway
|
|
│ ├── run.py # GatewayRunner — message dispatch (~12,200 lines)
|
|
│ ├── session.py # SessionStore — conversation persistence
|
|
│ ├── delivery.py # Outbound message delivery
|
|
│ ├── pairing.py # DM pairing authorization
|
|
│ ├── hooks.py # Hook discovery and lifecycle events
|
|
│ ├── mirror.py # Cross-session message mirroring
|
|
│ ├── status.py # Token locks, profile-scoped process tracking
|
|
│ ├── builtin_hooks/ # Extension point for always-registered hooks (none shipped)
|
|
│ └── platforms/ # 20 adapters: telegram, discord, slack, whatsapp,
|
|
│ # signal, matrix, mattermost, email, sms,
|
|
│ # dingtalk, feishu, wecom, wecom_callback, weixin,
|
|
│ # bluebubbles, qqbot, homeassistant, webhook, api_server,
|
|
│ # yuanbao
|
|
│
|
|
├── acp_adapter/ # ACP server (VS Code / Zed / JetBrains)
|
|
├── cron/ # Scheduler (jobs.py, scheduler.py)
|
|
├── plugins/memory/ # Memory provider plugins
|
|
├── plugins/context_engine/ # Context engine plugins
|
|
├── environments/ # RL training environments (Atropos)
|
|
├── skills/ # Bundled skills (always available)
|
|
├── optional-skills/ # Official optional skills (install explicitly)
|
|
├── website/ # Docusaurus documentation site
|
|
└── tests/ # Pytest suite (~3,000+ tests)
|
|
```
|
|
|
|
## Data Flow
|
|
|
|
### CLI Session
|
|
|
|
```text
|
|
User input → HermesCLI.process_input()
|
|
→ AIAgent.run_conversation()
|
|
→ prompt_builder.build_system_prompt()
|
|
→ runtime_provider.resolve_runtime_provider()
|
|
→ API call (chat_completions / codex_responses / anthropic_messages)
|
|
→ tool_calls? → model_tools.handle_function_call() → loop
|
|
→ final response → display → save to SessionDB
|
|
```
|
|
|
|
### Gateway Message
|
|
|
|
```text
|
|
Platform event → Adapter.on_message() → MessageEvent
|
|
→ GatewayRunner._handle_message()
|
|
→ authorize user
|
|
→ resolve session key
|
|
→ create AIAgent with session history
|
|
→ AIAgent.run_conversation()
|
|
→ deliver response back through adapter
|
|
```
|
|
|
|
### Cron Job
|
|
|
|
```text
|
|
Scheduler tick → load due jobs from jobs.json
|
|
→ create fresh AIAgent (no history)
|
|
→ inject attached skills as context
|
|
→ run job prompt
|
|
→ deliver response to target platform
|
|
→ update job state and next_run
|
|
```
|
|
|
|
## Recommended Reading Order
|
|
|
|
If you are new to the codebase:
|
|
|
|
1. **This page** — orient yourself
|
|
2. **[Agent Loop Internals](./agent-loop.md)** — how AIAgent works
|
|
3. **[Prompt Assembly](./prompt-assembly.md)** — system prompt construction
|
|
4. **[Provider Runtime Resolution](./provider-runtime.md)** — how providers are selected
|
|
5. **[Adding Providers](./adding-providers.md)** — practical guide to adding a new provider
|
|
6. **[Tools Runtime](./tools-runtime.md)** — tool registry, dispatch, environments
|
|
7. **[Session Storage](./session-storage.md)** — SQLite schema, FTS5, session lineage
|
|
8. **[Gateway Internals](./gateway-internals.md)** — messaging platform gateway
|
|
9. **[Context Compression & Prompt Caching](./context-compression-and-caching.md)** — compression and caching
|
|
10. **[ACP Internals](./acp-internals.md)** — IDE integration
|
|
11. **[Environments, Benchmarks & Data Generation](./environments.md)** — RL training
|
|
|
|
## Major Subsystems
|
|
|
|
### Agent Loop
|
|
|
|
The synchronous orchestration engine (`AIAgent` in `run_agent.py`). Handles provider selection, prompt construction, tool execution, retries, fallback, callbacks, compression, and persistence. Supports three API modes for different provider backends.
|
|
|
|
→ [Agent Loop Internals](./agent-loop.md)
|
|
|
|
### Prompt System
|
|
|
|
Prompt construction and maintenance across the conversation lifecycle:
|
|
|
|
- **`prompt_builder.py`** — Assembles the system prompt from: personality (SOUL.md), memory (MEMORY.md, USER.md), skills, context files (AGENTS.md, .hermes.md), tool-use guidance, and model-specific instructions
|
|
- **`prompt_caching.py`** — Applies Anthropic cache breakpoints for prefix caching
|
|
- **`context_compressor.py`** — Summarizes middle conversation turns when context exceeds thresholds
|
|
|
|
→ [Prompt Assembly](./prompt-assembly.md), [Context Compression & Prompt Caching](./context-compression-and-caching.md)
|
|
|
|
### Provider Resolution
|
|
|
|
A shared runtime resolver used by CLI, gateway, cron, ACP, and auxiliary calls. Maps `(provider, model)` tuples to `(api_mode, api_key, base_url)`. Handles 18+ providers, OAuth flows, credential pools, and alias resolution.
|
|
|
|
→ [Provider Runtime Resolution](./provider-runtime.md)
|
|
|
|
### Tool System
|
|
|
|
Central tool registry (`tools/registry.py`) with 61 registered tools across 52 toolsets. Each tool file self-registers at import time. The registry handles schema collection, dispatch, availability checking, and error wrapping. Terminal tools support 7 backends (local, Docker, SSH, Daytona, Modal, Singularity, Vercel Sandbox).
|
|
|
|
→ [Tools Runtime](./tools-runtime.md)
|
|
|
|
### Session Persistence
|
|
|
|
SQLite-based session storage with FTS5 full-text search. Sessions have lineage tracking (parent/child across compressions), per-platform isolation, and atomic writes with contention handling.
|
|
|
|
→ [Session Storage](./session-storage.md)
|
|
|
|
### Messaging Gateway
|
|
|
|
Long-running process with 20 platform adapters, unified session routing, user authorization (allowlists + DM pairing), slash command dispatch, hook system, cron ticking, and background maintenance.
|
|
|
|
→ [Gateway Internals](./gateway-internals.md)
|
|
|
|
### Plugin System
|
|
|
|
Three discovery sources: `~/.hermes/plugins/` (user), `.hermes/plugins/` (project), and pip entry points. Plugins register tools, hooks, and CLI commands through a context API. Two specialized plugin types exist: memory providers (`plugins/memory/`) and context engines (`plugins/context_engine/`). Both are single-select — only one of each can be active at a time, configured via `hermes plugins` or `config.yaml`.
|
|
|
|
→ [Plugin Guide](/docs/guides/build-a-hermes-plugin), [Memory Provider Plugin](./memory-provider-plugin.md)
|
|
|
|
### Cron
|
|
|
|
First-class agent tasks (not shell tasks). Jobs store in JSON, support multiple schedule formats, can attach skills and scripts, and deliver to any platform.
|
|
|
|
→ [Cron Internals](./cron-internals.md)
|
|
|
|
### ACP Integration
|
|
|
|
Exposes Hermes as an editor-native agent over stdio/JSON-RPC for VS Code, Zed, and JetBrains.
|
|
|
|
→ [ACP Internals](./acp-internals.md)
|
|
|
|
### RL / Environments / Trajectories
|
|
|
|
Full environment framework for evaluation and RL training. Integrates with Atropos, supports multiple tool-call parsers, and generates ShareGPT-format trajectories.
|
|
|
|
→ [Environments, Benchmarks & Data Generation](./environments.md), [Trajectories & Training Format](./trajectory-format.md)
|
|
|
|
## Design Principles
|
|
|
|
| Principle | What it means in practice |
|
|
|-----------|--------------------------|
|
|
| **Prompt stability** | System prompt doesn't change mid-conversation. No cache-breaking mutations except explicit user actions (`/model`). |
|
|
| **Observable execution** | Every tool call is visible to the user via callbacks. Progress updates in CLI (spinner) and gateway (chat messages). |
|
|
| **Interruptible** | API calls and tool execution can be cancelled mid-flight by user input or signals. |
|
|
| **Platform-agnostic core** | One AIAgent class serves CLI, gateway, ACP, batch, and API server. Platform differences live in the entry point, not the agent. |
|
|
| **Loose coupling** | Optional subsystems (MCP, plugins, memory providers, RL environments) use registry patterns and check_fn gating, not hard dependencies. |
|
|
| **Profile isolation** | Each profile (`hermes -p <name>`) gets its own HERMES_HOME, config, memory, sessions, and gateway PID. Multiple profiles run concurrently. |
|
|
|
|
## File Dependency Chain
|
|
|
|
```text
|
|
tools/registry.py (no deps — imported by all tool files)
|
|
↑
|
|
tools/*.py (each calls registry.register() at import time)
|
|
↑
|
|
model_tools.py (imports tools/registry + triggers tool discovery)
|
|
↑
|
|
run_agent.py, cli.py, batch_runner.py, environments/
|
|
```
|
|
|
|
This chain means tool registration happens at import time, before any agent instance is created. Any `tools/*.py` file with a top-level `registry.register()` call is auto-discovered — no manual import list needed.
|