diff --git a/website/docs/developer-guide/agent-loop.md b/website/docs/developer-guide/agent-loop.md index 4728a634b36..b07fa047896 100644 --- a/website/docs/developer-guide/agent-loop.md +++ b/website/docs/developer-guide/agent-loop.md @@ -226,7 +226,8 @@ After each turn: |------|---------| | `run_agent.py` | AIAgent class — the complete agent loop (~9,200 lines) | | `agent/prompt_builder.py` | System prompt assembly from memory, skills, context files, personality | -| `agent/context_compressor.py` | Conversation compression algorithm | +| `agent/context_engine.py` | ContextEngine ABC — pluggable context management | +| `agent/context_compressor.py` | Default engine — lossy summarization algorithm | | `agent/prompt_caching.py` | Anthropic prompt caching markers and cache metrics | | `agent/auxiliary_client.py` | Auxiliary LLM client for side tasks (vision, summarization) | | `model_tools.py` | Tool schema collection, `handle_function_call()` dispatch | diff --git a/website/docs/developer-guide/architecture.md b/website/docs/developer-guide/architecture.md index 38802a04919..13f08b7db42 100644 --- a/website/docs/developer-guide/architecture.md +++ b/website/docs/developer-guide/architecture.md @@ -62,7 +62,8 @@ hermes-agent/ │ ├── agent/ # Agent internals │ ├── prompt_builder.py # System prompt assembly -│ ├── context_compressor.py # Conversation compression algorithm +│ ├── context_engine.py # ContextEngine ABC (pluggable) +│ ├── context_compressor.py # Default engine — lossy summarization │ ├── prompt_caching.py # Anthropic prompt caching │ ├── auxiliary_client.py # Auxiliary LLM for side tasks (vision, summarization) │ ├── model_metadata.py # Model context lengths, token estimation @@ -123,6 +124,7 @@ hermes-agent/ ├── acp_adapter/ # ACP server (VS Code / Zed / JetBrains) ├── cron/ # Scheduler (jobs.py, scheduler.py) ├── plugins/memory/ # Memory provider plugins +├── plugins/context_engine/ # Context engine plugins ├── environments/ # RL training environments (Atropos) ├── skills/ # Bundled skills (always available) ├── optional-skills/ # Official optional skills (install explicitly) @@ -227,7 +229,7 @@ Long-running process with 14 platform adapters, unified session routing, user au ### Plugin System -Three discovery sources: `~/.hermes/plugins/` (user), `.hermes/plugins/` (project), and pip entry points. Plugins register tools, hooks, and CLI commands through a context API. Memory providers are a specialized plugin type under `plugins/memory/`. +Three discovery sources: `~/.hermes/plugins/` (user), `.hermes/plugins/` (project), and pip entry points. Plugins register tools, hooks, and CLI commands through a context API. Two specialized plugin types exist: memory providers (`plugins/memory/`) and context engines (`plugins/context_engine/`). Both are single-select — only one of each can be active at a time, configured via `hermes plugins` or `config.yaml`. → [Plugin Guide](/docs/guides/build-a-hermes-plugin), [Memory Provider Plugin](./memory-provider-plugin.md) diff --git a/website/docs/developer-guide/context-compression-and-caching.md b/website/docs/developer-guide/context-compression-and-caching.md index 583844645ac..98dc0a6e2ad 100644 --- a/website/docs/developer-guide/context-compression-and-caching.md +++ b/website/docs/developer-guide/context-compression-and-caching.md @@ -3,10 +3,37 @@ Hermes Agent uses a dual compression system and Anthropic prompt caching to manage context window usage efficiently across long conversations. -Source files: `agent/context_compressor.py`, `agent/prompt_caching.py`, -`gateway/run.py` (session hygiene), `run_agent.py` (search for `_compress_context`) +Source files: `agent/context_engine.py` (ABC), `agent/context_compressor.py` (default engine), +`agent/prompt_caching.py`, `gateway/run.py` (session hygiene), `run_agent.py` (search for `_compress_context`) +## Pluggable Context Engine + +Context management is built on the `ContextEngine` ABC (`agent/context_engine.py`). The built-in `ContextCompressor` is the default implementation, but plugins can replace it with alternative engines (e.g., Lossless Context Management). + +```yaml +context: + engine: "compressor" # default — built-in lossy summarization + engine: "lcm" # example — plugin providing lossless context +``` + +The engine is responsible for: +- Deciding when compaction should fire (`should_compress()`) +- Performing compaction (`compress()`) +- Optionally exposing tools the agent can call (e.g., `lcm_grep`) +- Tracking token usage from API responses + +Selection is config-driven via `context.engine` in `config.yaml`. The resolution order: +1. Check `plugins/context_engine//` directory +2. Check general plugin system (`register_context_engine()`) +3. Fall back to built-in `ContextCompressor` + +Plugin engines are **never auto-activated** — the user must explicitly set `context.engine` to the plugin's name. The default `"compressor"` always uses the built-in. + +Configure via `hermes plugins` → Provider Plugins → Context Engine, or edit `config.yaml` directly. + +For building a context engine plugin, see [Context Engine Plugins](/docs/developer-guide/context-engine-plugin). + ## Dual Compression System Hermes has two separate compression layers that operate independently: diff --git a/website/docs/developer-guide/context-engine-plugin.md b/website/docs/developer-guide/context-engine-plugin.md new file mode 100644 index 00000000000..5a606f8ea0c --- /dev/null +++ b/website/docs/developer-guide/context-engine-plugin.md @@ -0,0 +1,189 @@ +--- +sidebar_position: 9 +title: "Context Engine Plugins" +description: "How to build a context engine plugin that replaces the built-in ContextCompressor" +--- + +# Building a Context Engine Plugin + +Context engine plugins replace the built-in `ContextCompressor` with an alternative strategy for managing conversation context. For example, a Lossless Context Management (LCM) engine that builds a knowledge DAG instead of lossy summarization. + +## How it works + +The agent's context management is built on the `ContextEngine` ABC (`agent/context_engine.py`). The built-in `ContextCompressor` is the default implementation. Plugin engines must implement the same interface. + +Only **one** context engine can be active at a time. Selection is config-driven: + +```yaml +# config.yaml +context: + engine: "compressor" # default built-in + engine: "lcm" # activates a plugin engine named "lcm" +``` + +Plugin engines are **never auto-activated** — the user must explicitly set `context.engine` to the plugin's name. + +## Directory structure + +Each context engine lives in `plugins/context_engine//`: + +``` +plugins/context_engine/lcm/ +├── __init__.py # exports the ContextEngine subclass +├── plugin.yaml # metadata (name, description, version) +└── ... # any other modules your engine needs +``` + +## The ContextEngine ABC + +Your engine must implement these **required** methods: + +```python +from agent.context_engine import ContextEngine + +class LCMEngine(ContextEngine): + + @property + def name(self) -> str: + """Short identifier, e.g. 'lcm'. Must match config.yaml value.""" + return "lcm" + + def update_from_response(self, usage: dict) -> None: + """Called after every LLM call with the usage dict. + + Update self.last_prompt_tokens, self.last_completion_tokens, + self.last_total_tokens from the response. + """ + + def should_compress(self, prompt_tokens: int = None) -> bool: + """Return True if compaction should fire this turn.""" + + def compress(self, messages: list, current_tokens: int = None) -> list: + """Compact the message list and return a new (possibly shorter) list. + + The returned list must be a valid OpenAI-format message sequence. + """ +``` + +### Class attributes your engine must maintain + +The agent reads these directly for display and logging: + +```python +last_prompt_tokens: int = 0 +last_completion_tokens: int = 0 +last_total_tokens: int = 0 +threshold_tokens: int = 0 # when compression triggers +context_length: int = 0 # model's full context window +compression_count: int = 0 # how many times compress() has run +``` + +### Optional methods + +These have sensible defaults in the ABC. Override as needed: + +| Method | Default | Override when | +|--------|---------|--------------| +| `on_session_start(session_id, **kwargs)` | No-op | You need to load persisted state (DAG, DB) | +| `on_session_end(session_id, messages)` | No-op | You need to flush state, close connections | +| `on_session_reset()` | Resets token counters | You have per-session state to clear | +| `update_model(model, context_length, ...)` | Updates context_length + threshold | You need to recalculate budgets on model switch | +| `get_tool_schemas()` | Returns `[]` | Your engine provides agent-callable tools (e.g., `lcm_grep`) | +| `handle_tool_call(name, args, **kwargs)` | Returns error JSON | You implement tool handlers | +| `should_compress_preflight(messages)` | Returns `False` | You can do a cheap pre-API-call estimate | +| `get_status()` | Standard token/threshold dict | You have custom metrics to expose | + +## Engine tools + +Context engines can expose tools the agent calls directly. Return schemas from `get_tool_schemas()` and handle calls in `handle_tool_call()`: + +```python +def get_tool_schemas(self): + return [{ + "name": "lcm_grep", + "description": "Search the context knowledge graph", + "parameters": { + "type": "object", + "properties": { + "query": {"type": "string", "description": "Search query"} + }, + "required": ["query"], + }, + }] + +def handle_tool_call(self, name, args, **kwargs): + if name == "lcm_grep": + results = self._search_dag(args["query"]) + return json.dumps({"results": results}) + return json.dumps({"error": f"Unknown tool: {name}"}) +``` + +Engine tools are injected into the agent's tool list at startup and dispatched automatically — no registry registration needed. + +## Registration + +### Via directory (recommended) + +Place your engine in `plugins/context_engine//`. The `__init__.py` must export a `ContextEngine` subclass. The discovery system finds and instantiates it automatically. + +### Via general plugin system + +A general plugin can also register a context engine: + +```python +def register(ctx): + engine = LCMEngine(context_length=200000) + ctx.register_context_engine(engine) +``` + +Only one engine can be registered. A second plugin attempting to register is rejected with a warning. + +## Lifecycle + +``` +1. Engine instantiated (plugin load or directory discovery) +2. on_session_start() — conversation begins +3. update_from_response() — after each API call +4. should_compress() — checked each turn +5. compress() — called when should_compress() returns True +6. on_session_end() — session boundary (CLI exit, /reset, gateway expiry) +``` + +`on_session_reset()` is called on `/new` or `/reset` to clear per-session state without a full shutdown. + +## Configuration + +Users select your engine via `hermes plugins` → Provider Plugins → Context Engine, or by editing `config.yaml`: + +```yaml +context: + engine: "lcm" # must match your engine's name property +``` + +The `compression` config block (`compression.threshold`, `compression.protect_last_n`, etc.) is specific to the built-in `ContextCompressor`. Your engine should define its own config format if needed, reading from `config.yaml` during initialization. + +## Testing + +```python +from agent.context_engine import ContextEngine + +def test_engine_satisfies_abc(): + engine = YourEngine(context_length=200000) + assert isinstance(engine, ContextEngine) + assert engine.name == "your-name" + +def test_compress_returns_valid_messages(): + engine = YourEngine(context_length=200000) + msgs = [{"role": "user", "content": "hello"}] + result = engine.compress(msgs) + assert isinstance(result, list) + assert all("role" in m for m in result) +``` + +See `tests/agent/test_context_engine.py` for the full ABC contract test suite. + +## See also + +- [Context Compression and Caching](/docs/developer-guide/context-compression-and-caching) — how the built-in compressor works +- [Memory Provider Plugins](/docs/developer-guide/memory-provider-plugin) — analogous single-select plugin system for memory +- [Plugins](/docs/user-guide/features/plugins) — general plugin system overview diff --git a/website/docs/developer-guide/memory-provider-plugin.md b/website/docs/developer-guide/memory-provider-plugin.md index b5c6a3a3025..d08022a44a1 100644 --- a/website/docs/developer-guide/memory-provider-plugin.md +++ b/website/docs/developer-guide/memory-provider-plugin.md @@ -8,6 +8,10 @@ description: "How to build a memory provider plugin for Hermes Agent" Memory provider plugins give Hermes Agent persistent, cross-session knowledge beyond the built-in MEMORY.md and USER.md. This guide covers how to build one. +:::tip +Memory providers are one of two **provider plugin** types. The other is [Context Engine Plugins](/docs/developer-guide/context-engine-plugin), which replace the built-in context compressor. Both follow the same pattern: single-select, config-driven, managed via `hermes plugins`. +::: + ## Directory Structure Each memory provider lives in `plugins/memory//`: diff --git a/website/docs/guides/build-a-hermes-plugin.md b/website/docs/guides/build-a-hermes-plugin.md index 85b1c8177c8..e79cf2ee799 100644 --- a/website/docs/guides/build-a-hermes-plugin.md +++ b/website/docs/guides/build-a-hermes-plugin.md @@ -547,6 +547,12 @@ After registration, users can run `hermes my-plugin status`, `hermes my-plugin c **Active-provider gating:** Memory plugin CLI commands only appear when their provider is the active `memory.provider` in config. If a user hasn't set up your provider, your CLI commands won't clutter the help output. +:::tip +This guide covers **general plugins** (tools, hooks, CLI commands). For specialized plugin types, see: +- [Memory Provider Plugins](/docs/developer-guide/memory-provider-plugin) — cross-session knowledge backends +- [Context Engine Plugins](/docs/developer-guide/context-engine-plugin) — alternative context management strategies +::: + ### Distribute via pip For sharing plugins publicly, add an entry point to your Python package: diff --git a/website/docs/reference/cli-commands.md b/website/docs/reference/cli-commands.md index a7362b06ff7..132da079ce4 100644 --- a/website/docs/reference/cli-commands.md +++ b/website/docs/reference/cli-commands.md @@ -586,11 +586,14 @@ See [MCP Config Reference](./mcp-config-reference.md), [Use MCP with Hermes](../ hermes plugins [subcommand] ``` -Manage Hermes Agent plugins. Running `hermes plugins` with no subcommand launches an interactive curses checklist to enable/disable installed plugins. +Unified plugin management — general plugins, memory providers, and context engines in one place. Running `hermes plugins` with no subcommand opens a composite interactive screen with two sections: + +- **General Plugins** — multi-select checkboxes to enable/disable installed plugins +- **Provider Plugins** — single-select configuration for Memory Provider and Context Engine. Press ENTER on a category to open a radio picker. | Subcommand | Description | |------------|-------------| -| *(none)* | Interactive toggle UI — enable/disable plugins with arrow keys and space. | +| *(none)* | Composite interactive UI — general plugin toggles + provider plugin configuration. | | `install [--force]` | Install a plugin from a Git URL or `owner/repo`. | | `update ` | Pull latest changes for an installed plugin. | | `remove ` (aliases: `rm`, `uninstall`) | Remove an installed plugin. | @@ -598,7 +601,11 @@ Manage Hermes Agent plugins. Running `hermes plugins` with no subcommand launche | `disable ` | Disable a plugin without removing it. | | `list` (alias: `ls`) | List installed plugins with enabled/disabled status. | -Disabled plugins are stored in `config.yaml` under `plugins.disabled` and skipped during loading. +Provider plugin selections are saved to `config.yaml`: +- `memory.provider` — active memory provider (empty = built-in only) +- `context.engine` — active context engine (`"compressor"` = built-in default) + +General plugin disabled list is stored in `config.yaml` under `plugins.disabled`. See [Plugins](../user-guide/features/plugins.md) and [Build a Hermes Plugin](../guides/build-a-hermes-plugin.md). diff --git a/website/docs/user-guide/configuration.md b/website/docs/user-guide/configuration.md index 6c52645e190..a8cb23f99ab 100644 --- a/website/docs/user-guide/configuration.md +++ b/website/docs/user-guide/configuration.md @@ -482,6 +482,26 @@ Points at a custom OpenAI-compatible endpoint. Uses `OPENAI_API_KEY` for auth. The `summary_model` must support a context length at least as large as your main model's, since it receives the full middle section of the conversation for compression. +## Context Engine + +The context engine controls how conversations are managed when approaching the model's token limit. The built-in `compressor` engine uses lossy summarization (see [Context Compression](/docs/developer-guide/context-compression-and-caching)). Plugin engines can replace it with alternative strategies. + +```yaml +context: + engine: "compressor" # default — built-in lossy summarization +``` + +To use a plugin engine (e.g., LCM for lossless context management): + +```yaml +context: + engine: "lcm" # must match the plugin's name +``` + +Plugin engines are **never auto-activated** — you must explicitly set `context.engine` to the plugin name. Available engines can be browsed and selected via `hermes plugins` → Provider Plugins → Context Engine. + +See [Memory Providers](/docs/user-guide/features/memory-providers) for the analogous single-select system for memory plugins. + ## Iteration Budget Pressure When the agent is working on a complex task with many tool calls, it can burn through its iteration budget (default: 90 turns) without realizing it's running low. Budget pressure automatically warns the model as it approaches the limit: diff --git a/website/docs/user-guide/features/memory-providers.md b/website/docs/user-guide/features/memory-providers.md index e76a05414ff..f9db4ab5777 100644 --- a/website/docs/user-guide/features/memory-providers.md +++ b/website/docs/user-guide/features/memory-providers.md @@ -16,6 +16,8 @@ hermes memory status # check what's active hermes memory off # disable external provider ``` +You can also select the active memory provider via `hermes plugins` → Provider Plugins → Memory Provider. + Or set manually in `~/.hermes/config.yaml`: ```yaml diff --git a/website/docs/user-guide/features/overview.md b/website/docs/user-guide/features/overview.md index 9d9c7b2c507..2d26e153ae7 100644 --- a/website/docs/user-guide/features/overview.md +++ b/website/docs/user-guide/features/overview.md @@ -48,4 +48,4 @@ Hermes Agent includes a rich set of capabilities that extend far beyond basic ch - **[Personality & SOUL.md](personality.md)** — Fully customizable agent personality. `SOUL.md` is the primary identity file — the first thing in the system prompt — and you can swap in built-in or custom `/personality` presets per session. - **[Skins & Themes](skins.md)** — Customize the CLI's visual presentation: banner colors, spinner faces and verbs, response-box labels, branding text, and the tool activity prefix. -- **[Plugins](plugins.md)** — Add custom tools, hooks, and integrations without modifying core code. Drop a directory into `~/.hermes/plugins/` with a `plugin.yaml` and Python code. +- **[Plugins](plugins.md)** — Add custom tools, hooks, and integrations without modifying core code. Three plugin types: general plugins (tools/hooks), memory providers (cross-session knowledge), and context engines (alternative context management). Managed via the unified `hermes plugins` interactive UI. diff --git a/website/docs/user-guide/features/plugins.md b/website/docs/user-guide/features/plugins.md index a8f984fed4a..b7352c629cb 100644 --- a/website/docs/user-guide/features/plugins.md +++ b/website/docs/user-guide/features/plugins.md @@ -111,10 +111,22 @@ Plugins can register callbacks for these lifecycle events. See the **[Event Hook | [`on_session_start`](/docs/user-guide/features/hooks#on_session_start) | New session created (first turn only) | | [`on_session_end`](/docs/user-guide/features/hooks#on_session_end) | End of every `run_conversation` call + CLI exit handler | +## Plugin types + +Hermes has three kinds of plugins: + +| Type | What it does | Selection | Location | +|------|-------------|-----------|----------| +| **General plugins** | Add tools, hooks, CLI commands | Multi-select (enable/disable) | `~/.hermes/plugins/` | +| **Memory providers** | Replace or augment built-in memory | Single-select (one active) | `plugins/memory/` | +| **Context engines** | Replace the built-in context compressor | Single-select (one active) | `plugins/context_engine/` | + +Memory providers and context engines are **provider plugins** — only one of each type can be active at a time. General plugins can be enabled in any combination. + ## Managing plugins ```bash -hermes plugins # interactive toggle UI — enable/disable with checkboxes +hermes plugins # unified interactive UI hermes plugins list # table view with enabled/disabled status hermes plugins install user/repo # install from Git hermes plugins update my-plugin # pull latest @@ -123,7 +135,37 @@ hermes plugins enable my-plugin # re-enable a disabled plugin hermes plugins disable my-plugin # disable without removing ``` -Running `hermes plugins` with no arguments launches an interactive curses checklist (same UI as `hermes tools`) where you can toggle plugins on/off with arrow keys and space. +### Interactive UI + +Running `hermes plugins` with no arguments opens a composite interactive screen: + +``` +Plugins + ↑↓ navigate SPACE toggle ENTER configure/confirm ESC done + + General Plugins + → [✓] my-tool-plugin — Custom search tool + [ ] webhook-notifier — Event hooks + + Provider Plugins + Memory Provider ▸ honcho + Context Engine ▸ compressor +``` + +- **General Plugins section** — checkboxes, toggle with SPACE +- **Provider Plugins section** — shows current selection. Press ENTER to drill into a radio picker where you choose one active provider. + +Provider plugin selections are saved to `config.yaml`: + +```yaml +memory: + provider: "honcho" # empty string = built-in only + +context: + engine: "compressor" # default built-in compressor +``` + +### Disabling general plugins Disabled plugins remain installed but are skipped during loading. The disabled list is stored in `config.yaml` under `plugins.disabled`: diff --git a/website/sidebars.ts b/website/sidebars.ts index 87538359617..52fd589c7f6 100644 --- a/website/sidebars.ts +++ b/website/sidebars.ts @@ -176,6 +176,7 @@ const sidebars: SidebarsConfig = { 'developer-guide/adding-tools', 'developer-guide/adding-providers', 'developer-guide/memory-provider-plugin', + 'developer-guide/context-engine-plugin', 'developer-guide/creating-skills', 'developer-guide/extending-the-cli', ],