Files
hermes-agent/website/docs/developer-guide/adding-platform-adapters.md

404 lines
14 KiB
Markdown
Raw Normal View History

---
sidebar_position: 9
---
# Adding a Platform Adapter
This guide covers adding a new messaging platform to the Hermes gateway. A platform adapter connects Hermes to an external messaging service (Telegram, Discord, WeCom, etc.) so users can interact with the agent through that service.
:::tip
There are two ways to add a platform:
- **Plugin** (recommended for community/third-party): Drop a plugin directory into `~/.hermes/plugins/` — zero core code changes needed. See [Plugin Path](#plugin-path-recommended) below.
- **Built-in**: Modify 20+ files across code, config, and docs. Use the [Built-in Checklist](#step-by-step-checklist) below.
:::
## Architecture Overview
```
User ↔ Messaging Platform ↔ Platform Adapter ↔ Gateway Runner ↔ AIAgent
```
Every adapter extends `BasePlatformAdapter` from `gateway/platforms/base.py` and implements:
docs: resync reference, user-guide, developer-guide, and messaging pages against code (#17738) Broad drift audit against origin/main (b52b63396). Reference pages (most user-visible drift): - slash-commands: add /busy, /curator, /footer, /indicator, /redraw, /steer that were missing; drop non-existent /terminal-setup; fix /q footnote (resolves to /queue, not /quit); extend CLI-only list with all 24 CLI-only commands in the registry - cli-commands: add dedicated sections for hermes curator / fallback / hooks (new subcommands not previously documented); remove stale hermes honcho standalone section (the plugin registers dynamically via hermes memory); list curator/fallback/hooks in top-level table; fix completion to include fish - toolsets-reference: document the real 52-toolset count; split browser vs browser-cdp; add discord / discord_admin / spotify / yuanbao; correct hermes-cli tool count from 36 to 38; fix misleading claim that hermes-homeassistant adds tools (it's identical to hermes-cli) - tools-reference: bump tool count 55 -> 68; add 7 Spotify, 5 Yuanbao, 2 Discord toolsets; move browser_cdp/browser_dialog to their own browser-cdp toolset section - environment-variables: add 40+ user-facing HERMES_* vars that were undocumented (--yolo, --accept-hooks, --ignore-*, inference model override, agent/stream/checkpoint timeouts, OAuth trace, per-platform batch tuning for Telegram/Discord/Matrix/Feishu/WeCom, cron knobs, gateway restart/connect timeouts); dedupe the Cron Scheduler section; replace stale QQ_SANDBOX with QQ_PORTAL_HOST User-guide (top level): - cli.md: compression preserves last 20 turns, not 4 (protect_last_n: 20) - configuration.md: display.platforms is the canonical per-platform override key; tool_progress_overrides is deprecated and auto-migrated - profiles.md: model.default is the config key, not model.model - sessions.md: CLI/TUI session IDs use 6-char hex, gateway uses 8 - checkpoints-and-rollback.md: destructive-command list now matches _DESTRUCTIVE_PATTERNS (adds rmdir, cp, install, dd) - docker.md: the container runs as non-root hermes (UID 10000) via gosu; fix install command (uv pip); add missing --insecure on the dashboard compose example (required for non-loopback bind) - security.md: systemctl danger pattern also matches 'restart' - index.md: built-in tool count 47 -> 68 - integrations/index.md: 6 STT providers, 8 memory providers - integrations/providers.md: drop fictional dashscope/qwen aliases Features: - overview.md: 9 image models (not 8), 9 TTS providers (not 5), 8 memory providers (Supermemory was missing) - tool-gateway.md: 9 image models - tools.md: extend common-toolsets list with search / messaging / spotify / discord / debugging / safe - fallback-providers.md: add 6 real providers from PROVIDER_REGISTRY (lmstudio, kimi-coding-cn, stepfun, alibaba-coding-plan, tencent-tokenhub, azure-foundry) - plugins.md: Available Hooks table now includes on_session_finalize, on_session_reset, subagent_stop - built-in-plugins.md: add the 7 bundled plugins the page didn't mention (spotify, google_meet, three image_gen providers, two dashboard examples) - web-dashboard.md: add --insecure and --tui flags - cron.md: hermes cron create takes positional schedule/prompt, not flags Messaging: - telegram.md: TELEGRAM_WEBHOOK_SECRET is now REQUIRED when TELEGRAM_WEBHOOK_URL is set (gateway refuses to start without it per GHSA-3vpc-7q5r-276h). Biggest user-visible drift in the batch. - discord.md: HERMES_DISCORD_TEXT_BATCH_SPLIT_DELAY_SECONDS default is 2.0, not 0.1 - dingtalk.md: document DINGTALK_REQUIRE_MENTION / FREE_RESPONSE_CHATS / MENTION_PATTERNS / HOME_CHANNEL / ALLOW_ALL_USERS that the adapter supports - bluebubbles.md: drop fictional BLUEBUBBLES_SEND_READ_RECEIPTS env var; the setting lives in platforms.bluebubbles.extra only - qqbot.md: drop dead QQ_SANDBOX; add real QQ_PORTAL_HOST and QQ_GROUP_ALLOWED_USERS - wecom-callback.md: replace 'hermes gateway start' (service-only) with 'hermes gateway' for first-time setup Developer-guide: - architecture.md: refresh tool/toolset counts (61/52), terminal backend count (7), line counts for run_agent.py (~13.7k), cli.py (~11.5k), main.py (~10.4k), setup.py (~3.5k), gateway/run.py (~12.2k), mcp_tool.py (~3.1k); add yuanbao adapter, bump platform adapter count 18 -> 20 - agent-loop.md: run_agent.py line count 10.7k -> 13.7k - tools-runtime.md: add vercel_sandbox backend - adding-tools.md: remove stale 'Discovery import added to model_tools.py' checklist item (registry auto-discovery) - adding-platform-adapters.md: mark send_typing / get_chat_info as concrete base methods; only connect/disconnect/send are abstract - acp-internals.md: ACP sessions now persist to SessionDB (~/.hermes/state.db); acp.run_agent call uses use_unstable_protocol=True - cron-internals.md: gateway runs scheduler in a dedicated background thread via _start_cron_ticker, not on a maintenance cycle; locking is cross-process via fcntl.flock (Unix) / msvcrt.locking (Windows) - gateway-internals.md: gateway/run.py ~12k lines - provider-runtime.md: cron DOES support fallback (run_job reads fallback_providers from config) - session-storage.md: SCHEMA_VERSION = 11 (not 9); add migrations 10 and 11 (trigram FTS, inline-mode FTS5 re-index); add api_call_count column to Sessions DDL; document messages_fts_trigram and state_meta in the architecture tree - context-compression-and-caching.md: remove the obsolete 'context pressure warnings' section (warnings were removed for causing models to give up early) - context-engine-plugin.md: compress() signature now includes focus_topic param - extending-the-cli.md: _build_tui_layout_children signature now includes model_picker_widget; add to default layout Also fixed three pre-existing broken links/anchors the build warned about (docker.md -> api-server.md, yuanbao.md -> cron-jobs.md and tips#background-tasks, nix-setup.md -> #container-aware-cli). Regenerated per-skill pages via website/scripts/generate-skill-docs.py so catalog tables and sidebar are consistent with current SKILL.md frontmatter. docusaurus build: clean, no broken links or anchors.
2026-04-29 20:55:59 -07:00
- **`connect()`** — Establish connection (WebSocket, long-poll, HTTP server, etc.) *(abstract)*
- **`disconnect()`** — Clean shutdown *(abstract)*
- **`send()`** — Send a text message to a chat *(abstract)*
- **`send_typing()`** — Show typing indicator (optional override)
- **`get_chat_info()`** — Return chat metadata (optional override)
Inbound messages are received by the adapter and forwarded via `self.handle_message(event)`, which the base class routes to the gateway runner.
## Plugin Path (Recommended)
The plugin system lets you add a platform adapter without modifying any core Hermes code. Your plugin is a directory with two files:
```
~/.hermes/plugins/my-platform/
PLUGIN.yaml # Plugin metadata
adapter.py # Adapter class + register() entry point
```
### PLUGIN.yaml
```yaml
name: my-platform
version: 1.0.0
description: My custom messaging platform adapter
requires_env:
- MY_PLATFORM_TOKEN
- MY_PLATFORM_CHANNEL
```
### adapter.py
```python
import os
from gateway.platforms.base import (
BasePlatformAdapter, SendResult, MessageEvent, MessageType,
)
from gateway.config import Platform, PlatformConfig
class MyPlatformAdapter(BasePlatformAdapter):
def __init__(self, config: PlatformConfig):
super().__init__(config, Platform("my_platform"))
extra = config.extra or {}
self.token = os.getenv("MY_PLATFORM_TOKEN") or extra.get("token", "")
async def connect(self) -> bool:
# Connect to the platform API, start listeners
self._mark_connected()
return True
async def disconnect(self) -> None:
self._mark_disconnected()
async def send(self, chat_id, content, reply_to=None, metadata=None):
# Send message via platform API
return SendResult(success=True, message_id="...")
async def get_chat_info(self, chat_id):
return {"name": chat_id, "type": "dm"}
def check_requirements() -> bool:
return bool(os.getenv("MY_PLATFORM_TOKEN"))
def validate_config(config) -> bool:
extra = getattr(config, "extra", {}) or {}
return bool(os.getenv("MY_PLATFORM_TOKEN") or extra.get("token"))
def register(ctx):
"""Plugin entry point — called by the Hermes plugin system."""
ctx.register_platform(
name="my_platform",
label="My Platform",
adapter_factory=lambda cfg: MyPlatformAdapter(cfg),
check_fn=check_requirements,
validate_config=validate_config,
required_env=["MY_PLATFORM_TOKEN"],
install_hint="pip install my-platform-sdk",
# Per-platform user authorization env vars
allowed_users_env="MY_PLATFORM_ALLOWED_USERS",
allow_all_env="MY_PLATFORM_ALLOW_ALL_USERS",
# Message length limit for smart chunking (0 = no limit)
max_message_length=4000,
# LLM guidance injected into system prompt
platform_hint=(
"You are chatting via My Platform. "
"It supports markdown formatting."
),
# Display
emoji="💬",
)
# Optional: register platform-specific tools
ctx.register_tool(
name="my_platform_search",
toolset="my_platform",
schema={...},
handler=my_search_handler,
)
```
### Configuration
Users configure the platform in `config.yaml`:
```yaml
gateway:
platforms:
my_platform:
enabled: true
extra:
token: "..."
channel: "#general"
```
Or via environment variables (which the adapter reads in `__init__`).
### What the Plugin System Handles Automatically
When you call `ctx.register_platform()`, the following integration points are handled for you — no core code changes needed:
| Integration point | How it works |
|---|---|
| Gateway adapter creation | Registry checked before built-in if/elif chain |
| Config parsing | `Platform._missing_()` accepts any platform name |
| Connected platform validation | Registry `validate_config()` called |
| User authorization | `allowed_users_env` / `allow_all_env` checked |
| Cron delivery | `Platform()` resolves any registered name |
| send_message tool | Routes through live gateway adapter |
| Webhook cross-platform delivery | Registry checked for known platforms |
| `/update` command access | `allow_update_command` flag |
| Channel directory | Plugin platforms included in enumeration |
| System prompt hints | `platform_hint` injected into LLM context |
| Message chunking | `max_message_length` for smart splitting |
| PII redaction | `pii_safe` flag |
| `hermes status` | Shows plugin platforms with `(plugin)` tag |
| `hermes gateway setup` | Plugin platforms appear in setup menu |
| `hermes tools` / `hermes skills` | Plugin platforms in per-platform config |
| Token lock (multi-profile) | Use `acquire_scoped_lock()` in your `connect()` |
| Orphaned config warning | Descriptive log when plugin is missing |
### Reference Implementation
See `plugins/platforms/irc/` in the repo for a complete working example — a full async IRC adapter with zero external dependencies.
---
## Step-by-Step Checklist (Built-in Path)
:::note
This checklist is for adding a platform directly to the Hermes core codebase — typically done by core contributors for officially supported platforms. Community/third-party platforms should use the [Plugin Path](#plugin-path-recommended) above.
:::
### 1. Platform Enum
Add your platform to the `Platform` enum in `gateway/config.py`:
```python
class Platform(str, Enum):
# ... existing platforms ...
NEWPLAT = "newplat"
```
### 2. Adapter File
Create `gateway/platforms/newplat.py`:
```python
from gateway.config import Platform, PlatformConfig
from gateway.platforms.base import (
BasePlatformAdapter, MessageEvent, MessageType, SendResult,
)
def check_newplat_requirements() -> bool:
"""Return True if dependencies are available."""
return SOME_SDK_AVAILABLE
class NewPlatAdapter(BasePlatformAdapter):
def __init__(self, config: PlatformConfig):
super().__init__(config, Platform.NEWPLAT)
# Read config from config.extra dict
extra = config.extra or {}
self._api_key = extra.get("api_key") or os.getenv("NEWPLAT_API_KEY", "")
async def connect(self) -> bool:
# Set up connection, start polling/webhook
self._mark_connected()
return True
async def disconnect(self) -> None:
self._running = False
self._mark_disconnected()
async def send(self, chat_id, content, reply_to=None, metadata=None):
# Send message via platform API
return SendResult(success=True, message_id="...")
async def get_chat_info(self, chat_id):
return {"name": chat_id, "type": "dm"}
```
For inbound messages, build a `MessageEvent` and call `self.handle_message(event)`:
```python
source = self.build_source(
chat_id=chat_id,
chat_name=name,
chat_type="dm", # or "group"
user_id=user_id,
user_name=user_name,
)
event = MessageEvent(
text=content,
message_type=MessageType.TEXT,
source=source,
message_id=msg_id,
)
await self.handle_message(event)
```
### 3. Gateway Config (`gateway/config.py`)
Three touchpoints:
1. **`get_connected_platforms()`** — Add a check for your platform's required credentials
2. **`load_gateway_config()`** — Add token env map entry: `Platform.NEWPLAT: "NEWPLAT_TOKEN"`
3. **`_apply_env_overrides()`** — Map all `NEWPLAT_*` env vars to config
### 4. Gateway Runner (`gateway/run.py`)
Five touchpoints:
1. **`_create_adapter()`** — Add an `elif platform == Platform.NEWPLAT:` branch
2. **`_is_user_authorized()` allowed_users map** — `Platform.NEWPLAT: "NEWPLAT_ALLOWED_USERS"`
3. **`_is_user_authorized()` allow_all map** — `Platform.NEWPLAT: "NEWPLAT_ALLOW_ALL_USERS"`
4. **Early env check `_any_allowlist` tuple** — Add `"NEWPLAT_ALLOWED_USERS"`
5. **Early env check `_allow_all` tuple** — Add `"NEWPLAT_ALLOW_ALL_USERS"`
6. **`_UPDATE_ALLOWED_PLATFORMS` frozenset** — Add `Platform.NEWPLAT`
### 5. Cross-Platform Delivery
1. **`gateway/platforms/webhook.py`** — Add `"newplat"` to the delivery type tuple
2. **`cron/scheduler.py`** — Add to `_KNOWN_DELIVERY_PLATFORMS` frozenset and `_deliver_result()` platform map
### 6. CLI Integration
1. **`hermes_cli/config.py`** — Add all `NEWPLAT_*` vars to `_EXTRA_ENV_KEYS`
2. **`hermes_cli/gateway.py`** — Add entry to `_PLATFORMS` list with key, label, emoji, token_var, setup_instructions, and vars
3. **`hermes_cli/platforms.py`** — Add `PlatformInfo` entry with label and default_toolset (used by `skills_config` and `tools_config` TUIs)
4. **`hermes_cli/setup.py`** — Add `_setup_newplat()` function (can delegate to `gateway.py`) and add tuple to the messaging platforms list
5. **`hermes_cli/status.py`** — Add platform detection entry: `"NewPlat": ("NEWPLAT_TOKEN", "NEWPLAT_HOME_CHANNEL")`
6. **`hermes_cli/dump.py`** — Add `"newplat": "NEWPLAT_TOKEN"` to platform detection dict
### 7. Tools
1. **`tools/send_message_tool.py`** — Add `"newplat": Platform.NEWPLAT` to platform map
2. **`tools/cronjob_tools.py`** — Add `newplat` to the delivery target description string
### 8. Toolsets
1. **`toolsets.py`** — Add `"hermes-newplat"` toolset definition with `_HERMES_CORE_TOOLS`
2. **`toolsets.py`** — Add `"hermes-newplat"` to the `"hermes-gateway"` includes list
### 9. Optional: Platform Hints
**`agent/prompt_builder.py`** — If your platform has specific rendering limitations (no markdown, message length limits, etc.), add an entry to the `_PLATFORM_HINTS` dict. This injects platform-specific guidance into the system prompt:
```python
_PLATFORM_HINTS = {
# ...
"newplat": (
"You are chatting via NewPlat. It supports markdown formatting "
"but has a 4000-character message limit."
),
}
```
Not all platforms need hints — only add one if the agent's behavior should differ.
### 10. Tests
Create `tests/gateway/test_newplat.py` covering:
- Adapter construction from config
- Message event building
- Send method (mock the external API)
- Platform-specific features (encryption, routing, etc.)
### 11. Documentation
| File | What to add |
|------|-------------|
| `website/docs/user-guide/messaging/newplat.md` | Full platform setup page |
| `website/docs/user-guide/messaging/index.md` | Platform comparison table, architecture diagram, toolsets table, security section, next-steps link |
| `website/docs/reference/environment-variables.md` | All NEWPLAT_* env vars |
| `website/docs/reference/toolsets-reference.md` | hermes-newplat toolset |
| `website/docs/integrations/index.md` | Platform link |
| `website/sidebars.ts` | Sidebar entry for the docs page |
| `website/docs/developer-guide/architecture.md` | Adapter count + listing |
| `website/docs/developer-guide/gateway-internals.md` | Adapter file listing |
## Parity Audit
Before marking a new platform PR as complete, run a parity audit against an established platform:
```bash
# Find every .py file mentioning the reference platform
search_files "bluebubbles" output_mode="files_only" file_glob="*.py"
# Find every .py file mentioning the new platform
search_files "newplat" output_mode="files_only" file_glob="*.py"
# Any file in the first set but not the second is a potential gap
```
Repeat for `.md` and `.ts` files. Investigate each gap — is it a platform enumeration (needs updating) or a platform-specific reference (skip)?
## Common Patterns
### Long-Poll Adapters
If your adapter uses long-polling (like Telegram or Weixin), use a polling loop task:
```python
async def connect(self):
self._poll_task = asyncio.create_task(self._poll_loop())
self._mark_connected()
async def _poll_loop(self):
while self._running:
messages = await self._fetch_updates()
for msg in messages:
await self.handle_message(self._build_event(msg))
```
### Callback/Webhook Adapters
If the platform pushes messages to your endpoint (like WeCom Callback), run an HTTP server:
```python
async def connect(self):
self._app = web.Application()
self._app.router.add_post("/callback", self._handle_callback)
# ... start aiohttp server
self._mark_connected()
async def _handle_callback(self, request):
event = self._build_event(await request.text())
await self._message_queue.put(event)
return web.Response(text="success") # Acknowledge immediately
```
For platforms with tight response deadlines (e.g., WeCom's 5-second limit), always acknowledge immediately and deliver the agent's reply proactively via API later. Agent sessions run 330 minutes — inline replies within a callback response window are not feasible.
### Token Locks
If the adapter holds a persistent connection with a unique credential, add a scoped lock to prevent two profiles from using the same credential:
```python
from gateway.status import acquire_scoped_lock, release_scoped_lock
async def connect(self):
if not acquire_scoped_lock("newplat", self._token):
logger.error("Token already in use by another profile")
return False
# ... connect
async def disconnect(self):
release_scoped_lock("newplat", self._token)
```
## Reference Implementations
| Adapter | Pattern | Complexity | Good reference for |
|---------|---------|------------|-------------------|
| `bluebubbles.py` | REST + webhook | Medium | Simple REST API integration |
| `weixin.py` | Long-poll + CDN | High | Media handling, encryption |
| `wecom_callback.py` | Callback/webhook | Medium | HTTP server, AES crypto, multi-app |
| `telegram.py` | Long-poll + Bot API | High | Full-featured adapter with groups, threads |