Covers ~60 merged PRs from Apr 15–29 that shipped user-visible behavior without docs coverage. No functional code changes; docs + static manifest regeneration only. Highlights: Stale / incorrect: - configuration.md: auxiliary auto-routing line was wrong since #11900; now correctly states auto routes to the main model, with a note on the cost trade-off and per-task override pattern. - integrations/providers.md + configuration.md compression intro: removed stale 'Gemini Flash via OpenRouter' claim. - website/static/api/model-catalog.json: rebuilt from hermes_cli/models.py so the live manifest picks up tencent/hy3-preview (and remains in sync for future model-catalog PRs). Platform messaging (#17417 #16997 #16193 #14315 #13151 #11794 #10610 #10283 #10246 #11564 #13178): - Signal: native formatting (bodyRanges), reply quotes, reactions. - Telegram: table rendering (bullets + code-block fallback), disable_link_previews, group_allowed_chats. - Slack: strict_mention config. - Discord: slash_commands disable, send_animation GIF, send_message native media attachments. - DingTalk: require_mention + allowed_users. CLI (#16052 #16539 #16566 #15841 #14798 #10043): - New 'hermes fallback' interactive manager. - New 'hermes update --check', '--backup' flag, and pre-update pairing snapshot behavior. - 'hermes gateway start/restart --all' multi-profile flag. - cron.md: 'hermes tools' as a platform, per-job enabled_toolsets, wakeAgent gate, context_from chaining. Config keys / env vars (#17305 #17026 #17000 #15077 #14557 #14227 #14166 #14730 #17008): - terminal.docker_run_as_host_user, display.runtime_metadata_footer, compression.hygiene_hard_message_limit, HINDSIGHT_TIMEOUT, skills.guard_agent_created, TAVILY_BASE_URL, security.allow_private_urls, agent.api_max_retries, gateway hot-reload of compression/context_length config edits. TUI / CLI UX (#17130 #17113 #17175 #17150 #16707 #12312 #12305 #12934 #14810 #14045 #17286 #17126): - HERMES_TUI_RESUME, HERMES_TUI_THEME, LaTeX rendering, busy-indicator styles, ctrl-x queued-message delete, git branch in status bar, per- prompt elapsed stopwatch, external-editor keybind, markdown stripping, TUI voice-mode parity, /agents overlay, /reload + /mouse. Gateway features (#16506 #15027 #13428 #12116): - Native multimodal image routing based on vision capability. - /usage account-limits section. - /steer slash command (added to reference + explanation in CLI). Plugins / hooks (#12929 #12972 #10763 #16364): - transform_tool_result, transform_terminal_output plugin hooks. - PluginContext.dispatch_tool() documented with slash-command example. - google_meet bundled plugin entry under built-in-plugins.md. Other (#16576 #16572 #16383 #15878 #15608 #15606 #14809 #14767 #14231 #14232 #14307 #13683 #12373 #11891 #11291 #10066): - hermes backup exclusions (WAL/SHM/journal + checkpoints/). - security.md hardline blocklist (floor below --yolo). - FHS install layout for root installs. - openssh-client + docker-cli baked into the Docker image. - MEDIA: tag supported extensions table (docs/office/archives/pdf). - Remote-to-host file sync on SSH/Modal/Daytona teardown. - 'hermes model' -> Configure Auxiliary Models interactive picker. - Podman support via HERMES_DOCKER_BINARY. Providers / STT / one-shot (#15045 #14473 #15704): - alibaba-coding-plan first-class provider entry. - xAI Grok STT as a 6th transcription option. - 'hermes -z' scripted one-shot mode + HERMES_INFERENCE_MODEL. Build: 'docusaurus build' succeeds. No new broken links/anchors; pre-existing warnings unchanged.
51 KiB
sidebar_position, title, description
| sidebar_position | title | description |
|---|---|---|
| 6 | Event Hooks | Run custom code at key lifecycle points — log activity, send alerts, post to webhooks |
Event Hooks
Hermes has three hook systems that run custom code at key lifecycle points:
| System | Registered via | Runs in | Use case |
|---|---|---|---|
| Gateway hooks | HOOK.yaml + handler.py in ~/.hermes/hooks/ |
Gateway only | Logging, alerts, webhooks |
| Plugin hooks | ctx.register_hook() in a plugin |
CLI + Gateway | Tool interception, metrics, guardrails |
| Shell hooks | hooks: block in ~/.hermes/config.yaml pointing at shell scripts |
CLI + Gateway | Drop-in scripts for blocking, auto-formatting, context injection |
All three systems are non-blocking — errors in any hook are caught and logged, never crashing the agent.
Gateway Event Hooks
Gateway hooks fire automatically during gateway operation (Telegram, Discord, Slack, WhatsApp) without blocking the main agent pipeline.
Creating a Hook
Each hook is a directory under ~/.hermes/hooks/ containing two files:
~/.hermes/hooks/
└── my-hook/
├── HOOK.yaml # Declares which events to listen for
└── handler.py # Python handler function
HOOK.yaml
name: my-hook
description: Log all agent activity to a file
events:
- agent:start
- agent:end
- agent:step
The events list determines which events trigger your handler. You can subscribe to any combination of events, including wildcards like command:*.
handler.py
import json
from datetime import datetime
from pathlib import Path
LOG_FILE = Path.home() / ".hermes" / "hooks" / "my-hook" / "activity.log"
async def handle(event_type: str, context: dict):
"""Called for each subscribed event. Must be named 'handle'."""
entry = {
"timestamp": datetime.now().isoformat(),
"event": event_type,
**context,
}
with open(LOG_FILE, "a") as f:
f.write(json.dumps(entry) + "\n")
Handler rules:
- Must be named
handle - Receives
event_type(string) andcontext(dict) - Can be
async defor regulardef— both work - Errors are caught and logged, never crashing the agent
Available Events
| Event | When it fires | Context keys |
|---|---|---|
gateway:startup |
Gateway process starts | platforms (list of active platform names) |
session:start |
New messaging session created | platform, user_id, session_id, session_key |
session:end |
Session ended (before reset) | platform, user_id, session_key |
session:reset |
User ran /new or /reset |
platform, user_id, session_key |
agent:start |
Agent begins processing a message | platform, user_id, session_id, message |
agent:step |
Each iteration of the tool-calling loop | platform, user_id, session_id, iteration, tool_names |
agent:end |
Agent finishes processing | platform, user_id, session_id, message, response |
command:* |
Any slash command executed | platform, user_id, command, args |
Wildcard Matching
Handlers registered for command:* fire for any command: event (command:model, command:reset, etc.). Monitor all slash commands with a single subscription.
Examples
Telegram Alert on Long Tasks
Send yourself a message when the agent takes more than 10 steps:
# ~/.hermes/hooks/long-task-alert/HOOK.yaml
name: long-task-alert
description: Alert when agent is taking many steps
events:
- agent:step
# ~/.hermes/hooks/long-task-alert/handler.py
import os
import httpx
THRESHOLD = 10
BOT_TOKEN = os.getenv("TELEGRAM_BOT_TOKEN")
CHAT_ID = os.getenv("TELEGRAM_HOME_CHANNEL")
async def handle(event_type: str, context: dict):
iteration = context.get("iteration", 0)
if iteration == THRESHOLD and BOT_TOKEN and CHAT_ID:
tools = ", ".join(context.get("tool_names", []))
text = f"⚠️ Agent has been running for {iteration} steps. Last tools: {tools}"
async with httpx.AsyncClient() as client:
await client.post(
f"https://api.telegram.org/bot{BOT_TOKEN}/sendMessage",
json={"chat_id": CHAT_ID, "text": text},
)
Command Usage Logger
Track which slash commands are used:
# ~/.hermes/hooks/command-logger/HOOK.yaml
name: command-logger
description: Log slash command usage
events:
- command:*
# ~/.hermes/hooks/command-logger/handler.py
import json
from datetime import datetime
from pathlib import Path
LOG = Path.home() / ".hermes" / "logs" / "command_usage.jsonl"
def handle(event_type: str, context: dict):
LOG.parent.mkdir(parents=True, exist_ok=True)
entry = {
"ts": datetime.now().isoformat(),
"command": context.get("command"),
"args": context.get("args"),
"platform": context.get("platform"),
"user": context.get("user_id"),
}
with open(LOG, "a") as f:
f.write(json.dumps(entry) + "\n")
Session Start Webhook
POST to an external service on new sessions:
# ~/.hermes/hooks/session-webhook/HOOK.yaml
name: session-webhook
description: Notify external service on new sessions
events:
- session:start
- session:reset
# ~/.hermes/hooks/session-webhook/handler.py
import httpx
WEBHOOK_URL = "https://your-service.example.com/hermes-events"
async def handle(event_type: str, context: dict):
async with httpx.AsyncClient() as client:
await client.post(WEBHOOK_URL, json={
"event": event_type,
**context,
}, timeout=5)
Tutorial: BOOT.md — Run a Startup Checklist on Every Gateway Boot
A popular pattern from the community: drop a Markdown checklist at ~/.hermes/BOOT.md, and have the agent run it once every time the gateway starts. Useful for "on every boot, check overnight cron failures and ping me on Discord if anything failed," or "summarize the last 24h of deploy.log and post it to Slack #ops."
This tutorial shows how to build it yourself as a user-defined hook. Hermes does not ship a built-in BOOT.md hook — you wire up exactly the behavior you want.
What we're building
- A file at
~/.hermes/BOOT.mdwith natural-language startup instructions. - A gateway hook that fires on
gateway:startup, spawns a one-shot agent with your gateway's resolved model/credentials, and runs the BOOT.md instructions. - A
[SILENT]convention so the agent can opt out of sending a message when there's nothing to report.
Step 1: Write your checklist
Create ~/.hermes/BOOT.md. Write it as if you were giving instructions to a human assistant:
# Startup Checklist
1. Run `hermes cron list` and check if any scheduled jobs failed overnight.
2. If any failed, send a summary to Discord #ops using the `send_message` tool.
3. Check if `/opt/app/deploy.log` has any ERROR lines from the last 24 hours. If yes, summarize them and include in the same Discord message.
4. If nothing went wrong, reply with only `[SILENT]` so no message is sent.
The agent sees this as part of its prompt, so anything you can describe in plain language works — tool calls, shell commands, sending messages, summarizing files.
Step 2: Create the hook
~/.hermes/hooks/boot-md/
├── HOOK.yaml
└── handler.py
~/.hermes/hooks/boot-md/HOOK.yaml
name: boot-md
description: Run ~/.hermes/BOOT.md on gateway startup
events:
- gateway:startup
~/.hermes/hooks/boot-md/handler.py
"""Run ~/.hermes/BOOT.md on every gateway startup."""
import logging
import threading
from pathlib import Path
logger = logging.getLogger("hooks.boot-md")
BOOT_FILE = Path.home() / ".hermes" / "BOOT.md"
def _build_prompt(content: str) -> str:
return (
"You are running a startup boot checklist. Follow the instructions "
"below exactly.\n\n"
"---\n"
f"{content}\n"
"---\n\n"
"Execute each instruction. Use the send_message tool to deliver any "
"messages to platforms like Discord or Slack.\n"
"If nothing needs attention and there is nothing to report, reply "
"with ONLY: [SILENT]"
)
def _run_boot_agent(content: str) -> None:
"""Spawn a one-shot agent and execute the checklist.
Uses the gateway's resolved model and runtime credentials so this works
against custom endpoints, aggregators, and OAuth-based providers alike.
"""
try:
from gateway.run import _resolve_gateway_model, _resolve_runtime_agent_kwargs
from run_agent import AIAgent
agent = AIAgent(
model=_resolve_gateway_model(),
**_resolve_runtime_agent_kwargs(),
platform="gateway",
quiet_mode=True,
skip_context_files=True,
skip_memory=True,
max_iterations=20,
)
result = agent.run_conversation(_build_prompt(content))
response = result.get("final_response", "")
if response and "[SILENT]" not in response:
logger.info("boot-md completed: %s", response[:200])
else:
logger.info("boot-md completed (nothing to report)")
except Exception as e:
logger.error("boot-md agent failed: %s", e)
async def handle(event_type: str, context: dict) -> None:
if not BOOT_FILE.exists():
return
content = BOOT_FILE.read_text(encoding="utf-8").strip()
if not content:
return
logger.info("Running BOOT.md (%d chars)", len(content))
# Background thread so gateway startup isn't blocked on a full agent turn.
thread = threading.Thread(
target=_run_boot_agent,
args=(content,),
name="boot-md",
daemon=True,
)
thread.start()
The two key lines:
_resolve_gateway_model()reads the gateway's currently-configured model._resolve_runtime_agent_kwargs()resolves provider credentials the same way a normal gateway turn does — including API keys, base URLs, OAuth tokens, and credential pools.
Without these, a bare AIAgent() falls back to built-in defaults and will 401 against any non-default endpoint.
Step 3: Test it
Restart the gateway:
hermes gateway restart
Watch the logs:
hermes logs --follow --level INFO | grep boot-md
You should see Running BOOT.md (N chars) followed by either boot-md completed: ... (summary of what the agent did) or boot-md completed (nothing to report) when the agent replied [SILENT].
Delete ~/.hermes/BOOT.md to disable the checklist — the hook stays loaded but silently skips when the file isn't there.
Extending the pattern
- Schedule-aware checklists: key off
datetime.now().weekday()inside BOOT.md's instructions ("if it's Monday, also check the weekly deploy log"). The instructions are free-form text, so anything the agent can reason about is fair game. - Multiple checklists: point the hook at a different file (
STARTUP.md,MORNING.md, etc.) and register separate hook directories for each. - Non-agent variant: if you don't need a full agent loop, skip
AIAgententirely and have the handler post a fixed notification directly viahttpx. Cheaper, faster, and has no provider dependency.
Why this isn't a built-in
An earlier version of Hermes shipped this as a built-in hook and silently spawned an agent with bare defaults on every gateway boot. That surprised users with custom endpoints and made the feature invisible to users who didn't know it was running. Keeping it as a documented pattern — built by you, in your hooks directory — means you see exactly what it does and opt in by writing the files.
How It Works
- On gateway startup,
HookRegistry.discover_and_load()scans~/.hermes/hooks/ - Each subdirectory with
HOOK.yaml+handler.pyis loaded dynamically - Handlers are registered for their declared events
- At each lifecycle point,
hooks.emit()fires all matching handlers - Errors in any handler are caught and logged — a broken hook never crashes the agent
:::info Gateway hooks only fire in the gateway (Telegram, Discord, Slack, WhatsApp). The CLI does not load gateway hooks. For hooks that work everywhere, use plugin hooks. :::
Plugin Hooks
Plugins can register hooks that fire in both CLI and gateway sessions. These are registered programmatically via ctx.register_hook() in your plugin's register() function.
def register(ctx):
ctx.register_hook("pre_tool_call", my_tool_observer)
ctx.register_hook("post_tool_call", my_tool_logger)
ctx.register_hook("pre_llm_call", my_memory_callback)
ctx.register_hook("post_llm_call", my_sync_callback)
ctx.register_hook("on_session_start", my_init_callback)
ctx.register_hook("on_session_end", my_cleanup_callback)
General rules for all hooks:
- Callbacks receive keyword arguments. Always accept
**kwargsfor forward compatibility — new parameters may be added in future versions without breaking your plugin. - If a callback crashes, it's logged and skipped. Other hooks and the agent continue normally. A misbehaving plugin can never break the agent.
- Two hooks' return values affect behavior:
pre_tool_callcan block the tool, andpre_llm_callcan inject context into the LLM call. All other hooks are fire-and-forget observers.
Quick reference
| Hook | Fires when | Returns |
|---|---|---|
pre_tool_call |
Before any tool executes | {"action": "block", "message": str} to veto the call |
post_tool_call |
After any tool returns | ignored |
pre_llm_call |
Once per turn, before the tool-calling loop | {"context": str} to prepend context to the user message |
post_llm_call |
Once per turn, after the tool-calling loop | ignored |
on_session_start |
New session created (first turn only) | ignored |
on_session_end |
Session ends | ignored |
on_session_finalize |
CLI/gateway tears down an active session (flush, save, stats) | ignored |
on_session_reset |
Gateway swaps in a fresh session key (e.g. /new, /reset) |
ignored |
subagent_stop |
A delegate_task child has exited |
ignored |
pre_gateway_dispatch |
Gateway received a user message, before auth + dispatch | {"action": "skip" | "rewrite" | "allow", ...} to influence flow |
pre_approval_request |
Dangerous command needs user approval, before the prompt/notification is sent | ignored |
post_approval_response |
User responded to an approval prompt (or it timed out) | ignored |
transform_tool_result |
After any tool returns, before the result is handed back to the model | str to replace the result, None to leave unchanged |
transform_terminal_output |
Inside the terminal tool, before truncation/ANSI-strip/redact |
str to replace the raw output, None to leave unchanged |
pre_tool_call
Fires immediately before every tool execution — built-in tools and plugin tools alike.
Callback signature:
def my_callback(tool_name: str, args: dict, task_id: str, **kwargs):
| Parameter | Type | Description |
|---|---|---|
tool_name |
str |
Name of the tool about to execute (e.g. "terminal", "web_search", "read_file") |
args |
dict |
The arguments the model passed to the tool |
task_id |
str |
Session/task identifier. Empty string if not set. |
Fires: In model_tools.py, inside handle_function_call(), before the tool's handler runs. Fires once per tool call — if the model calls 3 tools in parallel, this fires 3 times.
Return value — veto the call:
return {"action": "block", "message": "Reason the tool call was blocked"}
The agent short-circuits the tool with message as the error returned to the model. The first matching block directive wins (Python plugins registered first, then shell hooks). Any other return value is ignored, so existing observer-only callbacks keep working unchanged.
Use cases: Logging, audit trails, tool call counters, blocking dangerous operations, rate limiting, per-user policy enforcement.
Example — tool call audit log:
import json, logging
from datetime import datetime
logger = logging.getLogger(__name__)
def audit_tool_call(tool_name, args, task_id, **kwargs):
logger.info("TOOL_CALL session=%s tool=%s args=%s",
task_id, tool_name, json.dumps(args)[:200])
def register(ctx):
ctx.register_hook("pre_tool_call", audit_tool_call)
Example — warn on dangerous tools:
DANGEROUS = {"terminal", "write_file", "patch"}
def warn_dangerous(tool_name, **kwargs):
if tool_name in DANGEROUS:
print(f"⚠ Executing potentially dangerous tool: {tool_name}")
def register(ctx):
ctx.register_hook("pre_tool_call", warn_dangerous)
post_tool_call
Fires immediately after every tool execution returns.
Callback signature:
def my_callback(tool_name: str, args: dict, result: str, task_id: str,
duration_ms: int, **kwargs):
| Parameter | Type | Description |
|---|---|---|
tool_name |
str |
Name of the tool that just executed |
args |
dict |
The arguments the model passed to the tool |
result |
str |
The tool's return value (always a JSON string) |
task_id |
str |
Session/task identifier. Empty string if not set. |
duration_ms |
int |
How long the tool's dispatch took, in milliseconds (measured with time.monotonic() around registry.dispatch()). |
Fires: In model_tools.py, inside handle_function_call(), after the tool's handler returns. Fires once per tool call. Does not fire if the tool raised an unhandled exception (the error is caught and returned as an error JSON string instead, and post_tool_call fires with that error string as result).
Return value: Ignored.
Use cases: Logging tool results, metrics collection, tracking tool success/failure rates, latency dashboards, per-tool budget alerts, sending notifications when specific tools complete.
Example — track tool usage metrics:
from collections import Counter, defaultdict
import json
_tool_counts = Counter()
_error_counts = Counter()
_latency_ms = defaultdict(list)
def track_metrics(tool_name, result, duration_ms=0, **kwargs):
_tool_counts[tool_name] += 1
_latency_ms[tool_name].append(duration_ms)
try:
parsed = json.loads(result)
if "error" in parsed:
_error_counts[tool_name] += 1
except (json.JSONDecodeError, TypeError):
pass
def register(ctx):
ctx.register_hook("post_tool_call", track_metrics)
pre_llm_call
Fires once per turn, before the tool-calling loop begins. This is the only hook whose return value is used — it can inject context into the current turn's user message.
Callback signature:
def my_callback(session_id: str, user_message: str, conversation_history: list,
is_first_turn: bool, model: str, platform: str, **kwargs):
| Parameter | Type | Description |
|---|---|---|
session_id |
str |
Unique identifier for the current session |
user_message |
str |
The user's original message for this turn (before any skill injection) |
conversation_history |
list |
Copy of the full message list (OpenAI format: [{"role": "user", "content": "..."}]) |
is_first_turn |
bool |
True if this is the first turn of a new session, False on subsequent turns |
model |
str |
The model identifier (e.g. "anthropic/claude-sonnet-4.6") |
platform |
str |
Where the session is running: "cli", "telegram", "discord", etc. |
Fires: In run_agent.py, inside run_conversation(), after context compression but before the main while loop. Fires once per run_conversation() call (i.e. once per user turn), not once per API call within the tool loop.
Return value: If the callback returns a dict with a "context" key, or a plain non-empty string, the text is appended to the current turn's user message. Return None for no injection.
# Inject context
return {"context": "Recalled memories:\n- User likes Python\n- Working on hermes-agent"}
# Plain string (equivalent)
return "Recalled memories:\n- User likes Python"
# No injection
return None
Where context is injected: Always the user message, never the system prompt. This preserves the prompt cache — the system prompt stays identical across turns, so cached tokens are reused. The system prompt is Hermes's territory (model guidance, tool enforcement, personality, skills). Plugins contribute context alongside the user's input.
All injected context is ephemeral — added at API call time only. The original user message in the conversation history is never mutated, and nothing is persisted to the session database.
When multiple plugins return context, their outputs are joined with double newlines in plugin discovery order (alphabetical by directory name).
Use cases: Memory recall, RAG context injection, guardrails, per-turn analytics.
Example — memory recall:
import httpx
MEMORY_API = "https://your-memory-api.example.com"
def recall(session_id, user_message, is_first_turn, **kwargs):
try:
resp = httpx.post(f"{MEMORY_API}/recall", json={
"session_id": session_id,
"query": user_message,
}, timeout=3)
memories = resp.json().get("results", [])
if not memories:
return None
text = "Recalled context:\n" + "\n".join(f"- {m['text']}" for m in memories)
return {"context": text}
except Exception:
return None
def register(ctx):
ctx.register_hook("pre_llm_call", recall)
Example — guardrails:
POLICY = "Never execute commands that delete files without explicit user confirmation."
def guardrails(**kwargs):
return {"context": POLICY}
def register(ctx):
ctx.register_hook("pre_llm_call", guardrails)
post_llm_call
Fires once per turn, after the tool-calling loop completes and the agent has produced a final response. Only fires on successful turns — does not fire if the turn was interrupted.
Callback signature:
def my_callback(session_id: str, user_message: str, assistant_response: str,
conversation_history: list, model: str, platform: str, **kwargs):
| Parameter | Type | Description |
|---|---|---|
session_id |
str |
Unique identifier for the current session |
user_message |
str |
The user's original message for this turn |
assistant_response |
str |
The agent's final text response for this turn |
conversation_history |
list |
Copy of the full message list after the turn completed |
model |
str |
The model identifier |
platform |
str |
Where the session is running |
Fires: In run_agent.py, inside run_conversation(), after the tool loop exits with a final response. Guarded by if final_response and not interrupted — so it does not fire when the user interrupts mid-turn or the agent hits the iteration limit without producing a response.
Return value: Ignored.
Use cases: Syncing conversation data to an external memory system, computing response quality metrics, logging turn summaries, triggering follow-up actions.
Example — sync to external memory:
import httpx
MEMORY_API = "https://your-memory-api.example.com"
def sync_memory(session_id, user_message, assistant_response, **kwargs):
try:
httpx.post(f"{MEMORY_API}/store", json={
"session_id": session_id,
"user": user_message,
"assistant": assistant_response,
}, timeout=5)
except Exception:
pass # best-effort
def register(ctx):
ctx.register_hook("post_llm_call", sync_memory)
Example — track response lengths:
import logging
logger = logging.getLogger(__name__)
def log_response_length(session_id, assistant_response, model, **kwargs):
logger.info("RESPONSE session=%s model=%s chars=%d",
session_id, model, len(assistant_response or ""))
def register(ctx):
ctx.register_hook("post_llm_call", log_response_length)
on_session_start
Fires once when a brand-new session is created. Does not fire on session continuation (when the user sends a second message in an existing session).
Callback signature:
def my_callback(session_id: str, model: str, platform: str, **kwargs):
| Parameter | Type | Description |
|---|---|---|
session_id |
str |
Unique identifier for the new session |
model |
str |
The model identifier |
platform |
str |
Where the session is running |
Fires: In run_agent.py, inside run_conversation(), during the first turn of a new session — specifically after the system prompt is built but before the tool loop starts. The check is if not conversation_history (no prior messages = new session).
Return value: Ignored.
Use cases: Initializing session-scoped state, warming caches, registering the session with an external service, logging session starts.
Example — initialize a session cache:
_session_caches = {}
def init_session(session_id, model, platform, **kwargs):
_session_caches[session_id] = {
"model": model,
"platform": platform,
"tool_calls": 0,
"started": __import__("datetime").datetime.now().isoformat(),
}
def register(ctx):
ctx.register_hook("on_session_start", init_session)
on_session_end
Fires at the very end of every run_conversation() call, regardless of outcome. Also fires from the CLI's exit handler if the agent was mid-turn when the user quit.
Callback signature:
def my_callback(session_id: str, completed: bool, interrupted: bool,
model: str, platform: str, **kwargs):
| Parameter | Type | Description |
|---|---|---|
session_id |
str |
Unique identifier for the session |
completed |
bool |
True if the agent produced a final response, False otherwise |
interrupted |
bool |
True if the turn was interrupted (user sent new message, /stop, or quit) |
model |
str |
The model identifier |
platform |
str |
Where the session is running |
Fires: In two places:
run_agent.py— at the end of everyrun_conversation()call, after all cleanup. Always fires, even if the turn errored.cli.py— in the CLI's atexit handler, but only if the agent was mid-turn (_agent_running=True) when the exit occurred. This catches Ctrl+C and/exitduring processing. In this case,completed=Falseandinterrupted=True.
Return value: Ignored.
Use cases: Flushing buffers, closing connections, persisting session state, logging session duration, cleanup of resources initialized in on_session_start.
Example — flush and cleanup:
_session_caches = {}
def cleanup_session(session_id, completed, interrupted, **kwargs):
cache = _session_caches.pop(session_id, None)
if cache:
# Flush accumulated data to disk or external service
status = "completed" if completed else ("interrupted" if interrupted else "failed")
print(f"Session {session_id} ended: {status}, {cache['tool_calls']} tool calls")
def register(ctx):
ctx.register_hook("on_session_end", cleanup_session)
Example — session duration tracking:
import time, logging
logger = logging.getLogger(__name__)
_start_times = {}
def on_start(session_id, **kwargs):
_start_times[session_id] = time.time()
def on_end(session_id, completed, interrupted, **kwargs):
start = _start_times.pop(session_id, None)
if start:
duration = time.time() - start
logger.info("SESSION_DURATION session=%s seconds=%.1f completed=%s interrupted=%s",
session_id, duration, completed, interrupted)
def register(ctx):
ctx.register_hook("on_session_start", on_start)
ctx.register_hook("on_session_end", on_end)
on_session_finalize
Fires when the CLI or gateway tears down an active session — for example, when the user runs /new, the gateway GC'd an idle session, or the CLI quit with an active agent. This is the last chance to flush state tied to the outgoing session before its identity is gone.
Callback signature:
def my_callback(session_id: str | None, platform: str, **kwargs):
| Parameter | Type | Description |
|---|---|---|
session_id |
str or None |
The outgoing session ID. May be None if no active session existed. |
platform |
str |
"cli" or the messaging platform name ("telegram", "discord", etc.). |
Fires: In cli.py (on /new / CLI exit) and gateway/run.py (when a session is reset or GC'd). Always paired with on_session_reset on the gateway side.
Return value: Ignored.
Use cases: Persist final session metrics before the session ID is discarded, close per-session resources, emit a final telemetry event, drain queued writes.
on_session_reset
Fires when the gateway swaps in a new session key for an active chat — the user invoked /new, /reset, /clear, or the adapter picked a fresh session after an idle window. This lets plugins react to the fact that conversation state has been wiped without waiting for the next on_session_start.
Callback signature:
def my_callback(session_id: str, platform: str, **kwargs):
| Parameter | Type | Description |
|---|---|---|
session_id |
str |
The new session's ID (already rotated to the fresh value). |
platform |
str |
The messaging platform name. |
Fires: In gateway/run.py, immediately after the new session key is allocated but before the next inbound message is processed. On the gateway, the order is: on_session_finalize(old_id) → swap → on_session_reset(new_id) → on_session_start(new_id) on the first inbound turn.
Return value: Ignored.
Use cases: Reset per-session caches keyed by session_id, emit "session rotated" analytics, prime a fresh state bucket.
See the Build a Plugin guide for the full walkthrough including tool schemas, handlers, and advanced hook patterns.
subagent_stop
Fires once per child agent after delegate_task finishes. Whether you delegated a single task or a batch of three, this hook fires once for each child, serialised on the parent thread.
Callback signature:
def my_callback(parent_session_id: str, child_role: str | None,
child_summary: str | None, child_status: str,
duration_ms: int, **kwargs):
| Parameter | Type | Description |
|---|---|---|
parent_session_id |
str |
Session ID of the delegating parent agent |
child_role |
str | None |
Orchestrator role tag set on the child (None if the feature isn't enabled) |
child_summary |
str | None |
The final response the child returned to the parent |
child_status |
str |
"completed", "failed", "interrupted", or "error" |
duration_ms |
int |
Wall-clock time spent running the child, in milliseconds |
Fires: In tools/delegate_tool.py, after ThreadPoolExecutor.as_completed() drains all child futures. Firing is marshalled to the parent thread so hook authors don't have to reason about concurrent callback execution.
Return value: Ignored.
Use cases: Logging orchestration activity, accumulating child durations for billing, writing post-delegation audit records.
Example — log orchestrator activity:
import logging
logger = logging.getLogger(__name__)
def log_subagent(parent_session_id, child_role, child_status, duration_ms, **kwargs):
logger.info(
"SUBAGENT parent=%s role=%s status=%s duration_ms=%d",
parent_session_id, child_role, child_status, duration_ms,
)
def register(ctx):
ctx.register_hook("subagent_stop", log_subagent)
:::info
With heavy delegation (e.g. orchestrator roles × 5 leaves × nested depth), subagent_stop fires many times per turn. Keep your callback fast; push expensive work to a background queue.
:::
pre_gateway_dispatch
Fires once per incoming MessageEvent in the gateway, after the internal-event guard but before auth/pairing and agent dispatch. This is the interception point for gateway-level message-flow policies (listen-only windows, human handover, per-chat routing, etc.) that don't fit cleanly into any single platform adapter.
Callback signature:
def my_callback(event, gateway, session_store, **kwargs):
| Parameter | Type | Description |
|---|---|---|
event |
MessageEvent |
The normalized inbound message (has .text, .source, .message_id, .internal, etc.). |
gateway |
GatewayRunner |
The active gateway runner, so plugins can call gateway.adapters[platform].send(...) for side-channel replies (owner notifications, etc.). |
session_store |
SessionStore |
For silent transcript ingestion via session_store.append_to_transcript(...). |
Fires: In gateway/run.py, inside GatewayRunner._handle_message(), immediately after is_internal is computed. Internal events skip the hook entirely (they are system-generated — background-process completions, etc. — and must not be gate-kept by user-facing policy).
Return value: None or a dict. The first recognized action dict wins; remaining plugin results are ignored. Exceptions in plugin callbacks are caught and logged; the gateway always falls through to normal dispatch on error.
| Return | Effect |
|---|---|
{"action": "skip", "reason": "..."} |
Drop the message — no agent reply, no pairing flow, no auth. Plugin is assumed to have handled it (e.g. silent-ingested into the transcript). |
{"action": "rewrite", "text": "new text"} |
Replace event.text, then continue normal dispatch with the modified event. Useful for collapsing buffered ambient messages into a single prompt. |
{"action": "allow"} / None |
Normal dispatch — runs the full auth / pairing / agent-loop chain. |
Use cases: Listen-only group chats (only respond when tagged; buffer ambient messages into context); human handover (silent-ingest customer messages while owner handles the chat manually); per-profile rate limiting; policy-driven routing.
Example — drop unauthorized DMs silently without triggering the pairing code:
def deny_unauthorized_dms(event, **kwargs):
src = event.source
if src.chat_type == "dm" and not _is_approved_user(src.user_id):
return {"action": "skip", "reason": "unauthorized-dm"}
return None
def register(ctx):
ctx.register_hook("pre_gateway_dispatch", deny_unauthorized_dms)
Example — rewrite an ambient-message buffer into a single prompt on mention:
_buffers = {}
def buffer_or_rewrite(event, **kwargs):
key = (event.source.platform, event.source.chat_id)
buf = _buffers.setdefault(key, [])
if _bot_mentioned(event.text):
combined = "\n".join(buf + [event.text])
buf.clear()
return {"action": "rewrite", "text": combined}
buf.append(event.text)
return {"action": "skip", "reason": "ambient-buffered"}
def register(ctx):
ctx.register_hook("pre_gateway_dispatch", buffer_or_rewrite)
pre_approval_request
Fires immediately before an approval request is shown to the user — covers every surface: interactive CLI, the Ink TUI, gateway platforms (Telegram, Discord, Slack, WhatsApp, Matrix, etc.), and ACP clients (VS Code, Zed, JetBrains).
This is the right place to wire a custom notifier — for example, a macOS menu-bar app that pops an allow/deny notification, or an audit log that records every approval request with context.
Callback signature:
def my_callback(
command: str,
description: str,
pattern_key: str,
pattern_keys: list[str],
session_key: str,
surface: str,
**kwargs,
):
| Parameter | Type | Description |
|---|---|---|
command |
str |
The shell command awaiting approval |
description |
str |
Human-readable reason(s) the command is flagged (combined when multiple patterns match) |
pattern_key |
str |
Primary pattern key that triggered the approval (e.g. "rm_rf", "sudo") |
pattern_keys |
list[str] |
All pattern keys that matched |
session_key |
str |
Session identifier, useful for scoping notifications per-chat |
surface |
str |
"cli" for interactive CLI/TUI prompts, "gateway" for async platform approvals |
Return value: ignored. Hooks here are observer-only; they cannot veto or pre-answer the approval. Use pre_tool_call to block a tool before it reaches the approval system.
Use cases: Desktop notifications, push alerts, audit logging, Slack webhooks, escalation routing, metrics.
Example — desktop notification on macOS:
import subprocess
def notify_approval(command, description, session_key, **kwargs):
title = "Hermes needs approval"
body = f"{description}: {command[:80]}"
subprocess.Popen([
"osascript", "-e",
f'display notification "{body}" with title "{title}"',
])
def register(ctx):
ctx.register_hook("pre_approval_request", notify_approval)
post_approval_response
Fires after the user responds to an approval prompt (or the prompt times out).
Callback signature:
def my_callback(
command: str,
description: str,
pattern_key: str,
pattern_keys: list[str],
session_key: str,
surface: str,
choice: str,
**kwargs,
):
Same kwargs as pre_approval_request, plus:
| Parameter | Type | Description |
|---|---|---|
choice |
str |
One of "once", "session", "always", "deny", or "timeout" |
Return value: ignored.
Use cases: Close the matching desktop notification, record the final decision in an audit log, update metrics, roll forward a rate limiter.
def log_decision(command, choice, session_key, **kwargs):
logger.info("approval %s: %s for session %s", choice, command[:60], session_key)
def register(ctx):
ctx.register_hook("post_approval_response", log_decision)
transform_tool_result
Fires after a tool returns and before the result is appended to the conversation. Lets a plugin rewrite ANY tool's result string — not just terminal output — before the model sees it.
Callback signature:
def my_callback(
tool_name: str,
arguments: dict,
result: str,
task_id: str | None,
**kwargs,
) -> str | None:
| Parameter | Type | Description |
|---|---|---|
tool_name |
str |
Tool that produced the result (read_file, web_extract, delegate_task, …). |
arguments |
dict |
Arguments the model called the tool with. |
result |
str |
The tool's raw result string, post-truncation and post-ANSI-strip. |
task_id |
str | None |
Task/session ID when running inside RL/benchmark environments. |
Return value: str to replace the result (the returned string is what the model sees), None to leave it unchanged.
Use cases: Redact organization-specific PII from web_extract output, wrap long JSON tool responses in a summary header, inject retrieval-augmented hints into read_file results, rewrite delegate_task subagent reports into a project-specific schema.
import re
SECRET = re.compile(r"sk-[A-Za-z0-9]{32,}")
def redact_secrets(tool_name, result, **kwargs):
if SECRET.search(result):
return SECRET.sub("[REDACTED]", result)
return None
def register(ctx):
ctx.register_hook("transform_tool_result", redact_secrets)
Applies to every tool. For terminal-only rewriting see transform_terminal_output below — it's narrower and runs earlier in the pipeline (pre-truncation, pre-redaction).
transform_terminal_output
Fires inside the terminal tool's foreground-output pipeline, before the default 50 KB truncation, ANSI strip, and secret redaction. Lets plugins rewrite the raw stdout/stderr of a shell command before any downstream processing touches it.
Callback signature:
def my_callback(
command: str,
output: str,
exit_code: int,
cwd: str,
task_id: str | None,
**kwargs,
) -> str | None:
| Parameter | Type | Description |
|---|---|---|
command |
str |
The shell command that produced the output. |
output |
str |
Raw combined stdout/stderr (may be very large — truncation happens after the hook). |
exit_code |
int |
Process exit code. |
cwd |
str |
Working directory the command ran in. |
Return value: str to replace the output, None to leave it unchanged.
Use cases: Inject summaries for commands that produce massive output (du -ah, find, tree), tag output with a project-specific marker so downstream hooks know how to handle it, strip timing noise that flaps between runs and defeats prompt caching.
def summarize_find(command, output, **kwargs):
if command.startswith("find ") and len(output) > 50_000:
lines = output.count("\n")
head = "\n".join(output.splitlines()[:40])
return f"{head}\n\n[summary: {lines} paths total, showing first 40]"
return None
def register(ctx):
ctx.register_hook("transform_terminal_output", summarize_find)
Pairs well with transform_tool_result (which covers every other tool).
Shell Hooks
Declare shell-script hooks in your cli-config.yaml and Hermes will run them as subprocesses whenever the corresponding plugin-hook event fires — in both CLI and gateway sessions. No Python plugin authoring required.
Use shell hooks when you want a drop-in, single-file script (Bash, Python, anything with a shebang) to:
- Block a tool call — reject dangerous
terminalcommands, enforce per-directory policies, require approval for destructivewrite_file/patchoperations. - Run after a tool call — auto-format Python or TypeScript files that the agent just wrote, log API calls, trigger a CI workflow.
- Inject context into the next LLM turn — prepend
git statusoutput, the current weekday, or retrieved documents to the user message (seepre_llm_call). - Observe lifecycle events — write a log line when a subagent completes (
subagent_stop) or a session starts (on_session_start).
Shell hooks are registered by calling agent.shell_hooks.register_from_config(cfg) at both CLI startup (hermes_cli/main.py) and gateway startup (gateway/run.py). They compose naturally with Python plugin hooks — both flow through the same dispatcher.
Comparison at a glance
| Dimension | Shell hooks | Plugin hooks | Gateway hooks |
|---|---|---|---|
| Declared in | hooks: block in ~/.hermes/config.yaml |
register() in a plugin.yaml plugin |
HOOK.yaml + handler.py directory |
| Lives under | ~/.hermes/agent-hooks/ (by convention) |
~/.hermes/plugins/<name>/ |
~/.hermes/hooks/<name>/ |
| Language | Any (Bash, Python, Go binary, …) | Python only | Python only |
| Runs in | CLI + Gateway | CLI + Gateway | Gateway only |
| Events | VALID_HOOKS (incl. subagent_stop) |
VALID_HOOKS |
Gateway lifecycle (gateway:startup, agent:*, command:*) |
| Can block a tool call | Yes (pre_tool_call) |
Yes (pre_tool_call) |
No |
| Can inject LLM context | Yes (pre_llm_call) |
Yes (pre_llm_call) |
No |
| Consent | First-use prompt per (event, command) pair |
Implicit (Python plugin trust) | Implicit (dir trust) |
| Inter-process isolation | Yes (subprocess) | No (in-process) | No (in-process) |
Configuration schema
hooks:
<event_name>: # Must be in VALID_HOOKS
- matcher: "<regex>" # Optional; used for pre/post_tool_call only
command: "<shell command>" # Required; runs via shlex.split, shell=False
timeout: <seconds> # Optional; default 60, capped at 300
hooks_auto_accept: false # See "Consent model" below
Event names must be one of the plugin hook events; typos produce a "Did you mean X?" warning and are skipped. Unknown keys inside a single entry are ignored; missing command is a skip-with-warning. timeout > 300 is clamped with a warning.
JSON wire protocol
Each time the event fires, Hermes spawns a subprocess for every matching hook (matcher permitting), pipes a JSON payload to stdin, and reads stdout back as JSON.
stdin — payload the script receives:
{
"hook_event_name": "pre_tool_call",
"tool_name": "terminal",
"tool_input": {"command": "rm -rf /"},
"session_id": "sess_abc123",
"cwd": "/home/user/project",
"extra": {"task_id": "...", "tool_call_id": "..."}
}
tool_name and tool_input are null for non-tool events (pre_llm_call, subagent_stop, session lifecycle). The extra dict carries all event-specific kwargs (user_message, conversation_history, child_role, duration_ms, …). Unserialisable values are stringified rather than omitted.
stdout — optional response:
// Block a pre_tool_call (both shapes accepted; normalised internally):
{"decision": "block", "reason": "Forbidden: rm -rf"} // Claude-Code style
{"action": "block", "message": "Forbidden: rm -rf"} // Hermes-canonical
// Inject context for pre_llm_call:
{"context": "Today is Friday, 2026-04-17"}
// Silent no-op — any empty / non-matching output is fine:
Malformed JSON, non-zero exit codes, and timeouts log a warning but never abort the agent loop.
Worked examples
1. Auto-format Python files after every write
# ~/.hermes/config.yaml
hooks:
post_tool_call:
- matcher: "write_file|patch"
command: "~/.hermes/agent-hooks/auto-format.sh"
#!/usr/bin/env bash
# ~/.hermes/agent-hooks/auto-format.sh
payload="$(cat -)"
path=$(echo "$payload" | jq -r '.tool_input.path // empty')
[[ "$path" == *.py ]] && command -v black >/dev/null && black "$path" 2>/dev/null
printf '{}\n'
The agent's in-context view of the file is not re-read automatically — the reformat only affects the file on disk. Subsequent read_file calls pick up the formatted version.
2. Block destructive terminal commands
hooks:
pre_tool_call:
- matcher: "terminal"
command: "~/.hermes/agent-hooks/block-rm-rf.sh"
timeout: 5
#!/usr/bin/env bash
# ~/.hermes/agent-hooks/block-rm-rf.sh
payload="$(cat -)"
cmd=$(echo "$payload" | jq -r '.tool_input.command // empty')
if echo "$cmd" | grep -qE 'rm[[:space:]]+-rf?[[:space:]]+/'; then
printf '{"decision": "block", "reason": "blocked: rm -rf / is not permitted"}\n'
else
printf '{}\n'
fi
3. Inject git status into every turn (Claude-Code UserPromptSubmit equivalent)
hooks:
pre_llm_call:
- command: "~/.hermes/agent-hooks/inject-cwd-context.sh"
#!/usr/bin/env bash
# ~/.hermes/agent-hooks/inject-cwd-context.sh
cat - >/dev/null # discard stdin payload
if status=$(git status --porcelain 2>/dev/null) && [[ -n "$status" ]]; then
jq --null-input --arg s "$status" \
'{context: ("Uncommitted changes in cwd:\n" + $s)}'
else
printf '{}\n'
fi
Claude Code's UserPromptSubmit event is intentionally not a separate Hermes event — pre_llm_call fires at the same place and already supports context injection. Use it here.
4. Log every subagent completion
hooks:
subagent_stop:
- command: "~/.hermes/agent-hooks/log-orchestration.sh"
#!/usr/bin/env bash
# ~/.hermes/agent-hooks/log-orchestration.sh
log=~/.hermes/logs/orchestration.log
jq -c '{ts: now, parent: .session_id, extra: .extra}' < /dev/stdin >> "$log"
printf '{}\n'
Consent model
Each unique (event, command) pair prompts the user for approval the first time Hermes sees it, then persists the decision to ~/.hermes/shell-hooks-allowlist.json. Subsequent runs (CLI or gateway) skip the prompt.
Three escape hatches bypass the interactive prompt — any one is sufficient:
--accept-hooksflag on the CLI (e.g.hermes --accept-hooks chat)HERMES_ACCEPT_HOOKS=1environment variablehooks_auto_accept: trueincli-config.yaml
Non-TTY runs (gateway, cron, CI) need one of these three — otherwise any newly-added hook silently stays un-registered and logs a warning.
Script edits are silently trusted. The allowlist keys on the exact command string, not the script's hash, so editing the script on disk does not invalidate consent. hermes hooks doctor flags mtime drift so you can spot edits and decide whether to re-approve.
The hermes hooks CLI
| Command | What it does |
|---|---|
hermes hooks list |
Dump configured hooks with matcher, timeout, and consent status |
hermes hooks test <event> [--for-tool X] [--payload-file F] |
Fire every matching hook against a synthetic payload and print the parsed response |
hermes hooks revoke <command> |
Remove every allowlist entry matching <command> (takes effect on next restart) |
hermes hooks doctor |
For every configured hook: check exec bit, allowlist status, mtime drift, JSON output validity, and rough execution time |
Security
Shell hooks run with your full user credentials — same trust boundary as a cron entry or a shell alias. Treat the hooks: block in config.yaml as privileged configuration:
- Only reference scripts you wrote or fully reviewed.
- Keep scripts inside
~/.hermes/agent-hooks/so the path is easy to audit. - Re-run
hermes hooks doctorafter you pull a shared config to spot newly-added hooks before they register. - If your config.yaml is version-controlled across a team, review PRs that change the
hooks:section the same way you'd review CI config.
Ordering and precedence
Both Python plugin hooks and shell hooks flow through the same invoke_hook() dispatcher. Python plugins are registered first (discover_and_load()), shell hooks second (register_from_config()), so Python pre_tool_call block decisions take precedence in tie cases. The first valid block wins — the aggregator returns as soon as any callback produces {"action": "block", "message": str} with a non-empty message.