diff --git a/website/docs/getting-started/installation.md b/website/docs/getting-started/installation.md index 219c1e7d555..5ff5489f874 100644 --- a/website/docs/getting-started/installation.md +++ b/website/docs/getting-started/installation.md @@ -41,6 +41,17 @@ Native Windows is **not supported**. Please install [WSL2](https://learn.microso The installer handles everything automatically — all dependencies (Python, Node.js, ripgrep, ffmpeg), the repo clone, virtual environment, global `hermes` command setup, and LLM provider configuration. By the end, you're ready to chat. +#### Install Layout + +Where the installer puts things depends on whether you're installing as a normal user or as root: + +| Installer | Code lives at | `hermes` binary | Data directory | +|---|---|---|---| +| Per-user (normal) | `~/.hermes/hermes-agent/` | `~/.local/bin/hermes` (symlink) | `~/.hermes/` | +| Root-mode (`sudo curl … \| sudo bash`) | `/usr/local/lib/hermes-agent/` | `/usr/local/bin/hermes` | `/root/.hermes/` (or `$HERMES_HOME`) | + +The root-mode **FHS layout** (`/usr/local/lib/…`, `/usr/local/bin/hermes`) matches where other system-wide developer tools land on Linux. It's useful for shared-machine deployments where one system install should serve every user. Per-user config (auth, skills, sessions) still lives under each user's `~/.hermes/` or explicit `HERMES_HOME`. + ### After Installation Reload your shell and start chatting: diff --git a/website/docs/getting-started/updating.md b/website/docs/getting-started/updating.md index eb74427a0a0..8550f89b797 100644 --- a/website/docs/getting-started/updating.md +++ b/website/docs/getting-started/updating.md @@ -24,10 +24,33 @@ This pulls the latest code, updates dependencies, and prompts you to configure a When you run `hermes update`, the following steps occur: -1. **Git pull** — pulls the latest code from the `main` branch and updates submodules -2. **Dependency install** — runs `uv pip install -e ".[all]"` to pick up new or changed dependencies -3. **Config migration** — detects new config options added since your version and prompts you to set them -4. **Gateway auto-restart** — if the gateway service is running (systemd on Linux, launchd on macOS), it is **automatically restarted** after the update completes so the new code takes effect immediately +1. **Pairing-data snapshot** — a lightweight pre-update state snapshot is saved (covers `~/.hermes/pairing/`, Feishu comment rules, and other state files that get modified at runtime). Rollbackable via `hermes backup restore --state pre-update`. +2. **Git pull** — pulls the latest code from the `main` branch and updates submodules +3. **Dependency install** — runs `uv pip install -e ".[all]"` to pick up new or changed dependencies +4. **Config migration** — detects new config options added since your version and prompts you to set them +5. **Gateway auto-restart** — if the gateway service is running (systemd on Linux, launchd on macOS), it is **automatically restarted** after the update completes so the new code takes effect immediately + +### Preview-only: `hermes update --check` + +Want to know if you're behind `origin/main` before actually pulling? Run `hermes update --check` — it fetches, prints your local commit and the latest remote commit side-by-side, and exits `0` if in sync or `1` if behind. No files are modified, no gateway is restarted. Useful in scripts and cron jobs that gate on "is there an update". + +### Full pre-update backup: `--backup` + +For high-value profiles (production gateways, shared team installs) you can opt into a full pre-pull backup of `HERMES_HOME` (config, auth, sessions, skills, pairing): + +```bash +hermes update --backup +``` + +Or make it the default for every run: + +```yaml +# ~/.hermes/config.yaml +update: + backup: true +``` + +`--backup` was the always-on behavior in earlier builds, but it was adding minutes to every update on large homes, so it's now opt-in. The lightweight pairing-data snapshot above still runs unconditionally. Expected output looks like: diff --git a/website/docs/guides/build-a-hermes-plugin.md b/website/docs/guides/build-a-hermes-plugin.md index 0c401033f93..3b1afb48709 100644 --- a/website/docs/guides/build-a-hermes-plugin.md +++ b/website/docs/guides/build-a-hermes-plugin.md @@ -242,8 +242,24 @@ def register(ctx): - `ctx.register_tool()` puts your tool in the registry — the model sees it immediately - `ctx.register_hook()` subscribes to lifecycle events - `ctx.register_cli_command()` registers a CLI subcommand (e.g. `hermes my-plugin `) +- `ctx.register_command()` registers an in-session slash command (e.g. `/myplugin ` inside CLI / gateway chat) — see [Register slash commands](#register-slash-commands) below +- `ctx.dispatch_tool(name, arguments)` — call any other tool (built-in or from another plugin) with the parent agent's context (approvals, credentials, task_id) wired up automatically. Useful from slash-command handlers that need to invoke `terminal`, `read_file`, or any other tool as if the model had called it directly. - If this function crashes, the plugin is disabled but Hermes continues fine +**`dispatch_tool` example — a slash command that runs a tool:** + +```python +def handle_scan(ctx, argstr): + """Implement /scan by invoking the terminal tool through the registry.""" + result = ctx.dispatch_tool("terminal", {"command": f"find . -name '{argstr}'"}) + return result # returned to the caller's chat UI + +def register(ctx): + ctx.register_command("scan", handle_scan, help="Find files matching a glob") +``` + +The dispatched tool goes through the normal approval, redaction, and budget pipelines — it's a real tool invocation, not a shortcut around them. + ## Step 6: Test it Start Hermes: diff --git a/website/docs/integrations/providers.md b/website/docs/integrations/providers.md index a989d938fed..7cbb9db5e2c 100644 --- a/website/docs/integrations/providers.md +++ b/website/docs/integrations/providers.md @@ -29,6 +29,7 @@ You need at least one way to connect to an LLM. Use `hermes model` to switch pro | **MiniMax** | `MINIMAX_API_KEY` in `~/.hermes/.env` (provider: `minimax`) | | **MiniMax China** | `MINIMAX_CN_API_KEY` in `~/.hermes/.env` (provider: `minimax-cn`) | | **Alibaba Cloud** | `DASHSCOPE_API_KEY` in `~/.hermes/.env` (provider: `alibaba`, aliases: `dashscope`, `qwen`) | +| **Alibaba Coding Plan** | `DASHSCOPE_API_KEY` (provider: `alibaba-coding-plan`, alias: `alibaba_coding`) — separate billing SKU, different endpoint | | **Kilo Code** | `KILOCODE_API_KEY` in `~/.hermes/.env` (provider: `kilocode`) | | **Xiaomi MiMo** | `XIAOMI_API_KEY` in `~/.hermes/.env` (provider: `xiaomi`, aliases: `mimo`, `xiaomi-mimo`) | | **Tencent TokenHub** | `TOKENHUB_API_KEY` in `~/.hermes/.env` (provider: `tencent-tokenhub`, aliases: `tencent`, `tokenhub`, `tencentmaas`) | @@ -136,7 +137,7 @@ The OpenAI Codex provider authenticates via device code (open a URL, enter a cod ::: :::warning -Even when using Nous Portal, Codex, or a custom endpoint, some tools (vision, web summarization, MoA) use a separate "auxiliary" model — by default Gemini Flash via OpenRouter. An `OPENROUTER_API_KEY` enables these tools automatically. You can also configure which model and provider these tools use — see [Auxiliary Models](/docs/user-guide/configuration#auxiliary-models). +Even when using Nous Portal, Codex, or a custom endpoint, some tools (vision, web summarization, MoA) use a separate "auxiliary" model. By default (`auxiliary.*.provider: "auto"`), Hermes routes these tasks to your **main chat model** — the same model you picked in `hermes model`. You can override each task individually to route it to a cheaper/faster model (e.g. Gemini Flash on OpenRouter) — see [Auxiliary Models](/docs/user-guide/configuration#auxiliary-models). ::: :::tip Nous Tool Gateway @@ -411,6 +412,24 @@ Set `HERMES_QWEN_BASE_URL` only if the portal endpoint relocates (default: `http `qwen-oauth` uses the consumer-facing Qwen Portal with OAuth login — ideal for individual users. The `alibaba` provider uses DashScope's enterprise API with a `DASHSCOPE_API_KEY` — ideal for programmatic / production workloads. Both route to Qwen-family models but live at different endpoints. ::: +### Alibaba Coding Plan + +If you're subscribed to Alibaba's **Coding Plan** (a pricing SKU separate from standard DashScope API access), Hermes exposes it as its own first-class provider: `alibaba-coding-plan`. Endpoint: `https://coding-intl.dashscope.aliyuncs.com/v1`. It's OpenAI-compatible like the regular `alibaba` provider but with a different base URL and billing surface. + +```yaml +model: + provider: alibaba_coding # alias for alibaba-coding-plan + model: qwen3-coder-plus +``` + +Or from the CLI: + +```bash +hermes chat --provider alibaba_coding --model qwen3-coder-plus +``` + +`alibaba_coding` uses the same `DASHSCOPE_API_KEY` your `alibaba` entry already uses — no separate key needed, just a different routing target. Before this provider was registered, users who set `provider: alibaba_coding` in `config.yaml` silently fell through to OpenRouter routing. + ### MiniMax (OAuth) MiniMax-M2.7 via browser OAuth login — no API key needed. Pick **MiniMax (OAuth)** in `hermes model`, sign in through the browser, and Hermes persists the access + refresh tokens. Uses the Anthropic Messages-compatible endpoint (`/anthropic`) under the hood. diff --git a/website/docs/reference/cli-commands.md b/website/docs/reference/cli-commands.md index 074f7ee830a..ac600bbeef8 100644 --- a/website/docs/reference/cli-commands.md +++ b/website/docs/reference/cli-commands.md @@ -64,12 +64,13 @@ hermes [global-options] [subcommand/options] | `hermes tools` | Configure enabled tools per platform. | | `hermes sessions` | Browse, export, prune, rename, and delete sessions. | | `hermes insights` | Show token/cost/activity analytics. | +| `hermes fallback` | Interactive manager for the fallback provider chain. | | `hermes claw` | OpenClaw migration helpers. | | `hermes dashboard` | Launch the web dashboard for managing config, API keys, and sessions. | | `hermes profile` | Manage profiles — multiple isolated Hermes instances. | | `hermes completion` | Print shell completion scripts (bash/zsh). | | `hermes version` | Show version information. | -| `hermes update` | Pull latest code and reinstall dependencies. | +| `hermes update` | Pull latest code and reinstall dependencies. `--check` prints commit diff without pulling; `--backup` takes a pre-pull `HERMES_HOME` snapshot. | | `hermes uninstall` | Remove Hermes from the system. | ## `hermes chat` @@ -85,7 +86,7 @@ Common options: | `-q`, `--query "..."` | One-shot, non-interactive prompt. | | `-m`, `--model ` | Override the model for this run. | | `-t`, `--toolsets ` | Enable a comma-separated set of toolsets. | -| `--provider ` | Force a provider: `auto`, `openrouter`, `nous`, `openai-codex`, `copilot-acp`, `copilot`, `anthropic`, `gemini`, `google-gemini-cli`, `huggingface`, `zai`, `kimi-coding`, `kimi-coding-cn`, `minimax`, `minimax-cn`, `minimax-oauth`, `kilocode`, `xiaomi`, `arcee`, `gmi`, `alibaba`, `deepseek`, `nvidia`, `ollama-cloud`, `xai` (alias `grok`), `qwen-oauth`, `bedrock`, `opencode-zen`, `opencode-go`, `ai-gateway`, `azure-foundry`. | +| `--provider ` | Force a provider: `auto`, `openrouter`, `nous`, `openai-codex`, `copilot-acp`, `copilot`, `anthropic`, `gemini`, `google-gemini-cli`, `huggingface`, `zai`, `kimi-coding`, `kimi-coding-cn`, `minimax`, `minimax-cn`, `minimax-oauth`, `kilocode`, `xiaomi`, `arcee`, `gmi`, `alibaba`, `alibaba-coding-plan` (alias `alibaba_coding`), `deepseek`, `nvidia`, `ollama-cloud`, `xai` (alias `grok`), `qwen-oauth`, `bedrock`, `opencode-zen`, `opencode-go`, `ai-gateway`, `azure-foundry`, `tencent-tokenhub` (alias `tencent`, `tokenhub`). | | `-s`, `--skills ` | Preload one or more skills for the session (can be repeated or comma-separated). | | `-v`, `--verbose` | Verbose output. | | `-Q`, `--quiet` | Programmatic mode: suppress banner/spinner/tool previews. | @@ -112,6 +113,33 @@ hermes chat --worktree -q "Review this repo and open a PR" hermes chat --ignore-user-config --ignore-rules -q "Repro without my personal setup" ``` +### `hermes -z ` — scripted one-shot + +For programmatic callers (shell scripts, CI, cron, parent processes piping in a prompt), `hermes -z` is the purest one-shot entry point: **single prompt in, final response text out, nothing else on stdout or stderr.** No banner, no spinner, no tool previews, no `Session:` line — just the agent's final reply as plain text. + +```bash +hermes -z "What's the capital of France?" +# → Paris. + +# Parent scripts can cleanly capture the response: +answer=$(hermes -z "summarize this" < /path/to/file.txt) +``` + +Per-run overrides (no mutation to `~/.hermes/config.yaml`): + +| Flag | Equivalent env var | Purpose | +|---|---|---| +| `-m` / `--model ` | `HERMES_INFERENCE_MODEL` | Override the model for this run | +| `--provider ` | `HERMES_INFERENCE_PROVIDER` | Override the provider for this run | + +```bash +hermes -z "…" --provider openrouter --model openai/gpt-5.5 +# or: +HERMES_INFERENCE_MODEL=anthropic/claude-sonnet-4.6 hermes -z "…" +``` + +Same agent, same tools, same skills — just strips every interactive / cosmetic layer. If you need tool output in the transcript too, use `hermes chat -q` instead; `-z` is explicitly for "I only want the final answer". + ## `hermes model` Interactive provider + model selector. **This is the command for adding new providers, setting up API keys, and running OAuth flows.** Run it from your terminal — not from inside an active Hermes chat session. @@ -181,6 +209,12 @@ Subcommands: | `uninstall` | Remove the installed service. | | `setup` | Interactive messaging-platform setup. | +Options: + +| Option | Description | +|--------|-------------| +| `--all` | On `start` / `restart` / `stop`: act on **every profile's** gateway, not just the active `HERMES_HOME`. Useful if you run multiple profiles side-by-side and want to restart them all after `hermes update`. | + :::tip WSL users Use `hermes gateway run` instead of `hermes gateway start` — WSL's systemd support is unreliable. Wrap it in tmux for persistence: `tmux new -s hermes 'hermes gateway run'`. See [WSL FAQ](/docs/reference/faq#wsl-gateway-keeps-disconnecting-or-hermes-gateway-start-fails) for details. ::: @@ -462,6 +496,12 @@ Create a zip archive of your Hermes configuration, skills, sessions, and data. T The backup uses SQLite's `backup()` API for safe copying, so it works correctly even when Hermes is running (WAL-mode safe). +**What's excluded from the zip:** + +- `*.db-wal`, `*.db-shm`, `*.db-journal` — SQLite's WAL / shared-memory / journal sidecars. The `*.db` file already got a consistent snapshot via `sqlite3.backup()`; shipping the live sidecars alongside it would let a restore see a half-committed state. +- `checkpoints/` — per-session trajectory caches. Hash-keyed and regenerated per session; wouldn't port cleanly to another install anyway. +- The `hermes-agent` code itself (this is a user-data backup, not a repo snapshot). + ### Examples ```bash @@ -910,6 +950,44 @@ hermes completion bash >> ~/.bashrc hermes completion zsh >> ~/.zshrc ``` +## `hermes update` + +```bash +hermes update [--check] [--backup] [--restart-gateway] +``` + +Pulls the latest `hermes-agent` code and reinstalls dependencies in your venv, then re-runs the post-install hooks (MCP servers, skills sync, completion install). Safe to run on a live install. + +| Option | Description | +|--------|-------------| +| `--check` | Print the current commit and the latest `origin/main` commit side by side, and exit 0 if in sync or 1 if behind. Does not pull, install, or restart anything. | +| `--backup` | Create a labeled pre-update snapshot of `HERMES_HOME` (config, auth, sessions, skills, pairing data) before pulling. Default is **off** — the previous always-backup behavior was adding minutes to every update on large homes. Flip it on permanently via `update.backup: true` in `config.yaml`. | +| `--restart-gateway` | After a successful update, restart the running gateway service. Implies `--all` semantics if multiple profiles are installed. | + +Additional behavior: + +- **Pairing data snapshot.** Even when `--backup` is off, `hermes update` takes a lightweight snapshot of `~/.hermes/pairing/` and the Feishu comment rules before `git pull`. You can roll it back with `hermes backup restore --state pre-update` if a pull rewrites a file you were editing. +- **Legacy `hermes.service` warning.** If Hermes detects a pre-rename `hermes.service` systemd unit (instead of the current `hermes-gateway.service`), it prints a one-time migration hint so you can avoid flap-loop issues. +- **Exit codes.** `0` on success, `1` on pull/install/post-install errors, `2` on unexpected working-tree changes that block `git pull`. + +## `hermes fallback` + +```bash +hermes fallback # interactive manager +``` + +Manage the fallback provider chain (used when your primary provider hits a rate limit or returns a fatal error) without hand-editing `config.yaml`. Reuses the provider picker from `hermes model` — same provider list, same credential prompts, same validation. + +Typical session: + +1. Press `a` to add a fallback → pick a provider (OAuth-based providers open a browser; API-key providers prompt for the key), then pick the specific model. +2. Use `↑`/`↓` to reorder fallbacks (first-in-list is tried first). +3. Press `d` to remove one. + +All changes persist to `fallback_providers:` under `model:` in `config.yaml`. Interacts with [Credential Pools](/docs/user-guide/features/credential-pools): pools rotate keys *within* a provider, fallbacks switch to a *different* provider entirely. + +See [Fallback Providers](/docs/user-guide/features/fallback-providers) for behavior details and interaction with `fallback_model` (legacy single-fallback key). + ## Maintenance commands | Command | Description | diff --git a/website/docs/reference/environment-variables.md b/website/docs/reference/environment-variables.md index 497c62adcfd..671b66573f0 100644 --- a/website/docs/reference/environment-variables.md +++ b/website/docs/reference/environment-variables.md @@ -93,7 +93,7 @@ For native Anthropic auth, Hermes prefers Claude Code's own credential files whe | Variable | Description | |----------|-------------| -| `HERMES_INFERENCE_PROVIDER` | Override provider selection: `auto`, `custom`, `openrouter`, `nous`, `openai-codex`, `copilot`, `copilot-acp`, `anthropic`, `huggingface`, `gemini`, `zai`, `kimi-coding`, `kimi-coding-cn`, `minimax`, `minimax-cn`, `minimax-oauth` (browser OAuth login — no API key required; see [MiniMax OAuth guide](../guides/minimax-oauth.md)), `kilocode`, `xiaomi`, `arcee`, `gmi`, `alibaba`, `deepseek`, `nvidia`, `ollama-cloud`, `xai` (alias `grok`), `google-gemini-cli`, `qwen-oauth`, `bedrock`, `opencode-zen`, `opencode-go`, `ai-gateway` (default: `auto`) | +| `HERMES_INFERENCE_PROVIDER` | Override provider selection: `auto`, `custom`, `openrouter`, `nous`, `openai-codex`, `copilot`, `copilot-acp`, `anthropic`, `huggingface`, `gemini`, `zai`, `kimi-coding`, `kimi-coding-cn`, `minimax`, `minimax-cn`, `minimax-oauth` (browser OAuth login — no API key required; see [MiniMax OAuth guide](../guides/minimax-oauth.md)), `kilocode`, `xiaomi`, `arcee`, `gmi`, `alibaba`, `alibaba-coding-plan` (alias `alibaba_coding`), `deepseek`, `nvidia`, `ollama-cloud`, `xai` (alias `grok`), `google-gemini-cli`, `qwen-oauth`, `bedrock`, `opencode-zen`, `opencode-go`, `ai-gateway`, `tencent-tokenhub` (default: `auto`) | | `HERMES_PORTAL_BASE_URL` | Override Nous Portal URL (for development/testing) | | `NOUS_INFERENCE_BASE_URL` | Override Nous inference API URL | | `HERMES_NOUS_MIN_KEY_TTL_SECONDS` | Min agent key TTL before re-mint (default: 1800 = 30min) | @@ -110,6 +110,7 @@ For native Anthropic auth, Hermes prefers Claude Code's own credential files whe | `FIRECRAWL_API_KEY` | Web scraping and cloud browser ([firecrawl.dev](https://firecrawl.dev/)) | | `FIRECRAWL_API_URL` | Custom Firecrawl API endpoint for self-hosted instances (optional) | | `TAVILY_API_KEY` | Tavily API key for AI-native web search, extract, and crawl ([app.tavily.com](https://app.tavily.com/home)) | +| `TAVILY_BASE_URL` | Override the Tavily API endpoint. Useful for corporate proxies and self-hosted Tavily-compatible search backends. Same pattern as `GROQ_BASE_URL`. | | `EXA_API_KEY` | Exa API key for AI-native web search and contents ([exa.ai](https://exa.ai/)) | | `BROWSERBASE_API_KEY` | Browser automation ([browserbase.com](https://browserbase.com/)) | | `BROWSERBASE_PROJECT_ID` | Browserbase project ID | @@ -128,6 +129,7 @@ For native Anthropic auth, Hermes prefers Claude Code's own credential files whe | `GITHUB_TOKEN` | GitHub token for Skills Hub (higher API rate limits, skill publish) | | `HONCHO_API_KEY` | Cross-session user modeling ([honcho.dev](https://honcho.dev/)) | | `HONCHO_BASE_URL` | Base URL for self-hosted Honcho instances (default: Honcho cloud). No API key required for local instances | +| `HINDSIGHT_TIMEOUT` | Timeout in seconds for Hindsight memory-provider API calls (default: `60`). Bump this if your Hindsight instance is slow to respond during `/sync` or `on_session_switch` and you're seeing timeouts in `errors.log`. | | `SUPERMEMORY_API_KEY` | Semantic long-term memory with profile recall and session ingest ([supermemory.ai](https://supermemory.ai)) | | `TINKER_API_KEY` | RL training ([tinker-console.thinkingmachines.ai](https://tinker-console.thinkingmachines.ai/)) | | `WANDB_API_KEY` | RL training metrics ([wandb.ai](https://wandb.ai/)) | @@ -169,6 +171,7 @@ These variables configure the [Tool Gateway](/docs/user-guide/features/tool-gate | Variable | Description | |----------|-------------| | `TERMINAL_ENV` | Backend: `local`, `docker`, `ssh`, `singularity`, `modal`, `daytona`, `vercel_sandbox` | +| `HERMES_DOCKER_BINARY` | Override the container binary Hermes shells out to (e.g. `podman`, `/usr/local/bin/docker`). When unset, Hermes auto-discovers `docker` or `podman` on `PATH`. Needed when both are installed and you want the non-default, or when the binary lives outside `PATH`. | | `TERMINAL_DOCKER_IMAGE` | Docker image (default: `nikolaik/python-nodejs:python3.11-nodejs20`) | | `TERMINAL_DOCKER_FORWARD_ENV` | JSON array of env var names to explicitly forward into Docker terminal sessions. Note: skill-declared `required_environment_variables` are forwarded automatically — you only need this for vars not declared by any skill. | | `TERMINAL_DOCKER_VOLUMES` | Additional Docker volume mounts (comma-separated `host:container` pairs) | @@ -404,6 +407,9 @@ For cloud sandbox backends, persistence is filesystem-oriented. `TERMINAL_LIFETI |----------|-------------| | `HERMES_TUI` | Launch the [TUI](../user-guide/tui.md) instead of the classic CLI when set to `1`. Equivalent to passing `--tui`. | | `HERMES_TUI_DIR` | Path to a prebuilt `ui-tui/` directory (must contain `dist/entry.js` and populated `node_modules`). Used by distros and Nix to skip the first-launch `npm install`. | +| `HERMES_TUI_RESUME` | Resume a specific TUI session by ID on launch. When set, `hermes --tui` skips forging a fresh session and picks up the named session instead — useful for re-attaching after a disconnect or terminal crash. | +| `HERMES_TUI_THEME` | Force the TUI color theme: `light`, `dark`, or a raw 6-character background hex (e.g. `ffffff` or `1a1a2e`). When unset, Hermes auto-detects using `COLORFGBG` and terminal background queries; this variable overrides detection on terminals (Ghostty, Warp, iTerm2, etc.) that don't set `COLORFGBG`. | +| `HERMES_INFERENCE_MODEL` | Force the model for `hermes -z` / `hermes chat` without mutating `config.yaml`. Pairs with `HERMES_INFERENCE_PROVIDER`. Useful for scripted callers (sweeper, CI, batch runners) that need to override the default model per run. | ## Cron Scheduler diff --git a/website/docs/reference/slash-commands.md b/website/docs/reference/slash-commands.md index d88705eec50..0678725d46f 100644 --- a/website/docs/reference/slash-commands.md +++ b/website/docs/reference/slash-commands.md @@ -33,6 +33,7 @@ Type `/` in the CLI to open the autocomplete menu. Built-in commands are case-in | `/snapshot [create\|restore \|prune]` (alias: `/snap`) | Create or restore state snapshots of Hermes config/state. `create [label]` saves a snapshot, `restore ` reverts to it, `prune [N]` removes old snapshots, or list all with no args. | | `/stop` | Kill all running background processes | | `/queue ` (alias: `/q`) | Queue a prompt for the next turn (doesn't interrupt the current agent response). **Note:** `/q` is claimed by both `/queue` and `/quit`; the last registration wins, so `/q` resolves to `/quit` in practice. Use `/queue` explicitly. | +| `/steer ` | Inject a mid-run note that arrives at the agent **after the next tool call** — no interrupt, no new user turn. The text is appended to the last tool result's content once the current tool completes, giving the agent new context without breaking the current tool-calling loop. Use this to nudge direction mid-task (e.g. "focus on the auth module" while the agent is running tests). | | `/resume [name]` | Resume a previously-named session | | `/status` | Show session info | | `/agents` (alias: `/tasks`) | Show active agents and running tasks across the current session. | @@ -72,7 +73,7 @@ Type `/` in the CLI to open the autocomplete menu. Built-in commands are case-in | Command | Description | |---------|-------------| | `/help` | Show this help message | -| `/usage` | Show token usage, cost breakdown, and session duration | +| `/usage` | Show token usage, cost breakdown, session duration, and — when available from the active provider — an **Account limits** section with remaining quota / credits / plan usage pulled live from the provider's API. | | `/insights` | Show usage insights and analytics (last 30 days) | | `/platforms` (alias: `/gateway`) | Show gateway/messaging platform status | | `/paste` | Attach a clipboard image | @@ -140,7 +141,7 @@ The messaging gateway supports the following built-in commands inside Telegram, | `/compress [focus topic]` | Manually compress conversation context. Optional focus topic narrows what the summary preserves. | | `/title [name]` | Set or show the session title. | | `/resume [name]` | Resume a previously named session. | -| `/usage` | Show token usage, estimated cost breakdown (input/output), context window state, and session duration. | +| `/usage` | Show token usage, estimated cost breakdown (input/output), context window state, session duration, and — when available from the active provider — an **Account limits** section with remaining quota / credits pulled live from the provider's API. | | `/insights [days]` | Show usage analytics. | | `/reasoning [level\|show\|hide]` | Change reasoning effort or toggle reasoning display. | | `/voice [on\|off\|tts\|join\|channel\|leave\|status]` | Control spoken replies in chat. `join`/`channel`/`leave` manage Discord voice-channel mode. | @@ -159,7 +160,7 @@ The messaging gateway supports the following built-in commands inside Telegram, ## Notes -- `/skin`, `/snapshot`, `/gquota`, `/reload`, `/tools`, `/toolsets`, `/browser`, `/config`, `/cron`, `/skills`, `/platforms`, `/paste`, `/image`, `/terminal-setup`, `/statusbar`, and `/plugins` are **CLI-only** commands. +- `/skin`, `/snapshot`, `/gquota`, `/reload`, `/tools`, `/toolsets`, `/browser`, `/config`, `/cron`, `/skills`, `/platforms`, `/paste`, `/image`, `/terminal-setup`, `/statusbar`, `/mouse`, `/plugins`, and `/steer` are **CLI-only** (the `/mouse` command is TUI-exclusive; `/steer` works in both classic CLI and TUI). - `/verbose` is **CLI-only by default**, but can be enabled for messaging platforms by setting `display.tool_progress_command: true` in `config.yaml`. When enabled, it cycles the `display.tool_progress` mode and saves to config. - `/sethome`, `/update`, `/restart`, `/approve`, `/deny`, and `/commands` are **messaging-only** commands. - `/status`, `/background`, `/voice`, `/reload-mcp`, `/rollback`, `/debug`, `/fast`, and `/yolo` work in **both** the CLI and the messaging gateway. diff --git a/website/docs/user-guide/cli.md b/website/docs/user-guide/cli.md index c3db8961bb9..527a49225f0 100644 --- a/website/docs/user-guide/cli.md +++ b/website/docs/user-guide/cli.md @@ -96,11 +96,17 @@ When resuming a previous session (`hermes -c` or `hermes --resume `), a "Pre | `Alt+V` | Paste an image from the clipboard when supported by the terminal | | `Ctrl+V` | Paste text and opportunistically attach clipboard images | | `Ctrl+B` | Start/stop voice recording when voice mode is enabled (`voice.record_key`, default: `ctrl+b`) | +| `Ctrl+G` | Open the current input buffer in `$EDITOR` (vim/nvim/nano/VS Code/etc.). Save and quit to send the edited text as the next prompt — ideal for long, multi-paragraph prompts. | +| `Ctrl+X Ctrl+E` | Emacs-style alternate binding for the external editor (same behavior as `Ctrl+G`). | | `Ctrl+C` | Interrupt agent (double-press within 2s to force exit) | | `Ctrl+D` | Exit | | `Ctrl+Z` | Suspend Hermes to background (Unix only). Run `fg` in the shell to resume. | | `Tab` | Accept auto-suggestion (ghost text) or autocomplete slash commands | +**Multiline paste preview.** When you paste a multi-line block, the CLI echoes a compact single-line preview (`[pasted: 47 lines, 1,842 chars — press Enter to send]`) instead of dumping the whole payload into the scrollback. The full content is still what gets sent; this is just display polish. + +**Markdown stripping in final responses.** The CLI strips the most verbose markdown fences and `**bold**` / `*italic*` wrappers from *final* agent replies so they render as readable terminal prose rather than raw source. Code blocks and lists are preserved. This does not affect gateway platforms or tool results — they keep their markdown for native rendering. + ## Slash Commands Type `/` to see the autocomplete dropdown. Hermes supports a large set of CLI slash commands, dynamic skill commands, and user-defined quick commands. diff --git a/website/docs/user-guide/configuration.md b/website/docs/user-guide/configuration.md index 7eeec950bd3..99c3fc6a502 100644 --- a/website/docs/user-guide/configuration.md +++ b/website/docs/user-guide/configuration.md @@ -132,6 +132,7 @@ terminal: backend: docker docker_image: "nikolaik/python-nodejs:python3.11-nodejs20" docker_mount_cwd_to_workspace: false # Mount launch dir into /workspace + docker_run_as_host_user: false # See "Running container as host user" below docker_forward_env: # Env vars to forward into container - "GITHUB_TOKEN" docker_volumes: # Host directory mounts @@ -145,7 +146,7 @@ terminal: container_persistent: true # Persist /workspace and /root across sessions ``` -**Requirements:** Docker Desktop or Docker Engine installed and running. Hermes probes `$PATH` plus common macOS install locations (`/usr/local/bin/docker`, `/opt/homebrew/bin/docker`, Docker Desktop app bundle). +**Requirements:** Docker Desktop or Docker Engine installed and running. Hermes probes `$PATH` plus common macOS install locations (`/usr/local/bin/docker`, `/opt/homebrew/bin/docker`, Docker Desktop app bundle). Podman is supported out of the box: set `HERMES_DOCKER_BINARY=podman` (or the full path) to force it when both are installed. **Container lifecycle:** Hermes reuses a single long-lived container (`docker run -d ... sleep 2h`) for every terminal and file-tool call, across sessions, `/new`, `/reset`, and `delegate_task` subagents, for the lifetime of the Hermes process. Commands run via `docker exec` with a login shell, so working-directory changes, installed packages, and files in `/workspace` all persist from one tool call to the next. The container is stopped and removed on Hermes shutdown (or when the idle-sweep reclaims it). @@ -301,6 +302,23 @@ If terminal commands fail immediately or the terminal tool is reported as disabl When in doubt, set `terminal.backend` back to `local` and verify that commands run there first. +### Remote-to-Host File Sync on Teardown + +For the **SSH**, **Modal**, and **Daytona** backends (anywhere the agent's working tree lives on a different machine than the host running Hermes), Hermes tracks files the agent touched inside the remote sandbox and, on session teardown / sandbox cleanup, **syncs the modified files back to the host** under `~/.hermes/cache/remote-syncs//`. + +- Triggers on: session close, `/new`, `/reset`, gateway message timeout, `delegate_task` subagent completion when the child used a remote backend. +- Covers the whole tree the agent modified, not just files it explicitly opened. Additions, edits, and deletions are all captured. +- The remote sandbox may have been torn down by the time you go looking; the local `~/.hermes/cache/remote-syncs/…` copy is the authoritative record of what the agent changed. +- Large binary outputs (model checkpoints, raw datasets) are capped by size — the sync skips files over `file_sync_max_mb` (default `100`). Bump that if you expect bigger artifacts to come back. + +```yaml +terminal: + file_sync_max_mb: 100 # default — sync files up to 100 MB each + file_sync_enabled: true # default — set false to skip the sync entirely +``` + +This is how you recover results from ephemeral cloud sandboxes that get destroyed after the session ends, without having to tell the agent to explicitly `scp` or `modal volume put` every artifact. + ### Docker Volume Mounts When using the Docker backend, `docker_volumes` lets you share host directories with the container. Each entry uses standard Docker `-v` syntax: `host_path:container_path[:options]`. @@ -355,6 +373,20 @@ Hermes resolves each listed variable from your current shell first, then falls b Anything listed in `docker_forward_env` becomes visible to commands run inside the container. Only forward credentials you are comfortable exposing to the terminal session. ::: +### Running the Container as Your Host User + +By default Docker containers run as `root` (UID 0). Files created inside `/workspace` or other bind-mounts end up owned by root on the host, so after a session you have to `sudo chown` them before you can edit them from your host editor. The `terminal.docker_run_as_host_user` flag fixes this: + +```yaml +terminal: + backend: docker + docker_run_as_host_user: true # default: false +``` + +When enabled, Hermes appends `--user $(id -u):$(id -g)` to the `docker run` command so files written into bind-mounted directories (`/workspace`, `/root`, anything in `docker_volumes`) are owned by your host user, not root. The trade-off: the container can no longer `apt install` or write to root-owned paths like `/root/.npm` — use a base image whose `HOME` is owned by a non-root user (or add your required tooling at image build time) if you need both. + +Leave this `false` (the default) for backwards-compatible behavior. Turn it on when your workflow is mostly "edit mounted host files" and you're tired of `sudo chown -R`. + ### Optional: Mount the Launch Directory into `/workspace` Docker sandboxes stay isolated by default. Hermes does **not** pass your current host working directory into the container unless you explicitly opt in. @@ -447,6 +479,17 @@ hermes config set skills.config.myplugin.path ~/myplugin-data For details on declaring config settings in your own skills, see [Creating Skills — Config Settings](/docs/developer-guide/creating-skills#config-settings-configyaml). +### Guard on agent-created skill writes + +When the agent uses `skill_manage` to create, edit, patch, or delete a skill, Hermes can optionally scan the new/updated content for dangerous keyword patterns (credential harvesting, obvious prompt injection, exfil instructions). The scanner is **off by default** — real agent workflows that legitimately touch `~/.ssh/` or mention `$OPENAI_API_KEY` were tripping the heuristic too often. Turn it back on if you want the scanner to prompt you before the agent's skill writes land: + +```yaml +skills: + guard_agent_created: true # default: false +``` + +When on, any flagged `skill_manage` write surfaces as an approval prompt with the scanner's rationale. Accepted writes land; denied writes return an explanatory error to the agent. + ## Memory Configuration ```yaml @@ -560,6 +603,7 @@ compression: threshold: 0.50 # Compress at this % of context limit target_ratio: 0.20 # Fraction of threshold to preserve as recent tail protect_last_n: 20 # Min recent messages to keep uncompressed + hygiene_hard_message_limit: 400 # Gateway safety valve — see below # The summarization model/provider is configured under auxiliary: auxiliary: @@ -573,6 +617,12 @@ auxiliary: Older configs with `compression.summary_model`, `compression.summary_provider`, and `compression.summary_base_url` are automatically migrated to `auxiliary.compression.*` on first load (config version 17). No manual action needed. ::: +`hygiene_hard_message_limit` is a gateway-only **pre-compression safety valve**. Runaway sessions with thousands of messages can hit model context limits before the normal percent-of-context threshold fires; when message count crosses this ceiling, Hermes forces compression regardless of token usage. Default `400` — raise it for platforms where very long sessions are normal, lower it to force more aggressive compression. Editing this value on a running gateway takes effect on the next message (see below). + +:::tip Gateway hot-reload of compression and context length +As of recent releases, editing `model.context_length` or any `compression.*` key in `config.yaml` on a running gateway takes effect on the next message — no gateway restart, no `/reset`, no session rotation required. The cached-agent signature includes these keys, so the gateway transparently rebuilds the agent when it sees a change. API keys and tool/skill config still require the usual reload paths. +::: + ### Common setups **Default (auto-detect) — no configuration needed:** @@ -581,7 +631,7 @@ compression: enabled: true threshold: 0.50 ``` -Uses the first available provider (OpenRouter → Nous → Codex) with Gemini Flash. +Uses your main provider and main model. Override per-task (e.g. `auxiliary.compression.provider: openrouter` + `model: google/gemini-2.5-flash`) if you want compression on a cheaper model than your main chat model. **Force a specific provider** (OAuth or API-key based): ```yaml @@ -647,12 +697,15 @@ Warnings are injected into the last tool result's JSON (as a `_budget_warning` f ```yaml agent: max_turns: 90 # Max iterations per conversation turn (default: 90) + api_max_retries: 2 # Retries per provider before fallback engages (default: 2) ``` Budget pressure is enabled by default. The agent sees warnings naturally as part of tool results, encouraging it to consolidate its work and deliver a response before running out of iterations. When the iteration budget is fully exhausted, the CLI shows a notification to the user: `⚠ Iteration budget reached (90/90) — response may be incomplete`. If the budget runs out during active work, the agent generates a summary of what was accomplished before stopping. +`agent.api_max_retries` controls how many times Hermes retries a provider API call on transient errors (rate limits, connection drops, 5xx) **before** fallback-provider switching engages. The default is `2` — three attempts total, matching the OpenAI SDK default. If you have [fallback providers](/docs/user-guide/features/fallback-providers) configured and want to fail over faster, drop this to `0` so the first transient error on your primary immediately hands off to the fallback instead of churning retries against the flaky endpoint. + ### API Timeouts Hermes has separate timeout layers for streaming, plus a stale detector for non-streaming calls. The stale detectors auto-adjust for local providers only when you leave them at their implicit defaults. @@ -709,7 +762,29 @@ Options: `fill_first` (default), `round_robin`, `least_used`, `random`. See [Cre ## Auxiliary Models -Hermes uses lightweight "auxiliary" models for side tasks like image analysis, web page summarization, and browser screenshot analysis. By default, these use **Gemini Flash** via auto-detection — you don't need to configure anything. +Hermes uses "auxiliary" models for side tasks like image analysis, web page summarization, browser screenshot analysis, session-title generation, and context compression. By default (`auxiliary.*.provider: "auto"`), Hermes routes every auxiliary task to your **main chat model** — the same provider/model you picked in `hermes model`. You don't need to configure anything to get started, but be aware that on expensive reasoning models (Opus, MiniMax M2.7, etc.) auxiliary tasks add meaningful cost. If you want cheap-and-fast side tasks regardless of your main model, set `auxiliary..provider` and `auxiliary..model` explicitly (for example, Gemini Flash on OpenRouter for vision and web extraction). + +:::note Why "auto" uses your main model +Earlier builds split aggregator users (OpenRouter, Nous Portal) onto a cheap provider-side default. That was surprising — users who paid for an aggregator subscription would see a different model handling their auxiliary traffic. `auto` now uses the main model for everyone, and per-task overrides in `config.yaml` still win (see [Full auxiliary config reference](#full-auxiliary-config-reference) below). +::: + +### Configuring auxiliary models interactively + +Instead of hand-editing YAML, run `hermes model` and pick **"Configure auxiliary models"** from the menu. You'll get an interactive per-task picker: + +``` +$ hermes model +→ Configure auxiliary models + +[ ] vision currently: auto / main model +[ ] web_extract currently: auto / main model +[ ] session_search currently: openrouter / google/gemini-2.5-flash +[ ] title_generation currently: openrouter / google/gemini-3-flash-preview +[ ] compression currently: auto / main model +[ ] approval currently: auto / main model +``` + +Select a task, pick a provider (OAuth flows open a browser; API-key providers prompt), pick a model. The change persists to `auxiliary..*` in `config.yaml`. Same machinery as the main-model picker — no extra syntax to learn. ### Video Tutorial @@ -1088,6 +1163,7 @@ display: streaming: false # Stream tokens to terminal as they arrive (real-time output) show_cost: false # Show estimated $ cost in the CLI status bar tool_preview_length: 0 # Max chars for tool call previews (0 = no limit, show full paths/commands) + runtime_metadata_footer: false # Gateway: append a runtime-context footer to final replies ``` | Mode | What you see | @@ -1099,6 +1175,23 @@ display: In the CLI, cycle through these modes with `/verbose`. To use `/verbose` in messaging platforms (Telegram, Discord, Slack, etc.), set `tool_progress_command: true` in the `display` section above. The command will then cycle the mode and save to config. +### Runtime-metadata footer (gateway only) + +When `display.runtime_metadata_footer: true`, Hermes appends a small runtime-context footer to the **final** message of each gateway turn — same info the CLI shows in its status bar (model, session duration, tokens, cost). Off by default; opt in per-gateway if your team wants every reply to include the provenance. + +```yaml +display: + runtime_metadata_footer: true +``` + +Example footer appended to a Telegram/Discord/Slack reply: + +``` +— claude-opus-4.7 · 12 tool calls · 2m 14s · $0.042 +``` + +Only the **final** message of a turn gets the footer; interim updates stay clean. + ### Per-platform progress overrides Different platforms have different verbosity needs. For example, Signal can't edit messages, so each progress update becomes a separate message — noisy. Use `tool_progress_overrides` to set per-platform modes: diff --git a/website/docs/user-guide/docker.md b/website/docs/user-guide/docker.md index 8dac4bea38b..17bd714aacd 100644 --- a/website/docs/user-guide/docker.md +++ b/website/docs/user-guide/docker.md @@ -263,6 +263,8 @@ The official image is based on `debian:13.4` and includes: - Node.js + npm (for browser automation and WhatsApp bridge) - Playwright with Chromium (`npx playwright install --with-deps chromium`) - ripgrep and ffmpeg as system utilities +- **`docker-cli`** — so agents running inside the container can drive the host's Docker daemon (bind-mount `/var/run/docker.sock` to opt in) for `docker build`, `docker run`, container inspection, etc. +- **`openssh-client`** — enables the [SSH terminal backend](/docs/user-guide/configuration#ssh-backend) from inside the container. The SSH backend shells out to the system `ssh` binary; without this, it failed silently in containerized installs. - The WhatsApp bridge (`scripts/whatsapp-bridge/`) The entrypoint script (`docker/entrypoint.sh`) bootstraps the data volume on first run: diff --git a/website/docs/user-guide/features/built-in-plugins.md b/website/docs/user-guide/features/built-in-plugins.md index 20c88df68c0..eb4d27e7281 100644 --- a/website/docs/user-guide/features/built-in-plugins.md +++ b/website/docs/user-guide/features/built-in-plugins.md @@ -162,6 +162,36 @@ Hermes-prefixed and standard SDK env vars (`LANGFUSE_PUBLIC_KEY`, `LANGFUSE_SECR **Disabling:** `hermes plugins disable observability/langfuse`. The plugin module is still discovered, but no module code runs until you re-enable. +### google_meet + +Lets the agent **join, transcribe, and participate in Google Meet calls** — take notes on a meeting, summarize the back-and-forth after, follow up on specific points, and (optionally) speak replies back into the call via TTS. + +**What it adds:** + +- A headless virtual participant that joins a Meet URL using browser automation +- Live transcription of the meeting audio via the configured STT provider +- A `meet_summarize` / `meet_speak` / `meet_followup` toolset the agent invokes to act on what it heard +- Post-meeting artifacts (transcript, speaker-attributed notes, action items) saved under `~/.hermes/cache/google_meet//` + +**Setup:** + +```bash +hermes plugins enable google_meet +# Prompts you to sign in via the plugin's OAuth flow on first use — +# needs a Google account with Meet access. Host approval may be required +# if the meeting enforces "only invited participants can join". +``` + +Usage from chat: + +> "Join meet.google.com/abc-defg-hij and take notes. After the call, send me a summary with action items." + +The agent kicks off the meeting join, streams the transcription back into its context as the call proceeds, and produces a structured summary when the meeting ends (or when you tell it to stop). + +**When to use it:** recurring standups where you want a bot to transcribe + summarize for async attendees; deposition-style interviews where you want structured notes; any case where you'd otherwise need Fireflies / Otter / Grain. When you'd rather not have an AI listening in — don't enable it. + +**Disabling:** `hermes plugins disable google_meet`. Any cached transcripts and recordings stay in `~/.hermes/cache/google_meet/` until you remove them. + ## Adding a bundled plugin Bundled plugins are written exactly like any other Hermes plugin — see [Build a Hermes Plugin](/docs/guides/build-a-hermes-plugin). The only differences are: diff --git a/website/docs/user-guide/features/cron.md b/website/docs/user-guide/features/cron.md index 6eb7580bf58..368c4a47cf8 100644 --- a/website/docs/user-guide/features/cron.md +++ b/website/docs/user-guide/features/cron.md @@ -366,6 +366,64 @@ cronjob(action="remove", job_id="...") For `update`, pass `skills=[]` to remove all attached skills. +## Toolsets available to cron jobs + +Cron runs each job in a fresh agent session with no chat platform attached. By default the cron agent gets **the toolset you configured for the `cron` platform in `hermes tools`** — not the CLI default, not everything under the sun. + +```bash +hermes tools +# → pick the "cron" platform in the curses UI +# → toggle toolsets on/off just like you would for Telegram/Discord/etc. +``` + +Tighter per-job control is available via the `enabled_toolsets` field on `cronjob.create` (or on an existing job via `cronjob.update`): + +```text +cronjob(action="create", name="weekly-news-summary", + schedule="every sunday 9am", + enabled_toolsets=["web", "file"], # just web + file, no terminal/browser/etc. + prompt="Summarize this week's AI news: ...") +``` + +When `enabled_toolsets` is set on a job it wins; otherwise the `hermes tools` cron-platform config wins; otherwise Hermes falls back to the built-in defaults. This matters for cost control: carrying `moa`, `browser`, `delegation` into every tiny "fetch news" job bloats the tool-schema prompt on every LLM call. + +### Skipping the agent entirely: `wakeAgent` + +If your cron job attaches a pre-check script (via `script=`), the script can decide at runtime whether Hermes should even invoke the agent. Emit a final stdout line of the form: + +```text +{"wakeAgent": false} +``` + +…and cron skips the agent run entirely for this tick. Useful for frequent polls (every 1–5 min) that only need to wake the LLM when state actually changed — otherwise you pay for zero-content agent turns over and over. + +```python +# pre-check script +import json, sys +latest = fetch_latest_issue_count() +prev = read_state("issue_count") +if latest == prev: + print(json.dumps({"wakeAgent": False})) # skip this tick + sys.exit(0) +write_state("issue_count", latest) +print(json.dumps({"wakeAgent": True, "context": {"new_issues": latest - prev}})) +``` + +When `wakeAgent` is omitted, the default is `true` (wake the agent as usual). + +### Chaining jobs: `context_from` + +A cron job can consume the most recent successful output of one or more other jobs by listing their names (or IDs) in `context_from`: + +```text +cronjob(action="create", name="daily-digest", + schedule="every day 7am", + context_from=["ai-news-fetch", "github-prs-fetch"], + prompt="Write the daily digest using the outputs above.") +``` + +The referenced jobs' most recent completed outputs are injected above the prompt as context for this run. Each upstream entry must be a valid job ID or name (see `cronjob action="list"`). Note: chaining reads the *most recent completed* output — it does not wait for upstream jobs that are running in the same tick. + ## Job storage Jobs are stored in `~/.hermes/cron/jobs.json`. Output from job runs is saved to `~/.hermes/cron/output/{job_id}/{timestamp}.md`. diff --git a/website/docs/user-guide/features/delegation.md b/website/docs/user-guide/features/delegation.md index f3c832bff0f..ec09d148f94 100644 --- a/website/docs/user-guide/features/delegation.md +++ b/website/docs/user-guide/features/delegation.md @@ -173,6 +173,32 @@ delegate_task( ) ``` +## Child Timeout + +Subagents are killed as stuck if they go quiet for more than `delegation.child_timeout_seconds` wall-clock seconds. The default is **600** (10 minutes) — bumped up from 300s in earlier releases because high-reasoning models on non-trivial research tasks were getting killed mid-think. Tune it per-install: + +```yaml +delegation: + child_timeout_seconds: 600 # default +``` + +Lower it for fast local models; raise it for slow reasoning models on hard problems. The timer resets every time the child makes an API call or tool call — only genuinely idle workers trigger the kill. + +:::tip Diagnostic dump on zero-call timeout +If a subagent times out having made **zero** API calls (usually: provider unreachable, auth failure, or tool-schema rejection), `delegate_task` writes a structured diagnostic to `~/.hermes/logs/subagent-timeout--.log` containing the subagent's config snapshot, credential-resolution trace, and any early error messages. Much easier to root-cause than the previous silent-timeout behavior. +::: + +## Monitoring Running Subagents (`/agents`) + +The TUI ships a `/agents` overlay (alias `/tasks`) that turns recursive `delegate_task` fan-out into a first-class audit surface: + +- Live tree view of running and recently-finished subagents, grouped by parent +- Per-branch cost, token, and file-touched rollups +- Kill and pause controls — cancel a specific subagent mid-flight without interrupting its siblings +- Post-hoc review: step through each subagent's turn-by-turn history even after they've returned to the parent + +The classic CLI just prints `/agents` as a text summary; the TUI is where the overlay shines. See [TUI — Slash commands](/docs/user-guide/tui#slash-commands). + ## Depth Limit and Nested Orchestration By default, delegation is **flat**: a parent (depth 0) spawns children (depth 1), and those children cannot delegate further. This prevents runaway recursive delegation. diff --git a/website/docs/user-guide/features/fallback-providers.md b/website/docs/user-guide/features/fallback-providers.md index ea8fc3fc8b9..5ead44af14f 100644 --- a/website/docs/user-guide/features/fallback-providers.md +++ b/website/docs/user-guide/features/fallback-providers.md @@ -21,7 +21,15 @@ When your main LLM provider encounters errors — rate limits, server overload, ### Configuration -Add a `fallback_model` section to `~/.hermes/config.yaml`: +The easiest path is the interactive manager: + +```bash +hermes fallback +``` + +`hermes fallback` reuses the provider picker from `hermes model` — same provider list, same credential prompts, same validation. Press `a` to add a fallback, `↑`/`↓` to reorder, `d` to remove, `q` to save and exit. Changes persist under `model.fallback_providers` in `config.yaml`. + +If you'd rather edit the YAML directly, add a `fallback_model` section to `~/.hermes/config.yaml`: ```yaml fallback_model: @@ -31,6 +39,10 @@ fallback_model: Both `provider` and `model` are **required**. If either is missing, the fallback is disabled. +:::note `fallback_model` vs `fallback_providers` +`fallback_model` (singular) is the legacy single-fallback key — Hermes still honors it for back-compat. `fallback_providers` (plural, list) supports multiple fallbacks tried in order; `hermes fallback` writes to this key. When both are set, Hermes merges them with `fallback_providers` taking priority. +::: + ### Supported Providers | Provider | Value | Requirements | diff --git a/website/docs/user-guide/features/hooks.md b/website/docs/user-guide/features/hooks.md index 3412255992f..e3893c0a239 100644 --- a/website/docs/user-guide/features/hooks.md +++ b/website/docs/user-guide/features/hooks.md @@ -385,6 +385,8 @@ def register(ctx): | [`pre_gateway_dispatch`](#pre_gateway_dispatch) | Gateway received a user message, before auth + dispatch | `{"action": "skip" \| "rewrite" \| "allow", ...}` to influence flow | | [`pre_approval_request`](#pre_approval_request) | Dangerous command needs user approval, before the prompt/notification is sent | ignored | | [`post_approval_response`](#post_approval_response) | User responded to an approval prompt (or it timed out) | ignored | +| [`transform_tool_result`](#transform_tool_result) | After any tool returns, before the result is handed back to the model | `str` to replace the result, `None` to leave unchanged | +| [`transform_terminal_output`](#transform_terminal_output) | Inside the `terminal` tool, before truncation/ANSI-strip/redact | `str` to replace the raw output, `None` to leave unchanged | --- @@ -1003,6 +1005,94 @@ def register(ctx): --- +### `transform_tool_result` + +Fires **after** a tool returns and **before** the result is appended to the conversation. Lets a plugin rewrite ANY tool's result string — not just terminal output — before the model sees it. + +**Callback signature:** + +```python +def my_callback( + tool_name: str, + arguments: dict, + result: str, + task_id: str | None, + **kwargs, +) -> str | None: +``` + +| Parameter | Type | Description | +|-----------|------|-------------| +| `tool_name` | `str` | Tool that produced the result (`read_file`, `web_extract`, `delegate_task`, …). | +| `arguments` | `dict` | Arguments the model called the tool with. | +| `result` | `str` | The tool's raw result string, post-truncation and post-ANSI-strip. | +| `task_id` | `str \| None` | Task/session ID when running inside RL/benchmark environments. | + +**Return value:** `str` to replace the result (the returned string is what the model sees), `None` to leave it unchanged. + +**Use cases:** Redact organization-specific PII from `web_extract` output, wrap long JSON tool responses in a summary header, inject retrieval-augmented hints into `read_file` results, rewrite `delegate_task` subagent reports into a project-specific schema. + +```python +import re +SECRET = re.compile(r"sk-[A-Za-z0-9]{32,}") + +def redact_secrets(tool_name, result, **kwargs): + if SECRET.search(result): + return SECRET.sub("[REDACTED]", result) + return None + +def register(ctx): + ctx.register_hook("transform_tool_result", redact_secrets) +``` + +Applies to every tool. For terminal-only rewriting see `transform_terminal_output` below — it's narrower and runs earlier in the pipeline (pre-truncation, pre-redaction). + +--- + +### `transform_terminal_output` + +Fires inside the `terminal` tool's foreground-output pipeline, **before** the default 50 KB truncation, ANSI strip, and secret redaction. Lets plugins rewrite the raw stdout/stderr of a shell command before any downstream processing touches it. + +**Callback signature:** + +```python +def my_callback( + command: str, + output: str, + exit_code: int, + cwd: str, + task_id: str | None, + **kwargs, +) -> str | None: +``` + +| Parameter | Type | Description | +|-----------|------|-------------| +| `command` | `str` | The shell command that produced the output. | +| `output` | `str` | Raw combined stdout/stderr (may be very large — truncation happens after the hook). | +| `exit_code` | `int` | Process exit code. | +| `cwd` | `str` | Working directory the command ran in. | + +**Return value:** `str` to replace the output, `None` to leave it unchanged. + +**Use cases:** Inject summaries for commands that produce massive output (`du -ah`, `find`, `tree`), tag output with a project-specific marker so downstream hooks know how to handle it, strip timing noise that flaps between runs and defeats prompt caching. + +```python +def summarize_find(command, output, **kwargs): + if command.startswith("find ") and len(output) > 50_000: + lines = output.count("\n") + head = "\n".join(output.splitlines()[:40]) + return f"{head}\n\n[summary: {lines} paths total, showing first 40]" + return None + +def register(ctx): + ctx.register_hook("transform_terminal_output", summarize_find) +``` + +Pairs well with `transform_tool_result` (which covers every other tool). + +--- + ## Shell Hooks Declare shell-script hooks in your `cli-config.yaml` and Hermes will run them as subprocesses whenever the corresponding plugin-hook event fires — in both CLI and gateway sessions. No Python plugin authoring required. diff --git a/website/docs/user-guide/features/tts.md b/website/docs/user-guide/features/tts.md index 2bf6430ff7c..0a49dc69834 100644 --- a/website/docs/user-guide/features/tts.md +++ b/website/docs/user-guide/features/tts.md @@ -135,13 +135,15 @@ Local transcription works out of the box when `faster-whisper` is installed. If ```yaml # In ~/.hermes/config.yaml stt: - provider: "local" # "local" | "groq" | "openai" | "mistral" + provider: "local" # "local" | "groq" | "openai" | "mistral" | "xai" local: model: "base" # tiny, base, small, medium, large-v3 openai: model: "whisper-1" # whisper-1, gpt-4o-mini-transcribe, gpt-4o-transcribe mistral: model: "voxtral-mini-latest" # voxtral-mini-latest, voxtral-mini-2602 + xai: + model: "grok-stt" # xAI Grok STT ``` ### Provider Details @@ -162,6 +164,8 @@ stt: **Mistral API (Voxtral Transcribe)** — Requires `MISTRAL_API_KEY`. Uses Mistral's [Voxtral Transcribe](https://docs.mistral.ai/capabilities/audio/speech_to_text/) models. Supports 13 languages, speaker diarization, and word-level timestamps. Install with `pip install hermes-agent[mistral]`. +**xAI Grok STT** — Requires `XAI_API_KEY`. Posts to `https://api.x.ai/v1/stt` as multipart/form-data. Good choice if you're already using xAI for chat or TTS and want one API key for everything. Auto-detection order puts it after Groq — explicitly set `stt.provider: xai` to force it. + **Custom local CLI fallback** — Set `HERMES_LOCAL_STT_COMMAND` if you want Hermes to call a local transcription command directly. The command template supports `{input_path}`, `{output_dir}`, `{language}`, and `{model}` placeholders. ### Fallback Behavior diff --git a/website/docs/user-guide/features/vision.md b/website/docs/user-guide/features/vision.md index 0ef77128d13..51cfe57bd10 100644 --- a/website/docs/user-guide/features/vision.md +++ b/website/docs/user-guide/features/vision.md @@ -189,3 +189,16 @@ Image paste works with any vision-capable model. The image is sent as a base64-e ``` Most modern models support this format, including GPT-4 Vision, Claude (with vision), Gemini, and open-source multimodal models served through OpenRouter. + +## Image Routing (Vision-Capable vs Text-Only Models) + +When a user attaches an image — from the CLI clipboard, the gateway (Telegram/Discord photo), or any other entry point — Hermes routes it based on whether your current model actually supports vision: + +| Your model | What happens to the image | +|---|---| +| **Vision-capable** (GPT-4V, Claude with vision, Gemini, Qwen-VL, MiMo-VL, etc.) | Sent as **real pixels** using the provider's native image content format above. No text summary layer. | +| **Text-only** (DeepSeek V3, smaller open-source models, older chat-only endpoints) | Routed through the `vision_analyze` auxiliary tool — an auxiliary vision model describes the image, and the text description is injected into the conversation. | + +You don't configure this — Hermes looks up your current model's capability in the provider metadata and picks the right path automatically. The practical effect: you can switch between vision and non-vision models mid-session and image handling "just works" without changing your workflow. Text-only models get coherent context about the image rather than a broken multimodal payload they'd have to reject. + +Which auxiliary model handles the text-description path is configurable under `auxiliary.vision` — see [Auxiliary Models](/docs/user-guide/configuration#auxiliary-models). diff --git a/website/docs/user-guide/features/voice-mode.md b/website/docs/user-guide/features/voice-mode.md index b82718cf048..2b45141d07f 100644 --- a/website/docs/user-guide/features/voice-mode.md +++ b/website/docs/user-guide/features/voice-mode.md @@ -105,6 +105,8 @@ If `faster-whisper` is installed, voice mode works with **zero API keys** for ST ## CLI Voice Mode +Voice mode is available in both the **classic CLI** (`hermes chat`) and the **TUI** (`hermes --tui`). Behavior is identical across both — same slash commands, same VAD silence detection, same streaming TTS, same hallucination filter. The TUI additionally forwards crash-forensic logs to `~/.hermes/logs/` so push-to-talk failures on exotic audio backends can be reported with a full stack trace rather than disappearing silently. + ### Quick Start Start the CLI and enable voice mode: diff --git a/website/docs/user-guide/messaging/dingtalk.md b/website/docs/user-guide/messaging/dingtalk.md index 9e8e74ee26f..4dd51b8b706 100644 --- a/website/docs/user-guide/messaging/dingtalk.md +++ b/website/docs/user-guide/messaging/dingtalk.md @@ -129,9 +129,25 @@ Optional behavior settings in `~/.hermes/config.yaml`: ```yaml group_sessions_per_user: true + +gateway: + platforms: + dingtalk: + extra: + # Require @mention in groups before the bot replies (parity with Slack/Telegram/Discord). + # DMs ignore this — the bot always replies in 1:1 chats. + require_mention: true + + # Per-platform allowlist. When set, only these DingTalk user IDs can interact with the bot + # (same semantics as DINGTALK_ALLOWED_USERS, but scoped here instead of in .env). + allowed_users: + - user-id-1 + - user-id-2 ``` - `group_sessions_per_user: true` keeps each participant's context isolated inside shared group chats +- `require_mention: true` prevents the bot from responding to every group message — it only answers when someone @-mentions it +- `allowed_users` under `dingtalk.extra` is an alternative to `DINGTALK_ALLOWED_USERS`; if both are set, they're merged ### Start the Gateway diff --git a/website/docs/user-guide/messaging/discord.md b/website/docs/user-guide/messaging/discord.md index d2b06f02379..898d8e7c6f4 100644 --- a/website/docs/user-guide/messaging/discord.md +++ b/website/docs/user-guide/messaging/discord.md @@ -482,6 +482,34 @@ Hermes automatically registers installed skills as **native Discord Application No extra configuration is needed — any skill installed via `hermes skills install` is automatically registered as a Discord slash command on the next gateway restart. +### Disabling Slash Command Registration + +If you run multiple Hermes gateways against the same Discord application (e.g. staging + production), only one of them should own the global slash-command registration — otherwise the last startup wins and the registrations flap. Turn slash registration off on the "follower" gateway: + +```yaml +gateway: + platforms: + discord: + extra: + slash_commands: false # default: true +``` + +Leaving this at `true` on the "primary" gateway keeps the normal behavior — global `/`-menu commands for built-ins and installed skills. + +## Sending Media (`send_message` + `MEDIA:` tags) + +The Discord adapter supports native file uploads for every common media type via the `send_message` tool and inline `MEDIA:/path/to/file` tags emitted by the agent: + +| Type | How it's delivered | +|---|---| +| Images (PNG/JPG/WebP) | Native Discord image attachment with inline preview | +| Animated GIFs | `send_animation` uploads as `animation.gif` so Discord plays it inline (not as a static thumbnail) | +| Video (MP4/MOV) | `send_video` — native video player | +| Audio / Voice | `send_voice` — native voice message when possible, file attachment otherwise | +| Documents (PDF/ZIP/docx/etc.) | `send_document` — native attachment with download button | + +Discord's per-upload size limit depends on the server's boost tier (25 MB free, up to 500 MB). If Hermes gets an HTTP 413, the adapter falls back to a link pointing at the local cache path rather than failing silently. + ## Home Channel You can designate a "home channel" where the bot sends proactive messages (such as cron job output, reminders, and notifications). There are two ways to set it: diff --git a/website/docs/user-guide/messaging/signal.md b/website/docs/user-guide/messaging/signal.md index bc72c27b207..f7653819462 100644 --- a/website/docs/user-guide/messaging/signal.md +++ b/website/docs/user-guide/messaging/signal.md @@ -168,6 +168,16 @@ All outgoing media goes through Signal's standard attachment API. Unlike some pl Attachment size limit: **100 MB** (both directions). +### Native Formatting, Reply Quotes, and Reactions + +Signal messages render with **native formatting** instead of literal markdown characters. The adapter converts markdown (`**bold**`, `*italic*`, `` `code` ``, `~~strike~~`, `||spoiler||`, headings) into Signal `bodyRanges` so the text shows up with real styling on the recipient's client rather than as visible `**` / `` ` `` characters. + +**Reply quotes.** When Hermes replies to a specific message, it now posts a native reply that quotes the original — same UI affordance Signal users see when they use "Reply" themselves. This is automatic for replies generated in response to an inbound message. + +**Reactions.** The agent can react to messages via the standard reaction API; reactions surface in Signal as emoji reactions on the referenced message rather than as extra text. + +None of this requires additional config — it ships on by default in recent signal-cli builds. If your `signal-cli` version is too old, Hermes falls back to plaintext delivery and logs a one-time warning. + ### Typing Indicators The bot sends typing indicators while processing messages, refreshing every 8 seconds. diff --git a/website/docs/user-guide/messaging/slack.md b/website/docs/user-guide/messaging/slack.md index 72e22db2327..f5b29c9d132 100644 --- a/website/docs/user-guide/messaging/slack.md +++ b/website/docs/user-guide/messaging/slack.md @@ -347,6 +347,14 @@ slack: # but you can set this explicitly for consistency with other platforms) require_mention: true + # Prevent thread auto-engagement: only reply to channel messages that + # contain an explicit @mention. With this OFF (default), Slack can + # "auto-engage" — remembering past mentions in a thread and following + # up on bot-message replies, and resuming active sessions without a + # fresh mention. With strict_mention ON, every new channel message + # must @mention the bot before Hermes will respond. + strict_mention: false + # Custom mention patterns that trigger the bot # (in addition to the default @mention detection) mention_patterns: @@ -357,6 +365,10 @@ slack: reply_prefix: "" ``` +:::tip When to use `strict_mention` +Set this to `true` in busy workspaces where Slack's default "the bot remembers this thread" behavior surprises users — for example, a long tech-support thread where the bot helped at the start and you'd rather it stay silent unless explicitly pinged again. DMs and active interactive sessions are unaffected. +::: + :::info Slack supports both patterns: `@mention` required to start a conversation by default, but you can opt specific channels out via `SLACK_FREE_RESPONSE_CHANNELS` (comma-separated channel IDs) or `slack.free_response_channels` in `config.yaml`. Once the bot has an active session in a thread, subsequent thread replies do not require a mention. In DMs the bot always responds without needing a mention. ::: diff --git a/website/docs/user-guide/messaging/telegram.md b/website/docs/user-guide/messaging/telegram.md index dbdfc3f4ac4..ad1ec492bf9 100644 --- a/website/docs/user-guide/messaging/telegram.md +++ b/website/docs/user-guide/messaging/telegram.md @@ -144,6 +144,22 @@ Then: If you already have a `docker_volumes:` section, add the new mount to the same list. YAML duplicate keys silently override earlier ones. +### Supported `MEDIA:` file extensions + +The gateway extracts `MEDIA:/path/to/file` tags from agent replies and ships the referenced file as a platform-native attachment. Supported extensions across all gateway platforms: + +| Category | Extensions | +|---|---| +| Images | `png`, `jpg`, `jpeg`, `gif`, `webp`, `bmp`, `tiff`, `svg` | +| Audio | `mp3`, `wav`, `ogg`, `m4a`, `opus`, `flac`, `aac` | +| Video | `mp4`, `mov`, `webm`, `mkv`, `avi` | +| **Documents** | `pdf`, `txt`, `md`, `csv`, `json`, `xml`, `html`, `yaml`, `yml`, `log` | +| **Office** | `docx`, `xlsx`, `pptx`, `odt`, `ods`, `odp` | +| **Archives** | `zip`, `rar`, `7z`, `tar`, `gz`, `bz2` | +| **Books / packages** | `epub`, `apk`, `ipa` | + +Anything on this list delivered as a native attachment on platforms that support it (Telegram, Discord, Signal, Slack, WhatsApp, Feishu, Matrix, etc.); on platforms without native support it falls back to a link or plain-text indicator. The **bold** categories were added in the last few releases — if you were relying on the model saying `here is the file: /path/to/report.docx` instead, swap to `MEDIA:/path/to/report.docx` for native delivery. + ## Webhook Mode By default, Hermes connects to Telegram using **long polling** — the gateway makes outbound requests to Telegram's servers to fetch new updates. This works well for local and always-on deployments. @@ -451,6 +467,50 @@ To find a topic's `thread_id`, open the topic in Telegram Web or Desktop and loo - **Privacy policy:** Telegram now requires bots to have a privacy policy. Set one via BotFather with `/setprivacy_policy`, or Telegram may auto-generate a placeholder. This is particularly important if your bot is public-facing. - **Message streaming:** Bot API 9.x added support for streaming long responses, which can improve perceived latency for lengthy agent replies. +## Rendering: Tables and Link Previews + +Telegram's MarkdownV2 has no native table syntax — pipe tables render as backslash-escaped noise if passed through raw. Hermes normalizes markdown tables automatically: + +- **Small tables** are flattened into **row-group bullets** — each row becomes a readable bulleted list under the column headings. Good for 2–4 columns and short cells. +- **Larger or wider tables** fall back to a **fenced code block** with aligned columns so nothing collapses. A one-line prompt hint is added so the agent knows to prefer prose follow-ups over more tables on Telegram. + +There's nothing to configure — the adapter picks the right fallback per message. If you want the legacy "always code-block" behavior, disable table normalization by setting `telegram.pretty_tables: false` in `config.yaml` (default: `true`). + +**Link previews.** Telegram auto-generates link previews for URLs in bot messages. If you'd rather suppress those (long `/tools` output, agent reply that mentions ten links, etc.): + +```yaml +gateway: + platforms: + telegram: + extra: + disable_link_previews: true +``` + +When enabled, Hermes attaches Telegram's `LinkPreviewOptions(is_disabled=True)` to every outgoing message and falls back to the legacy `disable_web_page_preview` parameter on older `python-telegram-bot` versions. + +## Group Allowlisting by Chat ID + +In addition to per-user access control via `TELEGRAM_ALLOWED_USERS`, you can allowlist entire group chats (and forum topics) by their numeric chat ID. Useful for team/support bots where any group member should be able to chat, but only in certain groups or topics. + +```yaml +gateway: + platforms: + telegram: + extra: + group_allowed_chats: + - -1001234567890 # supergroup — all members allowed + - -1001234567891/42 # supergroup + forum thread_id 42 only +``` + +Equivalent env var: `TELEGRAM_GROUP_ALLOWED_USERS="-1001234567890,-1001234567891/42"` (comma-separated; the `/` suffix is optional). + +Behavior: + +- A chat that appears in `group_allowed_chats` bypasses `TELEGRAM_ALLOWED_USERS` for its members — anyone in the group can interact with the bot. +- Omit the `/` suffix to allow the whole group; include it to allow just that forum topic. +- DMs still require the user ID to be in `TELEGRAM_ALLOWED_USERS`. +- This layers cleanly on top of `group_topics` (for topic-scoped skill binding) and `ignored_threads` (for silencing specific topics). + ## Interactive Model Picker When you send `/model` with no arguments in a Telegram chat, Hermes shows an interactive inline keyboard for switching models: diff --git a/website/docs/user-guide/security.md b/website/docs/user-guide/security.md index 3f2fe665bd4..dfb35dd520e 100644 --- a/website/docs/user-guide/security.md +++ b/website/docs/user-guide/security.md @@ -65,9 +65,31 @@ The `/yolo` command is a **toggle** — each use flips the mode on or off: YOLO mode is available in both CLI and gateway sessions. Internally, it sets the `HERMES_YOLO_MODE` environment variable which is checked before every command execution. :::danger -YOLO mode disables **all** dangerous command safety checks for the session. Use only when you fully trust the commands being generated (e.g., well-tested automation scripts in disposable environments). +YOLO mode disables **all** dangerous command safety checks for the session — **except** the hardline blocklist (see below). Use only when you fully trust the commands being generated (e.g., well-tested automation scripts in disposable environments). ::: +### Hardline Blocklist (Always-On Floor) + +Some commands are so catastrophic — irreversible filesystem wipes, fork bombs, direct block-device writes — that Hermes refuses to run them **regardless** of: + +- `--yolo` / `/yolo` toggled on +- `approvals.mode: off` +- Cron jobs running in headless `approve` mode +- User explicitly clicking "allow always" + +The blocklist is the floor below `--yolo`. It trips **before** the approval layer even sees the command, and there's no override flag. Patterns currently covered (not exhaustive; kept in sync with `tools/approval.py::UNRECOVERABLE_BLOCKLIST`): + +| Pattern | Why it's hardline | +|---|---| +| `rm -rf /` and obvious variants | Wipes the filesystem root | +| `rm -rf --no-preserve-root /` | The explicit "yes I mean root" variant | +| `:(){ :\|:& };:` (bash fork bomb) | Pegs the host until reboot | +| `mkfs.*` on a mounted root device | Formats the live system | +| `dd if=/dev/zero of=/dev/sd*` | Zeroes a physical disk | +| Piping untrusted URLs to `sh` at the rootfs top level | Remote-code-execution attack vector too broad to approve | + +If you hit the blocklist, the tool call returns an explanatory error to the agent and nothing runs. If a legitimate workflow needs one of these commands (you're the operator of a wipe-and-reinstall pipeline, for example), run it outside the agent. + ### Approval Timeout When a dangerous command prompt appears, the user has a configurable amount of time to respond. If no response is given within the timeout, the command is **denied** by default (fail-closed). @@ -479,7 +501,20 @@ All URL-capable tools (web search, web extract, vision, browser) validate URLs b - **Cloud metadata hostnames**: `metadata.google.internal`, `metadata.goog` - **Reserved, multicast, and unspecified addresses** -SSRF protection is always active and cannot be disabled. DNS failures are treated as blocked (fail-closed). Redirect chains are re-validated at each hop to prevent redirect-based bypasses. +SSRF protection is always active for internet-facing use and DNS failures are treated as blocked (fail-closed). Redirect chains are re-validated at each hop to prevent redirect-based bypasses. + +#### Intentionally allowing private URLs + +Some setups legitimately need private/internal URL access — home networks that resolve `home.arpa` to RFC 1918 space, LAN-only Ollama/llama.cpp endpoints, internal wikis, cloud metadata debugging, and the like. For those cases there's a global opt-out: + +```yaml +security: + allow_private_urls: true # default: false +``` + +When on, web tools, the browser, vision URL fetches, and gateway media downloads no longer reject RFC 1918 / loopback / link-local / CGNAT / cloud-metadata destinations. **This is a deliberate trust boundary** — only enable it on machines where the agent running arbitrary prompt-injected URLs against the local network is an acceptable risk. Public-facing gateways should leave it off. + +The host-substring guard (which blocks lookalike Unicode domain tricks even when the underlying IP is public) stays on regardless of this setting. ### Tirith Pre-Exec Security Scanning diff --git a/website/docs/user-guide/tui.md b/website/docs/user-guide/tui.md index 8c1b179b674..c7f0eeb8442 100644 --- a/website/docs/user-guide/tui.md +++ b/website/docs/user-guide/tui.md @@ -76,6 +76,8 @@ Keybindings match the [Classic CLI](cli.md#keybindings) exactly. The only behavi - **`Cmd+V` / `Ctrl+V`** first tries normal text paste, then falls back to OSC52/native clipboard reads, and finally image attach when the clipboard or pasted payload resolves to an image. - **`/terminal-setup`** installs local VS Code / Cursor / Windsurf terminal bindings for better `Cmd+Enter` and undo/redo parity on macOS. - **Slash autocompletion** opens as a floating panel with descriptions, not an inline dropdown. +- **`Ctrl+X`** — when a queued message is highlighted (sent while the agent was still running), delete it from the queue. **`Esc`** cancels editing and unhighlights without deleting. +- **`Ctrl+G` / `Ctrl+X Ctrl+E`** — open the current input buffer in `$EDITOR` for multi-line / long-prompt composition; save-and-exit sends the contents back as the prompt. ## Slash commands @@ -89,9 +91,56 @@ All slash commands work unchanged. A few are TUI-owned — they produce richer o | `/skin` | Live preview — theme change applies as you browse | | `/details` | Toggle verbose tool-call details (global or per-section) | | `/usage` | Rich token / cost / context panel | +| `/agents` (alias `/tasks`) | Observability overlay — live subagent tree with kill/pause controls, per-branch cost / token / file rollups, turn-by-turn history | +| `/reload` | Re-reads `~/.hermes/.env` into the running TUI process so newly added API keys take effect without a restart | +| `/mouse` | Toggle mouse tracking on/off at runtime (also persists to `display.mouse_tracking` in `config.yaml`) | Every other slash command (including installed skills, quick commands, and personality toggles) works identically to the classic CLI. See [Slash Commands Reference](../reference/slash-commands.md). +## LaTeX math rendering + +The TUI's markdown pipeline renders LaTeX math inline: `$E = mc^2$` and `$$\frac{a}{b}$$` render as Unicode-formatted math instead of the raw TeX source. Works for inline and block math; unsupported syntax falls back to showing the literal TeX wrapped in a code span so it remains copyable. + +This is always-on — nothing to configure. Classic CLI keeps the raw TeX. + +## Light-terminal detection + +The TUI auto-detects light terminals and swaps to the light theme accordingly. Detection works in three layers: + +1. `HERMES_TUI_THEME` env var — highest priority. Values: `light`, `dark`, or a raw 6-char background hex (e.g. `ffffff`, `1a1a2e`). +2. `COLORFGBG` env var — the classic "what's my background color?" hint used by xterm-derived terminals. +3. Terminal background probe via OSC 11 — works on modern terminals (Ghostty, Warp, iTerm2, WezTerm, Kitty) that don't set `COLORFGBG`. + +If you want the light theme permanently regardless of terminal: + +```bash +export HERMES_TUI_THEME=light +``` + +## Busy indicator styles + +The status-bar FaceTicker is pluggable — the default rotates Hermes' kawaii face palette every 2.5 seconds during agent work. Pick a different style (or `none` for a minimal dot) via config: + +```yaml +display: + busy_indicator: + style: kawaii # kawaii | minimal | dots | wings | none +``` + +Styles ship with matched glyph widths so the rest of the status bar doesn't jitter on rotation. + +## Auto-resume + +By default, `hermes --tui` starts a fresh session each launch. To re-attach to the most recent TUI session automatically (useful when your terminal or SSH connection drops unexpectedly), opt in: + +```bash +export HERMES_TUI_RESUME=1 # most-recent TUI session +# or: +export HERMES_TUI_RESUME= # specific session +``` + +Unset the variable or pass `--resume ` explicitly to override on a per-launch basis. + ## Status line The TUI's status line tracks agent state in real time: @@ -106,6 +155,11 @@ The TUI's status line tracks agent state in real time: The per-skin status-bar colors and thresholds are shared with the classic CLI — see [Skins](features/skins.md) for customization. +The status line also shows: + +- **Working directory with git branch** — `~/projects/hermes-agent (docs/two-week-gap-sweep)`. The branch suffix updates when you `git checkout` in a side terminal (mtime-cached) so the TUI reflects your actual active branch, not whatever it was at launch. +- **Per-prompt elapsed time** — `⏱ 12s/3m 45s` while the turn is running (live), frozen to `⏲ 32s / 3m 45s` after the turn completes. First number is time since last user message; second is total session duration. Resets on every new prompt. + ## Configuration The TUI respects all standard Hermes config: `~/.hermes/config.yaml`, profiles, personalities, skins, quick commands, credential pools, memory providers, tool/skill enablement. No TUI-specific config file exists. diff --git a/website/static/api/model-catalog.json b/website/static/api/model-catalog.json index b874ac06a3a..0845f7339ac 100644 --- a/website/static/api/model-catalog.json +++ b/website/static/api/model-catalog.json @@ -1,6 +1,6 @@ { "version": 1, - "updated_at": "2026-04-26T19:27:12Z", + "updated_at": "2026-04-30T03:06:09Z", "metadata": { "source": "hermes-agent repo", "docs": "https://hermes-agent.nousresearch.com/docs/reference/model-catalog" @@ -165,6 +165,9 @@ { "id": "xiaomi/mimo-v2.5" }, + { + "id": "tencent/hy3-preview" + }, { "id": "anthropic/claude-opus-4.7" },