mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-01 16:31:56 +08:00
docs: two-week gap sweep — platforms, CLI, config, TUI, hooks, providers (#17727)
Covers ~60 merged PRs from Apr 15–29 that shipped user-visible behavior without docs coverage. No functional code changes; docs + static manifest regeneration only. Highlights: Stale / incorrect: - configuration.md: auxiliary auto-routing line was wrong since #11900; now correctly states auto routes to the main model, with a note on the cost trade-off and per-task override pattern. - integrations/providers.md + configuration.md compression intro: removed stale 'Gemini Flash via OpenRouter' claim. - website/static/api/model-catalog.json: rebuilt from hermes_cli/models.py so the live manifest picks up tencent/hy3-preview (and remains in sync for future model-catalog PRs). Platform messaging (#17417 #16997 #16193 #14315 #13151 #11794 #10610 #10283 #10246 #11564 #13178): - Signal: native formatting (bodyRanges), reply quotes, reactions. - Telegram: table rendering (bullets + code-block fallback), disable_link_previews, group_allowed_chats. - Slack: strict_mention config. - Discord: slash_commands disable, send_animation GIF, send_message native media attachments. - DingTalk: require_mention + allowed_users. CLI (#16052 #16539 #16566 #15841 #14798 #10043): - New 'hermes fallback' interactive manager. - New 'hermes update --check', '--backup' flag, and pre-update pairing snapshot behavior. - 'hermes gateway start/restart --all' multi-profile flag. - cron.md: 'hermes tools' as a platform, per-job enabled_toolsets, wakeAgent gate, context_from chaining. Config keys / env vars (#17305 #17026 #17000 #15077 #14557 #14227 #14166 #14730 #17008): - terminal.docker_run_as_host_user, display.runtime_metadata_footer, compression.hygiene_hard_message_limit, HINDSIGHT_TIMEOUT, skills.guard_agent_created, TAVILY_BASE_URL, security.allow_private_urls, agent.api_max_retries, gateway hot-reload of compression/context_length config edits. TUI / CLI UX (#17130 #17113 #17175 #17150 #16707 #12312 #12305 #12934 #14810 #14045 #17286 #17126): - HERMES_TUI_RESUME, HERMES_TUI_THEME, LaTeX rendering, busy-indicator styles, ctrl-x queued-message delete, git branch in status bar, per- prompt elapsed stopwatch, external-editor keybind, markdown stripping, TUI voice-mode parity, /agents overlay, /reload + /mouse. Gateway features (#16506 #15027 #13428 #12116): - Native multimodal image routing based on vision capability. - /usage account-limits section. - /steer slash command (added to reference + explanation in CLI). Plugins / hooks (#12929 #12972 #10763 #16364): - transform_tool_result, transform_terminal_output plugin hooks. - PluginContext.dispatch_tool() documented with slash-command example. - google_meet bundled plugin entry under built-in-plugins.md. Other (#16576 #16572 #16383 #15878 #15608 #15606 #14809 #14767 #14231 #14232 #14307 #13683 #12373 #11891 #11291 #10066): - hermes backup exclusions (WAL/SHM/journal + checkpoints/). - security.md hardline blocklist (floor below --yolo). - FHS install layout for root installs. - openssh-client + docker-cli baked into the Docker image. - MEDIA: tag supported extensions table (docs/office/archives/pdf). - Remote-to-host file sync on SSH/Modal/Daytona teardown. - 'hermes model' -> Configure Auxiliary Models interactive picker. - Podman support via HERMES_DOCKER_BINARY. Providers / STT / one-shot (#15045 #14473 #15704): - alibaba-coding-plan first-class provider entry. - xAI Grok STT as a 6th transcription option. - 'hermes -z' scripted one-shot mode + HERMES_INFERENCE_MODEL. Build: 'docusaurus build' succeeds. No new broken links/anchors; pre-existing warnings unchanged.
This commit is contained in:
@@ -132,6 +132,7 @@ terminal:
|
||||
backend: docker
|
||||
docker_image: "nikolaik/python-nodejs:python3.11-nodejs20"
|
||||
docker_mount_cwd_to_workspace: false # Mount launch dir into /workspace
|
||||
docker_run_as_host_user: false # See "Running container as host user" below
|
||||
docker_forward_env: # Env vars to forward into container
|
||||
- "GITHUB_TOKEN"
|
||||
docker_volumes: # Host directory mounts
|
||||
@@ -145,7 +146,7 @@ terminal:
|
||||
container_persistent: true # Persist /workspace and /root across sessions
|
||||
```
|
||||
|
||||
**Requirements:** Docker Desktop or Docker Engine installed and running. Hermes probes `$PATH` plus common macOS install locations (`/usr/local/bin/docker`, `/opt/homebrew/bin/docker`, Docker Desktop app bundle).
|
||||
**Requirements:** Docker Desktop or Docker Engine installed and running. Hermes probes `$PATH` plus common macOS install locations (`/usr/local/bin/docker`, `/opt/homebrew/bin/docker`, Docker Desktop app bundle). Podman is supported out of the box: set `HERMES_DOCKER_BINARY=podman` (or the full path) to force it when both are installed.
|
||||
|
||||
**Container lifecycle:** Hermes reuses a single long-lived container (`docker run -d ... sleep 2h`) for every terminal and file-tool call, across sessions, `/new`, `/reset`, and `delegate_task` subagents, for the lifetime of the Hermes process. Commands run via `docker exec` with a login shell, so working-directory changes, installed packages, and files in `/workspace` all persist from one tool call to the next. The container is stopped and removed on Hermes shutdown (or when the idle-sweep reclaims it).
|
||||
|
||||
@@ -301,6 +302,23 @@ If terminal commands fail immediately or the terminal tool is reported as disabl
|
||||
|
||||
When in doubt, set `terminal.backend` back to `local` and verify that commands run there first.
|
||||
|
||||
### Remote-to-Host File Sync on Teardown
|
||||
|
||||
For the **SSH**, **Modal**, and **Daytona** backends (anywhere the agent's working tree lives on a different machine than the host running Hermes), Hermes tracks files the agent touched inside the remote sandbox and, on session teardown / sandbox cleanup, **syncs the modified files back to the host** under `~/.hermes/cache/remote-syncs/<session-id>/`.
|
||||
|
||||
- Triggers on: session close, `/new`, `/reset`, gateway message timeout, `delegate_task` subagent completion when the child used a remote backend.
|
||||
- Covers the whole tree the agent modified, not just files it explicitly opened. Additions, edits, and deletions are all captured.
|
||||
- The remote sandbox may have been torn down by the time you go looking; the local `~/.hermes/cache/remote-syncs/…` copy is the authoritative record of what the agent changed.
|
||||
- Large binary outputs (model checkpoints, raw datasets) are capped by size — the sync skips files over `file_sync_max_mb` (default `100`). Bump that if you expect bigger artifacts to come back.
|
||||
|
||||
```yaml
|
||||
terminal:
|
||||
file_sync_max_mb: 100 # default — sync files up to 100 MB each
|
||||
file_sync_enabled: true # default — set false to skip the sync entirely
|
||||
```
|
||||
|
||||
This is how you recover results from ephemeral cloud sandboxes that get destroyed after the session ends, without having to tell the agent to explicitly `scp` or `modal volume put` every artifact.
|
||||
|
||||
### Docker Volume Mounts
|
||||
|
||||
When using the Docker backend, `docker_volumes` lets you share host directories with the container. Each entry uses standard Docker `-v` syntax: `host_path:container_path[:options]`.
|
||||
@@ -355,6 +373,20 @@ Hermes resolves each listed variable from your current shell first, then falls b
|
||||
Anything listed in `docker_forward_env` becomes visible to commands run inside the container. Only forward credentials you are comfortable exposing to the terminal session.
|
||||
:::
|
||||
|
||||
### Running the Container as Your Host User
|
||||
|
||||
By default Docker containers run as `root` (UID 0). Files created inside `/workspace` or other bind-mounts end up owned by root on the host, so after a session you have to `sudo chown` them before you can edit them from your host editor. The `terminal.docker_run_as_host_user` flag fixes this:
|
||||
|
||||
```yaml
|
||||
terminal:
|
||||
backend: docker
|
||||
docker_run_as_host_user: true # default: false
|
||||
```
|
||||
|
||||
When enabled, Hermes appends `--user $(id -u):$(id -g)` to the `docker run` command so files written into bind-mounted directories (`/workspace`, `/root`, anything in `docker_volumes`) are owned by your host user, not root. The trade-off: the container can no longer `apt install` or write to root-owned paths like `/root/.npm` — use a base image whose `HOME` is owned by a non-root user (or add your required tooling at image build time) if you need both.
|
||||
|
||||
Leave this `false` (the default) for backwards-compatible behavior. Turn it on when your workflow is mostly "edit mounted host files" and you're tired of `sudo chown -R`.
|
||||
|
||||
### Optional: Mount the Launch Directory into `/workspace`
|
||||
|
||||
Docker sandboxes stay isolated by default. Hermes does **not** pass your current host working directory into the container unless you explicitly opt in.
|
||||
@@ -447,6 +479,17 @@ hermes config set skills.config.myplugin.path ~/myplugin-data
|
||||
|
||||
For details on declaring config settings in your own skills, see [Creating Skills — Config Settings](/docs/developer-guide/creating-skills#config-settings-configyaml).
|
||||
|
||||
### Guard on agent-created skill writes
|
||||
|
||||
When the agent uses `skill_manage` to create, edit, patch, or delete a skill, Hermes can optionally scan the new/updated content for dangerous keyword patterns (credential harvesting, obvious prompt injection, exfil instructions). The scanner is **off by default** — real agent workflows that legitimately touch `~/.ssh/` or mention `$OPENAI_API_KEY` were tripping the heuristic too often. Turn it back on if you want the scanner to prompt you before the agent's skill writes land:
|
||||
|
||||
```yaml
|
||||
skills:
|
||||
guard_agent_created: true # default: false
|
||||
```
|
||||
|
||||
When on, any flagged `skill_manage` write surfaces as an approval prompt with the scanner's rationale. Accepted writes land; denied writes return an explanatory error to the agent.
|
||||
|
||||
## Memory Configuration
|
||||
|
||||
```yaml
|
||||
@@ -560,6 +603,7 @@ compression:
|
||||
threshold: 0.50 # Compress at this % of context limit
|
||||
target_ratio: 0.20 # Fraction of threshold to preserve as recent tail
|
||||
protect_last_n: 20 # Min recent messages to keep uncompressed
|
||||
hygiene_hard_message_limit: 400 # Gateway safety valve — see below
|
||||
|
||||
# The summarization model/provider is configured under auxiliary:
|
||||
auxiliary:
|
||||
@@ -573,6 +617,12 @@ auxiliary:
|
||||
Older configs with `compression.summary_model`, `compression.summary_provider`, and `compression.summary_base_url` are automatically migrated to `auxiliary.compression.*` on first load (config version 17). No manual action needed.
|
||||
:::
|
||||
|
||||
`hygiene_hard_message_limit` is a gateway-only **pre-compression safety valve**. Runaway sessions with thousands of messages can hit model context limits before the normal percent-of-context threshold fires; when message count crosses this ceiling, Hermes forces compression regardless of token usage. Default `400` — raise it for platforms where very long sessions are normal, lower it to force more aggressive compression. Editing this value on a running gateway takes effect on the next message (see below).
|
||||
|
||||
:::tip Gateway hot-reload of compression and context length
|
||||
As of recent releases, editing `model.context_length` or any `compression.*` key in `config.yaml` on a running gateway takes effect on the next message — no gateway restart, no `/reset`, no session rotation required. The cached-agent signature includes these keys, so the gateway transparently rebuilds the agent when it sees a change. API keys and tool/skill config still require the usual reload paths.
|
||||
:::
|
||||
|
||||
### Common setups
|
||||
|
||||
**Default (auto-detect) — no configuration needed:**
|
||||
@@ -581,7 +631,7 @@ compression:
|
||||
enabled: true
|
||||
threshold: 0.50
|
||||
```
|
||||
Uses the first available provider (OpenRouter → Nous → Codex) with Gemini Flash.
|
||||
Uses your main provider and main model. Override per-task (e.g. `auxiliary.compression.provider: openrouter` + `model: google/gemini-2.5-flash`) if you want compression on a cheaper model than your main chat model.
|
||||
|
||||
**Force a specific provider** (OAuth or API-key based):
|
||||
```yaml
|
||||
@@ -647,12 +697,15 @@ Warnings are injected into the last tool result's JSON (as a `_budget_warning` f
|
||||
```yaml
|
||||
agent:
|
||||
max_turns: 90 # Max iterations per conversation turn (default: 90)
|
||||
api_max_retries: 2 # Retries per provider before fallback engages (default: 2)
|
||||
```
|
||||
|
||||
Budget pressure is enabled by default. The agent sees warnings naturally as part of tool results, encouraging it to consolidate its work and deliver a response before running out of iterations.
|
||||
|
||||
When the iteration budget is fully exhausted, the CLI shows a notification to the user: `⚠ Iteration budget reached (90/90) — response may be incomplete`. If the budget runs out during active work, the agent generates a summary of what was accomplished before stopping.
|
||||
|
||||
`agent.api_max_retries` controls how many times Hermes retries a provider API call on transient errors (rate limits, connection drops, 5xx) **before** fallback-provider switching engages. The default is `2` — three attempts total, matching the OpenAI SDK default. If you have [fallback providers](/docs/user-guide/features/fallback-providers) configured and want to fail over faster, drop this to `0` so the first transient error on your primary immediately hands off to the fallback instead of churning retries against the flaky endpoint.
|
||||
|
||||
### API Timeouts
|
||||
|
||||
Hermes has separate timeout layers for streaming, plus a stale detector for non-streaming calls. The stale detectors auto-adjust for local providers only when you leave them at their implicit defaults.
|
||||
@@ -709,7 +762,29 @@ Options: `fill_first` (default), `round_robin`, `least_used`, `random`. See [Cre
|
||||
|
||||
## Auxiliary Models
|
||||
|
||||
Hermes uses lightweight "auxiliary" models for side tasks like image analysis, web page summarization, and browser screenshot analysis. By default, these use **Gemini Flash** via auto-detection — you don't need to configure anything.
|
||||
Hermes uses "auxiliary" models for side tasks like image analysis, web page summarization, browser screenshot analysis, session-title generation, and context compression. By default (`auxiliary.*.provider: "auto"`), Hermes routes every auxiliary task to your **main chat model** — the same provider/model you picked in `hermes model`. You don't need to configure anything to get started, but be aware that on expensive reasoning models (Opus, MiniMax M2.7, etc.) auxiliary tasks add meaningful cost. If you want cheap-and-fast side tasks regardless of your main model, set `auxiliary.<task>.provider` and `auxiliary.<task>.model` explicitly (for example, Gemini Flash on OpenRouter for vision and web extraction).
|
||||
|
||||
:::note Why "auto" uses your main model
|
||||
Earlier builds split aggregator users (OpenRouter, Nous Portal) onto a cheap provider-side default. That was surprising — users who paid for an aggregator subscription would see a different model handling their auxiliary traffic. `auto` now uses the main model for everyone, and per-task overrides in `config.yaml` still win (see [Full auxiliary config reference](#full-auxiliary-config-reference) below).
|
||||
:::
|
||||
|
||||
### Configuring auxiliary models interactively
|
||||
|
||||
Instead of hand-editing YAML, run `hermes model` and pick **"Configure auxiliary models"** from the menu. You'll get an interactive per-task picker:
|
||||
|
||||
```
|
||||
$ hermes model
|
||||
→ Configure auxiliary models
|
||||
|
||||
[ ] vision currently: auto / main model
|
||||
[ ] web_extract currently: auto / main model
|
||||
[ ] session_search currently: openrouter / google/gemini-2.5-flash
|
||||
[ ] title_generation currently: openrouter / google/gemini-3-flash-preview
|
||||
[ ] compression currently: auto / main model
|
||||
[ ] approval currently: auto / main model
|
||||
```
|
||||
|
||||
Select a task, pick a provider (OAuth flows open a browser; API-key providers prompt), pick a model. The change persists to `auxiliary.<task>.*` in `config.yaml`. Same machinery as the main-model picker — no extra syntax to learn.
|
||||
|
||||
### Video Tutorial
|
||||
|
||||
@@ -1088,6 +1163,7 @@ display:
|
||||
streaming: false # Stream tokens to terminal as they arrive (real-time output)
|
||||
show_cost: false # Show estimated $ cost in the CLI status bar
|
||||
tool_preview_length: 0 # Max chars for tool call previews (0 = no limit, show full paths/commands)
|
||||
runtime_metadata_footer: false # Gateway: append a runtime-context footer to final replies
|
||||
```
|
||||
|
||||
| Mode | What you see |
|
||||
@@ -1099,6 +1175,23 @@ display:
|
||||
|
||||
In the CLI, cycle through these modes with `/verbose`. To use `/verbose` in messaging platforms (Telegram, Discord, Slack, etc.), set `tool_progress_command: true` in the `display` section above. The command will then cycle the mode and save to config.
|
||||
|
||||
### Runtime-metadata footer (gateway only)
|
||||
|
||||
When `display.runtime_metadata_footer: true`, Hermes appends a small runtime-context footer to the **final** message of each gateway turn — same info the CLI shows in its status bar (model, session duration, tokens, cost). Off by default; opt in per-gateway if your team wants every reply to include the provenance.
|
||||
|
||||
```yaml
|
||||
display:
|
||||
runtime_metadata_footer: true
|
||||
```
|
||||
|
||||
Example footer appended to a Telegram/Discord/Slack reply:
|
||||
|
||||
```
|
||||
— claude-opus-4.7 · 12 tool calls · 2m 14s · $0.042
|
||||
```
|
||||
|
||||
Only the **final** message of a turn gets the footer; interim updates stay clean.
|
||||
|
||||
### Per-platform progress overrides
|
||||
|
||||
Different platforms have different verbosity needs. For example, Signal can't edit messages, so each progress update becomes a separate message — noisy. Use `tool_progress_overrides` to set per-platform modes:
|
||||
|
||||
Reference in New Issue
Block a user