skill: point hermes-agent at llms-full.txt bundle + capability inventory

Christian's 1020→90 line trim swapped the bloated reference for a 37-row mapping table. Replacing the table with a pointer at the existing website/static/llms-full.txt bundle (auto-regenerated by website/scripts/generate-llms-txt.py on every docs build) and a one-shot capability inventory. - Mapping table dropped; agent greps llms-full.txt for targeted lookups. - Capability inventory section retains 'I can do this' priors that bare navigation indexes lose (slash commands, spawning, durable systems, voice, browser, security defaults, platforms, plugins, MCP). - Tighter intro, same key rules. - Final: 82 lines / 5.4KB. Doc fixes on top of the cherry-pick: - configuring-models.md vision aux chain corrected: actual order is main → OpenRouter → Nous (only). Anthropic and custom endpoint are NOT in _VISION_AUTO_PROVIDER_ORDER. - Text aux chain reworded: Anthropic is part of the api-key bucket (_resolve_api_key_provider), not a separate step.
refactor: lightweight hermes-agent skill as doc navigation index
2026-06-10 04:08:28 +08:00 · 2026-05-24 15:05:25 -07:00 · 2026-05-24 15:04:05 -07:00
4 changed files with 129 additions and 985 deletions
--- a/skills/autonomous-ai-agents/hermes-agent/SKILL.md
+++ b/skills/autonomous-ai-agents/hermes-agent/SKILL.md
--- a/website/docs/developer-guide/adding-tools.md
+++ b/website/docs/developer-guide/adding-tools.md
@@ -200,6 +200,28 @@ OPTIONAL_ENV_VARS = {
 }
 ```

+## After Adding: Restart Required
+
+New tools are discovered at process startup. Slash commands like `/reset` and `/new` only reset the conversation thread — they do **not** reload the tool palette. For a new tool to become available:
+
+```bash
+hermes gateway stop && hermes gateway start
+```
+
+**Do NOT use `hermes gateway run --replace`** — it can leave stale Python import state that causes the new process to inherit the old tool palette. Always use `stop` + `start` when registering new tools.
+
+To verify the tool is properly registered before restarting:
+
+```bash
+cd ~/.hermes/hermes-agent
+source venv/bin/activate
+python3 -c "
+from tools.registry import registry, discover_builtin_tools
+loaded = discover_builtin_tools()
+print('Registered:', 'your_tool_name' in registry.get_all_tool_names())
+"
+```
+
 ## Checklist

 - [ ] Tool file created with handler, schema, check function, and registration
--- a/website/docs/guides/cron-troubleshooting.md
+++ b/website/docs/guides/cron-troubleshooting.md
@@ -159,6 +159,28 @@ Likely a delivery target issue (see Delivery Failures above) or a silently suppr
 **Job hangs or times out**
 The scheduler uses an inactivity-based timeout (default 600s, configurable via `HERMES_CRON_TIMEOUT` env var, `0` for unlimited). The agent can run as long as it's actively calling tools — the timer only fires after sustained inactivity. Long-running jobs should use scripts to handle data collection and deliver only the result.

+**Import errors after gateway has been running for days**
+
+Symptom: `cannot import name 'cfg_get' from 'hermes_cli.config'` across multiple cron jobs, even though the function exists in the source file.
+
+Root cause: Module-level imports like `from hermes_cli.config import cfg_get` in tool files can hit stale namespace ordering in long-running gateway processes. The function exists but isn't available yet when the import runs.
+
+Immediate fix: Restart the gateway.
+```bash
+hermes gateway restart
+```
+
+Definitive fix: Move these imports inside the functions that use them (lazy import pattern):
+```python
+# Before — fails in long-running gateway:
+from hermes_cli.config import cfg_get
+
+# After — lazy import, definitive fix:
+def my_tool():
+    from hermes_cli.config import cfg_get
+```
+Python caches imports, so there's zero runtime overhead.
+
 ### Check 3: Lock contention

 The scheduler uses file-based locking to prevent overlapping ticks. If two gateway instances are running (or a CLI session conflicts with a gateway), jobs may be delayed or skipped.
--- a/website/docs/user-guide/configuring-models.md
+++ b/website/docs/user-guide/configuring-models.md
@@ -47,7 +47,17 @@ Click **Show auxiliary** to reveal the eight task slots:

 ![Auxiliary panel expanded](/img/docs/dashboard-models/auxiliary-expanded.png)

-Every auxiliary task defaults to `auto` — meaning Hermes uses your main model for that job too. Override a specific task when you want a cheaper or faster model for a side-job.
+Every auxiliary task defaults to `auto`, which uses a resolution chain starting with your main provider:
+
+- **Text tasks** (compression, session_search, web_extract, approval, MCP, skills_hub): main provider → OpenRouter → Nous Portal → custom endpoint → direct API-key providers (Anthropic, DeepSeek, etc.).
+- **Vision tasks**: main provider (if vision-capable) → OpenRouter → Nous Portal.
+
+To force auxiliaries to use a specific provider, set it explicitly:
+```bash
+hermes config set auxiliary.compression.provider openrouter
+```
+
+Override a specific task when you want a cheaper or faster model for a side-job.

 ### Common override patterns

@@ -151,6 +161,34 @@ Three things to check:

 On OpenRouter (or any aggregator), bare model names resolve *within* the aggregator first. So `claude-sonnet-4` on OpenRouter becomes `anthropic/claude-sonnet-4.6`, staying on your OpenRouter auth. But if you typed `claude-sonnet-4` on a native Anthropic auth, it would stay as `claude-sonnet-4-6`. If you see an unexpected provider switch, check that your current provider is what you expect — the picker always shows the current main at the top of the dialog.

+### model.context_length applies globally, not per-model
+
+Setting `model.context_length` in config.yaml caps **every** model the agent uses — main and auxiliaries. For per-model context limits, use `custom_providers` instead:
+
+```yaml
+custom_providers:
+  - name: "My Server"
+    models:
+      my-model:
+        context_length: 32768  # per-model only
+```
+
+To remove a global cap: `hermes config set model.context_length ""`.
+
+### model.max_tokens above the model's output ceiling has no effect
+
+Every model has a hard output ceiling (e.g., Claude Sonnet 4 = 8K tokens, GPT-4o = 16K). Setting `model.max_tokens` above that ceiling achieves nothing — the provider silently truncates. No production model outputs more than ~32K tokens per response. The default (`null`) works fine for most tasks.
+
+### OpenRouter response cache
+
+OpenRouter response caching is **enabled by default** (`response_cache: true`). Identical requests return cached responses for free (zero billing). To disable or adjust the TTL:
+
+```bash
+hermes config set openrouter.response_cache false
+# Default TTL is 300 seconds (5 min)
+hermes config set openrouter.response_cache_ttl 600
+```
+
 ## Alternative methods

 ### CLI slash command