Compare commits

..

2 Commits

Author SHA1 Message Date
teknium1
e1369d1936 docs: add /insights to all help menus and documentation
- website/docs/reference/cli-commands.md: Added 'hermes insights' terminal
  command section with --days and --source flags, plus /insights slash command
  in the Conversation section
- website/docs/user-guide/cli.md: Added /insights to slash commands table
- website/docs/user-guide/messaging/index.md: Added /insights to gateway
  chat commands table
- website/docs/user-guide/sessions.md: Added cross-reference to hermes
  insights from the sessions stats section
2026-03-06 16:03:20 -08:00
teknium1
64133814a2 fix: restore all removed bundled skills + fix skills sync system
- Restored 21 skills removed in commits 757d012 and 740dd92:
  accelerate, audiocraft, code-review, faiss, flash-attention, gguf,
  grpo-rl-training, guidance, llava, nemo-curator, obliteratus, peft,
  pytorch-fsdp, pytorch-lightning, simpo, slime, stable-diffusion,
  tensorrt-llm, torchtitan, trl-fine-tuning, whisper

- Rewrote sync_skills() with proper update semantics:
  * New skills (not in manifest): copied to user dir
  * Existing skills (in manifest + on disk): updated via hash comparison
  * User-deleted skills (in manifest, not on disk): respected, not re-added
  * Stale manifest entries (removed from bundled): cleaned from manifest

- Added sync_skills() to CLI startup (cmd_chat) and gateway startup
  (start_gateway) — previously only ran during 'hermes update'

- Updated cmd_update output to show new/updated/cleaned counts

- Rewrote tests: 20 tests covering manifest CRUD, dir hashing, fresh
  install, user deletion respect, update detection, stale cleanup, and
  name collision handling

75 bundled skills total. 2002 tests pass.
2026-03-06 15:57:12 -08:00
509 changed files with 5763 additions and 54711 deletions

View File

@@ -13,38 +13,6 @@ OPENROUTER_API_KEY=
# Examples: anthropic/claude-opus-4.6, openai/gpt-4o, google/gemini-3-flash-preview, zhipuai/glm-4-plus
LLM_MODEL=anthropic/claude-opus-4.6
# =============================================================================
# LLM PROVIDER (z.ai / GLM)
# =============================================================================
# z.ai provides access to ZhipuAI GLM models (GLM-4-Plus, etc.)
# Get your key at: https://z.ai or https://open.bigmodel.cn
GLM_API_KEY=
# GLM_BASE_URL=https://api.z.ai/api/paas/v4 # Override default base URL
# =============================================================================
# LLM PROVIDER (Kimi / Moonshot)
# =============================================================================
# Kimi Code provides access to Moonshot AI coding models (kimi-k2.5, etc.)
# Get your key at: https://platform.kimi.ai (Kimi Code console)
# Keys prefixed sk-kimi- use the Kimi Code API (api.kimi.com) by default.
# Legacy keys from platform.moonshot.ai need KIMI_BASE_URL override below.
KIMI_API_KEY=
# KIMI_BASE_URL=https://api.kimi.com/coding/v1 # Default for sk-kimi- keys
# KIMI_BASE_URL=https://api.moonshot.ai/v1 # For legacy Moonshot keys
# KIMI_BASE_URL=https://api.moonshot.cn/v1 # For Moonshot China keys
# =============================================================================
# LLM PROVIDER (MiniMax)
# =============================================================================
# MiniMax provides access to MiniMax models (global endpoint)
# Get your key at: https://www.minimax.io
MINIMAX_API_KEY=
# MINIMAX_BASE_URL=https://api.minimax.io/v1 # Override default base URL
# MiniMax China endpoint (for users in mainland China)
MINIMAX_CN_API_KEY=
# MINIMAX_CN_BASE_URL=https://api.minimaxi.com/v1 # Override default base URL
# =============================================================================
# TOOL API KEYS
# =============================================================================
@@ -53,6 +21,10 @@ MINIMAX_CN_API_KEY=
# Get at: https://firecrawl.dev/
FIRECRAWL_API_KEY=
# Nous Research API Key - Vision analysis and multi-model reasoning
# Get at: https://inference-api.nousresearch.com/
NOUS_API_KEY=
# FAL.ai API Key - Image generation
# Get at: https://fal.ai/
FAL_KEY=
@@ -201,18 +173,6 @@ VOICE_TOOLS_OPENAI_KEY=
# WHATSAPP_ENABLED=false
# WHATSAPP_ALLOWED_USERS=15551234567
# Email (IMAP/SMTP — send and receive emails as Hermes)
# For Gmail: enable 2FA → create App Password at https://myaccount.google.com/apppasswords
# EMAIL_ADDRESS=hermes@gmail.com
# EMAIL_PASSWORD=xxxx xxxx xxxx xxxx
# EMAIL_IMAP_HOST=imap.gmail.com
# EMAIL_IMAP_PORT=993
# EMAIL_SMTP_HOST=smtp.gmail.com
# EMAIL_SMTP_PORT=587
# EMAIL_POLL_INTERVAL=15
# EMAIL_ALLOWED_USERS=your@email.com
# EMAIL_HOME_ADDRESS=your@email.com
# Gateway-wide: allow ALL users without an allowlist (default: false = deny)
# Only set to true if you intentionally want open access.
# GATEWAY_ALLOW_ALL_USERS=false

View File

@@ -34,7 +34,7 @@ jobs:
- name: Run tests
run: |
source .venv/bin/activate
python -m pytest tests/ -q --ignore=tests/integration --tb=short -n auto
python -m pytest tests/ -q --ignore=tests/integration --tb=short
env:
# Ensure tests don't accidentally call real APIs
OPENROUTER_API_KEY: ""

4
.gitignore vendored
View File

@@ -47,6 +47,4 @@ cli-config.yaml
# Skills Hub state (lives in ~/.hermes/skills/.hub/ at runtime, but just in case)
skills/.hub/
ignored/
.worktrees/
environments/benchmarks/evals/
ignored/

View File

@@ -1,291 +0,0 @@
# OpenAI-Compatible API Server for Hermes Agent
## Motivation
Every major chat frontend (Open WebUI 126k★, LobeChat 73k★, LibreChat 34k★,
AnythingLLM 56k★, NextChat 87k★, ChatBox 39k★, Jan 26k★, HF Chat-UI 8k★,
big-AGI 7k★) connects to backends via the OpenAI-compatible REST API with
SSE streaming. By exposing this endpoint, hermes-agent becomes instantly
usable as a backend for all of them — no custom adapters needed.
## What It Enables
```
┌──────────────────┐
│ Open WebUI │──┐
│ LobeChat │ │ POST /v1/chat/completions
│ LibreChat │ ├──► Authorization: Bearer <key> ┌─────────────────┐
│ AnythingLLM │ │ {"messages": [...]} │ hermes-agent │
│ NextChat │ │ │ gateway │
│ Any OAI client │──┘ ◄── SSE streaming response │ (API server) │
└──────────────────┘ └─────────────────┘
```
A user would:
1. Set `API_SERVER_ENABLED=true` in `~/.hermes/.env`
2. Run `hermes gateway` (API server starts alongside Telegram/Discord/etc.)
3. Point Open WebUI (or any frontend) at `http://localhost:8642/v1`
4. Chat with hermes-agent through any OpenAI-compatible UI
## Endpoints
| Method | Path | Purpose |
|--------|------|---------|
| POST | `/v1/chat/completions` | Chat with the agent (streaming + non-streaming) |
| GET | `/v1/models` | List available "models" (returns hermes-agent as a model) |
| GET | `/health` | Health check |
## Architecture
### Option A: Gateway Platform Adapter (recommended)
Create `gateway/platforms/api_server.py` as a new platform adapter that
extends `BasePlatformAdapter`. This is the cleanest approach because:
- Reuses all gateway infrastructure (session management, auth, context building)
- Runs in the same async loop as other adapters
- Gets message handling, interrupt support, and session persistence for free
- Follows the established pattern (like Telegram, Discord, etc.)
- Uses `aiohttp.web` (already a dependency) for the HTTP server
The adapter would start an `aiohttp.web.Application` server in `connect()`
and route incoming HTTP requests through the standard `handle_message()` pipeline.
### Option B: Standalone Component
A separate HTTP server class in `gateway/api_server.py` that creates its own
AIAgent instances directly. Simpler but duplicates session/auth logic.
**Recommendation: Option A** — fits the existing architecture, less code to
maintain, gets all gateway features for free.
## Request/Response Format
### Chat Completions (non-streaming)
```
POST /v1/chat/completions
Authorization: Bearer hermes-api-key-here
Content-Type: application/json
{
"model": "hermes-agent",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What files are in the current directory?"}
],
"stream": false,
"temperature": 0.7
}
```
Response:
```json
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1710000000,
"model": "hermes-agent",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "Here are the files in the current directory:\n..."
},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 50,
"completion_tokens": 200,
"total_tokens": 250
}
}
```
### Chat Completions (streaming)
Same request with `"stream": true`. Response is SSE:
```
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"Here "},"finish_reason":null}]}
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"are "},"finish_reason":null}]}
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}
data: [DONE]
```
### Models List
```
GET /v1/models
Authorization: Bearer hermes-api-key-here
```
Response:
```json
{
"object": "list",
"data": [{
"id": "hermes-agent",
"object": "model",
"created": 1710000000,
"owned_by": "hermes-agent"
}]
}
```
## Key Design Decisions
### 1. Session Management
The OpenAI API is stateless — each request includes the full conversation.
But hermes-agent sessions have persistent state (memory, skills, tool context).
**Approach: Hybrid**
- Default: Stateless. Each request is independent. The `messages` array IS
the conversation. No session persistence between requests.
- Opt-in persistent sessions via `X-Session-ID` header. When provided, the
server maintains session state across requests (conversation history,
memory context, tool state). This enables richer agent behavior.
- The session ID also enables interrupt support — a subsequent request with
the same session ID while one is running triggers an interrupt.
### 2. Streaming
The agent's `run_conversation()` is synchronous and returns the full response.
For real SSE streaming, we need to emit chunks as they're generated.
**Phase 1 (MVP):** Run agent in a thread, return the complete response as
a single SSE chunk + `[DONE]`. This works with all frontends — they just see
a fast single-chunk response. Not true streaming but functional.
**Phase 2:** Add a response callback to AIAgent that emits text chunks as the
LLM generates them. The API server captures these via a queue and streams them
as SSE events. This gives real token-by-token streaming.
**Phase 3:** Stream tool execution progress too — emit tool call/result events
as the agent works, giving frontends visibility into what the agent is doing.
### 3. Tool Transparency
Two modes:
- **Opaque (default):** Frontends see only the final response. Tool calls
happen server-side and are invisible. Best for general-purpose UIs.
- **Transparent (opt-in via header):** Tool calls are emitted as OpenAI-format
tool_call/tool_result messages in the stream. Useful for agent-aware frontends.
### 4. Authentication
- Bearer token via `Authorization: Bearer <key>` header
- Token configured via `API_SERVER_KEY` env var
- Optional: allow unauthenticated local-only access (127.0.0.1 bind)
- Follows the same pattern as other platform adapters
### 5. Model Mapping
Frontends send `"model": "hermes-agent"` (or whatever). The actual LLM model
used is configured server-side in config.yaml. The API server maps any
requested model name to the configured hermes-agent model.
Optionally, allow model passthrough: if the frontend sends
`"model": "anthropic/claude-sonnet-4"`, the agent uses that model. Controlled
by a config flag.
## Configuration
```yaml
# In config.yaml
api_server:
enabled: true
port: 8642
host: "127.0.0.1" # localhost only by default
key: "your-secret-key" # or via API_SERVER_KEY env var
allow_model_override: false # let clients choose the model
max_concurrent: 5 # max simultaneous requests
```
Environment variables:
```bash
API_SERVER_ENABLED=true
API_SERVER_PORT=8642
API_SERVER_HOST=127.0.0.1
API_SERVER_KEY=your-secret-key
```
## Implementation Plan
### Phase 1: MVP (non-streaming) — PR
1. `gateway/platforms/api_server.py` — new adapter
- aiohttp.web server with endpoints:
- `POST /v1/chat/completions` — Chat Completions API (universal compat)
- `POST /v1/responses` — Responses API (server-side state, tool preservation)
- `GET /v1/models` — list available models
- `GET /health` — health check
- Bearer token auth middleware
- Non-streaming responses (run agent, return full result)
- Chat Completions: stateless, messages array is the conversation
- Responses API: server-side conversation storage via previous_response_id
- Store full internal conversation (including tool calls) keyed by response ID
- On subsequent requests, reconstruct full context from stored chain
- Frontend system prompt layered on top of hermes-agent's core prompt
2. `gateway/config.py` — add `Platform.API_SERVER` enum + config
3. `gateway/run.py` — register adapter in `_create_adapter()`
4. Tests in `tests/gateway/test_api_server.py`
### Phase 2: SSE Streaming
1. Add response streaming to both endpoints
- Chat Completions: `choices[0].delta.content` SSE format
- Responses API: semantic events (response.output_text.delta, etc.)
- Run agent in thread, collect output via callback queue
- Handle client disconnect (cancel agent)
2. Add `stream_callback` parameter to `AIAgent.run_conversation()`
### Phase 3: Enhanced Features
1. Tool call transparency mode (opt-in)
2. Model passthrough/override
3. Concurrent request limiting
4. Usage tracking / rate limiting
5. CORS headers for browser-based frontends
6. GET /v1/responses/{id} — retrieve stored response
7. DELETE /v1/responses/{id} — delete stored response
## Files Changed
| File | Change |
|------|--------|
| `gateway/platforms/api_server.py` | NEW — main adapter (~300 lines) |
| `gateway/config.py` | Add Platform.API_SERVER + config (~20 lines) |
| `gateway/run.py` | Register adapter in _create_adapter() (~10 lines) |
| `tests/gateway/test_api_server.py` | NEW — tests (~200 lines) |
| `cli-config.yaml.example` | Add api_server section |
| `README.md` | Mention API server in platform list |
## Compatibility Matrix
Once implemented, hermes-agent works as a drop-in backend for:
| Frontend | Stars | How to Connect |
|----------|-------|---------------|
| Open WebUI | 126k | Settings → Connections → Add OpenAI API, URL: `http://localhost:8642/v1` |
| NextChat | 87k | BASE_URL env var |
| LobeChat | 73k | Custom provider endpoint |
| AnythingLLM | 56k | LLM Provider → Generic OpenAI |
| Oobabooga | 42k | Already a backend, not a frontend |
| ChatBox | 39k | API Host setting |
| LibreChat | 34k | librechat.yaml custom endpoint |
| Chatbot UI | 29k | Custom API endpoint |
| Jan | 26k | Remote model config |
| AionUI | 18k | Custom API endpoint |
| HF Chat-UI | 8k | OPENAI_BASE_URL env var |
| big-AGI | 7k | Custom endpoint |

View File

@@ -1,705 +0,0 @@
# Streaming LLM Response Support for Hermes Agent
## Overview
Add token-by-token streaming of LLM responses across all platforms. When enabled,
users see the response typing out live instead of waiting for the full generation.
Streaming is opt-in via config, defaults to off, and all existing non-streaming
code paths remain intact as the default.
## Design Principles
1. **Feature-flagged**: `streaming.enabled: true` in config.yaml. Off by default.
When off, all existing code paths are unchanged — zero risk to current behavior.
2. **Callback-based**: A simple `stream_callback(text_delta: str)` function injected
into AIAgent. The agent doesn't know or care what the consumer does with tokens.
3. **Graceful degradation**: If the provider doesn't support streaming, or streaming
fails for any reason, silently fall back to the non-streaming path.
4. **Platform-agnostic core**: The streaming mechanism in AIAgent works the same
regardless of whether the consumer is CLI, Telegram, Discord, or the API server.
---
## Architecture
```
stream_callback(delta)
┌─────────────┐ ┌─────────────▼──────────────┐
│ LLM API │ │ queue.Queue() │
│ (stream) │───►│ thread-safe bridge between │
│ │ │ agent thread & consumer │
└─────────────┘ └─────────────┬──────────────┘
┌──────────────┼──────────────┐
│ │ │
┌─────▼─────┐ ┌─────▼─────┐ ┌─────▼─────┐
│ CLI │ │ Gateway │ │ API Server│
│ print to │ │ edit msg │ │ SSE event │
│ terminal │ │ on Tg/Dc │ │ to client │
└───────────┘ └───────────┘ └───────────┘
```
The agent runs in a thread. The callback puts tokens into a thread-safe queue.
Each consumer reads the queue in its own context (async task, main thread, etc.).
---
## Configuration
### config.yaml
```yaml
streaming:
enabled: false # Master switch. Default off.
# Per-platform overrides (optional):
# cli: true # Override for CLI only
# telegram: true # Override for Telegram only
# discord: false # Keep Discord non-streaming
# api_server: true # Override for API server
```
### Environment variables
```
HERMES_STREAMING_ENABLED=true # Master switch via env
```
### How the flag is read
- **CLI**: `load_cli_config()` reads `streaming.enabled`, sets env var. AIAgent
checks at init time.
- **Gateway**: `_run_agent()` reads config, decides whether to pass
`stream_callback` to the AIAgent constructor.
- **API server**: For Chat Completions `stream=true` requests, always uses streaming
regardless of config (the client is explicitly requesting it). For non-stream
requests, uses config.
### Precedence
1. API server: client's `stream` field overrides everything
2. Per-platform config override (e.g., `streaming.telegram: true`)
3. Master `streaming.enabled` flag
4. Default: off
---
## Implementation Plan
### Phase 1: Core streaming infrastructure in AIAgent
**File: run_agent.py**
#### 1a. Add stream_callback parameter to __init__ (~5 lines)
```python
def __init__(self, ..., stream_callback: callable = None, ...):
self.stream_callback = stream_callback
```
No other init changes. The callback is optional — when None, everything
works exactly as before.
#### 1b. Add _run_streaming_chat_completion() method (~65 lines)
New method for Chat Completions API streaming:
```python
def _run_streaming_chat_completion(self, api_kwargs: dict):
"""Stream a chat completion, emitting text tokens via stream_callback.
Returns a fake response object compatible with the non-streaming code path.
Falls back to non-streaming on any error.
"""
stream_kwargs = dict(api_kwargs)
stream_kwargs["stream"] = True
stream_kwargs["stream_options"] = {"include_usage": True}
accumulated_content = []
accumulated_tool_calls = {} # index -> {id, name, arguments}
final_usage = None
try:
stream = self.client.chat.completions.create(**stream_kwargs)
for chunk in stream:
if not chunk.choices:
# Usage-only chunk (final)
if chunk.usage:
final_usage = chunk.usage
continue
delta = chunk.choices[0].delta
# Text content — emit via callback
if delta.content:
accumulated_content.append(delta.content)
if self.stream_callback:
try:
self.stream_callback(delta.content)
except Exception:
pass
# Tool call deltas — accumulate silently
if delta.tool_calls:
for tc_delta in delta.tool_calls:
idx = tc_delta.index
if idx not in accumulated_tool_calls:
accumulated_tool_calls[idx] = {
"id": tc_delta.id or "",
"name": "", "arguments": ""
}
if tc_delta.function:
if tc_delta.function.name:
accumulated_tool_calls[idx]["name"] = tc_delta.function.name
if tc_delta.function.arguments:
accumulated_tool_calls[idx]["arguments"] += tc_delta.function.arguments
# Build fake response compatible with existing code
tool_calls = []
for idx in sorted(accumulated_tool_calls):
tc = accumulated_tool_calls[idx]
if tc["name"]:
tool_calls.append(SimpleNamespace(
id=tc["id"], type="function",
function=SimpleNamespace(name=tc["name"], arguments=tc["arguments"]),
))
return SimpleNamespace(
choices=[SimpleNamespace(
message=SimpleNamespace(
content="".join(accumulated_content) or "",
tool_calls=tool_calls or None,
role="assistant",
),
finish_reason="tool_calls" if tool_calls else "stop",
)],
usage=final_usage,
model=self.model,
)
except Exception as e:
logger.debug("Streaming failed, falling back to non-streaming: %s", e)
return self.client.chat.completions.create(**api_kwargs)
```
#### 1c. Modify _run_codex_stream() for Responses API (~10 lines)
The method already iterates the stream. Add callback emission:
```python
def _run_codex_stream(self, api_kwargs: dict):
with self.client.responses.stream(**api_kwargs) as stream:
for event in stream:
# Emit text deltas if streaming callback is set
if self.stream_callback and hasattr(event, 'type'):
if event.type == 'response.output_text.delta':
try:
self.stream_callback(event.delta)
except Exception:
pass
return stream.get_final_response()
```
#### 1d. Modify _interruptible_api_call() (~5 lines)
Add the streaming branch:
```python
def _call():
try:
if self.api_mode == "codex_responses":
result["response"] = self._run_codex_stream(api_kwargs)
elif self.stream_callback is not None:
result["response"] = self._run_streaming_chat_completion(api_kwargs)
else:
result["response"] = self.client.chat.completions.create(**api_kwargs)
except Exception as e:
result["error"] = e
```
#### 1e. Signal end-of-stream to consumers (~5 lines)
After the API call returns, signal the callback that streaming is done
so consumers can finalize (remove cursor, close SSE, etc.):
```python
# In run_conversation(), after _interruptible_api_call returns:
if self.stream_callback:
try:
self.stream_callback(None) # None = end of stream signal
except Exception:
pass
```
Consumers check: `if delta is None: finalize()`
**Tests for Phase 1:** (~150 lines)
- Test _run_streaming_chat_completion with mocked stream
- Test fallback to non-streaming on error
- Test tool_call accumulation during streaming
- Test stream_callback receives correct deltas
- Test None signal at end of stream
- Test streaming disabled when callback is None
---
### Phase 2: Gateway consumers (Telegram, Discord, etc.)
**File: gateway/run.py**
#### 2a. Read streaming config (~15 lines)
In `_run_agent()`, before creating the AIAgent:
```python
# Read streaming config
_streaming_enabled = False
try:
# Check per-platform override first
platform_key = source.platform.value if source.platform else ""
_stream_cfg = {} # loaded from config.yaml streaming section
if _stream_cfg.get(platform_key) is not None:
_streaming_enabled = bool(_stream_cfg[platform_key])
else:
_streaming_enabled = bool(_stream_cfg.get("enabled", False))
except Exception:
pass
# Env var override
if os.getenv("HERMES_STREAMING_ENABLED", "").lower() in ("true", "1", "yes"):
_streaming_enabled = True
```
#### 2b. Set up queue + callback (~15 lines)
```python
_stream_q = None
_stream_done = None
_stream_msg_id = [None] # mutable ref for the async task
if _streaming_enabled:
import queue as _q
_stream_q = _q.Queue()
_stream_done = threading.Event()
def _on_token(delta):
if delta is None:
_stream_done.set()
else:
_stream_q.put(delta)
```
Pass `stream_callback=_on_token` to the AIAgent constructor.
#### 2c. Telegram/Discord stream preview task (~50 lines)
```python
async def stream_preview():
"""Progressively edit a message with streaming tokens."""
if not _stream_q:
return
adapter = self.adapters.get(source.platform)
if not adapter:
return
accumulated = []
token_count = 0
last_edit = 0.0
MIN_TOKENS = 20 # Don't show until enough context
EDIT_INTERVAL = 1.5 # Respect Telegram rate limits
try:
while not _stream_done.is_set():
try:
chunk = _stream_q.get(timeout=0.1)
accumulated.append(chunk)
token_count += 1
except queue.Empty:
continue
now = time.monotonic()
if token_count >= MIN_TOKENS and (now - last_edit) >= EDIT_INTERVAL:
preview = "".join(accumulated) + ""
if _stream_msg_id[0] is None:
r = await adapter.send(
chat_id=source.chat_id,
content=preview,
metadata=_thread_metadata,
)
if r.success and r.message_id:
_stream_msg_id[0] = r.message_id
else:
await adapter.edit_message(
chat_id=source.chat_id,
message_id=_stream_msg_id[0],
content=preview,
)
last_edit = now
# Drain remaining tokens
while not _stream_q.empty():
accumulated.append(_stream_q.get_nowait())
# Final edit — remove cursor, show complete text
if _stream_msg_id[0] and accumulated:
await adapter.edit_message(
chat_id=source.chat_id,
message_id=_stream_msg_id[0],
content="".join(accumulated),
)
except asyncio.CancelledError:
# Clean up on cancel
if _stream_msg_id[0] and accumulated:
try:
await adapter.edit_message(
chat_id=source.chat_id,
message_id=_stream_msg_id[0],
content="".join(accumulated),
)
except Exception:
pass
except Exception as e:
logger.debug("stream_preview error: %s", e)
```
#### 2d. Skip final send if already streamed (~10 lines)
In `_process_message_background()` (base.py), after getting the response,
if streaming was active and `_stream_msg_id[0]` is set, the final response
was already delivered via progressive edits. Skip the normal `self.send()`
call to avoid duplicating the message.
This is the most delicate integration point — we need to communicate from
the gateway's `_run_agent` back to the base adapter's response sender that
the response was already delivered. Options:
- **Option A**: Return a special marker in the result dict:
`result["_streamed_msg_id"] = _stream_msg_id[0]`
The base adapter checks this and skips `send()`.
- **Option B**: Edit the already-sent message with the final response
(which may differ slightly from accumulated tokens due to think-block
stripping, etc.) and don't send a new one.
- **Option C**: The stream preview task handles the FULL final response
(including any post-processing), and the handler returns None to skip
the normal send path.
Recommended: **Option A** — cleanest separation. The result dict already
carries metadata; adding one more field is low-risk.
**Platform-specific considerations:**
| Platform | Edit support | Rate limits | Streaming approach |
|----------|-------------|-------------|-------------------|
| Telegram | ✅ edit_message_text | ~20 edits/min | Edit every 1.5s |
| Discord | ✅ message.edit | 5 edits/5s per message | Edit every 1.2s |
| Slack | ✅ chat.update | Tier 3 (~50/min) | Edit every 1.5s |
| WhatsApp | ❌ no edit support | N/A | Skip streaming, use normal path |
| HomeAssistant | ❌ no edit | N/A | Skip streaming |
| API Server | ✅ SSE native | No limit | Real SSE events |
WhatsApp and HomeAssistant fall back to non-streaming automatically because
they don't support message editing.
**Tests for Phase 2:** (~100 lines)
- Test stream_preview sends/edits correctly
- Test skip-final-send when streaming delivered
- Test WhatsApp/HA graceful fallback
- Test streaming disabled per-platform config
- Test thread_id metadata forwarded in stream messages
---
### Phase 3: CLI streaming
**File: cli.py**
#### 3a. Set up callback in the CLI chat loop (~20 lines)
In `_chat_once()` or wherever the agent is invoked:
```python
if streaming_enabled:
_stream_q = queue.Queue()
_stream_done = threading.Event()
def _cli_stream_callback(delta):
if delta is None:
_stream_done.set()
else:
_stream_q.put(delta)
agent.stream_callback = _cli_stream_callback
```
#### 3b. Token display thread/task (~30 lines)
Start a thread that reads the queue and prints tokens:
```python
def _stream_display():
"""Print tokens to terminal as they arrive."""
first_token = True
while not _stream_done.is_set():
try:
delta = _stream_q.get(timeout=0.1)
except queue.Empty:
continue
if first_token:
# Print response box top border
_cprint(f"\n{top}")
first_token = False
sys.stdout.write(delta)
sys.stdout.flush()
# Drain remaining
while not _stream_q.empty():
sys.stdout.write(_stream_q.get_nowait())
sys.stdout.flush()
# Print bottom border
_cprint(f"\n\n{bot}")
```
**Integration challenge: prompt_toolkit**
The CLI uses prompt_toolkit which controls the terminal. Writing directly
to stdout while prompt_toolkit is active can cause display corruption.
The existing KawaiiSpinner already solves this by using prompt_toolkit's
`patch_stdout` context. The streaming display would need to do the same.
Alternative: use `_cprint()` for each token chunk (routes through
prompt_toolkit's renderer). But this might be slow for individual tokens.
Recommended approach: accumulate tokens in small batches (e.g., every 50ms)
and `_cprint()` the batch. This balances display responsiveness with
prompt_toolkit compatibility.
**Tests for Phase 3:** (~50 lines)
- Test CLI streaming callback setup
- Test response box borders with streaming
- Test fallback when streaming disabled
---
### Phase 4: API Server real streaming
**File: gateway/platforms/api_server.py**
Replace the pseudo-streaming `_write_sse_chat_completion()` with real
token-by-token SSE when the agent supports it.
#### 4a. Wire streaming callback for stream=true requests (~20 lines)
```python
if stream:
_stream_q = queue.Queue()
def _api_stream_callback(delta):
_stream_q.put(delta) # None = done
# Pass callback to _run_agent
result, usage = await self._run_agent(
..., stream_callback=_api_stream_callback,
)
```
#### 4b. Real SSE writer (~40 lines)
```python
async def _write_real_sse(self, request, completion_id, model, stream_q):
response = web.StreamResponse(
headers={"Content-Type": "text/event-stream", "Cache-Control": "no-cache"},
)
await response.prepare(request)
# Role chunk
await response.write(...)
# Stream content chunks as they arrive
while True:
try:
delta = await asyncio.get_event_loop().run_in_executor(
None, lambda: stream_q.get(timeout=0.1)
)
except queue.Empty:
continue
if delta is None: # End of stream
break
chunk = {"id": completion_id, "object": "chat.completion.chunk", ...
"choices": [{"delta": {"content": delta}, ...}]}
await response.write(f"data: {json.dumps(chunk)}\n\n".encode())
# Finish + [DONE]
await response.write(...)
await response.write(b"data: [DONE]\n\n")
return response
```
**Challenge: concurrent execution**
The agent runs in a thread executor. SSE writing happens in the async event
loop. The queue bridges them. But `_run_agent()` currently awaits the full
result before returning. For real streaming, we need to start the agent in
the background and stream tokens while it runs:
```python
# Start agent in background
agent_task = asyncio.create_task(self._run_agent_async(...))
# Stream tokens while agent runs
await self._write_real_sse(request, ..., stream_q)
# Agent is done by now (stream_q received None)
result, usage = await agent_task
```
This requires splitting `_run_agent` into an async version that doesn't
block waiting for the result, or running it in a separate task.
**Responses API SSE format:**
For `/v1/responses` with `stream=true`, the SSE events are different:
```
event: response.output_text.delta
data: {"type":"response.output_text.delta","delta":"Hello"}
event: response.completed
data: {"type":"response.completed","response":{...}}
```
This needs a separate SSE writer that emits Responses API format events.
**Tests for Phase 4:** (~80 lines)
- Test real SSE streaming with mocked agent
- Test SSE event format (Chat Completions vs Responses)
- Test client disconnect during streaming
- Test fallback to pseudo-streaming when callback not available
---
## Integration Issues & Edge Cases
### 1. Tool calls during streaming
When the model returns tool calls instead of text, no text tokens are emitted.
The stream_callback is simply never called with text. After tools execute, the
next API call may produce the final text response — streaming picks up again.
The stream preview task needs to handle this: if no tokens arrive during a
tool-call round, don't send/edit any message. The tool progress messages
continue working as before.
### 2. Duplicate messages
The biggest risk: the agent sends the final response normally (via the
existing send path) AND the stream preview already showed it. The user
sees the response twice.
Prevention: when streaming is active and tokens were delivered, the final
response send must be suppressed. The `result["_streamed_msg_id"]` marker
tells the base adapter to skip its normal send.
### 3. Response post-processing
The final response may differ from the accumulated streamed tokens:
- Think block stripping (`<think>...</think>` removed)
- Trailing whitespace cleanup
- Tool result media tag appending
The stream preview shows raw tokens. The final edit should use the
post-processed version. This means the final edit (removing the cursor)
should use the post-processed `final_response`, not just the accumulated
stream text.
### 4. Context compression during streaming
If the agent triggers context compression mid-conversation, the streaming
tokens from BEFORE compression are from a different context than those
after. This isn't a problem in practice — compression happens between
API calls, not during streaming.
### 5. Interrupt during streaming
User sends a new message while streaming → interrupt. The stream is killed
(HTTP connection closed), accumulated tokens are shown as-is (no cursor),
and the interrupt message is processed normally. This is already handled by
`_interruptible_api_call` closing the client.
### 6. Multi-model / fallback
If the primary model fails and the agent falls back to a different model,
streaming state resets. The fallback call may or may not support streaming.
The graceful fallback in `_run_streaming_chat_completion` handles this.
### 7. Rate limiting on edits
Telegram: ~20 edits/minute (~1 every 3 seconds to be safe)
Discord: 5 edits per 5 seconds per message
Slack: ~50 API calls/minute
The 1.5s edit interval is conservative enough for all platforms. If we get
429 rate limit errors on edits, just skip that edit cycle and try next time.
---
## Files Changed Summary
| File | Phase | Changes |
|------|-------|---------|
| `run_agent.py` | 1 | +stream_callback param, +_run_streaming_chat_completion(), modify _run_codex_stream(), modify _interruptible_api_call() |
| `gateway/run.py` | 2 | +streaming config reader, +queue/callback setup, +stream_preview task, +skip-final-send logic |
| `gateway/platforms/base.py` | 2 | +check for _streamed_msg_id in response handler |
| `cli.py` | 3 | +streaming setup, +token display, +response box integration |
| `gateway/platforms/api_server.py` | 4 | +real SSE writer, +streaming callback wiring |
| `hermes_cli/config.py` | 1 | +streaming config defaults |
| `cli-config.yaml.example` | 1 | +streaming section |
| `tests/test_streaming.py` | 1-4 | NEW — ~380 lines of tests |
**Total new code**: ~500 lines across all phases
**Total test code**: ~380 lines
---
## Rollout Plan
1. **Phase 1** (core): Merge to main. Streaming disabled by default.
Zero impact on existing behavior. Can be tested with env var.
2. **Phase 2** (gateway): Merge to main. Test on Telegram manually.
Enable per-platform: `streaming.telegram: true` in config.
3. **Phase 3** (CLI): Merge to main. Test in terminal.
Enable: `streaming.cli: true` or `streaming.enabled: true`.
4. **Phase 4** (API server): Merge to main. Test with Open WebUI.
Auto-enabled when client sends `stream: true`.
Each phase is independently mergeable and testable. Streaming stays
off by default throughout. Once all phases are stable, consider
changing the default to enabled.
---
## Config Reference (final state)
```yaml
# config.yaml
streaming:
enabled: false # Master switch (default: off)
cli: true # Per-platform override
telegram: true
discord: true
slack: true
api_server: true # API server always streams when client requests it
edit_interval: 1.5 # Seconds between message edits (default: 1.5)
min_tokens: 20 # Tokens before first display (default: 20)
```
```bash
# Environment variable override
HERMES_STREAMING_ENABLED=true
```

830
AGENTS.md
View File

@@ -1,67 +1,78 @@
# Hermes Agent - Development Guide
Instructions for AI coding assistants and developers working on the hermes-agent codebase.
Instructions for AI coding assistants (GitHub Copilot, Cursor, etc.) and human developers.
Hermes Agent is an AI agent harness with tool-calling capabilities, interactive CLI, messaging integrations, and scheduled tasks.
## Development Environment
**IMPORTANT**: Always use the virtual environment if it exists:
```bash
source .venv/bin/activate # ALWAYS activate before running Python
source venv/bin/activate # Before running any Python commands
```
## Project Structure
```
hermes-agent/
├── run_agent.py # AIAgent class — core conversation loop
├── model_tools.py # Tool orchestration, _discover_tools(), handle_function_call()
├── toolsets.py # Toolset definitions, _HERMES_CORE_TOOLS list
├── cli.py # HermesCLI class — interactive CLI orchestrator
├── hermes_state.py # SessionDB — SQLite session store (FTS5 search)
├── agent/ # Agent internals
│ ├── prompt_builder.py # System prompt assembly
├── agent/ # Agent internals (extracted from run_agent.py)
├── model_metadata.py # Model context lengths, token estimation
│ ├── context_compressor.py # Auto context compression
│ ├── prompt_caching.py # Anthropic prompt caching
│ ├── auxiliary_client.py # Auxiliary LLM client (vision, summarization)
│ ├── model_metadata.py # Model context lengths, token estimation
│ ├── prompt_builder.py # System prompt assembly (identity, skills index, context files)
│ ├── display.py # KawaiiSpinner, tool preview formatting
│ ├── skill_commands.py # Skill slash commands (shared CLI/gateway)
│ └── trajectory.py # Trajectory saving helpers
├── hermes_cli/ # CLI subcommands and setup
│ ├── main.py # Entry point — all `hermes` subcommands
│ ├── config.py # DEFAULT_CONFIG, OPTIONAL_ENV_VARS, migration
│ ├── commands.py # Slash command definitions + SlashCommandCompleter
│ ├── callbacks.py # Terminal callbacks (clarify, sudo, approval)
├── hermes_cli/ # CLI implementation
│ ├── main.py # Entry point, command dispatcher
│ ├── banner.py # Welcome banner, ASCII art, skills summary
│ ├── commands.py # Slash command definitions + autocomplete
│ ├── callbacks.py # Interactive prompt callbacks (clarify, sudo, approval)
│ ├── setup.py # Interactive setup wizard
│ ├── skin_engine.py # Skin/theme engine — CLI visual customization
│ ├── skills_config.py # `hermes skills` — enable/disable skills per platform
│ ├── tools_config.py # `hermes tools` — enable/disable tools per platform
│ ├── skills_hub.py # `/skills` slash command (search, browse, install)
│ ├── models.py # Model catalog, provider model lists
── auth.py # Provider credential resolution
├── tools/ # Tool implementations (one file per tool)
│ ├── registry.py # Central tool registry (schemas, handlers, dispatch)
│ ├── approval.py # Dangerous command detection
│ ├── terminal_tool.py # Terminal orchestration
│ ├── process_registry.py # Background process management
│ ├── file_tools.py # File read/write/search/patch
│ ├── web_tools.py # Firecrawl search/extract
│ ├── browser_tool.py # Browserbase browser automation
│ ├── code_execution_tool.py # execute_code sandbox
│ ├── delegate_tool.py # Subagent delegation
│ ├── mcp_tool.py # MCP client (~1050 lines)
│ └── environments/ # Terminal backends (local, docker, ssh, modal, daytona, singularity)
├── gateway/ # Messaging platform gateway
│ ├── run.py # Main loop, slash commands, message dispatch
│ ├── session.py # SessionStore — conversation persistence
│ └── platforms/ # Adapters: telegram, discord, slack, whatsapp, homeassistant, signal
├── acp_adapter/ # ACP server (VS Code / Zed / JetBrains integration)
├── cron/ # Scheduler (jobs.py, scheduler.py)
├── environments/ # RL training environments (Atropos)
├── tests/ # Pytest suite (~3000 tests)
│ ├── config.py # Config management & migration
│ ├── status.py # Status display
│ ├── doctor.py # Diagnostics
│ ├── gateway.py # Gateway management
│ ├── uninstall.py # Uninstaller
── cron.py # Cron job management
│ └── skills_hub.py # Skills Hub CLI + /skills slash command
├── tools/ # Tool implementations
│ ├── registry.py # Central tool registry (schemas, handlers, dispatch)
│ ├── approval.py # Dangerous command detection + per-session approval
│ ├── environments/ # Terminal execution backends
│ ├── base.py # BaseEnvironment ABC
│ ├── local.py # Local execution with interrupt support
│ ├── docker.py # Docker container execution
│ ├── ssh.py # SSH remote execution
│ ├── singularity.py # Singularity/Apptainer + SIF management
│ ├── modal.py # Modal cloud execution
│ └── daytona.py # Daytona cloud sandboxes
│ ├── terminal_tool.py # Terminal orchestration (sudo, lifecycle, factory)
│ ├── todo_tool.py # Planning & task management
│ ├── process_registry.py # Background process management
│ └── ... # Other tool files
├── gateway/ # Messaging platform adapters
│ ├── platforms/ # Platform-specific adapters (telegram, discord, slack, whatsapp)
│ └── ...
├── cron/ # Scheduler implementation
├── environments/ # RL training environments (Atropos integration)
├── skills/ # Bundled skill sources
├── optional-skills/ # Official optional skills (not activated by default)
├── cli.py # Interactive CLI orchestrator (HermesCLI class)
├── run_agent.py # AIAgent class (core conversation loop)
├── model_tools.py # Tool orchestration (thin layer over tools/registry.py)
├── toolsets.py # Tool groupings
├── toolset_distributions.py # Probability-based tool selection
└── batch_runner.py # Parallel batch processing
```
**User config:** `~/.hermes/config.yaml` (settings), `~/.hermes/.env` (API keys)
**User Configuration** (stored in `~/.hermes/`):
- `~/.hermes/config.yaml` - Settings (model, terminal, toolsets, etc.)
- `~/.hermes/.env` - API keys and secrets
- `~/.hermes/pairing/` - DM pairing data
- `~/.hermes/hooks/` - Custom event hooks
- `~/.hermes/image_cache/` - Cached user images
- `~/.hermes/audio_cache/` - Cached user voice messages
- `~/.hermes/sticker_cache.json` - Telegram sticker descriptions
## File Dependency Chain
@@ -75,275 +86,600 @@ model_tools.py (imports tools/registry + triggers tool discovery)
run_agent.py, cli.py, batch_runner.py, environments/
```
Each tool file co-locates its schema, handler, and registration. `model_tools.py` is a thin orchestration layer.
---
## AIAgent Class (run_agent.py)
## AIAgent Class
The main agent is implemented in `run_agent.py`:
```python
class AIAgent:
def __init__(self,
model: str = "anthropic/claude-opus-4.6",
max_iterations: int = 90,
def __init__(
self,
model: str = "anthropic/claude-sonnet-4",
api_key: str = None,
base_url: str = "https://openrouter.ai/api/v1",
max_iterations: int = 60, # Max tool-calling loops
enabled_toolsets: list = None,
disabled_toolsets: list = None,
quiet_mode: bool = False,
save_trajectories: bool = False,
platform: str = None, # "cli", "telegram", etc.
session_id: str = None,
skip_context_files: bool = False,
skip_memory: bool = False,
# ... plus provider, api_mode, callbacks, routing params
): ...
def chat(self, message: str) -> str:
"""Simple interface — returns final response string."""
def run_conversation(self, user_message: str, system_message: str = None,
conversation_history: list = None, task_id: str = None) -> dict:
"""Full interface — returns dict with final_response + messages."""
verbose_logging: bool = False,
quiet_mode: bool = False, # Suppress progress output
tool_progress_callback: callable = None, # Called on each tool use
):
# Initialize OpenAI client, load tools based on toolsets
...
def chat(self, user_message: str, task_id: str = None) -> str:
# Main entry point - runs the agent loop
...
```
### Agent Loop
The core loop is inside `run_conversation()` — entirely synchronous:
The core loop in `_run_agent_loop()`:
```
1. Add user message to conversation
2. Call LLM with tools
3. If LLM returns tool calls:
- Execute each tool
- Add tool results to conversation
- Go to step 2
4. If LLM returns text response:
- Return response to user
```
```python
while api_call_count < self.max_iterations and self.iteration_budget.remaining > 0:
response = client.chat.completions.create(model=model, messages=messages, tools=tool_schemas)
while turns < max_turns:
response = client.chat.completions.create(
model=model,
messages=messages,
tools=tool_schemas,
)
if response.tool_calls:
for tool_call in response.tool_calls:
result = handle_function_call(tool_call.name, tool_call.args, task_id)
result = await execute_tool(tool_call)
messages.append(tool_result_message(result))
api_call_count += 1
turns += 1
else:
return response.content
```
Messages follow OpenAI format: `{"role": "system/user/assistant/tool", ...}`. Reasoning content is stored in `assistant_msg["reasoning"]`.
### Conversation Management
Messages are stored as a list of dicts following OpenAI format:
```python
messages = [
{"role": "system", "content": "You are a helpful assistant..."},
{"role": "user", "content": "Search for Python tutorials"},
{"role": "assistant", "content": None, "tool_calls": [...]},
{"role": "tool", "tool_call_id": "...", "content": "..."},
{"role": "assistant", "content": "Here's what I found..."},
]
```
### Reasoning Model Support
For models that support chain-of-thought reasoning:
- Extract `reasoning_content` from API responses
- Store in `assistant_msg["reasoning"]` for trajectory export
- Pass back via `reasoning_content` field on subsequent turns
---
## CLI Architecture (cli.py)
- **Rich** for banner/panels, **prompt_toolkit** for input with autocomplete
- **KawaiiSpinner** (`agent/display.py`) — animated faces during API calls, `┊` activity feed for tool results
- `load_cli_config()` in cli.py merges hardcoded defaults + user config YAML
- **Skin engine** (`hermes_cli/skin_engine.py`) — data-driven CLI theming; initialized from `display.skin` config key at startup; skins customize banner colors, spinner faces/verbs/wings, tool prefix, response box, branding text
- `process_command()` is a method on `HermesCLI` (not in commands.py)
- Skill slash commands: `agent/skill_commands.py` scans `~/.hermes/skills/`, injects as **user message** (not system prompt) to preserve prompt caching
The interactive CLI uses:
- **Rich** - For the welcome banner and styled panels
- **prompt_toolkit** - For fixed input area with history, `patch_stdout`, slash command autocomplete, and floating completion menus
- **KawaiiSpinner** (in run_agent.py) - Animated kawaii faces during API calls; clean `┊` activity feed for tool execution results
Key components:
- `HermesCLI` class - Main CLI controller with commands and conversation loop
- `SlashCommandCompleter` - Autocomplete dropdown for `/commands` (type `/` to see all)
- `agent/skill_commands.py` - Scans skills and builds invocation messages (shared with gateway)
- `load_cli_config()` - Loads config, sets environment variables for terminal
- `build_welcome_banner()` - Displays ASCII art logo, tools, and skills summary
CLI UX notes:
- Thinking spinner (during LLM API call) shows animated kawaii face + verb (`(⌐■_■) deliberating...`)
- When LLM returns tool calls, the spinner clears silently (no "got it!" noise)
- Tool execution results appear as a clean activity feed: `┊ {emoji} {verb} {detail} {duration}`
- "got it!" only appears when the LLM returns a final text response (`⚕ ready`)
- The prompt shows `⚕ ` when the agent is working, `` when idle
- Pasting 5+ lines auto-saves to `~/.hermes/pastes/` and collapses to a reference
- Multi-line input via Alt+Enter or Ctrl+J
- `/commands` - Process user commands like `/help`, `/clear`, `/personality`, etc.
- `/skill-name` - Invoke installed skills directly (e.g., `/axolotl`, `/gif-search`)
CLI uses `quiet_mode=True` when creating AIAgent to suppress verbose logging.
### Skill Slash Commands
Every installed skill in `~/.hermes/skills/` is automatically registered as a slash command.
The skill name (from frontmatter or folder name) becomes the command: `axolotl``/axolotl`.
Implementation (`agent/skill_commands.py`, shared between CLI and gateway):
1. `scan_skill_commands()` scans all SKILL.md files at startup
2. `build_skill_invocation_message()` loads the SKILL.md content and builds a user-turn message
3. The message includes the full skill content, a list of supporting files (not loaded), and the user's instruction
4. Supporting files can be loaded on demand via the `skill_view` tool
5. Injected as a **user message** (not system prompt) to preserve prompt caching
### Adding CLI Commands
1. Add to `COMMANDS` dict in `hermes_cli/commands.py`
2. Add handler in `HermesCLI.process_command()` in `cli.py`
3. For persistent settings, use `save_config_value()` in `cli.py`
1. Add to `COMMANDS` dict with description
2. Add handler in `process_command()` method
3. For persistent settings, use `save_config_value()` to update config
---
## Hermes CLI Commands
The unified `hermes` command provides all functionality:
| Command | Description |
|---------|-------------|
| `hermes` | Interactive chat (default) |
| `hermes chat -q "..."` | Single query mode |
| `hermes setup` | Configure API keys and settings |
| `hermes config` | View current configuration |
| `hermes config edit` | Open config in editor |
| `hermes config set KEY VAL` | Set a specific value |
| `hermes config check` | Check for missing config |
| `hermes config migrate` | Prompt for missing config interactively |
| `hermes status` | Show configuration status |
| `hermes doctor` | Diagnose issues |
| `hermes update` | Update to latest (checks for new config) |
| `hermes uninstall` | Uninstall (can keep configs for reinstall) |
| `hermes gateway` | Start gateway (messaging + cron scheduler) |
| `hermes gateway setup` | Configure messaging platforms interactively |
| `hermes gateway install` | Install gateway as system service |
| `hermes cron list` | View scheduled jobs |
| `hermes cron status` | Check if cron scheduler is running |
| `hermes version` | Show version info |
| `hermes pairing list/approve/revoke` | Manage DM pairing codes |
---
## Messaging Gateway
The gateway connects Hermes to Telegram, Discord, Slack, and WhatsApp.
### Setup
The interactive setup wizard handles platform configuration:
```bash
hermes gateway setup # Arrow-key menu of all platforms, configure tokens/allowlists/home channels
```
This is the recommended way to configure messaging. It shows which platforms are already set up, walks through each one interactively, and offers to start/restart the gateway service at the end.
Platforms can also be configured manually in `~/.hermes/.env`:
### Configuration (in `~/.hermes/.env`):
```bash
# Telegram
TELEGRAM_BOT_TOKEN=123456:ABC-DEF... # From @BotFather
TELEGRAM_ALLOWED_USERS=123456789,987654 # Comma-separated user IDs (from @userinfobot)
# Discord
DISCORD_BOT_TOKEN=MTIz... # From Developer Portal
DISCORD_ALLOWED_USERS=123456789012345678 # Comma-separated user IDs
# Agent Behavior
HERMES_MAX_ITERATIONS=60 # Max tool-calling iterations
MESSAGING_CWD=/home/myuser # Terminal working directory for messaging
# Tool progress is configured in config.yaml (display.tool_progress: off|new|all|verbose)
```
### Working Directory Behavior
- **CLI (`hermes` command)**: Uses current directory (`.``os.getcwd()`)
- **Messaging (Telegram/Discord)**: Uses `MESSAGING_CWD` (default: home directory)
This is intentional: CLI users are in a terminal and expect the agent to work in their current directory, while messaging users need a consistent starting location.
### Security (User Allowlists):
**IMPORTANT**: By default, the gateway denies all users who are not in an allowlist or paired via DM.
The gateway checks `{PLATFORM}_ALLOWED_USERS` environment variables:
- If set: Only listed user IDs can interact with the bot
- If unset: All users are denied unless `GATEWAY_ALLOW_ALL_USERS=true` is set
Users can find their IDs:
- **Telegram**: Message [@userinfobot](https://t.me/userinfobot)
- **Discord**: Enable Developer Mode, right-click name → Copy ID
### DM Pairing System
Instead of static allowlists, users can pair via one-time codes:
1. Unknown user DMs the bot → receives pairing code
2. Owner runs `hermes pairing approve <platform> <code>`
3. User is permanently authorized
Security: 8-char codes, 1-hour expiry, rate-limited (1/10min/user), max 3 pending per platform, lockout after 5 failed attempts, `chmod 0600` on data files.
Files: `gateway/pairing.py`, `hermes_cli/pairing.py`
### Event Hooks
Hooks fire at lifecycle points. Place hook directories in `~/.hermes/hooks/`:
```
~/.hermes/hooks/my-hook/
├── HOOK.yaml # name, description, events list
└── handler.py # async def handle(event_type, context): ...
```
Events: `gateway:startup`, `session:start`, `session:reset`, `agent:start`, `agent:step`, `agent:end`, `command:*`
The `agent:step` event fires each iteration of the tool-calling loop with tool names and results.
Files: `gateway/hooks.py`
### Tool Progress Notifications
When `tool_progress` is enabled in `config.yaml`, the bot sends status messages as it works:
- `💻 \`ls -la\`...` (terminal commands show the actual command)
- `🔍 web_search...`
- `📄 web_extract...`
- `🐍 execute_code...` (programmatic tool calling sandbox)
- `🔀 delegate_task...` (subagent delegation)
- `❓ clarify...` (user question, CLI-only)
Modes:
- `new`: Only when switching to a different tool (less spam)
- `all`: Every single tool call
### Typing Indicator
The gateway keeps the "typing..." indicator active throughout processing, refreshing every 4 seconds. This lets users know the bot is working even during long tool-calling sequences.
### Platform Toolsets:
Each platform has a dedicated toolset in `toolsets.py`:
- `hermes-telegram`: Full tools including terminal (with safety checks)
- `hermes-discord`: Full tools including terminal
- `hermes-whatsapp`: Full tools including terminal
---
## Configuration System
Configuration files are stored in `~/.hermes/` for easy user access:
- `~/.hermes/config.yaml` - All settings (model, terminal, compression, etc.)
- `~/.hermes/.env` - API keys and secrets
### Adding New Configuration Options
When adding new configuration variables, you MUST follow this process:
#### For config.yaml options:
1. Add to `DEFAULT_CONFIG` in `hermes_cli/config.py`
2. **CRITICAL**: Bump `_config_version` in `DEFAULT_CONFIG` when adding required fields
3. This triggers migration prompts for existing users on next `hermes update` or `hermes setup`
Example:
```python
DEFAULT_CONFIG = {
# ... existing config ...
"new_feature": {
"enabled": True,
"option": "default_value",
},
# BUMP THIS when adding required fields
"_config_version": 2, # Was 1, now 2
}
```
#### For .env variables (API keys/secrets):
1. Add to `REQUIRED_ENV_VARS` or `OPTIONAL_ENV_VARS` in `hermes_cli/config.py`
2. Include metadata for the migration system:
```python
OPTIONAL_ENV_VARS = {
# ... existing vars ...
"NEW_API_KEY": {
"description": "What this key is for",
"prompt": "Display name in prompts",
"url": "https://where-to-get-it.com/",
"tools": ["tools_it_enables"], # What tools need this
"password": True, # Mask input
},
}
```
#### Update related files:
- `hermes_cli/setup.py` - Add prompts in the setup wizard
- `cli-config.yaml.example` - Add example with comments
- Update README.md if user-facing
### Config Version Migration
The system uses `_config_version` to detect outdated configs:
1. `check_for_missing_config()` compares user config to `DEFAULT_CONFIG`
2. `migrate_config()` interactively prompts for missing values
3. Called automatically by `hermes update` and optionally by `hermes setup`
---
## Environment Variables
API keys are loaded from `~/.hermes/.env`:
- `OPENROUTER_API_KEY` - Main LLM API access (primary provider)
- `FIRECRAWL_API_KEY` - Web search/extract tools
- `FIRECRAWL_API_URL` - Self-hosted Firecrawl endpoint (optional)
- `BROWSERBASE_API_KEY` / `BROWSERBASE_PROJECT_ID` - Browser automation
- `FAL_KEY` - Image generation (FLUX model)
- `NOUS_API_KEY` - Vision and Mixture-of-Agents tools
Terminal tool configuration (in `~/.hermes/config.yaml`):
- `terminal.backend` - Backend: local, docker, singularity, modal, daytona, or ssh
- `terminal.cwd` - Working directory ("." = host CWD for local only; for remote backends set an absolute path inside the target, or omit to use the backend's default)
- `terminal.docker_image` - Image for Docker backend
- `terminal.singularity_image` - Image for Singularity backend
- `terminal.modal_image` - Image for Modal backend
- `terminal.daytona_image` - Image for Daytona backend
- `DAYTONA_API_KEY` - API key for Daytona backend (in .env)
- SSH: `TERMINAL_SSH_HOST`, `TERMINAL_SSH_USER`, `TERMINAL_SSH_KEY` in .env
Agent behavior (in `~/.hermes/.env`):
- `HERMES_MAX_ITERATIONS` - Max tool-calling iterations (default: 60)
- `MESSAGING_CWD` - Working directory for messaging platforms (default: ~)
- `display.tool_progress` in config.yaml - Tool progress: `off`, `new`, `all`, `verbose`
- `OPENAI_API_KEY` - Voice transcription (Whisper STT)
- `SLACK_BOT_TOKEN` / `SLACK_APP_TOKEN` - Slack integration (Socket Mode)
- `SLACK_ALLOWED_USERS` - Comma-separated Slack user IDs
- `HERMES_HUMAN_DELAY_MODE` - Response pacing: off/natural/custom
- `HERMES_HUMAN_DELAY_MIN_MS` / `HERMES_HUMAN_DELAY_MAX_MS` - Custom delay range
### Dangerous Command Approval
The terminal tool includes safety checks for potentially destructive commands (e.g., `rm -rf`, `DROP TABLE`, `chmod 777`, etc.):
**Behavior by Backend:**
- **Docker/Singularity/Modal**: Commands run unrestricted (isolated containers)
- **Local/SSH**: Dangerous commands trigger approval flow
**Approval Flow (CLI):**
```
⚠️ Potentially dangerous command detected: recursive delete
rm -rf /tmp/test
[o]nce | [s]ession | [a]lways | [d]eny
Choice [o/s/a/D]:
```
**Approval Flow (Messaging):**
- Command is blocked with explanation
- Agent explains the command was blocked for safety
- User must add the pattern to their allowlist via `hermes config edit` or run the command directly on their machine
**Configuration:**
- `command_allowlist` in `~/.hermes/config.yaml` stores permanently allowed patterns
- Add patterns via "always" approval or edit directly
**Sudo Handling (Messaging):**
- If sudo fails over messaging, output includes tip to add `SUDO_PASSWORD` to `~/.hermes/.env`
---
## Background Process Management
The `process` tool works alongside `terminal` for managing long-running background processes:
**Starting a background process:**
```python
terminal(command="pytest -v tests/", background=true)
# Returns: {"session_id": "proc_abc123", "pid": 12345, ...}
```
**Managing it with the process tool:**
- `process(action="list")` -- show all running/recent processes
- `process(action="poll", session_id="proc_abc123")` -- check status + new output
- `process(action="log", session_id="proc_abc123")` -- full output with pagination
- `process(action="wait", session_id="proc_abc123", timeout=600)` -- block until done
- `process(action="kill", session_id="proc_abc123")` -- terminate
- `process(action="write", session_id="proc_abc123", data="y")` -- send stdin
- `process(action="submit", session_id="proc_abc123", data="yes")` -- send + Enter
**Key behaviors:**
- Background processes execute through the configured terminal backend (local/Docker/Modal/Daytona/SSH/Singularity) -- never directly on the host unless `TERMINAL_ENV=local`
- The `wait` action blocks the tool call until the process finishes, times out, or is interrupted by a new user message
- PTY mode (`pty=true` on terminal) enables interactive CLI tools (Codex, Claude Code)
- In RL training, background processes are auto-killed when the episode ends (`tool_context.cleanup()`)
- In the gateway, sessions with active background processes are exempt from idle reset
- The process registry checkpoints to `~/.hermes/processes.json` for crash recovery
Files: `tools/process_registry.py` (registry + handler), `tools/terminal_tool.py` (spawn integration)
---
## Adding New Tools
Requires changes in **3 files**:
Adding a tool requires changes in **2 files** (the tool file and `toolsets.py`):
1. **Create `tools/your_tool.py`** with handler, schema, check function, and registry call:
**1. Create `tools/your_tool.py`:**
```python
import json, os
# tools/example_tool.py
import json
import os
from tools.registry import registry
def check_requirements() -> bool:
def check_example_requirements() -> bool:
"""Check if required API keys/dependencies are available."""
return bool(os.getenv("EXAMPLE_API_KEY"))
def example_tool(param: str, task_id: str = None) -> str:
return json.dumps({"success": True, "data": "..."})
"""Execute the tool and return JSON string result."""
try:
result = {"success": True, "data": "..."}
return json.dumps(result, ensure_ascii=False)
except Exception as e:
return json.dumps({"error": str(e)}, ensure_ascii=False)
EXAMPLE_SCHEMA = {
"name": "example_tool",
"description": "Does something useful.",
"parameters": {
"type": "object",
"properties": {
"param": {"type": "string", "description": "The parameter"}
},
"required": ["param"]
}
}
registry.register(
name="example_tool",
toolset="example",
schema={"name": "example_tool", "description": "...", "parameters": {...}},
handler=lambda args, **kw: example_tool(param=args.get("param", ""), task_id=kw.get("task_id")),
check_fn=check_requirements,
schema=EXAMPLE_SCHEMA,
handler=lambda args, **kw: example_tool(
param=args.get("param", ""), task_id=kw.get("task_id")),
check_fn=check_example_requirements,
requires_env=["EXAMPLE_API_KEY"],
)
```
**2. Add import** in `model_tools.py` `_discover_tools()` list.
2. **Add to `toolsets.py`**: Add `"example_tool"` to `_HERMES_CORE_TOOLS` if it should be in all platform toolsets, or create a new toolset entry.
**3. Add to `toolsets.py`** — either `_HERMES_CORE_TOOLS` (all platforms) or a new toolset.
3. **Add discovery import** in `model_tools.py`'s `_discover_tools()` list: `"tools.example_tool"`.
The registry handles schema collection, dispatch, availability checking, and error wrapping. All handlers MUST return a JSON string.
That's it. The registry handles schema collection, dispatch, availability checking, and error wrapping automatically. No edits to `TOOLSET_REQUIREMENTS`, `handle_function_call()`, `get_all_tool_names()`, or any other data structure.
**Agent-level tools** (todo, memory): intercepted by `run_agent.py` before `handle_function_call()`. See `todo_tool.py` for the pattern.
**Optional:** Add to `OPTIONAL_ENV_VARS` in `hermes_cli/config.py` for the setup wizard, and to `toolset_distributions.py` for batch processing.
**Special case: tools that need agent-level state** (like `todo`, `memory`):
These are intercepted by `run_agent.py`'s tool dispatch loop *before* `handle_function_call()`. The registry still holds their schemas, but dispatch returns a stub error as a safety fallback. See `todo_tool.py` for the pattern.
All tool handlers MUST return a JSON string. The registry's `dispatch()` wraps all exceptions in `{"error": "..."}` automatically.
### Dynamic Tool Availability
Tools declare their requirements at registration time via `check_fn` and `requires_env`. The registry checks `check_fn()` when building tool definitions -- tools whose check fails are silently excluded.
### Stateful Tools
Tools that maintain state (terminal, browser) require:
- `task_id` parameter for session isolation between concurrent tasks
- `cleanup_*()` function to release resources
- Cleanup is called automatically in run_agent.py after conversation completes
---
## Adding Configuration
## Trajectory Format
### config.yaml options:
1. Add to `DEFAULT_CONFIG` in `hermes_cli/config.py`
2. Bump `_config_version` (currently 5) to trigger migration for existing users
### .env variables:
1. Add to `OPTIONAL_ENV_VARS` in `hermes_cli/config.py` with metadata:
```python
"NEW_API_KEY": {
"description": "What it's for",
"prompt": "Display name",
"url": "https://...",
"password": True,
"category": "tool", # provider, tool, messaging, setting
},
Conversations are saved in ShareGPT format for training:
```json
{"from": "system", "value": "System prompt with <tools>...</tools>"}
{"from": "human", "value": "User message"}
{"from": "gpt", "value": "<think>reasoning</think>\n<tool_call>{...}</tool_call>"}
{"from": "tool", "value": "<tool_response>{...}</tool_response>"}
{"from": "gpt", "value": "Final response"}
```
### Config loaders (two separate systems):
Tool calls use `<tool_call>` XML tags, responses use `<tool_response>` tags, reasoning uses `<think>` tags.
| Loader | Used by | Location |
|--------|---------|----------|
| `load_cli_config()` | CLI mode | `cli.py` |
| `load_config()` | `hermes tools`, `hermes setup` | `hermes_cli/config.py` |
| Direct YAML load | Gateway | `gateway/run.py` |
---
## Skin/Theme System
The skin engine (`hermes_cli/skin_engine.py`) provides data-driven CLI visual customization. Skins are **pure data** — no code changes needed to add a new skin.
### Architecture
```
hermes_cli/skin_engine.py # SkinConfig dataclass, built-in skins, YAML loader
~/.hermes/skins/*.yaml # User-installed custom skins (drop-in)
```
- `init_skin_from_config()` — called at CLI startup, reads `display.skin` from config
- `get_active_skin()` — returns cached `SkinConfig` for the current skin
- `set_active_skin(name)` — switches skin at runtime (used by `/skin` command)
- `load_skin(name)` — loads from user skins first, then built-ins, then falls back to default
- Missing skin values inherit from the `default` skin automatically
### What skins customize
| Element | Skin Key | Used By |
|---------|----------|---------|
| Banner panel border | `colors.banner_border` | `banner.py` |
| Banner panel title | `colors.banner_title` | `banner.py` |
| Banner section headers | `colors.banner_accent` | `banner.py` |
| Banner dim text | `colors.banner_dim` | `banner.py` |
| Banner body text | `colors.banner_text` | `banner.py` |
| Response box border | `colors.response_border` | `cli.py` |
| Spinner faces (waiting) | `spinner.waiting_faces` | `display.py` |
| Spinner faces (thinking) | `spinner.thinking_faces` | `display.py` |
| Spinner verbs | `spinner.thinking_verbs` | `display.py` |
| Spinner wings (optional) | `spinner.wings` | `display.py` |
| Tool output prefix | `tool_prefix` | `display.py` |
| Agent name | `branding.agent_name` | `banner.py`, `cli.py` |
| Welcome message | `branding.welcome` | `cli.py` |
| Response box label | `branding.response_label` | `cli.py` |
| Prompt symbol | `branding.prompt_symbol` | `cli.py` |
### Built-in skins
- `default` — Classic Hermes gold/kawaii (the current look)
- `ares` — Crimson/bronze war-god theme with custom spinner wings
- `mono` — Clean grayscale monochrome
- `slate` — Cool blue developer-focused theme
### Adding a built-in skin
Add to `_BUILTIN_SKINS` dict in `hermes_cli/skin_engine.py`:
### Trajectory Export
```python
"mytheme": {
"name": "mytheme",
"description": "Short description",
"colors": { ... },
"spinner": { ... },
"branding": { ... },
"tool_prefix": "",
},
agent = AIAgent(save_trajectories=True)
agent.chat("Do something")
# Saves to trajectories/*.jsonl in ShareGPT format
```
### User skins (YAML)
Users create `~/.hermes/skins/<name>.yaml`:
```yaml
name: cyberpunk
description: Neon-soaked terminal theme
colors:
banner_border: "#FF00FF"
banner_title: "#00FFFF"
banner_accent: "#FF1493"
spinner:
thinking_verbs: ["jacking in", "decrypting", "uploading"]
wings:
- ["⟨⚡", "⚡⟩"]
branding:
agent_name: "Cyber Agent"
response_label: " ⚡ Cyber "
tool_prefix: "▏"
```
Activate with `/skin cyberpunk` or `display.skin: cyberpunk` in config.yaml.
---
## Important Policies
## Batch Processing (batch_runner.py)
### Prompt Caching Must Not Break
Hermes-Agent ensures caching remains valid throughout a conversation. **Do NOT implement changes that would:**
- Alter past context mid-conversation
- Change toolsets mid-conversation
- Reload memories or rebuild system prompts mid-conversation
Cache-breaking forces dramatically higher costs. The ONLY time we alter context is during context compression.
### Working Directory Behavior
- **CLI**: Uses current directory (`.``os.getcwd()`)
- **Messaging**: Uses `MESSAGING_CWD` env var (default: home directory)
### Background Process Notifications (Gateway)
When `terminal(background=true, check_interval=...)` is used, the gateway runs a watcher that
pushes status updates to the user's chat. Control verbosity with `display.background_process_notifications`
in config.yaml (or `HERMES_BACKGROUND_NOTIFICATIONS` env var):
- `all` — running-output updates + final message (default)
- `result` — only the final completion message
- `error` — only the final message when exit code != 0
- `off` — no watcher messages at all
---
## Known Pitfalls
### DO NOT use `simple_term_menu` for interactive menus
Rendering bugs in tmux/iTerm2 — ghosting on scroll. Use `curses` (stdlib) instead. See `hermes_cli/tools_config.py` for the pattern.
### DO NOT use `\033[K` (ANSI erase-to-EOL) in spinner/display code
Leaks as literal `?[K` text under `prompt_toolkit`'s `patch_stdout`. Use space-padding: `f"\r{line}{' ' * pad}"`.
### `_last_resolved_tool_names` is a process-global in `model_tools.py`
When subagents overwrite this global, `execute_code` calls after delegation may fail with missing tool imports. Known bug.
### Tests must not write to `~/.hermes/`
The `_isolate_hermes_home` autouse fixture in `tests/conftest.py` redirects `HERMES_HOME` to a temp dir. Never hardcode `~/.hermes/` paths in tests.
---
## Testing
For processing multiple prompts:
- Parallel execution with multiprocessing
- Content-based resume for fault tolerance (matches on prompt text, not indices)
- Toolset distributions control probabilistic tool availability per prompt
- Output: `data/<run_name>/trajectories.jsonl` (combined) + individual batch files
```bash
source .venv/bin/activate
python -m pytest tests/ -q # Full suite (~3000 tests, ~3 min)
python -m pytest tests/test_model_tools.py -q # Toolset resolution
python -m pytest tests/test_cli_init.py -q # CLI config loading
python -m pytest tests/gateway/ -q # Gateway tests
python -m pytest tests/tools/ -q # Tool-level tests
python batch_runner.py \
--dataset_file=prompts.jsonl \
--batch_size=20 \
--num_workers=4 \
--run_name=my_run
```
Always run the full suite before pushing changes.
---
## Skills System
Skills are on-demand knowledge documents the agent can load. Compatible with the [agentskills.io](https://agentskills.io/specification) open standard.
```
skills/
├── mlops/ # Category folder
│ ├── axolotl/ # Skill folder
│ │ ├── SKILL.md # Main instructions (required)
│ │ ├── references/ # Additional docs, API specs
│ │ ├── templates/ # Output formats, configs
│ │ └── assets/ # Supplementary files (agentskills.io)
│ └── vllm/
│ └── SKILL.md
├── .hub/ # Skills Hub state (gitignored)
│ ├── lock.json # Installed skill provenance
│ ├── quarantine/ # Pending security review
│ ├── audit.log # Security scan history
│ ├── taps.json # Custom source repos
│ └── index-cache/ # Cached remote indexes
```
**Progressive disclosure** (token-efficient):
1. `skills_categories()` - List category names (~50 tokens)
2. `skills_list(category)` - Name + description per skill (~3k tokens)
3. `skill_view(name)` - Full content + tags + linked files
SKILL.md files use YAML frontmatter (agentskills.io format):
```yaml
---
name: skill-name
description: Brief description for listing
version: 1.0.0
metadata:
hermes:
tags: [tag1, tag2]
related_skills: [other-skill]
---
# Skill Content...
```
**Skills Hub** — user-driven skill search/install from online registries and official optional skills. Sources: official optional skills (shipped with repo, labeled "official"), GitHub (openai/skills, anthropics/skills, custom taps), ClawHub, Claude marketplace, LobeHub. Not exposed as an agent tool — the model cannot search for or install skills. Users manage skills via `hermes skills browse/search/install` CLI commands or the `/skills` slash command in chat.
Key files:
- `tools/skills_tool.py` — Agent-facing skill list/view (progressive disclosure)
- `tools/skills_guard.py` — Security scanner (regex + LLM audit, trust-aware install policy)
- `tools/skills_hub.py` — Source adapters (OptionalSkillSource, GitHub, ClawHub, Claude marketplace, LobeHub), lock file, auth
- `hermes_cli/skills_hub.py` — CLI subcommands + `/skills` slash command handler
---
## Testing Changes
After making changes:
1. Run `hermes doctor` to check setup
2. Run `hermes config check` to verify config
3. Test with `hermes chat -q "test message"`
4. For new config options, test fresh install: `rm -rf ~/.hermes && hermes setup`

View File

@@ -118,7 +118,7 @@ hermes-agent/
├── cli.py # HermesCLI class — interactive TUI, prompt_toolkit integration
├── model_tools.py # Tool orchestration (thin layer over tools/registry.py)
├── toolsets.py # Tool groupings and presets (hermes-cli, hermes-telegram, etc.)
├── hermes_state.py # SQLite session database with FTS5 full-text search, session titles
├── hermes_state.py # SQLite session database with FTS5 full-text search
├── batch_runner.py # Parallel batch processing for trajectory generation
├── agent/ # Agent internals (extracted modules)
@@ -139,8 +139,7 @@ hermes-agent/
│ ├── commands.py # Slash command definitions + autocomplete
│ ├── callbacks.py # Interactive callbacks (clarify, sudo, approval)
│ ├── doctor.py # Diagnostics
── skills_hub.py # Skills Hub CLI + /skills slash command
│ └── skin_engine.py # Skin/theme engine — data-driven CLI visual customization
── skills_hub.py # Skills Hub CLI + /skills slash command
├── tools/ # Tool implementations (self-registering)
│ ├── registry.py # Central tool registry (schemas, handlers, dispatch)
@@ -219,7 +218,7 @@ User message → AIAgent._run_agent_loop()
- **Self-registering tools**: Each tool file calls `registry.register()` at import time. `model_tools.py` triggers discovery by importing all tool modules.
- **Toolset grouping**: Tools are grouped into toolsets (`web`, `terminal`, `file`, `browser`, etc.) that can be enabled/disabled per platform.
- **Session persistence**: All conversations are stored in SQLite (`hermes_state.py`) with full-text search and unique session titles. JSON logs go to `~/.hermes/sessions/`.
- **Session persistence**: All conversations are stored in SQLite (`hermes_state.py`) with full-text search. JSON logs go to `~/.hermes/sessions/`.
- **Ephemeral injection**: System prompts and prefill messages are injected at API call time, never persisted to the database or logs.
- **Provider abstraction**: The agent works with any OpenAI-compatible API. Provider resolution happens at init time (Nous Portal OAuth, OpenRouter API key, or custom endpoint).
- **Provider routing**: When using OpenRouter, `provider_routing` in config.yaml controls provider selection (sort by throughput/latency/price, allow/ignore specific providers, data retention policies). These are injected as `extra_body.provider` in API requests.
@@ -326,9 +325,6 @@ description: Brief description (shown in skill search results)
version: 1.0.0
author: Your Name
license: MIT
platforms: [macos, linux] # Optional — restrict to specific OS platforms
# Valid: macos, linux, windows
# Omit to load on all platforms (default)
metadata:
hermes:
tags: [Category, Subcategory, Keywords]
@@ -355,18 +351,6 @@ Known failure modes and how to handle them.
How the agent confirms it worked.
```
### Platform-specific skills
Skills can declare which OS platforms they support via the `platforms` frontmatter field. Skills with this field are automatically hidden from the system prompt, `skills_list()`, and slash commands on incompatible platforms.
```yaml
platforms: [macos] # macOS only (e.g., iMessage, Apple Reminders)
platforms: [macos, linux] # macOS and Linux
platforms: [windows] # Windows only
```
If the field is omitted or empty, the skill loads on all platforms (backward compatible). See `skills/apple/` for examples of macOS-only skills.
### Skill guidelines
- **No external dependencies unless absolutely necessary.** Prefer stdlib Python, curl, and existing Hermes tools (`web_extract`, `terminal`, `read_file`).
@@ -376,56 +360,6 @@ If the field is omitted or empty, the skill loads on all platforms (backward com
---
## Adding a Skin / Theme
Hermes uses a data-driven skin system — no code changes needed to add a new skin.
**Option A: User skin (YAML file)**
Create `~/.hermes/skins/<name>.yaml`:
```yaml
name: mytheme
description: Short description of the theme
colors:
banner_border: "#HEX" # Panel border color
banner_title: "#HEX" # Panel title color
banner_accent: "#HEX" # Section header color
banner_dim: "#HEX" # Muted/dim text color
banner_text: "#HEX" # Body text color
response_border: "#HEX" # Response box border
spinner:
waiting_faces: ["(⚔)", "(⛨)"]
thinking_faces: ["(⚔)", "(⌁)"]
thinking_verbs: ["forging", "plotting"]
wings: # Optional left/right decorations
- ["⟪⚔", "⚔⟫"]
branding:
agent_name: "My Agent"
welcome: "Welcome message"
response_label: " ⚔ Agent "
prompt_symbol: "⚔ "
tool_prefix: "╎" # Tool output line prefix
```
All fields are optional — missing values inherit from the default skin.
**Option B: Built-in skin**
Add to `_BUILTIN_SKINS` dict in `hermes_cli/skin_engine.py`. Use the same schema as above but as a Python dict. Built-in skins ship with the package and are always available.
**Activating:**
- CLI: `/skin mytheme` or set `display.skin: mytheme` in config.yaml
- Config: `display: { skin: mytheme }`
See `hermes_cli/skin_engine.py` for the full schema and existing skins as examples.
---
## Cross-Platform Compatibility
Hermes runs on Linux, macOS, and Windows. When writing code that touches the OS:

21
LICENSE
View File

@@ -1,21 +0,0 @@
MIT License
Copyright (c) 2025 Nous Research
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

View File

@@ -13,11 +13,11 @@
**The self-improving AI agent built by [Nous Research](https://nousresearch.com).** It's the only agent with a built-in learning loop — it creates skills from experience, improves them during use, nudges itself to persist knowledge, searches its own past conversations, and builds a deepening model of who you are across sessions. Run it on a $5 VPS, a GPU cluster, or serverless infrastructure that costs nearly nothing when idle. It's not tied to your laptop — talk to it from Telegram while it works on a cloud VM.
Use any model you want — [Nous Portal](https://portal.nousresearch.com), [OpenRouter](https://openrouter.ai) (200+ models), [z.ai/GLM](https://z.ai), [Kimi/Moonshot](https://platform.moonshot.ai), [MiniMax](https://www.minimax.io), OpenAI, or your own endpoint. Switch with `hermes model` — no code changes, no lock-in.
Use any model you want — [Nous Portal](https://portal.nousresearch.com), [OpenRouter](https://openrouter.ai) (200+ models), OpenAI, or your own endpoint. Switch with `hermes model` — no code changes, no lock-in.
<table>
<tr><td><b>A real terminal interface</b></td><td>Full TUI with multiline editing, slash-command autocomplete, conversation history, interrupt-and-redirect, and streaming tool output.</td></tr>
<tr><td><b>Lives where you do</b></td><td>Telegram, Discord, Slack, WhatsApp, Signal, and CLI — all from a single gateway process. Voice memo transcription, cross-platform conversation continuity.</td></tr>
<tr><td><b>Lives where you do</b></td><td>Telegram, Discord, Slack, WhatsApp, and CLI — all from a single gateway process. Voice memo transcription, cross-platform conversation continuity.</td></tr>
<tr><td><b>A closed learning loop</b></td><td>Agent-curated memory with periodic nudges. Autonomous skill creation after complex tasks. Skills self-improve during use. FTS5 session search with LLM summarization for cross-session recall. <a href="https://github.com/plastic-labs/honcho">Honcho</a> dialectic user modeling. Compatible with the <a href="https://agentskills.io">agentskills.io</a> open standard.</td></tr>
<tr><td><b>Scheduled automations</b></td><td>Built-in cron scheduler with delivery to any platform. Daily reports, nightly backups, weekly audits — all in natural language, running unattended.</td></tr>
<tr><td><b>Delegates and parallelizes</b></td><td>Spawn isolated subagents for parallel workstreams. Write Python scripts that call tools via RPC, collapsing multi-step pipelines into zero-context-cost turns.</td></tr>
@@ -41,6 +41,7 @@ After installation:
```bash
source ~/.bashrc # reload shell (or: source ~/.zshrc)
hermes setup # configure your LLM provider
hermes # start chatting!
```
@@ -50,11 +51,9 @@ hermes # start chatting!
```bash
hermes # Interactive CLI — start a conversation
hermes model # Choose your LLM provider and model
hermes tools # Configure which tools are enabled
hermes config set # Set individual config values
hermes model # Switch provider or model
hermes setup # Re-run the setup wizard
hermes gateway # Start the messaging gateway (Telegram, Discord, etc.)
hermes setup # Run the full setup wizard (configures everything at once)
hermes update # Update to the latest version
hermes doctor # Diagnose any issues
```
@@ -72,7 +71,7 @@ All documentation lives at **[hermes-agent.nousresearch.com/docs](https://hermes
| [Quickstart](https://hermes-agent.nousresearch.com/docs/getting-started/quickstart) | Install → setup → first conversation in 2 minutes |
| [CLI Usage](https://hermes-agent.nousresearch.com/docs/user-guide/cli) | Commands, keybindings, personalities, sessions |
| [Configuration](https://hermes-agent.nousresearch.com/docs/user-guide/configuration) | Config file, providers, models, all options |
| [Messaging Gateway](https://hermes-agent.nousresearch.com/docs/user-guide/messaging) | Telegram, Discord, Slack, WhatsApp, Signal, Home Assistant |
| [Messaging Gateway](https://hermes-agent.nousresearch.com/docs/user-guide/messaging) | Telegram, Discord, Slack, WhatsApp, Home Assistant |
| [Security](https://hermes-agent.nousresearch.com/docs/user-guide/security) | Command approval, DM pairing, container isolation |
| [Tools & Toolsets](https://hermes-agent.nousresearch.com/docs/user-guide/features/tools) | 40+ tools, toolset system, terminal backends |
| [Skills System](https://hermes-agent.nousresearch.com/docs/user-guide/features/skills) | Procedural memory, Skills Hub, creating skills |

129
TODO.md Normal file
View File

@@ -0,0 +1,129 @@
# Hermes Agent - Future Improvements
---
## 3. Local Browser Control via CDP 🌐
**Status:** Not started (currently Browserbase cloud only)
**Priority:** Medium
Support local Chrome/Chromium via Chrome DevTools Protocol alongside existing Browserbase cloud backend.
**What other agents do:**
- **OpenClaw**: Full CDP-based Chrome control with snapshots, actions, uploads, profiles, file chooser, PDF save, console messages, tab management. Uses local Chrome for persistent login sessions.
- **Cline**: Headless browser with Computer Use (click, type, scroll, screenshot, console logs)
**Our approach:**
- Add a `local` backend option to `browser_tool.py` using Playwright or raw CDP
- Config toggle: `browser.backend: local | browserbase | auto`
- `auto` mode: try local first, fall back to Browserbase
- Local advantages: free, persistent login sessions, no API key needed
- Local disadvantages: no CAPTCHA solving, no stealth mode, requires Chrome installed
- Reuse the same 10-tool interface -- just swap the backend
- Later: Chrome profile management for persistent sessions across restarts
---
## 4. Signal Integration 📡
**Status:** Not started
**Priority:** Low
New platform adapter using signal-cli daemon (JSON-RPC HTTP + SSE). Requires Java runtime and phone number registration.
**Reference:** OpenClaw has Signal support via signal-cli.
---
## 5. Plugin/Extension System 🔌
**Status:** Partially implemented (event hooks exist in `gateway/hooks.py`)
**Priority:** Medium
Full Python plugin interface that goes beyond the current hook system.
**What other agents do:**
- **OpenClaw**: Plugin SDK with tool-send capabilities, lifecycle phase hooks (before-agent-start, after-tool-call, model-override), plugin registry with install/uninstall.
- **Pi**: Extensions are TypeScript modules that can register tools, commands, keyboard shortcuts, custom UI widgets, overlays, status lines, dialogs, compaction hooks, raw terminal input listeners. Extremely comprehensive.
- **OpenCode**: MCP client support (stdio, SSE, StreamableHTTP), OAuth auth for MCP servers. Also has Copilot/Codex plugins.
- **Codex**: Full MCP integration with skill dependencies.
- **Cline**: MCP integration + lifecycle hooks with cancellation support.
**Our approach (phased):**
### Phase 1: Enhanced hooks
- Expand the existing `gateway/hooks.py` to support more events: `before-tool-call`, `after-tool-call`, `before-response`, `context-compress`, `session-end`
- Allow hooks to modify tool results (e.g., filter sensitive output)
### Phase 2: Plugin interface
- `~/.hermes/plugins/<name>/plugin.yaml` + `handler.py`
- Plugins can: register new tools, add CLI commands, subscribe to events, inject system prompt sections
- `hermes plugin list|install|uninstall|create` CLI commands
- Plugin discovery and validation on startup
### Phase 3: MCP support (industry standard) ✅ DONE
- ✅ MCP client that connects to external MCP servers (stdio + HTTP/StreamableHTTP)
- ✅ Config: `mcp_servers` in config.yaml with connection details
- ✅ Each MCP server's tools auto-registered as a dynamic toolset
- Future: Resources, Prompts, Progress notifications, `hermes mcp` CLI command
---
## 6. MCP (Model Context Protocol) Support 🔗 ✅ DONE
**Status:** Implemented (PR #301)
**Priority:** Complete
Native MCP client support with stdio and HTTP/StreamableHTTP transports, auto-discovery, reconnection with exponential backoff, env var filtering, and credential stripping. See `docs/mcp.md` for full documentation.
**Still TODO:**
- `hermes mcp` CLI subcommand (list/test/status)
- `hermes tools` UI integration for MCP toolsets
- MCP Resources and Prompts support
- OAuth authentication for remote servers
- Progress notifications for long-running tools
---
## 8. Filesystem Checkpointing / Rollback 🔄
**Status:** Not started
**Priority:** Low-Medium
Automatic filesystem snapshots after each agent loop iteration so the user can roll back destructive changes to their project.
**What other agents do:**
- **Cline**: Workspace checkpoints at each step with Compare/Restore UI
- **OpenCode**: Git-backed workspace snapshots per step, with weekly gc
- **Codex**: Sandboxed execution with commit-per-step, rollback on failure
**Our approach:**
- After each tool call (or batch of tool calls in a single turn) that modifies files, create a lightweight checkpoint of the affected files
- Git-based when the project is a repo: auto-commit to a detached/temporary branch (`hermes/checkpoints/<session>`) after each agent turn, squash or discard on session end
- Non-git fallback: tar snapshots of changed files in `~/.hermes/checkpoints/<session_id>/`
- `hermes rollback` CLI command to restore to a previous checkpoint
- Agent-accessible via a `checkpoint` tool: `list` (show available restore points), `restore` (roll back to a named point), `diff` (show what changed since a checkpoint)
- Configurable: off by default (opt-in via `config.yaml`), since auto-committing can be surprising
- Cleanup: checkpoints expire after session ends (or configurable retention period)
- Integration with the terminal backend: works with local, SSH, and Docker backends (snapshots happen on the execution host)
---
## Implementation Priority Order
### Tier 1: Next Up
1. ~~MCP Support -- #6~~ ✅ Done (PR #301)
### Tier 2: Quality of Life
3. Local Browser Control via CDP -- #3
4. Plugin/Extension System -- #5
### Tier 3: Nice to Have
5. Session Branching / Checkpoints -- #7
6. Filesystem Checkpointing / Rollback -- #8
7. Signal Integration -- #4

View File

@@ -4,29 +4,18 @@ Provides a single resolution chain so every consumer (context compression,
session search, web extraction, vision analysis, browser vision) picks up
the best available backend without duplicating fallback logic.
Resolution order for text tasks (auto mode):
Resolution order for text tasks:
1. OpenRouter (OPENROUTER_API_KEY)
2. Nous Portal (~/.hermes/auth.json active provider)
3. Custom endpoint (OPENAI_BASE_URL + OPENAI_API_KEY)
4. Codex OAuth (Responses API via chatgpt.com with gpt-5.3-codex,
wrapped to look like a chat.completions client)
5. Direct API-key providers (z.ai/GLM, Kimi/Moonshot, MiniMax, MiniMax-CN)
— checked via PROVIDER_REGISTRY entries with auth_type='api_key'
6. None
5. None
Resolution order for vision/multimodal tasks (auto mode):
Resolution order for vision/multimodal tasks:
1. OpenRouter
2. Nous Portal
3. None (steps 3-5 are skipped — they may not support multimodal)
Per-task provider overrides (e.g. AUXILIARY_VISION_PROVIDER,
CONTEXT_COMPRESSION_PROVIDER) can force a specific provider for each task:
"openrouter", "nous", "codex", or "main" (= steps 3-5).
Default "auto" follows the chains above.
Per-task model overrides (e.g. AUXILIARY_VISION_MODEL,
AUXILIARY_WEB_EXTRACT_MODEL) let callers use a different model slug
than the provider's default.
3. None (custom endpoints can't substitute for Gemini multimodal)
"""
import json
@@ -42,14 +31,6 @@ from hermes_constants import OPENROUTER_BASE_URL
logger = logging.getLogger(__name__)
# Default auxiliary models for direct API-key providers (cheap/fast for side tasks)
_API_KEY_PROVIDER_AUX_MODELS: Dict[str, str] = {
"zai": "glm-4.5-flash",
"kimi-coding": "kimi-k2-turbo-preview",
"minimax": "MiniMax-M2.5-highspeed",
"minimax-cn": "MiniMax-M2.5-highspeed",
}
# OpenRouter app attribution headers
_OR_HEADERS = {
"HTTP-Referer": "https://github.com/NousResearch/hermes-agent",
@@ -82,55 +63,6 @@ _CODEX_AUX_BASE_URL = "https://chatgpt.com/backend-api/codex"
# read response.choices[0].message.content. This adapter translates those
# calls to the Codex Responses API so callers don't need any changes.
def _convert_content_for_responses(content: Any) -> Any:
"""Convert chat.completions content to Responses API format.
chat.completions uses:
{"type": "text", "text": "..."}
{"type": "image_url", "image_url": {"url": "data:image/png;base64,..."}}
Responses API uses:
{"type": "input_text", "text": "..."}
{"type": "input_image", "image_url": "data:image/png;base64,..."}
If content is a plain string, it's returned as-is (the Responses API
accepts strings directly for text-only messages).
"""
if isinstance(content, str):
return content
if not isinstance(content, list):
return str(content) if content else ""
converted: List[Dict[str, Any]] = []
for part in content:
if not isinstance(part, dict):
continue
ptype = part.get("type", "")
if ptype == "text":
converted.append({"type": "input_text", "text": part.get("text", "")})
elif ptype == "image_url":
# chat.completions nests the URL: {"image_url": {"url": "..."}}
image_data = part.get("image_url", {})
url = image_data.get("url", "") if isinstance(image_data, dict) else str(image_data)
entry: Dict[str, Any] = {"type": "input_image", "image_url": url}
# Preserve detail if specified
detail = image_data.get("detail") if isinstance(image_data, dict) else None
if detail:
entry["detail"] = detail
converted.append(entry)
elif ptype in ("input_text", "input_image"):
# Already in Responses format — pass through
converted.append(part)
else:
# Unknown content type — try to preserve as text
text = part.get("text", "")
if text:
converted.append({"type": "input_text", "text": text})
return converted or ""
class _CodexCompletionsAdapter:
"""Drop-in shim that accepts chat.completions.create() kwargs and
routes them through the Codex Responses streaming API."""
@@ -144,31 +76,30 @@ class _CodexCompletionsAdapter:
model = kwargs.get("model", self._model)
temperature = kwargs.get("temperature")
# Separate system/instructions from conversation messages.
# Convert chat.completions multimodal content blocks to Responses
# API format (input_text / input_image instead of text / image_url).
# Separate system/instructions from conversation messages
instructions = "You are a helpful assistant."
input_msgs: List[Dict[str, Any]] = []
for msg in messages:
role = msg.get("role", "user")
content = msg.get("content") or ""
if role == "system":
instructions = content if isinstance(content, str) else str(content)
instructions = content
else:
input_msgs.append({
"role": role,
"content": _convert_content_for_responses(content),
})
input_msgs.append({"role": role, "content": content})
resp_kwargs: Dict[str, Any] = {
"model": model,
"instructions": instructions,
"input": input_msgs or [{"role": "user", "content": ""}],
"stream": True,
"store": False,
}
# Note: the Codex endpoint (chatgpt.com/backend-api/codex) does NOT
# support max_output_tokens or temperature — omit to avoid 400 errors.
max_tokens = kwargs.get("max_output_tokens") or kwargs.get("max_completion_tokens") or kwargs.get("max_tokens")
if max_tokens is not None:
resp_kwargs["max_output_tokens"] = int(max_tokens)
if temperature is not None:
resp_kwargs["temperature"] = temperature
# Tools support for flush_memories and similar callers
tools = kwargs.get("tools")
@@ -351,173 +282,53 @@ def _read_codex_access_token() -> Optional[str]:
return None
def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
"""Try each API-key provider in PROVIDER_REGISTRY order.
Returns (client, model) for the first provider whose env var is set,
or (None, None) if none are configured.
"""
try:
from hermes_cli.auth import PROVIDER_REGISTRY
except ImportError:
logger.debug("Could not import PROVIDER_REGISTRY for API-key fallback")
return None, None
for provider_id, pconfig in PROVIDER_REGISTRY.items():
if pconfig.auth_type != "api_key":
continue
# Check if any of the provider's env vars are set
api_key = ""
for env_var in pconfig.api_key_env_vars:
val = os.getenv(env_var, "").strip()
if val:
api_key = val
break
if not api_key:
continue
# Resolve base URL (with optional env-var override)
# Kimi Code keys (sk-kimi-) need api.kimi.com/coding/v1
env_url = ""
if pconfig.base_url_env_var:
env_url = os.getenv(pconfig.base_url_env_var, "").strip()
if env_url:
base_url = env_url.rstrip("/")
elif provider_id == "kimi-coding" and api_key.startswith("sk-kimi-"):
base_url = "https://api.kimi.com/coding/v1"
else:
base_url = pconfig.inference_base_url
model = _API_KEY_PROVIDER_AUX_MODELS.get(provider_id, "default")
logger.debug("Auxiliary text client: %s (%s)", pconfig.name, model)
extra = {}
if "api.kimi.com" in base_url.lower():
extra["default_headers"] = {"User-Agent": "KimiCLI/1.0"}
return OpenAI(api_key=api_key, base_url=base_url, **extra), model
return None, None
# ── Provider resolution helpers ─────────────────────────────────────────────
def _get_auxiliary_provider(task: str = "") -> str:
"""Read the provider override for a specific auxiliary task.
Checks AUXILIARY_{TASK}_PROVIDER first (e.g. AUXILIARY_VISION_PROVIDER),
then CONTEXT_{TASK}_PROVIDER (for the compression section's summary_provider),
then falls back to "auto". Returns one of: "auto", "openrouter", "nous", "main".
"""
if task:
for prefix in ("AUXILIARY_", "CONTEXT_"):
val = os.getenv(f"{prefix}{task.upper()}_PROVIDER", "").strip().lower()
if val and val != "auto":
return val
return "auto"
def _try_openrouter() -> Tuple[Optional[OpenAI], Optional[str]]:
or_key = os.getenv("OPENROUTER_API_KEY")
if not or_key:
return None, None
logger.debug("Auxiliary client: OpenRouter")
return OpenAI(api_key=or_key, base_url=OPENROUTER_BASE_URL,
default_headers=_OR_HEADERS), _OPENROUTER_MODEL
def _try_nous() -> Tuple[Optional[OpenAI], Optional[str]]:
nous = _read_nous_auth()
if not nous:
return None, None
global auxiliary_is_nous
auxiliary_is_nous = True
logger.debug("Auxiliary client: Nous Portal")
return (
OpenAI(api_key=_nous_api_key(nous), base_url=_nous_base_url()),
_NOUS_MODEL,
)
def _try_custom_endpoint() -> Tuple[Optional[OpenAI], Optional[str]]:
custom_base = os.getenv("OPENAI_BASE_URL")
custom_key = os.getenv("OPENAI_API_KEY")
if not custom_base or not custom_key:
return None, None
model = os.getenv("OPENAI_MODEL") or os.getenv("LLM_MODEL") or "gpt-4o-mini"
logger.debug("Auxiliary client: custom endpoint (%s)", model)
return OpenAI(api_key=custom_key, base_url=custom_base), model
def _try_codex() -> Tuple[Optional[Any], Optional[str]]:
codex_token = _read_codex_access_token()
if not codex_token:
return None, None
logger.debug("Auxiliary client: Codex OAuth (%s via Responses API)", _CODEX_AUX_MODEL)
real_client = OpenAI(api_key=codex_token, base_url=_CODEX_AUX_BASE_URL)
return CodexAuxiliaryClient(real_client, _CODEX_AUX_MODEL), _CODEX_AUX_MODEL
def _resolve_forced_provider(forced: str) -> Tuple[Optional[OpenAI], Optional[str]]:
"""Resolve a specific forced provider. Returns (None, None) if creds missing."""
if forced == "openrouter":
client, model = _try_openrouter()
if client is None:
logger.warning("auxiliary.provider=openrouter but OPENROUTER_API_KEY not set")
return client, model
if forced == "nous":
client, model = _try_nous()
if client is None:
logger.warning("auxiliary.provider=nous but Nous Portal not configured (run: hermes login)")
return client, model
if forced == "codex":
client, model = _try_codex()
if client is None:
logger.warning("auxiliary.provider=codex but no Codex OAuth token found (run: hermes model)")
return client, model
if forced == "main":
# "main" = skip OpenRouter/Nous, use the main chat model's credentials.
for try_fn in (_try_custom_endpoint, _try_codex, _resolve_api_key_provider):
client, model = try_fn()
if client is not None:
return client, model
logger.warning("auxiliary.provider=main but no main endpoint credentials found")
return None, None
# Unknown provider name — fall through to auto
logger.warning("Unknown auxiliary.provider=%r, falling back to auto", forced)
return None, None
def _resolve_auto() -> Tuple[Optional[OpenAI], Optional[str]]:
"""Full auto-detection chain: OpenRouter → Nous → custom → Codex → API-key → None."""
for try_fn in (_try_openrouter, _try_nous, _try_custom_endpoint,
_try_codex, _resolve_api_key_provider):
client, model = try_fn()
if client is not None:
return client, model
logger.debug("Auxiliary client: none available")
return None, None
# ── Public API ──────────────────────────────────────────────────────────────
def get_text_auxiliary_client(task: str = "") -> Tuple[Optional[OpenAI], Optional[str]]:
"""Return (client, default_model_slug) for text-only auxiliary tasks.
def get_text_auxiliary_client() -> Tuple[Optional[OpenAI], Optional[str]]:
"""Return (client, model_slug) for text-only auxiliary tasks.
Args:
task: Optional task name ("compression", "web_extract") to check
for a task-specific provider override.
Callers may override the returned model with a per-task env var
(e.g. CONTEXT_COMPRESSION_MODEL, AUXILIARY_WEB_EXTRACT_MODEL).
Falls through OpenRouter -> Nous Portal -> custom endpoint -> Codex OAuth -> (None, None).
"""
forced = _get_auxiliary_provider(task)
if forced != "auto":
return _resolve_forced_provider(forced)
return _resolve_auto()
# 1. OpenRouter
or_key = os.getenv("OPENROUTER_API_KEY")
if or_key:
logger.debug("Auxiliary text client: OpenRouter")
return OpenAI(api_key=or_key, base_url=OPENROUTER_BASE_URL,
default_headers=_OR_HEADERS), _OPENROUTER_MODEL
# 2. Nous Portal
nous = _read_nous_auth()
if nous:
global auxiliary_is_nous
auxiliary_is_nous = True
logger.debug("Auxiliary text client: Nous Portal")
return (
OpenAI(api_key=_nous_api_key(nous), base_url=_nous_base_url()),
_NOUS_MODEL,
)
# 3. Custom endpoint (both base URL and key must be set)
custom_base = os.getenv("OPENAI_BASE_URL")
custom_key = os.getenv("OPENAI_API_KEY")
if custom_base and custom_key:
model = os.getenv("OPENAI_MODEL") or os.getenv("LLM_MODEL") or "gpt-4o-mini"
logger.debug("Auxiliary text client: custom endpoint (%s)", model)
return OpenAI(api_key=custom_key, base_url=custom_base), model
# 4. Codex OAuth -- uses the Responses API (only endpoint the token
# can access), wrapped to look like a chat.completions client.
codex_token = _read_codex_access_token()
if codex_token:
logger.debug("Auxiliary text client: Codex OAuth (%s via Responses API)", _CODEX_AUX_MODEL)
real_client = OpenAI(api_key=codex_token, base_url=_CODEX_AUX_BASE_URL)
return CodexAuxiliaryClient(real_client, _CODEX_AUX_MODEL), _CODEX_AUX_MODEL
# 5. Nothing available
logger.debug("Auxiliary text client: none available")
return None, None
def get_async_text_auxiliary_client(task: str = ""):
def get_async_text_auxiliary_client():
"""Return (async_client, model_slug) for async consumers.
For standard providers returns (AsyncOpenAI, model). For Codex returns
@@ -526,7 +337,7 @@ def get_async_text_auxiliary_client(task: str = ""):
"""
from openai import AsyncOpenAI
sync_client, model = get_text_auxiliary_client(task)
sync_client, model = get_text_auxiliary_client()
if sync_client is None:
return None, None
@@ -539,36 +350,32 @@ def get_async_text_auxiliary_client(task: str = ""):
}
if "openrouter" in str(sync_client.base_url).lower():
async_kwargs["default_headers"] = dict(_OR_HEADERS)
elif "api.kimi.com" in str(sync_client.base_url).lower():
async_kwargs["default_headers"] = {"User-Agent": "KimiCLI/1.0"}
return AsyncOpenAI(**async_kwargs), model
def get_vision_auxiliary_client() -> Tuple[Optional[OpenAI], Optional[str]]:
"""Return (client, default_model_slug) for vision/multimodal auxiliary tasks.
"""Return (client, model_slug) for vision/multimodal auxiliary tasks.
Checks AUXILIARY_VISION_PROVIDER for a forced provider, otherwise
auto-detects. Callers may override the returned model with
AUXILIARY_VISION_MODEL.
In auto mode, only providers known to support multimodal are tried:
OpenRouter, Nous Portal, and Codex OAuth (gpt-5.3-codex supports
vision via the Responses API). Custom endpoints and API-key
providers are skipped — they may not handle vision input. To use
them, set AUXILIARY_VISION_PROVIDER explicitly.
Only OpenRouter and Nous Portal qualify — custom endpoints cannot
substitute for Gemini multimodal.
"""
forced = _get_auxiliary_provider("vision")
if forced != "auto":
return _resolve_forced_provider(forced)
# Auto: try providers known to support multimodal first, then fall
# back to the user's custom endpoint. Many local models (Qwen-VL,
# LLaVA, Pixtral, etc.) support vision — skipping them entirely
# caused silent failures for local-only users.
for try_fn in (_try_openrouter, _try_nous, _try_codex,
_try_custom_endpoint):
client, model = try_fn()
if client is not None:
return client, model
# 1. OpenRouter
or_key = os.getenv("OPENROUTER_API_KEY")
if or_key:
logger.debug("Auxiliary vision client: OpenRouter")
return OpenAI(api_key=or_key, base_url=OPENROUTER_BASE_URL,
default_headers=_OR_HEADERS), _OPENROUTER_MODEL
# 2. Nous Portal
nous = _read_nous_auth()
if nous:
logger.debug("Auxiliary vision client: Nous Portal")
return (
OpenAI(api_key=_nous_api_key(nous), base_url=_nous_base_url()),
_NOUS_MODEL,
)
# 3. Nothing suitable
logger.debug("Auxiliary vision client: none available")
return None, None

View File

@@ -7,7 +7,7 @@ protecting head and tail context.
import logging
import os
from typing import Any, Dict, List, Optional
from typing import Any, Dict, List
from agent.auxiliary_client import get_text_auxiliary_client
from agent.model_metadata import (
@@ -53,7 +53,7 @@ class ContextCompressor:
self.last_completion_tokens = 0
self.last_total_tokens = 0
self.client, default_model = get_text_auxiliary_client("compression")
self.client, default_model = get_text_auxiliary_client()
self.summary_model = summary_model_override or default_model
def update_from_response(self, usage: Dict[str, Any]):
@@ -82,14 +82,11 @@ class ContextCompressor:
"compression_count": self.compression_count,
}
def _generate_summary(self, turns_to_summarize: List[Dict[str, Any]]) -> Optional[str]:
"""Generate a concise summary of conversation turns.
def _generate_summary(self, turns_to_summarize: List[Dict[str, Any]]) -> str:
"""Generate a concise summary of conversation turns using a fast model."""
if not self.client:
return "[CONTEXT SUMMARY]: Previous conversation turns have been compressed to save space. The assistant performed various actions and received responses."
Tries the auxiliary model first, then falls back to the user's main
model. Returns None if all attempts fail — the caller should drop
the middle turns without a summary rather than inject a useless
placeholder.
"""
parts = []
for msg in turns_to_summarize:
role = msg.get("role", "unknown")
@@ -120,28 +117,28 @@ TURNS TO SUMMARIZE:
Write only the summary, starting with "[CONTEXT SUMMARY]:" prefix."""
# 1. Try the auxiliary model (cheap/fast)
if self.client:
try:
return self._call_summary_model(self.client, self.summary_model, prompt)
except Exception as e:
logging.warning(f"Failed to generate context summary with auxiliary model: {e}")
try:
return self._call_summary_model(self.client, self.summary_model, prompt)
except Exception as e:
logging.warning(f"Failed to generate context summary with auxiliary model: {e}")
# 2. Fallback: try the user's main model endpoint
fallback_client, fallback_model = self._get_fallback_client()
if fallback_client is not None:
try:
logger.info("Retrying context summary with main model (%s)", fallback_model)
summary = self._call_summary_model(fallback_client, fallback_model, prompt)
self.client = fallback_client
self.summary_model = fallback_model
return summary
except Exception as fallback_err:
logging.warning(f"Main model summary also failed: {fallback_err}")
# Fallback: try the main model's endpoint. This handles the common
# case where the user switched providers (e.g. OpenRouter → local LLM)
# but a stale API key causes the auxiliary client to pick the old
# provider which then fails (402, auth error, etc.).
fallback_client, fallback_model = self._get_fallback_client()
if fallback_client is not None:
try:
logger.info("Retrying context summary with fallback client (%s)", fallback_model)
summary = self._call_summary_model(fallback_client, fallback_model, prompt)
# Success — swap in the working client for future compressions
self.client = fallback_client
self.summary_model = fallback_model
return summary
except Exception as fallback_err:
logging.warning(f"Fallback summary model also failed: {fallback_err}")
# 3. All models failed — return None so the caller drops turns without a summary
logging.warning("Context compression: no model available for summary. Middle turns will be dropped without summary.")
return None
return "[CONTEXT SUMMARY]: Previous conversation turns have been compressed. The assistant performed tool calls and received responses."
def _call_summary_model(self, client, model: str, prompt: str) -> str:
"""Make the actual LLM call to generate a summary. Raises on failure."""
@@ -199,111 +196,10 @@ Write only the summary, starting with "[CONTEXT SUMMARY]:" prefix."""
logger.debug("Could not build fallback auxiliary client: %s", exc)
return None, None
# ------------------------------------------------------------------
# Tool-call / tool-result pair integrity helpers
# ------------------------------------------------------------------
@staticmethod
def _get_tool_call_id(tc) -> str:
"""Extract the call ID from a tool_call entry (dict or SimpleNamespace)."""
if isinstance(tc, dict):
return tc.get("id", "")
return getattr(tc, "id", "") or ""
def _sanitize_tool_pairs(self, messages: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
"""Fix orphaned tool_call / tool_result pairs after compression.
Two failure modes:
1. A tool *result* references a call_id whose assistant tool_call was
removed (summarized/truncated). The API rejects this with
"No tool call found for function call output with call_id ...".
2. An assistant message has tool_calls whose results were dropped.
The API rejects this because every tool_call must be followed by
a tool result with the matching call_id.
This method removes orphaned results and inserts stub results for
orphaned calls so the message list is always well-formed.
"""
surviving_call_ids: set = set()
for msg in messages:
if msg.get("role") == "assistant":
for tc in msg.get("tool_calls") or []:
cid = self._get_tool_call_id(tc)
if cid:
surviving_call_ids.add(cid)
result_call_ids: set = set()
for msg in messages:
if msg.get("role") == "tool":
cid = msg.get("tool_call_id")
if cid:
result_call_ids.add(cid)
# 1. Remove tool results whose call_id has no matching assistant tool_call
orphaned_results = result_call_ids - surviving_call_ids
if orphaned_results:
messages = [
m for m in messages
if not (m.get("role") == "tool" and m.get("tool_call_id") in orphaned_results)
]
if not self.quiet_mode:
logger.info("Compression sanitizer: removed %d orphaned tool result(s)", len(orphaned_results))
# 2. Add stub results for assistant tool_calls whose results were dropped
missing_results = surviving_call_ids - result_call_ids
if missing_results:
patched: List[Dict[str, Any]] = []
for msg in messages:
patched.append(msg)
if msg.get("role") == "assistant":
for tc in msg.get("tool_calls") or []:
cid = self._get_tool_call_id(tc)
if cid in missing_results:
patched.append({
"role": "tool",
"content": "[Result from earlier conversation — see context summary above]",
"tool_call_id": cid,
})
messages = patched
if not self.quiet_mode:
logger.info("Compression sanitizer: added %d stub tool result(s)", len(missing_results))
return messages
def _align_boundary_forward(self, messages: List[Dict[str, Any]], idx: int) -> int:
"""Push a compress-start boundary forward past any orphan tool results.
If ``messages[idx]`` is a tool result, slide forward until we hit a
non-tool message so we don't start the summarised region mid-group.
"""
while idx < len(messages) and messages[idx].get("role") == "tool":
idx += 1
return idx
def _align_boundary_backward(self, messages: List[Dict[str, Any]], idx: int) -> int:
"""Pull a compress-end boundary backward to avoid splitting a
tool_call / result group.
If the message just before ``idx`` is an assistant message with
tool_calls, those tool results will start at ``idx`` and would be
separated from their parent. Move backwards to include the whole
group in the summarised region.
"""
if idx <= 0 or idx >= len(messages):
return idx
prev = messages[idx - 1]
if prev.get("role") == "assistant" and prev.get("tool_calls"):
# The results for this assistant turn sit at idx..idx+k.
# Include the assistant message in the summarised region too.
idx -= 1
return idx
def compress(self, messages: List[Dict[str, Any]], current_tokens: int = None) -> List[Dict[str, Any]]:
"""Compress conversation messages by summarizing middle turns.
Keeps first N + last N turns, summarizes everything in between.
After compression, orphaned tool_call / tool_result pairs are cleaned
up so the API never receives mismatched IDs.
"""
n_messages = len(messages)
if n_messages <= self.protect_first_n + self.protect_last_n + 1:
@@ -316,12 +212,6 @@ Write only the summary, starting with "[CONTEXT SUMMARY]:" prefix."""
if compress_start >= compress_end:
return messages
# Adjust boundaries to avoid splitting tool_call/result groups.
compress_start = self._align_boundary_forward(messages, compress_start)
compress_end = self._align_boundary_backward(messages, compress_end)
if compress_start >= compress_end:
return messages
turns_to_summarize = messages[compress_start:compress_end]
display_tokens = current_tokens if current_tokens else self.last_prompt_tokens or estimate_messages_tokens_rough(messages)
@@ -329,6 +219,24 @@ Write only the summary, starting with "[CONTEXT SUMMARY]:" prefix."""
print(f"\n📦 Context compression triggered ({display_tokens:,} tokens ≥ {self.threshold_tokens:,} threshold)")
print(f" 📊 Model context limit: {self.context_length:,} tokens ({self.threshold_percent*100:.0f}% = {self.threshold_tokens:,})")
# Truncation fallback when no auxiliary model is available
if self.client is None:
print("⚠️ Context compression: no auxiliary model available. Falling back to message truncation.")
# Keep system message(s) at the front and the protected tail;
# simply drop the oldest non-system messages until under threshold.
kept = []
for msg in messages:
if msg.get("role") == "system":
kept.append(msg.copy())
else:
break
tail = messages[-self.protect_last_n:]
kept.extend(m.copy() for m in tail)
self.compression_count += 1
if not self.quiet_mode:
print(f" ✂️ Truncated: {len(messages)}{len(kept)} messages (dropped middle turns)")
return kept
if not self.quiet_mode:
print(f" 🗜️ Summarizing turns {compress_start+1}-{compress_end} ({len(turns_to_summarize)} turns)")
@@ -341,21 +249,13 @@ Write only the summary, starting with "[CONTEXT SUMMARY]:" prefix."""
msg["content"] = (msg.get("content") or "") + "\n\n[Note: Some earlier conversation turns may be summarized to preserve context space.]"
compressed.append(msg)
if summary:
last_head_role = messages[compress_start - 1].get("role", "user") if compress_start > 0 else "user"
summary_role = "user" if last_head_role in ("assistant", "tool") else "assistant"
compressed.append({"role": summary_role, "content": summary})
else:
if not self.quiet_mode:
print(" ⚠️ No summary model available — middle turns dropped without summary")
compressed.append({"role": "user", "content": summary})
for i in range(compress_end, n_messages):
compressed.append(messages[i].copy())
self.compression_count += 1
compressed = self._sanitize_tool_pairs(compressed)
if not self.quiet_mode:
new_estimate = estimate_messages_tokens_rough(compressed)
saved_estimate = display_tokens - new_estimate

View File

@@ -5,8 +5,8 @@ Used by AIAgent._execute_tool_calls for CLI feedback.
"""
import json
import logging
import os
import random
import sys
import threading
import time
@@ -15,49 +15,6 @@ import time
_RED = "\033[31m"
_RESET = "\033[0m"
logger = logging.getLogger(__name__)
# =========================================================================
# Skin-aware helpers (lazy import to avoid circular deps)
# =========================================================================
def _get_skin():
"""Get the active skin config, or None if not available."""
try:
from hermes_cli.skin_engine import get_active_skin
return get_active_skin()
except Exception:
return None
def get_skin_faces(key: str, default: list) -> list:
"""Get spinner face list from active skin, falling back to default."""
skin = _get_skin()
if skin:
faces = skin.get_spinner_list(key)
if faces:
return faces
return default
def get_skin_verbs() -> list:
"""Get thinking verbs from active skin."""
skin = _get_skin()
if skin:
verbs = skin.get_spinner_list("thinking_verbs")
if verbs:
return verbs
return KawaiiSpinner.THINKING_VERBS
def get_skin_tool_prefix() -> str:
"""Get tool output prefix character from active skin."""
skin = _get_skin()
if skin:
return skin.tool_prefix
return ""
# =========================================================================
# Tool preview (one-line summary of a tool call's primary argument)
@@ -65,8 +22,6 @@ def get_skin_tool_prefix() -> str:
def build_tool_preview(tool_name: str, args: dict, max_len: int = 40) -> str:
"""Build a short preview of a tool call's primary argument for display."""
if not args:
return None
primary_args = {
"terminal": "command", "web_search": "query", "web_extract": "urls",
"read_file": "path", "write_file": "path", "patch": "path",
@@ -208,7 +163,6 @@ class KawaiiSpinner:
self.frame_idx = 0
self.start_time = None
self.last_line_len = 0
self._last_flush_time = 0.0 # Rate-limit flushes for patch_stdout compat
# Capture stdout NOW, before any redirect_stdout(devnull) from
# child agents can replace sys.stdout with a black hole.
self._out = sys.stdout
@@ -223,34 +177,15 @@ class KawaiiSpinner:
pass
def _animate(self):
# Cache skin wings at start (avoid per-frame imports)
skin = _get_skin()
wings = skin.get_spinner_wings() if skin else []
while self.running:
if os.getenv("HERMES_SPINNER_PAUSE"):
time.sleep(0.1)
continue
frame = self.spinner_frames[self.frame_idx % len(self.spinner_frames)]
elapsed = time.time() - self.start_time
if wings:
left, right = wings[self.frame_idx % len(wings)]
line = f" {left} {frame} {self.message} {right} ({elapsed:.1f}s)"
else:
line = f" {frame} {self.message} ({elapsed:.1f}s)"
line = f" {frame} {self.message} ({elapsed:.1f}s)"
pad = max(self.last_line_len - len(line), 0)
# Rate-limit flush() calls to avoid spinner spam under
# prompt_toolkit's patch_stdout. Each flush() pushes a queue
# item that may trigger a separate run_in_terminal() call; if
# items are processed one-at-a-time the \r overwrite is lost
# and every frame appears on its own line. By flushing at
# most every 0.4s we guarantee multiple \r-frames are batched
# into a single write, so the terminal collapses them correctly.
now = time.time()
should_flush = (now - self._last_flush_time) >= 0.4
self._write(f"\r{line}{' ' * pad}", end='', flush=should_flush)
if should_flush:
self._last_flush_time = now
self._write(f"\r{line}{' ' * pad}", end='', flush=True)
self.last_line_len = len(line)
self.frame_idx += 1
time.sleep(0.12)
@@ -365,7 +300,7 @@ def _detect_tool_failure(tool_name: str, result: str | None) -> tuple[bool, str]
if exit_code is not None and exit_code != 0:
return True, f" [exit {exit_code}]"
except (json.JSONDecodeError, TypeError, AttributeError):
logger.debug("Could not parse terminal result as JSON for exit code check")
pass
return False, ""
# Memory-specific: distinguish "full" from real errors
@@ -375,7 +310,7 @@ def _detect_tool_failure(tool_name: str, result: str | None) -> tuple[bool, str]
if data.get("success") is False and "exceed the limit" in data.get("error", ""):
return True, " [full]"
except (json.JSONDecodeError, TypeError, AttributeError):
logger.debug("Could not parse memory result as JSON for capacity check")
pass
# Generic heuristic for non-terminal tools
lower = result[:500].lower()
@@ -397,7 +332,6 @@ def get_cute_tool_message(
"""
dur = f"{duration:.1f}s"
is_failure, failure_suffix = _detect_tool_failure(tool_name, result)
skin_prefix = get_skin_tool_prefix()
def _trunc(s, n=40):
s = str(s)
@@ -408,9 +342,7 @@ def get_cute_tool_message(
return ("..." + p[-(n-3):]) if len(p) > n else p
def _wrap(line: str) -> str:
"""Apply skin tool prefix and failure suffix."""
if skin_prefix != "":
line = line.replace("", skin_prefix, 1)
"""Append failure suffix when the tool failed."""
if not is_failure:
return line
return f"{line}{failure_suffix}"

View File

@@ -55,20 +55,6 @@ MODEL_PRICING = {
# Meta (via providers)
"llama-4-maverick": {"input": 0.50, "output": 0.70},
"llama-4-scout": {"input": 0.20, "output": 0.30},
# Z.AI / GLM (direct provider — pricing not published externally, treat as local)
"glm-5": {"input": 0.0, "output": 0.0},
"glm-4.7": {"input": 0.0, "output": 0.0},
"glm-4.5": {"input": 0.0, "output": 0.0},
"glm-4.5-flash": {"input": 0.0, "output": 0.0},
# Kimi / Moonshot (direct provider — pricing not published externally, treat as local)
"kimi-k2.5": {"input": 0.0, "output": 0.0},
"kimi-k2-thinking": {"input": 0.0, "output": 0.0},
"kimi-k2-turbo-preview": {"input": 0.0, "output": 0.0},
"kimi-k2-0905-preview": {"input": 0.0, "output": 0.0},
# MiniMax (direct provider — pricing not published externally, treat as local)
"MiniMax-M2.5": {"input": 0.0, "output": 0.0},
"MiniMax-M2.5-highspeed": {"input": 0.0, "output": 0.0},
"MiniMax-M2.1": {"input": 0.0, "output": 0.0},
}
# Fallback: unknown/custom models get zero cost (we can't assume pricing

View File

@@ -49,17 +49,6 @@ DEFAULT_CONTEXT_LENGTHS = {
"meta-llama/llama-3.3-70b-instruct": 131072,
"deepseek/deepseek-chat-v3": 65536,
"qwen/qwen-2.5-72b-instruct": 32768,
"glm-4.7": 202752,
"glm-5": 202752,
"glm-4.5": 131072,
"glm-4.5-flash": 131072,
"kimi-k2.5": 262144,
"kimi-k2-thinking": 262144,
"kimi-k2-turbo-preview": 262144,
"kimi-k2-0905-preview": 131072,
"MiniMax-M2.5": 204800,
"MiniMax-M2.5-highspeed": 204800,
"MiniMax-M2.1": 204800,
}

View File

@@ -66,8 +66,7 @@ DEFAULT_AGENT_IDENTITY = (
"range of tasks including answering questions, writing and editing code, "
"analyzing information, creative work, and executing actions via your tools. "
"You communicate clearly, admit uncertainty when appropriate, and prioritize "
"being genuinely useful over being verbose unless otherwise directed below. "
"Be targeted and efficient in your exploration and investigations."
"being genuinely useful over being verbose unless otherwise directed below."
)
MEMORY_GUIDANCE = (
@@ -103,41 +102,12 @@ PLATFORM_HINTS = {
"You are on a text messaging communication platform, Telegram. "
"Please do not use markdown as it does not render. "
"You can send media files natively: to deliver a file to the user, "
"include MEDIA:/absolute/path/to/file in your response. Images "
"(.png, .jpg, .webp) appear as photos, audio (.ogg) sends as voice "
"bubbles, and videos (.mp4) play inline. You can also include image "
"URLs in markdown format ![alt](url) and they will be sent as native photos."
"include MEDIA:/absolute/path/to/file in your response. Audio "
"(.ogg) sends as voice bubbles. You can also include image URLs "
"in markdown format ![alt](url) and they will be sent as native photos."
),
"discord": (
"You are in a Discord server or group chat communicating with your user. "
"You can send media files natively: include MEDIA:/absolute/path/to/file "
"in your response. Images (.png, .jpg, .webp) are sent as photo "
"attachments, audio as file attachments. You can also include image URLs "
"in markdown format ![alt](url) and they will be sent as attachments."
),
"slack": (
"You are in a Slack workspace communicating with your user. "
"You can send media files natively: include MEDIA:/absolute/path/to/file "
"in your response. Images (.png, .jpg, .webp) are uploaded as photo "
"attachments, audio as file attachments. You can also include image URLs "
"in markdown format ![alt](url) and they will be uploaded as attachments."
),
"signal": (
"You are on a text messaging communication platform, Signal. "
"Please do not use markdown as it does not render. "
"You can send media files natively: to deliver a file to the user, "
"include MEDIA:/absolute/path/to/file in your response. Images "
"(.png, .jpg, .webp) appear as photos, audio as attachments, and other "
"files arrive as downloadable documents. You can also include image "
"URLs in markdown format ![alt](url) and they will be sent as photos."
),
"email": (
"You are communicating via email. Write clear, well-structured responses "
"suitable for email. Use plain text formatting (no markdown). "
"Keep responses concise but complete. You can send file attachments — "
"include MEDIA:/absolute/path/to/file in your response. The subject line "
"is preserved for threading. Do not include greetings or sign-offs unless "
"contextually appropriate."
"You are in a Discord server or group chat communicating with your user."
),
"cli": (
"You are a CLI AI Agent. Try not to use markdown but simple text "
@@ -167,24 +137,9 @@ def _read_skill_description(skill_file: Path, max_chars: int = 60) -> str:
if len(desc) > max_chars:
desc = desc[:max_chars - 3] + "..."
return desc
except Exception as e:
logger.debug("Failed to read skill description from %s: %s", skill_file, e)
return ""
def _skill_is_platform_compatible(skill_file: Path) -> bool:
"""Quick check if a SKILL.md is compatible with the current OS platform.
Reads just enough to parse the ``platforms`` frontmatter field.
Skills without the field (the vast majority) are always compatible.
"""
try:
from tools.skills_tool import _parse_frontmatter, skill_matches_platform
raw = skill_file.read_text(encoding="utf-8")[:2000]
frontmatter, _ = _parse_frontmatter(raw)
return skill_matches_platform(frontmatter)
except Exception:
return True # Err on the side of showing the skill
pass
return ""
def build_skills_system_prompt() -> str:
@@ -193,7 +148,6 @@ def build_skills_system_prompt() -> str:
Scans ~/.hermes/skills/ for SKILL.md files grouped by category.
Includes per-skill descriptions from frontmatter so the model can
match skills by meaning, not just name.
Filters out skills incompatible with the current OS platform.
"""
hermes_home = Path(os.getenv("HERMES_HOME", Path.home() / ".hermes"))
skills_dir = hermes_home / "skills"
@@ -203,23 +157,13 @@ def build_skills_system_prompt() -> str:
# Collect skills with descriptions, grouped by category
# Each entry: (skill_name, description)
# Supports sub-categories: skills/mlops/training/axolotl/SKILL.md
# → category "mlops/training", skill "axolotl"
skills_by_category: dict[str, list[tuple[str, str]]] = {}
for skill_file in skills_dir.rglob("SKILL.md"):
# Skip skills incompatible with the current OS platform
if not _skill_is_platform_compatible(skill_file):
continue
rel_path = skill_file.relative_to(skills_dir)
parts = rel_path.parts
if len(parts) >= 2:
# Category is everything between skills_dir and the skill folder
# e.g. parts = ("mlops", "training", "axolotl", "SKILL.md")
# → category = "mlops/training", skill_name = "axolotl"
# e.g. parts = ("github", "github-auth", "SKILL.md")
# → category = "github", skill_name = "github-auth"
category = parts[0]
skill_name = parts[-2]
category = "/".join(parts[:-2]) if len(parts) > 2 else parts[0]
else:
category = "general"
skill_name = skill_file.parent.name
@@ -230,11 +174,9 @@ def build_skills_system_prompt() -> str:
return ""
# Read category-level descriptions from DESCRIPTION.md
# Checks both the exact category path and parent directories
category_descriptions = {}
for category in skills_by_category:
cat_path = Path(category)
desc_file = skills_dir / cat_path / "DESCRIPTION.md"
desc_file = skills_dir / category / "DESCRIPTION.md"
if desc_file.exists():
try:
content = desc_file.read_text(encoding="utf-8")

View File

@@ -8,14 +8,14 @@ the first 6 and last 4 characters for debuggability.
"""
import logging
import os
import re
from typing import Optional
logger = logging.getLogger(__name__)
# Known API key prefixes -- match the prefix + contiguous token chars
_PREFIX_PATTERNS = [
r"sk-[A-Za-z0-9_-]{10,}", # OpenAI / OpenRouter / Anthropic (sk-ant-*)
r"sk-[A-Za-z0-9_-]{10,}", # OpenAI / OpenRouter
r"ghp_[A-Za-z0-9]{10,}", # GitHub PAT (classic)
r"github_pat_[A-Za-z0-9_]{10,}", # GitHub PAT (fine-grained)
r"xox[baprs]-[A-Za-z0-9-]{10,}", # Slack tokens
@@ -25,18 +25,6 @@ _PREFIX_PATTERNS = [
r"fc-[A-Za-z0-9]{10,}", # Firecrawl
r"bb_live_[A-Za-z0-9_-]{10,}", # BrowserBase
r"gAAAA[A-Za-z0-9_=-]{20,}", # Codex encrypted tokens
r"AKIA[A-Z0-9]{16}", # AWS Access Key ID
r"sk_live_[A-Za-z0-9]{10,}", # Stripe secret key (live)
r"sk_test_[A-Za-z0-9]{10,}", # Stripe secret key (test)
r"rk_live_[A-Za-z0-9]{10,}", # Stripe restricted key
r"SG\.[A-Za-z0-9_-]{10,}", # SendGrid API key
r"hf_[A-Za-z0-9]{10,}", # HuggingFace token
r"r8_[A-Za-z0-9]{10,}", # Replicate API token
r"npm_[A-Za-z0-9]{10,}", # npm access token
r"pypi-[A-Za-z0-9_-]{10,}", # PyPI API token
r"dop_v1_[A-Za-z0-9]{10,}", # DigitalOcean PAT
r"doo_v1_[A-Za-z0-9]{10,}", # DigitalOcean OAuth
r"am_[A-Za-z0-9_-]{10,}", # AgentMail API key
]
# ENV assignment patterns: KEY=value where KEY contains a secret-like name
@@ -59,28 +47,11 @@ _AUTH_HEADER_RE = re.compile(
re.IGNORECASE,
)
# Telegram bot tokens: bot<digits>:<token> or <digits>:<token>,
# where token part is restricted to [-A-Za-z0-9_] and length >= 30
# Telegram bot tokens: bot<digits>:<token> or <digits>:<alphanum>
_TELEGRAM_RE = re.compile(
r"(bot)?(\d{8,}):([-A-Za-z0-9_]{30,})",
)
# Private key blocks: -----BEGIN RSA PRIVATE KEY----- ... -----END RSA PRIVATE KEY-----
_PRIVATE_KEY_RE = re.compile(
r"-----BEGIN[A-Z ]*PRIVATE KEY-----[\s\S]*?-----END[A-Z ]*PRIVATE KEY-----"
)
# Database connection strings: protocol://user:PASSWORD@host
# Catches postgres, mysql, mongodb, redis, amqp URLs and redacts the password
_DB_CONNSTR_RE = re.compile(
r"((?:postgres(?:ql)?|mysql|mongodb(?:\+srv)?|redis|amqp)://[^:]+:)([^@]+)(@)",
re.IGNORECASE,
)
# E.164 phone numbers: +<country><number>, 7-15 digits
# Negative lookahead prevents matching hex strings or identifiers
_SIGNAL_PHONE_RE = re.compile(r"(\+[1-9]\d{6,14})(?![A-Za-z0-9])")
# Compile known prefix patterns into one alternation
_PREFIX_RE = re.compile(
r"(?<![A-Za-z0-9_-])(" + "|".join(_PREFIX_PATTERNS) + r")(?![A-Za-z0-9_-])"
@@ -98,12 +69,9 @@ def redact_sensitive_text(text: str) -> str:
"""Apply all redaction patterns to a block of text.
Safe to call on any string -- non-matching text passes through unchanged.
Disabled when security.redact_secrets is false in config.yaml.
"""
if not text:
return text
if os.getenv("HERMES_REDACT_SECRETS", "").lower() in ("0", "false", "no", "off"):
return text
# Known prefixes (sk-, ghp_, etc.)
text = _PREFIX_RE.sub(lambda m: _mask_token(m.group(1)), text)
@@ -133,20 +101,6 @@ def redact_sensitive_text(text: str) -> str:
return f"{prefix}{digits}:***"
text = _TELEGRAM_RE.sub(_redact_telegram, text)
# Private key blocks
text = _PRIVATE_KEY_RE.sub("[REDACTED PRIVATE KEY]", text)
# Database connection string passwords
text = _DB_CONNSTR_RE.sub(lambda m: f"{m.group(1)}***{m.group(3)}", text)
# E.164 phone numbers (Signal, WhatsApp)
def _redact_phone(m):
phone = m.group(1)
if len(phone) <= 8:
return phone[:2] + "****" + phone[-2:]
return phone[:4] + "****" + phone[-4:]
text = _SIGNAL_PHONE_RE.sub(_redact_phone, text)
return text

View File

@@ -22,7 +22,7 @@ def scan_skill_commands() -> Dict[str, Dict[str, Any]]:
global _skill_commands
_skill_commands = {}
try:
from tools.skills_tool import SKILLS_DIR, _parse_frontmatter, skill_matches_platform
from tools.skills_tool import SKILLS_DIR, _parse_frontmatter
if not SKILLS_DIR.exists():
return _skill_commands
for skill_md in SKILLS_DIR.rglob("SKILL.md"):
@@ -31,9 +31,6 @@ def scan_skill_commands() -> Dict[str, Dict[str, Any]]:
try:
content = skill_md.read_text(encoding='utf-8')
frontmatter, body = _parse_frontmatter(content)
# Skip skills incompatible with the current OS platform
if not skill_matches_platform(frontmatter):
continue
name = frontmatter.get('name', skill_md.parent.name)
description = frontmatter.get('description', '')
if not description:

View File

@@ -606,7 +606,7 @@ class BatchRunner:
# Create batches
self.batches = self._create_batches()
print("📊 Batch Runner Initialized")
print(f"📊 Batch Runner Initialized")
print(f" Dataset: {self.dataset_file} ({len(self.dataset)} prompts)")
print(f" Batch size: {self.batch_size}")
print(f" Total batches: {len(self.batches)}")
@@ -826,7 +826,7 @@ class BatchRunner:
print("=" * 70)
print(f" Original dataset size: {len(self.dataset):,} prompts")
print(f" Already completed: {len(skipped_indices):,} prompts")
print(" ─────────────────────────────────────────")
print(f" ─────────────────────────────────────────")
print(f" 🎯 RESUMING WITH: {len(filtered_entries):,} prompts")
print(f" New batches created: {len(batches_to_process)}")
print("=" * 70 + "\n")
@@ -888,7 +888,7 @@ class BatchRunner:
]
print(f"✅ Created {len(tasks)} batch tasks")
print("🚀 Starting parallel batch processing...\n")
print(f"🚀 Starting parallel batch processing...\n")
# Use rich Progress for better visual tracking with persistent bottom bar
# redirect_stdout/stderr lets rich manage all output so progress bar stays clean
@@ -1057,7 +1057,7 @@ class BatchRunner:
print(f"✅ Total trajectories in merged file: {total_entries - filtered_entries}")
print(f"✅ Total batch files merged: {batch_files_found}")
print(f"⏱️ Total duration: {round(time.time() - start_time, 2)}s")
print("\n📈 Tool Usage Statistics:")
print(f"\n📈 Tool Usage Statistics:")
print("-" * 70)
if total_tool_stats:
@@ -1084,7 +1084,7 @@ class BatchRunner:
# Print reasoning coverage stats
total_discarded = sum(r.get("discarded_no_reasoning", 0) for r in results)
print("\n🧠 Reasoning Coverage:")
print(f"\n🧠 Reasoning Coverage:")
print("-" * 70)
total_turns = total_reasoning_stats["total_assistant_turns"]
with_reasoning = total_reasoning_stats["turns_with_reasoning"]
@@ -1101,8 +1101,8 @@ class BatchRunner:
print(f" 🚫 Samples discarded (zero reasoning): {total_discarded:,}")
print(f"\n💾 Results saved to: {self.output_dir}")
print(" - Trajectories: trajectories.jsonl (combined)")
print(" - Individual batches: batch_*.jsonl (for debugging)")
print(f" - Trajectories: trajectories.jsonl (combined)")
print(f" - Individual batches: batch_*.jsonl (for debugging)")
print(f" - Statistics: {self.stats_file.name}")
print(f" - Checkpoint: {self.checkpoint_file.name}")
@@ -1112,7 +1112,7 @@ def main(
batch_size: int = None,
run_name: str = None,
distribution: str = "default",
model: str = "anthropic/claude-sonnet-4.6",
model: str = "anthropic/claude-sonnet-4-20250514",
api_key: str = None,
base_url: str = "https://openrouter.ai/api/v1",
max_turns: int = 10,
@@ -1155,7 +1155,7 @@ def main(
providers_order (str): Comma-separated list of OpenRouter providers to try in order (e.g. "anthropic,openai,google")
provider_sort (str): Sort providers by "price", "throughput", or "latency" (OpenRouter only)
max_tokens (int): Maximum tokens for model responses (optional, uses model default if not set)
reasoning_effort (str): OpenRouter reasoning effort level: "xhigh", "high", "medium", "low", "minimal", "none" (default: "medium")
reasoning_effort (str): OpenRouter reasoning effort level: "xhigh", "high", "medium", "low", "minimal", "none" (default: "xhigh")
reasoning_disabled (bool): Completely disable reasoning/thinking tokens (default: False)
prefill_messages_file (str): Path to JSON file containing prefill messages (list of {role, content} dicts)
max_samples (int): Only process the first N samples from the dataset (optional, processes all if not set)
@@ -1216,7 +1216,7 @@ def main(
providers_order_list = [p.strip() for p in providers_order.split(",")] if providers_order else None
# Build reasoning_config from CLI flags
# --reasoning_disabled takes priority, then --reasoning_effort, then default (medium)
# --reasoning_disabled takes priority, then --reasoning_effort, then default (xhigh)
reasoning_config = None
if reasoning_disabled:
# Completely disable reasoning/thinking tokens
@@ -1238,7 +1238,7 @@ def main(
with open(prefill_messages_file, 'r', encoding='utf-8') as f:
prefill_messages = json.load(f)
if not isinstance(prefill_messages, list):
print("❌ Error: prefill_messages_file must contain a JSON array of messages")
print(f"❌ Error: prefill_messages_file must contain a JSON array of messages")
return
print(f"💬 Loaded {len(prefill_messages)} prefill messages from {prefill_messages_file}")
except Exception as e:

View File

@@ -11,13 +11,8 @@ model:
# Inference provider selection:
# "auto" - Use Nous Portal if logged in, otherwise OpenRouter/env vars (default)
# "nous-api" - Use Nous Portal via API key (requires: NOUS_API_KEY)
# "openrouter" - Always use OpenRouter API key from OPENROUTER_API_KEY
# "nous" - Always use Nous Portal (requires: hermes login)
# "zai" - Use z.ai / ZhipuAI GLM models (requires: GLM_API_KEY)
# "kimi-coding"- Use Kimi / Moonshot AI models (requires: KIMI_API_KEY)
# "minimax" - Use MiniMax global endpoint (requires: MINIMAX_API_KEY)
# "minimax-cn" - Use MiniMax China endpoint (requires: MINIMAX_CN_API_KEY)
# Can also be overridden with --provider flag or HERMES_INFERENCE_PROVIDER env var.
provider: "auto"
@@ -51,16 +46,6 @@ model:
# # Data policy: "allow" (default) or "deny" to exclude providers that may store data
# # data_collection: "deny"
# =============================================================================
# Git Worktree Isolation
# =============================================================================
# When enabled, each CLI session creates an isolated git worktree so multiple
# agents can work on the same repo concurrently without file collisions.
# Equivalent to always passing --worktree / -w on the command line.
#
# worktree: true # Always create a worktree when in a git repo
# worktree: false # Default — only create when -w flag is passed
# =============================================================================
# Terminal Tool Configuration
# =============================================================================
@@ -210,58 +195,8 @@ compression:
threshold: 0.85
# Model to use for generating summaries (fast/cheap recommended)
# This model compresses the middle turns into a concise summary.
# IMPORTANT: it receives the full middle section of the conversation, so it
# MUST support a context length at least as large as your main model's.
# This model compresses the middle turns into a concise summary
summary_model: "google/gemini-3-flash-preview"
# Provider for the summary model (default: "auto")
# Options: "auto", "openrouter", "nous", "main"
# summary_provider: "auto"
# =============================================================================
# Auxiliary Models (Advanced — Experimental)
# =============================================================================
# Hermes uses lightweight "auxiliary" models for side tasks: image analysis,
# browser screenshot analysis, web page summarization, and context compression.
#
# By default these use Gemini Flash via OpenRouter or Nous Portal and are
# auto-detected from your credentials. You do NOT need to change anything
# here for normal usage.
#
# WARNING: Overriding these with providers other than OpenRouter or Nous Portal
# is EXPERIMENTAL and may not work. Not all models/providers support vision,
# produce usable summaries, or accept the same API format. Change at your own
# risk — if things break, reset to "auto" / empty values.
#
# Each task has its own provider + model pair so you can mix providers.
# For example: OpenRouter for vision (needs multimodal), but your main
# local endpoint for compression (just needs text).
#
# Provider options:
# "auto" - Best available: OpenRouter → Nous Portal → main endpoint (default)
# "openrouter" - Force OpenRouter (requires OPENROUTER_API_KEY)
# "nous" - Force Nous Portal (requires: hermes login)
# "codex" - Force Codex OAuth (requires: hermes model → Codex).
# Uses gpt-5.3-codex which supports vision.
# "main" - Use your custom endpoint (OPENAI_BASE_URL + OPENAI_API_KEY).
# Works with OpenAI API, local models, or any OpenAI-compatible
# endpoint. Also falls back to Codex OAuth and API-key providers.
#
# Model: leave empty to use the provider's default. When empty, OpenRouter
# uses "google/gemini-3-flash-preview" and Nous uses "gemini-3-flash".
# Other providers pick a sensible default automatically.
#
# auxiliary:
# # Image analysis: vision_analyze tool + browser screenshots
# vision:
# provider: "auto"
# model: "" # e.g. "google/gemini-2.5-flash", "openai/gpt-4o"
#
# # Web page scraping / summarization + browser page text extraction
# web_extract:
# provider: "auto"
# model: ""
# =============================================================================
# Persistent Memory
@@ -346,7 +281,7 @@ agent:
# Reasoning effort level (OpenRouter and Nous Portal)
# Controls how much "thinking" the model does before responding.
# Options: "xhigh" (max), "high", "medium", "low", "minimal", "none" (disable)
reasoning_effort: "medium"
reasoning_effort: "xhigh"
# Predefined personalities (use with /personality command)
personalities:
@@ -403,13 +338,11 @@ agent:
# discord: [web, vision, skills, todo]
#
# If not set, defaults are:
# cli: hermes-cli (everything + cronjob management)
# telegram: hermes-telegram (terminal, file, web, vision, image, tts, browser, skills, todo, cronjob, messaging)
# discord: hermes-discord (same as telegram)
# whatsapp: hermes-whatsapp (same as telegram)
# slack: hermes-slack (same as telegram)
# signal: hermes-signal (same as telegram)
# homeassistant: hermes-homeassistant (same as telegram)
# cli: hermes-cli (everything + cronjob management)
# telegram: hermes-telegram (terminal, file, web, vision, image, tts, browser, skills, todo, cronjob, messaging)
# discord: hermes-discord (same as telegram)
# whatsapp: hermes-whatsapp (same as telegram)
# slack: hermes-slack (same as telegram)
#
platform_toolsets:
cli: [hermes-cli]
@@ -417,8 +350,6 @@ platform_toolsets:
discord: [hermes-discord]
whatsapp: [hermes-whatsapp]
slack: [hermes-slack]
signal: [hermes-signal]
homeassistant: [hermes-homeassistant]
# ─────────────────────────────────────────────────────────────────────────────
# Available toolsets (use these names in platform_toolsets or the toolsets list)
@@ -560,21 +491,6 @@ toolsets:
# args: ["-y", "@modelcontextprotocol/server-github"]
# env:
# GITHUB_PERSONAL_ACCESS_TOKEN: "ghp_..."
#
# Sampling (server-initiated LLM requests) — enabled by default.
# Per-server config under the 'sampling' key:
# analysis:
# command: npx
# args: ["-y", "analysis-server"]
# sampling:
# enabled: true # default: true
# model: "gemini-3-flash" # override model (optional)
# max_tokens_cap: 4096 # max tokens per request
# timeout: 30 # LLM call timeout (seconds)
# max_rpm: 10 # max requests per minute
# allowed_models: [] # model whitelist (empty = all)
# max_tool_rounds: 5 # tool loop limit (0 = disable)
# log_level: "info" # audit verbosity
# =============================================================================
# Voice Transcription (Speech-to-Text)
@@ -626,10 +542,6 @@ code_execution:
delegation:
max_iterations: 50 # Max tool-calling turns per child (default: 50)
default_toolsets: ["terminal", "file", "web"] # Default toolsets for subagents
# model: "google/gemini-3-flash-preview" # Override model for subagents (empty = inherit parent)
# provider: "openrouter" # Override provider for subagents (empty = inherit parent)
# # Resolves full credentials (base_url, api_key) automatically.
# # Supported: openrouter, nous, zai, kimi-coding, minimax
# =============================================================================
# Honcho Integration (Cross-Session User Modeling)
@@ -659,63 +571,3 @@ display:
# verbose: Full args, results, and debug logs (same as /verbose)
# Toggle at runtime with /verbose in the CLI
tool_progress: all
# Background process notifications (gateway/messaging only).
# Controls how chatty the process watcher is when you use
# terminal(background=true, check_interval=...) from Telegram/Discord/etc.
# off: No watcher messages at all
# result: Only the final completion message
# error: Only the final message when exit code != 0
# all: Running output updates + final message (default)
background_process_notifications: all
# Play terminal bell when agent finishes a response.
# Useful for long-running tasks — your terminal will ding when the agent is done.
# Works over SSH. Most terminals can be configured to flash the taskbar or play a sound.
bell_on_complete: false
# Show model reasoning/thinking before each response.
# When enabled, a dim box shows the model's thought process above the response.
# Toggle at runtime with /reasoning show or /reasoning hide.
show_reasoning: false
# ───────────────────────────────────────────────────────────────────────────
# Skin / Theme
# ───────────────────────────────────────────────────────────────────────────
# Customize CLI visual appearance — banner colors, spinner faces, tool prefix,
# response box label, and branding text. Change at runtime with /skin <name>.
#
# Built-in skins:
# default — Classic Hermes gold/kawaii
# ares — Crimson/bronze war-god theme with spinner wings
# mono — Clean grayscale monochrome
# slate — Cool blue developer-focused
#
# Custom skins: drop a YAML file in ~/.hermes/skins/<name>.yaml
# Schema (all fields optional, missing values inherit from default):
#
# name: my-theme
# description: Short description
# colors:
# banner_border: "#HEX" # Panel border
# banner_title: "#HEX" # Panel title
# banner_accent: "#HEX" # Section headers (Available Tools, etc.)
# banner_dim: "#HEX" # Dim/muted text
# banner_text: "#HEX" # Body text (tool names, skill names)
# ui_accent: "#HEX" # UI accent color
# response_border: "#HEX" # Response box border color
# spinner:
# waiting_faces: ["(⚔)", "(⛨)"] # Faces shown while waiting
# thinking_faces: ["(⚔)", "(⌁)"] # Faces shown while thinking
# thinking_verbs: ["forging", "plotting"] # Verbs for spinner messages
# wings: # Optional left/right spinner decorations
# - ["⟪⚔", "⚔⟫"]
# - ["⟪▲", "▲⟫"]
# branding:
# agent_name: "My Agent" # Banner title and branding
# welcome: "Welcome message" # Shown at CLI startup
# response_label: " ⚔ Agent " # Response box header label
# prompt_symbol: "⚔ " # Prompt symbol
# tool_prefix: "╎" # Tool output line prefix (default: ┊)
#
skin: default

1955
cli.py

File diff suppressed because it is too large Load Diff

View File

@@ -14,8 +14,6 @@ from datetime import datetime, timedelta
from pathlib import Path
from typing import Optional, Dict, List, Any
from hermes_time import now as _hermes_now
try:
from croniter import croniter
HAS_CRONITER = True
@@ -26,35 +24,16 @@ except ImportError:
# Configuration
# =============================================================================
HERMES_DIR = Path(os.getenv("HERMES_HOME", Path.home() / ".hermes"))
HERMES_DIR = Path.home() / ".hermes"
CRON_DIR = HERMES_DIR / "cron"
JOBS_FILE = CRON_DIR / "jobs.json"
OUTPUT_DIR = CRON_DIR / "output"
def _secure_dir(path: Path):
"""Set directory to owner-only access (0700). No-op on Windows."""
try:
os.chmod(path, 0o700)
except (OSError, NotImplementedError):
pass # Windows or other platforms where chmod is not supported
def _secure_file(path: Path):
"""Set file to owner-only read/write (0600). No-op on Windows."""
try:
if path.exists():
os.chmod(path, 0o600)
except (OSError, NotImplementedError):
pass
def ensure_dirs():
"""Ensure cron directories exist with secure permissions."""
"""Ensure cron directories exist."""
CRON_DIR.mkdir(parents=True, exist_ok=True)
OUTPUT_DIR.mkdir(parents=True, exist_ok=True)
_secure_dir(CRON_DIR)
_secure_dir(OUTPUT_DIR)
# =============================================================================
@@ -149,7 +128,7 @@ def parse_schedule(schedule: str) -> Dict[str, Any]:
# Duration like "30m", "2h", "1d" → one-shot from now
try:
minutes = parse_duration(schedule)
run_at = _hermes_now() + timedelta(minutes=minutes)
run_at = datetime.now() + timedelta(minutes=minutes)
return {
"kind": "once",
"run_at": run_at.isoformat(),
@@ -167,50 +146,37 @@ def parse_schedule(schedule: str) -> Dict[str, Any]:
)
def _ensure_aware(dt: datetime) -> datetime:
"""Make a naive datetime tz-aware using the configured timezone.
Handles backward compatibility: timestamps stored before timezone support
are naive (server-local). We assume they were in the same timezone as
the current configuration so comparisons work without crashing.
"""
if dt.tzinfo is None:
tz = _hermes_now().tzinfo
return dt.replace(tzinfo=tz)
return dt
def compute_next_run(schedule: Dict[str, Any], last_run_at: Optional[str] = None) -> Optional[str]:
"""
Compute the next run time for a schedule.
Returns ISO timestamp string, or None if no more runs.
"""
now = _hermes_now()
now = datetime.now()
if schedule["kind"] == "once":
run_at = _ensure_aware(datetime.fromisoformat(schedule["run_at"]))
run_at = datetime.fromisoformat(schedule["run_at"])
# If in the future, return it; if in the past, no more runs
return schedule["run_at"] if run_at > now else None
elif schedule["kind"] == "interval":
minutes = schedule["minutes"]
if last_run_at:
# Next run is last_run + interval
last = _ensure_aware(datetime.fromisoformat(last_run_at))
last = datetime.fromisoformat(last_run_at)
next_run = last + timedelta(minutes=minutes)
else:
# First run is now + interval
next_run = now + timedelta(minutes=minutes)
return next_run.isoformat()
elif schedule["kind"] == "cron":
if not HAS_CRONITER:
return None
cron = croniter(schedule["expr"], now)
next_run = cron.get_next(datetime)
return next_run.isoformat()
return None
@@ -238,11 +204,10 @@ def save_jobs(jobs: List[Dict[str, Any]]):
fd, tmp_path = tempfile.mkstemp(dir=str(JOBS_FILE.parent), suffix='.tmp', prefix='.jobs_')
try:
with os.fdopen(fd, 'w', encoding='utf-8') as f:
json.dump({"jobs": jobs, "updated_at": _hermes_now().isoformat()}, f, indent=2)
json.dump({"jobs": jobs, "updated_at": datetime.now().isoformat()}, f, indent=2)
f.flush()
os.fsync(f.fileno())
os.replace(tmp_path, JOBS_FILE)
_secure_file(JOBS_FILE)
except BaseException:
try:
os.unlink(tmp_path)
@@ -284,7 +249,7 @@ def create_job(
deliver = "origin" if origin else "local"
job_id = uuid.uuid4().hex[:12]
now = _hermes_now().isoformat()
now = datetime.now().isoformat()
job = {
"id": job_id,
@@ -363,7 +328,7 @@ def mark_job_run(job_id: str, success: bool, error: Optional[str] = None):
jobs = load_jobs()
for i, job in enumerate(jobs):
if job["id"] == job_id:
now = _hermes_now().isoformat()
now = datetime.now().isoformat()
job["last_run_at"] = now
job["last_status"] = "ok" if success else "error"
job["last_error"] = error if not success else None
@@ -396,7 +361,7 @@ def mark_job_run(job_id: str, success: bool, error: Optional[str] = None):
def get_due_jobs() -> List[Dict[str, Any]]:
"""Get all jobs that are due to run now."""
now = _hermes_now()
now = datetime.now()
jobs = load_jobs()
due = []
@@ -408,7 +373,7 @@ def get_due_jobs() -> List[Dict[str, Any]]:
if not next_run:
continue
next_run_dt = _ensure_aware(datetime.fromisoformat(next_run))
next_run_dt = datetime.fromisoformat(next_run)
if next_run_dt <= now:
due.append(job)
@@ -420,13 +385,11 @@ def save_job_output(job_id: str, output: str):
ensure_dirs()
job_output_dir = OUTPUT_DIR / job_id
job_output_dir.mkdir(parents=True, exist_ok=True)
_secure_dir(job_output_dir)
timestamp = _hermes_now().strftime("%Y-%m-%d_%H-%M-%S")
timestamp = datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
output_file = job_output_dir / f"{timestamp}.md"
with open(output_file, 'w', encoding='utf-8') as f:
f.write(output)
_secure_file(output_file)
return output_file

View File

@@ -27,8 +27,6 @@ from datetime import datetime
from pathlib import Path
from typing import Optional
from hermes_time import now as _hermes_now
logger = logging.getLogger(__name__)
# Add parent directory to path for imports
@@ -45,7 +43,7 @@ _LOCK_FILE = _LOCK_DIR / ".tick.lock"
def _resolve_origin(job: dict) -> Optional[dict]:
"""Extract origin info from a job, preserving any extra routing metadata."""
"""Extract origin info from a job, returning {platform, chat_id, chat_name} or None."""
origin = job.get("origin")
if not origin:
return None
@@ -69,8 +67,6 @@ def _deliver_result(job: dict, content: str) -> None:
if deliver == "local":
return
thread_id = None
# Resolve target platform + chat_id
if deliver == "origin":
if not origin:
@@ -78,7 +74,6 @@ def _deliver_result(job: dict, content: str) -> None:
return
platform_name = origin["platform"]
chat_id = origin["chat_id"]
thread_id = origin.get("thread_id")
elif ":" in deliver:
platform_name, chat_id = deliver.split(":", 1)
else:
@@ -86,7 +81,6 @@ def _deliver_result(job: dict, content: str) -> None:
platform_name = deliver
if origin and origin.get("platform") == platform_name:
chat_id = origin["chat_id"]
thread_id = origin.get("thread_id")
else:
# Fall back to home channel
chat_id = os.getenv(f"{platform_name.upper()}_HOME_CHANNEL", "")
@@ -102,8 +96,6 @@ def _deliver_result(job: dict, content: str) -> None:
"discord": Platform.DISCORD,
"slack": Platform.SLACK,
"whatsapp": Platform.WHATSAPP,
"signal": Platform.SIGNAL,
"email": Platform.EMAIL,
}
platform = platform_map.get(platform_name.lower())
if not platform:
@@ -123,13 +115,13 @@ def _deliver_result(job: dict, content: str) -> None:
# Run the async send in a fresh event loop (safe from any thread)
try:
result = asyncio.run(_send_to_platform(platform, pconfig, chat_id, content, thread_id=thread_id))
result = asyncio.run(_send_to_platform(platform, pconfig, chat_id, content))
except RuntimeError:
# asyncio.run() fails if there's already a running loop in this thread;
# spin up a new thread to avoid that.
import concurrent.futures
with concurrent.futures.ThreadPoolExecutor(max_workers=1) as pool:
future = pool.submit(asyncio.run, _send_to_platform(platform, pconfig, chat_id, content, thread_id=thread_id))
future = pool.submit(asyncio.run, _send_to_platform(platform, pconfig, chat_id, content))
result = future.result(timeout=30)
except Exception as e:
logger.error("Job '%s': delivery to %s:%s failed: %s", job["id"], platform_name, chat_id, e)
@@ -142,9 +134,9 @@ def _deliver_result(job: dict, content: str) -> None:
# Mirror the delivered content into the target's gateway session
try:
from gateway.mirror import mirror_to_session
mirror_to_session(platform_name, chat_id, content, source_label="cron", thread_id=thread_id)
except Exception as e:
logger.warning("Job '%s': mirror_to_session failed: %s", job["id"], e)
mirror_to_session(platform_name, chat_id, content, source_label="cron")
except Exception:
pass
def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
@@ -182,8 +174,6 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
model = os.getenv("HERMES_MODEL") or os.getenv("LLM_MODEL") or "anthropic/claude-opus-4.6"
# Load config.yaml for model, reasoning, prefill, toolsets, provider routing
_cfg = {}
try:
import yaml
_cfg_path = str(_hermes_home / "config.yaml")
@@ -195,44 +185,8 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
model = _model_cfg
elif isinstance(_model_cfg, dict):
model = _model_cfg.get("default", model)
except Exception as e:
logger.warning("Job '%s': failed to load config.yaml, using defaults: %s", job_id, e)
# Reasoning config from env or config.yaml
reasoning_config = None
effort = os.getenv("HERMES_REASONING_EFFORT", "")
if not effort:
effort = str(_cfg.get("agent", {}).get("reasoning_effort", "")).strip()
if effort and effort.lower() != "none":
valid = ("xhigh", "high", "medium", "low", "minimal")
if effort.lower() in valid:
reasoning_config = {"enabled": True, "effort": effort.lower()}
elif effort.lower() == "none":
reasoning_config = {"enabled": False}
# Prefill messages from env or config.yaml
prefill_messages = None
prefill_file = os.getenv("HERMES_PREFILL_MESSAGES_FILE", "") or _cfg.get("prefill_messages_file", "")
if prefill_file:
import json as _json
pfpath = Path(prefill_file).expanduser()
if not pfpath.is_absolute():
pfpath = _hermes_home / pfpath
if pfpath.exists():
try:
with open(pfpath, "r", encoding="utf-8") as _pf:
prefill_messages = _json.load(_pf)
if not isinstance(prefill_messages, list):
prefill_messages = None
except Exception as e:
logger.warning("Job '%s': failed to parse prefill messages file '%s': %s", job_id, pfpath, e)
prefill_messages = None
# Max iterations
max_iterations = _cfg.get("agent", {}).get("max_turns") or _cfg.get("max_turns") or 90
# Provider routing
pr = _cfg.get("provider_routing", {})
except Exception:
pass
from hermes_cli.runtime_provider import (
resolve_runtime_provider,
@@ -252,15 +206,8 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
base_url=runtime.get("base_url"),
provider=runtime.get("provider"),
api_mode=runtime.get("api_mode"),
max_iterations=max_iterations,
reasoning_config=reasoning_config,
prefill_messages=prefill_messages,
providers_allowed=pr.get("only"),
providers_ignored=pr.get("ignore"),
providers_order=pr.get("order"),
provider_sort=pr.get("sort"),
quiet_mode=True,
session_id=f"cron_{job_id}_{_hermes_now().strftime('%Y%m%d_%H%M%S')}"
session_id=f"cron_{job_id}_{datetime.now().strftime('%Y%m%d_%H%M%S')}"
)
result = agent.run_conversation(prompt)
@@ -272,7 +219,7 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
output = f"""# Cron Job: {job_name}
**Job ID:** {job_id}
**Run Time:** {_hermes_now().strftime('%Y-%m-%d %H:%M:%S')}
**Run Time:** {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}
**Schedule:** {job.get('schedule_display', 'N/A')}
## Prompt
@@ -294,7 +241,7 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
output = f"""# Cron Job: {job_name} (FAILED)
**Job ID:** {job_id}
**Run Time:** {_hermes_now().strftime('%Y-%m-%d %H:%M:%S')}
**Run Time:** {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}
**Schedule:** {job.get('schedule_display', 'N/A')}
## Prompt
@@ -350,11 +297,11 @@ def tick(verbose: bool = True) -> int:
due_jobs = get_due_jobs()
if verbose and not due_jobs:
logger.info("%s - No jobs due", _hermes_now().strftime('%H:%M:%S'))
logger.info("%s - No jobs due", datetime.now().strftime('%H:%M:%S'))
return 0
if verbose:
logger.info("%s - %s job(s) due", _hermes_now().strftime('%H:%M:%S'), len(due_jobs))
logger.info("%s - %s job(s) due", datetime.now().strftime('%H:%M:%S'), len(due_jobs))
executed = 0
for job in due_jobs:

View File

@@ -1,46 +0,0 @@
# datagen-config-examples/web_research.yaml
#
# Batch data generation config for WebResearchEnv.
# Generates tool-calling trajectories for multi-step web research tasks.
#
# Usage:
# python batch_runner.py \
# --config datagen-config-examples/web_research.yaml \
# --run_name web_research_v1
environment: web-research
# Toolsets available to the agent during data generation
toolsets:
- web
- file
# How many parallel workers to use
num_workers: 4
# Questions per batch
batch_size: 20
# Total trajectories to generate (comment out to run full dataset)
max_items: 500
# Model to use for generation (override with --model flag)
model: openrouter/nousresearch/hermes-3-llama-3.1-405b
# System prompt additions (ephemeral — not saved to trajectories)
ephemeral_system_prompt: |
You are a highly capable research agent. When asked a factual question,
always use web_search to find current, accurate information before answering.
Cite at least 2 sources. Be concise and accurate.
# Output directory
output_dir: data/web_research_v1
# Trajectory compression settings (for fitting into training token budgets)
compression:
enabled: true
target_max_tokens: 16000
# Eval settings
eval_every: 100 # Run eval every N trajectories
eval_size: 25 # Number of held-out questions per eval run

7
docs/README.md Normal file
View File

@@ -0,0 +1,7 @@
# Documentation
All documentation has moved to the website:
**📖 [hermes-agent.nousresearch.com/docs](https://hermes-agent.nousresearch.com/docs/)**
The documentation source files live in [`website/docs/`](../website/docs/).

View File

@@ -0,0 +1,344 @@
# send_file Integration Map — Hermes Agent Codebase Deep Dive
## 1. environments/tool_context.py — Base64 File Transfer Implementation
### upload_file() (lines 153-205)
- Reads local file as raw bytes, base64-encodes to ASCII string
- Creates parent dirs in sandbox via `self.terminal(f"mkdir -p {parent}")`
- **Chunk size:** 60,000 chars (~60KB per shell command)
- **Small files (<=60KB b64):** Single `printf '%s' '{b64}' | base64 -d > {remote_path}`
- **Large files:** Writes chunks to `/tmp/_hermes_upload.b64` via `printf >> append`, then `base64 -d` to target
- **Error handling:** Checks local file exists; returns `{exit_code, output}`
- **Size limits:** No explicit limit, but shell arg limit ~2MB means chunking is necessary for files >~45KB raw
- **No theoretical max** — but very large files would be slow (many terminal round trips)
### download_file() (lines 234-278)
- Runs `base64 {remote_path}` inside sandbox, captures stdout
- Strips output, base64-decodes to raw bytes
- Writes to host filesystem with parent dir creation
- **Error handling:** Checks exit code, empty output, decode errors
- Returns `{success: bool, bytes: int}` or `{success: false, error: str}`
- **Size limit:** Bounded by terminal output buffer (practical limit ~few MB via base64 terminal output)
### Promotion potential:
- These methods work via `self.terminal()` — they're environment-agnostic
- Could be directly lifted into a new tool that operates on the agent's current sandbox
- For send_file, this `download_file()` pattern is the key: it extracts files from sandbox → host
## 2. tools/environments/base.py — BaseEnvironment Interface
### Current methods:
- `execute(command, cwd, timeout, stdin_data)``{output, returncode}`
- `cleanup()` — release resources
- `stop()` — alias for cleanup
- `_prepare_command()` — sudo transformation
- `_build_run_kwargs()` — subprocess kwargs
- `_timeout_result()` — standard timeout dict
### What would need to be added for file transfer:
- **Nothing required at this level.** File transfer can be implemented via `execute()` (base64 over terminal, like ToolContext does) or via environment-specific methods.
- Optional: `upload_file(local_path, remote_path)` and `download_file(remote_path, local_path)` methods could be added to BaseEnvironment for optimized per-backend transfers, but the base64-over-terminal approach already works universally.
## 3. tools/environments/docker.py — Docker Container Details
### Container ID tracking:
- `self._container_id` stored at init from `self._inner.container_id`
- Inner is `minisweagent.environments.docker.DockerEnvironment`
- Container ID is a standard Docker container hash
### docker cp feasibility:
- **YES**, `docker cp` could be used for optimized file transfer:
- `docker cp {container_id}:{remote_path} {local_path}` (download)
- `docker cp {local_path} {container_id}:{remote_path}` (upload)
- Much faster than base64-over-terminal for large files
- Container ID is directly accessible via `env._container_id` or `env._inner.container_id`
### Volumes mounted:
- **Persistent mode:** Bind mounts at `~/.hermes/sandboxes/docker/{task_id}/workspace``/workspace` and `.../home``/root`
- **Ephemeral mode:** tmpfs at `/workspace` (10GB), `/home` (1GB), `/root` (1GB)
- **User volumes:** From `config.yaml docker_volumes` (arbitrary `-v` mounts)
- **Security tmpfs:** `/tmp` (512MB), `/var/tmp` (256MB), `/run` (64MB)
### Direct host access for persistent mode:
- If persistent, files at `/workspace/foo.txt` are just `~/.hermes/sandboxes/docker/{task_id}/workspace/foo.txt` on host — no transfer needed!
## 4. tools/environments/ssh.py — SSH Connection Management
### Connection management:
- Uses SSH ControlMaster for persistent connection
- Control socket at `/tmp/hermes-ssh/{user}@{host}:{port}.sock`
- ControlPersist=300 (5 min keepalive)
- BatchMode=yes (non-interactive)
- Stores: `self.host`, `self.user`, `self.port`, `self.key_path`
### SCP/SFTP feasibility:
- **YES**, SCP can piggyback on the ControlMaster socket:
- `scp -o ControlPath={socket} {user}@{host}:{remote} {local}` (download)
- `scp -o ControlPath={socket} {local} {user}@{host}:{remote}` (upload)
- Same SSH key and connection reuse — zero additional auth
- Would be much faster than base64-over-terminal for large files
## 5. tools/environments/modal.py — Modal Sandbox Filesystem
### Filesystem API exposure:
- **Not directly.** The inner `SwerexModalEnvironment` wraps Modal's sandbox
- The sandbox object is accessible at: `env._inner.deployment._sandbox`
- Modal's Python SDK exposes `sandbox.open()` for file I/O — but only via async API
- Currently only used for `snapshot_filesystem()` during cleanup
- **Could use:** `sandbox.open(path, "rb")` to read files or `sandbox.open(path, "wb")` to write
- **Alternative:** Base64-over-terminal already works via `execute()` — simpler, no SDK dependency
## 6. gateway/platforms/base.py — MEDIA: Tag Flow (Complete)
### extract_media() (lines 587-620):
- **Pattern:** `MEDIA:\S+` — extracts file paths after MEDIA: prefix
- **Voice flag:** `[[audio_as_voice]]` global directive sets `is_voice=True` for all media in message
- Returns `List[Tuple[str, bool]]` (path, is_voice) and cleaned content
### _process_message_background() media routing (lines 752-786):
- After extracting MEDIA tags, routes by file extension:
- `.ogg .opus .mp3 .wav .m4a``send_voice()`
- `.mp4 .mov .avi .mkv .3gp``send_video()`
- `.jpg .jpeg .png .webp .gif``send_image_file()`
- **Everything else** → `send_document()`
- This routing already supports arbitrary files!
### send_* method inventory (base class):
- `send(chat_id, content, reply_to, metadata)` — ABSTRACT, text
- `send_image(chat_id, image_url, caption, reply_to)` — URL-based images
- `send_animation(chat_id, animation_url, caption, reply_to)` — GIF animations
- `send_voice(chat_id, audio_path, caption, reply_to)` — voice messages
- `send_video(chat_id, video_path, caption, reply_to)` — video files
- `send_document(chat_id, file_path, caption, file_name, reply_to)` — generic files
- `send_image_file(chat_id, image_path, caption, reply_to)` — local image files
- `send_typing(chat_id)` — typing indicator
- `edit_message(chat_id, message_id, content)` — edit sent messages
### What's missing:
- **Telegram:** No override for `send_document` or `send_image_file` — falls back to text!
- **Discord:** No override for `send_document` — falls back to text!
- **WhatsApp:** Has `send_document` and `send_image_file` via bridge — COMPLETE.
- The base class defaults just send "📎 File: /path" as text — useless for actual file delivery.
## 7. gateway/platforms/telegram.py — Send Method Analysis
### Implemented send methods:
- `send()` — MarkdownV2 text with fallback to plain
- `send_voice()``.ogg`/`.opus` as `send_voice()`, others as `send_audio()`
- `send_image()` — URL-based via `send_photo()`
- `send_animation()` — GIF via `send_animation()`
- `send_typing()` — "typing" chat action
- `edit_message()` — edit text messages
### MISSING:
- **`send_document()` NOT overridden** — Need to add `self._bot.send_document(chat_id, document=open(file_path, 'rb'), ...)`
- **`send_image_file()` NOT overridden** — Need to add `self._bot.send_photo(chat_id, photo=open(path, 'rb'), ...)`
- **`send_video()` NOT overridden** — Need to add `self._bot.send_video(...)`
## 8. gateway/platforms/discord.py — Send Method Analysis
### Implemented send methods:
- `send()` — text messages with chunking
- `send_voice()` — discord.File attachment
- `send_image()` — downloads URL, creates discord.File attachment
- `send_typing()` — channel.typing()
- `edit_message()` — edit text messages
### MISSING:
- **`send_document()` NOT overridden** — Need to add discord.File attachment
- **`send_image_file()` NOT overridden** — Need to add discord.File from local path
- **`send_video()` NOT overridden** — Need to add discord.File attachment
## 9. gateway/run.py — User File Attachment Handling
### Current attachment flow:
1. **Telegram photos** (line 509-529): Download via `photo.get_file()``cache_image_from_bytes()` → vision auto-analysis
2. **Telegram voice** (line 532-541): Download → `cache_audio_from_bytes()` → STT transcription
3. **Telegram audio** (line 542-551): Same pattern
4. **Telegram documents** (line 553-617): Extension validation against `SUPPORTED_DOCUMENT_TYPES`, 20MB limit, content injection for text files
5. **Discord attachments** (line 717-751): Content-type detection, image/audio caching, URL fallback for other types
6. **Gateway run.py** (lines 818-883): Auto-analyzes images with vision, transcribes audio, enriches document messages with context notes
### Key insight: Files are always cached to host filesystem first, then processed. The agent sees local file paths.
## 10. tools/terminal_tool.py — Terminal Tool & Environment Interaction
### How it manages environments:
- Global dict `_active_environments: Dict[str, Any]` keyed by task_id
- Per-task creation locks prevent duplicate sandbox creation
- Auto-cleanup thread kills idle environments after `TERMINAL_LIFETIME_SECONDS`
- `_get_env_config()` reads all TERMINAL_* env vars for backend selection
- `_create_environment()` factory creates the right backend type
### Could send_file piggyback?
- **YES.** send_file needs access to the same environment to extract files from sandboxes.
- It can reuse `_active_environments[task_id]` to get the environment, then:
- Docker: Use `docker cp` via `env._container_id`
- SSH: Use `scp` via `env.control_socket`
- Local: Just read the file directly
- Modal: Use base64-over-terminal via `env.execute()`
- The file_tools.py module already does this with `ShellFileOperations` — read_file/write_file/search/patch all share the same env instance.
## 11. tools/tts_tool.py — Working Example of File Delivery
### Flow:
1. Generate audio file to `~/.hermes/audio_cache/tts_TIMESTAMP.{ogg,mp3}`
2. Return JSON with `media_tag: "MEDIA:/path/to/file"`
3. For Telegram voice: prepend `[[audio_as_voice]]` directive
4. The LLM includes the MEDIA tag in its response text
5. `BasePlatformAdapter._process_message_background()` calls `extract_media()` to find the tag
6. Routes by extension → `send_voice()` for audio files
7. Platform adapter sends the file natively
### Key pattern: Tool saves file to host → returns MEDIA: path → LLM echoes it → gateway extracts → platform delivers
## 12. tools/image_generation_tool.py — Working Example of Image Delivery
### Flow:
1. Call FAL.ai API → get image URL
2. Return JSON with `image: "https://fal.media/..."` URL
3. The LLM includes the URL in markdown: `![description](URL)`
4. `BasePlatformAdapter.extract_images()` finds `![alt](url)` patterns
5. Routes through `send_image()` (URL) or `send_animation()` (GIF)
6. Platform downloads and sends natively
### Key difference from TTS: Images are URL-based, not local files. The gateway downloads at send time.
---
# INTEGRATION MAP: Where send_file Hooks In
## Architecture Decision: MEDIA: Tag Protocol vs. New Tool
The MEDIA: tag protocol is already the established pattern for file delivery. Two options:
### Option A: Pure MEDIA: Tag (Minimal Change)
- No new tool needed
- Agent downloads file from sandbox to host using terminal (base64)
- Saves to known location (e.g., `~/.hermes/file_cache/`)
- Includes `MEDIA:/path` in response text
- Existing routing in `_process_message_background()` handles delivery
- **Problem:** Agent has to manually do base64 dance + know about MEDIA: convention
### Option B: Dedicated send_file Tool (Recommended)
- New tool that the agent calls with `(file_path, caption?)`
- Tool handles the sandbox → host extraction automatically
- Returns MEDIA: tag that gets routed through existing pipeline
- Much cleaner agent experience
## Implementation Plan for Option B
### Files to CREATE:
1. **`tools/send_file_tool.py`** — The new tool
- Accepts: `file_path` (path in sandbox), `caption` (optional)
- Detects environment backend from `_active_environments`
- Extracts file from sandbox:
- **local:** `shutil.copy()` or direct path
- **docker:** `docker cp {container_id}:{path} {local_cache}/`
- **ssh:** `scp -o ControlPath=... {user}@{host}:{path} {local_cache}/`
- **modal:** base64-over-terminal via `env.execute("base64 {path}")`
- Saves to `~/.hermes/file_cache/{uuid}_{filename}`
- Returns: `MEDIA:/cached/path` in response for gateway to pick up
- Register with `registry.register(name="send_file", toolset="file", ...)`
### Files to MODIFY:
2. **`gateway/platforms/telegram.py`** — Add missing send methods:
```python
async def send_document(self, chat_id, file_path, caption=None, file_name=None, reply_to=None):
with open(file_path, "rb") as f:
msg = await self._bot.send_document(
chat_id=int(chat_id), document=f,
caption=caption, filename=file_name or os.path.basename(file_path))
return SendResult(success=True, message_id=str(msg.message_id))
async def send_image_file(self, chat_id, image_path, caption=None, reply_to=None):
with open(image_path, "rb") as f:
msg = await self._bot.send_photo(chat_id=int(chat_id), photo=f, caption=caption)
return SendResult(success=True, message_id=str(msg.message_id))
async def send_video(self, chat_id, video_path, caption=None, reply_to=None):
with open(video_path, "rb") as f:
msg = await self._bot.send_video(chat_id=int(chat_id), video=f, caption=caption)
return SendResult(success=True, message_id=str(msg.message_id))
```
3. **`gateway/platforms/discord.py`** — Add missing send methods:
```python
async def send_document(self, chat_id, file_path, caption=None, file_name=None, reply_to=None):
channel = self._client.get_channel(int(chat_id)) or await self._client.fetch_channel(int(chat_id))
with open(file_path, "rb") as f:
file = discord.File(io.BytesIO(f.read()), filename=file_name or os.path.basename(file_path))
msg = await channel.send(content=caption, file=file)
return SendResult(success=True, message_id=str(msg.id))
async def send_image_file(self, chat_id, image_path, caption=None, reply_to=None):
# Same pattern as send_document with image filename
async def send_video(self, chat_id, video_path, caption=None, reply_to=None):
# Same pattern, discord renders video attachments inline
```
4. **`toolsets.py`** — Add `"send_file"` to `_HERMES_CORE_TOOLS` list
5. **`agent/prompt_builder.py`** — Update platform hints to mention send_file tool
### Code that can be REUSED (zero rewrite):
- `BasePlatformAdapter.extract_media()` — Already extracts MEDIA: tags
- `BasePlatformAdapter._process_message_background()` — Already routes by extension
- `ToolContext.download_file()` — Base64-over-terminal extraction pattern
- `tools/terminal_tool.py` _active_environments dict — Environment access
- `tools/registry.py` — Tool registration infrastructure
- `gateway/platforms/base.py` send_document/send_image_file/send_video signatures — Already defined
### Code that needs to be WRITTEN from scratch:
1. `tools/send_file_tool.py` (~150 lines):
- File extraction from each environment backend type
- Local file cache management
- Registry registration
2. Telegram `send_document` + `send_image_file` + `send_video` overrides (~40 lines)
3. Discord `send_document` + `send_image_file` + `send_video` overrides (~50 lines)
### Total effort: ~240 lines of new code, ~5 lines of config changes
## Key Environment-Specific Extract Strategies
| Backend | Extract Method | Speed | Complexity |
|------------|-------------------------------|----------|------------|
| local | shutil.copy / direct path | Instant | None |
| docker | `docker cp container:path .` | Fast | Low |
| docker+vol | Direct host path access | Instant | None |
| ssh | `scp -o ControlPath=...` | Fast | Low |
| modal | base64-over-terminal | Moderate | Medium |
| singularity| Direct path (overlay mount) | Fast | Low |
## Data Flow Summary
```
Agent calls send_file(file_path="/workspace/output.pdf", caption="Here's the report")
send_file_tool.py:
1. Get environment from _active_environments[task_id]
2. Detect backend type (docker/ssh/modal/local)
3. Extract file to ~/.hermes/file_cache/{uuid}_{filename}
4. Return: '{"success": true, "media_tag": "MEDIA:/home/user/.hermes/file_cache/abc123_output.pdf"}'
LLM includes MEDIA: tag in its response text
BasePlatformAdapter._process_message_background():
1. extract_media(response) → finds MEDIA:/path
2. Checks extension: .pdf → send_document()
3. Calls platform-specific send_document(chat_id, file_path, caption)
TelegramAdapter.send_document() / DiscordAdapter.send_document():
Opens file, sends via platform API as native document attachment
User receives downloadable file in chat
```

View File

@@ -1,89 +0,0 @@
# ============================================================================
# Hermes Agent — Example Skin Template
# ============================================================================
#
# Copy this file to ~/.hermes/skins/<name>.yaml to create a custom skin.
# All fields are optional — missing values inherit from the default skin.
# Activate with: /skin <name> or display.skin: <name> in config.yaml
#
# See hermes_cli/skin_engine.py for the full schema reference.
# ============================================================================
# Required: unique skin name (used in /skin command and config)
name: example
description: An example custom skin — copy and modify this template
# ── Colors ──────────────────────────────────────────────────────────────────
# Hex color values for Rich markup. These control the CLI's visual palette.
colors:
# Banner panel (the startup welcome box)
banner_border: "#CD7F32" # Panel border
banner_title: "#FFD700" # Panel title text
banner_accent: "#FFBF00" # Section headers (Available Tools, Skills, etc.)
banner_dim: "#B8860B" # Dim/muted text (separators, model info)
banner_text: "#FFF8DC" # Body text (tool names, skill names)
# UI elements
ui_accent: "#FFBF00" # General accent color
ui_label: "#4dd0e1" # Labels
ui_ok: "#4caf50" # Success indicators
ui_error: "#ef5350" # Error indicators
ui_warn: "#ffa726" # Warning indicators
# Input area
prompt: "#FFF8DC" # Prompt text color
input_rule: "#CD7F32" # Horizontal rule around input
# Response box
response_border: "#FFD700" # Response box border (ANSI color)
# Session display
session_label: "#DAA520" # Session label
session_border: "#8B8682" # Session ID dim color
# ── Spinner ─────────────────────────────────────────────────────────────────
# Customize the animated spinner shown during API calls and tool execution.
spinner:
# Faces shown while waiting for the API response
waiting_faces:
- "(。◕‿◕。)"
- "(◕‿◕✿)"
- "٩(◕‿◕。)۶"
# Faces shown during extended thinking/reasoning
thinking_faces:
- "(。•́︿•̀。)"
- "(◔_◔)"
- "(¬‿¬)"
# Verbs used in spinner messages (e.g., "pondering your request...")
thinking_verbs:
- "pondering"
- "contemplating"
- "musing"
- "ruminating"
# Optional: left/right decorations around the spinner
# Each entry is a [left, right] pair. Omit entirely for no wings.
# wings:
# - ["⟪⚔", "⚔⟫"]
# - ["⟪▲", "▲⟫"]
# ── Branding ────────────────────────────────────────────────────────────────
# Text strings used throughout the CLI interface.
branding:
agent_name: "Hermes Agent" # Banner title, about display
welcome: "Welcome! Type your message or /help for commands."
goodbye: "Goodbye! ⚕" # Exit message
response_label: " ⚕ Hermes " # Response box header label
prompt_symbol: " " # Input prompt symbol
help_header: "(^_^)? Available Commands" # /help header text
# ── Tool Output ─────────────────────────────────────────────────────────────
# Character used as the prefix for tool output lines.
# Default is "┊" (thin dotted vertical line). Some alternatives:
# "╎" (light triple dash vertical)
# "▏" (left one-eighth block)
# "│" (box drawing light vertical)
# "┃" (box drawing heavy vertical)
tool_prefix: "┊"

View File

@@ -195,12 +195,8 @@ environments/
│ └── hermes_swe_env.py
└── benchmarks/ # Evaluation benchmarks
── terminalbench_2/ # 89 terminal tasks, Modal sandboxes
└── terminalbench2_env.py
├── tblite/ # 100 calibrated tasks (fast TB2 proxy)
│ └── tblite_env.py
└── yc_bench/ # Long-horizon strategic benchmark
└── yc_bench_env.py
── terminalbench_2/
└── terminalbench2_env.py
```
## Concrete Environments

View File

@@ -18,14 +18,9 @@ Benchmarks (eval-only):
- benchmarks/terminalbench_2/: Terminal-Bench 2.0 evaluation
"""
try:
from environments.agent_loop import AgentResult, HermesAgentLoop
from environments.tool_context import ToolContext
from environments.hermes_base_env import HermesAgentBaseEnv, HermesAgentEnvConfig
except ImportError:
# atroposlib not installed — environments are unavailable but
# submodules like tool_call_parsers can still be imported directly.
pass
from environments.agent_loop import AgentResult, HermesAgentLoop
from environments.tool_context import ToolContext
from environments.hermes_base_env import HermesAgentBaseEnv, HermesAgentEnvConfig
__all__ = [
"AgentResult",

View File

@@ -249,62 +249,23 @@ class HermesAgentLoop:
reasoning = _extract_reasoning_from_message(assistant_msg)
reasoning_per_turn.append(reasoning)
# Check for tool calls -- standard OpenAI spec.
# Fallback: if response has no structured tool_calls but content
# contains raw tool call tags (e.g. <tool_call>), parse them using
# hermes-agent's standalone parsers. This handles the case where
# ManagedServer's ToolCallTranslator couldn't parse because vLLM
# isn't installed.
if (
not assistant_msg.tool_calls
and assistant_msg.content
and self.tool_schemas
and "<tool_call>" in (assistant_msg.content or "")
):
try:
from environments.tool_call_parsers import get_parser
fallback_parser = get_parser("hermes")
parsed_content, parsed_calls = fallback_parser.parse(
assistant_msg.content
)
if parsed_calls:
assistant_msg.tool_calls = parsed_calls
if parsed_content is not None:
assistant_msg.content = parsed_content
logger.debug(
"Fallback parser extracted %d tool calls from raw content",
len(parsed_calls),
)
except Exception:
pass # Fall through to no tool calls
# Check for tool calls -- standard OpenAI spec
if assistant_msg.tool_calls:
# Normalize tool calls to dicts — they may come as objects
# (OpenAI API) or dicts (vLLM ToolCallTranslator).
def _tc_to_dict(tc):
if isinstance(tc, dict):
return {
"id": tc.get("id", f"call_{uuid.uuid4().hex[:8]}"),
"type": "function",
"function": {
"name": tc.get("function", {}).get("name", tc.get("name", "")),
"arguments": tc.get("function", {}).get("arguments", tc.get("arguments", "{}")),
},
}
return {
"id": tc.id,
"type": "function",
"function": {
"name": tc.function.name,
"arguments": tc.function.arguments,
},
}
# Build the assistant message dict for conversation history
msg_dict: Dict[str, Any] = {
"role": "assistant",
"content": assistant_msg.content or "",
"tool_calls": [_tc_to_dict(tc) for tc in assistant_msg.tool_calls],
"tool_calls": [
{
"id": tc.id,
"type": "function",
"function": {
"name": tc.function.name,
"arguments": tc.function.arguments,
},
}
for tc in assistant_msg.tool_calls
],
}
# Preserve reasoning_content for multi-turn chat template handling
@@ -317,13 +278,8 @@ class HermesAgentLoop:
# Execute each tool call via hermes-agent's dispatch
for tc in assistant_msg.tool_calls:
# Handle both object (OpenAI) and dict (vLLM) formats
if isinstance(tc, dict):
tool_name = tc.get("function", {}).get("name", tc.get("name", ""))
tool_args_raw = tc.get("function", {}).get("arguments", tc.get("arguments", "{}"))
else:
tool_name = tc.function.name
tool_args_raw = tc.function.arguments
tool_name = tc.function.name
tool_args_raw = tc.function.arguments
# Validate tool name
if tool_name not in self.valid_tool_names:
@@ -434,11 +390,10 @@ class HermesAgentLoop:
pass
# Add tool response to conversation
tc_id = tc.get("id", "") if isinstance(tc, dict) else tc.id
messages.append(
{
"role": "tool",
"tool_call_id": tc_id,
"tool_call_id": tc.id,
"content": tool_result,
}
)

View File

@@ -1,38 +0,0 @@
# OpenThoughts-TBLite Evaluation -- Docker Backend (Local Compute)
#
# Runs tasks in Docker containers on the local machine.
# Sandboxed like Modal but no cloud costs. Good for dev/testing.
#
# Usage:
# python environments/benchmarks/tblite/tblite_env.py evaluate \
# --config environments/benchmarks/tblite/local.yaml
#
# # Override concurrency:
# python environments/benchmarks/tblite/tblite_env.py evaluate \
# --config environments/benchmarks/tblite/local.yaml \
# --env.eval_concurrency 4
env:
enabled_toolsets: ["terminal", "file"]
max_agent_turns: 60
max_token_length: 32000
agent_temperature: 0.8
terminal_backend: "docker"
terminal_timeout: 300
tool_pool_size: 16
dataset_name: "NousResearch/openthoughts-tblite"
test_timeout: 600
task_timeout: 1200
eval_concurrency: 8 # max 8 tasks at once
tokenizer_name: "NousResearch/Hermes-3-Llama-3.1-8B"
use_wandb: false
wandb_name: "openthoughts-tblite-local"
ensure_scores_are_not_same: false
data_dir_to_save_evals: "environments/benchmarks/evals/openthoughts-tblite-local"
openai:
base_url: "https://openrouter.ai/api/v1"
model_name: "anthropic/claude-sonnet-4"
server_type: "openai"
health_check: false
# api_key loaded from OPENROUTER_API_KEY in .env

View File

@@ -1,40 +0,0 @@
# OpenThoughts-TBLite Evaluation -- Local vLLM Backend
#
# Runs against a local vLLM server with Docker sandboxes.
#
# Start the vLLM server from the atropos directory:
# python -m example_trainer.vllm_api_server \
# --model Qwen/Qwen3-4B-Instruct-2507 \
# --port 9001 \
# --gpu-memory-utilization 0.8 \
# --max-model-len=32000
#
# Then run:
# python environments/benchmarks/tblite/tblite_env.py evaluate \
# --config environments/benchmarks/tblite/local_vllm.yaml
env:
enabled_toolsets: ["terminal", "file"]
max_agent_turns: 60
max_token_length: 16000
agent_temperature: 0.6
terminal_backend: "docker"
terminal_timeout: 300
tool_pool_size: 16
dataset_name: "NousResearch/openthoughts-tblite"
test_timeout: 600
task_timeout: 1200
eval_concurrency: 8
tool_call_parser: "hermes"
system_prompt: "You are an expert terminal agent. You MUST use the provided tools to complete tasks. Use the terminal tool to run shell commands, read_file to read files, write_file to write files, search_files to search, and patch to edit files. Do NOT write out solutions as text - execute them using the tools. Always start by exploring the environment with terminal commands."
tokenizer_name: "Qwen/Qwen3-4B-Instruct-2507"
use_wandb: false
wandb_name: "tblite-qwen3-4b-instruct"
ensure_scores_are_not_same: false
data_dir_to_save_evals: "environments/benchmarks/evals/tblite-qwen3-4b-local"
openai:
base_url: "http://localhost:9001"
model_name: "Qwen/Qwen3-4B-Instruct-2507"
server_type: "vllm"
health_check: false

View File

@@ -29,10 +29,6 @@ env:
wandb_name: "terminal-bench-2"
ensure_scores_are_not_same: false
data_dir_to_save_evals: "environments/benchmarks/evals/terminal-bench-2"
# CRITICAL: Limit concurrent Modal sandbox creations to avoid deadlocks.
# Modal's blocking calls (App.lookup, etc.) deadlock when too many sandboxes
# are created simultaneously inside thread pool workers via asyncio.run().
max_concurrent_tasks: 8
openai:
base_url: "https://openrouter.ai/api/v1"

View File

@@ -118,23 +118,6 @@ class TerminalBench2EvalConfig(HermesAgentEnvConfig):
"Tasks exceeding this are scored as FAIL. Default 30 minutes.",
)
# --- Concurrency control ---
max_concurrent_tasks: int = Field(
default=8,
description="Maximum number of tasks to run concurrently. "
"Limits concurrent Modal sandbox creations to avoid async/threading deadlocks. "
"Modal has internal limits and creating too many sandboxes simultaneously "
"causes blocking calls to deadlock inside the thread pool.",
)
# --- Eval concurrency ---
eval_concurrency: int = Field(
default=0,
description="Maximum number of tasks to evaluate in parallel. "
"0 means unlimited (all tasks run concurrently). "
"Set to 8 for local backends to avoid overwhelming the machine.",
)
# Tasks that cannot run properly on Modal and are excluded from scoring.
MODAL_INCOMPATIBLE_TASKS = {
@@ -209,7 +192,7 @@ class TerminalBench2EvalEnv(HermesAgentBaseEnv):
# Agent settings -- TB2 tasks are complex, need many turns
max_agent_turns=60,
max_token_length=***
max_token_length=16000,
agent_temperature=0.6,
system_prompt=None,
@@ -233,7 +216,7 @@ class TerminalBench2EvalEnv(HermesAgentBaseEnv):
steps_per_eval=1,
total_steps=1,
tokenizer_name="NousRe...1-8B",
tokenizer_name="NousResearch/Hermes-3-Llama-3.1-8B",
use_wandb=True,
wandb_name="terminal-bench-2",
ensure_scores_are_not_same=False, # Binary rewards may all be 0 or 1
@@ -245,7 +228,7 @@ class TerminalBench2EvalEnv(HermesAgentBaseEnv):
base_url="https://openrouter.ai/api/v1",
model_name="anthropic/claude-sonnet-4",
server_type="openai",
api_key=os.get...EY", ""),
api_key=os.getenv("OPENROUTER_API_KEY", ""),
health_check=False,
)
]
@@ -446,14 +429,8 @@ class TerminalBench2EvalEnv(HermesAgentBaseEnv):
"error": "no_image",
}
# --- 2. Register per-task image override ---
# Set both modal_image and docker_image so the task image is used
# regardless of which backend is configured.
register_task_env_overrides(task_id, {
"modal_image": modal_image,
"docker_image": modal_image,
"cwd": "/app",
})
# --- 2. Register per-task Modal image override ---
register_task_env_overrides(task_id, {"modal_image": modal_image})
logger.info(
"Task %s: registered image override for task_id %s",
task_name, task_id[:8],
@@ -468,37 +445,17 @@ class TerminalBench2EvalEnv(HermesAgentBaseEnv):
messages.append({"role": "user", "content": self.format_prompt(eval_item)})
# --- 4. Run agent loop ---
# Use ManagedServer (Phase 2) for vLLM/SGLang backends to get
# token-level tracking via /generate. Falls back to direct
# ServerManager (Phase 1) for OpenAI endpoints.
if self._use_managed_server():
async with self.server.managed_server(
tokenizer=self.tokenizer,
preserve_think_blocks=bool(self.config.thinking_mode),
) as managed:
agent = HermesAgentLoop(
server=managed,
tool_schemas=tools,
valid_tool_names=valid_names,
max_turns=self.config.max_agent_turns,
task_id=task_id,
temperature=self.config.agent_temperature,
max_tokens=self.config.max_token_length,
extra_body=self.config.extra_body,
)
result = await agent.run(messages)
else:
agent = HermesAgentLoop(
server=self.server,
tool_schemas=tools,
valid_tool_names=valid_names,
max_turns=self.config.max_agent_turns,
task_id=task_id,
temperature=self.config.agent_temperature,
max_tokens=self.config.max_token_length,
extra_body=self.config.extra_body,
)
result = await agent.run(messages)
agent = HermesAgentLoop(
server=self.server,
tool_schemas=tools,
valid_tool_names=valid_names,
max_turns=self.config.max_agent_turns,
task_id=task_id,
temperature=self.config.agent_temperature,
max_tokens=self.config.max_token_length,
extra_body=self.config.extra_body,
)
result = await agent.run(messages)
# --- 5. Verify -- run test suite in the agent's sandbox ---
# Skip verification if the agent produced no meaningful output
@@ -513,3 +470,435 @@ class TerminalBench2EvalEnv(HermesAgentBaseEnv):
reward = 0.0
else:
# Run tests in a thread so the blocking ctx.terminal() calls
# don't freeze the entire event loop (which would stall all
# other tasks, tqdm updates, and timeout timers).
ctx = ToolContext(task_id)
try:
loop = asyncio.get_event_loop()
reward = await loop.run_in_executor(
None, # default thread pool
self._run_tests, eval_item, ctx, task_name,
)
except Exception as e:
logger.error("Task %s: test verification failed: %s", task_name, e)
reward = 0.0
finally:
ctx.cleanup()
passed = reward == 1.0
status = "PASS" if passed else "FAIL"
elapsed = time.time() - task_start
tqdm.write(f" [{status}] {task_name} (turns={result.turns_used}, {elapsed:.0f}s)")
logger.info(
"Task %s: reward=%.1f, turns=%d, finished=%s",
task_name, reward, result.turns_used, result.finished_naturally,
)
out = {
"passed": passed,
"reward": reward,
"task_name": task_name,
"category": category,
"turns_used": result.turns_used,
"finished_naturally": result.finished_naturally,
"messages": result.messages,
}
self._save_result(out)
return out
except Exception as e:
elapsed = time.time() - task_start
logger.error("Task %s: rollout failed: %s", task_name, e, exc_info=True)
tqdm.write(f" [ERROR] {task_name}: {e} ({elapsed:.0f}s)")
out = {
"passed": False, "reward": 0.0,
"task_name": task_name, "category": category,
"error": str(e),
}
self._save_result(out)
return out
finally:
# --- Cleanup: clear overrides, sandbox, and temp files ---
clear_task_env_overrides(task_id)
try:
cleanup_vm(task_id)
except Exception as e:
logger.debug("VM cleanup for %s: %s", task_id[:8], e)
if task_dir and task_dir.exists():
shutil.rmtree(task_dir, ignore_errors=True)
def _run_tests(
self, item: Dict[str, Any], ctx: ToolContext, task_name: str
) -> float:
"""
Upload and execute the test suite in the agent's sandbox, then
download the verifier output locally to read the reward.
Follows Harbor's verification pattern:
1. Upload tests/ directory into the sandbox
2. Execute test.sh inside the sandbox
3. Download /logs/verifier/ directory to a local temp dir
4. Read reward.txt locally with native Python I/O
Downloading locally avoids issues with the file_read tool on
the Modal VM and matches how Harbor handles verification.
TB2 test scripts (test.sh) typically:
1. Install pytest via uv/pip
2. Run pytest against the test files in /tests/
3. Write results to /logs/verifier/reward.txt
Args:
item: The TB2 task dict (contains tests_tar, test_sh)
ctx: ToolContext scoped to this task's sandbox
task_name: For logging
Returns:
1.0 if tests pass, 0.0 otherwise
"""
tests_tar = item.get("tests_tar", "")
test_sh = item.get("test_sh", "")
if not test_sh:
logger.warning("Task %s: no test_sh content, reward=0", task_name)
return 0.0
# Create required directories in the sandbox
ctx.terminal("mkdir -p /tests /logs/verifier")
# Upload test files into the sandbox (binary-safe via base64)
if tests_tar:
tests_temp = Path(tempfile.mkdtemp(prefix=f"tb2-tests-{task_name}-"))
try:
_extract_base64_tar(tests_tar, tests_temp)
ctx.upload_dir(str(tests_temp), "/tests")
except Exception as e:
logger.warning("Task %s: failed to upload test files: %s", task_name, e)
finally:
shutil.rmtree(tests_temp, ignore_errors=True)
# Write the test runner script (test.sh)
ctx.write_file("/tests/test.sh", test_sh)
ctx.terminal("chmod +x /tests/test.sh")
# Execute the test suite
logger.info(
"Task %s: running test suite (timeout=%ds)",
task_name, self.config.test_timeout,
)
test_result = ctx.terminal(
"bash /tests/test.sh",
timeout=self.config.test_timeout,
)
exit_code = test_result.get("exit_code", -1)
output = test_result.get("output", "")
# Download the verifier output directory locally, then read reward.txt
# with native Python I/O. This avoids issues with file_read on the
# Modal VM and matches Harbor's verification pattern.
reward = 0.0
local_verifier_dir = Path(tempfile.mkdtemp(prefix=f"tb2-verifier-{task_name}-"))
try:
ctx.download_dir("/logs/verifier", str(local_verifier_dir))
reward_file = local_verifier_dir / "reward.txt"
if reward_file.exists() and reward_file.stat().st_size > 0:
content = reward_file.read_text().strip()
if content == "1":
reward = 1.0
elif content == "0":
reward = 0.0
else:
# Unexpected content -- try parsing as float
try:
reward = float(content)
except (ValueError, TypeError):
logger.warning(
"Task %s: reward.txt content unexpected (%r), "
"falling back to exit_code=%d",
task_name, content, exit_code,
)
reward = 1.0 if exit_code == 0 else 0.0
else:
# reward.txt not written -- fall back to exit code
logger.warning(
"Task %s: reward.txt not found after download, "
"falling back to exit_code=%d",
task_name, exit_code,
)
reward = 1.0 if exit_code == 0 else 0.0
except Exception as e:
logger.warning(
"Task %s: failed to download verifier dir: %s, "
"falling back to exit_code=%d",
task_name, e, exit_code,
)
reward = 1.0 if exit_code == 0 else 0.0
finally:
shutil.rmtree(local_verifier_dir, ignore_errors=True)
# Log test output for debugging failures
if reward == 0.0:
output_preview = output[-500:] if output else "(no output)"
logger.info(
"Task %s: FAIL (exit_code=%d)\n%s",
task_name, exit_code, output_preview,
)
return reward
# =========================================================================
# Evaluate -- main entry point for the eval subcommand
# =========================================================================
async def _eval_with_timeout(self, item: Dict[str, Any]) -> Dict:
"""
Wrap rollout_and_score_eval with a per-task wall-clock timeout.
If the task exceeds task_timeout seconds, it's automatically scored
as FAIL. This prevents any single task from hanging indefinitely.
"""
task_name = item.get("task_name", "unknown")
category = item.get("category", "unknown")
try:
return await asyncio.wait_for(
self.rollout_and_score_eval(item),
timeout=self.config.task_timeout,
)
except asyncio.TimeoutError:
from tqdm import tqdm
elapsed = self.config.task_timeout
tqdm.write(f" [TIMEOUT] {task_name} (exceeded {elapsed}s wall-clock limit)")
logger.error("Task %s: wall-clock timeout after %ds", task_name, elapsed)
out = {
"passed": False, "reward": 0.0,
"task_name": task_name, "category": category,
"error": f"timeout ({elapsed}s)",
}
self._save_result(out)
return out
async def evaluate(self, *args, **kwargs) -> None:
"""
Run Terminal-Bench 2.0 evaluation over all tasks.
This is the main entry point when invoked via:
python environments/terminalbench2_env.py evaluate
Runs all tasks through rollout_and_score_eval() via asyncio.gather()
(same pattern as GPQA and other Atropos eval envs). Each task is
wrapped with a wall-clock timeout so hung tasks auto-fail.
Suppresses noisy Modal/terminal output (HERMES_QUIET) so the tqdm
bar stays visible.
"""
start_time = time.time()
# Route all logging through tqdm.write() so the progress bar stays
# pinned at the bottom while log lines scroll above it.
from tqdm import tqdm
class _TqdmHandler(logging.Handler):
def emit(self, record):
try:
tqdm.write(self.format(record))
except Exception:
self.handleError(record)
handler = _TqdmHandler()
handler.setFormatter(logging.Formatter(
"%(asctime)s [%(name)s] %(levelname)s: %(message)s",
datefmt="%H:%M:%S",
))
root = logging.getLogger()
root.handlers = [handler] # Replace any existing handlers
root.setLevel(logging.INFO)
# Silence noisy third-party loggers that flood the output
logging.getLogger("httpx").setLevel(logging.WARNING) # Every HTTP request
logging.getLogger("openai").setLevel(logging.WARNING) # OpenAI client retries
logging.getLogger("rex-deploy").setLevel(logging.WARNING) # Swerex deployment
logging.getLogger("rex_image_builder").setLevel(logging.WARNING) # Image builds
print(f"\n{'='*60}")
print("Starting Terminal-Bench 2.0 Evaluation")
print(f"{'='*60}")
print(f" Dataset: {self.config.dataset_name}")
print(f" Total tasks: {len(self.all_eval_items)}")
print(f" Max agent turns: {self.config.max_agent_turns}")
print(f" Task timeout: {self.config.task_timeout}s")
print(f" Terminal backend: {self.config.terminal_backend}")
print(f" Tool thread pool: {self.config.tool_pool_size}")
print(f" Terminal timeout: {self.config.terminal_timeout}s/cmd")
print(f" Terminal lifetime: {self.config.terminal_lifetime}s (auto: task_timeout + 120)")
print(f"{'='*60}\n")
# Fire all tasks with wall-clock timeout, track live accuracy on the bar
total_tasks = len(self.all_eval_items)
eval_tasks = [
asyncio.ensure_future(self._eval_with_timeout(item))
for item in self.all_eval_items
]
results = []
passed_count = 0
pbar = tqdm(total=total_tasks, desc="Evaluating TB2", dynamic_ncols=True)
try:
for coro in asyncio.as_completed(eval_tasks):
result = await coro
results.append(result)
if result and result.get("passed"):
passed_count += 1
done = len(results)
pct = (passed_count / done * 100) if done else 0
pbar.set_postfix_str(f"pass={passed_count}/{done} ({pct:.1f}%)")
pbar.update(1)
except (KeyboardInterrupt, asyncio.CancelledError):
pbar.close()
print(f"\n\nInterrupted! Cleaning up {len(eval_tasks)} tasks...")
# Cancel all pending tasks
for task in eval_tasks:
task.cancel()
# Let cancellations propagate (finally blocks run cleanup_vm)
await asyncio.gather(*eval_tasks, return_exceptions=True)
# Belt-and-suspenders: clean up any remaining sandboxes
from tools.terminal_tool import cleanup_all_environments
cleanup_all_environments()
print("All sandboxes cleaned up.")
return
finally:
pbar.close()
end_time = time.time()
# Filter out None results (shouldn't happen, but be safe)
valid_results = [r for r in results if r is not None]
if not valid_results:
print("Warning: No valid evaluation results obtained")
return
# ---- Compute metrics ----
total = len(valid_results)
passed = sum(1 for r in valid_results if r.get("passed"))
overall_pass_rate = passed / total if total > 0 else 0.0
# Per-category breakdown
cat_results: Dict[str, List[Dict]] = defaultdict(list)
for r in valid_results:
cat_results[r.get("category", "unknown")].append(r)
# Build metrics dict
eval_metrics = {
"eval/pass_rate": overall_pass_rate,
"eval/total_tasks": total,
"eval/passed_tasks": passed,
"eval/evaluation_time_seconds": end_time - start_time,
}
# Per-category metrics
for category, cat_items in sorted(cat_results.items()):
cat_passed = sum(1 for r in cat_items if r.get("passed"))
cat_total = len(cat_items)
cat_pass_rate = cat_passed / cat_total if cat_total > 0 else 0.0
cat_key = category.replace(" ", "_").replace("-", "_").lower()
eval_metrics[f"eval/pass_rate_{cat_key}"] = cat_pass_rate
# Store metrics for wandb_log
self.eval_metrics = [(k, v) for k, v in eval_metrics.items()]
# ---- Print summary ----
print(f"\n{'='*60}")
print("Terminal-Bench 2.0 Evaluation Results")
print(f"{'='*60}")
print(f"Overall Pass Rate: {overall_pass_rate:.4f} ({passed}/{total})")
print(f"Evaluation Time: {end_time - start_time:.1f} seconds")
print("\nCategory Breakdown:")
for category, cat_items in sorted(cat_results.items()):
cat_passed = sum(1 for r in cat_items if r.get("passed"))
cat_total = len(cat_items)
cat_rate = cat_passed / cat_total if cat_total > 0 else 0.0
print(f" {category}: {cat_rate:.1%} ({cat_passed}/{cat_total})")
# Print individual task results
print("\nTask Results:")
for r in sorted(valid_results, key=lambda x: x.get("task_name", "")):
status = "PASS" if r.get("passed") else "FAIL"
turns = r.get("turns_used", "?")
error = r.get("error", "")
extra = f" (error: {error})" if error else ""
print(f" [{status}] {r['task_name']} (turns={turns}){extra}")
print(f"{'='*60}\n")
# Build sample records for evaluate_log (includes full conversations)
samples = [
{
"task_name": r.get("task_name"),
"category": r.get("category"),
"passed": r.get("passed"),
"reward": r.get("reward"),
"turns_used": r.get("turns_used"),
"error": r.get("error"),
"messages": r.get("messages"),
}
for r in valid_results
]
# Log evaluation results
try:
await self.evaluate_log(
metrics=eval_metrics,
samples=samples,
start_time=start_time,
end_time=end_time,
generation_parameters={
"temperature": self.config.agent_temperature,
"max_tokens": self.config.max_token_length,
"max_agent_turns": self.config.max_agent_turns,
"terminal_backend": self.config.terminal_backend,
},
)
except Exception as e:
print(f"Error logging evaluation results: {e}")
# Close streaming file
if hasattr(self, "_streaming_file") and not self._streaming_file.closed:
self._streaming_file.close()
print(f" Live results saved to: {self._streaming_path}")
# Kill all remaining sandboxes. Timed-out tasks leave orphaned thread
# pool workers still executing commands -- cleanup_all stops them.
from tools.terminal_tool import cleanup_all_environments
print("\nCleaning up all sandboxes...")
cleanup_all_environments()
# Shut down the tool thread pool so orphaned workers from timed-out
# tasks are killed immediately instead of retrying against dead
# sandboxes and spamming the console with TimeoutError warnings.
from environments.agent_loop import _tool_executor
_tool_executor.shutdown(wait=False, cancel_futures=True)
print("Done.")
# =========================================================================
# Wandb logging
# =========================================================================
async def wandb_log(self, wandb_metrics: Optional[Dict] = None):
"""Log TB2-specific metrics to wandb."""
if wandb_metrics is None:
wandb_metrics = {}
# Add stored eval metrics
for metric_name, metric_value in self.eval_metrics:
wandb_metrics[metric_name] = metric_value
self.eval_metrics = []
await super().wandb_log(wandb_metrics)
if __name__ == "__main__":
TerminalBench2EvalEnv.cli()

View File

@@ -1,115 +0,0 @@
# YC-Bench: Long-Horizon Agent Benchmark
[YC-Bench](https://github.com/collinear-ai/yc-bench) by [Collinear AI](https://collinear.ai/) is a deterministic, long-horizon benchmark that tests LLM agents' ability to act as a tech startup CEO. The agent manages a simulated company over 1-3 years, making compounding decisions about resource allocation, cash flow, task management, and prestige specialisation across 4 skill domains.
Unlike TerminalBench2 (which evaluates per-task coding ability with binary pass/fail), YC-Bench measures **long-term strategic coherence** — whether an agent can maintain consistent strategy, manage compounding consequences, and adapt plans over hundreds of turns.
## Setup
```bash
# Install yc-bench (optional dependency)
pip install "hermes-agent[yc-bench]"
# Or install from source
git clone https://github.com/collinear-ai/yc-bench
cd yc-bench && pip install -e .
# Verify
yc-bench --help
```
## Running
```bash
# From the repo root:
bash environments/benchmarks/yc_bench/run_eval.sh
# Or directly:
python environments/benchmarks/yc_bench/yc_bench_env.py evaluate \
--config environments/benchmarks/yc_bench/default.yaml
# Override model:
bash environments/benchmarks/yc_bench/run_eval.sh \
--openai.model_name anthropic/claude-opus-4-20250514
# Quick single-preset test:
bash environments/benchmarks/yc_bench/run_eval.sh \
--env.presets '["fast_test"]' --env.seeds '[1]'
```
## How It Works
### Architecture
```
HermesAgentLoop (our agent)
-> terminal tool -> subprocess("yc-bench company status") -> JSON output
-> terminal tool -> subprocess("yc-bench task accept --task-id X") -> JSON
-> terminal tool -> subprocess("yc-bench sim resume") -> JSON (advance time)
-> ... (100-500 turns per run)
```
The environment initialises the simulation via `yc-bench sim init` (NOT `yc-bench run`, which would start yc-bench's own built-in agent loop). Our `HermesAgentLoop` then drives all interaction through CLI commands.
### Simulation Mechanics
- **4 skill domains**: research, inference, data_environment, training
- **Prestige system** (1.0-10.0): Gates access to higher-paying tasks
- **Employee management**: Junior/Mid/Senior with domain-specific skill rates
- **Throughput splitting**: `effective_rate = base_rate / N` active tasks per employee
- **Financial pressure**: Monthly payroll, bankruptcy = game over
- **Deterministic**: SHA256-based RNG — same seed + preset = same world
### Difficulty Presets
| Preset | Employees | Tasks | Focus |
|-----------|-----------|-------|-------|
| tutorial | 3 | 50 | Basic loop mechanics |
| easy | 5 | 100 | Throughput awareness |
| **medium**| 5 | 150 | Prestige climbing + domain specialisation |
| **hard** | 7 | 200 | Precise ETA reasoning |
| nightmare | 8 | 300 | Sustained perfection under payroll pressure |
| fast_test | (varies) | (varies) | Quick validation (~50 turns) |
Default eval runs **fast_test + medium + hard** × 3 seeds = 9 runs.
### Scoring
```
composite = 0.5 × survival + 0.5 × normalised_funds
```
- **Survival** (binary): Did the company avoid bankruptcy?
- **Normalised funds** (0.0-1.0): Log-scale relative to initial $250K capital
## Configuration
Key fields in `default.yaml`:
| Field | Default | Description |
|-------|---------|-------------|
| `presets` | `["fast_test", "medium", "hard"]` | Which presets to evaluate |
| `seeds` | `[1, 2, 3]` | RNG seeds per preset |
| `max_agent_turns` | 200 | Max LLM calls per run |
| `run_timeout` | 3600 | Wall-clock timeout per run (seconds) |
| `survival_weight` | 0.5 | Weight of survival in composite score |
| `funds_weight` | 0.5 | Weight of normalised funds in composite |
| `horizon_years` | null | Override horizon (null = auto from preset) |
## Cost & Time Estimates
Each run is 100-500 LLM turns. Approximate costs per run at typical API rates:
| Preset | Turns | Time | Est. Cost |
|--------|-------|------|-----------|
| fast_test | ~50 | 5-10 min | $1-5 |
| medium | ~200 | 20-40 min | $5-15 |
| hard | ~300 | 30-60 min | $10-25 |
Full default eval (9 runs): ~3-6 hours, $50-200 depending on model.
## References
- [collinear-ai/yc-bench](https://github.com/collinear-ai/yc-bench) — Official repository
- [Collinear AI](https://collinear.ai/) — Company behind yc-bench
- [TerminalBench2](../terminalbench_2/) — Per-task coding benchmark (complementary)

View File

@@ -1,43 +0,0 @@
# YC-Bench Evaluation -- Default Configuration
#
# Long-horizon agent benchmark: agent plays CEO of an AI startup over
# a simulated 1-3 year run, interacting via yc-bench CLI subcommands.
#
# Requires: pip install "hermes-agent[yc-bench]"
#
# Usage:
# python environments/benchmarks/yc_bench/yc_bench_env.py evaluate \
# --config environments/benchmarks/yc_bench/default.yaml
#
# # Override model:
# python environments/benchmarks/yc_bench/yc_bench_env.py evaluate \
# --config environments/benchmarks/yc_bench/default.yaml \
# --openai.model_name anthropic/claude-opus-4-20250514
env:
enabled_toolsets: ["terminal"]
max_agent_turns: 200
max_token_length: 32000
agent_temperature: 0.0
terminal_backend: "local"
terminal_timeout: 60
presets: ["fast_test", "medium", "hard"]
seeds: [1, 2, 3]
run_timeout: 3600 # 60 min wall-clock per run, auto-FAIL if exceeded
survival_weight: 0.5 # weight of binary survival in composite score
funds_weight: 0.5 # weight of normalised final funds in composite score
db_dir: "/tmp/yc_bench_dbs"
company_name: "BenchCo"
start_date: "01/01/2025" # MM/DD/YYYY (yc-bench convention)
tokenizer_name: "NousResearch/Hermes-3-Llama-3.1-8B"
use_wandb: true
wandb_name: "yc-bench"
ensure_scores_are_not_same: false
data_dir_to_save_evals: "environments/benchmarks/evals/yc-bench"
openai:
base_url: "https://openrouter.ai/api/v1"
model_name: "anthropic/claude-sonnet-4.6"
server_type: "openai"
health_check: false
# api_key loaded from OPENROUTER_API_KEY in .env

View File

@@ -1,34 +0,0 @@
#!/bin/bash
# YC-Bench Evaluation
#
# Requires: pip install "hermes-agent[yc-bench]"
#
# Run from repo root:
# bash environments/benchmarks/yc_bench/run_eval.sh
#
# Override model:
# bash environments/benchmarks/yc_bench/run_eval.sh \
# --openai.model_name anthropic/claude-opus-4-20250514
#
# Run a single preset:
# bash environments/benchmarks/yc_bench/run_eval.sh \
# --env.presets '["fast_test"]' --env.seeds '[1]'
set -euo pipefail
mkdir -p logs evals/yc-bench
LOG_FILE="logs/yc_bench_$(date +%Y%m%d_%H%M%S).log"
echo "YC-Bench Evaluation"
echo "Log: $LOG_FILE"
echo ""
PYTHONUNBUFFERED=1 LOGLEVEL="${LOGLEVEL:-INFO}" \
python environments/benchmarks/yc_bench/yc_bench_env.py evaluate \
--config environments/benchmarks/yc_bench/default.yaml \
"$@" \
2>&1 | tee "$LOG_FILE"
echo ""
echo "Log saved to: $LOG_FILE"

View File

@@ -1,847 +0,0 @@
"""
YCBenchEvalEnv -- YC-Bench Long-Horizon Agent Benchmark Environment
Evaluates agentic LLMs on YC-Bench: a deterministic, long-horizon benchmark
where the agent acts as CEO of an AI startup over a simulated 1-3 year run.
The agent manages cash flow, employees, tasks, and prestige across 4 domains,
interacting exclusively via CLI subprocess calls against a SQLite-backed
discrete-event simulation.
Unlike TerminalBench2 (per-task binary pass/fail), YC-Bench measures sustained
multi-turn strategic coherence -- whether an agent can manage compounding
decisions over hundreds of turns without going bankrupt.
This is an eval-only environment. Run via:
python environments/benchmarks/yc_bench/yc_bench_env.py evaluate \
--config environments/benchmarks/yc_bench/default.yaml
The evaluate flow:
1. setup() -- Verifies yc-bench installed, builds eval matrix (preset x seed)
2. evaluate() -- Iterates over all runs sequentially through:
a. rollout_and_score_eval() -- Per-run agent loop
- Initialises a fresh yc-bench simulation via `sim init` (NOT `run`)
- Runs HermesAgentLoop with terminal tool only
- Reads final SQLite DB to extract score
- Returns survival (0/1) + normalised funds score
b. Aggregates per-preset and overall metrics
c. Logs results via evaluate_log() and wandb
Key features:
- CLI-only interface: agent calls yc-bench subcommands via terminal tool
- Deterministic: same seed + preset = same world (SHA256-based RNG)
- Multi-dimensional scoring: survival + normalised final funds
- Per-preset difficulty breakdown in results
- Isolated SQLite DB per run (no cross-run state leakage)
Requires: pip install hermes-agent[yc-bench]
"""
import asyncio
import datetime
import json
import logging
import math
import os
import sqlite3
import subprocess
import sys
import threading
import time
import uuid
from collections import defaultdict
from pathlib import Path
from typing import Any, Dict, List, Optional, Tuple
_repo_root = Path(__file__).resolve().parent.parent.parent.parent
if str(_repo_root) not in sys.path:
sys.path.insert(0, str(_repo_root))
from pydantic import Field
from atroposlib.envs.base import EvalHandlingEnum
from atroposlib.envs.server_handling.server_manager import APIServerConfig
from environments.agent_loop import HermesAgentLoop
from environments.hermes_base_env import HermesAgentBaseEnv, HermesAgentEnvConfig
logger = logging.getLogger(__name__)
# =============================================================================
# System prompt
# =============================================================================
YC_BENCH_SYSTEM_PROMPT = """\
You are the autonomous CEO of an early-stage AI startup in a deterministic
business simulation. You manage the company exclusively through the `yc-bench`
CLI tool. Your primary goal is to **survive** until the simulation horizon ends
without going bankrupt, while **maximising final funds**.
## Simulation Mechanics
- **Funds**: You start with $250,000 seed capital. Revenue comes from completing
tasks. Rewards scale with your prestige: `base × (1 + scale × (prestige 1))`.
- **Domains**: There are 4 skill domains: **research**, **inference**,
**data_environment**, and **training**. Each has its own prestige level
(1.0-10.0). Higher prestige unlocks better-paying tasks.
- **Employees**: You have employees (Junior/Mid/Senior) with domain-specific
skill rates. **Throughput splits**: `effective_rate = base_rate / N` where N
is the number of active tasks assigned to that employee. Focus beats breadth.
- **Payroll**: Deducted automatically on the first business day of each month.
Running out of funds = bankruptcy = game over.
- **Time**: The simulation runs on business days (Mon-Fri), 09:00-18:00.
Time only advances when you call `yc-bench sim resume`.
## Task Lifecycle
1. Browse market tasks with `market browse`
2. Accept a task with `task accept` (this sets its deadline)
3. Assign employees with `task assign`
4. Dispatch with `task dispatch` to start work
5. Call `sim resume` to advance time and let employees make progress
6. Tasks complete when all domain requirements are fulfilled
**Penalties for failure vary by difficulty preset.** Completing a task on time
earns full reward + prestige gain. Missing a deadline or cancelling a task
incurs prestige penalties -- cancelling is always more costly than letting a
task fail, so cancel only as a last resort.
## CLI Commands
### Observe
- `yc-bench company status` -- funds, prestige, runway
- `yc-bench employee list` -- skills, salary, active tasks
- `yc-bench market browse [--domain D] [--required-prestige-lte N]` -- available tasks
- `yc-bench task list [--status active|planned]` -- your tasks
- `yc-bench task inspect --task-id UUID` -- progress, deadline, assignments
- `yc-bench finance ledger [--category monthly_payroll|task_reward]` -- transaction history
- `yc-bench report monthly` -- monthly P&L
### Act
- `yc-bench task accept --task-id UUID` -- accept from market
- `yc-bench task assign --task-id UUID --employee-id UUID` -- assign employee
- `yc-bench task dispatch --task-id UUID` -- start work (needs >=1 assignment)
- `yc-bench task cancel --task-id UUID --reason "text"` -- cancel (prestige penalty)
- `yc-bench sim resume` -- advance simulation clock
### Memory (persists across context truncation)
- `yc-bench scratchpad read` -- read your persistent notes
- `yc-bench scratchpad write --content "text"` -- overwrite notes
- `yc-bench scratchpad append --content "text"` -- append to notes
- `yc-bench scratchpad clear` -- clear notes
## Strategy Guidelines
1. **Specialise in 2-3 domains** to climb the prestige ladder faster and unlock
high-reward tasks. Don't spread thin across all 4 domains early on.
2. **Focus employees** -- assigning one employee to many tasks halves their
throughput per additional task. Keep assignments concentrated.
3. **Use the scratchpad** to track your strategy, upcoming deadlines, and
employee assignments. This persists even if conversation context is truncated.
4. **Monitor runway** -- always know how many months of payroll you can cover.
Accept high-reward tasks before payroll dates.
5. **Don't over-accept** -- taking too many tasks and missing deadlines cascades
into prestige loss, locking you out of profitable contracts.
6. Use `finance ledger` and `report monthly` to track revenue trends.
## Your Turn
Each turn:
1. Call `yc-bench company status` and `yc-bench task list` to orient yourself.
2. Check for completed tasks and pending deadlines.
3. Browse market for profitable tasks within your prestige level.
4. Accept, assign, and dispatch tasks strategically.
5. Call `yc-bench sim resume` to advance time.
6. Repeat until the simulation ends.
Think step by step before acting."""
# Starting funds in cents ($250,000)
INITIAL_FUNDS_CENTS = 25_000_000
# Default horizon per preset (years)
_PRESET_HORIZONS = {
"tutorial": 1,
"easy": 1,
"medium": 1,
"hard": 1,
"nightmare": 1,
"fast_test": 1,
"default": 3,
"high_reward": 1,
}
# =============================================================================
# Configuration
# =============================================================================
class YCBenchEvalConfig(HermesAgentEnvConfig):
"""
Configuration for the YC-Bench evaluation environment.
Extends HermesAgentEnvConfig with YC-Bench-specific settings for
preset selection, seed control, scoring, and simulation parameters.
"""
presets: List[str] = Field(
default=["fast_test", "medium", "hard"],
description="YC-Bench preset names to evaluate.",
)
seeds: List[int] = Field(
default=[1, 2, 3],
description="Random seeds -- each preset x seed = one run.",
)
run_timeout: int = Field(
default=3600,
description="Maximum wall-clock seconds per run. Default 60 minutes.",
)
survival_weight: float = Field(
default=0.5,
description="Weight of survival (0/1) in composite score.",
)
funds_weight: float = Field(
default=0.5,
description="Weight of normalised final funds in composite score.",
)
db_dir: str = Field(
default="/tmp/yc_bench_dbs",
description="Directory for per-run SQLite databases.",
)
horizon_years: Optional[int] = Field(
default=None,
description=(
"Simulation horizon in years. If None (default), inferred from "
"preset name (1 year for most, 3 for 'default')."
),
)
company_name: str = Field(
default="BenchCo",
description="Name of the simulated company.",
)
start_date: str = Field(
default="01/01/2025",
description="Simulation start date in MM/DD/YYYY format (yc-bench convention).",
)
# =============================================================================
# Scoring helpers
# =============================================================================
def _read_final_score(db_path: str) -> Dict[str, Any]:
"""
Read final game state from a YC-Bench SQLite database.
Returns dict with final_funds_cents (int), survived (bool),
terminal_reason (str).
Note: yc-bench table names are plural -- 'companies' not 'company',
'sim_events' not 'simulation_log'.
"""
if not os.path.exists(db_path):
logger.warning("DB not found at %s", db_path)
return {
"final_funds_cents": 0,
"survived": False,
"terminal_reason": "db_missing",
}
conn = None
try:
conn = sqlite3.connect(db_path)
cur = conn.cursor()
# Read final funds from the 'companies' table
cur.execute("SELECT funds_cents FROM companies LIMIT 1")
row = cur.fetchone()
funds = row[0] if row else 0
# Determine terminal reason from 'sim_events' table
terminal_reason = "unknown"
try:
cur.execute(
"SELECT event_type FROM sim_events "
"WHERE event_type IN ('bankruptcy', 'horizon_end') "
"ORDER BY scheduled_at DESC LIMIT 1"
)
event_row = cur.fetchone()
if event_row:
terminal_reason = event_row[0]
except sqlite3.OperationalError:
# Table may not exist if simulation didn't progress
pass
survived = funds >= 0 and terminal_reason != "bankruptcy"
return {
"final_funds_cents": funds,
"survived": survived,
"terminal_reason": terminal_reason,
}
except Exception as e:
logger.error("Failed to read DB %s: %s", db_path, e)
return {
"final_funds_cents": 0,
"survived": False,
"terminal_reason": f"db_error: {e}",
}
finally:
if conn:
conn.close()
def _compute_composite_score(
final_funds_cents: int,
survived: bool,
survival_weight: float = 0.5,
funds_weight: float = 0.5,
initial_funds_cents: int = INITIAL_FUNDS_CENTS,
) -> float:
"""
Compute composite score from survival and final funds.
Score = survival_weight * survival_score
+ funds_weight * normalised_funds_score
Normalised funds uses log-scale relative to initial capital:
- funds <= 0: 0.0
- funds == initial: ~0.15
- funds == 10x: ~0.52
- funds == 100x: 1.0
"""
survival_score = 1.0 if survived else 0.0
if final_funds_cents <= 0:
funds_score = 0.0
else:
max_ratio = 100.0
ratio = final_funds_cents / max(initial_funds_cents, 1)
funds_score = min(math.log1p(ratio) / math.log1p(max_ratio), 1.0)
return survival_weight * survival_score + funds_weight * funds_score
# =============================================================================
# Main Environment
# =============================================================================
class YCBenchEvalEnv(HermesAgentBaseEnv):
"""
YC-Bench long-horizon agent benchmark environment (eval-only).
Each eval item is a (preset, seed) pair. The environment initialises the
simulation via ``yc-bench sim init`` (NOT ``yc-bench run`` which would start
a competing built-in agent loop). The HermesAgentLoop then drives the
interaction by calling individual yc-bench CLI commands via the terminal tool.
After the agent loop ends, the SQLite DB is read to extract the final score.
Scoring:
composite = 0.5 * survival + 0.5 * normalised_funds
"""
name = "yc-bench"
env_config_cls = YCBenchEvalConfig
@classmethod
def config_init(cls) -> Tuple[YCBenchEvalConfig, List[APIServerConfig]]:
env_config = YCBenchEvalConfig(
enabled_toolsets=["terminal"],
disabled_toolsets=None,
distribution=None,
max_agent_turns=200,
max_token_length=32000,
agent_temperature=0.0,
system_prompt=YC_BENCH_SYSTEM_PROMPT,
terminal_backend="local",
terminal_timeout=60,
presets=["fast_test", "medium", "hard"],
seeds=[1, 2, 3],
run_timeout=3600,
survival_weight=0.5,
funds_weight=0.5,
db_dir="/tmp/yc_bench_dbs",
eval_handling=EvalHandlingEnum.STOP_TRAIN,
group_size=1,
steps_per_eval=1,
total_steps=1,
tokenizer_name="NousResearch/Hermes-3-Llama-3.1-8B",
use_wandb=True,
wandb_name="yc-bench",
ensure_scores_are_not_same=False,
)
server_configs = [
APIServerConfig(
base_url="https://openrouter.ai/api/v1",
model_name="anthropic/claude-sonnet-4.6",
server_type="openai",
api_key=os.getenv("OPENROUTER_API_KEY", ""),
health_check=False,
)
]
return env_config, server_configs
# =========================================================================
# Setup
# =========================================================================
async def setup(self):
"""Verify yc-bench is installed and build the eval matrix."""
# Verify yc-bench CLI is available
try:
result = subprocess.run(
["yc-bench", "--help"], capture_output=True, text=True, timeout=10
)
if result.returncode != 0:
raise FileNotFoundError
except (FileNotFoundError, subprocess.TimeoutExpired):
raise RuntimeError(
"yc-bench CLI not found. Install with:\n"
' pip install "hermes-agent[yc-bench]"\n'
"Or: git clone https://github.com/collinear-ai/yc-bench "
"&& cd yc-bench && pip install -e ."
)
print("yc-bench CLI verified.")
# Build eval matrix: preset x seed
self.all_eval_items = [
{"preset": preset, "seed": seed}
for preset in self.config.presets
for seed in self.config.seeds
]
self.iter = 0
os.makedirs(self.config.db_dir, exist_ok=True)
self.eval_metrics: List[Tuple[str, float]] = []
# Streaming JSONL log for crash-safe result persistence
log_dir = os.path.join(os.path.dirname(__file__), "logs")
os.makedirs(log_dir, exist_ok=True)
run_ts = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
self._streaming_path = os.path.join(log_dir, f"samples_{run_ts}.jsonl")
self._streaming_file = open(self._streaming_path, "w")
self._streaming_lock = threading.Lock()
print(f"\nYC-Bench eval matrix: {len(self.all_eval_items)} runs")
for item in self.all_eval_items:
print(f" preset={item['preset']!r} seed={item['seed']}")
print(f"Streaming results to: {self._streaming_path}\n")
def _save_result(self, result: Dict[str, Any]):
"""Write a single run result to the streaming JSONL file immediately."""
if not hasattr(self, "_streaming_file") or self._streaming_file.closed:
return
with self._streaming_lock:
self._streaming_file.write(
json.dumps(result, ensure_ascii=False, default=str) + "\n"
)
self._streaming_file.flush()
# =========================================================================
# Training pipeline stubs (eval-only -- not used)
# =========================================================================
async def get_next_item(self):
item = self.all_eval_items[self.iter % len(self.all_eval_items)]
self.iter += 1
return item
def format_prompt(self, item: Dict[str, Any]) -> str:
preset = item["preset"]
seed = item["seed"]
return (
f"A new YC-Bench simulation has been initialized "
f"(preset='{preset}', seed={seed}).\n"
f"Your company '{self.config.company_name}' is ready.\n\n"
"Begin by calling:\n"
"1. `yc-bench company status` -- see your starting funds and prestige\n"
"2. `yc-bench employee list` -- see your team and their skills\n"
"3. `yc-bench market browse --required-prestige-lte 1` -- find tasks "
"you can take\n\n"
"Then accept 2-3 tasks, assign employees, dispatch them, and call "
"`yc-bench sim resume` to advance time. Repeat this loop until the "
"simulation ends (horizon reached or bankruptcy)."
)
async def compute_reward(self, item, result, ctx) -> float:
return 0.0
async def collect_trajectories(self, item):
return None, []
async def score(self, rollout_group_data):
return None
# =========================================================================
# Per-run evaluation
# =========================================================================
async def rollout_and_score_eval(self, eval_item: Dict[str, Any]) -> Dict:
"""
Evaluate a single (preset, seed) run.
1. Sets DATABASE_URL and YC_BENCH_EXPERIMENT env vars
2. Initialises the simulation via ``yc-bench sim init`` (NOT ``run``)
3. Runs HermesAgentLoop with terminal tool
4. Reads SQLite DB to compute final score
5. Returns result dict with survival, funds, and composite score
"""
preset = eval_item["preset"]
seed = eval_item["seed"]
run_id = str(uuid.uuid4())[:8]
run_key = f"{preset}_seed{seed}_{run_id}"
from tqdm import tqdm
tqdm.write(f" [START] preset={preset!r} seed={seed} (run_id={run_id})")
run_start = time.time()
# Isolated DB per run -- prevents cross-run state leakage
db_path = os.path.join(self.config.db_dir, f"yc_bench_{run_key}.db")
os.environ["DATABASE_URL"] = f"sqlite:///{db_path}"
os.environ["YC_BENCH_EXPERIMENT"] = preset
# Determine horizon: explicit config override > preset lookup > default 1
horizon = self.config.horizon_years or _PRESET_HORIZONS.get(preset, 1)
try:
# ----------------------------------------------------------
# Step 1: Initialise the simulation via CLI
# IMPORTANT: We use `sim init`, NOT `yc-bench run`.
# `yc-bench run` starts yc-bench's own LLM agent loop (via
# LiteLLM), which would compete with our HermesAgentLoop.
# `sim init` just sets up the world and returns.
# ----------------------------------------------------------
init_cmd = [
"yc-bench", "sim", "init",
"--seed", str(seed),
"--start-date", self.config.start_date,
"--company-name", self.config.company_name,
"--horizon-years", str(horizon),
]
init_result = subprocess.run(
init_cmd, capture_output=True, text=True, timeout=30,
)
if init_result.returncode != 0:
error_msg = (init_result.stderr or init_result.stdout).strip()
raise RuntimeError(f"yc-bench sim init failed: {error_msg}")
tqdm.write(f" Simulation initialized (horizon={horizon}yr)")
# ----------------------------------------------------------
# Step 2: Run the HermesAgentLoop
# ----------------------------------------------------------
tools, valid_names = self._resolve_tools_for_group()
messages: List[Dict[str, Any]] = [
{"role": "system", "content": YC_BENCH_SYSTEM_PROMPT},
{"role": "user", "content": self.format_prompt(eval_item)},
]
agent = HermesAgentLoop(
server=self.server,
tool_schemas=tools,
valid_tool_names=valid_names,
max_turns=self.config.max_agent_turns,
task_id=run_id,
temperature=self.config.agent_temperature,
max_tokens=self.config.max_token_length,
extra_body=self.config.extra_body,
)
result = await agent.run(messages)
# ----------------------------------------------------------
# Step 3: Read final score from the simulation DB
# ----------------------------------------------------------
score_data = _read_final_score(db_path)
final_funds = score_data["final_funds_cents"]
survived = score_data["survived"]
terminal_reason = score_data["terminal_reason"]
composite = _compute_composite_score(
final_funds_cents=final_funds,
survived=survived,
survival_weight=self.config.survival_weight,
funds_weight=self.config.funds_weight,
)
elapsed = time.time() - run_start
status = "SURVIVED" if survived else "BANKRUPT"
if final_funds >= 0:
funds_str = f"${final_funds / 100:,.0f}"
else:
funds_str = f"-${abs(final_funds) / 100:,.0f}"
tqdm.write(
f" [{status}] preset={preset!r} seed={seed} "
f"funds={funds_str} score={composite:.3f} "
f"turns={result.turns_used} ({elapsed:.0f}s)"
)
out = {
"preset": preset,
"seed": seed,
"survived": survived,
"final_funds_cents": final_funds,
"final_funds_usd": final_funds / 100,
"terminal_reason": terminal_reason,
"composite_score": composite,
"turns_used": result.turns_used,
"finished_naturally": result.finished_naturally,
"elapsed_seconds": elapsed,
"db_path": db_path,
"messages": result.messages,
}
self._save_result(out)
return out
except Exception as e:
elapsed = time.time() - run_start
logger.error("Run %s failed: %s", run_key, e, exc_info=True)
tqdm.write(
f" [ERROR] preset={preset!r} seed={seed}: {e} ({elapsed:.0f}s)"
)
out = {
"preset": preset,
"seed": seed,
"survived": False,
"final_funds_cents": 0,
"final_funds_usd": 0.0,
"terminal_reason": f"error: {e}",
"composite_score": 0.0,
"turns_used": 0,
"error": str(e),
"elapsed_seconds": elapsed,
}
self._save_result(out)
return out
# =========================================================================
# Evaluate
# =========================================================================
async def _run_with_timeout(self, item: Dict[str, Any]) -> Dict:
"""Wrap a single rollout with a wall-clock timeout."""
preset = item["preset"]
seed = item["seed"]
try:
return await asyncio.wait_for(
self.rollout_and_score_eval(item),
timeout=self.config.run_timeout,
)
except asyncio.TimeoutError:
from tqdm import tqdm
tqdm.write(
f" [TIMEOUT] preset={preset!r} seed={seed} "
f"(exceeded {self.config.run_timeout}s)"
)
out = {
"preset": preset,
"seed": seed,
"survived": False,
"final_funds_cents": 0,
"final_funds_usd": 0.0,
"terminal_reason": f"timeout ({self.config.run_timeout}s)",
"composite_score": 0.0,
"turns_used": 0,
"error": "timeout",
}
self._save_result(out)
return out
async def evaluate(self, *args, **kwargs) -> None:
"""
Run YC-Bench evaluation over all (preset, seed) combinations.
Runs sequentially -- each run is 100-500 turns, parallelising would
be prohibitively expensive and cause env var conflicts.
"""
start_time = time.time()
from tqdm import tqdm
# --- tqdm-compatible logging handler (TB2 pattern) ---
class _TqdmHandler(logging.Handler):
def emit(self, record):
try:
tqdm.write(self.format(record))
except Exception:
self.handleError(record)
root = logging.getLogger()
handler = _TqdmHandler()
handler.setFormatter(
logging.Formatter("%(levelname)s %(name)s: %(message)s")
)
root.handlers = [handler]
for noisy in ("httpx", "openai"):
logging.getLogger(noisy).setLevel(logging.WARNING)
# --- Print config summary ---
print(f"\n{'='*60}")
print("Starting YC-Bench Evaluation")
print(f"{'='*60}")
print(f" Presets: {self.config.presets}")
print(f" Seeds: {self.config.seeds}")
print(f" Total runs: {len(self.all_eval_items)}")
print(f" Max turns/run: {self.config.max_agent_turns}")
print(f" Run timeout: {self.config.run_timeout}s")
print(f"{'='*60}\n")
results = []
pbar = tqdm(
total=len(self.all_eval_items), desc="YC-Bench", dynamic_ncols=True
)
try:
for item in self.all_eval_items:
result = await self._run_with_timeout(item)
results.append(result)
survived_count = sum(1 for r in results if r.get("survived"))
pbar.set_postfix_str(
f"survived={survived_count}/{len(results)}"
)
pbar.update(1)
except (KeyboardInterrupt, asyncio.CancelledError):
tqdm.write("\n[INTERRUPTED] Stopping evaluation...")
pbar.close()
try:
from tools.terminal_tool import cleanup_all_environments
cleanup_all_environments()
except Exception:
pass
if hasattr(self, "_streaming_file") and not self._streaming_file.closed:
self._streaming_file.close()
return
pbar.close()
end_time = time.time()
# --- Compute metrics ---
valid = [r for r in results if r is not None]
if not valid:
print("Warning: No valid results.")
return
total = len(valid)
survived_total = sum(1 for r in valid if r.get("survived"))
survival_rate = survived_total / total if total else 0.0
avg_score = (
sum(r.get("composite_score", 0) for r in valid) / total
if total
else 0.0
)
preset_results: Dict[str, List[Dict]] = defaultdict(list)
for r in valid:
preset_results[r["preset"]].append(r)
eval_metrics = {
"eval/survival_rate": survival_rate,
"eval/avg_composite_score": avg_score,
"eval/total_runs": total,
"eval/survived_runs": survived_total,
"eval/evaluation_time_seconds": end_time - start_time,
}
for preset, items in sorted(preset_results.items()):
ps = sum(1 for r in items if r.get("survived"))
pt = len(items)
pa = (
sum(r.get("composite_score", 0) for r in items) / pt
if pt
else 0
)
key = preset.replace("-", "_")
eval_metrics[f"eval/survival_rate_{key}"] = ps / pt if pt else 0
eval_metrics[f"eval/avg_score_{key}"] = pa
self.eval_metrics = [(k, v) for k, v in eval_metrics.items()]
# --- Print summary ---
print(f"\n{'='*60}")
print("YC-Bench Evaluation Results")
print(f"{'='*60}")
print(
f"Overall survival rate: {survival_rate:.1%} "
f"({survived_total}/{total})"
)
print(f"Average composite score: {avg_score:.4f}")
print(f"Evaluation time: {end_time - start_time:.1f}s")
print("\nPer-preset breakdown:")
for preset, items in sorted(preset_results.items()):
ps = sum(1 for r in items if r.get("survived"))
pt = len(items)
pa = (
sum(r.get("composite_score", 0) for r in items) / pt
if pt
else 0
)
print(f" {preset}: {ps}/{pt} survived avg_score={pa:.4f}")
for r in items:
status = "SURVIVED" if r.get("survived") else "BANKRUPT"
funds = r.get("final_funds_usd", 0)
print(
f" seed={r['seed']} [{status}] "
f"${funds:,.0f} "
f"score={r.get('composite_score', 0):.3f}"
)
print(f"{'='*60}\n")
# --- Log results ---
samples = [
{k: v for k, v in r.items() if k != "messages"} for r in valid
]
try:
await self.evaluate_log(
metrics=eval_metrics,
samples=samples,
start_time=start_time,
end_time=end_time,
generation_parameters={
"temperature": self.config.agent_temperature,
"max_tokens": self.config.max_token_length,
"max_agent_turns": self.config.max_agent_turns,
},
)
except Exception as e:
print(f"Error logging results: {e}")
# --- Cleanup (TB2 pattern) ---
if hasattr(self, "_streaming_file") and not self._streaming_file.closed:
self._streaming_file.close()
print(f"Results saved to: {self._streaming_path}")
try:
from tools.terminal_tool import cleanup_all_environments
cleanup_all_environments()
except Exception:
pass
try:
from environments.agent_loop import _tool_executor
_tool_executor.shutdown(wait=False, cancel_futures=True)
except Exception:
pass
# =========================================================================
# Wandb logging
# =========================================================================
async def wandb_log(self, wandb_metrics: Optional[Dict] = None):
"""Log YC-Bench-specific metrics to wandb."""
if wandb_metrics is None:
wandb_metrics = {}
for k, v in self.eval_metrics:
wandb_metrics[k] = v
self.eval_metrics = []
await super().wandb_log(wandb_metrics)
if __name__ == "__main__":
YCBenchEvalEnv.cli()

View File

@@ -229,12 +229,6 @@ class HermesAgentBaseEnv(BaseEnv):
from environments.agent_loop import resize_tool_pool
resize_tool_pool(config.tool_pool_size)
# Set tool_parser on the ServerManager so ManagedServer uses it
# for bidirectional tool call translation (raw text ↔ OpenAI tool_calls).
if hasattr(self.server, 'tool_parser'):
self.server.tool_parser = config.tool_call_parser
print(f"🔧 Tool parser: {config.tool_call_parser}")
# Current group's resolved tools (set in collect_trajectories)
self._current_group_tools: Optional[Tuple[List[Dict], Set[str]]] = None
@@ -472,14 +466,22 @@ class HermesAgentBaseEnv(BaseEnv):
# Run the agent loop
result: AgentResult
if self._use_managed_server():
# Phase 2: ManagedServer with ToolCallTranslator -- exact tokens + logprobs
# tool_parser is set on ServerManager in __init__ and passed through
# to ManagedServer, which uses ToolCallTranslator for bidirectional
# translation between raw text and OpenAI tool_calls.
# Phase 2: ManagedServer with parser -- exact tokens + logprobs
# Load the tool call parser from registry based on config
from environments.tool_call_parsers import get_parser
try:
tc_parser = get_parser(self.config.tool_call_parser)
except KeyError:
logger.warning(
"Tool call parser '%s' not found, falling back to 'hermes'",
self.config.tool_call_parser,
)
tc_parser = get_parser("hermes")
try:
async with self.server.managed_server(
tokenizer=self.tokenizer,
preserve_think_blocks=bool(self.config.thinking_mode),
tool_call_parser=tc_parser,
) as managed:
agent = HermesAgentLoop(
server=managed,

View File

@@ -114,27 +114,11 @@ def _patch_swerex_modal():
self._worker = _AsyncWorker()
self._worker.start()
# Pre-build a modal.Image with pip fix for Modal's legacy image builder.
# Modal requires `python -m pip` to work during image build, but some
# task images (e.g., TBLite's broken-python) have intentionally broken pip.
# Fix: remove stale pip dist-info and reinstall via ensurepip before Modal
# tries to use it. This is a no-op for images where pip already works.
import modal as _modal
image_spec = self.config.image
if isinstance(image_spec, str):
image_spec = _modal.Image.from_registry(
image_spec,
setup_dockerfile_commands=[
"RUN rm -rf /usr/local/lib/python*/site-packages/pip* 2>/dev/null; "
"python -m ensurepip --upgrade --default-pip 2>/dev/null || true",
],
)
# Create AND start the deployment entirely on the worker's loop/thread
# so all gRPC channels and async state are bound to that loop
async def _create_and_start():
deployment = ModalDeployment(
image=image_spec,
image=self.config.image,
startup_timeout=self.config.startup_timeout,
runtime_timeout=self.config.runtime_timeout,
deployment_timeout=self.config.deployment_timeout,

View File

@@ -1,718 +0,0 @@
"""
WebResearchEnv — RL Environment for Multi-Step Web Research
============================================================
Trains models to do accurate, efficient, multi-source web research.
Reward signals:
- Answer correctness (LLM judge, 0.01.0)
- Source diversity (used ≥2 distinct domains)
- Efficiency (penalizes excessive tool calls)
- Tool usage (bonus for actually using web tools)
Dataset: FRAMES benchmark (Google, 2024) — multi-hop factual questions
HuggingFace: google/frames-benchmark
Fallback: built-in sample questions (no HF token needed)
Usage:
# Phase 1 (OpenAI-compatible server)
python environments/web_research_env.py serve \\
--openai.base_url http://localhost:8000/v1 \\
--openai.model_name YourModel \\
--openai.server_type openai
# Process mode (offline data generation)
python environments/web_research_env.py process \\
--env.data_path_to_save_groups data/web_research.jsonl
# Standalone eval
python environments/web_research_env.py evaluate \\
--openai.base_url http://localhost:8000/v1 \\
--openai.model_name YourModel
Built by: github.com/jackx707
Inspired by: GroceryMind — production Hermes agent doing live web research
across German grocery stores (firecrawl + hermes-agent)
"""
from __future__ import annotations
import asyncio
import json
import logging
import os
import random
import re
import sys
from pathlib import Path
from typing import Any, Dict, List, Optional, Tuple
from urllib.parse import urlparse
from pydantic import Field
# Ensure hermes-agent root is on path
_repo_root = Path(__file__).resolve().parent.parent
if str(_repo_root) not in sys.path:
sys.path.insert(0, str(_repo_root))
# ---------------------------------------------------------------------------
# Optional HuggingFace datasets import
# ---------------------------------------------------------------------------
try:
from datasets import load_dataset
HF_AVAILABLE = True
except ImportError:
HF_AVAILABLE = False
from atroposlib.envs.base import ScoredDataGroup
from atroposlib.envs.server_handling.server_manager import APIServerConfig
from atroposlib.type_definitions import Item
from environments.hermes_base_env import HermesAgentBaseEnv, HermesAgentEnvConfig
from environments.agent_loop import AgentResult
from environments.tool_context import ToolContext
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# Fallback sample dataset (used when HuggingFace is unavailable)
# Multi-hop questions requiring real web search to answer.
# ---------------------------------------------------------------------------
SAMPLE_QUESTIONS = [
{
"question": "What is the current population of the capital city of the country that won the 2022 FIFA World Cup?",
"answer": "Buenos Aires has approximately 3 million people in the city proper, or around 15 million in the greater metro area.",
"difficulty": "medium",
"hops": 2,
},
{
"question": "Who is the CEO of the company that makes the most widely used open-source container orchestration platform?",
"answer": "The Linux Foundation oversees Kubernetes. CNCF (Cloud Native Computing Foundation) is the specific body — it does not have a traditional CEO but has an executive director.",
"difficulty": "medium",
"hops": 2,
},
{
"question": "What programming language was used to write the original version of the web framework used by Instagram?",
"answer": "Django, which Instagram was built on, is written in Python.",
"difficulty": "easy",
"hops": 2,
},
{
"question": "In what year was the university founded where the inventor of the World Wide Web currently holds a professorship?",
"answer": "Tim Berners-Lee holds a professorship at MIT (founded 1861) and the University of Southampton (founded 1952).",
"difficulty": "hard",
"hops": 3,
},
{
"question": "What is the latest stable version of the programming language that ranks #1 on the TIOBE index as of this year?",
"answer": "Python is currently #1 on TIOBE. The latest stable version should be verified via the official python.org site.",
"difficulty": "medium",
"hops": 2,
},
{
"question": "How many employees does the parent company of Instagram have?",
"answer": "Meta Platforms (parent of Instagram) employs approximately 70,000+ people as of recent reports.",
"difficulty": "medium",
"hops": 2,
},
{
"question": "What is the current interest rate set by the central bank of the country where the Eiffel Tower is located?",
"answer": "The European Central Bank sets rates for France/eurozone. The current rate should be verified — it has changed frequently in 2023-2025.",
"difficulty": "hard",
"hops": 2,
},
{
"question": "Which company acquired the startup founded by the creator of Oculus VR?",
"answer": "Palmer Luckey founded Oculus VR, which was acquired by Facebook (now Meta). He later founded Anduril Industries.",
"difficulty": "medium",
"hops": 2,
},
{
"question": "What is the market cap of the company that owns the most popular search engine in Russia?",
"answer": "Yandex (now split into separate entities after 2024 restructuring). Current market cap should be verified via financial sources.",
"difficulty": "hard",
"hops": 2,
},
{
"question": "What was the GDP growth rate of the country that hosted the most recent Summer Olympics?",
"answer": "Paris, France hosted the 2024 Summer Olympics. France's recent GDP growth should be verified via World Bank or IMF data.",
"difficulty": "hard",
"hops": 2,
},
]
# ---------------------------------------------------------------------------
# Configuration
# ---------------------------------------------------------------------------
class WebResearchEnvConfig(HermesAgentEnvConfig):
"""Configuration for the web research RL environment."""
# Reward weights
correctness_weight: float = Field(
default=0.6,
description="Weight for answer correctness in reward (LLM judge score).",
)
tool_usage_weight: float = Field(
default=0.2,
description="Weight for tool usage signal (did the model actually use web tools?).",
)
efficiency_weight: float = Field(
default=0.2,
description="Weight for efficiency signal (penalizes excessive tool calls).",
)
diversity_bonus: float = Field(
default=0.1,
description="Bonus reward for citing ≥2 distinct domains.",
)
# Efficiency thresholds
efficient_max_calls: int = Field(
default=5,
description="Maximum tool calls before efficiency penalty begins.",
)
heavy_penalty_calls: int = Field(
default=10,
description="Tool call count where efficiency penalty steepens.",
)
# Eval
eval_size: int = Field(
default=20,
description="Number of held-out items for evaluation.",
)
eval_split_ratio: float = Field(
default=0.1,
description="Fraction of dataset to hold out for evaluation (0.01.0).",
)
# Dataset
dataset_name: str = Field(
default="google/frames-benchmark",
description="HuggingFace dataset name for research questions.",
)
# ---------------------------------------------------------------------------
# Environment
# ---------------------------------------------------------------------------
class WebResearchEnv(HermesAgentBaseEnv):
"""
RL environment for training multi-step web research skills.
The model is given a factual question requiring 2-3 hops of web research
and must use web_search / web_extract tools to find and synthesize the answer.
Reward is multi-signal:
60% — answer correctness (LLM judge)
20% — tool usage (did the model actually search the web?)
20% — efficiency (penalizes >5 tool calls)
Bonus +0.1 for source diversity (≥2 distinct domains cited).
"""
name = "web-research"
env_config_cls = WebResearchEnvConfig
# Default toolsets for this environment — web + file for saving notes
default_toolsets = ["web", "file"]
@classmethod
def config_init(cls) -> Tuple[WebResearchEnvConfig, List[APIServerConfig]]:
"""Default configuration for the web research environment."""
env_config = WebResearchEnvConfig(
enabled_toolsets=["web", "file"],
max_agent_turns=15,
agent_temperature=1.0,
system_prompt=(
"You are a highly capable research agent. When asked a factual question, "
"always use web_search to find current, accurate information before answering. "
"Cite at least 2 sources. Be concise and accurate."
),
group_size=4,
total_steps=1000,
steps_per_eval=100,
use_wandb=True,
wandb_name="web-research",
)
server_configs = [
APIServerConfig(
base_url="https://openrouter.ai/api/v1",
model_name="anthropic/claude-sonnet-4.5",
server_type="openai",
api_key=os.getenv("OPENROUTER_API_KEY", ""),
health_check=False,
)
]
return env_config, server_configs
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self._items: list[dict] = []
self._eval_items: list[dict] = []
self._index: int = 0
# Metrics tracking for wandb
self._reward_buffer: list[float] = []
self._correctness_buffer: list[float] = []
self._tool_usage_buffer: list[float] = []
self._efficiency_buffer: list[float] = []
self._diversity_buffer: list[float] = []
# ------------------------------------------------------------------
# 1. Setup — load dataset
# ------------------------------------------------------------------
async def setup(self) -> None:
"""Load the FRAMES benchmark or fall back to built-in samples."""
if HF_AVAILABLE:
try:
logger.info("Loading FRAMES benchmark from HuggingFace...")
ds = load_dataset(self.config.dataset_name, split="test")
self._items = [
{
"question": row["Prompt"],
"answer": row["Answer"],
"difficulty": row.get("reasoning_types", "unknown"),
"hops": 2,
}
for row in ds
]
# Hold out for eval
eval_size = max(
self.config.eval_size,
int(len(self._items) * self.config.eval_split_ratio),
)
random.shuffle(self._items)
self._eval_items = self._items[:eval_size]
self._items = self._items[eval_size:]
logger.info(
f"Loaded {len(self._items)} train / {len(self._eval_items)} eval items "
f"from FRAMES benchmark."
)
return
except Exception as e:
logger.warning(f"Could not load FRAMES from HuggingFace: {e}. Using built-in samples.")
# Fallback
random.shuffle(SAMPLE_QUESTIONS)
split = max(1, len(SAMPLE_QUESTIONS) * 8 // 10)
self._items = SAMPLE_QUESTIONS[:split]
self._eval_items = SAMPLE_QUESTIONS[split:]
logger.info(
f"Using built-in sample dataset: {len(self._items)} train / "
f"{len(self._eval_items)} eval items."
)
# ------------------------------------------------------------------
# 2. get_next_item — return the next question
# ------------------------------------------------------------------
async def get_next_item(self) -> dict:
"""Return the next item, cycling through the dataset."""
if not self._items:
raise RuntimeError("Dataset is empty. Did you call setup()?")
item = self._items[self._index % len(self._items)]
self._index += 1
return item
# ------------------------------------------------------------------
# 3. format_prompt — build the user-facing prompt
# ------------------------------------------------------------------
def format_prompt(self, item: dict) -> str:
"""Format the research question as a task prompt."""
return (
f"Research the following question thoroughly using web search. "
f"You MUST search the web to find current, accurate information — "
f"do not rely solely on your training data.\n\n"
f"Question: {item['question']}\n\n"
f"Requirements:\n"
f"- Use web_search and/or web_extract tools to find information\n"
f"- Search at least 2 different sources\n"
f"- Provide a concise, accurate answer (2-4 sentences)\n"
f"- Cite the sources you used"
)
# ------------------------------------------------------------------
# 4. compute_reward — multi-signal scoring
# ------------------------------------------------------------------
async def compute_reward(
self,
item: dict,
result: AgentResult,
ctx: ToolContext,
) -> float:
"""
Multi-signal reward function:
correctness_weight * correctness — LLM judge comparing answer to ground truth
tool_usage_weight * tool_used — binary: did the model use web tools?
efficiency_weight * efficiency — penalizes wasteful tool usage
+ diversity_bonus — source diversity (≥2 distinct domains)
"""
# Extract final response from messages (last assistant message with content)
final_response = ""
tools_used: list[str] = []
for msg in reversed(result.messages):
if msg.get("role") == "assistant" and msg.get("content") and not final_response:
final_response = msg["content"]
# Collect tool names from tool call messages
if msg.get("role") == "assistant" and msg.get("tool_calls"):
for tc in msg["tool_calls"]:
fn = tc.get("function", {}) if isinstance(tc, dict) else {}
name = fn.get("name", "")
if name:
tools_used.append(name)
tool_call_count: int = result.turns_used or len(tools_used)
cfg = self.config
# ---- Signal 1: Answer correctness (LLM judge) ----------------
correctness = await self._llm_judge(
question=item["question"],
expected=item["answer"],
model_answer=final_response,
)
# ---- Signal 2: Web tool usage --------------------------------
web_tools = {"web_search", "web_extract", "search", "firecrawl"}
tool_used = 1.0 if any(t in web_tools for t in tools_used) else 0.0
# ---- Signal 3: Efficiency ------------------------------------
if tool_call_count <= cfg.efficient_max_calls:
efficiency = 1.0
elif tool_call_count <= cfg.heavy_penalty_calls:
efficiency = 1.0 - (tool_call_count - cfg.efficient_max_calls) * 0.08
else:
efficiency = max(0.0, 1.0 - (tool_call_count - cfg.efficient_max_calls) * 0.12)
# ---- Bonus: Source diversity ---------------------------------
domains = self._extract_domains(final_response)
diversity = cfg.diversity_bonus if len(domains) >= 2 else 0.0
# ---- Combine ------------------------------------------------
reward = (
cfg.correctness_weight * correctness
+ cfg.tool_usage_weight * tool_used
+ cfg.efficiency_weight * efficiency
+ diversity
)
reward = min(1.0, max(0.0, reward)) # clamp to [0, 1]
# Track for wandb
self._reward_buffer.append(reward)
self._correctness_buffer.append(correctness)
self._tool_usage_buffer.append(tool_used)
self._efficiency_buffer.append(efficiency)
self._diversity_buffer.append(diversity)
logger.debug(
f"Reward breakdown — correctness={correctness:.2f}, "
f"tool_used={tool_used:.1f}, efficiency={efficiency:.2f}, "
f"diversity={diversity:.1f} → total={reward:.3f}"
)
return reward
# ------------------------------------------------------------------
# 5. evaluate — run on held-out eval split
# ------------------------------------------------------------------
async def evaluate(self, *args, **kwargs) -> None:
"""Run evaluation on the held-out split using the full agent loop with tools.
Each eval item runs through the same agent loop as training —
the model can use web_search, web_extract, etc. to research answers.
This measures actual agentic research capability, not just knowledge.
"""
import time
import uuid
from environments.agent_loop import HermesAgentLoop
from environments.tool_context import ToolContext
items = self._eval_items
if not items:
logger.warning("No eval items available.")
return
eval_size = min(self.config.eval_size, len(items))
eval_items = items[:eval_size]
logger.info(f"Running eval on {len(eval_items)} questions (with agent loop + tools)...")
start_time = time.time()
samples = []
# Resolve tools once for all eval items
tools, valid_names = self._resolve_tools_for_group()
for i, item in enumerate(eval_items):
task_id = str(uuid.uuid4())
logger.info(f"Eval [{i+1}/{len(eval_items)}]: {item['question'][:80]}...")
try:
# Build messages
messages: List[Dict[str, Any]] = []
if self.config.system_prompt:
messages.append({"role": "system", "content": self.config.system_prompt})
messages.append({"role": "user", "content": self.format_prompt(item)})
# Run the full agent loop with tools
agent = HermesAgentLoop(
server=self.server,
tool_schemas=tools,
valid_tool_names=valid_names,
max_turns=self.config.max_agent_turns,
task_id=task_id,
temperature=0.0, # Deterministic for eval
max_tokens=self.config.max_token_length,
extra_body=self.config.extra_body,
)
result = await agent.run(messages)
# Extract final response and tool usage from messages
final_response = ""
tool_call_count = 0
for msg in reversed(result.messages):
if msg.get("role") == "assistant" and msg.get("content") and not final_response:
final_response = msg["content"]
if msg.get("role") == "assistant" and msg.get("tool_calls"):
tool_call_count += len(msg["tool_calls"])
# Compute reward (includes LLM judge for correctness)
# Temporarily save buffer lengths so we can extract the
# correctness score without calling judge twice, and avoid
# polluting training metric buffers with eval data.
buf_len = len(self._correctness_buffer)
ctx = ToolContext(task_id)
try:
reward = await self.compute_reward(item, result, ctx)
finally:
ctx.cleanup()
# Extract correctness from the buffer (compute_reward appended it)
# then remove eval entries from training buffers
correctness = (
self._correctness_buffer[buf_len]
if len(self._correctness_buffer) > buf_len
else 0.0
)
# Roll back buffers to avoid polluting training metrics
for buf in (
self._reward_buffer, self._correctness_buffer,
self._tool_usage_buffer, self._efficiency_buffer,
self._diversity_buffer,
):
if len(buf) > buf_len:
buf.pop()
samples.append({
"prompt": item["question"],
"response": final_response[:500],
"expected": item["answer"],
"correctness": correctness,
"reward": reward,
"tool_calls": tool_call_count,
"turns": result.turns_used,
})
logger.info(
f" → correctness={correctness:.2f}, reward={reward:.3f}, "
f"tools={tool_call_count}, turns={result.turns_used}"
)
except Exception as e:
logger.error(f"Eval error on item: {e}")
samples.append({
"prompt": item["question"],
"response": f"ERROR: {e}",
"expected": item["answer"],
"correctness": 0.0,
"reward": 0.0,
"tool_calls": 0,
"turns": 0,
})
end_time = time.time()
# Compute aggregate metrics
correctness_scores = [s["correctness"] for s in samples]
rewards = [s["reward"] for s in samples]
tool_counts = [s["tool_calls"] for s in samples]
n = len(samples)
eval_metrics = {
"eval/mean_correctness": sum(correctness_scores) / n if n else 0.0,
"eval/mean_reward": sum(rewards) / n if n else 0.0,
"eval/mean_tool_calls": sum(tool_counts) / n if n else 0.0,
"eval/tool_usage_rate": sum(1 for t in tool_counts if t > 0) / n if n else 0.0,
"eval/n_items": n,
}
logger.info(
f"Eval complete — correctness={eval_metrics['eval/mean_correctness']:.3f}, "
f"reward={eval_metrics['eval/mean_reward']:.3f}, "
f"tool_usage={eval_metrics['eval/tool_usage_rate']:.0%}"
)
await self.evaluate_log(
metrics=eval_metrics,
samples=samples,
start_time=start_time,
end_time=end_time,
)
# ------------------------------------------------------------------
# 6. wandb_log — custom metrics
# ------------------------------------------------------------------
async def wandb_log(self, wandb_metrics: Optional[Dict] = None) -> None:
"""Log reward breakdown metrics to wandb."""
if wandb_metrics is None:
wandb_metrics = {}
if self._reward_buffer:
n = len(self._reward_buffer)
wandb_metrics["train/mean_reward"] = sum(self._reward_buffer) / n
wandb_metrics["train/mean_correctness"] = sum(self._correctness_buffer) / n
wandb_metrics["train/mean_tool_usage"] = sum(self._tool_usage_buffer) / n
wandb_metrics["train/mean_efficiency"] = sum(self._efficiency_buffer) / n
wandb_metrics["train/mean_diversity"] = sum(self._diversity_buffer) / n
wandb_metrics["train/total_rollouts"] = n
# Accuracy buckets
wandb_metrics["train/correct_rate"] = (
sum(1 for c in self._correctness_buffer if c >= 0.7) / n
)
wandb_metrics["train/tool_usage_rate"] = (
sum(1 for t in self._tool_usage_buffer if t > 0) / n
)
# Clear buffers
self._reward_buffer.clear()
self._correctness_buffer.clear()
self._tool_usage_buffer.clear()
self._efficiency_buffer.clear()
self._diversity_buffer.clear()
await super().wandb_log(wandb_metrics)
# ------------------------------------------------------------------
# Private helpers
# ------------------------------------------------------------------
async def _llm_judge(
self,
question: str,
expected: str,
model_answer: str,
) -> float:
"""
Use the server's LLM to judge answer correctness.
Falls back to keyword heuristic if LLM call fails.
"""
if not model_answer or not model_answer.strip():
return 0.0
judge_prompt = (
"You are an impartial judge evaluating the quality of an AI research answer.\n\n"
f"Question: {question}\n\n"
f"Reference answer: {expected}\n\n"
f"Model answer: {model_answer}\n\n"
"Score the model answer on a scale from 0.0 to 1.0 where:\n"
" 1.0 = fully correct and complete\n"
" 0.7 = mostly correct with minor gaps\n"
" 0.4 = partially correct\n"
" 0.1 = mentions relevant topic but wrong or very incomplete\n"
" 0.0 = completely wrong or no answer\n\n"
"Consider: factual accuracy, completeness, and relevance.\n"
'Respond with ONLY a JSON object: {"score": <float>, "reason": "<one sentence>"}'
)
try:
response = await self.server.chat_completion(
messages=[{"role": "user", "content": judge_prompt}],
n=1,
max_tokens=150,
temperature=0.0,
split="eval",
)
text = response.choices[0].message.content if response.choices else ""
parsed = self._parse_judge_json(text)
if parsed is not None:
return float(parsed)
except Exception as e:
logger.debug(f"LLM judge failed: {e}. Using heuristic.")
return self._heuristic_score(expected, model_answer)
@staticmethod
def _parse_judge_json(text: str) -> Optional[float]:
"""Extract the score float from LLM judge JSON response."""
try:
clean = re.sub(r"```(?:json)?|```", "", text).strip()
data = json.loads(clean)
score = float(data.get("score", -1))
if 0.0 <= score <= 1.0:
return score
except Exception:
match = re.search(r'"score"\s*:\s*([0-9.]+)', text)
if match:
score = float(match.group(1))
if 0.0 <= score <= 1.0:
return score
return None
@staticmethod
def _heuristic_score(expected: str, model_answer: str) -> float:
"""Lightweight keyword overlap score as fallback."""
stopwords = {
"the", "a", "an", "is", "are", "was", "were", "of", "in", "on",
"at", "to", "for", "with", "and", "or", "but", "it", "its",
"this", "that", "as", "by", "from", "be", "has", "have", "had",
}
def tokenize(text: str) -> set:
tokens = re.findall(r'\b\w+\b', text.lower())
return {t for t in tokens if t not in stopwords and len(t) > 2}
expected_tokens = tokenize(expected)
answer_tokens = tokenize(model_answer)
if not expected_tokens:
return 0.5
overlap = len(expected_tokens & answer_tokens)
union = len(expected_tokens | answer_tokens)
jaccard = overlap / union if union > 0 else 0.0
recall = overlap / len(expected_tokens)
return min(1.0, 0.4 * jaccard + 0.6 * recall)
@staticmethod
def _extract_domains(text: str) -> set:
"""Extract unique domains from URLs cited in the response."""
urls = re.findall(r'https?://[^\s\)>\]"\']+', text)
domains = set()
for url in urls:
try:
parsed = urlparse(url)
domain = parsed.netloc.lower().lstrip("www.")
if domain:
domains.add(domain)
except Exception:
pass
return domains
# ---------------------------------------------------------------------------
# Entry point
# ---------------------------------------------------------------------------
if __name__ == "__main__":
WebResearchEnv.cli()

View File

@@ -17,26 +17,6 @@ logger = logging.getLogger(__name__)
DIRECTORY_PATH = Path.home() / ".hermes" / "channel_directory.json"
def _session_entry_id(origin: Dict[str, Any]) -> Optional[str]:
chat_id = origin.get("chat_id")
if not chat_id:
return None
thread_id = origin.get("thread_id")
if thread_id:
return f"{chat_id}:{thread_id}"
return str(chat_id)
def _session_entry_name(origin: Dict[str, Any]) -> str:
base_name = origin.get("chat_name") or origin.get("user_name") or str(origin.get("chat_id"))
thread_id = origin.get("thread_id")
if not thread_id:
return base_name
topic_label = origin.get("chat_topic") or f"topic {thread_id}"
return f"{base_name} / {topic_label}"
# ---------------------------------------------------------------------------
# Build / refresh
# ---------------------------------------------------------------------------
@@ -60,8 +40,8 @@ def build_channel_directory(adapters: Dict[Any, Any]) -> Dict[str, Any]:
except Exception as e:
logger.warning("Channel directory: failed to build %s: %s", platform.value, e)
# Telegram, WhatsApp & Signal can't enumerate chats -- pull from session history
for plat_name in ("telegram", "whatsapp", "signal", "email"):
# Telegram & WhatsApp can't enumerate chats -- pull from session history
for plat_name in ("telegram", "whatsapp"):
if plat_name not in platforms:
platforms[plat_name] = _build_from_sessions(plat_name)
@@ -72,7 +52,7 @@ def build_channel_directory(adapters: Dict[Any, Any]) -> Dict[str, Any]:
try:
DIRECTORY_PATH.parent.mkdir(parents=True, exist_ok=True)
with open(DIRECTORY_PATH, "w", encoding="utf-8") as f:
with open(DIRECTORY_PATH, "w") as f:
json.dump(directory, f, indent=2, ensure_ascii=False)
except Exception as e:
logger.warning("Channel directory: failed to write: %s", e)
@@ -135,7 +115,7 @@ def _build_from_sessions(platform_name: str) -> List[Dict[str, str]]:
entries = []
try:
with open(sessions_path, encoding="utf-8") as f:
with open(sessions_path) as f:
data = json.load(f)
seen_ids = set()
@@ -143,15 +123,14 @@ def _build_from_sessions(platform_name: str) -> List[Dict[str, str]]:
origin = session.get("origin") or {}
if origin.get("platform") != platform_name:
continue
entry_id = _session_entry_id(origin)
if not entry_id or entry_id in seen_ids:
chat_id = origin.get("chat_id")
if not chat_id or chat_id in seen_ids:
continue
seen_ids.add(entry_id)
seen_ids.add(chat_id)
entries.append({
"id": entry_id,
"name": _session_entry_name(origin),
"id": str(chat_id),
"name": origin.get("chat_name") or origin.get("user_name") or str(chat_id),
"type": session.get("chat_type", "dm"),
"thread_id": origin.get("thread_id"),
})
except Exception as e:
logger.debug("Channel directory: failed to read sessions for %s: %s", platform_name, e)
@@ -168,7 +147,7 @@ def load_directory() -> Dict[str, Any]:
if not DIRECTORY_PATH.exists():
return {"updated_at": None, "platforms": {}}
try:
with open(DIRECTORY_PATH, encoding="utf-8") as f:
with open(DIRECTORY_PATH) as f:
return json.load(f)
except Exception:
return {"updated_at": None, "platforms": {}}

View File

@@ -26,9 +26,7 @@ class Platform(Enum):
DISCORD = "discord"
WHATSAPP = "whatsapp"
SLACK = "slack"
SIGNAL = "signal"
HOMEASSISTANT = "homeassistant"
EMAIL = "email"
@dataclass
@@ -157,19 +155,7 @@ class GatewayConfig:
"""Return list of platforms that are enabled and configured."""
connected = []
for platform, config in self.platforms.items():
if not config.enabled:
continue
# Platforms that use token/api_key auth
if config.token or config.api_key:
connected.append(platform)
# WhatsApp uses enabled flag only (bridge handles auth)
elif platform == Platform.WHATSAPP:
connected.append(platform)
# Signal uses extra dict for config (http_url + account)
elif platform == Platform.SIGNAL and config.extra.get("http_url"):
connected.append(platform)
# Email uses extra dict for config (address + imap_host + smtp_host)
elif platform == Platform.EMAIL and config.extra.get("address"):
if config.enabled and (config.token or config.api_key):
connected.append(platform)
return connected
@@ -274,7 +260,7 @@ def load_gateway_config() -> GatewayConfig:
gateway_config_path = Path.home() / ".hermes" / "gateway.json"
if gateway_config_path.exists():
try:
with open(gateway_config_path, "r", encoding="utf-8") as f:
with open(gateway_config_path, "r") as f:
data = json.load(f)
config = GatewayConfig.from_dict(data)
except Exception as e:
@@ -287,7 +273,7 @@ def load_gateway_config() -> GatewayConfig:
import yaml
config_yaml_path = Path.home() / ".hermes" / "config.yaml"
if config_yaml_path.exists():
with open(config_yaml_path, encoding="utf-8") as f:
with open(config_yaml_path) as f:
yaml_cfg = yaml.safe_load(f) or {}
sr = yaml_cfg.get("session_reset")
if sr and isinstance(sr, dict):
@@ -393,26 +379,6 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
name=os.getenv("SLACK_HOME_CHANNEL_NAME", ""),
)
# Signal
signal_url = os.getenv("SIGNAL_HTTP_URL")
signal_account = os.getenv("SIGNAL_ACCOUNT")
if signal_url and signal_account:
if Platform.SIGNAL not in config.platforms:
config.platforms[Platform.SIGNAL] = PlatformConfig()
config.platforms[Platform.SIGNAL].enabled = True
config.platforms[Platform.SIGNAL].extra.update({
"http_url": signal_url,
"account": signal_account,
"ignore_stories": os.getenv("SIGNAL_IGNORE_STORIES", "true").lower() in ("true", "1", "yes"),
})
signal_home = os.getenv("SIGNAL_HOME_CHANNEL")
if signal_home:
config.platforms[Platform.SIGNAL].home_channel = HomeChannel(
platform=Platform.SIGNAL,
chat_id=signal_home,
name=os.getenv("SIGNAL_HOME_CHANNEL_NAME", "Home"),
)
# Home Assistant
hass_token = os.getenv("HASS_TOKEN")
if hass_token:
@@ -424,28 +390,6 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
if hass_url:
config.platforms[Platform.HOMEASSISTANT].extra["url"] = hass_url
# Email
email_addr = os.getenv("EMAIL_ADDRESS")
email_pwd = os.getenv("EMAIL_PASSWORD")
email_imap = os.getenv("EMAIL_IMAP_HOST")
email_smtp = os.getenv("EMAIL_SMTP_HOST")
if all([email_addr, email_pwd, email_imap, email_smtp]):
if Platform.EMAIL not in config.platforms:
config.platforms[Platform.EMAIL] = PlatformConfig()
config.platforms[Platform.EMAIL].enabled = True
config.platforms[Platform.EMAIL].extra.update({
"address": email_addr,
"imap_host": email_imap,
"smtp_host": email_smtp,
})
email_home = os.getenv("EMAIL_HOME_ADDRESS")
if email_home:
config.platforms[Platform.EMAIL].home_channel = HomeChannel(
platform=Platform.EMAIL,
chat_id=email_home,
name=os.getenv("EMAIL_HOME_ADDRESS_NAME", "Home"),
)
# Session settings
idle_minutes = os.getenv("SESSION_IDLE_MINUTES")
if idle_minutes:
@@ -467,5 +411,5 @@ def save_gateway_config(config: GatewayConfig) -> None:
gateway_config_path = Path.home() / ".hermes" / "gateway.json"
gateway_config_path.parent.mkdir(parents=True, exist_ok=True)
with open(gateway_config_path, "w", encoding="utf-8") as f:
with open(gateway_config_path, "w") as f:
json.dump(config.to_dict(), f, indent=2)

View File

@@ -37,7 +37,6 @@ class DeliveryTarget:
"""
platform: Platform
chat_id: Optional[str] = None # None means use home channel
thread_id: Optional[str] = None
is_origin: bool = False
is_explicit: bool = False # True if chat_id was explicitly specified
@@ -59,7 +58,6 @@ class DeliveryTarget:
return cls(
platform=origin.platform,
chat_id=origin.chat_id,
thread_id=origin.thread_id,
is_origin=True,
)
else:
@@ -152,7 +150,7 @@ class DeliveryRouter:
continue
# Deduplicate
key = (target.platform, target.chat_id, target.thread_id)
key = (target.platform, target.chat_id)
if key not in seen_platforms:
seen_platforms.add(key)
targets.append(target)
@@ -287,10 +285,7 @@ class DeliveryRouter:
+ f"\n\n... [truncated, full output saved to {saved_path}]"
)
send_metadata = dict(metadata or {})
if target.thread_id and "thread_id" not in send_metadata:
send_metadata["thread_id"] = target.thread_id
return await adapter.send(target.chat_id, content, metadata=send_metadata or None)
return await adapter.send(target.chat_id, content, metadata=metadata)
def parse_deliver_spec(

View File

@@ -26,7 +26,6 @@ def mirror_to_session(
chat_id: str,
message_text: str,
source_label: str = "cli",
thread_id: Optional[str] = None,
) -> bool:
"""
Append a delivery-mirror message to the target session's transcript.
@@ -38,9 +37,9 @@ def mirror_to_session(
All errors are caught -- this is never fatal.
"""
try:
session_id = _find_session_id(platform, str(chat_id), thread_id=thread_id)
session_id = _find_session_id(platform, str(chat_id))
if not session_id:
logger.debug("Mirror: no session found for %s:%s:%s", platform, chat_id, thread_id)
logger.debug("Mirror: no session found for %s:%s", platform, chat_id)
return False
mirror_msg = {
@@ -58,11 +57,11 @@ def mirror_to_session(
return True
except Exception as e:
logger.debug("Mirror failed for %s:%s:%s: %s", platform, chat_id, thread_id, e)
logger.debug("Mirror failed for %s:%s: %s", platform, chat_id, e)
return False
def _find_session_id(platform: str, chat_id: str, thread_id: Optional[str] = None) -> Optional[str]:
def _find_session_id(platform: str, chat_id: str) -> Optional[str]:
"""
Find the active session_id for a platform + chat_id pair.
@@ -74,7 +73,7 @@ def _find_session_id(platform: str, chat_id: str, thread_id: Optional[str] = Non
return None
try:
with open(_SESSIONS_INDEX, encoding="utf-8") as f:
with open(_SESSIONS_INDEX) as f:
data = json.load(f)
except Exception:
return None
@@ -92,9 +91,6 @@ def _find_session_id(platform: str, chat_id: str, thread_id: Optional[str] = Non
origin_chat_id = str(origin.get("chat_id", ""))
if origin_chat_id == str(chat_id):
origin_thread_id = origin.get("thread_id")
if thread_id is not None and str(origin_thread_id or "") != str(thread_id):
continue
updated = entry.get("updated_at", "")
if updated > best_updated:
best_updated = updated
@@ -107,7 +103,7 @@ def _append_to_jsonl(session_id: str, message: dict) -> None:
"""Append a message to the JSONL transcript file."""
transcript_path = _SESSIONS_DIR / f"{session_id}.jsonl"
try:
with open(transcript_path, "a", encoding="utf-8") as f:
with open(transcript_path, "a") as f:
f.write(json.dumps(message, ensure_ascii=False) + "\n")
except Exception as e:
logger.debug("Mirror JSONL write failed: %s", e)
@@ -115,7 +111,6 @@ def _append_to_jsonl(session_id: str, message: dict) -> None:
def _append_to_sqlite(session_id: str, message: dict) -> None:
"""Append a message to the SQLite session database."""
db = None
try:
from hermes_state import SessionDB
db = SessionDB()
@@ -126,6 +121,3 @@ def _append_to_sqlite(session_id: str, message: dict) -> None:
)
except Exception as e:
logger.debug("Mirror SQLite write failed: %s", e)
finally:
if db is not None:
db.close()

View File

@@ -1,313 +0,0 @@
# Adding a New Messaging Platform
Checklist for integrating a new messaging platform into the Hermes gateway.
Use this as a reference when building a new adapter — every item here is a
real integration point that exists in the codebase. Missing any of them will
cause broken functionality, missing features, or inconsistent behavior.
---
## 1. Core Adapter (`gateway/platforms/<platform>.py`)
The adapter is a subclass of `BasePlatformAdapter` from `gateway/platforms/base.py`.
### Required methods
| Method | Purpose |
|--------|---------|
| `__init__(self, config)` | Parse config, init state. Call `super().__init__(config, Platform.YOUR_PLATFORM)` |
| `connect() -> bool` | Connect to the platform, start listeners. Return True on success |
| `disconnect()` | Stop listeners, close connections, cancel tasks |
| `send(chat_id, text, ...) -> SendResult` | Send a text message |
| `send_typing(chat_id)` | Send typing indicator |
| `send_image(chat_id, image_url, caption) -> SendResult` | Send an image |
| `get_chat_info(chat_id) -> dict` | Return `{name, type, chat_id}` for a chat |
### Optional methods (have default stubs in base)
| Method | Purpose |
|--------|---------|
| `send_document(chat_id, path, caption)` | Send a file attachment |
| `send_voice(chat_id, path)` | Send a voice message |
| `send_video(chat_id, path, caption)` | Send a video |
| `send_animation(chat_id, path, caption)` | Send a GIF/animation |
| `send_image_file(chat_id, path, caption)` | Send image from local file |
### Required function
```python
def check_<platform>_requirements() -> bool:
"""Check if this platform's dependencies are available."""
```
### Key patterns to follow
- Use `self.build_source(...)` to construct `SessionSource` objects
- Call `self.handle_message(event)` to dispatch inbound messages to the gateway
- Use `MessageEvent`, `MessageType`, `SendResult` from base
- Use `cache_image_from_bytes`, `cache_audio_from_bytes`, `cache_document_from_bytes` for attachments
- Filter self-messages (prevent reply loops)
- Filter sync/echo messages if the platform has them
- Redact sensitive identifiers (phone numbers, tokens) in all log output
- Implement reconnection with exponential backoff + jitter for streaming connections
- Set `MAX_MESSAGE_LENGTH` if the platform has message size limits
---
## 2. Platform Enum (`gateway/config.py`)
Add the platform to the `Platform` enum:
```python
class Platform(Enum):
...
YOUR_PLATFORM = "your_platform"
```
Add env var loading in `_apply_env_overrides()`:
```python
# Your Platform
your_token = os.getenv("YOUR_PLATFORM_TOKEN")
if your_token:
if Platform.YOUR_PLATFORM not in config.platforms:
config.platforms[Platform.YOUR_PLATFORM] = PlatformConfig()
config.platforms[Platform.YOUR_PLATFORM].enabled = True
config.platforms[Platform.YOUR_PLATFORM].token = your_token
```
Update `get_connected_platforms()` if your platform doesn't use token/api_key
(e.g., WhatsApp uses `enabled` flag, Signal uses `extra` dict).
---
## 3. Adapter Factory (`gateway/run.py`)
Add to `_create_adapter()`:
```python
elif platform == Platform.YOUR_PLATFORM:
from gateway.platforms.your_platform import YourAdapter, check_your_requirements
if not check_your_requirements():
logger.warning("Your Platform: dependencies not met")
return None
return YourAdapter(config)
```
---
## 4. Authorization Maps (`gateway/run.py`)
Add to BOTH dicts in `_is_user_authorized()`:
```python
platform_env_map = {
...
Platform.YOUR_PLATFORM: "YOUR_PLATFORM_ALLOWED_USERS",
}
platform_allow_all_map = {
...
Platform.YOUR_PLATFORM: "YOUR_PLATFORM_ALLOW_ALL_USERS",
}
```
---
## 5. Session Source (`gateway/session.py`)
If your platform needs extra identity fields (e.g., Signal's UUID alongside
phone number), add them to the `SessionSource` dataclass with `Optional` defaults,
and update `to_dict()`, `from_dict()`, and `build_source()` in base.py.
---
## 6. System Prompt Hints (`agent/prompt_builder.py`)
Add a `PLATFORM_HINTS` entry so the agent knows what platform it's on:
```python
PLATFORM_HINTS = {
...
"your_platform": (
"You are on Your Platform. "
"Describe formatting capabilities, media support, etc."
),
}
```
Without this, the agent won't know it's on your platform and may use
inappropriate formatting (e.g., markdown on platforms that don't render it).
---
## 7. Toolset (`toolsets.py`)
Add a named toolset for your platform:
```python
"hermes-your-platform": {
"description": "Your Platform bot toolset",
"tools": _HERMES_CORE_TOOLS,
"includes": []
},
```
And add it to the `hermes-gateway` composite:
```python
"hermes-gateway": {
"includes": [..., "hermes-your-platform"]
}
```
---
## 8. Cron Delivery (`cron/scheduler.py`)
Add to `platform_map` in `_deliver_result()`:
```python
platform_map = {
...
"your_platform": Platform.YOUR_PLATFORM,
}
```
Without this, `schedule_cronjob(deliver="your_platform")` silently fails.
---
## 9. Send Message Tool (`tools/send_message_tool.py`)
Add to `platform_map` in `send_message_tool()`:
```python
platform_map = {
...
"your_platform": Platform.YOUR_PLATFORM,
}
```
Add routing in `_send_to_platform()`:
```python
elif platform == Platform.YOUR_PLATFORM:
return await _send_your_platform(pconfig, chat_id, message)
```
Implement `_send_your_platform()` — a standalone async function that sends
a single message without requiring the full adapter (for use by cron jobs
and the send_message tool outside the gateway process).
Update the tool schema `target` description to include your platform example.
---
## 10. Cronjob Tool Schema (`tools/cronjob_tools.py`)
Update the `deliver` parameter description and docstring to mention your
platform as a delivery option.
---
## 11. Channel Directory (`gateway/channel_directory.py`)
If your platform can't enumerate chats (most can't), add it to the
session-based discovery list:
```python
for plat_name in ("telegram", "whatsapp", "signal", "your_platform"):
```
---
## 12. Status Display (`hermes_cli/status.py`)
Add to the `platforms` dict in the Messaging Platforms section:
```python
platforms = {
...
"Your Platform": ("YOUR_PLATFORM_TOKEN", "YOUR_PLATFORM_HOME_CHANNEL"),
}
```
---
## 13. Gateway Setup Wizard (`hermes_cli/gateway.py`)
Add to the `_PLATFORMS` list:
```python
{
"key": "your_platform",
"label": "Your Platform",
"emoji": "📱",
"token_var": "YOUR_PLATFORM_TOKEN",
"setup_instructions": [...],
"vars": [...],
}
```
If your platform needs custom setup logic (connectivity testing, QR codes,
policy choices), add a `_setup_your_platform()` function and route to it
in the platform selection switch.
Update `_platform_status()` if your platform's "configured" check differs
from the standard `bool(get_env_value(token_var))`.
---
## 14. Phone/ID Redaction (`agent/redact.py`)
If your platform uses sensitive identifiers (phone numbers, etc.), add a
regex pattern and redaction function to `agent/redact.py`. This ensures
identifiers are masked in ALL log output, not just your adapter's logs.
---
## 15. Documentation
| File | What to update |
|------|---------------|
| `README.md` | Platform list in feature table + documentation table |
| `AGENTS.md` | Gateway description + env var config section |
| `website/docs/user-guide/messaging/<platform>.md` | **NEW** — Full setup guide (see existing platform docs for template) |
| `website/docs/user-guide/messaging/index.md` | Architecture diagram, toolset table, security examples, Next Steps links |
| `website/docs/reference/environment-variables.md` | All env vars for the platform |
---
## 16. Tests (`tests/gateway/test_<platform>.py`)
Recommended test coverage:
- Platform enum exists with correct value
- Config loading from env vars via `_apply_env_overrides`
- Adapter init (config parsing, allowlist handling, default values)
- Helper functions (redaction, parsing, file type detection)
- Session source round-trip (to_dict → from_dict)
- Authorization integration (platform in allowlist maps)
- Send message tool routing (platform in platform_map)
Optional but valuable:
- Async tests for message handling flow (mock the platform API)
- SSE/WebSocket reconnection logic
- Attachment processing
- Group message filtering
---
## Quick Verification
After implementing everything, verify with:
```bash
# All tests pass
python -m pytest tests/ -q
# Grep for your platform name to find any missed integration points
grep -r "telegram\|discord\|whatsapp\|slack" gateway/ tools/ agent/ cron/ hermes_cli/ toolsets.py \
--include="*.py" -l | sort -u
# Check each file in the output — if it mentions other platforms but not yours, you missed it
```

View File

@@ -24,7 +24,7 @@ from pathlib import Path as _Path
sys.path.insert(0, str(_Path(__file__).resolve().parents[2]))
from gateway.config import Platform, PlatformConfig
from gateway.session import SessionSource, build_session_key
from gateway.session import SessionSource
# ---------------------------------------------------------------------------
@@ -252,7 +252,6 @@ def cleanup_document_cache(max_age_hours: int = 24) -> int:
class MessageType(Enum):
"""Types of incoming messages."""
TEXT = "text"
LOCATION = "location"
PHOTO = "photo"
VIDEO = "video"
AUDIO = "audio"
@@ -413,12 +412,11 @@ class BasePlatformAdapter(ABC):
"""
return SendResult(success=False, error="Not supported")
async def send_typing(self, chat_id: str, metadata=None) -> None:
async def send_typing(self, chat_id: str) -> None:
"""
Send a typing indicator.
Override in subclasses if the platform supports it.
metadata: optional dict with platform-specific context (e.g. thread_id for Slack).
"""
pass
@@ -516,7 +514,6 @@ class BasePlatformAdapter(ABC):
audio_path: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
**kwargs,
) -> SendResult:
"""
Send an audio file as a native voice message via the platform API.
@@ -536,7 +533,6 @@ class BasePlatformAdapter(ABC):
video_path: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
**kwargs,
) -> SendResult:
"""
Send a video natively via the platform API.
@@ -556,7 +552,6 @@ class BasePlatformAdapter(ABC):
caption: Optional[str] = None,
file_name: Optional[str] = None,
reply_to: Optional[str] = None,
**kwargs,
) -> SendResult:
"""
Send a document/file natively via the platform API.
@@ -575,7 +570,6 @@ class BasePlatformAdapter(ABC):
image_path: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
**kwargs,
) -> SendResult:
"""
Send a local image file natively via the platform API.
@@ -625,7 +619,7 @@ class BasePlatformAdapter(ABC):
return media, cleaned
async def _keep_typing(self, chat_id: str, interval: float = 2.0, metadata=None) -> None:
async def _keep_typing(self, chat_id: str, interval: float = 2.0) -> None:
"""
Continuously send typing indicator until cancelled.
@@ -634,7 +628,7 @@ class BasePlatformAdapter(ABC):
"""
try:
while True:
await self.send_typing(chat_id, metadata=metadata)
await self.send_typing(chat_id)
await asyncio.sleep(interval)
except asyncio.CancelledError:
pass # Normal cancellation when handler completes
@@ -650,7 +644,7 @@ class BasePlatformAdapter(ABC):
if not self._message_handler:
return
session_key = build_session_key(event.source)
session_key = event.source.chat_id
# Check if there's already an active handler for this session
if session_key in self._active_sessions:
@@ -692,8 +686,7 @@ class BasePlatformAdapter(ABC):
self._active_sessions[session_key] = interrupt_event
# Start continuous typing indicator (refreshes every 2 seconds)
_thread_metadata = {"thread_id": event.source.thread_id} if event.source.thread_id else None
typing_task = asyncio.create_task(self._keep_typing(event.source.chat_id, metadata=_thread_metadata))
typing_task = asyncio.create_task(self._keep_typing(event.source.chat_id))
try:
# Call the handler (this can take a while with tool calls)
@@ -708,8 +701,6 @@ class BasePlatformAdapter(ABC):
# Extract image URLs and send them as native platform attachments
images, text_content = self.extract_images(response)
if images:
logger.info("[%s] extract_images found %d image(s) in response (%d chars)", self.name, len(images), len(response))
# Send the text portion first (if any remains after extractions)
if text_content:
@@ -717,8 +708,7 @@ class BasePlatformAdapter(ABC):
result = await self.send(
chat_id=event.source.chat_id,
content=text_content,
reply_to=event.message_id,
metadata=_thread_metadata,
reply_to=event.message_id
)
# Log send failures (don't raise - user already saw tool progress)
@@ -728,8 +718,7 @@ class BasePlatformAdapter(ABC):
fallback_result = await self.send(
chat_id=event.source.chat_id,
content=f"(Response formatting failed, plain text:)\n\n{text_content[:3500]}",
reply_to=event.message_id,
metadata=_thread_metadata,
reply_to=event.message_id
)
if not fallback_result.success:
print(f"[{self.name}] Fallback send also failed: {fallback_result.error}")
@@ -738,32 +727,27 @@ class BasePlatformAdapter(ABC):
human_delay = self._get_human_delay()
# Send extracted images as native attachments
if images:
logger.info("[%s] Extracted %d image(s) to send as attachments", self.name, len(images))
for image_url, alt_text in images:
if human_delay > 0:
await asyncio.sleep(human_delay)
try:
logger.info("[%s] Sending image: %s (alt=%s)", self.name, image_url[:80], alt_text[:30] if alt_text else "")
# Route animated GIFs through send_animation for proper playback
if self._is_animation_url(image_url):
img_result = await self.send_animation(
chat_id=event.source.chat_id,
animation_url=image_url,
caption=alt_text if alt_text else None,
metadata=_thread_metadata,
)
else:
img_result = await self.send_image(
chat_id=event.source.chat_id,
image_url=image_url,
caption=alt_text if alt_text else None,
metadata=_thread_metadata,
)
if not img_result.success:
logger.error("[%s] Failed to send image: %s", self.name, img_result.error)
print(f"[{self.name}] Failed to send image: {img_result.error}")
except Exception as img_err:
logger.error("[%s] Error sending image: %s", self.name, img_err, exc_info=True)
print(f"[{self.name}] Error sending image: {img_err}")
# Send extracted media files — route by file type
_AUDIO_EXTS = {'.ogg', '.opus', '.mp3', '.wav', '.m4a'}
@@ -779,25 +763,21 @@ class BasePlatformAdapter(ABC):
media_result = await self.send_voice(
chat_id=event.source.chat_id,
audio_path=media_path,
metadata=_thread_metadata,
)
elif ext in _VIDEO_EXTS:
media_result = await self.send_video(
chat_id=event.source.chat_id,
video_path=media_path,
metadata=_thread_metadata,
)
elif ext in _IMAGE_EXTS:
media_result = await self.send_image_file(
chat_id=event.source.chat_id,
image_path=media_path,
metadata=_thread_metadata,
)
else:
media_result = await self.send_document(
chat_id=event.source.chat_id,
file_path=media_path,
metadata=_thread_metadata,
)
if not media_result.success:
@@ -853,8 +833,6 @@ class BasePlatformAdapter(ABC):
user_name: Optional[str] = None,
thread_id: Optional[str] = None,
chat_topic: Optional[str] = None,
user_id_alt: Optional[str] = None,
chat_id_alt: Optional[str] = None,
) -> SessionSource:
"""Helper to build a SessionSource for this platform."""
# Normalize empty topic to None
@@ -869,8 +847,6 @@ class BasePlatformAdapter(ABC):
user_name=user_name,
thread_id=str(thread_id) if thread_id else None,
chat_topic=chat_topic.strip() if chat_topic else None,
user_id_alt=user_id_alt,
chat_id_alt=chat_id_alt,
)
@abstractmethod

View File

@@ -72,11 +72,11 @@ class DiscordAdapter(BasePlatformAdapter):
async def connect(self) -> bool:
"""Connect to Discord and start receiving events."""
if not DISCORD_AVAILABLE:
logger.error("[%s] discord.py not installed. Run: pip install discord.py", self.name)
print(f"[{self.name}] discord.py not installed. Run: pip install discord.py")
return False
if not self.config.token:
logger.error("[%s] No bot token configured", self.name)
print(f"[{self.name}] No bot token configured")
return False
try:
@@ -105,7 +105,7 @@ class DiscordAdapter(BasePlatformAdapter):
# Register event handlers
@self._client.event
async def on_ready():
logger.info("[%s] Connected as %s", adapter_self.name, adapter_self._client.user)
print(f"[{adapter_self.name}] Connected as {adapter_self._client.user}")
# Resolve any usernames in the allowed list to numeric IDs
await adapter_self._resolve_allowed_usernames()
@@ -113,30 +113,16 @@ class DiscordAdapter(BasePlatformAdapter):
# Sync slash commands with Discord
try:
synced = await adapter_self._client.tree.sync()
logger.info("[%s] Synced %d slash command(s)", adapter_self.name, len(synced))
except Exception as e: # pragma: no cover - defensive logging
logger.warning("[%s] Slash command sync failed: %s", adapter_self.name, e, exc_info=True)
print(f"[{adapter_self.name}] Synced {len(synced)} slash command(s)")
except Exception as e:
print(f"[{adapter_self.name}] Slash command sync failed: {e}")
adapter_self._ready_event.set()
@self._client.event
async def on_message(message: DiscordMessage):
# Always ignore our own messages
# Ignore bot's own messages
if message.author == self._client.user:
return
# Bot message filtering (DISCORD_ALLOW_BOTS):
# "none" — ignore all other bots (default)
# "mentions" — accept bot messages only when they @mention us
# "all" — accept all bot messages
if getattr(message.author, "bot", False):
allow_bots = os.getenv("DISCORD_ALLOW_BOTS", "none").lower().strip()
if allow_bots == "none":
return
elif allow_bots == "mentions":
if not self._client.user or self._client.user not in message.mentions:
return
# "all" falls through to handle_message
await self._handle_message(message)
# Register slash commands
@@ -152,10 +138,10 @@ class DiscordAdapter(BasePlatformAdapter):
return True
except asyncio.TimeoutError:
logger.error("[%s] Timeout waiting for connection to Discord", self.name, exc_info=True)
print(f"[{self.name}] Timeout waiting for connection")
return False
except Exception as e: # pragma: no cover - defensive logging
logger.error("[%s] Failed to connect to Discord: %s", self.name, e, exc_info=True)
except Exception as e:
print(f"[{self.name}] Failed to connect: {e}")
return False
async def disconnect(self) -> None:
@@ -163,13 +149,13 @@ class DiscordAdapter(BasePlatformAdapter):
if self._client:
try:
await self._client.close()
except Exception as e: # pragma: no cover - defensive logging
logger.warning("[%s] Error during disconnect: %s", self.name, e, exc_info=True)
except Exception as e:
print(f"[{self.name}] Error during disconnect: {e}")
self._running = False
self._client = None
self._ready_event.clear()
logger.info("[%s] Disconnected", self.name)
print(f"[{self.name}] Disconnected")
async def send(
self,
@@ -218,8 +204,7 @@ class DiscordAdapter(BasePlatformAdapter):
raw_response={"message_ids": message_ids}
)
except Exception as e: # pragma: no cover - defensive logging
logger.error("[%s] Failed to send Discord message: %s", self.name, e, exc_info=True)
except Exception as e:
return SendResult(success=False, error=str(e))
async def edit_message(
@@ -241,8 +226,7 @@ class DiscordAdapter(BasePlatformAdapter):
formatted = formatted[:self.MAX_MESSAGE_LENGTH - 3] + "..."
await msg.edit(content=formatted)
return SendResult(success=True, message_id=message_id)
except Exception as e: # pragma: no cover - defensive logging
logger.error("[%s] Failed to edit Discord message %s: %s", self.name, message_id, e, exc_info=True)
except Exception as e:
return SendResult(success=False, error=str(e))
async def send_voice(
@@ -279,47 +263,10 @@ class DiscordAdapter(BasePlatformAdapter):
)
return SendResult(success=True, message_id=str(msg.id))
except Exception as e: # pragma: no cover - defensive logging
logger.error("[%s] Failed to send audio, falling back to base adapter: %s", self.name, e, exc_info=True)
except Exception as e:
print(f"[{self.name}] Failed to send audio: {e}")
return await super().send_voice(chat_id, audio_path, caption, reply_to)
async def send_image_file(
self,
chat_id: str,
image_path: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
) -> SendResult:
"""Send a local image file natively as a Discord file attachment."""
if not self._client:
return SendResult(success=False, error="Not connected")
try:
import io
channel = self._client.get_channel(int(chat_id))
if not channel:
channel = await self._client.fetch_channel(int(chat_id))
if not channel:
return SendResult(success=False, error=f"Channel {chat_id} not found")
if not os.path.exists(image_path):
return SendResult(success=False, error=f"Image file not found: {image_path}")
filename = os.path.basename(image_path)
with open(image_path, "rb") as f:
file = discord.File(io.BytesIO(f.read()), filename=filename)
msg = await channel.send(
content=caption if caption else None,
file=file,
)
return SendResult(success=True, message_id=str(msg.id))
except Exception as e: # pragma: no cover - defensive logging
logger.error("[%s] Failed to send local image, falling back to base adapter: %s", self.name, e, exc_info=True)
return await super().send_image_file(chat_id, image_path, caption, reply_to)
async def send_image(
self,
chat_id: str,
@@ -369,22 +316,13 @@ class DiscordAdapter(BasePlatformAdapter):
return SendResult(success=True, message_id=str(msg.id))
except ImportError:
logger.warning(
"[%s] aiohttp not installed, falling back to URL. Run: pip install aiohttp",
self.name,
exc_info=True,
)
print(f"[{self.name}] aiohttp not installed, falling back to URL. Run: pip install aiohttp")
return await super().send_image(chat_id, image_url, caption, reply_to)
except Exception as e: # pragma: no cover - defensive logging
logger.error(
"[%s] Failed to send image attachment, falling back to URL: %s",
self.name,
e,
exc_info=True,
)
except Exception as e:
print(f"[{self.name}] Failed to send image attachment, falling back to URL: {e}")
return await super().send_image(chat_id, image_url, caption, reply_to)
async def send_typing(self, chat_id: str, metadata=None) -> None:
async def send_typing(self, chat_id: str) -> None:
"""Send typing indicator."""
if self._client:
try:
@@ -429,8 +367,7 @@ class DiscordAdapter(BasePlatformAdapter):
"guild_id": str(channel.guild.id) if hasattr(channel, "guild") and channel.guild else None,
"guild_name": channel.guild.name if hasattr(channel, "guild") and channel.guild else None,
}
except Exception as e: # pragma: no cover - defensive logging
logger.error("[%s] Failed to get chat info for %s: %s", self.name, chat_id, e, exc_info=True)
except Exception as e:
return {"name": str(chat_id), "type": "dm", "error": str(e)}
async def _resolve_allowed_usernames(self) -> None:
@@ -618,89 +555,6 @@ class DiscordAdapter(BasePlatformAdapter):
except Exception as e:
logger.debug("Discord followup failed: %s", e)
@tree.command(name="compress", description="Compress conversation context")
async def slash_compress(interaction: discord.Interaction):
await interaction.response.defer(ephemeral=True)
event = self._build_slash_event(interaction, "/compress")
await self.handle_message(event)
try:
await interaction.followup.send("Done~", ephemeral=True)
except Exception as e:
logger.debug("Discord followup failed: %s", e)
@tree.command(name="title", description="Set or show the session title")
@discord.app_commands.describe(name="Session title. Leave empty to show current.")
async def slash_title(interaction: discord.Interaction, name: str = ""):
await interaction.response.defer(ephemeral=True)
event = self._build_slash_event(interaction, f"/title {name}".strip())
await self.handle_message(event)
try:
await interaction.followup.send("Done~", ephemeral=True)
except Exception as e:
logger.debug("Discord followup failed: %s", e)
@tree.command(name="resume", description="Resume a previously-named session")
@discord.app_commands.describe(name="Session name to resume. Leave empty to list sessions.")
async def slash_resume(interaction: discord.Interaction, name: str = ""):
await interaction.response.defer(ephemeral=True)
event = self._build_slash_event(interaction, f"/resume {name}".strip())
await self.handle_message(event)
try:
await interaction.followup.send("Done~", ephemeral=True)
except Exception as e:
logger.debug("Discord followup failed: %s", e)
@tree.command(name="usage", description="Show token usage for this session")
async def slash_usage(interaction: discord.Interaction):
await interaction.response.defer(ephemeral=True)
event = self._build_slash_event(interaction, "/usage")
await self.handle_message(event)
try:
await interaction.followup.send("Done~", ephemeral=True)
except Exception as e:
logger.debug("Discord followup failed: %s", e)
@tree.command(name="provider", description="Show available providers")
async def slash_provider(interaction: discord.Interaction):
await interaction.response.defer(ephemeral=True)
event = self._build_slash_event(interaction, "/provider")
await self.handle_message(event)
try:
await interaction.followup.send("Done~", ephemeral=True)
except Exception as e:
logger.debug("Discord followup failed: %s", e)
@tree.command(name="help", description="Show available commands")
async def slash_help(interaction: discord.Interaction):
await interaction.response.defer(ephemeral=True)
event = self._build_slash_event(interaction, "/help")
await self.handle_message(event)
try:
await interaction.followup.send("Done~", ephemeral=True)
except Exception as e:
logger.debug("Discord followup failed: %s", e)
@tree.command(name="insights", description="Show usage insights and analytics")
@discord.app_commands.describe(days="Number of days to analyze (default: 7)")
async def slash_insights(interaction: discord.Interaction, days: int = 7):
await interaction.response.defer(ephemeral=True)
event = self._build_slash_event(interaction, f"/insights {days}")
await self.handle_message(event)
try:
await interaction.followup.send("Done~", ephemeral=True)
except Exception as e:
logger.debug("Discord followup failed: %s", e)
@tree.command(name="reload-mcp", description="Reload MCP servers from config")
async def slash_reload_mcp(interaction: discord.Interaction):
await interaction.response.defer(ephemeral=True)
event = self._build_slash_event(interaction, "/reload-mcp")
await self.handle_message(event)
try:
await interaction.followup.send("Done~", ephemeral=True)
except Exception as e:
logger.debug("Discord followup failed: %s", e)
@tree.command(name="update", description="Update Hermes Agent to the latest version")
async def slash_update(interaction: discord.Interaction):
await interaction.response.defer(ephemeral=True)

View File

@@ -1,533 +0,0 @@
"""
Email platform adapter for the Hermes gateway.
Allows users to interact with Hermes by sending emails.
Uses IMAP to receive and SMTP to send messages.
Environment variables:
EMAIL_IMAP_HOST — IMAP server host (e.g., imap.gmail.com)
EMAIL_IMAP_PORT — IMAP server port (default: 993)
EMAIL_SMTP_HOST — SMTP server host (e.g., smtp.gmail.com)
EMAIL_SMTP_PORT — SMTP server port (default: 587)
EMAIL_ADDRESS — Email address for the agent
EMAIL_PASSWORD — Email password or app-specific password
EMAIL_POLL_INTERVAL — Seconds between mailbox checks (default: 15)
EMAIL_ALLOWED_USERS — Comma-separated list of allowed sender addresses
"""
import asyncio
import email as email_lib
import imaplib
import logging
import os
import re
import smtplib
import uuid
from datetime import datetime
from email.header import decode_header
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from email.mime.base import MIMEBase
from email import encoders
from pathlib import Path
from typing import Any, Dict, List, Optional
from gateway.platforms.base import (
BasePlatformAdapter,
MessageEvent,
MessageType,
SendResult,
cache_document_from_bytes,
cache_image_from_bytes,
)
from gateway.config import Platform, PlatformConfig
logger = logging.getLogger(__name__)
# Gmail-safe max length per email body
MAX_MESSAGE_LENGTH = 50_000
# Supported image extensions for inline detection
_IMAGE_EXTS = {".jpg", ".jpeg", ".png", ".gif", ".webp"}
def check_email_requirements() -> bool:
"""Check if email platform dependencies are available."""
addr = os.getenv("EMAIL_ADDRESS")
pwd = os.getenv("EMAIL_PASSWORD")
imap = os.getenv("EMAIL_IMAP_HOST")
smtp = os.getenv("EMAIL_SMTP_HOST")
if not all([addr, pwd, imap, smtp]):
return False
return True
def _decode_header_value(raw: str) -> str:
"""Decode an RFC 2047 encoded email header into a plain string."""
parts = decode_header(raw)
decoded = []
for part, charset in parts:
if isinstance(part, bytes):
decoded.append(part.decode(charset or "utf-8", errors="replace"))
else:
decoded.append(part)
return " ".join(decoded)
def _extract_text_body(msg: email_lib.message.Message) -> str:
"""Extract the plain-text body from a potentially multipart email."""
if msg.is_multipart():
for part in msg.walk():
content_type = part.get_content_type()
disposition = str(part.get("Content-Disposition", ""))
# Skip attachments
if "attachment" in disposition:
continue
if content_type == "text/plain":
payload = part.get_payload(decode=True)
if payload:
charset = part.get_content_charset() or "utf-8"
return payload.decode(charset, errors="replace")
# Fallback: try text/html and strip tags
for part in msg.walk():
content_type = part.get_content_type()
disposition = str(part.get("Content-Disposition", ""))
if "attachment" in disposition:
continue
if content_type == "text/html":
payload = part.get_payload(decode=True)
if payload:
charset = part.get_content_charset() or "utf-8"
html = payload.decode(charset, errors="replace")
return _strip_html(html)
return ""
else:
payload = msg.get_payload(decode=True)
if payload:
charset = msg.get_content_charset() or "utf-8"
text = payload.decode(charset, errors="replace")
if msg.get_content_type() == "text/html":
return _strip_html(text)
return text
return ""
def _strip_html(html: str) -> str:
"""Naive HTML tag stripper for fallback text extraction."""
text = re.sub(r"<br\s*/?>", "\n", html, flags=re.IGNORECASE)
text = re.sub(r"<p[^>]*>", "\n", text, flags=re.IGNORECASE)
text = re.sub(r"</p>", "\n", text, flags=re.IGNORECASE)
text = re.sub(r"<[^>]+>", "", text)
text = re.sub(r"&nbsp;", " ", text)
text = re.sub(r"&amp;", "&", text)
text = re.sub(r"&lt;", "<", text)
text = re.sub(r"&gt;", ">", text)
text = re.sub(r"\n{3,}", "\n\n", text)
return text.strip()
def _extract_email_address(raw: str) -> str:
"""Extract bare email address from 'Name <addr>' format."""
match = re.search(r"<([^>]+)>", raw)
if match:
return match.group(1).strip().lower()
return raw.strip().lower()
def _extract_attachments(msg: email_lib.message.Message) -> List[Dict[str, Any]]:
"""Extract attachment metadata and cache files locally."""
attachments = []
if not msg.is_multipart():
return attachments
for part in msg.walk():
disposition = str(part.get("Content-Disposition", ""))
if "attachment" not in disposition and "inline" not in disposition:
continue
# Skip text/plain and text/html body parts
content_type = part.get_content_type()
if content_type in ("text/plain", "text/html") and "attachment" not in disposition:
continue
filename = part.get_filename()
if filename:
filename = _decode_header_value(filename)
else:
ext = part.get_content_subtype() or "bin"
filename = f"attachment.{ext}"
payload = part.get_payload(decode=True)
if not payload:
continue
ext = Path(filename).suffix.lower()
if ext in _IMAGE_EXTS:
cached_path = cache_image_from_bytes(payload, ext)
attachments.append({
"path": cached_path,
"filename": filename,
"type": "image",
"media_type": content_type,
})
else:
cached_path = cache_document_from_bytes(payload, filename)
attachments.append({
"path": cached_path,
"filename": filename,
"type": "document",
"media_type": content_type,
})
return attachments
class EmailAdapter(BasePlatformAdapter):
"""Email gateway adapter using IMAP (receive) and SMTP (send)."""
def __init__(self, config: PlatformConfig):
super().__init__(config, Platform.EMAIL)
self._address = os.getenv("EMAIL_ADDRESS", "")
self._password = os.getenv("EMAIL_PASSWORD", "")
self._imap_host = os.getenv("EMAIL_IMAP_HOST", "")
self._imap_port = int(os.getenv("EMAIL_IMAP_PORT", "993"))
self._smtp_host = os.getenv("EMAIL_SMTP_HOST", "")
self._smtp_port = int(os.getenv("EMAIL_SMTP_PORT", "587"))
self._poll_interval = int(os.getenv("EMAIL_POLL_INTERVAL", "15"))
# Track message IDs we've already processed to avoid duplicates
self._seen_uids: set = set()
self._poll_task: Optional[asyncio.Task] = None
# Map chat_id (sender email) -> last subject + message-id for threading
self._thread_context: Dict[str, Dict[str, str]] = {}
logger.info("[Email] Adapter initialized for %s", self._address)
async def connect(self) -> bool:
"""Connect to the IMAP server and start polling for new messages."""
try:
# Test IMAP connection
imap = imaplib.IMAP4_SSL(self._imap_host, self._imap_port)
imap.login(self._address, self._password)
# Mark all existing messages as seen so we only process new ones
imap.select("INBOX")
status, data = imap.search(None, "ALL")
if status == "OK" and data[0]:
for uid in data[0].split():
self._seen_uids.add(uid)
imap.logout()
logger.info("[Email] IMAP connection test passed. %d existing messages skipped.", len(self._seen_uids))
except Exception as e:
logger.error("[Email] IMAP connection failed: %s", e)
return False
try:
# Test SMTP connection
smtp = smtplib.SMTP(self._smtp_host, self._smtp_port)
smtp.starttls()
smtp.login(self._address, self._password)
smtp.quit()
logger.info("[Email] SMTP connection test passed.")
except Exception as e:
logger.error("[Email] SMTP connection failed: %s", e)
return False
self._running = True
self._poll_task = asyncio.create_task(self._poll_loop())
print(f"[Email] Connected as {self._address}")
return True
async def disconnect(self) -> None:
"""Stop polling and disconnect."""
self._running = False
if self._poll_task:
self._poll_task.cancel()
try:
await self._poll_task
except asyncio.CancelledError:
pass
self._poll_task = None
logger.info("[Email] Disconnected.")
async def _poll_loop(self) -> None:
"""Poll IMAP for new messages at regular intervals."""
while self._running:
try:
await self._check_inbox()
except asyncio.CancelledError:
break
except Exception as e:
logger.error("[Email] Poll error: %s", e)
await asyncio.sleep(self._poll_interval)
async def _check_inbox(self) -> None:
"""Check INBOX for unseen messages and dispatch them."""
# Run IMAP operations in a thread to avoid blocking the event loop
loop = asyncio.get_running_loop()
messages = await loop.run_in_executor(None, self._fetch_new_messages)
for msg_data in messages:
await self._dispatch_message(msg_data)
def _fetch_new_messages(self) -> List[Dict[str, Any]]:
"""Fetch new (unseen) messages from IMAP. Runs in executor thread."""
results = []
try:
imap = imaplib.IMAP4_SSL(self._imap_host, self._imap_port)
imap.login(self._address, self._password)
imap.select("INBOX")
status, data = imap.search(None, "UNSEEN")
if status != "OK" or not data[0]:
imap.logout()
return results
for uid in data[0].split():
if uid in self._seen_uids:
continue
self._seen_uids.add(uid)
status, msg_data = imap.fetch(uid, "(RFC822)")
if status != "OK":
continue
raw_email = msg_data[0][1]
msg = email_lib.message_from_bytes(raw_email)
sender_raw = msg.get("From", "")
sender_addr = _extract_email_address(sender_raw)
sender_name = _decode_header_value(sender_raw)
# Remove email from name if present
if "<" in sender_name:
sender_name = sender_name.split("<")[0].strip().strip('"')
subject = _decode_header_value(msg.get("Subject", "(no subject)"))
message_id = msg.get("Message-ID", "")
in_reply_to = msg.get("In-Reply-To", "")
body = _extract_text_body(msg)
attachments = _extract_attachments(msg)
results.append({
"uid": uid,
"sender_addr": sender_addr,
"sender_name": sender_name,
"subject": subject,
"message_id": message_id,
"in_reply_to": in_reply_to,
"body": body,
"attachments": attachments,
"date": msg.get("Date", ""),
})
imap.logout()
except Exception as e:
logger.error("[Email] IMAP fetch error: %s", e)
return results
async def _dispatch_message(self, msg_data: Dict[str, Any]) -> None:
"""Convert a fetched email into a MessageEvent and dispatch it."""
sender_addr = msg_data["sender_addr"]
# Skip self-messages
if sender_addr == self._address.lower():
return
subject = msg_data["subject"]
body = msg_data["body"].strip()
attachments = msg_data["attachments"]
# Build message text: include subject as context
text = body
if subject and not subject.startswith("Re:"):
text = f"[Subject: {subject}]\n\n{body}"
# Determine message type and media
media_urls = []
media_types = []
msg_type = MessageType.TEXT
for att in attachments:
media_urls.append(att["path"])
media_types.append(att["media_type"])
if att["type"] == "image":
msg_type = MessageType.PHOTO
# Store thread context for reply threading
self._thread_context[sender_addr] = {
"subject": subject,
"message_id": msg_data["message_id"],
}
source = self.build_source(
chat_id=sender_addr,
chat_name=msg_data["sender_name"] or sender_addr,
chat_type="dm",
user_id=sender_addr,
user_name=msg_data["sender_name"] or sender_addr,
)
event = MessageEvent(
text=text or "(empty email)",
message_type=msg_type,
source=source,
message_id=msg_data["message_id"],
media_urls=media_urls,
media_types=media_types,
reply_to_message_id=msg_data["in_reply_to"] or None,
)
logger.info("[Email] New message from %s: %s", sender_addr, subject)
await self.handle_message(event)
async def send(
self,
chat_id: str,
content: str,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send an email reply to the given address."""
try:
loop = asyncio.get_running_loop()
message_id = await loop.run_in_executor(
None, self._send_email, chat_id, content, reply_to
)
return SendResult(success=True, message_id=message_id)
except Exception as e:
logger.error("[Email] Send failed to %s: %s", chat_id, e)
return SendResult(success=False, error=str(e))
def _send_email(
self,
to_addr: str,
body: str,
reply_to_msg_id: Optional[str] = None,
) -> str:
"""Send an email via SMTP. Runs in executor thread."""
msg = MIMEMultipart()
msg["From"] = self._address
msg["To"] = to_addr
# Thread context for reply
ctx = self._thread_context.get(to_addr, {})
subject = ctx.get("subject", "Hermes Agent")
if not subject.startswith("Re:"):
subject = f"Re: {subject}"
msg["Subject"] = subject
# Threading headers
original_msg_id = reply_to_msg_id or ctx.get("message_id")
if original_msg_id:
msg["In-Reply-To"] = original_msg_id
msg["References"] = original_msg_id
msg_id = f"<hermes-{uuid.uuid4().hex[:12]}@{self._address.split('@')[1]}>"
msg["Message-ID"] = msg_id
msg.attach(MIMEText(body, "plain", "utf-8"))
smtp = smtplib.SMTP(self._smtp_host, self._smtp_port)
smtp.starttls()
smtp.login(self._address, self._password)
smtp.send_message(msg)
smtp.quit()
logger.info("[Email] Sent reply to %s (subject: %s)", to_addr, subject)
return msg_id
async def send_typing(self, chat_id: str) -> None:
"""Email has no typing indicator — no-op."""
pass
async def send_image(
self,
chat_id: str,
image_url: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
) -> SendResult:
"""Send an image URL as part of an email body."""
text = caption or ""
text += f"\n\nImage: {image_url}"
return await self.send(chat_id, text.strip(), reply_to)
async def send_document(
self,
chat_id: str,
file_path: str,
caption: Optional[str] = None,
file_name: Optional[str] = None,
reply_to: Optional[str] = None,
) -> SendResult:
"""Send a file as an email attachment."""
try:
loop = asyncio.get_running_loop()
message_id = await loop.run_in_executor(
None,
self._send_email_with_attachment,
chat_id,
caption or "",
file_path,
file_name,
)
return SendResult(success=True, message_id=message_id)
except Exception as e:
logger.error("[Email] Send document failed: %s", e)
return SendResult(success=False, error=str(e))
def _send_email_with_attachment(
self,
to_addr: str,
body: str,
file_path: str,
file_name: Optional[str] = None,
) -> str:
"""Send an email with a file attachment via SMTP."""
msg = MIMEMultipart()
msg["From"] = self._address
msg["To"] = to_addr
ctx = self._thread_context.get(to_addr, {})
subject = ctx.get("subject", "Hermes Agent")
if not subject.startswith("Re:"):
subject = f"Re: {subject}"
msg["Subject"] = subject
original_msg_id = ctx.get("message_id")
if original_msg_id:
msg["In-Reply-To"] = original_msg_id
msg["References"] = original_msg_id
msg_id = f"<hermes-{uuid.uuid4().hex[:12]}@{self._address.split('@')[1]}>"
msg["Message-ID"] = msg_id
if body:
msg.attach(MIMEText(body, "plain", "utf-8"))
# Attach file
p = Path(file_path)
fname = file_name or p.name
with open(p, "rb") as f:
part = MIMEBase("application", "octet-stream")
part.set_payload(f.read())
encoders.encode_base64(part)
part.add_header("Content-Disposition", f"attachment; filename={fname}")
msg.attach(part)
smtp = smtplib.SMTP(self._smtp_host, self._smtp_port)
smtp.starttls()
smtp.login(self._address, self._password)
smtp.send_message(msg)
smtp.quit()
return msg_id
async def get_chat_info(self, chat_id: str) -> Dict[str, Any]:
"""Return basic info about the email chat."""
ctx = self._thread_context.get(chat_id, {})
return {
"name": chat_id,
"type": "dm",
"chat_id": chat_id,
"subject": ctx.get("subject", ""),
}

View File

@@ -419,7 +419,7 @@ class HomeAssistantAdapter(BasePlatformAdapter):
except Exception as e:
return SendResult(success=False, error=str(e))
async def send_typing(self, chat_id: str, metadata=None) -> None:
async def send_typing(self, chat_id: str) -> None:
"""No typing indicator for Home Assistant."""
pass

View File

@@ -1,727 +0,0 @@
"""Signal messenger platform adapter.
Connects to a signal-cli daemon running in HTTP mode.
Inbound messages arrive via SSE (Server-Sent Events) streaming.
Outbound messages and actions use JSON-RPC 2.0 over HTTP.
Based on PR #268 by ibhagwan, rebuilt with bug fixes.
Requires:
- signal-cli installed and running: signal-cli daemon --http 127.0.0.1:8080
- SIGNAL_HTTP_URL and SIGNAL_ACCOUNT environment variables set
"""
import asyncio
import base64
import json
import logging
import os
import random
import re
import time
from datetime import datetime, timezone
from pathlib import Path
from typing import Dict, List, Optional, Any
from urllib.parse import unquote
import httpx
from gateway.config import Platform, PlatformConfig
from gateway.platforms.base import (
BasePlatformAdapter,
MessageEvent,
MessageType,
SendResult,
cache_image_from_bytes,
cache_audio_from_bytes,
cache_document_from_bytes,
cache_image_from_url,
)
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# Constants
# ---------------------------------------------------------------------------
SIGNAL_MAX_ATTACHMENT_SIZE = 100 * 1024 * 1024 # 100 MB
MAX_MESSAGE_LENGTH = 8000 # Signal message size limit
TYPING_INTERVAL = 8.0 # seconds between typing indicator refreshes
SSE_RETRY_DELAY_INITIAL = 2.0
SSE_RETRY_DELAY_MAX = 60.0
HEALTH_CHECK_INTERVAL = 30.0 # seconds between health checks
HEALTH_CHECK_STALE_THRESHOLD = 120.0 # seconds without SSE activity before concern
# E.164 phone number pattern for redaction
_PHONE_RE = re.compile(r"\+[1-9]\d{6,14}")
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def _redact_phone(phone: str) -> str:
"""Redact a phone number for logging: +15551234567 -> +155****4567."""
if not phone:
return "<none>"
if len(phone) <= 8:
return phone[:2] + "****" + phone[-2:] if len(phone) > 4 else "****"
return phone[:4] + "****" + phone[-4:]
def _parse_comma_list(value: str) -> List[str]:
"""Split a comma-separated string into a list, stripping whitespace."""
return [v.strip() for v in value.split(",") if v.strip()]
def _guess_extension(data: bytes) -> str:
"""Guess file extension from magic bytes."""
if data[:4] == b"\x89PNG":
return ".png"
if data[:2] == b"\xff\xd8":
return ".jpg"
if data[:4] == b"GIF8":
return ".gif"
if len(data) >= 12 and data[:4] == b"RIFF" and data[8:12] == b"WEBP":
return ".webp"
if data[:4] == b"%PDF":
return ".pdf"
if len(data) >= 8 and data[4:8] == b"ftyp":
return ".mp4"
if data[:4] == b"OggS":
return ".ogg"
if len(data) >= 2 and data[0] == 0xFF and (data[1] & 0xE0) == 0xE0:
return ".mp3"
if data[:2] == b"PK":
return ".zip"
return ".bin"
def _is_image_ext(ext: str) -> bool:
return ext.lower() in (".jpg", ".jpeg", ".png", ".gif", ".webp")
def _is_audio_ext(ext: str) -> bool:
return ext.lower() in (".mp3", ".wav", ".ogg", ".m4a", ".aac")
_EXT_TO_MIME = {
".jpg": "image/jpeg", ".jpeg": "image/jpeg", ".png": "image/png",
".gif": "image/gif", ".webp": "image/webp",
".ogg": "audio/ogg", ".mp3": "audio/mpeg", ".wav": "audio/wav",
".m4a": "audio/mp4", ".aac": "audio/aac",
".mp4": "video/mp4", ".pdf": "application/pdf", ".zip": "application/zip",
}
def _ext_to_mime(ext: str) -> str:
"""Map file extension to MIME type."""
return _EXT_TO_MIME.get(ext.lower(), "application/octet-stream")
def _render_mentions(text: str, mentions: list) -> str:
"""Replace Signal mention placeholders (\\uFFFC) with readable @identifiers.
Signal encodes @mentions as the Unicode object replacement character
with out-of-band metadata containing the mentioned user's UUID/number.
"""
if not mentions or "\uFFFC" not in text:
return text
# Sort mentions by start position (reverse) to replace from end to start
# so indices don't shift as we replace
sorted_mentions = sorted(mentions, key=lambda m: m.get("start", 0), reverse=True)
for mention in sorted_mentions:
start = mention.get("start", 0)
length = mention.get("length", 1)
# Use the mention's number or UUID as the replacement
identifier = mention.get("number") or mention.get("uuid") or "user"
replacement = f"@{identifier}"
text = text[:start] + replacement + text[start + length:]
return text
def check_signal_requirements() -> bool:
"""Check if Signal is configured (has URL and account)."""
return bool(os.getenv("SIGNAL_HTTP_URL") and os.getenv("SIGNAL_ACCOUNT"))
# ---------------------------------------------------------------------------
# Signal Adapter
# ---------------------------------------------------------------------------
class SignalAdapter(BasePlatformAdapter):
"""Signal messenger adapter using signal-cli HTTP daemon."""
platform = Platform.SIGNAL
def __init__(self, config: PlatformConfig):
super().__init__(config, Platform.SIGNAL)
extra = config.extra or {}
self.http_url = extra.get("http_url", "http://127.0.0.1:8080").rstrip("/")
self.account = extra.get("account", "")
self.ignore_stories = extra.get("ignore_stories", True)
# Parse allowlists — group policy is derived from presence of group allowlist
group_allowed_str = os.getenv("SIGNAL_GROUP_ALLOWED_USERS", "")
self.group_allow_from = set(_parse_comma_list(group_allowed_str))
# HTTP client
self.client: Optional[httpx.AsyncClient] = None
# Background tasks
self._sse_task: Optional[asyncio.Task] = None
self._health_monitor_task: Optional[asyncio.Task] = None
self._typing_tasks: Dict[str, asyncio.Task] = {}
self._running = False
self._last_sse_activity = 0.0
self._sse_response: Optional[httpx.Response] = None
# Normalize account for self-message filtering
self._account_normalized = self.account.strip()
logger.info("Signal adapter initialized: url=%s account=%s groups=%s",
self.http_url, _redact_phone(self.account),
"enabled" if self.group_allow_from else "disabled")
# ------------------------------------------------------------------
# Lifecycle
# ------------------------------------------------------------------
async def connect(self) -> bool:
"""Connect to signal-cli daemon and start SSE listener."""
if not self.http_url or not self.account:
logger.error("Signal: SIGNAL_HTTP_URL and SIGNAL_ACCOUNT are required")
return False
self.client = httpx.AsyncClient(timeout=30.0)
# Health check — verify signal-cli daemon is reachable
try:
resp = await self.client.get(f"{self.http_url}/api/v1/check", timeout=10.0)
if resp.status_code != 200:
logger.error("Signal: health check failed (status %d)", resp.status_code)
return False
except Exception as e:
logger.error("Signal: cannot reach signal-cli at %s: %s", self.http_url, e)
return False
self._running = True
self._last_sse_activity = time.time()
self._sse_task = asyncio.create_task(self._sse_listener())
self._health_monitor_task = asyncio.create_task(self._health_monitor())
logger.info("Signal: connected to %s", self.http_url)
return True
async def disconnect(self) -> None:
"""Stop SSE listener and clean up."""
self._running = False
if self._sse_task:
self._sse_task.cancel()
try:
await self._sse_task
except asyncio.CancelledError:
pass
if self._health_monitor_task:
self._health_monitor_task.cancel()
try:
await self._health_monitor_task
except asyncio.CancelledError:
pass
# Cancel all typing tasks
for task in self._typing_tasks.values():
task.cancel()
self._typing_tasks.clear()
if self.client:
await self.client.aclose()
self.client = None
logger.info("Signal: disconnected")
# ------------------------------------------------------------------
# SSE Streaming (inbound messages)
# ------------------------------------------------------------------
async def _sse_listener(self) -> None:
"""Listen for SSE events from signal-cli daemon."""
url = f"{self.http_url}/api/v1/events?account={self.account}"
backoff = SSE_RETRY_DELAY_INITIAL
while self._running:
try:
logger.debug("Signal SSE: connecting to %s", url)
async with self.client.stream(
"GET", url,
headers={"Accept": "text/event-stream"},
timeout=None,
) as response:
self._sse_response = response
backoff = SSE_RETRY_DELAY_INITIAL # Reset on successful connection
self._last_sse_activity = time.time()
logger.info("Signal SSE: connected")
buffer = ""
async for chunk in response.aiter_text():
if not self._running:
break
buffer += chunk
while "\n" in buffer:
line, buffer = buffer.split("\n", 1)
line = line.strip()
if not line:
continue
# Parse SSE data lines
if line.startswith("data:"):
data_str = line[5:].strip()
if not data_str:
continue
self._last_sse_activity = time.time()
try:
data = json.loads(data_str)
await self._handle_envelope(data)
except json.JSONDecodeError:
logger.debug("Signal SSE: invalid JSON: %s", data_str[:100])
except Exception:
logger.exception("Signal SSE: error handling event")
except asyncio.CancelledError:
break
except httpx.HTTPError as e:
if self._running:
logger.warning("Signal SSE: HTTP error: %s (reconnecting in %.0fs)", e, backoff)
except Exception as e:
if self._running:
logger.warning("Signal SSE: error: %s (reconnecting in %.0fs)", e, backoff)
if self._running:
# Add 20% jitter to prevent thundering herd on reconnection
jitter = backoff * 0.2 * random.random()
await asyncio.sleep(backoff + jitter)
backoff = min(backoff * 2, SSE_RETRY_DELAY_MAX)
self._sse_response = None
# ------------------------------------------------------------------
# Health Monitor
# ------------------------------------------------------------------
async def _health_monitor(self) -> None:
"""Monitor SSE connection health and force reconnect if stale."""
while self._running:
await asyncio.sleep(HEALTH_CHECK_INTERVAL)
if not self._running:
break
elapsed = time.time() - self._last_sse_activity
if elapsed > HEALTH_CHECK_STALE_THRESHOLD:
logger.warning("Signal: SSE idle for %.0fs, checking daemon health", elapsed)
try:
resp = await self.client.get(
f"{self.http_url}/api/v1/check", timeout=10.0
)
if resp.status_code == 200:
# Daemon is alive but SSE is idle — update activity to
# avoid repeated warnings (connection may just be quiet)
self._last_sse_activity = time.time()
logger.debug("Signal: daemon healthy, SSE idle")
else:
logger.warning("Signal: health check failed (%d), forcing reconnect", resp.status_code)
self._force_reconnect()
except Exception as e:
logger.warning("Signal: health check error: %s, forcing reconnect", e)
self._force_reconnect()
def _force_reconnect(self) -> None:
"""Force SSE reconnection by closing the current response."""
if self._sse_response and not self._sse_response.is_stream_consumed:
try:
asyncio.create_task(self._sse_response.aclose())
except Exception:
pass
self._sse_response = None
# ------------------------------------------------------------------
# Message Handling
# ------------------------------------------------------------------
async def _handle_envelope(self, envelope: dict) -> None:
"""Process an incoming signal-cli envelope."""
# Unwrap nested envelope if present
envelope_data = envelope.get("envelope", envelope)
# Filter syncMessage envelopes (sent transcripts, read receipts, etc.)
# signal-cli may set syncMessage to null vs omitting it, so check key existence
if "syncMessage" in envelope_data:
return
# Extract sender info
sender = (
envelope_data.get("sourceNumber")
or envelope_data.get("sourceUuid")
or envelope_data.get("source")
)
sender_name = envelope_data.get("sourceName", "")
sender_uuid = envelope_data.get("sourceUuid", "")
if not sender:
logger.debug("Signal: ignoring envelope with no sender")
return
# Self-message filtering — prevent reply loops
if self._account_normalized and sender == self._account_normalized:
return
# Filter stories
if self.ignore_stories and envelope_data.get("storyMessage"):
return
# Get data message — also check editMessage (edited messages contain
# their updated dataMessage inside editMessage.dataMessage)
data_message = (
envelope_data.get("dataMessage")
or (envelope_data.get("editMessage") or {}).get("dataMessage")
)
if not data_message:
return
# Check for group message
group_info = data_message.get("groupInfo")
group_id = group_info.get("groupId") if group_info else None
is_group = bool(group_id)
# Group message filtering — derived from SIGNAL_GROUP_ALLOWED_USERS:
# - No env var set → groups disabled (default safe behavior)
# - Env var set with group IDs → only those groups allowed
# - Env var set with "*" → all groups allowed
# DM auth is fully handled by run.py (_is_user_authorized)
if is_group:
if not self.group_allow_from:
logger.debug("Signal: ignoring group message (no SIGNAL_GROUP_ALLOWED_USERS)")
return
if "*" not in self.group_allow_from and group_id not in self.group_allow_from:
logger.debug("Signal: group %s not in allowlist", group_id[:8] if group_id else "?")
return
# Build chat info
chat_id = sender if not is_group else f"group:{group_id}"
chat_type = "group" if is_group else "dm"
# Extract text and render mentions
text = data_message.get("message", "")
mentions = data_message.get("mentions", [])
if text and mentions:
text = _render_mentions(text, mentions)
# Process attachments
attachments_data = data_message.get("attachments", [])
media_urls = []
media_types = []
if attachments_data and not getattr(self, "ignore_attachments", False):
for att in attachments_data:
att_id = att.get("id")
att_size = att.get("size", 0)
if not att_id:
continue
if att_size > SIGNAL_MAX_ATTACHMENT_SIZE:
logger.warning("Signal: attachment too large (%d bytes), skipping", att_size)
continue
try:
cached_path, ext = await self._fetch_attachment(att_id)
if cached_path:
# Use contentType from Signal if available, else map from extension
content_type = att.get("contentType") or _ext_to_mime(ext)
media_urls.append(cached_path)
media_types.append(content_type)
except Exception:
logger.exception("Signal: failed to fetch attachment %s", att_id)
# Build session source
source = self.build_source(
chat_id=chat_id,
chat_name=group_info.get("groupName") if group_info else sender_name,
chat_type=chat_type,
user_id=sender,
user_name=sender_name or sender,
user_id_alt=sender_uuid if sender_uuid else None,
chat_id_alt=group_id if is_group else None,
)
# Determine message type from media
msg_type = MessageType.TEXT
if media_types:
if any(mt.startswith("audio/") for mt in media_types):
msg_type = MessageType.VOICE
elif any(mt.startswith("image/") for mt in media_types):
msg_type = MessageType.IMAGE
# Parse timestamp from envelope data (milliseconds since epoch)
ts_ms = envelope_data.get("timestamp", 0)
if ts_ms:
try:
timestamp = datetime.fromtimestamp(ts_ms / 1000, tz=timezone.utc)
except (ValueError, OSError):
timestamp = datetime.now(tz=timezone.utc)
else:
timestamp = datetime.now(tz=timezone.utc)
# Build and dispatch event
event = MessageEvent(
source=source,
text=text or "",
message_type=msg_type,
media_urls=media_urls,
media_types=media_types,
timestamp=timestamp,
)
logger.debug("Signal: message from %s in %s: %s",
_redact_phone(sender), chat_id[:20], (text or "")[:50])
await self.handle_message(event)
# ------------------------------------------------------------------
# Attachment Handling
# ------------------------------------------------------------------
async def _fetch_attachment(self, attachment_id: str) -> tuple:
"""Fetch an attachment via JSON-RPC and cache it. Returns (path, ext)."""
result = await self._rpc("getAttachment", {
"account": self.account,
"attachmentId": attachment_id,
})
if not result:
return None, ""
# Result is base64-encoded file content
raw_data = base64.b64decode(result)
ext = _guess_extension(raw_data)
if _is_image_ext(ext):
path = cache_image_from_bytes(raw_data, ext)
elif _is_audio_ext(ext):
path = cache_audio_from_bytes(raw_data, ext)
else:
path = cache_document_from_bytes(raw_data, ext)
return path, ext
# ------------------------------------------------------------------
# JSON-RPC Communication
# ------------------------------------------------------------------
async def _rpc(self, method: str, params: dict, rpc_id: str = None) -> Any:
"""Send a JSON-RPC 2.0 request to signal-cli daemon."""
if not self.client:
logger.warning("Signal: RPC called but client not connected")
return None
if rpc_id is None:
rpc_id = f"{method}_{int(time.time() * 1000)}"
payload = {
"jsonrpc": "2.0",
"method": method,
"params": params,
"id": rpc_id,
}
try:
resp = await self.client.post(
f"{self.http_url}/api/v1/rpc",
json=payload,
timeout=30.0,
)
resp.raise_for_status()
data = resp.json()
if "error" in data:
logger.warning("Signal RPC error (%s): %s", method, data["error"])
return None
return data.get("result")
except Exception as e:
logger.warning("Signal RPC %s failed: %s", method, e)
return None
# ------------------------------------------------------------------
# Sending
# ------------------------------------------------------------------
async def send(
self,
chat_id: str,
content: str,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send a text message."""
await self._stop_typing_indicator(chat_id)
params: Dict[str, Any] = {
"account": self.account,
"message": content,
}
if chat_id.startswith("group:"):
params["groupId"] = chat_id[6:]
else:
params["recipient"] = [chat_id]
result = await self._rpc("send", params)
if result is not None:
return SendResult(success=True)
return SendResult(success=False, error="RPC send failed")
async def send_typing(self, chat_id: str, metadata=None) -> None:
"""Send a typing indicator."""
params: Dict[str, Any] = {
"account": self.account,
}
if chat_id.startswith("group:"):
params["groupId"] = chat_id[6:]
else:
params["recipient"] = [chat_id]
await self._rpc("sendTyping", params, rpc_id="typing")
async def send_image(
self,
chat_id: str,
image_url: str,
caption: Optional[str] = None,
**kwargs,
) -> SendResult:
"""Send an image. Supports http(s):// and file:// URLs."""
await self._stop_typing_indicator(chat_id)
# Resolve image to local path
if image_url.startswith("file://"):
file_path = unquote(image_url[7:])
else:
# Download remote image to cache
try:
file_path = await cache_image_from_url(image_url)
except Exception as e:
logger.warning("Signal: failed to download image: %s", e)
return SendResult(success=False, error=str(e))
if not file_path or not Path(file_path).exists():
return SendResult(success=False, error="Image file not found")
# Validate size
file_size = Path(file_path).stat().st_size
if file_size > SIGNAL_MAX_ATTACHMENT_SIZE:
return SendResult(success=False, error=f"Image too large ({file_size} bytes)")
params: Dict[str, Any] = {
"account": self.account,
"message": caption or "",
"attachments": [file_path],
}
if chat_id.startswith("group:"):
params["groupId"] = chat_id[6:]
else:
params["recipient"] = [chat_id]
result = await self._rpc("send", params)
if result is not None:
return SendResult(success=True)
return SendResult(success=False, error="RPC send with attachment failed")
async def send_document(
self,
chat_id: str,
file_path: str,
caption: Optional[str] = None,
filename: Optional[str] = None,
**kwargs,
) -> SendResult:
"""Send a document/file attachment."""
await self._stop_typing_indicator(chat_id)
if not Path(file_path).exists():
return SendResult(success=False, error="File not found")
params: Dict[str, Any] = {
"account": self.account,
"message": caption or "",
"attachments": [file_path],
}
if chat_id.startswith("group:"):
params["groupId"] = chat_id[6:]
else:
params["recipient"] = [chat_id]
result = await self._rpc("send", params)
if result is not None:
return SendResult(success=True)
return SendResult(success=False, error="RPC send document failed")
# ------------------------------------------------------------------
# Typing Indicators
# ------------------------------------------------------------------
async def _start_typing_indicator(self, chat_id: str) -> None:
"""Start a typing indicator loop for a chat."""
if chat_id in self._typing_tasks:
return # Already running
async def _typing_loop():
try:
while True:
await self.send_typing(chat_id)
await asyncio.sleep(TYPING_INTERVAL)
except asyncio.CancelledError:
pass
self._typing_tasks[chat_id] = asyncio.create_task(_typing_loop())
async def _stop_typing_indicator(self, chat_id: str) -> None:
"""Stop a typing indicator loop for a chat."""
task = self._typing_tasks.pop(chat_id, None)
if task:
task.cancel()
try:
await task
except asyncio.CancelledError:
pass
# ------------------------------------------------------------------
# Chat Info
# ------------------------------------------------------------------
async def get_chat_info(self, chat_id: str) -> Dict[str, Any]:
"""Get information about a chat/contact."""
if chat_id.startswith("group:"):
return {
"name": chat_id,
"type": "group",
"chat_id": chat_id,
}
# Try to resolve contact name
result = await self._rpc("getContact", {
"account": self.account,
"contactAddress": chat_id,
})
name = chat_id
if result and isinstance(result, dict):
name = result.get("name") or result.get("profileName") or chat_id
return {
"name": name,
"type": "dm",
"chat_id": chat_id,
}

View File

@@ -9,9 +9,7 @@ Uses slack-bolt (Python) with Socket Mode for:
"""
import asyncio
import logging
import os
import re
from typing import Dict, List, Optional, Any
try:
@@ -35,16 +33,11 @@ from gateway.platforms.base import (
MessageEvent,
MessageType,
SendResult,
SUPPORTED_DOCUMENT_TYPES,
cache_document_from_bytes,
cache_image_from_url,
cache_audio_from_url,
)
logger = logging.getLogger(__name__)
def check_slack_requirements() -> bool:
"""Check if Slack dependencies are available."""
return SLACK_AVAILABLE
@@ -77,19 +70,17 @@ class SlackAdapter(BasePlatformAdapter):
async def connect(self) -> bool:
"""Connect to Slack via Socket Mode."""
if not SLACK_AVAILABLE:
logger.error(
"[Slack] slack-bolt not installed. Run: pip install slack-bolt",
)
print("[Slack] slack-bolt not installed. Run: pip install slack-bolt")
return False
bot_token = self.config.token
app_token = os.getenv("SLACK_APP_TOKEN")
if not bot_token:
logger.error("[Slack] SLACK_BOT_TOKEN not set")
print("[Slack] SLACK_BOT_TOKEN not set")
return False
if not app_token:
logger.error("[Slack] SLACK_APP_TOKEN not set")
print("[Slack] SLACK_APP_TOKEN not set")
return False
try:
@@ -105,13 +96,6 @@ class SlackAdapter(BasePlatformAdapter):
async def handle_message_event(event, say):
await self._handle_slack_message(event)
# Acknowledge app_mention events to prevent Bolt 404 errors.
# The "message" handler above already processes @mentions in
# channels, so this is intentionally a no-op to avoid duplicates.
@self._app.event("app_mention")
async def handle_app_mention(event, say):
pass
# Register slash command handler
@self._app.command("/hermes")
async def handle_hermes_command(ack, command):
@@ -123,22 +107,19 @@ class SlackAdapter(BasePlatformAdapter):
asyncio.create_task(self._handler.start_async())
self._running = True
logger.info("[Slack] Connected as @%s (Socket Mode)", bot_name)
print(f"[Slack] Connected as @{bot_name} (Socket Mode)")
return True
except Exception as e: # pragma: no cover - defensive logging
logger.error("[Slack] Connection failed: %s", e, exc_info=True)
except Exception as e:
print(f"[Slack] Connection failed: {e}")
return False
async def disconnect(self) -> None:
"""Disconnect from Slack."""
if self._handler:
try:
await self._handler.close_async()
except Exception as e: # pragma: no cover - defensive logging
logger.warning("[Slack] Error while closing Socket Mode handler: %s", e, exc_info=True)
await self._handler.close_async()
self._running = False
logger.info("[Slack] Disconnected")
print("[Slack] Disconnected")
async def send(
self,
@@ -171,8 +152,8 @@ class SlackAdapter(BasePlatformAdapter):
raw_response=result,
)
except Exception as e: # pragma: no cover - defensive logging
logger.error("[Slack] Send error: %s", e, exc_info=True)
except Exception as e:
print(f"[Slack] Send error: {e}")
return SendResult(success=False, error=str(e))
async def edit_message(
@@ -191,55 +172,13 @@ class SlackAdapter(BasePlatformAdapter):
text=content,
)
return SendResult(success=True, message_id=message_id)
except Exception as e: # pragma: no cover - defensive logging
logger.error(
"[Slack] Failed to edit message %s in channel %s: %s",
message_id,
chat_id,
e,
exc_info=True,
)
except Exception as e:
return SendResult(success=False, error=str(e))
async def send_typing(self, chat_id: str, metadata=None) -> None:
async def send_typing(self, chat_id: str) -> None:
"""Slack doesn't have a direct typing indicator API for bots."""
pass
async def send_image_file(
self,
chat_id: str,
image_path: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
) -> SendResult:
"""Send a local image file to Slack by uploading it."""
if not self._app:
return SendResult(success=False, error="Not connected")
try:
import os
if not os.path.exists(image_path):
return SendResult(success=False, error=f"Image file not found: {image_path}")
result = await self._app.client.files_upload_v2(
channel=chat_id,
file=image_path,
filename=os.path.basename(image_path),
initial_comment=caption or "",
thread_ts=reply_to,
)
return SendResult(success=True, raw_response=result)
except Exception as e: # pragma: no cover - defensive logging
logger.error(
"[%s] Failed to send local Slack image %s: %s",
self.name,
image_path,
e,
exc_info=True,
)
return await super().send_image_file(chat_id, image_path, caption, reply_to)
async def send_image(
self,
chat_id: str,
@@ -269,13 +208,7 @@ class SlackAdapter(BasePlatformAdapter):
return SendResult(success=True, raw_response=result)
except Exception as e: # pragma: no cover - defensive logging
logger.warning(
"[Slack] Failed to upload image from URL %s, falling back to text: %s",
image_url,
e,
exc_info=True,
)
except Exception as e:
# Fall back to sending the URL as text
text = f"{caption}\n{image_url}" if caption else image_url
return await self.send(chat_id=chat_id, content=text, reply_to=reply_to)
@@ -301,86 +234,9 @@ class SlackAdapter(BasePlatformAdapter):
)
return SendResult(success=True, raw_response=result)
except Exception as e: # pragma: no cover - defensive logging
logger.error(
"[Slack] Failed to send audio file %s: %s",
audio_path,
e,
exc_info=True,
)
except Exception as e:
return SendResult(success=False, error=str(e))
async def send_video(
self,
chat_id: str,
video_path: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
) -> SendResult:
"""Send a video file to Slack."""
if not self._app:
return SendResult(success=False, error="Not connected")
if not os.path.exists(video_path):
return SendResult(success=False, error=f"Video file not found: {video_path}")
try:
result = await self._app.client.files_upload_v2(
channel=chat_id,
file=video_path,
filename=os.path.basename(video_path),
initial_comment=caption or "",
thread_ts=reply_to,
)
return SendResult(success=True, raw_response=result)
except Exception as e: # pragma: no cover - defensive logging
logger.error(
"[%s] Failed to send video %s: %s",
self.name,
video_path,
e,
exc_info=True,
)
return await super().send_video(chat_id, video_path, caption, reply_to)
async def send_document(
self,
chat_id: str,
file_path: str,
caption: Optional[str] = None,
file_name: Optional[str] = None,
reply_to: Optional[str] = None,
) -> SendResult:
"""Send a document/file attachment to Slack."""
if not self._app:
return SendResult(success=False, error="Not connected")
if not os.path.exists(file_path):
return SendResult(success=False, error=f"File not found: {file_path}")
display_name = file_name or os.path.basename(file_path)
try:
result = await self._app.client.files_upload_v2(
channel=chat_id,
file=file_path,
filename=display_name,
initial_comment=caption or "",
thread_ts=reply_to,
)
return SendResult(success=True, raw_response=result)
except Exception as e: # pragma: no cover - defensive logging
logger.error(
"[%s] Failed to send document %s: %s",
self.name,
file_path,
e,
exc_info=True,
)
return await super().send_document(chat_id, file_path, caption, file_name, reply_to)
async def get_chat_info(self, chat_id: str) -> Dict[str, Any]:
"""Get information about a Slack channel."""
if not self._app:
@@ -394,13 +250,7 @@ class SlackAdapter(BasePlatformAdapter):
"name": channel.get("name", chat_id),
"type": "dm" if is_dm else "group",
}
except Exception as e: # pragma: no cover - defensive logging
logger.error(
"[Slack] Failed to fetch chat info for %s: %s",
chat_id,
e,
exc_info=True,
)
except Exception:
return {"name": chat_id, "type": "unknown"}
# ----- Internal handlers -----
@@ -455,8 +305,8 @@ class SlackAdapter(BasePlatformAdapter):
media_urls.append(cached)
media_types.append(mimetype)
msg_type = MessageType.PHOTO
except Exception as e: # pragma: no cover - defensive logging
logger.warning("[Slack] Failed to cache image from %s: %s", url, e, exc_info=True)
except Exception as e:
print(f"[Slack] Failed to cache image: {e}", flush=True)
elif mimetype.startswith("audio/") and url:
try:
ext = "." + mimetype.split("/")[-1].split(";")[0]
@@ -466,60 +316,8 @@ class SlackAdapter(BasePlatformAdapter):
media_urls.append(cached)
media_types.append(mimetype)
msg_type = MessageType.VOICE
except Exception as e: # pragma: no cover - defensive logging
logger.warning("[Slack] Failed to cache audio from %s: %s", url, e, exc_info=True)
elif url:
# Try to handle as a document attachment
try:
original_filename = f.get("name", "")
ext = ""
if original_filename:
_, ext = os.path.splitext(original_filename)
ext = ext.lower()
# Fallback: reverse-lookup from MIME type
if not ext and mimetype:
mime_to_ext = {v: k for k, v in SUPPORTED_DOCUMENT_TYPES.items()}
ext = mime_to_ext.get(mimetype, "")
if ext not in SUPPORTED_DOCUMENT_TYPES:
continue # Skip unsupported file types silently
# Check file size (Slack limit: 20 MB for bots)
file_size = f.get("size", 0)
MAX_DOC_BYTES = 20 * 1024 * 1024
if not file_size or file_size > MAX_DOC_BYTES:
logger.warning("[Slack] Document too large or unknown size: %s", file_size)
continue
# Download and cache
raw_bytes = await self._download_slack_file_bytes(url)
cached_path = cache_document_from_bytes(
raw_bytes, original_filename or f"document{ext}"
)
doc_mime = SUPPORTED_DOCUMENT_TYPES[ext]
media_urls.append(cached_path)
media_types.append(doc_mime)
msg_type = MessageType.DOCUMENT
logger.debug("[Slack] Cached user document: %s", cached_path)
# Inject text content for .txt/.md files (capped at 100 KB)
MAX_TEXT_INJECT_BYTES = 100 * 1024
if ext in (".md", ".txt") and len(raw_bytes) <= MAX_TEXT_INJECT_BYTES:
try:
text_content = raw_bytes.decode("utf-8")
display_name = original_filename or f"document{ext}"
display_name = re.sub(r'[^\w.\- ]', '_', display_name)
injection = f"[Content of {display_name}]:\n{text_content}"
if text:
text = f"{injection}\n\n{text}"
else:
text = injection
except UnicodeDecodeError:
pass # Binary content, skip injection
except Exception as e: # pragma: no cover - defensive logging
logger.warning("[Slack] Failed to cache document from %s: %s", url, e, exc_info=True)
except Exception as e:
print(f"[Slack] Failed to cache audio: {e}", flush=True)
# Build source
source = self.build_source(
@@ -600,16 +398,3 @@ class SlackAdapter(BasePlatformAdapter):
else:
from gateway.platforms.base import cache_image_from_bytes
return cache_image_from_bytes(response.content, ext)
async def _download_slack_file_bytes(self, url: str) -> bytes:
"""Download a Slack file and return raw bytes."""
import httpx
bot_token = self.config.token
async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
response = await client.get(
url,
headers={"Authorization": f"Bearer {bot_token}"},
)
response.raise_for_status()
return response.content

View File

@@ -8,13 +8,10 @@ Uses python-telegram-bot library for:
"""
import asyncio
import logging
import os
import re
from typing import Dict, List, Optional, Any
logger = logging.getLogger(__name__)
try:
from telegram import Update, Bot, Message
from telegram.ext import (
@@ -76,22 +73,6 @@ def _escape_mdv2(text: str) -> str:
return _MDV2_ESCAPE_RE.sub(r'\\\1', text)
def _strip_mdv2(text: str) -> str:
"""Strip MarkdownV2 escape backslashes to produce clean plain text.
Also removes MarkdownV2 bold markers (*text* -> text) so the fallback
doesn't show stray asterisks from header/bold conversion.
"""
# Remove escape backslashes before special characters
cleaned = re.sub(r'\\([_*\[\]()~`>#\+\-=|{}.!\\])', r'\1', text)
# Remove MarkdownV2 bold markers that format_message converted from **bold**
cleaned = re.sub(r'\*([^*]+)\*', r'\1', cleaned)
# Remove MarkdownV2 italic markers that format_message converted from *italic*
# Use word boundary (\b) to avoid breaking snake_case like my_variable_name
cleaned = re.sub(r'(?<!\w)_([^_]+)_(?!\w)', r'\1', cleaned)
return cleaned
class TelegramAdapter(BasePlatformAdapter):
"""
Telegram bot adapter.
@@ -114,14 +95,11 @@ class TelegramAdapter(BasePlatformAdapter):
async def connect(self) -> bool:
"""Connect to Telegram and start polling for updates."""
if not TELEGRAM_AVAILABLE:
logger.error(
"[%s] python-telegram-bot not installed. Run: pip install python-telegram-bot",
self.name,
)
print(f"[{self.name}] python-telegram-bot not installed. Run: pip install python-telegram-bot")
return False
if not self.config.token:
logger.error("[%s] No bot token configured", self.name)
print(f"[{self.name}] No bot token configured")
return False
try:
@@ -138,10 +116,6 @@ class TelegramAdapter(BasePlatformAdapter):
filters.COMMAND,
self._handle_command
))
self._app.add_handler(TelegramMessageHandler(
filters.LOCATION | getattr(filters, "VENUE", filters.LOCATION),
self._handle_location_message
))
self._app.add_handler(TelegramMessageHandler(
filters.PHOTO | filters.VIDEO | filters.AUDIO | filters.VOICE | filters.Document.ALL | filters.Sticker.ALL,
self._handle_media_message
@@ -165,30 +139,17 @@ class TelegramAdapter(BasePlatformAdapter):
BotCommand("status", "Show session info"),
BotCommand("stop", "Stop the running agent"),
BotCommand("sethome", "Set this chat as the home channel"),
BotCommand("compress", "Compress conversation context"),
BotCommand("title", "Set or show the session title"),
BotCommand("resume", "Resume a previously-named session"),
BotCommand("usage", "Show token usage for this session"),
BotCommand("provider", "Show available providers"),
BotCommand("insights", "Show usage insights and analytics"),
BotCommand("update", "Update Hermes to the latest version"),
BotCommand("reload_mcp", "Reload MCP servers from config"),
BotCommand("help", "Show available commands"),
])
except Exception as e:
logger.warning(
"[%s] Could not register Telegram command menu: %s",
self.name,
e,
exc_info=True,
)
print(f"[{self.name}] Could not register command menu: {e}")
self._running = True
logger.info("[%s] Connected and polling for Telegram updates", self.name)
print(f"[{self.name}] Connected and polling for updates")
return True
except Exception as e:
logger.error("[%s] Failed to connect to Telegram: %s", self.name, e, exc_info=True)
print(f"[{self.name}] Failed to connect: {e}")
return False
async def disconnect(self) -> None:
@@ -199,12 +160,12 @@ class TelegramAdapter(BasePlatformAdapter):
await self._app.stop()
await self._app.shutdown()
except Exception as e:
logger.warning("[%s] Error during Telegram disconnect: %s", self.name, e, exc_info=True)
print(f"[{self.name}] Error during disconnect: {e}")
self._running = False
self._app = None
self._bot = None
logger.info("[%s] Disconnected from Telegram", self.name)
print(f"[{self.name}] Disconnected")
async def send(
self,
@@ -238,13 +199,9 @@ class TelegramAdapter(BasePlatformAdapter):
except Exception as md_error:
# Markdown parsing failed, try plain text
if "parse" in str(md_error).lower() or "markdown" in str(md_error).lower():
logger.warning("[%s] MarkdownV2 parse failed, falling back to plain text: %s", self.name, md_error)
# Strip MDV2 escape backslashes so the user doesn't
# see raw backslashes littered through the message.
plain_chunk = _strip_mdv2(chunk)
msg = await self._bot.send_message(
chat_id=int(chat_id),
text=plain_chunk,
text=chunk,
parse_mode=None, # Plain text
reply_to_message_id=int(reply_to) if reply_to and i == 0 else None,
message_thread_id=int(thread_id) if thread_id else None,
@@ -260,7 +217,6 @@ class TelegramAdapter(BasePlatformAdapter):
)
except Exception as e:
logger.error("[%s] Failed to send Telegram message: %s", self.name, e, exc_info=True)
return SendResult(success=False, error=str(e))
async def edit_message(
@@ -290,13 +246,6 @@ class TelegramAdapter(BasePlatformAdapter):
)
return SendResult(success=True, message_id=message_id)
except Exception as e:
logger.error(
"[%s] Failed to edit Telegram message %s: %s",
self.name,
message_id,
e,
exc_info=True,
)
return SendResult(success=False, error=str(e))
async def send_voice(
@@ -305,7 +254,6 @@ class TelegramAdapter(BasePlatformAdapter):
audio_path: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send audio as a native Telegram voice message or audio file."""
if not self._bot:
@@ -319,186 +267,49 @@ class TelegramAdapter(BasePlatformAdapter):
with open(audio_path, "rb") as audio_file:
# .ogg files -> send as voice (round playable bubble)
if audio_path.endswith(".ogg") or audio_path.endswith(".opus"):
_voice_thread = metadata.get("thread_id") if metadata else None
msg = await self._bot.send_voice(
chat_id=int(chat_id),
voice=audio_file,
caption=caption[:1024] if caption else None,
reply_to_message_id=int(reply_to) if reply_to else None,
message_thread_id=int(_voice_thread) if _voice_thread else None,
)
else:
# .mp3 and others -> send as audio file
_audio_thread = metadata.get("thread_id") if metadata else None
msg = await self._bot.send_audio(
chat_id=int(chat_id),
audio=audio_file,
caption=caption[:1024] if caption else None,
reply_to_message_id=int(reply_to) if reply_to else None,
message_thread_id=int(_audio_thread) if _audio_thread else None,
)
return SendResult(success=True, message_id=str(msg.message_id))
except Exception as e:
logger.error(
"[%s] Failed to send Telegram voice/audio, falling back to base adapter: %s",
self.name,
e,
exc_info=True,
)
print(f"[{self.name}] Failed to send voice/audio: {e}")
return await super().send_voice(chat_id, audio_path, caption, reply_to)
async def send_image_file(
self,
chat_id: str,
image_path: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
**kwargs,
) -> SendResult:
"""Send a local image file natively as a Telegram photo."""
if not self._bot:
return SendResult(success=False, error="Not connected")
try:
import os
if not os.path.exists(image_path):
return SendResult(success=False, error=f"Image file not found: {image_path}")
with open(image_path, "rb") as image_file:
msg = await self._bot.send_photo(
chat_id=int(chat_id),
photo=image_file,
caption=caption[:1024] if caption else None,
reply_to_message_id=int(reply_to) if reply_to else None,
)
return SendResult(success=True, message_id=str(msg.message_id))
except Exception as e:
logger.error(
"[%s] Failed to send Telegram local image, falling back to base adapter: %s",
self.name,
e,
exc_info=True,
)
return await super().send_image_file(chat_id, image_path, caption, reply_to)
async def send_document(
self,
chat_id: str,
file_path: str,
caption: Optional[str] = None,
file_name: Optional[str] = None,
reply_to: Optional[str] = None,
**kwargs,
) -> SendResult:
"""Send a document/file natively as a Telegram file attachment."""
if not self._bot:
return SendResult(success=False, error="Not connected")
try:
if not os.path.exists(file_path):
return SendResult(success=False, error=f"File not found: {file_path}")
display_name = file_name or os.path.basename(file_path)
with open(file_path, "rb") as f:
msg = await self._bot.send_document(
chat_id=int(chat_id),
document=f,
filename=display_name,
caption=caption[:1024] if caption else None,
reply_to_message_id=int(reply_to) if reply_to else None,
)
return SendResult(success=True, message_id=str(msg.message_id))
except Exception as e:
print(f"[{self.name}] Failed to send document: {e}")
return await super().send_document(chat_id, file_path, caption, file_name, reply_to)
async def send_video(
self,
chat_id: str,
video_path: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
**kwargs,
) -> SendResult:
"""Send a video natively as a Telegram video message."""
if not self._bot:
return SendResult(success=False, error="Not connected")
try:
if not os.path.exists(video_path):
return SendResult(success=False, error=f"Video file not found: {video_path}")
with open(video_path, "rb") as f:
msg = await self._bot.send_video(
chat_id=int(chat_id),
video=f,
caption=caption[:1024] if caption else None,
reply_to_message_id=int(reply_to) if reply_to else None,
)
return SendResult(success=True, message_id=str(msg.message_id))
except Exception as e:
print(f"[{self.name}] Failed to send video: {e}")
return await super().send_video(chat_id, video_path, caption, reply_to)
async def send_image(
self,
chat_id: str,
image_url: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send an image natively as a Telegram photo.
Tries URL-based send first (fast, works for <5MB images).
Falls back to downloading and uploading as file (supports up to 10MB).
"""
"""Send an image natively as a Telegram photo."""
if not self._bot:
return SendResult(success=False, error="Not connected")
try:
# Telegram can send photos directly from URLs (up to ~5MB)
_photo_thread = metadata.get("thread_id") if metadata else None
# Telegram can send photos directly from URLs
msg = await self._bot.send_photo(
chat_id=int(chat_id),
photo=image_url,
caption=caption[:1024] if caption else None, # Telegram caption limit
reply_to_message_id=int(reply_to) if reply_to else None,
message_thread_id=int(_photo_thread) if _photo_thread else None,
)
return SendResult(success=True, message_id=str(msg.message_id))
except Exception as e:
logger.warning(
"[%s] URL-based send_photo failed, trying file upload: %s",
self.name,
e,
exc_info=True,
)
# Fallback: download and upload as file (supports up to 10MB)
try:
import httpx
async with httpx.AsyncClient(timeout=30.0) as client:
resp = await client.get(image_url)
resp.raise_for_status()
image_data = resp.content
msg = await self._bot.send_photo(
chat_id=int(chat_id),
photo=image_data,
caption=caption[:1024] if caption else None,
reply_to_message_id=int(reply_to) if reply_to else None,
)
return SendResult(success=True, message_id=str(msg.message_id))
except Exception as e2:
logger.error(
"[%s] File upload send_photo also failed: %s",
self.name,
e2,
exc_info=True,
)
# Final fallback: send URL as text
return await super().send_image(chat_id, image_url, caption, reply_to)
print(f"[{self.name}] Failed to send photo, falling back to URL: {e}")
# Fallback: send as text link
return await super().send_image(chat_id, image_url, caption, reply_to)
async def send_animation(
self,
@@ -506,50 +317,34 @@ class TelegramAdapter(BasePlatformAdapter):
animation_url: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send an animated GIF natively as a Telegram animation (auto-plays inline)."""
if not self._bot:
return SendResult(success=False, error="Not connected")
try:
_anim_thread = metadata.get("thread_id") if metadata else None
msg = await self._bot.send_animation(
chat_id=int(chat_id),
animation=animation_url,
caption=caption[:1024] if caption else None,
reply_to_message_id=int(reply_to) if reply_to else None,
message_thread_id=int(_anim_thread) if _anim_thread else None,
)
return SendResult(success=True, message_id=str(msg.message_id))
except Exception as e:
logger.error(
"[%s] Failed to send Telegram animation, falling back to photo: %s",
self.name,
e,
exc_info=True,
)
print(f"[{self.name}] Failed to send animation, falling back to photo: {e}")
# Fallback: try as a regular photo
return await self.send_image(chat_id, animation_url, caption, reply_to)
async def send_typing(self, chat_id: str, metadata: Optional[Dict[str, Any]] = None) -> None:
async def send_typing(self, chat_id: str) -> None:
"""Send typing indicator."""
if self._bot:
try:
_typing_thread = metadata.get("thread_id") if metadata else None
await self._bot.send_chat_action(
chat_id=int(chat_id),
action="typing",
message_thread_id=int(_typing_thread) if _typing_thread else None,
)
except Exception as e:
# Typing failures are non-fatal; log at debug level only.
logger.debug(
"[%s] Failed to send Telegram typing indicator: %s",
self.name,
e,
exc_info=True,
action="typing"
)
except Exception:
pass # Ignore typing indicator failures
async def get_chat_info(self, chat_id: str) -> Dict[str, Any]:
"""Get information about a Telegram chat."""
@@ -576,13 +371,6 @@ class TelegramAdapter(BasePlatformAdapter):
"is_forum": getattr(chat, "is_forum", False),
}
except Exception as e:
logger.error(
"[%s] Failed to get Telegram chat info for %s: %s",
self.name,
chat_id,
e,
exc_info=True,
)
return {"name": str(chat_id), "type": "dm", "error": str(e)}
def format_message(self, content: str) -> str:
@@ -681,41 +469,6 @@ class TelegramAdapter(BasePlatformAdapter):
event = self._build_message_event(update.message, MessageType.COMMAND)
await self.handle_message(event)
async def _handle_location_message(self, update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
"""Handle incoming location/venue pin messages."""
if not update.message:
return
msg = update.message
venue = getattr(msg, "venue", None)
location = getattr(venue, "location", None) if venue else getattr(msg, "location", None)
if not location:
return
lat = getattr(location, "latitude", None)
lon = getattr(location, "longitude", None)
if lat is None or lon is None:
return
# Build a text message with coordinates and context
parts = ["[The user shared a location pin.]"]
if venue:
title = getattr(venue, "title", None)
address = getattr(venue, "address", None)
if title:
parts.append(f"Venue: {title}")
if address:
parts.append(f"Address: {address}")
parts.append(f"latitude: {lat}")
parts.append(f"longitude: {lon}")
parts.append(f"Map: https://www.google.com/maps/search/?api=1&query={lat},{lon}")
parts.append("Ask what they'd like to find nearby (restaurants, cafes, etc.) and any preferences.")
event = self._build_message_event(msg, MessageType.LOCATION)
event.text = "\n".join(parts)
await self.handle_message(event)
async def _handle_media_message(self, update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
"""Handle incoming media messages, downloading images to local cache."""
if not update.message:
@@ -771,9 +524,9 @@ class TelegramAdapter(BasePlatformAdapter):
cached_path = cache_image_from_bytes(bytes(image_bytes), ext=ext)
event.media_urls = [cached_path]
event.media_types = [f"image/{ext.lstrip('.')}"]
logger.info("[Telegram] Cached user photo at %s", cached_path)
print(f"[Telegram] Cached user photo: {cached_path}", flush=True)
except Exception as e:
logger.warning("[Telegram] Failed to cache photo: %s", e, exc_info=True)
print(f"[Telegram] Failed to cache photo: {e}", flush=True)
# Download voice/audio messages to cache for STT transcription
if msg.voice:
@@ -783,9 +536,9 @@ class TelegramAdapter(BasePlatformAdapter):
cached_path = cache_audio_from_bytes(bytes(audio_bytes), ext=".ogg")
event.media_urls = [cached_path]
event.media_types = ["audio/ogg"]
logger.info("[Telegram] Cached user voice at %s", cached_path)
print(f"[Telegram] Cached user voice: {cached_path}", flush=True)
except Exception as e:
logger.warning("[Telegram] Failed to cache voice: %s", e, exc_info=True)
print(f"[Telegram] Failed to cache voice: {e}", flush=True)
elif msg.audio:
try:
file_obj = await msg.audio.get_file()
@@ -793,9 +546,9 @@ class TelegramAdapter(BasePlatformAdapter):
cached_path = cache_audio_from_bytes(bytes(audio_bytes), ext=".mp3")
event.media_urls = [cached_path]
event.media_types = ["audio/mp3"]
logger.info("[Telegram] Cached user audio at %s", cached_path)
print(f"[Telegram] Cached user audio: {cached_path}", flush=True)
except Exception as e:
logger.warning("[Telegram] Failed to cache audio: %s", e, exc_info=True)
print(f"[Telegram] Failed to cache audio: {e}", flush=True)
# Download document files to cache for agent processing
elif msg.document:
@@ -820,7 +573,7 @@ class TelegramAdapter(BasePlatformAdapter):
f"Unsupported document type '{ext or 'unknown'}'. "
f"Supported types: {supported_list}"
)
logger.info("[Telegram] Unsupported document type: %s", ext or "unknown")
print(f"[Telegram] Unsupported document type: {ext or 'unknown'}", flush=True)
await self.handle_message(event)
return
@@ -831,7 +584,7 @@ class TelegramAdapter(BasePlatformAdapter):
"The document is too large or its size could not be verified. "
"Maximum: 20 MB."
)
logger.info("[Telegram] Document too large: %s bytes", doc.file_size)
print(f"[Telegram] Document too large: {doc.file_size} bytes", flush=True)
await self.handle_message(event)
return
@@ -843,7 +596,7 @@ class TelegramAdapter(BasePlatformAdapter):
mime_type = SUPPORTED_DOCUMENT_TYPES[ext]
event.media_urls = [cached_path]
event.media_types = [mime_type]
logger.info("[Telegram] Cached user document at %s", cached_path)
print(f"[Telegram] Cached user document: {cached_path}", flush=True)
# For text files, inject content into event.text (capped at 100 KB)
MAX_TEXT_INJECT_BYTES = 100 * 1024
@@ -858,13 +611,10 @@ class TelegramAdapter(BasePlatformAdapter):
else:
event.text = injection
except UnicodeDecodeError:
logger.warning(
"[Telegram] Could not decode text file as UTF-8, skipping content injection",
exc_info=True,
)
print(f"[Telegram] Could not decode text file as UTF-8, skipping content injection", flush=True)
except Exception as e:
logger.warning("[Telegram] Failed to cache document: %s", e, exc_info=True)
print(f"[Telegram] Failed to cache document: {e}", flush=True)
await self.handle_message(event)
@@ -899,7 +649,7 @@ class TelegramAdapter(BasePlatformAdapter):
event.text = build_sticker_injection(
cached["description"], cached.get("emoji", emoji), cached.get("set_name", set_name)
)
logger.info("[Telegram] Sticker cache hit: %s", sticker.file_unique_id)
print(f"[Telegram] Sticker cache hit: {sticker.file_unique_id}", flush=True)
return
# Cache miss -- download and analyze
@@ -907,7 +657,7 @@ class TelegramAdapter(BasePlatformAdapter):
file_obj = await sticker.get_file()
image_bytes = await file_obj.download_as_bytearray()
cached_path = cache_image_from_bytes(bytes(image_bytes), ext=".webp")
logger.info("[Telegram] Analyzing sticker at %s", cached_path)
print(f"[Telegram] Analyzing sticker: {cached_path}", flush=True)
from tools.vision_tools import vision_analyze_tool
import json as _json
@@ -929,7 +679,7 @@ class TelegramAdapter(BasePlatformAdapter):
emoji, set_name,
)
except Exception as e:
logger.warning("[Telegram] Sticker analysis error: %s", e, exc_info=True)
print(f"[Telegram] Sticker analysis error: {e}", flush=True)
event.text = build_sticker_injection(
f"a sticker with emoji {emoji}" if emoji else "a sticker",
emoji, set_name,

View File

@@ -181,8 +181,8 @@ class WhatsAppAdapter(BasePlatformAdapter):
# Kill any orphaned bridge from a previous gateway run
_kill_port_process(self._bridge_port)
import asyncio
await asyncio.sleep(1)
import time
time.sleep(1)
# Start the bridge process in its own process group.
# Route output to a log file so QR codes, errors, and reconnection
@@ -493,7 +493,7 @@ class WhatsAppAdapter(BasePlatformAdapter):
file_name or os.path.basename(file_path),
)
async def send_typing(self, chat_id: str, metadata=None) -> None:
async def send_typing(self, chat_id: str) -> None:
"""Send typing indicator via bridge."""
if not self._running:
return

File diff suppressed because it is too large Load Diff

View File

@@ -45,8 +45,6 @@ class SessionSource:
user_name: Optional[str] = None
thread_id: Optional[str] = None # For forum topics, Discord threads, etc.
chat_topic: Optional[str] = None # Channel topic/description (Discord, Slack)
user_id_alt: Optional[str] = None # Signal UUID (alternative to phone number)
chat_id_alt: Optional[str] = None # Signal group internal ID
@property
def description(self) -> str:
@@ -70,7 +68,7 @@ class SessionSource:
return ", ".join(parts)
def to_dict(self) -> Dict[str, Any]:
d = {
return {
"platform": self.platform.value,
"chat_id": self.chat_id,
"chat_name": self.chat_name,
@@ -80,11 +78,6 @@ class SessionSource:
"thread_id": self.thread_id,
"chat_topic": self.chat_topic,
}
if self.user_id_alt:
d["user_id_alt"] = self.user_id_alt
if self.chat_id_alt:
d["chat_id_alt"] = self.chat_id_alt
return d
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "SessionSource":
@@ -97,8 +90,6 @@ class SessionSource:
user_name=data.get("user_name"),
thread_id=data.get("thread_id"),
chat_topic=data.get("chat_topic"),
user_id_alt=data.get("user_id_alt"),
chat_id_alt=data.get("chat_id_alt"),
)
@classmethod
@@ -241,9 +232,6 @@ class SessionEntry:
output_tokens: int = 0
total_tokens: int = 0
# Last API-reported prompt tokens (for accurate compression pre-check)
last_prompt_tokens: int = 0
# Set when a session was created because the previous one expired;
# consumed once by the message handler to inject a notice into context
was_auto_reset: bool = False
@@ -260,7 +248,6 @@ class SessionEntry:
"input_tokens": self.input_tokens,
"output_tokens": self.output_tokens,
"total_tokens": self.total_tokens,
"last_prompt_tokens": self.last_prompt_tokens,
}
if self.origin:
result["origin"] = self.origin.to_dict()
@@ -276,8 +263,8 @@ class SessionEntry:
if data.get("platform"):
try:
platform = Platform(data["platform"])
except ValueError as e:
logger.debug("Unknown platform value %r: %s", data["platform"], e)
except ValueError:
pass
return cls(
session_key=data["session_key"],
@@ -291,7 +278,6 @@ class SessionEntry:
input_tokens=data.get("input_tokens", 0),
output_tokens=data.get("output_tokens", 0),
total_tokens=data.get("total_tokens", 0),
last_prompt_tokens=data.get("last_prompt_tokens", 0),
)
@@ -306,8 +292,6 @@ def build_session_key(source: SessionSource) -> str:
if platform == "whatsapp" and source.chat_id:
return f"agent:main:{platform}:dm:{source.chat_id}"
return f"agent:main:{platform}:dm"
if source.thread_id:
return f"agent:main:{platform}:{source.chat_type}:{source.chat_id}:{source.thread_id}"
return f"agent:main:{platform}:{source.chat_type}:{source.chat_id}"
@@ -327,9 +311,7 @@ class SessionStore:
self._entries: Dict[str, SessionEntry] = {}
self._loaded = False
self._has_active_processes_fn = has_active_processes_fn
# on_auto_reset is deprecated — memory flush now runs proactively
# via the background session expiry watcher in GatewayRunner.
self._pre_flushed_sessions: set = set() # session_ids already flushed by watcher
self._on_auto_reset = on_auto_reset # callback(old_entry) before auto-reset
# Initialize SQLite session database
self._db = None
@@ -349,7 +331,7 @@ class SessionStore:
if sessions_file.exists():
try:
with open(sessions_file, "r", encoding="utf-8") as f:
with open(sessions_file, "r") as f:
data = json.load(f)
for key, entry_data in data.items():
self._entries[key] = SessionEntry.from_dict(entry_data)
@@ -360,69 +342,17 @@ class SessionStore:
def _save(self) -> None:
"""Save sessions index to disk (kept for session key -> ID mapping)."""
import tempfile
self.sessions_dir.mkdir(parents=True, exist_ok=True)
sessions_file = self.sessions_dir / "sessions.json"
data = {key: entry.to_dict() for key, entry in self._entries.items()}
fd, tmp_path = tempfile.mkstemp(
dir=str(self.sessions_dir), suffix=".tmp", prefix=".sessions_"
)
try:
with os.fdopen(fd, "w", encoding="utf-8") as f:
json.dump(data, f, indent=2)
f.flush()
os.fsync(f.fileno())
os.replace(tmp_path, sessions_file)
except BaseException:
try:
os.unlink(tmp_path)
except OSError as e:
logger.debug("Could not remove temp file %s: %s", tmp_path, e)
raise
with open(sessions_file, "w") as f:
json.dump(data, f, indent=2)
def _generate_session_key(self, source: SessionSource) -> str:
"""Generate a session key from a source."""
return build_session_key(source)
def _is_session_expired(self, entry: SessionEntry) -> bool:
"""Check if a session has expired based on its reset policy.
Works from the entry alone — no SessionSource needed.
Used by the background expiry watcher to proactively flush memories.
Sessions with active background processes are never considered expired.
"""
if self._has_active_processes_fn:
if self._has_active_processes_fn(entry.session_key):
return False
policy = self.config.get_reset_policy(
platform=entry.platform,
session_type=entry.chat_type,
)
if policy.mode == "none":
return False
now = datetime.now()
if policy.mode in ("idle", "both"):
idle_deadline = entry.updated_at + timedelta(minutes=policy.idle_minutes)
if now > idle_deadline:
return True
if policy.mode in ("daily", "both"):
today_reset = now.replace(
hour=policy.at_hour,
minute=0, second=0, microsecond=0,
)
if now.hour < policy.at_hour:
today_reset -= timedelta(days=1)
if entry.updated_at < today_reset:
return True
return False
def _should_reset(self, entry: SessionEntry, source: SessionSource) -> bool:
"""
Check if a session should be reset based on policy.
@@ -509,11 +439,13 @@ class SessionStore:
self._save()
return entry
else:
# Session is being auto-reset. The background expiry watcher
# should have already flushed memories proactively; discard
# the marker so it doesn't accumulate.
# Session is being auto-reset — flush memories before destroying
was_auto_reset = True
self._pre_flushed_sessions.discard(entry.session_id)
if self._on_auto_reset:
try:
self._on_auto_reset(entry)
except Exception as e:
logger.debug("Auto-reset callback failed: %s", e)
if self._db:
try:
self._db.end_session(entry.session_id, "session_reset")
@@ -557,8 +489,7 @@ class SessionStore:
self,
session_key: str,
input_tokens: int = 0,
output_tokens: int = 0,
last_prompt_tokens: int = None,
output_tokens: int = 0
) -> None:
"""Update a session's metadata after an interaction."""
self._ensure_loaded()
@@ -568,8 +499,6 @@ class SessionStore:
entry.updated_at = datetime.now()
entry.input_tokens += input_tokens
entry.output_tokens += output_tokens
if last_prompt_tokens is not None:
entry.last_prompt_tokens = last_prompt_tokens
entry.total_tokens = entry.input_tokens + entry.output_tokens
self._save()
@@ -626,49 +555,7 @@ class SessionStore:
logger.debug("Session DB operation failed: %s", e)
return new_entry
def switch_session(self, session_key: str, target_session_id: str) -> Optional[SessionEntry]:
"""Switch a session key to point at an existing session ID.
Used by ``/resume`` to restore a previously-named session.
Ends the current session in SQLite (like reset), but instead of
generating a fresh session ID, re-uses ``target_session_id`` so the
old transcript is loaded on the next message.
"""
self._ensure_loaded()
if session_key not in self._entries:
return None
old_entry = self._entries[session_key]
# Don't switch if already on that session
if old_entry.session_id == target_session_id:
return old_entry
# End the current session in SQLite
if self._db:
try:
self._db.end_session(old_entry.session_id, "session_switch")
except Exception as e:
logger.debug("Session DB end_session failed: %s", e)
now = datetime.now()
new_entry = SessionEntry(
session_key=session_key,
session_id=target_session_id,
created_at=now,
updated_at=now,
origin=old_entry.origin,
display_name=old_entry.display_name,
platform=old_entry.platform,
chat_type=old_entry.chat_type,
)
self._entries[session_key] = new_entry
self._save()
return new_entry
def list_sessions(self, active_minutes: Optional[int] = None) -> List[SessionEntry]:
"""List all sessions, optionally filtered by activity."""
self._ensure_loaded()
@@ -687,17 +574,10 @@ class SessionStore:
"""Get the path to a session's legacy transcript file."""
return self.sessions_dir / f"{session_id}.jsonl"
def append_to_transcript(self, session_id: str, message: Dict[str, Any], skip_db: bool = False) -> None:
"""Append a message to a session's transcript (SQLite + legacy JSONL).
Args:
skip_db: When True, only write to JSONL and skip the SQLite write.
Used when the agent already persisted messages to SQLite
via its own _flush_messages_to_session_db(), preventing
the duplicate-write bug (#860).
"""
# Write to SQLite (unless the agent already handled it)
if self._db and not skip_db:
def append_to_transcript(self, session_id: str, message: Dict[str, Any]) -> None:
"""Append a message to a session's transcript (SQLite + legacy JSONL)."""
# Write to SQLite
if self._db:
try:
self._db.append_message(
session_id=session_id,
@@ -712,7 +592,7 @@ class SessionStore:
# Also write legacy JSONL (keeps existing tooling working during transition)
transcript_path = self.get_transcript_path(session_id)
with open(transcript_path, "a", encoding="utf-8") as f:
with open(transcript_path, "a") as f:
f.write(json.dumps(message, ensure_ascii=False) + "\n")
def rewrite_transcript(self, session_id: str, messages: List[Dict[str, Any]]) -> None:
@@ -739,7 +619,7 @@ class SessionStore:
# JSONL: overwrite the file
transcript_path = self.get_transcript_path(session_id)
with open(transcript_path, "w", encoding="utf-8") as f:
with open(transcript_path, "w") as f:
for msg in messages:
f.write(json.dumps(msg, ensure_ascii=False) + "\n")
@@ -761,7 +641,7 @@ class SessionStore:
return []
messages = []
with open(transcript_path, "r", encoding="utf-8") as f:
with open(transcript_path, "r") as f:
for line in f:
line = line.strip()
if line:

View File

@@ -23,7 +23,6 @@ import stat
import base64
import hashlib
import subprocess
import threading
import time
import uuid
import webbrowser
@@ -45,10 +44,6 @@ try:
import fcntl
except Exception:
fcntl = None
try:
import msvcrt
except Exception:
msvcrt = None
# =============================================================================
# Constants
@@ -77,19 +72,15 @@ CODEX_ACCESS_TOKEN_REFRESH_SKEW_SECONDS = 120
@dataclass
class ProviderConfig:
"""Describes a known inference provider."""
"""Describes a known OAuth provider."""
id: str
name: str
auth_type: str # "oauth_device_code", "oauth_external", or "api_key"
auth_type: str # "oauth_device_code" or "api_key"
portal_base_url: str = ""
inference_base_url: str = ""
client_id: str = ""
scope: str = ""
extra: Dict[str, Any] = field(default_factory=dict)
# For API-key providers: env vars to check (in priority order)
api_key_env_vars: tuple = ()
# Optional env var for base URL override
base_url_env_var: str = ""
PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
@@ -108,126 +99,9 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
auth_type="oauth_external",
inference_base_url=DEFAULT_CODEX_BASE_URL,
),
"nous-api": ProviderConfig(
id="nous-api",
name="Nous Portal (API Key)",
auth_type="api_key",
inference_base_url="https://inference-api.nousresearch.com/v1",
api_key_env_vars=("NOUS_API_KEY",),
base_url_env_var="NOUS_BASE_URL",
),
"zai": ProviderConfig(
id="zai",
name="Z.AI / GLM",
auth_type="api_key",
inference_base_url="https://api.z.ai/api/paas/v4",
api_key_env_vars=("GLM_API_KEY", "ZAI_API_KEY", "Z_AI_API_KEY"),
base_url_env_var="GLM_BASE_URL",
),
"kimi-coding": ProviderConfig(
id="kimi-coding",
name="Kimi / Moonshot",
auth_type="api_key",
inference_base_url="https://api.moonshot.ai/v1",
api_key_env_vars=("KIMI_API_KEY",),
base_url_env_var="KIMI_BASE_URL",
),
"minimax": ProviderConfig(
id="minimax",
name="MiniMax",
auth_type="api_key",
inference_base_url="https://api.minimax.io/v1",
api_key_env_vars=("MINIMAX_API_KEY",),
base_url_env_var="MINIMAX_BASE_URL",
),
"minimax-cn": ProviderConfig(
id="minimax-cn",
name="MiniMax (China)",
auth_type="api_key",
inference_base_url="https://api.minimaxi.com/v1",
api_key_env_vars=("MINIMAX_CN_API_KEY",),
base_url_env_var="MINIMAX_CN_BASE_URL",
),
}
# =============================================================================
# Kimi Code Endpoint Detection
# =============================================================================
# Kimi Code (platform.kimi.ai) issues keys prefixed "sk-kimi-" that only work
# on api.kimi.com/coding/v1. Legacy keys from platform.moonshot.ai work on
# api.moonshot.ai/v1 (the default). Auto-detect when user hasn't set
# KIMI_BASE_URL explicitly.
KIMI_CODE_BASE_URL = "https://api.kimi.com/coding/v1"
def _resolve_kimi_base_url(api_key: str, default_url: str, env_override: str) -> str:
"""Return the correct Kimi base URL based on the API key prefix.
If the user has explicitly set KIMI_BASE_URL, that always wins.
Otherwise, sk-kimi- prefixed keys route to api.kimi.com/coding/v1.
"""
if env_override:
return env_override
if api_key.startswith("sk-kimi-"):
return KIMI_CODE_BASE_URL
return default_url
# =============================================================================
# Z.AI Endpoint Detection
# =============================================================================
# Z.AI has separate billing for general vs coding plans, and global vs China
# endpoints. A key that works on one may return "Insufficient balance" on
# another. We probe at setup time and store the working endpoint.
ZAI_ENDPOINTS = [
# (id, base_url, default_model, label)
("global", "https://api.z.ai/api/paas/v4", "glm-5", "Global"),
("cn", "https://open.bigmodel.cn/api/paas/v4", "glm-5", "China"),
("coding-global", "https://api.z.ai/api/coding/paas/v4", "glm-4.7", "Global (Coding Plan)"),
("coding-cn", "https://open.bigmodel.cn/api/coding/paas/v4", "glm-4.7", "China (Coding Plan)"),
]
def detect_zai_endpoint(api_key: str, timeout: float = 8.0) -> Optional[Dict[str, str]]:
"""Probe z.ai endpoints to find one that accepts this API key.
Returns {"id": ..., "base_url": ..., "model": ..., "label": ...} for the
first working endpoint, or None if all fail.
"""
for ep_id, base_url, model, label in ZAI_ENDPOINTS:
try:
resp = httpx.post(
f"{base_url}/chat/completions",
headers={
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json",
},
json={
"model": model,
"stream": False,
"max_tokens": 1,
"messages": [{"role": "user", "content": "ping"}],
},
timeout=timeout,
)
if resp.status_code == 200:
logger.debug("Z.AI endpoint probe: %s (%s) OK", ep_id, base_url)
return {
"id": ep_id,
"base_url": base_url,
"model": model,
"label": label,
}
logger.debug("Z.AI endpoint probe: %s returned %s", ep_id, resp.status_code)
except Exception as exc:
logger.debug("Z.AI endpoint probe: %s failed: %s", ep_id, exc)
return None
# =============================================================================
# Error Types
# =============================================================================
@@ -312,64 +186,31 @@ def _auth_lock_path() -> Path:
return _auth_file_path().with_suffix(".lock")
_auth_lock_holder = threading.local()
@contextmanager
def _auth_store_lock(timeout_seconds: float = AUTH_LOCK_TIMEOUT_SECONDS):
"""Cross-process advisory lock for auth.json reads+writes. Reentrant."""
# Reentrant: if this thread already holds the lock, just yield.
if getattr(_auth_lock_holder, "depth", 0) > 0:
_auth_lock_holder.depth += 1
try:
yield
finally:
_auth_lock_holder.depth -= 1
return
"""Cross-process advisory lock for auth.json reads+writes."""
lock_path = _auth_lock_path()
lock_path.parent.mkdir(parents=True, exist_ok=True)
if fcntl is None and msvcrt is None:
_auth_lock_holder.depth = 1
try:
with lock_path.open("a+") as lock_file:
if fcntl is None:
yield
finally:
_auth_lock_holder.depth = 0
return
return
# On Windows, msvcrt.locking needs the file to have content and the
# file pointer at position 0. Ensure the lock file has at least 1 byte.
if msvcrt and (not lock_path.exists() or lock_path.stat().st_size == 0):
lock_path.write_text(" ", encoding="utf-8")
with lock_path.open("r+" if msvcrt else "a+") as lock_file:
deadline = time.time() + max(1.0, timeout_seconds)
while True:
try:
if fcntl:
fcntl.flock(lock_file.fileno(), fcntl.LOCK_EX | fcntl.LOCK_NB)
else:
lock_file.seek(0)
msvcrt.locking(lock_file.fileno(), msvcrt.LK_NBLCK, 1)
fcntl.flock(lock_file.fileno(), fcntl.LOCK_EX | fcntl.LOCK_NB)
break
except (BlockingIOError, OSError, PermissionError):
except BlockingIOError:
if time.time() >= deadline:
raise TimeoutError("Timed out waiting for auth store lock")
time.sleep(0.05)
_auth_lock_holder.depth = 1
try:
yield
finally:
_auth_lock_holder.depth = 0
if fcntl:
fcntl.flock(lock_file.fileno(), fcntl.LOCK_UN)
elif msvcrt:
try:
lock_file.seek(0)
msvcrt.locking(lock_file.fileno(), msvcrt.LK_UNLCK, 1)
except (OSError, IOError):
pass
fcntl.flock(lock_file.fileno(), fcntl.LOCK_UN)
def _load_auth_store(auth_file: Optional[Path] = None) -> Dict[str, Any]:
@@ -514,20 +355,10 @@ def resolve_provider(
1. active_provider in auth.json with valid credentials
2. Explicit CLI api_key/base_url -> "openrouter"
3. OPENAI_API_KEY or OPENROUTER_API_KEY env vars -> "openrouter"
4. Provider-specific API keys (GLM, Kimi, MiniMax) -> that provider
5. Fallback: "openrouter"
4. Fallback: "openrouter"
"""
normalized = (requested or "auto").strip().lower()
# Normalize provider aliases
_PROVIDER_ALIASES = {
"nous_api": "nous-api", "nousapi": "nous-api", "nous-portal-api": "nous-api",
"glm": "zai", "z-ai": "zai", "z.ai": "zai", "zhipu": "zai",
"kimi": "kimi-coding", "moonshot": "kimi-coding",
"minimax-china": "minimax-cn", "minimax_cn": "minimax-cn",
}
normalized = _PROVIDER_ALIASES.get(normalized, normalized)
if normalized in {"openrouter", "custom"}:
return "openrouter"
if normalized in PROVIDER_REGISTRY:
@@ -556,14 +387,6 @@ def resolve_provider(
if os.getenv("OPENAI_API_KEY") or os.getenv("OPENROUTER_API_KEY"):
return "openrouter"
# Auto-detect API-key providers by checking their env vars
for pid, pconfig in PROVIDER_REGISTRY.items():
if pconfig.auth_type != "api_key":
continue
for env_var in pconfig.api_key_env_vars:
if os.getenv(env_var, "").strip():
return pid
return "openrouter"
@@ -1103,19 +926,6 @@ def fetch_nous_models(
continue
model_ids.append(mid)
# Sort: prefer opus > pro > haiku/flash > sonnet (sonnet is cheap/fast,
# users who want the best model should see opus first).
def _model_priority(mid: str) -> tuple:
low = mid.lower()
if "opus" in low:
return (0, mid)
if "pro" in low and "sonnet" not in low:
return (1, mid)
if "sonnet" in low:
return (3, mid)
return (2, mid)
model_ids.sort(key=_model_priority)
return list(dict.fromkeys(model_ids))
@@ -1420,42 +1230,6 @@ def get_codex_auth_status() -> Dict[str, Any]:
}
def get_api_key_provider_status(provider_id: str) -> Dict[str, Any]:
"""Status snapshot for API-key providers (z.ai, Kimi, MiniMax)."""
pconfig = PROVIDER_REGISTRY.get(provider_id)
if not pconfig or pconfig.auth_type != "api_key":
return {"configured": False}
api_key = ""
key_source = ""
for env_var in pconfig.api_key_env_vars:
val = os.getenv(env_var, "").strip()
if val:
api_key = val
key_source = env_var
break
env_url = ""
if pconfig.base_url_env_var:
env_url = os.getenv(pconfig.base_url_env_var, "").strip()
if provider_id == "kimi-coding":
base_url = _resolve_kimi_base_url(api_key, pconfig.inference_base_url, env_url)
elif env_url:
base_url = env_url
else:
base_url = pconfig.inference_base_url
return {
"configured": bool(api_key),
"provider": provider_id,
"name": pconfig.name,
"key_source": key_source,
"base_url": base_url,
"logged_in": bool(api_key), # compat with OAuth status shape
}
def get_auth_status(provider_id: Optional[str] = None) -> Dict[str, Any]:
"""Generic auth status dispatcher."""
target = provider_id or get_active_provider()
@@ -1463,54 +1237,9 @@ def get_auth_status(provider_id: Optional[str] = None) -> Dict[str, Any]:
return get_nous_auth_status()
if target == "openai-codex":
return get_codex_auth_status()
# API-key providers
pconfig = PROVIDER_REGISTRY.get(target)
if pconfig and pconfig.auth_type == "api_key":
return get_api_key_provider_status(target)
return {"logged_in": False}
def resolve_api_key_provider_credentials(provider_id: str) -> Dict[str, Any]:
"""Resolve API key and base URL for an API-key provider.
Returns dict with: provider, api_key, base_url, source.
"""
pconfig = PROVIDER_REGISTRY.get(provider_id)
if not pconfig or pconfig.auth_type != "api_key":
raise AuthError(
f"Provider '{provider_id}' is not an API-key provider.",
provider=provider_id,
code="invalid_provider",
)
api_key = ""
key_source = ""
for env_var in pconfig.api_key_env_vars:
val = os.getenv(env_var, "").strip()
if val:
api_key = val
key_source = env_var
break
env_url = ""
if pconfig.base_url_env_var:
env_url = os.getenv(pconfig.base_url_env_var, "").strip()
if provider_id == "kimi-coding":
base_url = _resolve_kimi_base_url(api_key, pconfig.inference_base_url, env_url)
elif env_url:
base_url = env_url.rstrip("/")
else:
base_url = pconfig.inference_base_url
return {
"provider": provider_id,
"api_key": api_key,
"base_url": base_url.rstrip("/"),
"source": key_source or "default",
}
# =============================================================================
# External credential detection
# =============================================================================
@@ -1684,11 +1413,11 @@ def _save_model_choice(model_id: str) -> None:
from hermes_cli.config import save_config, load_config, save_env_value
config = load_config()
# Always use dict format so provider/base_url can be stored alongside
# Handle both string and dict model formats
if isinstance(config.get("model"), dict):
config["model"]["default"] = model_id
else:
config["model"] = {"default": model_id}
config["model"] = model_id
save_config(config)
save_env_value("LLM_MODEL", model_id)

View File

@@ -1,15 +1,10 @@
"""Welcome banner, ASCII art, skills summary, and update check for the CLI.
"""Welcome banner, ASCII art, and skills summary for the CLI.
Pure display functions with no HermesCLI state dependency.
"""
import json
import logging
import os
import subprocess
import time
from pathlib import Path
from typing import Dict, List, Any, Optional
from typing import Dict, List, Any
from rich.console import Console
from rich.panel import Panel
@@ -18,8 +13,6 @@ from rich.table import Table
from prompt_toolkit import print_formatted_text as _pt_print
from prompt_toolkit.formatted_text import ANSI as _PT_ANSI
logger = logging.getLogger(__name__)
# =========================================================================
# ANSI building blocks for conversation display
@@ -36,28 +29,6 @@ def cprint(text: str):
_pt_print(_PT_ANSI(text))
# =========================================================================
# Skin-aware color helpers
# =========================================================================
def _skin_color(key: str, fallback: str) -> str:
"""Get a color from the active skin, or return fallback."""
try:
from hermes_cli.skin_engine import get_active_skin
return get_active_skin().get_color(key, fallback)
except Exception:
return fallback
def _skin_branding(key: str, fallback: str) -> str:
"""Get a branding string from the active skin, or return fallback."""
try:
from hermes_cli.skin_engine import get_active_skin
return get_active_skin().get_branding(key, fallback)
except Exception:
return fallback
# =========================================================================
# ASCII Art & Branding
# =========================================================================
@@ -124,72 +95,6 @@ def get_available_skills() -> Dict[str, List[str]]:
return skills_by_category
# =========================================================================
# Update check
# =========================================================================
# Cache update check results for 6 hours to avoid repeated git fetches
_UPDATE_CHECK_CACHE_SECONDS = 6 * 3600
def check_for_updates() -> Optional[int]:
"""Check how many commits behind origin/main the local repo is.
Does a ``git fetch`` at most once every 6 hours (cached to
``~/.hermes/.update_check``). Returns the number of commits behind,
or ``None`` if the check fails or isn't applicable.
"""
hermes_home = Path(os.getenv("HERMES_HOME", Path.home() / ".hermes"))
repo_dir = hermes_home / "hermes-agent"
cache_file = hermes_home / ".update_check"
# Must be a git repo
if not (repo_dir / ".git").exists():
return None
# Read cache
now = time.time()
try:
if cache_file.exists():
cached = json.loads(cache_file.read_text())
if now - cached.get("ts", 0) < _UPDATE_CHECK_CACHE_SECONDS:
return cached.get("behind")
except Exception:
pass
# Fetch latest refs (fast — only downloads ref metadata, no files)
try:
subprocess.run(
["git", "fetch", "origin", "--quiet"],
capture_output=True, timeout=10,
cwd=str(repo_dir),
)
except Exception:
pass # Offline or timeout — use stale refs, that's fine
# Count commits behind
try:
result = subprocess.run(
["git", "rev-list", "--count", "HEAD..origin/main"],
capture_output=True, text=True, timeout=5,
cwd=str(repo_dir),
)
if result.returncode == 0:
behind = int(result.stdout.strip())
else:
behind = None
except Exception:
behind = None
# Write cache
try:
cache_file.write_text(json.dumps({"ts": now, "behind": behind}))
except Exception:
pass
return behind
# =========================================================================
# Welcome banner
# =========================================================================
@@ -239,24 +144,18 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
layout_table.add_column("left", justify="center")
layout_table.add_column("right", justify="left")
# Resolve skin colors once for the entire banner
accent = _skin_color("banner_accent", "#FFBF00")
dim = _skin_color("banner_dim", "#B8860B")
text = _skin_color("banner_text", "#FFF8DC")
session_color = _skin_color("session_border", "#8B8682")
left_lines = ["", HERMES_CADUCEUS, ""]
model_short = model.split("/")[-1] if "/" in model else model
if len(model_short) > 28:
model_short = model_short[:25] + "..."
ctx_str = f" [dim {dim}]·[/] [dim {dim}]{_format_context_length(context_length)} context[/]" if context_length else ""
left_lines.append(f"[{accent}]{model_short}[/]{ctx_str} [dim {dim}]·[/] [dim {dim}]Nous Research[/]")
left_lines.append(f"[dim {dim}]{cwd}[/]")
ctx_str = f" [dim #B8860B]·[/] [dim #B8860B]{_format_context_length(context_length)} context[/]" if context_length else ""
left_lines.append(f"[#FFBF00]{model_short}[/]{ctx_str} [dim #B8860B]·[/] [dim #B8860B]Nous Research[/]")
left_lines.append(f"[dim #B8860B]{cwd}[/]")
if session_id:
left_lines.append(f"[dim {session_color}]Session: {session_id}[/]")
left_lines.append(f"[dim #8B8682]Session: {session_id}[/]")
left_content = "\n".join(left_lines)
right_lines = [f"[bold {accent}]Available Tools[/]"]
right_lines = ["[bold #FFBF00]Available Tools[/]"]
toolsets_dict: Dict[str, list] = {}
for tool in tools:
@@ -284,7 +183,7 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
if name in disabled_tools:
colored_names.append(f"[red]{name}[/]")
else:
colored_names.append(f"[{text}]{name}[/]")
colored_names.append(f"[#FFF8DC]{name}[/]")
tools_str = ", ".join(colored_names)
if len(", ".join(sorted(tool_names))) > 45:
@@ -303,7 +202,7 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
elif name in disabled_tools:
colored_names.append(f"[red]{name}[/]")
else:
colored_names.append(f"[{text}]{name}[/]")
colored_names.append(f"[#FFF8DC]{name}[/]")
tools_str = ", ".join(colored_names)
right_lines.append(f"[dim #B8860B]{toolset}:[/] {tools_str}")
@@ -334,7 +233,7 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
)
right_lines.append("")
right_lines.append(f"[bold {accent}]Available Skills[/]")
right_lines.append("[bold #FFBF00]Available Skills[/]")
skills_by_category = get_available_skills()
total_skills = sum(len(s) for s in skills_by_category.values())
@@ -348,9 +247,9 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
skills_str = ", ".join(skill_names)
if len(skills_str) > 50:
skills_str = skills_str[:47] + "..."
right_lines.append(f"[dim {dim}]{category}:[/] [{text}]{skills_str}[/]")
right_lines.append(f"[dim #B8860B]{category}:[/] [#FFF8DC]{skills_str}[/]")
else:
right_lines.append(f"[dim {dim}]No skills installed[/]")
right_lines.append("[dim #B8860B]No skills installed[/]")
right_lines.append("")
mcp_connected = sum(1 for s in mcp_status if s["connected"]) if mcp_status else 0
@@ -358,30 +257,15 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
if mcp_connected:
summary_parts.append(f"{mcp_connected} MCP servers")
summary_parts.append("/help for commands")
right_lines.append(f"[dim {dim}]{' · '.join(summary_parts)}[/]")
# Update check — show if behind origin/main
try:
behind = check_for_updates()
if behind and behind > 0:
commits_word = "commit" if behind == 1 else "commits"
right_lines.append(
f"[bold yellow]⚠ {behind} {commits_word} behind[/]"
f"[dim yellow] — run [bold]hermes update[/bold] to update[/]"
)
except Exception:
pass # Never break the banner over an update check
right_lines.append(f"[dim #B8860B]{' · '.join(summary_parts)}[/]")
right_content = "\n".join(right_lines)
layout_table.add_row(left_content, right_content)
agent_name = _skin_branding("agent_name", "Hermes Agent")
title_color = _skin_color("banner_title", "#FFD700")
border_color = _skin_color("banner_border", "#CD7F32")
outer_panel = Panel(
layout_table,
title=f"[bold {title_color}]{agent_name} {VERSION}[/]",
border_style=border_color,
title=f"[bold #FFD700]Hermes Agent {VERSION}[/]",
border_style="#CD7F32",
padding=(0, 2),
)

View File

@@ -254,7 +254,6 @@ def _wayland_save(dest: Path) -> bool:
)
if not dest.exists() or dest.stat().st_size == 0:
dest.unlink(missing_ok=True)
return False
# BMP needs conversion to PNG (common in WSLg where only BMP
@@ -286,27 +285,20 @@ def _convert_to_png(path: Path) -> bool:
logger.debug("Pillow BMP→PNG conversion failed: %s", e)
# Fall back to ImageMagick convert
tmp = path.with_suffix(".bmp")
try:
tmp = path.with_suffix(".bmp")
path.rename(tmp)
r = subprocess.run(
["convert", str(tmp), "png:" + str(path)],
capture_output=True, timeout=5,
)
tmp.unlink(missing_ok=True)
if r.returncode == 0 and path.exists() and path.stat().st_size > 0:
tmp.unlink(missing_ok=True)
return True
else:
# Convert failed — restore the original file
tmp.rename(path)
except FileNotFoundError:
logger.debug("ImageMagick not installed — cannot convert BMP to PNG")
if tmp.exists() and not path.exists():
tmp.rename(path)
except Exception as e:
logger.debug("ImageMagick BMP→PNG conversion failed: %s", e)
if tmp.exists() and not path.exists():
tmp.rename(path)
# Can't convert — BMP is still usable as-is for most APIs
return path.exists() and path.stat().st_size > 0

View File

@@ -47,7 +47,7 @@ def _fetch_models_from_api(access_token: str) -> List[str]:
if item.get("supported_in_api") is False:
continue
visibility = item.get("visibility", "")
if isinstance(visibility, str) and visibility.strip().lower() in ("hide", "hidden"):
if isinstance(visibility, str) and visibility.strip().lower() == "hide":
continue
priority = item.get("priority")
rank = int(priority) if isinstance(priority, (int, float)) else 10_000
@@ -94,10 +94,12 @@ def _read_cache_models(codex_home: Path) -> List[str]:
if not isinstance(slug, str) or not slug.strip():
continue
slug = slug.strip()
if "codex" not in slug.lower():
continue
if item.get("supported_in_api") is False:
continue
visibility = item.get("visibility")
if isinstance(visibility, str) and visibility.strip().lower() in ("hide", "hidden"):
if isinstance(visibility, str) and visibility.strip().lower() == "hidden":
continue
priority = item.get("priority")
rank = int(priority) if isinstance(priority, (int, float)) else 10_000

View File

@@ -1,121 +1,52 @@
"""Slash command definitions and autocomplete for the Hermes CLI.
Contains the shared built-in ``COMMANDS`` dict and ``SlashCommandCompleter``.
The completer can optionally include dynamic skill slash commands supplied by the
interactive CLI.
Contains the COMMANDS dict and the SlashCommandCompleter class.
These are pure data/UI with no HermesCLI state dependency.
"""
from __future__ import annotations
from collections.abc import Callable, Mapping
from typing import Any
from prompt_toolkit.completion import Completer, Completion
# Commands organized by category for better help display
COMMANDS_BY_CATEGORY = {
"Session": {
"/new": "Start a new conversation (reset history)",
"/reset": "Reset conversation only (keep screen)",
"/clear": "Clear screen and reset conversation (fresh start)",
"/history": "Show conversation history",
"/save": "Save the current conversation",
"/retry": "Retry the last message (resend to agent)",
"/undo": "Remove the last user/assistant exchange",
"/title": "Set a title for the current session (usage: /title My Session Name)",
"/compress": "Manually compress conversation context (flush memories + summarize)",
"/rollback": "List or restore filesystem checkpoints (usage: /rollback [number])",
"/background": "Run a prompt in the background (usage: /background <prompt>)",
},
"Configuration": {
"/config": "Show current configuration",
"/model": "Show or change the current model",
"/provider": "Show available providers and current provider",
"/prompt": "View/set custom system prompt",
"/personality": "Set a predefined personality",
"/verbose": "Cycle tool progress display: off → new → all → verbose",
"/reasoning": "Manage reasoning effort and display (usage: /reasoning [level|show|hide])",
"/skin": "Show or change the display skin/theme",
},
"Tools & Skills": {
"/tools": "List available tools",
"/toolsets": "List available toolsets",
"/skills": "Search, install, inspect, or manage skills from online registries",
"/cron": "Manage scheduled tasks (list, add, remove)",
"/reload-mcp": "Reload MCP servers from config.yaml",
},
"Info": {
"/help": "Show this help message",
"/usage": "Show token usage for the current session",
"/insights": "Show usage insights and analytics (last 30 days)",
"/platforms": "Show gateway/messaging platform status",
"/paste": "Check clipboard for an image and attach it",
},
"Exit": {
"/quit": "Exit the CLI (also: /exit, /q)",
},
COMMANDS = {
"/help": "Show this help message",
"/tools": "List available tools",
"/toolsets": "List available toolsets",
"/model": "Show or change the current model",
"/prompt": "View/set custom system prompt",
"/personality": "Set a predefined personality",
"/clear": "Clear screen and reset conversation (fresh start)",
"/history": "Show conversation history",
"/new": "Start a new conversation (reset history)",
"/reset": "Reset conversation only (keep screen)",
"/retry": "Retry the last message (resend to agent)",
"/undo": "Remove the last user/assistant exchange",
"/save": "Save the current conversation",
"/config": "Show current configuration",
"/cron": "Manage scheduled tasks (list, add, remove)",
"/skills": "Search, install, inspect, or manage skills from online registries",
"/platforms": "Show gateway/messaging platform status",
"/verbose": "Cycle tool progress display: off → new → all → verbose",
"/compress": "Manually compress conversation context (flush memories + summarize)",
"/usage": "Show token usage for the current session",
"/insights": "Show usage insights and analytics (last 30 days)",
"/quit": "Exit the CLI (also: /exit, /q)",
}
# Flat dict for backwards compatibility and autocomplete
COMMANDS = {}
for category_commands in COMMANDS_BY_CATEGORY.values():
COMMANDS.update(category_commands)
class SlashCommandCompleter(Completer):
"""Autocomplete for built-in slash commands and optional skill commands."""
def __init__(
self,
skill_commands_provider: Callable[[], Mapping[str, dict[str, Any]]] | None = None,
) -> None:
self._skill_commands_provider = skill_commands_provider
def _iter_skill_commands(self) -> Mapping[str, dict[str, Any]]:
if self._skill_commands_provider is None:
return {}
try:
return self._skill_commands_provider() or {}
except Exception:
return {}
@staticmethod
def _completion_text(cmd_name: str, word: str) -> str:
"""Return replacement text for a completion.
When the user has already typed the full command exactly (``/help``),
returning ``help`` would be a no-op and prompt_toolkit suppresses the
menu. Appending a trailing space keeps the dropdown visible and makes
backspacing retrigger it naturally.
"""
return f"{cmd_name} " if cmd_name == word else cmd_name
"""Autocomplete for /commands in the input area."""
def get_completions(self, document, complete_event):
text = document.text_before_cursor
if not text.startswith("/"):
return
word = text[1:]
for cmd, desc in COMMANDS.items():
cmd_name = cmd[1:]
if cmd_name.startswith(word):
yield Completion(
self._completion_text(cmd_name, word),
cmd_name,
start_position=-len(word),
display=cmd,
display_meta=desc,
)
for cmd, info in self._iter_skill_commands().items():
cmd_name = cmd[1:]
if cmd_name.startswith(word):
description = str(info.get("description", "Skill command"))
short_desc = description[:50] + ("..." if len(description) > 50 else "")
yield Completion(
self._completion_text(cmd_name, word),
start_position=-len(word),
display=cmd,
display_meta=f"{short_desc}",
)

View File

@@ -14,9 +14,8 @@ This module provides:
import os
import platform
import stat
import subprocess
import sys
import subprocess
from pathlib import Path
from typing import Dict, Any, Optional, List, Tuple
@@ -47,32 +46,13 @@ def get_project_root() -> Path:
"""Get the project installation directory."""
return Path(__file__).parent.parent.resolve()
def _secure_dir(path):
"""Set directory to owner-only access (0700). No-op on Windows."""
try:
os.chmod(path, 0o700)
except (OSError, NotImplementedError):
pass
def _secure_file(path):
"""Set file to owner-only read/write (0600). No-op on Windows."""
try:
if os.path.exists(str(path)):
os.chmod(path, 0o600)
except (OSError, NotImplementedError):
pass
def ensure_hermes_home():
"""Ensure ~/.hermes directory structure exists with secure permissions."""
"""Ensure ~/.hermes directory structure exists."""
home = get_hermes_home()
home.mkdir(parents=True, exist_ok=True)
_secure_dir(home)
for subdir in ("cron", "sessions", "logs", "memories"):
d = home / subdir
d.mkdir(parents=True, exist_ok=True)
_secure_dir(d)
(home / "cron").mkdir(parents=True, exist_ok=True)
(home / "sessions").mkdir(parents=True, exist_ok=True)
(home / "logs").mkdir(parents=True, exist_ok=True)
(home / "memories").mkdir(parents=True, exist_ok=True)
# =============================================================================
@@ -82,9 +62,7 @@ def ensure_hermes_home():
DEFAULT_CONFIG = {
"model": "anthropic/claude-opus-4.6",
"toolsets": ["hermes-cli"],
"agent": {
"max_turns": 90,
},
"max_turns": 100,
"terminal": {
"backend": "local",
@@ -99,52 +77,21 @@ DEFAULT_CONFIG = {
"container_memory": 5120, # MB (default 5GB)
"container_disk": 51200, # MB (default 50GB)
"container_persistent": True, # Persist filesystem across sessions
# Docker volume mounts — share host directories with the container.
# Each entry is "host_path:container_path" (standard Docker -v syntax).
# Example: ["/home/user/projects:/workspace/projects", "/data:/data"]
"docker_volumes": [],
},
"browser": {
"inactivity_timeout": 120,
"record_sessions": False, # Auto-record browser sessions as WebM videos
},
# Filesystem checkpoints — automatic snapshots before destructive file ops.
# When enabled, the agent takes a snapshot of the working directory once per
# conversation turn (on first write_file/patch call). Use /rollback to restore.
"checkpoints": {
"enabled": False,
"max_snapshots": 50, # Max checkpoints to keep per directory
},
"compression": {
"enabled": True,
"threshold": 0.85,
"summary_model": "google/gemini-3-flash-preview",
"summary_provider": "auto",
},
# Auxiliary model overrides (advanced). By default Hermes auto-selects
# the provider and model for each side task. Set these to override.
"auxiliary": {
"vision": {
"provider": "auto", # auto | openrouter | nous | main
"model": "", # e.g. "google/gemini-2.5-flash", "gpt-4o"
},
"web_extract": {
"provider": "auto",
"model": "",
},
},
"display": {
"compact": False,
"personality": "kawaii",
"resume_display": "full",
"bell_on_complete": False,
"show_reasoning": False,
"skin": "default",
},
# Text-to-speech configuration
@@ -183,16 +130,7 @@ DEFAULT_CONFIG = {
"memory_char_limit": 2200, # ~800 tokens at 2.75 chars/token
"user_char_limit": 1375, # ~500 tokens at 2.75 chars/token
},
# Subagent delegation — override the provider:model used by delegate_task
# so child agents can run on a different (cheaper/faster) provider and model.
# Uses the same runtime provider resolution as CLI/gateway startup, so all
# configured providers (OpenRouter, Nous, Z.ai, Kimi, etc.) are supported.
"delegation": {
"model": "", # e.g. "google/gemini-3-flash-preview" (empty = inherit parent model)
"provider": "", # e.g. "openrouter" (empty = inherit parent provider + credentials)
},
# Ephemeral prefill messages file — JSON list of {role, content} dicts
# injected at the start of every API call for few-shot priming.
# Never saved to sessions, logs, or trajectories.
@@ -203,36 +141,17 @@ DEFAULT_CONFIG = {
# (apiKey, workspace, peerName, sessions, enabled) comes from the global config.
"honcho": {},
# IANA timezone (e.g. "Asia/Kolkata", "America/New_York").
# Empty string means use server-local time.
"timezone": "",
# Permanently allowed dangerous command patterns (added via "always" approval)
"command_allowlist": [],
# User-defined quick commands that bypass the agent loop (type: exec only)
"quick_commands": {},
# Custom personalities — add your own entries here
# Supports string format: {"name": "system prompt"}
# Or dict format: {"name": {"description": "...", "system_prompt": "...", "tone": "...", "style": "..."}}
"personalities": {},
# Config schema version - bump this when adding new required fields
"_config_version": 6,
"_config_version": 5,
}
# =============================================================================
# Config Migration System
# =============================================================================
# Track which env vars were introduced in each config version.
# Migration only mentions vars new since the user's previous version.
ENV_VARS_BY_VERSION: Dict[int, List[str]] = {
3: ["FIRECRAWL_API_KEY", "BROWSERBASE_API_KEY", "BROWSERBASE_PROJECT_ID", "FAL_KEY"],
4: ["VOICE_TOOLS_OPENAI_KEY", "ELEVENLABS_API_KEY"],
5: ["WHATSAPP_ENABLED", "WHATSAPP_MODE", "WHATSAPP_ALLOWED_USERS",
"SLACK_BOT_TOKEN", "SLACK_APP_TOKEN", "SLACK_ALLOWED_USERS"],
}
# Required environment variables with metadata for migration prompts.
# LLM provider is required but handled in the setup wizard's provider
# selection step (Nous Portal / OpenRouter / Custom endpoint), so this
@@ -242,22 +161,6 @@ REQUIRED_ENV_VARS = {}
# Optional environment variables that enhance functionality
OPTIONAL_ENV_VARS = {
# ── Provider (handled in provider selection, not shown in checklists) ──
"NOUS_API_KEY": {
"description": "Nous Portal API key (direct API key access to Nous inference)",
"prompt": "Nous Portal API key",
"url": "https://portal.nousresearch.com",
"password": True,
"category": "provider",
"advanced": True,
},
"NOUS_BASE_URL": {
"description": "Nous Portal base URL override",
"prompt": "Nous Portal base URL (leave empty for default)",
"url": None,
"password": False,
"category": "provider",
"advanced": True,
},
"OPENROUTER_API_KEY": {
"description": "OpenRouter API key (for vision, web scraping helpers, and MoA)",
"prompt": "OpenRouter API key",
@@ -267,86 +170,6 @@ OPTIONAL_ENV_VARS = {
"category": "provider",
"advanced": True,
},
"GLM_API_KEY": {
"description": "Z.AI / GLM API key (also recognized as ZAI_API_KEY / Z_AI_API_KEY)",
"prompt": "Z.AI / GLM API key",
"url": "https://z.ai/",
"password": True,
"category": "provider",
"advanced": True,
},
"ZAI_API_KEY": {
"description": "Z.AI API key (alias for GLM_API_KEY)",
"prompt": "Z.AI API key",
"url": "https://z.ai/",
"password": True,
"category": "provider",
"advanced": True,
},
"Z_AI_API_KEY": {
"description": "Z.AI API key (alias for GLM_API_KEY)",
"prompt": "Z.AI API key",
"url": "https://z.ai/",
"password": True,
"category": "provider",
"advanced": True,
},
"GLM_BASE_URL": {
"description": "Z.AI / GLM base URL override",
"prompt": "Z.AI / GLM base URL (leave empty for default)",
"url": None,
"password": False,
"category": "provider",
"advanced": True,
},
"KIMI_API_KEY": {
"description": "Kimi / Moonshot API key",
"prompt": "Kimi API key",
"url": "https://platform.moonshot.cn/",
"password": True,
"category": "provider",
"advanced": True,
},
"KIMI_BASE_URL": {
"description": "Kimi / Moonshot base URL override",
"prompt": "Kimi base URL (leave empty for default)",
"url": None,
"password": False,
"category": "provider",
"advanced": True,
},
"MINIMAX_API_KEY": {
"description": "MiniMax API key (international)",
"prompt": "MiniMax API key",
"url": "https://www.minimax.io/",
"password": True,
"category": "provider",
"advanced": True,
},
"MINIMAX_BASE_URL": {
"description": "MiniMax base URL override",
"prompt": "MiniMax base URL (leave empty for default)",
"url": None,
"password": False,
"category": "provider",
"advanced": True,
},
"MINIMAX_CN_API_KEY": {
"description": "MiniMax API key (China endpoint)",
"prompt": "MiniMax (China) API key",
"url": "https://www.minimaxi.com/",
"password": True,
"category": "provider",
"advanced": True,
},
"MINIMAX_CN_BASE_URL": {
"description": "MiniMax (China) base URL override",
"prompt": "MiniMax (China) base URL (leave empty for default)",
"url": None,
"password": False,
"category": "provider",
"advanced": True,
},
# ── Tool API keys ──
"FIRECRAWL_API_KEY": {
@@ -366,7 +189,7 @@ OPTIONAL_ENV_VARS = {
"advanced": True,
},
"BROWSERBASE_API_KEY": {
"description": "Browserbase API key for cloud browser (optional — local browser works without this)",
"description": "Browserbase API key for browser automation",
"prompt": "Browserbase API key",
"url": "https://browserbase.com/",
"tools": ["browser_navigate", "browser_click"],
@@ -374,7 +197,7 @@ OPTIONAL_ENV_VARS = {
"category": "tool",
},
"BROWSERBASE_PROJECT_ID": {
"description": "Browserbase project ID (optional — only needed for cloud browser)",
"description": "Browserbase project ID",
"prompt": "Browserbase project ID",
"url": "https://browserbase.com/",
"tools": ["browser_navigate", "browser_click"],
@@ -468,18 +291,14 @@ OPTIONAL_ENV_VARS = {
"category": "messaging",
},
"SLACK_BOT_TOKEN": {
"description": "Slack bot token (xoxb-). Get from OAuth & Permissions after installing your app. "
"Required scopes: chat:write, app_mentions:read, channels:history, groups:history, "
"im:history, im:read, im:write, users:read, files:write",
"description": "Slack bot integration",
"prompt": "Slack Bot Token (xoxb-...)",
"url": "https://api.slack.com/apps",
"password": True,
"category": "messaging",
},
"SLACK_APP_TOKEN": {
"description": "Slack app-level token (xapp-) for Socket Mode. Get from Basic Information → "
"App-Level Tokens. Also ensure Event Subscriptions include: message.im, "
"message.channels, message.groups, app_mention",
"description": "Slack Socket Mode connection",
"prompt": "Slack App Token (xapp-...)",
"url": "https://api.slack.com/apps",
"password": True,
@@ -510,7 +329,7 @@ OPTIONAL_ENV_VARS = {
"category": "setting",
},
"HERMES_MAX_ITERATIONS": {
"description": "Maximum tool-calling iterations per conversation (default: 90)",
"description": "Maximum tool-calling iterations per conversation (default: 60)",
"prompt": "Max iterations",
"url": None,
"password": False,
@@ -666,22 +485,6 @@ def migrate_config(interactive: bool = True, quiet: bool = False) -> Dict[str, A
if not quiet:
print(f" ✓ Migrated tool progress to config.yaml: {display['tool_progress']}")
# ── Version 4 → 5: add timezone field ──
if current_ver < 5:
config = load_config()
if "timezone" not in config:
old_tz = os.getenv("HERMES_TIMEZONE", "")
if old_tz and old_tz.strip():
config["timezone"] = old_tz.strip()
results["config_added"].append(f"timezone={old_tz.strip()} (from HERMES_TIMEZONE)")
else:
config["timezone"] = ""
results["config_added"].append("timezone= (empty, uses server-local)")
save_config(config)
if not quiet:
tz_display = config["timezone"] or "(server-local)"
print(f" ✓ Added timezone to config.yaml: {tz_display}")
if current_ver < latest_ver and not quiet:
print(f"Config version: {current_ver}{latest_ver}")
@@ -722,47 +525,34 @@ def migrate_config(interactive: bool = True, quiet: bool = False) -> Dict[str, A
if v["name"] not in required_names and not v.get("advanced")
]
# Only offer to configure env vars that are NEW since the user's previous version
new_var_names = set()
for ver in range(current_ver + 1, latest_ver + 1):
new_var_names.update(ENV_VARS_BY_VERSION.get(ver, []))
if new_var_names and interactive and not quiet:
new_and_unset = [
(name, OPTIONAL_ENV_VARS[name])
for name in sorted(new_var_names)
if not get_env_value(name) and name in OPTIONAL_ENV_VARS
]
if new_and_unset:
print(f"\n {len(new_and_unset)} new optional key(s) in this update:")
for name, info in new_and_unset:
print(f"{name}{info.get('description', '')}")
if interactive and missing_optional:
print(" Would you like to configure any optional keys now?")
try:
answer = input(" Configure optional keys? [y/N]: ").strip().lower()
except (EOFError, KeyboardInterrupt):
answer = "n"
if answer in ("y", "yes"):
print()
try:
answer = input(" Configure new keys? [y/N]: ").strip().lower()
except (EOFError, KeyboardInterrupt):
answer = "n"
if answer in ("y", "yes"):
for var in missing_optional:
desc = var.get("description", "")
if var.get("url"):
print(f" {desc}")
print(f" Get your key at: {var['url']}")
else:
print(f" {desc}")
if var.get("password"):
import getpass
value = getpass.getpass(f" {var['prompt']} (Enter to skip): ")
else:
value = input(f" {var['prompt']} (Enter to skip): ").strip()
if value:
save_env_value(var["name"], value)
results["env_added"].append(var["name"])
print(f" ✓ Saved {var['name']}")
print()
for name, info in new_and_unset:
if info.get("url"):
print(f" {info.get('description', name)}")
print(f" Get your key at: {info['url']}")
else:
print(f" {info.get('description', name)}")
if info.get("password"):
import getpass
value = getpass.getpass(f" {info.get('prompt', name)} (Enter to skip): ")
else:
value = input(f" {info.get('prompt', name)} (Enter to skip): ").strip()
if value:
save_env_value(name, value)
results["env_added"].append(name)
print(f" ✓ Saved {name}")
print()
else:
print(" Set later with: hermes config set KEY VALUE")
# Check for missing config fields
missing_config = get_missing_config_fields()
@@ -811,23 +601,6 @@ def _deep_merge(base: dict, override: dict) -> dict:
return result
def _normalize_max_turns_config(config: Dict[str, Any]) -> Dict[str, Any]:
"""Normalize legacy root-level max_turns into agent.max_turns."""
config = dict(config)
agent_config = dict(config.get("agent") or {})
if "max_turns" in config and "max_turns" not in agent_config:
agent_config["max_turns"] = config["max_turns"]
if "max_turns" not in agent_config:
agent_config["max_turns"] = DEFAULT_CONFIG["agent"]["max_turns"]
config["agent"] = agent_config
config.pop("max_turns", None)
return config
def load_config() -> Dict[str, Any]:
"""Load configuration from ~/.hermes/config.yaml."""
import copy
@@ -837,77 +610,23 @@ def load_config() -> Dict[str, Any]:
if config_path.exists():
try:
with open(config_path, encoding="utf-8") as f:
with open(config_path) as f:
user_config = yaml.safe_load(f) or {}
if "max_turns" in user_config:
agent_user_config = dict(user_config.get("agent") or {})
if agent_user_config.get("max_turns") is None:
agent_user_config["max_turns"] = user_config["max_turns"]
user_config["agent"] = agent_user_config
user_config.pop("max_turns", None)
config = _deep_merge(config, user_config)
except Exception as e:
print(f"Warning: Failed to load config: {e}")
return _normalize_max_turns_config(config)
_COMMENTED_SECTIONS = """
# ── Security ──────────────────────────────────────────────────────────
# API keys, tokens, and passwords are redacted from tool output by default.
# Set to false to see full values (useful for debugging auth issues).
#
# security:
# redact_secrets: false
# ── Fallback Model ────────────────────────────────────────────────────
# Automatic provider failover when primary is unavailable.
# Uncomment and configure to enable. Triggers on rate limits (429),
# overload (529), service errors (503), or connection failures.
#
# Supported providers:
# openrouter (OPENROUTER_API_KEY) — routes to any model
# openai-codex (OAuth — hermes login) — OpenAI Codex
# nous (OAuth — hermes login) — Nous Portal
# zai (ZAI_API_KEY) — Z.AI / GLM
# kimi-coding (KIMI_API_KEY) — Kimi / Moonshot
# minimax (MINIMAX_API_KEY) — MiniMax
# minimax-cn (MINIMAX_CN_API_KEY) — MiniMax (China)
#
# For custom OpenAI-compatible endpoints, add base_url and api_key_env.
#
# fallback_model:
# provider: openrouter
# model: anthropic/claude-sonnet-4
"""
return config
def save_config(config: Dict[str, Any]):
"""Save configuration to ~/.hermes/config.yaml."""
from utils import atomic_yaml_write
ensure_hermes_home()
config_path = get_config_path()
normalized = _normalize_max_turns_config(config)
# Build optional commented-out sections for features that are off by
# default or only relevant when explicitly configured.
sections = []
sec = normalized.get("security", {})
if not sec or sec.get("redact_secrets") is None:
sections.append("security")
fb = normalized.get("fallback_model", {})
if not fb or not (fb.get("provider") and fb.get("model")):
sections.append("fallback")
atomic_yaml_write(
config_path,
normalized,
extra_content=_COMMENTED_SECTIONS if sections else None,
)
_secure_file(config_path)
with open(config_path, 'w') as f:
yaml.dump(config, f, default_flow_style=False, sort_keys=False)
def load_env() -> Dict[str, str]:
@@ -960,14 +679,6 @@ def save_env_value(key: str, value: str):
with open(env_path, 'w', **write_kw) as f:
f.writelines(lines)
_secure_file(env_path)
# Restrict .env permissions to owner-only (contains API keys)
if not _IS_WINDOWS:
try:
os.chmod(env_path, stat.S_IRUSR | stat.S_IWUSR)
except OSError:
pass
def get_env_value(key: str) -> Optional[str]:
@@ -1032,17 +743,9 @@ def show_config():
print()
print(color("◆ Model", Colors.CYAN, Colors.BOLD))
print(f" Model: {config.get('model', 'not set')}")
print(f" Max turns: {config.get('agent', {}).get('max_turns', DEFAULT_CONFIG['agent']['max_turns'])}")
print(f" Max turns: {config.get('max_turns', 100)}")
print(f" Toolsets: {', '.join(config.get('toolsets', ['all']))}")
# Display
print()
print(color("◆ Display", Colors.CYAN, Colors.BOLD))
display = config.get('display', {})
print(f" Personality: {display.get('personality', 'kawaii')}")
print(f" Reasoning: {'on' if display.get('show_reasoning', False) else 'off'}")
print(f" Bell: {'on' if display.get('bell_on_complete', False) else 'off'}")
# Terminal
print()
print(color("◆ Terminal", Colors.CYAN, Colors.BOLD))
@@ -1069,15 +772,6 @@ def show_config():
print(f" SSH host: {ssh_host or '(not set)'}")
print(f" SSH user: {ssh_user or '(not set)'}")
# Timezone
print()
print(color("◆ Timezone", Colors.CYAN, Colors.BOLD))
tz = config.get('timezone', '')
if tz:
print(f" Timezone: {tz}")
else:
print(f" Timezone: {color('(server-local)', Colors.DIM)}")
# Compression
print()
print(color("◆ Context Compression", Colors.CYAN, Colors.BOLD))
@@ -1087,31 +781,6 @@ def show_config():
if enabled:
print(f" Threshold: {compression.get('threshold', 0.85) * 100:.0f}%")
print(f" Model: {compression.get('summary_model', 'google/gemini-3-flash-preview')}")
comp_provider = compression.get('summary_provider', 'auto')
if comp_provider != 'auto':
print(f" Provider: {comp_provider}")
# Auxiliary models
auxiliary = config.get('auxiliary', {})
aux_tasks = {
"Vision": auxiliary.get('vision', {}),
"Web extract": auxiliary.get('web_extract', {}),
}
has_overrides = any(
t.get('provider', 'auto') != 'auto' or t.get('model', '')
for t in aux_tasks.values()
)
if has_overrides:
print()
print(color("◆ Auxiliary Models (overrides)", Colors.CYAN, Colors.BOLD))
for label, task_cfg in aux_tasks.items():
prov = task_cfg.get('provider', 'auto')
mdl = task_cfg.get('model', '')
if prov != 'auto' or mdl:
parts = [f"provider={prov}"]
if mdl:
parts.append(f"model={mdl}")
print(f" {label:12s} {', '.join(parts)}")
# Messaging
print()
@@ -1169,7 +838,7 @@ def set_config_value(key: str, value: str):
'FAL_KEY', 'TELEGRAM_BOT_TOKEN', 'DISCORD_BOT_TOKEN',
'TERMINAL_SSH_HOST', 'TERMINAL_SSH_USER', 'TERMINAL_SSH_KEY',
'SUDO_PASSWORD', 'SLACK_BOT_TOKEN', 'SLACK_APP_TOKEN',
'GITHUB_TOKEN', 'HONCHO_API_KEY', 'WANDB_API_KEY',
'GITHUB_TOKEN', 'HONCHO_API_KEY', 'NOUS_API_KEY', 'WANDB_API_KEY',
'TINKER_API_KEY',
]
@@ -1185,7 +854,7 @@ def set_config_value(key: str, value: str):
user_config = {}
if config_path.exists():
try:
with open(config_path, encoding="utf-8") as f:
with open(config_path) as f:
user_config = yaml.safe_load(f) or {}
except Exception:
user_config = {}
@@ -1213,7 +882,7 @@ def set_config_value(key: str, value: str):
# Write only user config back (not the full merged defaults)
ensure_hermes_home()
with open(config_path, 'w', encoding="utf-8") as f:
with open(config_path, 'w') as f:
yaml.dump(user_config, f, default_flow_style=False, sort_keys=False)
# Keep .env in sync for keys that terminal_tool reads directly from env vars.
@@ -1226,7 +895,6 @@ def set_config_value(key: str, value: str):
"terminal.daytona_image": "TERMINAL_DAYTONA_IMAGE",
"terminal.cwd": "TERMINAL_CWD",
"terminal.timeout": "TERMINAL_TIMEOUT",
"terminal.sandbox_dir": "TERMINAL_SANDBOX_DIR",
}
if key in _config_to_env_sync:
save_env_value(_config_to_env_sync[key], str(value))

View File

@@ -1,140 +0,0 @@
"""Shared curses-based UI components for Hermes CLI.
Used by `hermes tools` and `hermes skills` for interactive checklists.
Provides a curses multi-select with keyboard navigation, plus a
text-based numbered fallback for terminals without curses support.
"""
from typing import List, Set
from hermes_cli.colors import Colors, color
def curses_checklist(
title: str,
items: List[str],
selected: Set[int],
*,
cancel_returns: Set[int] | None = None,
) -> Set[int]:
"""Curses multi-select checklist. Returns set of selected indices.
Args:
title: Header line displayed above the checklist.
items: Display labels for each row.
selected: Indices that start checked (pre-selected).
cancel_returns: Returned on ESC/q. Defaults to the original *selected*.
"""
if cancel_returns is None:
cancel_returns = set(selected)
try:
import curses
chosen = set(selected)
result_holder: list = [None]
def _draw(stdscr):
curses.curs_set(0)
if curses.has_colors():
curses.start_color()
curses.use_default_colors()
curses.init_pair(1, curses.COLOR_GREEN, -1)
curses.init_pair(2, curses.COLOR_YELLOW, -1)
curses.init_pair(3, 8, -1) # dim gray
cursor = 0
scroll_offset = 0
while True:
stdscr.clear()
max_y, max_x = stdscr.getmaxyx()
# Header
try:
hattr = curses.A_BOLD
if curses.has_colors():
hattr |= curses.color_pair(2)
stdscr.addnstr(0, 0, title, max_x - 1, hattr)
stdscr.addnstr(
1, 0,
" ↑↓ navigate SPACE toggle ENTER confirm ESC cancel",
max_x - 1, curses.A_DIM,
)
except curses.error:
pass
# Scrollable item list
visible_rows = max_y - 3
if cursor < scroll_offset:
scroll_offset = cursor
elif cursor >= scroll_offset + visible_rows:
scroll_offset = cursor - visible_rows + 1
for draw_i, i in enumerate(
range(scroll_offset, min(len(items), scroll_offset + visible_rows))
):
y = draw_i + 3
if y >= max_y - 1:
break
check = "" if i in chosen else " "
arrow = "" if i == cursor else " "
line = f" {arrow} [{check}] {items[i]}"
attr = curses.A_NORMAL
if i == cursor:
attr = curses.A_BOLD
if curses.has_colors():
attr |= curses.color_pair(1)
try:
stdscr.addnstr(y, 0, line, max_x - 1, attr)
except curses.error:
pass
stdscr.refresh()
key = stdscr.getch()
if key in (curses.KEY_UP, ord("k")):
cursor = (cursor - 1) % len(items)
elif key in (curses.KEY_DOWN, ord("j")):
cursor = (cursor + 1) % len(items)
elif key == ord(" "):
chosen.symmetric_difference_update({cursor})
elif key in (curses.KEY_ENTER, 10, 13):
result_holder[0] = set(chosen)
return
elif key in (27, ord("q")):
result_holder[0] = cancel_returns
return
curses.wrapper(_draw)
return result_holder[0] if result_holder[0] is not None else cancel_returns
except Exception:
return _numbered_fallback(title, items, selected, cancel_returns)
def _numbered_fallback(
title: str,
items: List[str],
selected: Set[int],
cancel_returns: Set[int],
) -> Set[int]:
"""Text-based toggle fallback for terminals without curses."""
chosen = set(selected)
print(color(f"\n {title}", Colors.YELLOW))
print(color(" Toggle by number, Enter to confirm.\n", Colors.DIM))
while True:
for i, label in enumerate(items):
marker = color("[✓]", Colors.GREEN) if i in chosen else "[ ]"
print(f" {marker} {i + 1:>2}. {label}")
print()
try:
val = input(color(" Toggle # (or Enter to confirm): ", Colors.DIM)).strip()
if not val:
break
idx = int(val) - 1
if 0 <= idx < len(items):
chosen.symmetric_difference_update({idx})
except (ValueError, KeyboardInterrupt, EOFError):
return cancel_returns
print()
return chosen

View File

@@ -33,26 +33,6 @@ os.environ.setdefault("MSWEA_SILENT_STARTUP", "1")
from hermes_cli.colors import Colors, color
from hermes_constants import OPENROUTER_MODELS_URL
_PROVIDER_ENV_HINTS = (
"OPENROUTER_API_KEY",
"OPENAI_API_KEY",
"ANTHROPIC_API_KEY",
"OPENAI_BASE_URL",
"GLM_API_KEY",
"ZAI_API_KEY",
"Z_AI_API_KEY",
"KIMI_API_KEY",
"MINIMAX_API_KEY",
"MINIMAX_CN_API_KEY",
)
def _has_provider_env_config(content: str) -> bool:
"""Return True when ~/.hermes/.env contains provider auth/base URL settings."""
return any(key in content for key in _PROVIDER_ENV_HINTS)
def check_ok(text: str, detail: str = ""):
print(f" {color('', Colors.GREEN)} {text}" + (f" {color(detail, Colors.DIM)}" if detail else ""))
@@ -152,8 +132,8 @@ def run_doctor(args):
# Check for common issues
content = env_path.read_text()
if _has_provider_env_config(content):
check_ok("API key or custom endpoint configured")
if "OPENROUTER_API_KEY" in content or "ANTHROPIC_API_KEY" in content:
check_ok("API key configured")
else:
check_warn("No API key found in ~/.hermes/.env")
issues.append("Run 'hermes setup' to configure API keys")
@@ -488,48 +468,7 @@ def run_doctor(args):
print(f"\r {color('', Colors.YELLOW)} Anthropic API {color(msg, Colors.DIM)} ")
except Exception as e:
print(f"\r {color('', Colors.YELLOW)} Anthropic API {color(f'({e})', Colors.DIM)} ")
# -- API-key providers (Z.AI/GLM, Kimi, MiniMax, MiniMax-CN) --
_apikey_providers = [
("Z.AI / GLM", ("GLM_API_KEY", "ZAI_API_KEY", "Z_AI_API_KEY"), "https://api.z.ai/api/paas/v4/models", "GLM_BASE_URL"),
("Kimi / Moonshot", ("KIMI_API_KEY",), "https://api.moonshot.ai/v1/models", "KIMI_BASE_URL"),
("MiniMax", ("MINIMAX_API_KEY",), "https://api.minimax.io/v1/models", "MINIMAX_BASE_URL"),
("MiniMax (China)", ("MINIMAX_CN_API_KEY",), "https://api.minimaxi.com/v1/models", "MINIMAX_CN_BASE_URL"),
]
for _pname, _env_vars, _default_url, _base_env in _apikey_providers:
_key = ""
for _ev in _env_vars:
_key = os.getenv(_ev, "")
if _key:
break
if _key:
_label = _pname.ljust(20)
print(f" Checking {_pname} API...", end="", flush=True)
try:
import httpx
_base = os.getenv(_base_env, "")
# Auto-detect Kimi Code keys (sk-kimi-) → api.kimi.com
if not _base and _key.startswith("sk-kimi-"):
_base = "https://api.kimi.com/coding/v1"
_url = (_base.rstrip("/") + "/models") if _base else _default_url
_headers = {"Authorization": f"Bearer {_key}"}
if "api.kimi.com" in _url.lower():
_headers["User-Agent"] = "KimiCLI/1.0"
_resp = httpx.get(
_url,
headers=_headers,
timeout=10,
)
if _resp.status_code == 200:
print(f"\r {color('', Colors.GREEN)} {_label} ")
elif _resp.status_code == 401:
print(f"\r {color('', Colors.RED)} {_label} {color('(invalid API key)', Colors.DIM)} ")
issues.append(f"Check {_env_vars[0]} in .env")
else:
print(f"\r {color('', Colors.YELLOW)} {_label} {color(f'(HTTP {_resp.status_code})', Colors.DIM)} ")
except Exception as _e:
print(f"\r {color('', Colors.YELLOW)} {_label} {color(f'({_e})', Colors.DIM)} ")
# =========================================================================
# Check: Submodules
# =========================================================================

View File

@@ -154,33 +154,19 @@ def get_hermes_cli_path() -> str:
# =============================================================================
def generate_systemd_unit() -> str:
import shutil
python_path = get_python_path()
working_dir = str(PROJECT_ROOT)
venv_dir = str(PROJECT_ROOT / "venv")
venv_bin = str(PROJECT_ROOT / "venv" / "bin")
node_bin = str(PROJECT_ROOT / "node_modules" / ".bin")
# Build a PATH that includes the venv, node_modules, and standard system dirs
sane_path = f"{venv_bin}:{node_bin}:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
hermes_cli = shutil.which("hermes") or f"{python_path} -m hermes_cli.main"
return f"""[Unit]
Description={SERVICE_DESCRIPTION}
After=network.target
[Service]
Type=simple
ExecStart={python_path} -m hermes_cli.main gateway run --replace
ExecStop={hermes_cli} gateway stop
ExecStart={python_path} -m hermes_cli.main gateway run
WorkingDirectory={working_dir}
Environment="PATH={sane_path}"
Environment="VIRTUAL_ENV={venv_dir}"
Restart=on-failure
RestartSec=10
KillMode=mixed
KillSignal=SIGTERM
TimeoutStopSec=15
StandardOutput=journal
StandardError=journal
@@ -391,15 +377,8 @@ def launchd_status(deep: bool = False):
# Gateway Runner
# =============================================================================
def run_gateway(verbose: bool = False, replace: bool = False):
"""Run the gateway in foreground.
Args:
verbose: Enable verbose logging output.
replace: If True, kill any existing gateway instance before starting.
This prevents systemd restart loops when the old process
hasn't fully exited yet.
"""
def run_gateway(verbose: bool = False):
"""Run the gateway in foreground."""
sys.path.insert(0, str(PROJECT_ROOT))
from gateway.run import start_gateway
@@ -414,7 +393,7 @@ def run_gateway(verbose: bool = False, replace: bool = False):
# Exit with code 1 if gateway fails to connect any platform,
# so systemd Restart=on-failure will retry on transient errors
success = asyncio.run(start_gateway(replace=replace))
success = asyncio.run(start_gateway())
if not success:
sys.exit(1)
@@ -482,19 +461,14 @@ _PLATFORMS = [
"token_var": "SLACK_BOT_TOKEN",
"setup_instructions": [
"1. Go to https://api.slack.com/apps → Create New App → From Scratch",
"2. Enable Socket Mode: Settings → Socket Mode → Enable",
" Create an App-Level Token with scope: connections:write → copy xapp-... token",
"3. Add Bot Token Scopes: Features → OAuth & Permissions → Scopes",
" Required: chat:write, app_mentions:read, channels:history, channels:read,",
" groups:history, im:history, im:read, im:write, users:read, files:write",
"4. Subscribe to Events: Features → Event Subscriptions → Enable",
" Required events: message.im, message.channels, app_mention",
" Optional: message.groups (for private channels)",
" ⚠ Without message.channels the bot will ONLY work in DMs!",
"5. Install to Workspace: Settings → Install App → copy xoxb-... token",
"6. Reinstall the app after any scope or event changes",
"2. Enable Socket Mode: App Settings → Socket Mode → Enable",
"3. Get Bot Token: OAuth & Permissions → Install to Workspace → copy xoxb-... token",
"4. Get App Token: Basic Information → App-Level Tokens → Generate",
" Name it anything, add scope: connections:write → copy xapp-... token",
"5. Add bot scopes: OAuth & Permissions → Scopes → chat:write, im:history,",
" im:read, im:write, channels:history, channels:read",
"6. Reinstall the app to your workspace after adding scopes",
"7. Find your user ID: click your profile → three dots → Copy member ID",
"8. Invite the bot to channels: /invite @YourBot",
],
"vars": [
{"name": "SLACK_BOT_TOKEN", "prompt": "Bot Token (xoxb-...)", "password": True,
@@ -512,38 +486,6 @@ _PLATFORMS = [
"emoji": "📲",
"token_var": "WHATSAPP_ENABLED",
},
{
"key": "signal",
"label": "Signal",
"emoji": "📡",
"token_var": "SIGNAL_HTTP_URL",
},
{
"key": "email",
"label": "Email",
"emoji": "📧",
"token_var": "EMAIL_ADDRESS",
"setup_instructions": [
"1. Use a dedicated email account for your Hermes agent",
"2. For Gmail: enable 2FA, then create an App Password at",
" https://myaccount.google.com/apppasswords",
"3. For other providers: use your email password or app-specific password",
"4. IMAP must be enabled on your email account",
],
"vars": [
{"name": "EMAIL_ADDRESS", "prompt": "Email address", "password": False,
"help": "The email address Hermes will use (e.g., hermes@gmail.com)."},
{"name": "EMAIL_PASSWORD", "prompt": "Email password (or app password)", "password": True,
"help": "For Gmail, use an App Password (not your regular password)."},
{"name": "EMAIL_IMAP_HOST", "prompt": "IMAP host", "password": False,
"help": "e.g., imap.gmail.com for Gmail, outlook.office365.com for Outlook."},
{"name": "EMAIL_SMTP_HOST", "prompt": "SMTP host", "password": False,
"help": "e.g., smtp.gmail.com for Gmail, smtp.office365.com for Outlook."},
{"name": "EMAIL_ALLOWED_USERS", "prompt": "Allowed sender emails (comma-separated)", "password": False,
"is_allowlist": True,
"help": "Only emails from these addresses will be processed."},
],
},
]
@@ -562,22 +504,6 @@ def _platform_status(platform: dict) -> str:
return "configured + paired"
return "enabled, not paired"
return "not configured"
if platform.get("key") == "signal":
account = get_env_value("SIGNAL_ACCOUNT")
if val and account:
return "configured"
if val or account:
return "partially configured"
return "not configured"
if platform.get("key") == "email":
pwd = get_env_value("EMAIL_PASSWORD")
imap = get_env_value("EMAIL_IMAP_HOST")
smtp = get_env_value("EMAIL_SMTP_HOST")
if all([val, pwd, imap, smtp]):
return "configured"
if any([val, pwd, imap, smtp]):
return "partially configured"
return "not configured"
if val:
return "configured"
return "not configured"
@@ -703,121 +629,6 @@ def _is_service_running() -> bool:
return len(find_gateway_pids()) > 0
def _setup_signal():
"""Interactive setup for Signal messenger."""
import shutil
print()
print(color(" ─── 📡 Signal Setup ───", Colors.CYAN))
existing_url = get_env_value("SIGNAL_HTTP_URL")
existing_account = get_env_value("SIGNAL_ACCOUNT")
if existing_url and existing_account:
print()
print_success("Signal is already configured.")
if not prompt_yes_no(" Reconfigure Signal?", False):
return
# Check if signal-cli is available
print()
if shutil.which("signal-cli"):
print_success("signal-cli found on PATH.")
else:
print_warning("signal-cli not found on PATH.")
print_info(" Signal requires signal-cli running as an HTTP daemon.")
print_info(" Install options:")
print_info(" Linux: sudo apt install signal-cli")
print_info(" or download from https://github.com/AsamK/signal-cli")
print_info(" macOS: brew install signal-cli")
print_info(" Docker: bbernhard/signal-cli-rest-api")
print()
print_info(" After installing, link your account and start the daemon:")
print_info(" signal-cli link -n \"HermesAgent\"")
print_info(" signal-cli --account +YOURNUMBER daemon --http 127.0.0.1:8080")
print()
# HTTP URL
print()
print_info(" Enter the URL where signal-cli HTTP daemon is running.")
default_url = existing_url or "http://127.0.0.1:8080"
try:
url = input(f" HTTP URL [{default_url}]: ").strip() or default_url
except (EOFError, KeyboardInterrupt):
print("\n Setup cancelled.")
return
# Test connectivity
print_info(" Testing connection...")
try:
import httpx
resp = httpx.get(f"{url.rstrip('/')}/api/v1/check", timeout=10.0)
if resp.status_code == 200:
print_success(" signal-cli daemon is reachable!")
else:
print_warning(f" signal-cli responded with status {resp.status_code}.")
if not prompt_yes_no(" Continue anyway?", False):
return
except Exception as e:
print_warning(f" Could not reach signal-cli at {url}: {e}")
if not prompt_yes_no(" Save this URL anyway? (you can start signal-cli later)", True):
return
save_env_value("SIGNAL_HTTP_URL", url)
# Account phone number
print()
print_info(" Enter your Signal account phone number in E.164 format.")
print_info(" Example: +15551234567")
default_account = existing_account or ""
try:
account = input(f" Account number{f' [{default_account}]' if default_account else ''}: ").strip()
if not account:
account = default_account
except (EOFError, KeyboardInterrupt):
print("\n Setup cancelled.")
return
if not account:
print_error(" Account number is required.")
return
save_env_value("SIGNAL_ACCOUNT", account)
# Allowed users
print()
print_info(" The gateway DENIES all users by default for security.")
print_info(" Enter phone numbers or UUIDs of allowed users (comma-separated).")
existing_allowed = get_env_value("SIGNAL_ALLOWED_USERS") or ""
default_allowed = existing_allowed or account
try:
allowed = input(f" Allowed users [{default_allowed}]: ").strip() or default_allowed
except (EOFError, KeyboardInterrupt):
print("\n Setup cancelled.")
return
save_env_value("SIGNAL_ALLOWED_USERS", allowed)
# Group messaging
print()
if prompt_yes_no(" Enable group messaging? (disabled by default for security)", False):
print()
print_info(" Enter group IDs to allow, or * for all groups.")
existing_groups = get_env_value("SIGNAL_GROUP_ALLOWED_USERS") or ""
try:
groups = input(f" Group IDs [{existing_groups or '*'}]: ").strip() or existing_groups or "*"
except (EOFError, KeyboardInterrupt):
print("\n Setup cancelled.")
return
save_env_value("SIGNAL_GROUP_ALLOWED_USERS", groups)
print()
print_success("Signal configured!")
print_info(f" URL: {url}")
print_info(f" Account: {account}")
print_info(f" DM auth: via SIGNAL_ALLOWED_USERS + DM pairing")
print_info(f" Groups: {'enabled' if get_env_value('SIGNAL_GROUP_ALLOWED_USERS') else 'disabled'}")
def gateway_setup():
"""Interactive setup for messaging platforms + gateway service."""
@@ -870,8 +681,6 @@ def gateway_setup():
if platform["key"] == "whatsapp":
_setup_whatsapp()
elif platform["key"] == "signal":
_setup_signal()
else:
_setup_standard_platform(platform)
@@ -956,8 +765,7 @@ def gateway_command(args):
# Default to run if no subcommand
if subcmd is None or subcmd == "run":
verbose = getattr(args, 'verbose', False)
replace = getattr(args, 'replace', False)
run_gateway(verbose, replace=replace)
run_gateway(verbose)
return
if subcmd == "setup":

File diff suppressed because it is too large Load Diff

View File

@@ -1,18 +1,10 @@
"""
Canonical model catalogs and lightweight validation helpers.
Canonical list of OpenRouter models offered in CLI and setup wizards.
Add, remove, or reorder entries here — both `hermes setup` and
`hermes` provider-selection will pick up the change automatically.
"""
from __future__ import annotations
import json
import urllib.request
import urllib.error
from difflib import get_close_matches
from typing import Any, Optional
# (model_id, display description shown in menus)
OPENROUTER_MODELS: list[tuple[str, str]] = [
("anthropic/claude-opus-4.6", "recommended"),
@@ -22,64 +14,17 @@ OPENROUTER_MODELS: list[tuple[str, str]] = [
("openai/gpt-5.3-codex", ""),
("google/gemini-3-pro-preview", ""),
("google/gemini-3-flash-preview", ""),
("qwen/qwen3.5-plus-02-15", ""),
("qwen/qwen3.5-35b-a3b", ""),
("qwen/qwen3.5-plus-02-15", ""),
("qwen/qwen3.5-35b-a3b", ""),
("stepfun/step-3.5-flash", ""),
("z-ai/glm-5", ""),
("moonshotai/kimi-k2.5", ""),
("minimax/minimax-m2.5", ""),
]
_PROVIDER_MODELS: dict[str, list[str]] = {
"zai": [
"glm-5",
"glm-4.7",
"glm-4.5",
"glm-4.5-flash",
],
"kimi-coding": [
"kimi-k2.5",
"kimi-k2-thinking",
"kimi-k2-turbo-preview",
"kimi-k2-0905-preview",
],
"minimax": [
"MiniMax-M2.5",
"MiniMax-M2.5-highspeed",
"MiniMax-M2.1",
],
"minimax-cn": [
"MiniMax-M2.5",
"MiniMax-M2.5-highspeed",
"MiniMax-M2.1",
],
}
_PROVIDER_LABELS = {
"openrouter": "OpenRouter",
"openai-codex": "OpenAI Codex",
"nous": "Nous Portal",
"zai": "Z.AI / GLM",
"kimi-coding": "Kimi / Moonshot",
"minimax": "MiniMax",
"minimax-cn": "MiniMax (China)",
"custom": "Custom endpoint",
}
_PROVIDER_ALIASES = {
"glm": "zai",
"z-ai": "zai",
"z.ai": "zai",
"zhipu": "zai",
"kimi": "kimi-coding",
"moonshot": "kimi-coding",
"minimax-china": "minimax-cn",
"minimax_cn": "minimax-cn",
}
def model_ids() -> list[str]:
"""Return just the OpenRouter model-id strings."""
"""Return just the model-id strings (convenience helper)."""
return [mid for mid, _ in OPENROUTER_MODELS]
@@ -89,231 +34,3 @@ def menu_labels() -> list[str]:
for mid, desc in OPENROUTER_MODELS:
labels.append(f"{mid} ({desc})" if desc else mid)
return labels
# All provider IDs and aliases that are valid for the provider:model syntax.
_KNOWN_PROVIDER_NAMES: set[str] = (
set(_PROVIDER_LABELS.keys())
| set(_PROVIDER_ALIASES.keys())
| {"openrouter", "custom"}
)
def list_available_providers() -> list[dict[str, str]]:
"""Return info about all providers the user could use with ``provider:model``.
Each dict has ``id``, ``label``, and ``aliases``.
Checks which providers have valid credentials configured.
"""
# Canonical providers in display order
_PROVIDER_ORDER = [
"openrouter", "nous", "openai-codex",
"zai", "kimi-coding", "minimax", "minimax-cn",
]
# Build reverse alias map
aliases_for: dict[str, list[str]] = {}
for alias, canonical in _PROVIDER_ALIASES.items():
aliases_for.setdefault(canonical, []).append(alias)
result = []
for pid in _PROVIDER_ORDER:
label = _PROVIDER_LABELS.get(pid, pid)
alias_list = aliases_for.get(pid, [])
# Check if this provider has credentials available
has_creds = False
try:
from hermes_cli.runtime_provider import resolve_runtime_provider
runtime = resolve_runtime_provider(requested=pid)
has_creds = bool(runtime.get("api_key"))
except Exception:
pass
result.append({
"id": pid,
"label": label,
"aliases": alias_list,
"authenticated": has_creds,
})
return result
def parse_model_input(raw: str, current_provider: str) -> tuple[str, str]:
"""Parse ``/model`` input into ``(provider, model)``.
Supports ``provider:model`` syntax to switch providers at runtime::
openrouter:anthropic/claude-sonnet-4.5 → ("openrouter", "anthropic/claude-sonnet-4.5")
nous:hermes-3 → ("nous", "hermes-3")
anthropic/claude-sonnet-4.5 → (current_provider, "anthropic/claude-sonnet-4.5")
gpt-5.4 → (current_provider, "gpt-5.4")
The colon is only treated as a provider delimiter if the left side is a
recognized provider name or alias. This avoids misinterpreting model names
that happen to contain colons (e.g. ``anthropic/claude-3.5-sonnet:beta``).
Returns ``(provider, model)`` where *provider* is either the explicit
provider from the input or *current_provider* if none was specified.
"""
stripped = raw.strip()
colon = stripped.find(":")
if colon > 0:
provider_part = stripped[:colon].strip().lower()
model_part = stripped[colon + 1:].strip()
if provider_part and model_part and provider_part in _KNOWN_PROVIDER_NAMES:
return (normalize_provider(provider_part), model_part)
return (current_provider, stripped)
def curated_models_for_provider(provider: Optional[str]) -> list[tuple[str, str]]:
"""Return ``(model_id, description)`` tuples for a provider's curated list."""
normalized = normalize_provider(provider)
if normalized == "openrouter":
return list(OPENROUTER_MODELS)
models = _PROVIDER_MODELS.get(normalized, [])
return [(m, "") for m in models]
def normalize_provider(provider: Optional[str]) -> str:
"""Normalize provider aliases to Hermes' canonical provider ids.
Note: ``"auto"`` passes through unchanged — use
``hermes_cli.auth.resolve_provider()`` to resolve it to a concrete
provider based on credentials and environment.
"""
normalized = (provider or "openrouter").strip().lower()
return _PROVIDER_ALIASES.get(normalized, normalized)
def provider_model_ids(provider: Optional[str]) -> list[str]:
"""Return the best known model catalog for a provider."""
normalized = normalize_provider(provider)
if normalized == "openrouter":
return model_ids()
if normalized == "openai-codex":
from hermes_cli.codex_models import get_codex_model_ids
return get_codex_model_ids()
return list(_PROVIDER_MODELS.get(normalized, []))
def fetch_api_models(
api_key: Optional[str],
base_url: Optional[str],
timeout: float = 5.0,
) -> Optional[list[str]]:
"""Fetch the list of available model IDs from the provider's ``/models`` endpoint.
Returns a list of model ID strings, or ``None`` if the endpoint could not
be reached (network error, timeout, auth failure, etc.).
"""
if not base_url:
return None
url = base_url.rstrip("/") + "/models"
headers: dict[str, str] = {}
if api_key:
headers["Authorization"] = f"Bearer {api_key}"
req = urllib.request.Request(url, headers=headers)
try:
with urllib.request.urlopen(req, timeout=timeout) as resp:
data = json.loads(resp.read().decode())
# Standard OpenAI format: {"data": [{"id": "model-name", ...}, ...]}
return [m.get("id", "") for m in data.get("data", [])]
except Exception:
return None
def validate_requested_model(
model_name: str,
provider: Optional[str],
*,
api_key: Optional[str] = None,
base_url: Optional[str] = None,
) -> dict[str, Any]:
"""
Validate a ``/model`` value for the active provider.
Performs format checks first, then probes the live API to confirm
the model actually exists.
Returns a dict with:
- accepted: whether the CLI should switch to the requested model now
- persist: whether it is safe to save to config
- recognized: whether it matched a known provider catalog
- message: optional warning / guidance for the user
"""
requested = (model_name or "").strip()
normalized = normalize_provider(provider)
if normalized == "openrouter" and base_url and "openrouter.ai" not in base_url:
normalized = "custom"
if not requested:
return {
"accepted": False,
"persist": False,
"recognized": False,
"message": "Model name cannot be empty.",
}
if any(ch.isspace() for ch in requested):
return {
"accepted": False,
"persist": False,
"recognized": False,
"message": "Model names cannot contain spaces.",
}
# Probe the live API to check if the model actually exists
api_models = fetch_api_models(api_key, base_url)
if api_models is not None:
if requested in set(api_models):
# API confirmed the model exists
return {
"accepted": True,
"persist": True,
"recognized": True,
"message": None,
}
else:
# API responded but model is not listed
suggestions = get_close_matches(requested, api_models, n=3, cutoff=0.5)
suggestion_text = ""
if suggestions:
suggestion_text = "\n Did you mean: " + ", ".join(f"`{s}`" for s in suggestions)
return {
"accepted": False,
"persist": False,
"recognized": False,
"message": (
f"Error: `{requested}` is not a valid model for this provider."
f"{suggestion_text}"
),
}
# api_models is None — couldn't reach API, fall back to catalog check
provider_label = _PROVIDER_LABELS.get(normalized, normalized)
known_models = provider_model_ids(normalized)
if requested in known_models:
return {
"accepted": True,
"persist": True,
"recognized": True,
"message": None,
}
# Can't validate — accept for session only
suggestion = get_close_matches(requested, known_models, n=1, cutoff=0.6)
suggestion_text = f" Did you mean `{suggestion[0]}`?" if suggestion else ""
return {
"accepted": True,
"persist": False,
"recognized": False,
"message": (
f"Could not validate `{requested}` against the live {provider_label} API. "
"Using it for this session only; config unchanged."
f"{suggestion_text}"
),
}

View File

@@ -7,12 +7,10 @@ from typing import Any, Dict, Optional
from hermes_cli.auth import (
AuthError,
PROVIDER_REGISTRY,
format_auth_error,
resolve_provider,
resolve_nous_runtime_credentials,
resolve_codex_runtime_credentials,
resolve_api_key_provider_credentials,
)
from hermes_cli.config import load_config
from hermes_constants import OPENROUTER_BASE_URL
@@ -66,39 +64,20 @@ def _resolve_openrouter_runtime(
if not cfg_provider or cfg_provider == "auto":
use_config_base_url = True
# When the user explicitly requested the openrouter provider, skip
# OPENAI_BASE_URL — it typically points to a custom / non-OpenRouter
# endpoint and would prevent switching back to OpenRouter (#874).
skip_openai_base = requested_norm == "openrouter"
base_url = (
(explicit_base_url or "").strip()
or ("" if skip_openai_base else env_openai_base_url)
or env_openai_base_url
or (cfg_base_url.strip() if use_config_base_url else "")
or env_openrouter_base_url
or OPENROUTER_BASE_URL
).rstrip("/")
# Choose API key based on whether the resolved base_url targets OpenRouter.
# When hitting OpenRouter, prefer OPENROUTER_API_KEY (issue #289).
# When hitting a custom endpoint (e.g. Z.ai, local LLM), prefer
# OPENAI_API_KEY so the OpenRouter key doesn't leak to an unrelated
# provider (issues #420, #560).
_is_openrouter_url = "openrouter.ai" in base_url
if _is_openrouter_url:
api_key = (
explicit_api_key
or os.getenv("OPENROUTER_API_KEY")
or os.getenv("OPENAI_API_KEY")
or ""
)
else:
api_key = (
explicit_api_key
or os.getenv("OPENAI_API_KEY")
or os.getenv("OPENROUTER_API_KEY")
or ""
)
api_key = (
explicit_api_key
or os.getenv("OPENROUTER_API_KEY")
or os.getenv("OPENAI_API_KEY")
or ""
)
source = "explicit" if (explicit_api_key or explicit_base_url) else "env/config"
@@ -153,19 +132,6 @@ def resolve_runtime_provider(
"requested_provider": requested_provider,
}
# API-key providers (z.ai/GLM, Kimi, MiniMax, MiniMax-CN)
pconfig = PROVIDER_REGISTRY.get(provider)
if pconfig and pconfig.auth_type == "api_key":
creds = resolve_api_key_provider_credentials(provider)
return {
"provider": provider,
"api_mode": "chat_completions",
"base_url": creds.get("base_url", "").rstrip("/"),
"api_key": creds.get("api_key", ""),
"source": creds.get("source", "env"),
"requested_provider": requested_provider,
}
runtime = _resolve_openrouter_runtime(
requested_provider=requested_provider,
explicit_api_key=explicit_api_key,

File diff suppressed because it is too large Load Diff

View File

@@ -1,181 +0,0 @@
"""
Skills configuration for Hermes Agent.
`hermes skills` enters this module.
Toggle individual skills or categories on/off, globally or per-platform.
Config stored in ~/.hermes/config.yaml under:
skills:
disabled: [skill-a, skill-b] # global disabled list
platform_disabled: # per-platform overrides
telegram: [skill-c]
cli: []
"""
from typing import Dict, List, Optional, Set
from hermes_cli.config import load_config, save_config
from hermes_cli.colors import Colors, color
PLATFORMS = {
"cli": "🖥️ CLI",
"telegram": "📱 Telegram",
"discord": "💬 Discord",
"slack": "💼 Slack",
"whatsapp": "📱 WhatsApp",
"signal": "📡 Signal",
"email": "📧 Email",
}
# ─── Config Helpers ───────────────────────────────────────────────────────────
def get_disabled_skills(config: dict, platform: Optional[str] = None) -> Set[str]:
"""Return disabled skill names. Platform-specific list falls back to global."""
skills_cfg = config.get("skills", {})
global_disabled = set(skills_cfg.get("disabled", []))
if platform is None:
return global_disabled
platform_disabled = skills_cfg.get("platform_disabled", {}).get(platform)
if platform_disabled is None:
return global_disabled
return set(platform_disabled)
def save_disabled_skills(config: dict, disabled: Set[str], platform: Optional[str] = None):
"""Persist disabled skill names to config."""
config.setdefault("skills", {})
if platform is None:
config["skills"]["disabled"] = sorted(disabled)
else:
config["skills"].setdefault("platform_disabled", {})
config["skills"]["platform_disabled"][platform] = sorted(disabled)
save_config(config)
# ─── Skill Discovery ─────────────────────────────────────────────────────────
def _list_all_skills() -> List[dict]:
"""Return all installed skills (ignoring disabled state)."""
try:
from tools.skills_tool import _find_all_skills
return _find_all_skills(skip_disabled=True)
except Exception:
return []
def _get_categories(skills: List[dict]) -> List[str]:
"""Return sorted unique category names (None -> 'uncategorized')."""
return sorted({s["category"] or "uncategorized" for s in skills})
# ─── Platform Selection ──────────────────────────────────────────────────────
def _select_platform() -> Optional[str]:
"""Ask user which platform to configure, or global."""
options = [("global", "All platforms (global default)")] + list(PLATFORMS.items())
print()
print(color(" Configure skills for:", Colors.BOLD))
for i, (key, label) in enumerate(options, 1):
print(f" {i}. {label}")
print()
try:
raw = input(color(" Select [1]: ", Colors.YELLOW)).strip()
except (KeyboardInterrupt, EOFError):
return None
if not raw:
return None # global
try:
idx = int(raw) - 1
if 0 <= idx < len(options):
key = options[idx][0]
return None if key == "global" else key
except ValueError:
pass
return None
# ─── Category Toggle ─────────────────────────────────────────────────────────
def _toggle_by_category(skills: List[dict], disabled: Set[str]) -> Set[str]:
"""Toggle all skills in a category at once."""
from hermes_cli.curses_ui import curses_checklist
categories = _get_categories(skills)
cat_labels = []
# A category is "enabled" (checked) when NOT all its skills are disabled
pre_selected = set()
for i, cat in enumerate(categories):
cat_skills = [s["name"] for s in skills if (s["category"] or "uncategorized") == cat]
cat_labels.append(f"{cat} ({len(cat_skills)} skills)")
if not all(s in disabled for s in cat_skills):
pre_selected.add(i)
chosen = curses_checklist(
"Categories — toggle entire categories",
cat_labels, pre_selected, cancel_returns=pre_selected,
)
new_disabled = set(disabled)
for i, cat in enumerate(categories):
cat_skills = {s["name"] for s in skills if (s["category"] or "uncategorized") == cat}
if i in chosen:
new_disabled -= cat_skills # category enabled → remove from disabled
else:
new_disabled |= cat_skills # category disabled → add to disabled
return new_disabled
# ─── Entry Point ──────────────────────────────────────────────────────────────
def skills_command(args=None):
"""Entry point for `hermes skills`."""
from hermes_cli.curses_ui import curses_checklist
config = load_config()
skills = _list_all_skills()
if not skills:
print(color(" No skills installed.", Colors.DIM))
return
# Step 1: Select platform
platform = _select_platform()
platform_label = PLATFORMS.get(platform, "All platforms") if platform else "All platforms"
# Step 2: Select mode — individual or by category
print()
print(color(f" Configure for: {platform_label}", Colors.DIM))
print()
print(" 1. Toggle individual skills")
print(" 2. Toggle by category")
print()
try:
mode = input(color(" Select [1]: ", Colors.YELLOW)).strip() or "1"
except (KeyboardInterrupt, EOFError):
return
disabled = get_disabled_skills(config, platform)
if mode == "2":
new_disabled = _toggle_by_category(skills, disabled)
else:
# Build labels and map indices → skill names
labels = [
f"{s['name']} ({s['category'] or 'uncategorized'}) — {s['description'][:55]}"
for s in skills
]
# "selected" = enabled (not disabled) — matches the [✓] convention
pre_selected = {i for i, s in enumerate(skills) if s["name"] not in disabled}
chosen = curses_checklist(
f"Skills for {platform_label}",
labels, pre_selected, cancel_returns=pre_selected,
)
# Anything NOT chosen is disabled
new_disabled = {skills[i]["name"] for i in range(len(skills)) if i not in chosen}
if new_disabled == disabled:
print(color(" No changes.", Colors.DIM))
return
save_disabled_skills(config, new_disabled, platform)
enabled_count = len(skills) - len(new_disabled)
print(color(f"✓ Saved: {enabled_count} enabled, {len(new_disabled)} disabled ({platform_label}).", Colors.GREEN))

View File

@@ -408,11 +408,10 @@ def do_inspect(identifier: str, console: Optional[Console] = None) -> None:
def do_list(source_filter: str = "all", console: Optional[Console] = None) -> None:
"""List installed skills, distinguishing builtins from hub-installed."""
from tools.skills_hub import HubLockFile, ensure_hub_dirs
from tools.skills_hub import HubLockFile, SKILLS_DIR
from tools.skills_tool import _find_all_skills
c = console or _console
ensure_hub_dirs()
lock = HubLockFile()
hub_installed = {e["name"]: e for e in lock.list_installed()}

View File

@@ -1,630 +0,0 @@
"""Hermes CLI skin/theme engine.
A data-driven skin system that lets users customize the CLI's visual appearance.
Skins are defined as YAML files in ~/.hermes/skins/ or as built-in presets.
No code changes are needed to add a new skin.
SKIN YAML SCHEMA
================
All fields are optional. Missing values inherit from the ``default`` skin.
.. code-block:: yaml
# Required: skin identity
name: mytheme # Unique skin name (lowercase, hyphens ok)
description: Short description # Shown in /skin listing
# Colors: hex values for Rich markup (banner, UI, response box)
colors:
banner_border: "#CD7F32" # Panel border color
banner_title: "#FFD700" # Panel title text color
banner_accent: "#FFBF00" # Section headers (Available Tools, etc.)
banner_dim: "#B8860B" # Dim/muted text (separators, labels)
banner_text: "#FFF8DC" # Body text (tool names, skill names)
ui_accent: "#FFBF00" # General UI accent
ui_label: "#4dd0e1" # UI labels
ui_ok: "#4caf50" # Success indicators
ui_error: "#ef5350" # Error indicators
ui_warn: "#ffa726" # Warning indicators
prompt: "#FFF8DC" # Prompt text color
input_rule: "#CD7F32" # Input area horizontal rule
response_border: "#FFD700" # Response box border (ANSI)
session_label: "#DAA520" # Session label color
session_border: "#8B8682" # Session ID dim color
# Spinner: customize the animated spinner during API calls
spinner:
waiting_faces: # Faces shown while waiting for API
- "(⚔)"
- "(⛨)"
thinking_faces: # Faces shown during reasoning
- "(⌁)"
- "(<>)"
thinking_verbs: # Verbs for spinner messages
- "forging"
- "plotting"
wings: # Optional left/right spinner decorations
- ["⟪⚔", "⚔⟫"] # Each entry is [left, right] pair
- ["⟪▲", "▲⟫"]
# Branding: text strings used throughout the CLI
branding:
agent_name: "Hermes Agent" # Banner title, status display
welcome: "Welcome message" # Shown at CLI startup
goodbye: "Goodbye! ⚕" # Shown on exit
response_label: " ⚕ Hermes " # Response box header label
prompt_symbol: " " # Input prompt symbol
help_header: "(^_^)? Commands" # /help header text
# Tool prefix: character for tool output lines (default: ┊)
tool_prefix: ""
USAGE
=====
.. code-block:: python
from hermes_cli.skin_engine import get_active_skin, list_skins, set_active_skin
skin = get_active_skin()
print(skin.colors["banner_title"]) # "#FFD700"
print(skin.get_branding("agent_name")) # "Hermes Agent"
set_active_skin("ares") # Switch to built-in ares skin
set_active_skin("mytheme") # Switch to user skin from ~/.hermes/skins/
BUILT-IN SKINS
==============
- ``default`` — Classic Hermes gold/kawaii (the current look)
- ``ares`` — Crimson/bronze war-god theme with custom spinner wings
- ``mono`` — Clean grayscale monochrome
- ``slate`` — Cool blue developer-focused theme
USER SKINS
==========
Drop a YAML file in ``~/.hermes/skins/<name>.yaml`` following the schema above.
Activate with ``/skin <name>`` in the CLI or ``display.skin: <name>`` in config.yaml.
"""
import logging
import os
from dataclasses import dataclass, field
from pathlib import Path
from typing import Any, Dict, List, Optional, Tuple
logger = logging.getLogger(__name__)
# =============================================================================
# Skin data structure
# =============================================================================
@dataclass
class SkinConfig:
"""Complete skin configuration."""
name: str
description: str = ""
colors: Dict[str, str] = field(default_factory=dict)
spinner: Dict[str, Any] = field(default_factory=dict)
branding: Dict[str, str] = field(default_factory=dict)
tool_prefix: str = ""
banner_logo: str = "" # Rich-markup ASCII art logo (replaces HERMES_AGENT_LOGO)
banner_hero: str = "" # Rich-markup hero art (replaces HERMES_CADUCEUS)
def get_color(self, key: str, fallback: str = "") -> str:
"""Get a color value with fallback."""
return self.colors.get(key, fallback)
def get_spinner_list(self, key: str) -> List[str]:
"""Get a spinner list (faces, verbs, etc.)."""
return self.spinner.get(key, [])
def get_spinner_wings(self) -> List[Tuple[str, str]]:
"""Get spinner wing pairs, or empty list if none."""
raw = self.spinner.get("wings", [])
result = []
for pair in raw:
if isinstance(pair, (list, tuple)) and len(pair) == 2:
result.append((str(pair[0]), str(pair[1])))
return result
def get_branding(self, key: str, fallback: str = "") -> str:
"""Get a branding value with fallback."""
return self.branding.get(key, fallback)
# =============================================================================
# Built-in skin definitions
# =============================================================================
_BUILTIN_SKINS: Dict[str, Dict[str, Any]] = {
"default": {
"name": "default",
"description": "Classic Hermes — gold and kawaii",
"colors": {
"banner_border": "#CD7F32",
"banner_title": "#FFD700",
"banner_accent": "#FFBF00",
"banner_dim": "#B8860B",
"banner_text": "#FFF8DC",
"ui_accent": "#FFBF00",
"ui_label": "#4dd0e1",
"ui_ok": "#4caf50",
"ui_error": "#ef5350",
"ui_warn": "#ffa726",
"prompt": "#FFF8DC",
"input_rule": "#CD7F32",
"response_border": "#FFD700",
"session_label": "#DAA520",
"session_border": "#8B8682",
},
"spinner": {
# Empty = use hardcoded defaults in display.py
},
"branding": {
"agent_name": "Hermes Agent",
"welcome": "Welcome to Hermes Agent! Type your message or /help for commands.",
"goodbye": "Goodbye! ⚕",
"response_label": " ⚕ Hermes ",
"prompt_symbol": " ",
"help_header": "(^_^)? Available Commands",
},
"tool_prefix": "",
},
"ares": {
"name": "ares",
"description": "War-god theme — crimson and bronze",
"colors": {
"banner_border": "#9F1C1C",
"banner_title": "#C7A96B",
"banner_accent": "#DD4A3A",
"banner_dim": "#6B1717",
"banner_text": "#F1E6CF",
"ui_accent": "#DD4A3A",
"ui_label": "#C7A96B",
"ui_ok": "#4caf50",
"ui_error": "#ef5350",
"ui_warn": "#ffa726",
"prompt": "#F1E6CF",
"input_rule": "#9F1C1C",
"response_border": "#C7A96B",
"session_label": "#C7A96B",
"session_border": "#6E584B",
},
"spinner": {
"waiting_faces": ["(⚔)", "(⛨)", "(▲)", "(<>)", "(/)"],
"thinking_faces": ["(⚔)", "(⛨)", "(▲)", "(⌁)", "(<>)"],
"thinking_verbs": [
"forging", "marching", "sizing the field", "holding the line",
"hammering plans", "tempering steel", "plotting impact", "raising the shield",
],
"wings": [
["⟪⚔", "⚔⟫"],
["⟪▲", "▲⟫"],
["⟪╸", "╺⟫"],
["⟪⛨", "⛨⟫"],
],
},
"branding": {
"agent_name": "Ares Agent",
"welcome": "Welcome to Ares Agent! Type your message or /help for commands.",
"goodbye": "Farewell, warrior! ⚔",
"response_label": " ⚔ Ares ",
"prompt_symbol": " ",
"help_header": "(⚔) Available Commands",
},
"tool_prefix": "",
"banner_logo": """[bold #A3261F] █████╗ ██████╗ ███████╗███████╗ █████╗ ██████╗ ███████╗███╗ ██╗████████╗[/]
[bold #B73122]██╔══██╗██╔══██╗██╔════╝██╔════╝ ██╔══██╗██╔════╝ ██╔════╝████╗ ██║╚══██╔══╝[/]
[#C93C24]███████║██████╔╝█████╗ ███████╗█████╗███████║██║ ███╗█████╗ ██╔██╗ ██║ ██║[/]
[#D84A28]██╔══██║██╔══██╗██╔══╝ ╚════██║╚════╝██╔══██║██║ ██║██╔══╝ ██║╚██╗██║ ██║[/]
[#E15A2D]██║ ██║██║ ██║███████╗███████║ ██║ ██║╚██████╔╝███████╗██║ ╚████║ ██║[/]
[#EB6C32]╚═╝ ╚═╝╚═╝ ╚═╝╚══════╝╚══════╝ ╚═╝ ╚═╝ ╚═════╝ ╚══════╝╚═╝ ╚═══╝ ╚═╝[/]""",
"banner_hero": """[#9F1C1C]⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣤⣤⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀[/]
[#9F1C1C]⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⣴⣿⠟⠻⣿⣦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀[/]
[#C7A96B]⠀⠀⠀⠀⠀⠀⠀⣠⣾⡿⠋⠀⠀⠀⠙⢿⣷⣄⠀⠀⠀⠀⠀⠀⠀[/]
[#C7A96B]⠀⠀⠀⠀⠀⢀⣾⡿⠋⠀⠀⢠⡄⠀⠀⠙⢿⣷⡀⠀⠀⠀⠀⠀[/]
[#DD4A3A]⠀⠀⠀⠀⣰⣿⠟⠀⠀⠀⣰⣿⣿⣆⠀⠀⠀⠻⣿⣆⠀⠀⠀⠀[/]
[#DD4A3A]⠀⠀⠀⢰⣿⠏⠀⠀⢀⣾⡿⠉⢿⣷⡀⠀⠀⠹⣿⡆⠀⠀⠀[/]
[#9F1C1C]⠀⠀⠀⣿⡟⠀⠀⣠⣿⠟⠀⠀⠀⠻⣿⣄⠀⠀⢻⣿⠀⠀⠀[/]
[#9F1C1C]⠀⠀⠀⣿⡇⠀⠀⠙⠋⠀⠀⚔⠀⠀⠙⠋⠀⠀⢸⣿⠀⠀⠀[/]
[#6B1717]⠀⠀⠀⢿⣧⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣼⡿⠀⠀⠀[/]
[#6B1717]⠀⠀⠀⠘⢿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⣠⣾⡿⠃⠀⠀⠀[/]
[#C7A96B]⠀⠀⠀⠀⠈⠻⣿⣷⣦⣤⣀⣀⣤⣤⣶⣿⠿⠋⠀⠀⠀⠀[/]
[#C7A96B]⠀⠀⠀⠀⠀⠀⠀⠉⠛⠿⠿⠿⠿⠛⠉⠀⠀⠀⠀⠀⠀⠀[/]
[#DD4A3A]⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⚔⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀[/]
[dim #6B1717]war god online[/]""",
},
"mono": {
"name": "mono",
"description": "Monochrome — clean grayscale",
"colors": {
"banner_border": "#555555",
"banner_title": "#e6edf3",
"banner_accent": "#aaaaaa",
"banner_dim": "#444444",
"banner_text": "#c9d1d9",
"ui_accent": "#aaaaaa",
"ui_label": "#888888",
"ui_ok": "#888888",
"ui_error": "#cccccc",
"ui_warn": "#999999",
"prompt": "#c9d1d9",
"input_rule": "#444444",
"response_border": "#aaaaaa",
"session_label": "#888888",
"session_border": "#555555",
},
"spinner": {},
"branding": {
"agent_name": "Hermes Agent",
"welcome": "Welcome to Hermes Agent! Type your message or /help for commands.",
"goodbye": "Goodbye! ⚕",
"response_label": " ⚕ Hermes ",
"prompt_symbol": " ",
"help_header": "[?] Available Commands",
},
"tool_prefix": "",
},
"slate": {
"name": "slate",
"description": "Cool blue — developer-focused",
"colors": {
"banner_border": "#4169e1",
"banner_title": "#7eb8f6",
"banner_accent": "#8EA8FF",
"banner_dim": "#4b5563",
"banner_text": "#c9d1d9",
"ui_accent": "#7eb8f6",
"ui_label": "#8EA8FF",
"ui_ok": "#63D0A6",
"ui_error": "#F7A072",
"ui_warn": "#e6a855",
"prompt": "#c9d1d9",
"input_rule": "#4169e1",
"response_border": "#7eb8f6",
"session_label": "#7eb8f6",
"session_border": "#4b5563",
},
"spinner": {},
"branding": {
"agent_name": "Hermes Agent",
"welcome": "Welcome to Hermes Agent! Type your message or /help for commands.",
"goodbye": "Goodbye! ⚕",
"response_label": " ⚕ Hermes ",
"prompt_symbol": " ",
"help_header": "(^_^)? Available Commands",
},
"tool_prefix": "",
},
"poseidon": {
"name": "poseidon",
"description": "Ocean-god theme — deep blue and seafoam",
"colors": {
"banner_border": "#2A6FB9",
"banner_title": "#A9DFFF",
"banner_accent": "#5DB8F5",
"banner_dim": "#153C73",
"banner_text": "#EAF7FF",
"ui_accent": "#5DB8F5",
"ui_label": "#A9DFFF",
"ui_ok": "#4caf50",
"ui_error": "#ef5350",
"ui_warn": "#ffa726",
"prompt": "#EAF7FF",
"input_rule": "#2A6FB9",
"response_border": "#5DB8F5",
"session_label": "#A9DFFF",
"session_border": "#496884",
},
"spinner": {
"waiting_faces": ["(≈)", "(Ψ)", "(∿)", "(◌)", "(◠)"],
"thinking_faces": ["(Ψ)", "(∿)", "(≈)", "(⌁)", "(◌)"],
"thinking_verbs": [
"charting currents", "sounding the depth", "reading foam lines",
"steering the trident", "tracking undertow", "plotting sea lanes",
"calling the swell", "measuring pressure",
],
"wings": [
["⟪≈", "≈⟫"],
["⟪Ψ", "Ψ⟫"],
["⟪∿", "∿⟫"],
["⟪◌", "◌⟫"],
],
},
"branding": {
"agent_name": "Poseidon Agent",
"welcome": "Welcome to Poseidon Agent! Type your message or /help for commands.",
"goodbye": "Fair winds! Ψ",
"response_label": " Ψ Poseidon ",
"prompt_symbol": "Ψ ",
"help_header": "(Ψ) Available Commands",
},
"tool_prefix": "",
"banner_logo": """[bold #B8E8FF]██████╗ ██████╗ ███████╗██╗██████╗ ███████╗ ██████╗ ███╗ ██╗ █████╗ ██████╗ ███████╗███╗ ██╗████████╗[/]
[bold #97D6FF]██╔══██╗██╔═══██╗██╔════╝██║██╔══██╗██╔════╝██╔═══██╗████╗ ██║ ██╔══██╗██╔════╝ ██╔════╝████╗ ██║╚══██╔══╝[/]
[#75C1F6]██████╔╝██║ ██║███████╗██║██║ ██║█████╗ ██║ ██║██╔██╗ ██║█████╗███████║██║ ███╗█████╗ ██╔██╗ ██║ ██║[/]
[#4FA2E0]██╔═══╝ ██║ ██║╚════██║██║██║ ██║██╔══╝ ██║ ██║██║╚██╗██║╚════╝██╔══██║██║ ██║██╔══╝ ██║╚██╗██║ ██║[/]
[#2E7CC7]██║ ╚██████╔╝███████║██║██████╔╝███████╗╚██████╔╝██║ ╚████║ ██║ ██║╚██████╔╝███████╗██║ ╚████║ ██║[/]
[#1B4F95]╚═╝ ╚═════╝ ╚══════╝╚═╝╚═════╝ ╚══════╝ ╚═════╝ ╚═╝ ╚═══╝ ╚═╝ ╚═╝ ╚═════╝ ╚══════╝╚═╝ ╚═══╝ ╚═╝[/]""",
"banner_hero": """[#2A6FB9]⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⣀⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀[/]
[#5DB8F5]⠀⠀⠀⠀⠀⠀⠀⠀⠀⣠⣾⣿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀[/]
[#5DB8F5]⠀⠀⠀⠀⠀⠀⠀⢠⣿⠏⠀Ψ⠀⠹⣿⡄⠀⠀⠀⠀⠀⠀⠀[/]
[#A9DFFF]⠀⠀⠀⠀⠀⠀⠀⣿⡟⠀⠀⠀⠀⠀⢻⣿⠀⠀⠀⠀⠀⠀⠀[/]
[#A9DFFF]⠀⠀⠀≈≈≈≈≈⣿⡇⠀⠀⠀⠀⠀⢸⣿≈≈≈≈≈⠀⠀⠀[/]
[#5DB8F5]⠀⠀⠀⠀⠀⠀⠀⣿⡇⠀⠀⠀⠀⠀⢸⣿⠀⠀⠀⠀⠀⠀⠀[/]
[#2A6FB9]⠀⠀⠀⠀⠀⠀⠀⢿⣧⠀⠀⠀⠀⠀⣼⡿⠀⠀⠀⠀⠀⠀⠀[/]
[#2A6FB9]⠀⠀⠀⠀⠀⠀⠀⠘⢿⣷⣄⣀⣠⣾⡿⠃⠀⠀⠀⠀⠀⠀⠀[/]
[#153C73]⠀⠀⠀⠀⠀⠀⠀⠀⠈⠻⣿⣿⡿⠟⠁⠀⠀⠀⠀⠀⠀⠀⠀[/]
[#153C73]⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠁⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀[/]
[#5DB8F5]⠀⠀⠀⠀⠀≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈⠀⠀⠀⠀⠀[/]
[#A9DFFF]⠀⠀⠀⠀⠀⠀≈≈≈≈≈≈≈≈≈≈≈≈≈⠀⠀⠀⠀⠀⠀[/]
[dim #153C73]deep waters hold[/]""",
},
"sisyphus": {
"name": "sisyphus",
"description": "Sisyphean theme — austere grayscale with persistence",
"colors": {
"banner_border": "#B7B7B7",
"banner_title": "#F5F5F5",
"banner_accent": "#E7E7E7",
"banner_dim": "#4A4A4A",
"banner_text": "#D3D3D3",
"ui_accent": "#E7E7E7",
"ui_label": "#D3D3D3",
"ui_ok": "#919191",
"ui_error": "#E7E7E7",
"ui_warn": "#B7B7B7",
"prompt": "#F5F5F5",
"input_rule": "#656565",
"response_border": "#B7B7B7",
"session_label": "#919191",
"session_border": "#656565",
},
"spinner": {
"waiting_faces": ["(◉)", "(◌)", "(◬)", "(⬤)", "(::)"],
"thinking_faces": ["(◉)", "(◬)", "(◌)", "(○)", "(●)"],
"thinking_verbs": [
"finding traction", "measuring the grade", "resetting the boulder",
"counting the ascent", "testing leverage", "setting the shoulder",
"pushing uphill", "enduring the loop",
],
"wings": [
["⟪◉", "◉⟫"],
["⟪◬", "◬⟫"],
["⟪◌", "◌⟫"],
["⟪⬤", "⬤⟫"],
],
},
"branding": {
"agent_name": "Sisyphus Agent",
"welcome": "Welcome to Sisyphus Agent! Type your message or /help for commands.",
"goodbye": "The boulder waits. ◉",
"response_label": " ◉ Sisyphus ",
"prompt_symbol": " ",
"help_header": "(◉) Available Commands",
},
"tool_prefix": "",
"banner_logo": """[bold #F5F5F5]███████╗██╗███████╗██╗ ██╗██████╗ ██╗ ██╗██╗ ██╗███████╗ █████╗ ██████╗ ███████╗███╗ ██╗████████╗[/]
[bold #E7E7E7]██╔════╝██║██╔════╝╚██╗ ██╔╝██╔══██╗██║ ██║██║ ██║██╔════╝ ██╔══██╗██╔════╝ ██╔════╝████╗ ██║╚══██╔══╝[/]
[#D7D7D7]███████╗██║███████╗ ╚████╔╝ ██████╔╝███████║██║ ██║███████╗█████╗███████║██║ ███╗█████╗ ██╔██╗ ██║ ██║[/]
[#BFBFBF]╚════██║██║╚════██║ ╚██╔╝ ██╔═══╝ ██╔══██║██║ ██║╚════██║╚════╝██╔══██║██║ ██║██╔══╝ ██║╚██╗██║ ██║[/]
[#8F8F8F]███████║██║███████║ ██║ ██║ ██║ ██║╚██████╔╝███████║ ██║ ██║╚██████╔╝███████╗██║ ╚████║ ██║[/]
[#626262]╚══════╝╚═╝╚══════╝ ╚═╝ ╚═╝ ╚═╝ ╚═╝ ╚═════╝ ╚══════╝ ╚═╝ ╚═╝ ╚═════╝ ╚══════╝╚═╝ ╚═══╝ ╚═╝[/]""",
"banner_hero": """[#B7B7B7]⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⣀⣀⣀⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀[/]
[#D3D3D3]⠀⠀⠀⠀⠀⠀⠀⣠⣾⣿⣿⣿⣿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀[/]
[#E7E7E7]⠀⠀⠀⠀⠀⠀⣾⣿⣿⣿⣿⣿⣿⣿⣷⠀⠀⠀⠀⠀⠀⠀[/]
[#F5F5F5]⠀⠀⠀⠀⠀⢸⣿⣿⣿⣿⣿⣿⣿⣿⣿⡇⠀⠀⠀⠀⠀⠀[/]
[#E7E7E7]⠀⠀⠀⠀⠀⠀⣿⣿⣿⣿⣿⣿⣿⣿⣿⠀⠀⠀⠀⠀⠀⠀[/]
[#D3D3D3]⠀⠀⠀⠀⠀⠀⠘⢿⣿⣿⣿⣿⣿⡿⠃⠀⠀⠀⠀⠀⠀⠀[/]
[#B7B7B7]⠀⠀⠀⠀⠀⠀⠀⠀⠙⠿⣿⠿⠋⠀⠀⠀⠀⠀⠀⠀⠀⠀[/]
[#919191][/]
[#656565]⠀⠀⠀⠀⠀⠀⠀⠀⠀⣰⡄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀[/]
[#656565]⠀⠀⠀⠀⠀⠀⠀⠀⣰⣿⣿⣆⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀[/]
[#4A4A4A]⠀⠀⠀⠀⠀⠀⠀⣰⣿⣿⣿⣿⣆⠀⠀⠀⠀⠀⠀⠀⠀⠀[/]
[#4A4A4A]⠀⠀⠀⠀⠀⣀⣴⣿⣿⣿⣿⣿⣿⣦⣀⠀⠀⠀⠀⠀⠀[/]
[#656565]⠀⠀⠀━━━━━━━━━━━━━━━━━━━━━━━⠀⠀⠀[/]
[dim #4A4A4A]the boulder[/]""",
},
"charizard": {
"name": "charizard",
"description": "Volcanic theme — burnt orange and ember",
"colors": {
"banner_border": "#C75B1D",
"banner_title": "#FFD39A",
"banner_accent": "#F29C38",
"banner_dim": "#7A3511",
"banner_text": "#FFF0D4",
"ui_accent": "#F29C38",
"ui_label": "#FFD39A",
"ui_ok": "#4caf50",
"ui_error": "#ef5350",
"ui_warn": "#ffa726",
"prompt": "#FFF0D4",
"input_rule": "#C75B1D",
"response_border": "#F29C38",
"session_label": "#FFD39A",
"session_border": "#6C4724",
},
"spinner": {
"waiting_faces": ["(✦)", "(▲)", "(◇)", "(<>)", "(🔥)"],
"thinking_faces": ["(✦)", "(▲)", "(◇)", "(⌁)", "(🔥)"],
"thinking_verbs": [
"banking into the draft", "measuring burn", "reading the updraft",
"tracking ember fall", "setting wing angle", "holding the flame core",
"plotting a hot landing", "coiling for lift",
],
"wings": [
["⟪✦", "✦⟫"],
["⟪▲", "▲⟫"],
["⟪◌", "◌⟫"],
["⟪◇", "◇⟫"],
],
},
"branding": {
"agent_name": "Charizard Agent",
"welcome": "Welcome to Charizard Agent! Type your message or /help for commands.",
"goodbye": "Flame out! ✦",
"response_label": " ✦ Charizard ",
"prompt_symbol": " ",
"help_header": "(✦) Available Commands",
},
"tool_prefix": "",
"banner_logo": """[bold #FFF0D4] ██████╗██╗ ██╗ █████╗ ██████╗ ██╗███████╗ █████╗ ██████╗ ██████╗ █████╗ ██████╗ ███████╗███╗ ██╗████████╗[/]
[bold #FFD39A]██╔════╝██║ ██║██╔══██╗██╔══██╗██║╚══███╔╝██╔══██╗██╔══██╗██╔══██╗ ██╔══██╗██╔════╝ ██╔════╝████╗ ██║╚══██╔══╝[/]
[#F29C38]██║ ███████║███████║██████╔╝██║ ███╔╝ ███████║██████╔╝██║ ██║█████╗███████║██║ ███╗█████╗ ██╔██╗ ██║ ██║[/]
[#E2832B]██║ ██╔══██║██╔══██║██╔══██╗██║ ███╔╝ ██╔══██║██╔══██╗██║ ██║╚════╝██╔══██║██║ ██║██╔══╝ ██║╚██╗██║ ██║[/]
[#C75B1D]╚██████╗██║ ██║██║ ██║██║ ██║██║███████╗██║ ██║██║ ██║██████╔╝ ██║ ██║╚██████╔╝███████╗██║ ╚████║ ██║[/]
[#7A3511] ╚═════╝╚═╝ ╚═╝╚═╝ ╚═╝╚═╝ ╚═╝╚═╝╚══════╝╚═╝ ╚═╝╚═╝ ╚═╝╚═════╝ ╚═╝ ╚═╝ ╚═════╝ ╚══════╝╚═╝ ╚═══╝ ╚═╝[/]""",
"banner_hero": """[#FFD39A]⠀⠀⠀⠀⠀⠀⠀⠀⣀⣤⠶⠶⠶⣤⣀⠀⠀⠀⠀⠀⠀⠀⠀[/]
[#F29C38]⠀⠀⠀⠀⠀⠀⣴⠟⠁⠀⠀⠀⠀⠈⠻⣦⠀⠀⠀⠀⠀⠀[/]
[#F29C38]⠀⠀⠀⠀⠀⣼⠏⠀⠀⠀✦⠀⠀⠀⠀⠹⣧⠀⠀⠀⠀⠀[/]
[#E2832B]⠀⠀⠀⠀⢰⡟⠀⠀⣀⣤⣤⣤⣀⠀⠀⠀⢻⡆⠀⠀⠀⠀[/]
[#E2832B]⠀⠀⣠⡾⠛⠁⣠⣾⠟⠉⠀⠉⠻⣷⣄⠀⠈⠛⢷⣄⠀⠀[/]
[#C75B1D]⠀⣼⠟⠀⢀⣾⠟⠁⠀⠀⠀⠀⠀⠈⠻⣷⡀⠀⠻⣧⠀[/]
[#C75B1D]⢸⡟⠀⠀⣿⡟⠀⠀⠀🔥⠀⠀⠀⠀⢻⣿⠀⠀⢻⡇[/]
[#7A3511]⠀⠻⣦⡀⠘⢿⣧⡀⠀⠀⠀⠀⠀⢀⣼⡿⠃⢀⣴⠟⠀[/]
[#7A3511]⠀⠀⠈⠻⣦⣀⠙⢿⣷⣤⣤⣤⣾⡿⠋⣀⣴⠟⠁⠀⠀[/]
[#C75B1D]⠀⠀⠀⠀⠈⠙⠛⠶⠤⠭⠭⠤⠶⠛⠋⠁⠀⠀⠀⠀[/]
[#F29C38]⠀⠀⠀⠀⠀⠀⠀⠀⣰⡿⢿⣆⠀⠀⠀⠀⠀⠀⠀⠀⠀[/]
[#F29C38]⠀⠀⠀⠀⠀⠀⠀⣼⡟⠀⠀⢻⣧⠀⠀⠀⠀⠀⠀⠀⠀[/]
[dim #7A3511]tail flame lit[/]""",
},
}
# =============================================================================
# Skin loading and management
# =============================================================================
_active_skin: Optional[SkinConfig] = None
_active_skin_name: str = "default"
def _skins_dir() -> Path:
"""User skins directory."""
home = Path(os.getenv("HERMES_HOME", Path.home() / ".hermes"))
return home / "skins"
def _load_skin_from_yaml(path: Path) -> Optional[Dict[str, Any]]:
"""Load a skin definition from a YAML file."""
try:
import yaml
with open(path, "r", encoding="utf-8") as f:
data = yaml.safe_load(f)
if isinstance(data, dict) and "name" in data:
return data
except Exception as e:
logger.debug("Failed to load skin from %s: %s", path, e)
return None
def _build_skin_config(data: Dict[str, Any]) -> SkinConfig:
"""Build a SkinConfig from a raw dict (built-in or loaded from YAML)."""
# Start with default values as base for missing keys
default = _BUILTIN_SKINS["default"]
colors = dict(default.get("colors", {}))
colors.update(data.get("colors", {}))
spinner = dict(default.get("spinner", {}))
spinner.update(data.get("spinner", {}))
branding = dict(default.get("branding", {}))
branding.update(data.get("branding", {}))
return SkinConfig(
name=data.get("name", "unknown"),
description=data.get("description", ""),
colors=colors,
spinner=spinner,
branding=branding,
tool_prefix=data.get("tool_prefix", default.get("tool_prefix", "")),
banner_logo=data.get("banner_logo", ""),
banner_hero=data.get("banner_hero", ""),
)
def list_skins() -> List[Dict[str, str]]:
"""List all available skins (built-in + user-installed).
Returns list of {"name": ..., "description": ..., "source": "builtin"|"user"}.
"""
result = []
for name, data in _BUILTIN_SKINS.items():
result.append({
"name": name,
"description": data.get("description", ""),
"source": "builtin",
})
skins_path = _skins_dir()
if skins_path.is_dir():
for f in sorted(skins_path.glob("*.yaml")):
data = _load_skin_from_yaml(f)
if data:
skin_name = data.get("name", f.stem)
# Skip if it shadows a built-in
if any(s["name"] == skin_name for s in result):
continue
result.append({
"name": skin_name,
"description": data.get("description", ""),
"source": "user",
})
return result
def load_skin(name: str) -> SkinConfig:
"""Load a skin by name. Checks user skins first, then built-in."""
# Check user skins directory
skins_path = _skins_dir()
user_file = skins_path / f"{name}.yaml"
if user_file.is_file():
data = _load_skin_from_yaml(user_file)
if data:
return _build_skin_config(data)
# Check built-in skins
if name in _BUILTIN_SKINS:
return _build_skin_config(_BUILTIN_SKINS[name])
# Fallback to default
logger.warning("Skin '%s' not found, using default", name)
return _build_skin_config(_BUILTIN_SKINS["default"])
def get_active_skin() -> SkinConfig:
"""Get the currently active skin config (cached)."""
global _active_skin
if _active_skin is None:
_active_skin = load_skin(_active_skin_name)
return _active_skin
def set_active_skin(name: str) -> SkinConfig:
"""Switch the active skin. Returns the new SkinConfig."""
global _active_skin, _active_skin_name
_active_skin_name = name
_active_skin = load_skin(name)
return _active_skin
def get_active_skin_name() -> str:
"""Get the name of the currently active skin."""
return _active_skin_name
def init_skin_from_config(config: dict) -> None:
"""Initialize the active skin from CLI config at startup.
Call this once during CLI init with the loaded config dict.
"""
display = config.get("display", {})
skin_name = display.get("skin", "default")
if isinstance(skin_name, str) and skin_name.strip():
set_active_skin(skin_name.strip())
else:
set_active_skin("default")

View File

@@ -79,12 +79,8 @@ def show_status(args):
"OpenRouter": "OPENROUTER_API_KEY",
"Anthropic": "ANTHROPIC_API_KEY",
"OpenAI": "OPENAI_API_KEY",
"Z.AI/GLM": "GLM_API_KEY",
"Kimi": "KIMI_API_KEY",
"MiniMax": "MINIMAX_API_KEY",
"MiniMax-CN": "MINIMAX_CN_API_KEY",
"Firecrawl": "FIRECRAWL_API_KEY",
"Browserbase": "BROWSERBASE_API_KEY", # Optional — local browser works without this
"Browserbase": "BROWSERBASE_API_KEY",
"FAL": "FAL_KEY",
"Tinker": "TINKER_API_KEY",
"WandB": "WANDB_API_KEY",
@@ -141,28 +137,6 @@ def show_status(args):
if codex_status.get("error") and not codex_logged_in:
print(f" Error: {codex_status.get('error')}")
# =========================================================================
# API-Key Providers
# =========================================================================
print()
print(color("◆ API-Key Providers", Colors.CYAN, Colors.BOLD))
apikey_providers = {
"Z.AI / GLM": ("GLM_API_KEY", "ZAI_API_KEY", "Z_AI_API_KEY"),
"Kimi / Moonshot": ("KIMI_API_KEY",),
"MiniMax": ("MINIMAX_API_KEY",),
"MiniMax (China)": ("MINIMAX_CN_API_KEY",),
}
for pname, env_vars in apikey_providers.items():
key_val = ""
for ev in env_vars:
key_val = get_env_value(ev) or ""
if key_val:
break
configured = bool(key_val)
label = "configured" if configured else "not configured (run: hermes model)"
print(f" {pname:<16} {check_mark(configured)} {label}")
# =========================================================================
# Terminal Configuration
# =========================================================================
@@ -206,9 +180,6 @@ def show_status(args):
"Telegram": ("TELEGRAM_BOT_TOKEN", "TELEGRAM_HOME_CHANNEL"),
"Discord": ("DISCORD_BOT_TOKEN", "DISCORD_HOME_CHANNEL"),
"WhatsApp": ("WHATSAPP_ENABLED", None),
"Signal": ("SIGNAL_HTTP_URL", "SIGNAL_HOME_CHANNEL"),
"Slack": ("SLACK_BOT_TOKEN", None),
"Email": ("EMAIL_ADDRESS", "EMAIL_HOME_ADDRESS"),
}
for name, (token_var, home_var) in platforms.items():
@@ -264,7 +235,7 @@ def show_status(args):
if jobs_file.exists():
import json
try:
with open(jobs_file, encoding="utf-8") as f:
with open(jobs_file) as f:
data = json.load(f)
jobs = data.get("jobs", [])
enabled_jobs = [j for j in jobs if j.get("enabled", True)]
@@ -284,7 +255,7 @@ def show_status(args):
if sessions_file.exists():
import json
try:
with open(sessions_file, encoding="utf-8") as f:
with open(sessions_file) as f:
data = json.load(f)
print(f" Active: {len(data)} session(s)")
except Exception:

File diff suppressed because it is too large Load Diff

View File

@@ -7,6 +7,3 @@ without risk of circular imports.
OPENROUTER_BASE_URL = "https://openrouter.ai/api/v1"
OPENROUTER_MODELS_URL = f"{OPENROUTER_BASE_URL}/models"
OPENROUTER_CHAT_URL = f"{OPENROUTER_BASE_URL}/chat/completions"
NOUS_API_BASE_URL = "https://inference-api.nousresearch.com/v1"
NOUS_API_CHAT_URL = f"{NOUS_API_BASE_URL}/chat/completions"

View File

@@ -16,7 +16,6 @@ Key design decisions:
import json
import os
import re
import sqlite3
import time
from pathlib import Path
@@ -25,7 +24,7 @@ from typing import Dict, Any, List, Optional
DEFAULT_DB_PATH = Path(os.getenv("HERMES_HOME", Path.home() / ".hermes")) / "state.db"
SCHEMA_VERSION = 4
SCHEMA_VERSION = 2
SCHEMA_SQL = """
CREATE TABLE IF NOT EXISTS schema_version (
@@ -47,7 +46,6 @@ CREATE TABLE IF NOT EXISTS sessions (
tool_call_count INTEGER DEFAULT 0,
input_tokens INTEGER DEFAULT 0,
output_tokens INTEGER DEFAULT 0,
title TEXT,
FOREIGN KEY (parent_session_id) REFERENCES sessions(id)
);
@@ -135,33 +133,7 @@ class SessionDB:
except sqlite3.OperationalError:
pass # Column already exists
cursor.execute("UPDATE schema_version SET version = 2")
if current_version < 3:
# v3: add title column to sessions
try:
cursor.execute("ALTER TABLE sessions ADD COLUMN title TEXT")
except sqlite3.OperationalError:
pass # Column already exists
cursor.execute("UPDATE schema_version SET version = 3")
if current_version < 4:
# v4: add unique index on title (NULLs allowed, only non-NULL must be unique)
try:
cursor.execute(
"CREATE UNIQUE INDEX IF NOT EXISTS idx_sessions_title_unique "
"ON sessions(title) WHERE title IS NOT NULL"
)
except sqlite3.OperationalError:
pass # Index already exists
cursor.execute("UPDATE schema_version SET version = 4")
# Unique title index — always ensure it exists (safe to run after migrations
# since the title column is guaranteed to exist at this point)
try:
cursor.execute(
"CREATE UNIQUE INDEX IF NOT EXISTS idx_sessions_title_unique "
"ON sessions(title) WHERE title IS NOT NULL"
)
except sqlite3.OperationalError:
pass # Index already exists
# FTS5 setup (separate because CREATE VIRTUAL TABLE can't be in executescript with IF NOT EXISTS reliably)
try:
@@ -247,210 +219,6 @@ class SessionDB:
row = cursor.fetchone()
return dict(row) if row else None
# Maximum length for session titles
MAX_TITLE_LENGTH = 100
@staticmethod
def sanitize_title(title: Optional[str]) -> Optional[str]:
"""Validate and sanitize a session title.
- Strips leading/trailing whitespace
- Removes ASCII control characters (0x00-0x1F, 0x7F) and problematic
Unicode control chars (zero-width, RTL/LTR overrides, etc.)
- Collapses internal whitespace runs to single spaces
- Normalizes empty/whitespace-only strings to None
- Enforces MAX_TITLE_LENGTH
Returns the cleaned title string or None.
Raises ValueError if the title exceeds MAX_TITLE_LENGTH after cleaning.
"""
if not title:
return None
import re
# Remove ASCII control characters (0x00-0x1F, 0x7F) but keep
# whitespace chars (\t=0x09, \n=0x0A, \r=0x0D) so they can be
# normalized to spaces by the whitespace collapsing step below
cleaned = re.sub(r'[\x00-\x08\x0b\x0c\x0e-\x1f\x7f]', '', title)
# Remove problematic Unicode control characters:
# - Zero-width chars (U+200B-U+200F, U+FEFF)
# - Directional overrides (U+202A-U+202E, U+2066-U+2069)
# - Object replacement (U+FFFC), interlinear annotation (U+FFF9-U+FFFB)
cleaned = re.sub(
r'[\u200b-\u200f\u2028-\u202e\u2060-\u2069\ufeff\ufffc\ufff9-\ufffb]',
'', cleaned,
)
# Collapse internal whitespace runs and strip
cleaned = re.sub(r'\s+', ' ', cleaned).strip()
if not cleaned:
return None
if len(cleaned) > SessionDB.MAX_TITLE_LENGTH:
raise ValueError(
f"Title too long ({len(cleaned)} chars, max {SessionDB.MAX_TITLE_LENGTH})"
)
return cleaned
def set_session_title(self, session_id: str, title: str) -> bool:
"""Set or update a session's title.
Returns True if session was found and title was set.
Raises ValueError if title is already in use by another session,
or if the title fails validation (too long, invalid characters).
Empty/whitespace-only strings are normalized to None (clearing the title).
"""
title = self.sanitize_title(title)
if title:
# Check uniqueness (allow the same session to keep its own title)
cursor = self._conn.execute(
"SELECT id FROM sessions WHERE title = ? AND id != ?",
(title, session_id),
)
conflict = cursor.fetchone()
if conflict:
raise ValueError(
f"Title '{title}' is already in use by session {conflict['id']}"
)
cursor = self._conn.execute(
"UPDATE sessions SET title = ? WHERE id = ?",
(title, session_id),
)
self._conn.commit()
return cursor.rowcount > 0
def get_session_title(self, session_id: str) -> Optional[str]:
"""Get the title for a session, or None."""
cursor = self._conn.execute(
"SELECT title FROM sessions WHERE id = ?", (session_id,)
)
row = cursor.fetchone()
return row["title"] if row else None
def get_session_by_title(self, title: str) -> Optional[Dict[str, Any]]:
"""Look up a session by exact title. Returns session dict or None."""
cursor = self._conn.execute(
"SELECT * FROM sessions WHERE title = ?", (title,)
)
row = cursor.fetchone()
return dict(row) if row else None
def resolve_session_by_title(self, title: str) -> Optional[str]:
"""Resolve a title to a session ID, preferring the latest in a lineage.
If the exact title exists, returns that session's ID.
If not, searches for "title #N" variants and returns the latest one.
If the exact title exists AND numbered variants exist, returns the
latest numbered variant (the most recent continuation).
"""
# First try exact match
exact = self.get_session_by_title(title)
# Also search for numbered variants: "title #2", "title #3", etc.
# Escape SQL LIKE wildcards (%, _) in the title to prevent false matches
escaped = title.replace("\\", "\\\\").replace("%", "\\%").replace("_", "\\_")
cursor = self._conn.execute(
"SELECT id, title, started_at FROM sessions "
"WHERE title LIKE ? ESCAPE '\\' ORDER BY started_at DESC",
(f"{escaped} #%",),
)
numbered = cursor.fetchall()
if numbered:
# Return the most recent numbered variant
return numbered[0]["id"]
elif exact:
return exact["id"]
return None
def get_next_title_in_lineage(self, base_title: str) -> str:
"""Generate the next title in a lineage (e.g., "my session""my session #2").
Strips any existing " #N" suffix to find the base name, then finds
the highest existing number and increments.
"""
import re
# Strip existing #N suffix to find the true base
match = re.match(r'^(.*?) #(\d+)$', base_title)
if match:
base = match.group(1)
else:
base = base_title
# Find all existing numbered variants
# Escape SQL LIKE wildcards (%, _) in the base to prevent false matches
escaped = base.replace("\\", "\\\\").replace("%", "\\%").replace("_", "\\_")
cursor = self._conn.execute(
"SELECT title FROM sessions WHERE title = ? OR title LIKE ? ESCAPE '\\'",
(base, f"{escaped} #%"),
)
existing = [row["title"] for row in cursor.fetchall()]
if not existing:
return base # No conflict, use the base name as-is
# Find the highest number
max_num = 1 # The unnumbered original counts as #1
for t in existing:
m = re.match(r'^.* #(\d+)$', t)
if m:
max_num = max(max_num, int(m.group(1)))
return f"{base} #{max_num + 1}"
def list_sessions_rich(
self,
source: str = None,
limit: int = 20,
offset: int = 0,
) -> List[Dict[str, Any]]:
"""List sessions with preview (first user message) and last active timestamp.
Returns dicts with keys: id, source, model, title, started_at, ended_at,
message_count, preview (first 60 chars of first user message),
last_active (timestamp of last message).
Uses a single query with correlated subqueries instead of N+2 queries.
"""
source_clause = "WHERE s.source = ?" if source else ""
query = f"""
SELECT s.*,
COALESCE(
(SELECT SUBSTR(REPLACE(REPLACE(m.content, X'0A', ' '), X'0D', ' '), 1, 63)
FROM messages m
WHERE m.session_id = s.id AND m.role = 'user' AND m.content IS NOT NULL
ORDER BY m.timestamp, m.id LIMIT 1),
''
) AS _preview_raw,
COALESCE(
(SELECT MAX(m2.timestamp) FROM messages m2 WHERE m2.session_id = s.id),
s.started_at
) AS last_active
FROM sessions s
{source_clause}
ORDER BY s.started_at DESC
LIMIT ? OFFSET ?
"""
params = (source, limit, offset) if source else (limit, offset)
cursor = self._conn.execute(query, params)
sessions = []
for row in cursor.fetchall():
s = dict(row)
# Build the preview from the raw substring
raw = s.pop("_preview_raw", "").strip()
if raw:
text = raw[:60]
s["preview"] = text + ("..." if len(raw) > 60 else "")
else:
s["preview"] = ""
sessions.append(s)
return sessions
# =========================================================================
# Message storage
# =========================================================================
@@ -491,16 +259,12 @@ class SessionDB:
msg_id = cursor.lastrowid
# Update counters
# Count actual tool calls from the tool_calls list (not from tool responses).
# A single assistant message can contain multiple parallel tool calls.
num_tool_calls = 0
if tool_calls is not None:
num_tool_calls = len(tool_calls) if isinstance(tool_calls, list) else 1
if num_tool_calls > 0:
is_tool_related = role == "tool" or tool_calls is not None
if is_tool_related:
self._conn.execute(
"""UPDATE sessions SET message_count = message_count + 1,
tool_call_count = tool_call_count + ? WHERE id = ?""",
(num_tool_calls, session_id),
tool_call_count = tool_call_count + 1 WHERE id = ?""",
(session_id,),
)
else:
self._conn.execute(
@@ -558,32 +322,6 @@ class SessionDB:
# Search
# =========================================================================
@staticmethod
def _sanitize_fts5_query(query: str) -> str:
"""Sanitize user input for safe use in FTS5 MATCH queries.
FTS5 has its own query syntax where characters like ``"``, ``(``, ``)``,
``+``, ``*``, ``{``, ``}`` and bare boolean operators (``AND``, ``OR``,
``NOT``) have special meaning. Passing raw user input directly to
MATCH can cause ``sqlite3.OperationalError``.
Strategy: strip characters that are only meaningful as FTS5 operators
and would otherwise cause syntax errors. This preserves normal keyword
search while preventing crashes on inputs like ``C++``, ``"unterminated``,
or ``hello AND``.
"""
# Remove FTS5-special characters that are not useful in keyword search
sanitized = re.sub(r'[+{}()"^]', " ", query)
# Collapse repeated * (e.g. "***") into a single one, and remove
# leading * (prefix-only matching requires at least one char before *)
sanitized = re.sub(r"\*+", "*", sanitized)
sanitized = re.sub(r"(^|\s)\*", r"\1", sanitized)
# Remove dangling boolean operators at start/end that would cause
# syntax errors (e.g. "hello AND" or "OR world")
sanitized = re.sub(r"(?i)^(AND|OR|NOT)\b\s*", "", sanitized.strip())
sanitized = re.sub(r"(?i)\s+(AND|OR|NOT)\s*$", "", sanitized.strip())
return sanitized.strip()
def search_messages(
self,
query: str,
@@ -607,10 +345,6 @@ class SessionDB:
if not query or not query.strip():
return []
query = self._sanitize_fts5_query(query)
if not query:
return []
if source_filter is None:
source_filter = ["cli", "telegram", "discord", "whatsapp", "slack"]
@@ -650,11 +384,7 @@ class SessionDB:
LIMIT ? OFFSET ?
"""
try:
cursor = self._conn.execute(sql, params)
except sqlite3.OperationalError:
# FTS5 query syntax error despite sanitization — return empty
return []
cursor = self._conn.execute(sql, params)
matches = [dict(row) for row in cursor.fetchall()]
# Add surrounding context (1 message before + after each match)

View File

@@ -1,119 +0,0 @@
"""
Timezone-aware clock for Hermes.
Provides a single ``now()`` helper that returns a timezone-aware datetime
based on the user's configured IANA timezone (e.g. ``Asia/Kolkata``).
Resolution order:
1. ``HERMES_TIMEZONE`` environment variable
2. ``timezone`` key in ``~/.hermes/config.yaml``
3. Falls back to the server's local time (``datetime.now().astimezone()``)
Invalid timezone values log a warning and fall back safely — Hermes never
crashes due to a bad timezone string.
"""
import logging
import os
from datetime import datetime, timezone as _tz
from pathlib import Path
from typing import Optional
logger = logging.getLogger(__name__)
try:
from zoneinfo import ZoneInfo
except ImportError:
# Python 3.8 fallback (shouldn't be needed — Hermes requires 3.9+)
from backports.zoneinfo import ZoneInfo # type: ignore[no-redef]
# Cached state — resolved once, reused on every call.
# Call reset_cache() to force re-resolution (e.g. after config changes).
_cached_tz: Optional[ZoneInfo] = None
_cached_tz_name: Optional[str] = None
_cache_resolved: bool = False
def _resolve_timezone_name() -> str:
"""Read the configured IANA timezone string (or empty string).
This does file I/O when falling through to config.yaml, so callers
should cache the result rather than calling on every ``now()``.
"""
# 1. Environment variable (highest priority — set by Supervisor, etc.)
tz_env = os.getenv("HERMES_TIMEZONE", "").strip()
if tz_env:
return tz_env
# 2. config.yaml ``timezone`` key
try:
import yaml
hermes_home = Path(os.getenv("HERMES_HOME", Path.home() / ".hermes"))
config_path = hermes_home / "config.yaml"
if config_path.exists():
with open(config_path) as f:
cfg = yaml.safe_load(f) or {}
tz_cfg = cfg.get("timezone", "")
if isinstance(tz_cfg, str) and tz_cfg.strip():
return tz_cfg.strip()
except Exception:
pass
return ""
def _get_zoneinfo(name: str) -> Optional[ZoneInfo]:
"""Validate and return a ZoneInfo, or None if invalid."""
if not name:
return None
try:
return ZoneInfo(name)
except (KeyError, Exception) as exc:
logger.warning(
"Invalid timezone '%s': %s. Falling back to server local time.",
name, exc,
)
return None
def get_timezone() -> Optional[ZoneInfo]:
"""Return the user's configured ZoneInfo, or None (meaning server-local).
Resolved once and cached. Call ``reset_cache()`` after config changes.
"""
global _cached_tz, _cached_tz_name, _cache_resolved
if not _cache_resolved:
_cached_tz_name = _resolve_timezone_name()
_cached_tz = _get_zoneinfo(_cached_tz_name)
_cache_resolved = True
return _cached_tz
def get_timezone_name() -> str:
"""Return the IANA name of the configured timezone, or empty string."""
global _cached_tz_name, _cache_resolved
if not _cache_resolved:
get_timezone() # populates cache
return _cached_tz_name or ""
def now() -> datetime:
"""
Return the current time as a timezone-aware datetime.
If a valid timezone is configured, returns wall-clock time in that zone.
Otherwise returns the server's local time (via ``astimezone()``).
"""
tz = get_timezone()
if tz is not None:
return datetime.now(tz)
# No timezone configured — use server-local (still tz-aware)
return datetime.now().astimezone()
def reset_cache() -> None:
"""Clear the cached timezone. Used by tests and after config changes."""
global _cached_tz, _cached_tz_name, _cache_resolved
_cached_tz = None
_cached_tz_name = None
_cache_resolved = False

Binary file not shown.

Before

Width:  |  Height:  |  Size: 28 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 870 B

Binary file not shown.

Before

Width:  |  Height:  |  Size: 2.5 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 7.9 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 29 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 134 KiB

View File

@@ -19,10 +19,7 @@
<link href="https://fonts.googleapis.com/css2?family=Inter:wght@300;400;500;600;700&family=JetBrains+Mono:wght@400;500&display=swap" rel="stylesheet">
<link rel="stylesheet" href="style.css">
<link rel="icon" type="image/x-icon" href="favicon.ico">
<link rel="icon" type="image/png" sizes="32x32" href="favicon-32x32.png">
<link rel="icon" type="image/png" sizes="16x16" href="favicon-16x16.png">
<link rel="apple-touch-icon" sizes="180x180" href="apple-touch-icon.png">
<link rel="icon" href="data:image/svg+xml,<svg xmlns='http://www.w3.org/2000/svg' viewBox='0 0 100 100'><text y='.9em' font-size='90'>⚕</text></svg>">
</head>
<body>
<!-- Ambient glow effects -->

View File

@@ -149,7 +149,7 @@ class MiniSWERunner:
def __init__(
self,
model: str = "anthropic/claude-sonnet-4.6",
model: str = "anthropic/claude-sonnet-4-20250514",
base_url: str = None,
api_key: str = None,
env_type: str = "local",
@@ -200,7 +200,13 @@ class MiniSWERunner:
else:
client_kwargs["base_url"] = "https://openrouter.ai/api/v1"
if base_url and "api.anthropic.com" in base_url.strip().lower():
raise ValueError(
"Anthropic's native /v1/messages API is not supported yet (planned for a future release). "
"Hermes currently requires OpenAI-compatible /chat/completions endpoints. "
"To use Claude models now, route through OpenRouter (OPENROUTER_API_KEY) "
"or any OpenAI-compatible proxy that wraps the Anthropic API."
)
# Handle API key - OpenRouter is the primary provider
if api_key:

View File

@@ -225,18 +225,6 @@ def get_tool_definitions(
# Ask the registry for schemas (only returns tools whose check_fn passes)
filtered_tools = registry.get_definitions(tools_to_include, quiet=quiet_mode)
# Rebuild execute_code schema to only list sandbox tools that are actually
# enabled. Without this, the model sees "web_search is available in
# execute_code" even when the user disabled the web toolset (#560-discord).
if "execute_code" in tools_to_include:
from tools.code_execution_tool import SANDBOX_ALLOWED_TOOLS, build_execute_code_schema
sandbox_enabled = SANDBOX_ALLOWED_TOOLS & tools_to_include
dynamic_schema = build_execute_code_schema(sandbox_enabled)
for i, td in enumerate(filtered_tools):
if td.get("function", {}).get("name") == "execute_code":
filtered_tools[i] = {"type": "function", "function": dynamic_schema}
break
if not quiet_mode:
if filtered_tools:
tool_names = [t["function"]["name"] for t in filtered_tools]
@@ -266,7 +254,6 @@ def handle_function_call(
function_args: Dict[str, Any],
task_id: Optional[str] = None,
user_task: Optional[str] = None,
enabled_tools: Optional[List[str]] = None,
) -> str:
"""
Main function call dispatcher that routes calls to the tool registry.
@@ -276,36 +263,19 @@ def handle_function_call(
function_args: Arguments for the function.
task_id: Unique identifier for terminal/browser session isolation.
user_task: The user's original task (for browser_snapshot context).
enabled_tools: Tool names enabled for this session. When provided,
execute_code uses this list to determine which sandbox
tools to generate. Falls back to the process-global
``_last_resolved_tool_names`` for backward compat.
Returns:
Function result as a JSON string.
"""
# Notify the read-loop tracker when a non-read/search tool runs,
# so the *consecutive* counter resets (reads after other work are fine).
_READ_SEARCH_TOOLS = {"read_file", "search_files"}
if function_name not in _READ_SEARCH_TOOLS:
try:
from tools.file_tools import notify_other_tool_call
notify_other_tool_call(task_id or "default")
except Exception:
pass # file_tools may not be loaded yet
try:
if function_name in _AGENT_LOOP_TOOLS:
return json.dumps({"error": f"{function_name} must be handled by the agent loop"})
if function_name == "execute_code":
# Prefer the caller-provided list so subagents can't overwrite
# the parent's tool set via the process-global.
sandbox_enabled = enabled_tools if enabled_tools is not None else _last_resolved_tool_names
return registry.dispatch(
function_name, function_args,
task_id=task_id,
enabled_tools=sandbox_enabled,
enabled_tools=_last_resolved_tool_names,
)
return registry.dispatch(

View File

@@ -1,207 +0,0 @@
---
name: solana
description: Query Solana blockchain data with USD pricing — wallet balances, token portfolios with values, transaction details, NFTs, whale detection, and live network stats. Uses Solana RPC + CoinGecko. No API key required.
version: 0.2.0
author: Deniz Alagoz (gizdusum), enhanced by Hermes Agent
license: MIT
metadata:
hermes:
tags: [Solana, Blockchain, Crypto, Web3, RPC, DeFi, NFT]
related_skills: []
---
# Solana Blockchain Skill
Query Solana on-chain data enriched with USD pricing via CoinGecko.
8 commands: wallet portfolio, token info, transactions, activity, NFTs,
whale detection, network stats, and price lookup.
No API key needed. Uses only Python standard library (urllib, json, argparse).
---
## When to Use
- User asks for a Solana wallet balance, token holdings, or portfolio value
- User wants to inspect a specific transaction by signature
- User wants SPL token metadata, price, supply, or top holders
- User wants recent transaction history for an address
- User wants NFTs owned by a wallet
- User wants to find large SOL transfers (whale detection)
- User wants Solana network health, TPS, epoch, or SOL price
- User asks "what's the price of BONK/JUP/SOL?"
---
## Prerequisites
The helper script uses only Python standard library (urllib, json, argparse).
No external packages required.
Pricing data comes from CoinGecko's free API (no key needed, rate-limited
to ~10-30 requests/minute). For faster lookups, use `--no-prices` flag.
---
## Quick Reference
RPC endpoint (default): https://api.mainnet-beta.solana.com
Override: export SOLANA_RPC_URL=https://your-private-rpc.com
Helper script path: ~/.hermes/skills/blockchain/solana/scripts/solana_client.py
```
python3 solana_client.py wallet <address> [--limit N] [--all] [--no-prices]
python3 solana_client.py tx <signature>
python3 solana_client.py token <mint_address>
python3 solana_client.py activity <address> [--limit N]
python3 solana_client.py nft <address>
python3 solana_client.py whales [--min-sol N]
python3 solana_client.py stats
python3 solana_client.py price <mint_or_symbol>
```
---
## Procedure
### 0. Setup Check
```bash
python3 --version
# Optional: set a private RPC for better rate limits
export SOLANA_RPC_URL="https://api.mainnet-beta.solana.com"
# Confirm connectivity
python3 ~/.hermes/skills/blockchain/solana/scripts/solana_client.py stats
```
### 1. Wallet Portfolio
Get SOL balance, SPL token holdings with USD values, NFT count, and
portfolio total. Tokens sorted by value, dust filtered, known tokens
labeled by name (BONK, JUP, USDC, etc.).
```bash
python3 ~/.hermes/skills/blockchain/solana/scripts/solana_client.py \
wallet 9WzDXwBbmkg8ZTbNMqUxvQRAyrZzDsGYdLVL9zYtAWWM
```
Flags:
- `--limit N` — show top N tokens (default: 20)
- `--all` — show all tokens, no dust filter, no limit
- `--no-prices` — skip CoinGecko price lookups (faster, RPC-only)
Output includes: SOL balance + USD value, token list with prices sorted
by value, dust count, NFT summary, total portfolio value in USD.
### 2. Transaction Details
Inspect a full transaction by its base58 signature. Shows balance changes
in both SOL and USD.
```bash
python3 ~/.hermes/skills/blockchain/solana/scripts/solana_client.py \
tx 5j7s8K...your_signature_here
```
Output: slot, timestamp, fee, status, balance changes (SOL + USD),
program invocations.
### 3. Token Info
Get SPL token metadata, current price, market cap, supply, decimals,
mint/freeze authorities, and top 5 holders.
```bash
python3 ~/.hermes/skills/blockchain/solana/scripts/solana_client.py \
token DezXAZ8z7PnrnRJjz3wXBoRgixCa6xjnB7YaB1pPB263
```
Output: name, symbol, decimals, supply, price, market cap, top 5
holders with percentages.
### 4. Recent Activity
List recent transactions for an address (default: last 10, max: 25).
```bash
python3 ~/.hermes/skills/blockchain/solana/scripts/solana_client.py \
activity 9WzDXwBbmkg8ZTbNMqUxvQRAyrZzDsGYdLVL9zYtAWWM --limit 25
```
### 5. NFT Portfolio
List NFTs owned by a wallet (heuristic: SPL tokens with amount=1, decimals=0).
```bash
python3 ~/.hermes/skills/blockchain/solana/scripts/solana_client.py \
nft 9WzDXwBbmkg8ZTbNMqUxvQRAyrZzDsGYdLVL9zYtAWWM
```
Note: Compressed NFTs (cNFTs) are not detected by this heuristic.
### 6. Whale Detector
Scan the most recent block for large SOL transfers with USD values.
```bash
python3 ~/.hermes/skills/blockchain/solana/scripts/solana_client.py \
whales --min-sol 500
```
Note: scans the latest block only — point-in-time snapshot, not historical.
### 7. Network Stats
Live Solana network health: current slot, epoch, TPS, supply, validator
version, SOL price, and market cap.
```bash
python3 ~/.hermes/skills/blockchain/solana/scripts/solana_client.py stats
```
### 8. Price Lookup
Quick price check for any token by mint address or known symbol.
```bash
python3 ~/.hermes/skills/blockchain/solana/scripts/solana_client.py price BONK
python3 ~/.hermes/skills/blockchain/solana/scripts/solana_client.py price JUP
python3 ~/.hermes/skills/blockchain/solana/scripts/solana_client.py price SOL
python3 ~/.hermes/skills/blockchain/solana/scripts/solana_client.py price DezXAZ8z7PnrnRJjz3wXBoRgixCa6xjnB7YaB1pPB263
```
Known symbols: SOL, USDC, USDT, BONK, JUP, WETH, JTO, mSOL, stSOL,
PYTH, HNT, RNDR, WEN, W, TNSR, DRIFT, bSOL, JLP, WIF, MEW, BOME, PENGU.
---
## Pitfalls
- **CoinGecko rate-limits** — free tier allows ~10-30 requests/minute.
Price lookups use 1 request per token. Wallets with many tokens may
not get prices for all of them. Use `--no-prices` for speed.
- **Public RPC rate-limits** — Solana mainnet public RPC limits requests.
For production use, set SOLANA_RPC_URL to a private endpoint
(Helius, QuickNode, Triton).
- **NFT detection is heuristic** — amount=1 + decimals=0. Compressed
NFTs (cNFTs) and Token-2022 NFTs won't appear.
- **Whale detector scans latest block only** — not historical. Results
vary by the moment you query.
- **Transaction history** — public RPC keeps ~2 days. Older transactions
may not be available.
- **Token names** — ~25 well-known tokens are labeled by name. Others
show abbreviated mint addresses. Use the `token` command for full info.
- **Retry on 429** — both RPC and CoinGecko calls retry up to 2 times
with exponential backoff on rate-limit errors.
---
## Verification
```bash
# Should print current Solana slot, TPS, and SOL price
python3 ~/.hermes/skills/blockchain/solana/scripts/solana_client.py stats
```

View File

@@ -1,698 +0,0 @@
#!/usr/bin/env python3
"""
Solana Blockchain CLI Tool for Hermes Agent
--------------------------------------------
Queries the Solana JSON-RPC API and CoinGecko for enriched on-chain data.
Uses only Python standard library — no external packages required.
Usage:
python3 solana_client.py stats
python3 solana_client.py wallet <address> [--limit N] [--all] [--no-prices]
python3 solana_client.py tx <signature>
python3 solana_client.py token <mint_address>
python3 solana_client.py activity <address> [--limit N]
python3 solana_client.py nft <address>
python3 solana_client.py whales [--min-sol N]
python3 solana_client.py price <mint_address_or_symbol>
Environment:
SOLANA_RPC_URL Override the default RPC endpoint (default: mainnet-beta public)
"""
import argparse
import json
import os
import sys
import time
import urllib.request
import urllib.error
from typing import Any, Dict, List, Optional
RPC_URL = os.environ.get(
"SOLANA_RPC_URL",
"https://api.mainnet-beta.solana.com",
)
LAMPORTS_PER_SOL = 1_000_000_000
# Well-known Solana token names — avoids API calls for common tokens.
# Maps mint address → (symbol, name).
KNOWN_TOKENS: Dict[str, tuple] = {
"So11111111111111111111111111111111111111112": ("SOL", "Solana"),
"EPjFWdd5AufqSSqeM2qN1xzybapC8G4wEGGkZwyTDt1v": ("USDC", "USD Coin"),
"Es9vMFrzaCERmJfrF4H2FYD4KCoNkY11McCe8BenwNYB": ("USDT", "Tether"),
"DezXAZ8z7PnrnRJjz3wXBoRgixCa6xjnB7YaB1pPB263": ("BONK", "Bonk"),
"JUPyiwrYJFskUPiHa7hkeR8VUtAeFoSYbKedZNsDvCN": ("JUP", "Jupiter"),
"7vfCXTUXx5WJV5JADk17DUJ4ksgau7utNKj4b963voxs": ("WETH", "Wrapped Ether"),
"jtojtomepa8beP8AuQc6eXt5FriJwfFMwQx2v2f9mCL": ("JTO", "Jito"),
"mSoLzYCxHdYgdzU16g5QSh3i5K3z3KZK7ytfqcJm7So": ("mSOL", "Marinade Staked SOL"),
"7dHbWXmci3dT8UFYWYZweBLXgycu7Y3iL6trKn1Y7ARj": ("stSOL", "Lido Staked SOL"),
"HZ1JovNiVvGrGNiiYvEozEVgZ58xaU3RKwX8eACQBCt3": ("PYTH", "Pyth Network"),
"RLBxxFkseAZ4RgJH3Sqn8jXxhmGoz9jWxDNJMh8pL7a": ("RLBB", "Rollbit"),
"hntyVP6YFm1Hg25TN9WGLqM12b8TQmcknKrdu1oxWux": ("HNT", "Helium"),
"rndrizKT3MK1iimdxRdWabcF7Zg7AR5T4nud4EkHBof": ("RNDR", "Render"),
"WENWENvqqNya429ubCdR81ZmD69brwQaaBYY6p91oHQQ": ("WEN", "Wen"),
"85VBFQZC9TZkfaptBWjvUw7YbZjy52A6mjtPGjstQAmQ": ("W", "Wormhole"),
"TNSRxcUxoT9xBG3de7PiJyTDYu7kskLqcpddxnEJAS6": ("TNSR", "Tensor"),
"DriFtupJYLTosbwoN8koMbEYSx54aFAVLddWsbksjwg7": ("DRIFT", "Drift"),
"bSo13r4TkiE4KumL71LsHTPpL2euBYLFx6h9HP3piy1": ("bSOL", "BlazeStake Staked SOL"),
"27G8MtK7VtTcCHkpASjSDdkWWYfoqT6ggEuKidVJidD4": ("JLP", "Jupiter LP"),
"EKpQGSJtjMFqKZ9KQanSqYXRcF8fBopzLHYxdM65zcjm": ("WIF", "dogwifhat"),
"MEW1gQWJ3nEXg2qgERiKu7FAFj79PHvQVREQUzScPP5": ("MEW", "cat in a dogs world"),
"ukHH6c7mMyiWCf1b9pnWe25TSpkDDt3H5pQZgZ74J82": ("BOME", "Book of Meme"),
"A8C3xuqscfmyLrte3VwJvtPHXvcSN3FjDbUaSMAkQrCS": ("PENGU", "Pudgy Penguins"),
}
# Reverse lookup: symbol → mint (for the `price` command).
_SYMBOL_TO_MINT = {v[0].upper(): k for k, v in KNOWN_TOKENS.items()}
# ---------------------------------------------------------------------------
# HTTP / RPC helpers
# ---------------------------------------------------------------------------
def _http_get_json(url: str, timeout: int = 10, retries: int = 2) -> Any:
"""GET JSON from a URL with retry on 429 rate-limit. Returns parsed JSON or None."""
for attempt in range(retries + 1):
req = urllib.request.Request(
url, headers={"Accept": "application/json", "User-Agent": "HermesAgent/1.0"},
)
try:
with urllib.request.urlopen(req, timeout=timeout) as resp:
return json.load(resp)
except urllib.error.HTTPError as exc:
if exc.code == 429 and attempt < retries:
time.sleep(2.0 * (attempt + 1))
continue
return None
except Exception:
return None
return None
def _rpc_call(method: str, params: list = None, retries: int = 2) -> Any:
"""Send a JSON-RPC request with retry on 429 rate-limit."""
payload = json.dumps({
"jsonrpc": "2.0", "id": 1,
"method": method, "params": params or [],
}).encode()
for attempt in range(retries + 1):
req = urllib.request.Request(
RPC_URL, data=payload,
headers={"Content-Type": "application/json"}, method="POST",
)
try:
with urllib.request.urlopen(req, timeout=20) as resp:
body = json.load(resp)
if "error" in body:
err = body["error"]
# Rate-limit: retry after delay
if isinstance(err, dict) and err.get("code") == 429:
if attempt < retries:
time.sleep(1.5 * (attempt + 1))
continue
sys.exit(f"RPC error: {err}")
return body.get("result")
except urllib.error.HTTPError as exc:
if exc.code == 429 and attempt < retries:
time.sleep(1.5 * (attempt + 1))
continue
sys.exit(f"RPC HTTP error: {exc}")
except urllib.error.URLError as exc:
sys.exit(f"RPC connection error: {exc}")
return None
# Keep backward compat — the rest of the code uses `rpc()`.
rpc = _rpc_call
def rpc_batch(calls: list) -> list:
"""Send a batch of JSON-RPC requests (with retry on 429)."""
payload = json.dumps([
{"jsonrpc": "2.0", "id": i, "method": c["method"], "params": c.get("params", [])}
for i, c in enumerate(calls)
]).encode()
for attempt in range(3):
req = urllib.request.Request(
RPC_URL, data=payload,
headers={"Content-Type": "application/json"}, method="POST",
)
try:
with urllib.request.urlopen(req, timeout=20) as resp:
return json.load(resp)
except urllib.error.HTTPError as exc:
if exc.code == 429 and attempt < 2:
time.sleep(1.5 * (attempt + 1))
continue
sys.exit(f"RPC batch HTTP error: {exc}")
except urllib.error.URLError as exc:
sys.exit(f"RPC batch error: {exc}")
return []
def lamports_to_sol(lamports: int) -> float:
return lamports / LAMPORTS_PER_SOL
def print_json(obj: Any) -> None:
print(json.dumps(obj, indent=2))
def _short_mint(mint: str) -> str:
"""Abbreviate a mint address for display: first 4 + last 4."""
if len(mint) <= 12:
return mint
return f"{mint[:4]}...{mint[-4:]}"
# ---------------------------------------------------------------------------
# Price & token name helpers (CoinGecko — free, no API key)
# ---------------------------------------------------------------------------
def fetch_prices(mints: List[str], max_lookups: int = 20) -> Dict[str, float]:
"""Fetch USD prices for mint addresses via CoinGecko (one per request).
CoinGecko free tier doesn't support batch Solana token lookups,
so we do individual calls — capped at *max_lookups* to stay within
rate limits. Returns {mint: usd_price}.
"""
prices: Dict[str, float] = {}
for i, mint in enumerate(mints[:max_lookups]):
url = (
f"https://api.coingecko.com/api/v3/simple/token_price/solana"
f"?contract_addresses={mint}&vs_currencies=usd"
)
data = _http_get_json(url, timeout=10)
if data and isinstance(data, dict):
for addr, info in data.items():
if isinstance(info, dict) and "usd" in info:
prices[mint] = info["usd"]
break
# Pause between calls to respect CoinGecko free-tier rate-limits
if i < len(mints[:max_lookups]) - 1:
time.sleep(1.0)
return prices
def fetch_sol_price() -> Optional[float]:
"""Fetch current SOL price in USD via CoinGecko."""
data = _http_get_json(
"https://api.coingecko.com/api/v3/simple/price?ids=solana&vs_currencies=usd"
)
if data and "solana" in data:
return data["solana"].get("usd")
return None
def resolve_token_name(mint: str) -> Optional[Dict[str, str]]:
"""Look up token name and symbol from CoinGecko by mint address.
Returns {"name": ..., "symbol": ...} or None.
"""
if mint in KNOWN_TOKENS:
sym, name = KNOWN_TOKENS[mint]
return {"symbol": sym, "name": name}
url = f"https://api.coingecko.com/api/v3/coins/solana/contract/{mint}"
data = _http_get_json(url, timeout=10)
if data and "symbol" in data:
return {"symbol": data["symbol"].upper(), "name": data.get("name", "")}
return None
def _token_label(mint: str) -> str:
"""Return a human-readable label for a mint: symbol if known, else abbreviated address."""
if mint in KNOWN_TOKENS:
return KNOWN_TOKENS[mint][0]
return _short_mint(mint)
# ---------------------------------------------------------------------------
# 1. Network Stats
# ---------------------------------------------------------------------------
def cmd_stats(_args):
"""Live Solana network: slot, epoch, TPS, supply, version, SOL price."""
results = rpc_batch([
{"method": "getSlot"},
{"method": "getEpochInfo"},
{"method": "getRecentPerformanceSamples", "params": [1]},
{"method": "getSupply"},
{"method": "getVersion"},
])
by_id = {r["id"]: r.get("result") for r in results}
slot = by_id.get(0)
epoch_info = by_id.get(1)
perf_samples = by_id.get(2)
supply = by_id.get(3)
version = by_id.get(4)
tps = None
if perf_samples:
s = perf_samples[0]
tps = round(s["numTransactions"] / s["samplePeriodSecs"], 1)
total_supply = lamports_to_sol(supply["value"]["total"]) if supply else None
circ_supply = lamports_to_sol(supply["value"]["circulating"]) if supply else None
sol_price = fetch_sol_price()
out = {
"slot": slot,
"epoch": epoch_info.get("epoch") if epoch_info else None,
"slot_in_epoch": epoch_info.get("slotIndex") if epoch_info else None,
"tps": tps,
"total_supply_SOL": round(total_supply, 2) if total_supply else None,
"circulating_supply_SOL": round(circ_supply, 2) if circ_supply else None,
"validator_version": version.get("solana-core") if version else None,
}
if sol_price is not None:
out["sol_price_usd"] = sol_price
if circ_supply:
out["market_cap_usd"] = round(sol_price * circ_supply, 0)
print_json(out)
# ---------------------------------------------------------------------------
# 2. Wallet Info (enhanced with prices, sorting, filtering)
# ---------------------------------------------------------------------------
def cmd_wallet(args):
"""SOL balance + SPL token holdings with USD values."""
address = args.address
show_all = getattr(args, "all", False)
limit = getattr(args, "limit", 20) or 20
skip_prices = getattr(args, "no_prices", False)
# Fetch SOL balance
balance_result = rpc("getBalance", [address])
sol_balance = lamports_to_sol(balance_result["value"])
# Fetch all SPL token accounts
token_result = rpc("getTokenAccountsByOwner", [
address,
{"programId": "TokenkegQfeZyiNwAJbNbGKPFXCWuBvf9Ss623VQ5DA"},
{"encoding": "jsonParsed"},
])
raw_tokens = []
for acct in (token_result.get("value") or []):
info = acct["account"]["data"]["parsed"]["info"]
ta = info["tokenAmount"]
amount = float(ta.get("uiAmountString") or 0)
if amount > 0:
raw_tokens.append({
"mint": info["mint"],
"amount": amount,
"decimals": ta["decimals"],
})
# Separate NFTs (amount=1, decimals=0) from fungible tokens
nfts = [t for t in raw_tokens if t["decimals"] == 0 and t["amount"] == 1]
fungible = [t for t in raw_tokens if not (t["decimals"] == 0 and t["amount"] == 1)]
# Fetch prices for fungible tokens (cap lookups to avoid API abuse)
sol_price = None
prices: Dict[str, float] = {}
if not skip_prices and fungible:
sol_price = fetch_sol_price()
# Prioritize known tokens, then a small sample of unknowns.
# CoinGecko free tier = 1 request per mint, so we cap lookups.
known_mints = [t["mint"] for t in fungible if t["mint"] in KNOWN_TOKENS]
other_mints = [t["mint"] for t in fungible if t["mint"] not in KNOWN_TOKENS][:15]
mints_to_price = known_mints + other_mints
if mints_to_price:
prices = fetch_prices(mints_to_price, max_lookups=30)
# Enrich tokens with labels and USD values
enriched = []
dust_count = 0
dust_value = 0.0
for t in fungible:
mint = t["mint"]
label = _token_label(mint)
usd_price = prices.get(mint)
usd_value = round(usd_price * t["amount"], 2) if usd_price else None
# Filter dust (< $0.01) unless --all
if not show_all and usd_value is not None and usd_value < 0.01:
dust_count += 1
dust_value += usd_value
continue
entry = {"token": label, "mint": mint, "amount": t["amount"]}
if usd_price is not None:
entry["price_usd"] = usd_price
entry["value_usd"] = usd_value
enriched.append(entry)
# Sort: tokens with known USD value first (highest→lowest), then unknowns
enriched.sort(key=lambda x: (x.get("value_usd") is not None, x.get("value_usd") or 0), reverse=True)
# Apply limit unless --all
total_tokens = len(enriched)
if not show_all and len(enriched) > limit:
enriched = enriched[:limit]
# Compute portfolio total
total_usd = sum(t.get("value_usd", 0) for t in enriched)
sol_value_usd = round(sol_price * sol_balance, 2) if sol_price else None
if sol_value_usd:
total_usd += sol_value_usd
total_usd += dust_value
output = {
"address": address,
"sol_balance": round(sol_balance, 9),
}
if sol_price:
output["sol_price_usd"] = sol_price
output["sol_value_usd"] = sol_value_usd
output["tokens_shown"] = len(enriched)
if total_tokens > len(enriched):
output["tokens_hidden"] = total_tokens - len(enriched)
output["spl_tokens"] = enriched
if dust_count > 0:
output["dust_filtered"] = {"count": dust_count, "total_value_usd": round(dust_value, 4)}
output["nft_count"] = len(nfts)
if nfts:
output["nfts"] = [_token_label(n["mint"]) + f" ({_short_mint(n['mint'])})" for n in nfts[:10]]
if len(nfts) > 10:
output["nfts"].append(f"... and {len(nfts) - 10} more")
if total_usd > 0:
output["portfolio_total_usd"] = round(total_usd, 2)
print_json(output)
# ---------------------------------------------------------------------------
# 3. Transaction Details
# ---------------------------------------------------------------------------
def cmd_tx(args):
"""Full transaction details by signature."""
result = rpc("getTransaction", [
args.signature,
{"encoding": "jsonParsed", "maxSupportedTransactionVersion": 0},
])
if result is None:
sys.exit("Transaction not found (may be too old for public RPC history).")
meta = result.get("meta", {}) or {}
msg = result.get("transaction", {}).get("message", {})
account_keys = msg.get("accountKeys", [])
pre = meta.get("preBalances", [])
post = meta.get("postBalances", [])
balance_changes = []
for i, key in enumerate(account_keys):
acct_key = key["pubkey"] if isinstance(key, dict) else key
if i < len(pre) and i < len(post):
change = lamports_to_sol(post[i] - pre[i])
if change != 0:
balance_changes.append({"account": acct_key, "change_SOL": round(change, 9)})
programs = []
for ix in msg.get("instructions", []):
prog = ix.get("programId")
if prog is None and "programIdIndex" in ix:
k = account_keys[ix["programIdIndex"]]
prog = k["pubkey"] if isinstance(k, dict) else k
if prog:
programs.append(prog)
# Add USD value for SOL changes
sol_price = fetch_sol_price()
if sol_price and balance_changes:
for bc in balance_changes:
bc["change_USD"] = round(bc["change_SOL"] * sol_price, 2)
print_json({
"signature": args.signature,
"slot": result.get("slot"),
"block_time": result.get("blockTime"),
"fee_SOL": lamports_to_sol(meta.get("fee", 0)),
"status": "success" if meta.get("err") is None else "failed",
"balance_changes": balance_changes,
"programs_invoked": list(dict.fromkeys(programs)),
})
# ---------------------------------------------------------------------------
# 4. Token Info (enhanced with name + price)
# ---------------------------------------------------------------------------
def cmd_token(args):
"""SPL token metadata, supply, decimals, price, top holders."""
mint = args.mint
mint_info = rpc("getAccountInfo", [mint, {"encoding": "jsonParsed"}])
if mint_info is None or mint_info.get("value") is None:
sys.exit("Mint account not found.")
parsed = mint_info["value"]["data"]["parsed"]["info"]
decimals = parsed.get("decimals", 0)
supply_raw = int(parsed.get("supply", 0))
supply_human = supply_raw / (10 ** decimals) if decimals else supply_raw
largest = rpc("getTokenLargestAccounts", [mint])
holders = []
for acct in (largest.get("value") or [])[:5]:
amount = float(acct.get("uiAmountString") or 0)
pct = round((amount / supply_human * 100), 4) if supply_human > 0 else 0
holders.append({
"account": acct["address"],
"amount": amount,
"percent": pct,
})
# Resolve name + price
token_meta = resolve_token_name(mint)
price_data = fetch_prices([mint])
out = {"mint": mint}
if token_meta:
out["name"] = token_meta["name"]
out["symbol"] = token_meta["symbol"]
out["decimals"] = decimals
out["supply"] = round(supply_human, min(decimals, 6))
out["mint_authority"] = parsed.get("mintAuthority")
out["freeze_authority"] = parsed.get("freezeAuthority")
if mint in price_data:
out["price_usd"] = price_data[mint]
out["market_cap_usd"] = round(price_data[mint] * supply_human, 0)
out["top_5_holders"] = holders
print_json(out)
# ---------------------------------------------------------------------------
# 5. Recent Activity
# ---------------------------------------------------------------------------
def cmd_activity(args):
"""Recent transaction signatures for an address."""
limit = min(args.limit, 25)
result = rpc("getSignaturesForAddress", [args.address, {"limit": limit}])
txs = [
{
"signature": item["signature"],
"slot": item.get("slot"),
"block_time": item.get("blockTime"),
"err": item.get("err"),
}
for item in (result or [])
]
print_json({"address": args.address, "transactions": txs})
# ---------------------------------------------------------------------------
# 6. NFT Portfolio
# ---------------------------------------------------------------------------
def cmd_nft(args):
"""NFTs owned by a wallet (amount=1 && decimals=0 heuristic)."""
result = rpc("getTokenAccountsByOwner", [
args.address,
{"programId": "TokenkegQfeZyiNwAJbNbGKPFXCWuBvf9Ss623VQ5DA"},
{"encoding": "jsonParsed"},
])
nfts = [
acct["account"]["data"]["parsed"]["info"]["mint"]
for acct in (result.get("value") or [])
if acct["account"]["data"]["parsed"]["info"]["tokenAmount"]["decimals"] == 0
and int(acct["account"]["data"]["parsed"]["info"]["tokenAmount"]["amount"]) == 1
]
print_json({
"address": args.address,
"nft_count": len(nfts),
"nfts": nfts,
"note": "Heuristic only. Compressed NFTs (cNFTs) are not detected.",
})
# ---------------------------------------------------------------------------
# 7. Whale Detector (enhanced with USD values)
# ---------------------------------------------------------------------------
def cmd_whales(args):
"""Scan the latest block for large SOL transfers."""
min_lamports = int(args.min_sol * LAMPORTS_PER_SOL)
slot = rpc("getSlot")
block = rpc("getBlock", [
slot,
{
"encoding": "jsonParsed",
"transactionDetails": "full",
"maxSupportedTransactionVersion": 0,
"rewards": False,
},
])
if block is None:
sys.exit("Could not retrieve latest block.")
sol_price = fetch_sol_price()
whales = []
for tx in (block.get("transactions") or []):
meta = tx.get("meta", {}) or {}
if meta.get("err") is not None:
continue
msg = tx["transaction"].get("message", {})
account_keys = msg.get("accountKeys", [])
pre = meta.get("preBalances", [])
post = meta.get("postBalances", [])
for i in range(len(pre)):
change = post[i] - pre[i]
if change >= min_lamports:
k = account_keys[i]
receiver = k["pubkey"] if isinstance(k, dict) else k
sender = None
for j in range(len(pre)):
if pre[j] - post[j] >= min_lamports:
sk = account_keys[j]
sender = sk["pubkey"] if isinstance(sk, dict) else sk
break
entry = {
"sender": sender,
"receiver": receiver,
"amount_SOL": round(lamports_to_sol(change), 4),
}
if sol_price:
entry["amount_USD"] = round(lamports_to_sol(change) * sol_price, 2)
whales.append(entry)
out = {
"slot": slot,
"min_threshold_SOL": args.min_sol,
"large_transfers": whales,
"note": "Scans latest block only — point-in-time snapshot.",
}
if sol_price:
out["sol_price_usd"] = sol_price
print_json(out)
# ---------------------------------------------------------------------------
# 8. Price Lookup
# ---------------------------------------------------------------------------
def cmd_price(args):
"""Quick price lookup for a token by mint address or known symbol."""
query = args.token
# Check if it's a known symbol
mint = _SYMBOL_TO_MINT.get(query.upper(), query)
# Try to resolve name
token_meta = resolve_token_name(mint)
# Fetch price
prices = fetch_prices([mint])
out = {"query": query, "mint": mint}
if token_meta:
out["name"] = token_meta["name"]
out["symbol"] = token_meta["symbol"]
if mint in prices:
out["price_usd"] = prices[mint]
else:
out["price_usd"] = None
out["note"] = "Price not available — token may not be listed on CoinGecko."
print_json(out)
# ---------------------------------------------------------------------------
# CLI
# ---------------------------------------------------------------------------
def main():
parser = argparse.ArgumentParser(
prog="solana_client.py",
description="Solana blockchain query tool for Hermes Agent",
)
sub = parser.add_subparsers(dest="command", required=True)
sub.add_parser("stats", help="Network stats: slot, epoch, TPS, supply, SOL price")
p_wallet = sub.add_parser("wallet", help="SOL balance + SPL tokens with USD values")
p_wallet.add_argument("address")
p_wallet.add_argument("--limit", type=int, default=20,
help="Max tokens to display (default: 20)")
p_wallet.add_argument("--all", action="store_true",
help="Show all tokens (no limit, no dust filter)")
p_wallet.add_argument("--no-prices", action="store_true",
help="Skip price lookups (faster, RPC-only)")
p_tx = sub.add_parser("tx", help="Transaction details by signature")
p_tx.add_argument("signature")
p_token = sub.add_parser("token", help="SPL token metadata, price, and top holders")
p_token.add_argument("mint")
p_activity = sub.add_parser("activity", help="Recent transactions for an address")
p_activity.add_argument("address")
p_activity.add_argument("--limit", type=int, default=10,
help="Number of transactions (max 25, default 10)")
p_nft = sub.add_parser("nft", help="NFT portfolio for a wallet")
p_nft.add_argument("address")
p_whales = sub.add_parser("whales", help="Large SOL transfers in the latest block")
p_whales.add_argument("--min-sol", type=float, default=1000.0,
help="Minimum SOL transfer size (default: 1000)")
p_price = sub.add_parser("price", help="Quick price lookup by mint or symbol")
p_price.add_argument("token", help="Mint address or known symbol (SOL, BONK, JUP, ...)")
args = parser.parse_args()
dispatch = {
"stats": cmd_stats,
"wallet": cmd_wallet,
"tx": cmd_tx,
"token": cmd_token,
"activity": cmd_activity,
"nft": cmd_nft,
"whales": cmd_whales,
"price": cmd_price,
}
dispatch[args.command](args)
if __name__ == "__main__":
main()

View File

@@ -1,125 +0,0 @@
---
name: agentmail
description: Give the agent its own dedicated email inbox via AgentMail. Send, receive, and manage email autonomously using agent-owned email addresses (e.g. hermes-agent@agentmail.to).
version: 1.0.0
metadata:
hermes:
tags: [email, communication, agentmail, mcp]
category: email
---
# AgentMail — Agent-Owned Email Inboxes
## Requirements
- **AgentMail API key** (required) — sign up at https://console.agentmail.to (free tier: 3 inboxes, 3,000 emails/month; paid plans from $20/mo)
- Node.js 18+ (for the MCP server)
## When to Use
Use this skill when you need to:
- Give the agent its own dedicated email address
- Send emails autonomously on behalf of the agent
- Receive and read incoming emails
- Manage email threads and conversations
- Sign up for services or authenticate via email
- Communicate with other agents or humans via email
This is NOT for reading the user's personal email (use himalaya or Gmail for that).
AgentMail gives the agent its own identity and inbox.
## Setup
### 1. Get an API Key
- Go to https://console.agentmail.to
- Create an account and generate an API key (starts with `am_`)
### 2. Configure MCP Server
Add to `~/.hermes/config.yaml` (paste your actual key — MCP env vars are not expanded from .env):
```yaml
mcp_servers:
agentmail:
command: "npx"
args: ["-y", "agentmail-mcp"]
env:
AGENTMAIL_API_KEY: "am_your_key_here"
```
### 3. Restart Hermes
```bash
hermes
```
All 11 AgentMail tools are now available automatically.
## Available Tools (via MCP)
| Tool | Description |
|------|-------------|
| `list_inboxes` | List all agent inboxes |
| `get_inbox` | Get details of a specific inbox |
| `create_inbox` | Create a new inbox (gets a real email address) |
| `delete_inbox` | Delete an inbox |
| `list_threads` | List email threads in an inbox |
| `get_thread` | Get a specific email thread |
| `send_message` | Send a new email |
| `reply_to_message` | Reply to an existing email |
| `forward_message` | Forward an email |
| `update_message` | Update message labels/status |
| `get_attachment` | Download an email attachment |
## Procedure
### Create an inbox and send an email
1. Create a dedicated inbox:
- Use `create_inbox` with a username (e.g. `hermes-agent`)
- The agent gets address: `hermes-agent@agentmail.to`
2. Send an email:
- Use `send_message` with `inbox_id`, `to`, `subject`, `text`
3. Check for replies:
- Use `list_threads` to see incoming conversations
- Use `get_thread` to read a specific thread
### Check incoming email
1. Use `list_inboxes` to find your inbox ID
2. Use `list_threads` with the inbox ID to see conversations
3. Use `get_thread` to read a thread and its messages
### Reply to an email
1. Get the thread with `get_thread`
2. Use `reply_to_message` with the message ID and your reply text
## Example Workflows
**Sign up for a service:**
```
1. create_inbox (username: "signup-bot")
2. Use the inbox address to register on the service
3. list_threads to check for verification email
4. get_thread to read the verification code
```
**Agent-to-human outreach:**
```
1. create_inbox (username: "hermes-outreach")
2. send_message (to: user@example.com, subject: "Hello", text: "...")
3. list_threads to check for replies
```
## Pitfalls
- Free tier limited to 3 inboxes and 3,000 emails/month
- Emails come from `@agentmail.to` domain on free tier (custom domains on paid plans)
- Node.js (18+) is required for the MCP server (`npx -y agentmail-mcp`)
- The `mcp` Python package must be installed: `pip install mcp`
- Real-time inbound email (webhooks) requires a public server — use `list_threads` polling via cronjob instead for personal use
## Verification
After setup, test with:
```
hermes --toolsets mcp -q "Create an AgentMail inbox called test-agent and tell me its email address"
```
You should see the new inbox address returned.
## References
- AgentMail docs: https://docs.agentmail.to/
- AgentMail console: https://console.agentmail.to
- AgentMail MCP repo: https://github.com/agentmail-to/agentmail-mcp
- Pricing: https://www.agentmail.to/pricing

View File

@@ -1,2 +0,0 @@
Optional migration workflows for importing user state and customizations from
other agent systems into Hermes Agent.

View File

@@ -1,281 +0,0 @@
---
name: openclaw-migration
description: Migrate a user's OpenClaw customization footprint into Hermes Agent. Imports Hermes-compatible memories, SOUL.md, command allowlists, user skills, and selected workspace assets from ~/.openclaw, then reports exactly what could not be migrated and why.
version: 1.0.0
author: Hermes Agent (Nous Research)
license: MIT
metadata:
hermes:
tags: [Migration, OpenClaw, Hermes, Memory, Persona, Import]
related_skills: [hermes-agent]
---
# OpenClaw -> Hermes Migration
Use this skill when a user wants to move their OpenClaw setup into Hermes Agent with minimal manual cleanup.
## What this skill does
It uses `scripts/openclaw_to_hermes.py` to:
- import `SOUL.md` into the Hermes home directory as `SOUL.md`
- transform OpenClaw `MEMORY.md` and `USER.md` into Hermes memory entries
- merge OpenClaw command approval patterns into Hermes `command_allowlist`
- migrate Hermes-compatible messaging settings such as `TELEGRAM_ALLOWED_USERS` and `MESSAGING_CWD`
- copy OpenClaw skills into `~/.hermes/skills/openclaw-imports/`
- optionally copy the OpenClaw workspace instructions file into a chosen Hermes workspace
- mirror compatible workspace assets such as `workspace/tts/` into `~/.hermes/tts/`
- archive non-secret docs that do not have a direct Hermes destination
- produce a structured report listing migrated items, conflicts, skipped items, and reasons
## Path resolution
The helper script lives in this skill directory at:
- `scripts/openclaw_to_hermes.py`
When this skill is installed from the Skills Hub, the normal location is:
- `~/.hermes/skills/migration/openclaw-migration/scripts/openclaw_to_hermes.py`
Do not guess a shorter path like `~/.hermes/skills/openclaw-migration/...`.
Before running the helper:
1. Prefer the installed path under `~/.hermes/skills/migration/openclaw-migration/`.
2. If that path fails, inspect the installed skill directory and resolve the script relative to the installed `SKILL.md`.
3. Only use `find` as a fallback if the installed location is missing or the skill was moved manually.
4. When calling the terminal tool, do not pass `workdir: "~"`. Use an absolute directory such as the user's home directory, or omit `workdir` entirely.
With `--migrate-secrets`, it will also import a small allowlisted set of Hermes-compatible secrets, currently:
- `TELEGRAM_BOT_TOKEN`
## Default workflow
1. Inspect first with a dry run.
2. Present a simple summary of what can be migrated, what cannot be migrated, and what would be archived.
3. If the `clarify` tool is available, use it for user decisions instead of asking for a free-form prose reply.
4. If the dry run finds imported skill directory conflicts, ask how those should be handled before executing.
5. Ask the user to choose between the two supported migration modes before executing.
6. Ask for a target workspace path only if the user wants the workspace instructions file brought over.
7. Execute the migration with the matching preset and flags.
8. Summarize the results, especially:
- what was migrated
- what was archived for manual review
- what was skipped and why
## User interaction protocol
Hermes CLI supports the `clarify` tool for interactive prompts, but it is limited to:
- one choice at a time
- up to 4 predefined choices
- an automatic `Other` free-text option
It does **not** support true multi-select checkboxes in a single prompt.
For every `clarify` call:
- always include a non-empty `question`
- include `choices` only for real selectable prompts
- keep `choices` to 2-4 plain string options
- never emit placeholder or truncated options such as `...`
- never pad or stylize choices with extra whitespace
- never include fake form fields in the question such as `enter directory here`, blank lines to fill in, or underscores like `_____`
- for open-ended path questions, ask only the plain sentence; the user types in the normal CLI prompt below the panel
If a `clarify` call returns an error, inspect the error text, correct the payload, and retry once with a valid `question` and clean choices.
When `clarify` is available and the dry run reveals any required user decision, your **next action must be a `clarify` tool call**.
Do not end the turn with a normal assistant message such as:
- "Let me present the choices"
- "What would you like to do?"
- "Here are the options"
If a user decision is required, collect it via `clarify` before producing more prose.
If multiple unresolved decisions remain, do not insert an explanatory assistant message between them. After one `clarify` response is received, your next action should usually be the next required `clarify` call.
Treat `workspace-agents` as an unresolved decision whenever the dry run reports:
- `kind="workspace-agents"`
- `status="skipped"`
- reason containing `No workspace target was provided`
In that case, you must ask about workspace instructions before execution. Do not silently treat that as a decision to skip.
Because of that limitation, use this simplified decision flow:
1. For `SOUL.md` conflicts, use `clarify` with choices such as:
- `keep existing`
- `overwrite with backup`
- `review first`
2. If the dry run shows one or more `kind="skill"` items with `status="conflict"`, use `clarify` with choices such as:
- `keep existing skills`
- `overwrite conflicting skills with backup`
- `import conflicting skills under renamed folders`
3. For workspace instructions, use `clarify` with choices such as:
- `skip workspace instructions`
- `copy to a workspace path`
- `decide later`
4. If the user chooses to copy workspace instructions, ask a follow-up open-ended `clarify` question requesting an **absolute path**.
5. If the user chooses `skip workspace instructions` or `decide later`, proceed without `--workspace-target`.
5. For migration mode, use `clarify` with these 3 choices:
- `user-data only`
- `full compatible migration`
- `cancel`
6. `user-data only` means: migrate user data and compatible config, but do **not** import allowlisted secrets.
7. `full compatible migration` means: migrate the same compatible user data plus the allowlisted secrets when present.
8. If `clarify` is not available, ask the same question in normal text, but still constrain the answer to `user-data only`, `full compatible migration`, or `cancel`.
Execution gate:
- Do not execute while a `workspace-agents` skip caused by `No workspace target was provided` remains unresolved.
- The only valid ways to resolve it are:
- user explicitly chooses `skip workspace instructions`
- user explicitly chooses `decide later`
- user provides a workspace path after choosing `copy to a workspace path`
- Absence of a workspace target in the dry run is not itself permission to execute.
- Do not execute while any required `clarify` decision remains unresolved.
Use these exact `clarify` payload shapes as the default pattern:
- `{"question":"Your existing SOUL.md conflicts with the imported one. What should I do?","choices":["keep existing","overwrite with backup","review first"]}`
- `{"question":"One or more imported OpenClaw skills already exist in Hermes. How should I handle those skill conflicts?","choices":["keep existing skills","overwrite conflicting skills with backup","import conflicting skills under renamed folders"]}`
- `{"question":"Choose migration mode: migrate only user data, or run the full compatible migration including allowlisted secrets?","choices":["user-data only","full compatible migration","cancel"]}`
- `{"question":"Do you want to copy the OpenClaw workspace instructions file into a Hermes workspace?","choices":["skip workspace instructions","copy to a workspace path","decide later"]}`
- `{"question":"Please provide an absolute path where the workspace instructions should be copied."}`
## Decision-to-command mapping
Map user decisions to command flags exactly:
- If the user chooses `keep existing` for `SOUL.md`, do **not** add `--overwrite`.
- If the user chooses `overwrite with backup`, add `--overwrite`.
- If the user chooses `review first`, stop before execution and review the relevant files.
- If the user chooses `keep existing skills`, add `--skill-conflict skip`.
- If the user chooses `overwrite conflicting skills with backup`, add `--skill-conflict overwrite`.
- If the user chooses `import conflicting skills under renamed folders`, add `--skill-conflict rename`.
- If the user chooses `user-data only`, execute with `--preset user-data` and do **not** add `--migrate-secrets`.
- If the user chooses `full compatible migration`, execute with `--preset full --migrate-secrets`.
- Only add `--workspace-target` if the user explicitly provided an absolute workspace path.
- If the user chooses `skip workspace instructions` or `decide later`, do not add `--workspace-target`.
Before executing, restate the exact command plan in plain language and make sure it matches the user's choices.
## Post-run reporting rules
After execution, treat the script's JSON output as the source of truth.
1. Base all counts on `report.summary`.
2. Only list an item under "Successfully Migrated" if its `status` is exactly `migrated`.
3. Do not claim a conflict was resolved unless the report shows that item as `migrated`.
4. Do not say `SOUL.md` was overwritten unless the report item for `kind="soul"` has `status="migrated"`.
5. If `report.summary.conflict > 0`, include a conflict section instead of silently implying success.
6. If counts and listed items disagree, fix the list to match the report before responding.
7. Include the `output_dir` path from the report when available so the user can inspect `report.json`, `summary.md`, backups, and archived files.
8. For memory or user-profile overflow, do not say the entries were archived unless the report explicitly shows an archive path. If `details.overflow_file` exists, say the full overflow list was exported there.
9. If a skill was imported under a renamed folder, report the final destination and mention `details.renamed_from`.
10. If `report.skill_conflict_mode` is present, use it as the source of truth for the selected imported-skill conflict policy.
11. If an item has `status="skipped"`, do not describe it as overwritten, backed up, migrated, or resolved.
12. If `kind="soul"` has `status="skipped"` with reason `Target already matches source`, say it was left unchanged and do not mention a backup.
13. If a renamed imported skill has an empty `details.backup`, do not imply the existing Hermes skill was renamed or backed up. Say only that the imported copy was placed in the new destination and reference `details.renamed_from` as the pre-existing folder that remained in place.
## Migration presets
Prefer these two presets in normal use:
- `user-data`
- `full`
`user-data` includes:
- `soul`
- `workspace-agents`
- `memory`
- `user-profile`
- `messaging-settings`
- `command-allowlist`
- `skills`
- `tts-assets`
- `archive`
`full` includes everything in `user-data` plus:
- `secret-settings`
The helper script still supports category-level `--include` / `--exclude`, but treat that as an advanced fallback rather than the default UX.
## Commands
Dry run with full discovery:
```bash
python3 ~/.hermes/skills/migration/openclaw-migration/scripts/openclaw_to_hermes.py
```
When using the terminal tool, prefer an absolute invocation pattern such as:
```json
{"command":"python3 /home/USER/.hermes/skills/migration/openclaw-migration/scripts/openclaw_to_hermes.py","workdir":"/home/USER"}
```
Dry run with the user-data preset:
```bash
python3 ~/.hermes/skills/migration/openclaw-migration/scripts/openclaw_to_hermes.py --preset user-data
```
Execute a user-data migration:
```bash
python3 ~/.hermes/skills/migration/openclaw-migration/scripts/openclaw_to_hermes.py --execute --preset user-data --skill-conflict skip
```
Execute a full compatible migration:
```bash
python3 ~/.hermes/skills/migration/openclaw-migration/scripts/openclaw_to_hermes.py --execute --preset full --migrate-secrets --skill-conflict skip
```
Execute with workspace instructions included:
```bash
python3 ~/.hermes/skills/migration/openclaw-migration/scripts/openclaw_to_hermes.py --execute --preset user-data --skill-conflict rename --workspace-target "/absolute/workspace/path"
```
Do not use `$PWD` or the home directory as the workspace target by default. Ask for an explicit workspace path first.
## Important rules
1. Run a dry run before writing unless the user explicitly says to proceed immediately.
2. Do not migrate secrets by default. Tokens, auth blobs, device credentials, and raw gateway config should stay out of Hermes unless the user explicitly asks for secret migration.
3. Do not silently overwrite non-empty Hermes targets unless the user explicitly wants that. The helper script will preserve backups when overwriting is enabled.
4. Always give the user the skipped-items report. That report is part of the migration, not an optional extra.
5. Prefer the primary OpenClaw workspace (`~/.openclaw/workspace/`) over `workspace.default/`. Only use the default workspace as fallback when the primary files are missing.
6. Even in secret-migration mode, only migrate secrets with a clean Hermes destination. Unsupported auth blobs must still be reported as skipped.
7. If the dry run shows a large asset copy, a conflicting `SOUL.md`, or overflowed memory entries, call those out separately before execution.
8. Default to `user-data only` if the user is unsure.
9. Only include `workspace-agents` when the user has explicitly provided a destination workspace path.
10. Treat category-level `--include` / `--exclude` as an advanced escape hatch, not the normal flow.
11. Do not end the dry-run summary with a vague “What would you like to do?” if `clarify` is available. Use structured follow-up prompts instead.
12. Do not use an open-ended `clarify` prompt when a real choice prompt would work. Prefer selectable choices first, then free text only for absolute paths or file review requests.
13. After a dry run, never stop after summarizing if there is still an unresolved decision. Use `clarify` immediately for the highest-priority blocking decision.
14. Priority order for follow-up questions:
- `SOUL.md` conflict
- imported skill conflicts
- migration mode
- workspace instructions destination
15. Do not promise to present choices later in the same message. Present them by actually calling `clarify`.
16. After the migration-mode answer, explicitly check whether `workspace-agents` is still unresolved. If it is, your next action must be the workspace-instructions `clarify` call.
17. After any `clarify` answer, if another required decision remains, do not narrate what was just decided. Ask the next required question immediately.
## Expected result
After a successful run, the user should have:
- Hermes persona state imported
- Hermes memory files populated with converted OpenClaw knowledge
- OpenClaw skills available under `~/.hermes/skills/openclaw-imports/`
- a migration report showing any conflicts, omissions, or unsupported data

View File

@@ -1,441 +0,0 @@
---
name: qmd
description: Search personal knowledge bases, notes, docs, and meeting transcripts locally using qmd — a hybrid retrieval engine with BM25, vector search, and LLM reranking. Supports CLI and MCP integration.
version: 1.0.0
author: Hermes Agent + Teknium
license: MIT
platforms: [macos, linux]
metadata:
hermes:
tags: [Search, Knowledge-Base, RAG, Notes, MCP, Local-AI]
related_skills: [obsidian, native-mcp, arxiv]
---
# QMD — Query Markup Documents
Local, on-device search engine for personal knowledge bases. Indexes markdown
notes, meeting transcripts, documentation, and any text-based files, then
provides hybrid search combining keyword matching, semantic understanding, and
LLM-powered reranking — all running locally with no cloud dependencies.
Created by [Tobi Lütke](https://github.com/tobi/qmd). MIT licensed.
## When to Use
- User asks to search their notes, docs, knowledge base, or meeting transcripts
- User wants to find something across a large collection of markdown/text files
- User wants semantic search ("find notes about X concept") not just keyword grep
- User has already set up qmd collections and wants to query them
- User asks to set up a local knowledge base or document search system
- Keywords: "search my notes", "find in my docs", "knowledge base", "qmd"
## Prerequisites
### Node.js >= 22 (required)
```bash
# Check version
node --version # must be >= 22
# macOS — install or upgrade via Homebrew
brew install node@22
# Linux — use NodeSource or nvm
curl -fsSL https://deb.nodesource.com/setup_22.x | sudo -E bash -
sudo apt-get install -y nodejs
# or with nvm:
nvm install 22 && nvm use 22
```
### SQLite with Extension Support (macOS only)
macOS system SQLite lacks extension loading. Install via Homebrew:
```bash
brew install sqlite
```
### Install qmd
```bash
npm install -g @tobilu/qmd
# or with Bun:
bun install -g @tobilu/qmd
```
First run auto-downloads 3 local GGUF models (~2GB total):
| Model | Purpose | Size |
|-------|---------|------|
| embeddinggemma-300M-Q8_0 | Vector embeddings | ~300MB |
| qwen3-reranker-0.6b-q8_0 | Result reranking | ~640MB |
| qmd-query-expansion-1.7B | Query expansion | ~1.1GB |
### Verify Installation
```bash
qmd --version
qmd status
```
## Quick Reference
| Command | What It Does | Speed |
|---------|-------------|-------|
| `qmd search "query"` | BM25 keyword search (no models) | ~0.2s |
| `qmd vsearch "query"` | Semantic vector search (1 model) | ~3s |
| `qmd query "query"` | Hybrid + reranking (all 3 models) | ~2-3s warm, ~19s cold |
| `qmd get <docid>` | Retrieve full document content | instant |
| `qmd multi-get "glob"` | Retrieve multiple files | instant |
| `qmd collection add <path> --name <n>` | Add a directory as a collection | instant |
| `qmd context add <path> "description"` | Add context metadata to improve retrieval | instant |
| `qmd embed` | Generate/update vector embeddings | varies |
| `qmd status` | Show index health and collection info | instant |
| `qmd mcp` | Start MCP server (stdio) | persistent |
| `qmd mcp --http --daemon` | Start MCP server (HTTP, warm models) | persistent |
## Setup Workflow
### 1. Add Collections
Point qmd at directories containing your documents:
```bash
# Add a notes directory
qmd collection add ~/notes --name notes
# Add project docs
qmd collection add ~/projects/myproject/docs --name project-docs
# Add meeting transcripts
qmd collection add ~/meetings --name meetings
# List all collections
qmd collection list
```
### 2. Add Context Descriptions
Context metadata helps the search engine understand what each collection
contains. This significantly improves retrieval quality:
```bash
qmd context add qmd://notes "Personal notes, ideas, and journal entries"
qmd context add qmd://project-docs "Technical documentation for the main project"
qmd context add qmd://meetings "Meeting transcripts and action items from team syncs"
```
### 3. Generate Embeddings
```bash
qmd embed
```
This processes all documents in all collections and generates vector
embeddings. Re-run after adding new documents or collections.
### 4. Verify
```bash
qmd status # shows index health, collection stats, model info
```
## Search Patterns
### Fast Keyword Search (BM25)
Best for: exact terms, code identifiers, names, known phrases.
No models loaded — near-instant results.
```bash
qmd search "authentication middleware"
qmd search "handleError async"
```
### Semantic Vector Search
Best for: natural language questions, conceptual queries.
Loads embedding model (~3s first query).
```bash
qmd vsearch "how does the rate limiter handle burst traffic"
qmd vsearch "ideas for improving onboarding flow"
```
### Hybrid Search with Reranking (Best Quality)
Best for: important queries where quality matters most.
Uses all 3 models — query expansion, parallel BM25+vector, reranking.
```bash
qmd query "what decisions were made about the database migration"
```
### Structured Multi-Mode Queries
Combine different search types in a single query for precision:
```bash
# BM25 for exact term + vector for concept
qmd query $'lex: rate limiter\nvec: how does throttling work under load'
# With query expansion
qmd query $'expand: database migration plan\nlex: "schema change"'
```
### Query Syntax (lex/BM25 mode)
| Syntax | Effect | Example |
|--------|--------|---------|
| `term` | Prefix match | `perf` matches "performance" |
| `"phrase"` | Exact phrase | `"rate limiter"` |
| `-term` | Exclude term | `performance -sports` |
### HyDE (Hypothetical Document Embeddings)
For complex topics, write what you expect the answer to look like:
```bash
qmd query $'hyde: The migration plan involves three phases. First, we add the new columns without dropping the old ones. Then we backfill data. Finally we cut over and remove legacy columns.'
```
### Scoping to Collections
```bash
qmd search "query" --collection notes
qmd query "query" --collection project-docs
```
### Output Formats
```bash
qmd search "query" --json # JSON output (best for parsing)
qmd search "query" --limit 5 # Limit results
qmd get "#abc123" # Get by document ID
qmd get "path/to/file.md" # Get by file path
qmd get "file.md:50" -l 100 # Get specific line range
qmd multi-get "journals/*.md" --json # Batch retrieve by glob
```
## MCP Integration (Recommended)
qmd exposes an MCP server that provides search tools directly to
Hermes Agent via the native MCP client. This is the preferred
integration — once configured, the agent gets qmd tools automatically
without needing to load this skill.
### Option A: Stdio Mode (Simple)
Add to `~/.hermes/config.yaml`:
```yaml
mcp_servers:
qmd:
command: "qmd"
args: ["mcp"]
timeout: 30
connect_timeout: 45
```
This registers tools: `mcp_qmd_search`, `mcp_qmd_vsearch`,
`mcp_qmd_deep_search`, `mcp_qmd_get`, `mcp_qmd_status`.
**Tradeoff:** Models load on first search call (~19s cold start),
then stay warm for the session. Acceptable for occasional use.
### Option B: HTTP Daemon Mode (Fast, Recommended for Heavy Use)
Start the qmd daemon separately — it keeps models warm in memory:
```bash
# Start daemon (persists across agent restarts)
qmd mcp --http --daemon
# Runs on http://localhost:8181 by default
```
Then configure Hermes Agent to connect via HTTP:
```yaml
mcp_servers:
qmd:
url: "http://localhost:8181/mcp"
timeout: 30
```
**Tradeoff:** Uses ~2GB RAM while running, but every query is fast
(~2-3s). Best for users who search frequently.
### Keeping the Daemon Running
#### macOS (launchd)
```bash
cat > ~/Library/LaunchAgents/com.qmd.daemon.plist << 'EOF'
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
"http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>com.qmd.daemon</string>
<key>ProgramArguments</key>
<array>
<string>qmd</string>
<string>mcp</string>
<string>--http</string>
<string>--daemon</string>
</array>
<key>RunAtLoad</key>
<true/>
<key>KeepAlive</key>
<true/>
<key>StandardOutPath</key>
<string>/tmp/qmd-daemon.log</string>
<key>StandardErrorPath</key>
<string>/tmp/qmd-daemon.log</string>
</dict>
</plist>
EOF
launchctl load ~/Library/LaunchAgents/com.qmd.daemon.plist
```
#### Linux (systemd user service)
```bash
mkdir -p ~/.config/systemd/user
cat > ~/.config/systemd/user/qmd-daemon.service << 'EOF'
[Unit]
Description=QMD MCP Daemon
After=network.target
[Service]
ExecStart=qmd mcp --http --daemon
Restart=on-failure
RestartSec=10
Environment=PATH=/usr/local/bin:/usr/bin:/bin
[Install]
WantedBy=default.target
EOF
systemctl --user daemon-reload
systemctl --user enable --now qmd-daemon
systemctl --user status qmd-daemon
```
### MCP Tools Reference
Once connected, these tools are available as `mcp_qmd_*`:
| MCP Tool | Maps To | Description |
|----------|---------|-------------|
| `mcp_qmd_search` | `qmd search` | BM25 keyword search |
| `mcp_qmd_vsearch` | `qmd vsearch` | Semantic vector search |
| `mcp_qmd_deep_search` | `qmd query` | Hybrid search + reranking |
| `mcp_qmd_get` | `qmd get` | Retrieve document by ID or path |
| `mcp_qmd_status` | `qmd status` | Index health and stats |
The MCP tools accept structured JSON queries for multi-mode search:
```json
{
"searches": [
{"type": "lex", "query": "authentication middleware"},
{"type": "vec", "query": "how user login is verified"}
],
"collections": ["project-docs"],
"limit": 10
}
```
## CLI Usage (Without MCP)
When MCP is not configured, use qmd directly via terminal:
```
terminal(command="qmd query 'what was decided about the API redesign' --json", timeout=30)
```
For setup and management tasks, always use terminal:
```
terminal(command="qmd collection add ~/Documents/notes --name notes")
terminal(command="qmd context add qmd://notes 'Personal research notes and ideas'")
terminal(command="qmd embed")
terminal(command="qmd status")
```
## How the Search Pipeline Works
Understanding the internals helps choose the right search mode:
1. **Query Expansion** — A fine-tuned 1.7B model generates 2 alternative
queries. The original gets 2x weight in fusion.
2. **Parallel Retrieval** — BM25 (SQLite FTS5) and vector search run
simultaneously across all query variants.
3. **RRF Fusion** — Reciprocal Rank Fusion (k=60) merges results.
Top-rank bonus: #1 gets +0.05, #2-3 get +0.02.
4. **LLM Reranking** — qwen3-reranker scores top 30 candidates (0.0-1.0).
5. **Position-Aware Blending** — Ranks 1-3: 75% retrieval / 25% reranker.
Ranks 4-10: 60/40. Ranks 11+: 40/60 (trusts reranker more for long tail).
**Smart Chunking:** Documents are split at natural break points (headings,
code blocks, blank lines) targeting ~900 tokens with 15% overlap. Code
blocks are never split mid-block.
## Best Practices
1. **Always add context descriptions**`qmd context add` dramatically
improves retrieval accuracy. Describe what each collection contains.
2. **Re-embed after adding documents**`qmd embed` must be re-run when
new files are added to collections.
3. **Use `qmd search` for speed** — when you need fast keyword lookup
(code identifiers, exact names), BM25 is instant and needs no models.
4. **Use `qmd query` for quality** — when the question is conceptual or
the user needs the best possible results, use hybrid search.
5. **Prefer MCP integration** — once configured, the agent gets native
tools without needing to load this skill each time.
6. **Daemon mode for frequent users** — if the user searches their
knowledge base regularly, recommend the HTTP daemon setup.
7. **First query in structured search gets 2x weight** — put the most
important/certain query first when combining lex and vec.
## Troubleshooting
### "Models downloading on first run"
Normal — qmd auto-downloads ~2GB of GGUF models on first use.
This is a one-time operation.
### Cold start latency (~19s)
This happens when models aren't loaded in memory. Solutions:
- Use HTTP daemon mode (`qmd mcp --http --daemon`) to keep warm
- Use `qmd search` (BM25 only) when models aren't needed
- MCP stdio mode loads models on first search, stays warm for session
### macOS: "unable to load extension"
Install Homebrew SQLite: `brew install sqlite`
Then ensure it's on PATH before system SQLite.
### "No collections found"
Run `qmd collection add <path> --name <name>` to add directories,
then `qmd embed` to index them.
### Embedding model override (CJK/multilingual)
Set `QMD_EMBED_MODEL` environment variable for non-English content:
```bash
export QMD_EMBED_MODEL="your-multilingual-model"
```
## Data Storage
- **Index & vectors:** `~/.cache/qmd/index.sqlite`
- **Models:** Auto-downloaded to local cache on first run
- **No cloud dependencies** — everything runs locally
## References
- [GitHub: tobi/qmd](https://github.com/tobi/qmd)
- [QMD Changelog](https://github.com/tobi/qmd/blob/main/CHANGELOG.md)

View File

@@ -1,218 +0,0 @@
# Checkpoint & Rollback — Implementation Plan
## Goal
Automatic filesystem snapshots before destructive file operations, with user-facing rollback. The agent never sees or interacts with this — it's transparent infrastructure.
## Design Principles
1. **Not a tool** — the LLM never knows about it. Zero prompt tokens, zero tool schema overhead.
2. **Once per turn** — checkpoint at most once per conversation turn (user message → agent response cycle), triggered lazily on the first file-mutating operation. Not on every write.
3. **Opt-in via config** — disabled by default, enabled with `checkpoints: true` in config.yaml.
4. **Works on any directory** — uses a shadow git repo completely separate from the user's project git. Works on git repos, non-git directories, anything.
5. **User-facing rollback**`/rollback` slash command (CLI + gateway) to list and restore checkpoints. Also `hermes rollback` CLI subcommand.
## Architecture
```
~/.hermes/checkpoints/
{sha256(abs_dir)[:16]}/ # Shadow git repo per working directory
HEAD, refs/, objects/... # Standard git internals
HERMES_WORKDIR # Original dir path (for display)
info/exclude # Default excludes (node_modules, .env, etc.)
```
### Core: CheckpointManager (new file: tools/checkpoint_manager.py)
Adapted from PR #559's CheckpointStore. Key changes from the PR:
- **Not a tool** — no schema, no registry entry, no handler
- **Turn-scoped deduplication** — tracks `_checkpointed_dirs: Set[str]` per turn
- **Configurable** — reads `checkpoints` config key
- **Pruning** — keeps last N snapshots per directory (default 50), prunes on take
```python
class CheckpointManager:
def __init__(self, enabled: bool = False, max_snapshots: int = 50):
self.enabled = enabled
self.max_snapshots = max_snapshots
self._checkpointed_dirs: Set[str] = set() # reset each turn
def new_turn(self):
"""Call at start of each conversation turn to reset dedup."""
self._checkpointed_dirs.clear()
def ensure_checkpoint(self, working_dir: str, reason: str = "auto") -> None:
"""Take a checkpoint if enabled and not already done this turn."""
if not self.enabled:
return
abs_dir = str(Path(working_dir).resolve())
if abs_dir in self._checkpointed_dirs:
return
self._checkpointed_dirs.add(abs_dir)
try:
self._take(abs_dir, reason)
except Exception as e:
logger.debug("Checkpoint failed (non-fatal): %s", e)
def list_checkpoints(self, working_dir: str) -> List[dict]:
"""List available checkpoints for a directory."""
...
def restore(self, working_dir: str, commit_hash: str) -> dict:
"""Restore files to a checkpoint state."""
...
def _take(self, working_dir: str, reason: str):
"""Shadow git: add -A + commit. Prune if over max_snapshots."""
...
def _prune(self, shadow_repo: Path):
"""Keep only last max_snapshots commits."""
...
```
### Integration Point: run_agent.py
The AIAgent already owns the conversation loop. Add CheckpointManager as an instance attribute:
```python
class AIAgent:
def __init__(self, ...):
...
# Checkpoint manager — reads config to determine if enabled
self._checkpoint_mgr = CheckpointManager(
enabled=config.get("checkpoints", False),
max_snapshots=config.get("checkpoint_max_snapshots", 50),
)
```
**Turn boundary** — in `run_conversation()`, call `new_turn()` at the start of each agent iteration (before processing tool calls):
```python
# Inside the main loop, before _execute_tool_calls():
self._checkpoint_mgr.new_turn()
```
**Trigger point** — in `_execute_tool_calls()`, before dispatching file-mutating tools:
```python
# Before the handle_function_call dispatch:
if function_name in ("write_file", "patch"):
# Determine working dir from the file path in the args
file_path = function_args.get("path", "") or function_args.get("old_string", "")
if file_path:
work_dir = str(Path(file_path).parent.resolve())
self._checkpoint_mgr.ensure_checkpoint(work_dir, f"before {function_name}")
```
This means:
- First `write_file` in a turn → checkpoint (fast, one `git add -A && git commit`)
- Subsequent writes in the same turn → no-op (already checkpointed)
- Next turn (new user message) → fresh checkpoint eligibility
### Config
Add to `DEFAULT_CONFIG` in `hermes_cli/config.py`:
```python
"checkpoints": False, # Enable filesystem checkpoints before destructive ops
"checkpoint_max_snapshots": 50, # Max snapshots to keep per directory
```
User enables with:
```yaml
# ~/.hermes/config.yaml
checkpoints: true
```
### User-Facing Rollback
**CLI slash command** — add `/rollback` to `process_command()` in `cli.py`:
```
/rollback — List recent checkpoints for the current directory
/rollback <hash> — Restore files to that checkpoint
```
Shows a numbered list:
```
📸 Checkpoints for /home/user/project:
1. abc1234 2026-03-09 21:15 before write_file (3 files changed)
2. def5678 2026-03-09 20:42 before patch (1 file changed)
3. ghi9012 2026-03-09 20:30 before write_file (2 files changed)
Use /rollback <number> to restore, e.g. /rollback 1
```
**Gateway slash command** — add `/rollback` to gateway/run.py with the same behavior.
**CLI subcommand**`hermes rollback` (optional, lower priority).
### What Gets Excluded (not checkpointed)
Same as the PR's defaults — written to the shadow repo's `info/exclude`:
```
node_modules/
dist/
build/
.env
.env.*
__pycache__/
*.pyc
.DS_Store
*.log
.cache/
.venv/
.git/
```
Also respects the project's `.gitignore` if present (shadow repo can read it via `core.excludesFile`).
### Safety
- `ensure_checkpoint()` wraps everything in try/except — a checkpoint failure never blocks the actual file operation
- Shadow repo is completely isolated — GIT_DIR + GIT_WORK_TREE env vars, never touches user's .git
- If git isn't installed, checkpoints silently disable
- Large directories: add a file count check — skip checkpoint if >50K files to avoid slowdowns
## Files to Create/Modify
| File | Change |
|------|--------|
| `tools/checkpoint_manager.py` | **NEW** — CheckpointManager class (adapted from PR #559) |
| `run_agent.py` | Add CheckpointManager init + trigger in `_execute_tool_calls()` |
| `hermes_cli/config.py` | Add `checkpoints` + `checkpoint_max_snapshots` to DEFAULT_CONFIG |
| `cli.py` | Add `/rollback` slash command handler |
| `gateway/run.py` | Add `/rollback` slash command handler |
| `tests/tools/test_checkpoint_manager.py` | **NEW** — tests (adapted from PR #559's tests) |
## What We Take From PR #559
- `_shadow_repo_path()` — deterministic path hashing ✅
- `_git_env()` — GIT_DIR/GIT_WORK_TREE isolation ✅
- `_run_git()` — subprocess wrapper with timeout ✅
- `_init_shadow_repo()` — shadow repo initialization ✅
- `DEFAULT_EXCLUDES` list ✅
- Test structure and patterns ✅
## What We Change From PR #559
- **Remove tool schema/registry** — not a tool
- **Remove injection into file_operations.py and patch_parser.py** — trigger from run_agent.py instead
- **Add turn-scoped deduplication** — one checkpoint per turn, not per operation
- **Add pruning** — keep last N snapshots
- **Add config flag** — opt-in, not mandatory
- **Add /rollback command** — user-facing restore UI
- **Add file count guard** — skip huge directories
## Implementation Order
1. `tools/checkpoint_manager.py` — core class with take/list/restore/prune
2. `tests/tools/test_checkpoint_manager.py` — tests
3. `hermes_cli/config.py` — config keys
4. `run_agent.py` — integration (init + trigger)
5. `cli.py``/rollback` slash command
6. `gateway/run.py``/rollback` slash command
7. Full test suite run + manual smoke test

View File

@@ -40,20 +40,16 @@ dependencies = [
[project.optional-dependencies]
modal = ["swe-rex[modal]>=1.4.0"]
daytona = ["daytona>=0.148.0"]
dev = ["pytest", "pytest-asyncio", "pytest-xdist", "mcp>=1.2.0"]
dev = ["pytest", "pytest-asyncio"]
messaging = ["python-telegram-bot>=20.0", "discord.py>=2.0", "aiohttp>=3.9.0", "slack-bolt>=1.18.0", "slack-sdk>=3.27.0"]
cron = ["croniter"]
slack = ["slack-bolt>=1.18.0", "slack-sdk>=3.27.0"]
cli = ["simple-term-menu"]
tts-premium = ["elevenlabs"]
pty = [
"ptyprocess>=0.7.0; sys_platform != 'win32'",
"pywinpty>=2.0.0; sys_platform == 'win32'",
]
pty = ["ptyprocess>=0.7.0"]
honcho = ["honcho-ai>=2.0.1"]
mcp = ["mcp>=1.2.0"]
homeassistant = ["aiohttp>=3.9.0"]
yc-bench = ["yc-bench @ git+https://github.com/collinear-ai/yc-bench.git"]
all = [
"hermes-agent[modal]",
"hermes-agent[daytona]",
@@ -84,4 +80,4 @@ testpaths = ["tests"]
markers = [
"integration: marks tests requiring external services (API keys, Modal, etc.)",
]
addopts = "-m 'not integration' -n auto"
addopts = "-m 'not integration'"

File diff suppressed because it is too large Load Diff

View File

@@ -492,23 +492,9 @@ install_system_packages() {
return 0
fi
fi
elif [ -e /dev/tty ]; then
# Non-interactive (e.g. curl | bash) but a terminal is available.
# Read the prompt from /dev/tty (same approach the setup wizard uses).
echo ""
log_info "Installing ${description} requires sudo."
read -p "Install? [Y/n] " -n 1 -r < /dev/tty
echo
if [[ $REPLY =~ ^[Yy]$ ]] || [[ -z $REPLY ]]; then
if sudo DEBIAN_FRONTEND=noninteractive NEEDRESTART_MODE=a $install_cmd < /dev/tty; then
[ "$need_ripgrep" = true ] && HAS_RIPGREP=true && log_success "ripgrep installed"
[ "$need_ffmpeg" = true ] && HAS_FFMPEG=true && log_success "ffmpeg installed"
return 0
fi
fi
else
log_warn "Non-interactive mode and no terminal available — cannot install system packages"
log_info "Install manually after setup completes: sudo $install_cmd"
log_warn "Non-interactive mode: cannot prompt for sudo password"
log_info "Install missing packages manually: sudo $install_cmd"
fi
fi
fi
@@ -843,33 +829,6 @@ install_node_deps() {
log_warn "npm install failed (browser tools may not work)"
}
log_success "Node.js dependencies installed"
# Install Playwright browser + system dependencies.
# Playwright's install-deps only supports apt/dnf/zypper natively.
# For Arch/Manjaro we install the system libs via pacman first.
log_info "Installing browser engine (Playwright Chromium)..."
case "$DISTRO" in
arch|manjaro)
if command -v pacman &> /dev/null; then
log_info "Arch/Manjaro detected — installing Chromium system dependencies via pacman..."
if command -v sudo &> /dev/null && sudo -n true 2>/dev/null; then
sudo NEEDRESTART_MODE=a pacman -S --noconfirm --needed \
nss atk at-spi2-core cups libdrm libxkbcommon mesa pango cairo alsa-lib >/dev/null 2>&1 || true
elif [ "$(id -u)" -eq 0 ]; then
pacman -S --noconfirm --needed \
nss atk at-spi2-core cups libdrm libxkbcommon mesa pango cairo alsa-lib >/dev/null 2>&1 || true
else
log_warn "Cannot install browser deps without sudo. Run manually:"
log_warn " sudo pacman -S nss atk at-spi2-core cups libdrm libxkbcommon mesa pango cairo alsa-lib"
fi
fi
cd "$INSTALL_DIR" && npx playwright install chromium 2>/dev/null || true
;;
*)
cd "$INSTALL_DIR" && npx playwright install --with-deps chromium 2>/dev/null || true
;;
esac
log_success "Browser engine installed"
fi
# Install WhatsApp bridge dependencies

View File

@@ -1,3 +0,0 @@
---
description: Apple/macOS-specific skills — iMessage, Reminders, Notes, FindMy, and macOS automation. These skills only load on macOS systems.
---

View File

@@ -1,88 +0,0 @@
---
name: apple-notes
description: Manage Apple Notes via the memo CLI on macOS (create, view, search, edit).
version: 1.0.0
author: Hermes Agent
license: MIT
platforms: [macos]
metadata:
hermes:
tags: [Notes, Apple, macOS, note-taking]
related_skills: [obsidian]
---
# Apple Notes
Use `memo` to manage Apple Notes directly from the terminal. Notes sync across all Apple devices via iCloud.
## Prerequisites
- **macOS** with Notes.app
- Install: `brew tap antoniorodr/memo && brew install antoniorodr/memo/memo`
- Grant Automation access to Notes.app when prompted (System Settings → Privacy → Automation)
## When to Use
- User asks to create, view, or search Apple Notes
- Saving information to Notes.app for cross-device access
- Organizing notes into folders
- Exporting notes to Markdown/HTML
## When NOT to Use
- Obsidian vault management → use the `obsidian` skill
- Bear Notes → separate app (not supported here)
- Quick agent-only notes → use the `memory` tool instead
## Quick Reference
### View Notes
```bash
memo notes # List all notes
memo notes -f "Folder Name" # Filter by folder
memo notes -s "query" # Search notes (fuzzy)
```
### Create Notes
```bash
memo notes -a # Interactive editor
memo notes -a "Note Title" # Quick add with title
```
### Edit Notes
```bash
memo notes -e # Interactive selection to edit
```
### Delete Notes
```bash
memo notes -d # Interactive selection to delete
```
### Move Notes
```bash
memo notes -m # Move note to folder (interactive)
```
### Export Notes
```bash
memo notes -ex # Export to HTML/Markdown
```
## Limitations
- Cannot edit notes containing images or attachments
- Interactive prompts require terminal access (use pty=true if needed)
- macOS only — requires Apple Notes.app
## Rules
1. Prefer Apple Notes when user wants cross-device sync (iPhone/iPad/Mac)
2. Use the `memory` tool for agent-internal notes that don't need to sync
3. Use the `obsidian` skill for Markdown-native knowledge management

Some files were not shown because too many files have changed in this diff Show More