Compare commits

..

28 Commits

Author SHA1 Message Date
teknium1
214827a594 docs: move Discord behavior guidance to top 2026-03-14 23:18:21 -07:00
teknium1
678e0bd9cc docs: clarify Slack thread reply behavior 2026-03-14 23:15:25 -07:00
Teknium
973aa9b549 fix(update): drop autostash by stash selector
fix(update): drop autostash by stash selector
2026-03-14 22:53:50 -07:00
Teknium
2316b8dc98 Merge pull request #1405 from NousResearch/hermes/hermes-7ef7cb6a
docs: stabilize website diagrams
2026-03-14 22:52:56 -07:00
teknium1
47c5c97654 fix(update): drop autostash by stash selector 2026-03-14 22:45:29 -07:00
Teknium
c6fb7f6463 Merge pull request #1399 from NousResearch/hermes/hermes-629f8bde
fix(#1002): expand environment blocklist for terminal isolation
2026-03-14 22:30:05 -07:00
teknium1
672dc1666f test: cover extra provider env blocklist vars 2026-03-14 22:29:35 -07:00
Teknium
5b11570517 Merge pull request #1398 from NousResearch/hermes/hermes-1b6f4583
fix(cron): support per-job runtime overrides
2026-03-14 22:29:30 -07:00
teknium1
ff87a566c4 fix(test): make Nous setup prompt selection robust to optional vision step 2026-03-14 22:28:15 -07:00
Nikita
9e3752df36 fix(#1002): expand environment blocklist for terminal isolation
Expanded the list of blocked environment variables to include Google, Groq, Mistral, and other major LLM providers. This ensures complete isolation and prevents conflicts with external CLI tools.
2026-03-14 22:27:32 -07:00
Teknium
15bf0b4af2 Merge pull request #1365 from mr-emmett-one/fix/deepseek-multi-tool-calls-989
fix: support multiple parallel tool calls in DeepSeek V3 parser (#989)
2026-03-14 22:22:45 -07:00
Synergy
28b3764d1e fix(cron): support per-job runtime overrides
Salvaged from PR #1292 onto current main. Preserve per-job model,
provider, and base_url overrides in cron execution, persist them in
job records, expose them through the cronjob tool create/update paths,
and add regression coverage. Deliberately does not persist per-job
api_key values.
2026-03-14 22:22:31 -07:00
Teknium
62f1c2b622 Merge pull request #1397 from NousResearch/hermes/hermes-629f8bde
fix: escape parens and braces in fork bomb regex pattern
2026-03-14 22:17:16 -07:00
Teknium
71cff92eb7 Merge pull request #1377 from NousResearch/hermes/hermes-aa701810
feat: add native Anthropic auxiliary vision
2026-03-14 22:16:09 -07:00
teknium1
1337c9efd8 test: resolve auxiliary client merge conflict 2026-03-14 22:15:16 -07:00
Teknium
747612fb3e Merge pull request #1396 from NousResearch/hermes/hermes-0fadff1b
fix: persist Google OAuth PKCE state for headless setup
2026-03-14 22:13:37 -07:00
Teknium
84d99f7754 Merge pull request #1394 from NousResearch/hermes/hermes-eca4a640
fix: honor stt.enabled false across gateway transcription
2026-03-14 22:11:47 -07:00
teknium1
4524cddc72 fix: persist google oauth pkce for headless auth
Store the pending OAuth state and code verifier between --auth-url and --auth-code so the manual headless flow can reuse Flow.fetch_token() without disabling PKCE.
2026-03-14 22:11:34 -07:00
teknium1
f4e8772de4 fix: require oauth creds for native Anthropic 2026-03-14 22:11:21 -07:00
Teknium
39fe9e8533 Merge pull request #1395 from NousResearch/hermes/hermes-7ef7cb6a
fix: use description as pattern_key to prevent approval collisions
2026-03-14 22:11:09 -07:00
teknium1
f8ceadbad0 fix: propagate STT disable through shared transcription config
- add stt.enabled to the default user config
- make transcription_tools respect the disabled flag globally
- surface disabled state cleanly in voice mode diagnostics
- add regression coverage for disabled STT provider selection
2026-03-14 22:09:59 -07:00
teyrebaz33
c36136084a fix(gateway): honor stt.enabled false for voice transcription
- bridge stt.enabled from config.yaml into gateway runtime config
- preserve the flag in GatewayConfig serialization
- skip gateway voice transcription when STT is disabled
- add regression tests for config loading and disabled transcription flow
2026-03-14 22:09:53 -07:00
Teknium
f46b35e3d1 Merge pull request #1393 from NousResearch/hermes/hermes-45b79a59-pr1087
fix: normalize Codex dict tool arguments as JSON
2026-03-14 22:07:22 -07:00
0xbyt4
e6417cb7bc fix: escape parens and braces in fork bomb regex pattern
The fork bomb regex used `()` (empty capture group) and unescaped `{}`
instead of literal `\(\)` and `\{\}`. This meant the classic fork bomb
`:(){ :|:& };:` was never detected. Also added `\s*` between `:` and
`&` and between `;` and trailing `:` to catch whitespace variants.
2026-03-14 22:06:44 -07:00
0xbyt4
6f85283553 fix: use json.dumps instead of str() for Codex Responses API arguments
When the Responses API returns tool call arguments as a dict,
str(dict) produces Python repr with single quotes (e.g. {'key': 'val'})
which is invalid JSON. Downstream json.loads() fails silently and the
tool gets called with empty arguments, losing all parameters.

Affects both function_call and custom_tool_call item types in
_normalize_codex_response().
2026-03-14 22:03:53 -07:00
teknium1
db9e512424 fix: fall back from managed Anthropic keys 2026-03-14 21:44:39 -07:00
teknium1
db362dbd4c feat: add native Anthropic auxiliary vision 2026-03-14 21:14:20 -07:00
Emmett
26bedf973b fix: support multiple parallel tool calls in DeepSeek V3 parser (#989)
- Refactored regex pattern to handle varied whitespace and newlines for better robustness.
- Replaced logic to iterate through all tool call blocks using finditer instead of stopping at the first match.
- Ensured full extraction of multiple tool calls for complex agentic workflows.
- Added error logging for failed parsing attempts.
2026-03-15 03:55:24 +01:00
34 changed files with 1322 additions and 111 deletions

View File

@@ -102,30 +102,15 @@ def build_anthropic_client(api_key: str, base_url: str = None):
def read_claude_code_credentials() -> Optional[Dict[str, Any]]:
"""Read credentials from Claude Code's config files.
"""Read refreshable Claude Code OAuth credentials from ~/.claude/.credentials.json.
Checks two locations (in order):
1. ~/.claude.json — top-level primaryApiKey (native binary, v2.x)
2. ~/.claude/.credentials.json — claudeAiOauth block (npm/legacy installs)
This intentionally excludes ~/.claude.json primaryApiKey. Opencode's
subscription flow is OAuth/setup-token based with refreshable credentials,
and native direct Anthropic provider usage should follow that path rather
than auto-detecting Claude's first-party managed key.
Returns dict with {accessToken, refreshToken?, expiresAt?} or None.
"""
# 1. Native binary (v2.x): ~/.claude.json with top-level primaryApiKey
claude_json = Path.home() / ".claude.json"
if claude_json.exists():
try:
data = json.loads(claude_json.read_text(encoding="utf-8"))
primary_key = data.get("primaryApiKey", "")
if primary_key:
return {
"accessToken": primary_key,
"refreshToken": "",
"expiresAt": 0, # Managed keys don't have a user-visible expiry
}
except (json.JSONDecodeError, OSError, IOError) as e:
logger.debug("Failed to read ~/.claude.json: %s", e)
# 2. Legacy/npm installs: ~/.claude/.credentials.json
cred_path = Path.home() / ".claude" / ".credentials.json"
if cred_path.exists():
try:
@@ -138,6 +123,7 @@ def read_claude_code_credentials() -> Optional[Dict[str, Any]]:
"accessToken": access_token,
"refreshToken": oauth_data.get("refreshToken", ""),
"expiresAt": oauth_data.get("expiresAt", 0),
"source": "claude_code_credentials_file",
}
except (json.JSONDecodeError, OSError, IOError) as e:
logger.debug("Failed to read ~/.claude/.credentials.json: %s", e)
@@ -145,6 +131,20 @@ def read_claude_code_credentials() -> Optional[Dict[str, Any]]:
return None
def read_claude_managed_key() -> Optional[str]:
"""Read Claude's native managed key from ~/.claude.json for diagnostics only."""
claude_json = Path.home() / ".claude.json"
if claude_json.exists():
try:
data = json.loads(claude_json.read_text(encoding="utf-8"))
primary_key = data.get("primaryApiKey", "")
if isinstance(primary_key, str) and primary_key.strip():
return primary_key.strip()
except (json.JSONDecodeError, OSError, IOError) as e:
logger.debug("Failed to read ~/.claude.json: %s", e)
return None
def is_claude_code_token_valid(creds: Dict[str, Any]) -> bool:
"""Check if Claude Code credentials have a non-expired access token."""
import time
@@ -273,6 +273,35 @@ def _prefer_refreshable_claude_code_token(env_token: str, creds: Optional[Dict[s
return None
def get_anthropic_token_source(token: Optional[str] = None) -> str:
"""Best-effort source classification for an Anthropic credential token."""
token = (token or "").strip()
if not token:
return "none"
env_token = os.getenv("ANTHROPIC_TOKEN", "").strip()
if env_token and env_token == token:
return "anthropic_token_env"
cc_env_token = os.getenv("CLAUDE_CODE_OAUTH_TOKEN", "").strip()
if cc_env_token and cc_env_token == token:
return "claude_code_oauth_token_env"
creds = read_claude_code_credentials()
if creds and creds.get("accessToken") == token:
return str(creds.get("source") or "claude_code_credentials")
managed_key = read_claude_managed_key()
if managed_key and managed_key == token:
return "claude_json_primary_api_key"
api_key = os.getenv("ANTHROPIC_API_KEY", "").strip()
if api_key and api_key == token:
return "anthropic_api_key_env"
return "unknown"
def resolve_anthropic_token() -> Optional[str]:
"""Resolve an Anthropic token from all available sources.
@@ -391,6 +420,68 @@ def _sanitize_tool_id(tool_id: str) -> str:
return sanitized or "tool_0"
def _convert_openai_image_part_to_anthropic(part: Dict[str, Any]) -> Optional[Dict[str, Any]]:
"""Convert an OpenAI-style image block to Anthropic's image source format."""
image_data = part.get("image_url", {})
url = image_data.get("url", "") if isinstance(image_data, dict) else str(image_data)
if not isinstance(url, str) or not url.strip():
return None
url = url.strip()
if url.startswith("data:"):
header, sep, data = url.partition(",")
if sep and ";base64" in header:
media_type = header[5:].split(";", 1)[0] or "image/png"
return {
"type": "image",
"source": {
"type": "base64",
"media_type": media_type,
"data": data,
},
}
if url.startswith("http://") or url.startswith("https://"):
return {
"type": "image",
"source": {
"type": "url",
"url": url,
},
}
return None
def _convert_user_content_part_to_anthropic(part: Any) -> Optional[Dict[str, Any]]:
if isinstance(part, dict):
ptype = part.get("type")
if ptype == "text":
block = {"type": "text", "text": part.get("text", "")}
if isinstance(part.get("cache_control"), dict):
block["cache_control"] = dict(part["cache_control"])
return block
if ptype == "image_url":
return _convert_openai_image_part_to_anthropic(part)
if ptype == "image" and part.get("source"):
return dict(part)
if ptype == "image" and part.get("data"):
media_type = part.get("mimeType") or part.get("media_type") or "image/png"
return {
"type": "image",
"source": {
"type": "base64",
"media_type": media_type,
"data": part.get("data", ""),
},
}
if ptype == "tool_result":
return dict(part)
elif part is not None:
return {"type": "text", "text": str(part)}
return None
def convert_tools_to_anthropic(tools: List[Dict]) -> List[Dict]:
"""Convert OpenAI tool definitions to Anthropic format."""
if not tools:
@@ -495,7 +586,15 @@ def convert_messages_to_anthropic(
continue
# Regular user message
result.append({"role": "user", "content": content})
if isinstance(content, list):
converted_blocks = []
for part in content:
converted = _convert_user_content_part_to_anthropic(part)
if converted is not None:
converted_blocks.append(converted)
result.append({"role": "user", "content": converted_blocks or [{"type": "text", "text": ""}]})
else:
result.append({"role": "user", "content": content})
# Strip orphaned tool_use blocks (no matching tool_result follows)
tool_result_ids = set()

View File

@@ -1,4 +1,4 @@
"""Shared auxiliary OpenAI client for cheap/fast side tasks.
"""Shared auxiliary client router for side tasks.
Provides a single resolution chain so every consumer (context compression,
session search, web extraction, vision analysis, browser vision) picks up
@@ -10,21 +10,21 @@ Resolution order for text tasks (auto mode):
3. Custom endpoint (OPENAI_BASE_URL + OPENAI_API_KEY)
4. Codex OAuth (Responses API via chatgpt.com with gpt-5.3-codex,
wrapped to look like a chat.completions client)
5. Direct API-key providers (z.ai/GLM, Kimi/Moonshot, MiniMax, MiniMax-CN)
— checked via PROVIDER_REGISTRY entries with auth_type='api_key'
6. None
5. Native Anthropic
6. Direct API-key providers (z.ai/GLM, Kimi/Moonshot, MiniMax, MiniMax-CN)
7. None
Resolution order for vision/multimodal tasks (auto mode):
1. OpenRouter
2. Nous Portal
3. Codex OAuth (gpt-5.3-codex supports vision via Responses API)
4. Custom endpoint (for local vision models: Qwen-VL, LLaVA, Pixtral, etc.)
5. None (API-key providers like z.ai/Kimi/MiniMax are skipped —
they may not support multimodal)
1. Selected main provider, if it is one of the supported vision backends below
2. OpenRouter
3. Nous Portal
4. Codex OAuth (gpt-5.3-codex supports vision via Responses API)
5. Native Anthropic
6. Custom endpoint (for local vision models: Qwen-VL, LLaVA, Pixtral, etc.)
7. None
Per-task provider overrides (e.g. AUXILIARY_VISION_PROVIDER,
CONTEXT_COMPRESSION_PROVIDER) can force a specific provider for each task:
"openrouter", "nous", "codex", or "main" (= steps 3-5).
CONTEXT_COMPRESSION_PROVIDER) can force a specific provider for each task.
Default "auto" follows the chains above.
Per-task model overrides (e.g. AUXILIARY_VISION_MODEL,
@@ -78,6 +78,7 @@ auxiliary_is_nous: bool = False
_OPENROUTER_MODEL = "google/gemini-3-flash-preview"
_NOUS_MODEL = "gemini-3-flash"
_NOUS_DEFAULT_BASE_URL = "https://inference-api.nousresearch.com/v1"
_ANTHROPIC_DEFAULT_BASE_URL = "https://api.anthropic.com"
_AUTH_JSON_PATH = get_hermes_home() / "auth.json"
# Codex fallback: uses the Responses API (the only endpoint the Codex
@@ -313,6 +314,114 @@ class AsyncCodexAuxiliaryClient:
self.base_url = sync_wrapper.base_url
class _AnthropicCompletionsAdapter:
"""OpenAI-client-compatible adapter for Anthropic Messages API."""
def __init__(self, real_client: Any, model: str):
self._client = real_client
self._model = model
def create(self, **kwargs) -> Any:
from agent.anthropic_adapter import build_anthropic_kwargs, normalize_anthropic_response
messages = kwargs.get("messages", [])
model = kwargs.get("model", self._model)
tools = kwargs.get("tools")
tool_choice = kwargs.get("tool_choice")
max_tokens = kwargs.get("max_tokens") or kwargs.get("max_completion_tokens") or 2000
temperature = kwargs.get("temperature")
normalized_tool_choice = None
if isinstance(tool_choice, str):
normalized_tool_choice = tool_choice
elif isinstance(tool_choice, dict):
choice_type = str(tool_choice.get("type", "")).lower()
if choice_type == "function":
normalized_tool_choice = tool_choice.get("function", {}).get("name")
elif choice_type in {"auto", "required", "none"}:
normalized_tool_choice = choice_type
anthropic_kwargs = build_anthropic_kwargs(
model=model,
messages=messages,
tools=tools,
max_tokens=max_tokens,
reasoning_config=None,
tool_choice=normalized_tool_choice,
)
if temperature is not None:
anthropic_kwargs["temperature"] = temperature
response = self._client.messages.create(**anthropic_kwargs)
assistant_message, finish_reason = normalize_anthropic_response(response)
usage = None
if hasattr(response, "usage") and response.usage:
prompt_tokens = getattr(response.usage, "input_tokens", 0) or 0
completion_tokens = getattr(response.usage, "output_tokens", 0) or 0
total_tokens = getattr(response.usage, "total_tokens", 0) or (prompt_tokens + completion_tokens)
usage = SimpleNamespace(
prompt_tokens=prompt_tokens,
completion_tokens=completion_tokens,
total_tokens=total_tokens,
)
choice = SimpleNamespace(
index=0,
message=assistant_message,
finish_reason=finish_reason,
)
return SimpleNamespace(
choices=[choice],
model=model,
usage=usage,
)
class _AnthropicChatShim:
def __init__(self, adapter: _AnthropicCompletionsAdapter):
self.completions = adapter
class AnthropicAuxiliaryClient:
"""OpenAI-client-compatible wrapper over a native Anthropic client."""
def __init__(self, real_client: Any, model: str, api_key: str, base_url: str):
self._real_client = real_client
adapter = _AnthropicCompletionsAdapter(real_client, model)
self.chat = _AnthropicChatShim(adapter)
self.api_key = api_key
self.base_url = base_url
def close(self):
close_fn = getattr(self._real_client, "close", None)
if callable(close_fn):
close_fn()
class _AsyncAnthropicCompletionsAdapter:
def __init__(self, sync_adapter: _AnthropicCompletionsAdapter):
self._sync = sync_adapter
async def create(self, **kwargs) -> Any:
import asyncio
return await asyncio.to_thread(self._sync.create, **kwargs)
class _AsyncAnthropicChatShim:
def __init__(self, adapter: _AsyncAnthropicCompletionsAdapter):
self.completions = adapter
class AsyncAnthropicAuxiliaryClient:
def __init__(self, sync_wrapper: "AnthropicAuxiliaryClient"):
sync_adapter = sync_wrapper.chat.completions
async_adapter = _AsyncAnthropicCompletionsAdapter(sync_adapter)
self.chat = _AsyncAnthropicChatShim(async_adapter)
self.api_key = sync_wrapper.api_key
self.base_url = sync_wrapper.base_url
def _read_nous_auth() -> Optional[dict]:
"""Read and validate ~/.hermes/auth.json for an active Nous provider.
@@ -384,6 +493,9 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
break
if not api_key:
continue
if provider_id == "anthropic":
return _try_anthropic()
# Resolve base URL (with optional env-var override)
# Kimi Code keys (sk-kimi-) need api.kimi.com/coding/v1
env_url = ""
@@ -534,6 +646,22 @@ def _try_codex() -> Tuple[Optional[Any], Optional[str]]:
return CodexAuxiliaryClient(real_client, _CODEX_AUX_MODEL), _CODEX_AUX_MODEL
def _try_anthropic() -> Tuple[Optional[Any], Optional[str]]:
try:
from agent.anthropic_adapter import build_anthropic_client, resolve_anthropic_token
except ImportError:
return None, None
token = resolve_anthropic_token()
if not token:
return None, None
model = _API_KEY_PROVIDER_AUX_MODELS.get("anthropic", "claude-haiku-4-5-20251001")
logger.debug("Auxiliary client: Anthropic native (%s)", model)
real_client = build_anthropic_client(token, _ANTHROPIC_DEFAULT_BASE_URL)
return AnthropicAuxiliaryClient(real_client, model, token, _ANTHROPIC_DEFAULT_BASE_URL), model
def _resolve_forced_provider(forced: str) -> Tuple[Optional[OpenAI], Optional[str]]:
"""Resolve a specific forced provider. Returns (None, None) if creds missing."""
if forced == "openrouter":
@@ -596,6 +724,8 @@ def _to_async_client(sync_client, model: str):
if isinstance(sync_client, CodexAuxiliaryClient):
return AsyncCodexAuxiliaryClient(sync_client), model
if isinstance(sync_client, AnthropicAuxiliaryClient):
return AsyncAnthropicAuxiliaryClient(sync_client), model
async_kwargs = {
"api_key": sync_client.api_key,
@@ -756,6 +886,14 @@ def resolve_provider_client(
return None, None
if pconfig.auth_type == "api_key":
if provider == "anthropic":
client, default_model = _try_anthropic()
if client is None:
logger.warning("resolve_provider_client: anthropic requested but no Anthropic credentials found")
return None, None
final_model = model or default_model
return (_to_async_client(client, final_model) if async_mode else (client, final_model))
# Find the first configured API key
api_key = ""
for env_var in pconfig.api_key_env_vars:
@@ -849,6 +987,7 @@ _VISION_AUTO_PROVIDER_ORDER = (
"openrouter",
"nous",
"openai-codex",
"anthropic",
"custom",
)
@@ -870,6 +1009,8 @@ def _resolve_strict_vision_backend(provider: str) -> Tuple[Optional[Any], Option
return _try_nous()
if provider == "openai-codex":
return _try_codex()
if provider == "anthropic":
return _try_anthropic()
if provider == "custom":
return _try_custom_endpoint()
return None, None
@@ -879,19 +1020,36 @@ def _strict_vision_backend_available(provider: str) -> bool:
return _resolve_strict_vision_backend(provider)[0] is not None
def _preferred_main_vision_provider() -> Optional[str]:
"""Return the selected main provider when it is also a supported vision backend."""
try:
from hermes_cli.config import load_config
config = load_config()
model_cfg = config.get("model", {})
if isinstance(model_cfg, dict):
provider = _normalize_vision_provider(model_cfg.get("provider", ""))
if provider in _VISION_AUTO_PROVIDER_ORDER:
return provider
except Exception:
pass
return None
def get_available_vision_backends() -> List[str]:
"""Return the currently available vision backends in auto-selection order.
This is the single source of truth for setup, tool gating, and runtime
auto-routing of vision tasks. Phase 1 keeps the auto list conservative:
OpenRouter, Nous Portal, Codex OAuth, then custom OpenAI-compatible
endpoints. Explicit provider overrides can still route elsewhere.
auto-routing of vision tasks. The selected main provider is preferred when
it is also a known-good vision backend; otherwise Hermes falls back through
the standard conservative order.
"""
return [
provider
for provider in _VISION_AUTO_PROVIDER_ORDER
if _strict_vision_backend_available(provider)
]
ordered = list(_VISION_AUTO_PROVIDER_ORDER)
preferred = _preferred_main_vision_provider()
if preferred in ordered:
ordered.remove(preferred)
ordered.insert(0, preferred)
return [provider for provider in ordered if _strict_vision_backend_available(provider)]
def resolve_vision_provider_client(

View File

@@ -292,6 +292,9 @@ def create_job(
origin: Optional[Dict[str, Any]] = None,
skill: Optional[str] = None,
skills: Optional[List[str]] = None,
model: Optional[str] = None,
provider: Optional[str] = None,
base_url: Optional[str] = None,
) -> Dict[str, Any]:
"""
Create a new cron job.
@@ -305,6 +308,9 @@ def create_job(
origin: Source info where job was created (for "origin" delivery)
skill: Optional legacy single skill name to load before running the prompt
skills: Optional ordered list of skills to load before running the prompt
model: Optional per-job model override
provider: Optional per-job provider override
base_url: Optional per-job base URL override
Returns:
The created job dict
@@ -323,6 +329,13 @@ def create_job(
now = _hermes_now().isoformat()
normalized_skills = _normalize_skill_list(skill, skills)
normalized_model = str(model).strip() if isinstance(model, str) else None
normalized_provider = str(provider).strip() if isinstance(provider, str) else None
normalized_base_url = str(base_url).strip().rstrip("/") if isinstance(base_url, str) else None
normalized_model = normalized_model or None
normalized_provider = normalized_provider or None
normalized_base_url = normalized_base_url or None
label_source = (prompt or (normalized_skills[0] if normalized_skills else None)) or "cron job"
job = {
"id": job_id,
@@ -330,6 +343,9 @@ def create_job(
"prompt": prompt,
"skills": normalized_skills,
"skill": normalized_skills[0] if normalized_skills else None,
"model": normalized_model,
"provider": normalized_provider,
"base_url": normalized_base_url,
"schedule": parsed_schedule,
"schedule_display": parsed_schedule.get("display", schedule),
"repeat": {

View File

@@ -261,7 +261,7 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
if delivery_target.get("thread_id") is not None:
os.environ["HERMES_CRON_AUTO_DELIVER_THREAD_ID"] = str(delivery_target["thread_id"])
model = os.getenv("HERMES_MODEL") or "anthropic/claude-opus-4.6"
model = job.get("model") or os.getenv("HERMES_MODEL") or "anthropic/claude-opus-4.6"
# Load config.yaml for model, reasoning, prefill, toolsets, provider routing
_cfg = {}
@@ -272,10 +272,11 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
with open(_cfg_path) as _f:
_cfg = yaml.safe_load(_f) or {}
_model_cfg = _cfg.get("model", {})
if isinstance(_model_cfg, str):
model = _model_cfg
elif isinstance(_model_cfg, dict):
model = _model_cfg.get("default", model)
if not job.get("model"):
if isinstance(_model_cfg, str):
model = _model_cfg
elif isinstance(_model_cfg, dict):
model = _model_cfg.get("default", model)
except Exception as e:
logger.warning("Job '%s': failed to load config.yaml, using defaults: %s", job_id, e)
@@ -320,9 +321,12 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
format_runtime_provider_error,
)
try:
runtime = resolve_runtime_provider(
requested=os.getenv("HERMES_INFERENCE_PROVIDER"),
)
runtime_kwargs = {
"requested": job.get("provider") or os.getenv("HERMES_INFERENCE_PROVIDER"),
}
if job.get("base_url"):
runtime_kwargs["explicit_base_url"] = job.get("base_url")
runtime = resolve_runtime_provider(**runtime_kwargs)
except Exception as exc:
message = format_runtime_provider_error(exc)
raise RuntimeError(message) from exc

View File

@@ -10,12 +10,13 @@ Format uses special unicode tokens:
<tool▁call▁end>
<tool▁calls▁end>
Based on VLLM's DeepSeekV3ToolParser.extract_tool_calls()
Fixes Issue #989: Support for multiple simultaneous tool calls.
"""
import re
import uuid
from typing import List, Optional
import logging
from typing import List, Optional, Tuple
from openai.types.chat.chat_completion_message_tool_call import (
ChatCompletionMessageToolCall,
@@ -24,6 +25,7 @@ from openai.types.chat.chat_completion_message_tool_call import (
from environments.tool_call_parsers import ParseResult, ToolCallParser, register_parser
logger = logging.getLogger(__name__)
@register_parser("deepseek_v3")
class DeepSeekV3ToolCallParser(ToolCallParser):
@@ -32,45 +34,56 @@ class DeepSeekV3ToolCallParser(ToolCallParser):
Uses special unicode tokens with fullwidth angle brackets and block elements.
Extracts type, function name, and JSON arguments from the structured format.
Ensures all tool calls are captured when the model executes multiple actions.
"""
START_TOKEN = "<tool▁calls▁begin>"
# Regex captures: type, function_name, function_arguments
# Updated PATTERN: Using \s* instead of literal \n for increased robustness
# against variations in model formatting (Issue #989).
PATTERN = re.compile(
r"<tool▁call▁begin>(?P<type>.*?)<tool▁sep>(?P<function_name>.*?)\n```json\n(?P<function_arguments>.*?)\n```<tool▁call▁end>",
r"<tool▁call▁begin>(?P<type>.*?)<tool▁sep>(?P<function_name>.*?)\s*```json\s*(?P<function_arguments>.*?)\s*```\s*<tool▁call▁end>",
re.DOTALL,
)
def parse(self, text: str) -> ParseResult:
"""
Parses the input text and extracts all available tool calls.
"""
if self.START_TOKEN not in text:
return text, None
try:
matches = self.PATTERN.findall(text)
# Using finditer to capture ALL tool calls in the sequence
matches = list(self.PATTERN.finditer(text))
if not matches:
return text, None
tool_calls: List[ChatCompletionMessageToolCall] = []
for match in matches:
tc_type, func_name, func_args = match
func_name = match.group("function_name").strip()
func_args = match.group("function_arguments").strip()
tool_calls.append(
ChatCompletionMessageToolCall(
id=f"call_{uuid.uuid4().hex[:8]}",
type="function",
function=Function(
name=func_name.strip(),
arguments=func_args.strip(),
name=func_name,
arguments=func_args,
),
)
)
if not tool_calls:
return text, None
if tool_calls:
# Content is text before the first tool call block
content_index = text.find(self.START_TOKEN)
content = text[:content_index].strip()
return content if content else None, tool_calls
# Content is everything before the tool calls section
content = text[: text.find(self.START_TOKEN)].strip()
return content if content else None, tool_calls
except Exception:
return text, None
except Exception as e:
logger.error(f"Error parsing DeepSeek V3 tool calls: {e}")
return text, None

View File

@@ -21,6 +21,17 @@ from hermes_cli.config import get_hermes_home
logger = logging.getLogger(__name__)
def _coerce_bool(value: Any, default: bool = True) -> bool:
"""Coerce bool-ish config values, preserving a caller-provided default."""
if value is None:
return default
if isinstance(value, bool):
return value
if isinstance(value, str):
return value.strip().lower() in ("true", "1", "yes", "on")
return bool(value)
class Platform(Enum):
"""Supported messaging platforms."""
LOCAL = "local"
@@ -160,6 +171,9 @@ class GatewayConfig:
# Delivery settings
always_log_local: bool = True # Always save cron outputs to local files
# STT settings
stt_enabled: bool = True # Whether to auto-transcribe inbound voice messages
def get_connected_platforms(self) -> List[Platform]:
"""Return list of platforms that are enabled and configured."""
@@ -224,6 +238,7 @@ class GatewayConfig:
"quick_commands": self.quick_commands,
"sessions_dir": str(self.sessions_dir),
"always_log_local": self.always_log_local,
"stt_enabled": self.stt_enabled,
}
@classmethod
@@ -260,6 +275,10 @@ class GatewayConfig:
if not isinstance(quick_commands, dict):
quick_commands = {}
stt_enabled = data.get("stt_enabled")
if stt_enabled is None:
stt_enabled = data.get("stt", {}).get("enabled") if isinstance(data.get("stt"), dict) else None
return cls(
platforms=platforms,
default_reset_policy=default_policy,
@@ -269,6 +288,7 @@ class GatewayConfig:
quick_commands=quick_commands,
sessions_dir=sessions_dir,
always_log_local=data.get("always_log_local", True),
stt_enabled=_coerce_bool(stt_enabled, True),
)
@@ -318,6 +338,12 @@ def load_gateway_config() -> GatewayConfig:
else:
logger.warning("Ignoring invalid quick_commands in config.yaml (expected mapping, got %s)", type(qc).__name__)
# Bridge STT enable/disable from config.yaml into gateway runtime.
# This keeps the gateway aligned with the user-facing config source.
stt_cfg = yaml_cfg.get("stt")
if isinstance(stt_cfg, dict) and "enabled" in stt_cfg:
config.stt_enabled = _coerce_bool(stt_cfg.get("enabled"), True)
# Bridge discord settings from config.yaml to env vars
# (env vars take precedence — only set if not already defined)
discord_cfg = yaml_cfg.get("discord", {})

View File

@@ -3550,7 +3550,7 @@ class GatewayRunner:
audio_paths: List[str],
) -> str:
"""
Auto-transcribe user voice/audio messages using OpenAI Whisper API
Auto-transcribe user voice/audio messages using the configured STT provider
and prepend the transcript to the message text.
Args:
@@ -3560,6 +3560,12 @@ class GatewayRunner:
Returns:
The enriched message string with transcriptions prepended.
"""
if not getattr(self.config, "stt_enabled", True):
disabled_note = "[The user sent voice message(s), but transcription is disabled in config.]"
if user_text:
return f"{disabled_note}\n\n{user_text}"
return disabled_note
from tools.transcription_tools import transcribe_audio, get_stt_model_from_config
import asyncio

View File

@@ -219,7 +219,8 @@ DEFAULT_CONFIG = {
},
"stt": {
"provider": "local", # "local" (free, faster-whisper) | "openai" (Whisper API)
"enabled": True,
"provider": "local", # "local" (free, faster-whisper) | "groq" | "openai" (Whisper API)
"local": {
"model": "base", # tiny, base, small, medium, large-v3
},
@@ -300,7 +301,7 @@ DEFAULT_CONFIG = {
},
# Config schema version - bump this when adding new required fields
"_config_version": 7,
"_config_version": 8,
}
# =============================================================================

View File

@@ -2016,6 +2016,22 @@ def _stash_local_changes_if_needed(git_cmd: list[str], cwd: Path) -> Optional[st
def _resolve_stash_selector(git_cmd: list[str], cwd: Path, stash_ref: str) -> Optional[str]:
stash_list = subprocess.run(
git_cmd + ["stash", "list", "--format=%gd %H"],
cwd=cwd,
capture_output=True,
text=True,
check=True,
)
for line in stash_list.stdout.splitlines():
selector, _, commit = line.partition(" ")
if commit.strip() == stash_ref:
return selector.strip()
return None
def _restore_stashed_changes(
git_cmd: list[str],
cwd: Path,
@@ -2052,7 +2068,27 @@ def _restore_stashed_changes(
print(f"Resolve manually with: git stash apply {stash_ref}")
sys.exit(1)
subprocess.run(git_cmd + ["stash", "drop", stash_ref], cwd=cwd, check=True)
stash_selector = _resolve_stash_selector(git_cmd, cwd, stash_ref)
if stash_selector is None:
print("⚠ Local changes were restored, but Hermes couldn't find the stash entry to drop.")
print(" The stash was left in place. You can remove it manually after checking the result.")
print(f" Look for commit {stash_ref} in `git stash list --format='%gd %H'` and drop that selector.")
else:
drop = subprocess.run(
git_cmd + ["stash", "drop", stash_selector],
cwd=cwd,
capture_output=True,
text=True,
)
if drop.returncode != 0:
print("⚠ Local changes were restored, but Hermes couldn't drop the saved stash entry.")
if drop.stdout.strip():
print(drop.stdout.strip())
if drop.stderr.strip():
print(drop.stderr.strip())
print(" The stash was left in place. You can remove it manually after checking the result.")
print(f" If needed: git stash drop {stash_selector}")
print("⚠ Local changes were restored on top of the updated codebase.")
print(" Review `git diff` / `git status` if Hermes behaves unexpectedly.")
return True

View File

@@ -1268,11 +1268,9 @@ def setup_model_provider(config: dict):
_vision_needs_setup = not bool(_vision_backends)
if selected_provider in {"openrouter", "nous", "openai-codex"}:
# If the user just selected one of our known-good vision backends during
# setup, treat vision as covered. Auth/setup failure returns earlier.
_vision_needs_setup = False
elif selected_provider == "custom" and "custom" in _vision_backends:
if selected_provider in _vision_backends:
# If the user just selected a backend Hermes can already use for
# vision, treat it as covered. Auth/setup failure returns earlier.
_vision_needs_setup = False
if _vision_needs_setup:

View File

@@ -2407,7 +2407,7 @@ class AIAgent:
fn_name = getattr(item, "name", "") or ""
arguments = getattr(item, "arguments", "{}")
if not isinstance(arguments, str):
arguments = str(arguments)
arguments = json.dumps(arguments, ensure_ascii=False)
raw_call_id = getattr(item, "call_id", None)
raw_item_id = getattr(item, "id", None)
embedded_call_id, _ = self._split_responses_tool_id(raw_item_id)
@@ -2428,7 +2428,7 @@ class AIAgent:
fn_name = getattr(item, "name", "") or ""
arguments = getattr(item, "input", "{}")
if not isinstance(arguments, str):
arguments = str(arguments)
arguments = json.dumps(arguments, ensure_ascii=False)
raw_call_id = getattr(item, "call_id", None)
raw_item_id = getattr(item, "id", None)
embedded_call_id, _ = self._split_responses_tool_id(raw_item_id)

View File

@@ -102,7 +102,9 @@ This prints a URL. **Send the URL to the user** and tell them:
### Step 4: Exchange the code
The user will paste back either a URL like `http://localhost:1/?code=4/0A...&scope=...`
or just the code string. Either works:
or just the code string. Either works. The `--auth-url` step stores a temporary
pending OAuth session locally so `--auth-code` can complete the PKCE exchange
later, even on headless systems:
```bash
$GSETUP --auth-code "THE_URL_OR_CODE_THE_USER_PASTED"
@@ -119,6 +121,7 @@ Should print `AUTHENTICATED`. Setup is complete — token refreshes automaticall
### Notes
- Token is stored at `~/.hermes/google_token.json` and auto-refreshes.
- Pending OAuth session state/verifier are stored temporarily at `~/.hermes/google_oauth_pending.json` until exchange completes.
- To revoke: `$GSETUP --revoke`
## Usage

View File

@@ -31,6 +31,7 @@ from pathlib import Path
HERMES_HOME = Path(os.getenv("HERMES_HOME", Path.home() / ".hermes"))
TOKEN_PATH = HERMES_HOME / "google_token.json"
CLIENT_SECRET_PATH = HERMES_HOME / "google_client_secret.json"
PENDING_AUTH_PATH = HERMES_HOME / "google_oauth_pending.json"
SCOPES = [
"https://www.googleapis.com/auth/gmail.readonly",
@@ -141,6 +142,58 @@ def store_client_secret(path: str):
print(f"OK: Client secret saved to {CLIENT_SECRET_PATH}")
def _save_pending_auth(*, state: str, code_verifier: str):
"""Persist the OAuth session bits needed for a later token exchange."""
PENDING_AUTH_PATH.write_text(
json.dumps(
{
"state": state,
"code_verifier": code_verifier,
"redirect_uri": REDIRECT_URI,
},
indent=2,
)
)
def _load_pending_auth() -> dict:
"""Load the pending OAuth session created by get_auth_url()."""
if not PENDING_AUTH_PATH.exists():
print("ERROR: No pending OAuth session found. Run --auth-url first.")
sys.exit(1)
try:
data = json.loads(PENDING_AUTH_PATH.read_text())
except Exception as e:
print(f"ERROR: Could not read pending OAuth session: {e}")
print("Run --auth-url again to start a fresh OAuth session.")
sys.exit(1)
if not data.get("state") or not data.get("code_verifier"):
print("ERROR: Pending OAuth session is missing PKCE data.")
print("Run --auth-url again to start a fresh OAuth session.")
sys.exit(1)
return data
def _extract_code_and_state(code_or_url: str) -> tuple[str, str | None]:
"""Accept either a raw auth code or the full redirect URL pasted by the user."""
if not code_or_url.startswith("http"):
return code_or_url, None
from urllib.parse import parse_qs, urlparse
parsed = urlparse(code_or_url)
params = parse_qs(parsed.query)
if "code" not in params:
print("ERROR: No 'code' parameter found in URL.")
sys.exit(1)
state = params.get("state", [None])[0]
return params["code"][0], state
def get_auth_url():
"""Print the OAuth authorization URL. User visits this in a browser."""
if not CLIENT_SECRET_PATH.exists():
@@ -154,11 +207,13 @@ def get_auth_url():
str(CLIENT_SECRET_PATH),
scopes=SCOPES,
redirect_uri=REDIRECT_URI,
autogenerate_code_verifier=True,
)
auth_url, _ = flow.authorization_url(
auth_url, state = flow.authorization_url(
access_type="offline",
prompt="consent",
)
_save_pending_auth(state=state, code_verifier=flow.code_verifier)
# Print just the URL so the agent can extract it cleanly
print(auth_url)
@@ -169,26 +224,23 @@ def exchange_auth_code(code: str):
print("ERROR: No client secret stored. Run --client-secret first.")
sys.exit(1)
pending_auth = _load_pending_auth()
code, returned_state = _extract_code_and_state(code)
if returned_state and returned_state != pending_auth["state"]:
print("ERROR: OAuth state mismatch. Run --auth-url again to start a fresh session.")
sys.exit(1)
_ensure_deps()
from google_auth_oauthlib.flow import Flow
flow = Flow.from_client_secrets_file(
str(CLIENT_SECRET_PATH),
scopes=SCOPES,
redirect_uri=REDIRECT_URI,
redirect_uri=pending_auth.get("redirect_uri", REDIRECT_URI),
state=pending_auth["state"],
code_verifier=pending_auth["code_verifier"],
)
# The code might come as a full redirect URL or just the code itself
if code.startswith("http"):
# Extract code from redirect URL: http://localhost:1/?code=CODE&scope=...
from urllib.parse import urlparse, parse_qs
parsed = urlparse(code)
params = parse_qs(parsed.query)
if "code" not in params:
print("ERROR: No 'code' parameter found in URL.")
sys.exit(1)
code = params["code"][0]
try:
flow.fetch_token(code=code)
except Exception as e:
@@ -198,6 +250,7 @@ def exchange_auth_code(code: str):
creds = flow.credentials
TOKEN_PATH.write_text(creds.to_json())
PENDING_AUTH_PATH.unlink(missing_ok=True)
print(f"OK: Authenticated. Token saved to {TOKEN_PATH}")
@@ -229,6 +282,7 @@ def revoke():
print(f"Remote revocation failed (token may already be invalid): {e}")
TOKEN_PATH.unlink(missing_ok=True)
PENDING_AUTH_PATH.unlink(missing_ok=True)
print(f"Deleted {TOKEN_PATH}")

View File

@@ -10,6 +10,8 @@ import pytest
from agent.auxiliary_client import (
get_text_auxiliary_client,
get_vision_auxiliary_client,
get_available_vision_backends,
resolve_provider_client,
auxiliary_max_tokens_param,
_read_codex_access_token,
_get_auxiliary_provider,
@@ -24,6 +26,7 @@ def _clean_env(monkeypatch):
for key in (
"OPENROUTER_API_KEY", "OPENAI_BASE_URL", "OPENAI_API_KEY",
"OPENAI_MODEL", "LLM_MODEL", "NOUS_INFERENCE_BASE_URL",
"ANTHROPIC_API_KEY", "ANTHROPIC_TOKEN", "CLAUDE_CODE_OAUTH_TOKEN",
# Per-task provider/model/direct-endpoint overrides
"AUXILIARY_VISION_PROVIDER", "AUXILIARY_VISION_MODEL",
"AUXILIARY_VISION_BASE_URL", "AUXILIARY_VISION_API_KEY",
@@ -210,14 +213,74 @@ class TestGetTextAuxiliaryClient:
class TestVisionClientFallback:
"""Vision client auto mode only tries OpenRouter + Nous (multimodal-capable)."""
"""Vision client auto mode resolves known-good multimodal backends."""
def test_vision_returns_none_without_any_credentials(self):
with patch("agent.auxiliary_client._read_nous_auth", return_value=None):
with (
patch("agent.auxiliary_client._read_nous_auth", return_value=None),
patch("agent.auxiliary_client._try_anthropic", return_value=(None, None)),
):
client, model = get_vision_auxiliary_client()
assert client is None
assert model is None
def test_vision_auto_includes_anthropic_when_configured(self, monkeypatch):
monkeypatch.setenv("ANTHROPIC_API_KEY", "sk-ant-api03-key")
with (
patch("agent.auxiliary_client._read_nous_auth", return_value=None),
patch("agent.anthropic_adapter.build_anthropic_client", return_value=MagicMock()),
patch("agent.anthropic_adapter.resolve_anthropic_token", return_value="sk-ant-api03-key"),
):
backends = get_available_vision_backends()
assert "anthropic" in backends
def test_resolve_provider_client_returns_native_anthropic_wrapper(self, monkeypatch):
monkeypatch.setenv("ANTHROPIC_API_KEY", "sk-ant-api03-key")
with (
patch("agent.auxiliary_client._read_nous_auth", return_value=None),
patch("agent.anthropic_adapter.build_anthropic_client", return_value=MagicMock()),
patch("agent.anthropic_adapter.resolve_anthropic_token", return_value="sk-ant-api03-key"),
):
client, model = resolve_provider_client("anthropic")
assert client is not None
assert client.__class__.__name__ == "AnthropicAuxiliaryClient"
assert model == "claude-haiku-4-5-20251001"
def test_vision_auto_uses_anthropic_when_no_higher_priority_backend(self, monkeypatch):
monkeypatch.setenv("ANTHROPIC_API_KEY", "sk-ant-api03-key")
with (
patch("agent.auxiliary_client._read_nous_auth", return_value=None),
patch("agent.anthropic_adapter.build_anthropic_client", return_value=MagicMock()),
patch("agent.anthropic_adapter.resolve_anthropic_token", return_value="sk-ant-api03-key"),
):
client, model = get_vision_auxiliary_client()
assert client is not None
assert client.__class__.__name__ == "AnthropicAuxiliaryClient"
assert model == "claude-haiku-4-5-20251001"
def test_selected_anthropic_provider_is_preferred_for_vision_auto(self, monkeypatch):
monkeypatch.setenv("OPENROUTER_API_KEY", "or-key")
monkeypatch.setenv("ANTHROPIC_API_KEY", "sk-ant-api03-key")
def fake_load_config():
return {"model": {"provider": "anthropic", "default": "claude-sonnet-4-6"}}
with (
patch("agent.auxiliary_client._read_nous_auth", return_value=None),
patch("agent.anthropic_adapter.build_anthropic_client", return_value=MagicMock()),
patch("agent.anthropic_adapter.resolve_anthropic_token", return_value="sk-ant-api03-key"),
patch("agent.auxiliary_client.OpenAI") as mock_openai,
patch("hermes_cli.config.load_config", fake_load_config),
):
client, model = get_vision_auxiliary_client()
assert client is not None
assert client.__class__.__name__ == "AnthropicAuxiliaryClient"
assert model == "claude-haiku-4-5-20251001"
def test_vision_auto_includes_codex(self, codex_auth_dir):
"""Codex supports vision (gpt-5.3-codex), so auto mode should use it."""
with patch("agent.auxiliary_client._read_nous_auth", return_value=None), \

View File

@@ -309,6 +309,57 @@ class TestRunJobConfigLogging:
f"Expected 'failed to parse prefill messages' warning in logs, got: {[r.message for r in caplog.records]}"
class TestRunJobPerJobOverrides:
def test_job_level_model_provider_and_base_url_overrides_are_used(self, tmp_path):
config_yaml = tmp_path / "config.yaml"
config_yaml.write_text(
"model:\n"
" default: gpt-5.4\n"
" provider: openai-codex\n"
" base_url: https://chatgpt.com/backend-api/codex\n"
)
job = {
"id": "briefing-job",
"name": "briefing",
"prompt": "hello",
"model": "perplexity/sonar-pro",
"provider": "custom",
"base_url": "http://127.0.0.1:4000/v1",
}
fake_db = MagicMock()
fake_runtime = {
"provider": "openrouter",
"api_mode": "chat_completions",
"base_url": "http://127.0.0.1:4000/v1",
"api_key": "***",
}
with patch("cron.scheduler._hermes_home", tmp_path), \
patch("cron.scheduler._resolve_origin", return_value=None), \
patch("dotenv.load_dotenv"), \
patch("hermes_state.SessionDB", return_value=fake_db), \
patch("hermes_cli.runtime_provider.resolve_runtime_provider", return_value=fake_runtime) as runtime_mock, \
patch("run_agent.AIAgent") as mock_agent_cls:
mock_agent = MagicMock()
mock_agent.run_conversation.return_value = {"final_response": "ok"}
mock_agent_cls.return_value = mock_agent
success, output, final_response, error = run_job(job)
assert success is True
assert error is None
assert final_response == "ok"
assert "ok" in output
runtime_mock.assert_called_once_with(
requested="custom",
explicit_base_url="http://127.0.0.1:4000/v1",
)
assert mock_agent_cls.call_args.kwargs["model"] == "perplexity/sonar-pro"
fake_db.close.assert_called_once()
class TestRunJobSkillBacked:
def test_run_job_loads_skill_and_disables_recursive_cron_tools(self, tmp_path):
job = {

View File

@@ -0,0 +1,53 @@
"""Gateway STT config tests — honor stt.enabled: false from config.yaml."""
from pathlib import Path
from unittest.mock import AsyncMock, patch
import pytest
import yaml
from gateway.config import GatewayConfig, load_gateway_config
def test_gateway_config_stt_disabled_from_dict_nested():
config = GatewayConfig.from_dict({"stt": {"enabled": False}})
assert config.stt_enabled is False
def test_load_gateway_config_bridges_stt_enabled_from_config_yaml(tmp_path, monkeypatch):
hermes_home = tmp_path / ".hermes"
hermes_home.mkdir()
(hermes_home / "config.yaml").write_text(
yaml.dump({"stt": {"enabled": False}}),
encoding="utf-8",
)
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
monkeypatch.setattr(Path, "home", lambda: tmp_path)
config = load_gateway_config()
assert config.stt_enabled is False
@pytest.mark.asyncio
async def test_enrich_message_with_transcription_skips_when_stt_disabled():
from gateway.run import GatewayRunner
runner = GatewayRunner.__new__(GatewayRunner)
runner.config = GatewayConfig(stt_enabled=False)
with patch(
"tools.transcription_tools.transcribe_audio",
side_effect=AssertionError("transcribe_audio should not be called when STT is disabled"),
), patch(
"tools.transcription_tools.get_stt_model_from_config",
return_value=None,
):
result = await runner._enrich_message_with_transcription(
"caption",
["/tmp/voice.ogg"],
)
assert "transcription is disabled" in result.lower()
assert "caption" in result

View File

@@ -25,7 +25,11 @@ def test_nous_oauth_setup_keeps_current_model_when_syncing_disk_provider(
config = load_config()
prompt_choices = iter([0, 2])
# Provider selection always comes first. Depending on available vision
# backends, setup may either skip the optional vision step or prompt for
# it before the default-model choice. Provide enough selections for both
# paths while still ending on "keep current model".
prompt_choices = iter([0, 2, 2])
monkeypatch.setattr(
"hermes_cli.setup.prompt_choice",
lambda *args, **kwargs: next(prompt_choices),

View File

@@ -111,6 +111,7 @@ def test_setup_keep_current_config_provider_uses_provider_specific_model_menu(tm
monkeypatch.setattr("hermes_cli.auth.get_active_provider", lambda: None)
monkeypatch.setattr("hermes_cli.auth.detect_external_credentials", lambda: [])
monkeypatch.setattr("hermes_cli.models.provider_model_ids", lambda provider: [])
monkeypatch.setattr("agent.auxiliary_client.get_available_vision_backends", lambda: [])
setup_model_provider(config)
save_config(config)
@@ -149,6 +150,7 @@ def test_setup_keep_current_anthropic_can_configure_openai_vision_default(tmp_pa
monkeypatch.setattr("hermes_cli.auth.get_active_provider", lambda: None)
monkeypatch.setattr("hermes_cli.auth.detect_external_credentials", lambda: [])
monkeypatch.setattr("hermes_cli.models.provider_model_ids", lambda provider: [])
monkeypatch.setattr("agent.auxiliary_client.get_available_vision_backends", lambda: [])
setup_model_provider(config)
env = _read_env(tmp_path)
@@ -224,3 +226,17 @@ def test_setup_summary_marks_codex_auth_as_vision_available(tmp_path, monkeypatc
assert "missing run 'hermes setup' to configure" not in output
assert "Mixture of Agents" in output
assert "missing OPENROUTER_API_KEY" in output
def test_setup_summary_marks_anthropic_auth_as_vision_available(tmp_path, monkeypatch, capsys):
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
_clear_provider_env(monkeypatch)
monkeypatch.setenv("ANTHROPIC_API_KEY", "sk-ant-api03-key")
monkeypatch.setattr("shutil.which", lambda _name: None)
monkeypatch.setattr("agent.auxiliary_client.get_available_vision_backends", lambda: ["anthropic"])
_print_setup_summary(load_config(), tmp_path)
output = capsys.readouterr().out
assert "Vision (image analysis)" in output
assert "missing run 'hermes setup' to configure" not in output

View File

@@ -46,6 +46,20 @@ def test_stash_local_changes_if_needed_returns_specific_stash_commit(monkeypatch
assert calls[2][0][-3:] == ["rev-parse", "--verify", "refs/stash"]
def test_resolve_stash_selector_returns_matching_entry(monkeypatch, tmp_path):
def fake_run(cmd, **kwargs):
assert cmd == ["git", "stash", "list", "--format=%gd %H"]
return SimpleNamespace(
stdout="stash@{0} def456\nstash@{1} abc123\n",
returncode=0,
)
monkeypatch.setattr(hermes_main.subprocess, "run", fake_run)
assert hermes_main._resolve_stash_selector(["git"], tmp_path, "abc123") == "stash@{1}"
def test_restore_stashed_changes_prompts_before_applying(monkeypatch, tmp_path, capsys):
calls = []
@@ -53,6 +67,8 @@ def test_restore_stashed_changes_prompts_before_applying(monkeypatch, tmp_path,
calls.append((cmd, kwargs))
if cmd[1:3] == ["stash", "apply"]:
return SimpleNamespace(stdout="applied\n", stderr="", returncode=0)
if cmd[1:3] == ["stash", "list"]:
return SimpleNamespace(stdout="stash@{1} abc123\n", stderr="", returncode=0)
if cmd[1:3] == ["stash", "drop"]:
return SimpleNamespace(stdout="dropped\n", stderr="", returncode=0)
raise AssertionError(f"unexpected command: {cmd}")
@@ -64,7 +80,8 @@ def test_restore_stashed_changes_prompts_before_applying(monkeypatch, tmp_path,
assert restored is True
assert calls[0][0] == ["git", "stash", "apply", "abc123"]
assert calls[1][0] == ["git", "stash", "drop", "abc123"]
assert calls[1][0] == ["git", "stash", "list", "--format=%gd %H"]
assert calls[2][0] == ["git", "stash", "drop", "stash@{1}"]
out = capsys.readouterr().out
assert "Restore local changes now? [Y/n]" in out
assert "restored on top of the updated codebase" in out
@@ -99,6 +116,8 @@ def test_restore_stashed_changes_applies_without_prompt_when_disabled(monkeypatc
calls.append((cmd, kwargs))
if cmd[1:3] == ["stash", "apply"]:
return SimpleNamespace(stdout="applied\n", stderr="", returncode=0)
if cmd[1:3] == ["stash", "list"]:
return SimpleNamespace(stdout="stash@{0} abc123\n", stderr="", returncode=0)
if cmd[1:3] == ["stash", "drop"]:
return SimpleNamespace(stdout="dropped\n", stderr="", returncode=0)
raise AssertionError(f"unexpected command: {cmd}")
@@ -109,9 +128,64 @@ def test_restore_stashed_changes_applies_without_prompt_when_disabled(monkeypatc
assert restored is True
assert calls[0][0] == ["git", "stash", "apply", "abc123"]
assert calls[1][0] == ["git", "stash", "list", "--format=%gd %H"]
assert calls[2][0] == ["git", "stash", "drop", "stash@{0}"]
assert "Restore local changes now?" not in capsys.readouterr().out
def test_restore_stashed_changes_keeps_going_when_stash_entry_cannot_be_resolved(monkeypatch, tmp_path, capsys):
calls = []
def fake_run(cmd, **kwargs):
calls.append((cmd, kwargs))
if cmd[1:3] == ["stash", "apply"]:
return SimpleNamespace(stdout="applied\n", stderr="", returncode=0)
if cmd[1:3] == ["stash", "list"]:
return SimpleNamespace(stdout="stash@{0} def456\n", stderr="", returncode=0)
raise AssertionError(f"unexpected command: {cmd}")
monkeypatch.setattr(hermes_main.subprocess, "run", fake_run)
restored = hermes_main._restore_stashed_changes(["git"], tmp_path, "abc123", prompt_user=False)
assert restored is True
assert calls == [
(["git", "stash", "apply", "abc123"], {"cwd": tmp_path, "capture_output": True, "text": True}),
(["git", "stash", "list", "--format=%gd %H"], {"cwd": tmp_path, "capture_output": True, "text": True, "check": True}),
]
out = capsys.readouterr().out
assert "couldn't find the stash entry to drop" in out
assert "stash was left in place" in out
assert "Look for commit abc123" in out
def test_restore_stashed_changes_keeps_going_when_drop_fails(monkeypatch, tmp_path, capsys):
calls = []
def fake_run(cmd, **kwargs):
calls.append((cmd, kwargs))
if cmd[1:3] == ["stash", "apply"]:
return SimpleNamespace(stdout="applied\n", stderr="", returncode=0)
if cmd[1:3] == ["stash", "list"]:
return SimpleNamespace(stdout="stash@{0} abc123\n", stderr="", returncode=0)
if cmd[1:3] == ["stash", "drop"]:
return SimpleNamespace(stdout="", stderr="drop failed\n", returncode=1)
raise AssertionError(f"unexpected command: {cmd}")
monkeypatch.setattr(hermes_main.subprocess, "run", fake_run)
restored = hermes_main._restore_stashed_changes(["git"], tmp_path, "abc123", prompt_user=False)
assert restored is True
assert calls[2][0] == ["git", "stash", "drop", "stash@{0}"]
out = capsys.readouterr().out
assert "couldn't drop the saved stash entry" in out
assert "drop failed" in out
assert "git stash drop stash@{0}" in out
def test_restore_stashed_changes_exits_cleanly_when_apply_fails(monkeypatch, tmp_path, capsys):
calls = []

View File

@@ -0,0 +1,203 @@
"""Regression tests for Google Workspace OAuth setup.
These tests cover the headless/manual auth-code flow where the browser step and
code exchange happen in separate process invocations.
"""
import importlib.util
import json
import sys
import types
from pathlib import Path
import pytest
SCRIPT_PATH = (
Path(__file__).resolve().parents[2]
/ "skills/productivity/google-workspace/scripts/setup.py"
)
class FakeCredentials:
def __init__(self, payload=None):
self._payload = payload or {
"token": "access-token",
"refresh_token": "refresh-token",
"token_uri": "https://oauth2.googleapis.com/token",
"client_id": "client-id",
"client_secret": "client-secret",
"scopes": ["scope-a"],
}
def to_json(self):
return json.dumps(self._payload)
class FakeFlow:
created = []
default_state = "generated-state"
default_verifier = "generated-code-verifier"
credentials_payload = None
fetch_error = None
def __init__(
self,
client_secrets_file,
scopes,
*,
redirect_uri=None,
state=None,
code_verifier=None,
autogenerate_code_verifier=False,
):
self.client_secrets_file = client_secrets_file
self.scopes = scopes
self.redirect_uri = redirect_uri
self.state = state
self.code_verifier = code_verifier
self.autogenerate_code_verifier = autogenerate_code_verifier
self.authorization_kwargs = None
self.fetch_token_calls = []
self.credentials = FakeCredentials(self.credentials_payload)
if autogenerate_code_verifier and not self.code_verifier:
self.code_verifier = self.default_verifier
if not self.state:
self.state = self.default_state
@classmethod
def reset(cls):
cls.created = []
cls.default_state = "generated-state"
cls.default_verifier = "generated-code-verifier"
cls.credentials_payload = None
cls.fetch_error = None
@classmethod
def from_client_secrets_file(cls, client_secrets_file, scopes, **kwargs):
inst = cls(client_secrets_file, scopes, **kwargs)
cls.created.append(inst)
return inst
def authorization_url(self, **kwargs):
self.authorization_kwargs = kwargs
return f"https://auth.example/authorize?state={self.state}", self.state
def fetch_token(self, **kwargs):
self.fetch_token_calls.append(kwargs)
if self.fetch_error:
raise self.fetch_error
@pytest.fixture
def setup_module(monkeypatch, tmp_path):
FakeFlow.reset()
google_auth_module = types.ModuleType("google_auth_oauthlib")
flow_module = types.ModuleType("google_auth_oauthlib.flow")
flow_module.Flow = FakeFlow
google_auth_module.flow = flow_module
monkeypatch.setitem(sys.modules, "google_auth_oauthlib", google_auth_module)
monkeypatch.setitem(sys.modules, "google_auth_oauthlib.flow", flow_module)
spec = importlib.util.spec_from_file_location("google_workspace_setup_test", SCRIPT_PATH)
module = importlib.util.module_from_spec(spec)
assert spec.loader is not None
spec.loader.exec_module(module)
monkeypatch.setattr(module, "_ensure_deps", lambda: None)
monkeypatch.setattr(module, "CLIENT_SECRET_PATH", tmp_path / "google_client_secret.json")
monkeypatch.setattr(module, "TOKEN_PATH", tmp_path / "google_token.json")
monkeypatch.setattr(module, "PENDING_AUTH_PATH", tmp_path / "google_oauth_pending.json", raising=False)
client_secret = {
"installed": {
"client_id": "client-id",
"client_secret": "client-secret",
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://oauth2.googleapis.com/token",
}
}
module.CLIENT_SECRET_PATH.write_text(json.dumps(client_secret))
return module
class TestGetAuthUrl:
def test_persists_state_and_code_verifier_for_later_exchange(self, setup_module, capsys):
setup_module.get_auth_url()
out = capsys.readouterr().out.strip()
assert out == "https://auth.example/authorize?state=generated-state"
saved = json.loads(setup_module.PENDING_AUTH_PATH.read_text())
assert saved["state"] == "generated-state"
assert saved["code_verifier"] == "generated-code-verifier"
flow = FakeFlow.created[-1]
assert flow.autogenerate_code_verifier is True
assert flow.authorization_kwargs == {"access_type": "offline", "prompt": "consent"}
class TestExchangeAuthCode:
def test_reuses_saved_pkce_material_for_plain_code(self, setup_module):
setup_module.PENDING_AUTH_PATH.write_text(
json.dumps({"state": "saved-state", "code_verifier": "saved-verifier"})
)
setup_module.exchange_auth_code("4/test-auth-code")
flow = FakeFlow.created[-1]
assert flow.state == "saved-state"
assert flow.code_verifier == "saved-verifier"
assert flow.fetch_token_calls == [{"code": "4/test-auth-code"}]
assert json.loads(setup_module.TOKEN_PATH.read_text())["token"] == "access-token"
assert not setup_module.PENDING_AUTH_PATH.exists()
def test_extracts_code_from_redirect_url_and_checks_state(self, setup_module):
setup_module.PENDING_AUTH_PATH.write_text(
json.dumps({"state": "saved-state", "code_verifier": "saved-verifier"})
)
setup_module.exchange_auth_code(
"http://localhost:1/?code=4/extracted-code&state=saved-state&scope=gmail"
)
flow = FakeFlow.created[-1]
assert flow.fetch_token_calls == [{"code": "4/extracted-code"}]
def test_rejects_state_mismatch(self, setup_module, capsys):
setup_module.PENDING_AUTH_PATH.write_text(
json.dumps({"state": "saved-state", "code_verifier": "saved-verifier"})
)
with pytest.raises(SystemExit):
setup_module.exchange_auth_code(
"http://localhost:1/?code=4/extracted-code&state=wrong-state"
)
out = capsys.readouterr().out
assert "state mismatch" in out.lower()
assert not setup_module.TOKEN_PATH.exists()
def test_requires_pending_auth_session(self, setup_module, capsys):
with pytest.raises(SystemExit):
setup_module.exchange_auth_code("4/test-auth-code")
out = capsys.readouterr().out
assert "run --auth-url first" in out.lower()
assert not setup_module.TOKEN_PATH.exists()
def test_keeps_pending_auth_session_when_exchange_fails(self, setup_module, capsys):
setup_module.PENDING_AUTH_PATH.write_text(
json.dumps({"state": "saved-state", "code_verifier": "saved-verifier"})
)
FakeFlow.fetch_error = Exception("invalid_grant: Missing code verifier")
with pytest.raises(SystemExit):
setup_module.exchange_auth_code("4/test-auth-code")
out = capsys.readouterr().out
assert "token exchange failed" in out.lower()
assert setup_module.PENDING_AUTH_PATH.exists()
assert not setup_module.TOKEN_PATH.exists()

View File

@@ -16,6 +16,7 @@ from agent.anthropic_adapter import (
build_anthropic_kwargs,
convert_messages_to_anthropic,
convert_tools_to_anthropic,
get_anthropic_token_source,
is_claude_code_token_valid,
normalize_anthropic_response,
normalize_model_name,
@@ -87,16 +88,25 @@ class TestReadClaudeCodeCredentials:
cred_file.parent.mkdir(parents=True)
cred_file.write_text(json.dumps({
"claudeAiOauth": {
"accessToken": "sk-ant-oat01-test-token",
"refreshToken": "sk-ant-ort01-refresh",
"accessToken": "sk-ant-oat01-token",
"refreshToken": "sk-ant-oat01-refresh",
"expiresAt": int(time.time() * 1000) + 3600_000,
}
}))
monkeypatch.setattr("agent.anthropic_adapter.Path.home", lambda: tmp_path)
creds = read_claude_code_credentials()
assert creds is not None
assert creds["accessToken"] == "sk-ant-oat01-test-token"
assert creds["refreshToken"] == "sk-ant-ort01-refresh"
assert creds["accessToken"] == "sk-ant-oat01-token"
assert creds["refreshToken"] == "sk-ant-oat01-refresh"
assert creds["source"] == "claude_code_credentials_file"
def test_ignores_primary_api_key_for_native_anthropic_resolution(self, tmp_path, monkeypatch):
claude_json = tmp_path / ".claude.json"
claude_json.write_text(json.dumps({"primaryApiKey": "sk-ant-api03-primary"}))
monkeypatch.setattr("agent.anthropic_adapter.Path.home", lambda: tmp_path)
creds = read_claude_code_credentials()
assert creds is None
def test_returns_none_for_missing_file(self, tmp_path, monkeypatch):
monkeypatch.setattr("agent.anthropic_adapter.Path.home", lambda: tmp_path)
@@ -139,6 +149,24 @@ class TestResolveAnthropicToken:
monkeypatch.setenv("ANTHROPIC_TOKEN", "sk-ant-oat01-mytoken")
assert resolve_anthropic_token() == "sk-ant-oat01-mytoken"
def test_reports_claude_json_primary_key_source(self, monkeypatch, tmp_path):
monkeypatch.delenv("ANTHROPIC_API_KEY", raising=False)
monkeypatch.delenv("ANTHROPIC_TOKEN", raising=False)
monkeypatch.delenv("CLAUDE_CODE_OAUTH_TOKEN", raising=False)
(tmp_path / ".claude.json").write_text(json.dumps({"primaryApiKey": "sk-ant-api03-primary"}))
monkeypatch.setattr("agent.anthropic_adapter.Path.home", lambda: tmp_path)
assert get_anthropic_token_source("sk-ant-api03-primary") == "claude_json_primary_api_key"
def test_does_not_resolve_primary_api_key_as_native_anthropic_token(self, monkeypatch, tmp_path):
monkeypatch.delenv("ANTHROPIC_API_KEY", raising=False)
monkeypatch.delenv("ANTHROPIC_TOKEN", raising=False)
monkeypatch.delenv("CLAUDE_CODE_OAUTH_TOKEN", raising=False)
(tmp_path / ".claude.json").write_text(json.dumps({"primaryApiKey": "sk-ant-api03-primary"}))
monkeypatch.setattr("agent.anthropic_adapter.Path.home", lambda: tmp_path)
assert resolve_anthropic_token() is None
def test_falls_back_to_api_key_when_no_oauth_sources_exist(self, monkeypatch, tmp_path):
monkeypatch.setenv("ANTHROPIC_API_KEY", "sk-ant-api03-mykey")
monkeypatch.delenv("ANTHROPIC_TOKEN", raising=False)
@@ -567,6 +595,56 @@ class TestConvertMessages:
assert tool_block["content"] == "result"
assert tool_block["cache_control"] == {"type": "ephemeral"}
def test_converts_data_url_image_to_anthropic_image_block(self):
messages = [
{
"role": "user",
"content": [
{"type": "text", "text": "Describe this image"},
{
"type": "image_url",
"image_url": {"url": "data:image/png;base64,ZmFrZQ=="},
},
],
}
]
_, result = convert_messages_to_anthropic(messages)
blocks = result[0]["content"]
assert blocks[0] == {"type": "text", "text": "Describe this image"}
assert blocks[1] == {
"type": "image",
"source": {
"type": "base64",
"media_type": "image/png",
"data": "ZmFrZQ==",
},
}
def test_converts_remote_image_url_to_anthropic_image_block(self):
messages = [
{
"role": "user",
"content": [
{"type": "text", "text": "Describe this image"},
{
"type": "image_url",
"image_url": {"url": "https://example.com/cat.png"},
},
],
}
]
_, result = convert_messages_to_anthropic(messages)
blocks = result[0]["content"]
assert blocks[1] == {
"type": "image",
"source": {
"type": "url",
"url": "https://example.com/cat.png",
},
}
def test_empty_cached_assistant_tool_turn_converts_without_empty_text_block(self):
messages = apply_anthropic_cache_control([
{"role": "system", "content": "System prompt"},

View File

@@ -2533,3 +2533,56 @@ class TestVprintForceOnErrors:
agent._vprint("debug")
agent._vprint("error", force=True)
assert len(printed) == 2
class TestNormalizeCodexDictArguments:
"""_normalize_codex_response must produce valid JSON strings for tool
call arguments, even when the Responses API returns them as dicts."""
def _make_codex_response(self, item_type, arguments, item_status="completed"):
"""Build a minimal Responses API response with a single tool call."""
item = SimpleNamespace(
type=item_type,
status=item_status,
)
if item_type == "function_call":
item.name = "web_search"
item.arguments = arguments
item.call_id = "call_abc123"
item.id = "fc_abc123"
elif item_type == "custom_tool_call":
item.name = "web_search"
item.input = arguments
item.call_id = "call_abc123"
item.id = "fc_abc123"
return SimpleNamespace(
output=[item],
status="completed",
)
def test_function_call_dict_arguments_produce_valid_json(self, agent):
"""dict arguments from function_call must be serialised with
json.dumps, not str(), so downstream json.loads() succeeds."""
args_dict = {"query": "weather in NYC", "units": "celsius"}
response = self._make_codex_response("function_call", args_dict)
msg, _ = agent._normalize_codex_response(response)
tc = msg.tool_calls[0]
parsed = json.loads(tc.function.arguments)
assert parsed == args_dict
def test_custom_tool_call_dict_arguments_produce_valid_json(self, agent):
"""dict arguments from custom_tool_call must also use json.dumps."""
args_dict = {"path": "/tmp/test.txt", "content": "hello"}
response = self._make_codex_response("custom_tool_call", args_dict)
msg, _ = agent._normalize_codex_response(response)
tc = msg.tool_calls[0]
parsed = json.loads(tc.function.arguments)
assert parsed == args_dict
def test_string_arguments_unchanged(self, agent):
"""String arguments must pass through without modification."""
args_str = '{"query": "test"}'
response = self._make_codex_response("function_call", args_str)
msg, _ = agent._normalize_codex_response(response)
tc = msg.tool_calls[0]
assert tc.function.arguments == args_str

View File

@@ -456,3 +456,20 @@ class TestViewFullCommand:
# After first 'v', is_truncated becomes False, so second 'v' -> deny
assert result == "deny"
class TestForkBombDetection:
"""The fork bomb regex must match the classic :(){ :|:& };: pattern."""
def test_classic_fork_bomb(self):
dangerous, key, desc = detect_dangerous_command(":(){ :|:& };:")
assert dangerous is True, "classic fork bomb not detected"
assert "fork bomb" in desc.lower()
def test_fork_bomb_with_spaces(self):
dangerous, key, desc = detect_dangerous_command(":() { : | :& } ; :")
assert dangerous is True, "fork bomb with extra spaces not detected"
def test_colon_in_safe_command_not_flagged(self):
dangerous, key, desc = detect_dangerous_command("echo hello:world")
assert dangerous is False

View File

@@ -137,6 +137,22 @@ class TestScheduleCronjob:
))
assert result["repeat"] == "5 times"
def test_schedule_persists_runtime_overrides(self):
result = json.loads(schedule_cronjob(
prompt="Pinned job",
schedule="every 1h",
model="anthropic/claude-sonnet-4",
provider="custom",
base_url="http://127.0.0.1:4000/v1/",
))
assert result["success"] is True
listing = json.loads(list_cronjobs())
job = listing["jobs"][0]
assert job["model"] == "anthropic/claude-sonnet-4"
assert job["provider"] == "custom"
assert job["base_url"] == "http://127.0.0.1:4000/v1"
# =========================================================================
# list_cronjobs
@@ -249,6 +265,33 @@ class TestUnifiedCronjobTool:
assert updated["job"]["name"] == "New Name"
assert updated["job"]["schedule"] == "every 120m"
def test_update_runtime_overrides_can_set_and_clear(self):
created = json.loads(
cronjob(
action="create",
prompt="Check",
schedule="every 1h",
model="anthropic/claude-sonnet-4",
provider="custom",
base_url="http://127.0.0.1:4000/v1",
)
)
job_id = created["job_id"]
updated = json.loads(
cronjob(
action="update",
job_id=job_id,
model="openai/gpt-4.1",
provider="openrouter",
base_url="",
)
)
assert updated["success"] is True
assert updated["job"]["model"] == "openai/gpt-4.1"
assert updated["job"]["provider"] == "openrouter"
assert updated["job"]["base_url"] is None
def test_create_skill_backed_job(self):
result = json.loads(
cronjob(

View File

@@ -91,6 +91,25 @@ class TestProviderEnvBlocklist:
for var in registry_vars:
assert var not in result_env, f"{var} leaked into subprocess env"
def test_non_registry_provider_vars_are_stripped(self):
"""Extra provider vars not in PROVIDER_REGISTRY must also be blocked."""
extra_provider_vars = {
"GOOGLE_API_KEY": "google-key",
"DEEPSEEK_API_KEY": "deepseek-key",
"MISTRAL_API_KEY": "mistral-key",
"GROQ_API_KEY": "groq-key",
"TOGETHER_API_KEY": "together-key",
"PERPLEXITY_API_KEY": "perplexity-key",
"COHERE_API_KEY": "cohere-key",
"FIREWORKS_API_KEY": "fireworks-key",
"XAI_API_KEY": "xai-key",
"HELICONE_API_KEY": "helicone-key",
}
result_env = _run_with_env(extra_os_env=extra_provider_vars)
for var in extra_provider_vars:
assert var not in result_env, f"{var} leaked into subprocess env"
def test_safe_vars_are_preserved(self):
"""Standard env vars (PATH, HOME, USER) must still be passed through."""
result_env = _run_with_env()
@@ -171,3 +190,18 @@ class TestBlocklistCoverage:
must also be in the blocklist."""
extras = {"ANTHROPIC_TOKEN", "CLAUDE_CODE_OAUTH_TOKEN"}
assert extras.issubset(_HERMES_PROVIDER_ENV_BLOCKLIST)
def test_non_registry_provider_vars_are_in_blocklist(self):
extras = {
"GOOGLE_API_KEY",
"DEEPSEEK_API_KEY",
"MISTRAL_API_KEY",
"GROQ_API_KEY",
"TOGETHER_API_KEY",
"PERPLEXITY_API_KEY",
"COHERE_API_KEY",
"FIREWORKS_API_KEY",
"XAI_API_KEY",
"HELICONE_API_KEY",
}
assert extras.issubset(_HERMES_PROVIDER_ENV_BLOCKLIST)

View File

@@ -59,6 +59,10 @@ class TestGetProvider:
from tools.transcription_tools import _get_provider
assert _get_provider({}) == "local"
def test_disabled_config_returns_none(self):
from tools.transcription_tools import _get_provider
assert _get_provider({"enabled": False, "provider": "openai"}) == "none"
# ---------------------------------------------------------------------------
# File validation
@@ -217,6 +221,18 @@ class TestTranscribeAudio:
assert result["success"] is False
assert "No STT provider" in result["error"]
def test_disabled_config_returns_disabled_error(self, tmp_path):
audio_file = tmp_path / "test.ogg"
audio_file.write_bytes(b"fake audio")
with patch("tools.transcription_tools._load_stt_config", return_value={"enabled": False}), \
patch("tools.transcription_tools._get_provider", return_value="none"):
from tools.transcription_tools import transcribe_audio
result = transcribe_audio(str(audio_file))
assert result["success"] is False
assert "disabled" in result["error"].lower()
def test_invalid_file_returns_error(self):
from tools.transcription_tools import transcribe_audio
result = transcribe_audio("/nonexistent/file.ogg")

View File

@@ -38,7 +38,7 @@ DANGEROUS_PATTERNS = [
(r'\bsystemctl\s+(stop|disable|mask)\b', "stop/disable system service"),
(r'\bkill\s+-9\s+-1\b', "kill all processes"),
(r'\bpkill\s+-9\b', "force kill processes"),
(r':()\s*{\s*:\s*\|\s*:&\s*}\s*;:', "fork bomb"),
(r':\(\)\s*\{\s*:\s*\|\s*:\s*&\s*\}\s*;\s*:', "fork bomb"),
(r'\b(bash|sh|zsh)\s+-c\s+', "shell command via -c flag"),
(r'\b(python[23]?|perl|ruby|node)\s+-[ec]\s+', "script execution via -e/-c flag"),
(r'\b(curl|wget)\b.*\|\s*(ba)?sh\b', "pipe remote content to shell"),

View File

@@ -103,6 +103,16 @@ def _canonical_skills(skill: Optional[str] = None, skills: Optional[Any] = None)
def _normalize_optional_job_value(value: Optional[Any], *, strip_trailing_slash: bool = False) -> Optional[str]:
if value is None:
return None
text = str(value).strip()
if strip_trailing_slash:
text = text.rstrip("/")
return text or None
def _format_job(job: Dict[str, Any]) -> Dict[str, Any]:
prompt = job.get("prompt", "")
skills = _canonical_skills(job.get("skill"), job.get("skills"))
@@ -112,6 +122,9 @@ def _format_job(job: Dict[str, Any]) -> Dict[str, Any]:
"skill": skills[0] if skills else None,
"skills": skills,
"prompt_preview": prompt[:100] + "..." if len(prompt) > 100 else prompt,
"model": job.get("model"),
"provider": job.get("provider"),
"base_url": job.get("base_url"),
"schedule": job.get("schedule_display"),
"repeat": _repeat_display(job),
"deliver": job.get("deliver", "local"),
@@ -136,6 +149,9 @@ def cronjob(
include_disabled: bool = False,
skill: Optional[str] = None,
skills: Optional[List[str]] = None,
model: Optional[str] = None,
provider: Optional[str] = None,
base_url: Optional[str] = None,
reason: Optional[str] = None,
task_id: str = None,
) -> str:
@@ -164,6 +180,9 @@ def cronjob(
deliver=deliver,
origin=_origin_from_env(),
skills=canonical_skills,
model=_normalize_optional_job_value(model),
provider=_normalize_optional_job_value(provider),
base_url=_normalize_optional_job_value(base_url, strip_trailing_slash=True),
)
return json.dumps(
{
@@ -240,6 +259,12 @@ def cronjob(
canonical_skills = _canonical_skills(skill, skills)
updates["skills"] = canonical_skills
updates["skill"] = canonical_skills[0] if canonical_skills else None
if model is not None:
updates["model"] = _normalize_optional_job_value(model)
if provider is not None:
updates["provider"] = _normalize_optional_job_value(provider)
if base_url is not None:
updates["base_url"] = _normalize_optional_job_value(base_url, strip_trailing_slash=True)
if repeat is not None:
repeat_state = dict(job.get("repeat") or {})
repeat_state["times"] = repeat
@@ -272,6 +297,9 @@ def schedule_cronjob(
name: Optional[str] = None,
repeat: Optional[int] = None,
deliver: Optional[str] = None,
model: Optional[str] = None,
provider: Optional[str] = None,
base_url: Optional[str] = None,
task_id: str = None,
) -> str:
return cronjob(
@@ -281,6 +309,9 @@ def schedule_cronjob(
name=name,
repeat=repeat,
deliver=deliver,
model=model,
provider=provider,
base_url=base_url,
task_id=task_id,
)
@@ -343,6 +374,18 @@ Important safety rule: cron-run sessions should not recursively schedule more cr
"type": "string",
"description": "Delivery target: origin, local, telegram, discord, signal, or platform:chat_id"
},
"model": {
"type": "string",
"description": "Optional per-job model override used when the cron job runs"
},
"provider": {
"type": "string",
"description": "Optional per-job provider override used when resolving runtime credentials"
},
"base_url": {
"type": "string",
"description": "Optional per-job base URL override paired with provider/model routing"
},
"include_disabled": {
"type": "boolean",
"description": "For list: include paused/completed jobs"
@@ -407,6 +450,9 @@ registry.register(
include_disabled=args.get("include_disabled", False),
skill=args.get("skill"),
skills=args.get("skills"),
model=args.get("model"),
provider=args.get("provider"),
base_url=args.get("base_url"),
reason=args.get("reason"),
task_id=kw.get("task_id"),
),

View File

@@ -56,6 +56,17 @@ def _build_provider_env_blocklist() -> frozenset:
"ANTHROPIC_TOKEN", # OAuth token (not in registry as env var)
"CLAUDE_CODE_OAUTH_TOKEN",
"LLM_MODEL",
# Expanded isolation for other major providers (Issue #1002)
"GOOGLE_API_KEY", # Gemini / Google AI Studio
"DEEPSEEK_API_KEY", # DeepSeek
"MISTRAL_API_KEY", # Mistral AI
"GROQ_API_KEY", # Groq
"TOGETHER_API_KEY", # Together AI
"PERPLEXITY_API_KEY", # Perplexity
"COHERE_API_KEY", # Cohere
"FIREWORKS_API_KEY", # Fireworks AI
"XAI_API_KEY", # xAI (Grok)
"HELICONE_API_KEY", # LLM Observability proxy
})
return frozenset(blocked)

View File

@@ -93,6 +93,18 @@ def _load_stt_config() -> dict:
return {}
def is_stt_enabled(stt_config: Optional[dict] = None) -> bool:
"""Return whether STT is enabled in config."""
if stt_config is None:
stt_config = _load_stt_config()
enabled = stt_config.get("enabled", True)
if isinstance(enabled, str):
return enabled.strip().lower() in ("true", "1", "yes", "on")
if enabled is None:
return True
return bool(enabled)
def _get_provider(stt_config: dict) -> str:
"""Determine which STT provider to use.
@@ -101,6 +113,9 @@ def _get_provider(stt_config: dict) -> str:
2. Auto-detect: local > groq (free) > openai (paid)
3. Disabled (returns "none")
"""
if not is_stt_enabled(stt_config):
return "none"
provider = stt_config.get("provider", DEFAULT_PROVIDER)
if provider == "local":
@@ -334,6 +349,13 @@ def transcribe_audio(file_path: str, model: Optional[str] = None) -> Dict[str, A
# Load config and determine provider
stt_config = _load_stt_config()
if not is_stt_enabled(stt_config):
return {
"success": False,
"transcript": "",
"error": "STT is disabled in config.yaml (stt.enabled: false).",
}
provider = _get_provider(stt_config)
if provider == "local":

View File

@@ -3,7 +3,8 @@
Vision Tools Module
This module provides vision analysis tools that work with image URLs.
Uses Gemini 3 Flash Preview via OpenRouter API for intelligent image understanding.
Uses the centralized auxiliary vision router, which can select OpenRouter,
Nous, Codex, native Anthropic, or a custom OpenAI-compatible endpoint.
Available tools:
- vision_analyze_tool: Analyze images from URLs with custom prompts
@@ -409,7 +410,7 @@ if __name__ == "__main__":
if not api_available:
print("❌ No auxiliary vision model available")
print("Set OPENROUTER_API_KEY or configure Nous Portal to enable vision tools.")
print("Configure a supported multimodal backend (OpenRouter, Nous, Codex, Anthropic, or a custom OpenAI-compatible endpoint).")
exit(1)
else:
print("✅ Vision model available")

View File

@@ -703,10 +703,11 @@ def check_voice_requirements() -> Dict[str, Any]:
``missing_packages``, and ``details``.
"""
# Determine STT provider availability
from tools.transcription_tools import _get_provider, _load_stt_config, _HAS_FASTER_WHISPER
from tools.transcription_tools import _get_provider, _load_stt_config, is_stt_enabled, _HAS_FASTER_WHISPER
stt_config = _load_stt_config()
stt_enabled = is_stt_enabled(stt_config)
stt_provider = _get_provider(stt_config)
stt_available = stt_provider != "none"
stt_available = stt_enabled and stt_provider != "none"
missing: List[str] = []
has_audio = _audio_available()
@@ -725,7 +726,9 @@ def check_voice_requirements() -> Dict[str, Any]:
else:
details_parts.append("Audio capture: MISSING (pip install sounddevice numpy)")
if stt_provider == "local":
if not stt_enabled:
details_parts.append("STT provider: DISABLED in config (stt.enabled: false)")
elif stt_provider == "local":
details_parts.append("STT provider: OK (local faster-whisper)")
elif stt_provider == "groq":
details_parts.append("STT provider: OK (Groq)")

View File

@@ -8,6 +8,21 @@ description: "Set up Hermes Agent as a Discord bot"
Hermes Agent integrates with Discord as a bot, letting you chat with your AI assistant through direct messages or server channels. The bot receives your messages, processes them through the Hermes Agent pipeline (including tool use, memory, and reasoning), and responds in real time. It supports text, voice messages, file attachments, and slash commands.
Before setup, here's the part most people want to know: how Hermes behaves once it's in your server.
## How Hermes Behaves
| Context | Behavior |
|---------|----------|
| **DMs** | Hermes responds to every message. No `@mention` needed. |
| **Server channels** | By default, Hermes only responds when you `@mention` it. If you post in a channel without mentioning it, Hermes ignores the message. |
| **Free-response channels** | You can make specific channels mention-free with `DISCORD_FREE_RESPONSE_CHANNELS`, or disable mentions globally with `DISCORD_REQUIRE_MENTION=false`. |
| **Threads** | Hermes replies in the same thread. Mention rules still apply unless that thread or its parent channel is configured as free-response. |
:::tip
If you want a normal shared bot channel where people can talk to Hermes without tagging it every time, add that channel to `DISCORD_FREE_RESPONSE_CHANNELS`.
:::
This guide walks you through the full setup process — from creating your bot on Discord's Developer Portal to sending your first message.
## Step 1: Create a Discord Application
@@ -200,12 +215,6 @@ DISCORD_HOME_CHANNEL_NAME="#bot-updates"
Replace the ID with the actual channel ID (right-click → Copy Channel ID with Developer Mode on).
## Bot Behavior
- **Server channels**: By default the bot requires an `@mention` before it responds in server channels. You can disable that globally with `DISCORD_REQUIRE_MENTION=false` or allow specific channels to be mention-free via `DISCORD_FREE_RESPONSE_CHANNELS`.
- **Direct messages**: DMs always work, even without the Message Content Intent enabled (Discord exempts DMs from this requirement). However, you should still enable the intent for server channel support.
- **Conversations**: Each channel or DM maintains its own conversation context.
## Voice Messages
Hermes Agent supports Discord voice messages:

View File

@@ -193,8 +193,8 @@ Understanding how Hermes behaves in different contexts:
| Context | Behavior |
|---------|----------|
| **DMs** | Bot responds to every message — no @mention needed |
| **Channels** | Bot **only responds when @mentioned** (e.g., `@Hermes Agent what time is it?`) |
| **Threads** | Bot replies in threads when the triggering message is in a thread |
| **Channels** | Bot **only responds when @mentioned** (e.g., `@Hermes Agent what time is it?`). In channels, Hermes replies in a thread attached to that message. |
| **Threads** | If you @mention Hermes inside an existing thread, it replies in that same thread. |
:::tip
In channels, always @mention the bot. Simply typing a message without mentioning it will be ignored.