feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
"""Anthropic Messages API adapter for Hermes Agent.
|
|
|
|
|
|
|
|
|
|
|
|
Translates between Hermes's internal OpenAI-style message format and
|
|
|
|
|
|
Anthropic's Messages API. Follows the same pattern as the codex_responses
|
|
|
|
|
|
adapter — all provider-specific logic is isolated here.
|
|
|
|
|
|
|
|
|
|
|
|
Auth supports:
|
|
|
|
|
|
- Regular API keys (sk-ant-api*) → x-api-key header
|
|
|
|
|
|
- OAuth setup-tokens (sk-ant-oat*) → Bearer auth + beta header
|
2026-03-12 17:23:09 -07:00
|
|
|
|
- Claude Code credentials (~/.claude.json or ~/.claude/.credentials.json) → Bearer auth
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
"""
|
|
|
|
|
|
|
2026-04-02 10:14:20 -07:00
|
|
|
|
import copy
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
import json
|
|
|
|
|
|
import logging
|
|
|
|
|
|
import os
|
|
|
|
|
|
from pathlib import Path
|
refactor: consolidate get_hermes_home() and parse_reasoning_effort() (#3062)
Centralizes two widely-duplicated patterns into hermes_constants.py:
1. get_hermes_home() — Path resolution for ~/.hermes (HERMES_HOME env var)
- Was copy-pasted inline across 30+ files as:
Path(os.getenv("HERMES_HOME", Path.home() / ".hermes"))
- Now defined once in hermes_constants.py (zero-dependency module)
- hermes_cli/config.py re-exports it for backward compatibility
- Removed local wrapper functions in honcho_integration/client.py,
tools/website_policy.py, tools/tirith_security.py, hermes_cli/uninstall.py
2. parse_reasoning_effort() — Reasoning effort string validation
- Was copy-pasted in cli.py, gateway/run.py, cron/scheduler.py
- Same validation logic: check against (xhigh, high, medium, low, minimal, none)
- Now defined once in hermes_constants.py, called from all 3 locations
- Warning log for unknown values kept at call sites (context-specific)
31 files changed, net +31 lines (125 insertions, 94 deletions)
Full test suite: 6179 passed, 0 failed
2026-03-25 15:54:28 -07:00
|
|
|
|
|
|
|
|
|
|
from hermes_constants import get_hermes_home
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
from types import SimpleNamespace
|
|
|
|
|
|
from typing import Any, Dict, List, Optional, Tuple
|
2026-04-21 17:55:04 +08:00
|
|
|
|
from utils import normalize_proxy_env_vars
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
|
|
|
|
|
|
try:
|
|
|
|
|
|
import anthropic as _anthropic_sdk
|
|
|
|
|
|
except ImportError:
|
|
|
|
|
|
_anthropic_sdk = None # type: ignore[assignment]
|
|
|
|
|
|
|
|
|
|
|
|
logger = logging.getLogger(__name__)
|
|
|
|
|
|
|
|
|
|
|
|
THINKING_BUDGET = {"xhigh": 32000, "high": 16000, "medium": 8000, "low": 4000}
|
fix(agent): complete Claude Opus 4.7 API migration
Claude Opus 4.7 introduced several breaking API changes that the current
codebase partially handled but not completely. This patch finishes the
migration per the official migration guide at
https://platform.claude.com/docs/en/about-claude/models/migration-guide
Fixes NousResearch/hermes-agent#11137
Breaking-change coverage:
1. Adaptive thinking + output_config.effort — 4.7 is now recognized by
_supports_adaptive_thinking() (extends previous 4.6-only gate).
2. Sampling parameter stripping — 4.7 returns 400 for any non-default
temperature / top_p / top_k. build_anthropic_kwargs drops them as a
safety net; the OpenAI-protocol auxiliary path (_build_call_kwargs)
and AnthropicCompletionsAdapter.create() both early-exit before
setting temperature for 4.7+ models. This keeps flush_memories and
structured-JSON aux paths that hardcode temperature from 400ing
when the aux model is flipped to 4.7.
3. thinking.display = "summarized" — 4.7 defaults display to "omitted",
which silently hides reasoning text from Hermes's CLI activity feed
during long tool runs. Restoring "summarized" preserves 4.6 UX.
4. Effort level mapping — xhigh now maps to xhigh (was xhigh→max, which
silently over-efforted every coding/agentic request). max is now a
distinct ceiling per Anthropic's 5-level effort model.
5. New stop_reason values — refusal and model_context_window_exceeded
were silently collapsed to "stop" (end_turn) by the adapter's
stop_reason_map. Now mapped to "content_filter" and "length"
respectively, matching upstream finish-reason handling already in
bedrock_adapter.
6. Model catalogs — claude-opus-4-7 added to the Anthropic provider
list, anthropic/claude-opus-4.7 added at top of OpenRouter fallback
catalog (recommended), claude-opus-4-7 added to model_metadata
DEFAULT_CONTEXT_LENGTHS (1M, matching 4.6 per migration guide).
7. Prefill docstrings — run_agent.AIAgent and BatchRunner now document
that Anthropic Sonnet/Opus 4.6+ reject a trailing assistant-role
prefill (400).
8. Tests — 4 new tests in test_anthropic_adapter covering display
default, xhigh preservation, max on 4.7, refusal / context-overflow
stop_reason mapping, plus the sampling-param predicate. test_model_metadata
accepts 4.7 at 1M context.
Tested on macOS 15.5 (darwin). 119 tests pass in
tests/agent/test_anthropic_adapter.py, 1320 pass in tests/agent/.
2026-04-16 12:35:43 -05:00
|
|
|
|
# Hermes effort → Anthropic adaptive-thinking effort (output_config.effort).
|
|
|
|
|
|
# Anthropic exposes 5 levels on 4.7+: low, medium, high, xhigh, max.
|
fix(agent): downgrade xhigh→max on Anthropic pre-4.7 adaptive models
Regression from #11161 (Claude Opus 4.7 migration, commit 0517ac3e).
The Opus 4.7 migration changed `ADAPTIVE_EFFORT_MAP["xhigh"]` from "max"
(the pre-migration alias) to "xhigh" to preserve the new 4.7 effort level
as distinct from max. This is correct for 4.7, but Opus/Sonnet 4.6 only
expose 4 levels (low/medium/high/max) — sending "xhigh" there now 400s:
BadRequestError [HTTP 400]: This model does not support effort
level 'xhigh'. Supported levels: high, low, max, medium.
Users who set reasoning_effort=xhigh as their default (xhigh is the
recommended default for coding/agentic on 4.7 per the Anthropic migration
guide) now 400 every request the moment they switch back to a 4.6 model
via `/model` or config. Verified live against the Anthropic API on
`anthropic==0.94.0`.
Fix: make the mapping model-aware. Add `_supports_xhigh_effort()`
predicate (matches 4-7/4.7 substrings, mirroring the existing
`_supports_adaptive_thinking` / `_forbids_sampling_params` pattern).
On pre-4.7 adaptive models, downgrade xhigh→max (the strongest effort
those models accept, restoring pre-migration behavior). On 4.7+, keep
xhigh as a distinct level.
Per Anthropic's migration guide, xhigh is 4.7-only:
https://platform.claude.com/docs/en/about-claude/models/migration-guide
> Opus 4.7 effort levels: max, xhigh (new), high, medium, low.
> Opus 4.6 effort levels: max, high, medium, low.
SDK typing confirms: `anthropic.types.OutputConfigParam.effort: Literal[
"low", "medium", "high", "max"]` (v0.94.0 not yet updated for xhigh).
## Test plan
Verified live on macOS 15.5 / anthropic==0.94.0:
claude-opus-4-6 + effort=xhigh → output_config.effort=max → 200 OK
claude-opus-4-7 + effort=xhigh → output_config.effort=xhigh → 200 OK
claude-opus-4-6 + effort=max → output_config.effort=max → 200 OK
claude-opus-4-7 + effort=max → output_config.effort=max → 200 OK
`tests/agent/test_anthropic_adapter.py` — 120 pass (replaced 1 bugged
test that asserted the broken behavior, added 1 for 4.7 preservation).
Full adapter suite: 120 passed in 1.05s.
Broader suite (agent + run_agent + cli/gateway reasoning): 2140 passed
(2 pre-existing failures on clean upstream/main, unrelated).
## Platforms
Tested on macOS 15.5. No platform-specific code paths touched.
2026-04-16 13:51:42 -05:00
|
|
|
|
# Opus/Sonnet 4.6 only expose 4 levels: low, medium, high, max — no xhigh.
|
|
|
|
|
|
# We preserve xhigh as xhigh on 4.7+ (the recommended default for coding/
|
|
|
|
|
|
# agentic work) and downgrade it to max on pre-4.7 adaptive models (which
|
|
|
|
|
|
# is the strongest level they accept). "minimal" is a legacy alias that
|
|
|
|
|
|
# maps to low on every model. See:
|
fix(agent): complete Claude Opus 4.7 API migration
Claude Opus 4.7 introduced several breaking API changes that the current
codebase partially handled but not completely. This patch finishes the
migration per the official migration guide at
https://platform.claude.com/docs/en/about-claude/models/migration-guide
Fixes NousResearch/hermes-agent#11137
Breaking-change coverage:
1. Adaptive thinking + output_config.effort — 4.7 is now recognized by
_supports_adaptive_thinking() (extends previous 4.6-only gate).
2. Sampling parameter stripping — 4.7 returns 400 for any non-default
temperature / top_p / top_k. build_anthropic_kwargs drops them as a
safety net; the OpenAI-protocol auxiliary path (_build_call_kwargs)
and AnthropicCompletionsAdapter.create() both early-exit before
setting temperature for 4.7+ models. This keeps flush_memories and
structured-JSON aux paths that hardcode temperature from 400ing
when the aux model is flipped to 4.7.
3. thinking.display = "summarized" — 4.7 defaults display to "omitted",
which silently hides reasoning text from Hermes's CLI activity feed
during long tool runs. Restoring "summarized" preserves 4.6 UX.
4. Effort level mapping — xhigh now maps to xhigh (was xhigh→max, which
silently over-efforted every coding/agentic request). max is now a
distinct ceiling per Anthropic's 5-level effort model.
5. New stop_reason values — refusal and model_context_window_exceeded
were silently collapsed to "stop" (end_turn) by the adapter's
stop_reason_map. Now mapped to "content_filter" and "length"
respectively, matching upstream finish-reason handling already in
bedrock_adapter.
6. Model catalogs — claude-opus-4-7 added to the Anthropic provider
list, anthropic/claude-opus-4.7 added at top of OpenRouter fallback
catalog (recommended), claude-opus-4-7 added to model_metadata
DEFAULT_CONTEXT_LENGTHS (1M, matching 4.6 per migration guide).
7. Prefill docstrings — run_agent.AIAgent and BatchRunner now document
that Anthropic Sonnet/Opus 4.6+ reject a trailing assistant-role
prefill (400).
8. Tests — 4 new tests in test_anthropic_adapter covering display
default, xhigh preservation, max on 4.7, refusal / context-overflow
stop_reason mapping, plus the sampling-param predicate. test_model_metadata
accepts 4.7 at 1M context.
Tested on macOS 15.5 (darwin). 119 tests pass in
tests/agent/test_anthropic_adapter.py, 1320 pass in tests/agent/.
2026-04-16 12:35:43 -05:00
|
|
|
|
# https://platform.claude.com/docs/en/about-claude/models/migration-guide
|
2026-03-13 03:21:13 +01:00
|
|
|
|
ADAPTIVE_EFFORT_MAP = {
|
fix(agent): complete Claude Opus 4.7 API migration
Claude Opus 4.7 introduced several breaking API changes that the current
codebase partially handled but not completely. This patch finishes the
migration per the official migration guide at
https://platform.claude.com/docs/en/about-claude/models/migration-guide
Fixes NousResearch/hermes-agent#11137
Breaking-change coverage:
1. Adaptive thinking + output_config.effort — 4.7 is now recognized by
_supports_adaptive_thinking() (extends previous 4.6-only gate).
2. Sampling parameter stripping — 4.7 returns 400 for any non-default
temperature / top_p / top_k. build_anthropic_kwargs drops them as a
safety net; the OpenAI-protocol auxiliary path (_build_call_kwargs)
and AnthropicCompletionsAdapter.create() both early-exit before
setting temperature for 4.7+ models. This keeps flush_memories and
structured-JSON aux paths that hardcode temperature from 400ing
when the aux model is flipped to 4.7.
3. thinking.display = "summarized" — 4.7 defaults display to "omitted",
which silently hides reasoning text from Hermes's CLI activity feed
during long tool runs. Restoring "summarized" preserves 4.6 UX.
4. Effort level mapping — xhigh now maps to xhigh (was xhigh→max, which
silently over-efforted every coding/agentic request). max is now a
distinct ceiling per Anthropic's 5-level effort model.
5. New stop_reason values — refusal and model_context_window_exceeded
were silently collapsed to "stop" (end_turn) by the adapter's
stop_reason_map. Now mapped to "content_filter" and "length"
respectively, matching upstream finish-reason handling already in
bedrock_adapter.
6. Model catalogs — claude-opus-4-7 added to the Anthropic provider
list, anthropic/claude-opus-4.7 added at top of OpenRouter fallback
catalog (recommended), claude-opus-4-7 added to model_metadata
DEFAULT_CONTEXT_LENGTHS (1M, matching 4.6 per migration guide).
7. Prefill docstrings — run_agent.AIAgent and BatchRunner now document
that Anthropic Sonnet/Opus 4.6+ reject a trailing assistant-role
prefill (400).
8. Tests — 4 new tests in test_anthropic_adapter covering display
default, xhigh preservation, max on 4.7, refusal / context-overflow
stop_reason mapping, plus the sampling-param predicate. test_model_metadata
accepts 4.7 at 1M context.
Tested on macOS 15.5 (darwin). 119 tests pass in
tests/agent/test_anthropic_adapter.py, 1320 pass in tests/agent/.
2026-04-16 12:35:43 -05:00
|
|
|
|
"max": "max",
|
|
|
|
|
|
"xhigh": "xhigh",
|
|
|
|
|
|
"high": "high",
|
|
|
|
|
|
"medium": "medium",
|
|
|
|
|
|
"low": "low",
|
2026-03-13 03:21:13 +01:00
|
|
|
|
"minimal": "low",
|
|
|
|
|
|
}
|
|
|
|
|
|
|
fix(agent): downgrade xhigh→max on Anthropic pre-4.7 adaptive models
Regression from #11161 (Claude Opus 4.7 migration, commit 0517ac3e).
The Opus 4.7 migration changed `ADAPTIVE_EFFORT_MAP["xhigh"]` from "max"
(the pre-migration alias) to "xhigh" to preserve the new 4.7 effort level
as distinct from max. This is correct for 4.7, but Opus/Sonnet 4.6 only
expose 4 levels (low/medium/high/max) — sending "xhigh" there now 400s:
BadRequestError [HTTP 400]: This model does not support effort
level 'xhigh'. Supported levels: high, low, max, medium.
Users who set reasoning_effort=xhigh as their default (xhigh is the
recommended default for coding/agentic on 4.7 per the Anthropic migration
guide) now 400 every request the moment they switch back to a 4.6 model
via `/model` or config. Verified live against the Anthropic API on
`anthropic==0.94.0`.
Fix: make the mapping model-aware. Add `_supports_xhigh_effort()`
predicate (matches 4-7/4.7 substrings, mirroring the existing
`_supports_adaptive_thinking` / `_forbids_sampling_params` pattern).
On pre-4.7 adaptive models, downgrade xhigh→max (the strongest effort
those models accept, restoring pre-migration behavior). On 4.7+, keep
xhigh as a distinct level.
Per Anthropic's migration guide, xhigh is 4.7-only:
https://platform.claude.com/docs/en/about-claude/models/migration-guide
> Opus 4.7 effort levels: max, xhigh (new), high, medium, low.
> Opus 4.6 effort levels: max, high, medium, low.
SDK typing confirms: `anthropic.types.OutputConfigParam.effort: Literal[
"low", "medium", "high", "max"]` (v0.94.0 not yet updated for xhigh).
## Test plan
Verified live on macOS 15.5 / anthropic==0.94.0:
claude-opus-4-6 + effort=xhigh → output_config.effort=max → 200 OK
claude-opus-4-7 + effort=xhigh → output_config.effort=xhigh → 200 OK
claude-opus-4-6 + effort=max → output_config.effort=max → 200 OK
claude-opus-4-7 + effort=max → output_config.effort=max → 200 OK
`tests/agent/test_anthropic_adapter.py` — 120 pass (replaced 1 bugged
test that asserted the broken behavior, added 1 for 4.7 preservation).
Full adapter suite: 120 passed in 1.05s.
Broader suite (agent + run_agent + cli/gateway reasoning): 2140 passed
(2 pre-existing failures on clean upstream/main, unrelated).
## Platforms
Tested on macOS 15.5. No platform-specific code paths touched.
2026-04-16 13:51:42 -05:00
|
|
|
|
# Models that accept the "xhigh" output_config.effort level. Opus 4.7 added
|
|
|
|
|
|
# xhigh as a distinct level between high and max; older adaptive-thinking
|
|
|
|
|
|
# models (4.6) reject it with a 400. Keep this substring list in sync with
|
|
|
|
|
|
# the Anthropic migration guide as new model families ship.
|
|
|
|
|
|
_XHIGH_EFFORT_SUBSTRINGS = ("4-7", "4.7")
|
|
|
|
|
|
|
fix(agent): complete Claude Opus 4.7 API migration
Claude Opus 4.7 introduced several breaking API changes that the current
codebase partially handled but not completely. This patch finishes the
migration per the official migration guide at
https://platform.claude.com/docs/en/about-claude/models/migration-guide
Fixes NousResearch/hermes-agent#11137
Breaking-change coverage:
1. Adaptive thinking + output_config.effort — 4.7 is now recognized by
_supports_adaptive_thinking() (extends previous 4.6-only gate).
2. Sampling parameter stripping — 4.7 returns 400 for any non-default
temperature / top_p / top_k. build_anthropic_kwargs drops them as a
safety net; the OpenAI-protocol auxiliary path (_build_call_kwargs)
and AnthropicCompletionsAdapter.create() both early-exit before
setting temperature for 4.7+ models. This keeps flush_memories and
structured-JSON aux paths that hardcode temperature from 400ing
when the aux model is flipped to 4.7.
3. thinking.display = "summarized" — 4.7 defaults display to "omitted",
which silently hides reasoning text from Hermes's CLI activity feed
during long tool runs. Restoring "summarized" preserves 4.6 UX.
4. Effort level mapping — xhigh now maps to xhigh (was xhigh→max, which
silently over-efforted every coding/agentic request). max is now a
distinct ceiling per Anthropic's 5-level effort model.
5. New stop_reason values — refusal and model_context_window_exceeded
were silently collapsed to "stop" (end_turn) by the adapter's
stop_reason_map. Now mapped to "content_filter" and "length"
respectively, matching upstream finish-reason handling already in
bedrock_adapter.
6. Model catalogs — claude-opus-4-7 added to the Anthropic provider
list, anthropic/claude-opus-4.7 added at top of OpenRouter fallback
catalog (recommended), claude-opus-4-7 added to model_metadata
DEFAULT_CONTEXT_LENGTHS (1M, matching 4.6 per migration guide).
7. Prefill docstrings — run_agent.AIAgent and BatchRunner now document
that Anthropic Sonnet/Opus 4.6+ reject a trailing assistant-role
prefill (400).
8. Tests — 4 new tests in test_anthropic_adapter covering display
default, xhigh preservation, max on 4.7, refusal / context-overflow
stop_reason mapping, plus the sampling-param predicate. test_model_metadata
accepts 4.7 at 1M context.
Tested on macOS 15.5 (darwin). 119 tests pass in
tests/agent/test_anthropic_adapter.py, 1320 pass in tests/agent/.
2026-04-16 12:35:43 -05:00
|
|
|
|
# Models where extended thinking is deprecated/removed (4.6+ behavior: adaptive
|
|
|
|
|
|
# is the only supported mode; 4.7 additionally forbids manual thinking entirely
|
|
|
|
|
|
# and drops temperature/top_p/top_k).
|
|
|
|
|
|
_ADAPTIVE_THINKING_SUBSTRINGS = ("4-6", "4.6", "4-7", "4.7")
|
|
|
|
|
|
|
|
|
|
|
|
# Models where temperature/top_p/top_k return 400 if set to non-default values.
|
|
|
|
|
|
# This is the Opus 4.7 contract; future 4.x+ models are expected to follow it.
|
|
|
|
|
|
_NO_SAMPLING_PARAMS_SUBSTRINGS = ("4-7", "4.7")
|
|
|
|
|
|
|
2026-03-27 13:02:52 -07:00
|
|
|
|
# ── Max output token limits per Anthropic model ───────────────────────
|
|
|
|
|
|
# Source: Anthropic docs + Cline model catalog. Anthropic's API requires
|
|
|
|
|
|
# max_tokens as a mandatory field. Previously we hardcoded 16384, which
|
|
|
|
|
|
# starves thinking-enabled models (thinking tokens count toward the limit).
|
|
|
|
|
|
_ANTHROPIC_OUTPUT_LIMITS = {
|
fix(agent): complete Claude Opus 4.7 API migration
Claude Opus 4.7 introduced several breaking API changes that the current
codebase partially handled but not completely. This patch finishes the
migration per the official migration guide at
https://platform.claude.com/docs/en/about-claude/models/migration-guide
Fixes NousResearch/hermes-agent#11137
Breaking-change coverage:
1. Adaptive thinking + output_config.effort — 4.7 is now recognized by
_supports_adaptive_thinking() (extends previous 4.6-only gate).
2. Sampling parameter stripping — 4.7 returns 400 for any non-default
temperature / top_p / top_k. build_anthropic_kwargs drops them as a
safety net; the OpenAI-protocol auxiliary path (_build_call_kwargs)
and AnthropicCompletionsAdapter.create() both early-exit before
setting temperature for 4.7+ models. This keeps flush_memories and
structured-JSON aux paths that hardcode temperature from 400ing
when the aux model is flipped to 4.7.
3. thinking.display = "summarized" — 4.7 defaults display to "omitted",
which silently hides reasoning text from Hermes's CLI activity feed
during long tool runs. Restoring "summarized" preserves 4.6 UX.
4. Effort level mapping — xhigh now maps to xhigh (was xhigh→max, which
silently over-efforted every coding/agentic request). max is now a
distinct ceiling per Anthropic's 5-level effort model.
5. New stop_reason values — refusal and model_context_window_exceeded
were silently collapsed to "stop" (end_turn) by the adapter's
stop_reason_map. Now mapped to "content_filter" and "length"
respectively, matching upstream finish-reason handling already in
bedrock_adapter.
6. Model catalogs — claude-opus-4-7 added to the Anthropic provider
list, anthropic/claude-opus-4.7 added at top of OpenRouter fallback
catalog (recommended), claude-opus-4-7 added to model_metadata
DEFAULT_CONTEXT_LENGTHS (1M, matching 4.6 per migration guide).
7. Prefill docstrings — run_agent.AIAgent and BatchRunner now document
that Anthropic Sonnet/Opus 4.6+ reject a trailing assistant-role
prefill (400).
8. Tests — 4 new tests in test_anthropic_adapter covering display
default, xhigh preservation, max on 4.7, refusal / context-overflow
stop_reason mapping, plus the sampling-param predicate. test_model_metadata
accepts 4.7 at 1M context.
Tested on macOS 15.5 (darwin). 119 tests pass in
tests/agent/test_anthropic_adapter.py, 1320 pass in tests/agent/.
2026-04-16 12:35:43 -05:00
|
|
|
|
# Claude 4.7
|
|
|
|
|
|
"claude-opus-4-7": 128_000,
|
2026-03-27 13:02:52 -07:00
|
|
|
|
# Claude 4.6
|
|
|
|
|
|
"claude-opus-4-6": 128_000,
|
|
|
|
|
|
"claude-sonnet-4-6": 64_000,
|
|
|
|
|
|
# Claude 4.5
|
|
|
|
|
|
"claude-opus-4-5": 64_000,
|
|
|
|
|
|
"claude-sonnet-4-5": 64_000,
|
|
|
|
|
|
"claude-haiku-4-5": 64_000,
|
|
|
|
|
|
# Claude 4
|
|
|
|
|
|
"claude-opus-4": 32_000,
|
|
|
|
|
|
"claude-sonnet-4": 64_000,
|
|
|
|
|
|
# Claude 3.7
|
|
|
|
|
|
"claude-3-7-sonnet": 128_000,
|
|
|
|
|
|
# Claude 3.5
|
|
|
|
|
|
"claude-3-5-sonnet": 8_192,
|
|
|
|
|
|
"claude-3-5-haiku": 8_192,
|
|
|
|
|
|
# Claude 3
|
|
|
|
|
|
"claude-3-opus": 4_096,
|
|
|
|
|
|
"claude-3-sonnet": 4_096,
|
|
|
|
|
|
"claude-3-haiku": 4_096,
|
fix: align MiniMax provider with official API docs
Aligns MiniMax provider with official API documentation. Fixes 6 bugs:
transport mismatch (openai_chat -> anthropic_messages), credential leak
in switch_model(), prompt caching sent to non-Anthropic endpoints,
dot-to-hyphen model name corruption, trajectory compressor URL routing,
and stale doctor health check.
Also corrects context window (204,800), thinking support (manual mode),
max output (131,072), and model catalog (M2 family only on /anthropic).
Source: https://platform.minimax.io/docs/api-reference/text-anthropic-api
Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
2026-04-10 03:53:18 -07:00
|
|
|
|
# Third-party Anthropic-compatible providers
|
|
|
|
|
|
"minimax": 131_072,
|
2026-03-27 13:02:52 -07:00
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
# For any model not in the table, assume the highest current limit.
|
|
|
|
|
|
# Future Anthropic models are unlikely to have *less* output capacity.
|
|
|
|
|
|
_ANTHROPIC_DEFAULT_OUTPUT_LIMIT = 128_000
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def _get_anthropic_max_output(model: str) -> int:
|
|
|
|
|
|
"""Look up the max output token limit for an Anthropic model.
|
|
|
|
|
|
|
|
|
|
|
|
Uses substring matching against _ANTHROPIC_OUTPUT_LIMITS so date-stamped
|
|
|
|
|
|
model IDs (claude-sonnet-4-5-20250929) and variant suffixes (:1m, :fast)
|
|
|
|
|
|
resolve correctly. Longest-prefix match wins to avoid e.g. "claude-3-5"
|
|
|
|
|
|
matching before "claude-3-5-sonnet".
|
2026-04-10 00:54:36 -04:00
|
|
|
|
|
|
|
|
|
|
Normalizes dots to hyphens so that model names like
|
|
|
|
|
|
``anthropic/claude-opus-4.6`` match the ``claude-opus-4-6`` table key.
|
2026-03-27 13:02:52 -07:00
|
|
|
|
"""
|
2026-04-10 00:54:36 -04:00
|
|
|
|
m = model.lower().replace(".", "-")
|
2026-03-27 13:02:52 -07:00
|
|
|
|
best_key = ""
|
|
|
|
|
|
best_val = _ANTHROPIC_DEFAULT_OUTPUT_LIMIT
|
|
|
|
|
|
for key, val in _ANTHROPIC_OUTPUT_LIMITS.items():
|
|
|
|
|
|
if key in m and len(key) > len(best_key):
|
|
|
|
|
|
best_key = key
|
|
|
|
|
|
best_val = val
|
|
|
|
|
|
return best_val
|
|
|
|
|
|
|
2026-03-13 03:21:13 +01:00
|
|
|
|
|
|
|
|
|
|
def _supports_adaptive_thinking(model: str) -> bool:
|
fix(agent): complete Claude Opus 4.7 API migration
Claude Opus 4.7 introduced several breaking API changes that the current
codebase partially handled but not completely. This patch finishes the
migration per the official migration guide at
https://platform.claude.com/docs/en/about-claude/models/migration-guide
Fixes NousResearch/hermes-agent#11137
Breaking-change coverage:
1. Adaptive thinking + output_config.effort — 4.7 is now recognized by
_supports_adaptive_thinking() (extends previous 4.6-only gate).
2. Sampling parameter stripping — 4.7 returns 400 for any non-default
temperature / top_p / top_k. build_anthropic_kwargs drops them as a
safety net; the OpenAI-protocol auxiliary path (_build_call_kwargs)
and AnthropicCompletionsAdapter.create() both early-exit before
setting temperature for 4.7+ models. This keeps flush_memories and
structured-JSON aux paths that hardcode temperature from 400ing
when the aux model is flipped to 4.7.
3. thinking.display = "summarized" — 4.7 defaults display to "omitted",
which silently hides reasoning text from Hermes's CLI activity feed
during long tool runs. Restoring "summarized" preserves 4.6 UX.
4. Effort level mapping — xhigh now maps to xhigh (was xhigh→max, which
silently over-efforted every coding/agentic request). max is now a
distinct ceiling per Anthropic's 5-level effort model.
5. New stop_reason values — refusal and model_context_window_exceeded
were silently collapsed to "stop" (end_turn) by the adapter's
stop_reason_map. Now mapped to "content_filter" and "length"
respectively, matching upstream finish-reason handling already in
bedrock_adapter.
6. Model catalogs — claude-opus-4-7 added to the Anthropic provider
list, anthropic/claude-opus-4.7 added at top of OpenRouter fallback
catalog (recommended), claude-opus-4-7 added to model_metadata
DEFAULT_CONTEXT_LENGTHS (1M, matching 4.6 per migration guide).
7. Prefill docstrings — run_agent.AIAgent and BatchRunner now document
that Anthropic Sonnet/Opus 4.6+ reject a trailing assistant-role
prefill (400).
8. Tests — 4 new tests in test_anthropic_adapter covering display
default, xhigh preservation, max on 4.7, refusal / context-overflow
stop_reason mapping, plus the sampling-param predicate. test_model_metadata
accepts 4.7 at 1M context.
Tested on macOS 15.5 (darwin). 119 tests pass in
tests/agent/test_anthropic_adapter.py, 1320 pass in tests/agent/.
2026-04-16 12:35:43 -05:00
|
|
|
|
"""Return True for Claude 4.6+ models that support adaptive thinking."""
|
|
|
|
|
|
return any(v in model for v in _ADAPTIVE_THINKING_SUBSTRINGS)
|
|
|
|
|
|
|
2026-03-13 03:21:13 +01:00
|
|
|
|
|
fix(agent): downgrade xhigh→max on Anthropic pre-4.7 adaptive models
Regression from #11161 (Claude Opus 4.7 migration, commit 0517ac3e).
The Opus 4.7 migration changed `ADAPTIVE_EFFORT_MAP["xhigh"]` from "max"
(the pre-migration alias) to "xhigh" to preserve the new 4.7 effort level
as distinct from max. This is correct for 4.7, but Opus/Sonnet 4.6 only
expose 4 levels (low/medium/high/max) — sending "xhigh" there now 400s:
BadRequestError [HTTP 400]: This model does not support effort
level 'xhigh'. Supported levels: high, low, max, medium.
Users who set reasoning_effort=xhigh as their default (xhigh is the
recommended default for coding/agentic on 4.7 per the Anthropic migration
guide) now 400 every request the moment they switch back to a 4.6 model
via `/model` or config. Verified live against the Anthropic API on
`anthropic==0.94.0`.
Fix: make the mapping model-aware. Add `_supports_xhigh_effort()`
predicate (matches 4-7/4.7 substrings, mirroring the existing
`_supports_adaptive_thinking` / `_forbids_sampling_params` pattern).
On pre-4.7 adaptive models, downgrade xhigh→max (the strongest effort
those models accept, restoring pre-migration behavior). On 4.7+, keep
xhigh as a distinct level.
Per Anthropic's migration guide, xhigh is 4.7-only:
https://platform.claude.com/docs/en/about-claude/models/migration-guide
> Opus 4.7 effort levels: max, xhigh (new), high, medium, low.
> Opus 4.6 effort levels: max, high, medium, low.
SDK typing confirms: `anthropic.types.OutputConfigParam.effort: Literal[
"low", "medium", "high", "max"]` (v0.94.0 not yet updated for xhigh).
## Test plan
Verified live on macOS 15.5 / anthropic==0.94.0:
claude-opus-4-6 + effort=xhigh → output_config.effort=max → 200 OK
claude-opus-4-7 + effort=xhigh → output_config.effort=xhigh → 200 OK
claude-opus-4-6 + effort=max → output_config.effort=max → 200 OK
claude-opus-4-7 + effort=max → output_config.effort=max → 200 OK
`tests/agent/test_anthropic_adapter.py` — 120 pass (replaced 1 bugged
test that asserted the broken behavior, added 1 for 4.7 preservation).
Full adapter suite: 120 passed in 1.05s.
Broader suite (agent + run_agent + cli/gateway reasoning): 2140 passed
(2 pre-existing failures on clean upstream/main, unrelated).
## Platforms
Tested on macOS 15.5. No platform-specific code paths touched.
2026-04-16 13:51:42 -05:00
|
|
|
|
def _supports_xhigh_effort(model: str) -> bool:
|
|
|
|
|
|
"""Return True for models that accept the 'xhigh' adaptive effort level.
|
|
|
|
|
|
|
|
|
|
|
|
Opus 4.7 introduced xhigh as a distinct level between high and max.
|
|
|
|
|
|
Pre-4.7 adaptive models (Opus/Sonnet 4.6) only accept low/medium/high/max
|
|
|
|
|
|
and reject xhigh with an HTTP 400. Callers should downgrade xhigh→max
|
|
|
|
|
|
when this returns False.
|
|
|
|
|
|
"""
|
|
|
|
|
|
return any(v in model for v in _XHIGH_EFFORT_SUBSTRINGS)
|
|
|
|
|
|
|
|
|
|
|
|
|
fix(agent): complete Claude Opus 4.7 API migration
Claude Opus 4.7 introduced several breaking API changes that the current
codebase partially handled but not completely. This patch finishes the
migration per the official migration guide at
https://platform.claude.com/docs/en/about-claude/models/migration-guide
Fixes NousResearch/hermes-agent#11137
Breaking-change coverage:
1. Adaptive thinking + output_config.effort — 4.7 is now recognized by
_supports_adaptive_thinking() (extends previous 4.6-only gate).
2. Sampling parameter stripping — 4.7 returns 400 for any non-default
temperature / top_p / top_k. build_anthropic_kwargs drops them as a
safety net; the OpenAI-protocol auxiliary path (_build_call_kwargs)
and AnthropicCompletionsAdapter.create() both early-exit before
setting temperature for 4.7+ models. This keeps flush_memories and
structured-JSON aux paths that hardcode temperature from 400ing
when the aux model is flipped to 4.7.
3. thinking.display = "summarized" — 4.7 defaults display to "omitted",
which silently hides reasoning text from Hermes's CLI activity feed
during long tool runs. Restoring "summarized" preserves 4.6 UX.
4. Effort level mapping — xhigh now maps to xhigh (was xhigh→max, which
silently over-efforted every coding/agentic request). max is now a
distinct ceiling per Anthropic's 5-level effort model.
5. New stop_reason values — refusal and model_context_window_exceeded
were silently collapsed to "stop" (end_turn) by the adapter's
stop_reason_map. Now mapped to "content_filter" and "length"
respectively, matching upstream finish-reason handling already in
bedrock_adapter.
6. Model catalogs — claude-opus-4-7 added to the Anthropic provider
list, anthropic/claude-opus-4.7 added at top of OpenRouter fallback
catalog (recommended), claude-opus-4-7 added to model_metadata
DEFAULT_CONTEXT_LENGTHS (1M, matching 4.6 per migration guide).
7. Prefill docstrings — run_agent.AIAgent and BatchRunner now document
that Anthropic Sonnet/Opus 4.6+ reject a trailing assistant-role
prefill (400).
8. Tests — 4 new tests in test_anthropic_adapter covering display
default, xhigh preservation, max on 4.7, refusal / context-overflow
stop_reason mapping, plus the sampling-param predicate. test_model_metadata
accepts 4.7 at 1M context.
Tested on macOS 15.5 (darwin). 119 tests pass in
tests/agent/test_anthropic_adapter.py, 1320 pass in tests/agent/.
2026-04-16 12:35:43 -05:00
|
|
|
|
def _forbids_sampling_params(model: str) -> bool:
|
|
|
|
|
|
"""Return True for models that 400 on any non-default temperature/top_p/top_k.
|
|
|
|
|
|
|
|
|
|
|
|
Opus 4.7 explicitly rejects sampling parameters; later Claude releases are
|
|
|
|
|
|
expected to follow suit. Callers should omit these fields entirely rather
|
|
|
|
|
|
than passing zero/default values (the API rejects anything non-null).
|
|
|
|
|
|
"""
|
|
|
|
|
|
return any(v in model for v in _NO_SAMPLING_PARAMS_SUBSTRINGS)
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
|
fix(agent): complete Claude Opus 4.7 API migration
Claude Opus 4.7 introduced several breaking API changes that the current
codebase partially handled but not completely. This patch finishes the
migration per the official migration guide at
https://platform.claude.com/docs/en/about-claude/models/migration-guide
Fixes NousResearch/hermes-agent#11137
Breaking-change coverage:
1. Adaptive thinking + output_config.effort — 4.7 is now recognized by
_supports_adaptive_thinking() (extends previous 4.6-only gate).
2. Sampling parameter stripping — 4.7 returns 400 for any non-default
temperature / top_p / top_k. build_anthropic_kwargs drops them as a
safety net; the OpenAI-protocol auxiliary path (_build_call_kwargs)
and AnthropicCompletionsAdapter.create() both early-exit before
setting temperature for 4.7+ models. This keeps flush_memories and
structured-JSON aux paths that hardcode temperature from 400ing
when the aux model is flipped to 4.7.
3. thinking.display = "summarized" — 4.7 defaults display to "omitted",
which silently hides reasoning text from Hermes's CLI activity feed
during long tool runs. Restoring "summarized" preserves 4.6 UX.
4. Effort level mapping — xhigh now maps to xhigh (was xhigh→max, which
silently over-efforted every coding/agentic request). max is now a
distinct ceiling per Anthropic's 5-level effort model.
5. New stop_reason values — refusal and model_context_window_exceeded
were silently collapsed to "stop" (end_turn) by the adapter's
stop_reason_map. Now mapped to "content_filter" and "length"
respectively, matching upstream finish-reason handling already in
bedrock_adapter.
6. Model catalogs — claude-opus-4-7 added to the Anthropic provider
list, anthropic/claude-opus-4.7 added at top of OpenRouter fallback
catalog (recommended), claude-opus-4-7 added to model_metadata
DEFAULT_CONTEXT_LENGTHS (1M, matching 4.6 per migration guide).
7. Prefill docstrings — run_agent.AIAgent and BatchRunner now document
that Anthropic Sonnet/Opus 4.6+ reject a trailing assistant-role
prefill (400).
8. Tests — 4 new tests in test_anthropic_adapter covering display
default, xhigh preservation, max on 4.7, refusal / context-overflow
stop_reason mapping, plus the sampling-param predicate. test_model_metadata
accepts 4.7 at 1M context.
Tested on macOS 15.5 (darwin). 119 tests pass in
tests/agent/test_anthropic_adapter.py, 1320 pass in tests/agent/.
2026-04-16 12:35:43 -05:00
|
|
|
|
|
|
|
|
|
|
# Beta headers for enhanced features (sent with ALL auth types).
|
|
|
|
|
|
# As of Opus 4.7 (2026-04-16), both of these are GA on Claude 4.6+ — the
|
|
|
|
|
|
# beta headers are still accepted (harmless no-op) but not required. Kept
|
|
|
|
|
|
# here so older Claude (4.5, 4.1) + third-party Anthropic-compat endpoints
|
|
|
|
|
|
# that still gate on the headers continue to get the enhanced features.
|
|
|
|
|
|
# Migration guide: remove these if you no longer support ≤4.5 models.
|
fix(anthropic): address gaps found in deep-dive audit
After studying clawdbot (OpenClaw) and OpenCode implementations:
## Beta headers
- Add interleaved-thinking-2025-05-14 and fine-grained-tool-streaming-2025-05-14
as common betas (sent with ALL auth types, not just OAuth)
- OAuth tokens additionally get oauth-2025-04-20
- API keys now also get the common betas (previously got none)
## Vision/image support
- Add _convert_vision_content() to convert OpenAI multimodal format
(image_url blocks) to Anthropic format (image blocks with base64/url source)
- Handles both data: URIs (base64) and regular URLs
## Role alternation enforcement
- Anthropic strictly rejects consecutive same-role messages (400 error)
- Add post-processing step that merges consecutive user/assistant messages
- Handles string, list, and mixed content types during merge
## Tool choice support
- Add tool_choice parameter to build_anthropic_kwargs()
- Maps OpenAI values: auto→auto, required→any, none→omit, name→tool
## Cache metrics tracking
- Anthropic uses cache_read_input_tokens / cache_creation_input_tokens
(different from OpenRouter's prompt_tokens_details.cached_tokens)
- Add api_mode-aware branch in run_agent.py cache stats logging
## Credential refresh on 401
- On 401 error during anthropic_messages mode, re-read credentials
via resolve_anthropic_token() (picks up refreshed Claude Code tokens)
- Rebuild client if new token differs from current one
- Follows same pattern as Codex/Nous 401 refresh handlers
## Tests
- 44 adapter tests (8 new: vision conversion, role alternation, tool choice)
- Updated beta header tests to verify new structure
- Full suite: 3198 passed, 0 regressions
2026-03-12 16:00:46 -07:00
|
|
|
|
_COMMON_BETAS = [
|
|
|
|
|
|
"interleaved-thinking-2025-05-14",
|
|
|
|
|
|
"fine-grained-tool-streaming-2025-05-14",
|
|
|
|
|
|
]
|
2026-04-09 17:09:38 -07:00
|
|
|
|
# MiniMax's Anthropic-compatible endpoints fail tool-use requests when
|
|
|
|
|
|
# the fine-grained tool streaming beta is present. Omit it so tool calls
|
|
|
|
|
|
# fall back to the provider's default response path.
|
|
|
|
|
|
_TOOL_STREAMING_BETA = "fine-grained-tool-streaming-2025-05-14"
|
fix(anthropic): address gaps found in deep-dive audit
After studying clawdbot (OpenClaw) and OpenCode implementations:
## Beta headers
- Add interleaved-thinking-2025-05-14 and fine-grained-tool-streaming-2025-05-14
as common betas (sent with ALL auth types, not just OAuth)
- OAuth tokens additionally get oauth-2025-04-20
- API keys now also get the common betas (previously got none)
## Vision/image support
- Add _convert_vision_content() to convert OpenAI multimodal format
(image_url blocks) to Anthropic format (image blocks with base64/url source)
- Handles both data: URIs (base64) and regular URLs
## Role alternation enforcement
- Anthropic strictly rejects consecutive same-role messages (400 error)
- Add post-processing step that merges consecutive user/assistant messages
- Handles string, list, and mixed content types during merge
## Tool choice support
- Add tool_choice parameter to build_anthropic_kwargs()
- Maps OpenAI values: auto→auto, required→any, none→omit, name→tool
## Cache metrics tracking
- Anthropic uses cache_read_input_tokens / cache_creation_input_tokens
(different from OpenRouter's prompt_tokens_details.cached_tokens)
- Add api_mode-aware branch in run_agent.py cache stats logging
## Credential refresh on 401
- On 401 error during anthropic_messages mode, re-read credentials
via resolve_anthropic_token() (picks up refreshed Claude Code tokens)
- Rebuild client if new token differs from current one
- Follows same pattern as Codex/Nous 401 refresh handlers
## Tests
- 44 adapter tests (8 new: vision conversion, role alternation, tool choice)
- Updated beta header tests to verify new structure
- Full suite: 3198 passed, 0 regressions
2026-03-12 16:00:46 -07:00
|
|
|
|
|
2026-04-10 02:32:15 -07:00
|
|
|
|
# Fast mode beta — enables the ``speed: "fast"`` request parameter for
|
|
|
|
|
|
# significantly higher output token throughput on Opus 4.6 (~2.5x).
|
|
|
|
|
|
# See https://platform.claude.com/docs/en/build-with-claude/fast-mode
|
|
|
|
|
|
_FAST_MODE_BETA = "fast-mode-2026-02-01"
|
|
|
|
|
|
|
2026-03-16 17:08:22 -07:00
|
|
|
|
# Additional beta headers required for OAuth/subscription auth.
|
|
|
|
|
|
# Matches what Claude Code (and pi-ai / OpenCode) send.
|
fix(anthropic): address gaps found in deep-dive audit
After studying clawdbot (OpenClaw) and OpenCode implementations:
## Beta headers
- Add interleaved-thinking-2025-05-14 and fine-grained-tool-streaming-2025-05-14
as common betas (sent with ALL auth types, not just OAuth)
- OAuth tokens additionally get oauth-2025-04-20
- API keys now also get the common betas (previously got none)
## Vision/image support
- Add _convert_vision_content() to convert OpenAI multimodal format
(image_url blocks) to Anthropic format (image blocks with base64/url source)
- Handles both data: URIs (base64) and regular URLs
## Role alternation enforcement
- Anthropic strictly rejects consecutive same-role messages (400 error)
- Add post-processing step that merges consecutive user/assistant messages
- Handles string, list, and mixed content types during merge
## Tool choice support
- Add tool_choice parameter to build_anthropic_kwargs()
- Maps OpenAI values: auto→auto, required→any, none→omit, name→tool
## Cache metrics tracking
- Anthropic uses cache_read_input_tokens / cache_creation_input_tokens
(different from OpenRouter's prompt_tokens_details.cached_tokens)
- Add api_mode-aware branch in run_agent.py cache stats logging
## Credential refresh on 401
- On 401 error during anthropic_messages mode, re-read credentials
via resolve_anthropic_token() (picks up refreshed Claude Code tokens)
- Rebuild client if new token differs from current one
- Follows same pattern as Codex/Nous 401 refresh handlers
## Tests
- 44 adapter tests (8 new: vision conversion, role alternation, tool choice)
- Updated beta header tests to verify new structure
- Full suite: 3198 passed, 0 regressions
2026-03-12 16:00:46 -07:00
|
|
|
|
_OAUTH_ONLY_BETAS = [
|
fix: Anthropic OAuth — beta header, token refresh, config contamination, reauthentication (#1132)
Fixes Anthropic OAuth/subscription authentication end-to-end:
Auth failures (401 errors):
- Add missing 'claude-code-20250219' beta header for OAuth tokens. Both
clawdbot and OpenCode include this alongside 'oauth-2025-04-20' — without
it, Anthropic's API rejects OAuth tokens with 401 authentication errors.
- Fix _fetch_anthropic_models() to use canonical beta headers from
_COMMON_BETAS + _OAUTH_ONLY_BETAS instead of hardcoding.
Token refresh:
- Add _refresh_oauth_token() — when Claude Code credentials from
~/.claude/.credentials.json are expired but have a refresh token,
automatically POST to console.anthropic.com/v1/oauth/token to get
a new access token. Uses the same client_id as Claude Code / OpenCode.
- Add _write_claude_code_credentials() — writes refreshed tokens back
to ~/.claude/.credentials.json, preserving other fields.
- resolve_anthropic_token() now auto-refreshes expired tokens before
returning None.
Config contamination:
- Anthropic's _model_flow_anthropic() no longer saves base_url to config.
Since resolve_runtime_provider() always hardcodes Anthropic's URL, the
stale base_url was contaminating other providers when users switched
without re-running 'hermes model' (e.g., Codex hitting api.anthropic.com).
- _update_config_for_provider() now pops base_url when passed empty string.
- Same fix in setup.py.
Flow/UX (hermes model command):
- CLAUDE_CODE_OAUTH_TOKEN env var now checked in credential detection
- Reauthentication option when existing credentials found
- run_oauth_setup_token() runs 'claude setup-token' as interactive
subprocess, then auto-detects saved credentials
- Clean has_creds/needs_auth flow in both main.py and setup.py
Tests (14 new):
- Beta header assertions for claude-code-20250219
- Token refresh: successful refresh with credential writeback, failed
refresh returns None, no refresh token returns None
- Credential writeback: new file creation, preserving existing fields
- Auto-refresh integration in resolve_anthropic_token()
- CLAUDE_CODE_OAUTH_TOKEN fallback, credential file auto-discovery
- run_oauth_setup_token() (5 scenarios)
2026-03-12 20:45:50 -07:00
|
|
|
|
"claude-code-20250219",
|
fix(anthropic): address gaps found in deep-dive audit
After studying clawdbot (OpenClaw) and OpenCode implementations:
## Beta headers
- Add interleaved-thinking-2025-05-14 and fine-grained-tool-streaming-2025-05-14
as common betas (sent with ALL auth types, not just OAuth)
- OAuth tokens additionally get oauth-2025-04-20
- API keys now also get the common betas (previously got none)
## Vision/image support
- Add _convert_vision_content() to convert OpenAI multimodal format
(image_url blocks) to Anthropic format (image blocks with base64/url source)
- Handles both data: URIs (base64) and regular URLs
## Role alternation enforcement
- Anthropic strictly rejects consecutive same-role messages (400 error)
- Add post-processing step that merges consecutive user/assistant messages
- Handles string, list, and mixed content types during merge
## Tool choice support
- Add tool_choice parameter to build_anthropic_kwargs()
- Maps OpenAI values: auto→auto, required→any, none→omit, name→tool
## Cache metrics tracking
- Anthropic uses cache_read_input_tokens / cache_creation_input_tokens
(different from OpenRouter's prompt_tokens_details.cached_tokens)
- Add api_mode-aware branch in run_agent.py cache stats logging
## Credential refresh on 401
- On 401 error during anthropic_messages mode, re-read credentials
via resolve_anthropic_token() (picks up refreshed Claude Code tokens)
- Rebuild client if new token differs from current one
- Follows same pattern as Codex/Nous 401 refresh handlers
## Tests
- 44 adapter tests (8 new: vision conversion, role alternation, tool choice)
- Updated beta header tests to verify new structure
- Full suite: 3198 passed, 0 regressions
2026-03-12 16:00:46 -07:00
|
|
|
|
"oauth-2025-04-20",
|
|
|
|
|
|
]
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
|
2026-03-16 17:08:22 -07:00
|
|
|
|
# Claude Code identity — required for OAuth requests to be routed correctly.
|
|
|
|
|
|
# Without these, Anthropic's infrastructure intermittently 500s OAuth traffic.
|
fix: detect Claude Code version dynamically for OAuth user-agent
* fix: prevent infinite 400 failure loop on context overflow (#1630)
When a gateway session exceeds the model's context window, Anthropic may
return a generic 400 invalid_request_error with just 'Error' as the
message. This bypassed the phrase-based context-length detection,
causing the agent to treat it as a non-retryable client error. Worse,
the failed user message was still persisted to the transcript, making
the session even larger on each attempt — creating an infinite loop.
Three-layer fix:
1. run_agent.py — Fallback heuristic: when a 400 error has a very short
generic message AND the session is large (>40% of context or >80
messages), treat it as a probable context overflow and trigger
compression instead of aborting.
2. run_agent.py + gateway/run.py — Don't persist failed messages:
when the agent returns failed=True before generating any response,
skip writing the user's message to the transcript/DB. This prevents
the session from growing on each failure.
3. gateway/run.py — Smarter error messages: detect context-overflow
failures and suggest /compact or /reset specifically, instead of a
generic 'try again' that will fail identically.
* fix(skills): detect prompt injection patterns and block cache file reads
Adds two security layers to prevent prompt injection via skills hub
cache files (#1558):
1. read_file: blocks direct reads of ~/.hermes/skills/.hub/ directory
(index-cache, catalog files). The 3.5MB clawhub_catalog_v1.json
was the original injection vector — untrusted skill descriptions
in the catalog contained adversarial text that the model executed.
2. skill_view: warns when skills are loaded from outside the trusted
~/.hermes/skills/ directory, and detects common injection patterns
in skill content ("ignore previous instructions", "<system>", etc.).
Cherry-picked from PR #1562 by ygd58.
* fix(tools): chunk long messages in send_message_tool before dispatch (#1552)
Long messages sent via send_message tool or cron delivery silently
failed when exceeding platform limits. Gateway adapters handle this
via truncate_message(), but the standalone senders in send_message_tool
bypassed that entirely.
- Apply truncate_message() chunking in _send_to_platform() before
dispatching to individual platform senders
- Remove naive message[i:i+2000] character split in _send_discord()
in favor of centralized smart splitting
- Attach media files to last chunk only for Telegram
- Add regression tests for chunking and media placement
Cherry-picked from PR #1557 by llbn.
* fix(approval): show full command in dangerous command approval (#1553)
Previously the command was truncated to 80 chars in CLI (with a
[v]iew full option), 500 chars in Discord embeds, and missing entirely
in Telegram/Slack approval messages. Now the full command is always
displayed everywhere:
- CLI: removed 80-char truncation and [v]iew full menu option
- Gateway (TG/Slack): approval_required message includes full command
in a code block
- Discord: embed shows full command up to 4096-char limit
- Windows: skip SIGALRM-based test timeout (Unix-only)
- Updated tests: replaced view-flow tests with direct approval tests
Cherry-picked from PR #1566 by crazywriter1.
* fix(cli): flush stdout during agent loop to prevent macOS display freeze (#1624)
The interrupt polling loop in chat() waited on the queue without
invalidating the prompt_toolkit renderer. On macOS, the StdoutProxy
buffer only flushed on input events, causing the CLI to appear frozen
during tool execution until the user typed a key.
Fix: call _invalidate() on each queue timeout (every ~100ms, throttled
to 150ms) to force the renderer to flush buffered agent output.
* fix(claw): warn when API keys are skipped during OpenClaw migration (#1580)
When --migrate-secrets is not passed (the default), API keys like
OPENROUTER_API_KEY are silently skipped with no warning. Users don't
realize their keys weren't migrated until the agent fails to connect.
Add a post-migration warning with actionable instructions: either
re-run with --migrate-secrets or add the key manually via
hermes config set.
Cherry-picked from PR #1593 by ygd58.
* fix(security): block sandbox backend creds from subprocess env (#1264)
Add Modal and Daytona sandbox credentials to the subprocess env
blocklist so they're not leaked to agent terminal sessions via
printenv/env.
Cherry-picked from PR #1571 by ygd58.
* fix(gateway): cap interrupt recursion depth to prevent resource exhaustion (#816)
When a user sends multiple messages while the agent keeps failing,
_run_agent() calls itself recursively with no depth limit. This can
exhaust stack/memory if the agent is in a failure loop.
Add _MAX_INTERRUPT_DEPTH = 3. When exceeded, the pending message is
logged and the current result is returned instead of recursing deeper.
The log handler duplication bug described in #816 was already fixed
separately (AIAgent.__init__ deduplicates handlers).
* fix(gateway): /model shows active fallback model instead of config default (#1615)
When the agent falls back to a different model (e.g. due to rate
limiting), /model still showed the config default. Now tracks the
effective model/provider after each agent run and displays it.
Cleared when the primary model succeeds again or the user explicitly
switches via /model.
Cherry-picked from PR #1616 by MaxKerkula. Added hasattr guard for
test compatibility.
* feat(gateway): inject reply-to message context for out-of-session replies (#1594)
When a user replies to a Telegram message, check if the quoted text
exists in the current session transcript. If missing (from cron jobs,
background tasks, or old sessions), prepend [Replying to: "..."] to
the message so the agent has context about what's being referenced.
- Add reply_to_text field to MessageEvent (base.py)
- Populate from Telegram's reply_to_message (text or caption)
- Inject context in _handle_message when not found in history
Based on PR #1596 by anpicasso (cherry-picked reply-to feature only,
excluded unrelated /server command and background delegation changes).
* fix: recognize Claude Code OAuth credentials in startup gate (#1455)
The _has_any_provider_configured() startup check didn't look for
Claude Code OAuth credentials (~/.claude/.credentials.json). Users
with only Claude Code auth got the setup wizard instead of starting.
Cherry-picked from PR #1455 by kshitijk4poor.
* perf: use ripgrep for file search (200x faster than find)
search_files(target='files') now uses rg --files -g instead of find.
Ripgrep respects .gitignore, excludes hidden dirs by default, and has
parallel directory traversal — ~200x faster on wide trees (0.14s vs 34s
benchmarked on 164-repo tree).
Falls back to find when rg is unavailable, preserving hidden-dir
exclusion and BSD find compatibility.
Salvaged from PR #1464 by @light-merlin-dark (Merlin) — adapted to
preserve hidden-dir exclusion added since the original PR.
* refactor(tts): replace NeuTTS optional skill with built-in provider + setup flow
Remove the optional skill (redundant now that NeuTTS is a built-in TTS
provider). Replace neutts_cli dependency with a standalone synthesis
helper (tools/neutts_synth.py) that calls the neutts Python API directly
in a subprocess.
Add TTS provider selection to hermes setup:
- 'hermes setup' now prompts for TTS provider after model selection
- 'hermes setup tts' available as standalone section
- Selecting NeuTTS checks for deps and offers to install:
espeak-ng (system) + neutts[all] (pip)
- ElevenLabs/OpenAI selections prompt for API keys
- Tool status display shows NeuTTS install state
Changes:
- Remove optional-skills/mlops/models/neutts/ (skill + CLI scaffold)
- Add tools/neutts_synth.py (standalone synthesis subprocess helper)
- Move jo.wav/jo.txt to tools/neutts_samples/ (bundled default voice)
- Refactor _generate_neutts() — uses neutts API via subprocess, no
neutts_cli dependency, config-driven ref_audio/ref_text/model/device
- Add TTS setup to hermes_cli/setup.py (SETUP_SECTIONS, tool status)
- Update config.py defaults (ref_audio, ref_text, model, device)
* fix(docker): add explicit env allowlist for container credentials (#1436)
Docker terminal sessions are secret-dark by default. This adds
terminal.docker_forward_env as an explicit allowlist for env vars
that may be forwarded into Docker containers.
Values resolve from the current shell first, then fall back to
~/.hermes/.env. Only variables the user explicitly lists are
forwarded — nothing is auto-exposed.
Cherry-picked from PR #1449 by @teknium1, conflict-resolved onto
current main.
Fixes #1436
Supersedes #1439
* fix: email send_typing metadata param + ☤ Hermes staff symbol
- email.py: add missing metadata parameter to send_typing() to match
BasePlatformAdapter signature (PR #1431 by @ItsChoudhry)
- README.md: ⚕ → ☤ — the caduceus is Hermes's staff, not the
medical Staff of Asclepius (PR #1420 by @rianczerwinski)
* fix(whatsapp): support LID format in self-chat mode (#1556)
WhatsApp now uses LID (Linked Identity Device) format alongside classic
@s.whatsapp.net. Self-chat detection checked only the classic format,
breaking self-chat mode for users on newer WhatsApp versions.
- Check both sock.user.id and sock.user.lid for self-chat detection
- Accept 'append' message type in addition to 'notify' (self-chat
messages arrive as 'append')
- Track sent message IDs to prevent echo-back loops with media
- Add WHATSAPP_DEBUG env var for troubleshooting
Based on PR #1556 by jcorrego (manually applied due to cherry-pick
conflicts).
* fix: detect Claude Code version dynamically for OAuth user-agent
The _CLAUDE_CODE_VERSION was hardcoded to '2.1.2' but Anthropic
rejects OAuth requests when the spoofed user-agent version is too
far behind the current Claude Code release. The error is a generic
400 with just 'Error' as the message, making it very hard to diagnose.
Fix: detect the installed version via 'claude --version' at import
time, falling back to a bumped static constant (2.1.74) when Claude
Code isn't installed. This means users who keep Claude Code updated
never hit stale-version rejections.
Reported by Jack — changing the version string to match the installed
claude binary fixed persistent OAuth 400 errors immediately.
---------
Co-authored-by: buray <ygd58@users.noreply.github.com>
Co-authored-by: lbn <llbn@users.noreply.github.com>
Co-authored-by: crazywriter1 <53251494+crazywriter1@users.noreply.github.com>
Co-authored-by: Max K <MaxKerkula@users.noreply.github.com>
Co-authored-by: Angello Picasso <angello.picasso@devsu.com>
Co-authored-by: kshitij <kshitijk4poor@users.noreply.github.com>
Co-authored-by: jcorrego <jcorrego@users.noreply.github.com>
2026-03-17 02:48:33 -07:00
|
|
|
|
# The version must stay reasonably current — Anthropic rejects OAuth requests
|
|
|
|
|
|
# when the spoofed user-agent version is too far behind the actual release.
|
|
|
|
|
|
_CLAUDE_CODE_VERSION_FALLBACK = "2.1.74"
|
2026-03-27 07:49:44 -07:00
|
|
|
|
_claude_code_version_cache: Optional[str] = None
|
fix: detect Claude Code version dynamically for OAuth user-agent
* fix: prevent infinite 400 failure loop on context overflow (#1630)
When a gateway session exceeds the model's context window, Anthropic may
return a generic 400 invalid_request_error with just 'Error' as the
message. This bypassed the phrase-based context-length detection,
causing the agent to treat it as a non-retryable client error. Worse,
the failed user message was still persisted to the transcript, making
the session even larger on each attempt — creating an infinite loop.
Three-layer fix:
1. run_agent.py — Fallback heuristic: when a 400 error has a very short
generic message AND the session is large (>40% of context or >80
messages), treat it as a probable context overflow and trigger
compression instead of aborting.
2. run_agent.py + gateway/run.py — Don't persist failed messages:
when the agent returns failed=True before generating any response,
skip writing the user's message to the transcript/DB. This prevents
the session from growing on each failure.
3. gateway/run.py — Smarter error messages: detect context-overflow
failures and suggest /compact or /reset specifically, instead of a
generic 'try again' that will fail identically.
* fix(skills): detect prompt injection patterns and block cache file reads
Adds two security layers to prevent prompt injection via skills hub
cache files (#1558):
1. read_file: blocks direct reads of ~/.hermes/skills/.hub/ directory
(index-cache, catalog files). The 3.5MB clawhub_catalog_v1.json
was the original injection vector — untrusted skill descriptions
in the catalog contained adversarial text that the model executed.
2. skill_view: warns when skills are loaded from outside the trusted
~/.hermes/skills/ directory, and detects common injection patterns
in skill content ("ignore previous instructions", "<system>", etc.).
Cherry-picked from PR #1562 by ygd58.
* fix(tools): chunk long messages in send_message_tool before dispatch (#1552)
Long messages sent via send_message tool or cron delivery silently
failed when exceeding platform limits. Gateway adapters handle this
via truncate_message(), but the standalone senders in send_message_tool
bypassed that entirely.
- Apply truncate_message() chunking in _send_to_platform() before
dispatching to individual platform senders
- Remove naive message[i:i+2000] character split in _send_discord()
in favor of centralized smart splitting
- Attach media files to last chunk only for Telegram
- Add regression tests for chunking and media placement
Cherry-picked from PR #1557 by llbn.
* fix(approval): show full command in dangerous command approval (#1553)
Previously the command was truncated to 80 chars in CLI (with a
[v]iew full option), 500 chars in Discord embeds, and missing entirely
in Telegram/Slack approval messages. Now the full command is always
displayed everywhere:
- CLI: removed 80-char truncation and [v]iew full menu option
- Gateway (TG/Slack): approval_required message includes full command
in a code block
- Discord: embed shows full command up to 4096-char limit
- Windows: skip SIGALRM-based test timeout (Unix-only)
- Updated tests: replaced view-flow tests with direct approval tests
Cherry-picked from PR #1566 by crazywriter1.
* fix(cli): flush stdout during agent loop to prevent macOS display freeze (#1624)
The interrupt polling loop in chat() waited on the queue without
invalidating the prompt_toolkit renderer. On macOS, the StdoutProxy
buffer only flushed on input events, causing the CLI to appear frozen
during tool execution until the user typed a key.
Fix: call _invalidate() on each queue timeout (every ~100ms, throttled
to 150ms) to force the renderer to flush buffered agent output.
* fix(claw): warn when API keys are skipped during OpenClaw migration (#1580)
When --migrate-secrets is not passed (the default), API keys like
OPENROUTER_API_KEY are silently skipped with no warning. Users don't
realize their keys weren't migrated until the agent fails to connect.
Add a post-migration warning with actionable instructions: either
re-run with --migrate-secrets or add the key manually via
hermes config set.
Cherry-picked from PR #1593 by ygd58.
* fix(security): block sandbox backend creds from subprocess env (#1264)
Add Modal and Daytona sandbox credentials to the subprocess env
blocklist so they're not leaked to agent terminal sessions via
printenv/env.
Cherry-picked from PR #1571 by ygd58.
* fix(gateway): cap interrupt recursion depth to prevent resource exhaustion (#816)
When a user sends multiple messages while the agent keeps failing,
_run_agent() calls itself recursively with no depth limit. This can
exhaust stack/memory if the agent is in a failure loop.
Add _MAX_INTERRUPT_DEPTH = 3. When exceeded, the pending message is
logged and the current result is returned instead of recursing deeper.
The log handler duplication bug described in #816 was already fixed
separately (AIAgent.__init__ deduplicates handlers).
* fix(gateway): /model shows active fallback model instead of config default (#1615)
When the agent falls back to a different model (e.g. due to rate
limiting), /model still showed the config default. Now tracks the
effective model/provider after each agent run and displays it.
Cleared when the primary model succeeds again or the user explicitly
switches via /model.
Cherry-picked from PR #1616 by MaxKerkula. Added hasattr guard for
test compatibility.
* feat(gateway): inject reply-to message context for out-of-session replies (#1594)
When a user replies to a Telegram message, check if the quoted text
exists in the current session transcript. If missing (from cron jobs,
background tasks, or old sessions), prepend [Replying to: "..."] to
the message so the agent has context about what's being referenced.
- Add reply_to_text field to MessageEvent (base.py)
- Populate from Telegram's reply_to_message (text or caption)
- Inject context in _handle_message when not found in history
Based on PR #1596 by anpicasso (cherry-picked reply-to feature only,
excluded unrelated /server command and background delegation changes).
* fix: recognize Claude Code OAuth credentials in startup gate (#1455)
The _has_any_provider_configured() startup check didn't look for
Claude Code OAuth credentials (~/.claude/.credentials.json). Users
with only Claude Code auth got the setup wizard instead of starting.
Cherry-picked from PR #1455 by kshitijk4poor.
* perf: use ripgrep for file search (200x faster than find)
search_files(target='files') now uses rg --files -g instead of find.
Ripgrep respects .gitignore, excludes hidden dirs by default, and has
parallel directory traversal — ~200x faster on wide trees (0.14s vs 34s
benchmarked on 164-repo tree).
Falls back to find when rg is unavailable, preserving hidden-dir
exclusion and BSD find compatibility.
Salvaged from PR #1464 by @light-merlin-dark (Merlin) — adapted to
preserve hidden-dir exclusion added since the original PR.
* refactor(tts): replace NeuTTS optional skill with built-in provider + setup flow
Remove the optional skill (redundant now that NeuTTS is a built-in TTS
provider). Replace neutts_cli dependency with a standalone synthesis
helper (tools/neutts_synth.py) that calls the neutts Python API directly
in a subprocess.
Add TTS provider selection to hermes setup:
- 'hermes setup' now prompts for TTS provider after model selection
- 'hermes setup tts' available as standalone section
- Selecting NeuTTS checks for deps and offers to install:
espeak-ng (system) + neutts[all] (pip)
- ElevenLabs/OpenAI selections prompt for API keys
- Tool status display shows NeuTTS install state
Changes:
- Remove optional-skills/mlops/models/neutts/ (skill + CLI scaffold)
- Add tools/neutts_synth.py (standalone synthesis subprocess helper)
- Move jo.wav/jo.txt to tools/neutts_samples/ (bundled default voice)
- Refactor _generate_neutts() — uses neutts API via subprocess, no
neutts_cli dependency, config-driven ref_audio/ref_text/model/device
- Add TTS setup to hermes_cli/setup.py (SETUP_SECTIONS, tool status)
- Update config.py defaults (ref_audio, ref_text, model, device)
* fix(docker): add explicit env allowlist for container credentials (#1436)
Docker terminal sessions are secret-dark by default. This adds
terminal.docker_forward_env as an explicit allowlist for env vars
that may be forwarded into Docker containers.
Values resolve from the current shell first, then fall back to
~/.hermes/.env. Only variables the user explicitly lists are
forwarded — nothing is auto-exposed.
Cherry-picked from PR #1449 by @teknium1, conflict-resolved onto
current main.
Fixes #1436
Supersedes #1439
* fix: email send_typing metadata param + ☤ Hermes staff symbol
- email.py: add missing metadata parameter to send_typing() to match
BasePlatformAdapter signature (PR #1431 by @ItsChoudhry)
- README.md: ⚕ → ☤ — the caduceus is Hermes's staff, not the
medical Staff of Asclepius (PR #1420 by @rianczerwinski)
* fix(whatsapp): support LID format in self-chat mode (#1556)
WhatsApp now uses LID (Linked Identity Device) format alongside classic
@s.whatsapp.net. Self-chat detection checked only the classic format,
breaking self-chat mode for users on newer WhatsApp versions.
- Check both sock.user.id and sock.user.lid for self-chat detection
- Accept 'append' message type in addition to 'notify' (self-chat
messages arrive as 'append')
- Track sent message IDs to prevent echo-back loops with media
- Add WHATSAPP_DEBUG env var for troubleshooting
Based on PR #1556 by jcorrego (manually applied due to cherry-pick
conflicts).
* fix: detect Claude Code version dynamically for OAuth user-agent
The _CLAUDE_CODE_VERSION was hardcoded to '2.1.2' but Anthropic
rejects OAuth requests when the spoofed user-agent version is too
far behind the current Claude Code release. The error is a generic
400 with just 'Error' as the message, making it very hard to diagnose.
Fix: detect the installed version via 'claude --version' at import
time, falling back to a bumped static constant (2.1.74) when Claude
Code isn't installed. This means users who keep Claude Code updated
never hit stale-version rejections.
Reported by Jack — changing the version string to match the installed
claude binary fixed persistent OAuth 400 errors immediately.
---------
Co-authored-by: buray <ygd58@users.noreply.github.com>
Co-authored-by: lbn <llbn@users.noreply.github.com>
Co-authored-by: crazywriter1 <53251494+crazywriter1@users.noreply.github.com>
Co-authored-by: Max K <MaxKerkula@users.noreply.github.com>
Co-authored-by: Angello Picasso <angello.picasso@devsu.com>
Co-authored-by: kshitij <kshitijk4poor@users.noreply.github.com>
Co-authored-by: jcorrego <jcorrego@users.noreply.github.com>
2026-03-17 02:48:33 -07:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def _detect_claude_code_version() -> str:
|
|
|
|
|
|
"""Detect the installed Claude Code version, fall back to a static constant.
|
|
|
|
|
|
|
|
|
|
|
|
Anthropic's OAuth infrastructure validates the user-agent version and may
|
|
|
|
|
|
reject requests with a version that's too old. Detecting dynamically means
|
|
|
|
|
|
users who keep Claude Code updated never hit stale-version 400s.
|
|
|
|
|
|
"""
|
|
|
|
|
|
import subprocess as _sp
|
|
|
|
|
|
|
|
|
|
|
|
for cmd in ("claude", "claude-code"):
|
|
|
|
|
|
try:
|
|
|
|
|
|
result = _sp.run(
|
|
|
|
|
|
[cmd, "--version"],
|
|
|
|
|
|
capture_output=True, text=True, timeout=5,
|
|
|
|
|
|
)
|
|
|
|
|
|
if result.returncode == 0 and result.stdout.strip():
|
|
|
|
|
|
# Output is like "2.1.74 (Claude Code)" or just "2.1.74"
|
|
|
|
|
|
version = result.stdout.strip().split()[0]
|
|
|
|
|
|
if version and version[0].isdigit():
|
|
|
|
|
|
return version
|
|
|
|
|
|
except Exception:
|
|
|
|
|
|
pass
|
|
|
|
|
|
return _CLAUDE_CODE_VERSION_FALLBACK
|
|
|
|
|
|
|
|
|
|
|
|
|
2026-03-16 17:08:22 -07:00
|
|
|
|
_CLAUDE_CODE_SYSTEM_PREFIX = "You are Claude Code, Anthropic's official CLI for Claude."
|
|
|
|
|
|
_MCP_TOOL_PREFIX = "mcp_"
|
|
|
|
|
|
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
|
2026-03-27 07:49:44 -07:00
|
|
|
|
def _get_claude_code_version() -> str:
|
|
|
|
|
|
"""Lazily detect the installed Claude Code version when OAuth headers need it."""
|
|
|
|
|
|
global _claude_code_version_cache
|
|
|
|
|
|
if _claude_code_version_cache is None:
|
|
|
|
|
|
_claude_code_version_cache = _detect_claude_code_version()
|
|
|
|
|
|
return _claude_code_version_cache
|
|
|
|
|
|
|
|
|
|
|
|
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
def _is_oauth_token(key: str) -> bool:
|
2026-04-11 00:43:01 -07:00
|
|
|
|
"""Check if the key is an Anthropic OAuth/setup token.
|
2026-03-12 17:14:22 -07:00
|
|
|
|
|
2026-04-11 00:43:01 -07:00
|
|
|
|
Positively identifies Anthropic OAuth tokens by their key format:
|
|
|
|
|
|
- ``sk-ant-`` prefix (but NOT ``sk-ant-api``) → setup tokens, managed keys
|
|
|
|
|
|
- ``eyJ`` prefix → JWTs from the Anthropic OAuth flow
|
|
|
|
|
|
|
|
|
|
|
|
Non-Anthropic keys (MiniMax, Alibaba, etc.) don't match either pattern
|
|
|
|
|
|
and correctly return False.
|
2026-03-12 17:14:22 -07:00
|
|
|
|
"""
|
|
|
|
|
|
if not key:
|
|
|
|
|
|
return False
|
2026-04-11 00:43:01 -07:00
|
|
|
|
# Regular Anthropic Console API keys — x-api-key auth, never OAuth
|
2026-03-12 17:14:22 -07:00
|
|
|
|
if key.startswith("sk-ant-api"):
|
|
|
|
|
|
return False
|
2026-04-11 00:43:01 -07:00
|
|
|
|
# Anthropic-issued tokens (setup-tokens sk-ant-oat-*, managed keys)
|
|
|
|
|
|
if key.startswith("sk-ant-"):
|
|
|
|
|
|
return True
|
|
|
|
|
|
# JWTs from Anthropic OAuth flow
|
|
|
|
|
|
if key.startswith("eyJ"):
|
|
|
|
|
|
return True
|
|
|
|
|
|
return False
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
|
|
|
|
|
|
|
2026-04-08 13:51:41 -07:00
|
|
|
|
def _normalize_base_url_text(base_url) -> str:
|
|
|
|
|
|
"""Normalize SDK/base transport URL values to a plain string for inspection.
|
|
|
|
|
|
|
|
|
|
|
|
Some client objects expose ``base_url`` as an ``httpx.URL`` instead of a raw
|
|
|
|
|
|
string. Provider/auth detection should accept either shape.
|
|
|
|
|
|
"""
|
|
|
|
|
|
if not base_url:
|
|
|
|
|
|
return ""
|
|
|
|
|
|
return str(base_url).strip()
|
|
|
|
|
|
|
|
|
|
|
|
|
2026-03-30 20:36:56 -07:00
|
|
|
|
def _is_third_party_anthropic_endpoint(base_url: str | None) -> bool:
|
|
|
|
|
|
"""Return True for non-Anthropic endpoints using the Anthropic Messages API.
|
|
|
|
|
|
|
|
|
|
|
|
Third-party proxies (Azure AI Foundry, AWS Bedrock, self-hosted) authenticate
|
|
|
|
|
|
with their own API keys via x-api-key, not Anthropic OAuth tokens. OAuth
|
|
|
|
|
|
detection should be skipped for these endpoints.
|
|
|
|
|
|
"""
|
2026-04-08 13:51:41 -07:00
|
|
|
|
normalized = _normalize_base_url_text(base_url)
|
|
|
|
|
|
if not normalized:
|
2026-03-30 20:36:56 -07:00
|
|
|
|
return False # No base_url = direct Anthropic API
|
2026-04-08 13:51:41 -07:00
|
|
|
|
normalized = normalized.rstrip("/").lower()
|
2026-03-30 20:36:56 -07:00
|
|
|
|
if "anthropic.com" in normalized:
|
|
|
|
|
|
return False # Direct Anthropic API — OAuth applies
|
|
|
|
|
|
return True # Any other endpoint is a third-party proxy
|
|
|
|
|
|
|
|
|
|
|
|
|
2026-03-30 13:19:44 -07:00
|
|
|
|
def _requires_bearer_auth(base_url: str | None) -> bool:
|
|
|
|
|
|
"""Return True for Anthropic-compatible providers that require Bearer auth.
|
|
|
|
|
|
|
|
|
|
|
|
Some third-party /anthropic endpoints implement Anthropic's Messages API but
|
2026-04-08 13:51:41 -07:00
|
|
|
|
require Authorization: Bearer *** of Anthropic's native x-api-key header.
|
2026-03-30 13:19:44 -07:00
|
|
|
|
MiniMax's global and China Anthropic-compatible endpoints follow this pattern.
|
|
|
|
|
|
"""
|
2026-04-08 13:51:41 -07:00
|
|
|
|
normalized = _normalize_base_url_text(base_url)
|
|
|
|
|
|
if not normalized:
|
2026-03-30 13:19:44 -07:00
|
|
|
|
return False
|
2026-04-08 13:51:41 -07:00
|
|
|
|
normalized = normalized.rstrip("/").lower()
|
refactor: codebase-wide lint cleanup — unused imports, dead code, and inefficient patterns (#5821)
Comprehensive cleanup across 80 files based on automated (ruff, pyflakes, vulture)
and manual analysis of the entire codebase.
Changes by category:
Unused imports removed (~95 across 55 files):
- Removed genuinely unused imports from all major subsystems
- agent/, hermes_cli/, tools/, gateway/, plugins/, cron/
- Includes imports in try/except blocks that were truly unused
(vs availability checks which were left alone)
Unused variables removed (~25):
- Removed dead variables: connected, inner, channels, last_exc,
source, new_server_names, verify, pconfig, default_terminal,
result, pending_handled, temperature, loop
- Dropped unused argparse subparser assignments in hermes_cli/main.py
(12 instances of add_parser() where result was never used)
Dead code removed:
- run_agent.py: Removed dead ternary (None if False else None) and
surrounding unreachable branch in identity fallback
- run_agent.py: Removed write-only attribute _last_reported_tool
- hermes_cli/providers.py: Removed dead @property decorator on
module-level function (decorator has no effect outside a class)
- gateway/run.py: Removed unused MCP config load before reconnect
- gateway/platforms/slack.py: Removed dead SessionSource construction
Undefined name bugs fixed (would cause NameError at runtime):
- batch_runner.py: Added missing logger = logging.getLogger(__name__)
- tools/environments/daytona.py: Added missing Dict and Path imports
Unnecessary global statements removed (14):
- tools/terminal_tool.py: 5 functions declared global for dicts
they only mutated via .pop()/[key]=value (no rebinding)
- tools/browser_tool.py: cleanup thread loop only reads flag
- tools/rl_training_tool.py: 4 functions only do dict mutations
- tools/mcp_oauth.py: only reads the global
- hermes_time.py: only reads cached values
Inefficient patterns fixed:
- startswith/endswith tuple form: 15 instances of
x.startswith('a') or x.startswith('b') consolidated to
x.startswith(('a', 'b'))
- len(x)==0 / len(x)>0: 13 instances replaced with pythonic
truthiness checks (not x / bool(x))
- in dict.keys(): 5 instances simplified to in dict
- Redefined unused name: removed duplicate _strip_mdv2 import in
send_message_tool.py
Other fixes:
- hermes_cli/doctor.py: Replaced undefined logger.debug() with pass
- hermes_cli/config.py: Consolidated chained .endswith() calls
Test results: 3934 passed, 17 failed (all pre-existing on main),
19 skipped. Zero regressions.
2026-04-07 10:25:31 -07:00
|
|
|
|
return normalized.startswith(("https://api.minimax.io/anthropic", "https://api.minimaxi.com/anthropic"))
|
2026-03-30 13:19:44 -07:00
|
|
|
|
|
|
|
|
|
|
|
2026-04-09 17:09:38 -07:00
|
|
|
|
def _common_betas_for_base_url(base_url: str | None) -> list[str]:
|
|
|
|
|
|
"""Return the beta headers that are safe for the configured endpoint.
|
|
|
|
|
|
|
|
|
|
|
|
MiniMax's Anthropic-compatible endpoints (Bearer-auth) reject requests
|
|
|
|
|
|
that include Anthropic's ``fine-grained-tool-streaming`` beta — every
|
|
|
|
|
|
tool-use message triggers a connection error. Strip that beta for
|
|
|
|
|
|
Bearer-auth endpoints while keeping all other betas intact.
|
|
|
|
|
|
"""
|
|
|
|
|
|
if _requires_bearer_auth(base_url):
|
|
|
|
|
|
return [b for b in _COMMON_BETAS if b != _TOOL_STREAMING_BETA]
|
|
|
|
|
|
return _COMMON_BETAS
|
|
|
|
|
|
|
|
|
|
|
|
|
2026-04-19 05:41:29 -07:00
|
|
|
|
def build_anthropic_client(api_key: str, base_url: str = None, timeout: float = None):
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
"""Create an Anthropic client, auto-detecting setup-tokens vs API keys.
|
|
|
|
|
|
|
2026-04-19 05:41:29 -07:00
|
|
|
|
If *timeout* is provided it overrides the default 900s read timeout. The
|
|
|
|
|
|
connect timeout stays at 10s. Callers pass this from the per-provider /
|
|
|
|
|
|
per-model ``request_timeout_seconds`` config so Anthropic-native and
|
|
|
|
|
|
Anthropic-compatible providers respect the same knob as OpenAI-wire
|
|
|
|
|
|
providers.
|
|
|
|
|
|
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
Returns an anthropic.Anthropic instance.
|
|
|
|
|
|
"""
|
|
|
|
|
|
if _anthropic_sdk is None:
|
|
|
|
|
|
raise ImportError(
|
|
|
|
|
|
"The 'anthropic' package is required for the Anthropic provider. "
|
|
|
|
|
|
"Install it with: pip install 'anthropic>=0.39.0'"
|
|
|
|
|
|
)
|
2026-04-21 17:55:04 +08:00
|
|
|
|
|
|
|
|
|
|
normalize_proxy_env_vars()
|
|
|
|
|
|
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
from httpx import Timeout
|
|
|
|
|
|
|
2026-04-08 13:51:41 -07:00
|
|
|
|
normalized_base_url = _normalize_base_url_text(base_url)
|
2026-04-19 05:41:29 -07:00
|
|
|
|
_read_timeout = timeout if (isinstance(timeout, (int, float)) and timeout > 0) else 900.0
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
kwargs = {
|
2026-04-19 05:41:29 -07:00
|
|
|
|
"timeout": Timeout(timeout=float(_read_timeout), connect=10.0),
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
}
|
2026-04-08 13:51:41 -07:00
|
|
|
|
if normalized_base_url:
|
|
|
|
|
|
kwargs["base_url"] = normalized_base_url
|
2026-04-09 17:09:38 -07:00
|
|
|
|
common_betas = _common_betas_for_base_url(normalized_base_url)
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
|
2026-04-08 13:51:41 -07:00
|
|
|
|
if _requires_bearer_auth(normalized_base_url):
|
2026-03-30 13:19:44 -07:00
|
|
|
|
# Some Anthropic-compatible providers (e.g. MiniMax) expect the API key in
|
|
|
|
|
|
# Authorization: Bearer even for regular API keys. Route those endpoints
|
|
|
|
|
|
# through auth_token so the SDK sends Bearer auth instead of x-api-key.
|
|
|
|
|
|
# Check this before OAuth token shape detection because MiniMax secrets do
|
|
|
|
|
|
# not use Anthropic's sk-ant-api prefix and would otherwise be misread as
|
|
|
|
|
|
# Anthropic OAuth/setup tokens.
|
|
|
|
|
|
kwargs["auth_token"] = api_key
|
2026-04-09 17:09:38 -07:00
|
|
|
|
if common_betas:
|
|
|
|
|
|
kwargs["default_headers"] = {"anthropic-beta": ",".join(common_betas)}
|
2026-03-30 20:36:56 -07:00
|
|
|
|
elif _is_third_party_anthropic_endpoint(base_url):
|
|
|
|
|
|
# Third-party proxies (Azure AI Foundry, AWS Bedrock, etc.) use their
|
|
|
|
|
|
# own API keys with x-api-key auth. Skip OAuth detection — their keys
|
|
|
|
|
|
# don't follow Anthropic's sk-ant-* prefix convention and would be
|
|
|
|
|
|
# misclassified as OAuth tokens.
|
|
|
|
|
|
kwargs["api_key"] = api_key
|
2026-04-09 17:09:38 -07:00
|
|
|
|
if common_betas:
|
|
|
|
|
|
kwargs["default_headers"] = {"anthropic-beta": ",".join(common_betas)}
|
2026-03-30 13:19:44 -07:00
|
|
|
|
elif _is_oauth_token(api_key):
|
2026-03-16 17:08:22 -07:00
|
|
|
|
# OAuth access token / setup-token → Bearer auth + Claude Code identity.
|
|
|
|
|
|
# Anthropic routes OAuth requests based on user-agent and headers;
|
|
|
|
|
|
# without Claude Code's fingerprint, requests get intermittent 500s.
|
2026-04-09 17:09:38 -07:00
|
|
|
|
all_betas = common_betas + _OAUTH_ONLY_BETAS
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
kwargs["auth_token"] = api_key
|
2026-03-16 17:08:22 -07:00
|
|
|
|
kwargs["default_headers"] = {
|
|
|
|
|
|
"anthropic-beta": ",".join(all_betas),
|
2026-03-27 07:49:44 -07:00
|
|
|
|
"user-agent": f"claude-cli/{_get_claude_code_version()} (external, cli)",
|
2026-03-16 17:08:22 -07:00
|
|
|
|
"x-app": "cli",
|
|
|
|
|
|
}
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
else:
|
fix(anthropic): address gaps found in deep-dive audit
After studying clawdbot (OpenClaw) and OpenCode implementations:
## Beta headers
- Add interleaved-thinking-2025-05-14 and fine-grained-tool-streaming-2025-05-14
as common betas (sent with ALL auth types, not just OAuth)
- OAuth tokens additionally get oauth-2025-04-20
- API keys now also get the common betas (previously got none)
## Vision/image support
- Add _convert_vision_content() to convert OpenAI multimodal format
(image_url blocks) to Anthropic format (image blocks with base64/url source)
- Handles both data: URIs (base64) and regular URLs
## Role alternation enforcement
- Anthropic strictly rejects consecutive same-role messages (400 error)
- Add post-processing step that merges consecutive user/assistant messages
- Handles string, list, and mixed content types during merge
## Tool choice support
- Add tool_choice parameter to build_anthropic_kwargs()
- Maps OpenAI values: auto→auto, required→any, none→omit, name→tool
## Cache metrics tracking
- Anthropic uses cache_read_input_tokens / cache_creation_input_tokens
(different from OpenRouter's prompt_tokens_details.cached_tokens)
- Add api_mode-aware branch in run_agent.py cache stats logging
## Credential refresh on 401
- On 401 error during anthropic_messages mode, re-read credentials
via resolve_anthropic_token() (picks up refreshed Claude Code tokens)
- Rebuild client if new token differs from current one
- Follows same pattern as Codex/Nous 401 refresh handlers
## Tests
- 44 adapter tests (8 new: vision conversion, role alternation, tool choice)
- Updated beta header tests to verify new structure
- Full suite: 3198 passed, 0 regressions
2026-03-12 16:00:46 -07:00
|
|
|
|
# Regular API key → x-api-key header + common betas
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
kwargs["api_key"] = api_key
|
2026-04-09 17:09:38 -07:00
|
|
|
|
if common_betas:
|
|
|
|
|
|
kwargs["default_headers"] = {"anthropic-beta": ",".join(common_betas)}
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
|
|
|
|
|
|
return _anthropic_sdk.Anthropic(**kwargs)
|
|
|
|
|
|
|
|
|
|
|
|
|
feat: native AWS Bedrock provider via Converse API
Salvaged from PR #7920 by JiaDe-Wu — cherry-picked Bedrock-specific
additions onto current main, skipping stale-branch reverts (293 commits
behind).
Dual-path architecture:
- Claude models → AnthropicBedrock SDK (prompt caching, thinking budgets)
- Non-Claude models → Converse API via boto3 (Nova, DeepSeek, Llama, Mistral)
Includes:
- Core adapter (agent/bedrock_adapter.py, 1098 lines)
- Full provider registration (auth, models, providers, config, runtime, main)
- IAM credential chain + Bedrock API Key auth modes
- Dynamic model discovery via ListFoundationModels + ListInferenceProfiles
- Streaming with delta callbacks, error classification, guardrails
- hermes doctor + hermes auth integration
- /usage pricing for 7 Bedrock models
- 130 automated tests (79 unit + 28 integration + follow-up fixes)
- Documentation (website/docs/guides/aws-bedrock.md)
- boto3 optional dependency (pip install hermes-agent[bedrock])
Co-authored-by: JiaDe WU <40445668+JiaDe-Wu@users.noreply.github.com>
2026-04-15 15:18:01 -07:00
|
|
|
|
def build_anthropic_bedrock_client(region: str):
|
|
|
|
|
|
"""Create an AnthropicBedrock client for Bedrock Claude models.
|
|
|
|
|
|
|
|
|
|
|
|
Uses the Anthropic SDK's native Bedrock adapter, which provides full
|
|
|
|
|
|
Claude feature parity: prompt caching, thinking budgets, adaptive
|
|
|
|
|
|
thinking, fast mode — features not available via the Converse API.
|
|
|
|
|
|
|
|
|
|
|
|
Auth uses the boto3 default credential chain (IAM roles, SSO, env vars).
|
|
|
|
|
|
"""
|
|
|
|
|
|
if _anthropic_sdk is None:
|
|
|
|
|
|
raise ImportError(
|
|
|
|
|
|
"The 'anthropic' package is required for the Bedrock provider. "
|
|
|
|
|
|
"Install it with: pip install 'anthropic>=0.39.0'"
|
|
|
|
|
|
)
|
|
|
|
|
|
if not hasattr(_anthropic_sdk, "AnthropicBedrock"):
|
|
|
|
|
|
raise ImportError(
|
|
|
|
|
|
"anthropic.AnthropicBedrock not available. "
|
|
|
|
|
|
"Upgrade with: pip install 'anthropic>=0.39.0'"
|
|
|
|
|
|
)
|
|
|
|
|
|
from httpx import Timeout
|
|
|
|
|
|
|
|
|
|
|
|
return _anthropic_sdk.AnthropicBedrock(
|
|
|
|
|
|
aws_region=region,
|
|
|
|
|
|
timeout=Timeout(timeout=900.0, connect=10.0),
|
|
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
|
|
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
def read_claude_code_credentials() -> Optional[Dict[str, Any]]:
|
2026-03-14 22:11:21 -07:00
|
|
|
|
"""Read refreshable Claude Code OAuth credentials from ~/.claude/.credentials.json.
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
|
2026-03-14 22:11:21 -07:00
|
|
|
|
This intentionally excludes ~/.claude.json primaryApiKey. Opencode's
|
|
|
|
|
|
subscription flow is OAuth/setup-token based with refreshable credentials,
|
|
|
|
|
|
and native direct Anthropic provider usage should follow that path rather
|
|
|
|
|
|
than auto-detecting Claude's first-party managed key.
|
2026-03-12 16:43:31 -07:00
|
|
|
|
|
|
|
|
|
|
Returns dict with {accessToken, refreshToken?, expiresAt?} or None.
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
"""
|
|
|
|
|
|
cred_path = Path.home() / ".claude" / ".credentials.json"
|
2026-03-12 16:43:31 -07:00
|
|
|
|
if cred_path.exists():
|
|
|
|
|
|
try:
|
|
|
|
|
|
data = json.loads(cred_path.read_text(encoding="utf-8"))
|
|
|
|
|
|
oauth_data = data.get("claudeAiOauth")
|
|
|
|
|
|
if oauth_data and isinstance(oauth_data, dict):
|
|
|
|
|
|
access_token = oauth_data.get("accessToken", "")
|
|
|
|
|
|
if access_token:
|
|
|
|
|
|
return {
|
|
|
|
|
|
"accessToken": access_token,
|
|
|
|
|
|
"refreshToken": oauth_data.get("refreshToken", ""),
|
|
|
|
|
|
"expiresAt": oauth_data.get("expiresAt", 0),
|
2026-03-14 21:44:39 -07:00
|
|
|
|
"source": "claude_code_credentials_file",
|
2026-03-12 16:43:31 -07:00
|
|
|
|
}
|
|
|
|
|
|
except (json.JSONDecodeError, OSError, IOError) as e:
|
|
|
|
|
|
logger.debug("Failed to read ~/.claude/.credentials.json: %s", e)
|
|
|
|
|
|
|
|
|
|
|
|
return None
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
|
|
|
|
|
|
|
2026-03-14 22:11:21 -07:00
|
|
|
|
def read_claude_managed_key() -> Optional[str]:
|
|
|
|
|
|
"""Read Claude's native managed key from ~/.claude.json for diagnostics only."""
|
|
|
|
|
|
claude_json = Path.home() / ".claude.json"
|
|
|
|
|
|
if claude_json.exists():
|
|
|
|
|
|
try:
|
|
|
|
|
|
data = json.loads(claude_json.read_text(encoding="utf-8"))
|
|
|
|
|
|
primary_key = data.get("primaryApiKey", "")
|
|
|
|
|
|
if isinstance(primary_key, str) and primary_key.strip():
|
|
|
|
|
|
return primary_key.strip()
|
|
|
|
|
|
except (json.JSONDecodeError, OSError, IOError) as e:
|
|
|
|
|
|
logger.debug("Failed to read ~/.claude.json: %s", e)
|
|
|
|
|
|
return None
|
|
|
|
|
|
|
|
|
|
|
|
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
def is_claude_code_token_valid(creds: Dict[str, Any]) -> bool:
|
|
|
|
|
|
"""Check if Claude Code credentials have a non-expired access token."""
|
|
|
|
|
|
import time
|
|
|
|
|
|
|
|
|
|
|
|
expires_at = creds.get("expiresAt", 0)
|
|
|
|
|
|
if not expires_at:
|
2026-03-12 16:43:31 -07:00
|
|
|
|
# No expiry set (managed keys) — valid if token is present
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
return bool(creds.get("accessToken"))
|
|
|
|
|
|
|
|
|
|
|
|
# expiresAt is in milliseconds since epoch
|
|
|
|
|
|
now_ms = int(time.time() * 1000)
|
|
|
|
|
|
# Allow 60 seconds of buffer
|
|
|
|
|
|
return now_ms < (expires_at - 60_000)
|
|
|
|
|
|
|
|
|
|
|
|
|
feat(auth): same-provider credential pools with rotation, custom endpoint support, and interactive CLI (#2647)
* feat(auth): add same-provider credential pools and rotation UX
Add same-provider credential pooling so Hermes can rotate across
multiple credentials for a single provider, recover from exhausted
credentials without jumping providers immediately, and configure
that behavior directly in hermes setup.
- agent/credential_pool.py: persisted per-provider credential pools
- hermes auth add/list/remove/reset CLI commands
- 429/402/401 recovery with pool rotation in run_agent.py
- Setup wizard integration for pool strategy configuration
- Auto-seeding from env vars and existing OAuth state
Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
Salvaged from PR #2647
* fix(tests): prevent pool auto-seeding from host env in credential pool tests
Tests for non-pool Anthropic paths and auth remove were failing when
host env vars (ANTHROPIC_API_KEY) or file-backed OAuth credentials
were present. The pool auto-seeding picked these up, causing unexpected
pool entries in tests.
- Mock _select_pool_entry in auxiliary_client OAuth flag tests
- Clear Anthropic env vars and mock _seed_from_singletons in auth remove test
* feat(auth): add thread safety, least_used strategy, and request counting
- Add threading.Lock to CredentialPool for gateway thread safety
(concurrent requests from multiple gateway sessions could race on
pool state mutations without this)
- Add 'least_used' rotation strategy that selects the credential
with the lowest request_count, distributing load more evenly
- Add request_count field to PooledCredential for usage tracking
- Add mark_used() method to increment per-credential request counts
- Wrap select(), mark_exhausted_and_rotate(), and try_refresh_current()
with lock acquisition
- Add tests: least_used selection, mark_used counting, concurrent
thread safety (4 threads × 20 selects with no corruption)
* feat(auth): add interactive mode for bare 'hermes auth' command
When 'hermes auth' is called without a subcommand, it now launches an
interactive wizard that:
1. Shows full credential pool status across all providers
2. Offers a menu: add, remove, reset cooldowns, set strategy
3. For OAuth-capable providers (anthropic, nous, openai-codex), the
add flow explicitly asks 'API key or OAuth login?' — making it
clear that both auth types are supported for the same provider
4. Strategy picker shows all 4 options (fill_first, round_robin,
least_used, random) with the current selection marked
5. Remove flow shows entries with indices for easy selection
The subcommand paths (hermes auth add/list/remove/reset) still work
exactly as before for scripted/non-interactive use.
* fix(tests): update runtime_provider tests for config.yaml source of truth (#4165)
Tests were using OPENAI_BASE_URL env var which is no longer consulted
after #4165. Updated to use model config (provider, base_url, api_key)
which is the new single source of truth for custom endpoint URLs.
* feat(auth): support custom endpoint credential pools keyed by provider name
Custom OpenAI-compatible endpoints all share provider='custom', making
the provider-keyed pool useless. Now pools for custom endpoints are
keyed by 'custom:<normalized_name>' where the name comes from the
custom_providers config list (auto-generated from URL hostname).
- Pool key format: 'custom:together.ai', 'custom:local-(localhost:8080)'
- load_pool('custom:name') seeds from custom_providers api_key AND
model.api_key when base_url matches
- hermes auth add/list now shows custom endpoints alongside registry
providers
- _resolve_openrouter_runtime and _resolve_named_custom_runtime check
pool before falling back to single config key
- 6 new tests covering custom pool keying, seeding, and listing
* docs: add Excalidraw diagram of full credential pool flow
Comprehensive architecture diagram showing:
- Credential sources (env vars, auth.json OAuth, config.yaml, CLI)
- Pool storage and auto-seeding
- Runtime resolution paths (registry, custom, OpenRouter)
- Error recovery (429 retry-then-rotate, 402 immediate, 401 refresh)
- CLI management commands and strategy configuration
Open at: https://excalidraw.com/#json=2Ycqhqpi6f12E_3ITyiwh,c7u9jSt5BwrmiVzHGbm87g
* fix(tests): update setup wizard pool tests for unified select_provider_and_model flow
The setup wizard now delegates to select_provider_and_model() instead
of using its own prompt_choice-based provider picker. Tests needed:
- Mock select_provider_and_model as no-op (provider pre-written to config)
- Call _stub_tts BEFORE custom prompt_choice mock (it overwrites it)
- Pre-write model.provider to config so the pool step is reached
* docs: add comprehensive credential pool documentation
- New page: website/docs/user-guide/features/credential-pools.md
Full guide covering quick start, CLI commands, rotation strategies,
error recovery, custom endpoint pools, auto-discovery, thread safety,
architecture, and storage format.
- Updated fallback-providers.md to reference credential pools as the
first layer of resilience (same-provider rotation before cross-provider)
- Added hermes auth to CLI commands reference with usage examples
- Added credential_pool_strategies to configuration guide
* chore: remove excalidraw diagram from repo (external link only)
* refactor: simplify credential pool code — extract helpers, collapse extras, dedup patterns
- _load_config_safe(): replace 4 identical try/except/import blocks
- _iter_custom_providers(): shared generator for custom provider iteration
- PooledCredential.extra dict: collapse 11 round-trip-only fields
(token_type, scope, client_id, portal_base_url, obtained_at,
expires_in, agent_key_id, agent_key_expires_in, agent_key_reused,
agent_key_obtained_at, tls) into a single extra dict with
__getattr__ for backward-compatible access
- _available_entries(): shared exhaustion-check between select and peek
- Dedup anthropic OAuth seeding (hermes_pkce + claude_code identical)
- SimpleNamespace replaces class _Args boilerplate in auth_commands
- _try_resolve_from_custom_pool(): shared pool-check in runtime_provider
Net -17 lines. All 383 targeted tests pass.
---------
Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
2026-03-31 03:10:01 -07:00
|
|
|
|
def refresh_anthropic_oauth_pure(refresh_token: str, *, use_json: bool = False) -> Dict[str, Any]:
|
|
|
|
|
|
"""Refresh an Anthropic OAuth token without mutating local credential files."""
|
2026-03-26 13:26:56 -07:00
|
|
|
|
import time
|
feat(auth): same-provider credential pools with rotation, custom endpoint support, and interactive CLI (#2647)
* feat(auth): add same-provider credential pools and rotation UX
Add same-provider credential pooling so Hermes can rotate across
multiple credentials for a single provider, recover from exhausted
credentials without jumping providers immediately, and configure
that behavior directly in hermes setup.
- agent/credential_pool.py: persisted per-provider credential pools
- hermes auth add/list/remove/reset CLI commands
- 429/402/401 recovery with pool rotation in run_agent.py
- Setup wizard integration for pool strategy configuration
- Auto-seeding from env vars and existing OAuth state
Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
Salvaged from PR #2647
* fix(tests): prevent pool auto-seeding from host env in credential pool tests
Tests for non-pool Anthropic paths and auth remove were failing when
host env vars (ANTHROPIC_API_KEY) or file-backed OAuth credentials
were present. The pool auto-seeding picked these up, causing unexpected
pool entries in tests.
- Mock _select_pool_entry in auxiliary_client OAuth flag tests
- Clear Anthropic env vars and mock _seed_from_singletons in auth remove test
* feat(auth): add thread safety, least_used strategy, and request counting
- Add threading.Lock to CredentialPool for gateway thread safety
(concurrent requests from multiple gateway sessions could race on
pool state mutations without this)
- Add 'least_used' rotation strategy that selects the credential
with the lowest request_count, distributing load more evenly
- Add request_count field to PooledCredential for usage tracking
- Add mark_used() method to increment per-credential request counts
- Wrap select(), mark_exhausted_and_rotate(), and try_refresh_current()
with lock acquisition
- Add tests: least_used selection, mark_used counting, concurrent
thread safety (4 threads × 20 selects with no corruption)
* feat(auth): add interactive mode for bare 'hermes auth' command
When 'hermes auth' is called without a subcommand, it now launches an
interactive wizard that:
1. Shows full credential pool status across all providers
2. Offers a menu: add, remove, reset cooldowns, set strategy
3. For OAuth-capable providers (anthropic, nous, openai-codex), the
add flow explicitly asks 'API key or OAuth login?' — making it
clear that both auth types are supported for the same provider
4. Strategy picker shows all 4 options (fill_first, round_robin,
least_used, random) with the current selection marked
5. Remove flow shows entries with indices for easy selection
The subcommand paths (hermes auth add/list/remove/reset) still work
exactly as before for scripted/non-interactive use.
* fix(tests): update runtime_provider tests for config.yaml source of truth (#4165)
Tests were using OPENAI_BASE_URL env var which is no longer consulted
after #4165. Updated to use model config (provider, base_url, api_key)
which is the new single source of truth for custom endpoint URLs.
* feat(auth): support custom endpoint credential pools keyed by provider name
Custom OpenAI-compatible endpoints all share provider='custom', making
the provider-keyed pool useless. Now pools for custom endpoints are
keyed by 'custom:<normalized_name>' where the name comes from the
custom_providers config list (auto-generated from URL hostname).
- Pool key format: 'custom:together.ai', 'custom:local-(localhost:8080)'
- load_pool('custom:name') seeds from custom_providers api_key AND
model.api_key when base_url matches
- hermes auth add/list now shows custom endpoints alongside registry
providers
- _resolve_openrouter_runtime and _resolve_named_custom_runtime check
pool before falling back to single config key
- 6 new tests covering custom pool keying, seeding, and listing
* docs: add Excalidraw diagram of full credential pool flow
Comprehensive architecture diagram showing:
- Credential sources (env vars, auth.json OAuth, config.yaml, CLI)
- Pool storage and auto-seeding
- Runtime resolution paths (registry, custom, OpenRouter)
- Error recovery (429 retry-then-rotate, 402 immediate, 401 refresh)
- CLI management commands and strategy configuration
Open at: https://excalidraw.com/#json=2Ycqhqpi6f12E_3ITyiwh,c7u9jSt5BwrmiVzHGbm87g
* fix(tests): update setup wizard pool tests for unified select_provider_and_model flow
The setup wizard now delegates to select_provider_and_model() instead
of using its own prompt_choice-based provider picker. Tests needed:
- Mock select_provider_and_model as no-op (provider pre-written to config)
- Call _stub_tts BEFORE custom prompt_choice mock (it overwrites it)
- Pre-write model.provider to config so the pool step is reached
* docs: add comprehensive credential pool documentation
- New page: website/docs/user-guide/features/credential-pools.md
Full guide covering quick start, CLI commands, rotation strategies,
error recovery, custom endpoint pools, auto-discovery, thread safety,
architecture, and storage format.
- Updated fallback-providers.md to reference credential pools as the
first layer of resilience (same-provider rotation before cross-provider)
- Added hermes auth to CLI commands reference with usage examples
- Added credential_pool_strategies to configuration guide
* chore: remove excalidraw diagram from repo (external link only)
* refactor: simplify credential pool code — extract helpers, collapse extras, dedup patterns
- _load_config_safe(): replace 4 identical try/except/import blocks
- _iter_custom_providers(): shared generator for custom provider iteration
- PooledCredential.extra dict: collapse 11 round-trip-only fields
(token_type, scope, client_id, portal_base_url, obtained_at,
expires_in, agent_key_id, agent_key_expires_in, agent_key_reused,
agent_key_obtained_at, tls) into a single extra dict with
__getattr__ for backward-compatible access
- _available_entries(): shared exhaustion-check between select and peek
- Dedup anthropic OAuth seeding (hermes_pkce + claude_code identical)
- SimpleNamespace replaces class _Args boilerplate in auth_commands
- _try_resolve_from_custom_pool(): shared pool-check in runtime_provider
Net -17 lines. All 383 targeted tests pass.
---------
Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
2026-03-31 03:10:01 -07:00
|
|
|
|
import urllib.parse
|
fix: Anthropic OAuth — beta header, token refresh, config contamination, reauthentication (#1132)
Fixes Anthropic OAuth/subscription authentication end-to-end:
Auth failures (401 errors):
- Add missing 'claude-code-20250219' beta header for OAuth tokens. Both
clawdbot and OpenCode include this alongside 'oauth-2025-04-20' — without
it, Anthropic's API rejects OAuth tokens with 401 authentication errors.
- Fix _fetch_anthropic_models() to use canonical beta headers from
_COMMON_BETAS + _OAUTH_ONLY_BETAS instead of hardcoding.
Token refresh:
- Add _refresh_oauth_token() — when Claude Code credentials from
~/.claude/.credentials.json are expired but have a refresh token,
automatically POST to console.anthropic.com/v1/oauth/token to get
a new access token. Uses the same client_id as Claude Code / OpenCode.
- Add _write_claude_code_credentials() — writes refreshed tokens back
to ~/.claude/.credentials.json, preserving other fields.
- resolve_anthropic_token() now auto-refreshes expired tokens before
returning None.
Config contamination:
- Anthropic's _model_flow_anthropic() no longer saves base_url to config.
Since resolve_runtime_provider() always hardcodes Anthropic's URL, the
stale base_url was contaminating other providers when users switched
without re-running 'hermes model' (e.g., Codex hitting api.anthropic.com).
- _update_config_for_provider() now pops base_url when passed empty string.
- Same fix in setup.py.
Flow/UX (hermes model command):
- CLAUDE_CODE_OAUTH_TOKEN env var now checked in credential detection
- Reauthentication option when existing credentials found
- run_oauth_setup_token() runs 'claude setup-token' as interactive
subprocess, then auto-detects saved credentials
- Clean has_creds/needs_auth flow in both main.py and setup.py
Tests (14 new):
- Beta header assertions for claude-code-20250219
- Token refresh: successful refresh with credential writeback, failed
refresh returns None, no refresh token returns None
- Credential writeback: new file creation, preserving existing fields
- Auto-refresh integration in resolve_anthropic_token()
- CLAUDE_CODE_OAUTH_TOKEN fallback, credential file auto-discovery
- run_oauth_setup_token() (5 scenarios)
2026-03-12 20:45:50 -07:00
|
|
|
|
import urllib.request
|
|
|
|
|
|
|
|
|
|
|
|
if not refresh_token:
|
feat(auth): same-provider credential pools with rotation, custom endpoint support, and interactive CLI (#2647)
* feat(auth): add same-provider credential pools and rotation UX
Add same-provider credential pooling so Hermes can rotate across
multiple credentials for a single provider, recover from exhausted
credentials without jumping providers immediately, and configure
that behavior directly in hermes setup.
- agent/credential_pool.py: persisted per-provider credential pools
- hermes auth add/list/remove/reset CLI commands
- 429/402/401 recovery with pool rotation in run_agent.py
- Setup wizard integration for pool strategy configuration
- Auto-seeding from env vars and existing OAuth state
Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
Salvaged from PR #2647
* fix(tests): prevent pool auto-seeding from host env in credential pool tests
Tests for non-pool Anthropic paths and auth remove were failing when
host env vars (ANTHROPIC_API_KEY) or file-backed OAuth credentials
were present. The pool auto-seeding picked these up, causing unexpected
pool entries in tests.
- Mock _select_pool_entry in auxiliary_client OAuth flag tests
- Clear Anthropic env vars and mock _seed_from_singletons in auth remove test
* feat(auth): add thread safety, least_used strategy, and request counting
- Add threading.Lock to CredentialPool for gateway thread safety
(concurrent requests from multiple gateway sessions could race on
pool state mutations without this)
- Add 'least_used' rotation strategy that selects the credential
with the lowest request_count, distributing load more evenly
- Add request_count field to PooledCredential for usage tracking
- Add mark_used() method to increment per-credential request counts
- Wrap select(), mark_exhausted_and_rotate(), and try_refresh_current()
with lock acquisition
- Add tests: least_used selection, mark_used counting, concurrent
thread safety (4 threads × 20 selects with no corruption)
* feat(auth): add interactive mode for bare 'hermes auth' command
When 'hermes auth' is called without a subcommand, it now launches an
interactive wizard that:
1. Shows full credential pool status across all providers
2. Offers a menu: add, remove, reset cooldowns, set strategy
3. For OAuth-capable providers (anthropic, nous, openai-codex), the
add flow explicitly asks 'API key or OAuth login?' — making it
clear that both auth types are supported for the same provider
4. Strategy picker shows all 4 options (fill_first, round_robin,
least_used, random) with the current selection marked
5. Remove flow shows entries with indices for easy selection
The subcommand paths (hermes auth add/list/remove/reset) still work
exactly as before for scripted/non-interactive use.
* fix(tests): update runtime_provider tests for config.yaml source of truth (#4165)
Tests were using OPENAI_BASE_URL env var which is no longer consulted
after #4165. Updated to use model config (provider, base_url, api_key)
which is the new single source of truth for custom endpoint URLs.
* feat(auth): support custom endpoint credential pools keyed by provider name
Custom OpenAI-compatible endpoints all share provider='custom', making
the provider-keyed pool useless. Now pools for custom endpoints are
keyed by 'custom:<normalized_name>' where the name comes from the
custom_providers config list (auto-generated from URL hostname).
- Pool key format: 'custom:together.ai', 'custom:local-(localhost:8080)'
- load_pool('custom:name') seeds from custom_providers api_key AND
model.api_key when base_url matches
- hermes auth add/list now shows custom endpoints alongside registry
providers
- _resolve_openrouter_runtime and _resolve_named_custom_runtime check
pool before falling back to single config key
- 6 new tests covering custom pool keying, seeding, and listing
* docs: add Excalidraw diagram of full credential pool flow
Comprehensive architecture diagram showing:
- Credential sources (env vars, auth.json OAuth, config.yaml, CLI)
- Pool storage and auto-seeding
- Runtime resolution paths (registry, custom, OpenRouter)
- Error recovery (429 retry-then-rotate, 402 immediate, 401 refresh)
- CLI management commands and strategy configuration
Open at: https://excalidraw.com/#json=2Ycqhqpi6f12E_3ITyiwh,c7u9jSt5BwrmiVzHGbm87g
* fix(tests): update setup wizard pool tests for unified select_provider_and_model flow
The setup wizard now delegates to select_provider_and_model() instead
of using its own prompt_choice-based provider picker. Tests needed:
- Mock select_provider_and_model as no-op (provider pre-written to config)
- Call _stub_tts BEFORE custom prompt_choice mock (it overwrites it)
- Pre-write model.provider to config so the pool step is reached
* docs: add comprehensive credential pool documentation
- New page: website/docs/user-guide/features/credential-pools.md
Full guide covering quick start, CLI commands, rotation strategies,
error recovery, custom endpoint pools, auto-discovery, thread safety,
architecture, and storage format.
- Updated fallback-providers.md to reference credential pools as the
first layer of resilience (same-provider rotation before cross-provider)
- Added hermes auth to CLI commands reference with usage examples
- Added credential_pool_strategies to configuration guide
* chore: remove excalidraw diagram from repo (external link only)
* refactor: simplify credential pool code — extract helpers, collapse extras, dedup patterns
- _load_config_safe(): replace 4 identical try/except/import blocks
- _iter_custom_providers(): shared generator for custom provider iteration
- PooledCredential.extra dict: collapse 11 round-trip-only fields
(token_type, scope, client_id, portal_base_url, obtained_at,
expires_in, agent_key_id, agent_key_expires_in, agent_key_reused,
agent_key_obtained_at, tls) into a single extra dict with
__getattr__ for backward-compatible access
- _available_entries(): shared exhaustion-check between select and peek
- Dedup anthropic OAuth seeding (hermes_pkce + claude_code identical)
- SimpleNamespace replaces class _Args boilerplate in auth_commands
- _try_resolve_from_custom_pool(): shared pool-check in runtime_provider
Net -17 lines. All 383 targeted tests pass.
---------
Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
2026-03-31 03:10:01 -07:00
|
|
|
|
raise ValueError("refresh_token is required")
|
|
|
|
|
|
|
|
|
|
|
|
client_id = "9d1c250a-e61b-44d9-88ed-5944d1962f5e"
|
|
|
|
|
|
if use_json:
|
|
|
|
|
|
data = json.dumps({
|
|
|
|
|
|
"grant_type": "refresh_token",
|
|
|
|
|
|
"refresh_token": refresh_token,
|
|
|
|
|
|
"client_id": client_id,
|
|
|
|
|
|
}).encode()
|
|
|
|
|
|
content_type = "application/json"
|
|
|
|
|
|
else:
|
|
|
|
|
|
data = urllib.parse.urlencode({
|
|
|
|
|
|
"grant_type": "refresh_token",
|
|
|
|
|
|
"refresh_token": refresh_token,
|
|
|
|
|
|
"client_id": client_id,
|
|
|
|
|
|
}).encode()
|
|
|
|
|
|
content_type = "application/x-www-form-urlencoded"
|
fix: Anthropic OAuth — beta header, token refresh, config contamination, reauthentication (#1132)
Fixes Anthropic OAuth/subscription authentication end-to-end:
Auth failures (401 errors):
- Add missing 'claude-code-20250219' beta header for OAuth tokens. Both
clawdbot and OpenCode include this alongside 'oauth-2025-04-20' — without
it, Anthropic's API rejects OAuth tokens with 401 authentication errors.
- Fix _fetch_anthropic_models() to use canonical beta headers from
_COMMON_BETAS + _OAUTH_ONLY_BETAS instead of hardcoding.
Token refresh:
- Add _refresh_oauth_token() — when Claude Code credentials from
~/.claude/.credentials.json are expired but have a refresh token,
automatically POST to console.anthropic.com/v1/oauth/token to get
a new access token. Uses the same client_id as Claude Code / OpenCode.
- Add _write_claude_code_credentials() — writes refreshed tokens back
to ~/.claude/.credentials.json, preserving other fields.
- resolve_anthropic_token() now auto-refreshes expired tokens before
returning None.
Config contamination:
- Anthropic's _model_flow_anthropic() no longer saves base_url to config.
Since resolve_runtime_provider() always hardcodes Anthropic's URL, the
stale base_url was contaminating other providers when users switched
without re-running 'hermes model' (e.g., Codex hitting api.anthropic.com).
- _update_config_for_provider() now pops base_url when passed empty string.
- Same fix in setup.py.
Flow/UX (hermes model command):
- CLAUDE_CODE_OAUTH_TOKEN env var now checked in credential detection
- Reauthentication option when existing credentials found
- run_oauth_setup_token() runs 'claude setup-token' as interactive
subprocess, then auto-detects saved credentials
- Clean has_creds/needs_auth flow in both main.py and setup.py
Tests (14 new):
- Beta header assertions for claude-code-20250219
- Token refresh: successful refresh with credential writeback, failed
refresh returns None, no refresh token returns None
- Credential writeback: new file creation, preserving existing fields
- Auto-refresh integration in resolve_anthropic_token()
- CLAUDE_CODE_OAUTH_TOKEN fallback, credential file auto-discovery
- run_oauth_setup_token() (5 scenarios)
2026-03-12 20:45:50 -07:00
|
|
|
|
|
2026-03-26 13:26:56 -07:00
|
|
|
|
token_endpoints = [
|
|
|
|
|
|
"https://platform.claude.com/v1/oauth/token",
|
|
|
|
|
|
"https://console.anthropic.com/v1/oauth/token",
|
|
|
|
|
|
]
|
feat(auth): same-provider credential pools with rotation, custom endpoint support, and interactive CLI (#2647)
* feat(auth): add same-provider credential pools and rotation UX
Add same-provider credential pooling so Hermes can rotate across
multiple credentials for a single provider, recover from exhausted
credentials without jumping providers immediately, and configure
that behavior directly in hermes setup.
- agent/credential_pool.py: persisted per-provider credential pools
- hermes auth add/list/remove/reset CLI commands
- 429/402/401 recovery with pool rotation in run_agent.py
- Setup wizard integration for pool strategy configuration
- Auto-seeding from env vars and existing OAuth state
Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
Salvaged from PR #2647
* fix(tests): prevent pool auto-seeding from host env in credential pool tests
Tests for non-pool Anthropic paths and auth remove were failing when
host env vars (ANTHROPIC_API_KEY) or file-backed OAuth credentials
were present. The pool auto-seeding picked these up, causing unexpected
pool entries in tests.
- Mock _select_pool_entry in auxiliary_client OAuth flag tests
- Clear Anthropic env vars and mock _seed_from_singletons in auth remove test
* feat(auth): add thread safety, least_used strategy, and request counting
- Add threading.Lock to CredentialPool for gateway thread safety
(concurrent requests from multiple gateway sessions could race on
pool state mutations without this)
- Add 'least_used' rotation strategy that selects the credential
with the lowest request_count, distributing load more evenly
- Add request_count field to PooledCredential for usage tracking
- Add mark_used() method to increment per-credential request counts
- Wrap select(), mark_exhausted_and_rotate(), and try_refresh_current()
with lock acquisition
- Add tests: least_used selection, mark_used counting, concurrent
thread safety (4 threads × 20 selects with no corruption)
* feat(auth): add interactive mode for bare 'hermes auth' command
When 'hermes auth' is called without a subcommand, it now launches an
interactive wizard that:
1. Shows full credential pool status across all providers
2. Offers a menu: add, remove, reset cooldowns, set strategy
3. For OAuth-capable providers (anthropic, nous, openai-codex), the
add flow explicitly asks 'API key or OAuth login?' — making it
clear that both auth types are supported for the same provider
4. Strategy picker shows all 4 options (fill_first, round_robin,
least_used, random) with the current selection marked
5. Remove flow shows entries with indices for easy selection
The subcommand paths (hermes auth add/list/remove/reset) still work
exactly as before for scripted/non-interactive use.
* fix(tests): update runtime_provider tests for config.yaml source of truth (#4165)
Tests were using OPENAI_BASE_URL env var which is no longer consulted
after #4165. Updated to use model config (provider, base_url, api_key)
which is the new single source of truth for custom endpoint URLs.
* feat(auth): support custom endpoint credential pools keyed by provider name
Custom OpenAI-compatible endpoints all share provider='custom', making
the provider-keyed pool useless. Now pools for custom endpoints are
keyed by 'custom:<normalized_name>' where the name comes from the
custom_providers config list (auto-generated from URL hostname).
- Pool key format: 'custom:together.ai', 'custom:local-(localhost:8080)'
- load_pool('custom:name') seeds from custom_providers api_key AND
model.api_key when base_url matches
- hermes auth add/list now shows custom endpoints alongside registry
providers
- _resolve_openrouter_runtime and _resolve_named_custom_runtime check
pool before falling back to single config key
- 6 new tests covering custom pool keying, seeding, and listing
* docs: add Excalidraw diagram of full credential pool flow
Comprehensive architecture diagram showing:
- Credential sources (env vars, auth.json OAuth, config.yaml, CLI)
- Pool storage and auto-seeding
- Runtime resolution paths (registry, custom, OpenRouter)
- Error recovery (429 retry-then-rotate, 402 immediate, 401 refresh)
- CLI management commands and strategy configuration
Open at: https://excalidraw.com/#json=2Ycqhqpi6f12E_3ITyiwh,c7u9jSt5BwrmiVzHGbm87g
* fix(tests): update setup wizard pool tests for unified select_provider_and_model flow
The setup wizard now delegates to select_provider_and_model() instead
of using its own prompt_choice-based provider picker. Tests needed:
- Mock select_provider_and_model as no-op (provider pre-written to config)
- Call _stub_tts BEFORE custom prompt_choice mock (it overwrites it)
- Pre-write model.provider to config so the pool step is reached
* docs: add comprehensive credential pool documentation
- New page: website/docs/user-guide/features/credential-pools.md
Full guide covering quick start, CLI commands, rotation strategies,
error recovery, custom endpoint pools, auto-discovery, thread safety,
architecture, and storage format.
- Updated fallback-providers.md to reference credential pools as the
first layer of resilience (same-provider rotation before cross-provider)
- Added hermes auth to CLI commands reference with usage examples
- Added credential_pool_strategies to configuration guide
* chore: remove excalidraw diagram from repo (external link only)
* refactor: simplify credential pool code — extract helpers, collapse extras, dedup patterns
- _load_config_safe(): replace 4 identical try/except/import blocks
- _iter_custom_providers(): shared generator for custom provider iteration
- PooledCredential.extra dict: collapse 11 round-trip-only fields
(token_type, scope, client_id, portal_base_url, obtained_at,
expires_in, agent_key_id, agent_key_expires_in, agent_key_reused,
agent_key_obtained_at, tls) into a single extra dict with
__getattr__ for backward-compatible access
- _available_entries(): shared exhaustion-check between select and peek
- Dedup anthropic OAuth seeding (hermes_pkce + claude_code identical)
- SimpleNamespace replaces class _Args boilerplate in auth_commands
- _try_resolve_from_custom_pool(): shared pool-check in runtime_provider
Net -17 lines. All 383 targeted tests pass.
---------
Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
2026-03-31 03:10:01 -07:00
|
|
|
|
last_error = None
|
2026-03-26 13:26:56 -07:00
|
|
|
|
for endpoint in token_endpoints:
|
|
|
|
|
|
req = urllib.request.Request(
|
feat(auth): same-provider credential pools with rotation, custom endpoint support, and interactive CLI (#2647)
* feat(auth): add same-provider credential pools and rotation UX
Add same-provider credential pooling so Hermes can rotate across
multiple credentials for a single provider, recover from exhausted
credentials without jumping providers immediately, and configure
that behavior directly in hermes setup.
- agent/credential_pool.py: persisted per-provider credential pools
- hermes auth add/list/remove/reset CLI commands
- 429/402/401 recovery with pool rotation in run_agent.py
- Setup wizard integration for pool strategy configuration
- Auto-seeding from env vars and existing OAuth state
Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
Salvaged from PR #2647
* fix(tests): prevent pool auto-seeding from host env in credential pool tests
Tests for non-pool Anthropic paths and auth remove were failing when
host env vars (ANTHROPIC_API_KEY) or file-backed OAuth credentials
were present. The pool auto-seeding picked these up, causing unexpected
pool entries in tests.
- Mock _select_pool_entry in auxiliary_client OAuth flag tests
- Clear Anthropic env vars and mock _seed_from_singletons in auth remove test
* feat(auth): add thread safety, least_used strategy, and request counting
- Add threading.Lock to CredentialPool for gateway thread safety
(concurrent requests from multiple gateway sessions could race on
pool state mutations without this)
- Add 'least_used' rotation strategy that selects the credential
with the lowest request_count, distributing load more evenly
- Add request_count field to PooledCredential for usage tracking
- Add mark_used() method to increment per-credential request counts
- Wrap select(), mark_exhausted_and_rotate(), and try_refresh_current()
with lock acquisition
- Add tests: least_used selection, mark_used counting, concurrent
thread safety (4 threads × 20 selects with no corruption)
* feat(auth): add interactive mode for bare 'hermes auth' command
When 'hermes auth' is called without a subcommand, it now launches an
interactive wizard that:
1. Shows full credential pool status across all providers
2. Offers a menu: add, remove, reset cooldowns, set strategy
3. For OAuth-capable providers (anthropic, nous, openai-codex), the
add flow explicitly asks 'API key or OAuth login?' — making it
clear that both auth types are supported for the same provider
4. Strategy picker shows all 4 options (fill_first, round_robin,
least_used, random) with the current selection marked
5. Remove flow shows entries with indices for easy selection
The subcommand paths (hermes auth add/list/remove/reset) still work
exactly as before for scripted/non-interactive use.
* fix(tests): update runtime_provider tests for config.yaml source of truth (#4165)
Tests were using OPENAI_BASE_URL env var which is no longer consulted
after #4165. Updated to use model config (provider, base_url, api_key)
which is the new single source of truth for custom endpoint URLs.
* feat(auth): support custom endpoint credential pools keyed by provider name
Custom OpenAI-compatible endpoints all share provider='custom', making
the provider-keyed pool useless. Now pools for custom endpoints are
keyed by 'custom:<normalized_name>' where the name comes from the
custom_providers config list (auto-generated from URL hostname).
- Pool key format: 'custom:together.ai', 'custom:local-(localhost:8080)'
- load_pool('custom:name') seeds from custom_providers api_key AND
model.api_key when base_url matches
- hermes auth add/list now shows custom endpoints alongside registry
providers
- _resolve_openrouter_runtime and _resolve_named_custom_runtime check
pool before falling back to single config key
- 6 new tests covering custom pool keying, seeding, and listing
* docs: add Excalidraw diagram of full credential pool flow
Comprehensive architecture diagram showing:
- Credential sources (env vars, auth.json OAuth, config.yaml, CLI)
- Pool storage and auto-seeding
- Runtime resolution paths (registry, custom, OpenRouter)
- Error recovery (429 retry-then-rotate, 402 immediate, 401 refresh)
- CLI management commands and strategy configuration
Open at: https://excalidraw.com/#json=2Ycqhqpi6f12E_3ITyiwh,c7u9jSt5BwrmiVzHGbm87g
* fix(tests): update setup wizard pool tests for unified select_provider_and_model flow
The setup wizard now delegates to select_provider_and_model() instead
of using its own prompt_choice-based provider picker. Tests needed:
- Mock select_provider_and_model as no-op (provider pre-written to config)
- Call _stub_tts BEFORE custom prompt_choice mock (it overwrites it)
- Pre-write model.provider to config so the pool step is reached
* docs: add comprehensive credential pool documentation
- New page: website/docs/user-guide/features/credential-pools.md
Full guide covering quick start, CLI commands, rotation strategies,
error recovery, custom endpoint pools, auto-discovery, thread safety,
architecture, and storage format.
- Updated fallback-providers.md to reference credential pools as the
first layer of resilience (same-provider rotation before cross-provider)
- Added hermes auth to CLI commands reference with usage examples
- Added credential_pool_strategies to configuration guide
* chore: remove excalidraw diagram from repo (external link only)
* refactor: simplify credential pool code — extract helpers, collapse extras, dedup patterns
- _load_config_safe(): replace 4 identical try/except/import blocks
- _iter_custom_providers(): shared generator for custom provider iteration
- PooledCredential.extra dict: collapse 11 round-trip-only fields
(token_type, scope, client_id, portal_base_url, obtained_at,
expires_in, agent_key_id, agent_key_expires_in, agent_key_reused,
agent_key_obtained_at, tls) into a single extra dict with
__getattr__ for backward-compatible access
- _available_entries(): shared exhaustion-check between select and peek
- Dedup anthropic OAuth seeding (hermes_pkce + claude_code identical)
- SimpleNamespace replaces class _Args boilerplate in auth_commands
- _try_resolve_from_custom_pool(): shared pool-check in runtime_provider
Net -17 lines. All 383 targeted tests pass.
---------
Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
2026-03-31 03:10:01 -07:00
|
|
|
|
endpoint,
|
|
|
|
|
|
data=data,
|
|
|
|
|
|
headers={
|
|
|
|
|
|
"Content-Type": content_type,
|
|
|
|
|
|
"User-Agent": f"claude-cli/{_get_claude_code_version()} (external, cli)",
|
|
|
|
|
|
},
|
|
|
|
|
|
method="POST",
|
2026-03-26 13:26:56 -07:00
|
|
|
|
)
|
|
|
|
|
|
try:
|
|
|
|
|
|
with urllib.request.urlopen(req, timeout=10) as resp:
|
|
|
|
|
|
result = json.loads(resp.read().decode())
|
feat(auth): same-provider credential pools with rotation, custom endpoint support, and interactive CLI (#2647)
* feat(auth): add same-provider credential pools and rotation UX
Add same-provider credential pooling so Hermes can rotate across
multiple credentials for a single provider, recover from exhausted
credentials without jumping providers immediately, and configure
that behavior directly in hermes setup.
- agent/credential_pool.py: persisted per-provider credential pools
- hermes auth add/list/remove/reset CLI commands
- 429/402/401 recovery with pool rotation in run_agent.py
- Setup wizard integration for pool strategy configuration
- Auto-seeding from env vars and existing OAuth state
Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
Salvaged from PR #2647
* fix(tests): prevent pool auto-seeding from host env in credential pool tests
Tests for non-pool Anthropic paths and auth remove were failing when
host env vars (ANTHROPIC_API_KEY) or file-backed OAuth credentials
were present. The pool auto-seeding picked these up, causing unexpected
pool entries in tests.
- Mock _select_pool_entry in auxiliary_client OAuth flag tests
- Clear Anthropic env vars and mock _seed_from_singletons in auth remove test
* feat(auth): add thread safety, least_used strategy, and request counting
- Add threading.Lock to CredentialPool for gateway thread safety
(concurrent requests from multiple gateway sessions could race on
pool state mutations without this)
- Add 'least_used' rotation strategy that selects the credential
with the lowest request_count, distributing load more evenly
- Add request_count field to PooledCredential for usage tracking
- Add mark_used() method to increment per-credential request counts
- Wrap select(), mark_exhausted_and_rotate(), and try_refresh_current()
with lock acquisition
- Add tests: least_used selection, mark_used counting, concurrent
thread safety (4 threads × 20 selects with no corruption)
* feat(auth): add interactive mode for bare 'hermes auth' command
When 'hermes auth' is called without a subcommand, it now launches an
interactive wizard that:
1. Shows full credential pool status across all providers
2. Offers a menu: add, remove, reset cooldowns, set strategy
3. For OAuth-capable providers (anthropic, nous, openai-codex), the
add flow explicitly asks 'API key or OAuth login?' — making it
clear that both auth types are supported for the same provider
4. Strategy picker shows all 4 options (fill_first, round_robin,
least_used, random) with the current selection marked
5. Remove flow shows entries with indices for easy selection
The subcommand paths (hermes auth add/list/remove/reset) still work
exactly as before for scripted/non-interactive use.
* fix(tests): update runtime_provider tests for config.yaml source of truth (#4165)
Tests were using OPENAI_BASE_URL env var which is no longer consulted
after #4165. Updated to use model config (provider, base_url, api_key)
which is the new single source of truth for custom endpoint URLs.
* feat(auth): support custom endpoint credential pools keyed by provider name
Custom OpenAI-compatible endpoints all share provider='custom', making
the provider-keyed pool useless. Now pools for custom endpoints are
keyed by 'custom:<normalized_name>' where the name comes from the
custom_providers config list (auto-generated from URL hostname).
- Pool key format: 'custom:together.ai', 'custom:local-(localhost:8080)'
- load_pool('custom:name') seeds from custom_providers api_key AND
model.api_key when base_url matches
- hermes auth add/list now shows custom endpoints alongside registry
providers
- _resolve_openrouter_runtime and _resolve_named_custom_runtime check
pool before falling back to single config key
- 6 new tests covering custom pool keying, seeding, and listing
* docs: add Excalidraw diagram of full credential pool flow
Comprehensive architecture diagram showing:
- Credential sources (env vars, auth.json OAuth, config.yaml, CLI)
- Pool storage and auto-seeding
- Runtime resolution paths (registry, custom, OpenRouter)
- Error recovery (429 retry-then-rotate, 402 immediate, 401 refresh)
- CLI management commands and strategy configuration
Open at: https://excalidraw.com/#json=2Ycqhqpi6f12E_3ITyiwh,c7u9jSt5BwrmiVzHGbm87g
* fix(tests): update setup wizard pool tests for unified select_provider_and_model flow
The setup wizard now delegates to select_provider_and_model() instead
of using its own prompt_choice-based provider picker. Tests needed:
- Mock select_provider_and_model as no-op (provider pre-written to config)
- Call _stub_tts BEFORE custom prompt_choice mock (it overwrites it)
- Pre-write model.provider to config so the pool step is reached
* docs: add comprehensive credential pool documentation
- New page: website/docs/user-guide/features/credential-pools.md
Full guide covering quick start, CLI commands, rotation strategies,
error recovery, custom endpoint pools, auto-discovery, thread safety,
architecture, and storage format.
- Updated fallback-providers.md to reference credential pools as the
first layer of resilience (same-provider rotation before cross-provider)
- Added hermes auth to CLI commands reference with usage examples
- Added credential_pool_strategies to configuration guide
* chore: remove excalidraw diagram from repo (external link only)
* refactor: simplify credential pool code — extract helpers, collapse extras, dedup patterns
- _load_config_safe(): replace 4 identical try/except/import blocks
- _iter_custom_providers(): shared generator for custom provider iteration
- PooledCredential.extra dict: collapse 11 round-trip-only fields
(token_type, scope, client_id, portal_base_url, obtained_at,
expires_in, agent_key_id, agent_key_expires_in, agent_key_reused,
agent_key_obtained_at, tls) into a single extra dict with
__getattr__ for backward-compatible access
- _available_entries(): shared exhaustion-check between select and peek
- Dedup anthropic OAuth seeding (hermes_pkce + claude_code identical)
- SimpleNamespace replaces class _Args boilerplate in auth_commands
- _try_resolve_from_custom_pool(): shared pool-check in runtime_provider
Net -17 lines. All 383 targeted tests pass.
---------
Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
2026-03-31 03:10:01 -07:00
|
|
|
|
except Exception as exc:
|
|
|
|
|
|
last_error = exc
|
|
|
|
|
|
logger.debug("Anthropic token refresh failed at %s: %s", endpoint, exc)
|
|
|
|
|
|
continue
|
|
|
|
|
|
|
|
|
|
|
|
access_token = result.get("access_token", "")
|
|
|
|
|
|
if not access_token:
|
|
|
|
|
|
raise ValueError("Anthropic refresh response was missing access_token")
|
|
|
|
|
|
next_refresh = result.get("refresh_token", refresh_token)
|
|
|
|
|
|
expires_in = result.get("expires_in", 3600)
|
|
|
|
|
|
return {
|
|
|
|
|
|
"access_token": access_token,
|
|
|
|
|
|
"refresh_token": next_refresh,
|
|
|
|
|
|
"expires_at_ms": int(time.time() * 1000) + (expires_in * 1000),
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
if last_error is not None:
|
|
|
|
|
|
raise last_error
|
|
|
|
|
|
raise ValueError("Anthropic token refresh failed")
|
fix: Anthropic OAuth — beta header, token refresh, config contamination, reauthentication (#1132)
Fixes Anthropic OAuth/subscription authentication end-to-end:
Auth failures (401 errors):
- Add missing 'claude-code-20250219' beta header for OAuth tokens. Both
clawdbot and OpenCode include this alongside 'oauth-2025-04-20' — without
it, Anthropic's API rejects OAuth tokens with 401 authentication errors.
- Fix _fetch_anthropic_models() to use canonical beta headers from
_COMMON_BETAS + _OAUTH_ONLY_BETAS instead of hardcoding.
Token refresh:
- Add _refresh_oauth_token() — when Claude Code credentials from
~/.claude/.credentials.json are expired but have a refresh token,
automatically POST to console.anthropic.com/v1/oauth/token to get
a new access token. Uses the same client_id as Claude Code / OpenCode.
- Add _write_claude_code_credentials() — writes refreshed tokens back
to ~/.claude/.credentials.json, preserving other fields.
- resolve_anthropic_token() now auto-refreshes expired tokens before
returning None.
Config contamination:
- Anthropic's _model_flow_anthropic() no longer saves base_url to config.
Since resolve_runtime_provider() always hardcodes Anthropic's URL, the
stale base_url was contaminating other providers when users switched
without re-running 'hermes model' (e.g., Codex hitting api.anthropic.com).
- _update_config_for_provider() now pops base_url when passed empty string.
- Same fix in setup.py.
Flow/UX (hermes model command):
- CLAUDE_CODE_OAUTH_TOKEN env var now checked in credential detection
- Reauthentication option when existing credentials found
- run_oauth_setup_token() runs 'claude setup-token' as interactive
subprocess, then auto-detects saved credentials
- Clean has_creds/needs_auth flow in both main.py and setup.py
Tests (14 new):
- Beta header assertions for claude-code-20250219
- Token refresh: successful refresh with credential writeback, failed
refresh returns None, no refresh token returns None
- Credential writeback: new file creation, preserving existing fields
- Auto-refresh integration in resolve_anthropic_token()
- CLAUDE_CODE_OAUTH_TOKEN fallback, credential file auto-discovery
- run_oauth_setup_token() (5 scenarios)
2026-03-12 20:45:50 -07:00
|
|
|
|
|
feat(auth): same-provider credential pools with rotation, custom endpoint support, and interactive CLI (#2647)
* feat(auth): add same-provider credential pools and rotation UX
Add same-provider credential pooling so Hermes can rotate across
multiple credentials for a single provider, recover from exhausted
credentials without jumping providers immediately, and configure
that behavior directly in hermes setup.
- agent/credential_pool.py: persisted per-provider credential pools
- hermes auth add/list/remove/reset CLI commands
- 429/402/401 recovery with pool rotation in run_agent.py
- Setup wizard integration for pool strategy configuration
- Auto-seeding from env vars and existing OAuth state
Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
Salvaged from PR #2647
* fix(tests): prevent pool auto-seeding from host env in credential pool tests
Tests for non-pool Anthropic paths and auth remove were failing when
host env vars (ANTHROPIC_API_KEY) or file-backed OAuth credentials
were present. The pool auto-seeding picked these up, causing unexpected
pool entries in tests.
- Mock _select_pool_entry in auxiliary_client OAuth flag tests
- Clear Anthropic env vars and mock _seed_from_singletons in auth remove test
* feat(auth): add thread safety, least_used strategy, and request counting
- Add threading.Lock to CredentialPool for gateway thread safety
(concurrent requests from multiple gateway sessions could race on
pool state mutations without this)
- Add 'least_used' rotation strategy that selects the credential
with the lowest request_count, distributing load more evenly
- Add request_count field to PooledCredential for usage tracking
- Add mark_used() method to increment per-credential request counts
- Wrap select(), mark_exhausted_and_rotate(), and try_refresh_current()
with lock acquisition
- Add tests: least_used selection, mark_used counting, concurrent
thread safety (4 threads × 20 selects with no corruption)
* feat(auth): add interactive mode for bare 'hermes auth' command
When 'hermes auth' is called without a subcommand, it now launches an
interactive wizard that:
1. Shows full credential pool status across all providers
2. Offers a menu: add, remove, reset cooldowns, set strategy
3. For OAuth-capable providers (anthropic, nous, openai-codex), the
add flow explicitly asks 'API key or OAuth login?' — making it
clear that both auth types are supported for the same provider
4. Strategy picker shows all 4 options (fill_first, round_robin,
least_used, random) with the current selection marked
5. Remove flow shows entries with indices for easy selection
The subcommand paths (hermes auth add/list/remove/reset) still work
exactly as before for scripted/non-interactive use.
* fix(tests): update runtime_provider tests for config.yaml source of truth (#4165)
Tests were using OPENAI_BASE_URL env var which is no longer consulted
after #4165. Updated to use model config (provider, base_url, api_key)
which is the new single source of truth for custom endpoint URLs.
* feat(auth): support custom endpoint credential pools keyed by provider name
Custom OpenAI-compatible endpoints all share provider='custom', making
the provider-keyed pool useless. Now pools for custom endpoints are
keyed by 'custom:<normalized_name>' where the name comes from the
custom_providers config list (auto-generated from URL hostname).
- Pool key format: 'custom:together.ai', 'custom:local-(localhost:8080)'
- load_pool('custom:name') seeds from custom_providers api_key AND
model.api_key when base_url matches
- hermes auth add/list now shows custom endpoints alongside registry
providers
- _resolve_openrouter_runtime and _resolve_named_custom_runtime check
pool before falling back to single config key
- 6 new tests covering custom pool keying, seeding, and listing
* docs: add Excalidraw diagram of full credential pool flow
Comprehensive architecture diagram showing:
- Credential sources (env vars, auth.json OAuth, config.yaml, CLI)
- Pool storage and auto-seeding
- Runtime resolution paths (registry, custom, OpenRouter)
- Error recovery (429 retry-then-rotate, 402 immediate, 401 refresh)
- CLI management commands and strategy configuration
Open at: https://excalidraw.com/#json=2Ycqhqpi6f12E_3ITyiwh,c7u9jSt5BwrmiVzHGbm87g
* fix(tests): update setup wizard pool tests for unified select_provider_and_model flow
The setup wizard now delegates to select_provider_and_model() instead
of using its own prompt_choice-based provider picker. Tests needed:
- Mock select_provider_and_model as no-op (provider pre-written to config)
- Call _stub_tts BEFORE custom prompt_choice mock (it overwrites it)
- Pre-write model.provider to config so the pool step is reached
* docs: add comprehensive credential pool documentation
- New page: website/docs/user-guide/features/credential-pools.md
Full guide covering quick start, CLI commands, rotation strategies,
error recovery, custom endpoint pools, auto-discovery, thread safety,
architecture, and storage format.
- Updated fallback-providers.md to reference credential pools as the
first layer of resilience (same-provider rotation before cross-provider)
- Added hermes auth to CLI commands reference with usage examples
- Added credential_pool_strategies to configuration guide
* chore: remove excalidraw diagram from repo (external link only)
* refactor: simplify credential pool code — extract helpers, collapse extras, dedup patterns
- _load_config_safe(): replace 4 identical try/except/import blocks
- _iter_custom_providers(): shared generator for custom provider iteration
- PooledCredential.extra dict: collapse 11 round-trip-only fields
(token_type, scope, client_id, portal_base_url, obtained_at,
expires_in, agent_key_id, agent_key_expires_in, agent_key_reused,
agent_key_obtained_at, tls) into a single extra dict with
__getattr__ for backward-compatible access
- _available_entries(): shared exhaustion-check between select and peek
- Dedup anthropic OAuth seeding (hermes_pkce + claude_code identical)
- SimpleNamespace replaces class _Args boilerplate in auth_commands
- _try_resolve_from_custom_pool(): shared pool-check in runtime_provider
Net -17 lines. All 383 targeted tests pass.
---------
Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
2026-03-31 03:10:01 -07:00
|
|
|
|
|
|
|
|
|
|
def _refresh_oauth_token(creds: Dict[str, Any]) -> Optional[str]:
|
|
|
|
|
|
"""Attempt to refresh an expired Claude Code OAuth token."""
|
|
|
|
|
|
refresh_token = creds.get("refreshToken", "")
|
|
|
|
|
|
if not refresh_token:
|
|
|
|
|
|
logger.debug("No refresh token available — cannot refresh")
|
|
|
|
|
|
return None
|
|
|
|
|
|
|
|
|
|
|
|
try:
|
|
|
|
|
|
refreshed = refresh_anthropic_oauth_pure(refresh_token, use_json=False)
|
|
|
|
|
|
_write_claude_code_credentials(
|
|
|
|
|
|
refreshed["access_token"],
|
|
|
|
|
|
refreshed["refresh_token"],
|
|
|
|
|
|
refreshed["expires_at_ms"],
|
|
|
|
|
|
)
|
|
|
|
|
|
logger.debug("Successfully refreshed Claude Code OAuth token")
|
|
|
|
|
|
return refreshed["access_token"]
|
|
|
|
|
|
except Exception as e:
|
|
|
|
|
|
logger.debug("Failed to refresh Claude Code token: %s", e)
|
|
|
|
|
|
return None
|
fix: Anthropic OAuth — beta header, token refresh, config contamination, reauthentication (#1132)
Fixes Anthropic OAuth/subscription authentication end-to-end:
Auth failures (401 errors):
- Add missing 'claude-code-20250219' beta header for OAuth tokens. Both
clawdbot and OpenCode include this alongside 'oauth-2025-04-20' — without
it, Anthropic's API rejects OAuth tokens with 401 authentication errors.
- Fix _fetch_anthropic_models() to use canonical beta headers from
_COMMON_BETAS + _OAUTH_ONLY_BETAS instead of hardcoding.
Token refresh:
- Add _refresh_oauth_token() — when Claude Code credentials from
~/.claude/.credentials.json are expired but have a refresh token,
automatically POST to console.anthropic.com/v1/oauth/token to get
a new access token. Uses the same client_id as Claude Code / OpenCode.
- Add _write_claude_code_credentials() — writes refreshed tokens back
to ~/.claude/.credentials.json, preserving other fields.
- resolve_anthropic_token() now auto-refreshes expired tokens before
returning None.
Config contamination:
- Anthropic's _model_flow_anthropic() no longer saves base_url to config.
Since resolve_runtime_provider() always hardcodes Anthropic's URL, the
stale base_url was contaminating other providers when users switched
without re-running 'hermes model' (e.g., Codex hitting api.anthropic.com).
- _update_config_for_provider() now pops base_url when passed empty string.
- Same fix in setup.py.
Flow/UX (hermes model command):
- CLAUDE_CODE_OAUTH_TOKEN env var now checked in credential detection
- Reauthentication option when existing credentials found
- run_oauth_setup_token() runs 'claude setup-token' as interactive
subprocess, then auto-detects saved credentials
- Clean has_creds/needs_auth flow in both main.py and setup.py
Tests (14 new):
- Beta header assertions for claude-code-20250219
- Token refresh: successful refresh with credential writeback, failed
refresh returns None, no refresh token returns None
- Credential writeback: new file creation, preserving existing fields
- Auto-refresh integration in resolve_anthropic_token()
- CLAUDE_CODE_OAUTH_TOKEN fallback, credential file auto-discovery
- run_oauth_setup_token() (5 scenarios)
2026-03-12 20:45:50 -07:00
|
|
|
|
|
|
|
|
|
|
|
2026-03-30 18:35:16 -07:00
|
|
|
|
def _write_claude_code_credentials(
|
|
|
|
|
|
access_token: str,
|
|
|
|
|
|
refresh_token: str,
|
|
|
|
|
|
expires_at_ms: int,
|
|
|
|
|
|
*,
|
|
|
|
|
|
scopes: Optional[list] = None,
|
|
|
|
|
|
) -> None:
|
|
|
|
|
|
"""Write refreshed credentials back to ~/.claude/.credentials.json.
|
|
|
|
|
|
|
|
|
|
|
|
The optional *scopes* list (e.g. ``["user:inference", "user:profile", ...]``)
|
|
|
|
|
|
is persisted so that Claude Code's own auth check recognises the credential
|
|
|
|
|
|
as valid. Claude Code >=2.1.81 gates on the presence of ``"user:inference"``
|
|
|
|
|
|
in the stored scopes before it will use the token.
|
|
|
|
|
|
"""
|
fix: Anthropic OAuth — beta header, token refresh, config contamination, reauthentication (#1132)
Fixes Anthropic OAuth/subscription authentication end-to-end:
Auth failures (401 errors):
- Add missing 'claude-code-20250219' beta header for OAuth tokens. Both
clawdbot and OpenCode include this alongside 'oauth-2025-04-20' — without
it, Anthropic's API rejects OAuth tokens with 401 authentication errors.
- Fix _fetch_anthropic_models() to use canonical beta headers from
_COMMON_BETAS + _OAUTH_ONLY_BETAS instead of hardcoding.
Token refresh:
- Add _refresh_oauth_token() — when Claude Code credentials from
~/.claude/.credentials.json are expired but have a refresh token,
automatically POST to console.anthropic.com/v1/oauth/token to get
a new access token. Uses the same client_id as Claude Code / OpenCode.
- Add _write_claude_code_credentials() — writes refreshed tokens back
to ~/.claude/.credentials.json, preserving other fields.
- resolve_anthropic_token() now auto-refreshes expired tokens before
returning None.
Config contamination:
- Anthropic's _model_flow_anthropic() no longer saves base_url to config.
Since resolve_runtime_provider() always hardcodes Anthropic's URL, the
stale base_url was contaminating other providers when users switched
without re-running 'hermes model' (e.g., Codex hitting api.anthropic.com).
- _update_config_for_provider() now pops base_url when passed empty string.
- Same fix in setup.py.
Flow/UX (hermes model command):
- CLAUDE_CODE_OAUTH_TOKEN env var now checked in credential detection
- Reauthentication option when existing credentials found
- run_oauth_setup_token() runs 'claude setup-token' as interactive
subprocess, then auto-detects saved credentials
- Clean has_creds/needs_auth flow in both main.py and setup.py
Tests (14 new):
- Beta header assertions for claude-code-20250219
- Token refresh: successful refresh with credential writeback, failed
refresh returns None, no refresh token returns None
- Credential writeback: new file creation, preserving existing fields
- Auto-refresh integration in resolve_anthropic_token()
- CLAUDE_CODE_OAUTH_TOKEN fallback, credential file auto-discovery
- run_oauth_setup_token() (5 scenarios)
2026-03-12 20:45:50 -07:00
|
|
|
|
cred_path = Path.home() / ".claude" / ".credentials.json"
|
|
|
|
|
|
try:
|
|
|
|
|
|
# Read existing file to preserve other fields
|
|
|
|
|
|
existing = {}
|
|
|
|
|
|
if cred_path.exists():
|
|
|
|
|
|
existing = json.loads(cred_path.read_text(encoding="utf-8"))
|
|
|
|
|
|
|
2026-03-30 18:35:16 -07:00
|
|
|
|
oauth_data: Dict[str, Any] = {
|
fix: Anthropic OAuth — beta header, token refresh, config contamination, reauthentication (#1132)
Fixes Anthropic OAuth/subscription authentication end-to-end:
Auth failures (401 errors):
- Add missing 'claude-code-20250219' beta header for OAuth tokens. Both
clawdbot and OpenCode include this alongside 'oauth-2025-04-20' — without
it, Anthropic's API rejects OAuth tokens with 401 authentication errors.
- Fix _fetch_anthropic_models() to use canonical beta headers from
_COMMON_BETAS + _OAUTH_ONLY_BETAS instead of hardcoding.
Token refresh:
- Add _refresh_oauth_token() — when Claude Code credentials from
~/.claude/.credentials.json are expired but have a refresh token,
automatically POST to console.anthropic.com/v1/oauth/token to get
a new access token. Uses the same client_id as Claude Code / OpenCode.
- Add _write_claude_code_credentials() — writes refreshed tokens back
to ~/.claude/.credentials.json, preserving other fields.
- resolve_anthropic_token() now auto-refreshes expired tokens before
returning None.
Config contamination:
- Anthropic's _model_flow_anthropic() no longer saves base_url to config.
Since resolve_runtime_provider() always hardcodes Anthropic's URL, the
stale base_url was contaminating other providers when users switched
without re-running 'hermes model' (e.g., Codex hitting api.anthropic.com).
- _update_config_for_provider() now pops base_url when passed empty string.
- Same fix in setup.py.
Flow/UX (hermes model command):
- CLAUDE_CODE_OAUTH_TOKEN env var now checked in credential detection
- Reauthentication option when existing credentials found
- run_oauth_setup_token() runs 'claude setup-token' as interactive
subprocess, then auto-detects saved credentials
- Clean has_creds/needs_auth flow in both main.py and setup.py
Tests (14 new):
- Beta header assertions for claude-code-20250219
- Token refresh: successful refresh with credential writeback, failed
refresh returns None, no refresh token returns None
- Credential writeback: new file creation, preserving existing fields
- Auto-refresh integration in resolve_anthropic_token()
- CLAUDE_CODE_OAUTH_TOKEN fallback, credential file auto-discovery
- run_oauth_setup_token() (5 scenarios)
2026-03-12 20:45:50 -07:00
|
|
|
|
"accessToken": access_token,
|
|
|
|
|
|
"refreshToken": refresh_token,
|
|
|
|
|
|
"expiresAt": expires_at_ms,
|
|
|
|
|
|
}
|
2026-03-30 18:35:16 -07:00
|
|
|
|
if scopes is not None:
|
|
|
|
|
|
oauth_data["scopes"] = scopes
|
|
|
|
|
|
elif "claudeAiOauth" in existing and "scopes" in existing["claudeAiOauth"]:
|
|
|
|
|
|
# Preserve previously-stored scopes when the refresh response
|
|
|
|
|
|
# does not include a scope field.
|
|
|
|
|
|
oauth_data["scopes"] = existing["claudeAiOauth"]["scopes"]
|
|
|
|
|
|
|
|
|
|
|
|
existing["claudeAiOauth"] = oauth_data
|
fix: Anthropic OAuth — beta header, token refresh, config contamination, reauthentication (#1132)
Fixes Anthropic OAuth/subscription authentication end-to-end:
Auth failures (401 errors):
- Add missing 'claude-code-20250219' beta header for OAuth tokens. Both
clawdbot and OpenCode include this alongside 'oauth-2025-04-20' — without
it, Anthropic's API rejects OAuth tokens with 401 authentication errors.
- Fix _fetch_anthropic_models() to use canonical beta headers from
_COMMON_BETAS + _OAUTH_ONLY_BETAS instead of hardcoding.
Token refresh:
- Add _refresh_oauth_token() — when Claude Code credentials from
~/.claude/.credentials.json are expired but have a refresh token,
automatically POST to console.anthropic.com/v1/oauth/token to get
a new access token. Uses the same client_id as Claude Code / OpenCode.
- Add _write_claude_code_credentials() — writes refreshed tokens back
to ~/.claude/.credentials.json, preserving other fields.
- resolve_anthropic_token() now auto-refreshes expired tokens before
returning None.
Config contamination:
- Anthropic's _model_flow_anthropic() no longer saves base_url to config.
Since resolve_runtime_provider() always hardcodes Anthropic's URL, the
stale base_url was contaminating other providers when users switched
without re-running 'hermes model' (e.g., Codex hitting api.anthropic.com).
- _update_config_for_provider() now pops base_url when passed empty string.
- Same fix in setup.py.
Flow/UX (hermes model command):
- CLAUDE_CODE_OAUTH_TOKEN env var now checked in credential detection
- Reauthentication option when existing credentials found
- run_oauth_setup_token() runs 'claude setup-token' as interactive
subprocess, then auto-detects saved credentials
- Clean has_creds/needs_auth flow in both main.py and setup.py
Tests (14 new):
- Beta header assertions for claude-code-20250219
- Token refresh: successful refresh with credential writeback, failed
refresh returns None, no refresh token returns None
- Credential writeback: new file creation, preserving existing fields
- Auto-refresh integration in resolve_anthropic_token()
- CLAUDE_CODE_OAUTH_TOKEN fallback, credential file auto-discovery
- run_oauth_setup_token() (5 scenarios)
2026-03-12 20:45:50 -07:00
|
|
|
|
|
|
|
|
|
|
cred_path.parent.mkdir(parents=True, exist_ok=True)
|
|
|
|
|
|
cred_path.write_text(json.dumps(existing, indent=2), encoding="utf-8")
|
|
|
|
|
|
# Restrict permissions (credentials file)
|
|
|
|
|
|
cred_path.chmod(0o600)
|
|
|
|
|
|
except (OSError, IOError) as e:
|
|
|
|
|
|
logger.debug("Failed to write refreshed credentials: %s", e)
|
|
|
|
|
|
|
|
|
|
|
|
|
2026-03-14 19:22:31 -07:00
|
|
|
|
def _resolve_claude_code_token_from_credentials(creds: Optional[Dict[str, Any]] = None) -> Optional[str]:
|
|
|
|
|
|
"""Resolve a token from Claude Code credential files, refreshing if needed."""
|
|
|
|
|
|
creds = creds or read_claude_code_credentials()
|
|
|
|
|
|
if creds and is_claude_code_token_valid(creds):
|
|
|
|
|
|
logger.debug("Using Claude Code credentials (auto-detected)")
|
|
|
|
|
|
return creds["accessToken"]
|
|
|
|
|
|
if creds:
|
|
|
|
|
|
logger.debug("Claude Code credentials expired — attempting refresh")
|
|
|
|
|
|
refreshed = _refresh_oauth_token(creds)
|
|
|
|
|
|
if refreshed:
|
|
|
|
|
|
return refreshed
|
|
|
|
|
|
logger.debug("Token refresh failed — re-run 'claude setup-token' to reauthenticate")
|
|
|
|
|
|
return None
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def _prefer_refreshable_claude_code_token(env_token: str, creds: Optional[Dict[str, Any]]) -> Optional[str]:
|
|
|
|
|
|
"""Prefer Claude Code creds when a persisted env OAuth token would shadow refresh.
|
|
|
|
|
|
|
|
|
|
|
|
Hermes historically persisted setup tokens into ANTHROPIC_TOKEN. That makes
|
|
|
|
|
|
later refresh impossible because the static env token wins before we ever
|
|
|
|
|
|
inspect Claude Code's refreshable credential file. If we have a refreshable
|
|
|
|
|
|
Claude Code credential record, prefer it over the static env OAuth token.
|
|
|
|
|
|
"""
|
|
|
|
|
|
if not env_token or not _is_oauth_token(env_token) or not isinstance(creds, dict):
|
|
|
|
|
|
return None
|
|
|
|
|
|
if not creds.get("refreshToken"):
|
|
|
|
|
|
return None
|
|
|
|
|
|
|
|
|
|
|
|
resolved = _resolve_claude_code_token_from_credentials(creds)
|
|
|
|
|
|
if resolved and resolved != env_token:
|
|
|
|
|
|
logger.debug(
|
|
|
|
|
|
"Preferring Claude Code credential file over static env OAuth token so refresh can proceed"
|
|
|
|
|
|
)
|
|
|
|
|
|
return resolved
|
|
|
|
|
|
return None
|
|
|
|
|
|
|
|
|
|
|
|
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
def resolve_anthropic_token() -> Optional[str]:
|
|
|
|
|
|
"""Resolve an Anthropic token from all available sources.
|
|
|
|
|
|
|
|
|
|
|
|
Priority:
|
2026-03-13 02:09:52 -07:00
|
|
|
|
1. ANTHROPIC_TOKEN env var (OAuth/setup token saved by Hermes)
|
|
|
|
|
|
2. CLAUDE_CODE_OAUTH_TOKEN env var
|
|
|
|
|
|
3. Claude Code credentials (~/.claude.json or ~/.claude/.credentials.json)
|
fix: Anthropic OAuth — beta header, token refresh, config contamination, reauthentication (#1132)
Fixes Anthropic OAuth/subscription authentication end-to-end:
Auth failures (401 errors):
- Add missing 'claude-code-20250219' beta header for OAuth tokens. Both
clawdbot and OpenCode include this alongside 'oauth-2025-04-20' — without
it, Anthropic's API rejects OAuth tokens with 401 authentication errors.
- Fix _fetch_anthropic_models() to use canonical beta headers from
_COMMON_BETAS + _OAUTH_ONLY_BETAS instead of hardcoding.
Token refresh:
- Add _refresh_oauth_token() — when Claude Code credentials from
~/.claude/.credentials.json are expired but have a refresh token,
automatically POST to console.anthropic.com/v1/oauth/token to get
a new access token. Uses the same client_id as Claude Code / OpenCode.
- Add _write_claude_code_credentials() — writes refreshed tokens back
to ~/.claude/.credentials.json, preserving other fields.
- resolve_anthropic_token() now auto-refreshes expired tokens before
returning None.
Config contamination:
- Anthropic's _model_flow_anthropic() no longer saves base_url to config.
Since resolve_runtime_provider() always hardcodes Anthropic's URL, the
stale base_url was contaminating other providers when users switched
without re-running 'hermes model' (e.g., Codex hitting api.anthropic.com).
- _update_config_for_provider() now pops base_url when passed empty string.
- Same fix in setup.py.
Flow/UX (hermes model command):
- CLAUDE_CODE_OAUTH_TOKEN env var now checked in credential detection
- Reauthentication option when existing credentials found
- run_oauth_setup_token() runs 'claude setup-token' as interactive
subprocess, then auto-detects saved credentials
- Clean has_creds/needs_auth flow in both main.py and setup.py
Tests (14 new):
- Beta header assertions for claude-code-20250219
- Token refresh: successful refresh with credential writeback, failed
refresh returns None, no refresh token returns None
- Credential writeback: new file creation, preserving existing fields
- Auto-refresh integration in resolve_anthropic_token()
- CLAUDE_CODE_OAUTH_TOKEN fallback, credential file auto-discovery
- run_oauth_setup_token() (5 scenarios)
2026-03-12 20:45:50 -07:00
|
|
|
|
— with automatic refresh if expired and a refresh token is available
|
2026-03-13 02:09:52 -07:00
|
|
|
|
4. ANTHROPIC_API_KEY env var (regular API key, or legacy fallback)
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
|
|
|
|
|
|
Returns the token string or None.
|
|
|
|
|
|
"""
|
2026-03-14 19:22:31 -07:00
|
|
|
|
creds = read_claude_code_credentials()
|
|
|
|
|
|
|
2026-03-13 02:09:52 -07:00
|
|
|
|
# 1. Hermes-managed OAuth/setup token env var
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
token = os.getenv("ANTHROPIC_TOKEN", "").strip()
|
|
|
|
|
|
if token:
|
2026-03-14 19:22:31 -07:00
|
|
|
|
preferred = _prefer_refreshable_claude_code_token(token, creds)
|
|
|
|
|
|
if preferred:
|
|
|
|
|
|
return preferred
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
return token
|
|
|
|
|
|
|
2026-03-13 02:09:52 -07:00
|
|
|
|
# 2. CLAUDE_CODE_OAUTH_TOKEN (used by Claude Code for setup-tokens)
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
cc_token = os.getenv("CLAUDE_CODE_OAUTH_TOKEN", "").strip()
|
|
|
|
|
|
if cc_token:
|
2026-03-14 19:22:31 -07:00
|
|
|
|
preferred = _prefer_refreshable_claude_code_token(cc_token, creds)
|
|
|
|
|
|
if preferred:
|
|
|
|
|
|
return preferred
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
return cc_token
|
|
|
|
|
|
|
2026-03-25 18:29:47 -07:00
|
|
|
|
# 3. Claude Code credential file
|
2026-03-14 19:22:31 -07:00
|
|
|
|
resolved_claude_token = _resolve_claude_code_token_from_credentials(creds)
|
|
|
|
|
|
if resolved_claude_token:
|
|
|
|
|
|
return resolved_claude_token
|
fix: Anthropic OAuth — beta header, token refresh, config contamination, reauthentication (#1132)
Fixes Anthropic OAuth/subscription authentication end-to-end:
Auth failures (401 errors):
- Add missing 'claude-code-20250219' beta header for OAuth tokens. Both
clawdbot and OpenCode include this alongside 'oauth-2025-04-20' — without
it, Anthropic's API rejects OAuth tokens with 401 authentication errors.
- Fix _fetch_anthropic_models() to use canonical beta headers from
_COMMON_BETAS + _OAUTH_ONLY_BETAS instead of hardcoding.
Token refresh:
- Add _refresh_oauth_token() — when Claude Code credentials from
~/.claude/.credentials.json are expired but have a refresh token,
automatically POST to console.anthropic.com/v1/oauth/token to get
a new access token. Uses the same client_id as Claude Code / OpenCode.
- Add _write_claude_code_credentials() — writes refreshed tokens back
to ~/.claude/.credentials.json, preserving other fields.
- resolve_anthropic_token() now auto-refreshes expired tokens before
returning None.
Config contamination:
- Anthropic's _model_flow_anthropic() no longer saves base_url to config.
Since resolve_runtime_provider() always hardcodes Anthropic's URL, the
stale base_url was contaminating other providers when users switched
without re-running 'hermes model' (e.g., Codex hitting api.anthropic.com).
- _update_config_for_provider() now pops base_url when passed empty string.
- Same fix in setup.py.
Flow/UX (hermes model command):
- CLAUDE_CODE_OAUTH_TOKEN env var now checked in credential detection
- Reauthentication option when existing credentials found
- run_oauth_setup_token() runs 'claude setup-token' as interactive
subprocess, then auto-detects saved credentials
- Clean has_creds/needs_auth flow in both main.py and setup.py
Tests (14 new):
- Beta header assertions for claude-code-20250219
- Token refresh: successful refresh with credential writeback, failed
refresh returns None, no refresh token returns None
- Credential writeback: new file creation, preserving existing fields
- Auto-refresh integration in resolve_anthropic_token()
- CLAUDE_CODE_OAUTH_TOKEN fallback, credential file auto-discovery
- run_oauth_setup_token() (5 scenarios)
2026-03-12 20:45:50 -07:00
|
|
|
|
|
2026-03-25 18:29:47 -07:00
|
|
|
|
# 4. Regular API key, or a legacy OAuth token saved in ANTHROPIC_API_KEY.
|
2026-03-13 02:09:52 -07:00
|
|
|
|
# This remains as a compatibility fallback for pre-migration Hermes configs.
|
|
|
|
|
|
api_key = os.getenv("ANTHROPIC_API_KEY", "").strip()
|
|
|
|
|
|
if api_key:
|
|
|
|
|
|
return api_key
|
|
|
|
|
|
|
fix: Anthropic OAuth — beta header, token refresh, config contamination, reauthentication (#1132)
Fixes Anthropic OAuth/subscription authentication end-to-end:
Auth failures (401 errors):
- Add missing 'claude-code-20250219' beta header for OAuth tokens. Both
clawdbot and OpenCode include this alongside 'oauth-2025-04-20' — without
it, Anthropic's API rejects OAuth tokens with 401 authentication errors.
- Fix _fetch_anthropic_models() to use canonical beta headers from
_COMMON_BETAS + _OAUTH_ONLY_BETAS instead of hardcoding.
Token refresh:
- Add _refresh_oauth_token() — when Claude Code credentials from
~/.claude/.credentials.json are expired but have a refresh token,
automatically POST to console.anthropic.com/v1/oauth/token to get
a new access token. Uses the same client_id as Claude Code / OpenCode.
- Add _write_claude_code_credentials() — writes refreshed tokens back
to ~/.claude/.credentials.json, preserving other fields.
- resolve_anthropic_token() now auto-refreshes expired tokens before
returning None.
Config contamination:
- Anthropic's _model_flow_anthropic() no longer saves base_url to config.
Since resolve_runtime_provider() always hardcodes Anthropic's URL, the
stale base_url was contaminating other providers when users switched
without re-running 'hermes model' (e.g., Codex hitting api.anthropic.com).
- _update_config_for_provider() now pops base_url when passed empty string.
- Same fix in setup.py.
Flow/UX (hermes model command):
- CLAUDE_CODE_OAUTH_TOKEN env var now checked in credential detection
- Reauthentication option when existing credentials found
- run_oauth_setup_token() runs 'claude setup-token' as interactive
subprocess, then auto-detects saved credentials
- Clean has_creds/needs_auth flow in both main.py and setup.py
Tests (14 new):
- Beta header assertions for claude-code-20250219
- Token refresh: successful refresh with credential writeback, failed
refresh returns None, no refresh token returns None
- Credential writeback: new file creation, preserving existing fields
- Auto-refresh integration in resolve_anthropic_token()
- CLAUDE_CODE_OAUTH_TOKEN fallback, credential file auto-discovery
- run_oauth_setup_token() (5 scenarios)
2026-03-12 20:45:50 -07:00
|
|
|
|
return None
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def run_oauth_setup_token() -> Optional[str]:
|
|
|
|
|
|
"""Run 'claude setup-token' interactively and return the resulting token.
|
|
|
|
|
|
|
|
|
|
|
|
Checks multiple sources after the subprocess completes:
|
|
|
|
|
|
1. Claude Code credential files (may be written by the subprocess)
|
|
|
|
|
|
2. CLAUDE_CODE_OAUTH_TOKEN / ANTHROPIC_TOKEN env vars
|
|
|
|
|
|
|
|
|
|
|
|
Returns the token string, or None if no credentials were obtained.
|
|
|
|
|
|
Raises FileNotFoundError if the 'claude' CLI is not installed.
|
|
|
|
|
|
"""
|
|
|
|
|
|
import shutil
|
|
|
|
|
|
import subprocess
|
|
|
|
|
|
|
|
|
|
|
|
claude_path = shutil.which("claude")
|
|
|
|
|
|
if not claude_path:
|
|
|
|
|
|
raise FileNotFoundError(
|
|
|
|
|
|
"The 'claude' CLI is not installed. "
|
|
|
|
|
|
"Install it with: npm install -g @anthropic-ai/claude-code"
|
|
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
|
|
# Run interactively — stdin/stdout/stderr inherited so user can interact
|
|
|
|
|
|
try:
|
|
|
|
|
|
subprocess.run([claude_path, "setup-token"])
|
|
|
|
|
|
except (KeyboardInterrupt, EOFError):
|
|
|
|
|
|
return None
|
|
|
|
|
|
|
|
|
|
|
|
# Check if credentials were saved to Claude Code's config files
|
|
|
|
|
|
creds = read_claude_code_credentials()
|
|
|
|
|
|
if creds and is_claude_code_token_valid(creds):
|
|
|
|
|
|
return creds["accessToken"]
|
|
|
|
|
|
|
|
|
|
|
|
# Check env vars that may have been set
|
|
|
|
|
|
for env_var in ("CLAUDE_CODE_OAUTH_TOKEN", "ANTHROPIC_TOKEN"):
|
|
|
|
|
|
val = os.getenv(env_var, "").strip()
|
|
|
|
|
|
if val:
|
|
|
|
|
|
return val
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
|
|
|
|
|
|
return None
|
|
|
|
|
|
|
|
|
|
|
|
|
feat(auth): same-provider credential pools with rotation, custom endpoint support, and interactive CLI (#2647)
* feat(auth): add same-provider credential pools and rotation UX
Add same-provider credential pooling so Hermes can rotate across
multiple credentials for a single provider, recover from exhausted
credentials without jumping providers immediately, and configure
that behavior directly in hermes setup.
- agent/credential_pool.py: persisted per-provider credential pools
- hermes auth add/list/remove/reset CLI commands
- 429/402/401 recovery with pool rotation in run_agent.py
- Setup wizard integration for pool strategy configuration
- Auto-seeding from env vars and existing OAuth state
Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
Salvaged from PR #2647
* fix(tests): prevent pool auto-seeding from host env in credential pool tests
Tests for non-pool Anthropic paths and auth remove were failing when
host env vars (ANTHROPIC_API_KEY) or file-backed OAuth credentials
were present. The pool auto-seeding picked these up, causing unexpected
pool entries in tests.
- Mock _select_pool_entry in auxiliary_client OAuth flag tests
- Clear Anthropic env vars and mock _seed_from_singletons in auth remove test
* feat(auth): add thread safety, least_used strategy, and request counting
- Add threading.Lock to CredentialPool for gateway thread safety
(concurrent requests from multiple gateway sessions could race on
pool state mutations without this)
- Add 'least_used' rotation strategy that selects the credential
with the lowest request_count, distributing load more evenly
- Add request_count field to PooledCredential for usage tracking
- Add mark_used() method to increment per-credential request counts
- Wrap select(), mark_exhausted_and_rotate(), and try_refresh_current()
with lock acquisition
- Add tests: least_used selection, mark_used counting, concurrent
thread safety (4 threads × 20 selects with no corruption)
* feat(auth): add interactive mode for bare 'hermes auth' command
When 'hermes auth' is called without a subcommand, it now launches an
interactive wizard that:
1. Shows full credential pool status across all providers
2. Offers a menu: add, remove, reset cooldowns, set strategy
3. For OAuth-capable providers (anthropic, nous, openai-codex), the
add flow explicitly asks 'API key or OAuth login?' — making it
clear that both auth types are supported for the same provider
4. Strategy picker shows all 4 options (fill_first, round_robin,
least_used, random) with the current selection marked
5. Remove flow shows entries with indices for easy selection
The subcommand paths (hermes auth add/list/remove/reset) still work
exactly as before for scripted/non-interactive use.
* fix(tests): update runtime_provider tests for config.yaml source of truth (#4165)
Tests were using OPENAI_BASE_URL env var which is no longer consulted
after #4165. Updated to use model config (provider, base_url, api_key)
which is the new single source of truth for custom endpoint URLs.
* feat(auth): support custom endpoint credential pools keyed by provider name
Custom OpenAI-compatible endpoints all share provider='custom', making
the provider-keyed pool useless. Now pools for custom endpoints are
keyed by 'custom:<normalized_name>' where the name comes from the
custom_providers config list (auto-generated from URL hostname).
- Pool key format: 'custom:together.ai', 'custom:local-(localhost:8080)'
- load_pool('custom:name') seeds from custom_providers api_key AND
model.api_key when base_url matches
- hermes auth add/list now shows custom endpoints alongside registry
providers
- _resolve_openrouter_runtime and _resolve_named_custom_runtime check
pool before falling back to single config key
- 6 new tests covering custom pool keying, seeding, and listing
* docs: add Excalidraw diagram of full credential pool flow
Comprehensive architecture diagram showing:
- Credential sources (env vars, auth.json OAuth, config.yaml, CLI)
- Pool storage and auto-seeding
- Runtime resolution paths (registry, custom, OpenRouter)
- Error recovery (429 retry-then-rotate, 402 immediate, 401 refresh)
- CLI management commands and strategy configuration
Open at: https://excalidraw.com/#json=2Ycqhqpi6f12E_3ITyiwh,c7u9jSt5BwrmiVzHGbm87g
* fix(tests): update setup wizard pool tests for unified select_provider_and_model flow
The setup wizard now delegates to select_provider_and_model() instead
of using its own prompt_choice-based provider picker. Tests needed:
- Mock select_provider_and_model as no-op (provider pre-written to config)
- Call _stub_tts BEFORE custom prompt_choice mock (it overwrites it)
- Pre-write model.provider to config so the pool step is reached
* docs: add comprehensive credential pool documentation
- New page: website/docs/user-guide/features/credential-pools.md
Full guide covering quick start, CLI commands, rotation strategies,
error recovery, custom endpoint pools, auto-discovery, thread safety,
architecture, and storage format.
- Updated fallback-providers.md to reference credential pools as the
first layer of resilience (same-provider rotation before cross-provider)
- Added hermes auth to CLI commands reference with usage examples
- Added credential_pool_strategies to configuration guide
* chore: remove excalidraw diagram from repo (external link only)
* refactor: simplify credential pool code — extract helpers, collapse extras, dedup patterns
- _load_config_safe(): replace 4 identical try/except/import blocks
- _iter_custom_providers(): shared generator for custom provider iteration
- PooledCredential.extra dict: collapse 11 round-trip-only fields
(token_type, scope, client_id, portal_base_url, obtained_at,
expires_in, agent_key_id, agent_key_expires_in, agent_key_reused,
agent_key_obtained_at, tls) into a single extra dict with
__getattr__ for backward-compatible access
- _available_entries(): shared exhaustion-check between select and peek
- Dedup anthropic OAuth seeding (hermes_pkce + claude_code identical)
- SimpleNamespace replaces class _Args boilerplate in auth_commands
- _try_resolve_from_custom_pool(): shared pool-check in runtime_provider
Net -17 lines. All 383 targeted tests pass.
---------
Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
2026-03-31 03:10:01 -07:00
|
|
|
|
# ── Hermes-native PKCE OAuth flow ────────────────────────────────────────
|
|
|
|
|
|
# Mirrors the flow used by Claude Code, pi-ai, and OpenCode.
|
|
|
|
|
|
# Stores credentials in ~/.hermes/.anthropic_oauth.json (our own file).
|
|
|
|
|
|
|
|
|
|
|
|
_OAUTH_CLIENT_ID = "9d1c250a-e61b-44d9-88ed-5944d1962f5e"
|
|
|
|
|
|
_OAUTH_TOKEN_URL = "https://console.anthropic.com/v1/oauth/token"
|
|
|
|
|
|
_OAUTH_REDIRECT_URI = "https://console.anthropic.com/oauth/code/callback"
|
|
|
|
|
|
_OAUTH_SCOPES = "org:create_api_key user:profile user:inference"
|
|
|
|
|
|
_HERMES_OAUTH_FILE = get_hermes_home() / ".anthropic_oauth.json"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def _generate_pkce() -> tuple:
|
|
|
|
|
|
"""Generate PKCE code_verifier and code_challenge (S256)."""
|
|
|
|
|
|
import base64
|
|
|
|
|
|
import hashlib
|
|
|
|
|
|
import secrets
|
|
|
|
|
|
|
|
|
|
|
|
verifier = base64.urlsafe_b64encode(secrets.token_bytes(32)).rstrip(b"=").decode()
|
|
|
|
|
|
challenge = base64.urlsafe_b64encode(
|
|
|
|
|
|
hashlib.sha256(verifier.encode()).digest()
|
|
|
|
|
|
).rstrip(b"=").decode()
|
|
|
|
|
|
return verifier, challenge
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def run_hermes_oauth_login_pure() -> Optional[Dict[str, Any]]:
|
|
|
|
|
|
"""Run Hermes-native OAuth PKCE flow and return credential state."""
|
|
|
|
|
|
import time
|
|
|
|
|
|
import webbrowser
|
|
|
|
|
|
|
|
|
|
|
|
verifier, challenge = _generate_pkce()
|
|
|
|
|
|
|
|
|
|
|
|
params = {
|
|
|
|
|
|
"code": "true",
|
|
|
|
|
|
"client_id": _OAUTH_CLIENT_ID,
|
|
|
|
|
|
"response_type": "code",
|
|
|
|
|
|
"redirect_uri": _OAUTH_REDIRECT_URI,
|
|
|
|
|
|
"scope": _OAUTH_SCOPES,
|
|
|
|
|
|
"code_challenge": challenge,
|
|
|
|
|
|
"code_challenge_method": "S256",
|
|
|
|
|
|
"state": verifier,
|
|
|
|
|
|
}
|
|
|
|
|
|
from urllib.parse import urlencode
|
|
|
|
|
|
|
|
|
|
|
|
auth_url = f"https://claude.ai/oauth/authorize?{urlencode(params)}"
|
|
|
|
|
|
|
|
|
|
|
|
print()
|
|
|
|
|
|
print("Authorize Hermes with your Claude Pro/Max subscription.")
|
|
|
|
|
|
print()
|
|
|
|
|
|
print("╭─ Claude Pro/Max Authorization ────────────────────╮")
|
|
|
|
|
|
print("│ │")
|
|
|
|
|
|
print("│ Open this link in your browser: │")
|
|
|
|
|
|
print("╰───────────────────────────────────────────────────╯")
|
|
|
|
|
|
print()
|
|
|
|
|
|
print(f" {auth_url}")
|
|
|
|
|
|
print()
|
|
|
|
|
|
|
|
|
|
|
|
try:
|
|
|
|
|
|
webbrowser.open(auth_url)
|
|
|
|
|
|
print(" (Browser opened automatically)")
|
|
|
|
|
|
except Exception:
|
|
|
|
|
|
pass
|
|
|
|
|
|
|
|
|
|
|
|
print()
|
|
|
|
|
|
print("After authorizing, you'll see a code. Paste it below.")
|
|
|
|
|
|
print()
|
|
|
|
|
|
try:
|
|
|
|
|
|
auth_code = input("Authorization code: ").strip()
|
|
|
|
|
|
except (KeyboardInterrupt, EOFError):
|
|
|
|
|
|
return None
|
|
|
|
|
|
|
|
|
|
|
|
if not auth_code:
|
|
|
|
|
|
print("No code entered.")
|
|
|
|
|
|
return None
|
|
|
|
|
|
|
|
|
|
|
|
splits = auth_code.split("#")
|
|
|
|
|
|
code = splits[0]
|
|
|
|
|
|
state = splits[1] if len(splits) > 1 else ""
|
|
|
|
|
|
|
|
|
|
|
|
try:
|
|
|
|
|
|
import urllib.request
|
|
|
|
|
|
|
|
|
|
|
|
exchange_data = json.dumps({
|
|
|
|
|
|
"grant_type": "authorization_code",
|
|
|
|
|
|
"client_id": _OAUTH_CLIENT_ID,
|
|
|
|
|
|
"code": code,
|
|
|
|
|
|
"state": state,
|
|
|
|
|
|
"redirect_uri": _OAUTH_REDIRECT_URI,
|
|
|
|
|
|
"code_verifier": verifier,
|
|
|
|
|
|
}).encode()
|
|
|
|
|
|
|
|
|
|
|
|
req = urllib.request.Request(
|
|
|
|
|
|
_OAUTH_TOKEN_URL,
|
|
|
|
|
|
data=exchange_data,
|
|
|
|
|
|
headers={
|
|
|
|
|
|
"Content-Type": "application/json",
|
|
|
|
|
|
"User-Agent": f"claude-cli/{_get_claude_code_version()} (external, cli)",
|
|
|
|
|
|
},
|
|
|
|
|
|
method="POST",
|
|
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
|
|
with urllib.request.urlopen(req, timeout=15) as resp:
|
|
|
|
|
|
result = json.loads(resp.read().decode())
|
|
|
|
|
|
except Exception as e:
|
|
|
|
|
|
print(f"Token exchange failed: {e}")
|
|
|
|
|
|
return None
|
|
|
|
|
|
|
|
|
|
|
|
access_token = result.get("access_token", "")
|
|
|
|
|
|
refresh_token = result.get("refresh_token", "")
|
|
|
|
|
|
expires_in = result.get("expires_in", 3600)
|
|
|
|
|
|
|
|
|
|
|
|
if not access_token:
|
|
|
|
|
|
print("No access token in response.")
|
|
|
|
|
|
return None
|
|
|
|
|
|
|
|
|
|
|
|
expires_at_ms = int(time.time() * 1000) + (expires_in * 1000)
|
|
|
|
|
|
return {
|
|
|
|
|
|
"access_token": access_token,
|
|
|
|
|
|
"refresh_token": refresh_token,
|
|
|
|
|
|
"expires_at_ms": expires_at_ms,
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def read_hermes_oauth_credentials() -> Optional[Dict[str, Any]]:
|
|
|
|
|
|
"""Read Hermes-managed OAuth credentials from ~/.hermes/.anthropic_oauth.json."""
|
|
|
|
|
|
if _HERMES_OAUTH_FILE.exists():
|
|
|
|
|
|
try:
|
|
|
|
|
|
data = json.loads(_HERMES_OAUTH_FILE.read_text(encoding="utf-8"))
|
|
|
|
|
|
if data.get("accessToken"):
|
|
|
|
|
|
return data
|
|
|
|
|
|
except (json.JSONDecodeError, OSError, IOError) as e:
|
|
|
|
|
|
logger.debug("Failed to read Hermes OAuth credentials: %s", e)
|
|
|
|
|
|
return None
|
|
|
|
|
|
|
2026-03-16 23:15:43 -07:00
|
|
|
|
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
|
# Message / tool / response format conversion
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
|
|
|
|
|
|
|
|
2026-03-21 09:38:04 -07:00
|
|
|
|
def normalize_model_name(model: str, preserve_dots: bool = False) -> str:
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
"""Normalize a model name for the Anthropic API.
|
|
|
|
|
|
|
2026-03-12 17:23:09 -07:00
|
|
|
|
- Strips 'anthropic/' prefix (OpenRouter format, case-insensitive)
|
2026-03-13 03:08:14 -07:00
|
|
|
|
- Converts dots to hyphens in version numbers (OpenRouter uses dots,
|
2026-03-21 09:38:04 -07:00
|
|
|
|
Anthropic uses hyphens: claude-opus-4.6 → claude-opus-4-6), unless
|
|
|
|
|
|
preserve_dots is True (e.g. for Alibaba/DashScope: qwen3.5-plus).
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
"""
|
2026-03-12 17:23:09 -07:00
|
|
|
|
lower = model.lower()
|
|
|
|
|
|
if lower.startswith("anthropic/"):
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
model = model[len("anthropic/"):]
|
2026-03-21 09:38:04 -07:00
|
|
|
|
if not preserve_dots:
|
|
|
|
|
|
# OpenRouter uses dots for version separators (claude-opus-4.6),
|
|
|
|
|
|
# Anthropic uses hyphens (claude-opus-4-6). Convert dots to hyphens.
|
|
|
|
|
|
model = model.replace(".", "-")
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
return model
|
|
|
|
|
|
|
|
|
|
|
|
|
2026-03-12 17:23:09 -07:00
|
|
|
|
def _sanitize_tool_id(tool_id: str) -> str:
|
|
|
|
|
|
"""Sanitize a tool call ID for the Anthropic API.
|
|
|
|
|
|
|
|
|
|
|
|
Anthropic requires IDs matching [a-zA-Z0-9_-]. Replace invalid
|
|
|
|
|
|
characters with underscores and ensure non-empty.
|
|
|
|
|
|
"""
|
|
|
|
|
|
import re
|
|
|
|
|
|
if not tool_id:
|
|
|
|
|
|
return "tool_0"
|
|
|
|
|
|
sanitized = re.sub(r"[^a-zA-Z0-9_-]", "_", tool_id)
|
|
|
|
|
|
return sanitized or "tool_0"
|
|
|
|
|
|
|
|
|
|
|
|
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
def convert_tools_to_anthropic(tools: List[Dict]) -> List[Dict]:
|
|
|
|
|
|
"""Convert OpenAI tool definitions to Anthropic format."""
|
|
|
|
|
|
if not tools:
|
|
|
|
|
|
return []
|
|
|
|
|
|
result = []
|
|
|
|
|
|
for t in tools:
|
|
|
|
|
|
fn = t.get("function", {})
|
|
|
|
|
|
result.append({
|
|
|
|
|
|
"name": fn.get("name", ""),
|
|
|
|
|
|
"description": fn.get("description", ""),
|
|
|
|
|
|
"input_schema": fn.get("parameters", {"type": "object", "properties": {}}),
|
|
|
|
|
|
})
|
|
|
|
|
|
return result
|
|
|
|
|
|
|
|
|
|
|
|
|
2026-03-14 23:21:09 -07:00
|
|
|
|
def _image_source_from_openai_url(url: str) -> Dict[str, str]:
|
|
|
|
|
|
"""Convert an OpenAI-style image URL/data URL into Anthropic image source."""
|
|
|
|
|
|
url = str(url or "").strip()
|
|
|
|
|
|
if not url:
|
|
|
|
|
|
return {"type": "url", "url": ""}
|
|
|
|
|
|
|
|
|
|
|
|
if url.startswith("data:"):
|
|
|
|
|
|
header, _, data = url.partition(",")
|
|
|
|
|
|
media_type = "image/jpeg"
|
|
|
|
|
|
if header.startswith("data:"):
|
|
|
|
|
|
mime_part = header[len("data:"):].split(";", 1)[0].strip()
|
|
|
|
|
|
if mime_part.startswith("image/"):
|
|
|
|
|
|
media_type = mime_part
|
|
|
|
|
|
return {
|
|
|
|
|
|
"type": "base64",
|
|
|
|
|
|
"media_type": media_type,
|
|
|
|
|
|
"data": data,
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
return {"type": "url", "url": url}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def _convert_content_part_to_anthropic(part: Any) -> Optional[Dict[str, Any]]:
|
|
|
|
|
|
"""Convert a single OpenAI-style content part to Anthropic format."""
|
|
|
|
|
|
if part is None:
|
|
|
|
|
|
return None
|
|
|
|
|
|
if isinstance(part, str):
|
|
|
|
|
|
return {"type": "text", "text": part}
|
|
|
|
|
|
if not isinstance(part, dict):
|
|
|
|
|
|
return {"type": "text", "text": str(part)}
|
|
|
|
|
|
|
|
|
|
|
|
ptype = part.get("type")
|
|
|
|
|
|
|
|
|
|
|
|
if ptype == "input_text":
|
|
|
|
|
|
block: Dict[str, Any] = {"type": "text", "text": part.get("text", "")}
|
|
|
|
|
|
elif ptype in {"image_url", "input_image"}:
|
|
|
|
|
|
image_value = part.get("image_url", {})
|
|
|
|
|
|
url = image_value.get("url", "") if isinstance(image_value, dict) else str(image_value or "")
|
|
|
|
|
|
block = {"type": "image", "source": _image_source_from_openai_url(url)}
|
|
|
|
|
|
else:
|
|
|
|
|
|
block = dict(part)
|
|
|
|
|
|
|
|
|
|
|
|
if isinstance(part.get("cache_control"), dict) and "cache_control" not in block:
|
|
|
|
|
|
block["cache_control"] = dict(part["cache_control"])
|
|
|
|
|
|
return block
|
|
|
|
|
|
|
|
|
|
|
|
|
2026-04-02 10:14:20 -07:00
|
|
|
|
def _to_plain_data(value: Any, *, _depth: int = 0, _path: Optional[set] = None) -> Any:
|
|
|
|
|
|
"""Recursively convert SDK objects to plain Python data structures.
|
|
|
|
|
|
|
|
|
|
|
|
Guards against circular references (``_path`` tracks ``id()`` of objects
|
|
|
|
|
|
on the *current* recursion path) and runaway depth (capped at 20 levels).
|
|
|
|
|
|
Uses path-based tracking so shared (but non-cyclic) objects referenced by
|
|
|
|
|
|
multiple siblings are converted correctly rather than being stringified.
|
|
|
|
|
|
"""
|
|
|
|
|
|
_MAX_DEPTH = 20
|
|
|
|
|
|
if _depth > _MAX_DEPTH:
|
|
|
|
|
|
return str(value)
|
|
|
|
|
|
|
|
|
|
|
|
if _path is None:
|
|
|
|
|
|
_path = set()
|
|
|
|
|
|
|
|
|
|
|
|
obj_id = id(value)
|
|
|
|
|
|
if obj_id in _path:
|
|
|
|
|
|
return str(value)
|
|
|
|
|
|
|
|
|
|
|
|
if hasattr(value, "model_dump"):
|
|
|
|
|
|
_path.add(obj_id)
|
|
|
|
|
|
result = _to_plain_data(value.model_dump(), _depth=_depth + 1, _path=_path)
|
|
|
|
|
|
_path.discard(obj_id)
|
|
|
|
|
|
return result
|
|
|
|
|
|
if isinstance(value, dict):
|
|
|
|
|
|
_path.add(obj_id)
|
|
|
|
|
|
result = {k: _to_plain_data(v, _depth=_depth + 1, _path=_path) for k, v in value.items()}
|
|
|
|
|
|
_path.discard(obj_id)
|
|
|
|
|
|
return result
|
|
|
|
|
|
if isinstance(value, (list, tuple)):
|
|
|
|
|
|
_path.add(obj_id)
|
|
|
|
|
|
result = [_to_plain_data(v, _depth=_depth + 1, _path=_path) for v in value]
|
|
|
|
|
|
_path.discard(obj_id)
|
|
|
|
|
|
return result
|
|
|
|
|
|
if hasattr(value, "__dict__"):
|
|
|
|
|
|
_path.add(obj_id)
|
|
|
|
|
|
result = {
|
|
|
|
|
|
k: _to_plain_data(v, _depth=_depth + 1, _path=_path)
|
|
|
|
|
|
for k, v in vars(value).items()
|
|
|
|
|
|
if not k.startswith("_")
|
|
|
|
|
|
}
|
|
|
|
|
|
_path.discard(obj_id)
|
|
|
|
|
|
return result
|
|
|
|
|
|
return value
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def _extract_preserved_thinking_blocks(message: Dict[str, Any]) -> List[Dict[str, Any]]:
|
|
|
|
|
|
"""Return Anthropic thinking blocks previously preserved on the message."""
|
|
|
|
|
|
raw_details = message.get("reasoning_details")
|
|
|
|
|
|
if not isinstance(raw_details, list):
|
|
|
|
|
|
return []
|
|
|
|
|
|
|
|
|
|
|
|
preserved: List[Dict[str, Any]] = []
|
|
|
|
|
|
for detail in raw_details:
|
|
|
|
|
|
if not isinstance(detail, dict):
|
|
|
|
|
|
continue
|
|
|
|
|
|
block_type = str(detail.get("type", "") or "").strip().lower()
|
|
|
|
|
|
if block_type not in {"thinking", "redacted_thinking"}:
|
|
|
|
|
|
continue
|
|
|
|
|
|
preserved.append(copy.deepcopy(detail))
|
|
|
|
|
|
return preserved
|
|
|
|
|
|
|
|
|
|
|
|
|
2026-03-14 23:21:09 -07:00
|
|
|
|
def _convert_content_to_anthropic(content: Any) -> Any:
|
|
|
|
|
|
"""Convert OpenAI-style multimodal content arrays to Anthropic blocks."""
|
|
|
|
|
|
if not isinstance(content, list):
|
|
|
|
|
|
return content
|
|
|
|
|
|
|
|
|
|
|
|
converted = []
|
|
|
|
|
|
for part in content:
|
|
|
|
|
|
block = _convert_content_part_to_anthropic(part)
|
|
|
|
|
|
if block is not None:
|
|
|
|
|
|
converted.append(block)
|
|
|
|
|
|
return converted
|
|
|
|
|
|
|
|
|
|
|
|
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
def convert_messages_to_anthropic(
|
|
|
|
|
|
messages: List[Dict],
|
2026-04-08 13:51:41 -07:00
|
|
|
|
base_url: str | None = None,
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
) -> Tuple[Optional[Any], List[Dict]]:
|
|
|
|
|
|
"""Convert OpenAI-format messages to Anthropic format.
|
|
|
|
|
|
|
|
|
|
|
|
Returns (system_prompt, anthropic_messages).
|
|
|
|
|
|
System messages are extracted since Anthropic takes them as a separate param.
|
|
|
|
|
|
system_prompt is a string or list of content blocks (when cache_control present).
|
2026-04-08 13:51:41 -07:00
|
|
|
|
|
|
|
|
|
|
When *base_url* is provided and points to a third-party Anthropic-compatible
|
|
|
|
|
|
endpoint, all thinking block signatures are stripped. Signatures are
|
|
|
|
|
|
Anthropic-proprietary — third-party endpoints cannot validate them and will
|
|
|
|
|
|
reject them with HTTP 400 "Invalid signature in thinking block".
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
"""
|
|
|
|
|
|
system = None
|
|
|
|
|
|
result = []
|
|
|
|
|
|
|
|
|
|
|
|
for m in messages:
|
|
|
|
|
|
role = m.get("role", "user")
|
|
|
|
|
|
content = m.get("content", "")
|
|
|
|
|
|
|
|
|
|
|
|
if role == "system":
|
|
|
|
|
|
if isinstance(content, list):
|
|
|
|
|
|
# Preserve cache_control markers on content blocks
|
|
|
|
|
|
has_cache = any(
|
|
|
|
|
|
p.get("cache_control") for p in content if isinstance(p, dict)
|
|
|
|
|
|
)
|
|
|
|
|
|
if has_cache:
|
|
|
|
|
|
system = [p for p in content if isinstance(p, dict)]
|
|
|
|
|
|
else:
|
|
|
|
|
|
system = "\n".join(
|
|
|
|
|
|
p["text"] for p in content if p.get("type") == "text"
|
|
|
|
|
|
)
|
|
|
|
|
|
else:
|
|
|
|
|
|
system = content
|
|
|
|
|
|
continue
|
|
|
|
|
|
|
|
|
|
|
|
if role == "assistant":
|
2026-04-02 10:14:20 -07:00
|
|
|
|
blocks = _extract_preserved_thinking_blocks(m)
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
if content:
|
2026-03-13 13:27:03 -07:00
|
|
|
|
if isinstance(content, list):
|
2026-03-14 23:21:09 -07:00
|
|
|
|
converted_content = _convert_content_to_anthropic(content)
|
|
|
|
|
|
if isinstance(converted_content, list):
|
|
|
|
|
|
blocks.extend(converted_content)
|
2026-03-13 13:27:03 -07:00
|
|
|
|
else:
|
|
|
|
|
|
blocks.append({"type": "text", "text": str(content)})
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
for tc in m.get("tool_calls", []):
|
2026-03-20 22:01:42 +03:00
|
|
|
|
if not tc or not isinstance(tc, dict):
|
|
|
|
|
|
continue
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
fn = tc.get("function", {})
|
|
|
|
|
|
args = fn.get("arguments", "{}")
|
2026-03-12 17:14:22 -07:00
|
|
|
|
try:
|
|
|
|
|
|
parsed_args = json.loads(args) if isinstance(args, str) else args
|
|
|
|
|
|
except (json.JSONDecodeError, ValueError):
|
|
|
|
|
|
parsed_args = {}
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
blocks.append({
|
|
|
|
|
|
"type": "tool_use",
|
2026-03-12 17:23:09 -07:00
|
|
|
|
"id": _sanitize_tool_id(tc.get("id", "")),
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
"name": fn.get("name", ""),
|
2026-03-12 17:14:22 -07:00
|
|
|
|
"input": parsed_args,
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
})
|
2026-03-12 17:14:22 -07:00
|
|
|
|
# Anthropic rejects empty assistant content
|
|
|
|
|
|
effective = blocks or content
|
|
|
|
|
|
if not effective or effective == "":
|
|
|
|
|
|
effective = [{"type": "text", "text": "(empty)"}]
|
|
|
|
|
|
result.append({"role": "assistant", "content": effective})
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
continue
|
|
|
|
|
|
|
|
|
|
|
|
if role == "tool":
|
2026-03-12 17:23:09 -07:00
|
|
|
|
# Sanitize tool_use_id and ensure non-empty content
|
|
|
|
|
|
result_content = content if isinstance(content, str) else json.dumps(content)
|
|
|
|
|
|
if not result_content:
|
|
|
|
|
|
result_content = "(no output)"
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
tool_result = {
|
|
|
|
|
|
"type": "tool_result",
|
2026-03-12 17:23:09 -07:00
|
|
|
|
"tool_use_id": _sanitize_tool_id(m.get("tool_call_id", "")),
|
|
|
|
|
|
"content": result_content,
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
}
|
2026-03-13 13:27:03 -07:00
|
|
|
|
if isinstance(m.get("cache_control"), dict):
|
|
|
|
|
|
tool_result["cache_control"] = dict(m["cache_control"])
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
# Merge consecutive tool results into one user message
|
|
|
|
|
|
if (
|
|
|
|
|
|
result
|
|
|
|
|
|
and result[-1]["role"] == "user"
|
|
|
|
|
|
and isinstance(result[-1]["content"], list)
|
|
|
|
|
|
and result[-1]["content"]
|
|
|
|
|
|
and result[-1]["content"][0].get("type") == "tool_result"
|
|
|
|
|
|
):
|
|
|
|
|
|
result[-1]["content"].append(tool_result)
|
|
|
|
|
|
else:
|
|
|
|
|
|
result.append({"role": "user", "content": [tool_result]})
|
|
|
|
|
|
continue
|
|
|
|
|
|
|
2026-03-26 19:24:03 -07:00
|
|
|
|
# Regular user message — validate non-empty content (Anthropic rejects empty)
|
2026-03-14 21:14:20 -07:00
|
|
|
|
if isinstance(content, list):
|
2026-03-14 23:44:47 -07:00
|
|
|
|
converted_blocks = _convert_content_to_anthropic(content)
|
2026-03-26 19:24:03 -07:00
|
|
|
|
# Check if all text blocks are empty
|
|
|
|
|
|
if not converted_blocks or all(
|
|
|
|
|
|
b.get("text", "").strip() == ""
|
|
|
|
|
|
for b in converted_blocks
|
|
|
|
|
|
if isinstance(b, dict) and b.get("type") == "text"
|
|
|
|
|
|
):
|
|
|
|
|
|
converted_blocks = [{"type": "text", "text": "(empty message)"}]
|
|
|
|
|
|
result.append({"role": "user", "content": converted_blocks})
|
2026-03-14 21:14:20 -07:00
|
|
|
|
else:
|
2026-03-26 19:24:03 -07:00
|
|
|
|
# Validate string content is non-empty
|
|
|
|
|
|
if not content or (isinstance(content, str) and not content.strip()):
|
|
|
|
|
|
content = "(empty message)"
|
2026-03-14 21:14:20 -07:00
|
|
|
|
result.append({"role": "user", "content": content})
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
|
|
|
|
|
|
# Strip orphaned tool_use blocks (no matching tool_result follows)
|
|
|
|
|
|
tool_result_ids = set()
|
|
|
|
|
|
for m in result:
|
|
|
|
|
|
if m["role"] == "user" and isinstance(m["content"], list):
|
|
|
|
|
|
for block in m["content"]:
|
|
|
|
|
|
if block.get("type") == "tool_result":
|
|
|
|
|
|
tool_result_ids.add(block.get("tool_use_id"))
|
|
|
|
|
|
for m in result:
|
|
|
|
|
|
if m["role"] == "assistant" and isinstance(m["content"], list):
|
|
|
|
|
|
m["content"] = [
|
|
|
|
|
|
b
|
|
|
|
|
|
for b in m["content"]
|
|
|
|
|
|
if b.get("type") != "tool_use" or b.get("id") in tool_result_ids
|
|
|
|
|
|
]
|
|
|
|
|
|
if not m["content"]:
|
|
|
|
|
|
m["content"] = [{"type": "text", "text": "(tool call removed)"}]
|
|
|
|
|
|
|
2026-03-20 08:39:49 -07:00
|
|
|
|
# Strip orphaned tool_result blocks (no matching tool_use precedes them).
|
|
|
|
|
|
# This is the mirror of the above: context compression or session truncation
|
|
|
|
|
|
# can remove an assistant message containing a tool_use while leaving the
|
|
|
|
|
|
# subsequent tool_result intact. Anthropic rejects these with a 400.
|
|
|
|
|
|
tool_use_ids = set()
|
|
|
|
|
|
for m in result:
|
|
|
|
|
|
if m["role"] == "assistant" and isinstance(m["content"], list):
|
|
|
|
|
|
for block in m["content"]:
|
|
|
|
|
|
if block.get("type") == "tool_use":
|
|
|
|
|
|
tool_use_ids.add(block.get("id"))
|
|
|
|
|
|
for m in result:
|
|
|
|
|
|
if m["role"] == "user" and isinstance(m["content"], list):
|
|
|
|
|
|
m["content"] = [
|
|
|
|
|
|
b
|
|
|
|
|
|
for b in m["content"]
|
|
|
|
|
|
if b.get("type") != "tool_result" or b.get("tool_use_id") in tool_use_ids
|
|
|
|
|
|
]
|
|
|
|
|
|
if not m["content"]:
|
|
|
|
|
|
m["content"] = [{"type": "text", "text": "(tool result removed)"}]
|
|
|
|
|
|
|
fix(anthropic): address gaps found in deep-dive audit
After studying clawdbot (OpenClaw) and OpenCode implementations:
## Beta headers
- Add interleaved-thinking-2025-05-14 and fine-grained-tool-streaming-2025-05-14
as common betas (sent with ALL auth types, not just OAuth)
- OAuth tokens additionally get oauth-2025-04-20
- API keys now also get the common betas (previously got none)
## Vision/image support
- Add _convert_vision_content() to convert OpenAI multimodal format
(image_url blocks) to Anthropic format (image blocks with base64/url source)
- Handles both data: URIs (base64) and regular URLs
## Role alternation enforcement
- Anthropic strictly rejects consecutive same-role messages (400 error)
- Add post-processing step that merges consecutive user/assistant messages
- Handles string, list, and mixed content types during merge
## Tool choice support
- Add tool_choice parameter to build_anthropic_kwargs()
- Maps OpenAI values: auto→auto, required→any, none→omit, name→tool
## Cache metrics tracking
- Anthropic uses cache_read_input_tokens / cache_creation_input_tokens
(different from OpenRouter's prompt_tokens_details.cached_tokens)
- Add api_mode-aware branch in run_agent.py cache stats logging
## Credential refresh on 401
- On 401 error during anthropic_messages mode, re-read credentials
via resolve_anthropic_token() (picks up refreshed Claude Code tokens)
- Rebuild client if new token differs from current one
- Follows same pattern as Codex/Nous 401 refresh handlers
## Tests
- 44 adapter tests (8 new: vision conversion, role alternation, tool choice)
- Updated beta header tests to verify new structure
- Full suite: 3198 passed, 0 regressions
2026-03-12 16:00:46 -07:00
|
|
|
|
# Enforce strict role alternation (Anthropic rejects consecutive same-role messages)
|
|
|
|
|
|
fixed = []
|
|
|
|
|
|
for m in result:
|
|
|
|
|
|
if fixed and fixed[-1]["role"] == m["role"]:
|
|
|
|
|
|
if m["role"] == "user":
|
|
|
|
|
|
# Merge consecutive user messages
|
|
|
|
|
|
prev_content = fixed[-1]["content"]
|
|
|
|
|
|
curr_content = m["content"]
|
|
|
|
|
|
if isinstance(prev_content, str) and isinstance(curr_content, str):
|
|
|
|
|
|
fixed[-1]["content"] = prev_content + "\n" + curr_content
|
|
|
|
|
|
elif isinstance(prev_content, list) and isinstance(curr_content, list):
|
|
|
|
|
|
fixed[-1]["content"] = prev_content + curr_content
|
|
|
|
|
|
else:
|
|
|
|
|
|
# Mixed types — wrap string in list
|
|
|
|
|
|
if isinstance(prev_content, str):
|
|
|
|
|
|
prev_content = [{"type": "text", "text": prev_content}]
|
|
|
|
|
|
if isinstance(curr_content, str):
|
|
|
|
|
|
curr_content = [{"type": "text", "text": curr_content}]
|
|
|
|
|
|
fixed[-1]["content"] = prev_content + curr_content
|
|
|
|
|
|
else:
|
2026-04-08 03:38:08 -07:00
|
|
|
|
# Consecutive assistant messages — merge text content.
|
|
|
|
|
|
# Drop thinking blocks from the *second* message: their
|
|
|
|
|
|
# signature was computed against a different turn boundary
|
|
|
|
|
|
# and becomes invalid once merged.
|
|
|
|
|
|
if isinstance(m["content"], list):
|
|
|
|
|
|
m["content"] = [
|
|
|
|
|
|
b for b in m["content"]
|
|
|
|
|
|
if not (isinstance(b, dict) and b.get("type") in ("thinking", "redacted_thinking"))
|
|
|
|
|
|
]
|
fix(anthropic): address gaps found in deep-dive audit
After studying clawdbot (OpenClaw) and OpenCode implementations:
## Beta headers
- Add interleaved-thinking-2025-05-14 and fine-grained-tool-streaming-2025-05-14
as common betas (sent with ALL auth types, not just OAuth)
- OAuth tokens additionally get oauth-2025-04-20
- API keys now also get the common betas (previously got none)
## Vision/image support
- Add _convert_vision_content() to convert OpenAI multimodal format
(image_url blocks) to Anthropic format (image blocks with base64/url source)
- Handles both data: URIs (base64) and regular URLs
## Role alternation enforcement
- Anthropic strictly rejects consecutive same-role messages (400 error)
- Add post-processing step that merges consecutive user/assistant messages
- Handles string, list, and mixed content types during merge
## Tool choice support
- Add tool_choice parameter to build_anthropic_kwargs()
- Maps OpenAI values: auto→auto, required→any, none→omit, name→tool
## Cache metrics tracking
- Anthropic uses cache_read_input_tokens / cache_creation_input_tokens
(different from OpenRouter's prompt_tokens_details.cached_tokens)
- Add api_mode-aware branch in run_agent.py cache stats logging
## Credential refresh on 401
- On 401 error during anthropic_messages mode, re-read credentials
via resolve_anthropic_token() (picks up refreshed Claude Code tokens)
- Rebuild client if new token differs from current one
- Follows same pattern as Codex/Nous 401 refresh handlers
## Tests
- 44 adapter tests (8 new: vision conversion, role alternation, tool choice)
- Updated beta header tests to verify new structure
- Full suite: 3198 passed, 0 regressions
2026-03-12 16:00:46 -07:00
|
|
|
|
prev_blocks = fixed[-1]["content"]
|
|
|
|
|
|
curr_blocks = m["content"]
|
|
|
|
|
|
if isinstance(prev_blocks, list) and isinstance(curr_blocks, list):
|
|
|
|
|
|
fixed[-1]["content"] = prev_blocks + curr_blocks
|
|
|
|
|
|
elif isinstance(prev_blocks, str) and isinstance(curr_blocks, str):
|
|
|
|
|
|
fixed[-1]["content"] = prev_blocks + "\n" + curr_blocks
|
|
|
|
|
|
else:
|
2026-03-17 03:48:55 -07:00
|
|
|
|
# Mixed types — normalize both to list and merge
|
|
|
|
|
|
if isinstance(prev_blocks, str):
|
|
|
|
|
|
prev_blocks = [{"type": "text", "text": prev_blocks}]
|
|
|
|
|
|
if isinstance(curr_blocks, str):
|
|
|
|
|
|
curr_blocks = [{"type": "text", "text": curr_blocks}]
|
|
|
|
|
|
fixed[-1]["content"] = prev_blocks + curr_blocks
|
fix(anthropic): address gaps found in deep-dive audit
After studying clawdbot (OpenClaw) and OpenCode implementations:
## Beta headers
- Add interleaved-thinking-2025-05-14 and fine-grained-tool-streaming-2025-05-14
as common betas (sent with ALL auth types, not just OAuth)
- OAuth tokens additionally get oauth-2025-04-20
- API keys now also get the common betas (previously got none)
## Vision/image support
- Add _convert_vision_content() to convert OpenAI multimodal format
(image_url blocks) to Anthropic format (image blocks with base64/url source)
- Handles both data: URIs (base64) and regular URLs
## Role alternation enforcement
- Anthropic strictly rejects consecutive same-role messages (400 error)
- Add post-processing step that merges consecutive user/assistant messages
- Handles string, list, and mixed content types during merge
## Tool choice support
- Add tool_choice parameter to build_anthropic_kwargs()
- Maps OpenAI values: auto→auto, required→any, none→omit, name→tool
## Cache metrics tracking
- Anthropic uses cache_read_input_tokens / cache_creation_input_tokens
(different from OpenRouter's prompt_tokens_details.cached_tokens)
- Add api_mode-aware branch in run_agent.py cache stats logging
## Credential refresh on 401
- On 401 error during anthropic_messages mode, re-read credentials
via resolve_anthropic_token() (picks up refreshed Claude Code tokens)
- Rebuild client if new token differs from current one
- Follows same pattern as Codex/Nous 401 refresh handlers
## Tests
- 44 adapter tests (8 new: vision conversion, role alternation, tool choice)
- Updated beta header tests to verify new structure
- Full suite: 3198 passed, 0 regressions
2026-03-12 16:00:46 -07:00
|
|
|
|
else:
|
|
|
|
|
|
fixed.append(m)
|
|
|
|
|
|
result = fixed
|
|
|
|
|
|
|
2026-04-08 03:38:08 -07:00
|
|
|
|
# ── Thinking block signature management ──────────────────────────
|
|
|
|
|
|
# Anthropic signs thinking blocks against the full turn content.
|
|
|
|
|
|
# Any upstream mutation (context compression, session truncation,
|
|
|
|
|
|
# orphan stripping, message merging) invalidates the signature,
|
|
|
|
|
|
# causing HTTP 400 "Invalid signature in thinking block".
|
|
|
|
|
|
#
|
2026-04-08 13:51:41 -07:00
|
|
|
|
# Signatures are Anthropic-proprietary. Third-party endpoints
|
|
|
|
|
|
# (MiniMax, Azure AI Foundry, self-hosted proxies) cannot validate
|
|
|
|
|
|
# them and will reject them outright. When targeting a third-party
|
|
|
|
|
|
# endpoint, strip ALL thinking/redacted_thinking blocks from every
|
|
|
|
|
|
# assistant message — the third-party will generate its own
|
|
|
|
|
|
# thinking blocks if it supports extended thinking.
|
|
|
|
|
|
#
|
|
|
|
|
|
# For direct Anthropic (strategy following clawdbot/OpenClaw):
|
2026-04-08 03:38:08 -07:00
|
|
|
|
# 1. Strip thinking/redacted_thinking from all assistant messages
|
|
|
|
|
|
# EXCEPT the last one — preserves reasoning continuity on the
|
|
|
|
|
|
# current tool-use chain while avoiding stale signature errors.
|
|
|
|
|
|
# 2. Downgrade unsigned thinking blocks (no signature) to text —
|
|
|
|
|
|
# Anthropic can't validate them and will reject them.
|
|
|
|
|
|
# 3. Strip cache_control from thinking/redacted_thinking blocks —
|
|
|
|
|
|
# cache markers can interfere with signature validation.
|
|
|
|
|
|
_THINKING_TYPES = frozenset(("thinking", "redacted_thinking"))
|
2026-04-08 13:51:41 -07:00
|
|
|
|
_is_third_party = _is_third_party_anthropic_endpoint(base_url)
|
2026-04-08 03:38:08 -07:00
|
|
|
|
|
|
|
|
|
|
last_assistant_idx = None
|
|
|
|
|
|
for i in range(len(result) - 1, -1, -1):
|
|
|
|
|
|
if result[i].get("role") == "assistant":
|
|
|
|
|
|
last_assistant_idx = i
|
|
|
|
|
|
break
|
|
|
|
|
|
|
|
|
|
|
|
for idx, m in enumerate(result):
|
|
|
|
|
|
if m.get("role") != "assistant" or not isinstance(m.get("content"), list):
|
|
|
|
|
|
continue
|
|
|
|
|
|
|
2026-04-08 13:51:41 -07:00
|
|
|
|
if _is_third_party or idx != last_assistant_idx:
|
|
|
|
|
|
# Third-party endpoint: strip ALL thinking blocks from every
|
|
|
|
|
|
# assistant message — signatures are Anthropic-proprietary.
|
|
|
|
|
|
# Direct Anthropic: strip from non-latest assistant messages only.
|
2026-04-08 03:38:08 -07:00
|
|
|
|
stripped = [
|
|
|
|
|
|
b for b in m["content"]
|
|
|
|
|
|
if not (isinstance(b, dict) and b.get("type") in _THINKING_TYPES)
|
|
|
|
|
|
]
|
|
|
|
|
|
m["content"] = stripped or [{"type": "text", "text": "(thinking elided)"}]
|
|
|
|
|
|
else:
|
2026-04-08 13:51:41 -07:00
|
|
|
|
# Latest assistant on direct Anthropic: keep signed thinking
|
|
|
|
|
|
# blocks for reasoning continuity; downgrade unsigned ones to
|
|
|
|
|
|
# plain text.
|
2026-04-08 03:38:08 -07:00
|
|
|
|
new_content = []
|
|
|
|
|
|
for b in m["content"]:
|
|
|
|
|
|
if not isinstance(b, dict) or b.get("type") not in _THINKING_TYPES:
|
|
|
|
|
|
new_content.append(b)
|
|
|
|
|
|
continue
|
|
|
|
|
|
if b.get("type") == "redacted_thinking":
|
|
|
|
|
|
# Redacted blocks use 'data' for the signature payload
|
|
|
|
|
|
if b.get("data"):
|
|
|
|
|
|
new_content.append(b)
|
|
|
|
|
|
# else: drop — no data means it can't be validated
|
|
|
|
|
|
elif b.get("signature"):
|
|
|
|
|
|
# Signed thinking block — keep it
|
|
|
|
|
|
new_content.append(b)
|
|
|
|
|
|
else:
|
|
|
|
|
|
# Unsigned thinking — downgrade to text so it's not lost
|
|
|
|
|
|
thinking_text = b.get("thinking", "")
|
|
|
|
|
|
if thinking_text:
|
|
|
|
|
|
new_content.append({"type": "text", "text": thinking_text})
|
|
|
|
|
|
m["content"] = new_content or [{"type": "text", "text": "(empty)"}]
|
|
|
|
|
|
|
|
|
|
|
|
# Strip cache_control from any remaining thinking/redacted_thinking
|
|
|
|
|
|
# blocks — cache markers interfere with signature validation.
|
|
|
|
|
|
for b in m["content"]:
|
|
|
|
|
|
if isinstance(b, dict) and b.get("type") in _THINKING_TYPES:
|
|
|
|
|
|
b.pop("cache_control", None)
|
|
|
|
|
|
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
return system, result
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def build_anthropic_kwargs(
|
|
|
|
|
|
model: str,
|
|
|
|
|
|
messages: List[Dict],
|
|
|
|
|
|
tools: Optional[List[Dict]],
|
|
|
|
|
|
max_tokens: Optional[int],
|
|
|
|
|
|
reasoning_config: Optional[Dict[str, Any]],
|
fix(anthropic): address gaps found in deep-dive audit
After studying clawdbot (OpenClaw) and OpenCode implementations:
## Beta headers
- Add interleaved-thinking-2025-05-14 and fine-grained-tool-streaming-2025-05-14
as common betas (sent with ALL auth types, not just OAuth)
- OAuth tokens additionally get oauth-2025-04-20
- API keys now also get the common betas (previously got none)
## Vision/image support
- Add _convert_vision_content() to convert OpenAI multimodal format
(image_url blocks) to Anthropic format (image blocks with base64/url source)
- Handles both data: URIs (base64) and regular URLs
## Role alternation enforcement
- Anthropic strictly rejects consecutive same-role messages (400 error)
- Add post-processing step that merges consecutive user/assistant messages
- Handles string, list, and mixed content types during merge
## Tool choice support
- Add tool_choice parameter to build_anthropic_kwargs()
- Maps OpenAI values: auto→auto, required→any, none→omit, name→tool
## Cache metrics tracking
- Anthropic uses cache_read_input_tokens / cache_creation_input_tokens
(different from OpenRouter's prompt_tokens_details.cached_tokens)
- Add api_mode-aware branch in run_agent.py cache stats logging
## Credential refresh on 401
- On 401 error during anthropic_messages mode, re-read credentials
via resolve_anthropic_token() (picks up refreshed Claude Code tokens)
- Rebuild client if new token differs from current one
- Follows same pattern as Codex/Nous 401 refresh handlers
## Tests
- 44 adapter tests (8 new: vision conversion, role alternation, tool choice)
- Updated beta header tests to verify new structure
- Full suite: 3198 passed, 0 regressions
2026-03-12 16:00:46 -07:00
|
|
|
|
tool_choice: Optional[str] = None,
|
2026-03-16 17:08:22 -07:00
|
|
|
|
is_oauth: bool = False,
|
2026-03-21 09:38:04 -07:00
|
|
|
|
preserve_dots: bool = False,
|
2026-03-27 13:02:52 -07:00
|
|
|
|
context_length: Optional[int] = None,
|
2026-04-08 13:51:41 -07:00
|
|
|
|
base_url: str | None = None,
|
2026-04-10 02:32:15 -07:00
|
|
|
|
fast_mode: bool = False,
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
) -> Dict[str, Any]:
|
2026-03-16 17:08:22 -07:00
|
|
|
|
"""Build kwargs for anthropic.messages.create().
|
|
|
|
|
|
|
fix(compaction): don't halve context_length on output-cap-too-large errors
When the API returns "max_tokens too large given prompt" (input tokens
are within the context window, but input + requested output > window),
the old code incorrectly routed through the same handler as "prompt too
long" errors, calling get_next_probe_tier() and permanently halving
context_length. This made things worse: the window was fine, only the
requested output size needed trimming for that one call.
Two distinct error classes now handled separately:
Prompt too long — input itself exceeds context window.
Fix: compress history + halve context_length (existing behaviour,
unchanged).
Output cap too large — input OK, but input + max_tokens > window.
Fix: parse available_tokens from the error message, set a one-shot
_ephemeral_max_output_tokens override for the retry, and leave
context_length completely untouched.
Changes:
- agent/model_metadata.py: add parse_available_output_tokens_from_error()
that detects Anthropic's "available_tokens: N" error format and returns
the available output budget, or None for all other error types.
- run_agent.py: call the new parser first in the is_context_length_error
block; if it fires, set _ephemeral_max_output_tokens (with a 64-token
safety margin) and break to retry without touching context_length.
_build_api_kwargs consumes the ephemeral value exactly once then clears
it so subsequent calls use self.max_tokens normally.
- agent/anthropic_adapter.py: expand build_anthropic_kwargs docstring to
clearly document the max_tokens (output cap) vs context_length (total
window) distinction, which is a persistent source of confusion due to
the OpenAI-inherited "max_tokens" name.
- cli-config.yaml.example: add inline comments explaining both keys side
by side where users are most likely to look.
- website/docs/integrations/providers.md: add a callout box at the top
of "Context Length Detection" and clarify the troubleshooting entry.
- tests/test_ctx_halving_fix.py: 24 tests across four classes covering
the parser, build_anthropic_kwargs clamping, ephemeral one-shot
consumption, and the invariant that context_length is never mutated
on output-cap errors.
2026-04-09 16:54:23 +02:00
|
|
|
|
Naming note — two distinct concepts, easily confused:
|
|
|
|
|
|
max_tokens = OUTPUT token cap for a single response.
|
|
|
|
|
|
Anthropic's API calls this "max_tokens" but it only
|
|
|
|
|
|
limits the *output*. Anthropic's own native SDK
|
|
|
|
|
|
renamed it "max_output_tokens" for clarity.
|
|
|
|
|
|
context_length = TOTAL context window (input tokens + output tokens).
|
|
|
|
|
|
The API enforces: input_tokens + max_tokens ≤ context_length.
|
|
|
|
|
|
Stored on the ContextCompressor; reduced on overflow errors.
|
|
|
|
|
|
|
|
|
|
|
|
When *max_tokens* is None the model's native output ceiling is used
|
|
|
|
|
|
(e.g. 128K for Opus 4.6, 64K for Sonnet 4.6).
|
|
|
|
|
|
|
|
|
|
|
|
When *context_length* is provided and the model's native output ceiling
|
|
|
|
|
|
exceeds it (e.g. a local endpoint with an 8K window), the output cap is
|
|
|
|
|
|
clamped to context_length − 1. This only kicks in for unusually small
|
|
|
|
|
|
context windows; for full-size models the native output cap is always
|
|
|
|
|
|
smaller than the context window so no clamping happens.
|
|
|
|
|
|
NOTE: this clamping does not account for prompt size — if the prompt is
|
|
|
|
|
|
large, Anthropic may still reject the request. The caller must detect
|
|
|
|
|
|
"max_tokens too large given prompt" errors and retry with a smaller cap
|
|
|
|
|
|
(see parse_available_output_tokens_from_error + _ephemeral_max_output_tokens).
|
2026-03-27 13:02:52 -07:00
|
|
|
|
|
2026-03-16 17:08:22 -07:00
|
|
|
|
When *is_oauth* is True, applies Claude Code compatibility transforms:
|
|
|
|
|
|
system prompt prefix, tool name prefixing, and prompt sanitization.
|
2026-03-21 09:38:04 -07:00
|
|
|
|
|
|
|
|
|
|
When *preserve_dots* is True, model name dots are not converted to hyphens
|
|
|
|
|
|
(for Alibaba/DashScope anthropic-compatible endpoints: qwen3.5-plus).
|
2026-04-08 13:51:41 -07:00
|
|
|
|
|
|
|
|
|
|
When *base_url* points to a third-party Anthropic-compatible endpoint,
|
|
|
|
|
|
thinking block signatures are stripped (they are Anthropic-proprietary).
|
2026-04-10 02:32:15 -07:00
|
|
|
|
|
2026-04-13 13:37:05 -07:00
|
|
|
|
When *fast_mode* is True, adds ``extra_body["speed"] = "fast"`` and the
|
|
|
|
|
|
fast-mode beta header for ~2.5x faster output throughput on Opus 4.6.
|
|
|
|
|
|
Currently only supported on native Anthropic endpoints (not third-party
|
|
|
|
|
|
compatible ones).
|
2026-03-16 17:08:22 -07:00
|
|
|
|
"""
|
2026-04-08 13:51:41 -07:00
|
|
|
|
system, anthropic_messages = convert_messages_to_anthropic(messages, base_url=base_url)
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
anthropic_tools = convert_tools_to_anthropic(tools) if tools else []
|
|
|
|
|
|
|
2026-03-21 09:38:04 -07:00
|
|
|
|
model = normalize_model_name(model, preserve_dots=preserve_dots)
|
fix(compaction): don't halve context_length on output-cap-too-large errors
When the API returns "max_tokens too large given prompt" (input tokens
are within the context window, but input + requested output > window),
the old code incorrectly routed through the same handler as "prompt too
long" errors, calling get_next_probe_tier() and permanently halving
context_length. This made things worse: the window was fine, only the
requested output size needed trimming for that one call.
Two distinct error classes now handled separately:
Prompt too long — input itself exceeds context window.
Fix: compress history + halve context_length (existing behaviour,
unchanged).
Output cap too large — input OK, but input + max_tokens > window.
Fix: parse available_tokens from the error message, set a one-shot
_ephemeral_max_output_tokens override for the retry, and leave
context_length completely untouched.
Changes:
- agent/model_metadata.py: add parse_available_output_tokens_from_error()
that detects Anthropic's "available_tokens: N" error format and returns
the available output budget, or None for all other error types.
- run_agent.py: call the new parser first in the is_context_length_error
block; if it fires, set _ephemeral_max_output_tokens (with a 64-token
safety margin) and break to retry without touching context_length.
_build_api_kwargs consumes the ephemeral value exactly once then clears
it so subsequent calls use self.max_tokens normally.
- agent/anthropic_adapter.py: expand build_anthropic_kwargs docstring to
clearly document the max_tokens (output cap) vs context_length (total
window) distinction, which is a persistent source of confusion due to
the OpenAI-inherited "max_tokens" name.
- cli-config.yaml.example: add inline comments explaining both keys side
by side where users are most likely to look.
- website/docs/integrations/providers.md: add a callout box at the top
of "Context Length Detection" and clarify the troubleshooting entry.
- tests/test_ctx_halving_fix.py: 24 tests across four classes covering
the parser, build_anthropic_kwargs clamping, ephemeral one-shot
consumption, and the invariant that context_length is never mutated
on output-cap errors.
2026-04-09 16:54:23 +02:00
|
|
|
|
# effective_max_tokens = output cap for this call (≠ total context window)
|
2026-03-27 13:02:52 -07:00
|
|
|
|
effective_max_tokens = max_tokens or _get_anthropic_max_output(model)
|
|
|
|
|
|
|
fix(compaction): don't halve context_length on output-cap-too-large errors
When the API returns "max_tokens too large given prompt" (input tokens
are within the context window, but input + requested output > window),
the old code incorrectly routed through the same handler as "prompt too
long" errors, calling get_next_probe_tier() and permanently halving
context_length. This made things worse: the window was fine, only the
requested output size needed trimming for that one call.
Two distinct error classes now handled separately:
Prompt too long — input itself exceeds context window.
Fix: compress history + halve context_length (existing behaviour,
unchanged).
Output cap too large — input OK, but input + max_tokens > window.
Fix: parse available_tokens from the error message, set a one-shot
_ephemeral_max_output_tokens override for the retry, and leave
context_length completely untouched.
Changes:
- agent/model_metadata.py: add parse_available_output_tokens_from_error()
that detects Anthropic's "available_tokens: N" error format and returns
the available output budget, or None for all other error types.
- run_agent.py: call the new parser first in the is_context_length_error
block; if it fires, set _ephemeral_max_output_tokens (with a 64-token
safety margin) and break to retry without touching context_length.
_build_api_kwargs consumes the ephemeral value exactly once then clears
it so subsequent calls use self.max_tokens normally.
- agent/anthropic_adapter.py: expand build_anthropic_kwargs docstring to
clearly document the max_tokens (output cap) vs context_length (total
window) distinction, which is a persistent source of confusion due to
the OpenAI-inherited "max_tokens" name.
- cli-config.yaml.example: add inline comments explaining both keys side
by side where users are most likely to look.
- website/docs/integrations/providers.md: add a callout box at the top
of "Context Length Detection" and clarify the troubleshooting entry.
- tests/test_ctx_halving_fix.py: 24 tests across four classes covering
the parser, build_anthropic_kwargs clamping, ephemeral one-shot
consumption, and the invariant that context_length is never mutated
on output-cap errors.
2026-04-09 16:54:23 +02:00
|
|
|
|
# Clamp output cap to fit inside the total context window.
|
|
|
|
|
|
# Only matters for small custom endpoints where context_length < native
|
|
|
|
|
|
# output ceiling. For standard Anthropic models context_length (e.g.
|
|
|
|
|
|
# 200K) is always larger than the output ceiling (e.g. 128K), so this
|
|
|
|
|
|
# branch is not taken.
|
2026-03-27 13:02:52 -07:00
|
|
|
|
if context_length and effective_max_tokens > context_length:
|
|
|
|
|
|
effective_max_tokens = max(context_length - 1, 1)
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
|
2026-03-16 17:08:22 -07:00
|
|
|
|
# ── OAuth: Claude Code identity ──────────────────────────────────
|
|
|
|
|
|
if is_oauth:
|
|
|
|
|
|
# 1. Prepend Claude Code system prompt identity
|
|
|
|
|
|
cc_block = {"type": "text", "text": _CLAUDE_CODE_SYSTEM_PREFIX}
|
|
|
|
|
|
if isinstance(system, list):
|
|
|
|
|
|
system = [cc_block] + system
|
|
|
|
|
|
elif isinstance(system, str) and system:
|
|
|
|
|
|
system = [cc_block, {"type": "text", "text": system}]
|
|
|
|
|
|
else:
|
|
|
|
|
|
system = [cc_block]
|
|
|
|
|
|
|
|
|
|
|
|
# 2. Sanitize system prompt — replace product name references
|
|
|
|
|
|
# to avoid Anthropic's server-side content filters.
|
|
|
|
|
|
for block in system:
|
|
|
|
|
|
if isinstance(block, dict) and block.get("type") == "text":
|
|
|
|
|
|
text = block.get("text", "")
|
|
|
|
|
|
text = text.replace("Hermes Agent", "Claude Code")
|
|
|
|
|
|
text = text.replace("Hermes agent", "Claude Code")
|
|
|
|
|
|
text = text.replace("hermes-agent", "claude-code")
|
|
|
|
|
|
text = text.replace("Nous Research", "Anthropic")
|
|
|
|
|
|
block["text"] = text
|
|
|
|
|
|
|
|
|
|
|
|
# 3. Prefix tool names with mcp_ (Claude Code convention)
|
|
|
|
|
|
if anthropic_tools:
|
|
|
|
|
|
for tool in anthropic_tools:
|
|
|
|
|
|
if "name" in tool:
|
|
|
|
|
|
tool["name"] = _MCP_TOOL_PREFIX + tool["name"]
|
|
|
|
|
|
|
|
|
|
|
|
# 4. Prefix tool names in message history (tool_use and tool_result blocks)
|
|
|
|
|
|
for msg in anthropic_messages:
|
|
|
|
|
|
content = msg.get("content")
|
|
|
|
|
|
if isinstance(content, list):
|
|
|
|
|
|
for block in content:
|
|
|
|
|
|
if isinstance(block, dict):
|
|
|
|
|
|
if block.get("type") == "tool_use" and "name" in block:
|
|
|
|
|
|
if not block["name"].startswith(_MCP_TOOL_PREFIX):
|
|
|
|
|
|
block["name"] = _MCP_TOOL_PREFIX + block["name"]
|
|
|
|
|
|
elif block.get("type") == "tool_result" and "tool_use_id" in block:
|
|
|
|
|
|
pass # tool_result uses ID, not name
|
|
|
|
|
|
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
kwargs: Dict[str, Any] = {
|
|
|
|
|
|
"model": model,
|
|
|
|
|
|
"messages": anthropic_messages,
|
|
|
|
|
|
"max_tokens": effective_max_tokens,
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
if system:
|
|
|
|
|
|
kwargs["system"] = system
|
|
|
|
|
|
|
|
|
|
|
|
if anthropic_tools:
|
|
|
|
|
|
kwargs["tools"] = anthropic_tools
|
fix(anthropic): address gaps found in deep-dive audit
After studying clawdbot (OpenClaw) and OpenCode implementations:
## Beta headers
- Add interleaved-thinking-2025-05-14 and fine-grained-tool-streaming-2025-05-14
as common betas (sent with ALL auth types, not just OAuth)
- OAuth tokens additionally get oauth-2025-04-20
- API keys now also get the common betas (previously got none)
## Vision/image support
- Add _convert_vision_content() to convert OpenAI multimodal format
(image_url blocks) to Anthropic format (image blocks with base64/url source)
- Handles both data: URIs (base64) and regular URLs
## Role alternation enforcement
- Anthropic strictly rejects consecutive same-role messages (400 error)
- Add post-processing step that merges consecutive user/assistant messages
- Handles string, list, and mixed content types during merge
## Tool choice support
- Add tool_choice parameter to build_anthropic_kwargs()
- Maps OpenAI values: auto→auto, required→any, none→omit, name→tool
## Cache metrics tracking
- Anthropic uses cache_read_input_tokens / cache_creation_input_tokens
(different from OpenRouter's prompt_tokens_details.cached_tokens)
- Add api_mode-aware branch in run_agent.py cache stats logging
## Credential refresh on 401
- On 401 error during anthropic_messages mode, re-read credentials
via resolve_anthropic_token() (picks up refreshed Claude Code tokens)
- Rebuild client if new token differs from current one
- Follows same pattern as Codex/Nous 401 refresh handlers
## Tests
- 44 adapter tests (8 new: vision conversion, role alternation, tool choice)
- Updated beta header tests to verify new structure
- Full suite: 3198 passed, 0 regressions
2026-03-12 16:00:46 -07:00
|
|
|
|
# Map OpenAI tool_choice to Anthropic format
|
|
|
|
|
|
if tool_choice == "auto" or tool_choice is None:
|
|
|
|
|
|
kwargs["tool_choice"] = {"type": "auto"}
|
|
|
|
|
|
elif tool_choice == "required":
|
|
|
|
|
|
kwargs["tool_choice"] = {"type": "any"}
|
|
|
|
|
|
elif tool_choice == "none":
|
2026-03-17 04:02:49 -07:00
|
|
|
|
# Anthropic has no tool_choice "none" — omit tools entirely to prevent use
|
|
|
|
|
|
kwargs.pop("tools", None)
|
fix(anthropic): address gaps found in deep-dive audit
After studying clawdbot (OpenClaw) and OpenCode implementations:
## Beta headers
- Add interleaved-thinking-2025-05-14 and fine-grained-tool-streaming-2025-05-14
as common betas (sent with ALL auth types, not just OAuth)
- OAuth tokens additionally get oauth-2025-04-20
- API keys now also get the common betas (previously got none)
## Vision/image support
- Add _convert_vision_content() to convert OpenAI multimodal format
(image_url blocks) to Anthropic format (image blocks with base64/url source)
- Handles both data: URIs (base64) and regular URLs
## Role alternation enforcement
- Anthropic strictly rejects consecutive same-role messages (400 error)
- Add post-processing step that merges consecutive user/assistant messages
- Handles string, list, and mixed content types during merge
## Tool choice support
- Add tool_choice parameter to build_anthropic_kwargs()
- Maps OpenAI values: auto→auto, required→any, none→omit, name→tool
## Cache metrics tracking
- Anthropic uses cache_read_input_tokens / cache_creation_input_tokens
(different from OpenRouter's prompt_tokens_details.cached_tokens)
- Add api_mode-aware branch in run_agent.py cache stats logging
## Credential refresh on 401
- On 401 error during anthropic_messages mode, re-read credentials
via resolve_anthropic_token() (picks up refreshed Claude Code tokens)
- Rebuild client if new token differs from current one
- Follows same pattern as Codex/Nous 401 refresh handlers
## Tests
- 44 adapter tests (8 new: vision conversion, role alternation, tool choice)
- Updated beta header tests to verify new structure
- Full suite: 3198 passed, 0 regressions
2026-03-12 16:00:46 -07:00
|
|
|
|
elif isinstance(tool_choice, str):
|
|
|
|
|
|
# Specific tool name
|
|
|
|
|
|
kwargs["tool_choice"] = {"type": "tool", "name": tool_choice}
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
|
2026-03-13 03:21:13 +01:00
|
|
|
|
# Map reasoning_config to Anthropic's thinking parameter.
|
fix(agent): complete Claude Opus 4.7 API migration
Claude Opus 4.7 introduced several breaking API changes that the current
codebase partially handled but not completely. This patch finishes the
migration per the official migration guide at
https://platform.claude.com/docs/en/about-claude/models/migration-guide
Fixes NousResearch/hermes-agent#11137
Breaking-change coverage:
1. Adaptive thinking + output_config.effort — 4.7 is now recognized by
_supports_adaptive_thinking() (extends previous 4.6-only gate).
2. Sampling parameter stripping — 4.7 returns 400 for any non-default
temperature / top_p / top_k. build_anthropic_kwargs drops them as a
safety net; the OpenAI-protocol auxiliary path (_build_call_kwargs)
and AnthropicCompletionsAdapter.create() both early-exit before
setting temperature for 4.7+ models. This keeps flush_memories and
structured-JSON aux paths that hardcode temperature from 400ing
when the aux model is flipped to 4.7.
3. thinking.display = "summarized" — 4.7 defaults display to "omitted",
which silently hides reasoning text from Hermes's CLI activity feed
during long tool runs. Restoring "summarized" preserves 4.6 UX.
4. Effort level mapping — xhigh now maps to xhigh (was xhigh→max, which
silently over-efforted every coding/agentic request). max is now a
distinct ceiling per Anthropic's 5-level effort model.
5. New stop_reason values — refusal and model_context_window_exceeded
were silently collapsed to "stop" (end_turn) by the adapter's
stop_reason_map. Now mapped to "content_filter" and "length"
respectively, matching upstream finish-reason handling already in
bedrock_adapter.
6. Model catalogs — claude-opus-4-7 added to the Anthropic provider
list, anthropic/claude-opus-4.7 added at top of OpenRouter fallback
catalog (recommended), claude-opus-4-7 added to model_metadata
DEFAULT_CONTEXT_LENGTHS (1M, matching 4.6 per migration guide).
7. Prefill docstrings — run_agent.AIAgent and BatchRunner now document
that Anthropic Sonnet/Opus 4.6+ reject a trailing assistant-role
prefill (400).
8. Tests — 4 new tests in test_anthropic_adapter covering display
default, xhigh preservation, max on 4.7, refusal / context-overflow
stop_reason mapping, plus the sampling-param predicate. test_model_metadata
accepts 4.7 at 1M context.
Tested on macOS 15.5 (darwin). 119 tests pass in
tests/agent/test_anthropic_adapter.py, 1320 pass in tests/agent/.
2026-04-16 12:35:43 -05:00
|
|
|
|
# Claude 4.6+ models use adaptive thinking + output_config.effort.
|
2026-03-13 03:21:13 +01:00
|
|
|
|
# Older models use manual thinking with budget_tokens.
|
fix: align MiniMax provider with official API docs
Aligns MiniMax provider with official API documentation. Fixes 6 bugs:
transport mismatch (openai_chat -> anthropic_messages), credential leak
in switch_model(), prompt caching sent to non-Anthropic endpoints,
dot-to-hyphen model name corruption, trajectory compressor URL routing,
and stale doctor health check.
Also corrects context window (204,800), thinking support (manual mode),
max output (131,072), and model catalog (M2 family only on /anthropic).
Source: https://platform.minimax.io/docs/api-reference/text-anthropic-api
Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
2026-04-10 03:53:18 -07:00
|
|
|
|
# MiniMax Anthropic-compat endpoints support thinking (manual mode only,
|
|
|
|
|
|
# not adaptive). Haiku does NOT support extended thinking — skip entirely.
|
fix(agent): complete Claude Opus 4.7 API migration
Claude Opus 4.7 introduced several breaking API changes that the current
codebase partially handled but not completely. This patch finishes the
migration per the official migration guide at
https://platform.claude.com/docs/en/about-claude/models/migration-guide
Fixes NousResearch/hermes-agent#11137
Breaking-change coverage:
1. Adaptive thinking + output_config.effort — 4.7 is now recognized by
_supports_adaptive_thinking() (extends previous 4.6-only gate).
2. Sampling parameter stripping — 4.7 returns 400 for any non-default
temperature / top_p / top_k. build_anthropic_kwargs drops them as a
safety net; the OpenAI-protocol auxiliary path (_build_call_kwargs)
and AnthropicCompletionsAdapter.create() both early-exit before
setting temperature for 4.7+ models. This keeps flush_memories and
structured-JSON aux paths that hardcode temperature from 400ing
when the aux model is flipped to 4.7.
3. thinking.display = "summarized" — 4.7 defaults display to "omitted",
which silently hides reasoning text from Hermes's CLI activity feed
during long tool runs. Restoring "summarized" preserves 4.6 UX.
4. Effort level mapping — xhigh now maps to xhigh (was xhigh→max, which
silently over-efforted every coding/agentic request). max is now a
distinct ceiling per Anthropic's 5-level effort model.
5. New stop_reason values — refusal and model_context_window_exceeded
were silently collapsed to "stop" (end_turn) by the adapter's
stop_reason_map. Now mapped to "content_filter" and "length"
respectively, matching upstream finish-reason handling already in
bedrock_adapter.
6. Model catalogs — claude-opus-4-7 added to the Anthropic provider
list, anthropic/claude-opus-4.7 added at top of OpenRouter fallback
catalog (recommended), claude-opus-4-7 added to model_metadata
DEFAULT_CONTEXT_LENGTHS (1M, matching 4.6 per migration guide).
7. Prefill docstrings — run_agent.AIAgent and BatchRunner now document
that Anthropic Sonnet/Opus 4.6+ reject a trailing assistant-role
prefill (400).
8. Tests — 4 new tests in test_anthropic_adapter covering display
default, xhigh preservation, max on 4.7, refusal / context-overflow
stop_reason mapping, plus the sampling-param predicate. test_model_metadata
accepts 4.7 at 1M context.
Tested on macOS 15.5 (darwin). 119 tests pass in
tests/agent/test_anthropic_adapter.py, 1320 pass in tests/agent/.
2026-04-16 12:35:43 -05:00
|
|
|
|
#
|
|
|
|
|
|
# On 4.7+ the `thinking.display` field defaults to "omitted", which
|
|
|
|
|
|
# silently hides reasoning text that Hermes surfaces in its CLI. We
|
|
|
|
|
|
# request "summarized" so the reasoning blocks stay populated — matching
|
|
|
|
|
|
# 4.6 behavior and preserving the activity-feed UX during long tool runs.
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
if reasoning_config and isinstance(reasoning_config, dict):
|
fix: align MiniMax provider with official API docs
Aligns MiniMax provider with official API documentation. Fixes 6 bugs:
transport mismatch (openai_chat -> anthropic_messages), credential leak
in switch_model(), prompt caching sent to non-Anthropic endpoints,
dot-to-hyphen model name corruption, trajectory compressor URL routing,
and stale doctor health check.
Also corrects context window (204,800), thinking support (manual mode),
max output (131,072), and model catalog (M2 family only on /anthropic).
Source: https://platform.minimax.io/docs/api-reference/text-anthropic-api
Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
2026-04-10 03:53:18 -07:00
|
|
|
|
if reasoning_config.get("enabled") is not False and "haiku" not in model.lower():
|
2026-03-13 03:21:13 +01:00
|
|
|
|
effort = str(reasoning_config.get("effort", "medium")).lower()
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
budget = THINKING_BUDGET.get(effort, 8000)
|
2026-03-13 03:21:13 +01:00
|
|
|
|
if _supports_adaptive_thinking(model):
|
fix(agent): complete Claude Opus 4.7 API migration
Claude Opus 4.7 introduced several breaking API changes that the current
codebase partially handled but not completely. This patch finishes the
migration per the official migration guide at
https://platform.claude.com/docs/en/about-claude/models/migration-guide
Fixes NousResearch/hermes-agent#11137
Breaking-change coverage:
1. Adaptive thinking + output_config.effort — 4.7 is now recognized by
_supports_adaptive_thinking() (extends previous 4.6-only gate).
2. Sampling parameter stripping — 4.7 returns 400 for any non-default
temperature / top_p / top_k. build_anthropic_kwargs drops them as a
safety net; the OpenAI-protocol auxiliary path (_build_call_kwargs)
and AnthropicCompletionsAdapter.create() both early-exit before
setting temperature for 4.7+ models. This keeps flush_memories and
structured-JSON aux paths that hardcode temperature from 400ing
when the aux model is flipped to 4.7.
3. thinking.display = "summarized" — 4.7 defaults display to "omitted",
which silently hides reasoning text from Hermes's CLI activity feed
during long tool runs. Restoring "summarized" preserves 4.6 UX.
4. Effort level mapping — xhigh now maps to xhigh (was xhigh→max, which
silently over-efforted every coding/agentic request). max is now a
distinct ceiling per Anthropic's 5-level effort model.
5. New stop_reason values — refusal and model_context_window_exceeded
were silently collapsed to "stop" (end_turn) by the adapter's
stop_reason_map. Now mapped to "content_filter" and "length"
respectively, matching upstream finish-reason handling already in
bedrock_adapter.
6. Model catalogs — claude-opus-4-7 added to the Anthropic provider
list, anthropic/claude-opus-4.7 added at top of OpenRouter fallback
catalog (recommended), claude-opus-4-7 added to model_metadata
DEFAULT_CONTEXT_LENGTHS (1M, matching 4.6 per migration guide).
7. Prefill docstrings — run_agent.AIAgent and BatchRunner now document
that Anthropic Sonnet/Opus 4.6+ reject a trailing assistant-role
prefill (400).
8. Tests — 4 new tests in test_anthropic_adapter covering display
default, xhigh preservation, max on 4.7, refusal / context-overflow
stop_reason mapping, plus the sampling-param predicate. test_model_metadata
accepts 4.7 at 1M context.
Tested on macOS 15.5 (darwin). 119 tests pass in
tests/agent/test_anthropic_adapter.py, 1320 pass in tests/agent/.
2026-04-16 12:35:43 -05:00
|
|
|
|
kwargs["thinking"] = {
|
|
|
|
|
|
"type": "adaptive",
|
|
|
|
|
|
"display": "summarized",
|
|
|
|
|
|
}
|
fix(agent): downgrade xhigh→max on Anthropic pre-4.7 adaptive models
Regression from #11161 (Claude Opus 4.7 migration, commit 0517ac3e).
The Opus 4.7 migration changed `ADAPTIVE_EFFORT_MAP["xhigh"]` from "max"
(the pre-migration alias) to "xhigh" to preserve the new 4.7 effort level
as distinct from max. This is correct for 4.7, but Opus/Sonnet 4.6 only
expose 4 levels (low/medium/high/max) — sending "xhigh" there now 400s:
BadRequestError [HTTP 400]: This model does not support effort
level 'xhigh'. Supported levels: high, low, max, medium.
Users who set reasoning_effort=xhigh as their default (xhigh is the
recommended default for coding/agentic on 4.7 per the Anthropic migration
guide) now 400 every request the moment they switch back to a 4.6 model
via `/model` or config. Verified live against the Anthropic API on
`anthropic==0.94.0`.
Fix: make the mapping model-aware. Add `_supports_xhigh_effort()`
predicate (matches 4-7/4.7 substrings, mirroring the existing
`_supports_adaptive_thinking` / `_forbids_sampling_params` pattern).
On pre-4.7 adaptive models, downgrade xhigh→max (the strongest effort
those models accept, restoring pre-migration behavior). On 4.7+, keep
xhigh as a distinct level.
Per Anthropic's migration guide, xhigh is 4.7-only:
https://platform.claude.com/docs/en/about-claude/models/migration-guide
> Opus 4.7 effort levels: max, xhigh (new), high, medium, low.
> Opus 4.6 effort levels: max, high, medium, low.
SDK typing confirms: `anthropic.types.OutputConfigParam.effort: Literal[
"low", "medium", "high", "max"]` (v0.94.0 not yet updated for xhigh).
## Test plan
Verified live on macOS 15.5 / anthropic==0.94.0:
claude-opus-4-6 + effort=xhigh → output_config.effort=max → 200 OK
claude-opus-4-7 + effort=xhigh → output_config.effort=xhigh → 200 OK
claude-opus-4-6 + effort=max → output_config.effort=max → 200 OK
claude-opus-4-7 + effort=max → output_config.effort=max → 200 OK
`tests/agent/test_anthropic_adapter.py` — 120 pass (replaced 1 bugged
test that asserted the broken behavior, added 1 for 4.7 preservation).
Full adapter suite: 120 passed in 1.05s.
Broader suite (agent + run_agent + cli/gateway reasoning): 2140 passed
(2 pre-existing failures on clean upstream/main, unrelated).
## Platforms
Tested on macOS 15.5. No platform-specific code paths touched.
2026-04-16 13:51:42 -05:00
|
|
|
|
adaptive_effort = ADAPTIVE_EFFORT_MAP.get(effort, "medium")
|
|
|
|
|
|
# Downgrade xhigh→max on models that don't list xhigh as a
|
|
|
|
|
|
# supported level (Opus/Sonnet 4.6). Opus 4.7+ keeps xhigh.
|
|
|
|
|
|
if adaptive_effort == "xhigh" and not _supports_xhigh_effort(model):
|
|
|
|
|
|
adaptive_effort = "max"
|
2026-03-13 03:21:13 +01:00
|
|
|
|
kwargs["output_config"] = {
|
fix(agent): downgrade xhigh→max on Anthropic pre-4.7 adaptive models
Regression from #11161 (Claude Opus 4.7 migration, commit 0517ac3e).
The Opus 4.7 migration changed `ADAPTIVE_EFFORT_MAP["xhigh"]` from "max"
(the pre-migration alias) to "xhigh" to preserve the new 4.7 effort level
as distinct from max. This is correct for 4.7, but Opus/Sonnet 4.6 only
expose 4 levels (low/medium/high/max) — sending "xhigh" there now 400s:
BadRequestError [HTTP 400]: This model does not support effort
level 'xhigh'. Supported levels: high, low, max, medium.
Users who set reasoning_effort=xhigh as their default (xhigh is the
recommended default for coding/agentic on 4.7 per the Anthropic migration
guide) now 400 every request the moment they switch back to a 4.6 model
via `/model` or config. Verified live against the Anthropic API on
`anthropic==0.94.0`.
Fix: make the mapping model-aware. Add `_supports_xhigh_effort()`
predicate (matches 4-7/4.7 substrings, mirroring the existing
`_supports_adaptive_thinking` / `_forbids_sampling_params` pattern).
On pre-4.7 adaptive models, downgrade xhigh→max (the strongest effort
those models accept, restoring pre-migration behavior). On 4.7+, keep
xhigh as a distinct level.
Per Anthropic's migration guide, xhigh is 4.7-only:
https://platform.claude.com/docs/en/about-claude/models/migration-guide
> Opus 4.7 effort levels: max, xhigh (new), high, medium, low.
> Opus 4.6 effort levels: max, high, medium, low.
SDK typing confirms: `anthropic.types.OutputConfigParam.effort: Literal[
"low", "medium", "high", "max"]` (v0.94.0 not yet updated for xhigh).
## Test plan
Verified live on macOS 15.5 / anthropic==0.94.0:
claude-opus-4-6 + effort=xhigh → output_config.effort=max → 200 OK
claude-opus-4-7 + effort=xhigh → output_config.effort=xhigh → 200 OK
claude-opus-4-6 + effort=max → output_config.effort=max → 200 OK
claude-opus-4-7 + effort=max → output_config.effort=max → 200 OK
`tests/agent/test_anthropic_adapter.py` — 120 pass (replaced 1 bugged
test that asserted the broken behavior, added 1 for 4.7 preservation).
Full adapter suite: 120 passed in 1.05s.
Broader suite (agent + run_agent + cli/gateway reasoning): 2140 passed
(2 pre-existing failures on clean upstream/main, unrelated).
## Platforms
Tested on macOS 15.5. No platform-specific code paths touched.
2026-04-16 13:51:42 -05:00
|
|
|
|
"effort": adaptive_effort,
|
2026-03-13 03:21:13 +01:00
|
|
|
|
}
|
2026-03-12 17:04:31 -07:00
|
|
|
|
else:
|
|
|
|
|
|
kwargs["thinking"] = {"type": "enabled", "budget_tokens": budget}
|
2026-03-12 17:23:09 -07:00
|
|
|
|
# Anthropic requires temperature=1 when thinking is enabled on older models
|
|
|
|
|
|
kwargs["temperature"] = 1
|
2026-03-13 03:21:13 +01:00
|
|
|
|
kwargs["max_tokens"] = max(effective_max_tokens, budget + 4096)
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
|
fix(agent): complete Claude Opus 4.7 API migration
Claude Opus 4.7 introduced several breaking API changes that the current
codebase partially handled but not completely. This patch finishes the
migration per the official migration guide at
https://platform.claude.com/docs/en/about-claude/models/migration-guide
Fixes NousResearch/hermes-agent#11137
Breaking-change coverage:
1. Adaptive thinking + output_config.effort — 4.7 is now recognized by
_supports_adaptive_thinking() (extends previous 4.6-only gate).
2. Sampling parameter stripping — 4.7 returns 400 for any non-default
temperature / top_p / top_k. build_anthropic_kwargs drops them as a
safety net; the OpenAI-protocol auxiliary path (_build_call_kwargs)
and AnthropicCompletionsAdapter.create() both early-exit before
setting temperature for 4.7+ models. This keeps flush_memories and
structured-JSON aux paths that hardcode temperature from 400ing
when the aux model is flipped to 4.7.
3. thinking.display = "summarized" — 4.7 defaults display to "omitted",
which silently hides reasoning text from Hermes's CLI activity feed
during long tool runs. Restoring "summarized" preserves 4.6 UX.
4. Effort level mapping — xhigh now maps to xhigh (was xhigh→max, which
silently over-efforted every coding/agentic request). max is now a
distinct ceiling per Anthropic's 5-level effort model.
5. New stop_reason values — refusal and model_context_window_exceeded
were silently collapsed to "stop" (end_turn) by the adapter's
stop_reason_map. Now mapped to "content_filter" and "length"
respectively, matching upstream finish-reason handling already in
bedrock_adapter.
6. Model catalogs — claude-opus-4-7 added to the Anthropic provider
list, anthropic/claude-opus-4.7 added at top of OpenRouter fallback
catalog (recommended), claude-opus-4-7 added to model_metadata
DEFAULT_CONTEXT_LENGTHS (1M, matching 4.6 per migration guide).
7. Prefill docstrings — run_agent.AIAgent and BatchRunner now document
that Anthropic Sonnet/Opus 4.6+ reject a trailing assistant-role
prefill (400).
8. Tests — 4 new tests in test_anthropic_adapter covering display
default, xhigh preservation, max on 4.7, refusal / context-overflow
stop_reason mapping, plus the sampling-param predicate. test_model_metadata
accepts 4.7 at 1M context.
Tested on macOS 15.5 (darwin). 119 tests pass in
tests/agent/test_anthropic_adapter.py, 1320 pass in tests/agent/.
2026-04-16 12:35:43 -05:00
|
|
|
|
# ── Strip sampling params on 4.7+ ─────────────────────────────────
|
|
|
|
|
|
# Opus 4.7 rejects any non-default temperature/top_p/top_k with a 400.
|
|
|
|
|
|
# Callers (auxiliary_client, flush_memories, etc.) may set these for
|
|
|
|
|
|
# older models; drop them here as a safety net so upstream 4.6 → 4.7
|
|
|
|
|
|
# migrations don't require coordinated edits everywhere.
|
|
|
|
|
|
if _forbids_sampling_params(model):
|
|
|
|
|
|
for _sampling_key in ("temperature", "top_p", "top_k"):
|
|
|
|
|
|
kwargs.pop(_sampling_key, None)
|
|
|
|
|
|
|
2026-04-10 02:32:15 -07:00
|
|
|
|
# ── Fast mode (Opus 4.6 only) ────────────────────────────────────
|
2026-04-13 13:37:05 -07:00
|
|
|
|
# Adds extra_body.speed="fast" + the fast-mode beta header for ~2.5x
|
|
|
|
|
|
# output speed. Only for native Anthropic endpoints — third-party
|
|
|
|
|
|
# providers would reject the unknown beta header and speed parameter.
|
2026-04-10 02:32:15 -07:00
|
|
|
|
if fast_mode and not _is_third_party_anthropic_endpoint(base_url):
|
2026-04-13 13:37:05 -07:00
|
|
|
|
kwargs.setdefault("extra_body", {})["speed"] = "fast"
|
2026-04-10 02:32:15 -07:00
|
|
|
|
# Build extra_headers with ALL applicable betas (the per-request
|
|
|
|
|
|
# extra_headers override the client-level anthropic-beta header).
|
|
|
|
|
|
betas = list(_common_betas_for_base_url(base_url))
|
|
|
|
|
|
if is_oauth:
|
|
|
|
|
|
betas.extend(_OAUTH_ONLY_BETAS)
|
|
|
|
|
|
betas.append(_FAST_MODE_BETA)
|
|
|
|
|
|
kwargs["extra_headers"] = {"anthropic-beta": ",".join(betas)}
|
|
|
|
|
|
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
return kwargs
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def normalize_anthropic_response(
|
|
|
|
|
|
response,
|
2026-03-16 17:08:22 -07:00
|
|
|
|
strip_tool_prefix: bool = False,
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
) -> Tuple[SimpleNamespace, str]:
|
|
|
|
|
|
"""Normalize Anthropic response to match the shape expected by AIAgent.
|
|
|
|
|
|
|
|
|
|
|
|
Returns (assistant_message, finish_reason) where assistant_message has
|
|
|
|
|
|
.content, .tool_calls, and .reasoning attributes.
|
2026-03-16 17:08:22 -07:00
|
|
|
|
|
|
|
|
|
|
When *strip_tool_prefix* is True, removes the ``mcp_`` prefix that was
|
|
|
|
|
|
added to tool names for OAuth Claude Code compatibility.
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
"""
|
|
|
|
|
|
text_parts = []
|
|
|
|
|
|
reasoning_parts = []
|
2026-04-02 10:14:20 -07:00
|
|
|
|
reasoning_details = []
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
tool_calls = []
|
|
|
|
|
|
|
|
|
|
|
|
for block in response.content:
|
|
|
|
|
|
if block.type == "text":
|
|
|
|
|
|
text_parts.append(block.text)
|
|
|
|
|
|
elif block.type == "thinking":
|
|
|
|
|
|
reasoning_parts.append(block.thinking)
|
2026-04-02 10:14:20 -07:00
|
|
|
|
block_dict = _to_plain_data(block)
|
|
|
|
|
|
if isinstance(block_dict, dict):
|
|
|
|
|
|
reasoning_details.append(block_dict)
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
elif block.type == "tool_use":
|
2026-03-16 17:08:22 -07:00
|
|
|
|
name = block.name
|
|
|
|
|
|
if strip_tool_prefix and name.startswith(_MCP_TOOL_PREFIX):
|
|
|
|
|
|
name = name[len(_MCP_TOOL_PREFIX):]
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
tool_calls.append(
|
|
|
|
|
|
SimpleNamespace(
|
|
|
|
|
|
id=block.id,
|
|
|
|
|
|
type="function",
|
|
|
|
|
|
function=SimpleNamespace(
|
2026-03-16 17:08:22 -07:00
|
|
|
|
name=name,
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
arguments=json.dumps(block.input),
|
|
|
|
|
|
),
|
|
|
|
|
|
)
|
|
|
|
|
|
)
|
|
|
|
|
|
|
fix(agent): complete Claude Opus 4.7 API migration
Claude Opus 4.7 introduced several breaking API changes that the current
codebase partially handled but not completely. This patch finishes the
migration per the official migration guide at
https://platform.claude.com/docs/en/about-claude/models/migration-guide
Fixes NousResearch/hermes-agent#11137
Breaking-change coverage:
1. Adaptive thinking + output_config.effort — 4.7 is now recognized by
_supports_adaptive_thinking() (extends previous 4.6-only gate).
2. Sampling parameter stripping — 4.7 returns 400 for any non-default
temperature / top_p / top_k. build_anthropic_kwargs drops them as a
safety net; the OpenAI-protocol auxiliary path (_build_call_kwargs)
and AnthropicCompletionsAdapter.create() both early-exit before
setting temperature for 4.7+ models. This keeps flush_memories and
structured-JSON aux paths that hardcode temperature from 400ing
when the aux model is flipped to 4.7.
3. thinking.display = "summarized" — 4.7 defaults display to "omitted",
which silently hides reasoning text from Hermes's CLI activity feed
during long tool runs. Restoring "summarized" preserves 4.6 UX.
4. Effort level mapping — xhigh now maps to xhigh (was xhigh→max, which
silently over-efforted every coding/agentic request). max is now a
distinct ceiling per Anthropic's 5-level effort model.
5. New stop_reason values — refusal and model_context_window_exceeded
were silently collapsed to "stop" (end_turn) by the adapter's
stop_reason_map. Now mapped to "content_filter" and "length"
respectively, matching upstream finish-reason handling already in
bedrock_adapter.
6. Model catalogs — claude-opus-4-7 added to the Anthropic provider
list, anthropic/claude-opus-4.7 added at top of OpenRouter fallback
catalog (recommended), claude-opus-4-7 added to model_metadata
DEFAULT_CONTEXT_LENGTHS (1M, matching 4.6 per migration guide).
7. Prefill docstrings — run_agent.AIAgent and BatchRunner now document
that Anthropic Sonnet/Opus 4.6+ reject a trailing assistant-role
prefill (400).
8. Tests — 4 new tests in test_anthropic_adapter covering display
default, xhigh preservation, max on 4.7, refusal / context-overflow
stop_reason mapping, plus the sampling-param predicate. test_model_metadata
accepts 4.7 at 1M context.
Tested on macOS 15.5 (darwin). 119 tests pass in
tests/agent/test_anthropic_adapter.py, 1320 pass in tests/agent/.
2026-04-16 12:35:43 -05:00
|
|
|
|
# Map Anthropic stop_reason to OpenAI finish_reason.
|
|
|
|
|
|
# Newer stop reasons added in Claude 4.5+ / 4.7:
|
|
|
|
|
|
# - refusal: the model declined to answer (cyber safeguards, CSAM, etc.)
|
|
|
|
|
|
# - model_context_window_exceeded: hit context limit (not max_tokens)
|
|
|
|
|
|
# Both need distinct handling upstream — a refusal should surface to the
|
|
|
|
|
|
# user with a clear message, and a context-window overflow should trigger
|
|
|
|
|
|
# compression/truncation rather than be treated as normal end-of-turn.
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
stop_reason_map = {
|
|
|
|
|
|
"end_turn": "stop",
|
|
|
|
|
|
"tool_use": "tool_calls",
|
|
|
|
|
|
"max_tokens": "length",
|
|
|
|
|
|
"stop_sequence": "stop",
|
fix(agent): complete Claude Opus 4.7 API migration
Claude Opus 4.7 introduced several breaking API changes that the current
codebase partially handled but not completely. This patch finishes the
migration per the official migration guide at
https://platform.claude.com/docs/en/about-claude/models/migration-guide
Fixes NousResearch/hermes-agent#11137
Breaking-change coverage:
1. Adaptive thinking + output_config.effort — 4.7 is now recognized by
_supports_adaptive_thinking() (extends previous 4.6-only gate).
2. Sampling parameter stripping — 4.7 returns 400 for any non-default
temperature / top_p / top_k. build_anthropic_kwargs drops them as a
safety net; the OpenAI-protocol auxiliary path (_build_call_kwargs)
and AnthropicCompletionsAdapter.create() both early-exit before
setting temperature for 4.7+ models. This keeps flush_memories and
structured-JSON aux paths that hardcode temperature from 400ing
when the aux model is flipped to 4.7.
3. thinking.display = "summarized" — 4.7 defaults display to "omitted",
which silently hides reasoning text from Hermes's CLI activity feed
during long tool runs. Restoring "summarized" preserves 4.6 UX.
4. Effort level mapping — xhigh now maps to xhigh (was xhigh→max, which
silently over-efforted every coding/agentic request). max is now a
distinct ceiling per Anthropic's 5-level effort model.
5. New stop_reason values — refusal and model_context_window_exceeded
were silently collapsed to "stop" (end_turn) by the adapter's
stop_reason_map. Now mapped to "content_filter" and "length"
respectively, matching upstream finish-reason handling already in
bedrock_adapter.
6. Model catalogs — claude-opus-4-7 added to the Anthropic provider
list, anthropic/claude-opus-4.7 added at top of OpenRouter fallback
catalog (recommended), claude-opus-4-7 added to model_metadata
DEFAULT_CONTEXT_LENGTHS (1M, matching 4.6 per migration guide).
7. Prefill docstrings — run_agent.AIAgent and BatchRunner now document
that Anthropic Sonnet/Opus 4.6+ reject a trailing assistant-role
prefill (400).
8. Tests — 4 new tests in test_anthropic_adapter covering display
default, xhigh preservation, max on 4.7, refusal / context-overflow
stop_reason mapping, plus the sampling-param predicate. test_model_metadata
accepts 4.7 at 1M context.
Tested on macOS 15.5 (darwin). 119 tests pass in
tests/agent/test_anthropic_adapter.py, 1320 pass in tests/agent/.
2026-04-16 12:35:43 -05:00
|
|
|
|
"refusal": "content_filter",
|
|
|
|
|
|
"model_context_window_exceeded": "length",
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
}
|
|
|
|
|
|
finish_reason = stop_reason_map.get(response.stop_reason, "stop")
|
|
|
|
|
|
|
|
|
|
|
|
return (
|
|
|
|
|
|
SimpleNamespace(
|
|
|
|
|
|
content="\n".join(text_parts) if text_parts else None,
|
|
|
|
|
|
tool_calls=tool_calls or None,
|
|
|
|
|
|
reasoning="\n\n".join(reasoning_parts) if reasoning_parts else None,
|
|
|
|
|
|
reasoning_content=None,
|
2026-04-02 10:14:20 -07:00
|
|
|
|
reasoning_details=reasoning_details or None,
|
feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).
## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
- Reads Claude Code's OAuth credentials
- Checks token expiry with 60s buffer
- Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
- Regular API keys use standard x-api-key header
## Changes by file
### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
format conversion, Claude Code credential reader, token resolver.
Handles system prompt extraction, tool_use/tool_result blocks,
thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic
### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
* Client init (Anthropic SDK instead of OpenAI)
* API call dispatch (_anthropic_client.messages.create)
* Response validation (content blocks)
* finish_reason mapping (stop_reason -> finish_reason)
* Token usage (input_tokens/output_tokens)
* Response normalization (normalize_anthropic_response)
* Client interrupt/rebuild
* Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
|
|
|
|
),
|
|
|
|
|
|
finish_reason,
|
2026-04-09 17:09:38 -07:00
|
|
|
|
)
|
feat: add transport types + migrate Anthropic normalize path
Add agent/transports/types.py with three shared dataclasses:
- NormalizedResponse: content, tool_calls, finish_reason, reasoning, usage, provider_data
- ToolCall: id, name, arguments, provider_data (per-tool-call protocol metadata)
- Usage: prompt_tokens, completion_tokens, total_tokens, cached_tokens
Add normalize_anthropic_response_v2() to anthropic_adapter.py — wraps the
existing v1 function and maps its output to NormalizedResponse. One call site
in run_agent.py (the main normalize branch) uses v2 with a back-compat shim
to SimpleNamespace for downstream code.
No ABC, no registry, no streaming, no client lifecycle. Those land in PR 3
with the first concrete transport (AnthropicTransport).
46 new tests:
- test_types.py: dataclass construction, build_tool_call, map_finish_reason
- test_anthropic_normalize_v2.py: v1-vs-v2 regression tests (text, tools,
thinking, mixed, stop reasons, mcp prefix stripping, edge cases)
Part of the provider transport refactor (PR 2 of 9).
2026-04-20 20:13:33 +05:30
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def normalize_anthropic_response_v2(
|
|
|
|
|
|
response,
|
|
|
|
|
|
strip_tool_prefix: bool = False,
|
|
|
|
|
|
) -> "NormalizedResponse":
|
|
|
|
|
|
"""Normalize Anthropic response to NormalizedResponse.
|
|
|
|
|
|
|
|
|
|
|
|
Wraps the existing normalize_anthropic_response() and maps its output
|
|
|
|
|
|
to the shared transport types. This allows incremental migration —
|
|
|
|
|
|
one call site at a time — without changing the original function.
|
|
|
|
|
|
"""
|
|
|
|
|
|
from agent.transports.types import NormalizedResponse, build_tool_call
|
|
|
|
|
|
|
|
|
|
|
|
assistant_msg, finish_reason = normalize_anthropic_response(response, strip_tool_prefix)
|
|
|
|
|
|
|
|
|
|
|
|
tool_calls = None
|
|
|
|
|
|
if assistant_msg.tool_calls:
|
|
|
|
|
|
tool_calls = [
|
|
|
|
|
|
build_tool_call(
|
|
|
|
|
|
id=tc.id,
|
|
|
|
|
|
name=tc.function.name,
|
|
|
|
|
|
arguments=tc.function.arguments,
|
|
|
|
|
|
)
|
|
|
|
|
|
for tc in assistant_msg.tool_calls
|
|
|
|
|
|
]
|
|
|
|
|
|
|
|
|
|
|
|
provider_data = {}
|
|
|
|
|
|
if getattr(assistant_msg, "reasoning_details", None):
|
|
|
|
|
|
provider_data["reasoning_details"] = assistant_msg.reasoning_details
|
|
|
|
|
|
|
|
|
|
|
|
return NormalizedResponse(
|
|
|
|
|
|
content=assistant_msg.content,
|
|
|
|
|
|
tool_calls=tool_calls,
|
|
|
|
|
|
finish_reason=finish_reason,
|
|
|
|
|
|
reasoning=getattr(assistant_msg, "reasoning", None),
|
|
|
|
|
|
usage=None, # Anthropic usage is on the raw response, not the normaliser
|
|
|
|
|
|
provider_data=provider_data or None,
|
|
|
|
|
|
)
|