mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-01 00:11:39 +08:00
feat: add Anthropic Context Editing API support
Integrate Anthropic's server-side context management (beta) for Claude models.
When enabled, the API automatically clears old tool use/result pairs and
thinking blocks AFTER prompt cache lookup but BEFORE token counting — this
preserves prompt cache prefixes while freeing context space, something
impossible with client-side stripping.
Implementation:
- anthropic_adapter: add context-management-2025-06-27 to beta headers;
build context_management edits in build_anthropic_kwargs() via extra_body;
only include clear_thinking edit when reasoning is enabled (API requires it)
- run_agent: pipe context_editing config through AIAgent to the adapter
- cli/gateway: load context_editing config from config.yaml and pass to agent
- config: add context_editing section to DEFAULT_CONFIG with conservative
defaults (disabled, auto-scale triggers to 60%/10% of context window,
keep 5 tool uses and 2 thinking turns, exclude memory/skill_manage/todo)
Config (opt-in, add to config.yaml):
context_editing:
enabled: true
trigger_tokens: null # auto: 60% of context window
keep_tool_uses: 5
keep_thinking_turns: 2
exclude_tools: [memory, skill_manage, todo]
clear_tool_inputs: false
clear_at_least_tokens: null # auto: 10% of context window
Live tested with Anthropic API:
- Single turn with context_management: accepted, response normal
- Multi-turn with tool calls + thinking + context_management: works
- clear_thinking correctly omitted when thinking is disabled
- Config plumbing verified through AIAgent._build_api_kwargs()
Refs: #526, supersedes #528
This commit is contained in:
@@ -25,6 +25,29 @@ model:
|
||||
# api_key: "your-key-here" # Uncomment to set here instead of .env
|
||||
base_url: "https://openrouter.ai/api/v1"
|
||||
|
||||
# =============================================================================
|
||||
# Anthropic Context Editing (Claude-only, optional)
|
||||
# =============================================================================
|
||||
# Server-side context management for Claude models. Automatically clears old
|
||||
# tool call/result pairs and thinking blocks at the API level, AFTER prompt
|
||||
# cache lookup but BEFORE token counting. This preserves prompt cache prefixes
|
||||
# while freeing context space — something impossible with client-side stripping.
|
||||
#
|
||||
# Only works with direct Anthropic API (provider: anthropic). Disabled by default.
|
||||
# Anthropic reports ~29% performance improvement with context editing enabled.
|
||||
#
|
||||
# context_editing:
|
||||
# enabled: true # Enable server-side context editing
|
||||
# trigger_tokens: null # Input token threshold to start clearing (null = auto: 60% of context window)
|
||||
# keep_tool_uses: 5 # How many recent tool_use/result pairs to preserve
|
||||
# keep_thinking_turns: 2 # How many recent thinking turns to preserve
|
||||
# exclude_tools: # Tool calls that are NEVER cleared
|
||||
# - memory
|
||||
# - skill_manage
|
||||
# - todo
|
||||
# clear_tool_inputs: false # Also clear tool input params (default: false)
|
||||
# clear_at_least_tokens: null # Minimum tokens to clear per activation (null = auto: 10% of context window)
|
||||
|
||||
# =============================================================================
|
||||
# OpenRouter Provider Routing (only applies when using OpenRouter)
|
||||
# =============================================================================
|
||||
|
||||
Reference in New Issue
Block a user