Commit Graph

88 Commits

Author SHA1 Message Date
0668001438
83080772f2 fix(delegation): honor provider override for subagents
Clear inherited provider preference filters when delegation.provider is set so delegated children do not route back to the parent provider. Add a regression test for cross-provider delegation with parent OpenRouter filters.

Closes #10653
2026-05-04 05:22:35 -07:00
briandevans
0b5fd40a01 fix(delegate): correct _spawn_child → _build_child_agent in comments
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 05:18:45 -07:00
阿泥豆
0cc63043e0 fix(delegation): increase heartbeat stale thresholds
The heartbeat stale detection was too aggressive:
- idle: 5 * 30s = 150s — LLM inference on slow providers (Zhipu/GLM)
  frequently exceeds 150s, causing heartbeat to stop prematurely
- in-tool: 20 * 30s = 600s — borderline for long tool calls

When heartbeat stops, parent._last_activity_ts freezes, eventually
triggering gateway timeout and killing the entire delegation.

New thresholds:
- idle: 15 * 30s = 450s — accommodates slow LLM inference
- in-tool: 40 * 30s = 1200s — accommodates long-running tool calls

child_timeout_seconds (config: delegation.child_timeout_seconds) remains
the hard cap for total delegation duration.
2026-05-04 05:08:51 -07:00
ideathinklab01-source
d17eff29d5 fix(delegate): guard _load_config() against delegation: null in config.yaml
YAML parses `delegation: null` as Python None. `dict.get(key, {})`
only uses the default when the key is *missing*, not when it exists with
a None value, so `cfg.get("max_concurrent_children")` crashes with
`'NoneType' object has no attribute 'get'`.

Same pattern as fd9b692d (fix(tui): tolerate null top-level sections).
Use `dict.get(key) or {}` to handle both missing and None-valued keys.

Closes: delegation null config crash (same class as #7215, #7346)
2026-05-04 03:09:59 -07:00
Emilien Domenge
83bbe9b458 fix(delegation): pass target_model to resolve_runtime_provider in _resolve_delegation_credentials
When delegation.model differs from model.default and the provider is
opencode-go or opencode-zen, the wrong api_mode is computed because
resolve_runtime_provider falls back to model_cfg.get('default') — the
main model — instead of the configured delegation model.

For example, with model.default=minimax-m2.7 (anthropic_messages) and
delegation.model=glm-5.1 (chat_completions), subagents get
anthropic_messages, which strips /v1 from the base URL and causes a 404.

resolve_runtime_provider already accepts target_model for exactly this
purpose; _resolve_delegation_credentials just wasn't passing it.

Fixes #15319
Related: #13678
2026-05-04 02:30:48 -07:00
nftpoetrist
9faaa292b4 fix(delegate): inherit parent fallback_chain in _build_child_agent
_build_child_agent constructed child AIAgents without passing
fallback_model, leaving _fallback_chain=[] for every subagent.
When a subagent hit a rate-limit or credential exhaustion the
runtime fallback check (run_agent.py:7486 / 12267) found an empty
chain and failed immediately — even though the parent agent was
configured with fallback_providers and would have recovered.

The cron scheduler already propagates fallback_model correctly
(scheduler.py:1038). Fix closes the parity gap by reading the
parent's _fallback_chain (the normalised list form accepted by
AIAgent's fallback_model parameter) and threading it through.

Empty chains coerce to None so AIAgent initialises _fallback_chain=[]
as usual rather than iterating an empty list.
2026-05-04 01:48:56 -07:00
johnncenae
9ae1fa9e39 fix(delegate): honor runtime default model during provider resolution 2026-04-30 19:58:55 -07:00
Teknium
6085d7a93e chore: remove unused imports and dead locals (ruff F401, F841) (#17010)
Mechanical cleanup across 43 files — removes 46 unused imports
(F401) and 14 unused local variables (F841) detected by
`ruff check --select F401,F841`. Net: -49 lines.

Also fixes a latent NameError in rl_cli.py where `get_hermes_home()`
was called at module line 32 before its import at line 65 — the
module never imported successfully on main. The ruff audit surfaced
this because it correctly saw the symbol as imported-but-unused
(the call happened before the import ran); the fix moves the import
to the top of the file alongside other stdlib imports.

One `# noqa: F401` kept in hermes_cli/status.py for `subprocess`:
tests monkeypatch `hermes_cli.status.subprocess` as a regression
guard that systemctl isn't called on Termux, so the name must
exist at module scope even though the module body doesn't reference
it. Docstring explains the reason.

Also fixes an invalid `# noqa:` directive in
gateway/platforms/discord.py:308 that lacked a rule code.

Co-authored-by: teknium1 <teknium@users.noreply.github.com>
2026-04-28 06:46:45 -07:00
Teknium
69b8fa65d4 docs(delegate_task): clarify that it is synchronous and not durable (#17022)
delegate_task runs inside the parent turn and is cancelled when the parent is interrupted (new user message, /stop, /new). The child status payload (status=interrupted, exit_reason=interrupted) is already honest, but the tool schema and user-facing docs did not set the expectation, so users reasonably assumed delegated subagents would keep running in the background after interrupting the parent.

Updates:

- tools/delegate_tool.py DELEGATE_TASK_SCHEMA description adds a WHEN NOT TO USE bullet pointing at cronjob / terminal(background=True, notify_on_complete=True) for durable long-running work.

- website/docs/user-guide/features/delegation.md gains a Lifetime and Durability callout above Key Properties.

- website/docs/guides/delegation-patterns.md expands the Use something else list and the Constraints section with the same guidance.

Reported by LizLiz (@lizliz404) via Teknium.

Co-authored-by: teknium1 <teknium@users.noreply.github.com>
2026-04-28 06:45:15 -07:00
ygd58
6b6fc28e85 fix(delegate): clear acp_command when override_provider is set
When delegation.provider is configured (e.g. minimax-cn), subagents
inherited the parent's acp_command unconditionally. This caused
run_agent.py to initialize CopilotACPClient, which bypassed the
override credentials entirely and used its own default model
(provider=copilot-acp model=qwen3.5-397b-a17b) instead of the
configured delegation.provider and delegation.model.

Fix: when override_provider is set but override_acp_command is not,
clear effective_acp_command and effective_acp_args so the child agent
uses direct API calls with the configured provider credentials.

The existing override_acp_command path is unchanged — explicit ACP
transport overrides still force provider=copilot-acp as before.

Fixes #16816
2026-04-28 01:14:38 -07:00
Teknium
d7528d43ac fix(web): scope dashboard config Reset button to the current tab (#16813)
* Port from Kilo-Org/kilocode#9448: roll up subagent costs into parent session total

Child subagents built by delegate_task() each track their own
session_estimated_cost_usd, but the parent agent's total never folded
those numbers in.  On runs where the parent mostly delegates and the
children do the expensive work, the footer/UI was reporting a fraction
of the actual spend — sometimes $0.00 when the parent itself made no
billed calls.

Fix:
- Capture each child's session_estimated_cost_usd into _child_cost_usd
  on the result entry (before child.close() drops the counter).
- After the existing subagent_stop hook loop, sum the children's costs
  and add the total to parent.session_estimated_cost_usd.
- Promote session_cost_source from 'none' -> 'subagent' when the parent
  had no direct spend but children did, so the UI doesn't label the
  total as having unknown provenance.  Real sources (openrouter,
  anthropic, etc.) are preserved.

Nested orchestrator -> worker trees roll up naturally: each layer's own
delegate_task() folds its direct children in, and when the orchestrator
itself returns, its parent folds the orchestrator's now-inflated total
on top.

Internal fields (_child_cost_usd, _child_role) are stripped from the
results dict before it's serialised back to the model — same contract
as _child_role already followed.

Tests: TestSubagentCostRollup (5 cases) covers single-child, batch,
zero-cost-children, preserved-source, and legacy-fixture paths.

Source: https://github.com/Kilo-Org/kilocode/pull/9448

* fix(web): scope dashboard config Reset button to the current tab

Reported by @ykmfb001 via X: clicking 'Restore Defaults' (恢复默认值) on
the Auxiliary page wiped the entire config.yaml to defaults, not just
the auxiliary section. The button sits next to the category tabs and
users reasonably assumed 'reset this tab', not 'reset everything'.

Changes:
- handleReset now scopes to the fields in the current view:
  active category's fields (form mode) or search-matched fields
  (search mode). Only those keys are copied from defaults; the rest
  of the config is left alone.
- Added a window.confirm() with the scope name before applying.
- Button is hidden in YAML mode (scoping doesn't apply there).
- Tooltip/aria-label now name the scope, e.g. 'Reset Auxiliary to
  defaults'.
- i18n: new resetScopeTooltip / confirmResetScope / resetScopeToast
  strings in en + zh; resetDefaults key preserved for compat.
2026-04-27 21:09:14 -07:00
Teknium
517f30b043 improve(agent): guidance for plain-text URLs, subagent language/verification, hermes-config routing (#16325)
Four small tool-description / skill-content tweaks addressing recurring
model mistakes seen in @versun's docx feedback (Kimi 2.6, but the patterns
apply to every model):

1. browser_navigate description: call out .md/.txt/.json/.yaml/.csv/.xml,
   raw.githubusercontent.com, and API endpoints as specifically preferring
   curl or web_extract. The generic "prefer web_search or web_extract" was
   too weak; models kept firing up the browser for plain-text URLs.

2. delegate_task description: two additions.
   (a) Pass user language / output-style preferences in 'context' when they
   differ from English — otherwise subagents default to English and their
   summaries contaminate the final reply (caused the bilingual digest bug).
   (b) Subagent summaries are self-reports, not verified facts. For
   operations with external side-effects (HTTP uploads, remote writes,
   file creation at shared paths), require a verifiable handle (URL, ID,
   path) and verify it yourself before claiming success.

3. agent/prompt_builder.py Skills-mandatory block: new explicit line
   "Whenever the user asks to configure / set up / modify / install /
   enable / disable / troubleshoot Hermes Agent itself, load the
   `hermes-agent` skill first." The generic "load what's relevant" didn't
   route Hermes-meta questions (like "how do I turn off redaction?") to
   the one skill that has the answer.

4. skills/autonomous-ai-agents/hermes-agent/SKILL.md: new "Security &
   Privacy Toggles" section covering security.redact_secrets (with the
   import-time-snapshot restart-required caveat), privacy.redact_pii,
   approvals.mode (manual/smart/off) + --yolo + HERMES_YOLO_MODE, shell
   hooks allowlist, and how to disable network/media tools entirely.
   Every command verified against the actual config keys — no invented
   knobs.

Co-authored-by: teknium1 <teknium@noreply.github.com>
2026-04-26 20:57:19 -07:00
Teknium
023b1bff11 fix(delegate): resolve subagent approval prompts without deadlocking parent TUI (#15491)
Subagents run inside a ThreadPoolExecutor. The CLI's interactive approval
callback lives in tools/terminal_tool.py's threading.local(), which worker
threads do not inherit. When a subagent hits a dangerous-command guard,
prompt_dangerous_approval() falls back to input() from the worker thread,
deadlocking against the parent's prompt_toolkit TUI that owns stdin.

Fix: install a non-interactive callback into every subagent worker thread
via ThreadPoolExecutor(initializer=set_approval_callback, initargs=(cb,)).
The callback is config-gated by delegation.subagent_auto_approve:

  false (default) -> _subagent_auto_deny (safe; matches leaf tool blocklist)
  true            -> _subagent_auto_approve (opt-in YOLO for cron/batch)

Both emit a logger.warning audit line. Gateway sessions are unaffected
because they resolve approvals via tools/approval.py's per-session queue,
not through these TLS callbacks. Diagnosis credit: @MorAlekss (#14685).

- hermes_cli/config.py: DEFAULT_CONFIG.delegation.subagent_auto_approve: False
- cli-config.yaml.example: documented, commented (default)
- tools/delegate_tool.py: _subagent_auto_deny, _subagent_auto_approve,
  _get_subagent_approval_callback, wired into the child timeout executor
- tests/tools/test_delegate.py: 7 tests covering defaults, truthy coercion,
  and TLS scoping in the worker thread
2026-04-24 22:37:22 -07:00
MorAlekss
0ed37c0ca4 docs(delegate): document max_concurrent_children and max_spawn_depth + cost warning 2026-04-24 20:38:58 -07:00
Teknium
fcc05284fc fix(delegate): tool-activity-aware heartbeat stale detection (#13041) (#15183)
A child running a legitimately long-running tool (terminal command,
browser fetch, big file read) holds current_tool set and keeps
api_call_count frozen while the tool runs. The previous stale check
treated that as idle after 5 heartbeat cycles (~150s), stopped
touching the parent, and let the gateway kill the session.

Split the threshold in two:
- _HEARTBEAT_STALE_CYCLES_IDLE=5 (~150s)  — applied only when
  current_tool is None (child wedged between turns)
- _HEARTBEAT_STALE_CYCLES_IN_TOOL=20 (~600s) — applied when the child
  is inside a tool call

Stale counter also resets when current_tool changes (new tool =
progress). The hard child_timeout_seconds (default 600s) is still
the final cap, so genuinely stuck tools don't get to block forever.
2026-04-24 07:25:19 -07:00
Teknium
7634c1386f feat(delegate): diagnostic dump when a subagent times out with 0 API calls (#15105)
When a subagent in delegate_task times out before making its first LLM
request, write a structured diagnostic file under
~/.hermes/logs/subagent-timeout-<sid>-<ts>.log capturing enough state
for the user (and us) to debug the hang. The old error message —
'Subagent timed out after Ns with no response. The child may be stuck
on a slow API call or unresponsive network request.' — gave no
observability for the 0-API-call case, which is the hardest to reason
about remotely.

The diagnostic captures:
  - timeout config vs actual duration
  - goal (truncated to 1000 chars)
  - child config: model, provider, api_mode, base_url, max_iterations,
    quiet_mode, platform, _delegate_role, _delegate_depth
  - enabled_toolsets + loaded tool names
  - system prompt byte/char count (catches oversized prompts that
    providers silently choke on)
  - tool schema count + byte size
  - child's get_activity_summary() snapshot
  - Python stack of the worker thread at the moment of timeout
    (reveals whether the hang is in credential resolution, transport,
    prompt construction, etc.)

Wiring:
  - _run_single_child captures the worker thread via a small wrapper
    around child.run_conversation so we can look up its stack at
    timeout.
  - After a FuturesTimeoutError, we pull child.get_activity_summary()
    to read api_call_count. If 0 AND it was a timeout (not a raise),
    _dump_subagent_timeout_diagnostic() is invoked.
  - The returned path is surfaced in the error string so the parent
    agent (and therefore the user / gateway) sees exactly where to look.
  - api_calls > 0 timeouts keep the old 'stuck on slow API call'
    phrasing since that's the correct diagnosis for those.

This does NOT change any behavior for successful subagent runs,
non-timeout errors, or subagents that made at least one API call
before hanging.

Tests: 7 cases (tests/tools/test_delegate_subagent_timeout_diagnostic.py)
  - output format + required sections + field values
  - long-goal truncation with [truncated] marker
  - missing / already-exited worker thread branches
  - unwritable HERMES_HOME/logs/ returns None without raising
  - _run_single_child wiring: 0 API calls → dump + diagnostic_path in error
  - _run_single_child wiring: N>0 API calls → no dump, old message

Refs: #14726
2026-04-24 04:58:32 -07:00
Teknium
50d97edbe1 feat(delegation): bump default child_timeout_seconds to 600s (#14809)
The 300s default was too tight for high-reasoning models on non-trivial
delegated tasks — e.g. gpt-5.5 xhigh reviewing 12 files would burn >5min
on reasoning tokens before issuing its first tool call, tripping the
hard wall-clock timeout with 0 api_calls logged.

- tools/delegate_tool.py: DEFAULT_CHILD_TIMEOUT 300 -> 600
- hermes_cli/config.py: surface delegation.child_timeout_seconds in
  DEFAULT_CONFIG so it's discoverable (previously the key was read by
  _get_child_timeout() but absent from the default config schema)

Users can still override via config.yaml delegation.child_timeout_seconds
or DELEGATION_CHILD_TIMEOUT_SECONDS env var (floor 30s, no ceiling).
2026-04-23 16:14:55 -07:00
Teknium
64e6165686 fix(delegate): remove model-facing max_iterations override; config is authoritative (#14732)
Previously delegate_task exposed 'max_iterations' in its JSON schema and used
`max_iterations or default_max_iter` — so a model guessing conservatively (or
copy-pasting a docstring hint like 'Only set lower for simple tasks') could
silently shrink a subagent's budget below the user's configured
delegation.max_iterations. One such call this session capped a deep forensic
audit at 40 iterations while the user's config was set to 250.

Changes:
- Drop 'max_iterations' from DELEGATE_TASK_SCHEMA['parameters']['properties'].
  Models can no longer emit it.
- In delegate_task(): ignore any caller-supplied max_iterations, always use
  delegation.max_iterations from config. Log at debug if a stale schema or
  internal caller still passes one through.
- Keep the Python kwarg on the function signature for internal callers
  (_build_child_agent tests pass it through the plumbing layer).
- Update test_schema_valid to assert the param is now absent (intentional
  contract change, not a change-detector).
2026-04-23 13:56:26 -07:00
TaroballzChen
5d09474348 fix(tools): enforce ACP transport overrides in delegate_task child agents
When override_acp_command was passed to _build_child_agent, it failed to
override effective_provider to 'copilot-acp' and effective_api_mode to
'chat_completions'. This caused the child AIAgent to inherit the parent's
native API configuration (e.g. Anthropic) and attempt real HTTP requests
using the parent's API key, leading to HTTP 401 errors and completely
bypassing the ACP subprocess.

Ensure that if an ACP command override is provided, the child agent
correctly routes through CopilotACPClient.

Refs #2653
2026-04-23 02:37:15 -07:00
Teknium
7d8b2eee63 fix(delegate): default inherit_mcp_toolsets=true, drop version bump
Follow-up on helix4u's PR #14211:
- Flip default to true: narrowing toolsets=['web','browser'] expresses
  'I want these extras', not 'silently strip MCP'. Parent MCP tools
  (registered at runtime) should survive narrowing by default.
- Drop _config_version bump (22->23); additive nested key under
  delegation.* is handled by _deep_merge, no migration needed.
- Update tests to reflect new default behavior.
2026-04-22 17:45:48 -07:00
helix4u
3e96c87f37 fix(delegate): make MCP toolset inheritance configurable 2026-04-22 17:45:48 -07:00
Brooklyn Nicholson
7eae504d15 fix(tui): address Copilot round-2 on #14045
- delegate_task: use shared tool_error() for the paused-spawn early return
  so the error envelope matches the rest of the tool.
- Disk snapshot label: treat orphaned nodes (parentId missing from the
  snapshot) as top-level, matching buildSubagentTree / summarizeLabel.
2026-04-22 11:54:19 -05:00
Brooklyn Nicholson
dee51c1607 fix(tui): address Copilot review on #14045
Four real issues Copilot flagged:

1. delegate_tool: `_build_child_agent` never passed `toolsets` to the
   progress callback, so the event payload's `toolsets` field (wired
   through every layer) was always empty and the overlay's toolsets
   row never populated.  Thread `child_toolsets` through.

2. event handler: the race-protection on subagent.spawn_requested /
   subagent.start only preserved `completed`, so a late-arriving queued
   event could clobber `failed` / `interrupted` too.  Preserve any
   terminal status (`completed | failed | interrupted`).

3. SpawnHud: comment claimed concurrency was approximated by "widest
   level in the tree" but code used `totals.activeCount` (total across
   all parents).  `max_concurrent_children` is a per-parent cap, so
   activeCount over-warns for multi-orchestrator runs.  Switch to
   `max(widthByDepth(tree))`; the label now reads `W/cap+extra` where
   W is the widest level (drives the ratio) and `+extra` is the rest.

4. spawn_tree.list: comment said "peek header without parsing full list"
   but the code json.loads()'d every snapshot.  Adds a per-session
   `_index.jsonl` sidecar written on save; list() reads only the index
   (with a full-scan fallback for pre-index sessions).  O(1) per
   snapshot now vs O(file-size).
2026-04-22 10:56:32 -05:00
Brooklyn Nicholson
7785654ad5 feat(tui): subagent spawn observability overlay
Adds a live + post-hoc audit surface for recursive delegate_task fan-out.
None of cc/oc/oclaw tackle nested subagent trees inside an Ink overlay;
this ships a view-switched dashboard that handles arbitrary depth + width.

Python
- delegate_tool: every subagent event now carries subagent_id, parent_id,
  depth, model, tool_count; subagent.complete also ships input/output/
  reasoning tokens, cost, api_calls, files_read/files_written, and a
  tail of tool-call outputs
- delegate_tool: new subagent.spawn_requested event + _active_subagents
  registry so the overlay can kill a branch by id and pause new spawns
- tui_gateway: new RPCs delegation.status, delegation.pause,
  subagent.interrupt, spawn_tree.save/list/load (disk under
  \$HERMES_HOME/spawn-trees/<session>/<ts>.json)

TUI
- /agents overlay: full-width list mode (gantt strip + row picker) and
  Enter-to-drill full-width scrollable detail mode; inverse+amber
  selection, heat-coloured branch markers, wall-clock gantt with tick
  ruler, per-branch rollups
- Detail pane: collapsible accordions (Budget, Files, Tool calls, Output,
  Progress, Summary); open-state persists across agents + mode switches
  via a shared atom
- /replay [N|last|list|load <path>] for in-memory + disk history;
  /replay-diff <a> <b> for side-by-side tree comparison
- Status-bar SpawnHud warns as depth/concurrency approaches caps;
  overlay auto-follows the just-finished turn onto history[1]
- Theme: bump DARK dim #B8860B → #CC9B1F for readable secondary text
  globally; keep LIGHT untouched

Tests: +29 new subagentTree unit tests; 215/215 passing.
2026-04-22 10:38:17 -05:00
Kongxi
dd8ab40556 fix(delegation): add hard timeout and stale detection for subagent execution (#13770)
- Wrap child.run_conversation() in a ThreadPoolExecutor with configurable
  timeout (delegation.child_timeout_seconds, default 300s) to prevent
  indefinite blocking when a subagent's API call or tool HTTP request hangs.

- Add heartbeat stale detection: if a child's api_call_count doesn't
  advance for 5 consecutive heartbeat cycles (~2.5 min), stop touching
  the parent's activity timestamp so the gateway inactivity timeout
  can fire as a last resort.

- Add 'timeout' as a new exit_reason/status alongside the existing
  completed/max_iterations/interrupted states.

- Use shutdown(wait=False) on the timeout executor to avoid the
  ThreadPoolExecutor.__exit__ deadlock when a child is stuck on
  blocking I/O.

Closes #13768
2026-04-21 20:20:16 -07:00
王强
46d680125e fix(kimi-coding): set anthropic_messages api_mode for /coding endpoint 2026-04-21 19:48:39 -07:00
Teknium
9c9d9b7ddf feat(delegate): cross-agent file state coordination for concurrent subagents (#13718)
* feat(models): hide OpenRouter models that don't advertise tool support

Port from Kilo-Org/kilocode#9068.

hermes-agent is tool-calling-first — every provider path assumes the
model can invoke tools. Models whose OpenRouter supported_parameters
doesn't include 'tools' (e.g. image-only or completion-only models)
cannot be driven by the agent loop and fail at the first tool call.

Filter them out of fetch_openrouter_models() so they never appear in
the model picker (`hermes model`, setup wizard, /model slash command).

Permissive when the field is missing — OpenRouter-compatible gateways
(Nous Portal, private mirrors, older snapshots) don't always populate
supported_parameters. Treat missing as 'unknown → allow' rather than
silently emptying the picker on those gateways. Only hide models
whose supported_parameters is an explicit list that omits tools.

Tests cover: tools present → kept, tools absent → dropped, field
missing → kept, malformed non-list → kept, non-dict item → kept,
empty list → dropped.

* feat(delegate): cross-agent file state coordination for concurrent subagents

Prevents mangled edits when concurrent subagents touch the same file
(same process, same filesystem — the mangle scenario from #11215).

Three layers, all opt-out via HERMES_DISABLE_FILE_STATE_GUARD=1:

1. FileStateRegistry (tools/file_state.py) — process-wide singleton
   tracking per-agent read stamps and the last writer globally.
   check_stale() names the sibling subagent in the warning when a
   non-owning agent wrote after this agent's last read.

2. Per-path threading.Lock wrapped around the read-modify-write
   region in write_file_tool and patch_tool. Concurrent siblings on
   the same path serialize; different paths stay fully parallel.
   V4A multi-file patches lock in sorted path order (deadlock-free).

3. Delegate-completion reminder in tools/delegate_tool.py: after a
   subagent returns, writes_since(parent, child_start, parent_reads)
   appends '[NOTE: subagent modified files the parent previously
   read — re-read before editing: ...]' to entry.summary when the
   child touched anything the parent had already seen.

Complements (does not replace) the existing path-overlap check in
run_agent._should_parallelize_tool_batch — batch check prevents
same-file parallel dispatch within one agent's turn (cheap prevention,
zero API cost), registry catches cross-subagent and cross-turn
staleness at write time (detection).

Behavior is warning-only, not hard-failing — matches existing project
style. Errors surface naturally: sibling writes often invalidate the
old_string in patch operations, which already errors cleanly.

Tests: tests/tools/test_file_state_registry.py — 16 tests covering
registry state transitions, per-path locking, per-path-not-global
locking, writes_since filtering, kill switch, and end-to-end
integration through the real read_file/write_file/patch handlers.
2026-04-21 16:41:26 -07:00
pefontana
48ecb98f8a feat(delegate): orchestrator role and configurable spawn depth (default flat)
Adds role='leaf'|'orchestrator' to delegate_task. With max_spawn_depth>=2,
an orchestrator child retains the 'delegation' toolset and can spawn its
own workers; leaf children cannot delegate further (identical to today).

Default posture is flat — max_spawn_depth=1 means a depth-0 parent's
children land at the depth-1 floor and orchestrator role silently
degrades to leaf. Users opt into nested delegation by raising
max_spawn_depth to 2 or 3 in config.yaml.

Also threads acp_command/acp_args through the main agent loop's delegate
dispatch (previously silently dropped in the schema) via a new
_dispatch_delegate_task helper, and adds a DelegateEvent enum with
legacy-string back-compat for gateway/ACP/CLI progress consumers.

Config (hermes_cli/config.py defaults):
  delegation.max_concurrent_children: 3   # floor-only, no upper cap
  delegation.max_spawn_depth: 1           # 1=flat (default), 2-3 unlock nested
  delegation.orchestrator_enabled: true   # global kill switch

Salvaged from @pefontana's PR #11215. Overrides vs. the original PR:
concurrency stays at 3 (PR bumped to 5 + cap 8 — we keep the floor only,
no hard ceiling); max_spawn_depth defaults to 1 (PR defaulted to 2 which
silently enabled one level of orchestration for every user).

Co-authored-by: pefontana <fontana.pedro93@gmail.com>
2026-04-21 14:23:45 -07:00
Teknium
dbb7e00e7e fix: sweep remaining provider-URL substring checks across codebase
Completes the hostname-hardening sweep — every substring check against a
provider host in live-routing code is now hostname-based. This closes the
same false-positive class for OpenRouter, GitHub Copilot, Kimi, Qwen,
ChatGPT/Codex, Bedrock, GitHub Models, Vercel AI Gateway, Nous, Z.AI,
Moonshot, Arcee, and MiniMax that the original PR closed for OpenAI, xAI,
and Anthropic.

New helper:
- utils.base_url_host_matches(base_url, domain) — safe counterpart to
  'domain in base_url'. Accepts hostname equality and subdomain matches;
  rejects path segments, host suffixes, and prefix collisions.

Call sites converted (real-code only; tests, optional-skills, red-teaming
scripts untouched):

run_agent.py (10 sites):
- AIAgent.__init__ Bedrock branch, ChatGPT/Codex branch (also path check)
- header cascade for openrouter / copilot / kimi / qwen / chatgpt
- interleaved-thinking trigger (openrouter + claude)
- _is_openrouter_url(), _is_qwen_portal()
- is_native_anthropic check
- github-models-vs-copilot detection (3 sites)
- reasoning-capable route gate (nousresearch, vercel, github)
- codex-backend detection in API kwargs build
- fallback api_mode Bedrock detection

agent/auxiliary_client.py (7 sites):
- extra-headers cascades in 4 distinct client-construction paths
  (resolve custom, resolve auto, OpenRouter-fallback-to-custom,
  _async_client_from_sync, resolve_provider_client explicit-custom,
  resolve_auto_with_codex)
- _is_openrouter_client() base_url sniff

agent/usage_pricing.py:
- resolve_billing_route openrouter branch

agent/model_metadata.py:
- _is_openrouter_base_url(), Bedrock context-length lookup

hermes_cli/providers.py:
- determine_api_mode Bedrock heuristic

hermes_cli/runtime_provider.py:
- _is_openrouter_url flag for API-key preference (issues #420, #560)

hermes_cli/doctor.py:
- Kimi User-Agent header for /models probes

tools/delegate_tool.py:
- subagent Codex endpoint detection

trajectory_compressor.py:
- _detect_provider() cascade (8 providers: openrouter, nous, codex, zai,
  kimi-coding, arcee, minimax-cn, minimax)

cli.py, gateway/run.py:
- /model-switch cache-enabled hint (openrouter + claude)

Bedrock detection tightened from 'bedrock-runtime in url' to
'hostname starts with bedrock-runtime. AND host is under amazonaws.com'.
ChatGPT/Codex detection tightened from 'chatgpt.com/backend-api/codex in
url' to 'hostname is chatgpt.com AND path contains /backend-api/codex'.

Tests:
- tests/test_base_url_hostname.py extended with a base_url_host_matches
  suite (exact match, subdomain, path-segment rejection, host-suffix
  rejection, host-prefix rejection, empty-input, case-insensitivity,
  trailing dot).

Validation: 651 targeted tests pass (runtime_provider, minimax, bedrock,
gemini, auxiliary, codex_cloudflare, usage_pricing, compressor_fallback,
fallback_model, openai_client_lifecycle, provider_parity, cli_provider_resolution,
delegate, credential_pool, context_compressor, plus the 4 hostname test
modules). 26-assertion E2E call-site verification across 6 modules passes.
2026-04-20 22:14:29 -07:00
Teknium
cecf84daf7 fix: extend hostname-match provider detection across remaining call sites
Aslaaen's fix in the original PR covered _detect_api_mode_for_url and the
two openai/xai sites in run_agent.py. This finishes the sweep: the same
substring-match false-positive class (e.g. https://api.openai.com.evil/v1,
https://proxy/api.openai.com/v1, https://api.anthropic.com.example/v1)
existed in eight more call sites, and the hostname helper was duplicated
in two modules.

- utils: add shared base_url_hostname() (single source of truth).
- hermes_cli/runtime_provider, run_agent: drop local duplicates, import
  from utils. Reuse the cached AIAgent._base_url_hostname attribute
  everywhere it's already populated.
- agent/auxiliary_client: switch codex-wrap auto-detect, max_completion_tokens
  gate (auxiliary_max_tokens_param), and custom-endpoint max_tokens kwarg
  selection to hostname equality.
- run_agent: native-anthropic check in the Claude-style model branch
  and in the AIAgent init provider-auto-detect branch.
- agent/model_metadata: Anthropic /v1/models context-length lookup.
- hermes_cli/providers.determine_api_mode: anthropic / openai URL
  heuristics for custom/unknown providers (the /anthropic path-suffix
  convention for third-party gateways is preserved).
- tools/delegate_tool: anthropic detection for delegated subagent
  runtimes.
- hermes_cli/setup, hermes_cli/tools_config: setup-wizard vision-endpoint
  native-OpenAI detection (paired with deduping the repeated check into
  a single is_native_openai boolean per branch).

Tests:
- tests/test_base_url_hostname.py covers the helper directly
  (path-containing-host, host-suffix, trailing dot, port, case).
- tests/hermes_cli/test_determine_api_mode_hostname.py adds the same
  regression class for determine_api_mode, plus a test that the
  /anthropic third-party gateway convention still wins.

Also: add asslaenn5@gmail.com → Aslaaen to scripts/release.py AUTHOR_MAP.
2026-04-20 22:14:29 -07:00
Peter Fontana
3988c3c245 feat: shell hooks — wire shell scripts as Hermes hook callbacks
Users can declare shell scripts in config.yaml under a hooks: block that
fire on plugin-hook events (pre_tool_call, post_tool_call, pre_llm_call,
subagent_stop, etc). Scripts receive JSON on stdin, can return JSON on
stdout to block tool calls or inject context pre-LLM.

Key design:
- Registers closures on existing PluginManager._hooks dict — zero changes
  to invoke_hook() call sites
- subprocess.run(shell=False) via shlex.split — no shell injection
- First-use consent per (event, command) pair, persisted to allowlist JSON
- Bypass via --accept-hooks, HERMES_ACCEPT_HOOKS=1, or hooks_auto_accept
- hermes hooks list/test/revoke/doctor CLI subcommands
- Adds subagent_stop hook event fired after delegate_task children exit
- Claude Code compatible response shapes accepted

Cherry-picked from PR #13143 by @pefontana.
2026-04-20 20:53:51 -07:00
Junass1
735996d2ad fix(tools/delegate): propagate resolved ACP runtime settings to child agents 2026-04-20 20:47:01 -07:00
Brooklyn Nicholson
9c71f3a6ea Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-16 10:47:41 -05:00
Teknium
45fc0bd83a fix: UnboundLocalError on 'entry' in parallel subagent polling loop (#11050)
The completion-line printing block (idx = entry['task_index'] etc.)
was outside the 'for future in done:' loop but referenced 'entry'
which is only assigned inside that loop. When concurrent.futures.wait()
returns with an empty 'done' set (timeout expired, no futures finished),
the loop body never executes and 'entry' is unbound.

Moved the completion-line printing and spinner-update code inside
the for loop so each completed future gets its own status line,
and empty poll cycles simply loop back without accessing 'entry'.
2026-04-16 06:53:44 -07:00
Brooklyn Nicholson
f81dba0da2 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-16 08:23:20 -05:00
Teknium
e66b373351 fix: word-wrap spinner, interruptable agent join, and delegate_task interrupt (#10940)
* fix: stop /model from silently rerouting direct providers to OpenRouter (#10300)

detect_provider_for_model() silently remapped models to OpenRouter when
the direct provider's credentials weren't found via env vars. Three bugs:

1. Credential check only looked at env vars from PROVIDER_REGISTRY,
   missing credential pool entries, auth store, and OAuth tokens
2. When env var check failed, silently returned ('openrouter', slug)
   instead of the direct provider the model actually belongs to
3. Users with valid credentials via non-env-var mechanisms (pool,
   OAuth, Claude Code tokens) got silently rerouted

Fix:
- Expand credential check to also query credential pool and auth store
- Always return the direct provider match regardless of credential
  status -- let client init handle missing creds with a clear error
  rather than silently routing through the wrong provider

Same philosophy as the provider-required fix: don't guess, don't
silently reroute, error clearly when something is missing.

Closes #10300

* fix: word-wrap spinner, interruptable agent join, and delegate_task interrupt

Three fixes:

1. Spinner widget clips long tool commands — prompt_toolkit Window had
   height=1 and wrap_lines=False. Now uses wrap_lines=True with dynamic
   height from text length / terminal width. Long commands wrap naturally.

2. agent_thread.join() blocked forever after interrupt — if the agent
   thread took time to clean up, the process_loop thread froze. Now polls
   with 0.2s timeout on the interrupt path, checking _should_exit so
   double Ctrl+C breaks out immediately.

3. Root cause of 5-hour CLI hang: delegate_task() used as_completed()
   with no interrupt check. When subagent children got stuck, the parent
   blocked forever inside the ThreadPoolExecutor. Now polls with
   wait(timeout=0.5) and checks parent_agent._interrupt_requested each
   iteration. Stuck children are reported as interrupted, and the parent
   returns immediately.
2026-04-16 03:50:49 -07:00
Brooklyn Nicholson
4b4b4d47bc feat: just more cleaning 2026-04-15 14:14:01 -05:00
Teknium
c52f6348b6 fix: list all available toolsets in delegate_task schema description (#8231)
* fix: list all available toolsets in delegate_task schema description

The delegate_task tool's toolsets parameter description only mentioned
'terminal', 'file', and 'web' as examples. Models (especially smaller
ones like Gemma) would substitute 'web' for 'browser' because they
didn't know 'browser' was a valid option.

Now dynamically builds the toolset list from the TOOLSETS dict at import
time, excluding blocked, composite, and platform-specific toolsets.
Auto-updates when new toolsets are added.

Reported by jeffutter on Discord.

* chore: exclude moa and rl from delegate_task toolset list
2026-04-12 00:54:35 -07:00
hermes-agent-dhabibi
718e8ad6fa feat(delegation): add configurable reasoning_effort for subagents
Add delegation.reasoning_effort config key so subagents can run at a
different thinking level than the parent agent. When set, overrides
the parent's reasoning_config; when empty, inherits as before.

Valid values: xhigh, high, medium, low, minimal, none (disables thinking).

Config path: delegation.reasoning_effort in config.yaml

Files changed:
- tools/delegate_tool.py: resolve override in _build_child_agent
- hermes_cli/config.py: add reasoning_effort to DEFAULT_CONFIG
- tests/tools/test_delegate.py: 4 new tests covering all cases
2026-04-10 21:16:53 -07:00
pefontana
672cc80915 fix(delegate): close child agent after delegation completes
Call child.close() in the _run_single_child finally block after
unregistering the child from the parent's active children list.

Previously child AIAgent instances were only removed from the tracking
list but never had their resources released — the OpenAI/httpx client
and any tool subprocesses relied entirely on garbage collection.

Ref: #7131
2026-04-10 16:51:44 -07:00
angelos
7ccdb74364 fix(delegate): make max_concurrent_children configurable + error on excess
`delegate_task` silently truncated batch tasks to 3 — the model sends
5 tasks, gets results for 3, never told 2 were dropped. Now returns a
clear tool_error explaining the limit and how to fix it.

The limit is configurable via:
  - delegation.max_concurrent_children in config.yaml (priority 1)
  - DELEGATION_MAX_CONCURRENT_CHILDREN env var (priority 2)
  - default: 3

Uses the same _load_config() path as the rest of delegate_task for
consistent config priority. Clamps to min 1, warns on non-integer
config values.

Also removes the hardcoded maxItems: 3 from the JSON schema — the
schema was blocking the model from even attempting >3 tasks before
the runtime check could fire. The runtime check gives a much more
actionable error message.

Backwards compatible: default remains 3, existing configs unchanged.
2026-04-10 13:38:14 -07:00
Teknium
a093eb47f7 fix: propagate child activity to parent during delegate_task (#7295)
When delegate_task runs, the parent agent's activity tracker freezes
because child.run_conversation() blocks and the child's own
_touch_activity() never propagates back to the parent. The gateway
inactivity timeout then fires a spurious 'No activity' warning and
eventually kills the agent, even though the subagent is actively working.

Fix: add a heartbeat thread in _run_single_child that calls
parent._touch_activity() every 30 seconds with detail from the child's
activity summary (current tool, iteration count). The thread is a daemon
that starts before child.run_conversation() and is cleaned up in the
finally block.

This also improves the gateway 'Still working...' status messages —
instead of just 'running: delegate_task', users now see what the
subagent is actually doing (e.g., 'delegate_task: subagent running
terminal (iteration 5/50)').
2026-04-10 12:51:30 -07:00
Teknium
678a87c477 refactor: add tool_error/tool_result helpers + read_raw_config, migrate 129 callsites
Add three reusable helpers to eliminate pervasive boilerplate:

tools/registry.py — tool_error() and tool_result():
  Every tool handler returns JSON strings. The pattern
  json.dumps({"error": msg}, ensure_ascii=False) appeared 106 times,
  and json.dumps({"success": False, "error": msg}, ...) another 23.
  Now: tool_error(msg) or tool_error(msg, success=False).

  tool_result() handles arbitrary result dicts:
  tool_result(success=True, data=payload) or tool_result(some_dict).

hermes_cli/config.py — read_raw_config():
  Lightweight YAML reader that returns the raw config dict without
  load_config()'s deep-merge + migration overhead. Available for
  callsites that just need a single config value.

Migration (129 callsites across 32 files):
- tools/: browser_camofox (18), file_tools (10), homeassistant (8),
  web_tools (7), skill_manager (7), cronjob (11), code_execution (4),
  delegate (5), send_message (4), tts (4), memory (7), session_search (3),
  mcp (2), clarify (2), skills_tool (3), todo (1), vision (1),
  browser (1), process_registry (2), image_gen (1)
- plugins/memory/: honcho (9), supermemory (9), hindsight (8),
  holographic (7), openviking (7), mem0 (7), byterover (6), retaindb (2)
- agent/: memory_manager (2), builtin_memory_provider (1)
2026-04-07 13:36:38 -07:00
Mateus Scheuer Macedo
c706568993 fix(delegate): pass workspace path hints to child agents
Selectively cherry-picked from PR #5501 by MestreY0d4-Uninter.

- Add _resolve_workspace_hint() to detect parent's working directory
- Inject WORKSPACE PATH into child system prompts
- Add rule: never assume /workspace/ container paths
- Excludes the cli.py queue-busy-input changes from the original PR
2026-04-06 23:01:11 -07:00
Mateus Scheuer Macedo
f2c11ff30c fix(delegate): share credential pools with subagents + per-task leasing
Cherry-picked from PR #5580 by MestreY0d4-Uninter.

- Share parent's credential pool with child agents for key rotation
- Leasing layer spreads parallel children across keys (least-loaded)
- Thread-safe acquire_lease/release_lease in CredentialPool
- Reverted sneaked-in tool-name restoration change (kept original
  getattr + isinstance guard pattern)
2026-04-06 23:01:11 -07:00
Teknium
8cf013ecd9 fix: replace stale 'hermes login' refs with 'hermes auth' + fix credential removal re-seeding (#5670)
Two fixes:

1. Replace all stale 'hermes login' references with 'hermes auth' across
   auth.py, auxiliary_client.py, delegate_tool.py, config.py, run_agent.py,
   and documentation. The 'hermes login' command was deprecated; 'hermes auth'
   now handles OAuth credential management.

2. Fix credential removal not persisting for singleton-sourced credentials
   (device_code for openai-codex/nous, hermes_pkce for anthropic).
   auth_remove_command already cleared env vars for env-sourced credentials,
   but singleton credentials stored in the auth store were re-seeded by
   _seed_from_singletons() on the next load_pool() call. Now clears the
   underlying auth store entry when removing singleton-sourced credentials.
2026-04-06 17:17:57 -07:00
BongSuCHOI
ad567c9a8f fix: subagent toolset inheritance when parent enabled_toolsets is None
When parent_agent.enabled_toolsets is None (the default, meaning all tools
are enabled), subagents incorrectly fell back to DEFAULT_TOOLSETS
(['terminal', 'file', 'web']) instead of inheriting the parent's full
toolset.

Root cause:
- Line 188 used 'or' fallback: None or DEFAULT_TOOLSETS evaluates to
  DEFAULT_TOOLSETS
- Line 192 checked truthiness: None is falsy, falling through to else

Fix:
- Use 'is not None' checks instead of truthiness
- When enabled_toolsets is None, derive effective toolsets from
  parent_agent.valid_tool_names via the tool registry

Fixes the bug introduced in f75b1d21b and repeated in e5d14445e (PR #3269).
2026-04-06 13:20:01 -07:00
donrhmexe
7409715947 fix: link subagent sessions to parent and hide from session list
Subagent sessions spawned by delegate_task were created with
parent_session_id=NULL and source=cli, making them indistinguishable
from user sessions in hermes sessions list and /resume.

Changes:
- delegate_tool.py: pass parent_agent.session_id to child agent
- run_agent.py: accept parent_session_id param, pass to create_session
- hermes_state.py list_sessions_rich: filter parent_session_id IS NULL
  by default (opt-in include_children=True for callers that need them)
- hermes_state.py delete_session: delete child sessions first (FK)
- hermes_state.py prune_sessions: delete children before parents (FK)

session_search already handles parent_session_id correctly — child
sessions are filtered from recent list and resolved to parent root
in full-text search results.

Fixes #5122
2026-04-05 12:48:50 -07:00
Mibayy
cc2b56b26a feat(api): structured run events via /v1/runs SSE endpoint
Add POST /v1/runs to start async agent runs and GET /v1/runs/{run_id}/events
for SSE streaming of typed lifecycle events (tool.started, tool.completed,
message.delta, reasoning.available, run.completed, run.failed).

Changes the internal tool_progress_callback signature from positional
(tool_name, preview, args) to event-type-first
(event_type, tool_name, preview, args, **kwargs). Existing consumers
filter on event_type and remain backward-compatible.

Adds concurrency limit (_MAX_CONCURRENT_RUNS=10) and orphaned run sweep.

Fixes logic inversion in cli.py _on_tool_progress where the original PR
would have displayed internal tools instead of non-internal ones.

Co-authored-by: Mibayy <mibayy@users.noreply.github.com>
2026-04-05 12:05:13 -07:00
Mibayy
e167ad8f61 feat(delegate): add acp_command/acp_args override to delegate_task
Allow delegate_task to specify custom ACP transport per-task, so a parent
running via CLI/Discord/Telegram can spawn child agents over ACP
(e.g. claude --acp --stdio). Follows the existing override_provider pattern.
Supports per-task granularity in batch mode.

Co-authored-by: Mibayy <mibayy@users.noreply.github.com>
2026-04-05 12:05:13 -07:00