2026-02-05 03:49:46 -08:00
#!/usr/bin/env python3
""" File Tools Module - LLM agent file manipulation tools. """
2026-03-13 22:14:00 -07:00
import errno
2026-02-05 03:49:46 -08:00
import json
2026-02-21 03:11:11 -08:00
import logging
2026-03-29 22:33:47 -07:00
import os
2026-02-05 03:49:46 -08:00
import threading
2026-03-29 22:33:47 -07:00
from pathlib import Path
2026-03-20 13:00:37 -04:00
from typing import Optional
from agent . file_safety import get_read_block_error
2026-04-07 22:21:27 -07:00
from tools . binary_extensions import has_binary_extension
2026-02-05 03:49:46 -08:00
from tools . file_operations import ShellFileOperations
feat(delegate): cross-agent file state coordination for concurrent subagents (#13718)
* feat(models): hide OpenRouter models that don't advertise tool support
Port from Kilo-Org/kilocode#9068.
hermes-agent is tool-calling-first — every provider path assumes the
model can invoke tools. Models whose OpenRouter supported_parameters
doesn't include 'tools' (e.g. image-only or completion-only models)
cannot be driven by the agent loop and fail at the first tool call.
Filter them out of fetch_openrouter_models() so they never appear in
the model picker (`hermes model`, setup wizard, /model slash command).
Permissive when the field is missing — OpenRouter-compatible gateways
(Nous Portal, private mirrors, older snapshots) don't always populate
supported_parameters. Treat missing as 'unknown → allow' rather than
silently emptying the picker on those gateways. Only hide models
whose supported_parameters is an explicit list that omits tools.
Tests cover: tools present → kept, tools absent → dropped, field
missing → kept, malformed non-list → kept, non-dict item → kept,
empty list → dropped.
* feat(delegate): cross-agent file state coordination for concurrent subagents
Prevents mangled edits when concurrent subagents touch the same file
(same process, same filesystem — the mangle scenario from #11215).
Three layers, all opt-out via HERMES_DISABLE_FILE_STATE_GUARD=1:
1. FileStateRegistry (tools/file_state.py) — process-wide singleton
tracking per-agent read stamps and the last writer globally.
check_stale() names the sibling subagent in the warning when a
non-owning agent wrote after this agent's last read.
2. Per-path threading.Lock wrapped around the read-modify-write
region in write_file_tool and patch_tool. Concurrent siblings on
the same path serialize; different paths stay fully parallel.
V4A multi-file patches lock in sorted path order (deadlock-free).
3. Delegate-completion reminder in tools/delegate_tool.py: after a
subagent returns, writes_since(parent, child_start, parent_reads)
appends '[NOTE: subagent modified files the parent previously
read — re-read before editing: ...]' to entry.summary when the
child touched anything the parent had already seen.
Complements (does not replace) the existing path-overlap check in
run_agent._should_parallelize_tool_batch — batch check prevents
same-file parallel dispatch within one agent's turn (cheap prevention,
zero API cost), registry catches cross-subagent and cross-turn
staleness at write time (detection).
Behavior is warning-only, not hard-failing — matches existing project
style. Errors surface naturally: sibling writes often invalidate the
old_string in patch operations, which already errors cleanly.
Tests: tests/tools/test_file_state_registry.py — 16 tests covering
registry state transitions, per-path locking, per-path-not-global
locking, writes_since filtering, kill switch, and end-to-end
integration through the real read_file/write_file/patch handlers.
2026-04-21 16:41:26 -07:00
from tools import file_state
2026-03-09 00:49:46 -07:00
from agent . redact import redact_sensitive_text
2026-02-05 03:49:46 -08:00
2026-02-21 03:11:11 -08:00
logger = logging . getLogger ( __name__ )
2026-03-22 11:17:06 -07:00
2026-03-13 22:14:00 -07:00
_EXPECTED_WRITE_ERRNOS = { errno . EACCES , errno . EPERM , errno . EROFS }
feat(file_tools): harden read_file with size guard, dedup, and device blocking (#4315)
* feat(file_tools): harden read_file with size guard, dedup, and device blocking
Three improvements to read_file_tool to reduce wasted context tokens and
prevent process hangs:
1. Character-count guard: reads that produce more than 100K characters
(≈25-35K tokens across tokenisers) are rejected with an error that
tells the model to use offset+limit for a smaller range. The
effective cap is min(file_size, 100K) so small files that happen to
have long lines aren't over-penalised. Large truncated files also
get a hint nudging toward targeted reads.
2. File-read deduplication: when the same (path, offset, limit) is read
a second time and the file hasn't been modified (mtime unchanged),
return a lightweight stub instead of re-sending the full content.
Writes and patches naturally change mtime, so post-edit reads always
return fresh content. The dedup cache is cleared on context
compression — after compression the original read content is
summarised away, so the model needs the full content again.
3. Device path blocking: paths like /dev/zero, /dev/random, /dev/stdin
etc. are rejected before any I/O to prevent process hangs from
infinite-output or blocking-input devices.
Tests: 17 new tests covering all three features plus the dedup-reset-
on-compression integration. All 52 file-read tests pass (35 existing +
17 new). Full tool suite (2124 tests) passes with 0 failures.
* feat: make file_read_max_chars configurable, add docs
Add file_read_max_chars to DEFAULT_CONFIG (default 100K). read_file_tool
reads this on first call and caches for the process lifetime. Users on
large-context models can raise it; users on small local models can lower it.
Also adds a 'File Read Safety' section to the configuration docs
explaining the char limit, dedup behavior, and example values.
2026-03-31 12:53:19 -07:00
# ---------------------------------------------------------------------------
# Read-size guard: cap the character count returned to the model.
# We're model-agnostic so we can't count tokens; characters are a safe proxy.
# 100K chars ≈ 25– 35K tokens across typical tokenisers. Files larger than
# this in a single read are a context-window hazard — the model should use
# offset+limit to read the relevant section.
#
# Configurable via config.yaml: file_read_max_chars: 200000
# ---------------------------------------------------------------------------
_DEFAULT_MAX_READ_CHARS = 100_000
_max_read_chars_cached : int | None = None
def _get_max_read_chars ( ) - > int :
""" Return the configured max characters per file read.
Reads ` ` file_read_max_chars ` ` from config . yaml on first call , caches
the result for the lifetime of the process . Falls back to the
built - in default if the config is missing or invalid .
"""
global _max_read_chars_cached
if _max_read_chars_cached is not None :
return _max_read_chars_cached
try :
from hermes_cli . config import load_config
cfg = load_config ( )
val = cfg . get ( " file_read_max_chars " )
if isinstance ( val , ( int , float ) ) and val > 0 :
_max_read_chars_cached = int ( val )
return _max_read_chars_cached
except Exception :
pass
_max_read_chars_cached = _DEFAULT_MAX_READ_CHARS
return _max_read_chars_cached
# If the total file size exceeds this AND the caller didn't specify a narrow
# range (limit <= 200), we include a hint encouraging targeted reads.
_LARGE_FILE_HINT_BYTES = 512_000 # 512 KB
# ---------------------------------------------------------------------------
# Device path blocklist — reading these hangs the process (infinite output
# or blocking on input). Checked by path only (no I/O).
# ---------------------------------------------------------------------------
_BLOCKED_DEVICE_PATHS = frozenset ( {
# Infinite output — never reach EOF
" /dev/zero " , " /dev/random " , " /dev/urandom " , " /dev/full " ,
# Blocks waiting for input
" /dev/stdin " , " /dev/tty " , " /dev/console " ,
# Nonsensical to read
" /dev/stdout " , " /dev/stderr " ,
# fd aliases
" /dev/fd/0 " , " /dev/fd/1 " , " /dev/fd/2 " ,
} )
2026-04-20 12:23:00 -07:00
def _resolve_path ( filepath : str ) - > Path :
""" Resolve a path relative to TERMINAL_CWD (the worktree base directory)
instead of the main repository root .
"""
p = Path ( filepath ) . expanduser ( )
if not p . is_absolute ( ) :
base = os . environ . get ( " TERMINAL_CWD " , os . getcwd ( ) )
p = Path ( base ) / p
return p . resolve ( )
feat(file_tools): harden read_file with size guard, dedup, and device blocking (#4315)
* feat(file_tools): harden read_file with size guard, dedup, and device blocking
Three improvements to read_file_tool to reduce wasted context tokens and
prevent process hangs:
1. Character-count guard: reads that produce more than 100K characters
(≈25-35K tokens across tokenisers) are rejected with an error that
tells the model to use offset+limit for a smaller range. The
effective cap is min(file_size, 100K) so small files that happen to
have long lines aren't over-penalised. Large truncated files also
get a hint nudging toward targeted reads.
2. File-read deduplication: when the same (path, offset, limit) is read
a second time and the file hasn't been modified (mtime unchanged),
return a lightweight stub instead of re-sending the full content.
Writes and patches naturally change mtime, so post-edit reads always
return fresh content. The dedup cache is cleared on context
compression — after compression the original read content is
summarised away, so the model needs the full content again.
3. Device path blocking: paths like /dev/zero, /dev/random, /dev/stdin
etc. are rejected before any I/O to prevent process hangs from
infinite-output or blocking-input devices.
Tests: 17 new tests covering all three features plus the dedup-reset-
on-compression integration. All 52 file-read tests pass (35 existing +
17 new). Full tool suite (2124 tests) passes with 0 failures.
* feat: make file_read_max_chars configurable, add docs
Add file_read_max_chars to DEFAULT_CONFIG (default 100K). read_file_tool
reads this on first call and caches for the process lifetime. Users on
large-context models can raise it; users on small local models can lower it.
Also adds a 'File Read Safety' section to the configuration docs
explaining the char limit, dedup behavior, and example values.
2026-03-31 12:53:19 -07:00
def _is_blocked_device ( filepath : str ) - > bool :
""" Return True if the path would hang the process (infinite output or blocking input).
Uses the * literal * path — no symlink resolution — because the model
specifies paths directly and realpath follows symlinks all the way
through ( e . g . / dev / stdin → / proc / self / fd / 0 → / dev / pts / 0 ) , defeating
the check .
"""
normalized = os . path . expanduser ( filepath )
if normalized in _BLOCKED_DEVICE_PATHS :
return True
# /proc/self/fd/0-2 and /proc/<pid>/fd/0-2 are Linux aliases for stdio
if normalized . startswith ( " /proc/ " ) and normalized . endswith (
( " /fd/0 " , " /fd/1 " , " /fd/2 " )
) :
return True
return False
2026-03-29 22:33:47 -07:00
# Paths that file tools should refuse to write to without going through the
# terminal tool's approval system. These match prefixes after os.path.realpath.
2026-04-13 05:14:41 -07:00
_SENSITIVE_PATH_PREFIXES = (
" /etc/ " , " /boot/ " , " /usr/lib/systemd/ " ,
" /private/etc/ " , " /private/var/ " ,
)
2026-03-29 22:33:47 -07:00
_SENSITIVE_EXACT_PATHS = { " /var/run/docker.sock " , " /run/docker.sock " }
def _check_sensitive_path ( filepath : str ) - > str | None :
""" Return an error message if the path targets a sensitive system location. """
try :
2026-04-20 12:23:00 -07:00
resolved = str ( _resolve_path ( filepath ) )
2026-03-29 22:33:47 -07:00
except ( OSError , ValueError ) :
resolved = filepath
2026-04-13 05:14:41 -07:00
normalized = os . path . normpath ( os . path . expanduser ( filepath ) )
_err = (
f " Refusing to write to sensitive system path: { filepath } \n "
" Use the terminal tool with sudo if you need to modify system files. "
)
2026-03-29 22:33:47 -07:00
for prefix in _SENSITIVE_PATH_PREFIXES :
2026-04-13 05:14:41 -07:00
if resolved . startswith ( prefix ) or normalized . startswith ( prefix ) :
return _err
if resolved in _SENSITIVE_EXACT_PATHS or normalized in _SENSITIVE_EXACT_PATHS :
return _err
2026-03-29 22:33:47 -07:00
return None
2026-03-13 22:14:00 -07:00
def _is_expected_write_exception ( exc : Exception ) - > bool :
""" Return True for expected write denials that should not hit error logs. """
if isinstance ( exc , PermissionError ) :
return True
if isinstance ( exc , OSError ) and exc . errno in _EXPECTED_WRITE_ERRNOS :
return True
return False
2026-02-05 03:49:46 -08:00
_file_ops_lock = threading . Lock ( )
_file_ops_cache : dict = { }
feat(file_tools): harden read_file with size guard, dedup, and device blocking (#4315)
* feat(file_tools): harden read_file with size guard, dedup, and device blocking
Three improvements to read_file_tool to reduce wasted context tokens and
prevent process hangs:
1. Character-count guard: reads that produce more than 100K characters
(≈25-35K tokens across tokenisers) are rejected with an error that
tells the model to use offset+limit for a smaller range. The
effective cap is min(file_size, 100K) so small files that happen to
have long lines aren't over-penalised. Large truncated files also
get a hint nudging toward targeted reads.
2. File-read deduplication: when the same (path, offset, limit) is read
a second time and the file hasn't been modified (mtime unchanged),
return a lightweight stub instead of re-sending the full content.
Writes and patches naturally change mtime, so post-edit reads always
return fresh content. The dedup cache is cleared on context
compression — after compression the original read content is
summarised away, so the model needs the full content again.
3. Device path blocking: paths like /dev/zero, /dev/random, /dev/stdin
etc. are rejected before any I/O to prevent process hangs from
infinite-output or blocking-input devices.
Tests: 17 new tests covering all three features plus the dedup-reset-
on-compression integration. All 52 file-read tests pass (35 existing +
17 new). Full tool suite (2124 tests) passes with 0 failures.
* feat: make file_read_max_chars configurable, add docs
Add file_read_max_chars to DEFAULT_CONFIG (default 100K). read_file_tool
reads this on first call and caches for the process lifetime. Users on
large-context models can raise it; users on small local models can lower it.
Also adds a 'File Read Safety' section to the configuration docs
explaining the char limit, dedup behavior, and example values.
2026-03-31 12:53:19 -07:00
# Track files read per task to detect re-read loops and deduplicate reads.
fix: improve read-loop detection — consecutive-only, correct thresholds, fix bugs
Follow-up to PR #705 (merged from 0xbyt4). Addresses several issues:
1. CONSECUTIVE-ONLY TRACKING: Redesigned the read/search tracker to only
warn/block on truly consecutive identical calls. Any other tool call
in between (write, patch, terminal, etc.) resets the counter via
notify_other_tool_call(), called from handle_function_call() in
model_tools.py. This prevents false blocks in read→edit→verify flows.
2. THRESHOLD ADJUSTMENT: Warn on 3rd consecutive (was 2nd), block on
4th+ consecutive (was 3rd+). Gives the model more room before
intervening.
3. TUPLE UNPACKING BUG: Fixed get_read_files_summary() which crashed on
search keys (5-tuple) when trying to unpack as 3-tuple. Now uses a
separate read_history set that only tracks file reads.
4. WEB_EXTRACT DOCSTRING: Reverted incorrect removal of 'title' from
web_extract return docs in code_execution_tool.py — the field IS
returned by web_tools.py.
5. TESTS: Rewrote test_read_loop_detection.py (35 tests) to cover
consecutive-only behavior, notify_other_tool_call, interleaved
read/search, and summary-unaffected-by-searches.
2026-03-10 16:25:41 -07:00
# Per task_id we store:
# "last_key": the key of the most recent read/search call (or None)
# "consecutive": how many times that exact call has been repeated in a row
# "read_history": set of (path, offset, limit) tuples for get_read_files_summary
feat(file_tools): harden read_file with size guard, dedup, and device blocking (#4315)
* feat(file_tools): harden read_file with size guard, dedup, and device blocking
Three improvements to read_file_tool to reduce wasted context tokens and
prevent process hangs:
1. Character-count guard: reads that produce more than 100K characters
(≈25-35K tokens across tokenisers) are rejected with an error that
tells the model to use offset+limit for a smaller range. The
effective cap is min(file_size, 100K) so small files that happen to
have long lines aren't over-penalised. Large truncated files also
get a hint nudging toward targeted reads.
2. File-read deduplication: when the same (path, offset, limit) is read
a second time and the file hasn't been modified (mtime unchanged),
return a lightweight stub instead of re-sending the full content.
Writes and patches naturally change mtime, so post-edit reads always
return fresh content. The dedup cache is cleared on context
compression — after compression the original read content is
summarised away, so the model needs the full content again.
3. Device path blocking: paths like /dev/zero, /dev/random, /dev/stdin
etc. are rejected before any I/O to prevent process hangs from
infinite-output or blocking-input devices.
Tests: 17 new tests covering all three features plus the dedup-reset-
on-compression integration. All 52 file-read tests pass (35 existing +
17 new). Full tool suite (2124 tests) passes with 0 failures.
* feat: make file_read_max_chars configurable, add docs
Add file_read_max_chars to DEFAULT_CONFIG (default 100K). read_file_tool
reads this on first call and caches for the process lifetime. Users on
large-context models can raise it; users on small local models can lower it.
Also adds a 'File Read Safety' section to the configuration docs
explaining the char limit, dedup behavior, and example values.
2026-03-31 12:53:19 -07:00
# "dedup": dict mapping (resolved_path, offset, limit) → mtime float
# Used to skip re-reads of unchanged files. Reset on
# context compression (the original content is summarised
# away so the model needs the full content again).
2026-04-01 00:50:08 -07:00
# "read_timestamps": dict mapping resolved_path → modification-time float
# recorded when the file was last read (or written) by
# this task. Used by write_file and patch to detect
# external changes between the agent's read and write.
# Updated after successful writes so consecutive edits
# by the same task don't trigger false warnings.
2026-03-08 20:44:42 +03:00
_read_tracker_lock = threading . Lock ( )
_read_tracker : dict = { }
fix(tools): bound _read_tracker sub-containers + prune _completion_consumed (#11839)
Two accretion-over-time leaks that compound over long CLI / gateway
lifetimes. Both were flagged in the memory-leak audit.
## file_tools._read_tracker
_read_tracker[task_id] holds three sub-containers that grew unbounded:
read_history set of (path, offset, limit) tuples — 1 per unique read
dedup dict of (path, offset, limit) → mtime — same growth pattern
read_timestamps dict of resolved_path → mtime — 1 per unique path
A CLI session uses one stable task_id for its lifetime, so these were
uncapped. A 10k-read session accumulated ~1.5MB of tracker state that
the tool no longer needed (only the most recent reads are relevant for
dedup, consecutive-loop detection, and write/patch external-edit
warnings).
Fix: _cap_read_tracker_data() enforces hard caps on each container
after every add. Defaults: read_history=500, dedup=1000,
read_timestamps=1000. Eviction is insertion-order (Python 3.7+ dict
guarantee) for the dicts; arbitrary for the set (which only feeds
diagnostic summaries).
## process_registry._completion_consumed
Module-level set that recorded every session_id ever polled / waited /
logged. No pruning. Each entry is ~20 bytes, so the absolute leak is
small, but on a gateway processing thousands of background commands
per day the set grows until process exit.
Fix: _prune_if_needed() now discards _completion_consumed entries
alongside the session dict evictions it already performs (both the
TTL-based prune and the LRU-over-cap prune). Adds a final
belt-and-suspenders pass that drops any dangling entries whose
session_id no longer appears in _running or _finished.
Tests: tests/tools/test_accretion_caps.py — 9 cases
* Each container bound respected, oldest evicted
* No-op when under cap (no unnecessary work)
* Handles missing sub-containers without crashing
* Live read_file_tool path enforces caps end-to-end
* _completion_consumed pruned on TTL expiry
* _completion_consumed pruned on LRU eviction
* Dangling entries (no backing session) cleared
Broader suite: 3486 tests/tools + tests/cli pass. The single flake
(test_alias_command_passes_args) reproduces on unchanged main — known
cross-test pollution under suite-order load.
2026-04-17 15:53:57 -07:00
# Per-task bounds for the containers inside each _read_tracker[task_id].
# A CLI session uses one stable task_id for its lifetime; without these
# caps, a 10k-read session would accumulate ~1.5MB of dict/set state that
# is never referenced again (only the most recent reads matter for dedup,
# loop detection, and external-edit warnings). Hard caps bound the
# accretion to a few hundred KB regardless of session length.
_READ_HISTORY_CAP = 500 # set; used only by get_read_files_summary
_DEDUP_CAP = 1000 # dict; skip-identical-reread guard
_READ_TIMESTAMPS_CAP = 1000 # dict; external-edit detection for write/patch
def _cap_read_tracker_data ( task_data : dict ) - > None :
""" Enforce size caps on the per-task read-tracker sub-containers.
Must be called with ` ` _read_tracker_lock ` ` held . Eviction policy :
* ` ` read_history ` ` ( set ) : pop arbitrary entries on overflow . This
is fine because the set only feeds diagnostic summaries ; losing
old entries just trims the summary ' s tail.
* ` ` dedup ` ` / ` ` read_timestamps ` ` ( dict ) : pop oldest by insertion
order ( Python 3.7 + dicts ) . Evicted entries lose their dedup
skip on a future re - read ( the file gets re - sent once ) and
external - edit mtime comparison ( the write / patch falls back to
a non - mtime check ) . Both are graceful degradations , not bugs .
"""
rh = task_data . get ( " read_history " )
if rh is not None and len ( rh ) > _READ_HISTORY_CAP :
excess = len ( rh ) - _READ_HISTORY_CAP
for _ in range ( excess ) :
try :
rh . pop ( )
except KeyError :
break
dedup = task_data . get ( " dedup " )
if dedup is not None and len ( dedup ) > _DEDUP_CAP :
excess = len ( dedup ) - _DEDUP_CAP
for _ in range ( excess ) :
try :
dedup . pop ( next ( iter ( dedup ) ) )
except ( StopIteration , KeyError ) :
break
ts = task_data . get ( " read_timestamps " )
if ts is not None and len ( ts ) > _READ_TIMESTAMPS_CAP :
excess = len ( ts ) - _READ_TIMESTAMPS_CAP
for _ in range ( excess ) :
try :
ts . pop ( next ( iter ( ts ) ) )
except ( StopIteration , KeyError ) :
break
2026-02-05 03:49:46 -08:00
def _get_file_ops ( task_id : str = " default " ) - > ShellFileOperations :
2026-02-08 05:00:47 +00:00
""" Get or create ShellFileOperations for a terminal environment.
2026-02-16 19:37:40 -08:00
2026-02-08 05:00:47 +00:00
Respects the TERMINAL_ENV setting - - if the task_id doesn ' t have an
environment yet , creates one using the configured backend ( local , docker ,
modal , etc . ) rather than always defaulting to local .
2026-02-16 19:37:40 -08:00
Thread - safe : uses the same per - task creation locks as terminal_tool to
prevent duplicate sandbox creation from concurrent tool calls .
2026-02-08 05:00:47 +00:00
"""
from tools . terminal_tool import (
_active_environments , _env_lock , _create_environment ,
_get_env_config , _last_activity , _start_cleanup_thread ,
chore: remove ~100 unused imports across 55 files (#3016)
Automated cleanup via pyflakes + autoflake with manual review.
Changes:
- Removed unused stdlib imports (os, sys, json, pathlib.Path, etc.)
- Removed unused typing imports (List, Dict, Any, Optional, Tuple, Set, etc.)
- Removed unused internal imports (hermes_cli.auth, hermes_cli.config, etc.)
- Fixed cli.py: removed 8 shadowed banner imports (imported from hermes_cli.banner
then immediately redefined locally — only build_welcome_banner is actually used)
- Added noqa comments to imports that appear unused but serve a purpose:
- Re-exports (gateway/session.py SessionResetPolicy, tools/terminal_tool.py
is_interrupted/_interrupt_event)
- SDK presence checks in try/except (daytona, fal_client, discord)
- Test mock targets (auxiliary_client.py Path, mcp_config.py get_hermes_home)
Zero behavioral changes. Full test suite passes (6162/6162, 2 pre-existing
streaming test failures unrelated to this change).
2026-03-25 15:02:03 -07:00
_creation_locks ,
_creation_locks_lock ,
2026-02-08 05:00:47 +00:00
)
import time
2026-02-16 19:37:40 -08:00
# Fast path: check cache -- but also verify the underlying environment
# is still alive (it may have been killed by the cleanup thread).
2026-02-05 03:49:46 -08:00
with _file_ops_lock :
2026-02-16 19:37:40 -08:00
cached = _file_ops_cache . get ( task_id )
if cached is not None :
with _env_lock :
if task_id in _active_environments :
_last_activity [ task_id ] = time . time ( )
return cached
else :
# Environment was cleaned up -- invalidate stale cache entry
with _file_ops_lock :
_file_ops_cache . pop ( task_id , None )
# Need to ensure the environment exists before building file_ops.
# Acquire per-task lock so only one thread creates the sandbox.
with _creation_locks_lock :
if task_id not in _creation_locks :
_creation_locks [ task_id ] = threading . Lock ( )
task_lock = _creation_locks [ task_id ]
with task_lock :
# Double-check: another thread may have created it while we waited
with _env_lock :
if task_id in _active_environments :
_last_activity [ task_id ] = time . time ( )
terminal_env = _active_environments [ task_id ]
else :
terminal_env = None
if terminal_env is None :
from tools . terminal_tool import _task_env_overrides
config = _get_env_config ( )
env_type = config [ " env_type " ]
overrides = _task_env_overrides . get ( task_id , { } )
if env_type == " docker " :
image = overrides . get ( " docker_image " ) or config [ " docker_image " ]
elif env_type == " singularity " :
image = overrides . get ( " singularity_image " ) or config [ " singularity_image " ]
elif env_type == " modal " :
image = overrides . get ( " modal_image " ) or config [ " modal_image " ]
2026-03-05 00:41:12 -08:00
elif env_type == " daytona " :
image = overrides . get ( " daytona_image " ) or config [ " daytona_image " ]
2026-02-16 19:37:40 -08:00
else :
image = " "
cwd = overrides . get ( " cwd " ) or config [ " cwd " ]
2026-02-21 03:11:11 -08:00
logger . info ( " Creating new %s environment for task %s ... " , env_type , task_id [ : 8 ] )
2026-02-16 19:37:40 -08:00
2026-02-28 07:12:48 +10:00
container_config = None
2026-03-05 00:41:12 -08:00
if env_type in ( " docker " , " singularity " , " modal " , " daytona " ) :
2026-02-28 07:12:48 +10:00
container_config = {
" container_cpu " : config . get ( " container_cpu " , 1 ) ,
" container_memory " : config . get ( " container_memory " , 5120 ) ,
" container_disk " : config . get ( " container_disk " , 51200 ) ,
" container_persistent " : config . get ( " container_persistent " , True ) ,
2026-03-08 12:48:58 +01:00
" docker_volumes " : config . get ( " docker_volumes " , [ ] ) ,
2026-03-24 02:22:46 +03:00
" docker_mount_cwd_to_workspace " : config . get ( " docker_mount_cwd_to_workspace " , False ) ,
" docker_forward_env " : config . get ( " docker_forward_env " , [ ] ) ,
2026-02-28 07:12:48 +10:00
}
2026-03-15 02:26:39 +05:30
ssh_config = None
if env_type == " ssh " :
ssh_config = {
" host " : config . get ( " ssh_host " , " " ) ,
" user " : config . get ( " ssh_user " , " " ) ,
" port " : config . get ( " ssh_port " , 22 ) ,
" key " : config . get ( " ssh_key " , " " ) ,
" persistent " : config . get ( " ssh_persistent " , False ) ,
}
local_config = None
if env_type == " local " :
local_config = {
" persistent " : config . get ( " local_persistent " , False ) ,
}
2026-02-16 19:37:40 -08:00
terminal_env = _create_environment (
env_type = env_type ,
image = image ,
cwd = cwd ,
timeout = config [ " timeout " ] ,
2026-03-15 02:26:39 +05:30
ssh_config = ssh_config ,
2026-02-28 07:12:48 +10:00
container_config = container_config ,
2026-03-15 02:26:39 +05:30
local_config = local_config ,
2026-03-04 23:39:55 -08:00
task_id = task_id ,
2026-03-16 05:19:43 -07:00
host_cwd = config . get ( " host_cwd " ) ,
2026-02-16 19:37:40 -08:00
)
2026-02-12 05:37:14 +00:00
with _env_lock :
2026-02-16 19:37:40 -08:00
_active_environments [ task_id ] = terminal_env
_last_activity [ task_id ] = time . time ( )
_start_cleanup_thread ( )
2026-02-21 03:11:11 -08:00
logger . info ( " %s environment ready for task %s " , env_type , task_id [ : 8 ] )
2026-02-16 19:37:40 -08:00
# Build file_ops from the (guaranteed live) environment and cache it
2026-02-08 05:00:47 +00:00
file_ops = ShellFileOperations ( terminal_env )
with _file_ops_lock :
2026-02-05 03:49:46 -08:00
_file_ops_cache [ task_id ] = file_ops
2026-02-08 05:00:47 +00:00
return file_ops
2026-02-05 03:49:46 -08:00
def clear_file_ops_cache ( task_id : str = None ) :
""" Clear the file operations cache. """
with _file_ops_lock :
if task_id :
_file_ops_cache . pop ( task_id , None )
else :
_file_ops_cache . clear ( )
2026-04-08 01:45:51 -07:00
def read_file_tool ( path : str , offset : int = 1 , limit : int = 500 , task_id : str = " default " ) - > str :
2026-02-05 03:49:46 -08:00
""" Read a file with pagination and line numbers. """
try :
feat(file_tools): harden read_file with size guard, dedup, and device blocking (#4315)
* feat(file_tools): harden read_file with size guard, dedup, and device blocking
Three improvements to read_file_tool to reduce wasted context tokens and
prevent process hangs:
1. Character-count guard: reads that produce more than 100K characters
(≈25-35K tokens across tokenisers) are rejected with an error that
tells the model to use offset+limit for a smaller range. The
effective cap is min(file_size, 100K) so small files that happen to
have long lines aren't over-penalised. Large truncated files also
get a hint nudging toward targeted reads.
2. File-read deduplication: when the same (path, offset, limit) is read
a second time and the file hasn't been modified (mtime unchanged),
return a lightweight stub instead of re-sending the full content.
Writes and patches naturally change mtime, so post-edit reads always
return fresh content. The dedup cache is cleared on context
compression — after compression the original read content is
summarised away, so the model needs the full content again.
3. Device path blocking: paths like /dev/zero, /dev/random, /dev/stdin
etc. are rejected before any I/O to prevent process hangs from
infinite-output or blocking-input devices.
Tests: 17 new tests covering all three features plus the dedup-reset-
on-compression integration. All 52 file-read tests pass (35 existing +
17 new). Full tool suite (2124 tests) passes with 0 failures.
* feat: make file_read_max_chars configurable, add docs
Add file_read_max_chars to DEFAULT_CONFIG (default 100K). read_file_tool
reads this on first call and caches for the process lifetime. Users on
large-context models can raise it; users on small local models can lower it.
Also adds a 'File Read Safety' section to the configuration docs
explaining the char limit, dedup behavior, and example values.
2026-03-31 12:53:19 -07:00
# ── Device path guard ─────────────────────────────────────────
# Block paths that would hang the process (infinite output,
# blocking on input). Pure path check — no I/O.
if _is_blocked_device ( path ) :
return json . dumps ( {
" error " : (
f " Cannot read ' { path } ' : this is a device file that would "
" block or produce infinite output. "
) ,
} )
2026-04-20 12:23:00 -07:00
_resolved = _resolve_path ( path )
2026-04-07 22:21:27 -07:00
# ── Binary file guard ─────────────────────────────────────────
# Block binary files by extension (no I/O).
if has_binary_extension ( str ( _resolved ) ) :
_ext = _resolved . suffix . lower ( )
return json . dumps ( {
" error " : (
f " Cannot read binary file ' { path } ' ( { _ext } ). "
" Use vision_analyze for images, or terminal to inspect binary files. "
) ,
} )
feat(file_tools): harden read_file with size guard, dedup, and device blocking (#4315)
* feat(file_tools): harden read_file with size guard, dedup, and device blocking
Three improvements to read_file_tool to reduce wasted context tokens and
prevent process hangs:
1. Character-count guard: reads that produce more than 100K characters
(≈25-35K tokens across tokenisers) are rejected with an error that
tells the model to use offset+limit for a smaller range. The
effective cap is min(file_size, 100K) so small files that happen to
have long lines aren't over-penalised. Large truncated files also
get a hint nudging toward targeted reads.
2. File-read deduplication: when the same (path, offset, limit) is read
a second time and the file hasn't been modified (mtime unchanged),
return a lightweight stub instead of re-sending the full content.
Writes and patches naturally change mtime, so post-edit reads always
return fresh content. The dedup cache is cleared on context
compression — after compression the original read content is
summarised away, so the model needs the full content again.
3. Device path blocking: paths like /dev/zero, /dev/random, /dev/stdin
etc. are rejected before any I/O to prevent process hangs from
infinite-output or blocking-input devices.
Tests: 17 new tests covering all three features plus the dedup-reset-
on-compression integration. All 52 file-read tests pass (35 existing +
17 new). Full tool suite (2124 tests) passes with 0 failures.
* feat: make file_read_max_chars configurable, add docs
Add file_read_max_chars to DEFAULT_CONFIG (default 100K). read_file_tool
reads this on first call and caches for the process lifetime. Users on
large-context models can raise it; users on small local models can lower it.
Also adds a 'File Read Safety' section to the configuration docs
explaining the char limit, dedup behavior, and example values.
2026-03-31 12:53:19 -07:00
# ── Hermes internal path guard ────────────────────────────────
# Prevent prompt injection via catalog or hub metadata files.
2026-03-20 13:00:37 -04:00
block_error = get_read_block_error ( path )
if block_error :
return json . dumps ( { " error " : block_error } )
feat(file_tools): harden read_file with size guard, dedup, and device blocking (#4315)
* feat(file_tools): harden read_file with size guard, dedup, and device blocking
Three improvements to read_file_tool to reduce wasted context tokens and
prevent process hangs:
1. Character-count guard: reads that produce more than 100K characters
(≈25-35K tokens across tokenisers) are rejected with an error that
tells the model to use offset+limit for a smaller range. The
effective cap is min(file_size, 100K) so small files that happen to
have long lines aren't over-penalised. Large truncated files also
get a hint nudging toward targeted reads.
2. File-read deduplication: when the same (path, offset, limit) is read
a second time and the file hasn't been modified (mtime unchanged),
return a lightweight stub instead of re-sending the full content.
Writes and patches naturally change mtime, so post-edit reads always
return fresh content. The dedup cache is cleared on context
compression — after compression the original read content is
summarised away, so the model needs the full content again.
3. Device path blocking: paths like /dev/zero, /dev/random, /dev/stdin
etc. are rejected before any I/O to prevent process hangs from
infinite-output or blocking-input devices.
Tests: 17 new tests covering all three features plus the dedup-reset-
on-compression integration. All 52 file-read tests pass (35 existing +
17 new). Full tool suite (2124 tests) passes with 0 failures.
* feat: make file_read_max_chars configurable, add docs
Add file_read_max_chars to DEFAULT_CONFIG (default 100K). read_file_tool
reads this on first call and caches for the process lifetime. Users on
large-context models can raise it; users on small local models can lower it.
Also adds a 'File Read Safety' section to the configuration docs
explaining the char limit, dedup behavior, and example values.
2026-03-31 12:53:19 -07:00
# ── Dedup check ───────────────────────────────────────────────
# If we already read this exact (path, offset, limit) and the
# file hasn't been modified since, return a lightweight stub
# instead of re-sending the same content. Saves context tokens.
resolved_str = str ( _resolved )
dedup_key = ( resolved_str , offset , limit )
with _read_tracker_lock :
task_data = _read_tracker . setdefault ( task_id , {
" last_key " : None , " consecutive " : 0 ,
" read_history " : set ( ) , " dedup " : { } ,
} )
cached_mtime = task_data . get ( " dedup " , { } ) . get ( dedup_key )
if cached_mtime is not None :
try :
current_mtime = os . path . getmtime ( resolved_str )
if current_mtime == cached_mtime :
return json . dumps ( {
" content " : (
" File unchanged since last read. The content from "
" the earlier read_file result in this conversation is "
" still current — refer to that instead of re-reading. "
) ,
" path " : path ,
" dedup " : True ,
} , ensure_ascii = False )
except OSError :
pass # stat failed — fall through to full read
# ── Perform the read ──────────────────────────────────────────
2026-02-05 03:49:46 -08:00
file_ops = _get_file_ops ( task_id )
result = file_ops . read_file ( path , offset , limit )
2026-03-08 20:44:42 +03:00
result_dict = result . to_dict ( )
feat(file_tools): harden read_file with size guard, dedup, and device blocking (#4315)
* feat(file_tools): harden read_file with size guard, dedup, and device blocking
Three improvements to read_file_tool to reduce wasted context tokens and
prevent process hangs:
1. Character-count guard: reads that produce more than 100K characters
(≈25-35K tokens across tokenisers) are rejected with an error that
tells the model to use offset+limit for a smaller range. The
effective cap is min(file_size, 100K) so small files that happen to
have long lines aren't over-penalised. Large truncated files also
get a hint nudging toward targeted reads.
2. File-read deduplication: when the same (path, offset, limit) is read
a second time and the file hasn't been modified (mtime unchanged),
return a lightweight stub instead of re-sending the full content.
Writes and patches naturally change mtime, so post-edit reads always
return fresh content. The dedup cache is cleared on context
compression — after compression the original read content is
summarised away, so the model needs the full content again.
3. Device path blocking: paths like /dev/zero, /dev/random, /dev/stdin
etc. are rejected before any I/O to prevent process hangs from
infinite-output or blocking-input devices.
Tests: 17 new tests covering all three features plus the dedup-reset-
on-compression integration. All 52 file-read tests pass (35 existing +
17 new). Full tool suite (2124 tests) passes with 0 failures.
* feat: make file_read_max_chars configurable, add docs
Add file_read_max_chars to DEFAULT_CONFIG (default 100K). read_file_tool
reads this on first call and caches for the process lifetime. Users on
large-context models can raise it; users on small local models can lower it.
Also adds a 'File Read Safety' section to the configuration docs
explaining the char limit, dedup behavior, and example values.
2026-03-31 12:53:19 -07:00
# ── Character-count guard ─────────────────────────────────────
# We're model-agnostic so we can't count tokens; characters are
# the best proxy we have. If the read produced an unreasonable
# amount of content, reject it and tell the model to narrow down.
# Note: we check the formatted content (with line-number prefixes),
# not the raw file size, because that's what actually enters context.
2026-04-03 16:25:35 +02:00
# Check BEFORE redaction to avoid expensive regex on huge content.
feat(file_tools): harden read_file with size guard, dedup, and device blocking (#4315)
* feat(file_tools): harden read_file with size guard, dedup, and device blocking
Three improvements to read_file_tool to reduce wasted context tokens and
prevent process hangs:
1. Character-count guard: reads that produce more than 100K characters
(≈25-35K tokens across tokenisers) are rejected with an error that
tells the model to use offset+limit for a smaller range. The
effective cap is min(file_size, 100K) so small files that happen to
have long lines aren't over-penalised. Large truncated files also
get a hint nudging toward targeted reads.
2. File-read deduplication: when the same (path, offset, limit) is read
a second time and the file hasn't been modified (mtime unchanged),
return a lightweight stub instead of re-sending the full content.
Writes and patches naturally change mtime, so post-edit reads always
return fresh content. The dedup cache is cleared on context
compression — after compression the original read content is
summarised away, so the model needs the full content again.
3. Device path blocking: paths like /dev/zero, /dev/random, /dev/stdin
etc. are rejected before any I/O to prevent process hangs from
infinite-output or blocking-input devices.
Tests: 17 new tests covering all three features plus the dedup-reset-
on-compression integration. All 52 file-read tests pass (35 existing +
17 new). Full tool suite (2124 tests) passes with 0 failures.
* feat: make file_read_max_chars configurable, add docs
Add file_read_max_chars to DEFAULT_CONFIG (default 100K). read_file_tool
reads this on first call and caches for the process lifetime. Users on
large-context models can raise it; users on small local models can lower it.
Also adds a 'File Read Safety' section to the configuration docs
explaining the char limit, dedup behavior, and example values.
2026-03-31 12:53:19 -07:00
content_len = len ( result . content or " " )
file_size = result_dict . get ( " file_size " , 0 )
max_chars = _get_max_read_chars ( )
if content_len > max_chars :
total_lines = result_dict . get ( " total_lines " , " unknown " )
return json . dumps ( {
" error " : (
f " Read produced { content_len : , } characters which exceeds "
f " the safety limit ( { max_chars : , } chars). "
" Use offset and limit to read a smaller range. "
f " The file has { total_lines } lines total. "
) ,
" path " : path ,
" total_lines " : total_lines ,
" file_size " : file_size ,
} , ensure_ascii = False )
2026-04-03 16:25:35 +02:00
# ── Redact secrets (after guard check to skip oversized content) ──
if result . content :
result . content = redact_sensitive_text ( result . content )
result_dict [ " content " ] = result . content
feat(file_tools): harden read_file with size guard, dedup, and device blocking (#4315)
* feat(file_tools): harden read_file with size guard, dedup, and device blocking
Three improvements to read_file_tool to reduce wasted context tokens and
prevent process hangs:
1. Character-count guard: reads that produce more than 100K characters
(≈25-35K tokens across tokenisers) are rejected with an error that
tells the model to use offset+limit for a smaller range. The
effective cap is min(file_size, 100K) so small files that happen to
have long lines aren't over-penalised. Large truncated files also
get a hint nudging toward targeted reads.
2. File-read deduplication: when the same (path, offset, limit) is read
a second time and the file hasn't been modified (mtime unchanged),
return a lightweight stub instead of re-sending the full content.
Writes and patches naturally change mtime, so post-edit reads always
return fresh content. The dedup cache is cleared on context
compression — after compression the original read content is
summarised away, so the model needs the full content again.
3. Device path blocking: paths like /dev/zero, /dev/random, /dev/stdin
etc. are rejected before any I/O to prevent process hangs from
infinite-output or blocking-input devices.
Tests: 17 new tests covering all three features plus the dedup-reset-
on-compression integration. All 52 file-read tests pass (35 existing +
17 new). Full tool suite (2124 tests) passes with 0 failures.
* feat: make file_read_max_chars configurable, add docs
Add file_read_max_chars to DEFAULT_CONFIG (default 100K). read_file_tool
reads this on first call and caches for the process lifetime. Users on
large-context models can raise it; users on small local models can lower it.
Also adds a 'File Read Safety' section to the configuration docs
explaining the char limit, dedup behavior, and example values.
2026-03-31 12:53:19 -07:00
# Large-file hint: if the file is big and the caller didn't ask
# for a narrow window, nudge toward targeted reads.
if ( file_size and file_size > _LARGE_FILE_HINT_BYTES
and limit > 200
and result_dict . get ( " truncated " ) ) :
result_dict . setdefault ( " _hint " , (
f " This file is large ( { file_size : , } bytes). "
" Consider reading only the section you need with offset and limit "
" to keep context usage efficient. "
) )
# ── Track for consecutive-loop detection ──────────────────────
fix: improve read-loop detection — consecutive-only, correct thresholds, fix bugs
Follow-up to PR #705 (merged from 0xbyt4). Addresses several issues:
1. CONSECUTIVE-ONLY TRACKING: Redesigned the read/search tracker to only
warn/block on truly consecutive identical calls. Any other tool call
in between (write, patch, terminal, etc.) resets the counter via
notify_other_tool_call(), called from handle_function_call() in
model_tools.py. This prevents false blocks in read→edit→verify flows.
2. THRESHOLD ADJUSTMENT: Warn on 3rd consecutive (was 2nd), block on
4th+ consecutive (was 3rd+). Gives the model more room before
intervening.
3. TUPLE UNPACKING BUG: Fixed get_read_files_summary() which crashed on
search keys (5-tuple) when trying to unpack as 3-tuple. Now uses a
separate read_history set that only tracks file reads.
4. WEB_EXTRACT DOCSTRING: Reverted incorrect removal of 'title' from
web_extract return docs in code_execution_tool.py — the field IS
returned by web_tools.py.
5. TESTS: Rewrote test_read_loop_detection.py (35 tests) to cover
consecutive-only behavior, notify_other_tool_call, interleaved
read/search, and summary-unaffected-by-searches.
2026-03-10 16:25:41 -07:00
read_key = ( " read " , path , offset , limit )
2026-03-08 20:44:42 +03:00
with _read_tracker_lock :
feat(file_tools): harden read_file with size guard, dedup, and device blocking (#4315)
* feat(file_tools): harden read_file with size guard, dedup, and device blocking
Three improvements to read_file_tool to reduce wasted context tokens and
prevent process hangs:
1. Character-count guard: reads that produce more than 100K characters
(≈25-35K tokens across tokenisers) are rejected with an error that
tells the model to use offset+limit for a smaller range. The
effective cap is min(file_size, 100K) so small files that happen to
have long lines aren't over-penalised. Large truncated files also
get a hint nudging toward targeted reads.
2. File-read deduplication: when the same (path, offset, limit) is read
a second time and the file hasn't been modified (mtime unchanged),
return a lightweight stub instead of re-sending the full content.
Writes and patches naturally change mtime, so post-edit reads always
return fresh content. The dedup cache is cleared on context
compression — after compression the original read content is
summarised away, so the model needs the full content again.
3. Device path blocking: paths like /dev/zero, /dev/random, /dev/stdin
etc. are rejected before any I/O to prevent process hangs from
infinite-output or blocking-input devices.
Tests: 17 new tests covering all three features plus the dedup-reset-
on-compression integration. All 52 file-read tests pass (35 existing +
17 new). Full tool suite (2124 tests) passes with 0 failures.
* feat: make file_read_max_chars configurable, add docs
Add file_read_max_chars to DEFAULT_CONFIG (default 100K). read_file_tool
reads this on first call and caches for the process lifetime. Users on
large-context models can raise it; users on small local models can lower it.
Also adds a 'File Read Safety' section to the configuration docs
explaining the char limit, dedup behavior, and example values.
2026-03-31 12:53:19 -07:00
# Ensure "dedup" key exists (backward compat with old tracker state)
if " dedup " not in task_data :
task_data [ " dedup " ] = { }
fix: improve read-loop detection — consecutive-only, correct thresholds, fix bugs
Follow-up to PR #705 (merged from 0xbyt4). Addresses several issues:
1. CONSECUTIVE-ONLY TRACKING: Redesigned the read/search tracker to only
warn/block on truly consecutive identical calls. Any other tool call
in between (write, patch, terminal, etc.) resets the counter via
notify_other_tool_call(), called from handle_function_call() in
model_tools.py. This prevents false blocks in read→edit→verify flows.
2. THRESHOLD ADJUSTMENT: Warn on 3rd consecutive (was 2nd), block on
4th+ consecutive (was 3rd+). Gives the model more room before
intervening.
3. TUPLE UNPACKING BUG: Fixed get_read_files_summary() which crashed on
search keys (5-tuple) when trying to unpack as 3-tuple. Now uses a
separate read_history set that only tracks file reads.
4. WEB_EXTRACT DOCSTRING: Reverted incorrect removal of 'title' from
web_extract return docs in code_execution_tool.py — the field IS
returned by web_tools.py.
5. TESTS: Rewrote test_read_loop_detection.py (35 tests) to cover
consecutive-only behavior, notify_other_tool_call, interleaved
read/search, and summary-unaffected-by-searches.
2026-03-10 16:25:41 -07:00
task_data [ " read_history " ] . add ( ( path , offset , limit ) )
if task_data [ " last_key " ] == read_key :
task_data [ " consecutive " ] + = 1
else :
task_data [ " last_key " ] = read_key
task_data [ " consecutive " ] = 1
count = task_data [ " consecutive " ]
2026-03-08 20:44:42 +03:00
feat(file_tools): detect stale files on write and patch (#4345)
Track file mtime when read_file is called. When write_file or patch
subsequently targets the same file, compare the current mtime against
the recorded one. If they differ (external edit, concurrent agent,
user change), include a _warning in the result advising the agent to
re-read. The write still proceeds — this is a soft signal, not a
hard block.
Key design points:
- Per-task isolation: task A's reads don't affect task B's writes.
- Files never read produce no warning (not enforcing read-before-write).
- mtime naturally updates after the agent's own writes, so the warning
only fires on external changes, not the agent's own edits.
- V4A multi-file patches check all target paths.
Tests: 10 new tests covering write staleness, patch staleness,
never-read files, cross-task isolation, and the helper function.
2026-03-31 14:49:00 -07:00
# Store mtime at read time for two purposes:
# 1. Dedup: skip identical re-reads of unchanged files.
# 2. Staleness: warn on write/patch if the file changed since
# the agent last read it (external edit, concurrent agent, etc.).
feat(file_tools): harden read_file with size guard, dedup, and device blocking (#4315)
* feat(file_tools): harden read_file with size guard, dedup, and device blocking
Three improvements to read_file_tool to reduce wasted context tokens and
prevent process hangs:
1. Character-count guard: reads that produce more than 100K characters
(≈25-35K tokens across tokenisers) are rejected with an error that
tells the model to use offset+limit for a smaller range. The
effective cap is min(file_size, 100K) so small files that happen to
have long lines aren't over-penalised. Large truncated files also
get a hint nudging toward targeted reads.
2. File-read deduplication: when the same (path, offset, limit) is read
a second time and the file hasn't been modified (mtime unchanged),
return a lightweight stub instead of re-sending the full content.
Writes and patches naturally change mtime, so post-edit reads always
return fresh content. The dedup cache is cleared on context
compression — after compression the original read content is
summarised away, so the model needs the full content again.
3. Device path blocking: paths like /dev/zero, /dev/random, /dev/stdin
etc. are rejected before any I/O to prevent process hangs from
infinite-output or blocking-input devices.
Tests: 17 new tests covering all three features plus the dedup-reset-
on-compression integration. All 52 file-read tests pass (35 existing +
17 new). Full tool suite (2124 tests) passes with 0 failures.
* feat: make file_read_max_chars configurable, add docs
Add file_read_max_chars to DEFAULT_CONFIG (default 100K). read_file_tool
reads this on first call and caches for the process lifetime. Users on
large-context models can raise it; users on small local models can lower it.
Also adds a 'File Read Safety' section to the configuration docs
explaining the char limit, dedup behavior, and example values.
2026-03-31 12:53:19 -07:00
try :
feat(file_tools): detect stale files on write and patch (#4345)
Track file mtime when read_file is called. When write_file or patch
subsequently targets the same file, compare the current mtime against
the recorded one. If they differ (external edit, concurrent agent,
user change), include a _warning in the result advising the agent to
re-read. The write still proceeds — this is a soft signal, not a
hard block.
Key design points:
- Per-task isolation: task A's reads don't affect task B's writes.
- Files never read produce no warning (not enforcing read-before-write).
- mtime naturally updates after the agent's own writes, so the warning
only fires on external changes, not the agent's own edits.
- V4A multi-file patches check all target paths.
Tests: 10 new tests covering write staleness, patch staleness,
never-read files, cross-task isolation, and the helper function.
2026-03-31 14:49:00 -07:00
_mtime_now = os . path . getmtime ( resolved_str )
task_data [ " dedup " ] [ dedup_key ] = _mtime_now
2026-04-01 00:50:08 -07:00
task_data . setdefault ( " read_timestamps " , { } ) [ resolved_str ] = _mtime_now
feat(file_tools): harden read_file with size guard, dedup, and device blocking (#4315)
* feat(file_tools): harden read_file with size guard, dedup, and device blocking
Three improvements to read_file_tool to reduce wasted context tokens and
prevent process hangs:
1. Character-count guard: reads that produce more than 100K characters
(≈25-35K tokens across tokenisers) are rejected with an error that
tells the model to use offset+limit for a smaller range. The
effective cap is min(file_size, 100K) so small files that happen to
have long lines aren't over-penalised. Large truncated files also
get a hint nudging toward targeted reads.
2. File-read deduplication: when the same (path, offset, limit) is read
a second time and the file hasn't been modified (mtime unchanged),
return a lightweight stub instead of re-sending the full content.
Writes and patches naturally change mtime, so post-edit reads always
return fresh content. The dedup cache is cleared on context
compression — after compression the original read content is
summarised away, so the model needs the full content again.
3. Device path blocking: paths like /dev/zero, /dev/random, /dev/stdin
etc. are rejected before any I/O to prevent process hangs from
infinite-output or blocking-input devices.
Tests: 17 new tests covering all three features plus the dedup-reset-
on-compression integration. All 52 file-read tests pass (35 existing +
17 new). Full tool suite (2124 tests) passes with 0 failures.
* feat: make file_read_max_chars configurable, add docs
Add file_read_max_chars to DEFAULT_CONFIG (default 100K). read_file_tool
reads this on first call and caches for the process lifetime. Users on
large-context models can raise it; users on small local models can lower it.
Also adds a 'File Read Safety' section to the configuration docs
explaining the char limit, dedup behavior, and example values.
2026-03-31 12:53:19 -07:00
except OSError :
feat(file_tools): detect stale files on write and patch (#4345)
Track file mtime when read_file is called. When write_file or patch
subsequently targets the same file, compare the current mtime against
the recorded one. If they differ (external edit, concurrent agent,
user change), include a _warning in the result advising the agent to
re-read. The write still proceeds — this is a soft signal, not a
hard block.
Key design points:
- Per-task isolation: task A's reads don't affect task B's writes.
- Files never read produce no warning (not enforcing read-before-write).
- mtime naturally updates after the agent's own writes, so the warning
only fires on external changes, not the agent's own edits.
- V4A multi-file patches check all target paths.
Tests: 10 new tests covering write staleness, patch staleness,
never-read files, cross-task isolation, and the helper function.
2026-03-31 14:49:00 -07:00
pass # Can't stat — skip tracking for this entry
feat(file_tools): harden read_file with size guard, dedup, and device blocking (#4315)
* feat(file_tools): harden read_file with size guard, dedup, and device blocking
Three improvements to read_file_tool to reduce wasted context tokens and
prevent process hangs:
1. Character-count guard: reads that produce more than 100K characters
(≈25-35K tokens across tokenisers) are rejected with an error that
tells the model to use offset+limit for a smaller range. The
effective cap is min(file_size, 100K) so small files that happen to
have long lines aren't over-penalised. Large truncated files also
get a hint nudging toward targeted reads.
2. File-read deduplication: when the same (path, offset, limit) is read
a second time and the file hasn't been modified (mtime unchanged),
return a lightweight stub instead of re-sending the full content.
Writes and patches naturally change mtime, so post-edit reads always
return fresh content. The dedup cache is cleared on context
compression — after compression the original read content is
summarised away, so the model needs the full content again.
3. Device path blocking: paths like /dev/zero, /dev/random, /dev/stdin
etc. are rejected before any I/O to prevent process hangs from
infinite-output or blocking-input devices.
Tests: 17 new tests covering all three features plus the dedup-reset-
on-compression integration. All 52 file-read tests pass (35 existing +
17 new). Full tool suite (2124 tests) passes with 0 failures.
* feat: make file_read_max_chars configurable, add docs
Add file_read_max_chars to DEFAULT_CONFIG (default 100K). read_file_tool
reads this on first call and caches for the process lifetime. Users on
large-context models can raise it; users on small local models can lower it.
Also adds a 'File Read Safety' section to the configuration docs
explaining the char limit, dedup behavior, and example values.
2026-03-31 12:53:19 -07:00
fix(tools): bound _read_tracker sub-containers + prune _completion_consumed (#11839)
Two accretion-over-time leaks that compound over long CLI / gateway
lifetimes. Both were flagged in the memory-leak audit.
## file_tools._read_tracker
_read_tracker[task_id] holds three sub-containers that grew unbounded:
read_history set of (path, offset, limit) tuples — 1 per unique read
dedup dict of (path, offset, limit) → mtime — same growth pattern
read_timestamps dict of resolved_path → mtime — 1 per unique path
A CLI session uses one stable task_id for its lifetime, so these were
uncapped. A 10k-read session accumulated ~1.5MB of tracker state that
the tool no longer needed (only the most recent reads are relevant for
dedup, consecutive-loop detection, and write/patch external-edit
warnings).
Fix: _cap_read_tracker_data() enforces hard caps on each container
after every add. Defaults: read_history=500, dedup=1000,
read_timestamps=1000. Eviction is insertion-order (Python 3.7+ dict
guarantee) for the dicts; arbitrary for the set (which only feeds
diagnostic summaries).
## process_registry._completion_consumed
Module-level set that recorded every session_id ever polled / waited /
logged. No pruning. Each entry is ~20 bytes, so the absolute leak is
small, but on a gateway processing thousands of background commands
per day the set grows until process exit.
Fix: _prune_if_needed() now discards _completion_consumed entries
alongside the session dict evictions it already performs (both the
TTL-based prune and the LRU-over-cap prune). Adds a final
belt-and-suspenders pass that drops any dangling entries whose
session_id no longer appears in _running or _finished.
Tests: tests/tools/test_accretion_caps.py — 9 cases
* Each container bound respected, oldest evicted
* No-op when under cap (no unnecessary work)
* Handles missing sub-containers without crashing
* Live read_file_tool path enforces caps end-to-end
* _completion_consumed pruned on TTL expiry
* _completion_consumed pruned on LRU eviction
* Dangling entries (no backing session) cleared
Broader suite: 3486 tests/tools + tests/cli pass. The single flake
(test_alias_command_passes_args) reproduces on unchanged main — known
cross-test pollution under suite-order load.
2026-04-17 15:53:57 -07:00
# Bound the per-task containers so a long CLI session doesn't
# accumulate megabytes of dict/set state. See _cap_read_tracker_data.
_cap_read_tracker_data ( task_data )
feat(delegate): cross-agent file state coordination for concurrent subagents (#13718)
* feat(models): hide OpenRouter models that don't advertise tool support
Port from Kilo-Org/kilocode#9068.
hermes-agent is tool-calling-first — every provider path assumes the
model can invoke tools. Models whose OpenRouter supported_parameters
doesn't include 'tools' (e.g. image-only or completion-only models)
cannot be driven by the agent loop and fail at the first tool call.
Filter them out of fetch_openrouter_models() so they never appear in
the model picker (`hermes model`, setup wizard, /model slash command).
Permissive when the field is missing — OpenRouter-compatible gateways
(Nous Portal, private mirrors, older snapshots) don't always populate
supported_parameters. Treat missing as 'unknown → allow' rather than
silently emptying the picker on those gateways. Only hide models
whose supported_parameters is an explicit list that omits tools.
Tests cover: tools present → kept, tools absent → dropped, field
missing → kept, malformed non-list → kept, non-dict item → kept,
empty list → dropped.
* feat(delegate): cross-agent file state coordination for concurrent subagents
Prevents mangled edits when concurrent subagents touch the same file
(same process, same filesystem — the mangle scenario from #11215).
Three layers, all opt-out via HERMES_DISABLE_FILE_STATE_GUARD=1:
1. FileStateRegistry (tools/file_state.py) — process-wide singleton
tracking per-agent read stamps and the last writer globally.
check_stale() names the sibling subagent in the warning when a
non-owning agent wrote after this agent's last read.
2. Per-path threading.Lock wrapped around the read-modify-write
region in write_file_tool and patch_tool. Concurrent siblings on
the same path serialize; different paths stay fully parallel.
V4A multi-file patches lock in sorted path order (deadlock-free).
3. Delegate-completion reminder in tools/delegate_tool.py: after a
subagent returns, writes_since(parent, child_start, parent_reads)
appends '[NOTE: subagent modified files the parent previously
read — re-read before editing: ...]' to entry.summary when the
child touched anything the parent had already seen.
Complements (does not replace) the existing path-overlap check in
run_agent._should_parallelize_tool_batch — batch check prevents
same-file parallel dispatch within one agent's turn (cheap prevention,
zero API cost), registry catches cross-subagent and cross-turn
staleness at write time (detection).
Behavior is warning-only, not hard-failing — matches existing project
style. Errors surface naturally: sibling writes often invalidate the
old_string in patch operations, which already errors cleanly.
Tests: tests/tools/test_file_state_registry.py — 16 tests covering
registry state transitions, per-path locking, per-path-not-global
locking, writes_since filtering, kill switch, and end-to-end
integration through the real read_file/write_file/patch handlers.
2026-04-21 16:41:26 -07:00
# Cross-agent file-state registry (separate from per-task read
# tracker above): records that THIS agent has read this path so
# write/patch can detect sibling-subagent writes that happened
# after our read. Partial read when offset>1 or the read was
# truncated (large file with more content than limit covered).
# Outside the _read_tracker_lock so the registry's own locking
# isn't nested under ours.
try :
_partial = ( offset > 1 ) or bool ( result_dict . get ( " truncated " ) )
file_state . record_read ( task_id , resolved_str , partial = _partial )
except Exception :
logger . debug ( " file_state.record_read failed " , exc_info = True )
fix: improve read-loop detection — consecutive-only, correct thresholds, fix bugs
Follow-up to PR #705 (merged from 0xbyt4). Addresses several issues:
1. CONSECUTIVE-ONLY TRACKING: Redesigned the read/search tracker to only
warn/block on truly consecutive identical calls. Any other tool call
in between (write, patch, terminal, etc.) resets the counter via
notify_other_tool_call(), called from handle_function_call() in
model_tools.py. This prevents false blocks in read→edit→verify flows.
2. THRESHOLD ADJUSTMENT: Warn on 3rd consecutive (was 2nd), block on
4th+ consecutive (was 3rd+). Gives the model more room before
intervening.
3. TUPLE UNPACKING BUG: Fixed get_read_files_summary() which crashed on
search keys (5-tuple) when trying to unpack as 3-tuple. Now uses a
separate read_history set that only tracks file reads.
4. WEB_EXTRACT DOCSTRING: Reverted incorrect removal of 'title' from
web_extract return docs in code_execution_tool.py — the field IS
returned by web_tools.py.
5. TESTS: Rewrote test_read_loop_detection.py (35 tests) to cover
consecutive-only behavior, notify_other_tool_call, interleaved
read/search, and summary-unaffected-by-searches.
2026-03-10 16:25:41 -07:00
if count > = 4 :
2026-03-08 23:01:21 +03:00
# Hard block: stop returning content to break the loop
return json . dumps ( {
" error " : (
fix: improve read-loop detection — consecutive-only, correct thresholds, fix bugs
Follow-up to PR #705 (merged from 0xbyt4). Addresses several issues:
1. CONSECUTIVE-ONLY TRACKING: Redesigned the read/search tracker to only
warn/block on truly consecutive identical calls. Any other tool call
in between (write, patch, terminal, etc.) resets the counter via
notify_other_tool_call(), called from handle_function_call() in
model_tools.py. This prevents false blocks in read→edit→verify flows.
2. THRESHOLD ADJUSTMENT: Warn on 3rd consecutive (was 2nd), block on
4th+ consecutive (was 3rd+). Gives the model more room before
intervening.
3. TUPLE UNPACKING BUG: Fixed get_read_files_summary() which crashed on
search keys (5-tuple) when trying to unpack as 3-tuple. Now uses a
separate read_history set that only tracks file reads.
4. WEB_EXTRACT DOCSTRING: Reverted incorrect removal of 'title' from
web_extract return docs in code_execution_tool.py — the field IS
returned by web_tools.py.
5. TESTS: Rewrote test_read_loop_detection.py (35 tests) to cover
consecutive-only behavior, notify_other_tool_call, interleaved
read/search, and summary-unaffected-by-searches.
2026-03-10 16:25:41 -07:00
f " BLOCKED: You have read this exact file region { count } times in a row. "
2026-03-08 23:01:21 +03:00
" The content has NOT changed. You already have this information. "
" STOP re-reading and proceed with your task. "
) ,
" path " : path ,
" already_read " : count ,
} , ensure_ascii = False )
fix: improve read-loop detection — consecutive-only, correct thresholds, fix bugs
Follow-up to PR #705 (merged from 0xbyt4). Addresses several issues:
1. CONSECUTIVE-ONLY TRACKING: Redesigned the read/search tracker to only
warn/block on truly consecutive identical calls. Any other tool call
in between (write, patch, terminal, etc.) resets the counter via
notify_other_tool_call(), called from handle_function_call() in
model_tools.py. This prevents false blocks in read→edit→verify flows.
2. THRESHOLD ADJUSTMENT: Warn on 3rd consecutive (was 2nd), block on
4th+ consecutive (was 3rd+). Gives the model more room before
intervening.
3. TUPLE UNPACKING BUG: Fixed get_read_files_summary() which crashed on
search keys (5-tuple) when trying to unpack as 3-tuple. Now uses a
separate read_history set that only tracks file reads.
4. WEB_EXTRACT DOCSTRING: Reverted incorrect removal of 'title' from
web_extract return docs in code_execution_tool.py — the field IS
returned by web_tools.py.
5. TESTS: Rewrote test_read_loop_detection.py (35 tests) to cover
consecutive-only behavior, notify_other_tool_call, interleaved
read/search, and summary-unaffected-by-searches.
2026-03-10 16:25:41 -07:00
elif count > = 3 :
2026-03-08 20:44:42 +03:00
result_dict [ " _warning " ] = (
fix: improve read-loop detection — consecutive-only, correct thresholds, fix bugs
Follow-up to PR #705 (merged from 0xbyt4). Addresses several issues:
1. CONSECUTIVE-ONLY TRACKING: Redesigned the read/search tracker to only
warn/block on truly consecutive identical calls. Any other tool call
in between (write, patch, terminal, etc.) resets the counter via
notify_other_tool_call(), called from handle_function_call() in
model_tools.py. This prevents false blocks in read→edit→verify flows.
2. THRESHOLD ADJUSTMENT: Warn on 3rd consecutive (was 2nd), block on
4th+ consecutive (was 3rd+). Gives the model more room before
intervening.
3. TUPLE UNPACKING BUG: Fixed get_read_files_summary() which crashed on
search keys (5-tuple) when trying to unpack as 3-tuple. Now uses a
separate read_history set that only tracks file reads.
4. WEB_EXTRACT DOCSTRING: Reverted incorrect removal of 'title' from
web_extract return docs in code_execution_tool.py — the field IS
returned by web_tools.py.
5. TESTS: Rewrote test_read_loop_detection.py (35 tests) to cover
consecutive-only behavior, notify_other_tool_call, interleaved
read/search, and summary-unaffected-by-searches.
2026-03-10 16:25:41 -07:00
f " You have read this exact file region { count } times consecutively. "
" The content has not changed since your last read. Use the information you already have. "
2026-03-08 20:44:42 +03:00
" If you are stuck in a loop, stop reading and proceed with writing or responding. "
)
return json . dumps ( result_dict , ensure_ascii = False )
2026-02-05 03:49:46 -08:00
except Exception as e :
refactor: add tool_error/tool_result helpers + read_raw_config, migrate 129 callsites
Add three reusable helpers to eliminate pervasive boilerplate:
tools/registry.py — tool_error() and tool_result():
Every tool handler returns JSON strings. The pattern
json.dumps({"error": msg}, ensure_ascii=False) appeared 106 times,
and json.dumps({"success": False, "error": msg}, ...) another 23.
Now: tool_error(msg) or tool_error(msg, success=False).
tool_result() handles arbitrary result dicts:
tool_result(success=True, data=payload) or tool_result(some_dict).
hermes_cli/config.py — read_raw_config():
Lightweight YAML reader that returns the raw config dict without
load_config()'s deep-merge + migration overhead. Available for
callsites that just need a single config value.
Migration (129 callsites across 32 files):
- tools/: browser_camofox (18), file_tools (10), homeassistant (8),
web_tools (7), skill_manager (7), cronjob (11), code_execution (4),
delegate (5), send_message (4), tts (4), memory (7), session_search (3),
mcp (2), clarify (2), skills_tool (3), todo (1), vision (1),
browser (1), process_registry (2), image_gen (1)
- plugins/memory/: honcho (9), supermemory (9), hindsight (8),
holographic (7), openviking (7), mem0 (7), byterover (6), retaindb (2)
- agent/: memory_manager (2), builtin_memory_provider (1)
2026-04-07 13:36:20 -07:00
return tool_error ( str ( e ) )
2026-02-05 03:49:46 -08:00
2026-03-08 20:44:42 +03:00
feat(file_tools): harden read_file with size guard, dedup, and device blocking (#4315)
* feat(file_tools): harden read_file with size guard, dedup, and device blocking
Three improvements to read_file_tool to reduce wasted context tokens and
prevent process hangs:
1. Character-count guard: reads that produce more than 100K characters
(≈25-35K tokens across tokenisers) are rejected with an error that
tells the model to use offset+limit for a smaller range. The
effective cap is min(file_size, 100K) so small files that happen to
have long lines aren't over-penalised. Large truncated files also
get a hint nudging toward targeted reads.
2. File-read deduplication: when the same (path, offset, limit) is read
a second time and the file hasn't been modified (mtime unchanged),
return a lightweight stub instead of re-sending the full content.
Writes and patches naturally change mtime, so post-edit reads always
return fresh content. The dedup cache is cleared on context
compression — after compression the original read content is
summarised away, so the model needs the full content again.
3. Device path blocking: paths like /dev/zero, /dev/random, /dev/stdin
etc. are rejected before any I/O to prevent process hangs from
infinite-output or blocking-input devices.
Tests: 17 new tests covering all three features plus the dedup-reset-
on-compression integration. All 52 file-read tests pass (35 existing +
17 new). Full tool suite (2124 tests) passes with 0 failures.
* feat: make file_read_max_chars configurable, add docs
Add file_read_max_chars to DEFAULT_CONFIG (default 100K). read_file_tool
reads this on first call and caches for the process lifetime. Users on
large-context models can raise it; users on small local models can lower it.
Also adds a 'File Read Safety' section to the configuration docs
explaining the char limit, dedup behavior, and example values.
2026-03-31 12:53:19 -07:00
def reset_file_dedup ( task_id : str = None ) :
""" Clear the deduplication cache for file reads.
Called after context compression — the original read content has been
summarised away , so the model needs the full content if it reads the
same file again . Without this , reads after compression would return
a " file unchanged " stub pointing at content that no longer exists in
context .
Call with a task_id to clear just that task , or without to clear all .
"""
with _read_tracker_lock :
if task_id :
task_data = _read_tracker . get ( task_id )
if task_data and " dedup " in task_data :
task_data [ " dedup " ] . clear ( )
else :
for task_data in _read_tracker . values ( ) :
if " dedup " in task_data :
task_data [ " dedup " ] . clear ( )
fix: improve read-loop detection — consecutive-only, correct thresholds, fix bugs
Follow-up to PR #705 (merged from 0xbyt4). Addresses several issues:
1. CONSECUTIVE-ONLY TRACKING: Redesigned the read/search tracker to only
warn/block on truly consecutive identical calls. Any other tool call
in between (write, patch, terminal, etc.) resets the counter via
notify_other_tool_call(), called from handle_function_call() in
model_tools.py. This prevents false blocks in read→edit→verify flows.
2. THRESHOLD ADJUSTMENT: Warn on 3rd consecutive (was 2nd), block on
4th+ consecutive (was 3rd+). Gives the model more room before
intervening.
3. TUPLE UNPACKING BUG: Fixed get_read_files_summary() which crashed on
search keys (5-tuple) when trying to unpack as 3-tuple. Now uses a
separate read_history set that only tracks file reads.
4. WEB_EXTRACT DOCSTRING: Reverted incorrect removal of 'title' from
web_extract return docs in code_execution_tool.py — the field IS
returned by web_tools.py.
5. TESTS: Rewrote test_read_loop_detection.py (35 tests) to cover
consecutive-only behavior, notify_other_tool_call, interleaved
read/search, and summary-unaffected-by-searches.
2026-03-10 16:25:41 -07:00
def notify_other_tool_call ( task_id : str = " default " ) :
""" Reset consecutive read/search counter for a task.
Called by the tool dispatcher ( model_tools . py ) whenever a tool OTHER
than read_file / search_files is executed . This ensures we only warn
or block on * truly consecutive * repeated reads — if the agent does
anything else in between ( write , patch , terminal , etc . ) the counter
resets and the next read is treated as fresh .
"""
with _read_tracker_lock :
task_data = _read_tracker . get ( task_id )
if task_data :
task_data [ " last_key " ] = None
task_data [ " consecutive " ] = 0
2026-04-01 00:50:08 -07:00
def _update_read_timestamp ( filepath : str , task_id : str ) - > None :
""" Record the file ' s current modification time after a successful write.
Called after write_file and patch so that consecutive edits by the
same task don ' t trigger false staleness warnings — each write
refreshes the stored timestamp to match the file ' s new state.
"""
try :
2026-04-20 12:23:00 -07:00
resolved = str ( _resolve_path ( filepath ) )
2026-04-01 00:50:08 -07:00
current_mtime = os . path . getmtime ( resolved )
except ( OSError , ValueError ) :
return
with _read_tracker_lock :
task_data = _read_tracker . get ( task_id )
if task_data is not None :
task_data . setdefault ( " read_timestamps " , { } ) [ resolved ] = current_mtime
fix(tools): bound _read_tracker sub-containers + prune _completion_consumed (#11839)
Two accretion-over-time leaks that compound over long CLI / gateway
lifetimes. Both were flagged in the memory-leak audit.
## file_tools._read_tracker
_read_tracker[task_id] holds three sub-containers that grew unbounded:
read_history set of (path, offset, limit) tuples — 1 per unique read
dedup dict of (path, offset, limit) → mtime — same growth pattern
read_timestamps dict of resolved_path → mtime — 1 per unique path
A CLI session uses one stable task_id for its lifetime, so these were
uncapped. A 10k-read session accumulated ~1.5MB of tracker state that
the tool no longer needed (only the most recent reads are relevant for
dedup, consecutive-loop detection, and write/patch external-edit
warnings).
Fix: _cap_read_tracker_data() enforces hard caps on each container
after every add. Defaults: read_history=500, dedup=1000,
read_timestamps=1000. Eviction is insertion-order (Python 3.7+ dict
guarantee) for the dicts; arbitrary for the set (which only feeds
diagnostic summaries).
## process_registry._completion_consumed
Module-level set that recorded every session_id ever polled / waited /
logged. No pruning. Each entry is ~20 bytes, so the absolute leak is
small, but on a gateway processing thousands of background commands
per day the set grows until process exit.
Fix: _prune_if_needed() now discards _completion_consumed entries
alongside the session dict evictions it already performs (both the
TTL-based prune and the LRU-over-cap prune). Adds a final
belt-and-suspenders pass that drops any dangling entries whose
session_id no longer appears in _running or _finished.
Tests: tests/tools/test_accretion_caps.py — 9 cases
* Each container bound respected, oldest evicted
* No-op when under cap (no unnecessary work)
* Handles missing sub-containers without crashing
* Live read_file_tool path enforces caps end-to-end
* _completion_consumed pruned on TTL expiry
* _completion_consumed pruned on LRU eviction
* Dangling entries (no backing session) cleared
Broader suite: 3486 tests/tools + tests/cli pass. The single flake
(test_alias_command_passes_args) reproduces on unchanged main — known
cross-test pollution under suite-order load.
2026-04-17 15:53:57 -07:00
_cap_read_tracker_data ( task_data )
2026-04-01 00:50:08 -07:00
feat(file_tools): detect stale files on write and patch (#4345)
Track file mtime when read_file is called. When write_file or patch
subsequently targets the same file, compare the current mtime against
the recorded one. If they differ (external edit, concurrent agent,
user change), include a _warning in the result advising the agent to
re-read. The write still proceeds — this is a soft signal, not a
hard block.
Key design points:
- Per-task isolation: task A's reads don't affect task B's writes.
- Files never read produce no warning (not enforcing read-before-write).
- mtime naturally updates after the agent's own writes, so the warning
only fires on external changes, not the agent's own edits.
- V4A multi-file patches check all target paths.
Tests: 10 new tests covering write staleness, patch staleness,
never-read files, cross-task isolation, and the helper function.
2026-03-31 14:49:00 -07:00
def _check_file_staleness ( filepath : str , task_id : str ) - > str | None :
""" Check whether a file was modified since the agent last read it.
Returns a warning string if the file is stale ( mtime changed since
the last read_file call for this task ) , or None if the file is fresh
or was never read . Does not block — the write still proceeds .
"""
try :
2026-04-20 12:23:00 -07:00
resolved = str ( _resolve_path ( filepath ) )
feat(file_tools): detect stale files on write and patch (#4345)
Track file mtime when read_file is called. When write_file or patch
subsequently targets the same file, compare the current mtime against
the recorded one. If they differ (external edit, concurrent agent,
user change), include a _warning in the result advising the agent to
re-read. The write still proceeds — this is a soft signal, not a
hard block.
Key design points:
- Per-task isolation: task A's reads don't affect task B's writes.
- Files never read produce no warning (not enforcing read-before-write).
- mtime naturally updates after the agent's own writes, so the warning
only fires on external changes, not the agent's own edits.
- V4A multi-file patches check all target paths.
Tests: 10 new tests covering write staleness, patch staleness,
never-read files, cross-task isolation, and the helper function.
2026-03-31 14:49:00 -07:00
except ( OSError , ValueError ) :
return None
with _read_tracker_lock :
task_data = _read_tracker . get ( task_id )
if not task_data :
return None
2026-04-01 00:50:08 -07:00
read_mtime = task_data . get ( " read_timestamps " , { } ) . get ( resolved )
feat(file_tools): detect stale files on write and patch (#4345)
Track file mtime when read_file is called. When write_file or patch
subsequently targets the same file, compare the current mtime against
the recorded one. If they differ (external edit, concurrent agent,
user change), include a _warning in the result advising the agent to
re-read. The write still proceeds — this is a soft signal, not a
hard block.
Key design points:
- Per-task isolation: task A's reads don't affect task B's writes.
- Files never read produce no warning (not enforcing read-before-write).
- mtime naturally updates after the agent's own writes, so the warning
only fires on external changes, not the agent's own edits.
- V4A multi-file patches check all target paths.
Tests: 10 new tests covering write staleness, patch staleness,
never-read files, cross-task isolation, and the helper function.
2026-03-31 14:49:00 -07:00
if read_mtime is None :
return None # File was never read — nothing to compare against
try :
current_mtime = os . path . getmtime ( resolved )
except OSError :
return None # Can't stat — file may have been deleted, let write handle it
if current_mtime != read_mtime :
return (
f " Warning: { filepath } was modified since you last read it "
" (external edit or concurrent agent). The content you read may be "
" stale. Consider re-reading the file to verify before writing. "
)
return None
2026-02-05 03:49:46 -08:00
def write_file_tool ( path : str , content : str , task_id : str = " default " ) - > str :
""" Write content to a file. """
2026-03-29 22:33:47 -07:00
sensitive_err = _check_sensitive_path ( path )
if sensitive_err :
refactor: add tool_error/tool_result helpers + read_raw_config, migrate 129 callsites
Add three reusable helpers to eliminate pervasive boilerplate:
tools/registry.py — tool_error() and tool_result():
Every tool handler returns JSON strings. The pattern
json.dumps({"error": msg}, ensure_ascii=False) appeared 106 times,
and json.dumps({"success": False, "error": msg}, ...) another 23.
Now: tool_error(msg) or tool_error(msg, success=False).
tool_result() handles arbitrary result dicts:
tool_result(success=True, data=payload) or tool_result(some_dict).
hermes_cli/config.py — read_raw_config():
Lightweight YAML reader that returns the raw config dict without
load_config()'s deep-merge + migration overhead. Available for
callsites that just need a single config value.
Migration (129 callsites across 32 files):
- tools/: browser_camofox (18), file_tools (10), homeassistant (8),
web_tools (7), skill_manager (7), cronjob (11), code_execution (4),
delegate (5), send_message (4), tts (4), memory (7), session_search (3),
mcp (2), clarify (2), skills_tool (3), todo (1), vision (1),
browser (1), process_registry (2), image_gen (1)
- plugins/memory/: honcho (9), supermemory (9), hindsight (8),
holographic (7), openviking (7), mem0 (7), byterover (6), retaindb (2)
- agent/: memory_manager (2), builtin_memory_provider (1)
2026-04-07 13:36:20 -07:00
return tool_error ( sensitive_err )
2026-02-05 03:49:46 -08:00
try :
feat(delegate): cross-agent file state coordination for concurrent subagents (#13718)
* feat(models): hide OpenRouter models that don't advertise tool support
Port from Kilo-Org/kilocode#9068.
hermes-agent is tool-calling-first — every provider path assumes the
model can invoke tools. Models whose OpenRouter supported_parameters
doesn't include 'tools' (e.g. image-only or completion-only models)
cannot be driven by the agent loop and fail at the first tool call.
Filter them out of fetch_openrouter_models() so they never appear in
the model picker (`hermes model`, setup wizard, /model slash command).
Permissive when the field is missing — OpenRouter-compatible gateways
(Nous Portal, private mirrors, older snapshots) don't always populate
supported_parameters. Treat missing as 'unknown → allow' rather than
silently emptying the picker on those gateways. Only hide models
whose supported_parameters is an explicit list that omits tools.
Tests cover: tools present → kept, tools absent → dropped, field
missing → kept, malformed non-list → kept, non-dict item → kept,
empty list → dropped.
* feat(delegate): cross-agent file state coordination for concurrent subagents
Prevents mangled edits when concurrent subagents touch the same file
(same process, same filesystem — the mangle scenario from #11215).
Three layers, all opt-out via HERMES_DISABLE_FILE_STATE_GUARD=1:
1. FileStateRegistry (tools/file_state.py) — process-wide singleton
tracking per-agent read stamps and the last writer globally.
check_stale() names the sibling subagent in the warning when a
non-owning agent wrote after this agent's last read.
2. Per-path threading.Lock wrapped around the read-modify-write
region in write_file_tool and patch_tool. Concurrent siblings on
the same path serialize; different paths stay fully parallel.
V4A multi-file patches lock in sorted path order (deadlock-free).
3. Delegate-completion reminder in tools/delegate_tool.py: after a
subagent returns, writes_since(parent, child_start, parent_reads)
appends '[NOTE: subagent modified files the parent previously
read — re-read before editing: ...]' to entry.summary when the
child touched anything the parent had already seen.
Complements (does not replace) the existing path-overlap check in
run_agent._should_parallelize_tool_batch — batch check prevents
same-file parallel dispatch within one agent's turn (cheap prevention,
zero API cost), registry catches cross-subagent and cross-turn
staleness at write time (detection).
Behavior is warning-only, not hard-failing — matches existing project
style. Errors surface naturally: sibling writes often invalidate the
old_string in patch operations, which already errors cleanly.
Tests: tests/tools/test_file_state_registry.py — 16 tests covering
registry state transitions, per-path locking, per-path-not-global
locking, writes_since filtering, kill switch, and end-to-end
integration through the real read_file/write_file/patch handlers.
2026-04-21 16:41:26 -07:00
# Resolve once for the registry lock + stale check. Failures here
# fall back to the legacy path — write proceeds, per-task staleness
# check below still runs.
try :
_resolved = str ( _resolve_path ( path ) )
except Exception :
_resolved = None
if _resolved is None :
stale_warning = _check_file_staleness ( path , task_id )
file_ops = _get_file_ops ( task_id )
result = file_ops . write_file ( path , content )
result_dict = result . to_dict ( )
if stale_warning :
result_dict [ " _warning " ] = stale_warning
_update_read_timestamp ( path , task_id )
return json . dumps ( result_dict , ensure_ascii = False )
# Serialize the read→modify→write region per-path so concurrent
# subagents can't interleave on the same file. Different paths
# remain fully parallel.
with file_state . lock_path ( _resolved ) :
# Cross-agent staleness wins over per-task warning when both
# fire — its message names the sibling subagent.
cross_warning = file_state . check_stale ( task_id , _resolved )
stale_warning = _check_file_staleness ( path , task_id )
file_ops = _get_file_ops ( task_id )
result = file_ops . write_file ( path , content )
result_dict = result . to_dict ( )
effective_warning = cross_warning or stale_warning
if effective_warning :
result_dict [ " _warning " ] = effective_warning
# Refresh stamps after the successful write so consecutive
# writes by this task don't trigger false staleness warnings.
_update_read_timestamp ( path , task_id )
if not result_dict . get ( " error " ) :
file_state . note_write ( task_id , _resolved )
feat(file_tools): detect stale files on write and patch (#4345)
Track file mtime when read_file is called. When write_file or patch
subsequently targets the same file, compare the current mtime against
the recorded one. If they differ (external edit, concurrent agent,
user change), include a _warning in the result advising the agent to
re-read. The write still proceeds — this is a soft signal, not a
hard block.
Key design points:
- Per-task isolation: task A's reads don't affect task B's writes.
- Files never read produce no warning (not enforcing read-before-write).
- mtime naturally updates after the agent's own writes, so the warning
only fires on external changes, not the agent's own edits.
- V4A multi-file patches check all target paths.
Tests: 10 new tests covering write staleness, patch staleness,
never-read files, cross-task isolation, and the helper function.
2026-03-31 14:49:00 -07:00
return json . dumps ( result_dict , ensure_ascii = False )
2026-02-05 03:49:46 -08:00
except Exception as e :
2026-03-13 22:14:00 -07:00
if _is_expected_write_exception ( e ) :
logger . debug ( " write_file expected denial: %s : %s " , type ( e ) . __name__ , e )
else :
logger . error ( " write_file error: %s : %s " , type ( e ) . __name__ , e , exc_info = True )
refactor: add tool_error/tool_result helpers + read_raw_config, migrate 129 callsites
Add three reusable helpers to eliminate pervasive boilerplate:
tools/registry.py — tool_error() and tool_result():
Every tool handler returns JSON strings. The pattern
json.dumps({"error": msg}, ensure_ascii=False) appeared 106 times,
and json.dumps({"success": False, "error": msg}, ...) another 23.
Now: tool_error(msg) or tool_error(msg, success=False).
tool_result() handles arbitrary result dicts:
tool_result(success=True, data=payload) or tool_result(some_dict).
hermes_cli/config.py — read_raw_config():
Lightweight YAML reader that returns the raw config dict without
load_config()'s deep-merge + migration overhead. Available for
callsites that just need a single config value.
Migration (129 callsites across 32 files):
- tools/: browser_camofox (18), file_tools (10), homeassistant (8),
web_tools (7), skill_manager (7), cronjob (11), code_execution (4),
delegate (5), send_message (4), tts (4), memory (7), session_search (3),
mcp (2), clarify (2), skills_tool (3), todo (1), vision (1),
browser (1), process_registry (2), image_gen (1)
- plugins/memory/: honcho (9), supermemory (9), hindsight (8),
holographic (7), openviking (7), mem0 (7), byterover (6), retaindb (2)
- agent/: memory_manager (2), builtin_memory_provider (1)
2026-04-07 13:36:20 -07:00
return tool_error ( str ( e ) )
2026-02-05 03:49:46 -08:00
def patch_tool ( mode : str = " replace " , path : str = None , old_string : str = None ,
new_string : str = None , replace_all : bool = False , patch : str = None ,
task_id : str = " default " ) - > str :
""" Patch a file using replace mode or V4A patch format. """
2026-03-29 22:33:47 -07:00
# Check sensitive paths for both replace (explicit path) and V4A patch (extract paths)
_paths_to_check = [ ]
if path :
_paths_to_check . append ( path )
if mode == " patch " and patch :
import re as _re
for _m in _re . finditer ( r ' ^ \ * \ * \ * \ s+(?:Update|Add|Delete) \ s+File: \ s*(.+)$ ' , patch , _re . MULTILINE ) :
_paths_to_check . append ( _m . group ( 1 ) . strip ( ) )
for _p in _paths_to_check :
sensitive_err = _check_sensitive_path ( _p )
if sensitive_err :
refactor: add tool_error/tool_result helpers + read_raw_config, migrate 129 callsites
Add three reusable helpers to eliminate pervasive boilerplate:
tools/registry.py — tool_error() and tool_result():
Every tool handler returns JSON strings. The pattern
json.dumps({"error": msg}, ensure_ascii=False) appeared 106 times,
and json.dumps({"success": False, "error": msg}, ...) another 23.
Now: tool_error(msg) or tool_error(msg, success=False).
tool_result() handles arbitrary result dicts:
tool_result(success=True, data=payload) or tool_result(some_dict).
hermes_cli/config.py — read_raw_config():
Lightweight YAML reader that returns the raw config dict without
load_config()'s deep-merge + migration overhead. Available for
callsites that just need a single config value.
Migration (129 callsites across 32 files):
- tools/: browser_camofox (18), file_tools (10), homeassistant (8),
web_tools (7), skill_manager (7), cronjob (11), code_execution (4),
delegate (5), send_message (4), tts (4), memory (7), session_search (3),
mcp (2), clarify (2), skills_tool (3), todo (1), vision (1),
browser (1), process_registry (2), image_gen (1)
- plugins/memory/: honcho (9), supermemory (9), hindsight (8),
holographic (7), openviking (7), mem0 (7), byterover (6), retaindb (2)
- agent/: memory_manager (2), builtin_memory_provider (1)
2026-04-07 13:36:20 -07:00
return tool_error ( sensitive_err )
2026-02-05 03:49:46 -08:00
try :
feat(delegate): cross-agent file state coordination for concurrent subagents (#13718)
* feat(models): hide OpenRouter models that don't advertise tool support
Port from Kilo-Org/kilocode#9068.
hermes-agent is tool-calling-first — every provider path assumes the
model can invoke tools. Models whose OpenRouter supported_parameters
doesn't include 'tools' (e.g. image-only or completion-only models)
cannot be driven by the agent loop and fail at the first tool call.
Filter them out of fetch_openrouter_models() so they never appear in
the model picker (`hermes model`, setup wizard, /model slash command).
Permissive when the field is missing — OpenRouter-compatible gateways
(Nous Portal, private mirrors, older snapshots) don't always populate
supported_parameters. Treat missing as 'unknown → allow' rather than
silently emptying the picker on those gateways. Only hide models
whose supported_parameters is an explicit list that omits tools.
Tests cover: tools present → kept, tools absent → dropped, field
missing → kept, malformed non-list → kept, non-dict item → kept,
empty list → dropped.
* feat(delegate): cross-agent file state coordination for concurrent subagents
Prevents mangled edits when concurrent subagents touch the same file
(same process, same filesystem — the mangle scenario from #11215).
Three layers, all opt-out via HERMES_DISABLE_FILE_STATE_GUARD=1:
1. FileStateRegistry (tools/file_state.py) — process-wide singleton
tracking per-agent read stamps and the last writer globally.
check_stale() names the sibling subagent in the warning when a
non-owning agent wrote after this agent's last read.
2. Per-path threading.Lock wrapped around the read-modify-write
region in write_file_tool and patch_tool. Concurrent siblings on
the same path serialize; different paths stay fully parallel.
V4A multi-file patches lock in sorted path order (deadlock-free).
3. Delegate-completion reminder in tools/delegate_tool.py: after a
subagent returns, writes_since(parent, child_start, parent_reads)
appends '[NOTE: subagent modified files the parent previously
read — re-read before editing: ...]' to entry.summary when the
child touched anything the parent had already seen.
Complements (does not replace) the existing path-overlap check in
run_agent._should_parallelize_tool_batch — batch check prevents
same-file parallel dispatch within one agent's turn (cheap prevention,
zero API cost), registry catches cross-subagent and cross-turn
staleness at write time (detection).
Behavior is warning-only, not hard-failing — matches existing project
style. Errors surface naturally: sibling writes often invalidate the
old_string in patch operations, which already errors cleanly.
Tests: tests/tools/test_file_state_registry.py — 16 tests covering
registry state transitions, per-path locking, per-path-not-global
locking, writes_since filtering, kill switch, and end-to-end
integration through the real read_file/write_file/patch handlers.
2026-04-21 16:41:26 -07:00
# Resolve paths for locking. Ordered + deduplicated so concurrent
# callers lock in the same order — prevents deadlock on overlapping
# multi-file V4A patches.
_resolved_paths : list [ str ] = [ ]
_seen : set [ str ] = set ( )
feat(file_tools): detect stale files on write and patch (#4345)
Track file mtime when read_file is called. When write_file or patch
subsequently targets the same file, compare the current mtime against
the recorded one. If they differ (external edit, concurrent agent,
user change), include a _warning in the result advising the agent to
re-read. The write still proceeds — this is a soft signal, not a
hard block.
Key design points:
- Per-task isolation: task A's reads don't affect task B's writes.
- Files never read produce no warning (not enforcing read-before-write).
- mtime naturally updates after the agent's own writes, so the warning
only fires on external changes, not the agent's own edits.
- V4A multi-file patches check all target paths.
Tests: 10 new tests covering write staleness, patch staleness,
never-read files, cross-task isolation, and the helper function.
2026-03-31 14:49:00 -07:00
for _p in _paths_to_check :
feat(delegate): cross-agent file state coordination for concurrent subagents (#13718)
* feat(models): hide OpenRouter models that don't advertise tool support
Port from Kilo-Org/kilocode#9068.
hermes-agent is tool-calling-first — every provider path assumes the
model can invoke tools. Models whose OpenRouter supported_parameters
doesn't include 'tools' (e.g. image-only or completion-only models)
cannot be driven by the agent loop and fail at the first tool call.
Filter them out of fetch_openrouter_models() so they never appear in
the model picker (`hermes model`, setup wizard, /model slash command).
Permissive when the field is missing — OpenRouter-compatible gateways
(Nous Portal, private mirrors, older snapshots) don't always populate
supported_parameters. Treat missing as 'unknown → allow' rather than
silently emptying the picker on those gateways. Only hide models
whose supported_parameters is an explicit list that omits tools.
Tests cover: tools present → kept, tools absent → dropped, field
missing → kept, malformed non-list → kept, non-dict item → kept,
empty list → dropped.
* feat(delegate): cross-agent file state coordination for concurrent subagents
Prevents mangled edits when concurrent subagents touch the same file
(same process, same filesystem — the mangle scenario from #11215).
Three layers, all opt-out via HERMES_DISABLE_FILE_STATE_GUARD=1:
1. FileStateRegistry (tools/file_state.py) — process-wide singleton
tracking per-agent read stamps and the last writer globally.
check_stale() names the sibling subagent in the warning when a
non-owning agent wrote after this agent's last read.
2. Per-path threading.Lock wrapped around the read-modify-write
region in write_file_tool and patch_tool. Concurrent siblings on
the same path serialize; different paths stay fully parallel.
V4A multi-file patches lock in sorted path order (deadlock-free).
3. Delegate-completion reminder in tools/delegate_tool.py: after a
subagent returns, writes_since(parent, child_start, parent_reads)
appends '[NOTE: subagent modified files the parent previously
read — re-read before editing: ...]' to entry.summary when the
child touched anything the parent had already seen.
Complements (does not replace) the existing path-overlap check in
run_agent._should_parallelize_tool_batch — batch check prevents
same-file parallel dispatch within one agent's turn (cheap prevention,
zero API cost), registry catches cross-subagent and cross-turn
staleness at write time (detection).
Behavior is warning-only, not hard-failing — matches existing project
style. Errors surface naturally: sibling writes often invalidate the
old_string in patch operations, which already errors cleanly.
Tests: tests/tools/test_file_state_registry.py — 16 tests covering
registry state transitions, per-path locking, per-path-not-global
locking, writes_since filtering, kill switch, and end-to-end
integration through the real read_file/write_file/patch handlers.
2026-04-21 16:41:26 -07:00
try :
_r = str ( _resolve_path ( _p ) )
except Exception :
_r = None
if _r and _r not in _seen :
_resolved_paths . append ( _r )
_seen . add ( _r )
_resolved_paths . sort ( )
# Acquire per-path locks in sorted order via ExitStack. On single
# path this degenerates to one lock; on empty list (unresolvable)
# it's a no-op and execution falls through unchanged.
from contextlib import ExitStack
with ExitStack ( ) as _locks :
for _r in _resolved_paths :
_locks . enter_context ( file_state . lock_path ( _r ) )
# Collect warnings — cross-agent registry first (names sibling),
# then per-task tracker as a fallback.
stale_warnings : list [ str ] = [ ]
_path_to_resolved : dict [ str , str ] = { }
2026-04-01 00:50:08 -07:00
for _p in _paths_to_check :
feat(delegate): cross-agent file state coordination for concurrent subagents (#13718)
* feat(models): hide OpenRouter models that don't advertise tool support
Port from Kilo-Org/kilocode#9068.
hermes-agent is tool-calling-first — every provider path assumes the
model can invoke tools. Models whose OpenRouter supported_parameters
doesn't include 'tools' (e.g. image-only or completion-only models)
cannot be driven by the agent loop and fail at the first tool call.
Filter them out of fetch_openrouter_models() so they never appear in
the model picker (`hermes model`, setup wizard, /model slash command).
Permissive when the field is missing — OpenRouter-compatible gateways
(Nous Portal, private mirrors, older snapshots) don't always populate
supported_parameters. Treat missing as 'unknown → allow' rather than
silently emptying the picker on those gateways. Only hide models
whose supported_parameters is an explicit list that omits tools.
Tests cover: tools present → kept, tools absent → dropped, field
missing → kept, malformed non-list → kept, non-dict item → kept,
empty list → dropped.
* feat(delegate): cross-agent file state coordination for concurrent subagents
Prevents mangled edits when concurrent subagents touch the same file
(same process, same filesystem — the mangle scenario from #11215).
Three layers, all opt-out via HERMES_DISABLE_FILE_STATE_GUARD=1:
1. FileStateRegistry (tools/file_state.py) — process-wide singleton
tracking per-agent read stamps and the last writer globally.
check_stale() names the sibling subagent in the warning when a
non-owning agent wrote after this agent's last read.
2. Per-path threading.Lock wrapped around the read-modify-write
region in write_file_tool and patch_tool. Concurrent siblings on
the same path serialize; different paths stay fully parallel.
V4A multi-file patches lock in sorted path order (deadlock-free).
3. Delegate-completion reminder in tools/delegate_tool.py: after a
subagent returns, writes_since(parent, child_start, parent_reads)
appends '[NOTE: subagent modified files the parent previously
read — re-read before editing: ...]' to entry.summary when the
child touched anything the parent had already seen.
Complements (does not replace) the existing path-overlap check in
run_agent._should_parallelize_tool_batch — batch check prevents
same-file parallel dispatch within one agent's turn (cheap prevention,
zero API cost), registry catches cross-subagent and cross-turn
staleness at write time (detection).
Behavior is warning-only, not hard-failing — matches existing project
style. Errors surface naturally: sibling writes often invalidate the
old_string in patch operations, which already errors cleanly.
Tests: tests/tools/test_file_state_registry.py — 16 tests covering
registry state transitions, per-path locking, per-path-not-global
locking, writes_since filtering, kill switch, and end-to-end
integration through the real read_file/write_file/patch handlers.
2026-04-21 16:41:26 -07:00
try :
_r = str ( _resolve_path ( _p ) )
except Exception :
_r = None
_path_to_resolved [ _p ] = _r
_cross = file_state . check_stale ( task_id , _r ) if _r else None
_sw = _cross or _check_file_staleness ( _p , task_id )
if _sw :
stale_warnings . append ( _sw )
file_ops = _get_file_ops ( task_id )
if mode == " replace " :
if not path :
return tool_error ( " path required " )
if old_string is None or new_string is None :
return tool_error ( " old_string and new_string required " )
result = file_ops . patch_replace ( path , old_string , new_string , replace_all )
elif mode == " patch " :
if not patch :
return tool_error ( " patch content required " )
result = file_ops . patch_v4a ( patch )
else :
return tool_error ( f " Unknown mode: { mode } " )
result_dict = result . to_dict ( )
if stale_warnings :
result_dict [ " _warning " ] = stale_warnings [ 0 ] if len ( stale_warnings ) == 1 else " | " . join ( stale_warnings )
# Refresh stored timestamps for all successfully-patched paths so
# consecutive edits by this task don't trigger false warnings.
if not result_dict . get ( " error " ) :
for _p in _paths_to_check :
_update_read_timestamp ( _p , task_id )
_r = _path_to_resolved . get ( _p )
if _r :
file_state . note_write ( task_id , _r )
2026-03-08 17:46:28 -07:00
result_json = json . dumps ( result_dict , ensure_ascii = False )
# Hint when old_string not found — saves iterations where the agent
# retries with stale content instead of re-reading the file.
fix(patch): gate 'did you mean?' to no-match + extend to v4a/skill_manage
Follow-ups on top of @teyrebaz33's cherry-picked commit:
1. New shared helper format_no_match_hint() in fuzzy_match.py with a
startswith('Could not find') gate so the snippet only appends to
genuine no-match errors — not to 'Found N matches' (ambiguous),
'Escape-drift detected', or 'identical strings' errors, which would
all mislead the model.
2. file_tools.patch_tool suppresses the legacy generic '[Hint: old_string
not found...]' string when the rich 'Did you mean?' snippet is
already attached — no more double-hint.
3. Wire the same helper into patch_parser.py (V4A patch mode, both
_validate_operations and _apply_update) and skill_manager_tool.py so
all three fuzzy callers surface the hint consistently.
Tests: 7 new gating tests in TestFormatNoMatchHint cover every error
class (ambiguous, drift, identical, non-zero match count, None error,
no similar content, happy path). 34/34 test_fuzzy_match, 96/96
test_file_tools + test_patch_parser + test_skill_manager_tool pass.
E2E verified across all four scenarios: no-match-with-similar,
no-match-no-similar, ambiguous, success. V4A mode confirmed
end-to-end with a non-matching hunk.
2026-04-21 01:59:58 -07:00
# Suppressed when patch_replace already attached a rich "Did you mean?"
# snippet (which is strictly more useful than the generic hint).
2026-03-08 17:46:28 -07:00
if result_dict . get ( " error " ) and " Could not find " in str ( result_dict [ " error " ] ) :
fix(patch): gate 'did you mean?' to no-match + extend to v4a/skill_manage
Follow-ups on top of @teyrebaz33's cherry-picked commit:
1. New shared helper format_no_match_hint() in fuzzy_match.py with a
startswith('Could not find') gate so the snippet only appends to
genuine no-match errors — not to 'Found N matches' (ambiguous),
'Escape-drift detected', or 'identical strings' errors, which would
all mislead the model.
2. file_tools.patch_tool suppresses the legacy generic '[Hint: old_string
not found...]' string when the rich 'Did you mean?' snippet is
already attached — no more double-hint.
3. Wire the same helper into patch_parser.py (V4A patch mode, both
_validate_operations and _apply_update) and skill_manager_tool.py so
all three fuzzy callers surface the hint consistently.
Tests: 7 new gating tests in TestFormatNoMatchHint cover every error
class (ambiguous, drift, identical, non-zero match count, None error,
no similar content, happy path). 34/34 test_fuzzy_match, 96/96
test_file_tools + test_patch_parser + test_skill_manager_tool pass.
E2E verified across all four scenarios: no-match-with-similar,
no-match-no-similar, ambiguous, success. V4A mode confirmed
end-to-end with a non-matching hunk.
2026-04-21 01:59:58 -07:00
if " Did you mean one of these sections? " not in str ( result_dict [ " error " ] ) :
result_json + = " \n \n [Hint: old_string not found. Use read_file to verify the current content, or search_files to locate the text.] "
2026-03-08 17:46:28 -07:00
return result_json
2026-02-05 03:49:46 -08:00
except Exception as e :
refactor: add tool_error/tool_result helpers + read_raw_config, migrate 129 callsites
Add three reusable helpers to eliminate pervasive boilerplate:
tools/registry.py — tool_error() and tool_result():
Every tool handler returns JSON strings. The pattern
json.dumps({"error": msg}, ensure_ascii=False) appeared 106 times,
and json.dumps({"success": False, "error": msg}, ...) another 23.
Now: tool_error(msg) or tool_error(msg, success=False).
tool_result() handles arbitrary result dicts:
tool_result(success=True, data=payload) or tool_result(some_dict).
hermes_cli/config.py — read_raw_config():
Lightweight YAML reader that returns the raw config dict without
load_config()'s deep-merge + migration overhead. Available for
callsites that just need a single config value.
Migration (129 callsites across 32 files):
- tools/: browser_camofox (18), file_tools (10), homeassistant (8),
web_tools (7), skill_manager (7), cronjob (11), code_execution (4),
delegate (5), send_message (4), tts (4), memory (7), session_search (3),
mcp (2), clarify (2), skills_tool (3), todo (1), vision (1),
browser (1), process_registry (2), image_gen (1)
- plugins/memory/: honcho (9), supermemory (9), hindsight (8),
holographic (7), openviking (7), mem0 (7), byterover (6), retaindb (2)
- agent/: memory_manager (2), builtin_memory_provider (1)
2026-04-07 13:36:20 -07:00
return tool_error ( str ( e ) )
2026-02-05 03:49:46 -08:00
def search_tool ( pattern : str , target : str = " content " , path : str = " . " ,
file_glob : str = None , limit : int = 50 , offset : int = 0 ,
output_mode : str = " content " , context : int = 0 ,
task_id : str = " default " ) - > str :
""" Search for content or files. """
try :
fix: improve read-loop detection — consecutive-only, correct thresholds, fix bugs
Follow-up to PR #705 (merged from 0xbyt4). Addresses several issues:
1. CONSECUTIVE-ONLY TRACKING: Redesigned the read/search tracker to only
warn/block on truly consecutive identical calls. Any other tool call
in between (write, patch, terminal, etc.) resets the counter via
notify_other_tool_call(), called from handle_function_call() in
model_tools.py. This prevents false blocks in read→edit→verify flows.
2. THRESHOLD ADJUSTMENT: Warn on 3rd consecutive (was 2nd), block on
4th+ consecutive (was 3rd+). Gives the model more room before
intervening.
3. TUPLE UNPACKING BUG: Fixed get_read_files_summary() which crashed on
search keys (5-tuple) when trying to unpack as 3-tuple. Now uses a
separate read_history set that only tracks file reads.
4. WEB_EXTRACT DOCSTRING: Reverted incorrect removal of 'title' from
web_extract return docs in code_execution_tool.py — the field IS
returned by web_tools.py.
5. TESTS: Rewrote test_read_loop_detection.py (35 tests) to cover
consecutive-only behavior, notify_other_tool_call, interleaved
read/search, and summary-unaffected-by-searches.
2026-03-10 16:25:41 -07:00
# Track searches to detect *consecutive* repeated search loops.
2026-03-18 01:19:05 +03:00
# Include pagination args so users can page through truncated
# results without tripping the repeated-search guard.
search_key = (
" search " ,
pattern ,
target ,
str ( path ) ,
file_glob or " " ,
limit ,
offset ,
)
2026-03-08 23:01:21 +03:00
with _read_tracker_lock :
fix: improve read-loop detection — consecutive-only, correct thresholds, fix bugs
Follow-up to PR #705 (merged from 0xbyt4). Addresses several issues:
1. CONSECUTIVE-ONLY TRACKING: Redesigned the read/search tracker to only
warn/block on truly consecutive identical calls. Any other tool call
in between (write, patch, terminal, etc.) resets the counter via
notify_other_tool_call(), called from handle_function_call() in
model_tools.py. This prevents false blocks in read→edit→verify flows.
2. THRESHOLD ADJUSTMENT: Warn on 3rd consecutive (was 2nd), block on
4th+ consecutive (was 3rd+). Gives the model more room before
intervening.
3. TUPLE UNPACKING BUG: Fixed get_read_files_summary() which crashed on
search keys (5-tuple) when trying to unpack as 3-tuple. Now uses a
separate read_history set that only tracks file reads.
4. WEB_EXTRACT DOCSTRING: Reverted incorrect removal of 'title' from
web_extract return docs in code_execution_tool.py — the field IS
returned by web_tools.py.
5. TESTS: Rewrote test_read_loop_detection.py (35 tests) to cover
consecutive-only behavior, notify_other_tool_call, interleaved
read/search, and summary-unaffected-by-searches.
2026-03-10 16:25:41 -07:00
task_data = _read_tracker . setdefault ( task_id , {
" last_key " : None , " consecutive " : 0 , " read_history " : set ( ) ,
} )
if task_data [ " last_key " ] == search_key :
task_data [ " consecutive " ] + = 1
else :
task_data [ " last_key " ] = search_key
task_data [ " consecutive " ] = 1
count = task_data [ " consecutive " ]
2026-03-08 23:01:21 +03:00
fix: improve read-loop detection — consecutive-only, correct thresholds, fix bugs
Follow-up to PR #705 (merged from 0xbyt4). Addresses several issues:
1. CONSECUTIVE-ONLY TRACKING: Redesigned the read/search tracker to only
warn/block on truly consecutive identical calls. Any other tool call
in between (write, patch, terminal, etc.) resets the counter via
notify_other_tool_call(), called from handle_function_call() in
model_tools.py. This prevents false blocks in read→edit→verify flows.
2. THRESHOLD ADJUSTMENT: Warn on 3rd consecutive (was 2nd), block on
4th+ consecutive (was 3rd+). Gives the model more room before
intervening.
3. TUPLE UNPACKING BUG: Fixed get_read_files_summary() which crashed on
search keys (5-tuple) when trying to unpack as 3-tuple. Now uses a
separate read_history set that only tracks file reads.
4. WEB_EXTRACT DOCSTRING: Reverted incorrect removal of 'title' from
web_extract return docs in code_execution_tool.py — the field IS
returned by web_tools.py.
5. TESTS: Rewrote test_read_loop_detection.py (35 tests) to cover
consecutive-only behavior, notify_other_tool_call, interleaved
read/search, and summary-unaffected-by-searches.
2026-03-10 16:25:41 -07:00
if count > = 4 :
2026-03-08 23:01:21 +03:00
return json . dumps ( {
" error " : (
fix: improve read-loop detection — consecutive-only, correct thresholds, fix bugs
Follow-up to PR #705 (merged from 0xbyt4). Addresses several issues:
1. CONSECUTIVE-ONLY TRACKING: Redesigned the read/search tracker to only
warn/block on truly consecutive identical calls. Any other tool call
in between (write, patch, terminal, etc.) resets the counter via
notify_other_tool_call(), called from handle_function_call() in
model_tools.py. This prevents false blocks in read→edit→verify flows.
2. THRESHOLD ADJUSTMENT: Warn on 3rd consecutive (was 2nd), block on
4th+ consecutive (was 3rd+). Gives the model more room before
intervening.
3. TUPLE UNPACKING BUG: Fixed get_read_files_summary() which crashed on
search keys (5-tuple) when trying to unpack as 3-tuple. Now uses a
separate read_history set that only tracks file reads.
4. WEB_EXTRACT DOCSTRING: Reverted incorrect removal of 'title' from
web_extract return docs in code_execution_tool.py — the field IS
returned by web_tools.py.
5. TESTS: Rewrote test_read_loop_detection.py (35 tests) to cover
consecutive-only behavior, notify_other_tool_call, interleaved
read/search, and summary-unaffected-by-searches.
2026-03-10 16:25:41 -07:00
f " BLOCKED: You have run this exact search { count } times in a row. "
2026-03-08 23:01:21 +03:00
" The results have NOT changed. You already have this information. "
" STOP re-searching and proceed with your task. "
) ,
" pattern " : pattern ,
" already_searched " : count ,
} , ensure_ascii = False )
2026-02-05 03:49:46 -08:00
file_ops = _get_file_ops ( task_id )
result = file_ops . search (
pattern = pattern , path = path , target = target , file_glob = file_glob ,
limit = limit , offset = offset , output_mode = output_mode , context = context
)
2026-03-09 00:49:46 -07:00
if hasattr ( result , ' matches ' ) :
for m in result . matches :
if hasattr ( m , ' content ' ) and m . content :
m . content = redact_sensitive_text ( m . content )
2026-03-08 17:46:28 -07:00
result_dict = result . to_dict ( )
2026-03-08 23:01:21 +03:00
fix: improve read-loop detection — consecutive-only, correct thresholds, fix bugs
Follow-up to PR #705 (merged from 0xbyt4). Addresses several issues:
1. CONSECUTIVE-ONLY TRACKING: Redesigned the read/search tracker to only
warn/block on truly consecutive identical calls. Any other tool call
in between (write, patch, terminal, etc.) resets the counter via
notify_other_tool_call(), called from handle_function_call() in
model_tools.py. This prevents false blocks in read→edit→verify flows.
2. THRESHOLD ADJUSTMENT: Warn on 3rd consecutive (was 2nd), block on
4th+ consecutive (was 3rd+). Gives the model more room before
intervening.
3. TUPLE UNPACKING BUG: Fixed get_read_files_summary() which crashed on
search keys (5-tuple) when trying to unpack as 3-tuple. Now uses a
separate read_history set that only tracks file reads.
4. WEB_EXTRACT DOCSTRING: Reverted incorrect removal of 'title' from
web_extract return docs in code_execution_tool.py — the field IS
returned by web_tools.py.
5. TESTS: Rewrote test_read_loop_detection.py (35 tests) to cover
consecutive-only behavior, notify_other_tool_call, interleaved
read/search, and summary-unaffected-by-searches.
2026-03-10 16:25:41 -07:00
if count > = 3 :
2026-03-08 23:01:21 +03:00
result_dict [ " _warning " ] = (
fix: improve read-loop detection — consecutive-only, correct thresholds, fix bugs
Follow-up to PR #705 (merged from 0xbyt4). Addresses several issues:
1. CONSECUTIVE-ONLY TRACKING: Redesigned the read/search tracker to only
warn/block on truly consecutive identical calls. Any other tool call
in between (write, patch, terminal, etc.) resets the counter via
notify_other_tool_call(), called from handle_function_call() in
model_tools.py. This prevents false blocks in read→edit→verify flows.
2. THRESHOLD ADJUSTMENT: Warn on 3rd consecutive (was 2nd), block on
4th+ consecutive (was 3rd+). Gives the model more room before
intervening.
3. TUPLE UNPACKING BUG: Fixed get_read_files_summary() which crashed on
search keys (5-tuple) when trying to unpack as 3-tuple. Now uses a
separate read_history set that only tracks file reads.
4. WEB_EXTRACT DOCSTRING: Reverted incorrect removal of 'title' from
web_extract return docs in code_execution_tool.py — the field IS
returned by web_tools.py.
5. TESTS: Rewrote test_read_loop_detection.py (35 tests) to cover
consecutive-only behavior, notify_other_tool_call, interleaved
read/search, and summary-unaffected-by-searches.
2026-03-10 16:25:41 -07:00
f " You have run this exact search { count } times consecutively. "
2026-03-08 23:01:21 +03:00
" The results have not changed. Use the information you already have. "
)
2026-03-08 17:46:28 -07:00
result_json = json . dumps ( result_dict , ensure_ascii = False )
# Hint when results were truncated — explicit next offset is clearer
# than relying on the model to infer it from total_count vs match count.
if result_dict . get ( " truncated " ) :
next_offset = offset + limit
result_json + = f " \n \n [Hint: Results truncated. Use offset= { next_offset } to see more, or narrow with a more specific pattern or file_glob.] "
return result_json
2026-02-05 03:49:46 -08:00
except Exception as e :
refactor: add tool_error/tool_result helpers + read_raw_config, migrate 129 callsites
Add three reusable helpers to eliminate pervasive boilerplate:
tools/registry.py — tool_error() and tool_result():
Every tool handler returns JSON strings. The pattern
json.dumps({"error": msg}, ensure_ascii=False) appeared 106 times,
and json.dumps({"success": False, "error": msg}, ...) another 23.
Now: tool_error(msg) or tool_error(msg, success=False).
tool_result() handles arbitrary result dicts:
tool_result(success=True, data=payload) or tool_result(some_dict).
hermes_cli/config.py — read_raw_config():
Lightweight YAML reader that returns the raw config dict without
load_config()'s deep-merge + migration overhead. Available for
callsites that just need a single config value.
Migration (129 callsites across 32 files):
- tools/: browser_camofox (18), file_tools (10), homeassistant (8),
web_tools (7), skill_manager (7), cronjob (11), code_execution (4),
delegate (5), send_message (4), tts (4), memory (7), session_search (3),
mcp (2), clarify (2), skills_tool (3), todo (1), vision (1),
browser (1), process_registry (2), image_gen (1)
- plugins/memory/: honcho (9), supermemory (9), hindsight (8),
holographic (7), openviking (7), mem0 (7), byterover (6), retaindb (2)
- agent/: memory_manager (2), builtin_memory_provider (1)
2026-04-07 13:36:20 -07:00
return tool_error ( str ( e ) )
2026-02-05 03:49:46 -08:00
2026-02-21 20:22:33 -08:00
# ---------------------------------------------------------------------------
# Schemas + Registry
# ---------------------------------------------------------------------------
refactor: add tool_error/tool_result helpers + read_raw_config, migrate 129 callsites
Add three reusable helpers to eliminate pervasive boilerplate:
tools/registry.py — tool_error() and tool_result():
Every tool handler returns JSON strings. The pattern
json.dumps({"error": msg}, ensure_ascii=False) appeared 106 times,
and json.dumps({"success": False, "error": msg}, ...) another 23.
Now: tool_error(msg) or tool_error(msg, success=False).
tool_result() handles arbitrary result dicts:
tool_result(success=True, data=payload) or tool_result(some_dict).
hermes_cli/config.py — read_raw_config():
Lightweight YAML reader that returns the raw config dict without
load_config()'s deep-merge + migration overhead. Available for
callsites that just need a single config value.
Migration (129 callsites across 32 files):
- tools/: browser_camofox (18), file_tools (10), homeassistant (8),
web_tools (7), skill_manager (7), cronjob (11), code_execution (4),
delegate (5), send_message (4), tts (4), memory (7), session_search (3),
mcp (2), clarify (2), skills_tool (3), todo (1), vision (1),
browser (1), process_registry (2), image_gen (1)
- plugins/memory/: honcho (9), supermemory (9), hindsight (8),
holographic (7), openviking (7), mem0 (7), byterover (6), retaindb (2)
- agent/: memory_manager (2), builtin_memory_provider (1)
2026-04-07 13:36:20 -07:00
from tools . registry import registry , tool_error
2026-02-21 20:22:33 -08:00
def _check_file_reqs ( ) :
""" Lazy wrapper to avoid circular import with tools/__init__.py. """
from tools import check_file_requirements
return check_file_requirements ( )
READ_FILE_SCHEMA = {
" name " : " read_file " ,
2026-04-08 01:45:51 -07:00
" description " : " Read a text file with line numbers and pagination. Use this instead of cat/head/tail in terminal. Output format: ' LINE_NUM|CONTENT ' . Suggests similar filenames if not found. Use offset and limit for large files. Reads exceeding ~100K characters are rejected; use offset and limit to read specific sections of large files. NOTE: Cannot read images or binary files — use vision_analyze for images. " ,
2026-02-21 20:22:33 -08:00
" parameters " : {
" type " : " object " ,
" properties " : {
" path " : { " type " : " string " , " description " : " Path to the file to read (absolute, relative, or ~/path) " } ,
" offset " : { " type " : " integer " , " description " : " Line number to start reading from (1-indexed, default: 1) " , " default " : 1 , " minimum " : 1 } ,
" limit " : { " type " : " integer " , " description " : " Maximum number of lines to read (default: 500, max: 2000) " , " default " : 500 , " maximum " : 2000 }
} ,
" required " : [ " path " ]
}
}
WRITE_FILE_SCHEMA = {
" name " : " write_file " ,
2026-02-22 16:32:08 -08:00
" description " : " Write content to a file, completely replacing existing content. Use this instead of echo/cat heredoc in terminal. Creates parent directories automatically. OVERWRITES the entire file — use ' patch ' for targeted edits. " ,
2026-02-21 20:22:33 -08:00
" parameters " : {
" type " : " object " ,
" properties " : {
" path " : { " type " : " string " , " description " : " Path to the file to write (will be created if it doesn ' t exist, overwritten if it does) " } ,
" content " : { " type " : " string " , " description " : " Complete content to write to the file " }
} ,
" required " : [ " path " , " content " ]
}
}
PATCH_SCHEMA = {
" name " : " patch " ,
2026-02-22 16:32:08 -08:00
" description " : " Targeted find-and-replace edits in files. Use this instead of sed/awk in terminal. Uses fuzzy matching (9 strategies) so minor whitespace/indentation differences won ' t break it. Returns a unified diff. Auto-runs syntax checks after editing. \n \n Replace mode (default): find a unique string and replace it. \n Patch mode: apply V4A multi-file patches for bulk changes. " ,
2026-02-21 20:22:33 -08:00
" parameters " : {
" type " : " object " ,
" properties " : {
" mode " : { " type " : " string " , " enum " : [ " replace " , " patch " ] , " description " : " Edit mode: ' replace ' for targeted find-and-replace, ' patch ' for V4A multi-file patches " , " default " : " replace " } ,
" path " : { " type " : " string " , " description " : " File path to edit (required for ' replace ' mode) " } ,
" old_string " : { " type " : " string " , " description " : " Text to find in the file (required for ' replace ' mode). Must be unique in the file unless replace_all=true. Include enough surrounding context to ensure uniqueness. " } ,
" new_string " : { " type " : " string " , " description " : " Replacement text (required for ' replace ' mode). Can be empty string to delete the matched text. " } ,
" replace_all " : { " type " : " boolean " , " description " : " Replace all occurrences instead of requiring a unique match (default: false) " , " default " : False } ,
" patch " : { " type " : " string " , " description " : " V4A format patch content (required for ' patch ' mode). Format: \n *** Begin Patch \n *** Update File: path/to/file \n @@ context hint @@ \n context line \n -removed line \n +added line \n *** End Patch " }
} ,
" required " : [ " mode " ]
}
}
SEARCH_FILES_SCHEMA = {
" name " : " search_files " ,
2026-02-22 16:32:08 -08:00
" description " : " Search file contents or find files by name. Use this instead of grep/rg/find/ls in terminal. Ripgrep-backed, faster than shell equivalents. \n \n Content search (target= ' content ' ): Regex search inside files. Output modes: full matches with line numbers, file paths only, or match counts. \n \n File search (target= ' files ' ): Find files by glob pattern (e.g., ' *.py ' , ' *config* ' ). Also use this instead of ls — results sorted by modification time. " ,
2026-02-21 20:22:33 -08:00
" parameters " : {
" type " : " object " ,
" properties " : {
" pattern " : { " type " : " string " , " description " : " Regex pattern for content search, or glob pattern (e.g., ' *.py ' ) for file search " } ,
" target " : { " type " : " string " , " enum " : [ " content " , " files " ] , " description " : " ' content ' searches inside file contents, ' files ' searches for files by name " , " default " : " content " } ,
" path " : { " type " : " string " , " description " : " Directory or file to search in (default: current working directory) " , " default " : " . " } ,
" file_glob " : { " type " : " string " , " description " : " Filter files by pattern in grep mode (e.g., ' *.py ' to only search Python files) " } ,
" limit " : { " type " : " integer " , " description " : " Maximum number of results to return (default: 50) " , " default " : 50 } ,
" offset " : { " type " : " integer " , " description " : " Skip first N results for pagination (default: 0) " , " default " : 0 } ,
" output_mode " : { " type " : " string " , " enum " : [ " content " , " files_only " , " count " ] , " description " : " Output format for grep mode: ' content ' shows matching lines with line numbers, ' files_only ' lists file paths, ' count ' shows match counts per file " , " default " : " content " } ,
" context " : { " type " : " integer " , " description " : " Number of context lines before and after each match (grep mode only) " , " default " : 0 }
} ,
" required " : [ " pattern " ]
}
}
def _handle_read_file ( args , * * kw ) :
tid = kw . get ( " task_id " ) or " default "
2026-04-08 01:45:51 -07:00
return read_file_tool ( path = args . get ( " path " , " " ) , offset = args . get ( " offset " , 1 ) , limit = args . get ( " limit " , 500 ) , task_id = tid )
2026-02-21 20:22:33 -08:00
def _handle_write_file ( args , * * kw ) :
tid = kw . get ( " task_id " ) or " default "
return write_file_tool ( path = args . get ( " path " , " " ) , content = args . get ( " content " , " " ) , task_id = tid )
def _handle_patch ( args , * * kw ) :
tid = kw . get ( " task_id " ) or " default "
return patch_tool (
mode = args . get ( " mode " , " replace " ) , path = args . get ( " path " ) ,
old_string = args . get ( " old_string " ) , new_string = args . get ( " new_string " ) ,
replace_all = args . get ( " replace_all " , False ) , patch = args . get ( " patch " ) , task_id = tid )
def _handle_search_files ( args , * * kw ) :
tid = kw . get ( " task_id " ) or " default "
target_map = { " grep " : " content " , " find " : " files " }
raw_target = args . get ( " target " , " content " )
target = target_map . get ( raw_target , raw_target )
return search_tool (
pattern = args . get ( " pattern " , " " ) , target = target , path = args . get ( " path " , " . " ) ,
file_glob = args . get ( " file_glob " ) , limit = args . get ( " limit " , 50 ) , offset = args . get ( " offset " , 0 ) ,
output_mode = args . get ( " output_mode " , " content " ) , context = args . get ( " context " , 0 ) , task_id = tid )
2026-04-07 22:21:27 -07:00
registry . register ( name = " read_file " , toolset = " file " , schema = READ_FILE_SCHEMA , handler = _handle_read_file , check_fn = _check_file_reqs , emoji = " 📖 " , max_result_size_chars = float ( ' inf ' ) )
registry . register ( name = " write_file " , toolset = " file " , schema = WRITE_FILE_SCHEMA , handler = _handle_write_file , check_fn = _check_file_reqs , emoji = " ✍️ " , max_result_size_chars = 100_000 )
registry . register ( name = " patch " , toolset = " file " , schema = PATCH_SCHEMA , handler = _handle_patch , check_fn = _check_file_reqs , emoji = " 🔧 " , max_result_size_chars = 100_000 )
2026-04-08 01:45:51 -07:00
registry . register ( name = " search_files " , toolset = " file " , schema = SEARCH_FILES_SCHEMA , handler = _handle_search_files , check_fn = _check_file_reqs , emoji = " 🔎 " , max_result_size_chars = 100_000 )