Each AIAgent.__init__() was unconditionally starting a daemon thread to
pre-warm the OpenRouter model metadata cache. In gateway mode a new
AIAgent is created for every incoming message, so one OS thread leaked
per request. After ~1 000 messages the process hit the Linux thread
limit and raised RuntimeError: can't start new thread for all subsequent
requests.
Add a module-level threading.Event (_openrouter_prewarm_done) that is
set before the thread is started. Subsequent AIAgent instantiations
skip the spawn entirely; fetch_model_metadata() is cached for 1 hour so
the single background call is sufficient.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
PR #15027 (5 days ago) shipped TELEGRAM_GROUP_ALLOWED_USERS as a chat-ID
allowlist. #17686 correctly renames that to sender user IDs and moves
chat IDs to TELEGRAM_GROUP_ALLOWED_CHATS. Without a shim, any user on
PR #15027's guidance would silently start rejecting group traffic on
upgrade.
- gateway/run.py: in _is_user_authorized, if TELEGRAM_GROUP_ALLOWED_USERS
contains values starting with '-' (chat-ID-shaped), honor them as chat
IDs and log a one-shot deprecation warning pointing users at the new
TELEGRAM_GROUP_ALLOWED_CHATS var.
- tests/gateway/test_unauthorized_dm_behavior.py: three new tests cover
legacy chat-ID values authorizing the listed chat, not crossing to
other chats, and mixed sender/chat values in the same var.
- website/docs/user-guide/messaging/telegram.md: rewrite the Group
Allowlisting section to document the new user/chat split + migration
note. Remove stale '/thread_id' suffix claim (code never parsed it).
- website/docs/reference/environment-variables.md: document all three
Telegram allowlist env vars.
Salvage-follow-up to @shannonsands's /reload-skills PR. Trims the feature to
match the design: user-initiated rescan, no prompt-cache reset, no new
schema surface, no phantom user turn, and the next-turn note carries each
added/removed skill's 60-char description (not just its name).
Changes vs the original PR:
* Drop the in-process skills prompt-cache clear in reload_skills(). Skills
are invoked at runtime via /skill-name, skills_list, or skill_view —
they don't need to live in the system prompt for the model to use them.
Keeping the cache intact preserves prefix caching across the reload so
/reload-skills pays no cache-reset cost. (MCP has to break the cache
because tool schemas must be known at conversation start; skills do not.)
* Drop the skills_reload agent tool and SKILLS_RELOAD_SCHEMA from
tools/skills_tool.py, plus the four skills_reload enumerations in
toolsets.py. No new schema surface — agents can already see a freshly-
installed skill via skill_view / skills_list the moment it's on disk.
* Replace the phantom 'role: user' turn injection with a one-shot queued
note. CLI uses self._pending_skills_reload_note (same pattern as
_pending_model_switch_note, prepended to the next API call and cleared).
Gateway uses self._pending_skills_reload_notes[session_key]. The note
is prepended to the NEXT real user message in this session, so message
alternation stays intact and nothing out-of-band is persisted to the
transcript.
* reload_skills() now returns added/removed as
[{'name': str, 'description': str}, ...] (description truncated to 60
chars — matches the curator / gateway adapter budget). The injected
next-turn note formats each entry as 'name — description' so the model
can actually reason about which new skills to call without running
skills_list first.
* Only emit the note when the diff is non-empty. On empty diff, print
'No new skills detected' and do nothing else.
* Tests rewritten to cover the queue semantics, the description payload,
and a regression guard that the prompt-cache snapshot is preserved.
Adds a public reload path for the in-process skill caches so newly
installed (or removed) skills become visible mid-session without a
gateway restart. Mirrors the shape of /reload-mcp.
Three surfaces:
* /reload-skills slash command — CLI (cli.py) and gateway (gateway/run.py),
with /reload_skills alias for Telegram autocomplete and an explicit
Discord registration.
* skills_reload agent tool (tools/skills_tool.py) — lets agents/subagents
pick up freshly-installed skills via tool call.
* agent.skill_commands.reload_skills() — shared helper that clears
_skill_commands, _SKILLS_PROMPT_CACHE (in-process LRU), and the
on-disk .skills_prompt_snapshot.json, then returns an added/removed
diff plus the new total count.
Tested:
* tests/agent/test_skill_commands_reload.py (9 cases)
* tests/cli/test_cli_reload_skills.py (3 cases)
* tests/gateway/test_reload_skills_command.py (4 cases)
Use case: NemoClaw / OpenShell-style sandboxed orchestrators that drop
skills into ~/.hermes/skills mid-session, plus agentic flows where the
agent itself installs a skill via the shell tool and needs it bound
without a gateway restart. The Python helper
clear_skills_system_prompt_cache(clear_snapshot=True) already exists
internally — this PR just exposes it via slash command and tool.
- SQL: add `model != ''` to both queries in /api/analytics/models so
sessions with empty-string model (pre-existing data integrity,
confirmed in production DB: ~107 sessions) no longer render as
blank-header cards.
- ModelsPage: drop the arbitrary slashIdx < 20 length gate in
shortModelName / modelProvider. The gate was fragile for longer
vendor prefixes (e.g. `deepseek-ai/...`). Strip on the first /
unconditionally. Rename modelProvider -> modelVendor to avoid
confusion with the billing provider column.
- scripts/release.py: add AUTHOR_MAP entry for yatesjalex.
- New /models page in left nav (after Analytics)
- New /api/analytics/models endpoint with per-model token/cost/session
breakdown, cache read/reasoning tokens, tool calls, avg tokens/session,
and capabilities from models.dev (vision/tools/reasoning/context window)
- Model cards with stacked token distribution bar, capability badges,
provider badges, cost info, and relative time
- Summary stats bar (models used, total tokens, est. cost, sessions)
- Period selector (7d/30d/90d) with refresh
- i18n support (en + zh)
Broad drift audit against origin/main (b52b63396).
Reference pages (most user-visible drift):
- slash-commands: add /busy, /curator, /footer, /indicator, /redraw, /steer
that were missing; drop non-existent /terminal-setup; fix /q footnote
(resolves to /queue, not /quit); extend CLI-only list with all 24
CLI-only commands in the registry
- cli-commands: add dedicated sections for hermes curator / fallback /
hooks (new subcommands not previously documented); remove stale
hermes honcho standalone section (the plugin registers dynamically
via hermes memory); list curator/fallback/hooks in top-level table;
fix completion to include fish
- toolsets-reference: document the real 52-toolset count; split browser
vs browser-cdp; add discord / discord_admin / spotify / yuanbao;
correct hermes-cli tool count from 36 to 38; fix misleading claim
that hermes-homeassistant adds tools (it's identical to hermes-cli)
- tools-reference: bump tool count 55 -> 68; add 7 Spotify, 5 Yuanbao,
2 Discord toolsets; move browser_cdp/browser_dialog to their own
browser-cdp toolset section
- environment-variables: add 40+ user-facing HERMES_* vars that were
undocumented (--yolo, --accept-hooks, --ignore-*, inference model
override, agent/stream/checkpoint timeouts, OAuth trace, per-platform
batch tuning for Telegram/Discord/Matrix/Feishu/WeCom, cron knobs,
gateway restart/connect timeouts); dedupe the Cron Scheduler section;
replace stale QQ_SANDBOX with QQ_PORTAL_HOST
User-guide (top level):
- cli.md: compression preserves last 20 turns, not 4 (protect_last_n: 20)
- configuration.md: display.platforms is the canonical per-platform
override key; tool_progress_overrides is deprecated and auto-migrated
- profiles.md: model.default is the config key, not model.model
- sessions.md: CLI/TUI session IDs use 6-char hex, gateway uses 8
- checkpoints-and-rollback.md: destructive-command list now matches
_DESTRUCTIVE_PATTERNS (adds rmdir, cp, install, dd)
- docker.md: the container runs as non-root hermes (UID 10000) via
gosu; fix install command (uv pip); add missing --insecure on the
dashboard compose example (required for non-loopback bind)
- security.md: systemctl danger pattern also matches 'restart'
- index.md: built-in tool count 47 -> 68
- integrations/index.md: 6 STT providers, 8 memory providers
- integrations/providers.md: drop fictional dashscope/qwen aliases
Features:
- overview.md: 9 image models (not 8), 9 TTS providers (not 5),
8 memory providers (Supermemory was missing)
- tool-gateway.md: 9 image models
- tools.md: extend common-toolsets list with search / messaging /
spotify / discord / debugging / safe
- fallback-providers.md: add 6 real providers from PROVIDER_REGISTRY
(lmstudio, kimi-coding-cn, stepfun, alibaba-coding-plan,
tencent-tokenhub, azure-foundry)
- plugins.md: Available Hooks table now includes on_session_finalize,
on_session_reset, subagent_stop
- built-in-plugins.md: add the 7 bundled plugins the page didn't
mention (spotify, google_meet, three image_gen providers, two
dashboard examples)
- web-dashboard.md: add --insecure and --tui flags
- cron.md: hermes cron create takes positional schedule/prompt, not
flags
Messaging:
- telegram.md: TELEGRAM_WEBHOOK_SECRET is now REQUIRED when
TELEGRAM_WEBHOOK_URL is set (gateway refuses to start without it
per GHSA-3vpc-7q5r-276h). Biggest user-visible drift in the batch.
- discord.md: HERMES_DISCORD_TEXT_BATCH_SPLIT_DELAY_SECONDS default
is 2.0, not 0.1
- dingtalk.md: document DINGTALK_REQUIRE_MENTION /
FREE_RESPONSE_CHATS / MENTION_PATTERNS / HOME_CHANNEL /
ALLOW_ALL_USERS that the adapter supports
- bluebubbles.md: drop fictional BLUEBUBBLES_SEND_READ_RECEIPTS env
var; the setting lives in platforms.bluebubbles.extra only
- qqbot.md: drop dead QQ_SANDBOX; add real QQ_PORTAL_HOST and
QQ_GROUP_ALLOWED_USERS
- wecom-callback.md: replace 'hermes gateway start' (service-only)
with 'hermes gateway' for first-time setup
Developer-guide:
- architecture.md: refresh tool/toolset counts (61/52), terminal
backend count (7), line counts for run_agent.py (~13.7k), cli.py
(~11.5k), main.py (~10.4k), setup.py (~3.5k), gateway/run.py
(~12.2k), mcp_tool.py (~3.1k); add yuanbao adapter, bump platform
adapter count 18 -> 20
- agent-loop.md: run_agent.py line count 10.7k -> 13.7k
- tools-runtime.md: add vercel_sandbox backend
- adding-tools.md: remove stale 'Discovery import added to
model_tools.py' checklist item (registry auto-discovery)
- adding-platform-adapters.md: mark send_typing / get_chat_info as
concrete base methods; only connect/disconnect/send are abstract
- acp-internals.md: ACP sessions now persist to SessionDB
(~/.hermes/state.db); acp.run_agent call uses
use_unstable_protocol=True
- cron-internals.md: gateway runs scheduler in a dedicated background
thread via _start_cron_ticker, not on a maintenance cycle; locking
is cross-process via fcntl.flock (Unix) / msvcrt.locking (Windows)
- gateway-internals.md: gateway/run.py ~12k lines
- provider-runtime.md: cron DOES support fallback (run_job reads
fallback_providers from config)
- session-storage.md: SCHEMA_VERSION = 11 (not 9); add migrations
10 and 11 (trigram FTS, inline-mode FTS5 re-index); add
api_call_count column to Sessions DDL; document messages_fts_trigram
and state_meta in the architecture tree
- context-compression-and-caching.md: remove the obsolete 'context
pressure warnings' section (warnings were removed for causing
models to give up early)
- context-engine-plugin.md: compress() signature now includes
focus_topic param
- extending-the-cli.md: _build_tui_layout_children signature now
includes model_picker_widget; add to default layout
Also fixed three pre-existing broken links/anchors the build warned
about (docker.md -> api-server.md, yuanbao.md -> cron-jobs.md and
tips#background-tasks, nix-setup.md -> #container-aware-cli).
Regenerated per-skill pages via website/scripts/generate-skill-docs.py
so catalog tables and sidebar are consistent with current SKILL.md
frontmatter.
docusaurus build: clean, no broken links or anchors.
Self-review caught several errors in the previous commit:
Frontmatter
- Replace non-standard `requires_runtime` / `requires_tooling` fields with
the documented `compatibility:` field (parsed by tools/skills_tool.py).
- Drop the `audit-v5` author tag I added unnecessarily.
MODEL_LOADERS catalog
- Remove `IPAdapterUnifiedLoader` (input `preset` is an enum, not a file).
- Remove `IPAdapterInsightFaceLoader` and `InsightFaceLoader` (input
`provider` is a GPU backend selector, not a model file). These would have
flagged enum values like "STANDARD" or "CUDA" as missing model files.
- Add "NB:" comment explaining `BasicGuider` has no `cfg` input
(the original PARAM_PATTERNS entry would never have matched).
- Remove `SamplerCustomAdvanced.noise_seed` from PARAM_PATTERNS — that
node takes a NOISE input from RandomNoise, not a seed field directly.
NODE_TO_PACKAGE registry slugs
- Verified all 18 packages against api.comfy.org and fixed:
- `comfyui-essentials` → `comfyui_essentials` (underscore, not hyphen)
- `comfyui-gguf` → `ComfyUI-GGUF` (case-sensitive)
- `comfyui-photomaker-plus` → `ComfyUI-PhotoMaker-Plus`
- `comfyui-wanvideowrapper` → `ComfyUI-WanVideoWrapper`
- ComfyUI-HunyuanVideoWrapper isn't on the registry; surface a git-URL
install hint via new NODE_TO_GIT_URL fallback so the user can install
via ComfyUI-Manager's /manager/queue/install endpoint.
Wrong class names
- `Canny` → `CannyEdgePreprocessor` (controlnet-aux registers the latter,
the former never appears in /object_info).
- Add `Zoe_DepthAnythingPreprocessor` and `AnimalPosePreprocessor` while
fixing controlnet-aux.
- Remove `Reroute (rgthree)` (rgthree's Reroute is JS-only — no Python
class, never appears in /object_info).
- Add `Display Int (rgthree)` (sibling of Display Any).
- Move `UltralyticsDetectorProvider` from `comfyui-impact-pack` to
`comfyui-impact-subpack` (separate package, registered there).
Tests
- Update test_packages_are_safe_for_shell to accept case-mixed slugs (the
registry uses both ComfyUI- and comfyui_ prefixes inconsistently). Replaced
the lowercase-only assertion with a shell-safe regex check.
- 117 tests still pass (105 unit + 8 cloud + 4 cross-host).
Attribution
- Add `SHL0MS@users.noreply.github.com` mapping to scripts/release.py
AUTHOR_MAP so check-attribution CI passes.
The audit of v4.1 surfaced ~70 issues across the five scripts and three
reference docs — most user-visible (silent file overwrites, status-error
misclassified as success, X-API-Key leaked to S3 on /api/view redirect,
Cloud endpoints that 404 because they were renamed). v5.0.0 fixes those
and fills the gaps that previously forced users to write their own glue
(WebSocket monitoring, batch/sweep, img2img upload helper, dep auto-fix,
log fetch, health check, example workflows).
Critical fixes
- run_workflow.py: poll_status now checks status_str==error BEFORE
completed:true, so a failed run no longer reports success
- run_workflow.py: download_output streams to disk via safe_path_join,
preserves server subfolder structure (no silent overwrites), and
retries with exponential backoff
- run_workflow.py: refuses to overwrite a link with a literal in
inject_params (would silently break wiring)
- _common.py: _StripSensitiveOnRedirectSession (subclasses
requests.Session.rebuild_auth) drops X-API-Key/Cookie on cross-host
redirects — fixes a real key-leak path through Cloud's signed-URL
download flow. Tested
- Cloud routing (verified live): /history → /history_v2,
/models/<f> → /experiment/models/<f>, plus folder aliases for the
unet ↔ diffusion_models and clip ↔ text_encoders rename
- check_deps.py: distinguishes 200/empty vs 404 folder_not_found vs
403 free-tier; emits concrete fix_command per missing dep
- extract_schema.py: prompt vs negative_prompt determined by tracing
KSampler.{positive,negative} connections (incl. through Reroute /
Primitive nodes) instead of meta-title heuristic; symmetric
duplicate-name resolution; cycle-safe trace_to_node
- hardware_check.py: multi-GPU pick-best, Apple variant detection,
Rosetta detection, WSL2, ROCm --json, disk-space check, optional
PyTorch probe; powershell preferred over deprecated wmic
- comfyui_setup.sh: prefers pipx → uvx → pip --user (with PEP-668
fallback); idempotent — skips relaunch if server already up;
configurable port/workspace; persistent log; SIGINT trap
New scripts
- run_batch.py — count or sweep (cartesian product), parallel up to
cloud tier limit
- ws_monitor.py — real-time WebSocket viewer; saves preview frames
- auto_fix_deps.py — runs comfy node install / model download for
whatever check_deps reports missing (with --dry-run)
- health_check.py — single command that runs the verification checklist
(comfy-cli + server + checkpoints + optional smoke test that cancels
itself to avoid burning compute)
- fetch_logs.py — pull traceback / status messages for a prompt_id
Coverage expansion
- Param patterns now cover Flux (BasicScheduler, BasicGuider,
RandomNoise, ModelSamplingFlux), SD3, Wan/Hunyuan/LTX video,
IPAdapter, rgthree, easy-use, AnimateDiff
- Embedding refs in CLIPTextEncode strings extracted as model deps
- ckpt_name / vae_name / lora_name / unet_name now controllable so
workflows can be retargeted per run
Examples
- workflows/{sd15,sdxl,flux_dev}_txt2img.json
- workflows/sdxl_{img2img,inpaint}.json
- workflows/upscale_4x.json
- workflows/{animatediff_video,wan_video_t2v}.json + README
Tests
- 117 tests (105 unit + 8 cloud integration + 4 cross-host security)
- Cloud tests auto-skip without COMFY_CLOUD_API_KEY; verified end-to-end
against live cloud API
Backwards compatibility
- All existing CLI flags continue to work; new behavior is opt-in
(--ws, --input-image, --randomize-seed, --flat-output, etc.)
Pull the top-level + chat parser construction out of main() into
hermes_cli/_parser.py so relaunch.py can introspect parser._actions to
discover which flags exist and whether they take values, instead of
maintaining a parallel hand-rolled (flag, takes_value) tuple list.
- _parser.py: build_top_level_parser() returns (parser, subparsers,
chat_parser); side-effect-free import.
- main.py: ~290 lines of inline parser construction collapsed to a
helper call. Other subparsers stay inline (dispatch is bound to
module-level cmd_* functions).
- _parser._inherited_flag(parser, ...): wraps parser.add_argument and
sets action.inherit_on_relaunch = True. Used in place of
parser.add_argument for the 25 flags (top-level + chat) that need to
carry over.
- _parser.PRE_ARGPARSE_INHERITED_FLAGS: holds --profile/-p, which
isn't on argparse (consumed earlier by main._apply_profile_override).
- relaunch.py: drops _CRITICAL_DESTS and _PRE_ARGPARSE_FLAGS; the table
builder now filters by getattr(action, 'inherit_on_relaunch', False).
- test_ignore_user_config_flags.py: brittle inspect.getsource grep
replaced with proper parser introspection.
- test_relaunch.py: introspection sanity tests added.
Salvaged from PR #17549; added top-level -t/--toolsets flag to
_parser.py so #17623 (fix(tui): honor launch toolsets) behavior is
preserved on current main.
Co-authored-by: ethernet <arilotter@gmail.com>
Extract all os.execvp('hermes', ...) calls into a utility so flags like
--tui, --dev, --profile, --model, --provider, et al. survive session
resume and post-setup relaunch.
- resolve_hermes_bin: prefers sys.argv[0] when callable, then PATH,
then falls back to '${sys.executable} -m hermes_cli.main' (fixes nix
run relaunches)
- build_relaunch_argv: allowlists critical flags so they carry over
- cmd_sessions browse now calls relaunch(['--resume', <id>])
- _apply_profile_override skips redundant work when HERMES_HOME is
already set (child inherits parent profile)
- setup.py replaces _resolve_hermes_chat_argv with relaunch_chat()
- added comprehensive tests for flag extraction and binary resolution
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
If a concurrent RPC mutates _sessions while session.delete is iterating
it (e.g. a parallel session.create on the thread pool), the bare except
swallowed the RuntimeError and let the delete proceed against a row
that may still be live. Snapshot via list(_sessions.values()) and
return an error when even that raises, instead of treating "couldn't
check" as "no active sessions."
Single-key confirm matches how the picker already accepts 1-9 to
resume — no separate y/n keymap to learn — and "press d again" is
self-documenting next to the cursor.
Pressing `d` on the highlighted row in the resume picker prompts
`delete? y/n`; `y` deletes the session (DB row + on-disk transcript
files), anything else cancels. The active session is excluded from
deletion server-side.
Adds a new `session.delete` JSON-RPC handler that wraps
`SessionDB.delete_session`, forwarding the per-profile `sessions/`
directory so transcripts get cleaned up alongside the row.
vision_analyze used Path('./temp_vision_images') — a relative path that
resolved against cwd. Under Docker the image's WORKDIR is /opt/hermes,
which is root-owned and only chmoded a+rX (read + traversal). Since
#5811 landed (run as non-root hermes UID 10000, Apr 12), remote-URL
vision calls fail with PermissionError on mkdir.
Switch to get_hermes_dir('cache/vision', 'temp_vision_images'): resolves
to $HERMES_HOME/cache/vision/ (= /opt/data/cache/vision/ in Docker —
the user-owned volume mount). Existing installs with the old dir keep
using it via the get_hermes_dir back-compat path; no migration needed.
Only site in the codebase that stored runtime files via Path('./...').
Reported via Discord: https://juick.com/i/p/3089079.jpg → Telegram →
gateway → [Errno 13] Permission denied: 'temp_vision_images'.
CI Tests workflow has been red on main for 40+ consecutive runs. This
commit recovers every failure visible in run 25130722163 (most recent
completed run prior to this PR).
Root causes, by group:
Test-mock drift after product landed (fix: update mocks)
- test_mcp_structured_content / test_mcp_dynamic_discovery (6 tests):
product added _rpc_lock (#02ae15222) and _schedule_tools_refresh
(#1350d12b0) without updating sibling test files. Install a real
asyncio.Lock inside the fake run-loop and patch at _schedule_tools_refresh.
- test_session.py: renamed normalize_whatsapp_identifier → canonical_
whatsapp_identifier upstream; keep a local alias so the legacy tests
keep working.
- test_run_progress_topics Slack DM test: PR #8006 made Slack default
tool_progress=off; explicitly set it to 'all' in the test fixture so
the progress-callback path still runs. Also read tool_progress_callback
at call time rather than freezing it in FakeAgent.__init__ — production
assigns it AFTER construction.
- test_tui_gateway_server session-create/close race: session.create now
defers _start_agent_build behind a 50ms timer — wait for the build
thread to enter _make_agent before closing, otherwise the orphan-
cleanup path never runs.
- test_protocol session.resume: product get_messages_as_conversation now
takes include_ancestors kwarg; accept **_kwargs in the test stub.
- test_copilot_acp_client redaction: redactor is OFF by default (snapshots
HERMES_REDACT_SECRETS at import); patch agent.redact._REDACT_ENABLED=True
for the duration of the test.
- test_minimax_provider: after #17171, dots in non-Anthropic model names
stay dots even with preserve_dots=False. Assert the new invariant
rather than the old 'broken for MiniMax' behavior.
- test_update_autostash: updater now scans `ps -A` for dashboard PIDs;
the test's catch-all subprocess.run stub needed stdout/stderr fields.
- test_accretion_caps: read_timestamps dict is populated lazily when
os.path.getmtime succeeds. Use .get("read_timestamps", {}) to tolerate
CI filesystems where the stat races file creation.
Change-detector tests (fix: rewrite as structural invariants)
- test_credential_sources_registry_has_expected_steps: was a frozen set
comparison that broke when minimax-oauth was added. Rewrite as an
invariant check (every step has description, no dupes, core steps
present) per AGENTS.md 'don't write change-detector tests'.
xdist ordering / test pollution (fix: reset state, use module-local patches)
- test_setup vercel: sibling test saved VERCEL_PROJECT_ID='project' to
os.environ via save_env_value() and never cleared it. monkeypatch.delenv
the VERCEL_* vars in the link-file test.
- test_clipboard TestIsWsl: GitHub Actions is on Azure VMs whose real
/proc/version often contains 'microsoft'. Patching builtins.open with
mock_open didn't reliably intercept hermes_constants.is_wsl's call in
xdist workers that had already cached _wsl_detected=True from an
earlier test. Patch hermes_constants.open directly and add
teardown_method to reset the cache after each test.
Pytest-asyncio cancellation hangs (fix: bound product await with timeout)
- test_session_split_brain_11016 (3 params) + test_gateway_shutdown
cancel-inflight: under pytest-asyncio 1.3.0, 'await task' and
'asyncio.gather(cancelled_tasks)' can stall for 30s when the cancelled
task's finally block awaits typing-task cleanup. Bound both with
asyncio.wait_for(..., timeout=5.0) and asyncio.shield — the stragglers
are released from adapter tracking and allowed to finish unwinding in
the background. This is also a legitimate hardening: a wedged finally
shouldn't stall the caller's dispatch or a gateway shutdown.
Orphan UI config (fix: merge tiny tab into messaging category)
- test_web_server test_no_single_field_categories: the telegram.reactions
config field lived in its own 'telegram' schema category with no
siblings. Fold it under 'discord' via _CATEGORY_MERGE so the dashboard
doesn't render an orphan single-field tab.
Local verification: 38/38 originally-failing tests pass; 4044/4044
gateway tests pass; 684/684 targeted subset (all 16 touched test files)
passes.
Route Option/Alt or Ctrl wheel input through a gated precision path that scrolls at most one row per short interval, while preserving the existing accelerated behavior for plain wheel input. Keep precision active briefly after modifier release so queued wheel events from the same gesture do not jump into acceleration mid-stream.
Decode Shift, Meta, and Ctrl bits from SGR and legacy X10 wheel event button bytes so TUI input handlers can distinguish modified wheel gestures from plain scrolling.
check_for_updates() looked at __file__.parent.parent for a .git dir to
diff against origin/main. A nix-built hermes lives in /nix/store with
no .git there, so the check fell through to whatever editable-install
dev checkout last populated ~/.hermes/.update_check, producing stale
"X commits behind" warnings right after a fresh `nix run --refresh`.
Embed the locked flake rev into the wrapper as HERMES_REVISION (only
on
clean builds — dirty refs don't represent any upstream commit). When
set, banner.py compares it to upstream main via `git ls-remote`
instead
of inspecting a local checkout, and the cache key includes the rev so
nix updates invalidate immediately. Without local history we can't
count commits, so the message is a plain "update available" with no
suggested command — nix users may install via `nix run`, profile,
system flake, or home-manager, and we don't know which.
Also bump web/package-lock.json npmDepsHash via `nix run
.#fix-lockfiles`.
* fix(tui): offload manual compaction RPC
Route TUI session compression through the existing long-handler pool so slow compaction does not block other gateway RPCs.
* fix(tui): show compaction progress immediately
Print a local status line before the compress RPC starts so slow manual compaction does not look like a no-op.
* feat(tui): rich /compress feedback parity with CLI
Show pre-compaction message count and rough token estimate immediately, emit a status update so the bottom bar reflects ongoing compaction, and report a multi-line summary (headline + token delta + optional note) using the shared summarize_manual_compression helper.
* fix(tui): show live compaction estimate in transcript
Mirror compression progress status into the transcript so users see the backend message count and token estimate while /compress is still running.
* fix(tui): single live compaction line with spinner glyph
Drop the redundant local "compressing context..." placeholder and prefix the live backend status line with a braille spinner glyph so /compress reads as a single in-progress row.
* fix(tui): address review nits on /compress feedback
Reuse the precomputed token estimate inside _compress_session_history so the gateway does not redo the O(n) work while holding history_lock, keep the status bar pinned during long manual compactions instead of auto-restoring after 4s, and drop the redundant noop bullet that doubled with the system role glyph.
* fix(tui): release history_lock during compaction LLM call
Move the snapshot/commit pattern into _compress_session_history so the lock is held only across the in-memory bookkeeping, not during agent._compress_context. Also emit a final neutral status update from session.compress so the pinned compressing indicator clears even on errors.
* fix(tui): rebuild prompt cleanly + sync session_key after compress
Pass system_message=None so AIAgent._compress_context rebuilds the system prompt without nesting the cached identity block. Reuse the handler's pre-snapshotted history inside _compress_session_history to avoid a second O(n) copy under the lock. After compaction, when AIAgent._compress_context rotates session_id, sync the gateway session_key, migrate approval notify + yolo state, restart the slash worker, and clear the stale pending title. Mirrors HermesCLI._manual_compress.
* Avoid /compress lock re-entry in slash side effects.
Stop pre-locking history before _compress_session_history in slash command mirroring, keep session-key sync parity with manual compression, and add a regression test that asserts /compress is invoked without holding history_lock.
* fix(tui): word-wrap composer input
Wrap composer input at word boundaries and anchor the good-vibes heart to the full composer row.
* test(tui): cover composer word wrap edge
Add regression coverage for moving the next word instead of splitting it at the composer edge.
* fix(tui): honor launch toolsets
Carry chat --toolsets through the TUI launcher so TUI sessions use the same per-session tool scope as the classic CLI.
* fix(tui): parse top-level toolsets flag
Allow top-level hermes --tui --toolsets to reach the implicit chat session, matching chat subcommand behavior.
* fix(tui): validate launch toolsets
Filter invalid HERMES_TUI_TOOLSETS entries and fall back to configured CLI toolsets when the override contains no valid toolsets.
* fix(tui): avoid config load for builtin toolsets
Honor built-in HERMES_TUI_TOOLSETS values before loading config and treat all/* as the all-toolsets sentinel.
* fix(cli): honor toolsets in oneshot mode
Forward top-level --toolsets into oneshot agent construction so the flag is not silently ignored outside the TUI path.
* fix(cli): validate oneshot toolsets
Reject invalid-only oneshot toolset overrides before output redirection and clarify TUI fallback warnings.
* fix(cli): preserve all-toolsets sentinel
Map explicit all/* oneshot toolset overrides to the all-toolsets sentinel and replace locals() checks in TUI toolset loading.
* fix(cli): warn on extra all-toolset entries
Warn when all/* toolset overrides include additional ignored entries so typos are still visible.
* fix(tui): honor plugin toolset overrides
Discover plugin toolsets before rejecting unresolved explicit toolset overrides and read raw config for MCP name validation.
* fix(tui): reuse toolset argument normalizer
Share top-level TUI toolset argument parsing with the oneshot path to avoid duplicate normalization logic.
* fix(cli): reject disabled mcp toolsets
Validate explicit toolset overrides against enabled MCP servers only and clarify top-level toolset flag help.
* fix(cli): distinguish disabled mcp from unknown toolsets
Report disabled MCP servers separately from unknown toolset entries and stub plugin discovery in invalid-name tests for determinism.
shutil.copytree from default ~/.hermes duplicated ~/.hermes/profiles into
the new profile, causing nested profiles/.../profiles/... and huge disk use.
Match export behavior (_DEFAULT_EXPORT_EXCLUDE_ROOT) by ignoring the sibling
profiles tree at the source root.
Made-with: Cursor
Intended placement per PR #17610 discussion — comfyui belongs in
skills/creative/ alongside other creative built-ins (touchdesigner-mcp,
pretext, sketch), not in optional-skills/.
Pure directory rename, no content changes. History preserved via git mv.
The skip_pre_tool_call_hook flag was added to prevent double-firing of
pre_tool_call when run_agent._invoke_tool pre-checks for a block
directive and then dispatches via handle_function_call. But the
implementation added an else: branch that fired invoke_hook again for
'observers', without noticing that get_pre_tool_call_block_message() in
hermes_cli.plugins already fires invoke_hook('pre_tool_call', ...) as
part of its block-directive poll.
Result: every tool call ran through the run_agent loop fired the hook
twice — reported by community users whose observer / audit plugins
logged each tool invocation twice with identical timestamps.
Fix: delete the else: branch. The single-fire contract is now:
- skip=False (direct handle_function_call): hook fires once inside
get_pre_tool_call_block_message().
- skip=True (run_agent._invoke_tool path): caller fires the hook
once via get_pre_tool_call_block_message(); handle_function_call
must not fire it again.
Tightened the existing skip-flag test (renamed to
test_skip_flag_prevents_double_fire) to assert pre_tool_call fires
zero times when skip=True, and added
test_run_agent_pattern_fires_pre_tool_call_exactly_once to lock in
end-to-end that the full block-check + dispatch sequence fires the
hook exactly once.
Adds Step 0 'Ask Local vs Cloud' as the very first onboarding step, with a
scripted question that spells out the hardware requirements for local
(6 GB VRAM NVIDIA, ROCm AMD on Linux, or M1+ Mac with 16 GB unified)
and routes Cloud users straight to Path A without a hardware check.
Hardware check becomes Step 1, run only when the user picked local.
Layers a programmatic hardware-feasibility check on top of the v4 skill
so the agent doesn't silently push users toward a local install they
can't actually run. The official comfy-cli supports --nvidia / --amd /
--m-series / --cpu, but has no guard against "4 GB laptop GPU on SDXL"
or "Intel Mac falling back to CPU" — both route to comfy-cli paths in
the original table and then fail on first workflow.
- scripts/hardware_check.py: detect OS/arch/GPU (NVIDIA nvidia-smi,
AMD rocm-smi, Apple M1+ via arm64+sysctl, Intel Arc via clinfo),
VRAM, system/unified RAM. Emits JSON
{verdict: ok|marginal|cloud, recommended_install_path, comfy_cli_flag}
with practical thresholds: discrete GPU >=6 GB VRAM minimum,
Apple Silicon >=16 GB unified memory minimum, Intel Mac -> cloud,
no accelerator -> cloud. comfy_cli_flag maps directly to
`comfy install` so the agent can stitch the whole flow together.
- scripts/comfyui_setup.sh: runs hardware_check.py first when no
explicit flag is passed. If verdict=cloud, refuses to install
locally, prints Comfy Cloud URL + an override command, exits 2.
Otherwise auto-selects the right --nvidia/--amd/--m-series flag
for `comfy install`. Surfaces marginal-verdict notes to the user.
- SKILL.md Setup & Onboarding: adds mandatory Step 0 "Check If This
Machine Can Run ComfyUI Locally" ahead of the Path A-E selection.
Documents the verdict thresholds inline, ties verdict + comfy_cli_flag
to the install paths, and updates the path-choice table so
"verdict: cloud" is the first row. Quick-Start "Detect Environment"
block extended to include the hardware check. Verification
checklist gains a hardware-check gate.
- Frontmatter setup.help rewritten to point at hardware_check.py
first. Version bumped 4.0.0 -> 4.1.0.