When a provider returns a 429 rate-limit error (not billing-related),
the auxiliary client's call_llm/async_call_llm previously did NOT trigger
the fallback chain. This caused auxiliary tasks like session_search to
exhaust all 3 retries against the same rate-limited endpoint, losing
session metadata that depended on the summarization completing.
Root cause: `_is_payment_error()` only matched 429s containing billing
keywords ("credits", "insufficient funds", etc.). Provider-specific
rate-limit messages like Nous's "Hold up for a bit, you've exceeded the
rate limit on your API key" didn't match, so `_is_payment_error` returned
False, `_is_connection_error` returned False, and `should_fallback` was
False — all retries hit the same rate-limited provider.
Fix:
- New `_is_rate_limit_error()` function that detects 429 + rate-limit
keywords, generic 429 without billing keywords, and OpenAI SDK
`RateLimitError` class instances (which may omit .status_code).
- Updated `should_fallback` in both `call_llm` and `async_call_llm` to
include `_is_rate_limit_error`.
- Updated the max_tokens retry path to also check for rate-limit errors.
- Updated the reason string to include "rate limit".
This complements the Nous rate guard (PR #10568) which prevents new calls
to Nous when already rate-limited — this fix handles the case where a
request is already in flight when the 429 arrives.
Related: #8023, #12554, #11034
Co-authored-by: Zeejay <zjtan1@gmail.com>
Salvages @Es1la's PR #13632 — a non-numeric timestamp in the persisted
feishu dedup state crashed adapter startup with ValueError/TypeError
from the unguarded float() call. Wrap the float() conversion in
try/except; skip the bad key and keep loading the rest.
The original PR also restructured existing TestDedupTTL tests to use
tempfile.TemporaryDirectory + HERMES_HOME patching — that was
test-hygiene scope creep unrelated to the bug. Kept only the
malformed-timestamp fix and added a focused regression test.
OpenRouter's dashboard attributes usage via the `X-Title` header.
Hermes was sending `X-OpenRouter-Title`, which OpenRouter does not
recognize, so Hermes usage showed up unlabeled. Rename to `X-Title`
to match the canonical header (already used elsewhere in the same
file via _AI_GATEWAY_HEADERS).
Salvages the core fix from @JTroyerOvermatch's PR #13649. Dropped the
PR's `HERMES_OPENROUTER_TITLE` / `HERMES_OPENROUTER_REFERER` env-var
override plumbing per the '.env is for secrets only' policy — if
per-deployment attribution is needed later it should go under
`openrouter.title` / `openrouter.referer` in config.yaml instead.
WhatsApp bridge (bridge.js) only sets ptt:true when file extension is .ogg
or .opus, causing mp3/wav files (from Edge TTS, NeuTTS, etc.) to arrive
as file attachments instead of voice bubbles — silently, with no error.
Fix: when audio type is sent with a non-ogg/opus format, run ffmpeg
conversion to ogg/opus in a temp file before sending. This makes
send_voice() self-sufficient regardless of what format the caller provides.
Fallback: if ffmpeg is unavailable, original buffer is sent (previous
behaviour) with a console.warn — no crash.
Addresses veloguardian's review comment on PR #4992.
ACP's save_session() did a non-atomic clear_messages() + append_message()
loop. If any message hit an exception mid-loop (bad tool_call shape, etc.),
the DELETE had already committed and the persisted conversation was lost.
SessionDB.replace_messages() wraps DELETE + bulk INSERT in a single
BEGIN IMMEDIATE transaction that rolls back on any exception, so a bad
message can no longer clobber previously-persisted history.
Salvages @Awsh1's PR #13675 — uses the existing replace_messages()
helper (which covers more message fields than the PR's own copy)
instead of adding a duplicate.
Feishu post-type 'md' elements do not render markdown tables.
When table content is sent as post (triggered by **bold** matching
_MARKDOWN_HINT_RE), the message appears blank on the client.
Add _MARKDOWN_TABLE_RE to detect markdown table syntax and force
text mode for table content, ensuring it is visible as plain text.
After PR #13725 replaced the module-level _LOCK_DIR/_LOCK_FILE constants
with a dynamic _get_lock_paths() helper, the xdist-isolation fixture
needs to patch the function instead of the removed constants.
- scheduler.py: Replace static _hermes_home with dynamic _get_hermes_home() function
to support profile switching at runtime (HERMES_HOME override)
- scheduler.py: Replace static _LOCK_DIR/_LOCK_FILE with _get_lock_paths() function
for profile-aware lock path resolution
- feishu.py: Add receive_id_type detection (oc_/ou_ -> open_id, else chat_id)
to fix Feishu API '[230001] ext=invalid receive_id' error for user DMs
Workers completing a kanban task can now claim the ids of cards they
created via an optional ``created_cards`` field on ``kanban_complete``.
The kernel verifies each id exists and was created by the completing
worker's profile; any phantom id blocks the completion with a
``HallucinatedCardsError`` and records a
``completion_blocked_hallucination`` event on the task so the rejected
attempt is auditable. Successful completions also get a non-blocking
prose-scan pass over their ``summary`` + ``result`` that emits a
``suspected_hallucinated_references`` event for any ``t_<hex>``
reference that doesn't resolve.
Closes#20017.
Recovery UX (kernel + CLI + dashboard)
--------------------------------------
A structural gate alone isn't enough — operators also need to see and
act on stuck workers, especially when a profile's model is the root
cause. This PR ships the full loop:
* ``kanban_db.reclaim_task(task_id)`` — operator-driven reclaim that
releases an active worker claim immediately (unlike
``release_stale_claims`` which only acts after claim_expires has
passed). Emits a ``reclaimed`` event with ``manual: True`` payload.
* ``kanban_db.reassign_task(task_id, profile, reclaim_first=...)`` —
switch a task to a different profile, optionally reclaiming a stuck
running worker in the same call.
* ``hermes kanban reclaim <id> [--reason ...]`` and
``hermes kanban reassign <id> <profile> [--reclaim] [--reason ...]``
CLI subcommands wired through to the same helpers.
* ``POST /api/plugins/kanban/tasks/{id}/reclaim`` and
``POST /api/plugins/kanban/tasks/{id}/reassign`` endpoints on the
dashboard plugin.
Dashboard surfacing
-------------------
* ⚠ **warning badge** on cards with active hallucination events.
* **attention strip** at the top of the board listing all flagged
tasks; dismissible per session.
* **events callout** in the task drawer — hallucination events render
with a red left border, amber icon, and phantom ids as styled chips.
* **recovery section** in the task drawer with three actions: Reclaim,
Reassign (with profile picker + reclaim-first checkbox), and a
copy-to-clipboard hint for ``hermes -p <profile> model`` since
profile config lives on disk and can't be edited from the browser.
Auto-opens when the task has warnings, collapsed otherwise.
Keyed by task id so state doesn't leak between drawers.
Active-vs-stale rule: warnings clear when a clean ``completed`` or
``edited`` event supersedes the hallucination, so recovery is never
permanently stigmatising — the audit events persist for debugging but
the badge goes away once the worker succeeds.
Skill updates
-------------
* ``skills/devops/kanban-worker/SKILL.md`` documents the
``created_cards`` contract with good/bad examples.
* ``skills/devops/kanban-orchestrator/SKILL.md`` gains a "Recovering
stuck workers" section with the three actions and when to use each.
Tests
-----
* Kernel gate: verified-cards manifest, phantom rejection + audit
event, cross-worker rejection, prose scan positive + negative.
* Recovery helpers: reclaim on running task, reclaim on non-running
returns False, reassign refuses running without reclaim_first,
reassign with reclaim_first succeeds on running.
* API endpoints: warnings field present on /board and /tasks/:id,
warnings cleared after clean completion, reclaim 200 + 409 paths,
reassign 200 + 409 + reclaim_first paths.
* CLI smoke: reclaim + reassign subcommands.
Live-verified end-to-end on a dashboard with seeded scenarios:
attention strip renders, badges land on the right cards, drawer
callout shows phantom chips, Reclaim on a running task flips status to
ready + emits manual reclaimed event + refreshes the drawer,
Reassign swaps the assignee and triggers board refresh.
359/359 kanban-suite tests pass
(test_kanban_{db,cli,boards,core_functionality} + dashboard + tools).
* revert(gateway): remove stale-code self-check and auto-restart
Removes the _detect_stale_code / _trigger_stale_code_restart mechanism
introduced in #17648 and iterated in #19740. On every incoming message
the gateway compared the boot-time git HEAD SHA to the current SHA on
disk, and if they differed it would reply with
Gateway code was updated in the background --
restarting this gateway so your next message runs
on the new code. Please retry in a moment.
and then kick off a graceful restart. This is unwanted behaviour:
users who run a long-lived gateway and do their own ad-hoc git
operations on the checkout end up with their chat interrupted and
the current message dropped every time HEAD moves, with no way to
opt out.
If an operator really needs the old protection against stale
sys.modules after "hermes update", the SIGKILL-survivor sweep in
hermes update (hermes_cli/main.py, also tagged #17648) already
handles the supervisor-respawn case on its own.
Removed:
gateway/run.py:
- _STALE_CODE_SENTINELS, _GIT_SHA_CACHE_TTL_SECS
- _read_git_head_sha(), _compute_repo_mtime() module helpers
- class-level _boot_wall_time / _boot_repo_mtime / _boot_git_sha /
_stale_code_restart_triggered defaults
- __init__ boot-snapshot block (_boot_*, _cached_current_sha*,
_repo_root_for_staleness, _stale_code_notified)
- _current_git_sha_cached(), _detect_stale_code(),
_trigger_stale_code_restart() methods
- stale-code check + user-facing restart notice at the top of
_handle_message()
tests/gateway/test_stale_code_self_check.py (deleted, 412 lines)
No new logic added. Zero remaining references to any removed
symbol. Gateway test suite passes the same 4589 tests it passed
before; the 3 pre-existing unrelated failures (discord free-channel,
feishu bot admission, teams typing) are unchanged by this commit.
* feat(i18n): add display.language for static message translation (zh/ja/de/es)
Adds a thin-slice i18n layer covering the highest-impact static user-facing
messages: the CLI dangerous-command approval prompt and a handful of gateway
slash-command replies (restart-drain, goal cleared, approval expired, config
read/save errors).
Out of scope (stays English): agent responses, log lines, tool outputs,
slash-command descriptions, error tracebacks.
Infrastructure:
- agent/i18n.py: catalog loader, t() helper, language resolution
(HERMES_LANGUAGE env var > display.language config > en)
- locales/{en,zh,ja,de,es}.yaml: ~19 translated strings per language
- display.language in DEFAULT_CONFIG (hermes_cli/config.py)
Tests:
- tests/agent/test_i18n.py: 21 tests covering catalog parity, placeholder
parity across locales, fallback behavior, env-var override, alias
normalization, missing-key graceful degradation.
Docs:
- website/docs/user-guide/configuration.md: display.language entry plus a
short section explaining scope so users don't expect agent responses to
translate via this knob.
* docs(AGENTS.md): add curator/cron/delegation/toolsets, fix plugin tree, frontmatter, auto-discovery caveat
Closes#19101 and #19107 (@pty819).
Verified 16 claims from those two issues against current main. 12 were
real gaps; 2 were generated/hallucinated (#10 unverified --now flag is
actually real and already cited in AGENTS.md; #11 stale PR refs #5587
and #4950 do not appear in AGENTS.md at all); 2 were low-prio nits
(memory provider hierarchy, --now scope enumeration) deferred.
Changes:
- Project tree: add yuanbao to platforms comment; expand plugins/
subtree with real directory names (kanban, hermes-achievements,
observability, image_gen) instead of vague '<others>'.
- Test-count blurb: 15k/700 Apr → 17k/900 May (verified: 17,375 test
defs, 915 files).
- Adding New Tools: clarify that auto-discovery wires up schemas but
the tool only reaches an agent if its name is added to a toolset in
toolsets.py. _HERMES_CORE_TOOLS is not dead code.
- Adding Configuration: enumerate top-level config.yaml sections
including auxiliary and curator; note auxiliary is per-task
overrides for side-LLM work.
- SKILL.md frontmatter: add author, license, related_skills. Note
top-level tags/category are mirrored from metadata.hermes.*.
- New section 'Toolsets' — enumerates the 30 current TOOLSETS keys
(including yuanbao, kanban, moa, spotify, safe, debugging).
- New section 'Delegation (delegate_task)' — sync semantics, batch
mode, leaf vs orchestrator roles, config knobs, durability caveat.
- New section 'Curator (skill lifecycle)' — core files, 11 CLI verbs,
telemetry sidecar, invariants (pin/delete split after PR #20220,
bundled/hub off-limits), curator.* config section.
- New section 'Cron (scheduled jobs)' — 4 schedule formats, 7 CLI
verbs, per-job fields, 3-min hard interrupt, catchup/grace windows,
tick.lock, cron→session isolation.
Skipped (invalid claims):
- #19107 item 10: --now is real (hermes_cli/skills_hub.py:624/966/1013/1470)
- #19107 item 11: no '#5587' or '#4950' or 'async_delegation' in AGENTS.md
* docs(AGENTS.md): add Kanban section
Adds a Kanban entry alongside Curator / Cron / Delegation so the major
durable background systems are all represented. Covers the CLI verbs,
the HERMES_KANBAN_TASK-gated worker toolset, the in-gateway dispatcher,
plugin assets, and the board/tenant isolation model. Points at the full
742-line user docs for detail.
Strip bracketed-paste control sequences from setup prompt input so pasted API keys work on Linux and WSL terminals, and add regression tests for normal/password prompts.
Closes#16491
Each auxiliary model must be resolved with its own provider so that
provider-specific paths (e.g. Bedrock static table, OpenRouter API)
are invoked for the correct client, not inherited from the main model.
When the main model is Bedrock, passing self.provider unconditionally
to get_model_context_length() for the aux model caused the Bedrock
static table hard-intercept (step 1b) to fire for non-Bedrock models,
returning BEDROCK_DEFAULT_CONTEXT_LENGTH=128K instead of the model's
real context window — triggering a false compression warning every session.
Fix: pass _aux_cfg_provider when explicitly set, falling back to
self.provider only when the aux provider is unset or "auto".
Closes#12977
Related: #13807, #17460
Widens @Krionex's PR #16933 fix to cover the second bug class at the sibling
site. natural mode used to pass env values through int() before the PR
caught mis-typed values crashing the gateway; custom mode had the exact
same bug one branch away (HERMES_HUMAN_DELAY_MIN_MS=oops in custom mode
still crashed). Same try/except/fallback pattern, scoped to the two
int() calls that feed random.uniform().
When auxiliary.<task> config has base_url set but api_key is empty
(common when user expects env var fallback), _resolve_task_provider_model()
returned provider="custom" with api_key=None. This caused downstream
client construction to make API calls without an Authorization header,
resulting in HTTP 401 errors.
Fix: only return "custom" when BOTH cfg_base_url AND cfg_api_key are
non-empty. When base_url is set without api_key but with a known
provider (e.g. "openrouter"), pass through to that provider so it can
resolve credentials from environment variables.
Fixes#16829
When context compression rotates the agent's session_id to a new
child session, the API server was still returning the stale parent
session_id in the X-Hermes-Session-Id response header.
This caused external clients to keep sending the old session_id,
loading uncompressed parent history instead of the compressed
continuation.
Fix: _run_agent() now includes the effective session_id in its
result dict, and the response header uses it instead of the
original provided session_id.
hermes config set model.aliases.xxx commands write to the model.aliases
nested key, but _load_direct_aliases() only read from the top-level
model_aliases key. This meant aliases set via hermes config set were
invisible to the /model command, and unrecognised inputs fell through
to the DeepSeek normaliser which mapped everything to deepseek-chat.
Add a second pass in _load_direct_aliases() that reads model.aliases
and converts string-value entries (provider/model format) into
DirectAlias objects. The provider is parsed from the slash prefix;
if no slash, the current default provider from config is used.
Also prevent simple aliases from overriding explicit model_aliases
dict entries when both exist.
Copilot review on PR #17012 noted the docstring/comment lists `0`
among the falsy effort values that fall back to `medium`, but the
existing regression tests only cover `None` and `""`. Add the third
case to lock in the full contract.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
auxiliary.<task>.extra_body.reasoning, but the new translation path in
_CodexCompletionsAdapter.create() reads the effort with
``reasoning_cfg.get("effort", "medium")``. That returns the configured
value verbatim when the key is present, so ``effort: null`` /
``effort: ""`` (both common YAML shapes) flow through as
``{"effort": null, "summary": "auto"}`` and Codex rejects the request
with "Invalid value for parameter ``reasoning.effort``".
agent/transports/codex.py::build_kwargs() — which the new adapter is
documented to mirror — uses a truthy check (``elif
reasoning_config.get("effort"):``) so the same falsy values keep the
"medium" default. Switch the auxiliary adapter to the same
``or "medium"`` truthy form so identical config produces identical
requests on both paths.
- [x] Two new regression tests cover ``effort: None`` and
``effort: ""`` and assert the request goes out as
``{"effort": "medium", "summary": "auto"}``.
- [x] Old behaviour fails the new tests (``{'effort': None} !=
{'effort': 'medium'}``); fixed behaviour passes all 11 tests in the
``TestCodexAdapterReasoningTranslation`` class.
- [x] Adjacent suites green: ``tests/agent/test_auxiliary_client.py``
(108 passed) and ``tests/agent/transports/test_codex_transport.py +
test_chat_completions.py`` (73 passed).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>