hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-04-28 06:51:16 +08:00

Author	SHA1	Message	Date
Siddharth Balyan	ef41d3bd45	feat(nix): declarative plugin installation for NixOS module (#15953 ) * feat(nix): parameterize dependency-groups in python.nix * refactor(nix): extract package to callPackage-able hermes-agent.nix Makes the package overridable via .override{} and adds extraPythonPackages parameter for PYTHONPATH injection. Includes build-time collision check using PEP 503 name canonicalization. * feat(nix): add overlay for external NixOS consumption External flakes can now add overlays = [ inputs.hermes-agent.overlays.default ] to get pkgs.hermes-agent with full .override support. * test(nix): add check for extraPythonPackages PYTHONPATH injection Verifies wrapper has PYTHONPATH when extras provided, and base package has no PYTHONPATH without extras. * feat(nix): add extraPlugins option for directory-based plugins Symlinks plugin packages into HERMES_HOME/plugins/ at activation time. Validates plugin.yaml presence. Asserts unique plugin names at eval time. Hermes discovers them automatically via its directory scan. * feat(nix): add extraPythonPackages option for entry-point plugins Overrides the hermes package with PYTHONPATH injection when extraPythonPackages is non-empty. Plugin .dist-info directories become visible to importlib.metadata for entry-point discovery. Works in both native systemd and container modes. * docs: add NixOS declarative plugin installation to nix-setup, plugins, and build-a-plugin guides - nix-setup.md: new Plugins section with extraPlugins/extraPythonPackages examples, overlay usage, collision checking note, options reference rows - plugins.md: Nix row in discovery table, NixOS declarative plugins section - build-a-hermes-plugin.md: Distribute for NixOS section after pip section * fix: address review feedback — remove unrelated umask, fix fetchFromGitHub naming, simplify checks - Remove accidentally introduced umask/migration changes (unrelated to plugins) - Add pluginName helper, fix fetchFromGitHub producing name='source' - Show name= in extraPlugins example docs - Simplify checks.nix: use hermes-agent.override instead of re-callPackage - Fix fragile grep shell logic in checks * refactor: address simplify feedback — lib.getName, drop unused inputs', Python list for extras - Use lib.getName instead of custom pluginName helper - Drop unused inputs' from checks.nix perSystem args - Pass extraPythonPackages as Python list literal instead of colon-split string * fix: walk propagatedBuildInputs for plugin PYTHONPATH and collision check Uses python312.pkgs.requiredPythonModules to resolve the full transitive closure of extraPythonPackages. Without this, a plugin with third-party deps (e.g. requests) would fail at runtime if those deps weren't already in the sealed uv2nix venv. The collision check now also scans the full closure, catching transitive conflicts. * cleanup: fold plugins into subdir loop, use find for symlink cleanup, inline lib.getName - Add 'plugins' to the existing cron/sessions/logs/memories subdir loop instead of a separate mkdir/chown/chmod block - Replace fragile for-glob with find -delete for stale symlink cleanup - Inline lib.getName at both call sites, remove pluginName wrapper	2026-04-28 00:18:32 +05:30
kshitijk4poor	56724147ef	fix(providers/gmi): post-salvage review fixes - config.py: remove dead ENV_VARS_BY_VERSION[17] entry (current _config_version is 22, so all users are past version 17 and would never be prompted for GMI_API_KEY on upgrade — consistent with how arcee was added) - auxiliary_client.py: use google/gemini-3.1-flash-lite-preview as GMI aux model instead of anthropic/claude-opus-4.6 (matches cheap fast-model pattern used by all other providers: zai→glm-4.5-flash, kimi→kimi-k2-turbo-preview, stepfun→step-3.5-flash, kilocode→google/gemini-3-flash-preview) - test_gmi_provider.py: fix malformed write_text() call in doctor test (was: write_text("GMI_API_KEY=* encoding="utf-8") → missing closing quote, wrote literal string 'GMI_API_KEY=* encoding=' to .env file) - test_gmi_provider.py + test_auxiliary_client.py: update aux model assertions to match new cheaper default - docs/integrations/providers.md: add 'gmi' to inline 'Supported providers' fallback list (was only in the table, not the inline list at line ~1181) - docs/reference/cli-commands.md: add 'gmi' to --provider choices list	2026-04-27 11:17:59 -07:00
Isaac Huang	c53fcb0173	feat(providers): add GMI Cloud as a first-class API-key provider (#11955 ) Add GMI Cloud (api.gmi-serving.com) as a full first-class API-key provider with built-in auth, aliases, model catalog, CLI entry points, auxiliary client routing, context length resolution, doctor checks, env var tracking, and docs. - auth.py: ProviderConfig for 'gmi' (api_key, GMI_API_KEY / GMI_BASE_URL) - providers.py: HermesOverlay with extra_env_vars for models.dev detection - models.py: curated slash-form model catalog; live /v1/models fetch - main.py: 'gmi' in _named_custom_provider_map and --provider choices - model_metadata.py: _URL_TO_PROVIDER, _PROVIDER_PREFIXES, dedicated context-length probe block (GMI's /models has authoritative data) - auxiliary_client.py: alias entries; _compat_model fix for slash-form models on cached aggregator-style clients; gmi aux default model - doctor.py: GMI in provider connectivity checks - config.py: GMI_API_KEY / GMI_BASE_URL in OPTIONAL_ENV_VARS - conftest.py: explicit GMI_BASE_URL clearing (not caught by _API_KEY suffix) - docs: providers.md, environment-variables.md, fallback-providers.md, configuration.md, quickstart.md (expands provider table) Co-authored-by: Isaac Huang <isaachuang@Isaacs-MacBook-Pro.local>	2026-04-27 11:17:59 -07:00
Teknium	65f648ee84	fix(website): auto-wrap ASCII-art code blocks in generated skill pages (#16497 ) Defensive: when the generator encounters a fenced code block containing Unicode box-drawing characters, wrap it in `<!-- ascii-guard-ignore -->` markers so the docs-site-checks lint (which scans inside code fences) can't reject the page for a skill's own diagram. Plain bash/python code blocks stay uncluttered — only blocks with box chars get wrapped. Skill authors no longer have to remember to add the ignore markers in every SKILL.md with ASCII art. Fixes #15305.	2026-04-27 03:38:39 -07:00
Teknium	c5781d50c7	fix(azure-foundry): auto-route gpt-5.x / codex / o-series to Responses API (#16361 ) Azure Foundry deploys GPT-5.x, codex-*, and o1/o3/o4 reasoning models as Responses-API-only. Calling /chat/completions against these deployments returns 400 'The requested operation is unsupported.', which broke any user who ran 'hermes model' on Azure, picked a gpt-5/codex deployment, and kept the default api_mode: chat_completions. Verified in a user debug bundle on 2026-04-26: gpt-5.3-codex failed on synopsisse.openai.azure.com with that exact payload while gpt-4o-pure on the same endpoint worked. Adds azure_foundry_model_api_mode(model_name) that returns codex_responses when the model name starts with gpt-5, codex, o1, o3, or o4 — otherwise None so chat_completions / anthropic_messages stay untouched for gpt-4o, Llama, Claude-via-Anthropic, etc. Resolver (both the direct Azure Foundry path and the pool-entry path) consults it and upgrades api_mode unless the user explicitly picked anthropic_messages. target_model (from /model mid-session switch) takes precedence over the persisted default so switching from gpt-4o to gpt-5.3-codex routes correctly before the next request. Docs: correct the azure-foundry guide which previously claimed Azure keeps gpt-5.x on chat completions — that was only true for early Azure OpenAI, not Azure Foundry codex/o-series deployments. Tests: 14 unit tests for azure_foundry_model_api_mode + 6 integration tests in TestAzureFoundryResolution covering Bob's exact scenario, target_model override, anthropic_messages guard, and o3-mini.	2026-04-26 21:33:31 -07:00
Teknium	235bfb192b	docs(skills): document URL install across features, reference, guide, and hermes-agent skill (#16355 ) Follow-up to #16323 — the UrlSource adapter is shipped but four user-facing docs surfaces still only listed the hub-identifier forms. - user-guide/features/skills.md: add ``url`` to the Supported-hub-sources table; add a new "#### 8. Direct URL (`url`)" section explaining scope (single-file SKILL.md only), name-resolution order (frontmatter → URL slug → interactive prompt → --name flag), and both TTY and non-interactive usage. Add two URL examples to the install-examples block near the top of the page. - reference/cli-commands.md: two URL install examples + one note explaining the name-resolution fallback chain. - guides/work-with-skills.md: one URL-install example alongside the existing hub-identifier examples. - skills/autonomous-ai-agents/hermes-agent/SKILL.md: Quick Reference block's ``hermes skills install`` line now spells out that ID can be a hub identifier OR a direct SKILL.md URL, and mentions --name for frontmatter-less skills. No code changes. No new dependencies. Website builds via the usual Docusaurus pipeline. Co-authored-by: teknium1 <teknium@noreply.github.com>	2026-04-26 21:27:59 -07:00
Teknium	478444c262	feat(checkpoints): auto-prune orphan and stale shadow repos at startup (#16303 ) Every working dir hermes ever touches gets its own shadow git repo under ~/.hermes/checkpoints/{sha256(abs_dir)[:16]}/. The per-repo _prune is a no-op (comment in CheckpointManager._prune says so), so abandoned repos from deleted/moved projects or one-off tmp dirs pile up forever. Field reports put the typical offender at 1000+ repos / ~12 GB on active contributor machines. Adds an opt-in startup sweep that mirrors the sessions.auto_prune pattern from #13861 / #16286: - tools/checkpoint_manager.py: new prune_checkpoints() and maybe_auto_prune_checkpoints() helpers. Deletes shadow repos that are orphan (HERMES_WORKDIR marker points to a path that no longer exists) or stale (newest in-repo mtime older than retention_days). Idempotent via a CHECKPOINT_BASE/.last_prune marker file so it only runs once per min_interval_hours regardless of how many hermes processes start up. - hermes_cli/config.py: new checkpoints.auto_prune / retention_days / delete_orphans / min_interval_hours knobs. Default auto_prune: false so users who rely on /rollback against long-ago sessions never lose data silently. - cli.py / gateway/run.py: startup hooks gated on checkpoints.auto_prune, called right next to the existing state.db maintenance block. - Docs updated with the new config knobs. - 11 regression tests: orphan/stale deletion, precedence, byte-freed tracking, non-shadow dir skip, interval gating, corrupt marker recovery. Refs #3015 (session-file disk growth was fixed in #16286; this covers the checkpoint side noted out-of-scope there).	2026-04-26 19:05:52 -07:00
Teknium	ab6879634e	yuanbao platform (#16298 ) Co-authored-by: loongzhao <loongzhao@tencent.com>	2026-04-26 18:50:49 -07:00
Teknium	7e3c8a31f0	feat(skills/airtable): tailor skill to Hermes idioms + expand cookbook Expand the airtable skill from bare CRUD to a full Hermes-shaped cookbook matching the linear/notion neighbors, and trim the description to fit the 60-char system-prompt cutoff. Hermes-specific additions: - Explicit 'use the terminal tool with curl — not web_extract or browser_navigate' guidance, matching the same note in linear. - Note that AIRTABLE_API_KEY flows from ~/.hermes/.env into the subprocess automatically via env_passthrough, so curl calls don't need to re-export it. - Prefer 'python3 -m json.tool' (always present) over jq (optional) for pretty-printing, with -s on every curl to keep output clean. - Read-before-write workflow that resolves record IDs via filterByFormula instead of guessing. Cookbook expansion (new vs original): - Field-type reference table (text, select, multi-select, attachment, linked record, user) with the exact write-shape Airtable expects. - typecast flag for auto-coercing values / auto-creating select options. - performUpsert PATCH for idempotent sync by merge field. - Batch create/delete endpoints (10-record cap per call). - Sort + fields query params with URL-encoding (%5B / %5D). - Named-view query that applies saved filter/sort server-side. - Full pagination loop template (while loop with offset). - Common filterByFormula patterns (exact match, contains, AND/OR, date comparison, NOT empty). - Rate-limit backoff guidance (Retry-After header, per-base budget). - Airtable error-code reference (AUTHENTICATION_REQUIRED, INVALID_PERMISSIONS, MODEL_ID_NOT_FOUND, INVALID_MULTIPLE_CHOICE_OPTIONS) so the agent can map failures to user-actionable fixes instead of just retrying. Also: description trimmed from 183 chars (truncated to 60 in system prompt, losing 'filter/upsert/delete' trigger terms) down to 59 chars that render whole: 'Airtable REST API via curl. Records CRUD, filters, upserts.' Catalog row updated to match. SKILL.md grew from 115 to 228 lines — still under the 500-line soft cap and below the linear skill (297 lines) which serves the same role for GraphQL.	2026-04-26 18:45:15 -07:00
Teknium	0bef0b9416	chore: docs + attribution for airtable skill - scripts/release.py: map sonoyuncudmr@gmail.com -> Sonoyunchu so the check-attribution CI job and release notes credit Soynchu correctly. - website/docs/reference/skills-catalog.md: add the airtable row to the productivity bundled-skills table.	2026-04-26 18:45:15 -07:00
mewwts	8fb861ea6e	feat(gateway/slack): support channel_skill_bindings Extends the existing channel_skill_bindings mechanism (previously Discord-only) to Slack, so a channel or DM can auto-load one or more skills at session start without relying on the model's skill selector for every short reply. Motivation: Mats's German flashcards DM pushes a cron-driven card 5x/day; he responds with one-word guesses like 'work'. Previously each reply required the main agent to decide whether to load german-flashcards (full opus turn just to pick a skill). With the binding configured per Slack channel, the skill is injected at session start and grading runs directly. Changes: - Extract resolve_channel_skills() from DiscordAdapter._resolve_channel_skills into gateway.platforms.base (now shared across adapters). - DiscordAdapter._resolve_channel_skills delegates to the shared helper (behavior preserved — existing test suite still passes unchanged). - SlackAdapter: resolve channel_skill_bindings on each message and attach auto_skill to MessageEvent. gateway/run.py already handles auto-skill injection on new sessions; this just wires Slack through it. - gateway/config.py: accept channel_skill_bindings in slack: block of config.yaml (was Discord-only). - Tests: new tests/gateway/test_slack_channel_skills.py with 11 cases covering DM/thread/parent resolution, single-vs-list skills, dedup, malformed entries. Discord suite unchanged. - Docs: add 'Per-Channel Skill Bindings' section to Slack user guide. Config example: slack: channel_skill_bindings: - id: "D0ATH9TQ0G6" skills: ["german-flashcards"]	2026-04-26 18:25:41 -07:00
Teknium	635253b918	feat(busy): add 'steer' as a third display.busy_input_mode option (#16279 ) Enter while the agent is busy can now inject the typed text via /steer — arriving at the agent after the next tool call — instead of interrupting (current default) or queueing for the next turn. Changes: - cli.py: keybinding honors busy_input_mode='steer' by calling agent.steer(text) on the UI thread (thread-safe), with automatic fallback to 'queue' when the agent is missing, steer() is unavailable, images are attached, or steer() rejects the payload. /busy accepts 'steer' as a fourth argument alongside queue/interrupt/status. - gateway/run.py: busy-message handler and the PRIORITY running-agent path both route through running_agent.steer() when the mode is 'steer', with the same fallback-to-queue safety net. Ack wording tells users their message was steered into the current run. Restart-drain queueing now also activates for 'steer' so messages aren't lost across restarts. - agent/onboarding.py: first-touch hint has a steer branch for both CLI and gateway. - hermes_cli/commands.py: /busy args_hint updated to include steer, and 'steer' is registered as a subcommand (completions). - hermes_cli/web_server.py: dashboard select widget offers steer. - hermes_cli/config.py, cli-config.yaml.example, hermes_cli/tips.py: inline docs updated. - website/docs/user-guide/cli.md + messaging/index.md: documented. - Tests: steer set/status path for /busy; onboarding hints; _load_busy_input_mode accepts steer; busy-session ack exercises steer success + two fallback-to-queue branches. Requested on X by @CodingAcct. Default is unchanged (interrupt).	2026-04-26 18:21:29 -07:00
Teknium	b16f9d438b	feat(telegram): send fresh finals for stale preview streams (port openclaw#72038) (#16261 ) Ports openclaw/openclaw#72038 to hermes-agent. Telegram's `editMessageText` preserves the original message timestamp, so a long-running streamed reply (reasoning models that take 60+ seconds to finish) would keep the first-token timestamp even after completion. Users can't tell how long a task actually took. When a preview message has been visible for >= 60s (configurable via `streaming.fresh_final_after_seconds`), finalize by sending a fresh message instead of editing in place, then best-effort delete the stale preview. Short previews still edit in place (the existing fast path). Implementation notes adapted from OpenClaw's TypeScript original: - `StreamConsumerConfig` gains `fresh_final_after_seconds` (default 0 = legacy edit-in-place). Gateway-level `StreamingConfig` defaults to 60. - `GatewayStreamConsumer` tracks `_message_created_ts` at first-send and checks it in `_send_or_edit` on `finalize=True`. New helpers `_should_send_fresh_final` + `_try_fresh_final`. - `BasePlatformAdapter` gains optional `delete_message(chat_id, message_id)` returning False by default. `TelegramAdapter` implements it via `_bot.delete_message`. - `gateway/run.py` only enables fresh-final for `Platform.TELEGRAM`; other platforms ignore the setting (they don't have the stale-edit timestamp problem or edit-then-read works cheaply). - Fallback to normal edit on any fresh-send failure — no user-visible regression if Telegram rate-limits a send or the message is gone. Tests: 15 new cases in tests/gateway/test_stream_consumer_fresh_final.py covering short/long previews, config plumbing, delete-support absent, send-failure fallback, __no_edit__ sentinel safety, and StreamingConfig round-trip. Co-authored-by: Hermes Agent <agent@nousresearch.com>	2026-04-26 17:26:37 -07:00
Zainan Victor Zhou	778fd1898e	fix(slack): surface attachment access diagnostics Translate Slack attachment failures into actionable user-facing notices instead of generic download errors. When a scope/auth/permission issue breaks attachment processing, the user sees: [Slack attachment notice] - Slack attachment access failed for photo.jpg. Missing scope: files:read. Update the Slack app scopes/settings and reinstall the app to the workspace. Two helpers do the translation: _describe_slack_api_error — handles SlackApiError responses (missing_scope, invalid_auth, file_not_found, access_denied, etc.) _describe_slack_download_failure — handles httpx.HTTPStatusError (401/403/404) and Slack-returns-HTML-sign-in fallbacks Wired into three existing call sites: - the Slack Connect files.info path (PR #11111) so scope errors surface instead of being logged as generic "files.info failed" - the image, audio, and document download paths so 401/403 and HTML-body responses translate into actionable notices Adjustment from original PR: dropped _probe_slack_file_access_issue, the proactive pre-download files.info probe. It added one extra Slack API call per attachment even on healthy ones, and overlapped with the existing files.info call from PR #11111. The post-failure translation path covers the same user-facing diagnostic value without the per-message tax. Also documents files:read scope more prominently in the Slack setup guide and troubleshooting table. Contributed back from https://github.com/xinbenlv/zn-hermes-agent. Closes #7015. Co-authored-by: xinbenlv <zzn+pa@zzn.im>	2026-04-26 12:47:43 -07:00
Teknium	541cd732e8	chore(models): drop deepseek from OpenRouter and Nous Portal curated picker lists (#16197 ) Removes deepseek/deepseek-v4-pro and deepseek/deepseek-v4-flash from OPENROUTER_MODELS and _PROVIDER_MODELS['nous'], then regenerates website/static/api/model-catalog.json so the hosted picker JSON drops them too. Direct-API deepseek provider support is unchanged.	2026-04-26 12:28:17 -07:00
Teknium	5b2c59559a	feat(terminal): collapse subagent task_ids to shared container (#16177 ) Before: delegate_task children each allocated their own terminal sandbox keyed by child task_id. Starting extra containers (or Modal sandboxes / Daytona workspaces) is expensive, and the subagent's work is invisible to the parent — files written by the child in its container don't exist in the parent's when the subagent returns. After: a single `_resolve_container_task_id` helper maps any tool-call task_id to "default" UNLESS an env override is registered for it. The parent agent and all delegate_task children therefore share one long-lived sandbox — installed packages, cwd, /workspace files, and /tmp scratch carry over freely between them. RL and benchmark environments (TerminalBench2, HermesSweEnv, ...) opt in to isolation via `register_task_env_overrides(task_id, {...})`; those task_ids survive the collapse and get their own sandbox, preserving the per-task Docker image behavior these benchmarks rely on. file_state / active-subagents registry / TUI events still key off the original child task_id, so the 'subagent wrote a file the parent read' warning and UI per-subagent panels keep working. Tradeoff: parallel delegate_task children (tasks=[...]) now share one bash/container. Concurrent cd, env-var mutations, and writes to the same path will collide. If that bites a specific workflow, the subagent can opt back into isolation via register_task_env_overrides. Applied at four lookup sites: - tools/terminal_tool.py terminal_tool() and get_active_env() - tools/file_tools.py _get_file_ops() and _get_live_tracking_cwd() - tools/code_execution_tool.py _get_or_create_environment() Docs: website/docs/user-guide/configuration.md updated to reflect the shared-container reality and document the RL/benchmark carve-out. Tests: tests/tools/test_shared_container_task_id.py (9 cases).	2026-04-26 11:55:02 -07:00
Teknium	087e74d4d7	feat(slack): register every gateway command as a native slash (Discord/Telegram parity) (#16164 ) Every command in COMMAND_REGISTRY (/btw, /stop, /model, /help, /new, /bg, /reset, ...) is now a first-class Slack slash command instead of a /hermes <subcommand>. Users get the same autocomplete-driven slash picker experience Slack users expect and that Discord and Telegram already provide. Previously Slack registered ONE native slash (/hermes) and split on the first word, so typing /btw in Slack's composer got 'couldn't find an app for /btw' because the workspace manifest never declared it. Changes - hermes_cli/commands.py: slack_native_slashes() + slack_app_manifest() generate a Slack manifest from the registry (canonical names + aliases + plugin commands), clamped to Slack's 50-slash cap with /hermes reserved as the catch-all. - gateway/platforms/slack.py: single regex matcher dispatches every registered slash to _handle_slash_command, which dispatches on command['command']. Legacy /hermes <subcommand> keeps working for backward compat with older workspace manifests. - hermes_cli/slack_cli.py + hermes_cli/main.py: new 'hermes slack manifest' command prints/writes a full manifest (display info, OAuth scopes, event subs, socket mode, slash commands) ready to paste into 'Create from manifest' or Features → App Manifest. - hermes_cli/setup.py: _setup_slack() now writes the manifest up-front and points users at the 'From an app manifest' flow; also offers to refresh the manifest on reconfigure for picking up new commands. - Tests: 14 new tests covering native-slash dispatch (/btw, /stop, /model), legacy /hermes <sub> compat, manifest structure, and telegram<->slack parity (every Telegram command must also register as a Slack slash). Existing /hermes-registration test updated to assert the new regex matches /hermes, /btw, /stop, /model, /help. - Docs: slack.md gains a 'Slash Commands' section + Option A manifest flow in Step 1; cli-commands.md documents 'hermes slack manifest'. Users pick up the new slashes by running 'hermes slack manifest --write' and pasting into Features → App Manifest → Edit in their Slack app config, then Save (Slack prompts for reinstall if scopes changed).	2026-04-26 11:38:32 -07:00
Teknium	9be83728a6	docs(docker-backend): clarify container is shared across sessions, not per-session (#16158 ) The Docker terminal-backend docs said 'each session starts a long-lived container', implying a fresh container per chat session. That hasn't been true for a while: for the top-level agent, task_id defaults to 'default' and the container is cached in _active_environments for the lifetime of the Hermes process. /new, /reset, and switching sessions all reuse the same container. Only delegate_task subagents and RL rollouts get isolated containers keyed by their own task_id.	2026-04-26 10:46:08 -07:00
Teknium	9397767513	chore(skills): remove empty feeds category (#16153 ) skills/feeds/ only contained a category-marker DESCRIPTION.md with no actual skills in it. Removing the directory and the 'feeds' -> 'Feeds' display-label mapping in website/scripts/extract-skills.py (the only other reference in the repo).	2026-04-26 10:44:56 -07:00
Teknium	42c076d349	feat(browser): auto-spawn local Chromium for LAN/localhost URLs in cloud mode (#16136 ) When a cloud browser provider (Browserbase / Browser-Use / Firecrawl) is configured, browser_navigate now transparently spawns a local Chromium sidecar for URLs whose host resolves to a private/loopback/LAN address (localhost, 127.0.0.1, 192.168.x.x, 10.x.x.x, .local, .lan, *.internal, ::1, 169.254.x.x). Public URLs continue to use the cloud provider in the same conversation. Previously, setting BROWSERBASE_API_KEY / cloud_provider: browserbase pinned the whole tool to cloud for the process — localhost URLs were either SSRF-blocked (default) or sent to Browserbase (where they 404'd because the cloud can't reach your LAN). Users who wanted 'cloud for public, local for localhost' had no way to express it short of toggling providers mid-session. Implementation uses a composite session key scheme: the bare task_id serves the cloud session, and a '{task_id}::local' sidecar serves the local Chromium. _last_active_session_key[task_id] tracks which of the two served the most recent nav so snapshot/click/fill/etc. hit the correct one. cleanup_browser(bare_task_id) reaps both. Feature is on by default. Opt out via: browser: auto_local_for_private_urls: false The cloud provider never sees private URLs. Post-redirect SSRF guard is preserved: redirects from public onto private addresses still block.	2026-04-26 09:57:58 -07:00
Teknium	06f81752ed	Revert "feat(kanban): durable multi-profile collaboration board (#16081 )" (#16098 ) This reverts commit `15937a6b46`.	2026-04-26 08:29:37 -07:00
Teknium	15937a6b46	feat(kanban): durable multi-profile collaboration board (#16081 ) New `hermes kanban` CLI subcommand + `/kanban` slash command + skills for worker and orchestrator profiles. SQLite-backed task board (~/.hermes/kanban.db) shared across all profiles on the host. Zero changes to run_agent.py, no new core tools, no tool-schema bloat. Motivation: delegate_task is a function call — sync fork/join, anonymous subagent, no resumability, no human-in-the-loop. Kanban is the durable shape needed for research triage, scheduled ops, digital twins, engineering pipelines, and fleet work. They coexist (workers may call delegate_task internally). What this adds - hermes_cli/kanban_db.py — schema, CAS claim, dependency resolution, dispatcher, workspace resolution, worker-context builder. - hermes_cli/kanban.py — 15-verb CLI surface and shared run_slash() entry point used by both CLI and gateway. - skills/devops/kanban-worker — how a profile should work a claimed task. - skills/devops/kanban-orchestrator — "you are a dispatcher, not a worker" template with anti-temptation rules. - /kanban slash command wired into cli.py and gateway/run.py. Bypasses the running-agent guard (board writes don't touch agent state), so /kanban unblock can free a stuck worker mid-conversation. - Design spec at docs/hermes-kanban-v1-spec.pdf — comparative analysis vs Cline Kanban, Paperclip, NanoClaw, Gemini Enterprise; 8 patterns; 4 user stories; implementation plan; concurrency correctness. - Docs: website/docs/user-guide/features/kanban.md, CLI reference updated, sidebar entry added. Architecture highlights - Three planes: control (user + gateway), state (board + dispatcher), execution (pool of profile processes). - Every worker is a full OS process, spawned as `hermes -p <profile>`. No in-process subagent swarms — solves NanoClaw's SDK-lifecycle failure class. - Atomic claim via SQLite CAS in a BEGIN IMMEDIATE transaction; stale claims reclaimed 15 min after their TTL expires. - Tenant namespacing via one nullable column — one specialist fleet can serve many businesses with data isolation by workspace path. Tests: 60 targeted tests (schema, CAS atomicity, dependency resolution, dispatcher, workspace kinds, tenancy, CLI + slash surface). All pass hermetic via scripts/run_tests.sh.	2026-04-26 08:24:26 -07:00
Teknium	7fa70b6c87	refactor: /btw is now an alias for /background (#16053 ) The ephemeral no-tools side-question variant of /btw confused users who expected 'by-the-way' to mean 'run this off to the side with tools' — they'd type /btw and get a toolless agent that couldn't do the work. /bg worked because it was /background with full tools. Collapse the two: /btw and /bg both alias to /background. One command, one behavior, no more gotchas about which variant has tools. Removed: - _handle_btw_command in cli.py and gateway/run.py - _run_btw_task + _active_btw_tasks state in gateway/run.py - prompt.btw JSON-RPC method + btw.complete event in tui_gateway - BtwStartResponse type + btw.complete case in ui-tui - Standalone /btw slash tree registration in Discord - Standalone btw CommandDef in hermes_cli/commands.py Updated: - background CommandDef aliases: (bg,) -> (bg, btw) - TUI session.ts: local btw handler merged into background - Docs and tips updated to describe /btw as a /background alias	2026-04-26 07:11:08 -07:00
Teknium	9a70260490	Revert "feat(onboarding): port first-touch hints to the TUI (#16054 )" (#16062 ) This reverts commit `ffd2621039`.	2026-04-26 06:31:37 -07:00
Teknium	ffd2621039	feat(onboarding): port first-touch hints to the TUI (#16054 ) PR #16046 added /busy and /verbose hints to the classic CLI and the gateway runner but skipped the Ink TUI (and therefore the dashboard /chat page, which embeds the TUI via PTY). This extends the same latch to the TUI with TUI-native wording. The TUI's busy-input model is not the /busy knob from the CLI — single Enter while busy auto-queues, double Enter on an empty line interrupts. The new busy-input hint teaches THAT gesture instead of telling the user to flip a config that does not apply. Changes: - agent/onboarding.py — add busy_input_hint_tui() + tool_progress_hint_tui() - tui_gateway/server.py — onboarding.claim JSON-RPC (Ink triggers busy hint on enqueue) + _maybe_emit_onboarding_hint helper hooked into _on_tool_complete for the 30s/tool_progress=all path. Same config.yaml latch so each hint fires at most once per install across CLI, gateway, and TUI combined. - ui-tui/src/gatewayTypes.ts — OnboardingClaimResponse + onboarding.hint event - ui-tui/src/app/createGatewayEventHandler.ts — render the hint event as sys() - ui-tui/src/app/useSubmission.ts — claim busy_input_prompt on first busy enqueue - tests/agent/test_onboarding.py — +3 cases for TUI hint shape - tests/tui_gateway/test_protocol.py — +4 cases for onboarding.claim - website/docs/user-guide/tui.md — new 'Interrupting and queueing' section explaining the TUI's double-Enter model and the hints Validation: scripts/run_tests.sh tests/agent/test_onboarding.py \ tests/tui_gateway/test_protocol.py \ tests/gateway/test_busy_session_ack.py -> 66 passed npm --prefix ui-tui run type-check -> clean npm --prefix ui-tui run lint -> clean npm --prefix ui-tui run build -> clean	2026-04-26 06:24:19 -07:00
Teknium	83c1c201f6	feat(onboarding): contextual first-touch hints for /busy and /verbose (#16046 ) Instead of a blocking first-run questionnaire, show a one-time hint the first time the user hits each behavior fork: 1. First message while the agent is working — appends a hint to the busy-ack explaining the /busy queue vs /busy interrupt knob, phrased to match the mode that was just applied (don't tell a queue-mode user to switch to queue). 2. First tool that runs for >= 30s in the noisiest progress mode (tool_progress: all) — prints a hint about /verbose to cycle display modes (all -> new -> off -> verbose). Gated on /verbose actually being usable on the surface: always shown on CLI; on gateway only shown when display.tool_progress_command is enabled. Each hint is latched in config.yaml under onboarding.seen.<flag>, so it fires exactly once per install across CLI, gateway, and cron, then never again. Users can wipe the section to re-see hints. New: - agent/onboarding.py — is_seen / mark_seen / hint strings, shared by both CLI and gateway. - onboarding.seen in DEFAULT_CONFIG (hermes_cli/config.py) and in load_cli_config defaults (cli.py). No _config_version bump — deep merge handles new keys. Wired: - gateway/run.py: _handle_active_session_busy_message appends the hint after building the ack. progress_callback tracks tool.completed duration and queues the tool-progress hint into the progress bubble. - cli.py: CLI input loop appends the busy-input hint on the first busy Enter; _on_tool_progress appends the tool-progress hint on the first >=30s tool completion. In-memory CLI_CONFIG is also updated so subsequent fires in the same process are suppressed immediately. All writes go through atomic_yaml_write and are wrapped in try/except so onboarding can never break the input/busy-ack paths.	2026-04-26 06:06:27 -07:00
Teknium	855366909f	feat(models): remote model catalog manifest for OpenRouter + Nous Portal (#16033 ) OpenRouter and Nous Portal curated picker lists now resolve via a JSON manifest served by the docs site, falling back to the in-repo snapshot when unreachable. Lets us update model lists without shipping a release. Live URL: https://hermes-agent.nousresearch.com/docs/api/model-catalog.json (source at website/static/api/model-catalog.json; auto-deploys via the existing deploy-site.yml GitHub Pages pipeline on every merge to main). Schema (v1) carries id + optional description + free-form metadata at manifest, provider, and model levels. Pricing and context length stay live-fetched via existing machinery (/v1/models endpoints, models.dev). Config (new model_catalog section, default enabled): model_catalog.url master manifest URL model_catalog.ttl_hours disk cache TTL (default 24h) model_catalog.providers.<name>.url optional per-provider override Fetch pipeline: in-process cache -> disk cache (fresh < TTL) -> HTTP fetch -> disk-cache-on-failure fallback -> in-repo snapshot as last resort. Never raises to callers; at worst returns the bundled list. Changes: - website/static/api/model-catalog.json initial manifest (35 OR + 31 Nous) - scripts/build_model_catalog.py regenerator from in-repo lists - hermes_cli/model_catalog.py fetch + validate + cache module - hermes_cli/models.py fetch_openrouter_models() + new get_curated_nous_model_ids() - hermes_cli/main.py, hermes_cli/auth.py Nous flows use the helper - hermes_cli/config.py model_catalog defaults - website/docs/reference/model-catalog.md + sidebars.ts - tests/hermes_cli/test_model_catalog.py 21 tests (validation, fetch success/failure, accessors, disabled, overrides, integration)	2026-04-26 05:46:43 -07:00
Teknium	59b56d445c	feat(hooks): add duration_ms to post_tool_call + transform_tool_result (#15429 ) Plugin hooks fired after a tool dispatch now receive an integer duration_ms kwarg measuring how long the tool's registry.dispatch() call took (time.monotonic() before/after). Inspired by Claude Code 2.1.119 which added the same field to PostToolUse hook inputs. Wire points: - model_tools.py: measure dispatch latency, pass duration_ms to invoke_hook("post_tool_call", ...) and invoke_hook("transform_tool_result", ...) - hermes_cli/hooks.py: include duration_ms in the synthetic payload used by 'hermes hooks test' and 'hermes hooks doctor' so shell-hook authors see the same shape at development time as runtime - shell hooks (agent/shell_hooks.py): no code change needed; _serialize_payload already surfaces non-top-level kwargs under payload['extra'], so duration_ms lands at extra.duration_ms for shell-hook scripts Plugin authors can now build latency dashboards, per-tool SLO alerts, and regression canaries without having to wrap every tool manually. Test: tests/test_model_tools.py::test_post_tool_call_receives_non_negative_integer_duration_ms E2E: real PluginManager + dispatch monkey-patched with a 50ms sleep, hook callback observes duration_ms=50 (int). Refs: https://code.claude.com/docs/en/changelog (2.1.119, Apr 23 2026)	2026-04-25 22:13:12 -07:00
Teknium	a55de5bcd0	feat(setup): auto-reconfigure on existing installs (#15879 ) Bare `hermes setup` on a returning user now drops straight into the full reconfigure wizard — every prompt shows the current value as its default, press Enter to keep or type a new value to change it. The returning-user menu is gone. Behavior: - First-time user: first-time wizard (unchanged) - Returning user, bare command: full reconfigure wizard (new default) - Returning user, `--quick`: only prompt for missing/unset items - Returning user, one section: `hermes setup model\|terminal\|gateway\|tools\|agent` - `--reconfigure`: preserved as backwards-compat alias (no-op since it's now default) The section functions already used current values as prompt defaults — this change just removes the extra click to get to them. The 'Quick Setup - configure missing items only' menu option is now exposed as the explicit `--quick` flag; it's the narrow case of filling in missing config (e.g. after a partial OpenClaw migration or when a required API key got cleared). Inspired by Mercury Agent's `mercury doctor` UX. Also removes: - RETURNING_USER_MENU_SECTION_KEYS (orphaned constant) - Two returning-user menu tests in test_setup_noninteractive.py (guarding behavior that no longer exists — covered by test_setup_reconfigure.py instead)	2026-04-25 22:02:02 -07:00
Teknium	7c50ed707c	docs(azure-foundry): add provider guide, env vars, release AUTHOR_MAP - New website/docs/guides/azure-foundry.md covering both OpenAI-style and Anthropic-style endpoints, auto-detection behaviour, gpt-5.x routing, /v1 stripping, api-version query forwarding, and the provider: anthropic + Azure URL alternative setup. - environment-variables.md picks up AZURE_FOUNDRY_API_KEY, AZURE_FOUNDRY_BASE_URL, AZURE_ANTHROPIC_KEY. - cli-commands.md includes azure-foundry in the provider choices list. - configuration.md lists azure-foundry among auxiliary-task providers. - sidebars.ts wires the new guide into the Guides section. - scripts/release.py AUTHOR_MAP entries for TechPrototyper, HangGlidersRule (noreply), and pein892 so the contributor-attribution CI check does not reject the salvage.	2026-04-25 18:48:43 -07:00
nerijusas	81e01f6ee9	fix(agent): preserve Codex message items for replay	2026-04-25 18:22:06 -07:00
Teknium	dc4d92f131	docs: embed tutorial videos on webhooks + auxiliary models pages (#15809 ) - webhooks.md: adds a Video Tutorial section under the intro with a responsive YouTube iframe (WNYe5mD4fY8). - configuration.md: adds a Video Tutorial subsection under Auxiliary Models with a responsive YouTube iframe (NoF-YajElIM). Both use a 16:9 aspect-ratio wrapper so the embeds scale cleanly on mobile. Verified with `npm run build` — MDX parses clean, no new warnings or broken links introduced.	2026-04-25 16:44:53 -07:00
Teknium	ea01bdcebe	refactor(memory): remove flush_memories entirely (#15696 ) The AIAgent.flush_memories pre-compression save, the gateway _flush_memories_for_session, and everything feeding them are obsolete now that the background memory/skill review handles persistent memory extraction. Problems with flush_memories: - Pre-dates the background review loop. It was the only memory-save path when introduced; the background review now fires every 10 user turns on CLI and gateway alike, which is far more frequent than compression or session reset ever triggered flush. - Blocking and synchronous. Pre-compression flush ran on the live agent before compression, blocking the user-visible response. - Cache-breaking. Flush built a temporary conversation prefix (system prompt + memory-only tool list) that diverged from the live conversation's cached prefix, invalidating prompt caching. The gateway variant spawned a fresh AIAgent with its own clean prompt for each finalized session — still cache-breaking, just in a different process. - Redundant. Background review runs in the live conversation's session context, gets the same content, writes to the same memory store, and doesn't break the cache. Everything flush_memories claimed to preserve is already covered. What this removes: - AIAgent.flush_memories() method (~248 LOC in run_agent.py) - Pre-compression flush call in _compress_context - flush_memories call sites in cli.py (/new + exit) - GatewayRunner._flush_memories_for_session + _async_flush_memories (and the 3 call sites: session expiry watcher, /new, /resume) - 'flush_memories' entry from DEFAULT_CONFIG auxiliary tasks, hermes tools UI task list, auxiliary_client docstrings - _memory_flush_min_turns config + init - #15631's headroom-deduction math in _check_compression_model_feasibility (headroom was only needed because flush dragged the full main-agent system prompt along; the compression summariser sends a single user-role prompt so new_threshold = aux_context is safe again) - The dedicated test files and assertions that exercised flush-specific paths What this renames (with read-time backcompat on sessions.json): - SessionEntry.memory_flushed -> SessionEntry.expiry_finalized. The session-expiry watcher still uses the flag to avoid re-running finalize/eviction on the same expired session; the new name reflects what it now actually gates. from_dict() reads 'expiry_finalized' first, falls back to the legacy 'memory_flushed' key so existing sessions.json files upgrade seamlessly. Supersedes #15631 and #15638. Tested: 383 targeted tests pass across run_agent/, agent/, cli/, and gateway/ session-boundary suites. No behavior regressions — background memory review continues to handle persistent memory extraction on both CLI and gateway.	2026-04-25 08:21:14 -07:00
Teknium	cf2fabc40f	docs(dashboard): document page-scoped plugin slots (#15662 ) Follow-up to PR #15658. The feature PR introduced page-scoped slots (<page>:top / <page>:bottom inside every built-in page) but only touched the Shell slots catalogue. Adds proper narrative coverage so plugin authors find the feature. Changes - extending-the-dashboard.md: - Frontmatter description + intro bullet now mention page-scoped slots - New TOC entry "Augmenting built-in pages (page-scoped slots)" - New dedicated subsection after "Replacing built-in pages" explaining the heavy-vs-light tradeoff, listing the pages that expose slots, and showing a worked manifest + IIFE example with tab.hidden: true - Cross-link from the tab.override section pointing readers to the lighter augmentation option - web-dashboard.md: - Bullet mentioning "page-scoped slots (inject widgets into built-in pages without overriding them)" Validation - TOC anchor "#augmenting-built-in-pages-page-scoped-slots" matches the generated heading slug - Code fences balanced (64, even) - Pre-existing docusaurus build errors (skills.json, api-server.md link) reproduce on bare main -- not introduced here	2026-04-25 06:59:24 -07:00
Teknium	af22421e87	feat(dashboard): page-scoped plugin slots for built-in pages (#15658 ) * fix(terminal): three-layer defense against watch_patterns notification spam Background processes that stack notify_on_complete=True with watch_patterns can flood the user with duplicate, delayed notifications — matches deliver asynchronously via the completion queue and continue arriving minutes after the process has exited. The docstring warning against this (PR #12113) has proven insufficient; agents still misuse the combination. Three layered defenses, each sufficient on its own: 1. Mutual exclusion (terminal_tool.py): When both flags are set on a background process, drop watch_patterns with a warning. notify_on_complete wins because 'let me know when it's done' is the more useful signal and fires exactly once. Extracted as _resolve_notification_flag_conflict() so the rule is testable in isolation. 2. Suppress-after-exit (process_registry.py): _check_watch_patterns() now bails the moment session.exited is True. Post-exit chunks (buffered reads draining after the process is gone) no longer produce notifications. This is the fix flagged as future work in session 20260418_020302_79881c. 3. Global circuit breaker (process_registry.py): Per-session rate limits don't catch the sibling-flood case — N concurrent processes can each stay under 8/10s and still collectively spam. New WATCH_GLOBAL_MAX_PER_WINDOW=15 cap trips a 30-second cooldown across ALL sessions, emits a single watch_overflow_tripped event, silently counts dropped events, and emits a watch_overflow_released summary when the cooldown ends. Also updates the tool schema + docstring to document the new behavior. Tests: 8 new tests covering all three fixes (suppress-after-exit x2, mutual-exclusion resolver x4, global breaker trip/cooldown/release x2). All 60 tests across test_watch_patterns.py, test_notify_on_complete.py, test_terminal_tool.py pass. Real-world trigger: self-inflicted in session 20260425_051924 — three concurrent hermes-sweeper review subprocesses each set watch_patterns= ['failed validation', 'errored'] AND notify_on_complete=True, then iterated over multiple items, producing enough matches per process to defeat the per-session cap while staying under the global cap that didn't yet exist. * fix(terminal): aggressive 1-per-15s watch_patterns rate limit + strike-3 promotion Per Teknium's direction, the watch_patterns rate limit is now much more aggressive and self-healing. ## New rule — per session - HARD cap: 1 watch-match notification per 15 seconds per process. - Any match arriving inside the cooldown window is dropped and counts as ONE strike for that window (many drops in the same window still = 1 strike). - After 3 consecutive strike windows, watch_patterns is permanently disabled for the session and the session is auto-promoted to notify_on_complete semantics — exactly one notification when the process actually exits. - A cooldown window that expires with zero drops resets the consecutive strike counter — healthy cadence is forgiven. ## Schema + docstring rewritten The tool schema description now gives the model explicit guidance: - notify_on_complete is 'the right choice for almost every long-running task' - watch_patterns is for RARE one-shot signals on LONG-LIVED processes - Do NOT use watch_patterns with loops/batch jobs — error patterns fire every iteration and will hit the strike limit fast - Mutual exclusion is stated on both parameter descriptions - 1/15s cooldown and 3-strike promotion are stated in the watch_patterns description so the model sees the contract every turn ## Removed - WATCH_MAX_PER_WINDOW (8/10s) and WATCH_OVERLOAD_KILL_SECONDS (45) — the new 1/15s limit subsumes both; keeping them would double-count. - _watch_window_hits / _watch_window_start / _watch_overload_since fields on ProcessSession. Replaced by _watch_last_emit_at / _watch_cooldown_until / _watch_strike_candidate / _watch_consecutive_strikes. ## Kept - Global circuit breaker across all sessions (15/10s → 30s cooldown) as a secondary safety net for concurrent siblings. Still valuable when 20 short-lived processes each fire once — none individually violates the per-session limit. - Suppress-after-exit guard. - Mutual exclusion resolver at the tool entry point. ## Tests - 6 new tests in TestPerSessionRateLimit covering: first match delivers, second in cooldown suppressed, multi-drop = single strike, 3 strikes disables + promotes, clean window resets counter, suppressed count carried to next emit. - Global circuit breaker tests rewritten to use fresh sessions instead of hacking removed per-window fields. - 50/50 watch_patterns + notify_on_complete tests pass. - 60/60 including test_terminal_tool.py pass. * feat(dashboard): page-scoped plugin slots for built-in pages Dashboard plugins can now inject components into specific built-in pages (Sessions, Analytics, Logs, Cron, Skills, Config, Env, Docs, Chat) without overriding the whole route. Previously, plugins could only: 1. Add new tabs (tab.path) 2. Replace whole built-in pages (tab.override) 3. Inject into global shell slots (header-, footer-, pre-main, ...) None of those let a plugin add a banner, card, or widget to an existing page. The new <page>:top / <page>:bottom slots close that gap, reusing the existing registerSlot() API. Changes - web/src/plugins/slots.ts: 18 new KNOWN_SLOT_NAMES entries (sessions:top, sessions:bottom, analytics:top, ..., chat:bottom), grouped under "Shell-wide" vs "Page-scoped" in the docblock - web/src/pages/*: each built-in page now renders <PluginSlot name="<page>:top" /> as the first child of its outer wrapper and <PluginSlot name="<page>:bottom" /> as the last child -- zero visual cost when no plugin registers - plugins/example-dashboard: registers a demo banner into sessions:top via registerSlot(), with matching slots entry in the manifest -- so freshly-setup users can see what page-scoped slots look like without writing any plugin code - website/docs: new "Page-scoped slots" table in the plugin authoring guide, with a worked example - tests/hermes_cli/test_web_server.py: round-trip test for colon-bearing slot names (sessions:top, analytics:bottom, ...) Validation - npm run build: clean (tsc -b + vite build, 2761 modules) - scripts/run_tests.sh tests/hermes_cli/test_web_server.py::TestDashboardPluginManifestExtensions: 5/5 pass	2026-04-25 06:55:35 -07:00
Teknium	e5647d7863	docs: consolidate dashboard themes and plugins into Extending the Dashboard (#15530 ) The web-dashboard.md and dashboard-plugins.md pages had overlapping, partial coverage of the theme and plugin systems. Themes were split across two pages; the plugin docs had a minimal manifest reference but no step-by-step guide, no slot catalog, and no theme+plugin demo. New: user-guide/features/extending-the-dashboard.md — single navigable reference for all three extension layers (themes, UI plugins, backend plugins). Includes: - Theme quick-start + full schema (palette, typography, layout, layout variants, assets, componentStyles, colorOverrides, customCSS) - Plugin quick-start + full schema (manifest, SDK, slots, tab.override, tab.hidden, backend routes, custom CSS) - 10-slot shell catalog with locations - Plugin discovery + load lifecycle - Combined theme+plugin walkthrough (Strike Freedom cockpit demo) - API reference + troubleshooting web-dashboard.md: trimmed to core tool docs (pages, REST API, CORS, development). Theme/plugin content now points to the new page with a built-in themes summary table. dashboard-plugins.md: deleted (merged into extending-the-dashboard.md). sidebars.ts: swap 'dashboard-plugins' → 'extending-the-dashboard' under the Management group. No user-facing behavior change; docs-only.	2026-04-24 23:26:51 -07:00
MorAlekss	0ed37c0ca4	docs(delegate): document max_concurrent_children and max_spawn_depth + cost warning	2026-04-24 20:38:58 -07:00
Julia Bennet	1dcf79a864	feat: add slash command for busy input mode	2026-04-24 15:15:26 -07:00
Allard	0bcbc9e316	docs(faq): Update docs on backups - update faq answer with new `backup` command in release 0.9.0 - move profile export section together with backup section so related information can be read more easily - add table comparison between `profile export` and `backup` to assist users if understanding the nuances between both	2026-04-24 15:14:08 -07:00
Austin Pickett	c61547c067	Merge pull request #14890 from NousResearch/bb/tui-web-chat-unified feat(web): dashboard Chat tab — xterm.js + JSON-RPC sidecar (supersedes #12710 + #13379)	2026-04-24 10:35:43 -07:00
Austin Pickett	850fac14e3	chore: address copilot comments	2026-04-24 12:51:04 -04:00
Austin Pickett	5500b51800	chore: fix lint	2026-04-24 12:32:10 -04:00
Keira Voss	10deb1b87d	fix(gateway): canonicalize WhatsApp identity in session keys Hermes' WhatsApp bridge routinely surfaces the same person under either a phone-format JID (60123456789@s.whatsapp.net) or a LID (…@lid), and may flip between the two for a single human within the same conversation. Before this change, build_session_key used the raw identifier verbatim, so the bridge reshuffling an alias form produced two distinct session keys for the same person — in two places: 1. DM chat_id — a user's DM sessions split in half, transcripts and per-sender state diverge. 2. Group participant_id (with group_sessions_per_user enabled) — a member's per-user session inside a group splits in half for the same reason. Add a canonicalizer that walks the bridge's lid-mapping-*.json files and picks the shortest/numeric-preferred alias as the stable identity. build_session_key now routes both the DM chat_id and the group participant_id through this helper when the platform is WhatsApp. All other platforms and chat types are untouched. Expose canonical_whatsapp_identifier and normalize_whatsapp_identifier as public helpers. Plugins that need per-sender behaviour (role-based routing, per-contact authorization, policy gating) need the same identity resolution Hermes uses internally; without a public helper, each plugin would have to re-implement the walker against the bridge's internal on-disk format. Keeping this alongside build_session_key makes it authoritative and one refactor away if the bridge ever changes shape. _expand_whatsapp_aliases stays private — it's an implementation detail of how the mapping files are walked, not a contract callers should depend on.	2026-04-24 07:55:55 -07:00
emozilla	f49afd3122	feat(web): add /api/pty WebSocket bridge to embed TUI in dashboard Exposes hermes --tui over a PTY-backed WebSocket so the dashboard can embed the real TUI rather than reimplement its surface. The browser attaches xterm.js to the socket; keystrokes flow in, PTY output bytes flow out. Architecture: browser <Terminal> (xterm.js) │ onData ───► ws.send(keystrokes) │ onResize ► ws.send('\x1b[RESIZE:cols;rows]') │ write ◄── ws.onmessage (PTY bytes) ▼ FastAPI /api/pty (token-gated, loopback-only) ▼ PtyBridge (ptyprocess) ── spawns node ui-tui/dist/entry.js ──► tui_gateway + AIAgent Components ---------- hermes_cli/pty_bridge.py Thin wrapper around ptyprocess.PtyProcess: byte-safe read/write on the master fd via os.read/os.write (not PtyProcessUnicode — ANSI is inherently byte-oriented and UTF-8 boundaries may land mid-read), non-blocking select-based reads, TIOCSWINSZ resize, idempotent SIGHUP→SIGTERM→SIGKILL teardown, platform guard (POSIX-only; Windows is WSL-supported only). hermes_cli/web_server.py @app.websocket("/api/pty") endpoint gated by the existing _SESSION_TOKEN (via ?token= query param since browsers can't set Authorization on WS upgrades). Loopback-only enforcement. Reader task uses run_in_executor to pump PTY bytes without blocking the event loop. Writer loop intercepts a custom \x1b[RESIZE:cols;rows] escape before forwarding to the PTY. The endpoint resolves the TUI argv through a _resolve_chat_argv hook so tests can inject fake commands without building the real TUI. Tests ----- tests/hermes_cli/test_pty_bridge.py — 12 unit tests: spawn, stdout, stdin round-trip, EOF, resize (via TIOCSWINSZ + tput readback), close idempotency, cwd, env forwarding, unavailable-platform error. tests/hermes_cli/test_web_server.py — TestPtyWebSocket adds 7 tests: missing/bad token rejection (close code 4401), stdout streaming, stdin round-trip, resize escape forwarding, unavailable-platform ANSI error frame + 1011 close, resume parameter forwarding to argv. 96 tests pass under scripts/run_tests.sh. (cherry picked from commit `29b337bca7`) feat(web): add Chat tab with xterm.js terminal + Sessions resume button (cherry picked from commit `3d21aee8` by emozilla, conflicts resolved against current main: BUILTIN_ROUTES table + plugin slot layout) fix(tui): replace OSC 52 jargon in /copy confirmation When the user ran /copy successfully, Ink confirmed with: sent OSC52 copy sequence (terminal support required) That reads like a protocol spec to everyone who isn't a terminal implementer. The caveat was a historical artifact — OSC 52 wasn't universally supported when this message was written, so the TUI honestly couldn't guarantee the copy had landed anywhere. Today every modern terminal (including the dashboard's embedded xterm.js) handles OSC 52 reliably. Say what the user actually wants to know — that it copied, and how much — matching the message the TUI already uses for selection copy: copied 1482 chars (cherry picked from commit `a0701b1d5a`) docs: document the dashboard Chat tab AGENTS.md — new subsection under TUI Architecture explaining that the dashboard embeds the real hermes --tui rather than rewriting it, with pointers to the pty_bridge + WebSocket endpoint and the rule 'never add a parallel chat surface in React.' website/docs/user-guide/features/web-dashboard.md — user-facing Chat section inside the existing Web Dashboard page, covering how it works (WebSocket + PTY + xterm.js), the Sessions-page resume flow, and prerequisites (Node.js, ptyprocess, POSIX kernel / WSL on Windows). (cherry picked from commit `2c2e32cc45`) feat(tui-gateway): transport-aware dispatch + WebSocket sidecar Decouples the JSON-RPC dispatcher from its I/O sink so the same handler surface can drive multiple transports concurrently. The PTY chat tab already speaks to the TUI binary as bytes — this adds a structured event channel alongside it for dashboard-side React widgets that need typed events (tool.start/complete, model picker state, slash catalog) that PTY can't surface. - `tui_gateway/transport.py` — `Transport` protocol + `contextvars` binding + module-level `StdioTransport` fallback. The stdio stream resolves through a lambda so existing tests that monkey-patch `_real_stdout` keep passing without modification. - `tui_gateway/ws.py` — WebSocket transport implementation; FastAPI endpoint mounting lives in hermes_cli/web_server.py. - `tui_gateway/server.py`: - `write_json` routes via session transport (for async events) → contextvar transport (for in-request writes) → stdio fallback. - `dispatch(req, transport=None)` binds the transport for the request lifetime and propagates it to pool workers via `contextvars.copy_context` so async handlers don't lose their sink. - `_init_session` and the manual-session create path stash the request's transport so out-of-band events (subagent.complete, etc.) fan out to the right peer. `tui_gateway.entry` (Ink's stdio handshake) is unchanged externally — it falls through every precedence step into the stdio fallback, byte- identical to the previous behaviour. feat(web): ChatSidebar — JSON-RPC sidecar next to xterm.js terminal Composes the two transports into a single Chat tab: ┌─────────────────────────────────────────┬──────────────┐ │ xterm.js / PTY (emozilla #13379) │ ChatSidebar │ │ the literal hermes --tui process │ /api/ws │ └─────────────────────────────────────────┴──────────────┘ terminal bytes structured events The terminal pane stays the canonical chat surface — full TUI fidelity, slash commands, model picker, mouse, skin engine, wide chars all paint inside the terminal. The sidebar opens a parallel JSON-RPC WebSocket to the same gateway and renders metadata that PTY can't surface to React chrome: • model + provider badge with connection state (click → switch) • running tool-call list (driven by tool.start / tool.progress / tool.complete events) • model picker dialog (gateway-driven, reuses ModelPickerDialog) The sidecar is best-effort. If the WS can't connect (older gateway, network hiccup, missing token) the terminal pane keeps working unimpaired — sidebar just shows the connection-state badge in the appropriate tone. - `web/src/components/ChatSidebar.tsx` — new component (~270 lines). Owns its GatewayClient, drives the model picker through `slash.exec`, fans tool events into a capped tool list. - `web/src/pages/ChatPage.tsx` — split layout: terminal pane (`flex-1`) + sidebar (`w-80`, `lg+` only). - `hermes_cli/web_server.py` — mount `/api/ws` (token + loopback guards mirror /api/pty), delegate to `tui_gateway.ws.handle_ws`. Co-authored-by: emozilla <emozilla@nousresearch.com> refactor(web): /clean pass on ChatSidebar + ChatPage lint debt - ChatSidebar: lift gw out of useRef into a useMemo derived from a reconnect counter. React 19's react-hooks/refs and react-hooks/ set-state-in-effect rules both fire when you touch a ref during render or call setState from inside a useEffect body. The counter-derived gw is the canonical pattern for "external resource that needs to be replaceable on user action" — re-creating the client comes from bumping `version`, the effect just wires + tears down. Drops the imperative `gwRef.current = …` reassign in reconnect, drops the truthy ref guard in JSX. modelLabel + banner inlined as derived locals (one-off useMemo was overkill). - ChatPage: lazy-init the banner state from the missing-token check so the effect body doesn't have to setState on first run. Drops the unused react-hooks/exhaustive-deps eslint-disable. Adds a scoped no-control-regex disable on the SGR mouse parser regex (the \\x1b is intentional for xterm escape sequences). All my-touched files now lint clean. Remaining warnings on web/ belong to pre-existing files this PR doesn't touch. Verified: vitest 249/249, ui-tui eslint clean, web tsc clean, python imports clean. chore: uptick fix(web): drop ChatSidebar tool list — events can't cross PTY/WS boundary The /api/pty endpoint spawns `hermes --tui` as a child process with its own tui_gateway and _sessions dict; /api/ws runs handle_ws in-process in the dashboard server with a separate _sessions dict. Tool events fire on the child's gateway and never reach the WS sidecar, so the sidebar's tool.start/progress/complete listeners always observed an empty list. Drop the misleading list (and the now-orphaned ToolCall primitive), keep model badge + connection state + model picker + error banner — those work because they're sidecar-local concerns. Surfacing tool calls in the sidebar requires cross-process forwarding (PTY child opens a back-WS to the dashboard, gateway tees emits onto stdio + sidecar transport) — proper feature for a follow-up. feat(web): wire ChatSidebar tool list to PTY child via /api/pub broadcast The dashboard's /api/pty spawns hermes --tui as a child process; tool events fire in the python tui_gateway grandchild and never crossed the process boundary into the in-process WS sidecar — so the sidebar tool list was always empty. Cross-process forwarding: - tui_gateway: TeeTransport (transport.py) + WsPublisherTransport (event_publisher.py, sync websockets client). entry.py installs the tee on _stdio_transport when HERMES_TUI_SIDECAR_URL is set, mirroring every dispatcher emit to a back-WS without disturbing Ink's stdio handshake. - hermes_cli/web_server.py: new /api/pub (publisher) + /api/events (subscriber) endpoints with a per-channel registry. /api/pty now accepts ?channel= and propagates the sidecar URL via env. start_server also stashes app.state.bound_port so the URL is constructable. - web/src/pages/ChatPage.tsx: generates a channel UUID per mount, passes it to /api/pty and as a prop to ChatSidebar. - web/src/components/ChatSidebar.tsx: opens /api/events?channel=, fans tool.start/progress/complete back into the ToolCall list. Restores the ToolCall primitive. Tests: 4 new TestPtyWebSocket cases cover channel propagation, broadcast fan-out, and missing-channel rejection (10 PTY tests pass, 120 web_server tests overall). fix(web): address Copilot review on #14890 Five threads, all real: - gatewayClient.ts: register `message`/`close` listeners BEFORE awaiting the open handshake. Server emits `gateway.ready` immediately after accept, so a listener attached after the open promise could race past the initial skin payload and lose it. - ChatSidebar.tsx: wire `error`/`close` on the /api/events subscriber WS into the existing error banner. 4401/4403 (auth/loopback reject) surface as a "reload the page" message; mid-stream drops surface as "events feed disconnected" with the existing reconnect button. Clean unmount closes (1000/1001) stay silent. - web-dashboard.md: install hint was `pip install hermes-agent[web]` but ptyprocess lives in the `pty` extra, not `web`. Switch to `hermes-agent[web,pty]` in both prerequisite blocks. - AGENTS.md: previous "never add a parallel React chat surface" guidance was overbroad and contradicted this PR's sidebar. Tightened to forbid re-implementing the transcript/composer/PTY terminal while explicitly allowing structured supporting widgets (sidebar / model picker / inspectors), matching the actual architecture. - web/package-lock.json: regenerated cleanly so the wterm sibling workspace paths (extraneous machine-local entries) stop polluting CI. Tests: 249/249 vitest, 10/10 PTY/events, web tsc clean. refactor(web): /clean pass on ChatSidebar events handler Spotted in the round-2 review: - Banner flashed on clean unmount: `ws.close()` from the effect cleanup fires `close` with code 1005, opened=true, neither 1000 nor 1001 — hit the "unexpected drop" branch. Track `unmounting` in the effect scope and gate the banner through a `surface()` helper so cleanup closes stay silent. - DRY the duplicated "events feed disconnected" string into a local const used by both the error and close handlers. - Drop the `opened` flag (no longer needed once the unmount guard is the source of truth for "is this an expected close?").	2026-04-24 10:51:49 -04:00
Teknium	1840c6a57d	feat(spotify): wire setup wizard into 'hermes tools' + document cron usage (#15180 ) A — 'hermes tools' activation now runs the full Spotify wizard. Previously a user had to (1) toggle the Spotify toolset on in 'hermes tools' AND (2) separately run 'hermes auth spotify' to actually use it. The second step was a discovery gap — the docs mentioned it but nothing in the TUI pointed users there. Now toggling Spotify on calls login_spotify_command as a post_setup hook. If the user has no client_id yet, the interactive wizard walks them through Spotify app creation; if they do, it skips straight to PKCE. Either way, one 'hermes tools' pass leaves Spotify toggled on AND authenticated. SystemExit from the wizard (user abort) leaves the toolset enabled and prints a 'run: hermes auth spotify' hint — it does NOT fail the toolset toggle. Dropped the TOOL_CATEGORIES env_vars list for Spotify. The wizard handles HERMES_SPOTIFY_CLIENT_ID persistence itself, and asking users to type env var names before the wizard fires was UX-backwards — the point of the wizard is that they don't HAVE a client_id yet. B — Docs page now covers cron + Spotify. New 'Scheduling: Spotify + cron' section with two working examples (morning playlist, wind-down pause) using the real 'hermes cron add' CLI surface (verified via 'cron add --help'). Covers the active-device gotcha, Premium gating, memory isolation, and links to the cron docs. Also fixed a stale '9 Spotify tools' reference in the setup copy — we consolidated to 7 tools in #15154. Validation: - scripts/run_tests.sh tests/hermes_cli/test_tools_config.py tests/hermes_cli/test_spotify_auth.py tests/tools/test_spotify_client.py → 54 passed - website: node scripts/prebuild.mjs && npx docusaurus build → SUCCESS, no new warnings	2026-04-24 07:24:28 -07:00
Teknium	e5d41f05d4	feat(spotify): consolidate tools (9→7), add spotify skill, surface in hermes setup (#15154 ) Three quality improvements on top of #15121 / #15130 / #15135: 1. Tool consolidation (9 → 7) - spotify_saved_tracks + spotify_saved_albums → spotify_library with kind='tracks'\|'albums'. Handler code was ~90 percent identical across the two old tools; the merge is a behavioral no-op. - spotify_activity dropped. Its 'now_playing' action was a duplicate of spotify_playback.get_currently_playing (both return identical 204/empty payloads). Its 'recently_played' action moves onto spotify_playback as a new action — history belongs adjacent to live state. - Net: each API call ships 2 fewer tool schemas when the Spotify toolset is enabled, and the action surface is more discoverable (everything playback-related is on one tool). 2. Spotify skill (skills/media/spotify/SKILL.md) Teaches the agent canonical usage patterns so common requests don't balloon into 4+ tool calls: - 'play X' = one search, then play by URI (not search + scan + describe + play) - 'what's playing' = single get_currently_playing (no preflight get_state chain) - Don't retry on '403 Premium required' or '403 No active device' — both require user action - URI/URL/bare-ID format normalization - Full failure-mode reference for 204/401/403/429 3. Surfaced in 'hermes setup' tool status Adds 'Spotify (PKCE OAuth)' to the tool status list when auth.json has a Spotify access/refresh token. Matches the homeassistant pattern but reads from auth.json (OAuth-based) rather than env vars. Docs updated to reflect the new 7-tool surface, and mention the companion skill in the 'Using it' section. Tests: 54 passing (client 22, auth 15, tools_config 35 — 18 = 54 after renaming/replacing the spotify_activity tests with library + recently_played coverage). Docusaurus build clean.	2026-04-24 06:14:51 -07:00
Teknium	9be17bb84f	docs(spotify): expand feature page with tool reference, Free/Premium matrix, troubleshooting (#15135 ) The initial Spotify docs page shipped in #15130 was a setup guide. This expands it into a full feature reference: - Per-tool parameter table for all 9 tools, extracted from the real schemas in tools/spotify_tool.py (actions, required/optional args, premium gating). - Free vs Premium feature matrix — which actions work on which tier, so Free users don't assume Spotify tools are useless to them. - Active-device prerequisite called out at the top; this is the #1 cause of '403 no active device' reports for every Spotify integration. - SSH / headless section explaining that browser auto-open is skipped when SSH_CLIENT/SSH_TTY is set, and how to tunnel the callback port. - Token lifecycle: refresh on 401, persistence across restarts, how to revoke server-side via spotify.com/account/apps. - Example prompt list so users know what to ask the agent. - Troubleshooting expanded: no-active-device, Premium-required, 204 now_playing, INVALID_CLIENT, 429, 401 refresh-revoked, wizard not opening browser. - 'Where things live' table mapping auth.json / .env / Spotify app. Verified with 'node scripts/prebuild.mjs && npx docusaurus build' — page compiles, no new warnings.	2026-04-24 05:38:02 -07:00
Teknium	05394f2f28	feat(spotify): interactive setup wizard + docs page (#15130 ) Previously 'hermes auth spotify' crashed with 'HERMES_SPOTIFY_CLIENT_ID is required' if the user hadn't manually created a Spotify developer app and set env vars. Now the command detects a missing client_id and walks the user through the one-time app registration inline: - Opens https://developer.spotify.com/dashboard in the browser - Tells the user exactly what to paste into the Spotify form (including the correct default redirect URI, 127.0.0.1:43827) - Prompts for the Client ID - Persists HERMES_SPOTIFY_CLIENT_ID to ~/.hermes/.env so subsequent runs skip the wizard - Continues straight into the PKCE OAuth flow Also prints the docs URL at both the start of the wizard and the end of a successful login so users can find the full guide. Adds website/docs/user-guide/features/spotify.md with the complete setup walkthrough, tool reference, and troubleshooting, and wires it into the sidebar under User Guide > Features > Advanced. Fixes a stale redirect URI default in the hermes_cli/tools_config.py TOOL_CATEGORIES entry (was 8888/callback from the PR description instead of the actual DEFAULT_SPOTIFY_REDIRECT_URI value 43827/spotify/callback defined in auth.py).	2026-04-24 05:30:05 -07:00
l0hde	2cab8129d1	feat(copilot): add 401 auth recovery with automatic token refresh and client rebuild When using GitHub Copilot as provider, HTTP 401 errors could cause Hermes to silently fall back to the next model in the chain instead of recovering. This adds a one-shot retry mechanism that: 1. Re-resolves the Copilot token via the standard priority chain (COPILOT_GITHUB_TOKEN -> GH_TOKEN -> GITHUB_TOKEN -> gh auth token) 2. Rebuilds the OpenAI client with fresh credentials and Copilot headers 3. Retries the failed request before falling back The fix handles the common case where the gho_* OAuth token remains valid but the httpx client state becomes stale (e.g. after startup race conditions or long-lived sessions). Key design decisions: - Always rebuild client even if token string unchanged (recovers stale state) - Uses _apply_client_headers_for_base_url() for canonical header management - One-shot flag guard prevents infinite 401 loops (matches existing pattern used by Codex/Nous/Anthropic providers) - No token exchange via /copilot_internal/v2/token (returns 404 for some account types; direct gho_* auth works reliably) Tests: 3 new test cases covering end-to-end 401->refresh->retry, client rebuild verification, and same-token rebuild scenarios. Docs: Updated providers.md with Copilot auth behavior section.	2026-04-24 05:09:08 -07:00
Teknium	852c7f3be3	feat(cron): per-job workdir for project-aware cron runs (#15110 ) Cron jobs can now specify a per-job working directory. When set, the job runs as if launched from that directory: AGENTS.md / CLAUDE.md / .cursorrules from that dir are injected into the system prompt, and the terminal / file / code-exec tools use it as their cwd (via TERMINAL_CWD). When unset, old behaviour is preserved (no project context files, tools use the scheduler's cwd). Requested by @bluthcy. ## Mechanism - cron/jobs.py: create_job / update_job accept 'workdir'; validated to be an absolute existing directory at create/update time. - cron/scheduler.py run_job: if job.workdir is set, point TERMINAL_CWD at it and flip skip_context_files to False before building the agent. Restored in finally on every exit path. - cron/scheduler.py tick: workdir jobs run sequentially (outside the thread pool) because TERMINAL_CWD is process-global. Workdir-less jobs still run in the parallel pool unchanged. - tools/cronjob_tools.py + hermes_cli/cron.py + hermes_cli/main.py: expose 'workdir' via the cronjob tool and 'hermes cron create/edit --workdir ...'. Empty string on edit clears the field. ## Validation - tests/cron/test_cron_workdir.py (21 tests): normalize, create, update, JSON round-trip via cronjob tool, tick partition (workdir jobs run on the main thread, not the pool), run_job env toggle + restore in finally. - Full targeted suite (tests/cron/, test_cronjob_tools.py, test_cron.py, test_config_cwd_bridge.py, test_worktree.py): 314/314 passed. - Live smoke: hermes cron create --workdir $(pwd) works; relative path rejected; list shows 'Workdir:'; edit --workdir '' clears.	2026-04-24 05:07:01 -07:00

1 2 3 4 5 ...

532 Commits