Compare commits

...

12 Commits

Author SHA1 Message Date
teknium1
1ac64adaf9 fix(docker): don't require >0 TTY width in passthrough test
test_tty_passthrough_to_container asserted tput cols > 0, but a
script(1)-allocated PTY on a headless CI runner has a 0x0 window, so
tput cols legitimately prints 0 while the container still sees a real
TTY. The passthrough contract is already proven by the NO_TTY guard and
the numeric-output assert; the strict >0 check just made build-amd64
flaky-to-consistently-red on current runners. Loosen to >= 0.
2026-06-03 19:38:34 -07:00
Dusk1e
2059707fce fix(gateway-windows): anchor detached/startup cwd at HERMES_HOME 2026-06-03 19:37:29 -07:00
LeonSGP43
40fbb0f3c6 fix(constants): use windows native default hermes home 2026-06-03 19:37:29 -07:00
Teknium
e3313c50a7 feat(dashboard): add Debug Share to the System page (#38600)
* Port from google-gemini/gemini-cli#21541: back up corrupted config.yaml

When config.yaml fails to parse, load_config() silently falls back to
DEFAULT_CONFIG and leaves the broken file on disk. If the user then re-runs
the setup wizard or hermes config set (both rewrite config.yaml), their
broken-but-recoverable overrides are lost for good.

Adapts the policy-file recovery from gemini-cli#21541: on the first parse
warning for a given broken file, snapshot it to config.yaml.corrupt.<ts>.bak
(best-effort, symlink-guarded, size-deduped) and tell the user where it
landed. Unlike Gemini's version we deliberately do NOT reset config.yaml to a
clean state — hermes never silently mutates user config, and leaving it means
a hand-fixed file is re-read on the next load.

Tests: 3 new cases (backup created + content preserved + original untouched;
same-size backup dedup; symlink not copied). E2E verified with isolated
HERMES_HOME and a real tab-indented broken config.

* feat(dashboard): add Debug Share to the System page

Surface `hermes debug share` in the dashboard. The System > Operations
section gets a dedicated card that uploads a redacted report + full logs
and returns the paste URLs as real, copyable links instead of a log tail.

- debug.py: factor a pure build_debug_share() returning structured
  {urls, failures, redacted, auto_delete_seconds}; run_debug_share now
  calls it (CLI output unchanged).
- web_server.py: POST /api/ops/debug-share runs the share core in a
  worker thread and returns the structured payload synchronously (the
  URLs are the whole point — not a backgrounded action).
- api.ts: runDebugShare() + DebugShareResponse.
- SystemPage.tsx: share card with a redaction toggle (on by default),
  per-link + copy-all buttons, and the 6h auto-delete countdown.
- tests: build_debug_share core + endpoint (redact toggle, failure 502,
  token gate).
2026-06-03 19:37:04 -07:00
Teknium
f66a929a6b fix(desktop): render approval/sudo/secret prompts so tools stop silently timing out (#38578)
* fix(desktop): render approval/sudo/secret prompts so tools stop silently timing out

The desktop app's gateway event handler (use-message-stream.ts) handled
clarify.request but had no case for approval.request, sudo.request, or
secret.request. When a tool needed approval, the gateway emitted
approval.request and blocked the agent thread in _await_gateway_decision()
for up to 5 min (approvals.gateway_timeout); the desktop dropped the unknown
event, never showed a dialog, then the agent returned BLOCKED. No prompt,
just a stall then a block.

The Ink TUI already handles all three (createGatewayEventHandler.ts); this
brings the Electron app to parity.

- store/prompts.ts: approval/sudo/secret atoms (+ request-id-guarded clears)
- components/prompt-overlays.tsx: Radix dialogs; close/Esc maps to refusal so
  silence is never mistaken for consent (parity with TUI Esc->deny)
- use-message-stream.ts: wire the three *.request cases; clearAllPrompts on
  message.complete so an overlay can't outlive its turn
- chat-messages.ts: GatewayEventPayload gains command/description/env_var/prompt
- mount PromptOverlays in the chat shell

* feat(desktop): inline tool-call approval bar (Cursor-style "Run")

Render dangerous-command / execute_code approval inline on the pending
tool row instead of as a modal. Binding is positional: the desktop
tool.start payload carries no structured args, but approval.request only
fires from the terminal/execute_code guards and the agent blocks on one
approval at a time, so the single pending row of those tools is the one
that raised it. Command/description text comes from $approvalRequest.

Drops ApprovalDialog from PromptOverlays (sudo/secret stay modal).

* style(desktop): make inline approval bar match Cursor's command card

Drop the amber alert styling for a neutral elevated card: command on a
terminal-prefixed row up top, a divided footer with the muted description
on the left and right-aligned controls — a ghost "Reject" (Esc) plus a
split primary "Run" (⌘⏎) whose chevron opens "Allow this session" /
"Always allow" / "Reject". Wire ⌘/Ctrl+Enter → Run and Esc → Reject to
match Cursor's accept/skip bindings, guarded against double-send via the
$approvalRequest atom.

* style(desktop): shrink inline approval to a tiny Cursor-style button strip

The running tool row already shows the command, so drop the whole card +
command echo + description band. What's left is a compact strip under the
row: a small split "Run ⌘⏎" button (chevron → Allow this session / Always
allow / Reject) and a ghost "Reject Esc", indented to sit under the row's
title text.

* style(desktop): drop the loud blue Run button for a quiet outlined control

Swap the primary (blue) Run for a subtle outlined split control — neutral
border, transparent fill, hover-accent — so the approval strip reads as
quiet inline affordance rather than a big CTA. Reject stays ghost.

* style(desktop): make Run a soft primary badge

Tint the Run split control with the primary color as a badge (bg-primary/10,
primary text, primary/25 border, rounded-md, hover primary/15) instead of a
solid CTA or a neutral outline.

* style(desktop): slim the approval chevron and space out Reject

The chevron button had ballooned because dropping the size prop fell back
to the big default size (h-9 + has-svg px-3). Pin size=xs everywhere and
give the chevron a tight w-5/px-0. Bump the gap between the Run badge and
Reject (gap-2.5) and loosen Reject's internal spacing.

* feat(desktop): confirm before "Always allow" persists an approval

"Always allow" writes the matched pattern to ~/.hermes/config.yaml and
suppresses the prompt in every future session — too consequential to fire
straight from a menu click. Route it through a confirm dialog that names
the pattern + command and the file it touches. The dialog owns the
keyboard while open so Esc closes it instead of denying the approval.

* fix(gateway): make sudo + secret prompts actually fire in the desktop

Tek's PR added the sudo/secret overlays and callback wiring, but neither
reached the live path:

- Sudo: the sudo password callback is thread-local (terminal_tool
  _callback_tls), and _wire_callbacks runs on the agent-build thread, not
  the turn thread that executes tools. At command time the callback was
  missing, so terminal sudo fell through to /dev/tty and hung the headless
  gateway. Re-wire callbacks at the top of the prompt-submit turn thread.

- Secret: skills_tool short-circuited to the "secret entry unsupported"
  hint for any gateway surface, before invoking the callback. Interactive
  surfaces (desktop/TUI) register a secret-capture callback that routes to
  the secret.request overlay; only short-circuit when no callback exists,
  so messaging still gets the hint but the desktop prompts.

* docs(desktop): drop Cursor references from approval comments

* docs(desktop): drop Cursor reference from prompt-overlays comment

* fix(skills): gate in-band secret capture on HERMES_INTERACTIVE, not callback presence

The desktop/sudo PR switched the gateway secret-capture short-circuit from
"any gateway surface" to "gateway surface with no callback registered". That
made a messaging gateway (telegram/discord/...) attempt interactive in-band
secret capture whenever any callback happened to be registered, instead of
returning the safe "setup unsupported" hint — and broke
test_gateway_still_loads_skill_but_returns_setup_guidance.

Discriminate on HERMES_INTERACTIVE instead: the desktop app / TUI set it in
_enable_gateway_prompts (alongside registering the secret.request callback),
while messaging platforms never do. This is the same flag tools/approval.py
uses to tell an interactive surface from a messaging one, so messaging keeps
the hint and desktop/TUI still prompt.

---------

Co-authored-by: Brooklyn Nicholson <brooklyn.bb.nicholson@gmail.com>
2026-06-04 01:53:51 +00:00
Ben Barclay
04d620d91f fix(docker): run config migrations during container boot (salvage #35508) (#36627)
Salvage of #35508 (@dchenk), rebased onto current main. Resolved the
tests/tools/test_stage2_hook_puid_pgid.py conflict (kept both the
envdir-creation regression test on main and the new config-migration
tests).

Docker image upgrades replace code under $INSTALL_DIR but preserve
$HERMES_HOME on the mounted volume, so the persisted config.yaml never
received the schema migrations that non-Docker `hermes update` runs
(#35406). This adds scripts/docker_config_migrate.py, invoked from
stage2-hook after first-boot seeding and before gateway services start:
it backs up config.yaml + .env, runs migrate_config(interactive=False),
and honors HERMES_SKIP_CONFIG_MIGRATION=1 for manual control.

Also fixes a latent bug in check_config_version(): it called load_config()
which deep-merges DEFAULT_CONFIG, so a legacy config with no raw
_config_version falsely reported as already-current. It now reads the raw
on-disk file so legacy configs are correctly detected for migration.

Differs from #35508 as submitted (Option B cleanup): dropped the
`_config_version` line added to cli-config.yaml.example and removed the
accompanying test_cli_config_example_declares_latest_version change-detector
test. The example is a copy-template and has no business asserting a schema
version; check_config_version() reads the user's real config.yaml, not the
example. This removes a second sync point that drifts on every version bump.

Closes #35508. Fixes #35406.

Co-authored-by: Dmitriy Cherchenko <17372886+dchenk@users.noreply.github.com>
2026-06-04 11:11:27 +10:00
brooklyn!
92be989291 Merge pull request #38564 from NousResearch/bb/tui-sgr-mouse-fragment-leak
fix(hermes-ink): reassemble split SGR mouse sequences at the tokenizer (supersedes #29337)
2026-06-03 20:10:48 -05:00
Brooklyn Nicholson
725290db63 test(hermes-ink): fuzz the tokenizer flush valve against fragment leaks
Hammer createTokenizer with the worst stalls a terminal can produce —
split + flush at every interior byte, and a 200-report byte-by-byte feed
that flushes after every single byte — and assert the two invariants that
make the SGR-leak class structurally impossible: nothing ever leaks as a
text token, and every complete report reassembles whole. A mixed
mouse+keystroke variant proves real input survives the same storm.
2026-06-03 19:38:08 -05:00
Brooklyn Nicholson
6efc7eda57 refactor(hermes-ink): delete now-dead SGR mouse fragment recovery
With the tokenizer reassembling split CSI sequences across a flush (prior
commit), no SGR mouse fragment can reach a text token anymore — terminals
write a mouse report as one atomic sequence, and any read/flush split now
re-joins in the tokenizer buffer instead of leaking. That makes the whole
downstream recovery layer dead code:

- SGR_MOUSE_FRAGMENT_RE, MOUSE_BURST_NOISE_RE, MOUSE_BURST_RESIDUE_RE
- parseTextWithSgrMouseFragments / parseSgrMouseFragment /
  normalizeSgrMouseFragment
- the whole-text mouse-burst noise fast path in parseMultipleKeypresses

Remove all of it (~185 lines) and the tests that only exercised it. The
narrow legacy X10 wheel-tail resynth stays (distinct mechanism, kept with
its own test). This retires the #17701#18113#26781#28463#35512
regex hardening chain in favor of the one correct parser fix.
2026-06-03 19:29:42 -05:00
Brooklyn Nicholson
de124800a2 test(hermes-ink): drop input-event SGR guard test
The guard it covered was removed in the previous commit (fragments no
longer reach input-event — they reassemble at the tokenizer). Reassembly
is now covered by termio/tokenize.test.ts and the flush-boundary cases in
parse-keypress.test.ts.
2026-06-03 19:24:51 -05:00
Brooklyn Nicholson
f354323547 fix(hermes-ink): reassemble split mouse sequences at the tokenizer; drop the regex sink
Root-cause fix for the SGR mouse fragment leak (`46M35;40M...` typed into
the prompt). The leak was never really about the fragments — it was the
flush emitting them. When App's 50ms watchdog fires mid-CSI during a render
stall, the tokenizer was force-emitting the buffered partial as a token and
resetting to ground, so both the prefix and the ESC-less remainder surfaced
as unparseable input.

Make the flush state-aware (xterm.js discipline): a bare ESC still flushes
to the Escape key (the legitimate ESCDELAY case), but a buffer still inside
a multi-byte control sequence (csi/osc/dcs/apc/ss3/intermediate) is NOT
emitted — it's kept so the continuation reassembles on the next feed. A
one-tick truncation valve in createTokenizer.flush() drops a partial that
survives a second flush with no progress, so a genuinely truncated write
can't fuse into the next keypress.

With partials never entering the input stream, the downstream scrubber is
dead code: remove the SGR fragment guard from input-event.ts (both the
original `/^\[<\d+;\d+;\d+[Mm]/` and the consolidated form added earlier in
this PR). The parse-keypress burst-recovery regexes (MOUSE_BURST_*) are now
also redundant but left in place as a safety net for one release; they can
be removed in a follow-up once this soaks.

Tests: tokenize.test.ts proves a mid-CSI flush keeps/reassembles and that a
stale partial is dropped after a second flush and a bare ESC still emits;
parse-keypress.test.ts adds the end-to-end split-then-reassemble case
yielding a single clean mouse event with no leaked key.

Supersedes #29337.
2026-06-03 19:24:28 -05:00
Brooklyn Nicholson
01c010e233 fix(hermes-ink): collapse SGR mouse fragment guards into one flush-aware rule
When App's 50ms flush watchdog fires mid-CSI during a render stall, an
SGR mouse report (ESC[<btn;col;row M/m) is split across stdin chunks: the
tokenizer force-emits the buffered prefix and resets to ground, so both
the prefix and the ESC-less remainder reach InputEvent as nameless tokens.

The previous guard only matched a full `[<\d+;\d+;\d+[Mm]` fragment, so
the flushed prefixes (`ESC[<0;35;`) and the 1-/2-field and leading-`;`
tails (`46M`, `35;46M`, `;46M`) still leaked into the composer as
`46M35;40M...` during long sessions.

Replace the three would-be narrow regexes with one consolidated rule that
covers every split position. A `(?=...\d)` lookahead keeps typed `<`, `[`,
`;`, and `M` safe (no coordinate digit), and the embedded M/m terminator
in the param class leaves stuck-together fragments / prose intact. The
existing `!keypress.name` gate continues to protect real keystrokes, which
arrive one char per chunk with a name set.

Supersedes #29337 (covers the prefix-leak and leading-`;`/1-/2-field tail
cases that PR's two added guards missed).
2026-06-03 19:05:26 -05:00
34 changed files with 2147 additions and 304 deletions

View File

@@ -13,6 +13,7 @@ import { useLocation } from 'react-router-dom'
import { Thread } from '@/components/assistant-ui/thread'
import { Backdrop } from '@/components/Backdrop'
import { NotificationStack } from '@/components/notifications'
import { PromptOverlays } from '@/components/prompt-overlays'
import { Button } from '@/components/ui/button'
import { Codicon } from '@/components/ui/codicon'
import { getGlobalModelOptions, type HermesGateway } from '@/hermes'
@@ -315,6 +316,7 @@ export function ChatView({
/>
<NotificationStack />
<PromptOverlays />
<div
className="relative min-h-0 max-w-full flex-1 overflow-hidden bg-(--ui-chat-surface-background) contain-[layout_paint]"

View File

@@ -19,6 +19,7 @@ import { isProviderSetupErrorMessage } from '@/lib/provider-setup-errors'
import { setClarifyRequest } from '@/store/clarify'
import { notify } from '@/store/notifications'
import { requestDesktopOnboarding } from '@/store/onboarding'
import { clearAllPrompts, setApprovalRequest, setSecretRequest, setSudoRequest } from '@/store/prompts'
import {
setCurrentBranch,
setCurrentCwd,
@@ -751,6 +752,13 @@ export function useMessageStream({
return
}
// Turn ended — drop any blocking prompt that's still open (e.g. the
// agent was interrupted, or the approval already resolved). Prevents a
// stale overlay from outliving the turn that raised it.
if (isActiveEvent) {
clearAllPrompts()
}
flushQueuedDeltas(sessionId)
if (isActiveEvent) {
@@ -816,10 +824,60 @@ export function useMessageStream({
sessionId: sessionId ?? null
})
}
} else if (event.type === 'approval.request') {
if (!isActiveEvent) {
return
}
// Dangerous-command / execute_code approval. The Python side is
// blocked in _await_gateway_decision() until approval.respond lands;
// without this the agent stalls until its 5-min timeout and the tool
// is BLOCKED. Approval is session-keyed (no request_id) — the overlay
// sends back {choice, session_id}.
setApprovalRequest({
command: typeof payload?.command === 'string' ? payload.command : '',
description: typeof payload?.description === 'string' ? payload.description : 'dangerous command',
sessionId: sessionId ?? null
})
} else if (event.type === 'sudo.request') {
if (!isActiveEvent) {
return
}
// Sudo password capture (tools/terminal_tool.py). Blocked on
// sudo.respond {request_id, password}.
const requestId = typeof payload?.request_id === 'string' ? payload.request_id : ''
if (requestId) {
setSudoRequest({ requestId })
}
} else if (event.type === 'secret.request') {
if (!isActiveEvent) {
return
}
// Skill credential capture (tools/skills_tool.py). Blocked on
// secret.respond {request_id, value}.
const requestId = typeof payload?.request_id === 'string' ? payload.request_id : ''
if (requestId) {
setSecretRequest({
requestId,
envVar: typeof payload?.env_var === 'string' ? payload.env_var : '',
prompt: typeof payload?.prompt === 'string' ? payload.prompt : ''
})
}
} else if (event.type === 'error') {
const errorMessage = payload?.message || 'Hermes reported an error'
const looksLikeProviderSetup = isProviderSetupErrorMessage(errorMessage)
// A turn that errors out has also ended — drop any open blocking
// prompt so an approval/sudo/secret overlay can't linger past the
// failed turn (same intent as the message.complete clear).
if (isActiveEvent) {
clearAllPrompts()
}
if (looksLikeProviderSetup) {
requestDesktopOnboarding(errorMessage)
} else if (isActiveEvent) {

View File

@@ -0,0 +1,78 @@
import { cleanup, fireEvent, render, screen, waitFor } from '@testing-library/react'
import { afterEach, describe, expect, it, vi } from 'vitest'
import type { HermesGateway } from '@/hermes'
import { $gateway } from '@/store/gateway'
import { $approvalRequest } from '@/store/prompts'
import { PendingToolApproval } from './tool-approval'
import type { ToolPart } from './tool-fallback-model'
function part(toolName: string): ToolPart {
return { toolName, type: `tool-${toolName}` } as unknown as ToolPart
}
function setRequest(command = 'rm -rf /tmp/x') {
$approvalRequest.set({ command, description: 'dangerous command', sessionId: 'sess-1' })
}
function mockGateway() {
const request = vi.fn().mockResolvedValue({ resolved: true })
$gateway.set({ request } as unknown as HermesGateway)
return request
}
afterEach(() => {
cleanup()
$approvalRequest.set(null)
$gateway.set(null)
})
describe('PendingToolApproval', () => {
it('renders nothing when there is no pending approval', () => {
const { container } = render(<PendingToolApproval part={part('terminal')} />)
expect(container.innerHTML).toBe('')
})
it('renders nothing for tools that never raise approval', () => {
setRequest()
const { container } = render(<PendingToolApproval part={part('read_file')} />)
expect(container.innerHTML).toBe('')
})
it('renders the inline run/reject controls on the pending terminal row', () => {
setRequest('chmod -R 777 /tmp/x')
render(<PendingToolApproval part={part('terminal')} />)
expect(screen.getByRole('button', { name: /Run/ })).toBeTruthy()
expect(screen.getByRole('button', { name: /Reject/ })).toBeTruthy()
})
it('sends approval.respond {choice: "once"} and clears the request on Run', async () => {
const request = mockGateway()
setRequest()
render(<PendingToolApproval part={part('terminal')} />)
fireEvent.click(screen.getByRole('button', { name: /Run/ }))
await waitFor(() => {
expect(request).toHaveBeenCalledWith('approval.respond', { choice: 'once', session_id: 'sess-1' })
})
expect($approvalRequest.get()).toBeNull()
})
it('sends choice "deny" on Reject', async () => {
const request = mockGateway()
setRequest()
render(<PendingToolApproval part={part('terminal')} />)
fireEvent.click(screen.getByRole('button', { name: /Reject/ }))
await waitFor(() => {
expect(request).toHaveBeenCalledWith('approval.respond', { choice: 'deny', session_id: 'sess-1' })
})
})
})

View File

@@ -0,0 +1,213 @@
'use client'
import { useStore } from '@nanostores/react'
import { type FC, useCallback, useEffect, useState } from 'react'
import { Button } from '@/components/ui/button'
import {
Dialog,
DialogContent,
DialogDescription,
DialogFooter,
DialogHeader,
DialogTitle
} from '@/components/ui/dialog'
import {
DropdownMenu,
DropdownMenuContent,
DropdownMenuItem,
DropdownMenuTrigger
} from '@/components/ui/dropdown-menu'
import { triggerHaptic } from '@/lib/haptics'
import { ChevronDown, Loader2 } from '@/lib/icons'
import { $gateway } from '@/store/gateway'
import { notifyError } from '@/store/notifications'
import { $approvalRequest, type ApprovalRequest, clearApprovalRequest } from '@/store/prompts'
import type { ToolPart } from './tool-fallback-model'
// Inline approval control. Rendered as a compact button strip
// under the pending tool row that raised the approval (the row already shows
// the command, so the strip deliberately doesn't repeat it) instead of as a
// modal overlay.
//
// Binding is POSITIONAL, not command-matched: the desktop `tool.start` payload
// carries no structured args (only tool_id/name/context — see
// tui_gateway/server.py::_on_tool_start), so we cannot join the approval to the
// row by command string. But `approval.request` only ever fires from the
// `terminal` / `execute_code` guards and the agent thread blocks on exactly one
// approval at a time, so the single pending row of those tools IS the row that
// raised it. The command/description text comes from `$approvalRequest` (the
// event payload), which is the only place that data reliably exists.
const APPROVAL_TOOLS = new Set(['terminal', 'execute_code'])
// Canonical gateway choices (ui-tui/src/components/prompts.tsx).
type ApprovalChoice = 'once' | 'session' | 'always' | 'deny'
export const PendingToolApproval: FC<{ part: ToolPart }> = ({ part }) => {
const request = useStore($approvalRequest)
if (!request || !APPROVAL_TOOLS.has(part.toolName)) {
return null
}
return <ApprovalBar request={request} />
}
const isMac = typeof navigator !== 'undefined' && /Mac|iP(hone|ad|od)/.test(navigator.platform)
const ApprovalBar: FC<{ request: ApprovalRequest }> = ({ request }) => {
const gateway = useStore($gateway)
const [submitting, setSubmitting] = useState<ApprovalChoice | null>(null)
// "Always allow" persists the pattern to ~/.hermes/config.yaml permanently, so
// it goes through a confirm step rather than firing straight from the menu.
const [confirmAlways, setConfirmAlways] = useState(false)
const busy = submitting !== null
const respond = useCallback(
async (choice: ApprovalChoice) => {
// Another bar (or the keyboard path) may have already resolved this
// approval; the atom is the single source of truth, so bail if it's gone.
if (busy || !$approvalRequest.get()) {
return
}
if (!gateway) {
notifyError(new Error('Hermes gateway is not connected'), 'Could not send approval response')
return
}
setSubmitting(choice)
try {
await gateway.request<{ resolved?: boolean }>('approval.respond', {
choice,
session_id: request.sessionId ?? undefined
})
triggerHaptic(choice === 'deny' ? 'cancel' : 'submit')
clearApprovalRequest()
} catch (error) {
notifyError(error, 'Could not send approval response')
setSubmitting(null)
}
},
[busy, gateway, request.sessionId]
)
// ⌘/Ctrl+Enter → Run, Esc → Reject.
// While the confirm dialog is open it owns the keyboard (Esc closes it), so
// the strip-level shortcuts stand down to avoid denying the whole approval.
useEffect(() => {
if (confirmAlways) {
return
}
const onKeyDown = (event: KeyboardEvent) => {
if (event.key === 'Enter' && (event.metaKey || event.ctrlKey)) {
event.preventDefault()
void respond('once')
} else if (event.key === 'Escape') {
event.preventDefault()
void respond('deny')
}
}
window.addEventListener('keydown', onKeyDown, true)
return () => window.removeEventListener('keydown', onKeyDown, true)
}, [confirmAlways, respond])
return (
<div className="mt-1 flex items-center gap-2.5 ps-5" data-slot="tool-approval-inline">
<div className="inline-flex h-6 items-stretch overflow-hidden rounded-md border border-primary/25 bg-primary/10 text-primary">
<Button
className="h-full gap-1 rounded-none px-2 text-xs font-medium text-primary hover:bg-primary/15 hover:text-primary"
disabled={busy}
onClick={() => void respond('once')}
size="xs"
variant="ghost"
>
{submitting === 'once' ? <Loader2 className="size-3 animate-spin" /> : 'Run'}
{submitting !== 'once' && <span className="text-[0.625rem] text-primary/60">{isMac ? '⌘⏎' : 'Ctrl⏎'}</span>}
</Button>
<span aria-hidden className="w-px self-stretch bg-primary/20" />
<DropdownMenu>
<DropdownMenuTrigger asChild>
<Button
aria-label="More approval options"
className="h-full w-5 rounded-none px-0 text-primary hover:bg-primary/15 hover:text-primary"
disabled={busy}
size="xs"
variant="ghost"
>
<ChevronDown className="size-3" />
</Button>
</DropdownMenuTrigger>
<DropdownMenuContent align="start" className="min-w-44">
<DropdownMenuItem onSelect={() => void respond('session')}>Allow this session</DropdownMenuItem>
<DropdownMenuItem
onSelect={() => {
// Defer one tick so the menu fully unmounts before the dialog
// mounts — otherwise Radix's focus-return races the dialog and
// dismisses it via onInteractOutside.
setTimeout(() => setConfirmAlways(true), 0)
}}
>
Always allow
</DropdownMenuItem>
<DropdownMenuItem onSelect={() => void respond('deny')} variant="destructive">
Reject
</DropdownMenuItem>
</DropdownMenuContent>
</DropdownMenu>
</div>
<Button
className="h-6 gap-1.5 rounded-md px-1.5 text-xs font-normal text-(--ui-text-tertiary) hover:text-foreground"
disabled={busy}
onClick={() => void respond('deny')}
size="xs"
variant="ghost"
>
{submitting === 'deny' ? <Loader2 className="size-3 animate-spin" /> : 'Reject'}
{submitting !== 'deny' && <span className="text-[0.625rem] opacity-55">Esc</span>}
</Button>
<Dialog onOpenChange={setConfirmAlways} open={confirmAlways}>
<DialogContent className="max-w-md">
<DialogHeader>
<DialogTitle>Always allow this command?</DialogTitle>
<DialogDescription>
This adds the {request.description} pattern to your permanent allowlist (
<code className="font-mono text-xs">~/.hermes/config.yaml</code>). Hermes wont ask again for commands
like this in this session or any future one.
</DialogDescription>
</DialogHeader>
{request.command.trim() && (
<pre className="max-h-32 overflow-auto whitespace-pre-wrap break-words rounded-md border border-(--ui-stroke-tertiary) bg-(--ui-chat-surface-background) px-2.5 py-1.5 font-mono text-xs leading-snug text-foreground">
{request.command.trim()}
</pre>
)}
<DialogFooter>
<Button onClick={() => setConfirmAlways(false)} size="sm" variant="ghost">
Cancel
</Button>
<Button
onClick={() => {
setConfirmAlways(false)
void respond('always')
}}
size="sm"
variant="destructive"
>
Always allow
</Button>
</DialogFooter>
</DialogContent>
</Dialog>
</div>
)
}

View File

@@ -24,6 +24,7 @@ import { cn } from '@/lib/utils'
import { $toolInlineDiffs } from '@/store/tool-diffs'
import { $toolDisclosureOpen, $toolViewMode, setToolDisclosureOpen } from '@/store/tool-view'
import { PendingToolApproval } from './tool-approval'
import {
groupCopyText as buildGroupCopyText,
buildToolView,
@@ -309,6 +310,7 @@ function ToolEntry({ part }: ToolEntryProps) {
</span>
</DisclosureRow>
</div>
{isPending && <PendingToolApproval part={part} />}
{open && (
<div className="grid w-full min-w-0 max-w-full gap-1.5 overflow-hidden p-1.5">
{!embedded && view.previewTarget && isPreviewableTarget(view.previewTarget) && (

View File

@@ -0,0 +1,230 @@
'use client'
import { useStore } from '@nanostores/react'
import { type FormEvent, useCallback, useEffect, useState } from 'react'
import { Button } from '@/components/ui/button'
import { Dialog, DialogContent, DialogDescription, DialogFooter, DialogHeader, DialogTitle } from '@/components/ui/dialog'
import { Input } from '@/components/ui/input'
import { triggerHaptic } from '@/lib/haptics'
import { KeyRound, Loader2, Lock } from '@/lib/icons'
import { $gateway } from '@/store/gateway'
import { notifyError } from '@/store/notifications'
import { $secretRequest, $sudoRequest, clearSecretRequest, clearSudoRequest } from '@/store/prompts'
// Renders the modal mid-turn prompts the gateway raises and waits on: sudo
// password and skill secret capture. (Dangerous-command / execute_code approval
// is rendered INLINE on the pending tool row instead — see
// components/assistant-ui/tool-approval.tsx — so it reads like an inline "Run"
// affordance rather than a blocking modal.) Each Python-side caller blocks the
// agent thread until the matching `*.respond` RPC lands; without a renderer the
// agent stalls until its timeout and the tool is BLOCKED (the bug this fixes —
// desktop handled clarify.request but not these). Any close path (Esc, backdrop
// click) funnels through Radix's single `onOpenChange(false)` and maps to a
// refusal, so silence is never mistaken for consent, matching the TUI. We
// deliberately do NOT add onEscapeKeyDown / onInteractOutside handlers — they'd
// fire a second `*.respond` alongside onOpenChange (double-send) or block the
// backdrop-dismiss path.
function SudoDialog() {
const request = useStore($sudoRequest)
const gateway = useStore($gateway)
const [password, setPassword] = useState('')
const [submitting, setSubmitting] = useState(false)
useEffect(() => {
setPassword('')
setSubmitting(false)
}, [request?.requestId])
const send = useCallback(
async (value: string) => {
if (!request) {
return
}
if (!gateway) {
notifyError(new Error('Hermes gateway is not connected'), 'Could not send sudo password')
return
}
setSubmitting(true)
try {
await gateway.request<{ status?: string }>('sudo.respond', {
password: value,
request_id: request.requestId
})
triggerHaptic('submit')
clearSudoRequest(request.requestId)
} catch (error) {
notifyError(error, 'Could not send sudo password')
setSubmitting(false)
}
},
[gateway, request]
)
// Cancel → empty password. The backend treats an empty sudo response as a
// failed sudo (no command runs), so closing the dialog is a safe refusal.
const onOpenChange = useCallback(
(open: boolean) => {
if (!open && !submitting && request) {
void send('')
}
},
[request, send, submitting]
)
const onSubmit = useCallback(
(event: FormEvent<HTMLFormElement>) => {
event.preventDefault()
void send(password)
},
[password, send]
)
if (!request) {
return null
}
return (
<Dialog onOpenChange={onOpenChange} open>
<DialogContent showCloseButton={false}>
<DialogHeader>
<DialogTitle className="flex items-center gap-2">
<Lock className="size-4 text-primary" />
Administrator password
</DialogTitle>
<DialogDescription>
Hermes needs your sudo password to run a privileged command. It is sent only to your local agent.
</DialogDescription>
</DialogHeader>
<form className="grid gap-3" onSubmit={onSubmit}>
<Input
autoFocus
disabled={submitting}
onChange={event => setPassword(event.target.value)}
placeholder="sudo password"
type="password"
value={password}
/>
<DialogFooter>
<Button disabled={submitting} onClick={() => void send('')} type="button" variant="ghost">
Cancel
</Button>
<Button disabled={submitting} type="submit">
{submitting ? <Loader2 className="size-3.5 animate-spin" /> : 'Send'}
</Button>
</DialogFooter>
</form>
</DialogContent>
</Dialog>
)
}
function SecretDialog() {
const request = useStore($secretRequest)
const gateway = useStore($gateway)
const [value, setValue] = useState('')
const [submitting, setSubmitting] = useState(false)
useEffect(() => {
setValue('')
setSubmitting(false)
}, [request?.requestId])
const send = useCallback(
async (secret: string) => {
if (!request) {
return
}
if (!gateway) {
notifyError(new Error('Hermes gateway is not connected'), 'Could not send secret')
return
}
setSubmitting(true)
try {
await gateway.request<{ status?: string }>('secret.respond', {
request_id: request.requestId,
value: secret
})
triggerHaptic('submit')
clearSecretRequest(request.requestId)
} catch (error) {
notifyError(error, 'Could not send secret')
setSubmitting(false)
}
},
[gateway, request]
)
const onOpenChange = useCallback(
(open: boolean) => {
if (!open && !submitting && request) {
void send('')
}
},
[request, send, submitting]
)
const onSubmit = useCallback(
(event: FormEvent<HTMLFormElement>) => {
event.preventDefault()
void send(value)
},
[send, value]
)
if (!request) {
return null
}
return (
<Dialog onOpenChange={onOpenChange} open>
<DialogContent showCloseButton={false}>
<DialogHeader>
<DialogTitle className="flex items-center gap-2">
<KeyRound className="size-4 text-primary" />
{request.envVar || 'Secret required'}
</DialogTitle>
<DialogDescription>{request.prompt || 'Hermes needs a credential to continue.'}</DialogDescription>
</DialogHeader>
<form className="grid gap-3" onSubmit={onSubmit}>
<Input
autoFocus
disabled={submitting}
onChange={event => setValue(event.target.value)}
placeholder={request.envVar || 'secret value'}
type="password"
value={value}
/>
<DialogFooter>
<Button disabled={submitting} onClick={() => void send('')} type="button" variant="ghost">
Cancel
</Button>
<Button disabled={submitting || !value} type="submit">
{submitting ? <Loader2 className="size-3.5 animate-spin" /> : 'Send'}
</Button>
</DialogFooter>
</form>
</DialogContent>
</Dialog>
)
}
export function PromptOverlays() {
return (
<>
<SudoDialog />
<SecretDialog />
</>
)
}

View File

@@ -55,6 +55,12 @@ export type GatewayEventPayload = {
request_id?: string
question?: string
choices?: string[] | null
// approval.request (dangerous command / execute_code) — session-keyed
command?: string
description?: string
// secret.request (skill credential capture)
env_var?: string
prompt?: string
}
export function textPart(text: string): ChatMessagePart {

View File

@@ -0,0 +1,91 @@
import { afterEach, describe, expect, it } from 'vitest'
import {
$approvalRequest,
$secretRequest,
$sudoRequest,
clearAllPrompts,
clearApprovalRequest,
clearSecretRequest,
clearSudoRequest,
setApprovalRequest,
setSecretRequest,
setSudoRequest
} from './prompts'
afterEach(() => {
clearAllPrompts()
})
describe('approval prompt store', () => {
it('holds the most recent session-keyed approval request', () => {
setApprovalRequest({ command: 'rm -rf /tmp/x', description: 'recursive delete', sessionId: 's1' })
expect($approvalRequest.get()).toEqual({
command: 'rm -rf /tmp/x',
description: 'recursive delete',
sessionId: 's1'
})
})
it('clears unconditionally (approval is session-keyed, no request id)', () => {
setApprovalRequest({ command: 'x', description: 'd', sessionId: 's1' })
clearApprovalRequest()
expect($approvalRequest.get()).toBeNull()
})
})
describe('sudo prompt store', () => {
it('clears only when the request id matches the in-flight prompt', () => {
setSudoRequest({ requestId: 'abc' })
// A stale clear for a different request must NOT drop the live prompt —
// otherwise a late response to a prior sudo ask would dismiss the current
// one and leave the agent blocked.
clearSudoRequest('stale')
expect($sudoRequest.get()).toEqual({ requestId: 'abc' })
clearSudoRequest('abc')
expect($sudoRequest.get()).toBeNull()
})
it('clears unconditionally when no request id is given', () => {
setSudoRequest({ requestId: 'abc' })
clearSudoRequest()
expect($sudoRequest.get()).toBeNull()
})
})
describe('secret prompt store', () => {
it('carries env var and prompt, and clears on id match', () => {
setSecretRequest({ requestId: 'r1', envVar: 'OPENAI_API_KEY', prompt: 'Paste your key' })
expect($secretRequest.get()).toEqual({
requestId: 'r1',
envVar: 'OPENAI_API_KEY',
prompt: 'Paste your key'
})
clearSecretRequest('mismatch')
expect($secretRequest.get()).not.toBeNull()
clearSecretRequest('r1')
expect($secretRequest.get()).toBeNull()
})
})
describe('clearAllPrompts', () => {
it('drops every in-flight prompt at once (turn end / interrupt)', () => {
setApprovalRequest({ command: 'x', description: 'd', sessionId: 's1' })
setSudoRequest({ requestId: 'abc' })
setSecretRequest({ requestId: 'r1', envVar: 'E', prompt: 'p' })
clearAllPrompts()
expect($approvalRequest.get()).toBeNull()
expect($sudoRequest.get()).toBeNull()
expect($secretRequest.get()).toBeNull()
})
})

View File

@@ -0,0 +1,86 @@
import { atom } from 'nanostores'
// Blocking interactive prompts the gateway raises mid-turn. Each maps to a
// `*.request` event the Python side emits while it blocks the agent thread
// waiting for a `*.respond` RPC. Without a renderer for these, the agent
// silently stalls until its timeout (default 5 min) and the tool is BLOCKED
// — the desktop app previously handled clarify.request but not these three,
// so dangerous-command approval, sudo, and secret prompts never surfaced.
export interface ApprovalRequest {
command: string
description: string
sessionId: string | null
}
// Approval is session-keyed on the backend (one in-flight approval per
// session, resolved via approval.respond {choice, session_id}). It carries
// no request_id, unlike sudo/secret which are _block()-style request/response.
export const $approvalRequest = atom<ApprovalRequest | null>(null)
export function setApprovalRequest(request: ApprovalRequest): void {
$approvalRequest.set(request)
}
export function clearApprovalRequest(): void {
$approvalRequest.set(null)
}
export interface SudoRequest {
requestId: string
}
export const $sudoRequest = atom<SudoRequest | null>(null)
export function setSudoRequest(request: SudoRequest): void {
$sudoRequest.set(request)
}
export function clearSudoRequest(requestId?: string): void {
const current = $sudoRequest.get()
if (!current) {
return
}
if (requestId && current.requestId !== requestId) {
return
}
$sudoRequest.set(null)
}
export interface SecretRequest {
requestId: string
envVar: string
prompt: string
}
export const $secretRequest = atom<SecretRequest | null>(null)
export function setSecretRequest(request: SecretRequest): void {
$secretRequest.set(request)
}
export function clearSecretRequest(requestId?: string): void {
const current = $secretRequest.get()
if (!current) {
return
}
if (requestId && current.requestId !== requestId) {
return
}
$secretRequest.set(null)
}
// Drop every in-flight prompt. Called when a turn ends (message.complete /
// error) so a stale overlay can't linger past the turn that raised it — e.g.
// if the agent was interrupted while a prompt was open.
export function clearAllPrompts(): void {
$approvalRequest.set(null)
$sudoRequest.set(null)
$secretRequest.set(null)
}

View File

@@ -338,6 +338,17 @@ if [ -f "$HERMES_HOME/.env" ]; then
chmod 600 "$HERMES_HOME/.env" 2>/dev/null || true
fi
# --- Migrate persisted config schema ---
# Docker image upgrades replace the code under $INSTALL_DIR but preserve
# $HERMES_HOME on the mounted volume. Run the same safe, non-interactive
# config-schema migrations that `hermes update` runs for non-Docker installs,
# after first-boot seeding and before supervised gateway services start.
# Set HERMES_SKIP_CONFIG_MIGRATION=1 for controlled/manual migrations.
if [ -f "$HERMES_HOME/config.yaml" ]; then
s6-setuidgid hermes "$INSTALL_DIR/.venv/bin/python" "$INSTALL_DIR/scripts/docker_config_migrate.py" \
|| echo "[stage2] Warning: docker_config_migrate.py failed; continuing"
fi
# auth.json: bootstrap from env on first boot only. Same semantics as the
# pre-s6 entrypoint — the [ ! -f ] guard is critical to avoid clobbering
# rotated refresh tokens on container restart.

View File

@@ -17,11 +17,13 @@ import logging
import os
import platform
import re
import shutil
import stat
import subprocess
import sys
import tempfile
import threading
import time
from dataclasses import dataclass
from pathlib import Path
from typing import Dict, Any, Optional, List, Tuple
@@ -36,6 +38,60 @@ logger = logging.getLogger(__name__)
_CONFIG_PARSE_WARNED: set = set()
def _backup_corrupt_config(config_path: Path) -> Optional[Path]:
"""Preserve a corrupted ``config.yaml`` by copying it to a timestamped ``.bak``.
When the YAML can't be parsed, ``load_config()`` silently falls back to
``DEFAULT_CONFIG`` and the user's broken file stays on disk untouched.
That file is still the user's only copy of their intended overrides — if
they re-run the setup wizard or ``hermes config set`` (which rewrites
``config.yaml``), the broken-but-recoverable content is gone for good.
This snapshots the corrupted file to ``config.yaml.corrupt.<ts>.bak`` so
the user can diff/repair it. Unlike Gemini CLI's policy-file recovery
(which resets the live file to a clean state), we deliberately leave
``config.yaml`` in place: hermes never silently mutates the user's config,
and leaving it means a hand-fixed file is re-read on the next load. The
backup is best-effort — any failure (permissions, symlink, disk full) is
swallowed so config loading is never blocked by backup problems.
Returns the backup path on success, else ``None``. Symlinks are not
followed/copied (mirrors the Gemini #21541 lstat guard) to avoid
clobbering whatever a malicious/misconfigured symlink points at.
"""
try:
if config_path.is_symlink():
return None
st = config_path.stat()
if st.st_size == 0:
# Empty file isn't worth preserving and yaml.safe_load returns {}
# for it anyway (so it wouldn't reach here), but guard regardless.
return None
ts = time.strftime("%Y%m%d-%H%M%S")
backup_path = config_path.with_name(f"{config_path.name}.corrupt.{ts}.bak")
# Don't clobber an existing backup from the same second; if there's
# already a corrupt backup for this exact mtime, assume we've snapshotted
# this corruption already and skip (the dedup cache normally prevents a
# second call, but a process restart can clear it).
sibling_baks = list(
config_path.parent.glob(f"{config_path.name}.corrupt.*.bak")
)
for existing in sibling_baks:
try:
if existing.stat().st_size == st.st_size:
# Same size as the current broken file — likely the same
# corruption already preserved. Avoid backup churn.
return None
except OSError:
continue
if backup_path.exists():
return None
shutil.copy2(config_path, backup_path)
return backup_path
except Exception:
return None
def _warn_config_parse_failure(config_path: Path, exc: Exception) -> None:
"""Surface a config.yaml parse failure to user, log, and stderr.
@@ -48,7 +104,11 @@ def _warn_config_parse_failure(config_path: Path, exc: Exception) -> None:
Now: warn once per (path, mtime_ns, size) on stderr **and** in
``agent.log`` / ``errors.log`` at WARNING level so ``hermes logs``
surfaces it. Re-warns automatically if the file changes (different
mtime/size), so users editing the config see the next failure.
mtime/size), so users editing the config see the next failure. On the
first warning for a given broken file we also snapshot it to a
timestamped ``.bak`` (best-effort) so the user's recoverable content
survives any later rewrite of ``config.yaml`` by the setup wizard or
``hermes config set``.
"""
try:
st = config_path.stat()
@@ -59,12 +119,16 @@ def _warn_config_parse_failure(config_path: Path, exc: Exception) -> None:
return
_CONFIG_PARSE_WARNED.add(key)
backup_path = _backup_corrupt_config(config_path)
msg = (
f"Failed to parse {config_path}: {exc}. "
f"Falling back to default config — every user override "
f"(auxiliary providers, fallback chain, model settings) is being IGNORED. "
f"Fix the YAML and restart."
)
if backup_path is not None:
msg += f" A copy of the corrupted file was saved to {backup_path}."
logger.warning(msg)
try:
sys.stderr.write(f"⚠️ hermes config: {msg}\n")
@@ -3792,15 +3856,46 @@ def get_custom_provider_context_length(
return None
def _coerce_config_version(value: Any) -> int:
"""Return a safe integer config version, treating invalid values as legacy."""
if isinstance(value, bool):
return 0
try:
version = int(value)
except (TypeError, ValueError):
return 0
return max(version, 0)
def check_config_version() -> Tuple[int, int]:
"""
Check config version.
Check the raw on-disk config schema version.
``load_config()`` deliberately starts from ``DEFAULT_CONFIG`` and deep-merges
the user's file, which is correct for runtime reads but wrong for deciding
whether the user's persisted schema has been migrated. A config file with no
raw ``_config_version`` must remain visible as legacy instead of inheriting
the latest default version in memory.
Returns (current_version, latest_version).
"""
config = load_config()
current = config.get("_config_version", 0)
latest = DEFAULT_CONFIG.get("_config_version", 1)
latest = _coerce_config_version(DEFAULT_CONFIG.get("_config_version", 1)) or 1
config_path = get_config_path()
if not config_path.exists():
return latest, latest
try:
with open(config_path, encoding="utf-8") as f:
config = yaml.safe_load(f) or {}
except Exception as e:
# Invalid YAML needs a parse warning, not an automatic schema rewrite
# that could replace the user's broken file with defaults.
_warn_config_parse_failure(config_path, e)
return latest, latest
if not isinstance(config, dict):
config = {}
current = _coerce_config_version(config.get("_config_version"))
return current, latest

View File

@@ -585,20 +585,41 @@ def collect_debug_report(
# CLI entry points
# ---------------------------------------------------------------------------
def run_debug_share(args):
"""Collect debug report + full logs, upload each, print URLs."""
@dataclass
class DebugShareResult:
"""Structured outcome of a ``debug share`` upload.
Returned by :func:`build_debug_share` so non-CLI callers (the dashboard
web server, gateway) can render the uploaded paste URLs as real links
instead of scraping printed text.
"""
urls: dict # label -> paste URL (e.g. {"Report": "...", "agent.log": "..."})
failures: list # human-readable "label: error" strings for optional uploads
redacted: bool # whether force-mode redaction was applied before upload
auto_delete_seconds: int # how long until the pastes auto-delete
report: str = "" # the summary report text (kept for local fallback)
def build_debug_share(
*,
log_lines: int = 200,
expiry: int = 7,
redact: bool = True,
) -> DebugShareResult:
"""Collect the debug report + full logs, upload each, return the URLs.
This is the shared core behind ``hermes debug share`` (CLI) and the
dashboard ``POST /api/ops/debug-share`` endpoint. It performs blocking
network I/O (paste uploads) — callers inside an event loop must run it in
a worker thread.
The summary report upload is required: on failure this raises
``RuntimeError``. Full-log uploads are best-effort; their errors are
collected into ``failures`` rather than raised.
"""
_best_effort_sweep_expired_pastes()
log_lines = getattr(args, "lines", 200)
expiry = getattr(args, "expire", 7)
local_only = getattr(args, "local", False)
redact = not getattr(args, "no_redact", False)
if not local_only:
print(_PRIVACY_NOTICE)
print("Collecting debug report...")
# Capture dump once — prepended to every paste for context.
# The dump is already redacted at extract time via dump.py:_redact;
# log_snapshots are redacted by _capture_default_log_snapshots when
@@ -639,71 +660,112 @@ def run_debug_share(args):
if desktop_log:
desktop_log = _REDACTION_BANNER + desktop_log
if local_only:
print(report)
if agent_log:
print(f"\n\n{'=' * 60}")
print("FULL agent.log")
print(f"{'=' * 60}\n")
print(agent_log)
if gateway_log:
print(f"\n\n{'=' * 60}")
print("FULL gateway.log")
print(f"{'=' * 60}\n")
print(gateway_log)
if desktop_log:
print(f"\n\n{'=' * 60}")
print("FULL desktop.log")
print(f"{'=' * 60}\n")
print(desktop_log)
return
print("Uploading...")
urls: dict[str, str] = {}
failures: list[str] = []
# 1. Summary report (required)
# 1. Summary report (required — raises on failure so callers can fall back)
urls["Report"] = upload_to_pastebin(report, expiry_days=expiry)
# 2-4. Full logs (optional — failures are collected, not raised)
for label, content in (
("agent.log", agent_log),
("gateway.log", gateway_log),
("desktop.log", desktop_log),
):
if not content:
continue
try:
urls[label] = upload_to_pastebin(content, expiry_days=expiry)
except Exception as exc:
failures.append(f"{label}: {exc}")
# Schedule auto-deletion after 6 hours.
_schedule_auto_delete(list(urls.values()))
return DebugShareResult(
urls=urls,
failures=failures,
redacted=redact,
auto_delete_seconds=_AUTO_DELETE_SECONDS,
report=report,
)
def run_debug_share(args):
"""Collect debug report + full logs, upload each, print URLs."""
log_lines = getattr(args, "lines", 200)
expiry = getattr(args, "expire", 7)
local_only = getattr(args, "local", False)
redact = not getattr(args, "no_redact", False)
if local_only:
# Local-only path never uploads — render the report to stdout and bail
# before any network I/O. Mirrors the upload path's collection logic.
_best_effort_sweep_expired_pastes()
print("Collecting debug report...")
dump_text = _capture_dump()
log_snapshots = _capture_default_log_snapshots(log_lines, redact=redact)
report = collect_debug_report(
log_lines=log_lines,
dump_text=dump_text,
log_snapshots=log_snapshots,
)
agent_log = log_snapshots["agent"].full_text
gateway_log = log_snapshots["gateway"].full_text
desktop_log = log_snapshots["desktop"].full_text
if agent_log:
agent_log = dump_text + "\n\n--- full agent.log ---\n" + agent_log
if gateway_log:
gateway_log = dump_text + "\n\n--- full gateway.log ---\n" + gateway_log
if desktop_log:
desktop_log = dump_text + "\n\n--- full desktop.log ---\n" + desktop_log
if redact:
report = _REDACTION_BANNER + report
if agent_log:
agent_log = _REDACTION_BANNER + agent_log
if gateway_log:
gateway_log = _REDACTION_BANNER + gateway_log
if desktop_log:
desktop_log = _REDACTION_BANNER + desktop_log
print(report)
for title, body in (
("FULL agent.log", agent_log),
("FULL gateway.log", gateway_log),
("FULL desktop.log", desktop_log),
):
if body:
print(f"\n\n{'=' * 60}")
print(title)
print(f"{'=' * 60}\n")
print(body)
return
print(_PRIVACY_NOTICE)
print("Collecting debug report...")
print("Uploading...")
try:
urls["Report"] = upload_to_pastebin(report, expiry_days=expiry)
result = build_debug_share(
log_lines=log_lines,
expiry=expiry,
redact=redact,
)
except RuntimeError as exc:
print(f"\nUpload failed: {exc}", file=sys.stderr)
print("\nFull report printed below — copy-paste it manually:\n")
print(report)
print("\nRun `hermes debug share --local` to print the report instead.\n")
sys.exit(1)
# 2. Full agent.log (optional)
if agent_log:
try:
urls["agent.log"] = upload_to_pastebin(agent_log, expiry_days=expiry)
except Exception as exc:
failures.append(f"agent.log: {exc}")
# 3. Full gateway.log (optional)
if gateway_log:
try:
urls["gateway.log"] = upload_to_pastebin(gateway_log, expiry_days=expiry)
except Exception as exc:
failures.append(f"gateway.log: {exc}")
# 4. Full desktop.log (optional — Electron app boot + backend output)
if desktop_log:
try:
urls["desktop.log"] = upload_to_pastebin(desktop_log, expiry_days=expiry)
except Exception as exc:
failures.append(f"desktop.log: {exc}")
# Print results
label_width = max(len(k) for k in urls)
label_width = max(len(k) for k in result.urls)
print(f"\nDebug report uploaded:")
for label, url in urls.items():
for label, url in result.urls.items():
print(f" {label:<{label_width}} {url}")
if failures:
print(f"\n (failed to upload: {', '.join(failures)})")
if result.failures:
print(f"\n (failed to upload: {', '.join(result.failures)})")
# Schedule auto-deletion after 6 hours
_schedule_auto_delete(list(urls.values()))
print(f"\n⏱ Pastes will auto-delete in 6 hours.")
hours = result.auto_delete_seconds // 3600
print(f"\n⏱ Pastes will auto-delete in {hours} hours.")
# Manual delete fallback
print(f"To delete now: hermes debug delete <url>")

View File

@@ -308,6 +308,29 @@ def get_startup_entry_path() -> Path:
return _startup_dir() / f"{_sanitize_filename(get_task_name())}.cmd"
# ---------------------------------------------------------------------------
# Stable working directory
# ---------------------------------------------------------------------------
def _stable_gateway_working_dir(project_root: Path) -> str:
"""Return a stable cwd for detached/startup gateway runs.
Mirror the POSIX service invariant: anchor at ``HERMES_HOME`` whenever it
exists so Scheduled Task / Startup launches do not fail at the ``cd`` step
after a transient checkout or worktree is moved away. Fall back to the
source checkout only if ``HERMES_HOME`` cannot be resolved yet.
"""
from hermes_cli.config import get_hermes_home
try:
home = get_hermes_home()
if home and Path(home).is_dir():
return str(Path(home).resolve())
except Exception:
pass
return str(project_root)
# ---------------------------------------------------------------------------
# Script rendering
# ---------------------------------------------------------------------------
@@ -321,7 +344,7 @@ def _build_gateway_cmd_script(
"""Build the ``gateway.cmd`` wrapper content (CRLF-terminated).
The script:
- cd's into the project directory
- cd's into a stable working directory
- exports HERMES_HOME, PYTHONIOENCODING, VIRTUAL_ENV
- invokes ``pythonw -m hermes_cli.main [--profile X] gateway run``
directly so the wrapper cmd.exe exits without a visible gateway console
@@ -380,7 +403,7 @@ def _write_task_script() -> Path:
)
python_path = get_python_path()
working_dir = str(PROJECT_ROOT)
working_dir = _stable_gateway_working_dir(PROJECT_ROOT)
hermes_home = str(Path(get_hermes_home()).resolve())
profile_arg = _profile_arg(hermes_home)
@@ -547,7 +570,8 @@ def _build_gateway_argv() -> tuple[list[str], str, dict[str, str]]:
)
python_exe, venv_dir, extra_pythonpath = _resolve_detached_python(get_python_path())
working_dir = str(PROJECT_ROOT)
project_root = str(PROJECT_ROOT)
working_dir = _stable_gateway_working_dir(PROJECT_ROOT)
hermes_home = str(Path(get_hermes_home()).resolve())
profile_arg = _profile_arg(hermes_home)
@@ -562,7 +586,7 @@ def _build_gateway_argv() -> tuple[list[str], str, dict[str, str]]:
"HERMES_GATEWAY_DETACHED": "1",
"VIRTUAL_ENV": str(venv_dir),
}
_prepend_pythonpath(env_overlay, [working_dir, *extra_pythonpath] if extra_pythonpath else [])
_prepend_pythonpath(env_overlay, [project_root, *extra_pythonpath] if extra_pythonpath else [project_root])
return argv, working_dir, env_overlay

View File

@@ -1016,6 +1016,51 @@ async def run_config_migrate():
return {"ok": True, "pid": proc.pid, "name": "config-migrate"}
class DebugShareRequest(BaseModel):
# Redaction is ON by default — force-mode scrubs credential-shaped tokens
# out of log content before it leaves the machine. The toggle exists so an
# operator who knows the logs are clean can opt out for fuller fidelity.
redact: bool = True
# Recent log lines included in the summary tail (full logs are separate).
lines: int = 200
@app.post("/api/ops/debug-share")
async def run_debug_share_endpoint(body: DebugShareRequest | None = None):
"""Upload a redacted debug report + full logs and return the paste URLs.
Unlike the other diagnostics actions (doctor, dump, prompt-size) this is
*synchronous*: the whole point of ``debug share`` is the set of shareable
URLs it produces, so we run the upload in a worker thread and return the
structured ``{urls, failures, redacted, ...}`` payload directly. The
dashboard renders those as real, copyable links instead of scraping a log
tail. Pastes auto-delete after 6 hours (handled inside the share core).
"""
from hermes_cli.debug import build_debug_share
req = body or DebugShareRequest()
try:
result = await asyncio.to_thread(
build_debug_share,
log_lines=max(1, min(int(req.lines), 5000)),
redact=bool(req.redact),
)
except RuntimeError as exc:
# Required summary-report upload failed (offline / paste service down).
raise HTTPException(status_code=502, detail=f"Upload failed: {exc}")
except Exception as exc:
_log.exception("debug share failed")
raise HTTPException(status_code=500, detail=f"Failed: {exc}")
return {
"ok": True,
"urls": result.urls,
"failures": result.failures,
"redacted": result.redacted,
"auto_delete_seconds": result.auto_delete_seconds,
}
# ---------------------------------------------------------------------------
# Gateway + update actions (invoked from the Status page).
#

View File

@@ -5,6 +5,7 @@ without risk of circular imports.
"""
import os
import sys
import sysconfig
from contextvars import ContextVar, Token
from pathlib import Path
@@ -40,17 +41,26 @@ def get_hermes_home_override() -> str | None:
return str(override)
def get_hermes_home() -> Path:
"""Return the Hermes home directory (default: ~/.hermes).
def _get_platform_default_hermes_home() -> Path:
"""Return the platform-native default Hermes home path."""
if sys.platform == "win32":
local_appdata = os.environ.get("LOCALAPPDATA", "").strip()
base = Path(local_appdata) if local_appdata else Path.home() / "AppData" / "Local"
return base / "hermes"
return Path.home() / ".hermes"
Reads HERMES_HOME env var, falls back to ~/.hermes.
def get_hermes_home() -> Path:
"""Return the Hermes home directory (default: platform-native path).
Reads HERMES_HOME env var, falls back to the platform-native default.
This is the single source of truth — all other copies should import this.
When ``HERMES_HOME`` is unset but an ``active_profile`` file indicates
a non-default profile is active, logs a loud one-shot warning to
``errors.log`` so cross-profile data corruption is diagnosable instead
of silent. Behavior is unchanged otherwise — we still return
``~/.hermes`` — because raising here would brick 30+ module-level
the platform-native default — because raising here would brick 30+ module-level
callers that import this at load time. Subprocess spawners are
expected to propagate ``HERMES_HOME`` explicitly (see the systemd
template in ``hermes_cli/gateway.py`` and the kanban dispatcher in
@@ -69,10 +79,8 @@ def get_hermes_home() -> Path:
global _profile_fallback_warned
if not _profile_fallback_warned:
try:
# Inline the default-root resolution from get_default_hermes_root()
# to stay import-safe (this function is called from module scope
# in 30+ files; we cannot afford to trigger logging setup here).
active_path = (Path.home() / ".hermes" / "active_profile")
fallback_home = _get_platform_default_hermes_home()
active_path = fallback_home / "active_profile"
active = active_path.read_text().strip() if active_path.exists() else ""
except (UnicodeDecodeError, OSError):
active = ""
@@ -83,10 +91,9 @@ def get_hermes_home() -> Path:
# module-import time from 30+ sites, often before logging is
# configured, and (b) root-logger propagation would double-emit
# on consoles where a StreamHandler is already attached.
import sys
msg = (
f"[HERMES_HOME fallback] HERMES_HOME is unset but active "
f"profile is {active!r}. Falling back to ~/.hermes, which "
f"profile is {active!r}. Falling back to {fallback_home}, which "
f"is the DEFAULT profile — not {active!r}. Any data this "
f"process writes will land in the wrong profile. The "
f"subprocess spawner should pass HERMES_HOME explicitly "
@@ -98,13 +105,14 @@ def get_hermes_home() -> Path:
except Exception:
pass
return Path.home() / ".hermes"
return _get_platform_default_hermes_home()
def get_default_hermes_root() -> Path:
"""Return the root Hermes directory for profile-level operations.
In standard deployments this is ``~/.hermes``.
In standard deployments this is the platform-native Hermes home
(``~/.hermes`` on POSIX, ``%LOCALAPPDATA%\\hermes`` on native Windows).
In Docker or custom deployments where ``HERMES_HOME`` points outside
``~/.hermes`` (e.g. ``/opt/data``), returns ``HERMES_HOME`` directly
@@ -117,7 +125,7 @@ def get_default_hermes_root() -> Path:
Import-safe — no dependencies beyond stdlib.
"""
native_home = Path.home() / ".hermes"
native_home = _get_platform_default_hermes_home()
env_home = os.environ.get("HERMES_HOME", "")
if not env_home:
return native_home

View File

@@ -0,0 +1,67 @@
#!/usr/bin/env python3
"""Run Docker boot-time config migrations safely."""
from __future__ import annotations
import shutil
import sys
from datetime import datetime, timezone
from pathlib import Path
from typing import Iterable
from hermes_cli.config import (
check_config_version,
get_config_path,
get_env_path,
migrate_config,
)
from utils import env_var_enabled
def _backup_path(path: Path, stamp: str) -> Path:
base = path.with_name(f"{path.name}.bak-{stamp}")
if not base.exists():
return base
for index in range(1, 1000):
candidate = path.with_name(f"{path.name}.bak-{stamp}.{index}")
if not candidate.exists():
return candidate
raise RuntimeError(f"could not choose a backup path for {path}")
def _backup_existing(paths: Iterable[Path]) -> list[Path]:
stamp = datetime.now(timezone.utc).strftime("%Y%m%dT%H%M%SZ")
backups: list[Path] = []
for path in paths:
if not path.is_file():
continue
dest = _backup_path(path, stamp)
shutil.copy2(path, dest)
backups.append(dest)
return backups
def main() -> int:
if env_var_enabled("HERMES_SKIP_CONFIG_MIGRATION"):
print("[config-migrate] HERMES_SKIP_CONFIG_MIGRATION is set; skipping config migration")
return 0
current_ver, latest_ver = check_config_version()
if current_ver >= latest_ver:
return 0
backups = _backup_existing((get_config_path(), get_env_path()))
backup_text = ", ".join(str(path) for path in backups) if backups else "none"
print(
f"[config-migrate] Migrating config schema {current_ver} -> {latest_ver}; "
f"backups: {backup_text}"
)
migrate_config(interactive=False, quiet=False)
return 0
if __name__ == "__main__":
try:
raise SystemExit(main())
except Exception as exc:
print(f"[config-migrate] ERROR: {exc}", file=sys.stderr)
raise SystemExit(1)

View File

@@ -38,7 +38,13 @@ def test_tty_passthrough_to_container(built_image: str) -> None:
assert "NO_TTY" not in output, f"TTY passthrough failed: {output!r}"
numeric_lines = [s for s in output.split() if s.strip().isdigit()]
assert numeric_lines, f"No numeric width in output: {output!r}"
assert int(numeric_lines[0]) > 0
# The container saw a real TTY (``[ -t 1 ]`` was true, so we got a number
# rather than ``NO_TTY``) and terminfo resolved ``tput cols``. That is the
# passthrough contract this test guards. The reported width itself depends
# on the host PTY's window-size ioctl: a ``script(1)``-allocated PTY on a
# headless CI runner has a 0x0 window, so ``tput cols`` legitimately prints
# ``0`` while still being a TTY. Assert non-negative, not strictly positive.
assert int(numeric_lines[0]) >= 0
def test_tui_flag_recognized(built_image: str) -> None:

View File

@@ -9,6 +9,7 @@ import yaml
from hermes_cli.config import (
DEFAULT_CONFIG,
check_config_version,
get_hermes_home,
ensure_hermes_home,
get_compatible_custom_providers,
@@ -156,6 +157,70 @@ class TestLoadConfigParseFailure:
after_edit = capsys.readouterr().err
assert "hermes config:" in after_edit, "edited file should re-warn"
def test_corrupt_config_is_backed_up(self, tmp_path, capsys):
"""A broken config.yaml is snapshotted to a timestamped .bak so the
user's recoverable overrides survive a later wizard/config-set rewrite.
Ported from google-gemini/gemini-cli#21541 (policy-file TOML recovery),
adapted: we back up but deliberately do NOT reset config.yaml.
"""
from hermes_cli import config as cfg_mod
cfg_mod._CONFIG_PARSE_WARNED.clear()
with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
broken = "\tmodel: test/custom\nbroken indent:\n"
(tmp_path / "config.yaml").write_text(broken)
load_config()
err = capsys.readouterr().err
baks = list(tmp_path.glob("config.yaml.corrupt.*.bak"))
assert len(baks) == 1, f"expected one backup, got {baks}"
# Backup preserves the original broken content verbatim
assert baks[0].read_text() == broken
# Original config.yaml is left untouched (not reset to clean state)
assert (tmp_path / "config.yaml").read_text() == broken
# User is told where the backup landed
assert str(baks[0]) in err
def test_backup_skips_when_same_size_bak_exists(self, tmp_path, capsys):
"""Don't churn backups: if a corrupt backup of the same size already
exists (same corruption already preserved), skip making another."""
from hermes_cli import config as cfg_mod
cfg_mod._CONFIG_PARSE_WARNED.clear()
with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
broken = "\tbroken:\n"
cfg = tmp_path / "config.yaml"
cfg.write_text(broken)
# Pre-existing backup of identical size simulates an earlier snapshot.
(tmp_path / "config.yaml.corrupt.20260101-000000.bak").write_text(broken)
load_config()
baks = list(tmp_path.glob("config.yaml.corrupt.*.bak"))
assert len(baks) == 1, f"should not add a second same-size backup, got {baks}"
def test_corrupt_symlink_config_not_backed_up(self, tmp_path):
"""Symlinked config.yaml is not copied (mirrors Gemini #21541 lstat
guard) — avoids clobbering whatever the symlink points at."""
import sys as _sys
if _sys.platform == "win32":
pytest.skip("symlink creation requires privileges on Windows")
from hermes_cli import config as cfg_mod
cfg_mod._CONFIG_PARSE_WARNED.clear()
with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
real = tmp_path / "real_config.yaml"
real.write_text("\tbroken:\n")
link = tmp_path / "config.yaml"
link.symlink_to(real)
load_config()
assert not list(tmp_path.glob("config.yaml.corrupt.*.bak"))
class TestSaveAndLoadRoundtrip:
def test_roundtrip(self, tmp_path):
@@ -542,6 +607,28 @@ class TestConfigMigrationSecretPrompts:
assert results["env_added"] == ["TEST_API_KEY"]
class TestConfigVersionDetection:
def test_check_config_version_uses_raw_on_disk_version(self, tmp_path):
config_path = tmp_path / "config.yaml"
config_path.write_text("model: {}\n", encoding="utf-8")
with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
assert load_config()["_config_version"] == DEFAULT_CONFIG["_config_version"]
assert check_config_version() == (0, DEFAULT_CONFIG["_config_version"])
def test_check_config_version_treats_missing_file_as_current(self, tmp_path):
with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
latest = DEFAULT_CONFIG["_config_version"]
assert check_config_version() == (latest, latest)
def test_check_config_version_does_not_migrate_invalid_yaml(self, tmp_path):
(tmp_path / "config.yaml").write_text("model: [unterminated\n", encoding="utf-8")
with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
latest = DEFAULT_CONFIG["_config_version"]
assert check_config_version() == (latest, latest)
class TestAnthropicTokenMigration:
"""Test that config version 8→9 clears ANTHROPIC_TOKEN."""
@@ -904,4 +991,3 @@ class TestEnvWriteDenylist:
# But the write path still refuses to update it
with pytest.raises(ValueError, match="denylist"):
save_env_value("LD_PRELOAD", "/tmp/evil.so")

View File

@@ -498,3 +498,96 @@ class TestUpdateCheckEndpoint:
assert body["update_available"] is False
assert body["message"]
class TestDebugShareEndpoint:
"""POST /api/ops/debug-share returns the paste URLs synchronously so the
dashboard can render them as copyable links (not a backgrounded log tail)."""
@pytest.fixture(autouse=True)
def _setup(self, _isolate_hermes_home):
self.client, self.header = _client()
from hermes_constants import get_hermes_home
logs = get_hermes_home() / "logs"
logs.mkdir(parents=True, exist_ok=True)
(logs / "agent.log").write_text("agent line\n")
(logs / "errors.log").write_text("err line\n")
(logs / "gateway.log").write_text("gw line\n")
def test_returns_structured_urls(self, monkeypatch):
import hermes_cli.debug as dbg
count = [0]
def _upload(content, expiry_days=7):
count[0] += 1
return f"https://paste.rs/p{count[0]}"
monkeypatch.setattr(dbg, "upload_to_pastebin", _upload)
monkeypatch.setattr(dbg, "_schedule_auto_delete", lambda *a, **k: None)
monkeypatch.setattr(dbg, "_best_effort_sweep_expired_pastes", lambda: None)
monkeypatch.setattr("hermes_cli.dump.run_dump", lambda a: None)
r = self.client.post("/api/ops/debug-share", json={"redact": True})
assert r.status_code == 200
body = r.json()
assert body["ok"] is True
assert "Report" in body["urls"]
assert body["redacted"] is True
assert body["auto_delete_seconds"] == 21600
assert isinstance(body["failures"], list)
def test_redact_false_is_honored(self, monkeypatch):
import hermes_cli.debug as dbg
monkeypatch.setattr(
dbg, "upload_to_pastebin", lambda c, expiry_days=7: "https://paste.rs/x"
)
monkeypatch.setattr(dbg, "_schedule_auto_delete", lambda *a, **k: None)
monkeypatch.setattr(dbg, "_best_effort_sweep_expired_pastes", lambda: None)
monkeypatch.setattr("hermes_cli.dump.run_dump", lambda a: None)
r = self.client.post("/api/ops/debug-share", json={"redact": False})
assert r.status_code == 200
assert r.json()["redacted"] is False
def test_default_body_redacts(self, monkeypatch):
import hermes_cli.debug as dbg
monkeypatch.setattr(
dbg, "upload_to_pastebin", lambda c, expiry_days=7: "https://paste.rs/x"
)
monkeypatch.setattr(dbg, "_schedule_auto_delete", lambda *a, **k: None)
monkeypatch.setattr(dbg, "_best_effort_sweep_expired_pastes", lambda: None)
monkeypatch.setattr("hermes_cli.dump.run_dump", lambda a: None)
# No JSON body at all — should default redact=True.
r = self.client.post("/api/ops/debug-share")
assert r.status_code == 200
assert r.json()["redacted"] is True
def test_upload_failure_returns_502(self, monkeypatch):
import hermes_cli.debug as dbg
monkeypatch.setattr(
dbg,
"upload_to_pastebin",
lambda c, expiry_days=7: (_ for _ in ()).throw(RuntimeError("down")),
)
monkeypatch.setattr(dbg, "_schedule_auto_delete", lambda *a, **k: None)
monkeypatch.setattr(dbg, "_best_effort_sweep_expired_pastes", lambda: None)
monkeypatch.setattr("hermes_cli.dump.run_dump", lambda a: None)
r = self.client.post("/api/ops/debug-share", json={"redact": True})
assert r.status_code == 502
def test_requires_session_token(self):
# Drop the token header and confirm the global auth gate rejects it.
bare = self.client
r = bare.post(
"/api/ops/debug-share",
json={"redact": True},
headers={self.header: "wrong-token"},
)
assert r.status_code == 401

View File

@@ -1273,3 +1273,110 @@ class TestShareIncludesAutoDelete:
out = capsys.readouterr().out
assert "public paste service" not in out
# ---------------------------------------------------------------------------
# build_debug_share — structured core used by the dashboard endpoint
# ---------------------------------------------------------------------------
class TestBuildDebugShare:
"""The shared core that returns structured paste URLs (not printed text).
Backs both ``hermes debug share`` (CLI) and ``POST /api/ops/debug-share``
(dashboard). The dashboard renders ``urls`` as real, copyable links, so the
contract here is the return value, not stdout.
"""
def test_returns_structured_urls(self, hermes_home):
from hermes_cli.debug import build_debug_share, DebugShareResult
count = [0]
def _upload(content, expiry_days=7):
count[0] += 1
return f"https://paste.rs/p{count[0]}"
with patch("hermes_cli.dump.run_dump"), patch(
"hermes_cli.debug.upload_to_pastebin", side_effect=_upload
), patch("hermes_cli.debug._schedule_auto_delete"):
result = build_debug_share(log_lines=50, redact=True)
assert isinstance(result, DebugShareResult)
# All four seeded logs (agent/gateway/desktop) + the summary report.
assert "Report" in result.urls
assert "agent.log" in result.urls
assert "gateway.log" in result.urls
assert "desktop.log" in result.urls
assert result.failures == []
assert result.redacted is True
assert result.auto_delete_seconds == 21600
def test_skips_missing_logs_without_failure(self, hermes_home):
from hermes_cli.debug import build_debug_share
# Remove desktop.log so it should be neither uploaded nor reported failed.
(hermes_home / "logs" / "desktop.log").unlink()
with patch("hermes_cli.dump.run_dump"), patch(
"hermes_cli.debug.upload_to_pastebin",
side_effect=lambda c, expiry_days=7: "https://paste.rs/x",
), patch("hermes_cli.debug._schedule_auto_delete"):
result = build_debug_share(log_lines=50, redact=True)
assert "desktop.log" not in result.urls
assert result.failures == []
def test_redaction_keeps_secrets_out_of_payload(self, hermes_home):
from hermes_cli.debug import build_debug_share
secret = "sk-proj-SUPERSECRETtoken1234567890"
(hermes_home / "logs" / "agent.log").write_text(
f"line one\nauthorization token={secret}\nline three\n"
)
uploaded = []
def _upload(content, expiry_days=7):
uploaded.append(content)
return "https://paste.rs/x"
with patch("hermes_cli.dump.run_dump"), patch(
"hermes_cli.debug.upload_to_pastebin", side_effect=_upload
), patch("hermes_cli.debug._schedule_auto_delete"):
result = build_debug_share(log_lines=50, redact=True)
assert result.redacted is True
joined = "\n".join(uploaded)
assert secret not in joined, "secret leaked into upload payload"
def test_optional_log_failure_is_collected_not_raised(self, hermes_home):
from hermes_cli.debug import build_debug_share
count = [0]
def _upload(content, expiry_days=7):
count[0] += 1
# First call (the required Report) succeeds; a later one fails.
if count[0] == 2:
raise RuntimeError("paste service hiccup")
return f"https://paste.rs/p{count[0]}"
with patch("hermes_cli.dump.run_dump"), patch(
"hermes_cli.debug.upload_to_pastebin", side_effect=_upload
), patch("hermes_cli.debug._schedule_auto_delete"):
result = build_debug_share(log_lines=50, redact=True)
assert "Report" in result.urls
assert len(result.failures) == 1
assert "paste service hiccup" in result.failures[0]
def test_required_report_failure_raises(self, hermes_home):
from hermes_cli.debug import build_debug_share
with patch("hermes_cli.dump.run_dump"), patch(
"hermes_cli.debug.upload_to_pastebin",
side_effect=RuntimeError("all paste services down"),
), patch("hermes_cli.debug._schedule_auto_delete"):
with pytest.raises(RuntimeError, match="all paste services down"):
build_debug_share(log_lines=50, redact=True)

View File

@@ -78,9 +78,11 @@ def test_build_gateway_argv_uses_base_pythonw_for_uv_venv_launcher(monkeypatch,
project = tmp_path / "project"
scripts = project / "venv" / "Scripts"
site_packages = project / "venv" / "Lib" / "site-packages"
hermes_home = tmp_path / "hermes-home"
base = tmp_path / "uv" / "python" / "cpython-3.11-windows-x86_64-none"
scripts.mkdir(parents=True)
site_packages.mkdir(parents=True)
hermes_home.mkdir()
base.mkdir(parents=True)
venv_python = scripts / "python.exe"
@@ -99,17 +101,55 @@ def test_build_gateway_argv_uses_base_pythonw_for_uv_venv_launcher(monkeypatch,
monkeypatch.setattr(gateway, "PROJECT_ROOT", project)
monkeypatch.setattr(gateway, "get_python_path", lambda: str(venv_python))
monkeypatch.setattr(gateway, "_profile_arg", lambda hermes_home: "")
monkeypatch.setattr("hermes_cli.config.get_hermes_home", lambda: str(tmp_path / "hermes-home"))
monkeypatch.setattr("hermes_cli.config.get_hermes_home", lambda: str(hermes_home))
argv, cwd, env_overlay = gateway_windows._build_gateway_argv()
assert argv[:3] == [str(base_pythonw), "-m", "hermes_cli.main"]
assert cwd == str(project)
assert cwd == str(hermes_home.resolve())
assert env_overlay["VIRTUAL_ENV"] == str(project / "venv")
assert str(project) in env_overlay["PYTHONPATH"].split(gateway_windows.os.pathsep)
assert str(site_packages) in env_overlay["PYTHONPATH"].split(gateway_windows.os.pathsep)
class TestStableWindowsGatewayWorkingDir:
def test_stable_gateway_working_dir_uses_hermes_home(self, tmp_path, monkeypatch):
home = tmp_path / ".hermes"
home.mkdir()
monkeypatch.setattr("hermes_cli.config.get_hermes_home", lambda: home)
assert gateway_windows._stable_gateway_working_dir(tmp_path / "checkout") == str(home.resolve())
def test_stable_gateway_working_dir_falls_back_to_project_root(self, tmp_path, monkeypatch):
missing = tmp_path / "missing" / ".hermes"
project = tmp_path / "checkout"
monkeypatch.setattr("hermes_cli.config.get_hermes_home", lambda: missing)
assert gateway_windows._stable_gateway_working_dir(project) == str(project)
def test_write_task_script_anchors_cmd_cd_at_hermes_home(monkeypatch, tmp_path):
project = tmp_path / "project"
hermes_home = tmp_path / "hermes-home"
hermes_home.mkdir()
python_exe = project / "venv" / "Scripts" / "python.exe"
python_exe.parent.mkdir(parents=True)
python_exe.write_text("", encoding="utf-8")
script_path = tmp_path / "gateway.cmd"
monkeypatch.setattr(gateway_windows, "_assert_windows", lambda: None)
monkeypatch.setattr(gateway, "PROJECT_ROOT", project)
monkeypatch.setattr(gateway, "get_python_path", lambda: str(python_exe))
monkeypatch.setattr(gateway, "_profile_arg", lambda hermes_home: "")
monkeypatch.setattr("hermes_cli.config.get_hermes_home", lambda: str(hermes_home))
monkeypatch.setattr(gateway_windows, "get_task_script_path", lambda: script_path)
written = gateway_windows._write_task_script()
content = script_path.read_text(encoding="utf-8")
assert written == script_path
assert f"cd /d {gateway_windows._quote_cmd_script_arg(str(hermes_home.resolve()))}" in content
assert f"cd /d {gateway_windows._quote_cmd_script_arg(str(project))}" not in content
def _arrange_startup_fallback(monkeypatch, tmp_path, running_pids):
script_path = tmp_path / "Hermes_Gateway_alice.cmd"
startup_entry = tmp_path / "Startup" / "Hermes_Gateway_alice.cmd"
@@ -741,4 +781,4 @@ def test_drain_helper_still_waits_if_marker_write_fails(monkeypatch):
monkeypatch.setattr(status_mod, "_pid_exists", lambda check_pid: False)
# Returns True because _pid_exists immediately says "gone".
assert gateway_windows._drain_gateway_pid(pid, drain_timeout=5.0) is True
assert gateway_windows._drain_gateway_pid(pid, drain_timeout=5.0) is True

View File

@@ -9,6 +9,7 @@ import hermes_constants
from hermes_constants import (
VALID_REASONING_EFFORTS,
get_default_hermes_root,
get_hermes_home,
is_container,
parse_reasoning_effort,
secure_parent_dir,
@@ -68,6 +69,41 @@ class TestGetDefaultHermesRoot:
monkeypatch.setenv("HERMES_HOME", str(profile))
assert get_default_hermes_root() == docker_root
def test_no_hermes_home_returns_localappdata_root_on_windows(self, tmp_path, monkeypatch):
"""Native Windows falls back to %LOCALAPPDATA%\\hermes, not ~/.hermes."""
local_appdata = tmp_path / "LocalAppData"
monkeypatch.delenv("HERMES_HOME", raising=False)
monkeypatch.setenv("LOCALAPPDATA", str(local_appdata))
monkeypatch.setattr(Path, "home", lambda: tmp_path / "Home")
monkeypatch.setattr(hermes_constants.sys, "platform", "win32")
assert get_default_hermes_root() == local_appdata / "hermes"
def test_no_hermes_home_uses_windows_path_when_localappdata_missing(self, tmp_path, monkeypatch):
"""Windows fallback still uses AppData/Local/hermes without LOCALAPPDATA."""
home = tmp_path / "Home"
monkeypatch.delenv("HERMES_HOME", raising=False)
monkeypatch.delenv("LOCALAPPDATA", raising=False)
monkeypatch.setattr(Path, "home", lambda: home)
monkeypatch.setattr(hermes_constants.sys, "platform", "win32")
assert get_default_hermes_root() == home / "AppData" / "Local" / "hermes"
class TestGetHermesHome:
"""Tests for get_hermes_home() platform-aware fallback."""
def test_windows_fallback_uses_localappdata(self, tmp_path, monkeypatch):
"""When HERMES_HOME is unset on Windows, use %LOCALAPPDATA%\\hermes."""
local_appdata = tmp_path / "LocalAppData"
monkeypatch.delenv("HERMES_HOME", raising=False)
monkeypatch.setenv("LOCALAPPDATA", str(local_appdata))
monkeypatch.setattr(Path, "home", lambda: tmp_path / "Home")
monkeypatch.setattr(hermes_constants.sys, "platform", "win32")
monkeypatch.setattr(hermes_constants, "_profile_fallback_warned", False)
assert get_hermes_home() == local_appdata / "hermes"
class TestIsContainer:
"""Tests for is_container() — Docker/Podman detection."""
@@ -262,4 +298,3 @@ class TestSecureParentDir:
assert len(called_with) == 1
assert called_with[0] == (str(real_dir), 0o700)

View File

@@ -0,0 +1,119 @@
from __future__ import annotations
import os
import subprocess
import sys
from pathlib import Path
import yaml
from hermes_cli.config import DEFAULT_CONFIG
REPO_ROOT = Path(__file__).resolve().parents[2]
SCRIPT = REPO_ROOT / "scripts" / "docker_config_migrate.py"
def _run_migration(hermes_home: Path, **env_overrides: str) -> subprocess.CompletedProcess[str]:
env = os.environ.copy()
env.update(
{
"HERMES_HOME": str(hermes_home),
"HERMES_SKIP_CHMOD": "1",
"PYTHONPATH": str(REPO_ROOT),
}
)
env.update(env_overrides)
return subprocess.run(
[sys.executable, str(SCRIPT)],
cwd=str(REPO_ROOT),
env=env,
capture_output=True,
text=True,
)
def test_docker_config_migrate_backs_up_and_migrates_legacy_config(tmp_path: Path) -> None:
config_path = tmp_path / "config.yaml"
env_path = tmp_path / ".env"
config_path.write_text(
yaml.safe_dump(
{
"_config_version": 11,
"custom_providers": [
{
"name": "Local API",
"base_url": "http://localhost:8080/v1",
"api_key": "test-key",
}
],
}
),
encoding="utf-8",
)
env_path.write_text("OPENROUTER_API_KEY=test\n", encoding="utf-8")
proc = _run_migration(tmp_path)
assert proc.returncode == 0, proc.stderr
assert "Migrating config schema 11 ->" in proc.stdout
raw = yaml.safe_load(config_path.read_text(encoding="utf-8"))
assert raw["_config_version"] == DEFAULT_CONFIG["_config_version"]
assert "custom_providers" not in raw
assert raw["providers"]["local-api"]["api"] == "http://localhost:8080/v1"
assert list(tmp_path.glob("config.yaml.bak-*"))
assert list(tmp_path.glob(".env.bak-*"))
def test_docker_config_migrate_backs_up_and_migrates_unversioned_config(tmp_path: Path) -> None:
config_path = tmp_path / "config.yaml"
config_path.write_text(
yaml.safe_dump(
{
"custom_providers": [
{
"name": "Local API",
"base_url": "http://localhost:8080/v1",
"api_key": "test-key",
}
],
}
),
encoding="utf-8",
)
proc = _run_migration(tmp_path)
assert proc.returncode == 0, proc.stderr
assert "Migrating config schema 0 ->" in proc.stdout
raw = yaml.safe_load(config_path.read_text(encoding="utf-8"))
assert raw["_config_version"] == DEFAULT_CONFIG["_config_version"]
assert "custom_providers" not in raw
assert raw["providers"]["local-api"]["api"] == "http://localhost:8080/v1"
assert list(tmp_path.glob("config.yaml.bak-*"))
def test_docker_config_migrate_does_not_rewrite_invalid_yaml(tmp_path: Path) -> None:
config_path = tmp_path / "config.yaml"
original = "model: [unterminated\n"
config_path.write_text(original, encoding="utf-8")
proc = _run_migration(tmp_path)
assert proc.returncode == 0, proc.stderr
assert "Migrating config schema" not in proc.stdout
assert "hermes config:" in proc.stderr
assert config_path.read_text(encoding="utf-8") == original
assert not list(tmp_path.glob("*.bak-*"))
def test_docker_config_migrate_skip_env_leaves_config_unchanged(tmp_path: Path) -> None:
config_path = tmp_path / "config.yaml"
original = yaml.safe_dump({"_config_version": 11})
config_path.write_text(original, encoding="utf-8")
proc = _run_migration(tmp_path, HERMES_SKIP_CONFIG_MIGRATION="1")
assert proc.returncode == 0, proc.stderr
assert "skipping config migration" in proc.stdout
assert config_path.read_text(encoding="utf-8") == original
assert not list(tmp_path.glob("*.bak-*"))

View File

@@ -99,3 +99,12 @@ def test_stage2_hook_creates_s6_envdir_before_writing_browser_path(stage2_text:
assert mkdir_line in stage2_text
assert write_line in stage2_text
assert stage2_text.index(mkdir_line) < stage2_text.index(write_line)
def test_stage2_hook_runs_config_migration_as_hermes(stage2_text: str) -> None:
assert "scripts/docker_config_migrate.py" in stage2_text
assert 's6-setuidgid hermes "$INSTALL_DIR/.venv/bin/python"' in stage2_text
def test_stage2_hook_documents_config_migration_opt_out(stage2_text: str) -> None:
assert "HERMES_SKIP_CONFIG_MIGRATION" in stage2_text

View File

@@ -305,7 +305,13 @@ def _capture_required_environment_variables(
}
missing_names = [entry["name"] for entry in missing_entries]
if _is_gateway_surface():
# Most gateway surfaces (messaging platforms) can't prompt for a secret, so
# they short-circuit to the "unsupported" hint. Interactive gateway surfaces
# — the desktop app / TUI — set HERMES_INTERACTIVE and register a
# secret-capture callback that routes to a secure secret.request overlay, so
# they fall through and actually prompt. (HERMES_INTERACTIVE is the same flag
# tools/approval.py uses to tell an interactive surface from a messaging one.)
if _is_gateway_surface() and not env_var_enabled("HERMES_INTERACTIVE"):
return {
"missing_names": missing_names,
"setup_skipped": False,

View File

@@ -4154,6 +4154,13 @@ def _run_prompt_submit(rid, sid: str, session: dict, text: Any) -> None:
approval_token = set_current_session_key(session["session_key"])
session_tokens = _set_session_context(session["session_key"])
# The sudo password callback is thread-local (tools.terminal_tool
# _callback_tls), so wiring it on the build thread doesn't reach this
# turn thread — terminal sudo prompts would fall through to /dev/tty
# and hang the headless gateway. Re-wire here so the prompt routes to
# the sudo.request overlay. (secret capture is a module global, so
# re-running is a harmless no-op.)
_wire_callbacks(sid)
cwd = _session_cwd(session)
_register_session_cwd(session)
cols = session.get("cols", 80)

View File

@@ -83,17 +83,10 @@ function parseKey(keypress: ParsedKey): [Key, string] {
input = ''
}
// Suppress ESC-less SGR mouse fragments. When a heavy React commit blocks
// the event loop past App's 50ms NORMAL_TIMEOUT flush, a CSI split across
// stdin chunks gets its buffered ESC flushed as a lone Escape key, and the
// continuation arrives as a text token with name='' — which falls through
// all of parseKeypress's ESC-anchored regexes and the nonAlphanumericKeys
// clear below (name is falsy). The fragment then leaks into the prompt as
// literal `[<64;74;16M`. This is the same defensive sink as the F13 guard
// above; the underlying tokenizer-flush race is upstream of this layer.
if (!keypress.name && /^\[<\d+;\d+;\d+[Mm]/.test(input)) {
input = ''
}
// (SGR mouse-report fragments used to be scrubbed here. They no longer reach
// this layer: the tokenizer keeps an incomplete CSI buffered across a
// watchdog flush and reassembles it on the next feed instead of force-
// emitting the partial as input. See termio/tokenize.ts.)
// Strip meta if it's still remaining after `parseKeypress`
// TODO(vadimdemedes): remove this in the next major version.

View File

@@ -97,71 +97,37 @@ describe('mouse wheel modifier decoding', () => {
})
})
describe('fragmented SGR mouse recovery', () => {
it('re-synthesizes bracket-only SGR mouse tails as mouse events', () => {
const [[mouse]] = parseMultipleKeypresses(INITIAL_STATE, '[<35;159;11M')
describe('flush-boundary SGR mouse reassembly', () => {
it('reassembles a report split by a mid-sequence watchdog flush into one mouse event', () => {
// chunk 1: heavy render stalls the loop, only the prefix is read
let [keys, state] = parseMultipleKeypresses(INITIAL_STATE, '\x1b[<0;35;')
expect(keys).toEqual([])
expect(mouse).toMatchObject({ kind: 'mouse', button: 35, col: 159, row: 11, action: 'press' })
// App's 50ms watchdog flushes (input=null) — must NOT emit the partial
;[keys, state] = parseMultipleKeypresses(state, null)
expect(keys).toEqual([])
// continuation arrives; the whole report reassembles, nothing leaks
;[keys, state] = parseMultipleKeypresses(state, '46M')
expect(keys).toEqual([expect.objectContaining({ kind: 'mouse', button: 0, col: 35, row: 46, action: 'press' })])
})
it('re-synthesizes angle-only SGR mouse tails as mouse events', () => {
const [[mouse]] = parseMultipleKeypresses(INITIAL_STATE, '<35;159;11M')
it('drops a truncated mouse prefix after a second flush instead of leaking it', () => {
let [keys, state] = parseMultipleKeypresses(INITIAL_STATE, '\x1b[<0;35;')
expect(mouse).toMatchObject({ kind: 'mouse', button: 35, col: 159, row: 11, action: 'press' })
;[keys, state] = parseMultipleKeypresses(state, null) // first flush keeps it
;[keys, state] = parseMultipleKeypresses(state, null) // second flush drops it
expect(keys).toEqual([])
expect(state.incomplete).toBe('')
})
it('re-synthesizes degraded SGR mouse bursts without leaking prompt text', () => {
const [events] = parseMultipleKeypresses(INITIAL_STATE, '5;142;11M<35;159;11M35;124;26M35;119;26Mtyped')
it('re-synthesizes an orphaned X10 wheel tail (legacy mouse) into a scroll key', () => {
// X10 wheel-up = ESC[M + (0x40+32) + col + row. If the ESC was flushed as a
// lone Escape and the `[M…` payload arrives as text, resynthesize it.
const tail = '[M' + String.fromCharCode(0x60) + '!!'
const [[key]] = parseMultipleKeypresses(INITIAL_STATE, tail)
expect(events.slice(0, 4)).toEqual([
expect.objectContaining({ kind: 'mouse', button: 5, col: 142, row: 11 }),
expect.objectContaining({ kind: 'mouse', button: 35, col: 159, row: 11 }),
expect.objectContaining({ kind: 'mouse', button: 35, col: 124, row: 26 }),
expect.objectContaining({ kind: 'mouse', button: 35, col: 119, row: 26 })
])
expect(events[4]).toMatchObject({ kind: 'key', sequence: 'typed' })
})
it('keeps isolated semicolon text that only resembles a prefixless mouse report', () => {
const [[key]] = parseMultipleKeypresses(INITIAL_STATE, 'see 1;2;3M for details')
expect(key).toMatchObject({ kind: 'key', sequence: 'see 1;2;3M for details' })
})
it('does not match prefixless fragments inside longer digit runs', () => {
const [[key]] = parseMultipleKeypresses(INITIAL_STATE, '1234;56;78M9;10;11M')
expect(key).toMatchObject({ kind: 'key', sequence: '1234;56;78M9;10;11M' })
})
it('swallows a fully degraded mouse-burst noise blob without leaking prompt text', () => {
// Captured from Windows Terminal during a heavy tool-call render: the event
// loop blocked past App's 50ms flush timer, so a long burst of SGR mouse
// reports (mode 1003 any-motion) arrived as text with prefixes AND
// too degraded for SGR_MOUSE_FRAGMENT_RE (1- and 2-param remnants, a
// stray focus-in `[I`), so without the whole-text noise fast path the entire
// blob types into the composer and locks the user out.
const blob =
'M6M35;220;56M6M35;218;56M169;48M;157;47M;44M20;43M79;40M78;40M0M7M35;49;41M48;41M;47;40M9;15;32M[I;31M5;211;26M35;211;25M7M;220;1MM0M09;25M24M23M3;22MM18M99;26M32MM38M63;44M47MM1;51M M4M54M'
const [events] = parseMultipleKeypresses(INITIAL_STATE, blob)
expect(events).toEqual([])
})
it('keeps plain prose that only contains scattered M and m letters', () => {
const [[key]] = parseMultipleKeypresses(INITIAL_STATE, 'Mmm MMM mmm yummy')
expect(key).toMatchObject({ kind: 'key', sequence: 'Mmm MMM mmm yummy' })
})
it('swallows noise wholesale even when it contains intact recoverable fragments', () => {
// A noise blob can carry a few intact `<b;c;r M` fragments amid the chewed
// shards. The whole-text noise check must run BEFORE fragment recovery —
// otherwise parseTextWithSgrMouseFragments returns non-null and emits a
// pile of recovered mouse events instead of dropping the blob wholesale.
const blob = '<35;159;11M;44M20;43M0M7M<35;124;26M;47;40M9;15;32M5M2M'
const [events] = parseMultipleKeypresses(INITIAL_STATE, blob)
expect(events).toEqual([])
expect(key).toMatchObject({ name: 'wheelup' })
})
})

View File

@@ -63,35 +63,6 @@ const XTVERSION_RE = /^\x1bP>\|(.*?)(?:\x07|\x1b\\)$/s
// Button 32=left-drag (0x20 | motion-bit). Plain 0/1/2 = left/mid/right click.
// eslint-disable-next-line no-control-regex
const SGR_MOUSE_RE = /^\x1b\[<(\d+);(\d+);(\d+)([Mm])$/
const SGR_MOUSE_FRAGMENT_RE = /(?<!\d)(?:\[<|<)?(?:[0-9]|[1-9][0-9]|1\d{2}|2[0-4]\d|25[0-5]);\d+;\d+[Mm]/g
// Whole-text mouse-burst noise fast path. When a heavy render blocks the event
// loop past App's 50ms flush watchdog, a long burst of SGR mouse reports (mode
// 1003 any-motion / 1006 SGR) can arrive as a single text token with prefixes
// AND coordinate digits chewed off across many partial reads. The surviving
// shards (1- and 2-param remnants, stray focus-in `[I`, lone `M`/`m`
// terminators) are too degraded for SGR_MOUSE_FRAGMENT_RE, so the leftover
// tail leaks into the composer and locks the user out (they can't type or exit).
//
// If the ENTIRE text token is drawn only from the mouse-leak alphabet
// (`[ ] < ; I M m`, digits, and the stray spaces a burst can carry) AND it
// carries the structural signature of mouse coordinates — ≥3 `M`/`m`
// terminators, at least one digit, and at least one `;` separator — swallow it
// wholesale. All three constraints together preserve real prose: `Mmm MMM mmm`
// has no digit and no `;`, `see 1;2;3M for details` contains disqualifying
// letters, and `1234;56;78M9;10;11M` has only two terminators.
// eslint-disable-next-line no-control-regex
const MOUSE_BURST_NOISE_RE = /^(?=[\s\S]*\d)(?=[\s\S]*;)(?=(?:[^Mm]*[Mm]){3})[\d;<\[\]IMm \x1b]+$/
// Residual-shard variant for the gaps BETWEEN / AFTER recovered fragments
// inside parseTextWithSgrMouseFragments. A real recovery run leaves degraded
// remnants (e.g. `M6M`, `7M;220;1MM0M`, lone `;157;47M`) that are pure
// mouse-leak alphabet but too short to satisfy the ≥3-terminator whole-text
// rule. Swallow such a residue only when it is pure alphabet AND carries a
// digit AND at least one `M`/`m` — a prose gap like ` for details ` contains
// disqualifying letters and never matches.
// eslint-disable-next-line no-control-regex
const MOUSE_BURST_RESIDUE_RE = /^(?=[^\d]*\d)(?=[^Mm]*[Mm])[\d;<\[\]IMm \x1b]+$/
function createPasteKey(content: string): ParsedKey {
return {
@@ -296,32 +267,18 @@ export function parseMultipleKeypresses(
} else if (token.type === 'text') {
if (inPaste) {
pasteBuffer += token.value
} else if (MOUSE_BURST_NOISE_RE.test(token.value)) {
// Fully degraded mouse-burst noise — a heavy render (e.g. a sudo /
// secret prompt repaint) blocked the event loop past App's 50ms flush
// watchdog, so a long burst of SGR mouse reports arrived as text with
// prefixes AND coordinate digits chewed off. Checked BEFORE fragment
// recovery: a noise blob can still contain a few intact `<b;c;r M`
// fragments, and parseTextWithSgrMouseFragments would then return
// non-null and emit a pile of recovered mouse events instead of
// dropping the blob wholesale. Swallow it here so it never leaks into
// the composer (and we skip the extra fragment-recovery work mid-stall).
} else if (/^\[M[\x60-\x7f][\x20-\uffff]{2}$/.test(token.value)) {
// Orphaned X10 wheel tail (legacy 1000/1002 terminals, fullscreen
// only). If the buffered ESC was flushed as a lone Escape and the X10
// payload (`[M` + 3 bytes) arrived as the next text token, re-synthesize
// with ESC so the scroll event still fires instead of leaking. SGR mouse
// reports no longer reach this branch — the tokenizer keeps an
// incomplete CSI buffered across a flush and reassembles it (see
// termio/tokenize.ts), so the old fragment/burst recovery is gone.
const resynthesized = '\x1b' + token.value
keys.push(parseKeypress(resynthesized))
} else {
const mouseFragments = parseTextWithSgrMouseFragments(token.value)
if (mouseFragments) {
keys.push(...mouseFragments)
} else if (/^\[M[\x60-\x7f][\x20-\uffff]{2}$/.test(token.value)) {
// Orphaned X10 wheel tail (fullscreen only — mouse tracking is off
// otherwise). A heavy render blocked the event loop past App's 50ms
// flush timer, so the buffered ESC was flushed as a lone Escape and
// the continuation arrived as text. Re-synthesize with ESC so the
// scroll event still fires instead of leaking into the prompt.
const resynthesized = '\x1b' + token.value
keys.push(parseKeypress(resynthesized))
} else {
keys.push(parseKeypress(token.value))
}
keys.push(parseKeypress(token.value))
}
}
}
@@ -663,87 +620,6 @@ function parseMouseEvent(s: string): ParsedMouse | null {
}
}
function normalizeSgrMouseFragment(fragment: string): string {
if (fragment.startsWith('[<')) {
return `\x1b${fragment}`
}
if (fragment.startsWith('<')) {
return `\x1b[${fragment}`
}
return `\x1b[<${fragment}`
}
function parseSgrMouseFragment(fragment: string): ParsedInput {
const sequence = normalizeSgrMouseFragment(fragment)
return parseMouseEvent(sequence) ?? parseKeypress(sequence)
}
function parseTextWithSgrMouseFragments(text: string): ParsedInput[] | null {
SGR_MOUSE_FRAGMENT_RE.lastIndex = 0
const matches = [...text.matchAll(SGR_MOUSE_FRAGMENT_RE)]
if (matches.length === 0) {
return null
}
const parsed: ParsedInput[] = []
let cursor = 0
let consumedAny = false
for (let i = 0; i < matches.length;) {
const first = matches[i]!
const run: RegExpMatchArray[] = [first]
let runEnd = first.index! + first[0].length
i++
while (i < matches.length && matches[i]!.index === runEnd) {
run.push(matches[i]!)
runEnd = matches[i]!.index! + matches[i]![0].length
i++
}
const hasExplicitMousePrefix = run.some(match => match[0].startsWith('[<') || match[0].startsWith('<'))
const isFragmentBurst = run.length > 1
if (!hasExplicitMousePrefix && !isFragmentBurst) {
continue
}
if (first.index! > cursor) {
const gap = text.slice(cursor, first.index!)
// Skip pure mouse-leak residue between recovered fragments; only emit
// real text gaps as keypresses.
if (!MOUSE_BURST_RESIDUE_RE.test(gap)) {
parsed.push(parseKeypress(gap))
}
}
for (const match of run) {
parsed.push(parseSgrMouseFragment(match[0]))
}
cursor = runEnd
consumedAny = true
}
if (!consumedAny) {
return null
}
if (cursor < text.length) {
const tail = text.slice(cursor)
// Swallow a pure mouse-leak residue tail (the head fragments recovered, but
// the burst trailed off into chewed-up shards). Emit only real trailing text.
if (!MOUSE_BURST_RESIDUE_RE.test(tail)) {
parsed.push(parseKeypress(tail))
}
}
return parsed
}
function parseKeypress(s: string = ''): ParsedKey {
let parts

View File

@@ -0,0 +1,185 @@
import { describe, expect, it } from 'vitest'
import { createTokenizer, type Token } from './tokenize.js'
describe('tokenizer escape-sequence boundaries', () => {
it('reassembles a CSI mouse sequence split across two feeds', () => {
const t = createTokenizer({ x10Mouse: true })
expect(t.feed('\x1b[<0;35;')).toEqual([])
expect(t.feed('46M')).toEqual([{ type: 'sequence', value: '\x1b[<0;35;46M' }])
expect(t.buffer()).toBe('')
})
})
describe('tokenizer state-aware flush', () => {
it('does not emit an incomplete CSI on flush — it keeps it for reassembly', () => {
const t = createTokenizer({ x10Mouse: true })
// A render stall lets App's watchdog flush mid-sequence. The buffered CSI
// prefix must NOT be emitted (that is the `46M…` leak); it stays buffered.
expect(t.feed('\x1b[<0;35;')).toEqual([])
expect(t.flush()).toEqual([])
expect(t.buffer()).toBe('\x1b[<0;35;')
// The continuation arrives on the next feed and the whole report
// reassembles into a single clean sequence token — nothing leaked.
expect(t.feed('46M')).toEqual([{ type: 'sequence', value: '\x1b[<0;35;46M' }])
expect(t.buffer()).toBe('')
})
it('drops a partial control sequence that survives a second flush (truncation)', () => {
const t = createTokenizer({ x10Mouse: true })
expect(t.feed('\x1b[<0;35;')).toEqual([])
expect(t.flush()).toEqual([]) // first flush keeps the buffer
expect(t.buffer()).toBe('\x1b[<0;35;')
// Continuation never arrived: the next flush sees the same buffer and
// drops it so it can't fuse with the next keypress's bytes.
expect(t.flush()).toEqual([])
expect(t.buffer()).toBe('')
})
it('still emits a bare ESC on flush so the Escape key works', () => {
const t = createTokenizer({ x10Mouse: true })
expect(t.feed('\x1b')).toEqual([])
expect(t.flush()).toEqual([{ type: 'sequence', value: '\x1b' }])
expect(t.buffer()).toBe('')
})
it('reassembles even when a flush fires between every byte of the report', () => {
const t = createTokenizer({ x10Mouse: true })
// Pathological stall: a flush between each chunk. As long as the
// continuation eventually arrives, no fragment is ever emitted as input.
for (const chunk of ['\x1b[', '<', '0;', '35;', '46']) {
expect(t.feed(chunk)).toEqual([])
expect(t.flush()).toEqual([])
}
expect(t.feed('M')).toEqual([{ type: 'sequence', value: '\x1b[<0;35;46M' }])
expect(t.buffer()).toBe('')
})
})
// Battle-test: prove the leak class is structurally impossible, not just that
// the known cases are patched. We hammer the tokenizer with the worst stalls a
// terminal can produce (split + flush at every byte) and assert the two hard
// invariants: nothing leaks as text, and every complete report reassembles.
describe('tokenizer fuzz: fragments never leak under a flush storm', () => {
const sgr = (btn: number, col: number, row: number, press: boolean): string =>
`\x1b[<${btn};${col};${row}${press ? 'M' : 'm'}`
it('reassembles a report split + flushed at every interior byte', () => {
const seq = sgr(0, 35, 46, true)
// Start at 2: an earlier split is the lone-ESC ESCDELAY boundary, which
// intentionally flushes to the Escape key. Terminals never split a mouse
// report there — a report is one atomic write — so it's not a real case.
for (let i = 2; i < seq.length; i++) {
const t = createTokenizer({ x10Mouse: true })
const tokens: Token[] = [...t.feed(seq.slice(0, i)), ...t.flush(), ...t.feed(seq.slice(i))]
expect(tokens).toEqual([{ type: 'sequence', value: seq }])
expect(t.buffer()).toBe('')
}
})
it('feeds 200 random reports one byte at a time, flushing after every byte', () => {
// Deterministic PRNG so a failure is reproducible.
let s = 0x1234567
const rnd = (n: number): number => {
s = (s * 1103515245 + 12345) & 0x7fffffff
return s % n
}
const reports = Array.from({ length: 200 }, () => sgr(rnd(120), 1 + rnd(300), 1 + rnd(200), rnd(2) === 0))
const stream = reports.join('')
const t = createTokenizer({ x10Mouse: true })
const seqTokens: string[] = []
let textLeak = ''
const drain = (tokens: Token[]): void => {
for (const tok of tokens) {
if (tok.type === 'sequence') {
seqTokens.push(tok.value)
} else {
textLeak += tok.value
}
}
}
for (const ch of stream) {
drain(t.feed(ch))
// Flush storm — but not at a lone-ESC boundary (the real watchdog
// re-arms while bytes are pending; a single flush between feeds never
// hits the truncation valve).
if (t.buffer() !== '\x1b') {
drain(t.flush())
}
}
expect(textLeak).toBe('')
expect(seqTokens.join('')).toBe(stream)
})
it('keeps real keystrokes intact while mouse reports reassemble around them', () => {
let s = 0x0badf00d
const rnd = (n: number): number => {
s = (s * 1103515245 + 12345) & 0x7fffffff
return s % n
}
const typed = 'abc 123 xyz'
const expectedKeys: string[] = []
const expectedSeqs: string[] = []
const parts: string[] = []
for (let k = 0; k < 120; k++) {
if (rnd(3) === 0) {
const ch = typed[rnd(typed.length)]!
expectedKeys.push(ch)
parts.push(ch)
} else {
const seq = sgr(rnd(64), 1 + rnd(200), 1 + rnd(100), rnd(2) === 0)
expectedSeqs.push(seq)
parts.push(seq)
}
}
const stream = parts.join('')
const t = createTokenizer({ x10Mouse: true })
const seqTokens: string[] = []
let text = ''
const drain = (tokens: Token[]): void => {
for (const tok of tokens) {
if (tok.type === 'sequence') {
seqTokens.push(tok.value)
} else {
text += tok.value
}
}
}
for (const ch of stream) {
drain(t.feed(ch))
if (t.buffer() !== '\x1b') {
drain(t.flush())
}
}
// Every typed character survives, in order; every report reassembles whole.
expect(text).toBe(expectedKeys.join(''))
expect(seqTokens).toEqual(expectedSeqs)
})
})

View File

@@ -47,10 +47,18 @@ type TokenizerOptions = {
export function createTokenizer(options?: TokenizerOptions): Tokenizer {
let currentState: State = 'ground'
let currentBuffer = ''
// The control-sequence buffer kept across the previous flush, if any. Used
// as a one-tick truncation valve: a partial CSI mouse report normally
// reassembles on the very next feed, so if a flush sees the exact same
// buffer it kept last time (the continuation never arrived), we drop it.
let lastFlushedBuffer = ''
const x10Mouse = options?.x10Mouse ?? false
return {
feed(input: string): Token[] {
// Real bytes arrived — any kept partial is no longer stale.
lastFlushedBuffer = ''
const result = tokenize(input, currentState, currentBuffer, false, x10Mouse)
currentState = result.state.state
@@ -64,12 +72,25 @@ export function createTokenizer(options?: TokenizerOptions): Tokenizer {
currentState = result.state.state
currentBuffer = result.state.buffer
// tokenize() keeps (doesn't emit) an incomplete control sequence on
// flush. If two consecutive flushes see the same buffer with no feed in
// between, the continuation is never coming (truncated write / killed
// process) — drop it so it can't fuse with the next keypress's bytes.
if (currentBuffer && currentBuffer === lastFlushedBuffer) {
currentState = 'ground'
currentBuffer = ''
lastFlushedBuffer = ''
} else {
lastFlushedBuffer = currentBuffer
}
return result.tokens
},
reset(): void {
currentState = 'ground'
currentBuffer = ''
lastFlushedBuffer = ''
},
buffer(): string {
@@ -298,8 +319,10 @@ function tokenize(
// Handle end of input
if (result.state === 'ground') {
flushText()
} else if (flush) {
// Force output incomplete sequence
} else if (flush && result.state === 'escape') {
// A bare ESC with nothing after it is the Escape key — the one incomplete
// state a flush should turn into input (the classic ESCDELAY lone-ESC
// disambiguation: ESC alone vs. ESC as a sequence/meta prefix).
const remaining = data.slice(seqStart)
if (remaining) {
@@ -308,7 +331,18 @@ function tokenize(
result.state = 'ground'
} else {
// Buffer incomplete sequence for next call
// Buffer the incomplete sequence. Two paths land here:
// - streaming (flush=false): normal carry-over to the next feed.
// - flush=true while still inside a multi-byte control sequence
// (csi/osc/dcs/apc/ss3/escapeIntermediate): we deliberately do NOT
// emit it. A half-arrived CSI mouse report (ESC[<btn;col;row M) is an
// unfinished sequence, not user input — force-emitting it is what
// injects `46M`/`35;46M` shards into the prompt during a render stall.
// Keeping it buffered lets the continuation reassemble on the next
// feed (the xterm.js state-machine discipline — partial sequences
// never become text). createTokenizer.flush() drops the buffer if it
// survives a second flush with no progress (a genuine truncation), so
// a stuck partial can never merge into the next keypress's bytes.
result.buffer = data.slice(seqStart)
}

View File

@@ -853,6 +853,15 @@ export const api = {
runDump: () => fetchJSON<ActionResponse>("/api/ops/dump", { method: "POST" }),
runConfigMigrate: () =>
fetchJSON<ActionResponse>("/api/ops/config-migrate", { method: "POST" }),
runDebugShare: (opts?: { redact?: boolean; lines?: number }) =>
fetchJSON<DebugShareResponse>("/api/ops/debug-share", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
redact: opts?.redact ?? true,
lines: opts?.lines ?? 200,
}),
}),
getCheckpoints: () => fetchJSON<CheckpointsResponse>("/api/ops/checkpoints"),
@@ -906,6 +915,16 @@ export interface ActionResponse {
update_command?: string;
}
export interface DebugShareResponse {
ok: boolean;
// label -> paste URL, e.g. { Report: "https://paste.rs/abc", "agent.log": "..." }
urls: Record<string, string>;
// "label: error" strings for optional full-log uploads that failed.
failures: string[];
redacted: boolean;
auto_delete_seconds: number;
}
export interface SessionStoreStats {
total: number;
active_store: number;

View File

@@ -3,17 +3,22 @@ import { Link } from "react-router-dom";
import {
Activity,
Brain,
Check,
Clock,
Copy,
Cpu,
Database,
Download,
Globe,
HardDrive,
KeyRound,
Link2,
Play,
Plus,
Power,
RotateCw,
Server,
Share2,
ShieldCheck,
Sparkles,
Stethoscope,
@@ -48,6 +53,7 @@ import type {
UpdateCheckResponse,
CuratorStatus,
PortalStatus,
DebugShareResponse,
} from "@/lib/api";
function formatBytes(n: number): string {
@@ -324,6 +330,54 @@ export default function SystemPage() {
}
};
// ── Debug share ────────────────────────────────────────────────────
// Unlike the fire-and-forget ops above, `debug share` produces shareable
// paste URLs that are the whole point — so we surface them as real,
// copyable links rather than a log tail.
const [shareRedact, setShareRedact] = useState(true);
const [sharing, setSharing] = useState(false);
const [shareResult, setShareResult] = useState<DebugShareResponse | null>(
null,
);
const [copiedLabel, setCopiedLabel] = useState<string | null>(null);
const copyToClipboard = useCallback(
async (text: string, label: string) => {
try {
await navigator.clipboard.writeText(text);
setCopiedLabel(label);
setTimeout(
() => setCopiedLabel((cur) => (cur === label ? null : cur)),
1500,
);
} catch {
showToast("Couldn't copy to clipboard", "error");
}
},
[showToast],
);
const runDebugShare = useCallback(async () => {
setSharing(true);
setShareResult(null);
try {
const res = await api.runDebugShare({ redact: shareRedact });
setShareResult(res);
const n = Object.keys(res.urls).length;
showToast(
`Uploaded ${n} paste${n === 1 ? "" : "s"}${
res.redacted ? " (redacted)" : ""
}`,
"success",
);
} catch (e) {
showToast(`Debug share failed: ${e}`, "error");
} finally {
setSharing(false);
}
}, [shareRedact, showToast]);
// ── Update check / apply ───────────────────────────────────────────
const checkForUpdate = useCallback(
async (force = false) => {
@@ -992,6 +1046,129 @@ export default function SystemPage() {
</Button>
</CardContent>
</Card>
{/* Debug share — uploads a redacted report + logs, returns shareable
links. Separated from the buttons above because its output is
persistent, copyable URLs, not a fire-and-forget log tail. */}
<Card>
<CardContent className="flex flex-col gap-3 py-4">
<div className="flex flex-wrap items-center justify-between gap-3">
<div className="flex items-start gap-2">
<Share2 className="h-4 w-4 mt-0.5 text-muted-foreground" />
<div className="flex flex-col">
<span className="text-sm font-medium">Share debug report</span>
<span className="text-xs text-muted-foreground max-w-prose">
Uploads system info + logs to a public paste service and
returns links to send the Hermes team. Pastes auto-delete
after 6 hours.
</span>
</div>
</div>
<Button
size="sm"
disabled={sharing}
prefix={
sharing ? (
<Spinner className="h-3.5 w-3.5" />
) : (
<Share2 className="h-3.5 w-3.5" />
)
}
onClick={() => void runDebugShare()}
>
{sharing ? "Uploading…" : "Generate share link"}
</Button>
</div>
<label className="flex items-center gap-2 text-xs text-muted-foreground select-none">
<input
type="checkbox"
className="accent-current"
checked={shareRedact}
disabled={sharing}
onChange={(e) => setShareRedact(e.target.checked)}
/>
Redact credential-shaped tokens before upload (recommended)
</label>
{shareResult && (
<div className="flex flex-col gap-2 border-t border-border pt-3">
<div className="flex items-center justify-between">
<div className="flex items-center gap-2">
<Badge tone="success">uploaded</Badge>
{shareResult.redacted ? (
<Badge tone="outline">redacted</Badge>
) : (
<Badge tone="warning">not redacted</Badge>
)}
<span className="flex items-center gap-1 text-xs text-muted-foreground">
<Clock className="h-3 w-3" />
auto-deletes in{" "}
{Math.round(shareResult.auto_delete_seconds / 3600)}h
</span>
</div>
{Object.keys(shareResult.urls).length > 1 && (
<Button
size="sm"
ghost
prefix={
copiedLabel === "__all__" ? (
<Check className="h-3.5 w-3.5" />
) : (
<Copy className="h-3.5 w-3.5" />
)
}
onClick={() =>
void copyToClipboard(
Object.entries(shareResult.urls)
.map(([label, url]) => `${label}: ${url}`)
.join("\n"),
"__all__",
)
}
>
Copy all
</Button>
)}
</div>
{Object.entries(shareResult.urls).map(([label, url]) => (
<div
key={label}
className="flex items-center gap-2 bg-background/50 border border-border px-3 py-2"
>
<Link2 className="h-3.5 w-3.5 shrink-0 text-muted-foreground" />
<span className="font-mono text-xs shrink-0 w-24 truncate text-muted-foreground">
{label}
</span>
<a
href={url}
target="_blank"
rel="noreferrer"
className="font-mono text-xs truncate flex-1 text-primary hover:underline"
>
{url}
</a>
<Button
ghost
size="icon"
aria-label={`Copy ${label} link`}
onClick={() => void copyToClipboard(url, label)}
>
{copiedLabel === label ? <Check /> : <Copy />}
</Button>
</div>
))}
{shareResult.failures.length > 0 && (
<span className="text-xs text-destructive">
Some logs failed to upload: {shareResult.failures.join("; ")}
</span>
)}
</div>
)}
</CardContent>
</Card>
<Card>
<CardContent className="flex flex-col gap-3 py-4 sm:flex-row sm:items-end">
<div className="grid gap-2 flex-1">

View File

@@ -418,7 +418,7 @@ The official image is based on `debian:13.4` and includes:
- **[`s6-overlay`](https://github.com/just-containers/s6-overlay) v3** as PID 1 (replaces the older `tini`) — supervises the dashboard and per-profile gateways with auto-restart on crash, reaps zombie subprocesses, and forwards signals.
The container's `ENTRYPOINT` is s6-overlay's `/init`. On boot it:
1. Runs `/etc/cont-init.d/01-hermes-setup` (= `docker/stage2-hook.sh`) as root: optional UID/GID remap, fixes volume ownership, seeds `.env` / `config.yaml` / `SOUL.md` on first boot, syncs bundled skills.
1. Runs `/etc/cont-init.d/01-hermes-setup` (= `docker/stage2-hook.sh`) as root: optional UID/GID remap, fixes volume ownership, seeds `.env` / `config.yaml` / `SOUL.md` on first boot, runs non-interactive config-schema migrations unless `HERMES_SKIP_CONFIG_MIGRATION=1`, syncs bundled skills.
2. Runs `/etc/cont-init.d/02-reconcile-profiles` (= `hermes_cli.container_boot`): walks `$HERMES_HOME/profiles/<name>/`, recreates the per-profile gateway s6 service slot under `/run/service/gateway-<profile>/`, and auto-starts only those whose last recorded state was `running` (see [Per-profile gateway supervision](#per-profile-gateway-supervision)).
3. Starts the static `main-hermes` and `dashboard` s6-rc services.
4. Exec's the container's CMD as the main program (`/opt/hermes/docker/main-wrapper.sh`), which routes the arguments the user passed to `docker run`:
@@ -462,7 +462,11 @@ Each profile created with `hermes profile create <name>` automatically gets an s
## Upgrading
Pull the latest image and recreate the container. Your data directory is untouched.
Pull the latest image and recreate the container. Your data directory is
preserved, and the container runs non-interactive config-schema migrations
against the mounted `$HERMES_HOME/config.yaml` before starting the gateway.
When a migration is needed, Hermes writes timestamped backups next to
`config.yaml` and `.env` first.
```sh
docker pull nousresearch/hermes-agent:latest
@@ -481,6 +485,9 @@ docker compose pull
docker compose up -d
```
Set `HERMES_SKIP_CONFIG_MIGRATION=1` only if you need to inspect or migrate the
persisted config manually before letting the new image rewrite it.
## Skills and credential files
When using Docker as the execution environment (not the methods above, but when the agent runs commands inside a Docker sandbox — see [Configuration → Docker Backend](./configuration.md#docker-backend)), Hermes reuses a single long-lived container for all tool calls and automatically bind-mounts the skills directory (`~/.hermes/skills/`) and any credential files declared by skills into that container as read-only volumes. Skill scripts, templates, and references are available inside the sandbox without manual configuration, and because the container persists for the life of the Hermes process, any dependencies you install or files you write stay around for the next tool call.