Files
hermes-agent/website/docs/user-guide/checkpoints-and-rollback.md
Teknium 289cc47631 docs: resync reference, user-guide, developer-guide, and messaging pages against code (#17738)
Broad drift audit against origin/main (b52b63396).

Reference pages (most user-visible drift):
- slash-commands: add /busy, /curator, /footer, /indicator, /redraw, /steer
  that were missing; drop non-existent /terminal-setup; fix /q footnote
  (resolves to /queue, not /quit); extend CLI-only list with all 24
  CLI-only commands in the registry
- cli-commands: add dedicated sections for hermes curator / fallback /
  hooks (new subcommands not previously documented); remove stale
  hermes honcho standalone section (the plugin registers dynamically
  via hermes memory); list curator/fallback/hooks in top-level table;
  fix completion to include fish
- toolsets-reference: document the real 52-toolset count; split browser
  vs browser-cdp; add discord / discord_admin / spotify / yuanbao;
  correct hermes-cli tool count from 36 to 38; fix misleading claim
  that hermes-homeassistant adds tools (it's identical to hermes-cli)
- tools-reference: bump tool count 55 -> 68; add 7 Spotify, 5 Yuanbao,
  2 Discord toolsets; move browser_cdp/browser_dialog to their own
  browser-cdp toolset section
- environment-variables: add 40+ user-facing HERMES_* vars that were
  undocumented (--yolo, --accept-hooks, --ignore-*, inference model
  override, agent/stream/checkpoint timeouts, OAuth trace, per-platform
  batch tuning for Telegram/Discord/Matrix/Feishu/WeCom, cron knobs,
  gateway restart/connect timeouts); dedupe the Cron Scheduler section;
  replace stale QQ_SANDBOX with QQ_PORTAL_HOST

User-guide (top level):
- cli.md: compression preserves last 20 turns, not 4 (protect_last_n: 20)
- configuration.md: display.platforms is the canonical per-platform
  override key; tool_progress_overrides is deprecated and auto-migrated
- profiles.md: model.default is the config key, not model.model
- sessions.md: CLI/TUI session IDs use 6-char hex, gateway uses 8
- checkpoints-and-rollback.md: destructive-command list now matches
  _DESTRUCTIVE_PATTERNS (adds rmdir, cp, install, dd)
- docker.md: the container runs as non-root hermes (UID 10000) via
  gosu; fix install command (uv pip); add missing --insecure on the
  dashboard compose example (required for non-loopback bind)
- security.md: systemctl danger pattern also matches 'restart'
- index.md: built-in tool count 47 -> 68
- integrations/index.md: 6 STT providers, 8 memory providers
- integrations/providers.md: drop fictional dashscope/qwen aliases

Features:
- overview.md: 9 image models (not 8), 9 TTS providers (not 5),
  8 memory providers (Supermemory was missing)
- tool-gateway.md: 9 image models
- tools.md: extend common-toolsets list with search / messaging /
  spotify / discord / debugging / safe
- fallback-providers.md: add 6 real providers from PROVIDER_REGISTRY
  (lmstudio, kimi-coding-cn, stepfun, alibaba-coding-plan,
  tencent-tokenhub, azure-foundry)
- plugins.md: Available Hooks table now includes on_session_finalize,
  on_session_reset, subagent_stop
- built-in-plugins.md: add the 7 bundled plugins the page didn't
  mention (spotify, google_meet, three image_gen providers, two
  dashboard examples)
- web-dashboard.md: add --insecure and --tui flags
- cron.md: hermes cron create takes positional schedule/prompt, not
  flags

Messaging:
- telegram.md: TELEGRAM_WEBHOOK_SECRET is now REQUIRED when
  TELEGRAM_WEBHOOK_URL is set (gateway refuses to start without it
  per GHSA-3vpc-7q5r-276h). Biggest user-visible drift in the batch.
- discord.md: HERMES_DISCORD_TEXT_BATCH_SPLIT_DELAY_SECONDS default
  is 2.0, not 0.1
- dingtalk.md: document DINGTALK_REQUIRE_MENTION /
  FREE_RESPONSE_CHATS / MENTION_PATTERNS / HOME_CHANNEL /
  ALLOW_ALL_USERS that the adapter supports
- bluebubbles.md: drop fictional BLUEBUBBLES_SEND_READ_RECEIPTS env
  var; the setting lives in platforms.bluebubbles.extra only
- qqbot.md: drop dead QQ_SANDBOX; add real QQ_PORTAL_HOST and
  QQ_GROUP_ALLOWED_USERS
- wecom-callback.md: replace 'hermes gateway start' (service-only)
  with 'hermes gateway' for first-time setup

Developer-guide:
- architecture.md: refresh tool/toolset counts (61/52), terminal
  backend count (7), line counts for run_agent.py (~13.7k), cli.py
  (~11.5k), main.py (~10.4k), setup.py (~3.5k), gateway/run.py
  (~12.2k), mcp_tool.py (~3.1k); add yuanbao adapter, bump platform
  adapter count 18 -> 20
- agent-loop.md: run_agent.py line count 10.7k -> 13.7k
- tools-runtime.md: add vercel_sandbox backend
- adding-tools.md: remove stale 'Discovery import added to
  model_tools.py' checklist item (registry auto-discovery)
- adding-platform-adapters.md: mark send_typing / get_chat_info as
  concrete base methods; only connect/disconnect/send are abstract
- acp-internals.md: ACP sessions now persist to SessionDB
  (~/.hermes/state.db); acp.run_agent call uses
  use_unstable_protocol=True
- cron-internals.md: gateway runs scheduler in a dedicated background
  thread via _start_cron_ticker, not on a maintenance cycle; locking
  is cross-process via fcntl.flock (Unix) / msvcrt.locking (Windows)
- gateway-internals.md: gateway/run.py ~12k lines
- provider-runtime.md: cron DOES support fallback (run_job reads
  fallback_providers from config)
- session-storage.md: SCHEMA_VERSION = 11 (not 9); add migrations
  10 and 11 (trigram FTS, inline-mode FTS5 re-index); add
  api_call_count column to Sessions DDL; document messages_fts_trigram
  and state_meta in the architecture tree
- context-compression-and-caching.md: remove the obsolete 'context
  pressure warnings' section (warnings were removed for causing
  models to give up early)
- context-engine-plugin.md: compress() signature now includes
  focus_topic param
- extending-the-cli.md: _build_tui_layout_children signature now
  includes model_picker_widget; add to default layout

Also fixed three pre-existing broken links/anchors the build warned
about (docker.md -> api-server.md, yuanbao.md -> cron-jobs.md and
tips#background-tasks, nix-setup.md -> #container-aware-cli).

Regenerated per-skill pages via website/scripts/generate-skill-docs.py
so catalog tables and sidebar are consistent with current SKILL.md
frontmatter.

docusaurus build: clean, no broken links or anchors.
2026-04-29 20:55:59 -07:00

7.1 KiB
Raw Blame History

sidebar_position, sidebar_label, title, description
sidebar_position sidebar_label title description
8 Checkpoints & Rollback Checkpoints and /rollback Filesystem safety nets for destructive operations using shadow git repos and automatic snapshots

Checkpoints and /rollback

Hermes Agent automatically snapshots your project before destructive operations and lets you restore it with a single command. Checkpoints are enabled by default — there's zero cost when no file-mutating tools fire.

This safety net is powered by an internal Checkpoint Manager that keeps a separate shadow git repository under ~/.hermes/checkpoints/ — your real project .git is never touched.

What Triggers a Checkpoint

Checkpoints are taken automatically before:

  • File toolswrite_file and patch
  • Destructive terminal commandsrm, rmdir, cp, install, mv, sed -i, truncate, dd, shred, output redirects (>), and git reset/clean/checkout

The agent creates at most one checkpoint per directory per turn, so long-running sessions don't spam snapshots.

Quick Reference

Command Description
/rollback List all checkpoints with change stats
/rollback <N> Restore to checkpoint N (also undoes last chat turn)
/rollback diff <N> Preview diff between checkpoint N and current state
/rollback <N> <file> Restore a single file from checkpoint N

How Checkpoints Work

At a high level:

  • Hermes detects when tools are about to modify files in your working tree.
  • Once per conversation turn (per directory), it:
    • Resolves a reasonable project root for the file.
    • Initialises or reuses a shadow git repo tied to that directory.
    • Stages and commits the current state with a short, humanreadable reason.
  • These commits form a checkpoint history that you can inspect and restore via /rollback.
flowchart LR
  user["User command\n(hermes, gateway)"]
  agent["AIAgent\n(run_agent.py)"]
  tools["File & terminal tools"]
  cpMgr["CheckpointManager"]
  shadowRepo["Shadow git repo\n~/.hermes/checkpoints/<hash>"]

  user --> agent
  agent -->|"tool call"| tools
  tools -->|"before mutate\nensure_checkpoint()"| cpMgr
  cpMgr -->|"git add/commit"| shadowRepo
  cpMgr -->|"OK / skipped"| tools
  tools -->|"apply changes"| agent

Configuration

Checkpoints are enabled by default. Configure in ~/.hermes/config.yaml:

checkpoints:
  enabled: true          # master switch (default: true)
  max_snapshots: 50      # max checkpoints per directory

  # Auto-maintenance (opt-in): sweep ~/.hermes/checkpoints/ at startup
  # and delete shadow repos whose working directory no longer exists
  # (orphans) or whose newest commit is older than retention_days.
  # Runs at most once per min_interval_hours, tracked via a
  # .last_prune marker inside ~/.hermes/checkpoints/.
  auto_prune: false           # default off — enable to reclaim disk
  retention_days: 7
  delete_orphans: true        # delete repos whose workdir is gone
  min_interval_hours: 24

To disable:

checkpoints:
  enabled: false

When disabled, the Checkpoint Manager is a noop and never attempts git operations.

Listing Checkpoints

From a CLI session:

/rollback

Hermes responds with a formatted list showing change statistics:

📸 Checkpoints for /path/to/project:

  1. 4270a8c  2026-03-16 04:36  before patch  (1 file, +1/-0)
  2. eaf4c1f  2026-03-16 04:35  before write_file
  3. b3f9d2e  2026-03-16 04:34  before terminal: sed -i s/old/new/ config.py  (1 file, +1/-1)

  /rollback <N>             restore to checkpoint N
  /rollback diff <N>        preview changes since checkpoint N
  /rollback <N> <file>      restore a single file from checkpoint N

Each entry shows:

  • Short hash
  • Timestamp
  • Reason (what triggered the snapshot)
  • Change summary (files changed, insertions/deletions)

Previewing Changes with /rollback diff

Before committing to a restore, preview what has changed since a checkpoint:

/rollback diff 1

This shows a git diff stat summary followed by the actual diff:

test.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/test.py b/test.py
--- a/test.py
+++ b/test.py
@@ -1 +1 @@
-print('original content')
+print('modified content')

Long diffs are capped at 80 lines to avoid flooding the terminal.

Restoring with /rollback

Restore to a checkpoint by number:

/rollback 1

Behind the scenes, Hermes:

  1. Verifies the target commit exists in the shadow repo.
  2. Takes a prerollback snapshot of the current state so you can "undo the undo" later.
  3. Restores tracked files in your working directory.
  4. Undoes the last conversation turn so the agent's context matches the restored filesystem state.

On success:

✅ Restored to checkpoint 4270a8c5: before patch
A pre-rollback snapshot was saved automatically.
(^_^)b Undid 4 message(s). Removed: "Now update test.py to ..."
  4 message(s) remaining in history.
  Chat turn undone to match restored file state.

The conversation undo ensures the agent doesn't "remember" changes that have been rolled back, avoiding confusion on the next turn.

Single-File Restore

Restore just one file from a checkpoint without affecting the rest of the directory:

/rollback 1 src/broken_file.py

This is useful when the agent made changes to multiple files but only one needs to be reverted.

Safety and Performance Guards

To keep checkpointing safe and fast, Hermes applies several guardrails:

  • Git availability — if git is not found on PATH, checkpoints are transparently disabled.
  • Directory scope — Hermes skips overly broad directories (root /, home $HOME).
  • Repository size — directories with more than 50,000 files are skipped to avoid slow git operations.
  • Nochange snapshots — if there are no changes since the last snapshot, the checkpoint is skipped.
  • Nonfatal errors — all errors inside the Checkpoint Manager are logged at debug level; your tools continue to run.

Where Checkpoints Live

All shadow repos live under:

~/.hermes/checkpoints/
  ├── <hash1>/   # shadow git repo for one working directory
  ├── <hash2>/
  └── ...

Each <hash> is derived from the absolute path of the working directory. Inside each shadow repo you'll find:

  • Standard git internals (HEAD, refs/, objects/)
  • An info/exclude file containing a curated ignore list
  • A HERMES_WORKDIR file pointing back to the original project root

You normally never need to touch these manually.

Best Practices

  • Leave checkpoints enabled — they're on by default and have zero cost when no files are modified.
  • Use /rollback diff before restoring — preview what will change to pick the right checkpoint.
  • Use /rollback instead of git reset when you want to undo agent-driven changes only.
  • Combine with Git worktrees for maximum safety — keep each Hermes session in its own worktree/branch, with checkpoints as an extra layer.

For running multiple agents in parallel on the same repo, see the guide on Git worktrees.