Compare commits

...

24 Commits

Author SHA1 Message Date
teknium1
daa0a938e4 fix(agent): route structured-reasoning empties to prefill, not nudge
Post-tool empty-response nudge fired before the prefill branch for thinking
models that emit reasoning via structured API fields (OpenRouter reasoning /
reasoning_details, e.g. qwen3-vl-8b-thinking). The nudge guard only checked
_has_inline_thinking (<think> tags in content), so every tool-using turn on
these models hit the nudge path — one wasted LLM round-trip (~3-5s, ~400
tokens) and a spurious warning, before self-recovering.

Hoist the _has_structured computation above the nudge guard and widen the
guard from 'not _has_inline_thinking' to 'not _has_structured'. Nudge and
prefill are now disjoint on _has_structured; the empty-retry branch's
existing _prefill_exhausted guard already handles always-reasoning models
falling through after prefill.

Closes #34655. Reported by @sawtdakhili.
2026-05-29 12:23:21 -07:00
kshitij
7379f17556 fix(gateway): only fire planned-stop watcher for self-targeting markers + fix Windows consume (#34749)
* fix(gateway): only fire planned-stop watcher for markers targeting self

Salvaged from #34599 — rebased onto current main.

The planned-stop watcher now only fires shutdown for a marker that targets
the current process, instead of any marker that exists on disk. Fixes the
Windows crash loop (#34597) where a stale marker from a previous Gateway
instance kills a freshly booted Gateway ~400ms after start with a false
"Received UNKNOWN — initiating shutdown".

Co-authored-by: Bartok9 <danielrpike9@gmail.com>

* fix(gateway): match planned-stop/takeover markers by PID alone when start_time is unavailable

Follow-up to the #34599 salvage. The watcher's non-destructive probe
(planned_stop_marker_targets_self) already falls back to PID equality when
a process start_time is unavailable, but the authoritative consume it gates
(_consume_pid_marker_for_self) still required a non-None start_time match.

_get_process_start_time reads /proc/<pid>/stat and returns None on macOS and
native Windows — the only platform the planned-stop watcher exists for. So on
Windows the probe would fire the shutdown handler (PID matches) but the
handler's consume_planned_stop_marker_for_self() would return False, and a
legitimate 'hermes gateway stop' was still misclassified as an unexpected
UNKNOWN exit (exit 1) and revived by the service manager — a residual half of
the #34597 crash loop on the legitimate-stop path.

Align the consume with the probe: when both start_times are known they must
match (PID-reuse guard preserved on Linux); when either is unavailable, fall
back to PID equality alone, bounded by the existing short marker TTL. This
also fixes the parallel --replace takeover consume on Windows, which shares
the same helper.

Adds regression tests for the Windows (None start_time) path, the foreign-PID
rejection under that fallback, and confirmation the start_time-mismatch guard
still rejects when both are known.

---------

Co-authored-by: Bartok9 <danielrpike9@gmail.com>
2026-05-29 17:36:58 +00:00
alt-glitch
0563ab0652 fix(test): add fal_client.submit stub to surface matrix test
The plugin switched from fal_client.subscribe() to submit()+handle.get().
The test mock only had subscribe, causing CI failures.
2026-05-29 22:26:24 +05:30
alt-glitch
e46e4bcf47 fix(video_gen): parse duration suffix in success_response
int(payload["duration"]) blows up on "4s" (veo3.1 format).
Strip non-digit chars before int conversion in the response builder.
2026-05-29 22:26:24 +05:30
alt-glitch
3183b2e28c fix(video_gen): veo3.1 duration format and 4k resolution
FAL veo3.1 API expects duration as "4s"/"6s"/"8s" (with unit suffix),
not bare "4"/"6"/"8" like other families. Add per-family duration_suffix
field and apply it in _build_payload. Also add "4k" to veo3.1 resolutions
per FAL API docs.

Note: the managed gateway currently rejects the "4s" format (expects
integer duration). Gateway-side fix needed for veo3.1 to work through
the Nous subscription path.
2026-05-29 22:26:24 +05:30
alt-glitch
a4c18f65d4 feat(video_gen): wire Nous subscription override into hermes tools UX
Add the same managed-gateway UX that image_gen already has:

- TOOL_CATEGORIES['video_gen'] gets a 'Nous Subscription' provider row
  with managed_nous_feature='video_gen' + video_gen_plugin_name='fal'
- NousSubscriptionFeatures gains a video_gen property + feature state
  computation (managed/active/available using the fal-queue gateway)
- _GATEWAY_TOOL_LABELS, _GATEWAY_DIRECT_LABELS, _ALL_GATEWAY_KEYS,
  _get_gateway_direct_credentials, opted_in all include video_gen
- apply_nous_managed_defaults and apply_gateway_defaults handle video_gen
- _is_toolset_satisfied checks Nous features for video_gen
- _is_provider_active detects managed video_gen (use_gateway + fal provider)
- _select_plugin_video_gen_provider accepts use_gateway kwarg, propagated
  from all 4 call sites in _configure_provider when managed_feature is set
- hermes setup status shows 'Video Generation (FAL via Nous subscription)'

Users on a Nous subscription can now pick 'Nous Subscription' under
hermes tools → Video Generation, which sets video_gen.provider=fal +
video_gen.use_gateway=true. The FAL plugin's _resolve_managed_fal_video_gateway
then routes through the managed queue gateway — no FAL_KEY needed.
2026-05-29 22:26:24 +05:30
alt-glitch
b6294ea9f1 test(video_gen): cover gateway decision matrix gaps and 4xx error path
- Add test for 4xx ValueError with actionable remediation message
- Add test for is_available() returning True via managed gateway
- Add test for prefers_gateway overriding direct FAL_KEY
- Add test for is_available() via gateway in plugin test file
2026-05-29 22:26:24 +05:30
alt-glitch
d04b3c193e feat(video_gen): route FAL video gen through managed Nous gateway
Wire plugins/video_gen/fal/__init__.py to use the same
_ManagedFalSyncClient pattern that image gen already uses.

Changes:
- Add managed gateway resolution, client caching, and
  _submit_fal_video_request() that routes between direct FAL_KEY
  and Nous gateway modes
- Update is_available() to return True when either FAL_KEY or the
  managed gateway is reachable
- Update generate() to use submit+get handle pattern instead of
  fal_client.subscribe() directly
- Fix happy-horse endpoint namespace: fal-ai/ → alibaba/ (matches
  the tool-gateway allowlist from fal-video-gen branch)
- Surface actionable error on 4xx gateway rejections

Tests:
- 4 new tests in test_managed_media_gateways.py (gateway routing,
  client reuse, direct mode fallback, alibaba namespace)
- Updated existing test_fal_plugin.py fixture to use submit/handle
  pattern and patch _resolve_managed_fal_video_gateway for isolation
2026-05-29 22:26:24 +05:30
kshitijk4poor
5cd0673217 ci: harden supply-chain gate jobs against changes-job failure
The scan-gate / dep-bounds-gate jobs use needs.changes; if the changes
job itself fails, its dependents would be skipped via a failed dependency
(not a conditional skip), leaving the required check unreported — the same
"pending forever" failure this PR fixes. Add always() and switch the gate
condition from == 'false' to != 'true' so the gate still fires (and reports
SUCCESS) when changes fails and its output is empty.
2026-05-29 09:17:01 -07:00
ethernet
6bc309baf2 ci: ensure required checks always report status
Remove paths filters from contributor-check and supply-chain-audit
workflows. When no matching files changed, the workflows never ran and
the required checks (check-attribution, supply chain scan, dep bounds)
stayed "pending" forever, blocking merge.

Now both workflows always trigger. A path-check step/job determines
whether the real work should run; gate jobs with matching names report
success when the real job was skipped, so branch protection always
gets a check status.

Also fixes dep-bounds: the old condition
  if: contains(github.event.pull_request.changed_files_url, 'pyproject.toml') || true
was always true (the || true made it unconditional). Now uses the
proper changes.deps output from the shared filter job.
2026-05-29 09:17:01 -07:00
ethernet
6928692cec Merge pull request #33773 from dvir-pashut/fix/nix-full-drop-stale-vercel-group
fix(nix): drop stale "vercel" group from #full variant
2026-05-29 11:16:25 -04:00
teknium1
75cd420b3b docs(skills): move antigravity-cli to autonomous-ai-agents in catalog + sidebar 2026-05-29 05:21:48 -07:00
teknium1
78d7fa1b5c refactor(skills/antigravity-cli): move to autonomous-ai-agents (it's an AI agent CLI) 2026-05-29 05:21:48 -07:00
teknium1
904c0b479b refactor(state): return FTS index count from vacuum()
Have vacuum() return optimize_fts()'s count so the CLI 'sessions optimize'
summary uses the real merged-index count instead of probing the private
_FTS_TABLES / _fts_table_exists() members.
2026-05-29 05:09:56 -07:00
kshitijk4poor
38695254f8 perf(state): merge FTS5 segments on VACUUM + add 'hermes sessions optimize'
The FTS5 indexes (messages_fts, messages_fts_trigram) grow as a series of
incremental b-tree segments — one per trigger-driven insert batch. SQLite's
automerge caps at ~16 segments, so a long-lived store keeps scanning many
segments per MATCH and never collapses them unless the special 'optimize'
command runs. Nothing in the codebase ever ran it: vacuum() only fired after
a prune that deleted rows, and even then never merged FTS segments.

Changes:
- SessionDB.optimize_fts(): merges each FTS5 index to a single segment,
  probing for the (optional/lazy) trigram table first so it is safe to call
  unconditionally. Layout-only — search results and snippet() are unchanged.
- vacuum() now calls optimize_fts() before VACUUM so freed index pages are
  returned to the OS in the same pass.
- 'hermes sessions optimize' CLI subcommand for on-demand reclamation +
  segment compaction (previously there was no way to compact the store
  without a prune deleting rows), with before/after size reporting.

Benchmark (8000 msgs, fragmented to 8 segments/index):
- segments 8 -> 1 on both indexes
- porter MATCH 5.5x faster (0.449 -> 0.081 ms/q)
- trigram MATCH 3.0x faster (0.632 -> 0.207 ms/q)
- 8000 matches before == 8000 after, identical row ids (no functional change)

Orthogonal to the structural FTS-size PRs (#20239 external-content,
#27770 optional trigram) — segment merge helps regardless of those.

Tests: TestOptimizeFts covers index count, search+snippet preservation,
missing-trigram path, and idempotency. Full test_hermes_state.py green (227).
2026-05-29 05:09:56 -07:00
Teknium
2159d2a729 docs(credential-pools): document immediate rotation on usage-limit 429 (#34580)
The rotation flowchart only described the generic 'retry once, rotate on
second 429' path. ChatGPT/Codex plan-limit 429s carry a usage_limit_reached
reason and rotate to the next pool key immediately (no retry, since the cap
won't clear on retry). Document that case so the docs match the code.
2026-05-29 04:50:14 -07:00
teknium1
0dba60f73b docs(skills): regen catalog + sidebar for optional antigravity-cli skill 2026-05-29 04:49:42 -07:00
teknium1
632a7088a3 chore(skills/antigravity-cli): make optional, frame through Hermes tools, tighten frontmatter 2026-05-29 04:49:42 -07:00
Tony Simons
1bba5f27ab feat(skills): add antigravity-cli operator skill 2026-05-29 04:49:42 -07:00
teknium1
d6f2bdabda docs(skills): regen catalog + sidebar for optional grok skill 2026-05-29 04:49:38 -07:00
teknium1
99ddba94ed chore(skills/grok): make optional + tighten SKILL.md to modern format 2026-05-29 04:49:38 -07:00
Matt Maximo
10cd4138cc feat(skills): add grok skill for xAI Grok Build CLI
Adds a `grok` skill under `skills/autonomous-ai-agents/`, a third coding-agent orchestration guide alongside `codex` and `claude-code`. It teaches Hermes to delegate coding tasks to Grok Build (xAI's `grok` CLI).

- Headless `-p` one-shots (preferred)
- Interactive TUI via pty + tmux
- Session resume, background tasks, structured JSON output
- PR review and parallel worktree patterns
- Auth via SuperGrok / X Premium+ (`grok login`)
- Full pitfalls and config notes
2026-05-29 04:49:38 -07:00
Teknium
5e7c2ffa9f chore(models): gemini-3.5-flash replaces gemini-3-flash-preview in OpenRouter + Nous lists (#34581)
* chore(models): swap gemini-3-flash-preview for gemini-3.5-flash in OpenRouter + Nous lists

* chore(models): regenerate model-catalog.json for gemini-3.5-flash swap
2026-05-29 04:27:58 -07:00
dvir pashut
66265a0571 fix(nix): drop stale "vercel" group from #full variant
The `vercel` optional-dependency was removed from pyproject.toml in
#33067, but `nix/packages.nix` (added a few hours later in #33108)
still references `"vercel"` in the `#full` variant's
`extraDependencyGroups`. uv2nix fails evaluation with:

  error: Extra/group name 'vercel' does not match either extra or
  dependency group

Because `nix/devShell.nix` does
`inputsFrom = builtins.attrValues self'.packages`, the broken `#full`
derivation is pulled into the dev shell too, so `nix develop` /
direnv breaks on a fresh clone — not just `nix build .#full`.
2026-05-28 11:52:31 +03:00
33 changed files with 2146 additions and 109 deletions

View File

@@ -3,11 +3,9 @@ name: Contributor Attribution Check
on:
pull_request:
branches: [main]
paths:
# Only run when code files change (not docs-only PRs)
- '*.py'
- '**/*.py'
- '.github/workflows/contributor-check.yml'
# No paths filter — the job must always run so the required check
# reports a status (path-gated workflows leave checks "pending" forever
# when no matching files change, which blocks merge).
permissions:
contents: read
@@ -20,7 +18,21 @@ jobs:
with:
fetch-depth: 0 # Full history needed for git log
- name: Check if relevant files changed
id: filter
run: |
BASE="${{ github.event.pull_request.base.sha }}"
HEAD="${{ github.event.pull_request.head.sha }}"
CHANGED=$(git diff --name-only "$BASE"..."$HEAD" -- '*.py' '**/*.py' '.github/workflows/contributor-check.yml' || true)
if [ -n "$CHANGED" ]; then
echo "run=true" >> "$GITHUB_OUTPUT"
else
echo "run=false" >> "$GITHUB_OUTPUT"
echo "No Python files changed, skipping attribution check."
fi
- name: Check for unmapped contributor emails
if: steps.filter.outputs.run == 'true'
run: |
# Get the merge base between this PR and main
MERGE_BASE=$(git merge-base origin/main HEAD)

View File

@@ -3,15 +3,9 @@ name: Supply Chain Audit
on:
pull_request:
types: [opened, synchronize, reopened]
paths:
- '**/*.py'
- '**/*.pth'
- '**/setup.py'
- '**/setup.cfg'
- '**/sitecustomize.py'
- '**/usercustomize.py'
- '**/__init__.pth'
- 'pyproject.toml'
# No paths filter — the jobs must always run so required checks
# report a status (path-gated workflows leave checks "pending" forever
# when no matching files change, which blocks merge).
permissions:
pull-requests: write
@@ -27,8 +21,44 @@ permissions:
# advisory-only workflow instead.
jobs:
# ── Path filter (shared by both scan and dep-bounds) ───────────────
changes:
runs-on: ubuntu-latest
outputs:
# True when any file the scanner cares about changed in this PR
scan: ${{ steps.filter.outputs.scan }}
# True when pyproject.toml changed in this PR
deps: ${{ steps.filter.outputs.deps }}
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
fetch-depth: 0
- name: Check for relevant file changes
id: filter
run: |
BASE="${{ github.event.pull_request.base.sha }}"
HEAD="${{ github.event.pull_request.head.sha }}"
SCAN_FILES=$(git diff --name-only "$BASE"..."$HEAD" -- \
'*.py' '**/*.py' '*.pth' '**/*.pth' \
'setup.py' 'setup.cfg' \
'sitecustomize.py' 'usercustomize.py' '__init__.pth' \
'pyproject.toml' || true)
if [ -n "$SCAN_FILES" ]; then
echo "scan=true" >> "$GITHUB_OUTPUT"
else
echo "scan=false" >> "$GITHUB_OUTPUT"
fi
DEPS_FILES=$(git diff --name-only "$BASE"..."$HEAD" -- 'pyproject.toml' || true)
if [ -n "$DEPS_FILES" ]; then
echo "deps=true" >> "$GITHUB_OUTPUT"
else
echo "deps=false" >> "$GITHUB_OUTPUT"
fi
scan:
name: Scan PR for critical supply chain risks
needs: changes
if: needs.changes.outputs.scan == 'true'
runs-on: ubuntu-latest
steps:
- name: Checkout
@@ -147,10 +177,24 @@ jobs:
echo "::error::CRITICAL supply chain risk patterns detected in this PR. See the PR comment for details."
exit 1
# Gate: reports success when scan was skipped (no relevant files changed).
# This ensures the required check always gets a status.
scan-gate:
name: Scan PR for critical supply chain risks
needs: changes
# always() so the gate still reports SUCCESS even if `changes` fails/is
# skipped — without it, a failed dependency would leave the required
# check unreported (i.e. "pending"), the exact failure mode this fixes.
if: always() && needs.changes.outputs.scan != 'true'
runs-on: ubuntu-latest
steps:
- run: echo "No supply-chain-relevant files changed, skipping scan."
dep-bounds:
name: Check PyPI dependency upper bounds
needs: changes
if: needs.changes.outputs.deps == 'true'
runs-on: ubuntu-latest
if: contains(github.event.pull_request.changed_files_url, 'pyproject.toml') || true
steps:
- name: Checkout
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
@@ -211,3 +255,16 @@ jobs:
run: |
echo "::error::PyPI dependencies without upper bounds detected. Add <next_major ceiling per CONTRIBUTING.md policy."
exit 1
# Gate: reports success when dep-bounds was skipped (no pyproject.toml changed).
# This ensures the required check always gets a status.
dep-bounds-gate:
name: Check PyPI dependency upper bounds
needs: changes
# always() so the gate still reports SUCCESS even if `changes` fails/is
# skipped — without it, a failed dependency would leave the required
# check unreported (i.e. "pending"), the exact failure mode this fixes.
if: always() && needs.changes.outputs.deps != 'true'
runs-on: ubuntu-latest
steps:
- run: echo "No pyproject.toml changes, skipping dependency bounds check."

View File

@@ -3981,10 +3981,25 @@ def run_conversation(
re.IGNORECASE,
)
)
# Detect structured reasoning emitted via API fields
# (OpenRouter `reasoning` / `reasoning_details`, or the
# streaming-accumulated `reasoning_content`). Thinking
# models like qwen3-vl-8b-thinking return reasoning here
# with empty content after tool calls — that's the model
# still working, not a genuine empty response. Compute
# this BEFORE the nudge guard so those turns route to the
# prefill branch below instead of wasting an LLM round-trip
# on a nudge.
_has_structured = bool(
getattr(assistant_message, "reasoning", None)
or getattr(assistant_message, "reasoning_content", None)
or getattr(assistant_message, "reasoning_details", None)
or _has_inline_thinking
)
if (
_prior_was_tool
and not getattr(agent, "_post_tool_empty_retried", False)
and not _has_inline_thinking # thinking model still working — let prefill handle
and not _has_structured # thinking model still working — let prefill handle
):
agent._post_tool_empty_retried = True
# Clear stale narration so it doesn't resurface
@@ -4028,12 +4043,8 @@ def run_conversation(
# Inspired by clawdbot's "incomplete-text" recovery.
# Also covers Qwen3/Ollama in-content <think> blocks
# (detected above as _has_inline_thinking).
_has_structured = bool(
getattr(assistant_message, "reasoning", None)
or getattr(assistant_message, "reasoning_content", None)
or getattr(assistant_message, "reasoning_details", None)
or _has_inline_thinking
)
# _has_structured was computed above the nudge guard so
# both branches share the same definition.
if _has_structured and agent._thinking_prefill_retries < 2:
agent._thinking_prefill_retries += 1
logger.info(

View File

@@ -18442,7 +18442,10 @@ def _run_planned_stop_watcher(
poll_interval: seconds between marker checks. 0.5s gives a
responsive shutdown without burning CPU.
"""
from gateway.status import _get_planned_stop_marker_path
from gateway.status import (
_get_planned_stop_marker_path,
planned_stop_marker_targets_self,
)
marker_path = _get_planned_stop_marker_path()
while not stop_event.is_set():
try:
@@ -18451,6 +18454,26 @@ def _run_planned_stop_watcher(
and not getattr(runner, "_draining", False)
and getattr(runner, "_running", False)
):
# A marker existing is NOT sufficient — it may have been
# written for a PREVIOUS gateway instance (different PID)
# and left behind because that process exited before the
# CLI's stop() could clean it up. Firing the handler on a
# stale/foreign marker drives the gateway into shutdown,
# then consume_planned_stop_marker_for_self() correctly
# reports a PID mismatch — but by then we're already
# stopping, so it's logged as an unexpected "UNKNOWN" exit
# and the watchdog crash-loops the gateway (issue #34597,
# a regression from PR #33798 which added this watcher
# without the PID check).
#
# Only fire when the marker actually targets us. The probe
# is non-destructive on a match (the handler does the
# authoritative consume on the loop thread) and self-heals
# by unlinking stale/malformed markers so they cannot wedge
# a freshly booted gateway.
if not planned_stop_marker_targets_self():
stop_event.wait(poll_interval)
continue
# Drive the same path as a real signal handler.
# Pass signal=None — the handler tolerates that and consumes
# the marker via consume_planned_stop_marker_for_self,

View File

@@ -816,12 +816,24 @@ def _consume_pid_marker_for_self(
our_pid = os.getpid()
our_start_time = _get_process_start_time(our_pid)
matches = (
target_pid == our_pid
and target_start_time is not None
and our_start_time is not None
and target_start_time == our_start_time
)
# Start-time is a PID-reuse guard. It is only meaningful when both
# sides actually have it: ``_get_process_start_time`` returns None on
# platforms without ``/proc`` (macOS, native Windows — the very
# platform the planned-stop watcher exists for). Requiring a non-None
# match there would make every consume return False, so a legitimate
# ``hermes gateway stop`` on Windows would be misclassified as an
# unexpected ``UNKNOWN`` exit (exit 1) and revived by the service
# manager. So: when both start_times are known they must match; when
# either is unknown, fall back to PID equality alone (bounded by the
# marker's short TTL). This mirrors ``planned_stop_marker_targets_self``
# so the watcher's non-destructive probe and this authoritative
# consume agree on every platform (issue #34597).
if target_pid != our_pid:
matches = False
elif target_start_time is not None and our_start_time is not None:
matches = target_start_time == our_start_time
else:
matches = True
try:
path.unlink(missing_ok=True)
@@ -914,6 +926,68 @@ def consume_planned_stop_marker_for_self() -> bool:
)
def planned_stop_marker_targets_self() -> bool:
"""Return True only when a live planned-stop marker names the current process.
This is a **non-destructive** probe used by the watcher thread
(``gateway/run.py:_run_planned_stop_watcher``) to decide whether to
trigger shutdown. Unlike :func:`consume_planned_stop_marker_for_self`,
it never unlinks a marker that matches us — the shutdown handler does
the authoritative consume on its own thread.
It *does* clean up markers that can never apply to this process:
malformed markers and markers older than the TTL are unlinked so a
stale file left behind by a previous gateway instance cannot wedge
the new one. Markers naming a different PID/start_time are left in
place (they may still be consumed legitimately by the process they
name) but report False here.
Returns False (without raising) on any read/parse error.
"""
path = _get_planned_stop_marker_path()
record = _read_json_file(path)
if not record:
return False
try:
target_pid = int(record["target_pid"])
target_start_time = record.get("target_start_time")
written_at = record.get("written_at") or ""
except (KeyError, TypeError, ValueError):
# Malformed marker can never match anyone — drop it.
try:
path.unlink(missing_ok=True)
except OSError:
pass
return False
if _marker_is_stale(written_at, _PLANNED_STOP_MARKER_TTL_S):
# A marker this old is past its useful life regardless of target —
# clean it up so it cannot crash-loop a freshly booted gateway.
try:
path.unlink(missing_ok=True)
except OSError:
pass
return False
our_pid = os.getpid()
if target_pid != our_pid:
return False
# Start-time is a PID-reuse guard. It is only meaningful when both
# sides actually have it: ``_get_process_start_time`` returns None on
# platforms without ``/proc`` (macOS, native Windows — the very
# platform this watcher exists for). Requiring a non-None match there
# would make the watcher never fire and re-break the #33778 Windows
# session-resume path. So: when both start_times are known they must
# match; when either is unknown, fall back to PID equality alone
# (the marker is short-lived under a 60s TTL, bounding reuse risk).
our_start_time = _get_process_start_time(our_pid)
if target_start_time is not None and our_start_time is not None:
return target_start_time == our_start_time
return True
def clear_planned_stop_marker() -> None:
"""Remove the planned-stop marker unconditionally."""
try:

View File

@@ -13390,6 +13390,11 @@ Examples:
"--yes", "-y", action="store_true", help="Skip confirmation"
)
sessions_subparsers.add_parser(
"optimize",
help="Reclaim disk space: merge FTS5 segments + VACUUM (no data change)",
)
sessions_subparsers.add_parser("stats", help="Show session store statistics")
sessions_rename = sessions_subparsers.add_parser(
@@ -13562,6 +13567,34 @@ Examples:
relaunch(["--resume", selected_id])
return # won't reach here after execvp
elif action == "optimize":
db_path = db.db_path
before_mb = (
os.path.getsize(db_path) / (1024 * 1024)
if db_path.exists()
else 0.0
)
print("Optimizing session store (FTS merge + VACUUM)…")
try:
# vacuum() merges FTS5 segments (optimize_fts) then VACUUMs,
# and returns the number of indexes it merged.
n = db.vacuum()
except Exception as e:
print(f"Error: optimization failed: {e}")
db.close()
return
after_mb = (
os.path.getsize(db_path) / (1024 * 1024)
if db_path.exists()
else 0.0
)
saved = before_mb - after_mb
print(f"Optimized {n} FTS index(es).")
print(
f"Database size: {before_mb:.1f} MB -> {after_mb:.1f} MB "
f"(reclaimed {saved:.1f} MB)"
)
elif action == "stats":
total = db.session_count()
msgs = db.message_count()

View File

@@ -49,7 +49,7 @@ OPENROUTER_MODELS: list[tuple[str, str]] = [
("xiaomi/mimo-v2.5-pro", ""),
("tencent/hy3-preview", ""),
("google/gemini-3-pro-image-preview", ""),
("google/gemini-3-flash-preview", ""),
("google/gemini-3.5-flash", ""),
("google/gemini-3.1-pro-preview", ""),
("google/gemini-3.1-flash-lite-preview", ""),
("qwen/qwen3.6-35b-a3b", ""),
@@ -156,7 +156,7 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
"xiaomi/mimo-v2.5-pro",
"tencent/hy3-preview",
"google/gemini-3-pro-preview",
"google/gemini-3-flash-preview",
"google/gemini-3.5-flash",
"google/gemini-3.1-pro-preview",
"google/gemini-3.1-flash-lite-preview",
"qwen/qwen3.6-35b-a3b",

View File

@@ -71,12 +71,16 @@ class NousSubscriptionFeatures:
def browser(self) -> NousFeatureState:
return self.features["browser"]
@property
def video_gen(self) -> NousFeatureState:
return self.features["video_gen"]
@property
def modal(self) -> NousFeatureState:
return self.features["modal"]
def items(self) -> Iterable[NousFeatureState]:
ordered = ("web", "image_gen", "tts", "browser", "modal")
ordered = ("web", "image_gen", "video_gen", "tts", "browser", "modal")
for key in ordered:
yield self.features[key]
@@ -255,6 +259,7 @@ def get_nous_subscription_features(
web_tool_enabled = _toolset_enabled(config, "web")
image_tool_enabled = _toolset_enabled(config, "image_gen")
video_tool_enabled = _toolset_enabled(config, "video_gen")
tts_tool_enabled = _toolset_enabled(config, "tts")
browser_tool_enabled = _toolset_enabled(config, "browser")
modal_tool_enabled = _toolset_enabled(config, "terminal")
@@ -289,6 +294,8 @@ def get_nous_subscription_features(
browser_use_gateway = _uses_gateway(browser_cfg)
image_gen_cfg = config.get("image_gen") if isinstance(config.get("image_gen"), dict) else {}
image_use_gateway = _uses_gateway(image_gen_cfg)
video_gen_cfg = config.get("video_gen") if isinstance(config.get("video_gen"), dict) else {}
video_use_gateway = _uses_gateway(video_gen_cfg)
direct_exa = bool(get_env_value("EXA_API_KEY"))
direct_firecrawl = bool(get_env_value("FIRECRAWL_API_KEY") or get_env_value("FIRECRAWL_API_URL"))
@@ -296,6 +303,7 @@ def get_nous_subscription_features(
direct_tavily = bool(get_env_value("TAVILY_API_KEY"))
direct_searxng = bool(get_env_value("SEARXNG_URL"))
direct_fal = fal_key_is_configured()
direct_fal_video = direct_fal # same FAL_KEY; separate var so use_gateway is independent
direct_openai_tts = bool(resolve_openai_audio_api_key())
direct_elevenlabs = bool(get_env_value("ELEVENLABS_API_KEY"))
direct_camofox = bool(get_env_value("CAMOFOX_URL"))
@@ -311,6 +319,8 @@ def get_nous_subscription_features(
direct_tavily = False
if image_use_gateway:
direct_fal = False
if video_use_gateway:
direct_fal_video = False
if tts_use_gateway:
direct_openai_tts = False
direct_elevenlabs = False
@@ -320,6 +330,8 @@ def get_nous_subscription_features(
managed_web_available = managed_tools_flag and nous_auth_present and is_managed_tool_gateway_ready("firecrawl")
managed_image_available = managed_tools_flag and nous_auth_present and is_managed_tool_gateway_ready("fal-queue")
# Video gen uses the same fal-queue gateway as image gen.
managed_video_available = managed_image_available
managed_tts_available = managed_tools_flag and nous_auth_present and is_managed_tool_gateway_ready("openai-audio")
managed_browser_available = managed_tools_flag and nous_auth_present and is_managed_tool_gateway_ready("browser-use")
managed_modal_available = managed_tools_flag and nous_auth_present and is_managed_tool_gateway_ready("modal")
@@ -357,6 +369,10 @@ def get_nous_subscription_features(
image_active = bool(image_tool_enabled and (image_managed or direct_fal))
image_available = bool(managed_image_available or direct_fal)
video_managed = video_tool_enabled and managed_video_available and not direct_fal_video
video_active = bool(video_tool_enabled and (video_managed or direct_fal_video))
video_available = bool(managed_video_available or direct_fal_video)
tts_current_provider = tts_provider or "edge"
tts_managed = (
tts_tool_enabled
@@ -451,6 +467,18 @@ def get_nous_subscription_features(
current_provider="FAL" if direct_fal else ("Nous Subscription" if image_managed else ""),
explicit_configured=direct_fal,
),
"video_gen": NousFeatureState(
key="video_gen",
label="Video generation",
included_by_default=False,
available=video_available,
active=video_active,
managed_by_nous=video_managed,
direct_override=video_active and not video_managed,
toolset_enabled=video_tool_enabled,
current_provider="FAL" if direct_fal_video else ("Nous Subscription" if video_managed else ""),
explicit_configured=direct_fal_video,
),
"tts": NousFeatureState(
key="tts",
label="OpenAI TTS",
@@ -561,6 +589,9 @@ def apply_nous_managed_defaults(
if "image_gen" in selected_toolsets and not fal_key_is_configured():
changed.add("image_gen")
if "video_gen" in selected_toolsets and not fal_key_is_configured():
changed.add("video_gen")
return changed
@@ -571,6 +602,7 @@ def apply_nous_managed_defaults(
_GATEWAY_TOOL_LABELS = {
"web": "Web search & extract (Firecrawl)",
"image_gen": "Image generation (FAL)",
"video_gen": "Video generation (FAL)",
"tts": "Text-to-speech (OpenAI TTS)",
"browser": "Browser automation (Browser Use)",
}
@@ -578,6 +610,7 @@ _GATEWAY_TOOL_LABELS = {
def _get_gateway_direct_credentials() -> Dict[str, bool]:
"""Return a dict of tool_key -> has_direct_credentials."""
fal_direct = fal_key_is_configured()
return {
"web": bool(
get_env_value("FIRECRAWL_API_KEY")
@@ -586,7 +619,8 @@ def _get_gateway_direct_credentials() -> Dict[str, bool]:
or get_env_value("TAVILY_API_KEY")
or get_env_value("EXA_API_KEY")
),
"image_gen": fal_key_is_configured(),
"image_gen": fal_direct,
"video_gen": fal_direct,
"tts": bool(
resolve_openai_audio_api_key()
or get_env_value("ELEVENLABS_API_KEY")
@@ -601,11 +635,12 @@ def _get_gateway_direct_credentials() -> Dict[str, bool]:
_GATEWAY_DIRECT_LABELS = {
"web": "Firecrawl/Exa/Parallel/Tavily key",
"image_gen": "FAL key",
"video_gen": "FAL key",
"tts": "OpenAI/ElevenLabs key",
"browser": "Browser Use/Browserbase key",
}
_ALL_GATEWAY_KEYS = ("web", "image_gen", "tts", "browser")
_ALL_GATEWAY_KEYS = ("web", "image_gen", "video_gen", "tts", "browser")
def get_gateway_eligible_tools(
@@ -646,6 +681,7 @@ def get_gateway_eligible_tools(
opted_in = {
"web": _uses_gateway(config.get("web")),
"image_gen": _uses_gateway(config.get("image_gen")),
"video_gen": _uses_gateway(config.get("video_gen")),
"tts": _uses_gateway(config.get("tts")),
"browser": _uses_gateway(config.get("browser")),
}
@@ -714,6 +750,15 @@ def apply_gateway_defaults(
image_cfg["use_gateway"] = True
changed.add("image_gen")
if "video_gen" in tool_keys:
video_cfg = config.get("video_gen")
if not isinstance(video_cfg, dict):
video_cfg = {}
config["video_gen"] = video_cfg
video_cfg["provider"] = "fal"
video_cfg["use_gateway"] = True
changed.add("video_gen")
return changed

View File

@@ -454,22 +454,25 @@ def _print_setup_summary(config: dict, hermes_home):
# Video generation — opt-in via `hermes tools` → Video Generation.
# Only show the row when a plugin reports available so we don't badger
# users who don't care about video gen with a "missing" status line.
try:
from agent.video_gen_registry import list_providers as _list_video_providers
from hermes_cli.plugins import _ensure_plugins_discovered as _ensure_plugins
_ensure_plugins()
_video_backend = None
for _vp in _list_video_providers():
try:
if _vp.is_available():
_video_backend = _vp.display_name
break
except Exception:
continue
except Exception:
_video_backend = None
if _video_backend:
tool_status.append((f"Video Generation ({_video_backend})", True, None))
if subscription_features.video_gen.managed_by_nous:
tool_status.append(("Video Generation (FAL via Nous subscription)", True, None))
else:
try:
from agent.video_gen_registry import list_providers as _list_video_providers
from hermes_cli.plugins import _ensure_plugins_discovered as _ensure_plugins
_ensure_plugins()
_video_backend = None
for _vp in _list_video_providers():
try:
if _vp.is_available():
_video_backend = _vp.display_name
break
except Exception:
continue
except Exception:
_video_backend = None
if _video_backend:
tool_status.append((f"Video Generation ({_video_backend})", True, None))
# TTS — show configured provider
tts_provider = cfg_get(config, "tts", "provider", default="edge")

View File

@@ -339,11 +339,26 @@ TOOL_CATEGORIES = {
"video_gen": {
"name": "Video Generation",
"icon": "🎬",
# Providers list is intentionally empty — every video gen backend
# is a plugin, surfaced by ``_plugin_video_gen_providers()`` and
# injected by ``_visible_providers``. Mirrors the design we'll
# converge image_gen toward.
"providers": [],
# "Nous Subscription" row mirrors the image_gen pattern — managed
# FAL video generation billed via the Nous Portal. Plugin-backed
# provider rows (FAL BYOK, xAI, …) are injected at runtime by
# ``_plugin_video_gen_providers()`` in ``_visible_providers``.
"providers": [
{
"name": "Nous Subscription",
"badge": "subscription",
"tag": "Managed FAL video generation billed to your subscription",
"env_vars": [],
"requires_nous_auth": True,
"managed_nous_feature": "video_gen",
"override_env_vars": ["FAL_KEY"],
# The underlying plugin backend — when the user picks
# "Nous Subscription" we set video_gen.provider = "fal"
# and video_gen.use_gateway = True so the FAL plugin
# routes through the managed queue gateway.
"video_gen_plugin_name": "fal",
},
],
},
"x_search": {
"name": "X (Twitter) Search",
@@ -1438,7 +1453,7 @@ def _toolset_has_keys(
except Exception:
return False
if ts_key in {"web", "image_gen", "tts", "browser"}:
if ts_key in {"web", "image_gen", "video_gen", "tts", "browser"}:
features = get_nous_subscription_features(config, force_fresh=force_fresh)
feature = features.features.get(ts_key)
if feature and (feature.available or feature.managed_by_nous):
@@ -2153,7 +2168,7 @@ def _is_provider_active(
return isinstance(image_cfg, dict) and image_cfg.get("provider") == plugin_name
video_plugin_name = provider.get("video_gen_plugin_name")
if video_plugin_name:
if video_plugin_name and not provider.get("managed_nous_feature"):
video_cfg = config.get("video_gen", {})
return isinstance(video_cfg, dict) and video_cfg.get("provider") == video_plugin_name
@@ -2172,6 +2187,15 @@ def _is_provider_active(
if image_cfg.get("use_gateway") is not None and not is_truthy_value(image_cfg.get("use_gateway"), default=False):
return False
return feature.managed_by_nous
if managed_feature == "video_gen":
video_cfg = config.get("video_gen", {})
if isinstance(video_cfg, dict):
configured_provider = video_cfg.get("provider")
if configured_provider not in {None, "", "fal"}:
return False
if video_cfg.get("use_gateway") is not None and not is_truthy_value(video_cfg.get("use_gateway"), default=False):
return False
return feature.managed_by_nous
if provider.get("tts_provider"):
return (
feature.managed_by_nous
@@ -2505,14 +2529,14 @@ def _configure_videogen_model_for_plugin(plugin_name: str, config: dict) -> None
_print_success(f" Model set to: {chosen}")
def _select_plugin_video_gen_provider(plugin_name: str, config: dict) -> None:
def _select_plugin_video_gen_provider(plugin_name: str, config: dict, *, use_gateway: bool = False) -> None:
"""Persist a plugin-backed video generation provider selection."""
vid_cfg = config.setdefault("video_gen", {})
if not isinstance(vid_cfg, dict):
vid_cfg = {}
config["video_gen"] = vid_cfg
vid_cfg["provider"] = plugin_name
vid_cfg["use_gateway"] = False
vid_cfg["use_gateway"] = use_gateway
_print_success(f" video_gen.provider set to: {plugin_name}")
_configure_videogen_model_for_plugin(plugin_name, config)
@@ -2597,7 +2621,7 @@ def _configure_provider(
# registry.
video_plugin = provider.get("video_gen_plugin_name")
if video_plugin:
_select_plugin_video_gen_provider(video_plugin, config)
_select_plugin_video_gen_provider(video_plugin, config, use_gateway=bool(managed_feature))
return
# Imagegen backends prompt for model selection after backend pick.
backend = provider.get("imagegen_backend")
@@ -2676,7 +2700,7 @@ def _configure_provider(
return
video_plugin = provider.get("video_gen_plugin_name")
if video_plugin:
_select_plugin_video_gen_provider(video_plugin, config)
_select_plugin_video_gen_provider(video_plugin, config, use_gateway=bool(managed_feature))
return
# Imagegen backends prompt for model selection after env vars are in.
backend = provider.get("imagegen_backend")
@@ -2957,7 +2981,7 @@ def _reconfigure_provider(
# Plugin-registered video_gen provider — same flow, different registry.
video_plugin = provider.get("video_gen_plugin_name")
if video_plugin:
_select_plugin_video_gen_provider(video_plugin, config)
_select_plugin_video_gen_provider(video_plugin, config, use_gateway=bool(managed_feature))
return
# Imagegen backends prompt for model selection on reconfig too.
backend = provider.get("imagegen_backend")
@@ -2997,7 +3021,7 @@ def _reconfigure_provider(
# Plugin-registered video_gen provider — same flow, different registry.
video_plugin = provider.get("video_gen_plugin_name")
if video_plugin:
_select_plugin_video_gen_provider(video_plugin, config)
_select_plugin_video_gen_provider(video_plugin, config, use_gateway=bool(managed_feature))
return
backend = provider.get("imagegen_backend")

View File

@@ -3251,7 +3251,59 @@ class SessionDB:
# ── Space reclamation ──
def vacuum(self) -> None:
# FTS5 virtual tables whose b-tree segments we merge on optimize. The
# trigram table is created lazily / may be disabled, so we probe before
# touching it (see optimize_fts).
_FTS_TABLES = ("messages_fts", "messages_fts_trigram")
def _fts_table_exists(self, name: str) -> bool:
"""True if an FTS5 virtual table is queryable in this DB."""
try:
self._conn.execute(f"SELECT 1 FROM {name} LIMIT 0")
return True
except sqlite3.OperationalError:
return False
def optimize_fts(self) -> int:
"""Merge fragmented FTS5 b-tree segments into one per index.
FTS5 indexes grow as a series of incremental segments — one per
``INSERT`` batch driven by the message triggers. Over tens of
thousands of messages these segments accumulate, which both bloats
the ``*_data`` shadow tables and slows ``MATCH`` queries that must
scan every segment. The special ``'optimize'`` command rewrites each
index as a single merged segment.
This is purely a maintenance operation — it changes neither search
results nor ``snippet()`` output, only on-disk layout and query
speed. It is complementary to VACUUM: ``optimize`` compacts the FTS
index internally, then VACUUM returns the freed pages to the OS.
Skips any FTS table that does not exist (e.g. the trigram index when
disabled via ``HERMES_DISABLE_FTS_TRIGRAM`` or not yet created), so
it is safe to call unconditionally.
Returns the number of FTS indexes that were optimized.
"""
optimized = 0
with self._lock:
for tbl in self._FTS_TABLES:
if not self._fts_table_exists(tbl):
continue
try:
# The column name in the INSERT must match the table name
# for FTS5 special commands.
self._conn.execute(
f"INSERT INTO {tbl}({tbl}) VALUES('optimize')"
)
optimized += 1
except sqlite3.OperationalError as exc:
logger.warning(
"FTS optimize failed for %s: %s", tbl, exc
)
return optimized
def vacuum(self) -> int:
"""Run VACUUM to reclaim disk space after large deletes.
SQLite does not shrink the database file when rows are deleted —
@@ -3264,7 +3316,21 @@ class SessionDB:
exclusive lock, so callers must ensure no other writers are
active. Safe to call at startup before the gateway/CLI starts
serving traffic.
FTS5 segments are merged first via :meth:`optimize_fts` so the
subsequent VACUUM reclaims the pages freed by the merge. This is a
layout-only optimization — search results are unchanged.
Returns the number of FTS indexes that were optimized (0 if the
merge step failed or no FTS tables exist).
"""
# Merge FTS5 segments before VACUUM so the freed pages are returned
# to the OS in the same pass. optimize_fts() manages its own lock.
optimized = 0
try:
optimized = self.optimize_fts()
except Exception as exc:
logger.warning("FTS optimize before VACUUM failed: %s", exc)
# VACUUM cannot be executed inside a transaction.
with self._lock:
# Best-effort WAL checkpoint first, then VACUUM.
@@ -3273,6 +3339,7 @@ class SessionDB:
except Exception:
pass
self._conn.execute("VACUUM")
return optimized
def maybe_auto_prune_and_vacuum(
self,

View File

@@ -43,7 +43,6 @@
"modal"
"parallel-web"
"tts-premium"
"vercel"
"voice"
] ++ lib.optionals pkgs.stdenv.isLinux [ "matrix" ];
};

View File

@@ -0,0 +1,177 @@
---
name: antigravity-cli
description: "Operate the Antigravity CLI (agy): plugins, auth, sandbox."
version: 0.1.0
author: Tony Simons (asimons81), Hermes Agent
license: MIT
platforms: [linux, macos, windows]
metadata:
hermes:
tags: [Coding-Agent, Antigravity, CLI, Auth, Plugins, Sandbox]
related_skills: [grok, codex, claude-code, hermes-agent]
---
# Antigravity CLI (`agy`)
Operator guide for the Antigravity CLI, invoked as `agy`. Run all `agy`
commands through the Hermes `terminal` tool; inspect its config and logs with
`read_file`. This skill is reference + procedure — it does not wrap a network
API, so there is nothing to authenticate from Hermes itself.
## When to Use
- Installing, updating, or smoke-testing the `agy` binary
- Driving non-interactive `agy --print` / `agy -p` one-shots
- Debugging Antigravity auth, sandbox, permissions, or plugin state
- Reading Antigravity settings, keybindings, conversations, or logs
## Mental model
Antigravity has two layers — keep them distinct or the guidance will be wrong:
1. **Shell wrapper commands**`agy help`, `agy install`, `agy plugin`,
`agy update`, `agy changelog`. Run these through the `terminal` tool.
2. **Interactive in-session slash commands**`/config`, `/permissions`,
`/skills`, `/agents`, etc. These only exist inside a running `agy` TUI
session, not on the shell wrapper.
`agy help` shows the shell wrapper surface, NOT the in-session slash commands.
## Prerequisites
- The `agy` binary on PATH. Verify through the `terminal` tool:
`command -v agy && agy --version`.
- No env vars or API keys required by this skill — Antigravity manages its own
auth via the OS keyring / browser sign-in (see Authentication below).
## How to Run
Invoke every `agy` command through the `terminal` tool. Examples:
```
terminal(command="agy --version")
terminal(command="agy help")
terminal(command="agy plugin list")
terminal(command="agy --print 'Summarize the repo in 3 bullets'", workdir="/path/to/project")
```
For an interactive multi-turn TUI session, launch `agy` with `pty=true` (and
tmux for capture/monitoring), the same pattern the `codex` / `claude-code`
skills use. For one-shot smoke tests and scripted prompts, prefer
`agy --print` (non-interactive).
To inspect Antigravity's own files, use `read_file` on the paths under Core
paths below — do not `cat` them through the terminal.
## Core paths
- Binary / entrypoint: `agy`
- App data dir: `~/.gemini/antigravity-cli/`
- Settings file: `~/.gemini/antigravity-cli/settings.json`
- Keybindings file: `~/.gemini/antigravity-cli/keybindings.json`
- Logs: `~/.gemini/antigravity-cli/log/cli-*.log`
- Conversations: `~/.gemini/antigravity-cli/conversations/`
- Brain artifacts: `~/.gemini/antigravity-cli/brain/`
- History: `~/.gemini/antigravity-cli/history.jsonl`
- Plugin staging: `~/.gemini/antigravity-cli/plugins/<plugin_name>/`
## Quick Reference
### Wrapper commands
- `agy changelog`
- `agy help`
- `agy install`
- `agy plugin` / `agy plugins`
- `agy update`
### Useful flags
- `--add-dir`
- `--continue` / `-c`
- `--conversation`
- `--dangerously-skip-permissions`
- `--print` / `-p`
- `--print-timeout`
- `--prompt`
- `--prompt-interactive` / `-i`
- `--sandbox`
- `--log-file`
- `--version`
### Plugin subcommands (`agy plugin --help`)
- `list`, `import [source]`, `install <target>`, `uninstall <name>`,
`enable <name>`, `disable <name>`, `validate [path]`, `link <mp> <target>`,
`help`
### Install flags (`agy install --help`)
- `--dir`, `--skip-aliases`, `--skip-path`
### In-session slash commands
- **Conversation control:** `/resume` (`/switch`), `/rewind` (`/undo`),
`/rename <name>`, `/clear`, `/fork`, `/reset`, `/new`
- **Settings & tools:** `/config`, `/settings`, `/permissions`, `/model`,
`/keybindings`, `/statusline`, `/tasks`, `/skills`, `/mcp`, `/open <path>`,
`/usage`, `/logout`, `/agents`
- **Prompt helpers:** `@` path autocomplete, `esc esc` clears the prompt (when
not streaming), `!` runs a terminal command directly, `?` opens help
## Settings and permissions
### Common settings keys (`settings.json`)
- `allowNonWorkspaceAccess`
- `colorScheme`
- `permissions.allow`
- `trustedWorkspaces`
### Permission modes
`request-review`, `always-proceed`, `strict`, `proceed-in-sandbox`.
### Sandbox behavior
- `enableTerminalSandbox` is a boolean in `settings.json`; default `false`.
- Launch-time overrides (`--sandbox`, `--dangerously-skip-permissions`) can
supersede persistent settings for the current session.
## Authentication behavior
- The CLI tries the OS secure keyring first.
- With no saved session, it falls back to browser-based Google sign-in.
- Locally it opens the default browser; over SSH it prints an authorization URL
and expects the auth code pasted back.
- `/logout` removes saved credentials.
## Plugins
- Plugins stage under `~/.gemini/antigravity-cli/plugins/<plugin_name>/`.
- They can bundle skills, agents, rules, MCP servers, and hooks.
- `agy plugin list` returning no imported plugins is a valid empty state.
## Pitfalls
- `agy help` shows wrapper commands, not interactive slash commands.
- `agy --version` is the safe non-interactive version check; `agy version` is
interactive and can fail without a real TTY.
- First place to look for failures: `~/.gemini/antigravity-cli/log/cli-*.log`
(read with `read_file`).
- Don't confuse persistent JSON settings with launch-time overrides.
- `~/.gemini/antigravity-cli/bin/agentapi` is a thin wrapper to `agy agentapi`.
- On WSL, token storage is file-based, so auth issues are usually local-file /
session-state problems, not browser-only problems.
- Workspace identity can depend on launch directory and the `.antigravitycli`
project marker.
## Verification
Confirm the install is real and usable, all through the `terminal` tool (read
files with `read_file`):
1. `terminal(command="command -v agy")`
2. `terminal(command="agy --version")`
3. `terminal(command="agy help")`
4. `terminal(command="agy plugin list")`
5. `read_file` on `~/.gemini/antigravity-cli/settings.json`
6. `read_file` on the latest `~/.gemini/antigravity-cli/log/cli-*.log`
7. If needed, `read_file` on `~/.gemini/antigravity-cli/keybindings.json`
## Support files
- `references/cli-docs.md` — condensed notes from the getting-started, usage,
and features docs.

View File

@@ -0,0 +1,64 @@
# Antigravity CLI docs, condensed
Source pages reviewed:
- `/docs/cli-getting-started`
- `/docs/cli-using`
- `/docs/cli-features`
## Install
- macOS/Linux: `curl -fsSL https://antigravity.google/cli/install.sh | bash`
- Windows PowerShell: `irm https://antigravity.google/cli/install.ps1 | iex`
- Windows CMD: `curl -fsSL https://antigravity.google/cli/install.cmd -o install.cmd && install.cmd && del install.cmd`
## Authentication
- Tries secure keyring first.
- If no saved session exists, falls back to browser-based Google sign-in.
- Local machine: opens the default browser.
- SSH/remote: prints a secure authorization URL, then expects the auth code to be pasted back.
- `/logout` removes saved credentials.
## Config and files
- Settings: `~/.gemini/antigravity-cli/settings.json`
- Keybindings: `~/.gemini/antigravity-cli/keybindings.json`
- Plugins: `~/.gemini/antigravity-cli/plugins/<plugin_name>/`
## Useful slash commands
- `/config`, `/settings`
- `/permissions`
- `/resume` / `/switch`
- `/rewind` / `/undo`
- `/rename <name>`
- `/model`
- `/keybindings`
- `/statusline`
- `/tasks`
- `/skills`
- `/mcp`
- `/open <path>`
- `/usage`
- `/logout`
- `/agents`
## Prompt helpers
- `@` path autocomplete
- `esc esc` clears prompt when not streaming
- `!` runs a terminal command
- `?` opens help / slash command list
## Permissions and sandbox
- Permission modes: `request-review`, `always-proceed`, `strict`, `proceed-in-sandbox`
- Launch overrides: `--sandbox`, `--dangerously-skip-permissions`
- Sandbox setting: `enableTerminalSandbox` in `settings.json` (default `false`)
## Plugins
- Plugins can bundle skills, agents, rules, MCP servers, and hooks.
- They are staged locally and auto-discovered once installed.
## Subagents
- `/agents` opens the panel for active/completed subagents.
- Subagents can run in parallel and request approvals.
## Keybindings
- `~/.gemini/antigravity-cli/keybindings.json`
- Malformed JSON falls back to defaults for broken actions.
- Docs list default bindings for clear, submit, cancel, exit, suspend, editor, approval yes/no, navigation, clipboard, undo/redo, and newline insertion.

View File

@@ -0,0 +1,301 @@
---
name: grok
description: "Delegate coding to xAI Grok Build CLI (features, PRs)."
version: 0.1.0
author: Matt Maximo (MattMaximo), Hermes Agent
license: MIT
platforms: [linux, macos, windows]
metadata:
hermes:
tags: [Coding-Agent, Grok, xAI, Code-Review, Refactoring, Automation]
related_skills: [codex, claude-code, hermes-agent]
---
# Grok Build CLI — Hermes Orchestration Guide
Delegate coding tasks to [Grok Build](https://docs.x.ai/build/overview) (xAI's
autonomous coding agent CLI, the `grok` command) via the Hermes terminal. Grok
can read files, write code, run shell commands, spawn subagents, and manage git
workflows. It runs three ways: an interactive TUI, **headless** (`-p`), and as
an **ACP agent** over JSON-RPC.
This is the third sibling to `codex` and `claude-code`. The orchestration
pattern is nearly identical — **prefer headless `-p` for one-shots**, use a PTY
for interactive sessions.
## When to use
- Building features
- Refactoring
- PR reviews
- Batch issue fixing
- Any task where you'd otherwise reach for Codex / Claude Code but want Grok
## Prerequisites
- **Install (preferred):** `npm install -g @xai-official/grok`
- The official installer `curl -fsSL https://x.ai/cli/install.sh | bash` also
works, but the `x.ai` host is Cloudflare-walled in some environments. The
npm path avoids that dependency entirely.
- **Auth — SuperGrok / X Premium+ subscription (primary path):**
- Run `grok login` once → opens a browser for OAuth → token cached in
`~/.grok/auth.json`. This uses your **SuperGrok or X Premium+** subscription
(no per-token API billing).
- Check sign-in state by looking for `~/.grok/auth.json`, or run a cheap
headless smoke test: `grok --no-auto-update -p "Say ok."`
- In the TUI, `/logout` signs out and `/login` (or relaunching) signs back in.
- **No git repo required** — unlike Codex, Grok runs fine outside a git
directory (good for scratch/throwaway tasks).
- **Claude Code / AGENTS.md compatible with zero config** — Grok auto-reads
`CLAUDE.md`, `.claude/` (skills, agents, MCPs, hooks, rules), and the
`AGENTS.md` family. Existing project context just works.
> **API-key fallback (not the default for this user):** Grok also supports
> setting the `XAI_API_KEY` environment variable for pay-as-you-go billing
> via `api.x.ai`. Only use
> this if `grok login` / SuperGrok auth is unavailable. The subscription path
> (`grok login`) is the intended setup here.
## Two Orchestration Modes
### Mode 1: Headless (`-p`) — Non-Interactive (PREFERRED)
Runs a one-shot task, prints the result, and exits. No PTY, no interactive
dialogs to navigate. This is the cleanest integration path — the analog of
`claude -p` and `codex exec`.
```
terminal(command="grok --no-auto-update -p 'Add a dark mode toggle to settings'", workdir="/path/to/project", timeout=180)
```
Always pass `--no-auto-update` in automation to skip background update checks.
**When to use headless:**
- One-shot coding tasks (fix a bug, add a feature, refactor)
- CI/CD automation and scripting
- Structured output parsing with `--output-format json`
- Any task that doesn't need multi-turn conversation
### Mode 2: Interactive PTY — Multi-Turn TUI Sessions
The TUI is a fullscreen, mouse-interactive app. Drive it with `pty=true`. For
robust monitoring/input use tmux (same pattern as the `claude-code` skill).
```
# Launch in a tmux session for capture-pane monitoring
terminal(command="tmux new-session -d -s grok-work -x 140 -y 40")
terminal(command="tmux send-keys -t grok-work 'cd /path/to/project && grok' Enter")
# Wait for startup, then send a task
terminal(command="sleep 5 && tmux send-keys -t grok-work 'Refactor the auth module to use JWT' Enter")
# Monitor progress
terminal(command="sleep 15 && tmux capture-pane -t grok-work -p -S -50")
# Exit when done
terminal(command="tmux send-keys -t grok-work '/quit' Enter && sleep 1 && tmux kill-session -t grok-work")
```
**Tip for headless-but-inline output:** if you want TUI-style output without the
fullscreen alt-screen takeover (e.g. for cleaner logs), add `--no-alt-screen`.
For pure automation, headless `-p` is still cleaner than the TUI.
## Headless Deep Dive
### Common Flags
| Flag | Effect |
|------|--------|
| `-p, --single <PROMPT>` | Send one prompt, run headless, exit |
| `-m, --model <MODEL>` | Choose a model |
| `-s, --session-id <ID>` | Create or resume a named headless session |
| `-r, --resume <ID>` | Resume an existing session |
| `-c, --continue` | Continue the most recent session in the current directory |
| `--cwd <PATH>` | Set the working directory |
| `--output-format <FMT>` | `plain` (default), `json`, or `streaming-json` |
| `--always-approve` | Auto-approve all tool executions (the `--full-auto` / `--yolo` equivalent) |
| `--no-alt-screen` | Run inline, no fullscreen TUI takeover |
| `--no-auto-update` | Skip background update checks (use in all automation) |
### Output Formats
- `plain` — human-readable text (default)
- `json` — one JSON object at the end of the run (parse the result cleanly)
- `streaming-json` — newline-delimited JSON events as they arrive
```
# Structured result for parsing
terminal(command="grok --no-auto-update -p 'List all TODO comments in src/' --output-format json", workdir="/project", timeout=120)
# Auto-approve for autonomous building
terminal(command="grok --no-auto-update --always-approve -p 'Refactor the database layer and run the tests'", workdir="/project", timeout=300)
```
### Background Mode (Long Tasks)
```
# Start headless in background
terminal(command="grok --no-auto-update --always-approve -p 'Refactor the auth module'", workdir="/project", background=true, notify_on_complete=true)
# Returns session_id
# Monitor
process(action="poll", session_id="<id>")
process(action="log", session_id="<id>")
# Kill if needed
process(action="kill", session_id="<id>")
```
For an interactive (TUI) background session, use `pty=true` + tmux and monitor
with `tmux capture-pane`, exactly like the `claude-code` / `codex` skills.
### Session Continuation
```
# Start a named session
terminal(command="grok --no-auto-update -s refactor-db -p 'Start refactoring the database layer' --always-approve", workdir="/project", timeout=240)
# Resume it later
terminal(command="grok --no-auto-update -r refactor-db -p 'Now add connection pooling' --always-approve", workdir="/project", timeout=180)
# Or continue the most recent session in this directory
terminal(command="grok --no-auto-update -c -p 'What did you change last time?'", workdir="/project", timeout=60)
```
## Read-Only Audit → Markdown Note Pattern
To have Grok review local artifacts and return a clean markdown note (for
Obsidian or a repo) without mutating anything:
1. Prepare stable input files first with Hermes tools (`read_file`,
`write_file`). Snapshot only the relevant context into a temp file rather
than dumping raw paths.
2. Run Grok headless **without** `--always-approve` so it cannot auto-write, and
demand `markdown only, no preamble`.
3. Save Grok's stdout straight into the destination note with `write_file()`.
```
grok --no-auto-update -p "Read /tmp/current.md and /tmp/inventory.md. Produce markdown only, no preamble. Output a clean note titled 'Cleanup Review'." --output-format plain
```
**Pitfall (same as Claude Code):** for document rewrites, a loose "rewrite this"
prompt may return a change summary instead of the full file. Instead: pipe the
file in, and demand `Return ONLY the full revised markdown document. No intro,
no explanation, no code fences. Start immediately with '# Title'.` Verify the
first lines with `read_file()` before overwriting the destination.
## PR Review Patterns
### Quick Review (Headless)
```
terminal(command="cd /path/to/repo && git diff main...feature-branch | grok --no-auto-update -p 'Review this diff for bugs, security issues, and style problems. Be thorough.'", timeout=120)
```
### Clone-to-temp Review (safe, no repo mutation)
```
terminal(command="REVIEW=$(mktemp -d) && git clone https://github.com/user/repo.git $REVIEW && cd $REVIEW && gh pr checkout 42 && grok --no-auto-update -p 'Review the changes vs origin/main. Check bugs, security, race conditions, missing tests.'", pty=true, timeout=300)
```
### Post the review
```
terminal(command="gh pr comment 42 --body '<review text>'", workdir="/path/to/repo")
```
## Parallel Issue Fixing with Worktrees
```
# Create worktrees
terminal(command="git worktree add -b fix/issue-78 /tmp/issue-78 main", workdir="~/project")
terminal(command="git worktree add -b fix/issue-99 /tmp/issue-99 main", workdir="~/project")
# Launch Grok headless in each (background)
terminal(command="grok --no-auto-update --always-approve -p 'Fix issue #78: <description>. Commit when done.'", workdir="/tmp/issue-78", background=true, notify_on_complete=true)
terminal(command="grok --no-auto-update --always-approve -p 'Fix issue #99: <description>. Commit when done.'", workdir="/tmp/issue-99", background=true, notify_on_complete=true)
# Monitor
process(action="list")
# After completion: push and open PRs
terminal(command="cd /tmp/issue-78 && git push -u origin fix/issue-78")
terminal(command="gh pr create --repo user/repo --head fix/issue-78 --title 'fix: ...' --body '...'")
# Cleanup
terminal(command="git worktree remove /tmp/issue-78", workdir="~/project")
```
## Useful Subcommands & TUI Commands
| Command | Purpose |
|---------|---------|
| `grok` | Start the interactive TUI |
| `grok -p "query"` | Headless one-shot |
| `grok login` / `grok logout` | Sign in / out (SuperGrok / X Premium+ OAuth) |
| `grok inspect` | Show what Grok discovered in cwd: config sources, instructions, skills, plugins, hooks, MCP servers |
| `grok agent stdio` | Run as an ACP agent over JSON-RPC (for IDE/tool integration) |
| `grok update` | Update the CLI (needs the `x.ai` host; skip in automation) |
TUI slash commands (interactive only): `/model <name>`, `/always-approve`,
`/plan`, `/context`, `/compact`, `/resume`, `/sessions`, `/fork`, `/usage`,
`/quit`. `Shift+Tab` cycles session modes (including Plan mode, which blocks
write tools except the session plan file).
## Config (`~/.grok/config.toml`)
```toml
[cli]
auto_update = false # skip background update checks persistently
[ui]
permission_mode = "ask" # or "always-approve" to skip tool prompts by default
[models]
default = "grok-build-0.1"
```
Put global preferences in `~/.grok/config.toml` (not project-scoped
`.grok/config.toml`). `permission_mode` supersedes the legacy `approval_mode` /
`yolo = true` keys.
## Pitfalls & Gotchas
1. **Auth is subscription-gated.** `grok login` requires a SuperGrok or X
Premium+ subscription. If login fails or there's no `~/.grok/auth.json`,
confirm the subscription is active before falling back to `XAI_API_KEY`.
2. **Don't conflate Hermes' xAI auth with the `grok` CLI's auth.** Hermes'
`x_search` runs on its own xAI OAuth; the standalone `grok` CLI has a
separate token in `~/.grok/auth.json`. A working `x_search` does NOT mean
`grok` is logged in.
3. **Always pass `--no-auto-update` in automation** — otherwise Grok phones home
for update checks (and `x.ai`/`storage.googleapis.com` may be unreachable).
4. **Prefer npm install over the curl installer** — `npm install -g
@xai-official/grok` avoids the Cloudflare-walled `x.ai` host.
5. **`--always-approve` is the autonomous-build switch.** Without it, headless
runs may stall waiting on tool-approval prompts. Omit it deliberately for
read-only review/audit work so Grok can't mutate files.
6. **Headless `-p` skips TUI dialogs**; the TUI needs `pty=true` (+ tmux for
monitoring), just like Claude Code.
7. **Use `--no-alt-screen`** if you run the TUI inline and the fullscreen
alt-screen takeover garbles captured output.
8. **No git repo needed**, but for PR/commit workflows you still want one — use
`mktemp -d && git init` for scratch commit tasks.
9. **Clean up tmux sessions** with `tmux kill-session -t <name>` when done.
## Rules for Hermes Agents
1. **Prefer headless `-p`** for single tasks — cleanest integration, structured
output via `--output-format json`.
2. **Always set `workdir`** (or `--cwd`) so Grok targets the right project.
3. **Pass `--no-auto-update`** in every automated invocation.
4. **Use `--always-approve` only when Grok should write autonomously**; omit it
for read-only reviews and audits.
5. **Background long tasks** with `background=true, notify_on_complete=true` and
monitor via the `process` tool.
6. **Use tmux for multi-turn interactive work** and monitor with
`tmux capture-pane -t <session> -p -S -50`.
7. **Verify auth before relying on it** — check `~/.grok/auth.json` or run a
cheap `grok -p "Say ok."` smoke test; don't assume Hermes' xAI auth carries
over.
8. **Report results to the user** — summarize what Grok changed and what's left.

View File

@@ -17,7 +17,7 @@ Model families (each with t2v + i2v endpoints):
veo3.1 fal-ai/veo3.1 / fal-ai/veo3.1/image-to-video
seedance-2.0 bytedance/seedance-2.0/text-to-video / bytedance/seedance-2.0/image-to-video
kling-v3-4k fal-ai/kling-video/v3/4k/text-to-video / fal-ai/kling-video/v3/4k/image-to-video
happy-horse fal-ai/happy-horse/text-to-video / fal-ai/happy-horse/image-to-video
happy-horse alibaba/happy-horse/text-to-video / alibaba/happy-horse/image-to-video
Selection precedence for the active family:
1. ``model=`` arg from the tool call
@@ -26,14 +26,16 @@ Selection precedence for the active family:
4. ``video_gen.model`` in ``config.yaml`` (when it's one of our family IDs)
5. ``DEFAULT_MODEL``
Authentication via ``FAL_KEY``. Output is an HTTPS URL from FAL's CDN; the
gateway downloads and delivers it.
Authentication via ``FAL_KEY`` or the managed Nous gateway. Output is an
HTTPS URL from FAL's CDN; the gateway downloads and delivers it.
"""
from __future__ import annotations
import logging
import os
import threading
import uuid
from typing import Any, Dict, List, Optional, Tuple
from agent.video_gen_provider import (
@@ -104,8 +106,9 @@ FAL_FAMILIES: Dict[str, Dict[str, Any]] = {
"text_endpoint": "fal-ai/veo3.1",
"image_endpoint": "fal-ai/veo3.1/image-to-video",
"aspect_ratios": ("16:9", "9:16"),
"resolutions": ("720p", "1080p"),
"resolutions": ("720p", "1080p", "4k"),
"durations": (4, 6, 8),
"duration_suffix": "s", # FAL veo3.1 wants "4s" not "4"
"audio": True,
"negative": True,
},
@@ -148,8 +151,8 @@ FAL_FAMILIES: Dict[str, Dict[str, Any]] = {
"price": "premium",
"strengths": "Alibaba. New model, sparse public docs — conservative defaults.",
"tier": "premium",
"text_endpoint": "fal-ai/happy-horse/text-to-video",
"image_endpoint": "fal-ai/happy-horse/image-to-video",
"text_endpoint": "alibaba/happy-horse/text-to-video",
"image_endpoint": "alibaba/happy-horse/image-to-video",
# Docs don't expose duration/aspect/resolution — let the endpoint
# apply its own defaults.
"aspect_ratios": None,
@@ -270,7 +273,9 @@ def _build_payload(
clamped = _clamp_duration(family, duration)
if clamped is not None and family.get("durations"):
# FAL exposes duration as a string in the queue API ("8" not 8).
payload["duration"] = str(clamped)
# Some families (e.g. veo3.1) require a unit suffix ("4s" not "4").
suffix = family.get("duration_suffix", "")
payload["duration"] = f"{clamped}{suffix}"
if family.get("audio") and audio is not None:
payload["generate_audio"] = bool(audio)
@@ -302,6 +307,92 @@ def _load_fal_client() -> Any:
return _fal_client
# ---------------------------------------------------------------------------
# Managed FAL gateway (Nous Subscription)
# ---------------------------------------------------------------------------
_managed_fal_video_client: Any = None
_managed_fal_video_client_config: Any = None
_managed_fal_video_client_lock = threading.Lock()
def _resolve_managed_fal_video_gateway():
"""Return managed fal-queue gateway config when the user prefers the gateway
or direct FAL credentials are absent."""
from tools.tool_backend_helpers import fal_key_is_configured, prefers_gateway
if fal_key_is_configured() and not prefers_gateway("video_gen"):
return None
from tools.managed_tool_gateway import resolve_managed_tool_gateway
return resolve_managed_tool_gateway("fal-queue")
def _get_managed_fal_video_client(managed_gateway):
"""Reuse the managed FAL client so its internal httpx.Client is not leaked per call."""
global _managed_fal_video_client, _managed_fal_video_client_config
from tools.fal_common import _ManagedFalSyncClient
client_config = (
managed_gateway.gateway_origin.rstrip("/"),
managed_gateway.nous_user_token,
)
with _managed_fal_video_client_lock:
if _managed_fal_video_client is not None and _managed_fal_video_client_config == client_config:
return _managed_fal_video_client
_load_fal_client()
_managed_fal_video_client = _ManagedFalSyncClient(
_fal_client,
key=managed_gateway.nous_user_token,
queue_run_origin=managed_gateway.gateway_origin,
)
_managed_fal_video_client_config = client_config
return _managed_fal_video_client
def _submit_fal_video_request(endpoint: str, arguments: Dict[str, Any]):
"""Submit a FAL video request using direct credentials or the managed queue gateway.
Returns a request handle whose ``.get()`` blocks until the result is ready.
"""
_load_fal_client()
request_headers = {"x-idempotency-key": str(uuid.uuid4())}
managed_gateway = _resolve_managed_fal_video_gateway()
if managed_gateway is None:
return _fal_client.submit(endpoint, arguments=arguments, headers=request_headers)
managed_client = _get_managed_fal_video_client(managed_gateway)
try:
return managed_client.submit(
endpoint,
arguments=arguments,
headers=request_headers,
)
except Exception as exc:
from tools.fal_common import _extract_http_status
status = _extract_http_status(exc)
if status is not None and 400 <= status < 500:
raise ValueError(
f"Nous Subscription gateway rejected endpoint '{endpoint}' "
f"(HTTP {status}). This model may not yet be enabled on "
f"the Nous Portal's FAL proxy. Either:\n"
f" • Set FAL_KEY in your environment to use FAL.ai directly, or\n"
f" • Pick a different model via `hermes tools` → Video Generation."
) from exc
raise
def _check_fal_video_available() -> bool:
"""True if the FAL.ai video backend is reachable (direct key or managed gateway)."""
from tools.tool_backend_helpers import fal_key_is_configured
if fal_key_is_configured():
return True
return _resolve_managed_fal_video_gateway() is not None
# ---------------------------------------------------------------------------
# Provider
# ---------------------------------------------------------------------------
@@ -323,13 +414,10 @@ class FALVideoGenProvider(VideoGenProvider):
return "FAL"
def is_available(self) -> bool:
if not os.environ.get("FAL_KEY", "").strip():
return False
try:
import fal_client # noqa: F401
except ImportError:
return _check_fal_video_available()
except Exception: # noqa: BLE001 — never break the picker
return False
return True
def list_models(self) -> List[Dict[str, Any]]:
out: List[Dict[str, Any]] = []
@@ -394,11 +482,12 @@ class FALVideoGenProvider(VideoGenProvider):
seed: Optional[int] = None,
**kwargs: Any,
) -> Dict[str, Any]:
if not os.environ.get("FAL_KEY", "").strip():
if not _check_fal_video_available():
return error_response(
error=(
"FAL_KEY not set. Run `hermes tools` → Video Generation "
"→ FAL to configure."
"No FAL backend available. Either set FAL_KEY "
"(run `hermes tools` → Video Generation → FAL to configure) "
"or sign in to Nous (`hermes setup`) for managed gateway access."
),
error_type="auth_required",
provider="fal",
@@ -406,7 +495,7 @@ class FALVideoGenProvider(VideoGenProvider):
)
try:
fal_client = _load_fal_client()
_load_fal_client()
except ImportError:
return error_response(
error="fal_client Python package not installed (pip install fal-client)",
@@ -467,11 +556,8 @@ class FALVideoGenProvider(VideoGenProvider):
)
try:
result = fal_client.subscribe(
endpoint,
arguments=payload,
with_logs=False,
)
handle = _submit_fal_video_request(endpoint, payload)
result = handle.get()
except Exception as exc:
logger.warning(
"FAL video gen failed (family=%s, endpoint=%s): %s",
@@ -511,7 +597,7 @@ class FALVideoGenProvider(VideoGenProvider):
prompt=prompt,
modality=modality_used,
aspect_ratio=aspect_ratio if "aspect_ratio" in payload else "",
duration=int(payload["duration"]) if "duration" in payload else 0,
duration=int("".join(c for c in payload["duration"] if c.isdigit()) or "0") if "duration" in payload else 0,
provider="fal",
extra=extra,
)

View File

@@ -440,6 +440,7 @@ class TestBuildNousSubscriptionPrompt:
features={
"web": NousFeatureState("web", "Web tools", True, True, True, True, False, True, "firecrawl"),
"image_gen": NousFeatureState("image_gen", "Image generation", True, True, True, True, False, True, "Nous Subscription"),
"video_gen": NousFeatureState("video_gen", "Video generation", False, False, False, False, False, False, ""),
"tts": NousFeatureState("tts", "OpenAI TTS", True, True, True, True, False, True, "OpenAI TTS"),
"browser": NousFeatureState("browser", "Browser automation", True, True, True, True, False, True, "Browser Use"),
"modal": NousFeatureState("modal", "Modal execution", False, True, False, False, False, True, "local"),
@@ -464,6 +465,7 @@ class TestBuildNousSubscriptionPrompt:
features={
"web": NousFeatureState("web", "Web tools", True, False, False, False, False, True, ""),
"image_gen": NousFeatureState("image_gen", "Image generation", True, False, False, False, False, True, ""),
"video_gen": NousFeatureState("video_gen", "Video generation", False, False, False, False, False, False, ""),
"tts": NousFeatureState("tts", "OpenAI TTS", True, False, False, False, False, True, ""),
"browser": NousFeatureState("browser", "Browser automation", True, False, False, False, False, True, ""),
"modal": NousFeatureState("modal", "Modal execution", False, False, False, False, False, True, ""),

View File

@@ -12,12 +12,33 @@ See issue #33778 for the original Windows session-loss bug report.
"""
import asyncio
import json
import os
import threading
import time
from unittest.mock import MagicMock
from gateway.run import _run_planned_stop_watcher
from gateway import status as status_mod
def _write_self_marker(marker, *, stale: bool = False):
"""Write a planned-stop marker that targets the CURRENT process.
The watcher only fires for markers naming our PID + start_time (the
fix for issue #34597), so tests that expect a fire must write a
self-targeting marker. Pass ``stale=True`` to backdate ``written_at``
past the TTL.
"""
written_at = "2000-01-01T00:00:00+00:00" if stale else status_mod._utc_now_iso()
record = {
"target_pid": os.getpid(),
"target_start_time": status_mod._get_process_start_time(os.getpid()),
"stopper_pid": os.getpid(),
"written_at": written_at,
}
marker.write_text(json.dumps(record), encoding="utf-8")
class _FakeRunner:
@@ -41,11 +62,10 @@ def _make_loop_capturing_calls():
def test_watcher_fires_shutdown_when_marker_appears(tmp_path, monkeypatch):
"""When the marker file exists, the watcher must call the shutdown handler."""
"""When a marker targeting THIS process exists, fire the shutdown handler."""
marker = tmp_path / ".gateway-planned-stop.json"
# Patch the marker-path resolver so the watcher polls our temp location.
from gateway import status as status_mod
monkeypatch.setattr(status_mod, "_get_planned_stop_marker_path", lambda: marker)
runner = _FakeRunner(running=True, draining=False)
@@ -53,8 +73,8 @@ def test_watcher_fires_shutdown_when_marker_appears(tmp_path, monkeypatch):
shutdown_handler = MagicMock(name="shutdown_signal_handler")
stop_event = threading.Event()
# Drop the marker before the thread starts.
marker.write_text('{"target_pid": 1234}', encoding="utf-8")
# Drop a self-targeting marker before the thread starts.
_write_self_marker(marker)
watcher = threading.Thread(
target=_run_planned_stop_watcher,
@@ -114,9 +134,8 @@ def test_watcher_skips_when_runner_already_draining(tmp_path, monkeypatch):
so the watcher backs off once any shutdown is in flight.
"""
marker = tmp_path / ".gateway-planned-stop.json"
marker.write_text('{"target_pid": 1234}', encoding="utf-8")
_write_self_marker(marker)
from gateway import status as status_mod
monkeypatch.setattr(status_mod, "_get_planned_stop_marker_path", lambda: marker)
# Already draining — watcher should be a no-op.
@@ -204,9 +223,8 @@ def test_watcher_fires_only_once_when_marker_persists(tmp_path, monkeypatch):
times before the gateway actually shuts down.
"""
marker = tmp_path / ".gateway-planned-stop.json"
marker.write_text('{"target_pid": 1234}', encoding="utf-8")
_write_self_marker(marker)
from gateway import status as status_mod
monkeypatch.setattr(status_mod, "_get_planned_stop_marker_path", lambda: marker)
runner = _FakeRunner(running=True, draining=False)
@@ -263,3 +281,113 @@ def test_watcher_tolerates_marker_path_resolution_errors(tmp_path, monkeypatch,
assert not watcher.is_alive(), "Watcher should still honour stop_event after errors"
# No shutdown fired because the marker never reported existence.
assert loop._captured == []
# ---------------------------------------------------------------------------
# Regression coverage for issue #34597:
# A marker left behind by a PREVIOUS gateway instance (different PID, or
# past its TTL) must NOT crash the freshly booted gateway. The watcher
# only fires when the marker targets the current process, and self-heals
# by cleaning up stale/malformed markers.
# ---------------------------------------------------------------------------
def test_watcher_does_not_fire_for_foreign_pid_marker(tmp_path, monkeypatch):
"""A marker naming a DIFFERENT process must not trigger our shutdown.
This is the core #34597 regression: a stale marker from a prior
gateway instance was firing the handler, driving the new gateway into
a false "Received UNKNOWN" shutdown and a watchdog crash loop.
"""
marker = tmp_path / ".gateway-planned-stop.json"
# Foreign PID + a start_time that cannot match ours, freshly written
# so the TTL does NOT remove it — the watcher must still decline.
record = {
"target_pid": os.getpid() + 1,
"target_start_time": -1,
"stopper_pid": os.getpid() + 1,
"written_at": status_mod._utc_now_iso(),
}
marker.write_text(json.dumps(record), encoding="utf-8")
monkeypatch.setattr(status_mod, "_get_planned_stop_marker_path", lambda: marker)
runner = _FakeRunner(running=True, draining=False)
loop = _make_loop_capturing_calls()
shutdown_handler = MagicMock(name="shutdown_signal_handler")
stop_event = threading.Event()
watcher = threading.Thread(
target=_run_planned_stop_watcher,
args=(stop_event, runner, loop, shutdown_handler),
kwargs={"poll_interval": 0.05},
daemon=True,
)
watcher.start()
time.sleep(0.3) # several poll cycles
stop_event.set()
watcher.join(timeout=2.0)
assert not watcher.is_alive()
assert loop._captured == [], (
f"Watcher fired on a foreign-PID marker (#34597 regression): {loop._captured}"
)
shutdown_handler.assert_not_called()
# Foreign (but live) marker is left in place — it may still belong to
# the process it names.
assert marker.exists()
def test_watcher_cleans_up_stale_marker_and_keeps_running(tmp_path, monkeypatch):
"""A marker older than the TTL is unlinked and never fires shutdown."""
marker = tmp_path / ".gateway-planned-stop.json"
# Self-targeting but backdated past the TTL: must be treated as dead.
_write_self_marker(marker, stale=True)
monkeypatch.setattr(status_mod, "_get_planned_stop_marker_path", lambda: marker)
runner = _FakeRunner(running=True, draining=False)
loop = _make_loop_capturing_calls()
shutdown_handler = MagicMock(name="shutdown_signal_handler")
stop_event = threading.Event()
watcher = threading.Thread(
target=_run_planned_stop_watcher,
args=(stop_event, runner, loop, shutdown_handler),
kwargs={"poll_interval": 0.05},
daemon=True,
)
watcher.start()
time.sleep(0.3)
stop_event.set()
watcher.join(timeout=2.0)
assert not watcher.is_alive()
assert loop._captured == [], "Stale marker must not fire shutdown"
shutdown_handler.assert_not_called()
assert not marker.exists(), "Stale marker should have been cleaned up"
def test_planned_stop_marker_targets_self_probe_is_non_destructive(tmp_path, monkeypatch):
"""The probe returns True for a self-marker WITHOUT unlinking it.
The shutdown handler performs the authoritative consume on its own
thread, so the watcher's probe must leave a matching marker intact.
"""
marker = tmp_path / ".gateway-planned-stop.json"
_write_self_marker(marker)
monkeypatch.setattr(status_mod, "_get_planned_stop_marker_path", lambda: marker)
assert status_mod.planned_stop_marker_targets_self() is True
assert marker.exists(), "Probe must not consume a matching marker"
# Idempotent: still True on a second call.
assert status_mod.planned_stop_marker_targets_self() is True
def test_planned_stop_marker_targets_self_drops_malformed(tmp_path, monkeypatch):
"""A malformed marker reports False and is cleaned up."""
marker = tmp_path / ".gateway-planned-stop.json"
marker.write_text("{not valid json", encoding="utf-8")
monkeypatch.setattr(status_mod, "_get_planned_stop_marker_path", lambda: marker)
assert status_mod.planned_stop_marker_targets_self() is False

View File

@@ -707,6 +707,33 @@ class TestTakeoverMarker:
assert result is False
def test_consume_returns_true_on_windows_when_start_time_unavailable(
self, tmp_path, monkeypatch
):
"""Takeover consume must also recognise a self-marker on platforms
without ``/proc`` (macOS / native Windows).
``consume_takeover_marker_for_self`` shares ``_consume_pid_marker_for_self``
with the planned-stop path, so the same start_time fallback applies:
a ``--replace`` SIGTERM on Windows (where start_time is None on both
sides) must be recognised as a planned takeover and exit 0, not be
misclassified as an unexpected UNKNOWN exit. With start_time
unavailable we fall back to PID equality alone, bounded by the TTL.
"""
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
# Simulate Windows: no start_time available for any PID.
monkeypatch.setattr(status, "_get_process_start_time", lambda pid: None)
ok = status.write_takeover_marker(target_pid=os.getpid())
assert ok is True
payload = json.loads((tmp_path / ".gateway-takeover.json").read_text())
assert payload["target_start_time"] is None
result = status.consume_takeover_marker_for_self()
assert result is True
assert not (tmp_path / ".gateway-takeover.json").exists()
def test_consume_returns_false_when_marker_missing(self, tmp_path, monkeypatch):
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
@@ -899,6 +926,74 @@ class TestPlannedStopMarker:
assert ok is False
def test_consume_returns_true_on_windows_when_start_time_unavailable(
self, tmp_path, monkeypatch
):
"""Regression for #34597: a legitimate stop must be recognised on
platforms without ``/proc``.
``_get_process_start_time`` returns None on macOS / native Windows
(no ``/proc/<pid>/stat``). The planned-stop watcher only runs there,
so if the authoritative consume required a non-None start_time match
it would always return False — and ``hermes gateway stop`` would be
misclassified as an unexpected ``UNKNOWN`` exit, exit 1, and revived
by the service manager (the very crash loop #34597 set out to fix).
With start_time unavailable on BOTH sides we fall back to PID
equality alone, bounded by the marker TTL.
"""
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
# Simulate Windows: no start_time available for any PID.
monkeypatch.setattr(status, "_get_process_start_time", lambda pid: None)
ok = status.write_planned_stop_marker(target_pid=os.getpid())
assert ok is True
# Marker carries a null start_time, exactly as written on Windows.
payload = json.loads((tmp_path / ".gateway-planned-stop.json").read_text())
assert payload["target_start_time"] is None
result = status.consume_planned_stop_marker_for_self()
assert result is True
assert not (tmp_path / ".gateway-planned-stop.json").exists()
def test_consume_still_rejects_foreign_pid_when_start_time_unavailable(
self, tmp_path, monkeypatch
):
"""The PID-only fallback must NOT match a marker naming another PID.
Falling back to PID equality when start_time is unknown must remain
a PID check — a marker for a different process is never ours.
"""
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
monkeypatch.setattr(status, "_get_process_start_time", lambda pid: None)
ok = status.write_planned_stop_marker(target_pid=os.getpid() + 9999)
assert ok is True
result = status.consume_planned_stop_marker_for_self()
assert result is False
def test_consume_still_rejects_start_time_mismatch_when_both_known(
self, tmp_path, monkeypatch
):
"""PID-reuse defence is preserved when BOTH start_times are present.
The Windows fallback only relaxes matching when a start_time is
unavailable. When both sides report one (Linux), a mismatch must
still reject — otherwise PID reuse could resurrect a stale marker.
"""
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
monkeypatch.setattr(status, "_get_process_start_time", lambda pid: 100)
status.write_planned_stop_marker(target_pid=os.getpid())
# Simulate PID reuse: same PID, different start_time.
monkeypatch.setattr(status, "_get_process_start_time", lambda pid: 9999)
result = status.consume_planned_stop_marker_for_self()
assert result is False
class TestReadProcessCmdlinePsFallback:
"""Tests for _read_process_cmdline falling back to ps on non-Linux."""

View File

@@ -218,7 +218,7 @@ def test_get_gateway_eligible_tools_ignores_quoted_false_opt_in(monkeypatch):
monkeypatch.setattr(
ns,
"_get_gateway_direct_credentials",
lambda: {"web": True, "image_gen": False, "tts": False, "browser": False},
lambda: {"web": True, "image_gen": False, "video_gen": False, "tts": False, "browser": False},
)
unconfigured, has_direct, already_managed = ns.get_gateway_eligible_tools(
@@ -230,4 +230,4 @@ def test_get_gateway_eligible_tools_ignores_quoted_false_opt_in(monkeypatch):
assert "web" in has_direct
assert "web" not in already_managed
assert set(unconfigured) == {"image_gen", "tts", "browser"}
assert set(unconfigured) == {"image_gen", "video_gen", "tts", "browser"}

View File

@@ -498,6 +498,7 @@ def test_setup_summary_shows_camofox_when_browser_feature_is_camofox(tmp_path, m
features={
"web": NousFeatureState("web", "Web tools", True, False, False, False, False, True, ""),
"image_gen": NousFeatureState("image_gen", "Image generation", True, False, False, False, False, True, ""),
"video_gen": NousFeatureState("video_gen", "Video generation", False, False, False, False, False, False, ""),
"tts": NousFeatureState("tts", "OpenAI TTS", True, False, False, False, False, True, ""),
"browser": NousFeatureState("browser", "Browser automation", True, True, True, False, True, True, "Camofox"),
"modal": NousFeatureState("modal", "Modal execution", False, False, False, False, False, True, "local"),
@@ -525,6 +526,7 @@ def test_setup_summary_does_not_mark_incomplete_browserbase_as_available(tmp_pat
features={
"web": NousFeatureState("web", "Web tools", True, False, False, False, False, True, ""),
"image_gen": NousFeatureState("image_gen", "Image generation", True, False, False, False, False, True, ""),
"video_gen": NousFeatureState("video_gen", "Video generation", False, False, False, False, False, False, ""),
"tts": NousFeatureState("tts", "OpenAI TTS", True, False, False, False, False, True, ""),
"browser": NousFeatureState("browser", "Browser automation", True, False, False, False, False, True, "Browserbase"),
"modal": NousFeatureState("modal", "Modal execution", False, False, False, False, False, True, "local"),

View File

@@ -88,6 +88,7 @@ def test_show_status_reports_managed_nous_features(monkeypatch, capsys, tmp_path
features={
"web": NousFeatureState("web", "Web tools", True, True, True, True, False, True, "firecrawl"),
"image_gen": NousFeatureState("image_gen", "Image generation", True, True, True, True, False, True, "Nous Subscription"),
"video_gen": NousFeatureState("video_gen", "Video generation", False, False, False, False, False, False, ""),
"tts": NousFeatureState("tts", "OpenAI TTS", True, True, True, True, False, True, "OpenAI TTS"),
"browser": NousFeatureState("browser", "Browser automation", True, True, True, True, False, True, "Browser Use"),
"modal": NousFeatureState("modal", "Modal execution", False, True, False, False, False, True, "local"),

View File

@@ -85,44 +85,72 @@ def test_fal_list_models_advertises_both_modalities():
def test_fal_unavailable_without_key(monkeypatch):
from plugins.video_gen.fal import FALVideoGenProvider
from plugins.video_gen import fal as fal_plugin
monkeypatch.delenv("FAL_KEY", raising=False)
# Also ensure managed gateway is unavailable
monkeypatch.setattr(fal_plugin, "_resolve_managed_fal_video_gateway", lambda: None)
assert FALVideoGenProvider().is_available() is False
def test_fal_generate_requires_fal_key(monkeypatch):
from plugins.video_gen.fal import FALVideoGenProvider
from plugins.video_gen import fal as fal_plugin
monkeypatch.delenv("FAL_KEY", raising=False)
# Also ensure managed gateway is unavailable
monkeypatch.setattr(fal_plugin, "_resolve_managed_fal_video_gateway", lambda: None)
result = FALVideoGenProvider().generate("a happy dog")
assert result["success"] is False
assert result["error_type"] == "auth_required"
def test_fal_available_via_gateway(monkeypatch):
from plugins.video_gen.fal import FALVideoGenProvider
from plugins.video_gen import fal as fal_plugin
monkeypatch.delenv("FAL_KEY", raising=False)
monkeypatch.setattr(
fal_plugin,
"_resolve_managed_fal_video_gateway",
lambda: object(), # truthy sentinel — gateway is available
)
assert FALVideoGenProvider().is_available() is True
class TestFamilyRouting:
"""The headline behavior: image_url presence picks the endpoint."""
@pytest.fixture
def with_fake_fal(self, monkeypatch):
"""Stub fal_client.subscribe to capture which endpoint we hit."""
"""Stub fal_client.submit to capture which endpoint we hit."""
import sys
import types
captured = {"endpoint": None, "arguments": None}
class FakeHandle:
def get(self):
return {"video": {"url": "https://fake/out.mp4"}}
fake = types.ModuleType("fal_client")
def _subscribe(endpoint, arguments=None, with_logs=False):
def _submit(endpoint, arguments=None, headers=None):
captured["endpoint"] = endpoint
captured["arguments"] = arguments
return {"video": {"url": "https://fake/out.mp4"}}
fake.subscribe = _subscribe # type: ignore
return FakeHandle()
fake.submit = _submit # type: ignore
monkeypatch.setitem(sys.modules, "fal_client", fake)
# Reset the lazy global so it picks up our stub
from plugins.video_gen import fal as fal_plugin
fal_plugin._fal_client = None
# Also reset the managed client cache
fal_plugin._managed_fal_video_client = None
fal_plugin._managed_fal_video_client_config = None
monkeypatch.setenv("FAL_KEY", "test")
# Force direct mode — no managed gateway
monkeypatch.setattr(fal_plugin, "_resolve_managed_fal_video_gateway", lambda: None)
return captured
def test_text_to_video_routes_to_text_endpoint(self, with_fake_fal):
@@ -229,7 +257,7 @@ class TestPayloadBuilder:
seed=42,
)
assert p["prompt"] == "x"
assert p["duration"] == "8" # FAL queue API uses strings
assert p["duration"] == "8s" # veo3.1 uses "Ns" format per FAL API
assert p["aspect_ratio"] == "16:9"
assert p["resolution"] == "720p"
assert p["generate_audio"] is True

View File

@@ -2676,6 +2676,64 @@ class TestVacuum:
db.vacuum()
class TestOptimizeFts:
def test_optimize_returns_index_count(self, db):
"""A fresh DB has both FTS indexes; optimize merges both."""
db.create_session(session_id="s1", source="cli")
db.append_message(session_id="s1", role="user", content="hello world")
assert db.optimize_fts() == 2
def test_optimize_preserves_search_and_snippet(self, db):
"""Optimize is layout-only: MATCH results + snippets are unchanged."""
db.create_session(session_id="s1", source="cli")
for i in range(50):
db.append_message(
session_id="s1",
role="user",
content=f"needle alpha bravo charlie message {i}",
)
before = db.search_messages("needle")
n = db.optimize_fts()
assert n == 2
after = db.search_messages("needle")
assert len(after) == len(before)
assert len(after) > 0
# Snippet must still be populated (would be empty/None if the FTS
# content shadow were lost during optimize).
assert all(row.get("snippet") for row in after)
# IDs and snippets are identical before/after — pure layout change.
assert [r["id"] for r in after] == [r["id"] for r in before]
assert [r["snippet"] for r in after] == [r["snippet"] for r in before]
def test_optimize_skips_missing_trigram_table(self, db):
"""When the trigram index is absent, optimize handles only the porter
index and does not raise."""
db.create_session(session_id="s1", source="cli")
db.append_message(session_id="s1", role="user", content="hello")
# Drop the trigram table + triggers to simulate a disabled/absent index.
with db._lock:
for trig in (
"messages_fts_trigram_insert",
"messages_fts_trigram_delete",
"messages_fts_trigram_update",
):
db._conn.execute(f"DROP TRIGGER IF EXISTS {trig}")
db._conn.execute("DROP TABLE IF EXISTS messages_fts_trigram")
assert db._fts_table_exists("messages_fts_trigram") is False
assert db._fts_table_exists("messages_fts") is True
# Only the porter index remains -> 1 optimized, no error.
assert db.optimize_fts() == 1
def test_optimize_idempotent(self, db):
"""Running optimize twice is safe (second pass is a no-op merge)."""
db.create_session(session_id="s1", source="cli")
db.append_message(session_id="s1", role="user", content="repeat me")
assert db.optimize_fts() == 2
assert db.optimize_fts() == 2
# Search still works after repeated optimization.
assert len(db.search_messages("repeat")) == 1
class TestAutoMaintenance:
def _make_old_ended(self, db, sid: str, days_old: int = 100):
"""Create a session that is ended and was started `days_old` days ago."""

View File

@@ -305,3 +305,214 @@ def test_transcription_uses_model_specific_response_formats(monkeypatch, tmp_pat
assert json_result["transcript"] == "hello from gpt-4o"
assert json_capture["transcription_kwargs"]["response_format"] == "json"
assert json_capture["close_calls"] == 1
PLUGINS_DIR = Path(__file__).resolve().parents[2] / "plugins"
def _load_video_gen_plugin(monkeypatch):
"""Load the FAL video gen plugin in isolation."""
_install_fake_tools_package()
# Also need the agent.video_gen_provider ABC
agent_dir = Path(__file__).resolve().parents[2] / "agent"
spec = spec_from_file_location(
"agent.video_gen_provider",
agent_dir / "video_gen_provider.py",
)
assert spec and spec.loader
mod = module_from_spec(spec)
sys.modules["agent.video_gen_provider"] = mod
spec.loader.exec_module(mod)
# Load the plugin
plugin_init = PLUGINS_DIR / "video_gen" / "fal" / "__init__.py"
spec = spec_from_file_location("plugins.video_gen.fal", plugin_init)
assert spec and spec.loader
plugin_mod = module_from_spec(spec)
sys.modules["plugins.video_gen.fal"] = plugin_mod
spec.loader.exec_module(plugin_mod)
return plugin_mod
def test_video_gen_managed_fal_submit_uses_gateway(monkeypatch):
"""Video gen routes through the managed gateway when FAL_KEY is absent."""
captured = {}
fake_fal = _install_fake_fal_client(captured)
monkeypatch.delenv("FAL_KEY", raising=False)
monkeypatch.setenv("FAL_QUEUE_GATEWAY_URL", "http://127.0.0.1:3009")
monkeypatch.setenv("TOOL_GATEWAY_USER_TOKEN", "nous-video-token")
plugin = _load_video_gen_plugin(monkeypatch)
# Patch uuid for deterministic idempotency key
monkeypatch.setattr(plugin.uuid, "uuid4", lambda: "video-submit-456")
plugin._submit_fal_video_request(
"fal-ai/pixverse/v6/text-to-video",
{"prompt": "a cat riding a bicycle", "duration": "5"},
)
assert captured["submit_via"] == "managed_client"
assert captured["client_key"] == "nous-video-token"
assert captured["submit_url"] == "http://127.0.0.1:3009/fal-ai/pixverse/v6/text-to-video"
assert captured["method"] == "POST"
assert captured["arguments"] == {"prompt": "a cat riding a bicycle", "duration": "5"}
assert captured["headers"] == {"x-idempotency-key": "video-submit-456"}
assert captured["sync_client_inits"] == 1
def test_video_gen_managed_client_reused_across_calls(monkeypatch):
"""The managed video client is cached and reused across requests."""
captured = {}
_install_fake_fal_client(captured)
monkeypatch.delenv("FAL_KEY", raising=False)
monkeypatch.setenv("FAL_QUEUE_GATEWAY_URL", "http://127.0.0.1:3009")
monkeypatch.setenv("TOOL_GATEWAY_USER_TOKEN", "nous-video-token")
plugin = _load_video_gen_plugin(monkeypatch)
plugin._submit_fal_video_request("fal-ai/pixverse/v6/text-to-video", {"prompt": "first"})
first_client = captured["http_client"]
plugin._submit_fal_video_request("fal-ai/pixverse/v6/text-to-video", {"prompt": "second"})
assert captured["sync_client_inits"] == 1
assert captured["http_client"] is first_client
def test_video_gen_direct_mode_when_fal_key_set(monkeypatch):
"""When FAL_KEY is set and gateway not preferred, uses direct fal_client.submit."""
captured = {}
_install_fake_fal_client(captured)
monkeypatch.setenv("FAL_KEY", "direct-fal-key-123")
monkeypatch.delenv("FAL_QUEUE_GATEWAY_URL", raising=False)
monkeypatch.delenv("TOOL_GATEWAY_USER_TOKEN", raising=False)
plugin = _load_video_gen_plugin(monkeypatch)
monkeypatch.setattr(plugin.uuid, "uuid4", lambda: "direct-456")
# Trigger the lazy load so _fal_client is populated from our fake
plugin._load_fal_client()
# In direct mode, fal_client.submit is the module-level function.
# Our fake raises AssertionError from the managed path, so we need
# to patch it to actually capture the call.
direct_captured = {}
def direct_submit(endpoint, arguments=None, headers=None):
direct_captured["endpoint"] = endpoint
direct_captured["arguments"] = arguments
direct_captured["headers"] = headers
# Return a mock handle
class FakeHandle:
def get(self):
return {"video": {"url": "https://fal.media/result.mp4"}}
return FakeHandle()
plugin._fal_client.submit = direct_submit
plugin._submit_fal_video_request(
"fal-ai/pixverse/v6/text-to-video",
{"prompt": "test direct"},
)
assert direct_captured["endpoint"] == "fal-ai/pixverse/v6/text-to-video"
assert direct_captured["arguments"] == {"prompt": "test direct"}
assert direct_captured["headers"] == {"x-idempotency-key": "direct-456"}
# Managed client should NOT have been initialized
assert "submit_via" not in captured
def test_video_gen_gateway_4xx_raises_actionable_valueerror(monkeypatch):
"""A 4xx from the managed gateway surfaces a clear ValueError with remediation hints."""
captured = {}
_install_fake_fal_client(captured)
monkeypatch.delenv("FAL_KEY", raising=False)
monkeypatch.setenv("FAL_QUEUE_GATEWAY_URL", "http://127.0.0.1:3009")
monkeypatch.setenv("TOOL_GATEWAY_USER_TOKEN", "nous-video-token")
plugin = _load_video_gen_plugin(monkeypatch)
# Make _maybe_retry_request raise an exception with a 403 status
class FakeResponse:
status_code = 403
class GatewayRejectError(Exception):
def __init__(self):
super().__init__("forbidden")
self.response = FakeResponse()
original_retry = sys.modules["fal_client"].client._maybe_retry_request
def raising_retry(client, method, url, json=None, timeout=None, headers=None):
raise GatewayRejectError()
sys.modules["fal_client"].client._maybe_retry_request = raising_retry
with pytest.raises(ValueError, match=r"gateway rejected endpoint.*HTTP 403"):
plugin._submit_fal_video_request(
"fal-ai/pixverse/v6/text-to-video",
{"prompt": "test 4xx"},
)
def test_video_gen_is_available_true_via_gateway(monkeypatch):
"""is_available() returns True when FAL_KEY is absent but managed gateway is configured."""
_install_fake_fal_client({})
monkeypatch.delenv("FAL_KEY", raising=False)
monkeypatch.setenv("FAL_QUEUE_GATEWAY_URL", "http://127.0.0.1:3009")
monkeypatch.setenv("TOOL_GATEWAY_USER_TOKEN", "nous-video-token")
plugin = _load_video_gen_plugin(monkeypatch)
provider = plugin.FALVideoGenProvider()
assert provider.is_available() is True
def test_video_gen_prefers_gateway_overrides_direct_key(monkeypatch):
"""When FAL_KEY is set but prefers_gateway('video_gen') is True, routes through gateway."""
captured = {}
_install_fake_fal_client(captured)
monkeypatch.setenv("FAL_KEY", "direct-key-present")
monkeypatch.setenv("FAL_QUEUE_GATEWAY_URL", "http://127.0.0.1:3009")
monkeypatch.setenv("TOOL_GATEWAY_USER_TOKEN", "nous-video-token")
plugin = _load_video_gen_plugin(monkeypatch)
# Patch prefers_gateway to return True for video_gen
tb_helpers = sys.modules["tools.tool_backend_helpers"]
original_pg = tb_helpers.prefers_gateway
monkeypatch.setattr(tb_helpers, "prefers_gateway", lambda section: section == "video_gen")
plugin._submit_fal_video_request(
"fal-ai/pixverse/v6/text-to-video",
{"prompt": "gateway preferred"},
)
assert captured["submit_via"] == "managed_client"
assert captured["client_key"] == "nous-video-token"
def test_video_gen_happy_horse_uses_alibaba_namespace():
"""Verify the happy-horse family uses alibaba/ not fal-ai/ endpoints."""
_install_fake_tools_package()
# Load just the plugin module to check the catalog
plugin_init = PLUGINS_DIR / "video_gen" / "fal" / "__init__.py"
agent_dir = Path(__file__).resolve().parents[2] / "agent"
spec = spec_from_file_location(
"agent.video_gen_provider",
agent_dir / "video_gen_provider.py",
)
mod = module_from_spec(spec)
sys.modules["agent.video_gen_provider"] = mod
spec.loader.exec_module(mod)
spec = spec_from_file_location("plugins.video_gen.fal", plugin_init)
plugin_mod = module_from_spec(spec)
sys.modules["plugins.video_gen.fal"] = plugin_mod
spec.loader.exec_module(plugin_mod)
hh = plugin_mod.FAL_FAMILIES["happy-horse"]
assert hh["text_endpoint"] == "alibaba/happy-horse/text-to-video"
assert hh["image_endpoint"] == "alibaba/happy-horse/image-to-video"

View File

@@ -46,6 +46,18 @@ def matrix_env(tmp_path, monkeypatch):
fal_calls.append({"endpoint": endpoint, "arguments": arguments})
return {"video": {"url": f"https://fake-fal/{endpoint.replace('/','_')}.mp4"}}
fake_fal.subscribe = _subscribe # type: ignore
class _FalHandle:
def __init__(self, result):
self._result = result
def get(self):
return self._result
def _submit(endpoint, arguments=None, headers=None):
fal_calls.append({"endpoint": endpoint, "arguments": arguments})
return _FalHandle({"video": {"url": f"https://fake-fal/{endpoint.replace('/','_')}.mp4"}})
fake_fal.submit = _submit # type: ignore
monkeypatch.setitem(__import__("sys").modules, "fal_client", fake_fal)
# httpx stub for xAI

View File

@@ -31,7 +31,9 @@ hermes skills uninstall <skill-name>
| Skill | Description |
|-------|-------------|
| [**antigravity-cli**](/docs/user-guide/skills/optional/autonomous-ai-agents/autonomous-ai-agents-antigravity-cli) | Operate the Antigravity CLI (agy): plugins, auth, sandbox. |
| [**blackbox**](/docs/user-guide/skills/optional/autonomous-ai-agents/autonomous-ai-agents-blackbox) | Delegate coding tasks to Blackbox AI CLI agent. Multi-model agent with built-in judge that runs tasks through multiple LLMs and picks the best result. Requires the blackbox CLI and a Blackbox AI API key. |
| [**grok**](/docs/user-guide/skills/optional/autonomous-ai-agents/autonomous-ai-agents-grok) | Delegate coding to xAI Grok Build CLI (features, PRs). |
| [**honcho**](/docs/user-guide/skills/optional/autonomous-ai-agents/autonomous-ai-agents-honcho) | Configure and use Honcho memory with Hermes -- cross-session user modeling, multi-profile peer isolation, observation config, dialectic reasoning, session summaries, and context budget enforcement. Use when setting up Honcho, troubleshoo... |
| [**openhands**](/docs/user-guide/skills/optional/autonomous-ai-agents/autonomous-ai-agents-openhands) | Delegate coding to OpenHands CLI (model-agnostic, LiteLLM). |

View File

@@ -22,8 +22,11 @@ Your request
→ Pick key from pool (round_robin / least_used / fill_first / random)
→ Send to provider
→ 429 rate limit?
Retry same key once (transient blip)
→ Second 429rotate to next pool key
Plan/usage limit reached (e.g. ChatGPT/Codex "usage limit reached")?
Rotate to next pool key immediately (no retry — the cap won't clear on retry)
→ Generic / transient 429?
→ Retry same key once (transient blip)
→ Second 429 → rotate to next pool key
→ All keys exhausted → fallback_model (different provider)
→ 402 billing error?
→ Immediately rotate to next pool key (24h cooldown)

View File

@@ -0,0 +1,195 @@
---
title: "Antigravity Cli — Operate the Antigravity CLI (agy): plugins, auth, sandbox"
sidebar_label: "Antigravity Cli"
description: "Operate the Antigravity CLI (agy): plugins, auth, sandbox"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Antigravity Cli
Operate the Antigravity CLI (agy): plugins, auth, sandbox.
## Skill metadata
| | |
|---|---|
| Source | Optional — install with `hermes skills install official/autonomous-ai-agents/antigravity-cli` |
| Path | `optional-skills/autonomous-ai-agents/antigravity-cli` |
| Version | `0.1.0` |
| Author | Tony Simons (asimons81), Hermes Agent |
| License | MIT |
| Platforms | linux, macos, windows |
| Tags | `Coding-Agent`, `Antigravity`, `CLI`, `Auth`, `Plugins`, `Sandbox` |
| Related skills | [`grok`](/docs/user-guide/skills/optional/autonomous-ai-agents/autonomous-ai-agents-grok), [`codex`](/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-codex), [`claude-code`](/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-claude-code), [`hermes-agent`](/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-hermes-agent) |
## Reference: full SKILL.md
:::info
The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active.
:::
# Antigravity CLI (`agy`)
Operator guide for the Antigravity CLI, invoked as `agy`. Run all `agy`
commands through the Hermes `terminal` tool; inspect its config and logs with
`read_file`. This skill is reference + procedure — it does not wrap a network
API, so there is nothing to authenticate from Hermes itself.
## When to Use
- Installing, updating, or smoke-testing the `agy` binary
- Driving non-interactive `agy --print` / `agy -p` one-shots
- Debugging Antigravity auth, sandbox, permissions, or plugin state
- Reading Antigravity settings, keybindings, conversations, or logs
## Mental model
Antigravity has two layers — keep them distinct or the guidance will be wrong:
1. **Shell wrapper commands**`agy help`, `agy install`, `agy plugin`,
`agy update`, `agy changelog`. Run these through the `terminal` tool.
2. **Interactive in-session slash commands**`/config`, `/permissions`,
`/skills`, `/agents`, etc. These only exist inside a running `agy` TUI
session, not on the shell wrapper.
`agy help` shows the shell wrapper surface, NOT the in-session slash commands.
## Prerequisites
- The `agy` binary on PATH. Verify through the `terminal` tool:
`command -v agy && agy --version`.
- No env vars or API keys required by this skill — Antigravity manages its own
auth via the OS keyring / browser sign-in (see Authentication below).
## How to Run
Invoke every `agy` command through the `terminal` tool. Examples:
```
terminal(command="agy --version")
terminal(command="agy help")
terminal(command="agy plugin list")
terminal(command="agy --print 'Summarize the repo in 3 bullets'", workdir="/path/to/project")
```
For an interactive multi-turn TUI session, launch `agy` with `pty=true` (and
tmux for capture/monitoring), the same pattern the `codex` / `claude-code`
skills use. For one-shot smoke tests and scripted prompts, prefer
`agy --print` (non-interactive).
To inspect Antigravity's own files, use `read_file` on the paths under Core
paths below — do not `cat` them through the terminal.
## Core paths
- Binary / entrypoint: `agy`
- App data dir: `~/.gemini/antigravity-cli/`
- Settings file: `~/.gemini/antigravity-cli/settings.json`
- Keybindings file: `~/.gemini/antigravity-cli/keybindings.json`
- Logs: `~/.gemini/antigravity-cli/log/cli-*.log`
- Conversations: `~/.gemini/antigravity-cli/conversations/`
- Brain artifacts: `~/.gemini/antigravity-cli/brain/`
- History: `~/.gemini/antigravity-cli/history.jsonl`
- Plugin staging: `~/.gemini/antigravity-cli/plugins/<plugin_name>/`
## Quick Reference
### Wrapper commands
- `agy changelog`
- `agy help`
- `agy install`
- `agy plugin` / `agy plugins`
- `agy update`
### Useful flags
- `--add-dir`
- `--continue` / `-c`
- `--conversation`
- `--dangerously-skip-permissions`
- `--print` / `-p`
- `--print-timeout`
- `--prompt`
- `--prompt-interactive` / `-i`
- `--sandbox`
- `--log-file`
- `--version`
### Plugin subcommands (`agy plugin --help`)
- `list`, `import [source]`, `install <target>`, `uninstall <name>`,
`enable <name>`, `disable <name>`, `validate [path]`, `link <mp> <target>`,
`help`
### Install flags (`agy install --help`)
- `--dir`, `--skip-aliases`, `--skip-path`
### In-session slash commands
- **Conversation control:** `/resume` (`/switch`), `/rewind` (`/undo`),
`/rename <name>`, `/clear`, `/fork`, `/reset`, `/new`
- **Settings & tools:** `/config`, `/settings`, `/permissions`, `/model`,
`/keybindings`, `/statusline`, `/tasks`, `/skills`, `/mcp`, `/open <path>`,
`/usage`, `/logout`, `/agents`
- **Prompt helpers:** `@` path autocomplete, `esc esc` clears the prompt (when
not streaming), `!` runs a terminal command directly, `?` opens help
## Settings and permissions
### Common settings keys (`settings.json`)
- `allowNonWorkspaceAccess`
- `colorScheme`
- `permissions.allow`
- `trustedWorkspaces`
### Permission modes
`request-review`, `always-proceed`, `strict`, `proceed-in-sandbox`.
### Sandbox behavior
- `enableTerminalSandbox` is a boolean in `settings.json`; default `false`.
- Launch-time overrides (`--sandbox`, `--dangerously-skip-permissions`) can
supersede persistent settings for the current session.
## Authentication behavior
- The CLI tries the OS secure keyring first.
- With no saved session, it falls back to browser-based Google sign-in.
- Locally it opens the default browser; over SSH it prints an authorization URL
and expects the auth code pasted back.
- `/logout` removes saved credentials.
## Plugins
- Plugins stage under `~/.gemini/antigravity-cli/plugins/<plugin_name>/`.
- They can bundle skills, agents, rules, MCP servers, and hooks.
- `agy plugin list` returning no imported plugins is a valid empty state.
## Pitfalls
- `agy help` shows wrapper commands, not interactive slash commands.
- `agy --version` is the safe non-interactive version check; `agy version` is
interactive and can fail without a real TTY.
- First place to look for failures: `~/.gemini/antigravity-cli/log/cli-*.log`
(read with `read_file`).
- Don't confuse persistent JSON settings with launch-time overrides.
- `~/.gemini/antigravity-cli/bin/agentapi` is a thin wrapper to `agy agentapi`.
- On WSL, token storage is file-based, so auth issues are usually local-file /
session-state problems, not browser-only problems.
- Workspace identity can depend on launch directory and the `.antigravitycli`
project marker.
## Verification
Confirm the install is real and usable, all through the `terminal` tool (read
files with `read_file`):
1. `terminal(command="command -v agy")`
2. `terminal(command="agy --version")`
3. `terminal(command="agy help")`
4. `terminal(command="agy plugin list")`
5. `read_file` on `~/.gemini/antigravity-cli/settings.json`
6. `read_file` on the latest `~/.gemini/antigravity-cli/log/cli-*.log`
7. If needed, `read_file` on `~/.gemini/antigravity-cli/keybindings.json`
## Support files
- `references/cli-docs.md` — condensed notes from the getting-started, usage,
and features docs.

View File

@@ -0,0 +1,319 @@
---
title: "Grok — Delegate coding to xAI Grok Build CLI (features, PRs)"
sidebar_label: "Grok"
description: "Delegate coding to xAI Grok Build CLI (features, PRs)"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Grok
Delegate coding to xAI Grok Build CLI (features, PRs).
## Skill metadata
| | |
|---|---|
| Source | Optional — install with `hermes skills install official/autonomous-ai-agents/grok` |
| Path | `optional-skills/autonomous-ai-agents/grok` |
| Version | `0.1.0` |
| Author | Matt Maximo (MattMaximo), Hermes Agent |
| License | MIT |
| Platforms | linux, macos, windows |
| Tags | `Coding-Agent`, `Grok`, `xAI`, `Code-Review`, `Refactoring`, `Automation` |
| Related skills | [`codex`](/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-codex), [`claude-code`](/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-claude-code), [`hermes-agent`](/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-hermes-agent) |
## Reference: full SKILL.md
:::info
The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active.
:::
# Grok Build CLI — Hermes Orchestration Guide
Delegate coding tasks to [Grok Build](https://docs.x.ai/build/overview) (xAI's
autonomous coding agent CLI, the `grok` command) via the Hermes terminal. Grok
can read files, write code, run shell commands, spawn subagents, and manage git
workflows. It runs three ways: an interactive TUI, **headless** (`-p`), and as
an **ACP agent** over JSON-RPC.
This is the third sibling to `codex` and `claude-code`. The orchestration
pattern is nearly identical — **prefer headless `-p` for one-shots**, use a PTY
for interactive sessions.
## When to use
- Building features
- Refactoring
- PR reviews
- Batch issue fixing
- Any task where you'd otherwise reach for Codex / Claude Code but want Grok
## Prerequisites
- **Install (preferred):** `npm install -g @xai-official/grok`
- The official installer `curl -fsSL https://x.ai/cli/install.sh | bash` also
works, but the `x.ai` host is Cloudflare-walled in some environments. The
npm path avoids that dependency entirely.
- **Auth — SuperGrok / X Premium+ subscription (primary path):**
- Run `grok login` once → opens a browser for OAuth → token cached in
`~/.grok/auth.json`. This uses your **SuperGrok or X Premium+** subscription
(no per-token API billing).
- Check sign-in state by looking for `~/.grok/auth.json`, or run a cheap
headless smoke test: `grok --no-auto-update -p "Say ok."`
- In the TUI, `/logout` signs out and `/login` (or relaunching) signs back in.
- **No git repo required** — unlike Codex, Grok runs fine outside a git
directory (good for scratch/throwaway tasks).
- **Claude Code / AGENTS.md compatible with zero config** — Grok auto-reads
`CLAUDE.md`, `.claude/` (skills, agents, MCPs, hooks, rules), and the
`AGENTS.md` family. Existing project context just works.
> **API-key fallback (not the default for this user):** Grok also supports
> setting the `XAI_API_KEY` environment variable for pay-as-you-go billing
> via `api.x.ai`. Only use
> this if `grok login` / SuperGrok auth is unavailable. The subscription path
> (`grok login`) is the intended setup here.
## Two Orchestration Modes
### Mode 1: Headless (`-p`) — Non-Interactive (PREFERRED)
Runs a one-shot task, prints the result, and exits. No PTY, no interactive
dialogs to navigate. This is the cleanest integration path — the analog of
`claude -p` and `codex exec`.
```
terminal(command="grok --no-auto-update -p 'Add a dark mode toggle to settings'", workdir="/path/to/project", timeout=180)
```
Always pass `--no-auto-update` in automation to skip background update checks.
**When to use headless:**
- One-shot coding tasks (fix a bug, add a feature, refactor)
- CI/CD automation and scripting
- Structured output parsing with `--output-format json`
- Any task that doesn't need multi-turn conversation
### Mode 2: Interactive PTY — Multi-Turn TUI Sessions
The TUI is a fullscreen, mouse-interactive app. Drive it with `pty=true`. For
robust monitoring/input use tmux (same pattern as the `claude-code` skill).
```
# Launch in a tmux session for capture-pane monitoring
terminal(command="tmux new-session -d -s grok-work -x 140 -y 40")
terminal(command="tmux send-keys -t grok-work 'cd /path/to/project && grok' Enter")
# Wait for startup, then send a task
terminal(command="sleep 5 && tmux send-keys -t grok-work 'Refactor the auth module to use JWT' Enter")
# Monitor progress
terminal(command="sleep 15 && tmux capture-pane -t grok-work -p -S -50")
# Exit when done
terminal(command="tmux send-keys -t grok-work '/quit' Enter && sleep 1 && tmux kill-session -t grok-work")
```
**Tip for headless-but-inline output:** if you want TUI-style output without the
fullscreen alt-screen takeover (e.g. for cleaner logs), add `--no-alt-screen`.
For pure automation, headless `-p` is still cleaner than the TUI.
## Headless Deep Dive
### Common Flags
| Flag | Effect |
|------|--------|
| `-p, --single <PROMPT>` | Send one prompt, run headless, exit |
| `-m, --model <MODEL>` | Choose a model |
| `-s, --session-id <ID>` | Create or resume a named headless session |
| `-r, --resume <ID>` | Resume an existing session |
| `-c, --continue` | Continue the most recent session in the current directory |
| `--cwd <PATH>` | Set the working directory |
| `--output-format <FMT>` | `plain` (default), `json`, or `streaming-json` |
| `--always-approve` | Auto-approve all tool executions (the `--full-auto` / `--yolo` equivalent) |
| `--no-alt-screen` | Run inline, no fullscreen TUI takeover |
| `--no-auto-update` | Skip background update checks (use in all automation) |
### Output Formats
- `plain` — human-readable text (default)
- `json` — one JSON object at the end of the run (parse the result cleanly)
- `streaming-json` — newline-delimited JSON events as they arrive
```
# Structured result for parsing
terminal(command="grok --no-auto-update -p 'List all TODO comments in src/' --output-format json", workdir="/project", timeout=120)
# Auto-approve for autonomous building
terminal(command="grok --no-auto-update --always-approve -p 'Refactor the database layer and run the tests'", workdir="/project", timeout=300)
```
### Background Mode (Long Tasks)
```
# Start headless in background
terminal(command="grok --no-auto-update --always-approve -p 'Refactor the auth module'", workdir="/project", background=true, notify_on_complete=true)
# Returns session_id
# Monitor
process(action="poll", session_id="<id>")
process(action="log", session_id="<id>")
# Kill if needed
process(action="kill", session_id="<id>")
```
For an interactive (TUI) background session, use `pty=true` + tmux and monitor
with `tmux capture-pane`, exactly like the `claude-code` / `codex` skills.
### Session Continuation
```
# Start a named session
terminal(command="grok --no-auto-update -s refactor-db -p 'Start refactoring the database layer' --always-approve", workdir="/project", timeout=240)
# Resume it later
terminal(command="grok --no-auto-update -r refactor-db -p 'Now add connection pooling' --always-approve", workdir="/project", timeout=180)
# Or continue the most recent session in this directory
terminal(command="grok --no-auto-update -c -p 'What did you change last time?'", workdir="/project", timeout=60)
```
## Read-Only Audit → Markdown Note Pattern
To have Grok review local artifacts and return a clean markdown note (for
Obsidian or a repo) without mutating anything:
1. Prepare stable input files first with Hermes tools (`read_file`,
`write_file`). Snapshot only the relevant context into a temp file rather
than dumping raw paths.
2. Run Grok headless **without** `--always-approve` so it cannot auto-write, and
demand `markdown only, no preamble`.
3. Save Grok's stdout straight into the destination note with `write_file()`.
```
grok --no-auto-update -p "Read /tmp/current.md and /tmp/inventory.md. Produce markdown only, no preamble. Output a clean note titled 'Cleanup Review'." --output-format plain
```
**Pitfall (same as Claude Code):** for document rewrites, a loose "rewrite this"
prompt may return a change summary instead of the full file. Instead: pipe the
file in, and demand `Return ONLY the full revised markdown document. No intro,
no explanation, no code fences. Start immediately with '# Title'.` Verify the
first lines with `read_file()` before overwriting the destination.
## PR Review Patterns
### Quick Review (Headless)
```
terminal(command="cd /path/to/repo && git diff main...feature-branch | grok --no-auto-update -p 'Review this diff for bugs, security issues, and style problems. Be thorough.'", timeout=120)
```
### Clone-to-temp Review (safe, no repo mutation)
```
terminal(command="REVIEW=$(mktemp -d) && git clone https://github.com/user/repo.git $REVIEW && cd $REVIEW && gh pr checkout 42 && grok --no-auto-update -p 'Review the changes vs origin/main. Check bugs, security, race conditions, missing tests.'", pty=true, timeout=300)
```
### Post the review
```
terminal(command="gh pr comment 42 --body '<review text>'", workdir="/path/to/repo")
```
## Parallel Issue Fixing with Worktrees
```
# Create worktrees
terminal(command="git worktree add -b fix/issue-78 /tmp/issue-78 main", workdir="~/project")
terminal(command="git worktree add -b fix/issue-99 /tmp/issue-99 main", workdir="~/project")
# Launch Grok headless in each (background)
terminal(command="grok --no-auto-update --always-approve -p 'Fix issue #78: <description>. Commit when done.'", workdir="/tmp/issue-78", background=true, notify_on_complete=true)
terminal(command="grok --no-auto-update --always-approve -p 'Fix issue #99: <description>. Commit when done.'", workdir="/tmp/issue-99", background=true, notify_on_complete=true)
# Monitor
process(action="list")
# After completion: push and open PRs
terminal(command="cd /tmp/issue-78 && git push -u origin fix/issue-78")
terminal(command="gh pr create --repo user/repo --head fix/issue-78 --title 'fix: ...' --body '...'")
# Cleanup
terminal(command="git worktree remove /tmp/issue-78", workdir="~/project")
```
## Useful Subcommands & TUI Commands
| Command | Purpose |
|---------|---------|
| `grok` | Start the interactive TUI |
| `grok -p "query"` | Headless one-shot |
| `grok login` / `grok logout` | Sign in / out (SuperGrok / X Premium+ OAuth) |
| `grok inspect` | Show what Grok discovered in cwd: config sources, instructions, skills, plugins, hooks, MCP servers |
| `grok agent stdio` | Run as an ACP agent over JSON-RPC (for IDE/tool integration) |
| `grok update` | Update the CLI (needs the `x.ai` host; skip in automation) |
TUI slash commands (interactive only): `/model <name>`, `/always-approve`,
`/plan`, `/context`, `/compact`, `/resume`, `/sessions`, `/fork`, `/usage`,
`/quit`. `Shift+Tab` cycles session modes (including Plan mode, which blocks
write tools except the session plan file).
## Config (`~/.grok/config.toml`)
```toml
[cli]
auto_update = false # skip background update checks persistently
[ui]
permission_mode = "ask" # or "always-approve" to skip tool prompts by default
[models]
default = "grok-build-0.1"
```
Put global preferences in `~/.grok/config.toml` (not project-scoped
`.grok/config.toml`). `permission_mode` supersedes the legacy `approval_mode` /
`yolo = true` keys.
## Pitfalls & Gotchas
1. **Auth is subscription-gated.** `grok login` requires a SuperGrok or X
Premium+ subscription. If login fails or there's no `~/.grok/auth.json`,
confirm the subscription is active before falling back to `XAI_API_KEY`.
2. **Don't conflate Hermes' xAI auth with the `grok` CLI's auth.** Hermes'
`x_search` runs on its own xAI OAuth; the standalone `grok` CLI has a
separate token in `~/.grok/auth.json`. A working `x_search` does NOT mean
`grok` is logged in.
3. **Always pass `--no-auto-update` in automation** — otherwise Grok phones home
for update checks (and `x.ai`/`storage.googleapis.com` may be unreachable).
4. **Prefer npm install over the curl installer** — `npm install -g
@xai-official/grok` avoids the Cloudflare-walled `x.ai` host.
5. **`--always-approve` is the autonomous-build switch.** Without it, headless
runs may stall waiting on tool-approval prompts. Omit it deliberately for
read-only review/audit work so Grok can't mutate files.
6. **Headless `-p` skips TUI dialogs**; the TUI needs `pty=true` (+ tmux for
monitoring), just like Claude Code.
7. **Use `--no-alt-screen`** if you run the TUI inline and the fullscreen
alt-screen takeover garbles captured output.
8. **No git repo needed**, but for PR/commit workflows you still want one — use
`mktemp -d && git init` for scratch commit tasks.
9. **Clean up tmux sessions** with `tmux kill-session -t <name>` when done.
## Rules for Hermes Agents
1. **Prefer headless `-p`** for single tasks — cleanest integration, structured
output via `--output-format json`.
2. **Always set `workdir`** (or `--cwd`) so Grok targets the right project.
3. **Pass `--no-auto-update`** in every automated invocation.
4. **Use `--always-approve` only when Grok should write autonomously**; omit it
for read-only reviews and audits.
5. **Background long tasks** with `background=true, notify_on_complete=true` and
monitor via the `process` tool.
6. **Use tmux for multi-turn interactive work** and monitor with
`tmux capture-pane -t <session> -p -S -50`.
7. **Verify auth before relying on it** — check `~/.grok/auth.json` or run a
cheap `grok -p "Say ok."` smoke test; don't assume Hermes' xAI auth carries
over.
8. **Report results to the user** — summarize what Grok changed and what's left.

View File

@@ -18,8 +18,11 @@ Your request
→ Pick key from pool (round_robin / least_used / fill_first / random)
→ Send to provider
→ 429 rate limit?
Retry same key once (transient blip)
→ Second 429rotate to next pool key
Plan/usage limit reached (e.g. ChatGPT/Codex "usage limit reached")?
Rotate to next pool key immediately (no retry — the cap won't clear on retry)
→ Generic / transient 429?
→ Retry same key once (transient blip)
→ Second 429 → rotate to next pool key
→ All keys exhausted → fallback_model (different provider)
→ 402 billing error?
→ Immediately rotate to next pool key (24h cooldown)

View File

@@ -389,7 +389,9 @@ const sidebars: SidebarsConfig = {
key: 'skills-optional-autonomous-ai-agents',
collapsed: true,
items: [
'user-guide/skills/optional/autonomous-ai-agents/autonomous-ai-agents-antigravity-cli',
'user-guide/skills/optional/autonomous-ai-agents/autonomous-ai-agents-blackbox',
'user-guide/skills/optional/autonomous-ai-agents/autonomous-ai-agents-grok',
'user-guide/skills/optional/autonomous-ai-agents/autonomous-ai-agents-honcho',
'user-guide/skills/optional/autonomous-ai-agents/autonomous-ai-agents-openhands',
],

View File

@@ -1,6 +1,6 @@
{
"version": 1,
"updated_at": "2026-05-29T06:55:44Z",
"updated_at": "2026-05-29T11:20:16Z",
"metadata": {
"source": "hermes-agent repo",
"docs": "https://hermes-agent.nousresearch.com/docs/reference/model-catalog"
@@ -81,7 +81,7 @@
"description": ""
},
{
"id": "google/gemini-3-flash-preview",
"id": "google/gemini-3.5-flash",
"description": ""
},
{
@@ -198,7 +198,7 @@
"id": "google/gemini-3-pro-preview"
},
{
"id": "google/gemini-3-flash-preview"
"id": "google/gemini-3.5-flash"
},
{
"id": "google/gemini-3.1-pro-preview"