merge: resolve conflict with main (i18n refactor)

Main moved StatusPage constants/functions inside the component and added i18n support. Resolved by keeping the i18n structure and adding the runningRemote key to en.ts, zh.ts, and types.ts for remote gateway display.
docs(docker): add dashboard section, expose API port, update Compose example
2026-05-06 02:37:05 +08:00 · 2026-04-14 22:30:50 +00:00 · 2026-04-15 08:11:52 +10:00 · 2026-04-14 22:01:02 +00:00 · 2026-04-14 06:29:59 +00:00 · 2026-04-14 05:17:17 +00:00
7 changed files with 163 additions and 5 deletions
--- a/gateway/platforms/api_server.py
+++ b/gateway/platforms/api_server.py
@@ -10,6 +10,7 @@ Exposes an HTTP server with endpoints:
 - POST /v1/runs                    — start a run, returns run_id immediately (202)
 - GET  /v1/runs/{run_id}/events    — SSE stream of structured lifecycle events
 - GET  /health                     — health check
 - GET  /health/detailed            — rich status for cross-container dashboard probing
 Any OpenAI-compatible frontend (Open WebUI, LobeChat, LibreChat,
 AnythingLLM, NextChat, ChatBox, etc.) can connect to hermes-agent
@@ -565,6 +566,27 @@ class APIServerAdapter(BasePlatformAdapter):
        """GET /health — simple health check."""
        return web.json_response({"status": "ok", "platform": "hermes-agent"})
    async def _handle_health_detailed(self, request: "web.Request") -> "web.Response":
        """GET /health/detailed — rich status for cross-container dashboard probing.
        Returns gateway state, connected platforms, PID, and uptime so the
        dashboard can display full status without needing a shared PID file or
        /proc access.  No authentication required.
        """
        from gateway.status import read_runtime_status
        runtime = read_runtime_status() or {}
        return web.json_response({
            "status": "ok",
            "platform": "hermes-agent",
            "gateway_state": runtime.get("gateway_state"),
            "platforms": runtime.get("platforms", {}),
            "active_agents": runtime.get("active_agents", 0),
            "exit_reason": runtime.get("exit_reason"),
            "updated_at": runtime.get("updated_at"),
            "pid": os.getpid(),
        })
    async def _handle_models(self, request: "web.Request") -> "web.Response":
        """GET /v1/models — return hermes-agent as an available model."""
        auth_err = self._check_auth(request)
@@ -1783,6 +1805,7 @@ class APIServerAdapter(BasePlatformAdapter):
            self._app = web.Application(middlewares=mws)
            self._app["api_server_adapter"] = self
            self._app.router.add_get("/health", self._handle_health)
            self._app.router.add_get("/health/detailed", self._handle_health_detailed)
            self._app.router.add_get("/v1/health", self._handle_health)
            self._app.router.add_get("/v1/models", self._handle_models)
            self._app.router.add_post("/v1/chat/completions", self._handle_chat_completions)
--- a/hermes_cli/web_server.py
+++ b/hermes_cli/web_server.py
@@ -13,6 +13,7 @@ import asyncio
 import hmac
 import json
 import logging
 import os
 import secrets
 import sys
 import threading
@@ -319,12 +320,68 @@ class EnvVarReveal(BaseModel):
    key: str
 _GATEWAY_HEALTH_URL = os.getenv("GATEWAY_HEALTH_URL")
 _GATEWAY_HEALTH_TIMEOUT = float(os.getenv("GATEWAY_HEALTH_TIMEOUT", "3"))
 def _probe_gateway_health() -> tuple[bool, dict | None]:
    """Probe the gateway via its HTTP health endpoint (cross-container).
    Uses ``/health/detailed`` first (returns full state), falling back to
    the simpler ``/health`` endpoint.  Returns ``(is_alive, body_dict)``.
    Accepts any of these as ``GATEWAY_HEALTH_URL``:
    - ``http://gateway:8642``                (base URL — recommended)
    - ``http://gateway:8642/health``         (explicit health path)
    - ``http://gateway:8642/health/detailed`` (explicit detailed path)
    This is a **blocking** call — run via ``run_in_executor`` from async code.
    """
    if not _GATEWAY_HEALTH_URL:
        return False, None
    # Normalise to base URL so we always probe the right paths regardless of
    # whether the user included /health or /health/detailed in the env var.
    base = _GATEWAY_HEALTH_URL.rstrip("/")
    if base.endswith("/health/detailed"):
        base = base[: -len("/health/detailed")]
    elif base.endswith("/health"):
        base = base[: -len("/health")]
    for path in (f"{base}/health/detailed", f"{base}/health"):
        try:
            req = urllib.request.Request(path, method="GET")
            with urllib.request.urlopen(req, timeout=_GATEWAY_HEALTH_TIMEOUT) as resp:
                if resp.status == 200:
                    body = json.loads(resp.read())
                    return True, body
        except Exception:
            continue
    return False, None
@app.get("/api/status")
 async def get_status():
    current_ver, latest_ver = check_config_version()
    # --- Gateway liveness detection ---
    # Try local PID check first (same-host).  If that fails and a remote
    # GATEWAY_HEALTH_URL is configured, probe the gateway over HTTP so the
    # dashboard works when the gateway runs in a separate container.
    gateway_pid = get_running_pid()
    gateway_running = gateway_pid is not None
    remote_health_body: dict | None = None
    if not gateway_running and _GATEWAY_HEALTH_URL:
        loop = asyncio.get_event_loop()
        alive, remote_health_body = await loop.run_in_executor(
            None, _probe_gateway_health
        )
        if alive:
            gateway_running = True
            # PID from the remote container (display only — not locally valid)
            if remote_health_body:
                gateway_pid = remote_health_body.get("pid")
    gateway_state = None
    gateway_platforms: dict = {}
@@ -341,7 +398,12 @@ async def get_status():
    except Exception:
        configured_gateway_platforms = None
    # Prefer the detailed health endpoint response (has full state) when the
    # local runtime status file is absent or stale (cross-container).
    runtime = read_runtime_status()
    if runtime is None and remote_health_body and remote_health_body.get("gateway_state"):
        runtime = remote_health_body
    if runtime:
        gateway_state = runtime.get("gateway_state")
        gateway_platforms = runtime.get("platforms") or {}
@@ -356,6 +418,17 @@ async def get_status():
        if not gateway_running:
            gateway_state = gateway_state if gateway_state in ("stopped", "startup_failed") else "stopped"
            gateway_platforms = {}
        elif gateway_running and remote_health_body is not None:
            # The health probe confirmed the gateway is alive, but the local
            # runtime status file may be stale (cross-container).  Override
            # stopped/None state so the dashboard shows the correct badge.
            if gateway_state in (None, "stopped"):
                gateway_state = "running"
    # If there was no runtime info at all but the health probe confirmed alive,
    # ensure we still report the gateway as running (no shared volume scenario).
    if gateway_running and gateway_state is None and remote_health_body is not None:
        gateway_state = "running"
    active_sessions = 0
    try:
--- a/web/src/i18n/en.ts
+++ b/web/src/i18n/en.ts
@@ -78,6 +78,7 @@ export const en: Translations = {
    disconnected: "Disconnected",
    error: "Error",
    notRunning: "Not running",
    runningRemote: "Running (remote)",
    startFailed: "Start failed",
    pid: "PID",
    noneRunning: "None",
--- a/web/src/i18n/types.ts
+++ b/web/src/i18n/types.ts
@@ -81,6 +81,7 @@ export interface Translations {
    disconnected: string;
    error: string;
    notRunning: string;
    runningRemote: string;
    startFailed: string;
    pid: string;
    noneRunning: string;
--- a/web/src/i18n/zh.ts
+++ b/web/src/i18n/zh.ts
@@ -78,6 +78,7 @@ export const zh: Translations = {
    disconnected: "已断开",
    error: "错误",
    notRunning: "未运行",
    runningRemote: "运行中（远程）",
    startFailed: "启动失败",
    pid: "进程",
    noneRunning: "无",
--- a/web/src/pages/StatusPage.tsx
+++ b/web/src/pages/StatusPage.tsx
@@ -53,7 +53,8 @@ export default function StatusPage() {
  };
  function gatewayValue(): string {
-    if (status!.gateway_running) return `${t.status.pid} ${status!.gateway_pid}`;
+    if (status!.gateway_running && status!.gateway_pid) return `${t.status.pid} ${status!.gateway_pid}`;
    if (status!.gateway_running) return t.status.runningRemote;
    if (status!.gateway_state === "startup_failed") return t.status.startFailed;
    return t.status.notRunning;
  }
--- a/website/docs/user-guide/docker.md
+++ b/website/docs/user-guide/docker.md
@@ -35,9 +35,39 @@ docker run -d \
  --name hermes \
  --restart unless-stopped \
  -v ~/.hermes:/opt/data \
  -p 8642:8642 \
  nousresearch/hermes-agent gateway run
 ```
 Port 8642 exposes the gateway's [OpenAI-compatible API server](./api-server.md) and health endpoint. It's optional if you only use chat platforms (Telegram, Discord, etc.), but required if you want the dashboard or external tools to reach the gateway.
 Opening any port on an internet facing machine is a security risk. You should not do it unless you understand the risks.
 ## Running the dashboard
 The built-in web dashboard can run alongside the gateway as a separate container. 
 To run the dashboard as its own container, point it at the gateway's health endpoint so it can detect gateway status across containers:
 ```sh
 docker run -d \
  --name hermes-dashboard \
  --restart unless-stopped \
  -v ~/.hermes:/opt/data \
  -p 9119:9119 \
  -e GATEWAY_HEALTH_URL=http://$HOST_IP:8642 \
  nousresearch/hermes-agent dashboard
 ```
 Replace `$HOST_IP` with the IP address of the machine running the gateway container (e.g. `192.168.1.100`), or use a Docker network hostname if both containers share a network (see the [Compose example](#docker-compose-example) below).
 | Environment variable | Description | Default |
 |---------------------|-------------|---------|
 | `GATEWAY_HEALTH_URL` | Base URL of the gateway's API server, e.g. `http://gateway:8642` | *(unset — local PID check only)* |
 | `GATEWAY_HEALTH_TIMEOUT` | Health probe timeout in seconds | `3` |
 Without `GATEWAY_HEALTH_URL`, the dashboard falls back to local process detection — which only works when the gateway runs in the same container or on the same host.
 ## Running interactively (CLI chat)
 To open an interactive chat session against a running data directory:
@@ -66,7 +96,7 @@ The `/opt/data` volume is the single source of truth for all Hermes state. It ma
 | `skins/` | Custom CLI skins |
 :::warning
-Never run two Hermes containers against the same data directory simultaneously — session files and memory stores are not designed for concurrent access.
+Never run two Hermes **gateway** containers against the same data directory simultaneously — session files and memory stores are not designed for concurrent write access. Running a dashboard container alongside the gateway is safe since the dashboard only reads data.
 :::
 ## Environment variable forwarding
@@ -85,18 +115,21 @@ Direct `-e` flags override values from `.env`. This is useful for CI/CD or secre
 ## Docker Compose example
-For persistent gateway deployment, a `docker-compose.yaml` is convenient:
+For persistent deployment with both the gateway and dashboard, a `docker-compose.yaml` is convenient:
 ```yaml
 version: "3.8"
 services:
  hermes:
    image: nousresearch/hermes-agent:latest
    container_name: hermes
    restart: unless-stopped
    command: gateway run
    ports:
      - "8642:8642"
    volumes:
      - ~/.hermes:/opt/data
    networks:
      - hermes-net
    # Uncomment to forward specific env vars instead of using .env file:
    # environment:
    #   - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
@@ -107,9 +140,34 @@ services:
        limits:
          memory: 4G
          cpus: "2.0"
  dashboard:
    image: nousresearch/hermes-agent:latest
    container_name: hermes-dashboard
    restart: unless-stopped
    command: dashboard --host 0.0.0.0
    ports:
      - "9119:9119"
    volumes:
      - ~/.hermes:/opt/data
    environment:
      - GATEWAY_HEALTH_URL=http://hermes:8642
    networks:
      - hermes-net
    depends_on:
      - hermes
    deploy:
      resources:
        limits:
          memory: 512M
          cpus: "0.5"
 networks:
  hermes-net:
    driver: bridge
 ```
-Start with `docker compose up -d` and view logs with `docker compose logs -f hermes`.
+Start with `docker compose up -d` and view logs with `docker compose logs -f`.
 ## Resource limits
Author	SHA1	Message	Date
Hermes Agent	5ad28a2dbe	merge: resolve conflict with main (i18n refactor) Main moved StatusPage constants/functions inside the component and added i18n support. Resolved by keeping the i18n structure and adding the runningRemote key to en.ts, zh.ts, and types.ts for remote gateway display.	2026-04-14 22:30:50 +00:00
Hermes Agent	d5949d0d16	docs(docker): add dashboard section, expose API port, update Compose example - Running in gateway mode: expose port 8642 for the API server and health endpoint, with a note on when it's needed. - New 'Running the dashboard' section: docker run command with GATEWAY_HEALTH_URL and env var reference table. - Docker Compose example: updated to include both gateway and dashboard services with internal network connectivity (hermes-net), so the dashboard probes the gateway via http://hermes:8642. - Concurrent access warning: clarified that running a read-only dashboard alongside the gateway is safe.	2026-04-15 08:11:52 +10:00
Hermes Agent	88d590ce5e	fix: override stale 'stopped' state when health probe confirms gateway alive When the gateway responds to the health probe but the local gateway_state.json has a stale 'stopped' state (common in cross-container setups where the file was written before the gateway restarted), the dashboard would show 'Running (remote)' but with a 'Stopped' badge. Now if the HTTP probe succeeded (remote_health_body is not None) and gateway_state is 'stopped' or None, override it to 'running'. Also handles the no-shared-volume case where runtime is None entirely.	2026-04-14 22:01:02 +00:00
Hermes Agent	30c089a7e9	fix: normalise GATEWAY_HEALTH_URL to base URL before probing The probe was appending '/detailed' to whatever URL was provided, so GATEWAY_HEALTH_URL=http://host:8642 would try /8642/detailed and /8642 — neither of which are valid routes. Now strips any trailing /health or /health/detailed from the env var and always probes {base}/health/detailed then {base}/health. Accepts bare base URL, /health, or /health/detailed forms.	2026-04-14 06:29:59 +00:00
Hermes Agent	28c39fda3d	feat(dashboard): add HTTP health probe for cross-container gateway detection The dashboard's gateway status detection relied solely on local PID checks (os.kill + /proc), which fails when the gateway runs in a separate container. Changes: - web_server.py: Add _probe_gateway_health() that queries the gateway's HTTP /health/detailed endpoint when the local PID check fails. Activated by setting the GATEWAY_HEALTH_URL env var (e.g. http://gateway:8642/health). Falls back to standard PID check when the env var is not set. - api_server.py: Add GET /health/detailed endpoint that returns full gateway state (platforms, gateway_state, active_agents, pid, etc.) without auth. The existing GET /health remains unchanged for backwards compatibility. - StatusPage.tsx: Handle the case where gateway_pid is null but the gateway is running remotely, displaying 'Running (remote)' instead of 'PID null'. Environment variables: - GATEWAY_HEALTH_URL: URL of the gateway health endpoint (e.g. http://gateway-container:8642/health). Unset = local PID check only. - GATEWAY_HEALTH_TIMEOUT: Probe timeout in seconds (default: 3).	2026-04-14 05:17:17 +00:00