feat(desktop): agents waterfall — live + historical execution traces

Replace the flat agents list with a zoomable d3 waterfall: a turn strip, collapsible label tree, time-compressed track (idle gaps collapse), and a span inspector. Live turns are stitched from the message/tool/subagent streams into the same TraceDoc shape and fold into the server-exact DB trace on settle, so following a turn never reframes. Clears the overlay's traffic-light/titlebar inset at the OverlayView level for every overlay.
feat(gateway): expose trace.get and trace.turns RPCs
2026-06-29 05:06:48 +08:00 · 2026-06-26 04:48:31 -05:00 · 2026-06-26 04:36:26 -05:00 · 2026-06-26 04:36:26 -05:00 · 2026-06-26 04:36:26 -05:00
27 changed files with 3330 additions and 385 deletions
--- a/agent/trace_builder.py
+++ b/agent/trace_builder.py
@@ -0,0 +1,610 @@
+"""Derive OpenTelemetry-style traces from the Hermes session store.
+
+Hermes already persists everything a trace needs: ``sessions`` rows carry
+server-side ``started_at`` / ``ended_at`` and full token accounting, and
+``messages`` rows carry a server-side ``timestamp`` plus ``tool_calls`` (the
+OpenAI tool-call JSON) and ``tool_call_id`` so a tool call can be paired with
+its result. Every subagent is itself a session linked by
+``parent_session_id``. That means a complete, accurately-timed span tree can be
+reconstructed for *any* session — historical or live — with zero extra
+instrumentation.
+
+This module is the read-side "derive-on-read" trace builder. It turns a session
+(and its subagent descendants) into a provider-neutral :class:`Trace` of
+:class:`Span` objects. ``agent/trace_export.py`` renders that into OTLP/JSON
+(OpenInference conventions, ingestible by Arize Phoenix / any OTel backend) or
+the Chrome Trace Event format (viewable in https://ui.perfetto.dev).
+
+Accuracy note: the Hermes agent loop runs tool calls sequentially, so inferring
+span durations from consecutive message timestamps matches real execution. The
+only inferred link is a ``delegate_task`` tool call → its child session, matched
+by start-time proximity; a future precision pass can persist the spawning
+``tool_call_id`` to make that exact.
+"""
+
+from __future__ import annotations
+
+import json
+import logging
+from dataclasses import dataclass, field
+from typing import Any, Dict, List, Optional, Protocol
+
+logger = logging.getLogger(__name__)
+
+# OpenInference span kinds (what Phoenix and other OTel-GenAI viewers expect).
+KIND_AGENT = "AGENT"
+KIND_LLM = "LLM"
+KIND_TOOL = "TOOL"
+KIND_CHAIN = "CHAIN"
+
+# Status codes, mirroring OTLP (1 = OK, 2 = ERROR).
+STATUS_OK = "ok"
+STATUS_ERROR = "error"
+STATUS_UNSET = "unset"
+
+# Tool names that spawn subagent sessions. A span with one of these names gets
+# its matched child session's subtree nested underneath it.
+_DELEGATE_TOOL_NAMES = frozenset({"delegate_task"})
+
+# How long after the last message a session's ``ended_at`` may sit and still be
+# trusted as real activity (vs a cleanup/orphan reaper firing much later).
+_END_GRACE_SECONDS = 300.0
+
+
+class _SessionStore(Protocol):
+    """The slice of ``SessionDB`` the builder depends on (keeps it testable)."""
+
+    def get_session(self, session_id: str) -> Optional[Dict[str, Any]]: ...
+
+    def get_messages(
+        self, session_id: str, include_inactive: bool = False
+    ) -> List[Dict[str, Any]]: ...
+
+    def get_child_session_ids(self, parent_session_id: str) -> List[str]: ...
+
+
+@dataclass
+class Span:
+    """A single unit of work on the trace timeline.
+
+    Times are epoch seconds (float) to match ``messages.timestamp``. Exporters
+    convert to their own units (OTLP nanoseconds, Chrome microseconds).
+    """
+
+    span_id: str
+    parent_id: Optional[str]
+    name: str
+    kind: str
+    start: float
+    end: float
+    status: str = STATUS_UNSET
+    session_id: Optional[str] = None
+    attributes: Dict[str, Any] = field(default_factory=dict)
+
+    @property
+    def duration(self) -> float:
+        return max(0.0, self.end - self.start)
+
+    def to_dict(self) -> Dict[str, Any]:
+        return {
+            "span_id": self.span_id,
+            "parent_id": self.parent_id,
+            "name": self.name,
+            "kind": self.kind,
+            "start": self.start,
+            "end": self.end,
+            "duration": self.duration,
+            "status": self.status,
+            "session_id": self.session_id,
+            "attributes": {k: v for k, v in self.attributes.items() if v is not None},
+        }
+
+
+@dataclass
+class Trace:
+    """A full span tree rooted at one session (plus its subagent descendants)."""
+
+    trace_id: str
+    root_session_id: str
+    spans: List[Span] = field(default_factory=list)
+    root_span_id: Optional[str] = None
+    metadata: Dict[str, Any] = field(default_factory=dict)
+
+    @property
+    def start(self) -> float:
+        return min((s.start for s in self.spans), default=0.0)
+
+    @property
+    def end(self) -> float:
+        return max((s.end for s in self.spans), default=0.0)
+
+    @property
+    def duration(self) -> float:
+        return max(0.0, self.end - self.start)
+
+    def to_dict(self) -> Dict[str, Any]:
+        return {
+            "trace_id": self.trace_id,
+            "root_session_id": self.root_session_id,
+            "root_span_id": self.root_span_id,
+            "start": self.start,
+            "end": self.end,
+            "duration": self.duration,
+            "metadata": {k: v for k, v in self.metadata.items() if v is not None},
+            "spans": [s.to_dict() for s in self.spans],
+        }
+
+
+# ── tool-call shape helpers ──────────────────────────────────────────────────
+
+
+def _tool_call_id(call: Dict[str, Any]) -> str:
+    return str(call.get("id") or call.get("tool_call_id") or "")
+
+
+def _tool_call_name(call: Dict[str, Any]) -> str:
+    fn = call.get("function")
+    if isinstance(fn, dict) and fn.get("name"):
+        return str(fn["name"])
+    return str(call.get("name") or "tool")
+
+
+def _tool_call_args(call: Dict[str, Any]) -> Any:
+    fn = call.get("function")
+    raw = fn.get("arguments") if isinstance(fn, dict) else call.get("arguments")
+    if isinstance(raw, str):
+        try:
+            return json.loads(raw)
+        except (json.JSONDecodeError, TypeError):
+            return raw
+    return raw
+
+
+def _as_text(content: Any) -> str:
+    if content is None:
+        return ""
+    if isinstance(content, str):
+        return content
+    try:
+        return json.dumps(content, ensure_ascii=False, default=str)
+    except (TypeError, ValueError):
+        return str(content)
+
+
+def _looks_like_error(message: Dict[str, Any]) -> bool:
+    """Best-effort error detection on a tool-result message."""
+    text = _as_text(message.get("content")).lstrip()
+    if not text:
+        return False
+    head = text[:400].lower()
+    if text.startswith("{"):
+        try:
+            obj = json.loads(text)
+            if isinstance(obj, dict):
+                if obj.get("error") or obj.get("success") is False:
+                    return True
+                status = str(obj.get("status", "")).lower()
+                if status in {"error", "failed", "failure"}:
+                    return True
+        except (json.JSONDecodeError, TypeError):
+            pass
+    return any(
+        marker in head
+        for marker in ("traceback (most recent call last)", "error:", "exception:")
+    )
+
+
+# ── builder ──────────────────────────────────────────────────────────────────
+
+
+def _short(value: str, limit: int = 120) -> str:
+    flat = " ".join(value.split())
+    return flat if len(flat) <= limit else flat[: limit - 1] + "…"
+
+
+def _llm_span_name() -> str:
+    """Label for an LLM (assistant-turn) span. A plain structural "llm" (matching
+    the OTel/Langfuse convention of short, uniform LLM labels) — the model and
+    the response text live in the span's attributes / detail panel, not the row.
+    """
+    return "llm"
+
+
+def _mk_span_id(prefix: str, session_id: str, key: Any) -> str:
+    return f"{prefix}:{session_id}:{key}"
+
+
+def _trace_id(session_id: str) -> str:
+    return f"trace:{session_id}"
+
+
+def build_trace(
+    store: _SessionStore,
+    session_id: str,
+    *,
+    include_subagents: bool = True,
+    _depth: int = 0,
+    _max_depth: int = 8,
+) -> Optional[Trace]:
+    """Reconstruct a :class:`Trace` for ``session_id`` from the session store.
+
+    Returns ``None`` when the session does not exist. Walks delegate subagent
+    descendants (``parent_session_id``) and nests each under the
+    ``delegate_task`` tool span that spawned it.
+    """
+    session = store.get_session(session_id)
+    if not session:
+        return None
+
+    trace = Trace(
+        trace_id=_trace_id(session_id),
+        root_session_id=session_id,
+        metadata={
+            "source": session.get("source"),
+            "model": session.get("model"),
+            "cwd": session.get("cwd"),
+            "git_branch": session.get("git_branch"),
+        },
+    )
+
+    root_span = _build_session_spans(store, session, parent_span_id=None, trace=trace)
+    trace.root_span_id = root_span.span_id if root_span else None
+
+    if include_subagents and root_span and _depth < _max_depth:
+        _attach_subagents(store, session_id, trace, _depth=_depth, _max_depth=_max_depth)
+
+    trace.spans.sort(key=lambda s: (s.start, s.span_id))
+    return trace
+
+
+# Synthetic re-injections that re-enter the conversation as ``user`` messages
+# but are CONTINUATIONS of earlier work, not a fresh prompt: async-delegation
+# completions (`[ASYNC DELEGATION …]`) and background-process notifications
+# (`[IMPORTANT: …]`). They must not open a new turn — otherwise a background
+# subagent dispatched in turn N shows up as its own orphan "[ASYNC DELEGATION]"
+# turn when it finishes, instead of folding into the group that spawned it. This
+# mirrors the desktop live view, which only resets the live turn on a real
+# ``prompt.submit``.
+_CONTINUATION_PREFIXES = (
+    "[ASYNC DELEGATION",
+    "[IMPORTANT:",
+)
+
+
+def _is_continuation(message: Dict[str, Any]) -> bool:
+    """True for a synthetic re-injection that should merge into the current turn."""
+    if message.get("role") != "user":
+        return False
+    return _as_text(message.get("content")).lstrip().startswith(_CONTINUATION_PREFIXES)
+
+
+def _split_turns(messages: List[Dict[str, Any]]) -> List[tuple]:
+    """Split messages into turns. A turn begins at each *real* ``user`` message and
+    runs until the next one. Synthetic continuations (async-delegation /
+    background-process re-injections) do NOT start a turn — they merge into the
+    one that spawned the work. Leading non-user messages (e.g. system) join the
+    first turn. Returns ``[(start_idx, end_idx), ...]`` index ranges.
+    """
+    bounds: List[tuple] = []
+    start = 0
+    for i, m in enumerate(messages):
+        if i > 0 and m.get("role") == "user" and not _is_continuation(m):
+            bounds.append((start, i))
+            start = i
+    bounds.append((start, len(messages)))
+    return bounds
+
+
+def build_session_turns(
+    store: _SessionStore,
+    session_id: str,
+    *,
+    include_subagents: bool = True,
+) -> List[Trace]:
+    """Build one :class:`Trace` per turn for a session.
+
+    A turn (one user prompt → the agent's full response, subagents included) is
+    the natural trace unit — it has no inter-turn idle gaps, so each renders as a
+    tight waterfall. Returns traces in chronological order.
+    """
+    session = store.get_session(session_id)
+    if not session:
+        return []
+    messages = store.get_messages(session_id)
+    if not messages:
+        return []
+
+    meta = {
+        "source": session.get("source"),
+        "model": session.get("model"),
+        "cwd": session.get("cwd"),
+        "git_branch": session.get("git_branch"),
+    }
+    out: List[Trace] = []
+    for ti, (a, b) in enumerate(_split_turns(messages)):
+        slice_msgs = messages[a:b]
+        if not slice_msgs:
+            continue
+        trace = Trace(
+            trace_id=f"{_trace_id(session_id)}:t{ti}",
+            root_session_id=session_id,
+            metadata={**meta, "turn": ti},
+        )
+        root = _build_session_spans(
+            store,
+            session,
+            parent_span_id=None,
+            trace=trace,
+            messages=slice_msgs,
+            agent_key=f"turn{ti}",
+        )
+        if not root:
+            continue
+        trace.root_span_id = root.span_id
+        if include_subagents:
+            _attach_subagents(
+                store,
+                session_id,
+                trace,
+                agent_key=f"turn{ti}",
+                window=(root.start, root.end),
+                _depth=0,
+                _max_depth=8,
+            )
+        trace.spans.sort(key=lambda s: (s.start, s.span_id))
+        out.append(trace)
+
+    return out
+
+
+def _build_session_spans(
+    store: _SessionStore,
+    session: Dict[str, Any],
+    *,
+    parent_span_id: Optional[str],
+    trace: Trace,
+    messages: Optional[List[Dict[str, Any]]] = None,
+    agent_key: str = "root",
+) -> Optional[Span]:
+    """Append the AGENT span for one session (or a turn slice of it) plus its
+    LLM/TOOL child spans.
+
+    ``messages`` lets a caller pass a turn-scoped slice; ``agent_key`` keeps the
+    AGENT span id unique per turn. Returns the session's root (AGENT) span, or
+    ``None`` for an empty session.
+    """
+    session_id = session["id"]
+    if messages is None:
+        messages = store.get_messages(session_id)
+    if not messages:
+        return None
+
+    msg_start = min(float(m["timestamp"]) for m in messages)
+    msg_end = max(float(m["timestamp"]) for m in messages)
+    started_at = float(session.get("started_at") or msg_start)
+
+    # Clamp the AGENT span to real message activity. Session ``started_at`` /
+    # ``ended_at`` are unreliable for trace timing: ``ended_at`` can be a
+    # cleanup/orphan reaper firing hours later (e.g. ``ws_orphan_reap``), and on
+    # a turn slice ``started_at`` is the whole-session start, far before this
+    # turn. Snap to them only when they sit right at the slice's edges, so a
+    # span never balloons into inter-turn idle.
+    activity_start = msg_start
+    if msg_start - _END_GRACE_SECONDS <= started_at <= msg_start:
+        activity_start = started_at
+    activity_end = msg_end
+    raw_ended = session.get("ended_at")
+    if raw_ended is not None:
+        ended_at = float(raw_ended)
+        if msg_end <= ended_at <= msg_end + _END_GRACE_SECONDS:
+            activity_end = ended_at
+
+    goal = _session_goal(messages, session)
+    agent_span = Span(
+        span_id=_mk_span_id("agent", session_id, agent_key),
+        parent_id=parent_span_id,
+        name=goal,
+        kind=KIND_AGENT,
+        start=activity_start,
+        end=activity_end,
+        status=STATUS_ERROR if session.get("end_reason") in {"error", "failed"} else STATUS_OK,
+        session_id=session_id,
+        attributes={
+            "session.id": session_id,
+            "session.source": session.get("source"),
+            "llm.model_name": session.get("model"),
+            "llm.token_count.prompt": session.get("input_tokens"),
+            "llm.token_count.completion": session.get("output_tokens"),
+            "llm.token_count.reasoning": session.get("reasoning_tokens"),
+            "session.message_count": session.get("message_count"),
+            "session.tool_call_count": session.get("tool_call_count"),
+            "session.end_reason": session.get("end_reason"),
+        },
+    )
+    trace.spans.append(agent_span)
+
+    # Pre-index tool results by tool_call_id so calls pair with their output.
+    results_by_id: Dict[str, Dict[str, Any]] = {}
+    for m in messages:
+        if m.get("role") == "tool" and m.get("tool_call_id"):
+            results_by_id[str(m["tool_call_id"])] = m
+
+    # Walk the turn timeline. An assistant message closes the LLM span that
+    # began at the previous boundary; each tool_call it carries becomes a TOOL
+    # span ending at its paired result.
+    prev_boundary = activity_start
+    for m in messages:
+        role = m.get("role")
+        ts = float(m["timestamp"])
+
+        if role == "assistant":
+            llm_span = Span(
+                span_id=_mk_span_id("llm", session_id, m["id"]),
+                parent_id=agent_span.span_id,
+                name=_llm_span_name(),
+                kind=KIND_LLM,
+                start=prev_boundary,
+                end=ts,
+                status=STATUS_OK,
+                session_id=session_id,
+                attributes={
+                    "llm.model_name": session.get("model"),
+                    "llm.token_count.completion": m.get("token_count"),
+                    "output.value": _short(_as_text(m.get("content")), 2000),
+                    "hermes.finish_reason": m.get("finish_reason"),
+                    "hermes.has_reasoning": bool(
+                        m.get("reasoning") or m.get("reasoning_content")
+                    ),
+                },
+            )
+            trace.spans.append(llm_span)
+
+            for call in m.get("tool_calls") or []:
+                if not isinstance(call, dict):
+                    continue
+                _append_tool_span(
+                    trace=trace,
+                    session=session,
+                    parent_span_id=agent_span.span_id,
+                    call=call,
+                    call_ts=ts,
+                    results_by_id=results_by_id,
+                    fallback_end=activity_end,
+                )
+
+            prev_boundary = ts
+        elif role in {"user", "tool"}:
+            # User input and tool results define the next LLM span's start.
+            prev_boundary = ts
+
+    return agent_span
+
+
+def _append_tool_span(
+    *,
+    trace: Trace,
+    session: Dict[str, Any],
+    parent_span_id: str,
+    call: Dict[str, Any],
+    call_ts: float,
+    results_by_id: Dict[str, Dict[str, Any]],
+    fallback_end: float,
+) -> None:
+    session_id = session["id"]
+    call_id = _tool_call_id(call)
+    name = _tool_call_name(call)
+    result = results_by_id.get(call_id) if call_id else None
+    end = float(result["timestamp"]) if result else fallback_end
+    status = STATUS_OK
+    if result and _looks_like_error(result):
+        status = STATUS_ERROR
+    elif not result:
+        status = STATUS_UNSET
+
+    args = _tool_call_args(call)
+    span = Span(
+        span_id=_mk_span_id("tool", session_id, call_id or f"{call_ts}:{name}"),
+        parent_id=parent_span_id,
+        name=name,
+        kind=KIND_TOOL,
+        start=call_ts,
+        end=max(end, call_ts),
+        status=status,
+        session_id=session_id,
+        attributes={
+            "tool.name": name,
+            "tool.call_id": call_id or None,
+            "input.value": _short(_as_text(args), 2000),
+            "output.value": _short(_as_text(result.get("content")), 2000) if result else None,
+            "hermes.is_delegate": name in _DELEGATE_TOOL_NAMES,
+        },
+    )
+    trace.spans.append(span)
+
+
+def _attach_subagents(
+    store: _SessionStore,
+    session_id: str,
+    trace: Trace,
+    *,
+    agent_key: str = "root",
+    window: Optional[tuple] = None,
+    _depth: int,
+    _max_depth: int,
+) -> None:
+    """Nest each delegate child session under the tool span that spawned it.
+
+    Children are matched to ``delegate_task`` tool spans by start-time proximity
+    (each child consumed once). Unmatched children attach to the session's AGENT
+    span so they are never dropped from the trace. ``window`` (start, end) limits
+    attachment to children spawned during a turn slice.
+    """
+    child_ids = store.get_child_session_ids(session_id)
+    if not child_ids:
+        return
+
+    delegate_spans = sorted(
+        (
+            s
+            for s in trace.spans
+            if s.session_id == session_id
+            and s.kind == KIND_TOOL
+            and s.attributes.get("hermes.is_delegate")
+        ),
+        key=lambda s: s.start,
+    )
+    agent_span_id = _mk_span_id("agent", session_id, agent_key)
+
+    children = []
+    for cid in child_ids:
+        csess = store.get_session(cid)
+        if not csess:
+            continue
+        if window is not None:
+            cstart = float(csess.get("started_at") or 0.0)
+            if not (window[0] - 1.0 <= cstart <= window[1] + 1.0):
+                continue
+        children.append(csess)
+    children.sort(key=lambda c: float(c.get("started_at") or 0.0))
+
+    used: set = set()
+    for csess in children:
+        cstart = float(csess.get("started_at") or 0.0)
+        parent_span_id = agent_span_id
+        best = None
+        best_gap = None
+        for ds in delegate_spans:
+            if ds.span_id in used:
+                continue
+            gap = abs(ds.start - cstart)
+            if best_gap is None or gap < best_gap:
+                best, best_gap = ds, gap
+        if best is not None:
+            used.add(best.span_id)
+            parent_span_id = best.span_id
+
+        child_root = _build_session_spans(
+            store, csess, parent_span_id=parent_span_id, trace=trace
+        )
+        if child_root and _depth + 1 < _max_depth:
+            _attach_subagents(
+                store, csess["id"], trace, _depth=_depth + 1, _max_depth=_max_depth
+            )
+
+
+def _session_goal(messages: List[Dict[str, Any]], session: Dict[str, Any]) -> str:
+    """A human-readable label for an AGENT span.
+
+    Prefer the first user message in the given slice so per-turn spans get their
+    own prompt as a label (the session title is identical across every turn).
+    Fall back to the session title, then a short id.
+    """
+    for m in messages:
+        if m.get("role") == "user":
+            text = _as_text(m.get("content")).strip()
+            if text:
+                return _short(text, 120)
+    title = session.get("title")
+    if title:
+        return _short(str(title), 120)
+    return f"session {str(session.get('id', ''))[:8]}"
--- a/agent/trace_export.py
+++ b/agent/trace_export.py
@@ -0,0 +1,170 @@
+"""Render a :class:`~agent.trace_builder.Trace` into portable file formats.
+
+Two formats, both hand-built (no OTel SDK dependency):
+
+* **OTLP/JSON** with OpenInference semantic conventions — the industry standard
+  for LLM/agent traces. Ingestible by Arize Phoenix and any OpenTelemetry
+  backend, so we can confirm our spans are correct in a real viewer.
+* **Chrome Trace Event format** — each session becomes its own track, viewable
+  by dropping the file into https://ui.perfetto.dev or ``chrome://tracing``.
+
+Keeping these as plain dict/JSON builders means the trace layer has zero new
+third-party dependencies.
+"""
+
+from __future__ import annotations
+
+import hashlib
+import json
+from typing import Any, Dict, List
+
+from agent.trace_builder import STATUS_ERROR, STATUS_OK, Trace
+
+_SERVICE_NAME = "hermes-agent"
+_SCOPE_NAME = "hermes.tracing"
+
+
+def _hex_id(value: str, nbytes: int) -> str:
+    """Deterministic hex id of ``nbytes`` bytes from an arbitrary string."""
+    digest = hashlib.sha1(value.encode("utf-8")).hexdigest()
+    return digest[: nbytes * 2]
+
+
+def _otlp_any_value(value: Any) -> Dict[str, Any]:
+    if isinstance(value, bool):
+        return {"boolValue": value}
+    if isinstance(value, int):
+        return {"intValue": str(value)}
+    if isinstance(value, float):
+        return {"doubleValue": value}
+    return {"stringValue": str(value)}
+
+
+def _otlp_attributes(attrs: Dict[str, Any]) -> List[Dict[str, Any]]:
+    out: List[Dict[str, Any]] = []
+    for key, value in attrs.items():
+        if value is None:
+            continue
+        out.append({"key": key, "value": _otlp_any_value(value)})
+    return out
+
+
+def _otlp_status(status: str) -> Dict[str, Any]:
+    if status == STATUS_OK:
+        return {"code": 1}
+    if status == STATUS_ERROR:
+        return {"code": 2}
+    return {"code": 0}
+
+
+def _to_nanos(seconds: float) -> str:
+    return str(int(seconds * 1_000_000_000))
+
+
+def to_otlp_json(trace: Trace) -> Dict[str, Any]:
+    """Build an OTLP/JSON ``TracesData`` document with OpenInference attributes."""
+    trace_hex = _hex_id(trace.trace_id, 16)
+    otlp_spans: List[Dict[str, Any]] = []
+
+    for span in trace.spans:
+        attributes = dict(span.attributes)
+        # OpenInference: the GenAI span kind travels as an attribute; the OTLP
+        # SpanKind stays INTERNAL (1).
+        attributes["openinference.span.kind"] = span.kind
+        if span.session_id:
+            attributes.setdefault("session.id", span.session_id)
+
+        otlp_span: Dict[str, Any] = {
+            "traceId": trace_hex,
+            "spanId": _hex_id(span.span_id, 8),
+            "name": span.name,
+            "kind": 1,
+            "startTimeUnixNano": _to_nanos(span.start),
+            "endTimeUnixNano": _to_nanos(span.end),
+            "attributes": _otlp_attributes(attributes),
+            "status": _otlp_status(span.status),
+        }
+        if span.parent_id:
+            otlp_span["parentSpanId"] = _hex_id(span.parent_id, 8)
+        otlp_spans.append(otlp_span)
+
+    return {
+        "resourceSpans": [
+            {
+                "resource": {
+                    "attributes": _otlp_attributes(
+                        {
+                            "service.name": _SERVICE_NAME,
+                            "session.id": trace.root_session_id,
+                            "hermes.source": trace.metadata.get("source"),
+                        }
+                    )
+                },
+                "scopeSpans": [
+                    {
+                        "scope": {"name": _SCOPE_NAME},
+                        "spans": otlp_spans,
+                    }
+                ],
+            }
+        ]
+    }
+
+
+def to_chrome_trace(trace: Trace) -> Dict[str, Any]:
+    """Build a Chrome Trace Event document (one track per session)."""
+    base = trace.start
+    events: List[Dict[str, Any]] = []
+
+    # Stable, compact track ids per session — each session is its own lane.
+    tids: Dict[str, int] = {}
+
+    def tid_for(session_id: str) -> int:
+        if session_id not in tids:
+            tids[session_id] = len(tids) + 1
+        return tids[session_id]
+
+    for span in trace.spans:
+        sid = span.session_id or trace.root_session_id
+        events.append(
+            {
+                "name": span.name,
+                "cat": span.kind,
+                "ph": "X",
+                "ts": (span.start - base) * 1_000_000,
+                "dur": max(0.0, span.duration) * 1_000_000,
+                "pid": 1,
+                "tid": tid_for(sid),
+                "args": {k: v for k, v in span.attributes.items() if v is not None},
+            }
+        )
+
+    # Name each track after its session (root first) for legible lanes.
+    for session_id, tid in tids.items():
+        label = "root" if session_id == trace.root_session_id else f"subagent {session_id[:8]}"
+        events.append(
+            {
+                "name": "thread_name",
+                "ph": "M",
+                "pid": 1,
+                "tid": tid,
+                "args": {"name": label},
+            }
+        )
+
+    return {"traceEvents": events, "displayTimeUnit": "ms"}
+
+
+def dumps(trace: Trace, fmt: str = "otlp", *, indent: int = 2) -> str:
+    """Serialize ``trace`` to a JSON string in the requested format."""
+    fmt = (fmt or "otlp").lower()
+    if fmt in {"otlp", "otlp-json", "openinference"}:
+        doc = to_otlp_json(trace)
+    elif fmt in {"chrome", "perfetto", "trace-event"}:
+        doc = to_chrome_trace(trace)
+    else:
+        raise ValueError(f"unknown trace format: {fmt!r} (use 'otlp' or 'chrome')")
+    return json.dumps(doc, ensure_ascii=False, indent=indent, default=str)
+
+
+__all__ = ["dumps", "to_chrome_trace", "to_otlp_json"]
--- a/apps/desktop/package.json
+++ b/apps/desktop/package.json
@@ -80,6 +80,9 @@
    "class-variance-authority": "^0.7.1",
    "clsx": "^2.1.1",
    "cmdk": "^1.1.1",
+    "d3-scale": "^4.0.2",
+    "d3-selection": "^3.0.0",
+    "d3-zoom": "^3.0.0",
    "dnd-core": "^14.0.1",
    "hast-util-from-html-isomorphic": "^2.0.0",
    "hast-util-to-text": "^4.0.2",
@@ -115,6 +118,9 @@
    "@eslint/js": "^9.39.4",
    "@testing-library/dom": "^10.4.0",
    "@testing-library/react": "^16.3.2",
+    "@types/d3-scale": "^4.0.9",
+    "@types/d3-selection": "^3.0.11",
+    "@types/d3-zoom": "^3.0.8",
    "@types/hast": "^3.0.4",
    "@types/node": "^24.13.2",
    "@types/react": "^19.2.14",
--- a/apps/desktop/src/app/agents/build-live-trace.ts
+++ b/apps/desktop/src/app/agents/build-live-trace.ts
@@ -0,0 +1,194 @@
+import type { LiveTurn } from '@/store/live-turn'
+import type { SubagentProgress } from '@/store/subagents'
+import type { TraceDoc, TraceSpan, TraceSpanStatus } from '@/store/trace'
+
+const TERMINAL_SUB = new Set(['completed', 'failed', 'interrupted'])
+
+/**
+ * Stitch the live turn + subagent stream into a TraceDoc (client time, epoch
+ * seconds) in the SAME shape the DB trace produces, so the waterfall renders the
+ * in-flight turn with no separate code path. LLM spans are the gaps between tool
+ * calls; subagents nest under the `delegate_task` tool span that spawned them.
+ *
+ * `live=false` finalizes the snapshot (running → ok) so a settled turn stops
+ * pulsing. Returns null when there's nothing in flight to draw.
+ */
+export function buildLiveTrace(
+  turn: LiveTurn | undefined,
+  subs: SubagentProgress[],
+  nowMs: number,
+  rootLabel?: string,
+  live = true
+): null | TraceDoc {
+  const tools = turn?.tools ?? []
+
+  // Robust to opening/reloading mid-turn (when message.start was never captured
+  // in this renderer): build from whatever live data exists — the captured turn,
+  // its tools, or streamed subagents.
+  if (!turn?.busy && tools.length === 0 && subs.length === 0) {
+    return null
+  }
+
+  const nowSec = nowMs / 1000
+
+  const startCandidates = [turn?.turnStart, ...subs.map(s => s.startedAt)].filter(
+    (n): n is number => typeof n === 'number'
+  )
+
+  const startSec = (startCandidates.length ? Math.min(...startCandidates) : nowMs) / 1000
+  const rootId = 'live:root'
+
+  const spans: TraceSpan[] = [
+    {
+      id: rootId,
+      parentId: null,
+      name: rootLabel || 'Current turn',
+      kind: 'AGENT',
+      start: startSec,
+      end: nowSec,
+      duration: Math.max(0, nowSec - startSec),
+      status: live ? 'running' : 'ok',
+      sessionId: null,
+      attributes: {}
+    }
+  ]
+
+  const toolSpanIds = new Set<string>()
+  const sortedTools = [...tools].sort((a, b) => a.start - b.start)
+  let prev = startSec
+  let llmIdx = 0
+
+  const pushLlm = (start: number, end: number, running: boolean, output?: string) => {
+    if (end - start <= 0.05) {
+      return
+    }
+
+    const text = output?.trim()
+
+    spans.push({
+      id: `live:llm:${llmIdx++}`,
+      parentId: rootId,
+      name: 'llm',
+      kind: 'LLM',
+      start,
+      end,
+      duration: end - start,
+      status: running ? 'running' : 'ok',
+      sessionId: null,
+      attributes: text ? { 'output.value': text } : {}
+    })
+  }
+
+  for (const t of sortedTools) {
+    const ts = t.start / 1000
+    const te = (t.end ?? nowMs) / 1000
+    pushLlm(prev, ts, false)
+    const spanId = `live:tool:${t.id}`
+    toolSpanIds.add(spanId)
+    spans.push({
+      id: spanId,
+      parentId: rootId,
+      name: t.name,
+      kind: 'TOOL',
+      start: ts,
+      end: Math.max(te, ts),
+      duration: Math.max(0, te - ts),
+      status: t.status,
+      sessionId: null,
+      attributes: { 'tool.name': t.name }
+    })
+    prev = Math.max(prev, te)
+  }
+
+  // Trailing llm = the model's response after the last tool. Show it ONLY when
+  // there were tools (it's a distinct segment, and it's what streams/grows during
+  // "reporting back"). A pure no-tool turn skips it — the root already is the
+  // response, so a lone "llm" child would just duplicate it.
+  if (sortedTools.length > 0) {
+    pushLlm(prev, nowSec, live, turn?.replyText)
+  } else if (turn?.replyText.trim()) {
+    // No-tool turn: the root IS the response, so don't add a redundant llm row —
+    // instead hang the streamed reply on the root so selecting it shows the text.
+    spans[0].attributes = { ...spans[0].attributes, 'output.value': turn.replyText.trim() }
+  }
+
+  // Nest each subagent under: another subagent (parentId), else the delegate_task
+  // tool span that spawned it. Native subagents don't carry the tool id, so match
+  // by the nearest delegate span that started at/before the subagent.
+  const subIds = new Set(subs.map(s => s.id))
+
+  const delegateSpans = sortedTools
+    .filter(t => t.name === 'delegate_task')
+    .map(t => ({ id: `live:tool:${t.id}`, startMs: t.start }))
+
+  for (const s of subs) {
+    const start = s.startedAt / 1000
+    const terminal = TERMINAL_SUB.has(s.status)
+    const end = terminal && s.durationSeconds ? start + s.durationSeconds : nowSec
+
+    const status: TraceSpanStatus =
+      s.status === 'failed' || s.status === 'interrupted' ? 'error' : terminal ? 'ok' : 'running'
+
+    let parentId = rootId
+
+    if (s.parentId && subIds.has(s.parentId)) {
+      parentId = `live:sub:${s.parentId}`
+    } else {
+      const idMatch = /^delegate-tool:(.+):\d+$/.exec(s.id)
+
+      if (idMatch && toolSpanIds.has(`live:tool:${idMatch[1]}`)) {
+        parentId = `live:tool:${idMatch[1]}`
+      } else {
+        let best: null | { id: string; startMs: number } = null
+
+        for (const d of delegateSpans) {
+          if (d.startMs <= s.startedAt + 1000 && (!best || d.startMs > best.startMs)) {
+            best = d
+          }
+        }
+
+        if (best) {
+          parentId = best.id
+        }
+      }
+    }
+
+    spans.push({
+      id: `live:sub:${s.id}`,
+      parentId,
+      name: s.goal || 'subagent',
+      kind: 'AGENT',
+      start,
+      end: Math.max(end, start),
+      duration: Math.max(0, end - start),
+      status,
+      sessionId: s.sessionId ?? null,
+      attributes: {
+        'llm.model_name': s.model,
+        'llm.token_count.completion': s.outputTokens,
+        'llm.token_count.prompt': s.inputTokens
+      }
+    })
+  }
+
+  // Finalized snapshot (turn settled): no span should keep a 'running' status,
+  // or its bar pulses forever.
+  if (!live) {
+    for (const sp of spans) {
+      if (sp.status === 'running') {
+        sp.status = 'ok'
+      }
+    }
+  }
+
+  return {
+    traceId: 'live',
+    rootSessionId: 'live',
+    rootSpanId: rootId,
+    start: startSec,
+    end: nowSec,
+    duration: Math.max(0, nowSec - startSec),
+    metadata: { live: true },
+    spans
+  }
+}
--- a/apps/desktop/src/app/agents/empty-state.tsx
+++ b/apps/desktop/src/app/agents/empty-state.tsx
@@ -0,0 +1,10 @@
+import { Codicon } from '@/components/ui/codicon'
+
+export function EmptyState({ icon, text }: { icon: string; text: string }) {
+  return (
+    <div className="flex flex-1 flex-col items-center justify-center gap-2 text-center">
+      <Codicon className="text-muted-foreground/50" name={icon} size="1.25rem" />
+      <p className="text-xs text-muted-foreground/70">{text}</p>
+    </div>
+  )
+}
--- a/apps/desktop/src/app/agents/format.ts
+++ b/apps/desktop/src/app/agents/format.ts
@@ -0,0 +1,12 @@
+/** Span/trace duration (seconds) → "120ms" / "1.50s" / "2m 3s". */
+export function fmtDuration(s: number): string {
+  if (s < 1) {
+    return `${Math.round(s * 1000)}ms`
+  }
+
+  if (s < 60) {
+    return `${s.toFixed(s < 10 ? 2 : 1)}s`
+  }
+
+  return `${Math.floor(s / 60)}m ${Math.round(s % 60)}s`
+}
--- a/apps/desktop/src/app/agents/hooks/use-session-trace.ts
+++ b/apps/desktop/src/app/agents/hooks/use-session-trace.ts
@@ -0,0 +1,112 @@
+import { useCallback, useEffect } from 'react'
+
+import { useGatewayRequest } from '@/app/gateway/hooks/use-gateway-request'
+import {
+  $trace,
+  $traceError,
+  $traceLoading,
+  $traceTurns,
+  setTrace,
+  toTraceDoc,
+  toTurnSummaries,
+  type TraceDoc
+} from '@/store/trace'
+
+/**
+ * Fetch the execution trace for a session from the gateway (`trace.get`) and
+ * publish it into the trace store. Re-fetches when the session or turn changes.
+ */
+export function useSessionTrace(sessionId: null | string, turn?: number) {
+  const { requestGateway } = useGatewayRequest()
+
+  const load = useCallback(async () => {
+    if (!sessionId) {
+      setTrace(null)
+
+      return
+    }
+
+    $traceLoading.set(true)
+    $traceError.set(null)
+
+    try {
+      const params: Record<string, unknown> = { session_id: sessionId }
+
+      if (typeof turn === 'number') {
+        params.turn = turn
+      }
+
+      const wire = await requestGateway<Record<string, unknown>>('trace.get', params)
+      setTrace(toTraceDoc(wire))
+    } catch (error) {
+      $traceError.set(error instanceof Error ? error.message : String(error))
+      $trace.set(null)
+    } finally {
+      $traceLoading.set(false)
+    }
+  }, [requestGateway, sessionId, turn])
+
+  useEffect(() => {
+    void load()
+  }, [load])
+}
+
+/**
+ * Fetch the per-turn summaries for a session (`trace.turns`) so the overlay can
+ * offer a turn strip. Publishes into the trace store.
+ */
+export function useTraceTurns(sessionId: null | string) {
+  const { requestGateway } = useGatewayRequest()
+
+  const reloadTurns = useCallback(async () => {
+    if (!sessionId) {
+      $traceTurns.set([])
+
+      return
+    }
+
+    try {
+      const wire = await requestGateway<{ turns?: unknown[] }>('trace.turns', { session_id: sessionId })
+      $traceTurns.set(toTurnSummaries(wire as { turns?: never[] }))
+    } catch {
+      // Keep the previous list on a transient failure — never blank it, or the
+      // nav (and any latest-index math) flickers mid-turn.
+    }
+  }, [requestGateway, sessionId])
+
+  useEffect(() => {
+    void reloadTurns()
+  }, [reloadTurns])
+
+  return { reloadTurns }
+}
+
+/**
+ * Imperative one-shot fetch of a single turn's trace, WITHOUT touching the
+ * `$trace` store. The agents overlay uses this to grab a just-finished turn's
+ * exact DB trace and fold it into the live view, instead of letting the
+ * declarative `useSessionTrace` swap the store (which would race the live
+ * stitch and churn the view).
+ */
+export function useTraceFetcher() {
+  const { requestGateway } = useGatewayRequest()
+
+  const fetchTurn = useCallback(
+    async (sessionId: null | string, turn: number): Promise<null | TraceDoc> => {
+      if (!sessionId) {
+        return null
+      }
+
+      try {
+        const wire = await requestGateway<Record<string, unknown>>('trace.get', { session_id: sessionId, turn })
+
+        return toTraceDoc(wire)
+      } catch {
+        return null
+      }
+    },
+    [requestGateway]
+  )
+
+  return { fetchTurn }
+}
--- a/apps/desktop/src/app/agents/hooks/use-trace-view.ts
+++ b/apps/desktop/src/app/agents/hooks/use-trace-view.ts
@@ -0,0 +1,215 @@
+import { useStore } from '@nanostores/react'
+import { useEffect, useMemo, useRef, useState } from 'react'
+
+import { chatMessageText } from '@/lib/chat-messages'
+import { $liveTurnBySession } from '@/store/live-turn'
+import { $activeSessionId, $messages } from '@/store/session'
+import { $subagentsBySession } from '@/store/subagents'
+import {
+  $hoveredSpanId,
+  $selectedSpanId,
+  $trace,
+  $traceError,
+  $traceLoading,
+  $traceSelection,
+  $traceTurns,
+  rebaseTrace,
+  type TraceDoc
+} from '@/store/trace'
+
+import { buildLiveTrace } from '../build-live-trace'
+
+import { useSessionTrace, useTraceFetcher, useTraceTurns } from './use-session-trace'
+
+export interface TraceView {
+  activeIndex: null | number
+  error: null | string
+  liveIndex: null | number
+  loading: boolean
+  selectTurn: (index: number) => void
+  selection: ReturnType<typeof $traceSelection.get>
+  sessionId: null | string
+  trace: null | TraceDoc
+}
+
+/**
+ * The agents overlay's view-model: resolves the one trace to render from three
+ * sources — the live event stitch while a followed turn is in flight, the
+ * server-exact DB trace it folds into once settled, and any pinned/historical
+ * turn or whole-session view. Keeps the route root pure layout.
+ */
+export function useTraceView(): TraceView {
+  const sessionId = useStore($activeSessionId)
+  const turns = useStore($traceTurns)
+  const dbTrace = useStore($trace)
+  const loading = useStore($traceLoading)
+  const error = useStore($traceError)
+  const subsBySession = useStore($subagentsBySession)
+  const liveTurnBySession = useStore($liveTurnBySession)
+  const selection = useStore($traceSelection)
+
+  const [nowMs, setNowMs] = useState(() => Date.now())
+  // Frozen last live render — kept after a turn settles so the time mapping (and
+  // thus the view) can't shift. Cleared on session switch.
+  const liveTraceRef = useRef<null | TraceDoc>(null)
+  const finalizedRef = useRef(false)
+  // The just-finished followed turn re-fetched from the DB and rebased onto the
+  // live start: settled exactness folded into the live view (the "B" in A→B).
+  const [foldedTrace, setFoldedTrace] = useState<null | TraceDoc>(null)
+  const sessionRef = useRef(sessionId)
+
+  if (sessionRef.current !== sessionId) {
+    sessionRef.current = sessionId
+    liveTraceRef.current = null
+    finalizedRef.current = false
+  }
+
+  const liveSubs = useMemo(() => (sessionId ? (subsBySession[sessionId] ?? []) : []), [sessionId, subsBySession])
+  const liveTurn = sessionId ? liveTurnBySession[sessionId] : undefined
+  // Live = turn busy OR subagents in flight (robust to opening mid-turn).
+  const isLive = !!liveTurn?.busy || liveSubs.some(s => s.status === 'running' || s.status === 'queued')
+  const hasLiveData = isLive || (liveTurn?.tools.length ?? 0) > 0 || liveSubs.length > 0
+
+  const following = selection === 'latest'
+  const latestIndex = turns.length - 1
+  // Latch onto the live stitch while following: once a turn has produced live
+  // data we keep showing it (frozen, then folded with DB exactness after it
+  // settles) and NEVER let the declarative DB fetch swap the store under us.
+  // That swap + the turn-list reload races were the churn ("before subagents →
+  // all → end"). DB-by-store is only for an explicitly pinned turn/all, or the
+  // latest turn on a fresh idle open (no live data this session).
+  const showLive = following && (hasLiveData || liveTraceRef.current !== null)
+
+  const activeIndex =
+    selection === 'all' ? null : selection === 'latest' ? (latestIndex >= 0 ? latestIndex : null) : selection
+
+  const liveIndex = isLive && latestIndex >= 0 ? latestIndex : null
+
+  const { reloadTurns } = useTraceTurns(sessionId)
+  const { fetchTurn } = useTraceFetcher()
+
+  // DB fetch target: a pinned turn number, else the latest settled turn. Skipped
+  // entirely (undefined) while we render the live stitch or the whole session.
+  const dbTurnArg =
+    showLive || selection === 'all'
+      ? undefined
+      : typeof selection === 'number'
+        ? selection
+        : latestIndex >= 0
+          ? latestIndex
+          : undefined
+
+  useSessionTrace(sessionId, dbTurnArg)
+
+  // Drop the ephemeral hover/selection when the panel closes so a stale span
+  // can't auto-zoom the next time it opens.
+  useEffect(
+    () => () => {
+      $selectedSpanId.set(null)
+      $hoveredSpanId.set(null)
+    },
+    []
+  )
+
+  // Follow-latest on session switch.
+  useEffect(() => {
+    $traceSelection.set('latest')
+    setFoldedTrace(null)
+  }, [sessionId])
+
+  // A new live turn (or leaving 'latest') invalidates a folded snapshot.
+  useEffect(() => {
+    if (isLive || selection !== 'latest') {
+      setFoldedTrace(null)
+    }
+  }, [isLive, selection])
+
+  // While the turn streams, tick so running bars grow toward "now".
+  useEffect(() => {
+    if (!isLive) {
+      return
+    }
+
+    const id = window.setInterval(() => setNowMs(Date.now()), 400)
+
+    return () => window.clearInterval(id)
+  }, [isLive])
+
+  // Refresh the turn list on live edges so the nav reflects the in-flight /
+  // finished turn. Does NOT touch the displayed (live) trace.
+  const prevLive = useRef(isLive)
+  useEffect(() => {
+    if (isLive !== prevLive.current) {
+      void reloadTurns()
+    }
+
+    prevLive.current = isLive
+  }, [isLive, reloadTurns])
+
+  // Fold: once a followed turn settles, pull its exact DB trace and rebase it
+  // onto the frozen live start, then swap. rebaseTrace keeps it on the same
+  // on-screen window, and the waterfall preserves the view across the tmap
+  // change, so the swap is seamless — approximate live bars become server-exact
+  // in place, with no reframe.
+  useEffect(() => {
+    if (!following || isLive || !sessionId || !liveTraceRef.current) {
+      return
+    }
+
+    const liveStart = liveTraceRef.current.start
+    let cancelled = false
+
+    void (async () => {
+      await reloadTurns()
+      const idx = $traceTurns.get().length - 1
+
+      if (idx < 0) {
+        return
+      }
+
+      const db = await fetchTurn(sessionId, idx)
+
+      if (!cancelled && db && db.spans.length > 0) {
+        setFoldedTrace(rebaseTrace(db, liveStart))
+      }
+    })()
+
+    return () => {
+      cancelled = true
+    }
+  }, [following, isLive, sessionId, reloadTurns, fetchTurn])
+
+  const trace = useMemo<null | TraceDoc>(() => {
+    if (!showLive) {
+      return dbTrace
+    }
+
+    const lastUser = $messages.get().findLast(m => m.role === 'user' && !m.hidden)
+    const rootLabel = lastUser ? chatMessageText(lastUser).trim().slice(0, 80) : undefined
+
+    if (isLive) {
+      finalizedRef.current = false
+      liveTraceRef.current = buildLiveTrace(liveTurn, liveSubs, nowMs, rootLabel, true)
+
+      return liveTraceRef.current
+    }
+
+    // Settled & folded: server-exact spans rebased onto the live start.
+    if (foldedTrace) {
+      return foldedTrace
+    }
+
+    // Settled, fold not in yet: build the finalized snapshot once (running → ok,
+    // no pulse) and freeze it as the bridge until the DB fold lands.
+    if (!finalizedRef.current) {
+      liveTraceRef.current = buildLiveTrace(liveTurn, liveSubs, nowMs, rootLabel, false)
+      finalizedRef.current = true
+    }
+
+    return liveTraceRef.current
+  }, [showLive, isLive, dbTrace, liveTurn, liveSubs, nowMs, foldedTrace])
+
+  const selectTurn = (index: number) => $traceSelection.set(index === latestIndex ? 'latest' : index)
+
+  return { activeIndex, error, liveIndex, loading, selectTurn, selection, sessionId, trace }
+}
--- a/apps/desktop/src/app/agents/index.tsx
+++ b/apps/desktop/src/app/agents/index.tsx
@@ -1,398 +1,60 @@
-import { useStore } from '@nanostores/react'
-import { type ReactNode, useEffect, useMemo, useState } from 'react'
-
-import { useElapsedSeconds } from '@/components/chat/activity-timer'
-import { ActivityTimerText } from '@/components/chat/activity-timer-text'
-import { Codicon } from '@/components/ui/codicon'
-import { FadeText } from '@/components/ui/fade-text'
-import { GlyphSpinner } from '@/components/ui/glyph-spinner'
-import { type Translations, useI18n } from '@/i18n'
-import { AlertCircle, CheckCircle2 } from '@/lib/icons'
-import { useEnterAnimation } from '@/lib/use-enter-animation'
-import { cn } from '@/lib/utils'
-import {
-  $subagentsBySession,
-  allSubagents,
-  buildSubagentTree,
-  type SubagentNode,
-  type SubagentStatus,
-  type SubagentStreamEntry
-} from '@/store/subagents'
+import { $traceSelection } from '@/store/trace'

 import { OverlayView } from '../overlays/overlay-view'

-// Mirrors statusGlyph() in tool-fallback.tsx so subagent rows speak the
-// same visual vocabulary as the chat tool blocks.
-function statusGlyph(status: SubagentStatus, a: Translations['agents']): ReactNode {
-  if (status === 'running' || status === 'queued') {
-    return (
-      <GlyphSpinner
-        ariaLabel={a.running}
-        className="size-3.5 shrink-0 text-[0.95rem] text-muted-foreground/80"
-        spinner="breathe"
-      />
-    )
-  }
-
-  if (status === 'failed' || status === 'interrupted') {
-    return <AlertCircle aria-label={a.failed} className="size-3.5 shrink-0 text-destructive" />
-  }
-
-  return <CheckCircle2 aria-label={a.done} className="size-3.5 shrink-0 text-emerald-600/85 dark:text-emerald-400/85" />
-}
-
-const STREAM_TONE: Record<SubagentStreamEntry['kind'], string> = {
-  progress: 'text-muted-foreground/75',
-  summary: 'text-foreground/85',
-  thinking: 'text-muted-foreground/80',
-  tool: 'text-foreground/85'
-}
-
-function streamGlyph(entry: SubagentStreamEntry): ReactNode {
-  if (entry.isError) {
-    return <AlertCircle aria-hidden className="mt-0.5 size-3 shrink-0 text-destructive" />
-  }
-
-  if (entry.kind === 'tool') {
-    return <span aria-hidden className="mt-0.5 size-1.5 shrink-0 rounded-full bg-foreground/55" />
-  }
-
-  if (entry.kind === 'summary') {
-    return <CheckCircle2 aria-hidden className="mt-0.5 size-3 shrink-0 text-emerald-600/85 dark:text-emerald-400/85" />
-  }
-
-  if (entry.kind === 'thinking') {
-    return (
-      <span aria-hidden className="font-mono text-[0.7rem] leading-none text-muted-foreground/70">
-        …
-      </span>
-    )
-  }
-
-  return <span aria-hidden className="mt-0.5 size-1 shrink-0 rounded-full bg-muted-foreground/55" />
-}
+import { EmptyState } from './empty-state'
+import { fmtDuration } from './format'
+import { useTraceView } from './hooks/use-trace-view'
+import { SpanInspector } from './span-inspector'
+import { ROW_HEIGHT, TraceWaterfall } from './trace-waterfall'
+import { TurnStrip } from './turn-strip'

 interface AgentsViewProps {
  onClose: () => void
 }

 export function AgentsView({ onClose }: AgentsViewProps) {
-  const { t } = useI18n()
-  const subagentsBySession = useStore($subagentsBySession)
-
-  // Aggregate every session, matching the status-bar indicator — a subagent
-  // running in a background session must still be visible here, or the two
-  // desync ("Agents N running" vs an empty tree).
-  const tree = useMemo(() => buildSubagentTree(allSubagents(subagentsBySession)), [subagentsBySession])
+  const { activeIndex, error, liveIndex, loading, selectTurn, selection, sessionId, trace } = useTraceView()
+  const hasTrace = !!trace && trace.spans.length > 0

  return (
    <OverlayView
-      closeLabel={t.agents.close}
-      contentClassName="px-5 pt-5 pb-4 sm:px-6"
+      closeLabel="Close"
+      contentClassName="flex h-full flex-col px-4 py-4 sm:px-5"
      onClose={onClose}
-      rootClassName="mx-auto max-w-3xl"
+      rootClassName="mx-auto flex h-full w-full max-w-6xl flex-col"
    >
-      <header className="mb-3 shrink-0">
-        <h2 className="text-sm font-semibold text-foreground">{t.agents.title}</h2>
-        <p className="text-xs text-muted-foreground/80">{t.agents.subtitle}</p>
+      <header className="mb-2 flex shrink-0 items-center justify-between gap-3 pl-2">
+        <div className="min-w-0">
+          <h2 className="text-sm font-semibold text-foreground">Trace</h2>
+          <p className="truncate text-xs text-muted-foreground/80">
+            {sessionId ? `Execution waterfall · ${sessionId.slice(0, 16)}` : 'No active session'}
+            {hasTrace ? ` · ${trace.spans.length} spans · ${fmtDuration(trace.duration)}` : ''}
+          </p>
+        </div>
+        <TurnStrip
+          activeIndex={activeIndex}
+          allActive={selection === 'all'}
+          liveIndex={liveIndex}
+          onAll={() => $traceSelection.set('all')}
+          onTurn={selectTurn}
+        />
      </header>
-      <SubagentTree tree={tree} />
+
+      {hasTrace ? (
+        <div className="flex min-h-0 flex-1 gap-3 overflow-hidden">
+          <div className="flex min-w-0 flex-1 flex-col overflow-hidden">
+            <TraceWaterfall trace={trace} viewKey={`${sessionId ?? ''}:${selection}`} />
+          </div>
+          <div className="flex w-72 shrink-0 flex-col overflow-y-auto" style={{ paddingTop: ROW_HEIGHT }}>
+            <SpanInspector trace={trace} />
+          </div>
+        </div>
+      ) : error ? (
+        <EmptyState icon="warning" text={error} />
+      ) : (
+        <EmptyState icon={loading ? 'loading~spin' : 'hubot'} text={loading ? 'Loading trace…' : 'No trace yet'} />
+      )}
    </OverlayView>
  )
 }
-
-const fmtDuration = (seconds: number | undefined, a: Translations['agents']) => {
-  if (!seconds || seconds <= 0) {
-    return ''
-  }
-
-  if (seconds < 60) {
-    return a.durationSeconds(seconds.toFixed(1))
-  }
-
-  const m = Math.floor(seconds / 60)
-  const s = Math.round(seconds % 60)
-
-  return a.durationMinutes(m, s)
-}
-
-const fmtTokens = (value: number | undefined, a: Translations['agents']) => {
-  if (!value) {
-    return ''
-  }
-
-  return value >= 1000 ? a.tokensK((value / 1000).toFixed(1)) : a.tokens(value)
-}
-
-const fmtAge = (updatedAt: number, nowMs: number, a: Translations['agents']) => {
-  const s = Math.max(0, Math.round((nowMs - updatedAt) / 1000))
-
-  if (s < 2) {
-    return a.ageNow
-  }
-
-  if (s < 60) {
-    return a.ageSeconds(s)
-  }
-
-  const m = Math.floor(s / 60)
-
-  if (m < 60) {
-    return a.ageMinutes(m)
-  }
-
-  return a.ageHours(Math.floor(m / 60))
-}
-
-const flatten = (nodes: readonly SubagentNode[]): SubagentNode[] =>
-  nodes.flatMap(node => [node, ...flatten(node.children)])
-
-interface RootGroup {
-  id: string
-  delegationIndex: number
-  nodes: SubagentNode[]
-  taskCount: number
-}
-
-function groupDelegations(roots: readonly SubagentNode[]): RootGroup[] {
-  const groups: RootGroup[] = []
-  let n = 0
-
-  for (const node of roots) {
-    const prev = groups.at(-1)
-    const prevTail = prev?.nodes.at(-1)
-    const closeInTime = prevTail ? Math.abs(node.startedAt - prevTail.startedAt) <= 5_000 : false
-    const sameShape = prev && node.taskCount > 1 && prev.taskCount === node.taskCount
-    const uniqueStep = prev ? !prev.nodes.some(item => item.taskIndex === node.taskIndex) : false
-
-    if (prev && sameShape && closeInTime && uniqueStep) {
-      prev.nodes.push(node)
-
-      continue
-    }
-
-    if (node.taskCount > 1) {
-      n += 1
-      groups.push({ id: `delegation-${n}`, delegationIndex: n, nodes: [node], taskCount: node.taskCount })
-
-      continue
-    }
-
-    groups.push({ id: node.id, delegationIndex: 0, nodes: [node], taskCount: node.taskCount })
-  }
-
-  return groups
-}
-
-function SubagentTree({ tree }: { tree: SubagentNode[] }) {
-  const { t } = useI18n()
-  const flat = useMemo(() => flatten(tree), [tree])
-  const groups = useMemo(() => groupDelegations(tree), [tree])
-  const [nowMs, setNowMs] = useState(() => Date.now())
-
-  const active = flat.filter(n => n.status === 'running' || n.status === 'queued').length
-  const failed = flat.filter(n => n.status === 'failed' || n.status === 'interrupted').length
-  const tools = flat.reduce((sum, n) => sum + (n.toolCount ?? 0), 0)
-  const files = flat.reduce((sum, n) => sum + n.filesRead.length + n.filesWritten.length, 0)
-  const tokens = flat.reduce((sum, n) => sum + (n.inputTokens ?? 0) + (n.outputTokens ?? 0), 0)
-  const cost = flat.reduce((sum, n) => sum + (n.costUsd ?? 0), 0)
-
-  useEffect(() => {
-    if (active <= 0 || typeof window === 'undefined') {
-      return
-    }
-
-    const id = window.setInterval(() => setNowMs(Date.now()), 500)
-
-    return () => window.clearInterval(id)
-  }, [active])
-
-  if (tree.length === 0) {
-    return (
-      <div className="grid place-items-center gap-3 py-12 text-center">
-        <Codicon className="text-muted-foreground/60" name="hubot" size="1.5rem" />
-        <p className="text-sm font-medium text-foreground/90">{t.agents.emptyTitle}</p>
-        <p className="max-w-md text-xs leading-relaxed text-muted-foreground/75">{t.agents.emptyDesc}</p>
-      </div>
-    )
-  }
-
-  const summary = [
-    t.agents.agentsCount(flat.length),
-    active > 0 ? t.agents.activeCount(active) : '',
-    failed > 0 ? t.agents.failedCount(failed) : '',
-    tools > 0 ? t.agents.toolsCount(tools) : '',
-    files > 0 ? t.agents.filesCount(files) : '',
-    tokens > 0 ? fmtTokens(tokens, t.agents) : '',
-    cost > 0 ? `$${cost.toFixed(2)}` : ''
-  ].filter(Boolean)
-
-  return (
-    <div className="flex min-h-0 min-w-0 flex-1 flex-col gap-4 overflow-hidden">
-      <p className="shrink-0 text-[0.7rem] text-muted-foreground/70">{summary.join(' · ')}</p>
-      <div className="min-h-0 min-w-0 flex-1 overflow-x-hidden overflow-y-auto overscroll-contain pr-1">
-        <div className="flex min-w-0 flex-col gap-6">
-          {groups.map(group => (
-            <DelegationGroup group={group} key={group.id} nowMs={nowMs} />
-          ))}
-        </div>
-      </div>
-    </div>
-  )
-}
-
-function DelegationGroup({ group, nowMs }: { group: RootGroup; nowMs: number }) {
-  const { t } = useI18n()
-
-  if (group.nodes.length === 1 && group.taskCount <= 1) {
-    return <SubagentRow node={group.nodes[0]!} nowMs={nowMs} />
-  }
-
-  const activeWorkers = group.nodes.filter(n => n.status === 'running' || n.status === 'queued').length
-
-  return (
-    <section className="grid min-w-0 gap-3">
-      <p className="text-[0.66rem] font-medium uppercase tracking-wider text-muted-foreground/70">
-        {group.delegationIndex > 0 ? t.agents.delegation(group.delegationIndex) : ''}{' '}
-        <span className="text-muted-foreground/50">·</span> {t.agents.workers(group.nodes.length)}
-        {activeWorkers > 0 ? <span className="text-primary/85"> · {t.agents.workersActive(activeWorkers)}</span> : null}
-      </p>
-      <div className="grid min-w-0 gap-4">
-        {group.nodes.map(node => (
-          <SubagentRow key={node.id} node={node} nowMs={nowMs} />
-        ))}
-      </div>
-    </section>
-  )
-}
-
-function StreamLine({
-  active,
-  entry,
-  parentRunning,
-  rowKey
-}: {
-  active: boolean
-  entry: SubagentStreamEntry
-  parentRunning: boolean
-  rowKey: string
-}) {
-  const { t } = useI18n()
-  const enterRef = useEnterAnimation(parentRunning, `subagent-stream:${rowKey}`)
-  const isMono = entry.kind === 'tool'
-  const tone = entry.isError ? 'text-destructive' : STREAM_TONE[entry.kind]
-
-  return (
-    <div className="flex min-w-0 items-baseline gap-2 text-[0.72rem] leading-relaxed" ref={enterRef}>
-      <span className="flex h-[0.95rem] shrink-0 items-center">{streamGlyph(entry)}</span>
-      <span className={cn('min-w-0 flex-1 wrap-anywhere', tone, isMono && 'font-mono text-[0.69rem]')}>
-        {entry.text}
-        {active ? (
-          <GlyphSpinner
-            ariaLabel={t.agents.streaming}
-            className="ml-1 inline-block size-2.5 align-middle text-muted-foreground/70"
-            spinner="breathe"
-          />
-        ) : null}
-      </span>
-    </div>
-  )
-}
-
-function SubagentRow({ node, depth = 0, nowMs }: { node: SubagentNode; depth?: number; nowMs: number }) {
-  const { t } = useI18n()
-  const running = node.status === 'running' || node.status === 'queued'
-  const elapsed = useElapsedSeconds(running, `subagent:${node.id}`)
-
-  const durationSeconds =
-    typeof node.durationSeconds === 'number' ? Math.max(0, Math.round(node.durationSeconds)) : elapsed
-
-  const [open, setOpen] = useState(() => running || depth < 2)
-  const enterRef = useEnterAnimation(true, `subagent-row:${node.id}`)
-
-  useEffect(() => {
-    if (running) {
-      setOpen(true)
-    }
-  }, [running])
-
-  const visibleRows = open ? node.stream.slice(-10) : node.stream.slice(-2)
-  const fileLines = [...node.filesWritten.map(p => `+ ${p}`), ...node.filesRead.map(p => `· ${p}`)]
-
-  const subtitle = [
-    node.model,
-    fmtDuration(durationSeconds, t.agents),
-    node.toolCount ? t.agents.toolsCount(node.toolCount) : '',
-    fmtTokens((node.inputTokens ?? 0) + (node.outputTokens ?? 0), t.agents),
-    t.agents.updatedAgo(fmtAge(node.updatedAt, nowMs, t.agents))
-  ].filter(Boolean)
-
-  return (
-    <div className={cn('grid min-w-0 max-w-full gap-2', depth > 0 && 'pl-4')} data-slot="tool-block" ref={enterRef}>
-      <button
-        aria-expanded={open}
-        className="group flex w-full min-w-0 items-start gap-2.5 text-left"
-        onClick={() => setOpen(v => !v)}
-        type="button"
-      >
-        <span className="mt-0.5 flex h-[1.1rem] shrink-0 items-center">{statusGlyph(node.status, t.agents)}</span>
-        <span className="flex min-w-0 flex-1 flex-col gap-0.5">
-          <span
-            className={cn(
-              'wrap-anywhere text-[0.82rem] font-medium leading-[1.1rem] text-foreground/90 transition-colors group-hover:text-foreground',
-              running && 'shimmer text-foreground/65'
-            )}
-          >
-            {node.goal}
-          </span>
-          {subtitle.length > 0 ? (
-            <FadeText className="text-[0.66rem] leading-[1.05rem] text-muted-foreground/65">
-              {subtitle.join(' · ')}
-            </FadeText>
-          ) : null}
-        </span>
-        {running ? <ActivityTimerText className="mt-1 shrink-0 text-[0.6rem]" seconds={durationSeconds} /> : null}
-      </button>
-
-      {visibleRows.length > 0 ? (
-        <div className="grid min-w-0 gap-1 pl-6" data-selectable-text="true">
-          {visibleRows.map((entry, i) => (
-            <StreamLine
-              active={running && i === visibleRows.length - 1}
-              entry={entry}
-              key={`${entry.kind}:${entry.at}:${i}`}
-              parentRunning={running}
-              rowKey={`${node.id}:${entry.kind}:${entry.at}`}
-            />
-          ))}
-        </div>
-      ) : null}
-
-      {open && fileLines.length > 0 ? (
-        <div className="grid min-w-0 gap-0.5 pl-6" data-selectable-text="true">
-          <p className="text-[0.58rem] font-medium tracking-wider text-muted-foreground/60 uppercase">
-            {t.agents.files}
-          </p>
-          {fileLines.slice(0, 8).map(line => (
-            <p className="wrap-break-word font-mono text-[0.67rem] leading-relaxed text-muted-foreground/80" key={line}>
-              {line}
-            </p>
-          ))}
-          {fileLines.length > 8 ? (
-            <p className="font-mono text-[0.67rem] leading-relaxed text-muted-foreground/65">
-              {t.agents.moreFiles(fileLines.length - 8)}
-            </p>
-          ) : null}
-        </div>
-      ) : null}
-
-      {node.children.length > 0 ? (
-        <div className="grid min-w-0 gap-3 pl-6">
-          {node.children.map(child => (
-            <SubagentRow depth={depth + 1} key={child.id} node={child} nowMs={nowMs} />
-          ))}
-        </div>
-      ) : null}
-    </div>
-  )
-}
--- a/apps/desktop/src/app/agents/span-inspector.tsx
+++ b/apps/desktop/src/app/agents/span-inspector.tsx
@@ -0,0 +1,110 @@
+import { useStore } from '@nanostores/react'
+import { useMemo } from 'react'
+
+import { $hoveredSpanId, $selectedSpanId, type TraceDoc } from '@/store/trace'
+
+import { fmtDuration } from './format'
+import { ROW_HEIGHT } from './trace-waterfall'
+
+const fmtInt = (n: number) => n.toLocaleString()
+
+export function SpanInspector({ trace }: { trace: null | TraceDoc }) {
+  const selectedId = useStore($selectedSpanId)
+  const hoveredId = useStore($hoveredSpanId)
+  // Hover previews; the clicked span stays pinned when nothing is hovered.
+  const activeId = hoveredId ?? selectedId
+
+  const span = useMemo(() => trace?.spans.find(s => s.id === activeId) ?? null, [trace, activeId])
+
+  if (!span) {
+    return (
+      <div className="flex items-center text-[0.7rem] text-muted-foreground/55" style={{ height: ROW_HEIGHT }}>
+        Select a span to inspect its details.
+      </div>
+    )
+  }
+
+  const attrs = span.attributes
+  const num = (key: string) => (typeof attrs[key] === 'number' ? (attrs[key] as number) : undefined)
+
+  const meta: [string, string][] = [['kind', span.kind], ['status', span.status]]
+
+  // Where the span sits in the trace, then how long it ran.
+  if (trace) {
+    meta.push(['started', `+${fmtDuration(Math.max(0, span.start - trace.start))}`])
+  }
+
+  meta.push(['duration', fmtDuration(span.duration)])
+
+  // Push an attribute row when present; numbers are thousands-formatted.
+  const push = (label: string, key: string) => {
+    const v = attrs[key]
+
+    if (v !== undefined && v !== null && v !== '') {
+      meta.push([label, typeof v === 'number' ? fmtInt(v) : String(v)])
+    }
+  }
+
+  push('model', 'llm.model_name')
+  push('tokens in', 'llm.token_count.prompt')
+  push('tokens out', 'llm.token_count.completion')
+
+  const treason = num('llm.token_count.reasoning')
+
+  if (treason) {
+    meta.push(['reasoning', fmtInt(treason)])
+  }
+
+  const tin = num('llm.token_count.prompt')
+  const tout = num('llm.token_count.completion')
+
+  if (tin !== undefined || tout !== undefined) {
+    meta.push(['tokens total', fmtInt((tin ?? 0) + (tout ?? 0) + (treason ?? 0))])
+  }
+
+  push('finish', 'hermes.finish_reason')
+  push('tool', 'tool.name')
+  // Container (AGENT) spans carry session shape; subagents expose their id.
+  push('source', 'session.source')
+  push('messages', 'session.message_count')
+  push('tool calls', 'session.tool_call_count')
+
+  if (span.sessionId) {
+    meta.push(['session', span.sessionId.slice(0, 12)])
+  }
+
+  const input = attrs['input.value']
+  const output = attrs['output.value']
+
+  return (
+    <div className="flex flex-col gap-3 pb-3">
+      <p
+        className="flex items-center text-[0.82rem] font-medium break-words text-foreground/90"
+        style={{ minHeight: ROW_HEIGHT }}
+      >
+        {span.name}
+      </p>
+      <dl className="grid grid-cols-[6rem_1fr] gap-x-3 gap-y-1 text-[0.7rem]">
+        {meta.map(([k, v]) => (
+          <div className="contents" key={k}>
+            <dt className="truncate text-muted-foreground/55">{k}</dt>
+            <dd className="min-w-0 break-words text-foreground/85">{v}</dd>
+          </div>
+        ))}
+      </dl>
+      {input ? <InspectorBlock label="input" value={String(input)} /> : null}
+      {output ? <InspectorBlock label="output" value={String(output)} /> : null}
+    </div>
+  )
+}
+
+function InspectorBlock({ label, value }: { label: string; value: string }) {
+  return (
+    <div className="flex min-w-0 flex-col gap-1">
+      <span className="text-[0.6rem] font-medium tracking-wider text-muted-foreground/50 uppercase">{label}</span>
+      <pre className="max-h-40 overflow-auto rounded bg-foreground/5 p-2 text-[0.66rem] break-words whitespace-pre-wrap text-foreground/80">
+        {value}
+      </pre>
+    </div>
+  )
+}
--- a/apps/desktop/src/app/agents/span-style.ts
+++ b/apps/desktop/src/app/agents/span-style.ts
@@ -0,0 +1,56 @@
+import { toolIconName } from '@/components/assistant-ui/tool-fallback-model'
+import type { TraceSpanKind, TraceSpanNode } from '@/store/trace'
+
+// Category → bar classes. Tailwind -500 family reads at even weight on dark;
+// error tint overrides downstream.
+const KIND_BAR: Record<TraceSpanKind, string> = {
+  AGENT: 'bg-violet-500/70',
+  CHAIN: 'bg-slate-500/70',
+  LLM: 'bg-sky-500/70',
+  TOOL: 'bg-emerald-500/70'
+}
+
+const TOOL_BAR: Record<string, string> = {
+  delegate_task: 'bg-violet-500/70',
+  patch: 'bg-amber-500/70',
+  read_file: 'bg-cyan-500/70',
+  search_files: 'bg-cyan-500/70',
+  terminal: 'bg-zinc-400/70',
+  web_extract: 'bg-teal-500/70',
+  web_search: 'bg-teal-500/70',
+  write_file: 'bg-amber-500/70'
+}
+
+/** Bar color for a span: error tint wins, then per-tool, then per-kind. */
+export function barClass(node: TraceSpanNode): string {
+  if (node.status === 'error') {
+    return 'bg-red-500/80'
+  }
+
+  const tool = String(node.attributes['tool.name'] ?? '')
+
+  if (node.kind === 'TOOL' && TOOL_BAR[tool]) {
+    return TOOL_BAR[tool]
+  }
+
+  return KIND_BAR[node.kind]
+}
+
+/** Icon for a span, reusing the thread's tool icons so the trace matches what
+ *  the conversation shows mid-thread. */
+export function spanIconName(node: TraceSpanNode): string {
+  if (node.kind === 'TOOL') {
+    return toolIconName(String(node.attributes['tool.name'] ?? 'tools'))
+  }
+
+  // The root span is the user's turn (its label is the prompt) — mark it human.
+  if (node.kind === 'AGENT' && node.parentId === null) {
+    return 'account'
+  }
+
+  if (node.kind === 'AGENT' || node.kind === 'LLM') {
+    return 'hubot'
+  }
+
+  return 'list-tree'
+}
--- a/apps/desktop/src/app/agents/time-map.test.ts
+++ b/apps/desktop/src/app/agents/time-map.test.ts
@@ -0,0 +1,67 @@
+import { describe, expect, it } from 'vitest'
+
+import type { TraceSpanNode } from '@/store/trace'
+
+import { buildTimeMap, niceTicks } from './time-map'
+
+function node(kind: TraceSpanNode['kind'], start: number, end: number): TraceSpanNode {
+  return {
+    id: `${kind}:${start}`,
+    parentId: null,
+    name: kind,
+    kind,
+    start,
+    end,
+    duration: end - start,
+    status: 'ok',
+    sessionId: null,
+    attributes: {},
+    depth: 0,
+    children: []
+  }
+}
+
+describe('buildTimeMap', () => {
+  it('maps busy time 1:1 and compresses idle gaps', () => {
+    // Two 1s busy spans separated by a 100s idle gap.
+    const nodes = [node('TOOL', 0, 1), node('TOOL', 101, 102)]
+    const map = buildTimeMap(nodes, 0, 102)
+
+    // Endpoints anchor.
+    expect(map.toV(0)).toBe(0)
+    expect(map.toReal(0)).toBe(0)
+    expect(map.toReal(map.totalV)).toBeCloseTo(102)
+
+    // The 100s gap is collapsed: virtual width ≪ real width.
+    expect(map.totalV).toBeLessThan(20)
+    expect(map.gaps).toHaveLength(1)
+    expect(map.gaps[0]!.r1 - map.gaps[0]!.r0).toBeCloseTo(100)
+  })
+
+  it('round-trips real↔virtual within busy regions', () => {
+    const map = buildTimeMap([node('LLM', 10, 20)], 10, 20)
+
+    expect(map.toReal(map.toV(15))).toBeCloseTo(15)
+  })
+
+  it('survives degenerate input (only container spans)', () => {
+    const map = buildTimeMap([node('AGENT', 0, 0)], 0, 0)
+
+    expect(map.totalV).toBeGreaterThan(0)
+    expect(Number.isFinite(map.toReal(0))).toBe(true)
+  })
+})
+
+describe('niceTicks', () => {
+  it('returns ascending ticks covering the range', () => {
+    const ticks = niceTicks(0, 10)
+
+    expect(ticks.length).toBeGreaterThan(1)
+    expect(ticks).toEqual([...ticks].sort((a, b) => a - b))
+    expect(ticks.at(-1)!).toBeLessThanOrEqual(10)
+  })
+
+  it('degenerates safely on a zero-width range', () => {
+    expect(niceTicks(5, 5)).toEqual([5])
+  })
+})
--- a/apps/desktop/src/app/agents/time-map.ts
+++ b/apps/desktop/src/app/agents/time-map.ts
@@ -0,0 +1,146 @@
+import type { TraceSpanNode } from '@/store/trace'
+
+// Time compression. "All" spans a whole session, mostly idle between turns. We
+// build a compressed coordinate space (virtual units) where idle gaps collapse
+// to a fixed width, and route positioning / zoom / ticks through it — the
+// approach Sentry's compressed trace timeline uses. Real busy time maps 1:1;
+// long idle gaps shrink and get a marker.
+
+export interface TimeSeg {
+  gap: boolean
+  r0: number
+  r1: number
+  v0: number
+  v1: number
+}
+
+export interface TimeMap {
+  gaps: TimeSeg[]
+  toReal: (v: number) => number
+  toV: (t: number) => number
+  totalV: number
+}
+
+export function buildTimeMap(nodes: TraceSpanNode[], fullStart: number, fullEnd: number): TimeMap {
+  // Busy = actual work (LLM/TOOL). AGENT/CHAIN containers span whole turns
+  // including idle, so they'd hide every gap — exclude them from gap detection.
+  const busy = nodes
+    .filter(n => n.kind === 'LLM' || n.kind === 'TOOL')
+    .map(n => [n.start, Math.max(n.end, n.start)] as [number, number])
+    .sort((a, b) => a[0] - b[0])
+
+  const merged: [number, number][] = []
+
+  for (const [s, e] of busy) {
+    const last = merged.at(-1)
+
+    if (last && s <= last[1]) {
+      last[1] = Math.max(last[1], e)
+    } else {
+      merged.push([s, e])
+    }
+  }
+
+  if (merged.length === 0) {
+    merged.push([fullStart, fullEnd])
+  }
+
+  const activeTotal = merged.reduce((sum, [s, e]) => sum + (e - s), 0) || 1
+  const compressed = Math.min(Math.max(activeTotal * 0.04, 0.5), 4)
+
+  const segs: TimeSeg[] = []
+  let v = 0
+
+  const add = (r0: number, r1: number, gap: boolean) => {
+    if (r1 <= r0) {
+      return
+    }
+
+    const vlen = gap ? Math.min(r1 - r0, compressed) : r1 - r0
+    segs.push({ gap, r0, r1, v0: v, v1: v + vlen })
+    v += vlen
+  }
+
+  let cursor = fullStart
+
+  if (merged[0]![0] > cursor) {
+    add(cursor, merged[0]![0], true)
+    cursor = merged[0]![0]
+  }
+
+  for (let i = 0; i < merged.length; i++) {
+    const [s, e] = merged[i]!
+    add(Math.max(s, cursor), e, false)
+    cursor = Math.max(cursor, e)
+    const next = merged[i + 1]
+
+    if (next && next[0] > cursor) {
+      add(cursor, next[0], true)
+      cursor = next[0]
+    }
+  }
+
+  if (fullEnd > cursor) {
+    add(cursor, fullEnd, true)
+  }
+
+  // Degenerate input (e.g. a live trace with only AGENT spans, or a zero-width
+  // range at spawn) yields no segments — guarantee at least one so toV/toReal
+  // never index into an empty array.
+  if (segs.length === 0) {
+    const r1 = Math.max(fullEnd, fullStart + 0.001)
+    segs.push({ gap: false, r0: fullStart, r1, v0: 0, v1: r1 - fullStart })
+    v = r1 - fullStart
+  }
+
+  const totalV = v || 1
+
+  const toV = (t: number) => {
+    if (t <= segs[0]!.r0) {
+      return 0
+    }
+
+    for (const s of segs) {
+      if (t <= s.r1) {
+        return s.v0 + ((t - s.r0) / (s.r1 - s.r0 || 1)) * (s.v1 - s.v0)
+      }
+    }
+
+    return totalV
+  }
+
+  const toReal = (vv: number) => {
+    for (const s of segs) {
+      if (vv <= s.v1) {
+        return s.r0 + ((vv - s.v0) / (s.v1 - s.v0 || 1)) * (s.r1 - s.r0)
+      }
+    }
+
+    return fullEnd
+  }
+
+  const gaps = segs.filter(s => s.gap && s.r1 - s.r0 > s.v1 - s.v0 + 1e-6)
+
+  return { gaps, toReal, toV, totalV }
+}
+
+/** "Nice" axis ticks (1/2/5 × 10ⁿ) covering [start, end] in virtual units. */
+export function niceTicks(start: number, end: number, target = 6): number[] {
+  const span = end - start
+
+  if (span <= 0) {
+    return [start]
+  }
+
+  const raw = span / target
+  const mag = 10 ** Math.floor(Math.log10(raw))
+  const norm = raw / mag
+  const step = (norm >= 5 ? 5 : norm >= 2 ? 2 : 1) * mag
+  const ticks: number[] = []
+
+  for (let t = Math.ceil(start / step) * step; t <= end; t += step) {
+    ticks.push(t)
+  }
+
+  return ticks
+}
--- a/apps/desktop/src/app/agents/trace-waterfall.tsx
+++ b/apps/desktop/src/app/agents/trace-waterfall.tsx
@@ -0,0 +1,544 @@
+import { useStore } from '@nanostores/react'
+import { scaleLinear } from 'd3-scale'
+import { select } from 'd3-selection'
+import {
+  zoom as d3Zoom,
+  type D3ZoomEvent,
+  type ZoomBehavior,
+  zoomIdentity,
+  type ZoomTransform,
+  zoomTransform
+} from 'd3-zoom'
+import { useEffect, useLayoutEffect, useMemo, useRef, useState } from 'react'
+
+import { Codicon } from '@/components/ui/codicon'
+import { ToolIcon } from '@/components/ui/tool-icon'
+import { cn } from '@/lib/utils'
+import {
+  $hoveredSpanId,
+  $selectedSpanId,
+  $traceLabelsCollapsed,
+  clearHoveredSpan,
+  flattenSpanTree,
+  type TraceDoc,
+  type TraceSpanNode
+} from '@/store/trace'
+
+import { fmtDuration } from './format'
+import { barClass, spanIconName } from './span-style'
+import { buildTimeMap, niceTicks } from './time-map'
+
+export const ROW_HEIGHT = 26
+const LABEL_WIDTH = 280
+const LABEL_MAX_WIDTH = 240
+const MAX_ZOOM = 5000
+
+export function TraceWaterfall({ trace, viewKey }: { trace: TraceDoc; viewKey: string }) {
+  const nodes = useMemo(() => flattenSpanTree(trace), [trace])
+  const selectedId = useStore($selectedSpanId)
+  const hoveredId = useStore($hoveredSpanId)
+  const collapsed = useStore($traceLabelsCollapsed)
+  const trackRef = useRef<HTMLDivElement>(null)
+  const bodyRef = useRef<HTMLDivElement>(null)
+  const zoomRef = useRef<ZoomBehavior<HTMLDivElement, unknown> | null>(null)
+  const [size, setSize] = useState({ height: 0, width: 0 })
+  const [availHeight, setAvailHeight] = useState(0)
+  const [transform, setTransform] = useState<ZoomTransform>(zoomIdentity)
+
+  // Measure the track so the time scale + zoom extents track its real width.
+  useLayoutEffect(() => {
+    const el = trackRef.current
+
+    if (!el) {
+      return
+    }
+
+    const ro = new ResizeObserver(([entry]) => {
+      if (entry) {
+        setSize({ height: entry.contentRect.height, width: entry.contentRect.width })
+      }
+    })
+
+    ro.observe(el)
+
+    return () => ro.disconnect()
+  }, [])
+
+  // Available height of the scroll area, so the track can fill it when there are
+  // few rows (and still grow + scroll when there are many).
+  useLayoutEffect(() => {
+    const el = bodyRef.current
+
+    if (!el) {
+      return
+    }
+
+    const ro = new ResizeObserver(([entry]) => {
+      if (entry) {
+        setAvailHeight(entry.contentRect.height)
+      }
+    })
+
+    ro.observe(el)
+
+    return () => ro.disconnect()
+  }, [])
+
+  const tmap = useMemo(
+    () => buildTimeMap(nodes, trace.start, trace.end || trace.start + 1),
+    [nodes, trace.start, trace.end]
+  )
+
+  const xScale = useMemo(
+    () => scaleLinear().domain([0, tmap.totalV]).range([0, Math.max(1, size.width)]),
+    [tmap.totalV, size.width]
+  )
+
+  // d3-zoom owns drag-pan + click/drag separation; we drive the wheel ourselves.
+  // Attached ONCE on mount (not gated on measured size) so the gesture is live
+  // immediately — gating on size.width was the regression that "lost" zoom when
+  // the grid delayed the first measurement.
+  useEffect(() => {
+    const el = trackRef.current
+
+    if (!el) {
+      return
+    }
+
+    const behavior = d3Zoom<HTMLDivElement, unknown>()
+      .scaleExtent([1, MAX_ZOOM])
+      .clickDistance(4)
+      .filter((event: Event) => event.type !== 'wheel' && !(event as MouseEvent).button)
+      .on('zoom', (event: D3ZoomEvent<HTMLDivElement, unknown>) => setTransform(event.transform))
+
+    zoomRef.current = behavior
+    const sel = select(el)
+    sel.call(behavior)
+    sel.on('dblclick.zoom', null)
+
+    // Wheel routing on the track itself:
+    //   ⌘/Ctrl + wheel → zoom toward the cursor · horizontal/shift wheel → pan
+    //   time · plain vertical wheel → fall through so the row list scrolls.
+    const onWheel = (e: WheelEvent) => {
+      if (e.ctrlKey || e.metaKey) {
+        e.preventDefault()
+        e.stopPropagation()
+        const px = e.clientX - el.getBoundingClientRect().left
+        behavior.scaleBy(select(el), Math.exp(-e.deltaY * 0.002), [px, 0])
+
+        return
+      }
+
+      const horizontal = Math.abs(e.deltaX) > Math.abs(e.deltaY)
+      const dx = horizontal ? e.deltaX : e.shiftKey ? e.deltaY : 0
+
+      if (!dx) {
+        return // plain vertical → let the list scroll
+      }
+
+      e.preventDefault()
+      e.stopPropagation()
+      behavior.translateBy(select(el), -dx / zoomTransform(el).k, 0)
+    }
+
+    el.addEventListener('wheel', onWheel, { passive: false })
+
+    return () => {
+      sel.on('.zoom', null)
+      el.removeEventListener('wheel', onWheel)
+    }
+  }, [])
+
+  // Vertical drag scrubs the actual scroll container (the body owns row
+  // position; labels + canvas stay in sync because they share this scroller).
+  useEffect(() => {
+    const scroller = bodyRef.current
+
+    if (!scroller) {
+      return
+    }
+
+    let lastY: null | number = null
+
+    const stop = () => {
+      lastY = null
+      window.removeEventListener('mousemove', move, { capture: true })
+      window.removeEventListener('mouseup', stop, { capture: true })
+    }
+
+    const move = (e: MouseEvent) => {
+      if (!(e.buttons & 1) || lastY === null) {
+        stop()
+
+        return
+      }
+
+      scroller.scrollTop -= e.clientY - lastY
+      lastY = e.clientY
+    }
+
+    const start = (e: MouseEvent) => {
+      if (e.button !== 0) {
+        return
+      }
+
+      lastY = e.clientY
+      window.addEventListener('mousemove', move, { capture: true })
+      window.addEventListener('mouseup', stop, { capture: true })
+    }
+
+    scroller.addEventListener('mousedown', start, { capture: true })
+
+    return () => {
+      scroller.removeEventListener('mousedown', start, { capture: true })
+      stop()
+    }
+  }, [])
+
+  // Keep the pan/zoom extents in sync with the measured track size.
+  useEffect(() => {
+    const behavior = zoomRef.current
+
+    if (!behavior || size.width === 0) {
+      return
+    }
+
+    behavior
+      .extent([
+        [0, 0],
+        [size.width, size.height]
+      ])
+      .translateExtent([
+        [0, 0],
+        [size.width, size.height]
+      ])
+  }, [size.width, size.height])
+
+  // Reset the viewport only on an explicit navigation (session switch or pinning
+  // a different turn) — NOT when the followed turn settles live→DB, and not on
+  // live ticks. Keeps the user's zoom/pan; no auto-nav.
+  useEffect(() => {
+    const el = trackRef.current
+
+    if (el && zoomRef.current) {
+      select(el).call(zoomRef.current.transform, zoomIdentity)
+    } else {
+      setTransform(zoomIdentity)
+    }
+  }, [viewKey])
+
+  // Preserve the visible *real-time* window across tmap changes that AREN'T a
+  // nav (same viewKey): live ticks growing the trace, and the live→DB fold where
+  // the compressed axis (totalV) shifts. Without this, a zoomed-in view would
+  // drift each tick and the fold would reframe/jump. We translate the OLD view's
+  // edges back to real time, re-project them through the NEW map, and re-apply
+  // the transform — so the same span stays put while the bars get more exact.
+  const prevTmapRef = useRef(tmap)
+  const prevViewKeyRef = useRef(viewKey)
+  useLayoutEffect(() => {
+    const el = trackRef.current
+    const behavior = zoomRef.current
+    const width = size.width
+
+    // Nav reset owns viewKey changes; just resync our baselines and bail.
+    if (viewKey !== prevViewKeyRef.current) {
+      prevViewKeyRef.current = viewKey
+      prevTmapRef.current = tmap
+
+      return
+    }
+
+    const oldMap = prevTmapRef.current
+
+    if (oldMap === tmap || !el || !behavior || width === 0) {
+      prevTmapRef.current = tmap
+
+      return
+    }
+
+    const t = zoomTransform(el)
+    prevTmapRef.current = tmap
+
+    // Full view (identity) → keep showing the whole trace as it grows; nothing
+    // to preserve, and re-projecting would fight the natural "fit all".
+    if (t.k <= 1.0001 && Math.abs(t.x) < 0.5) {
+      return
+    }
+
+    const clampV = (v: number, total: number) => Math.max(0, Math.min(total, v))
+    const oldXs = scaleLinear().domain([0, oldMap.totalV]).range([0, width])
+    const zxOld = t.rescaleX(oldXs)
+    const realStart = oldMap.toReal(clampV(zxOld.invert(0), oldMap.totalV))
+    const realEnd = oldMap.toReal(clampV(zxOld.invert(width), oldMap.totalV))
+
+    const newXs = scaleLinear().domain([0, tmap.totalV]).range([0, width])
+    const bx0 = newXs(tmap.toV(realStart))
+    const bx1 = newXs(tmap.toV(realEnd))
+    const k = Math.max(1, Math.min(MAX_ZOOM, width / Math.max(1, bx1 - bx0)))
+    const x = Math.min(0, Math.max(width * (1 - k), -k * bx0))
+
+    select(el).call(behavior.transform, zoomIdentity.scale(k).translate(x / k, 0))
+  }, [tmap, viewKey, size.width])
+
+  // Latest nodes/tmap read via refs so the zoom-to-span effect can fire ONLY on
+  // a selection change — never on live ticks (which would re-snap the view).
+  const nodesRef = useRef(nodes)
+  nodesRef.current = nodes
+  const tmapRef = useRef(tmap)
+  tmapRef.current = tmap
+
+  // Selecting a span (clicking a row) zooms the timeline to frame it (~70%) and
+  // scrolls to it. Keyed on the selection alone so it never fights live updates.
+  useEffect(() => {
+    const el = trackRef.current
+    const behavior = zoomRef.current
+
+    if (!el || !behavior || !selectedId) {
+      return
+    }
+
+    const span = nodesRef.current.find(n => n.id === selectedId)
+    const width = el.clientWidth
+
+    if (!span || !width) {
+      return
+    }
+
+    const map = tmapRef.current
+    const xs = scaleLinear().domain([0, map.totalV]).range([0, width])
+    const bx0 = xs(map.toV(span.start))
+    const bx1 = xs(map.toV(span.end))
+    const bw = Math.max(2, bx1 - bx0)
+    const k = Math.max(1, Math.min(MAX_ZOOM, (width * 0.7) / bw))
+    const center = (bx0 + bx1) / 2
+    // Clamp the translate so the view can't slide past the start (left gap) or
+    // the end — matching d3's translateExtent, but on the target so there's no
+    // "show gap then snap back".
+    const x = Math.min(0, Math.max(width * (1 - k), width / 2 - k * center))
+    const next = zoomIdentity.scale(k).translate(x / k, 0)
+    select(el).transition().duration(250).call(behavior.transform, next)
+  }, [selectedId])
+
+  const view = useMemo(() => {
+    if (size.width === 0) {
+      return { end: tmap.totalV, start: 0 }
+    }
+
+    const zx = transform.rescaleX(xScale)
+
+    return { end: zx.invert(size.width), start: zx.invert(0) }
+  }, [transform, xScale, size.width, tmap.totalV])
+
+  // view is in compressed (virtual) coordinates.
+  const viewSpan = Math.max(1e-6, view.end - view.start)
+  const ticks = useMemo(() => niceTicks(view.start, view.end), [view.start, view.end])
+  // pctV: a virtual coord → %; pct: a real time → % (via the compression map).
+  const pctV = (vv: number) => ((vv - view.start) / viewSpan) * 100
+  const pct = (t: number) => pctV(tmap.toV(t))
+
+  const resetView = () => {
+    const el = trackRef.current
+
+    if (el && zoomRef.current) {
+      select(el).transition().duration(200).call(zoomRef.current.transform, zoomIdentity)
+    }
+  }
+
+  // One column template shared by the ruler and the body so the time axis and
+  // the bars can never drift out of alignment. Column 1 is the label tree (a
+  // thin gutter when collapsed, just wide enough for the expand toggle); the
+  // `gap-x-2` between the columns separates labels from the chart.
+  const cols = `${collapsed ? '1.5rem' : `${LABEL_WIDTH}px`} minmax(0, 1fr)`
+  // Fill the scroll area when rows are few; grow + scroll when they're many.
+  const bodyHeight = Math.max(nodes.length * ROW_HEIGHT, availHeight)
+
+  return (
+    <div className="relative flex min-h-0 min-w-0 flex-1 flex-col overflow-hidden">
+      {/* Ruler — shares the body's column template; column 1 holds the collapse
+          toggle (in flow so it never overlaps labels), column 2 the time axis. */}
+      <div
+        className="grid shrink-0 items-center gap-x-2 text-[0.62rem] text-muted-foreground/70"
+        style={{ gridTemplateColumns: cols, height: ROW_HEIGHT }}
+      >
+        <div className="flex items-center justify-end">
+          <button
+            className="rounded p-0.5 text-muted-foreground/55 hover:bg-foreground/10 hover:text-foreground"
+            onClick={() => $traceLabelsCollapsed.set(!collapsed)}
+            title={collapsed ? 'Show labels' : 'Hide labels'}
+            type="button"
+          >
+            <Codicon name={collapsed ? 'chevron-right' : 'chevron-left'} size="0.9rem" />
+          </button>
+        </div>
+        <div className="relative h-full overflow-hidden">
+          {ticks.map(t => {
+            const left = pctV(t)
+
+            if (left < 0 || left > 100) {
+              return null
+            }
+
+            // Edge ticks align inward so the first/last label isn't sliced by the
+            // track's clip; interior ticks center on their gridline.
+            const edge = left < 4 ? 'left' : left > 96 ? 'right' : 'center'
+
+            return (
+              <span
+                className={cn(
+                  'absolute top-1/2 -translate-y-1/2 px-1 whitespace-nowrap tabular-nums',
+                  edge === 'left' ? 'translate-x-0' : edge === 'right' ? '-translate-x-full' : '-translate-x-1/2'
+                )}
+                key={t}
+                style={{ left: `${left}%` }}
+              >
+                {fmtDuration(tmap.toReal(t) - trace.start)}
+              </span>
+            )
+          })}
+        </div>
+      </div>
+
+      {/* Body — same column template; vertically scrolls labels + track together.
+          scrollbar-gutter stays stable so switching between a short turn (no
+          scrollbar) and the full session (scrollbar) never reflows the track. */}
+      <div
+        className="min-h-0 flex-1 overflow-y-auto overscroll-contain [scrollbar-gutter:stable]"
+        ref={bodyRef}
+      >
+        <div className="grid min-h-full gap-x-2" style={{ gridTemplateColumns: cols }}>
+          {/* Column 1: label index (empty cell when collapsed keeps the grid 2-wide) */}
+          {collapsed ? (
+            <div />
+          ) : (
+            <div>
+              {nodes.map(node => (
+                <SpanLabel active={node.id === selectedId || node.id === hoveredId} key={node.id} node={node} />
+              ))}
+            </div>
+          )}
+
+          {/* Column 2: time track */}
+          <div
+            className="relative cursor-grab touch-none overflow-hidden select-none active:cursor-grabbing"
+            onClick={() => $selectedSpanId.set(null)}
+            onDoubleClick={resetView}
+            ref={trackRef}
+            style={{ height: bodyHeight }}
+          >
+            {ticks.map(t => {
+              const left = pctV(t)
+
+              // Skip the flush-left gridline at t=0 — it reads as a border.
+              if (left <= 0.1 || left > 100) {
+                return null
+              }
+
+              return (
+                <div
+                  className="pointer-events-none absolute top-0 bottom-0 w-px bg-border/30"
+                  key={t}
+                  style={{ left: `${left}%` }}
+                />
+              )
+            })}
+
+            {/* Collapsed-idle markers: a dashed seam where dead time was removed. */}
+            {tmap.gaps.map(g => {
+              const left = pctV(g.v0)
+              const right = pctV(g.v1)
+
+              if (right < 0 || left > 100) {
+                return null
+              }
+
+              return (
+                <div
+                  className="pointer-events-none absolute top-0 bottom-0 flex items-start justify-center border-l border-dashed border-border/50 bg-foreground/[0.03]"
+                  key={`gap-${g.v0}`}
+                  style={{ left: `${left}%`, width: `${Math.max(0.4, right - left)}%` }}
+                >
+                  <span className="mt-0.5 rounded bg-background/70 px-1 text-[0.55rem] whitespace-nowrap text-muted-foreground/55">
+                    {fmtDuration(g.r1 - g.r0)}
+                  </span>
+                </div>
+              )
+            })}
+
+            {nodes.map((node, i) => {
+              const left = pct(node.start)
+              const width = (tmap.toV(node.end) - tmap.toV(node.start)) / viewSpan
+              const active = node.id === selectedId || node.id === hoveredId
+
+              return (
+                <div
+                  className={cn(
+                    'absolute right-0 left-0 transition-colors duration-100 ease-out hover:bg-foreground/[0.035] hover:transition-none',
+                    active && 'bg-foreground/[0.035]'
+                  )}
+                  key={node.id}
+                  onMouseEnter={() => $hoveredSpanId.set(node.id)}
+                  onMouseLeave={() => clearHoveredSpan(node.id)}
+                  style={{ height: ROW_HEIGHT, top: i * ROW_HEIGHT }}
+                >
+                  <button
+                    className={cn(
+                      'absolute top-1/2 h-4 -translate-y-1/2 overflow-hidden rounded-[3px] transition-[filter] hover:brightness-125',
+                      barClass(node),
+                      node.status === 'running' && 'animate-pulse',
+                      active && 'ring-1 ring-foreground/70'
+                    )}
+                    onClick={e => {
+                      e.stopPropagation()
+                      $selectedSpanId.set(node.id)
+                    }}
+                    style={{ left: `${left}%`, minWidth: 2, width: `${width * 100}%` }}
+                    type="button"
+                  />
+                  {/* On-lane label: name + duration, anchored at the bar start.
+                      Only when the start is in view AND the bar is wide enough to
+                      host it (or it's active) — otherwise sliver bars (e.g. a
+                      sub-second span pinned at t=0) leave a floating duration
+                      stuck at the left edge. The left tree always has the full
+                      label, so slivers lose nothing. */}
+                  {left >= 0 && left < 100 && (active || width * size.width >= 32) ? (
+                    <span
+                      className="pointer-events-none absolute inset-y-0 flex items-center gap-1.5 truncate pl-1.5 text-[0.6rem] text-white/85"
+                      style={{ left: `${left}%`, maxWidth: LABEL_MAX_WIDTH }}
+                    >
+                      <span className="truncate">{node.name}</span>
+                      <span className="shrink-0 text-white/55">{fmtDuration(node.duration)}</span>
+                    </span>
+                  ) : null}
+                </div>
+              )
+            })}
+          </div>
+        </div>
+      </div>
+    </div>
+  )
+}
+
+function SpanLabel({ active, node }: { active: boolean; node: TraceSpanNode }) {
+  return (
+    <button
+      className={cn(
+        'flex w-full items-center gap-1.5 truncate pr-2 pl-2 text-left text-[0.72rem] text-(--ui-text-secondary) transition-colors duration-100 ease-out hover:bg-(--ui-row-hover-background) hover:text-foreground hover:transition-none',
+        active && 'bg-(--ui-row-active-background) text-foreground'
+      )}
+      onClick={() => $selectedSpanId.set(node.id)}
+      onMouseEnter={() => $hoveredSpanId.set(node.id)}
+      onMouseLeave={() => clearHoveredSpan(node.id)}
+      style={{ height: ROW_HEIGHT, paddingLeft: 8 + node.depth * 14 }}
+      type="button"
+    >
+      <ToolIcon
+        className={cn('shrink-0', node.status === 'error' ? 'text-red-500' : 'text-muted-foreground/60')}
+        name={spanIconName(node)}
+        size="0.8rem"
+      />
+      <span className="min-w-0 flex-1 truncate text-foreground/85">{node.name}</span>
+      <span className="shrink-0 tabular-nums text-[0.62rem] text-muted-foreground/55">{fmtDuration(node.duration)}</span>
+    </button>
+  )
+}
--- a/apps/desktop/src/app/agents/turn-strip.tsx
+++ b/apps/desktop/src/app/agents/turn-strip.tsx
@@ -0,0 +1,73 @@
+import { useStore } from '@nanostores/react'
+
+import { cn } from '@/lib/utils'
+import { $traceTurns } from '@/store/trace'
+
+interface TurnStripProps {
+  activeIndex: null | number
+  allActive: boolean
+  liveIndex: null | number
+  onAll: () => void
+  onTurn: (index: number) => void
+}
+
+// Turn nav as a row of timeline bars (à la the thread timeline). Each button is
+// full header height (the hit target); the bar inside is short when inactive,
+// full height when active/live.
+export function TurnStrip({ activeIndex, allActive, liveIndex, onAll, onTurn }: TurnStripProps) {
+  const turns = useStore($traceTurns)
+
+  if (turns.length === 0) {
+    return null
+  }
+
+  return (
+    <div className="flex shrink-0 self-stretch items-center gap-2 overflow-x-auto pt-7">
+      <button
+        aria-label="All turns"
+        className={cn(
+          'flex h-full shrink-0 items-center gap-1 rounded px-1 text-[0.6rem] font-medium tracking-wide uppercase transition-colors',
+          allActive ? 'text-foreground' : 'text-muted-foreground/45 hover:text-foreground/80'
+        )}
+        onClick={onAll}
+        title="All turns"
+        type="button"
+      >
+        All
+      </button>
+      <div className="flex h-full items-center gap-px">
+        {turns.map(turn => {
+          const active = turn.index === activeIndex
+          const live = turn.index === liveIndex
+
+          return (
+            <button
+              aria-label={`Turn ${turn.index + 1}`}
+              className="group flex h-full items-center px-px"
+              key={turn.index}
+              onClick={() => onTurn(turn.index)}
+              onMouseEnter={() => onTurn(turn.index)}
+              title={`#${turn.index + 1} · ${turn.label}`}
+              type="button"
+            >
+              {/* Fixed-height box so the strip never grows when a bar activates;
+                  only the inner fill changes height. */}
+              <span className="flex h-4 w-[3px] items-center justify-center">
+                <span
+                  className={cn(
+                    'w-full rounded-full transition-all duration-100 ease-out group-hover:transition-none',
+                    live
+                      ? 'h-full animate-pulse bg-emerald-500'
+                      : active
+                        ? 'h-full bg-foreground'
+                        : 'h-1/2 bg-foreground/25 group-hover:h-3/4 group-hover:bg-foreground/50'
+                  )}
+                />
+              </span>
+            </button>
+          )
+        })}
+      </div>
+    </div>
+  )
+}
--- a/apps/desktop/src/app/overlays/overlay-view.tsx
+++ b/apps/desktop/src/app/overlays/overlay-view.tsx
@@ -49,7 +49,11 @@ export function OverlayView({

  return (
    <div
-      className="fixed inset-0 z-50 bg-black/22 p-3 backdrop-blur-[0.125rem] sm:p-6"
+      className={cn(
+        'fixed inset-0 z-50 bg-black/22 backdrop-blur-[0.125rem]',
+        'p-3 pt-[calc(var(--titlebar-height)+0.625rem)] pl-[max(0.75rem,calc(var(--titlebar-content-inset,0px)+0.25rem))]',
+        'sm:p-6 sm:pt-[calc(var(--titlebar-height)+0.875rem)] sm:pl-[max(1.5rem,calc(var(--titlebar-content-inset,0px)+0.25rem))]'
+      )}
      onClick={event => {
        if (event.target === event.currentTarget) {
          closeOverlay()
--- a/apps/desktop/src/app/session/hooks/use-message-stream.ts
+++ b/apps/desktop/src/app/session/hooks/use-message-stream.ts
@@ -31,6 +31,7 @@ import { setClarifyRequest } from '@/store/clarify'
 import { setSessionCompacting } from '@/store/compaction'
 import { refreshBackgroundProcesses } from '@/store/composer-status'
 import { $gateway } from '@/store/gateway'
+import { liveToolComplete, liveToolStart, liveTurnAppendText, liveTurnEnd, liveTurnStart } from '@/store/live-turn'
 import { dispatchNativeNotification } from '@/store/native-notifications'
 import { notify } from '@/store/notifications'
 import { requestDesktopOnboarding } from '@/store/onboarding'
@@ -53,7 +54,7 @@ import {
  setYoloActive
 } from '@/store/session'
 import { broadcastSessionsChanged } from '@/store/session-sync'
-import { clearSessionSubagents, pruneDelegateFallbackSubagents, upsertSubagent } from '@/store/subagents'
+import { pruneDelegateFallbackSubagents, upsertSubagent } from '@/store/subagents'
 import { setSessionTodos } from '@/store/todos'
 import { recordToolDiff } from '@/store/tool-diffs'
 import { notifyWorkspaceChanged, toolMayMutateFiles } from '@/store/workspace-events'
@@ -857,7 +858,11 @@ export function useMessageStream({
        }

        flushQueuedDeltas(sessionId)
-        clearSessionSubagents(sessionId)
+        // NOTE: subagents are cleared on the user's submit (the real turn
+        // boundary), NOT here — message.start also fires per assistant round and
+        // for synthetic re-entries (async-delegation completion / notifications),
+        // which must accumulate into the same turn, not wipe it.
+        liveTurnStart(sessionId)
        setSessionCompacting(sessionId, false)
        compactedTurnRef.current.delete(sessionId)
        nativeSubagentSessionsRef.current.delete(sessionId)
@@ -880,7 +885,9 @@ export function useMessageStream({
        }
      } else if (event.type === 'message.delta') {
        if (sessionId) {
-          appendAssistantDelta(sessionId, coerceGatewayText(payload?.text))
+          const text = coerceGatewayText(payload?.text)
+          appendAssistantDelta(sessionId, text)
+          liveTurnAppendText(sessionId, text)
        }
      } else if (event.type === 'thinking.delta') {
        // thinking.delta carries the kawaii spinner status (face + verb from
@@ -921,6 +928,7 @@ export function useMessageStream({

        const finalText = coerceGatewayText(payload?.text) || coerceGatewayText(payload?.rendered)
        completeAssistantMessage(sessionId, finalText)
+        liveTurnEnd(sessionId)

        if (isActiveEvent) {
          setTurnStartedAt(null)
@@ -961,6 +969,14 @@ export function useMessageStream({
        flushQueuedDeltas(sessionId)
        upsertToolCall(sessionId, toTodoPayload(payload) ?? payload, 'running', event.type)

+        if (event.type === 'tool.start') {
+          const toolId = String(payload?.tool_id ?? payload?.tool_call_id ?? payload?.id ?? payload?.name ?? '')
+
+          if (toolId) {
+            liveToolStart(sessionId, toolId, String(payload?.name ?? 'tool'))
+          }
+        }
+
        if (isActiveEvent) {
          setPetActivity({ reasoning: false, toolRunning: true })
        }
@@ -969,6 +985,12 @@ export function useMessageStream({
          flushQueuedDeltas(sessionId)
          upsertToolCall(sessionId, toTodoPayload(payload) ?? payload, 'complete', event.type)

+          const toolId = String(payload?.tool_id ?? payload?.tool_call_id ?? payload?.id ?? payload?.name ?? '')
+
+          if (toolId) {
+            liveToolComplete(sessionId, toolId, payload?.error ? 'error' : 'ok')
+          }
+
          if (isActiveEvent) {
            setPetActivity({ toolRunning: false })
          }
--- a/apps/desktop/src/app/session/hooks/use-prompt-actions.ts
+++ b/apps/desktop/src/app/session/hooks/use-prompt-actions.ts
@@ -38,6 +38,7 @@ import {
  updateComposerAttachment
 } from '@/store/composer'
 import { resetSessionBackground } from '@/store/composer-status'
+import { liveTurnReset } from '@/store/live-turn'
 import { clearNotifications, notify, notifyError } from '@/store/notifications'
 import { requestDesktopOnboarding } from '@/store/onboarding'
 import { setPetScale } from '@/store/pet-gallery'
@@ -766,6 +767,12 @@ export function usePromptActions({
        rewriteOptimistic(sessionId)
        const text = buildContextText(syncedAttachments)

+        // New user turn = the real trace boundary: reset the live turn + clear
+        // the prior turn's subagents here, so async re-entries / multi-round
+        // message.starts accumulate into one turn instead of clobbering it.
+        liveTurnReset(sessionId)
+        clearSessionSubagents(sessionId)
+
        // On sleep/wake the gateway's in-memory session may have been cleared
        // while the desktop app still holds the old session ID. Detect this,
        // resume the stored session to re-register it, and retry once.
--- a/apps/desktop/src/components/assistant-ui/tool-fallback-model.ts
+++ b/apps/desktop/src/components/assistant-ui/tool-fallback-model.ts
@@ -257,6 +257,18 @@ function isToolTitleKey(name: string): name is ToolTitleKey {
  return name in TOOL_META
 }

+/** The icon name the thread uses for a tool, so other surfaces (e.g. the trace
+ *  waterfall) can render the same glyph. Falls back to the generic tools icon. */
+export function toolIconName(name: string): string {
+  if (isToolTitleKey(name)) {
+    return TOOL_META[name].icon ?? 'tools'
+  }
+
+  const prefix = PREFIX_META.find(p => name.startsWith(p.prefix))
+
+  return prefix?.icon ?? 'tools'
+}
+
 const INLINE_CODE_SPLIT_RE = /(`[^`\n]+`)/g
 const CITATION_MARKER_RE = /(?<=[\p{L}\p{N})\].,!?:;"'”’])\[(?:\d+(?:\s*,\s*\d+)*)\](?!\()/gu
 const BACKTICK_NOISE_RE = /`{3,}/g
--- a/apps/desktop/src/store/live-turn.ts
+++ b/apps/desktop/src/store/live-turn.ts
@@ -0,0 +1,102 @@
+import { atom } from 'nanostores'
+
+/**
+ * Live, in-flight turn state captured from the event stream (no backend).
+ *
+ * The chat runtime already receives `message.*` and `tool.*` events; here we
+ * record just their timing + streamed reply per session. `buildLiveTrace`
+ * (app/agents) stitches this into a TraceDoc so the waterfall can render the
+ * current turn live, then the DB trace folds in once it settles.
+ */
+
+export interface LiveToolEvent {
+  id: string
+  name: string
+  start: number
+  end?: number
+  status: 'error' | 'ok' | 'running'
+}
+
+export interface LiveTurn {
+  busy: boolean
+  turnStart: number
+  tools: LiveToolEvent[]
+  /** Streamed assistant text for the CURRENT round (reset each message.start).
+   *  After the final round this is the turn's reply — shown on the trailing
+   *  llm span so selecting it reveals what the turn produced. */
+  replyText: string
+}
+
+export const $liveTurnBySession = atom<Record<string, LiveTurn>>({})
+
+const REPLY_CAP = 4000
+
+function patch(sid: string, fn: (turn: LiveTurn) => LiveTurn) {
+  const map = $liveTurnBySession.get()
+  const existing = map[sid] ?? { busy: false, turnStart: Date.now(), tools: [], replyText: '' }
+  $liveTurnBySession.set({ ...map, [sid]: fn(existing) })
+}
+
+/** Mark the turn busy without wiping it. Called on every message.start — which
+ *  fires per assistant message AND for synthetic re-entries (async-delegation
+ *  completion, notifications), so it must accumulate, not reset. A new assistant
+ *  message means a fresh reply run, so reset just the streamed text (not tools). */
+export function liveTurnStart(sid: string) {
+  patch(sid, turn => ({ ...turn, busy: true, replyText: '' }))
+}
+
+/** Reset for a brand-new user turn (the real boundary — call it on submit). */
+export function liveTurnReset(sid: string, at = Date.now()) {
+  $liveTurnBySession.set({
+    ...$liveTurnBySession.get(),
+    [sid]: { busy: true, turnStart: at, tools: [], replyText: '' }
+  })
+}
+
+/** Accumulate streamed assistant text for the current round (message.delta). */
+export function liveTurnAppendText(sid: string, delta: string) {
+  if (!delta) {
+    return
+  }
+
+  patch(sid, turn => ({ ...turn, replyText: (turn.replyText + delta).slice(-REPLY_CAP) }))
+}
+
+export function liveTurnEnd(sid: string) {
+  const map = $liveTurnBySession.get()
+
+  if (!map[sid]?.busy) {
+    return
+  }
+
+  patch(sid, turn => ({ ...turn, busy: false }))
+}
+
+export function liveToolStart(sid: string, id: string, name: string, at = Date.now()) {
+  patch(sid, turn => {
+    if (turn.tools.some(t => t.id === id && t.end === undefined)) {
+      return turn // already tracking this running tool
+    }
+
+    return { ...turn, busy: true, tools: [...turn.tools, { id, name, start: at, status: 'running' }] }
+  })
+}
+
+export function liveToolComplete(sid: string, id: string, status: 'error' | 'ok', at = Date.now()) {
+  patch(sid, turn => {
+    let patched = false
+
+    const tools = turn.tools.map(t => {
+      if (!patched && t.id === id && t.end === undefined) {
+        patched = true
+
+        return { ...t, end: at, status }
+      }
+
+      return t
+    })
+
+    return { ...turn, tools }
+  })
+}
+
--- a/apps/desktop/src/store/trace.ts
+++ b/apps/desktop/src/store/trace.ts
@@ -0,0 +1,205 @@
+import { atom } from 'nanostores'
+
+/** Span kinds, mirroring agent/trace_builder.py (OpenInference conventions). */
+export type TraceSpanKind = 'AGENT' | 'CHAIN' | 'LLM' | 'TOOL'
+export type TraceSpanStatus = 'error' | 'ok' | 'running' | 'unset'
+
+export interface TraceSpan {
+  id: string
+  parentId: null | string
+  name: string
+  kind: TraceSpanKind
+  /** Epoch seconds. */
+  start: number
+  end: number
+  duration: number
+  status: TraceSpanStatus
+  sessionId: null | string
+  attributes: Record<string, unknown>
+}
+
+export interface TraceDoc {
+  traceId: string
+  rootSessionId: string
+  rootSpanId: null | string
+  start: number
+  end: number
+  duration: number
+  metadata: Record<string, unknown>
+  spans: TraceSpan[]
+}
+
+export interface TraceTurnSummary {
+  index: number
+  label: string
+  start: number
+  end: number
+  duration: number
+  spanCount: number
+}
+
+export interface TraceSpanNode extends TraceSpan {
+  depth: number
+  children: TraceSpanNode[]
+}
+
+/** Raw wire payloads from the gateway (snake_case). */
+interface WireSpan {
+  span_id?: string
+  parent_id?: null | string
+  name?: string
+  kind?: string
+  start?: number
+  end?: number
+  duration?: number
+  status?: string
+  session_id?: null | string
+  attributes?: Record<string, unknown>
+}
+
+interface WireTrace {
+  trace_id?: string
+  root_session_id?: string
+  root_span_id?: null | string
+  start?: number
+  end?: number
+  duration?: number
+  metadata?: Record<string, unknown>
+  spans?: WireSpan[]
+}
+
+const asKind = (v: unknown): TraceSpanKind =>
+  v === 'AGENT' || v === 'CHAIN' || v === 'LLM' || v === 'TOOL' ? v : 'CHAIN'
+
+const asStatus = (v: unknown): TraceSpanStatus => (v === 'ok' || v === 'error' ? v : 'unset')
+
+const num = (v: unknown, fallback = 0) => (typeof v === 'number' && Number.isFinite(v) ? v : fallback)
+
+function toSpan(w: WireSpan): TraceSpan {
+  const start = num(w.start)
+  const end = num(w.end, start)
+
+  return {
+    id: String(w.span_id ?? ''),
+    parentId: w.parent_id ?? null,
+    name: String(w.name ?? ''),
+    kind: asKind(w.kind),
+    start,
+    end,
+    duration: num(w.duration, Math.max(0, end - start)),
+    status: asStatus(w.status),
+    sessionId: w.session_id ?? null,
+    attributes: w.attributes ?? {}
+  }
+}
+
+export function toTraceDoc(wire: WireTrace): TraceDoc {
+  const spans = (wire.spans ?? []).map(toSpan)
+
+  return {
+    traceId: String(wire.trace_id ?? ''),
+    rootSessionId: String(wire.root_session_id ?? ''),
+    rootSpanId: wire.root_span_id ?? null,
+    start: num(wire.start),
+    end: num(wire.end),
+    duration: num(wire.duration),
+    metadata: wire.metadata ?? {},
+    spans
+  }
+}
+
+/**
+ * Shift every timestamp in a trace by a constant so its root starts at
+ * `newStart`, preserving all relative durations. Used to fold a settled DB
+ * trace (server epoch) onto the live turn's client epoch, so swapping live →
+ * DB lands on the same on-screen window — the time-rebase half of an A→B
+ * hand-off where B's exact spans replace A's approximate ones in place.
+ */
+export function rebaseTrace(doc: TraceDoc, newStart: number): TraceDoc {
+  const delta = newStart - doc.start
+
+  if (!Number.isFinite(delta) || delta === 0) {
+    return doc
+  }
+
+  return {
+    ...doc,
+    start: doc.start + delta,
+    end: doc.end + delta,
+    spans: doc.spans.map(s => ({ ...s, start: s.start + delta, end: s.end + delta }))
+  }
+}
+
+/** Pre-order flatten of the span tree with depth, sorted by start time. */
+export function flattenSpanTree(trace: TraceDoc): TraceSpanNode[] {
+  const byParent = new Map<null | string, TraceSpan[]>()
+
+  for (const span of trace.spans) {
+    const list = byParent.get(span.parentId) ?? []
+    list.push(span)
+    byParent.set(span.parentId, list)
+  }
+
+  for (const list of byParent.values()) {
+    list.sort((a, b) => a.start - b.start || a.id.localeCompare(b.id))
+  }
+
+  const out: TraceSpanNode[] = []
+
+  const walk = (parentId: null | string, depth: number) => {
+    for (const span of byParent.get(parentId) ?? []) {
+      const node: TraceSpanNode = { ...span, depth, children: [] }
+      out.push(node)
+      walk(span.id, depth + 1)
+    }
+  }
+
+  walk(null, 0)
+
+  return out
+}
+
+export const $trace = atom<TraceDoc | null>(null)
+export const $traceTurns = atom<TraceTurnSummary[]>([])
+export const $traceLoading = atom<boolean>(false)
+export const $traceError = atom<null | string>(null)
+export const $selectedSpanId = atom<null | string>(null)
+export const $hoveredSpanId = atom<null | string>(null)
+export const $traceLabelsCollapsed = atom<boolean>(false)
+
+/** Which turn the agents overlay shows: 'latest' follows the newest turn (and
+ *  the live stream), 'all' is the whole session, a number pins a settled turn. */
+export type TraceSelection = 'all' | 'latest' | number
+export const $traceSelection = atom<TraceSelection>('latest')
+
+/** Clear hover only if this span is the current one (avoids enter/leave races). */
+export function clearHoveredSpan(id: string) {
+  if ($hoveredSpanId.get() === id) {
+    $hoveredSpanId.set(null)
+  }
+}
+
+interface WireTurn {
+  index?: number
+  label?: string
+  start?: number
+  end?: number
+  duration?: number
+  span_count?: number
+}
+
+export function toTurnSummaries(wire: { turns?: WireTurn[] }): TraceTurnSummary[] {
+  return (wire.turns ?? []).map(t => ({
+    index: num(t.index),
+    label: String(t.label ?? `turn ${num(t.index)}`),
+    start: num(t.start),
+    end: num(t.end),
+    duration: num(t.duration),
+    spanCount: num(t.span_count)
+  }))
+}
+
+export function setTrace(trace: null | TraceDoc) {
+  $trace.set(trace)
+  $selectedSpanId.set(null)
+}
--- a/hermes_cli/main.py
+++ b/hermes_cli/main.py
@@ -295,6 +295,7 @@ from hermes_cli.subcommands.memory import build_memory_parser
 from hermes_cli.subcommands.acp import build_acp_parser
 from hermes_cli.subcommands.tools import build_tools_parser
 from hermes_cli.subcommands.insights import build_insights_parser
+from hermes_cli.subcommands.trace import build_trace_parser
 from hermes_cli.subcommands.skills import build_skills_parser
 from hermes_cli.subcommands.pairing import build_pairing_parser
 from hermes_cli.subcommands.plugins import build_plugins_parser
@@ -10407,6 +10408,7 @@ def _coalesce_session_name_args(argv: list) -> list:
        "pairing",
        "skills",
        "tools",
+        "trace",
        "mcp",
        "sessions",
        "insights",
@@ -11592,7 +11594,7 @@ _BUILTIN_SUBCOMMANDS = frozenset(
        "project", "proxy",
        "prompt-size",
        "send", "sessions", "setup",
-        "skills", "slack", "status", "tools", "uninstall", "update",
+        "skills", "slack", "status", "tools", "trace", "uninstall", "update",
        "version", "webhook", "whatsapp", "whatsapp-cloud", "chat", "secrets", "security",
        # Help-ish invocations — plugin commands not being listed in
        # top-level --help is an acceptable trade-off for skipping an
@@ -12024,6 +12026,83 @@ def cmd_insights(args):
        print(f"Error generating insights: {e}")


+def cmd_trace(args):
+    from hermes_state import SessionDB
+
+    db = None
+    try:
+        db = SessionDB()
+        sid = db.resolve_session_id(args.session)
+        if not sid:
+            print(f"Error: no session matching {args.session!r}")
+            return
+
+        from agent.trace_builder import build_trace
+
+        trace = build_trace(db, sid, include_subagents=not args.no_subagents)
+        if trace is None or not trace.spans:
+            print(f"No trace data for session {sid}")
+            return
+
+        action = getattr(args, "trace_action", None)
+        if action == "show":
+            _print_trace_tree(trace)
+            return
+
+        from agent.trace_export import dumps
+
+        payload = dumps(trace, getattr(args, "format", "otlp"))
+        output = getattr(args, "output", None)
+        if not output or output == "-":
+            print(payload)
+        else:
+            with open(output, "w", encoding="utf-8") as fh:
+                fh.write(payload)
+            print(
+                f"Wrote {len(trace.spans)} spans ({args.format}) to {output}"
+            )
+    except Exception as e:
+        print(f"Error building trace: {e}")
+    finally:
+        if db is not None:
+            db.close()
+
+
+def _print_trace_tree(trace) -> None:
+    """Render a span tree as an indented summary for the terminal."""
+    from agent.trace_builder import STATUS_ERROR
+
+    by_parent: dict = {}
+    for span in trace.spans:
+        by_parent.setdefault(span.parent_id, []).append(span)
+    for kids in by_parent.values():
+        kids.sort(key=lambda s: s.start)
+
+    glyph = {"AGENT": "◆", "LLM": "✦", "TOOL": "⚙", "CHAIN": "▸"}
+
+    print(
+        f"trace {trace.root_session_id[:12]} · {len(trace.spans)} spans · "
+        f"{trace.duration:.1f}s"
+    )
+
+    def walk(parent_id, depth):
+        for span in by_parent.get(parent_id, []):
+            mark = "✗" if span.status == STATUS_ERROR else glyph.get(span.kind, "·")
+            indent = "  " * depth
+            print(
+                f"{indent}{mark} [{span.kind:<5}] {span.name[:60]:<60} "
+                f"{span.duration:7.2f}s"
+            )
+            walk(span.span_id, depth + 1)
+
+    roots = [s for s in trace.spans if s.parent_id is None]
+    roots.sort(key=lambda s: s.start)
+    for root in roots:
+        mark = "✗" if root.status == STATUS_ERROR else glyph.get(root.kind, "·")
+        print(f"{mark} [{root.kind:<5}] {root.name[:60]:<60} {root.duration:7.2f}s")
+        walk(root.span_id, 1)
+
+
 def cmd_skills(args):
    # Route 'config' action to skills_config module
    if getattr(args, "skills_action", None) == "config":
@@ -13083,6 +13162,11 @@ def main():
    # =========================================================================
    build_insights_parser(subparsers, cmd_insights=cmd_insights)

+    # =========================================================================
+    # trace command  (parser built in hermes_cli/subcommands/trace.py)
+    # =========================================================================
+    build_trace_parser(subparsers, cmd_trace=cmd_trace)
+
    # =========================================================================
    # claw command  (parser built in hermes_cli/subcommands/claw.py)
    # =========================================================================
--- a/hermes_cli/subcommands/trace.py
+++ b/hermes_cli/subcommands/trace.py
@@ -0,0 +1,61 @@
+"""``hermes trace`` subcommand parser.
+
+Exports a session's reconstructed span tree to a portable trace file (OTLP/JSON
+for Phoenix or any OpenTelemetry backend, or Chrome Trace format for Perfetto),
+or prints a quick summary tree to the terminal. Handler injected to avoid
+importing ``main``.
+"""
+
+from __future__ import annotations
+
+from typing import Callable
+
+
+def build_trace_parser(subparsers, *, cmd_trace: Callable) -> None:
+    """Attach the ``trace`` subcommand to ``subparsers``."""
+    trace_parser = subparsers.add_parser(
+        "trace",
+        help="Export or inspect a session's execution trace",
+        description=(
+            "Reconstruct an OpenTelemetry-style span tree for any session "
+            "(including its subagents) from the session store, and export it to "
+            "a standard trace file or print a summary."
+        ),
+    )
+    trace_sub = trace_parser.add_subparsers(dest="trace_action", required=True)
+
+    export_p = trace_sub.add_parser(
+        "export",
+        help="Write a session trace to a file (OTLP/JSON or Chrome Trace)",
+    )
+    export_p.add_argument("session", help="Session id or unique prefix")
+    export_p.add_argument(
+        "--format",
+        "-f",
+        choices=["otlp", "chrome"],
+        default="otlp",
+        help="otlp = OTLP/JSON for Phoenix; chrome = Perfetto/chrome://tracing",
+    )
+    export_p.add_argument(
+        "--output",
+        "-o",
+        help="Output path (default: stdout). Use '-' for stdout.",
+    )
+    export_p.add_argument(
+        "--no-subagents",
+        action="store_true",
+        help="Exclude subagent (delegate) descendant sessions",
+    )
+
+    show_p = trace_sub.add_parser(
+        "show",
+        help="Print a summary span tree for a session to the terminal",
+    )
+    show_p.add_argument("session", help="Session id or unique prefix")
+    show_p.add_argument(
+        "--no-subagents",
+        action="store_true",
+        help="Exclude subagent (delegate) descendant sessions",
+    )
+
+    trace_parser.set_defaults(func=cmd_trace)
--- a/hermes_state.py
+++ b/hermes_state.py
@@ -1931,6 +1931,27 @@ class SessionDB:
            row = cursor.fetchone()
        return dict(row) if row else None

+    def get_child_session_ids(self, parent_session_id: str) -> List[str]:
+        """Return subagent child session ids spawned by ``parent_session_id``.
+
+        Scoped to *ephemeral* children — delegate/subagent runs — using the same
+        predicate that hides them from session pickers. Branch forks and
+        compression continuations (which also carry ``parent_session_id``) are
+        intentionally excluded: a branch is a separate trace, and a compression
+        continuation is the same conversation rather than a nested subagent.
+        Ordered by ``started_at`` so callers can match them to spawn order.
+        """
+        if not parent_session_id:
+            return []
+        with self._lock:
+            cursor = self._conn.execute(
+                f"SELECT id FROM sessions s "
+                f"WHERE s.parent_session_id = ? AND {_ephemeral_child_sql('s')} "
+                f"ORDER BY s.started_at ASC, s.id ASC",
+                (parent_session_id,),
+            )
+            return [row["id"] for row in cursor.fetchall()]
+
    def resolve_session_id(self, session_id_or_prefix: str) -> Optional[str]:
        """Resolve an exact or uniquely prefixed session ID to the full ID.

--- a/package-lock.json
+++ b/package-lock.json
@@ -94,6 +94,9 @@
        "class-variance-authority": "^0.7.1",
        "clsx": "^2.1.1",
        "cmdk": "^1.1.1",
+        "d3-scale": "^4.0.2",
+        "d3-selection": "^3.0.0",
+        "d3-zoom": "^3.0.0",
        "dnd-core": "^14.0.1",
        "hast-util-from-html-isomorphic": "^2.0.0",
        "hast-util-to-text": "^4.0.2",
@@ -129,6 +132,9 @@
        "@eslint/js": "^9.39.4",
        "@testing-library/dom": "^10.4.0",
        "@testing-library/react": "^16.3.2",
+        "@types/d3-scale": "^4.0.9",
+        "@types/d3-selection": "^3.0.11",
+        "@types/d3-zoom": "^3.0.8",
        "@types/hast": "^3.0.4",
        "@types/node": "^24.13.2",
        "@types/react": "^19.2.14",
--- a/tests/agent/test_trace_builder.py
+++ b/tests/agent/test_trace_builder.py
@@ -0,0 +1,331 @@
+"""Tests for the derive-on-read trace builder and exporters.
+
+Builds real sessions/messages in a temp SQLite store and asserts the
+reconstructed span tree, then checks the OTLP/JSON and Chrome export shapes.
+"""
+
+from __future__ import annotations
+
+import json
+
+import pytest
+
+from agent.trace_builder import (
+    KIND_AGENT,
+    KIND_LLM,
+    KIND_TOOL,
+    STATUS_ERROR,
+    STATUS_OK,
+    build_session_turns,
+    build_trace,
+)
+from agent.trace_export import to_chrome_trace, to_otlp_json
+from hermes_state import SessionDB
+
+BASE = 1_700_000_000.0
+
+
+def _tool_call(call_id: str, name: str, args: dict):
+    return {
+        "id": call_id,
+        "type": "function",
+        "function": {"name": name, "arguments": json.dumps(args)},
+    }
+
+
+@pytest.fixture
+def db(tmp_path):
+    store = SessionDB(db_path=tmp_path / "sessions.db")
+    yield store
+    store.close()
+
+
+def _set_times(db: SessionDB, session_id: str, started: float, ended: float):
+    db._conn.execute(
+        "UPDATE sessions SET started_at = ?, ended_at = ? WHERE id = ?",
+        (started, ended, session_id),
+    )
+    db._conn.commit()
+
+
+def _build_parent_with_subagent(db: SessionDB):
+    """Parent session that reads a file, then delegates to a subagent."""
+    db.create_session("parent", "cli", model="test-model")
+    db.append_message("parent", "user", "do the thing", timestamp=BASE)
+    db.append_message(
+        "parent",
+        "assistant",
+        "",
+        tool_calls=[_tool_call("call_read", "read_file", {"path": "a.py"})],
+        token_count=10,
+        timestamp=BASE + 1,
+    )
+    db.append_message(
+        "parent",
+        "tool",
+        '{"success": true}',
+        tool_name="read_file",
+        tool_call_id="call_read",
+        timestamp=BASE + 2,
+    )
+    db.append_message(
+        "parent",
+        "assistant",
+        "",
+        tool_calls=[_tool_call("call_deleg", "delegate_task", {"goal": "sub"})],
+        timestamp=BASE + 3,
+    )
+    # Subagent child session.
+    db.create_session("child", "tool", parent_session_id="parent", model="test-model")
+    db.append_message("child", "user", "sub goal", timestamp=BASE + 3.5)
+    db.append_message(
+        "child",
+        "assistant",
+        "working",
+        tool_calls=[_tool_call("call_search", "search_files", {"q": "x"})],
+        timestamp=BASE + 4,
+    )
+    db.append_message(
+        "child",
+        "tool",
+        "results",
+        tool_name="search_files",
+        tool_call_id="call_search",
+        timestamp=BASE + 4.5,
+    )
+    db.append_message("child", "assistant", "subagent done", timestamp=BASE + 5)
+    _set_times(db, "child", BASE + 3.4, BASE + 5)
+    # Delegate result lands back in the parent, then the parent wraps up.
+    db.append_message(
+        "parent",
+        "tool",
+        "subagent done",
+        tool_name="delegate_task",
+        tool_call_id="call_deleg",
+        timestamp=BASE + 6,
+    )
+    db.append_message("parent", "assistant", "all done", timestamp=BASE + 7)
+    _set_times(db, "parent", BASE, BASE + 7)
+
+
+def test_build_trace_basic_shape(db):
+    _build_parent_with_subagent(db)
+    trace = build_trace(db, "parent")
+
+    assert trace is not None
+    kinds = [s.kind for s in trace.spans]
+    assert kinds.count(KIND_AGENT) == 2  # parent + child
+    assert kinds.count(KIND_TOOL) == 3  # read_file, delegate_task, search_files
+    assert kinds.count(KIND_LLM) == 5  # 3 parent assistants + 2 child assistants
+
+    root = next(s for s in trace.spans if s.span_id == trace.root_span_id)
+    assert root.kind == KIND_AGENT
+    assert root.parent_id is None
+    assert root.session_id == "parent"
+
+
+def test_tool_span_pairs_and_times(db):
+    _build_parent_with_subagent(db)
+    trace = build_trace(db, "parent")
+
+    read = next(s for s in trace.spans if s.attributes.get("tool.name") == "read_file")
+    assert read.start == pytest.approx(BASE + 1)
+    assert read.end == pytest.approx(BASE + 2)
+    assert read.status == STATUS_OK
+    assert read.attributes["tool.call_id"] == "call_read"
+
+
+def test_subagent_nested_under_delegate_span(db):
+    _build_parent_with_subagent(db)
+    trace = build_trace(db, "parent")
+
+    delegate = next(
+        s for s in trace.spans if s.attributes.get("tool.name") == "delegate_task"
+    )
+    child_root = next(
+        s for s in trace.spans if s.kind == KIND_AGENT and s.session_id == "child"
+    )
+    assert child_root.parent_id == delegate.span_id
+    # The delegate tool span should envelop the child's work.
+    assert delegate.start <= child_root.start
+    assert delegate.end >= BASE + 6 - 0.001
+
+
+def test_no_subagents_flag_excludes_children(db):
+    _build_parent_with_subagent(db)
+    trace = build_trace(db, "parent", include_subagents=False)
+    assert all(s.session_id == "parent" for s in trace.spans)
+
+
+def test_error_status_detected(db):
+    db.create_session("err", "cli", model="m")
+    db.append_message("err", "user", "go", timestamp=BASE)
+    db.append_message(
+        "err",
+        "assistant",
+        "",
+        tool_calls=[_tool_call("c1", "terminal", {"cmd": "boom"})],
+        timestamp=BASE + 1,
+    )
+    db.append_message(
+        "err",
+        "tool",
+        '{"error": "command failed", "success": false}',
+        tool_name="terminal",
+        tool_call_id="c1",
+        timestamp=BASE + 2,
+    )
+    _set_times(db, "err", BASE, BASE + 2)
+
+    trace = build_trace(db, "err")
+    tool = next(s for s in trace.spans if s.kind == KIND_TOOL)
+    assert tool.status == STATUS_ERROR
+
+
+def test_missing_session_returns_none(db):
+    assert build_trace(db, "nope") is None
+
+
+def test_otlp_export_shape(db):
+    _build_parent_with_subagent(db)
+    trace = build_trace(db, "parent")
+    doc = to_otlp_json(trace)
+
+    spans = doc["resourceSpans"][0]["scopeSpans"][0]["spans"]
+    assert len(spans) == len(trace.spans)
+    one = spans[0]
+    assert len(one["traceId"]) == 32  # 16 bytes hex
+    assert len(one["spanId"]) == 16  # 8 bytes hex
+    keys = {a["key"] for a in one["attributes"]}
+    assert "openinference.span.kind" in keys
+    # All non-root spans carry a parentSpanId.
+    assert any("parentSpanId" in s for s in spans)
+
+
+def test_ended_at_orphan_reap_does_not_balloon_root(db):
+    # ended_at sits hours after the last message (cleanup reaper); the AGENT
+    # span must clamp to real activity, not the bogus ended_at.
+    db.create_session("orphan", "tui", model="m")
+    db.append_message("orphan", "user", "go", timestamp=BASE)
+    db.append_message("orphan", "assistant", "done", timestamp=BASE + 10)
+    db._conn.execute(
+        "UPDATE sessions SET started_at=?, ended_at=?, end_reason=? WHERE id=?",
+        (BASE, BASE + 18_000, "ws_orphan_reap", "orphan"),
+    )
+    db._conn.commit()
+
+    trace = build_trace(db, "orphan")
+    root = next(s for s in trace.spans if s.kind == KIND_AGENT)
+    assert root.duration < 60  # ~10s of real work, not 5 hours
+
+
+def test_build_session_turns_splits_and_tightens(db):
+    # Two turns separated by a long idle gap; each turn-trace must be tight.
+    db.create_session("multi", "tui", model="m")
+    db.append_message("multi", "user", "first task", timestamp=BASE)
+    db.append_message("multi", "assistant", "done one", timestamp=BASE + 5)
+    # User walks away for 800s, then a second turn.
+    db.append_message("multi", "user", "second task", timestamp=BASE + 805)
+    db.append_message("multi", "assistant", "done two", timestamp=BASE + 810)
+    db._conn.execute(
+        "UPDATE sessions SET started_at=?, ended_at=? WHERE id=?",
+        (BASE, BASE + 810, "multi"),
+    )
+    db._conn.commit()
+
+    turns = build_session_turns(db, "multi")
+    assert len(turns) == 2
+    # Neither turn contains the 800s idle gap.
+    assert all(t.duration < 60 for t in turns)
+    assert turns[0].metadata["turn"] == 0
+    assert turns[1].metadata["turn"] == 1
+
+
+def test_async_delegation_completion_merges_into_dispatch_turn(db):
+    # A background delegation dispatched in turn 0 re-enters as a synthetic
+    # `[ASYNC DELEGATION …]` user message. It must NOT open its own turn — it
+    # merges into the turn that spawned it, so the completion processing lands in
+    # the same group as the delegate_task call.
+    db.create_session("async", "tui", model="m")
+    db.append_message("async", "user", "kick off background work", timestamp=BASE)
+    db.append_message(
+        "async",
+        "assistant",
+        "",
+        tool_calls=[_tool_call("call_bg", "delegate_task", {"goal": "bg", "background": True})],
+        timestamp=BASE + 1,
+    )
+    db.append_message(
+        "async",
+        "tool",
+        '{"delegation_id": "d1"}',
+        tool_name="delegate_task",
+        tool_call_id="call_bg",
+        timestamp=BASE + 2,
+    )
+    db.append_message("async", "assistant", "dispatched, carrying on", timestamp=BASE + 3)
+    # Later, the background result re-enters as a synthetic continuation.
+    db.append_message(
+        "async",
+        "user",
+        "[ASYNC DELEGATION COMPLETE — d1]\nA background subagent finished.",
+        timestamp=BASE + 50,
+    )
+    db.append_message("async", "assistant", "acting on the result", timestamp=BASE + 52)
+    db._conn.execute(
+        "UPDATE sessions SET started_at=?, ended_at=? WHERE id=?",
+        (BASE, BASE + 52, "async"),
+    )
+    db._conn.commit()
+
+    turns = build_session_turns(db, "async")
+    assert len(turns) == 1  # not split by the re-injection
+    # Label is the real prompt, not the async marker.
+    root = next(s for s in turns[0].spans if s.span_id == turns[0].root_span_id)
+    assert "kick off background work" in root.name
+    assert "ASYNC DELEGATION" not in root.name
+
+
+def test_important_notification_merges_into_turn(db):
+    # Background-process notifications (`[IMPORTANT: …]`) are continuations too.
+    db.create_session("notif", "tui", model="m")
+    db.append_message("notif", "user", "run the build", timestamp=BASE)
+    db.append_message("notif", "assistant", "started", timestamp=BASE + 1)
+    db.append_message(
+        "notif",
+        "user",
+        "[IMPORTANT: Background process p1 exited with code 0]",
+        timestamp=BASE + 30,
+    )
+    db.append_message("notif", "assistant", "build finished", timestamp=BASE + 31)
+    db._conn.execute(
+        "UPDATE sessions SET started_at=?, ended_at=? WHERE id=?",
+        (BASE, BASE + 31, "notif"),
+    )
+    db._conn.commit()
+
+    turns = build_session_turns(db, "notif")
+    assert len(turns) == 1
+
+
+def test_to_dict_round_trips_shape(db):
+    _build_parent_with_subagent(db)
+    trace = build_trace(db, "parent")
+    d = trace.to_dict()
+    assert d["root_session_id"] == "parent"
+    assert d["root_span_id"] == trace.root_span_id
+    assert len(d["spans"]) == len(trace.spans)
+    span = d["spans"][0]
+    assert {"span_id", "parent_id", "kind", "start", "end", "duration", "status"} <= set(span)
+
+
+def test_chrome_export_one_track_per_session(db):
+    _build_parent_with_subagent(db)
+    trace = build_trace(db, "parent")
+    doc = to_chrome_trace(trace)
+
+    complete = [e for e in doc["traceEvents"] if e["ph"] == "X"]
+    assert len(complete) == len(trace.spans)
+    tids = {e["tid"] for e in complete}
+    assert len(tids) == 2  # parent + child lanes
+    assert all(e["ts"] >= 0 for e in complete)
--- a/tui_gateway/server.py
+++ b/tui_gateway/server.py
@@ -7773,6 +7773,109 @@ def _(rid, params: dict) -> dict:
    return _ok(rid, {"found": ok, "subagent_id": subagent_id})


+# ── Execution traces: span tree reconstructed from the session store ──
+# Powers the desktop /agents waterfall. A trace is derived on read from
+# sessions + messages (real server timestamps, parent_session_id lineage),
+# so it works for any session — live or historical — with no extra storage.
+
+
+def _resolve_trace_session(db, raw: str) -> str | None:
+    """Resolve a trace target id from a DB id/prefix OR a live gateway key.
+
+    The desktop tracks conversations by gateway ``session_key`` (a UUID), while
+    DB rows use timestamp ids. Fall back to the live agent's ``session_id`` for
+    that key so the overlay can trace the conversation the user is in.
+    """
+    sid = db.resolve_session_id(raw)
+    if sid:
+        return sid
+    sess = _sessions.get(raw)
+    agent = sess.get("agent") if isinstance(sess, dict) else None
+    live_id = getattr(agent, "session_id", None)
+    if live_id:
+        return db.resolve_session_id(live_id) or live_id
+    return None
+
+
+def _empty_trace(session_id: str) -> dict:
+    return {
+        "trace_id": f"trace:{session_id}",
+        "root_session_id": session_id,
+        "root_span_id": None,
+        "start": 0.0,
+        "end": 0.0,
+        "duration": 0.0,
+        "metadata": {},
+        "spans": [],
+    }
+
+
+@method("trace.get")
+def _(rid, params: dict) -> dict:
+    db = _get_db()
+    if db is None:
+        return _err(rid, 5000, "session store unavailable")
+
+    raw = str(params.get("session_id") or "").strip()
+    if not raw:
+        return _err(rid, 4000, "session_id required")
+    sid = _resolve_trace_session(db, raw)
+    if not sid:
+        return _err(rid, 4040, f"no session matching {raw!r}")
+
+    include_subagents = params.get("include_subagents", True)
+    turn = params.get("turn")
+
+    from agent.trace_builder import build_session_turns, build_trace
+
+    if turn is not None:
+        try:
+            turns = build_session_turns(db, sid, include_subagents=include_subagents)
+            ti = int(turn)
+            if ti < 0 or ti >= len(turns):
+                return _err(rid, 4040, f"turn {ti} out of range (0..{len(turns) - 1})")
+            return _ok(rid, turns[ti].to_dict())
+        except (TypeError, ValueError):
+            return _err(rid, 4000, "turn must be an integer")
+
+    trace = build_trace(db, sid, include_subagents=include_subagents)
+    # A live session with no flushed messages yet has no spans — return an empty
+    # trace (not an error) so the overlay shows "no activity" rather than failing.
+    return _ok(rid, trace.to_dict() if trace is not None else _empty_trace(sid))
+
+
+@method("trace.turns")
+def _(rid, params: dict) -> dict:
+    db = _get_db()
+    if db is None:
+        return _err(rid, 5000, "session store unavailable")
+
+    raw = str(params.get("session_id") or "").strip()
+    if not raw:
+        return _err(rid, 4000, "session_id required")
+    sid = _resolve_trace_session(db, raw)
+    if not sid:
+        return _ok(rid, {"session_id": raw, "turns": []})
+
+    from agent.trace_builder import build_session_turns
+
+    turns = build_session_turns(db, sid, include_subagents=True)
+    summaries = []
+    for ti, tr in enumerate(turns):
+        root = next((s for s in tr.spans if s.span_id == tr.root_span_id), None)
+        summaries.append(
+            {
+                "index": ti,
+                "label": root.name if root else f"turn {ti}",
+                "start": tr.start,
+                "end": tr.end,
+                "duration": tr.duration,
+                "span_count": len(tr.spans),
+            }
+        )
+    return _ok(rid, {"session_id": sid, "turns": summaries})
+
+
 # ── Spawn-tree snapshots: TUI-written, disk-persisted ────────────────
 # The TUI is the source of truth for subagent state (it assembles payloads
 # from the event stream).  On turn-complete it posts the final tree here;
Author	SHA1	Message	Date
Brooklyn Nicholson	17df5ea573	feat(desktop): agents waterfall — live + historical execution traces Replace the flat agents list with a zoomable d3 waterfall: a turn strip, collapsible label tree, time-compressed track (idle gaps collapse), and a span inspector. Live turns are stitched from the message/tool/subagent streams into the same TraceDoc shape and fold into the server-exact DB trace on settle, so following a turn never reframes. Clears the overlay's traffic-light/titlebar inset at the OverlayView level for every overlay.	2026-06-26 04:48:31 -05:00
Brooklyn Nicholson	0db227a6d8	feat(gateway): expose trace.get and trace.turns RPCs Resolve a desktop session key to its DB session id and serve the derived trace (whole session or a single turn) plus per-turn summaries; return an empty trace for known-but-empty sessions instead of erroring.	2026-06-26 04:36:26 -05:00
Brooklyn Nicholson	49f7a0b456	feat(cli): add hermes trace command to export/show a session trace Wires `hermes trace <session> [--format otlp\|chrome] [-o file]` and a terminal tree view through the central subcommand registry.	2026-06-26 04:36:26 -05:00
Brooklyn Nicholson	74ab798b49	feat(trace): derive OTel-style execution traces from the session store Reconstruct per-session and per-turn span trees (AGENT/LLM/TOOL, with subagents nested under their delegate_task span) straight from SQLite — no new write path. Turn-scoping splits on real user prompts and folds synthetic continuations ([ASYNC DELEGATION ...], [IMPORTANT: ...]) into the turn that spawned them. Exports to OTLP/JSON and Chrome Trace formats.	2026-06-26 04:36:26 -05:00