test(session-log): pin no-session_json regression + drop trailing whitespace

Adds TestNoSessionJsonSnapshot to lock the contract that session_log_file attribute, _save_session_log method, and the per-session JSON snapshot writer are gone. logs_dir is retained for request_dump_*.json. Also cleans up stray trailing whitespace in test_run_agent_codex_responses introduced when the _save_session_log stub line was deleted.
refactor(session-log): delete dead _clean_session_content helper
2026-06-10 04:08:28 +08:00 · 2026-05-20 11:30:08 +02:00 · 2026-05-20 10:24:29 +02:00 · 2026-05-20 09:19:46 +02:00 · 2026-05-20 09:19:13 +02:00 · 2026-05-20 09:19:09 +02:00
14 changed files with 33 additions and 171 deletions
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -210,7 +210,7 @@ hermes-agent/
 | `~/.hermes/skills/` | All active skills (bundled + hub-installed + agent-created) |
 | `~/.hermes/memories/` | Persistent memory (MEMORY.md, USER.md) |
 | `~/.hermes/state.db` | SQLite session database |
-| `~/.hermes/sessions/` | JSON session logs |
+| `~/.hermes/sessions/` | Legacy session artifacts (no longer written; state.db is canonical). Holds the gateway routing index (`sessions.json`) and request-dump breadcrumbs. |
 | `~/.hermes/cron/` | Scheduled job data |
 | `~/.hermes/whatsapp/session/` | WhatsApp bridge credentials |

@@ -239,7 +239,7 @@ User message → AIAgent._run_agent_loop()

 - **Self-registering tools**: Each tool file calls `registry.register()` at import time. `model_tools.py` triggers discovery by importing all tool modules.
 - **Toolset grouping**: Tools are grouped into toolsets (`web`, `terminal`, `file`, `browser`, etc.) that can be enabled/disabled per platform.
- **Session persistence**: All conversations are stored in SQLite (`hermes_state.py`) with full-text search and unique session titles. JSON logs go to `~/.hermes/sessions/`.
+- **Session persistence**: All conversations are stored in SQLite (`hermes_state.py`) with full-text search and unique session titles. Per-session JSON snapshots in `~/.hermes/sessions/` were superseded by the SQLite store and are no longer written.
 - **Ephemeral injection**: System prompts and prefill messages are injected at API call time, never persisted to the database or logs.
 - **Provider abstraction**: The agent works with any OpenAI-compatible API. Provider resolution happens at init time (Nous Portal OAuth, OpenRouter API key, or custom endpoint).
 - **Provider routing**: When using OpenRouter, `provider_routing` in config.yaml controls provider selection (sort by throughput/latency/price, allow/ignore specific providers, data retention policies). These are injected as `extra_body.provider` in API requests.
--- a/agent/agent_init.py
+++ b/agent/agent_init.py
@@ -901,7 +901,8 @@ def init_agent(
    hermes_home = get_hermes_home()
    agent.logs_dir = hermes_home / "sessions"
    agent.logs_dir.mkdir(parents=True, exist_ok=True)
-    agent.session_log_file = agent.logs_dir / f"session_{agent.session_id}.json"
+    # session_log_file removed — state.db is the canonical message store.
+    # logs_dir retained for request_dump_*.json (debug breadcrumb path).
    
    # Track conversation messages for session logging
    agent._session_messages: List[Dict[str, Any]] = []
--- a/agent/conversation_compression.py
+++ b/agent/conversation_compression.py
@@ -387,8 +387,6 @@ def compress_context(
                _SESSION_ID.set(agent.session_id)
            except Exception:
                pass
-            # Update session_log_file to point to the new session's JSON file
-            agent.session_log_file = agent.logs_dir / f"session_{agent.session_id}.json"
            agent._session_db_created = False
            agent._session_db.create_session(
                session_id=agent.session_id,
--- a/agent/conversation_loop.py
+++ b/agent/conversation_loop.py
@@ -1454,7 +1454,6 @@ def run_conversation(
                                }
                                messages.append(continue_msg)
                                agent._session_messages = messages
-                                agent._save_session_log(messages)
                                restart_with_length_continuation = True
                                break

@@ -3086,7 +3085,6 @@ def run_conversation(
                    if not agent.quiet_mode:
                        agent._vprint(f"{agent.log_prefix}↻ Codex response incomplete; continuing turn ({agent._codex_incomplete_retries}/3)")
                    agent._session_messages = messages
-                    agent._save_session_log(messages)
                    continue

                agent._codex_incomplete_retries = 0
@@ -3411,7 +3409,6 @@ def run_conversation(
                
                # Save session log incrementally (so progress is visible even if interrupted)
                agent._session_messages = messages
-                agent._save_session_log(messages)
                
                # Continue loop for next response
                continue
@@ -3578,7 +3575,6 @@ def run_conversation(
                        interim_msg["_thinking_prefill"] = True
                        messages.append(interim_msg)
                        agent._session_messages = messages
-                        agent._save_session_log(messages)
                        continue

                    # ── Empty response retry ──────────────────────
@@ -3712,7 +3708,6 @@ def run_conversation(
                    }
                    messages.append(continue_msg)
                    agent._session_messages = messages
-                    agent._save_session_log(messages)
                    continue

                codex_ack_continuations = 0
--- a/cli.py
+++ b/cli.py
@@ -6501,12 +6501,6 @@ class HermesCLI:
        if self.agent:
            self.agent.session_id = new_session_id
            self.agent.session_start = now
-            # Redirect the JSON session log to the new branch session file so
-            # messages written after branching land in the correct file.
-            if hasattr(self.agent, "session_log_file") and hasattr(self.agent, "logs_dir"):
-                self.agent.session_log_file = (
-                    self.agent.logs_dir / f"session_{new_session_id}.json"
-                )
            self.agent.reset_session_state()
            if hasattr(self.agent, "_last_flushed_db_idx"):
                self.agent._last_flushed_db_idx = len(self.conversation_history)
--- a/run_agent.py
+++ b/run_agent.py
@@ -168,7 +168,6 @@ from agent.tool_result_classification import (
    file_mutation_result_landed,
 )
 from agent.trajectory import (
-    convert_scratchpad_to_think, has_incomplete_scratchpad,
    save_trajectory as _save_trajectory_to_file,
 )
 from agent.message_sanitization import (
@@ -1176,7 +1175,6 @@ class AIAgent:
        self._drop_trailing_empty_response_scaffolding(messages)
        self._apply_persist_user_message_override(messages)
        self._session_messages = messages
-        self._save_session_log(messages)
        self._flush_messages_to_session_db(messages, conversation_history)

    def _drop_trailing_empty_response_scaffolding(self, messages: List[Dict]) -> None:
@@ -1506,81 +1504,6 @@ class AIAgent:
        from agent.agent_runtime_helpers import dump_api_request_debug
        return dump_api_request_debug(self, api_kwargs, reason=reason, error=error)

-    @staticmethod
-    def _clean_session_content(content: str) -> str:
-        """Convert REASONING_SCRATCHPAD to think tags and clean up whitespace."""
-        if not content:
-            return content
-        content = convert_scratchpad_to_think(content)
-        content = re.sub(r'\n+(<think>)', r'\n\1', content)
-        content = re.sub(r'(</think>)\n+', r'\1\n', content)
-        return content.strip()
-
-    def _save_session_log(self, messages: List[Dict[str, Any]] = None):
-        """
-        Save the full raw session to a JSON file.
-
-        Stores every message exactly as the agent sees it: user messages,
-        assistant messages (with reasoning, finish_reason, tool_calls),
-        tool responses (with tool_call_id, tool_name), and injected system
-        messages (compression summaries, todo snapshots, etc.).
-
-        REASONING_SCRATCHPAD tags are converted to <think> blocks for consistency.
-        Overwritten after each turn so it always reflects the latest state.
-        """
-        messages = messages or self._session_messages
-        if not messages:
-            return
-
-        try:
-            # Clean assistant content for session logs
-            cleaned = []
-            for msg in messages:
-                if msg.get("role") == "assistant" and msg.get("content"):
-                    msg = dict(msg)
-                    msg["content"] = self._clean_session_content(msg["content"])
-                cleaned.append(msg)
-
-            # Guard: never overwrite a larger session log with fewer messages.
-            # This protects against data loss when --resume loads a session whose
-            # messages weren't fully written to SQLite — the resumed agent starts
-            # with partial history and would otherwise clobber the full JSON log.
-            if self.session_log_file.exists():
-                try:
-                    existing = json.loads(self.session_log_file.read_text(encoding="utf-8"))
-                    existing_count = existing.get("message_count", len(existing.get("messages", [])))
-                    if existing_count > len(cleaned):
-                        logging.debug(
-                            "Skipping session log overwrite: existing has %d messages, current has %d",
-                            existing_count, len(cleaned),
-                        )
-                        return
-                except Exception:
-                    pass  # corrupted existing file — allow the overwrite
-
-            entry = {
-                "session_id": self.session_id,
-                "model": self.model,
-                "base_url": self.base_url,
-                "platform": self.platform,
-                "session_start": self.session_start.isoformat(),
-                "last_updated": datetime.now().isoformat(),
-                "system_prompt": self._cached_system_prompt or "",
-                "tools": self.tools or [],
-                "message_count": len(cleaned),
-                "messages": cleaned,
-            }
-
-            atomic_json_write(
-                self.session_log_file,
-                entry,
-                indent=2,
-                default=str,
-            )
-
-        except Exception as e:
-            if self.verbose_logging:
-                logging.warning(f"Failed to save session log: {e}")

    def interrupt(self, message: str = None) -> None:
        """
--- a/skills/autonomous-ai-agents/hermes-agent/SKILL.md
+++ b/skills/autonomous-ai-agents/hermes-agent/SKILL.md
@@ -336,7 +336,8 @@ The registry of record is `hermes_cli/commands.py` — every consumer
 ~/.hermes/config.yaml       Main configuration
 ~/.hermes/.env              API keys and secrets
 $HERMES_HOME/skills/        Installed skills
-~/.hermes/sessions/         Session transcripts
+~/.hermes/sessions/         Legacy session artifacts (no longer written; state.db is canonical)
+~/.hermes/state.db          Canonical session store (SQLite + FTS5)
 ~/.hermes/logs/             Gateway and error logs
 ~/.hermes/auth.json         OAuth tokens and credential pools
 ~/.hermes/hermes-agent/     Source code (if git-installed)
@@ -867,7 +868,7 @@ hermes config set auxiliary.vision.model <model_name>
 | Env variables | `hermes config env-path` or [Env vars reference](https://hermes-agent.nousresearch.com/docs/reference/environment-variables) |
 | CLI commands | `hermes --help` or [CLI reference](https://hermes-agent.nousresearch.com/docs/reference/cli-commands) |
 | Gateway logs | `~/.hermes/logs/gateway.log` |
-| Session files | `~/.hermes/sessions/` or `hermes sessions browse` |
+| Session files | `hermes sessions browse` (reads state.db) |
 | Source code | `~/.hermes/hermes-agent/` |

 ---
--- a/tests/cli/test_branch_command.py
+++ b/tests/cli/test_branch_command.py
@@ -160,30 +160,6 @@ class TestBranchCommandCLI:
        assert agent.reset_session_state.called
        assert agent._last_flushed_db_idx == 4  # len(conversation_history)

-    def test_branch_updates_agent_session_log_file(self, cli_instance, session_db, tmp_path):
-        """Branching must redirect the agent's session_log_file to the new session's path."""
-        from cli import HermesCLI
-        from pathlib import Path
-
-        logs_dir = tmp_path / "sessions"
-        logs_dir.mkdir()
-
-        agent = MagicMock()
-        agent._last_flushed_db_idx = 0
-        agent.logs_dir = logs_dir
-        agent.session_log_file = logs_dir / f"session_{cli_instance.session_id}.json"
-        cli_instance.agent = agent
-
-        old_log_file = agent.session_log_file
-        HermesCLI._handle_branch_command(cli_instance, "/branch")
-
-        new_session_id = cli_instance.session_id
-        expected_log = logs_dir / f"session_{new_session_id}.json"
-        assert agent.session_log_file == expected_log, (
-            "session_log_file must point to the branch session, not the original"
-        )
-        assert agent.session_log_file != old_log_file
-
    def test_branch_sets_resumed_flag(self, cli_instance, session_db):
        """Branch should set _resumed=True to prevent auto-title generation."""
        from cli import HermesCLI
--- a/tests/cron/test_codex_execution_paths.py
+++ b/tests/cron/test_codex_execution_paths.py
@@ -74,7 +74,6 @@ class _Codex401ThenSuccessAgent(run_agent.AIAgent):
        self._cleanup_task_resources = lambda task_id: None
        self._persist_session = lambda messages, history=None: None
        self._save_trajectory = lambda messages, user_message, completed: None
-        self._save_session_log = lambda messages: None

    def _try_refresh_codex_client_credentials(self, *, force: bool = True) -> bool:
        type(self).refresh_attempts += 1
--- a/tests/run_agent/test_860_dedup.py
+++ b/tests/run_agent/test_860_dedup.py
@@ -110,8 +110,6 @@ class TestFlushDeduplication:
            db = SessionDB(db_path=db_path)

            agent = self._make_agent(db)
-            # Stub out _save_session_log to avoid file I/O
-            agent._save_session_log = MagicMock()

            conversation_history = [{"role": "user", "content": "old"}]
            messages = list(conversation_history) + [
--- a/tests/run_agent/test_context_token_tracking.py
+++ b/tests/run_agent/test_context_token_tracking.py
@@ -52,7 +52,7 @@ def _make_agent(monkeypatch, api_mode, provider, response_fn):
            kw.update(skip_context_files=True, skip_memory=True, max_iterations=4)
            super().__init__(*a, **kw)
            self._cleanup_task_resources = self._persist_session = lambda *a, **k: None
-            self._save_trajectory = self._save_session_log = lambda *a, **k: None
+            self._save_trajectory = lambda *a, **k: None

        def run_conversation(self, msg, conversation_history=None, task_id=None):
            self._interruptible_api_call = lambda kw: response_fn()
--- a/tests/run_agent/test_empty_response_recovery_persistence.py
+++ b/tests/run_agent/test_empty_response_recovery_persistence.py
@@ -9,11 +9,7 @@ def _agent_with_stubbed_persistence():
    agent._persist_user_message_override = None
    agent._session_db = None
    agent._session_messages = []
-    agent.saved_session_logs = []
    agent.flushed_session_db_messages = []
-    agent._save_session_log = lambda messages: agent.saved_session_logs.append(
-        [m.copy() for m in messages]
-    )
    agent._flush_messages_to_session_db = lambda messages, conversation_history=None: (
        agent.flushed_session_db_messages.append([m.copy() for m in messages])
    )
@@ -60,7 +56,7 @@ def test_persist_session_strips_trailing_empty_recovery_scaffolding():
    assert messages == [
        {"role": "user", "content": "run the task"},
    ]
-    assert agent.saved_session_logs[-1] == messages
+    assert agent.flushed_session_db_messages[-1] == messages
    assert all(not msg.get("_empty_recovery_synthetic") for msg in messages)


@@ -77,7 +73,7 @@ def test_persist_session_keeps_unmarked_terminal_empty_response():
        {"role": "user", "content": "run the task"},
        {"role": "assistant", "content": "(empty)"},
    ]
-    assert agent.saved_session_logs[-1] == messages
+    assert agent.flushed_session_db_messages[-1] == messages


 def test_persist_session_strips_marked_terminal_empty_sentinel():
@@ -94,5 +90,5 @@ def test_persist_session_strips_marked_terminal_empty_sentinel():
    AIAgent._persist_session(agent, messages, conversation_history=[])

    assert messages == [{"role": "user", "content": "continue"}]
-    assert agent.saved_session_logs[-1] == messages
+    assert agent.flushed_session_db_messages[-1] == messages
    assert all(not msg.get("_empty_terminal_sentinel") for msg in messages)
--- a/tests/run_agent/test_run_agent.py
+++ b/tests/run_agent/test_run_agent.py
@@ -554,23 +554,29 @@ class TestExtractReasoning:
        assert result == "from structured field"


-class TestCleanSessionContent:
-    def test_none_passthrough(self):
-        assert AIAgent._clean_session_content(None) is None
+class TestNoSessionJsonSnapshot:
+    """Regression: agent must not write session_{sid}.json snapshots.

-    def test_scratchpad_converted(self):
-        text = "<REASONING_SCRATCHPAD>think</REASONING_SCRATCHPAD> answer"
-        result = AIAgent._clean_session_content(text)
-        assert "<REASONING_SCRATCHPAD>" not in result
-        assert "<think>" in result
+    state.db is the canonical message store after #29182. The legacy snapshot
+    writer was removed; this test pins that contract so a future refactor
+    can't silently reintroduce the file (and the ~500MB/950-file disk usage
+    that came with it).
+    """

-    def test_extra_newlines_cleaned(self):
-        text = "\n\n\n<think>x</think>\n\n\nafter"
-        result = AIAgent._clean_session_content(text)
-        # Should not have excessive newlines around think block
-        assert "\n\n\n" not in result
-        # Content after think block must be preserved
-        assert "after" in result
+    def test_session_log_file_attribute_not_set(self, agent):
+        assert not hasattr(agent, "session_log_file"), (
+            "session_log_file attribute removed in #29182 — state.db is canonical"
+        )
+
+    def test_no_session_log_writer_method(self, agent):
+        assert not hasattr(agent, "_save_session_log"), (
+            "_save_session_log method removed in #29182"
+        )
+
+    def test_logs_dir_retained_for_request_dumps(self, agent):
+        # logs_dir is kept because agent_runtime_helpers.dump_api_request_debug
+        # still writes request_dump_*.json there (debug breadcrumb path).
+        assert hasattr(agent, "logs_dir")


 class TestGetMessagesUpToLastAssistant:
@@ -1901,7 +1907,6 @@ class TestExecuteToolCalls:
        agent._interruptible_api_call = _fake_api_call
        agent._persist_session = lambda *args, **kwargs: None
        agent._save_trajectory = lambda *args, **kwargs: None
-        agent._save_session_log = lambda *args, **kwargs: None

        captured = io.StringIO()
        agent._print_fn = lambda *args, **kw: print(*args, file=captured, **kw)
@@ -4300,22 +4305,6 @@ class TestSafeWriter:
        assert inner.getvalue() == "test"


-class TestSaveSessionLogAtomicWrite:
-    def test_uses_shared_atomic_json_helper(self, agent, tmp_path):
-        agent.session_log_file = tmp_path / "session.json"
-        messages = [{"role": "user", "content": "hello"}]
-
-        with patch("run_agent.atomic_json_write", create=True) as mock_atomic_write:
-            agent._save_session_log(messages)
-
-        mock_atomic_write.assert_called_once()
-        call_args = mock_atomic_write.call_args
-        assert call_args.args[0] == agent.session_log_file
-        payload = call_args.args[1]
-        assert payload["session_id"] == agent.session_id
-        assert payload["messages"] == messages
-        assert call_args.kwargs["indent"] == 2
-        assert call_args.kwargs["default"] is str


 # ===================================================================
@@ -5103,12 +5092,9 @@ class TestPersistUserMessageOverride:
            {"role": "assistant", "content": "Hi!"},
        ]

-        with patch.object(agent, "_save_session_log") as mock_save:
-            agent._persist_session(messages, [])
+        agent._persist_session(messages, [])

        assert messages[0]["content"] == "Hello there"
-        saved_messages = mock_save.call_args.args[0]
-        assert saved_messages[0]["content"] == "Hello there"
        first_db_write = agent._session_db.append_message.call_args_list[0].kwargs
        assert first_db_write["content"] == "Hello there"

--- a/tests/run_agent/test_run_agent_codex_responses.py
+++ b/tests/run_agent/test_run_agent_codex_responses.py
@@ -54,7 +54,6 @@ def _build_agent(monkeypatch):
    agent._cleanup_task_resources = lambda task_id: None
    agent._persist_session = lambda messages, history=None: None
    agent._save_trajectory = lambda messages, user_message, completed: None
-    agent._save_session_log = lambda messages: None
    return agent


@@ -75,7 +74,6 @@ def _build_copilot_agent(monkeypatch, *, model="gpt-5.4"):
    agent._cleanup_task_resources = lambda task_id: None
    agent._persist_session = lambda messages, history=None: None
    agent._save_trajectory = lambda messages, user_message, completed: None
-    agent._save_session_log = lambda messages: None
    return agent


@@ -335,7 +333,6 @@ def test_build_api_kwargs_codex_clamps_minimal_effort(monkeypatch):
    agent._cleanup_task_resources = lambda task_id: None
    agent._persist_session = lambda messages, history=None: None
    agent._save_trajectory = lambda messages, user_message, completed: None
-    agent._save_session_log = lambda messages: None

    kwargs = agent._build_api_kwargs(
        [
@@ -365,7 +362,6 @@ def test_build_api_kwargs_codex_preserves_supported_efforts(monkeypatch):
        agent._cleanup_task_resources = lambda task_id: None
        agent._persist_session = lambda messages, history=None: None
        agent._save_trajectory = lambda messages, user_message, completed: None
-        agent._save_session_log = lambda messages: None

        kwargs = agent._build_api_kwargs(
            [
@@ -594,7 +590,6 @@ def _build_xai_oauth_agent(monkeypatch):
    agent._cleanup_task_resources = lambda task_id: None
    agent._persist_session = lambda messages, history=None: None
    agent._save_trajectory = lambda messages, user_message, completed: None
-    agent._save_session_log = lambda messages: None
    return agent
Author	SHA1	Message	Date
yoniebans	36d2bbe87e	test(session-log): pin no-session_json regression + drop trailing whitespace Adds TestNoSessionJsonSnapshot to lock the contract that session_log_file attribute, _save_session_log method, and the per-session JSON snapshot writer are gone. logs_dir is retained for request_dump_*.json. Also cleans up stray trailing whitespace in test_run_agent_codex_responses introduced when the _save_session_log stub line was deleted.	2026-05-20 11:30:08 +02:00
yoniebans	27ceb3850e	refactor(session-log): delete dead _clean_session_content helper Only caller was the removed _save_session_log. Also removes the unused convert_scratchpad_to_think and has_incomplete_scratchpad imports from run_agent.py (both still used elsewhere via their own imports).	2026-05-20 10:24:29 +02:00
yoniebans	66685a6f85	docs(session-log): state.db is canonical; ~/.hermes/sessions/ is legacy	2026-05-20 09:19:46 +02:00
yoniebans	cb1b951691	refactor(session-log): drop branch/compress re-point of session_log_file The attribute no longer exists; nothing to re-point.	2026-05-20 09:19:13 +02:00
yoniebans	41d584d0d1	refactor(session-log): stop initializing session_log_file attribute	2026-05-20 09:19:09 +02:00
yoniebans	b8c60dc3d6	refactor(session-log): delete _save_session_log and all callers state.db now stores every message field the JSON snapshot stored. Removed the method, all 7 call-sites, and ~13 test stubs that suppressed its file I/O. Body is in git history if it ever needs to come back.	2026-05-20 09:18:05 +02:00