Files
hermes-agent/tests/tools/test_mcp_tool_401_handling.py

140 lines
4.5 KiB
Python
Raw Normal View History

fix(mcp): consolidate OAuth handling, pick up external token refreshes (#11383) * feat(mcp-oauth): scaffold MCPOAuthManager Central manager for per-server MCP OAuth state. Provides get_or_build_provider (cached), remove (evicts cache + deletes disk), invalidate_if_disk_changed (mtime watch, core fix for external-refresh workflow), and handle_401 (dedup'd recovery). No behavior change yet — existing call sites still use build_oauth_auth directly. Task 1 of 8 in the MCP OAuth consolidation (fixes Cthulhu's BetterStack reliability issues). * feat(mcp-oauth): add HermesMCPOAuthProvider with pre-flow disk watch Subclasses the MCP SDK's OAuthClientProvider to inject a disk mtime check before every async_auth_flow, via the central manager. When a subclass instance is used, external token refreshes (cron, another CLI instance) are picked up before the next API call. Still dead code: the manager's _build_provider still delegates to build_oauth_auth and returns the plain OAuthClientProvider. Task 4 wires this subclass in. Task 2 of 8. * refactor(mcp-oauth): extract build_oauth_auth helpers Decomposes build_oauth_auth into _configure_callback_port, _build_client_metadata, _maybe_preregister_client, and _parse_base_url. Public API preserved. These helpers let MCPOAuthManager._build_provider reuse the same logic in Task 4 instead of duplicating the construction dance. Also updates the SDK version hint in the warning from 1.10.0 to 1.26.0 (which is what we actually require for the OAuth types used here). Task 3 of 8. * feat(mcp-oauth): manager now builds HermesMCPOAuthProvider directly _build_provider constructs the disk-watching subclass using the helpers from Task 3, instead of delegating to the plain build_oauth_auth factory. Any consumer using the manager now gets pre-flow disk-freshness checks automatically. build_oauth_auth is preserved as the public API for backwards compatibility. The code path is now: MCPOAuthManager.get_or_build_provider -> _build_provider -> _configure_callback_port _build_client_metadata _maybe_preregister_client _parse_base_url HermesMCPOAuthProvider(...) Task 4 of 8. * feat(mcp): wire OAuth manager + add _reconnect_event MCPServerTask gains _reconnect_event alongside _shutdown_event. When set, _run_http / _run_stdio exit their async-with blocks cleanly (no exception), and the outer run() loop re-enters the transport to rebuild the MCP session with fresh credentials. This is the recovery path for OAuth failures that the SDK's in-place httpx.Auth cannot handle (e.g. cron externally consumed the refresh_token, or server-side session invalidation). _run_http now asks MCPOAuthManager for the OAuth provider instead of calling build_oauth_auth directly. Config-time, runtime, and reconnect paths all share one provider instance with pre-flow disk-watch active. shutdown() defensively sets both events so there is no race between reconnect and shutdown signalling. Task 5 of 8. * feat(mcp): detect auth failures in tool handlers, trigger reconnect All 5 MCP tool handlers (tool call, list_resources, read_resource, list_prompts, get_prompt) now detect auth failures and route through MCPOAuthManager.handle_401: 1. If the manager says recovery is viable (disk has fresh tokens, or SDK can refresh in-place), signal MCPServerTask._reconnect_event to tear down and rebuild the MCP session with fresh credentials, then retry the tool call once. 2. If no recovery path exists, return a structured needs_reauth JSON error so the model stops hallucinating manual refresh attempts (the 'let me curl the token endpoint' loop Cthulhu pasted from Discord). _is_auth_error catches OAuthFlowError, OAuthTokenError, OAuthNonInteractiveError, and httpx.HTTPStatusError(401). Non-auth exceptions still surface via the generic error path unchanged. Task 6 of 8. * feat(mcp-cli): route add/remove through manager, add 'hermes mcp login' cmd_mcp_add and cmd_mcp_remove now go through MCPOAuthManager instead of calling build_oauth_auth / remove_oauth_tokens directly. This means CLI config-time state and runtime MCP session state are backed by the same provider cache — removing a server evicts the live provider, adding a server populates the same cache the MCP session will read from. New 'hermes mcp login <name>' command: - Wipes both the on-disk tokens file and the in-memory MCPOAuthManager cache - Triggers a fresh OAuth browser flow via the existing probe path - Intended target for the needs_reauth error Task 6 returns to the model Task 7 of 8. * test(mcp-oauth): end-to-end integration tests Five new tests exercising the full consolidation with real file I/O and real imports (no transport mocks): 1. external_refresh_picked_up_without_restart — Cthulhu's cron workflow. External process writes fresh tokens to disk; on the next auth flow the manager's mtime-watch flips _initialized and the SDK re-reads from storage. 2. handle_401_deduplicates_concurrent_callers — 10 concurrent handlers for the same failed token fire exactly ONE recovery attempt (thundering-herd protection). 3. handle_401_returns_false_when_no_provider — defensive path for unknown servers. 4. invalidate_if_disk_changed_handles_missing_file — pre-auth state returns False cleanly. 5. provider_is_reused_across_reconnects — cache stickiness so reconnects preserve the disk-watch baseline mtime. Task 8 of 8 — consolidation complete.
2026-04-16 21:57:10 -07:00
"""Tests for MCP tool-handler auth-failure detection.
When a tool call raises UnauthorizedError / OAuthNonInteractiveError /
httpx.HTTPStatusError(401), the handler should:
1. Ask MCPOAuthManager.handle_401 if recovery is viable.
2. If yes, trigger MCPServerTask._reconnect_event and retry once.
3. If no, return a structured needs_reauth error so the model stops
hallucinating manual refresh attempts.
"""
import json
from unittest.mock import MagicMock
import pytest
pytest.importorskip("mcp.client.auth.oauth2")
def test_is_auth_error_detects_oauth_flow_error():
from tools.mcp_tool import _is_auth_error
from mcp.client.auth import OAuthFlowError
assert _is_auth_error(OAuthFlowError("expired")) is True
def test_is_auth_error_detects_oauth_non_interactive():
from tools.mcp_tool import _is_auth_error
from tools.mcp_oauth import OAuthNonInteractiveError
assert _is_auth_error(OAuthNonInteractiveError("no browser")) is True
def test_is_auth_error_detects_httpx_401():
from tools.mcp_tool import _is_auth_error
import httpx
response = MagicMock()
response.status_code = 401
exc = httpx.HTTPStatusError("unauth", request=MagicMock(), response=response)
assert _is_auth_error(exc) is True
def test_is_auth_error_rejects_httpx_500():
from tools.mcp_tool import _is_auth_error
import httpx
response = MagicMock()
response.status_code = 500
exc = httpx.HTTPStatusError("oops", request=MagicMock(), response=response)
assert _is_auth_error(exc) is False
def test_is_auth_error_rejects_generic_exception():
from tools.mcp_tool import _is_auth_error
assert _is_auth_error(ValueError("not auth")) is False
assert _is_auth_error(RuntimeError("not auth")) is False
def test_call_tool_handler_returns_needs_reauth_on_unrecoverable_401(monkeypatch, tmp_path):
"""When session.call_tool raises 401 and handle_401 returns False,
handler returns a structured needs_reauth error (not a generic failure)."""
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
from tools.mcp_tool import _make_tool_handler
from tools.mcp_oauth_manager import get_manager, reset_manager_for_tests
from mcp.client.auth import OAuthFlowError
reset_manager_for_tests()
# Stub server
server = MagicMock()
server.name = "srv"
session = MagicMock()
async def _call_tool_raises(*a, **kw):
raise OAuthFlowError("token expired")
session.call_tool = _call_tool_raises
server.session = session
server._reconnect_event = MagicMock()
server._ready = MagicMock()
server._ready.is_set.return_value = True
from tools import mcp_tool
mcp_tool._servers["srv"] = server
mcp_tool._server_error_counts.pop("srv", None)
# Ensure the MCP loop exists (run_on_mcp_loop needs it)
mcp_tool._ensure_mcp_loop()
# Force handle_401 to return False (no recovery available)
mgr = get_manager()
async def _h401(name, token=None):
return False
monkeypatch.setattr(mgr, "handle_401", _h401)
try:
handler = _make_tool_handler("srv", "tool1", 10.0)
result = handler({"arg": "v"})
parsed = json.loads(result)
assert parsed.get("needs_reauth") is True, f"expected needs_reauth, got: {parsed}"
assert parsed.get("server") == "srv"
assert "re-auth" in parsed.get("error", "").lower() or "reauth" in parsed.get("error", "").lower()
finally:
mcp_tool._servers.pop("srv", None)
mcp_tool._server_error_counts.pop("srv", None)
def test_call_tool_handler_non_auth_error_still_generic(monkeypatch, tmp_path):
"""Non-auth exceptions still surface via the generic error path, not needs_reauth."""
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
from tools.mcp_tool import _make_tool_handler
server = MagicMock()
server.name = "srv"
session = MagicMock()
async def _raises(*a, **kw):
raise RuntimeError("unrelated")
session.call_tool = _raises
server.session = session
from tools import mcp_tool
mcp_tool._servers["srv"] = server
mcp_tool._server_error_counts.pop("srv", None)
mcp_tool._ensure_mcp_loop()
try:
handler = _make_tool_handler("srv", "tool1", 10.0)
result = handler({"arg": "v"})
parsed = json.loads(result)
assert "needs_reauth" not in parsed
assert "MCP call failed" in parsed.get("error", "")
finally:
mcp_tool._servers.pop("srv", None)
mcp_tool._server_error_counts.pop("srv", None)