mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-04-30 07:51:45 +08:00
* perf(startup): lazy-import OpenAI, Anthropic, Firecrawl, account_usage
Four heavy SDK/module imports are now deferred off the hot startup path.
Net savings on cold module imports:
cli 1200 → 958 ms (-242)
run_agent 1220 → 901 ms (-319)
tools.web_tools 711 → 423 ms (-288)
agent.anthropic_adapter 230 → 15 ms (-215)
agent.auxiliary_client 253 → 68 ms (-185)
Four independent changes in one PR since they all use the same pattern
and share the same risk profile (heavy SDK import → lazy proxy or
function-local import):
1. tools/web_tools.py:
'from firecrawl import Firecrawl' moved into _get_firecrawl_client(),
which is only called when backend='firecrawl'. Users on Exa/Tavily/
Parallel pay zero firecrawl cost.
2. cli.py + gateway/run.py:
'from agent.account_usage import ...' moved into the /limits handlers.
account_usage transitively pulls the OpenAI SDK chain; only needed
when the user runs /limits.
3. agent/anthropic_adapter.py:
'try: import anthropic as _anthropic_sdk' replaced with a cached
'_get_anthropic_sdk()' accessor. The three usage sites
(build_anthropic_client, build_anthropic_bedrock_client,
read_claude_code_credentials_from_keychain) now resolve via the
accessor. All pre-existing test patches of
'agent.anthropic_adapter._anthropic_sdk' keep working because the
accessor respects any value already in module globals.
4. agent/auxiliary_client.py AND run_agent.py:
'from openai import OpenAI' replaced with an '_OpenAIProxy()' module-
level object that looks like the OpenAI class but imports the SDK on
first call/isinstance check. This preserves:
- 15+ in-module OpenAI(...) construction sites in auxiliary_client
and the single site in run_agent's _create_openai_client (Python's
function-scope name lookup finds the proxy, forwards the call);
- 'patch("agent.auxiliary_client.OpenAI", ...)' and
'patch("run_agent.OpenAI", ...)' test patterns used by 28+ test
files (patch replaces the module attribute as usual).
Tried two alternatives first:
- 'from openai._client import OpenAI' — doesn't skip openai/__init__.py
(the audit's hypothesis here was wrong).
- Module-level __getattr__ — works for external access but Python
function-scope name resolution skips __getattr__, so in-module
OpenAI(...) calls NameError.
Note: 'openai' still loads on 'import cli' because
cli.py -> neuter_async_httpx_del() -> openai._base_client, and
run_agent.py -> code_execution_tool.py (module-level
build_execute_code_schema) -> _load_config() -> 'from cli import
CLI_CONFIG'. Deferring those is a separate, larger change — out of scope
for this PR. The savings above all come from avoiding the openai/*,
anthropic/*, and firecrawl/* top-level type-tree imports on paths that
don't need them.
Verified:
- 302/302 tests in tests/agent/{test_anthropic_adapter,
test_bedrock_1m_context, test_minimax_provider, test_anthropic_keychain}
pass. Two pre-existing failures on main unchanged.
- 106/106 tests/agent/test_auxiliary_client.py pass (1 pre-existing fail).
- 97/97 tests/run_agent/test_create_openai_client_kwargs_isolation.py,
test_plugin_context_engine_init.py, test_invalid_context_length_warning.py,
test_api_max_retries_config.py,
tests/hermes_cli/test_gemini_provider.py, test_ollama_cloud_provider.py
pass (1 pre-existing fail).
- Live hermes chat smoke: 2 turns + /model switch + tool calls, zero
errors in the 57-line agent.log window.
- Module-level import of run_agent + auxiliary_client + anthropic_adapter
no longer pulls 'anthropic' or 'firecrawl' at all.
* fix(gateway): restore top-level account_usage import for test-patch surface
CI caught two failures in tests/gateway/test_usage_command.py that I
missed locally:
AttributeError: 'module' object at gateway.run has no attribute 'fetch_account_usage'
The test uses monkeypatch.setattr('gateway.run.fetch_account_usage', ...)
to inject a fake account-fetch call. Moving the import inside the
handler deleted that module-level attribute, breaking the patch surface.
Restoring the top-level import in gateway/run.py gives up the ~230 ms
gateway-boot savings from that one lazy, but:
1. the gateway is a long-running daemon — boot cost is paid once per
install, not per turn;
2. the other four lazy-imports (firecrawl, openai, anthropic, cli's
account_usage) remain in place and still account for the bulk of
the savings reported in the PR body;
3. preserving the patch surface keeps the established
'gateway.run.fetch_account_usage' monkeypatch pattern working
without touching tests.
Verified: tests/gateway/test_usage_command.py — 8 passed, 0 failed.
Full targeted sweep (2336 tests across agent/gateway/hermes_cli/run_agent):
2332 passed, 4 failed — all 4 pre-existing on main.
---------
Co-authored-by: teknium1 <teknium@users.noreply.github.com>