mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-04-28 06:51:16 +08:00
docs: add Environments, Benchmarks & Data Generation guide
Comprehensive developer guide covering: - Architecture (BaseEnv → HermesAgentBaseEnv → concrete envs) - All three benchmarks (TerminalBench2, TBLite, YC-Bench) - Training environments (TerminalTestEnv, HermesSweEnv) - Core components (AgentLoop, ToolContext, Tool Call Parsers) - Two-phase operation (Phase 1 OpenAI, Phase 2 VLLM) - Running environments (evaluate, process, serve modes) - Creating new environments (training + eval-only) - Configuration reference and prerequisites Also updates environments/README.md directory tree to include TBLite and YC-Bench benchmarks.
This commit is contained in:
@@ -195,8 +195,12 @@ environments/
|
||||
│ └── hermes_swe_env.py
|
||||
│
|
||||
└── benchmarks/ # Evaluation benchmarks
|
||||
└── terminalbench_2/
|
||||
└── terminalbench2_env.py
|
||||
├── terminalbench_2/ # 89 terminal tasks, Modal sandboxes
|
||||
│ └── terminalbench2_env.py
|
||||
├── tblite/ # 100 calibrated tasks (fast TB2 proxy)
|
||||
│ └── tblite_env.py
|
||||
└── yc_bench/ # Long-horizon strategic benchmark
|
||||
└── yc_bench_env.py
|
||||
```
|
||||
|
||||
## Concrete Environments
|
||||
|
||||
Reference in New Issue
Block a user