update trajectory writing

This commit is contained in:
alt-glitch
2026-04-01 22:48:22 -07:00
parent 0e459f2b7b
commit 59471b79e5
2 changed files with 17 additions and 3 deletions

View File

@@ -25,6 +25,8 @@ env:
- After making changes, always test them — run the test command, check the output.
- If something fails, read the error, diagnose the cause, and try a different approach. Do not give up or repeat the same failing command.
- Do not stop until you have verified your solution works.
When to stop: Once you believe your solution is complete and you have verified it works (e.g. the program runs correctly, the output looks right, the file is in place), respond with a plain text message summarizing what you did. Do NOT make any more tool calls after that.
enabled_toolsets: ["terminal", "file"]
max_agent_turns: 60
max_token_length: 32000
@@ -34,7 +36,7 @@ env:
tool_pool_size: 128 # thread pool for 89 parallel tasks
dataset_name: "sidbin/terminal-bench-2-verified-flattened"
test_timeout: 600
task_timeout: 1800 # 30 min wall-clock per task, auto-FAIL if exceeded
task_timeout: 900 # 15 min wall-clock per task, auto-FAIL if exceeded
tokenizer_name: "NousResearch/Hermes-3-Llama-3.1-8B"
use_wandb: true
wandb_name: "terminal-bench-2"
@@ -47,7 +49,7 @@ env:
openai:
base_url: "https://openrouter.ai/api/v1"
model_name: "openai/gpt-oss-120b:nitro"
model_name: "qwen/qwen3.5-122b-a10b:nitro"
server_type: "openai"
health_check: false
timeout: 300 # 5 min per API call (default 1200s causes 20min stalls)