mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-08 03:37:13 +08:00
Compare commits
17 Commits
add-morph-
...
fix-termin
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
a6ec79730c | ||
|
|
faecbddd9b | ||
|
|
de9c0edc51 | ||
|
|
8d256779d8 | ||
|
|
d36790de91 | ||
|
|
a398d320b7 | ||
|
|
22b6d5866c | ||
|
|
0e2e69a71d | ||
|
|
bc5f0e62d9 | ||
|
|
6fac6fecde | ||
|
|
c42d9055ed | ||
|
|
a7ff4d49e9 | ||
|
|
0411ca1880 | ||
|
|
c5386ed7e6 | ||
|
|
2082c7caa3 | ||
|
|
17608c1142 | ||
|
|
c7fa4447b8 |
23
.cursorrules
Normal file
23
.cursorrules
Normal file
@@ -0,0 +1,23 @@
|
||||
Hermes-Agent is an agent harness for LLMs.
|
||||
|
||||
When building, the tool functionality is in the tools/ directory, where each specific tool (or in some cases, tools that are built for the same execution category or api) are placed in a script each their own.
|
||||
|
||||
Each tool is then consolidated in the model_tools.py file in the repo root.
|
||||
|
||||
There is also a way to consolidate sets of tools in toolsets.py for the agent to use.
|
||||
|
||||
The primary agent runner code is in run_agent, but other runners could be developed using the tools and framework.
|
||||
|
||||
Always ensure consistency between tools, the model_tools.py and toolsets.py when changing any of them, otherwise they could become desynced in a way that is detrimental to functionality.
|
||||
|
||||
The expected pathway for using API keys is to setup and place them in a .env file in the repo root.
|
||||
|
||||
Test scripts will be placed in tests/
|
||||
|
||||
The run_agent loop is setup to:
|
||||
- Process the enabled toolsets to provide to the model,
|
||||
- Pipe in a prompt or problem from the input to the agent,
|
||||
- Loop the LLM each time it calls a tool, until the model decides no more tools are needed and provides a natural language response,
|
||||
- Return that response.
|
||||
|
||||
There are additional caveats for logging, where we restructure the "tools" as a system prompt for storage later into a format that can be used and handled properly later.
|
||||
49
.env.example
Normal file
49
.env.example
Normal file
@@ -0,0 +1,49 @@
|
||||
# Hermes Agent Environment Configuration
|
||||
# Copy this file to .env and fill in your API keys
|
||||
# Get API keys from the URLs listed below
|
||||
|
||||
# =============================================================================
|
||||
# REQUIRED API KEYS
|
||||
# =============================================================================
|
||||
|
||||
# Anthropic API Key - Main agent model
|
||||
# Get at: https://console.anthropic.com/
|
||||
ANTHROPIC_API_KEY=
|
||||
|
||||
# Firecrawl API Key - Web search, extract, and crawl
|
||||
# Get at: https://firecrawl.dev/
|
||||
FIRECRAWL_API_KEY=
|
||||
|
||||
# Nous Research API Key - Vision analysis and multi-model reasoning
|
||||
# Get at: https://inference-api.nousresearch.com/
|
||||
NOUS_API_KEY=
|
||||
|
||||
# Morph API Key - Terminal/command execution tools
|
||||
# Get at: https://morph.so/
|
||||
MORPH_API_KEY=
|
||||
|
||||
# FAL.ai API Key - Image generation
|
||||
# Get at: https://fal.ai/
|
||||
FAL_KEY=
|
||||
|
||||
# =============================================================================
|
||||
# OPTIONAL API KEYS
|
||||
# =============================================================================
|
||||
|
||||
# OpenAI API Key - Optional, for enhanced Hecate features
|
||||
# Get at: https://platform.openai.com/
|
||||
OPENAI_API_KEY=
|
||||
|
||||
# =============================================================================
|
||||
# OPTIONAL CONFIGURATION
|
||||
# =============================================================================
|
||||
|
||||
# Terminal Tool Settings
|
||||
HECATE_VM_LIFETIME_SECONDS=300
|
||||
HECATE_DEFAULT_SNAPSHOT_ID=snapshot_p5294qxt
|
||||
|
||||
# Debug Logging (set to "true" to enable, logs saved to ./logs/)
|
||||
WEB_TOOLS_DEBUG=false
|
||||
VISION_TOOLS_DEBUG=false
|
||||
MOA_TOOLS_DEBUG=false
|
||||
IMAGE_TOOLS_DEBUG=false
|
||||
5
.gitignore
vendored
5
.gitignore
vendored
@@ -16,3 +16,8 @@ __pycache__/
|
||||
export*
|
||||
__pycache__/model_tools.cpython-310.pyc
|
||||
__pycache__/web_tools.cpython-310.pyc
|
||||
logs/
|
||||
data/
|
||||
.pytest_cache/
|
||||
tmp/
|
||||
temp_vision_images/
|
||||
230
README.md
230
README.md
@@ -1,13 +1,99 @@
|
||||
# Hermes Agent
|
||||
|
||||
An AI agent with advanced tool-calling capabilities, featuring a flexible toolsets system for organizing and managing tools.
|
||||
|
||||
## Features
|
||||
|
||||
- **Web Tools**: Search, extract content, and crawl websites
|
||||
- **Terminal Tools**: Execute commands with interactive session support
|
||||
- **Vision Tools**: Analyze images from URLs
|
||||
- **Reasoning Tools**: Advanced multi-model reasoning (Mixture of Agents)
|
||||
- **Creative Tools**: Generate images from text prompts
|
||||
- **Toolsets System**: Organize tools into logical groups for different scenarios
|
||||
- **Batch Processing**: Process datasets in parallel with checkpointing and statistics tracking
|
||||
- **Ephemeral System Prompts**: Guide model behavior without polluting training datasets
|
||||
|
||||
## Setup
|
||||
```
|
||||
|
||||
### 1. Install Dependencies
|
||||
```bash
|
||||
# Create and activate virtual environment (recommended)
|
||||
python3 -m venv venv
|
||||
source venv/bin/activate # On Windows: venv\Scripts\activate
|
||||
|
||||
# Install required packages
|
||||
pip install -r requirements.txt
|
||||
|
||||
# Install Hecate for terminal tools
|
||||
git clone git@github.com:NousResearch/hecate.git
|
||||
cd hecate
|
||||
pip install -e .
|
||||
cd ..
|
||||
```
|
||||
|
||||
## Run
|
||||
### 2. Configure Environment Variables
|
||||
```bash
|
||||
# Copy the example environment file
|
||||
cp .env.example .env
|
||||
|
||||
# Edit .env and add your API keys
|
||||
nano .env # or use your preferred editor
|
||||
```
|
||||
|
||||
**Required API Keys:**
|
||||
- `ANTHROPIC_API_KEY` - Main agent model (get at: https://console.anthropic.com/)
|
||||
- `FIRECRAWL_API_KEY` - Web tools (get at: https://firecrawl.dev/)
|
||||
- `NOUS_API_KEY` - Vision & reasoning tools (get at: https://inference-api.nousresearch.com/)
|
||||
- `MORPH_API_KEY` - Terminal tools (get at: https://morph.so/)
|
||||
- `FAL_KEY` - Image generation (get at: https://fal.ai/)
|
||||
- `OPENAI_API_KEY` - Optional, for some Hecate features
|
||||
|
||||
See `.env.example` for all available configuration options including debug settings and terminal tool configuration.
|
||||
|
||||
## Toolsets System
|
||||
|
||||
The agent uses a toolsets system for organizing and managing tools. All tools must be part of a toolset to be accessible - individual tool selection is not supported. This ensures consistent and logical grouping of capabilities.
|
||||
|
||||
### Key Concepts
|
||||
|
||||
- **Toolsets**: Logical groups of tools for specific use cases (e.g., "research", "development", "debugging")
|
||||
- **Composition**: Toolsets can include other toolsets for powerful combinations
|
||||
- **Custom Toolsets**: Create your own toolsets at runtime or by editing `toolsets.py`
|
||||
- **Toolset-Only Access**: Tools are only accessible through toolsets, not individually
|
||||
|
||||
### Available Toolsets
|
||||
|
||||
See `toolsets.py` for the complete list of predefined toolsets including:
|
||||
- Basic toolsets (web, terminal, vision, creative, reasoning)
|
||||
- Composite toolsets (research, development, analysis, etc.)
|
||||
- Scenario-specific toolsets (debugging, documentation, API testing, etc.)
|
||||
- Special toolsets (safe mode without terminal, minimal, offline)
|
||||
|
||||
### Using Toolsets
|
||||
|
||||
```bash
|
||||
# Use a predefined toolset
|
||||
python run_agent.py --enabled_toolsets=research --query "Find latest AI papers"
|
||||
|
||||
# Combine multiple toolsets
|
||||
python run_agent.py --enabled_toolsets=web,vision --query "Analyze this website"
|
||||
|
||||
# Enable all toolsets explicitly (same as omitting the flag)
|
||||
python run_agent.py --enabled_toolsets=all --query "Do web research and run commands if helpful"
|
||||
|
||||
# Safe mode (no terminal access)
|
||||
python run_agent.py --enabled_toolsets=safe --query "Help without running commands"
|
||||
|
||||
# List all available toolsets and tools
|
||||
python run_agent.py --list_tools
|
||||
```
|
||||
|
||||
For detailed documentation on toolsets, see `TOOLSETS_README.md`.
|
||||
|
||||
## Basic Usage
|
||||
|
||||
### Default (all tools enabled)
|
||||
```bash
|
||||
python run_agent.py \
|
||||
--query "search up the latest docs on jit in python 3.13 and write me basic example that's not in their docs. profile its perf" \
|
||||
--max_turns 20 \
|
||||
@@ -15,3 +101,143 @@ python run_agent.py \
|
||||
--base_url https://api.anthropic.com/v1/ \
|
||||
--api_key $ANTHROPIC_API_KEY
|
||||
```
|
||||
|
||||
### With specific toolset
|
||||
```bash
|
||||
python run_agent.py \
|
||||
--query "Debug this Python error" \
|
||||
--enabled_toolsets=debugging \
|
||||
--model claude-sonnet-4-20250514 \
|
||||
--api_key $ANTHROPIC_API_KEY
|
||||
```
|
||||
|
||||
### Python API
|
||||
```python
|
||||
from run_agent import AIAgent
|
||||
|
||||
# Use a specific toolset
|
||||
agent = AIAgent(
|
||||
model="claude-opus-4-20250514",
|
||||
enabled_toolsets=["research"]
|
||||
)
|
||||
response = agent.chat("Find information about quantum computing")
|
||||
|
||||
# Create custom toolset at runtime
|
||||
from toolsets import create_custom_toolset
|
||||
|
||||
create_custom_toolset(
|
||||
name="my_tools",
|
||||
description="My custom toolkit",
|
||||
tools=["web_search"],
|
||||
includes=["terminal", "vision"]
|
||||
)
|
||||
|
||||
agent = AIAgent(enabled_toolsets=["my_tools"])
|
||||
```
|
||||
|
||||
## Batch Processing
|
||||
|
||||
Process multiple prompts from a dataset in parallel with automatic checkpointing and statistics tracking:
|
||||
|
||||
```bash
|
||||
# Basic batch processing
|
||||
python batch_runner.py \
|
||||
--dataset_file=prompts.jsonl \
|
||||
--batch_size=20 \
|
||||
--run_name=my_run
|
||||
|
||||
# With specific distribution
|
||||
python batch_runner.py \
|
||||
--dataset_file=prompts.jsonl \
|
||||
--batch_size=20 \
|
||||
--run_name=image_run \
|
||||
--distribution=image_gen \
|
||||
--num_workers=4
|
||||
```
|
||||
|
||||
**Key Features:**
|
||||
- Parallel processing with configurable workers
|
||||
- Toolset distributions for varied data generation
|
||||
- Automatic checkpointing and resume capability
|
||||
- Combined output in `data/<run_name>/trajectories.jsonl`
|
||||
- Tool usage statistics and success rates
|
||||
|
||||
**Quick Start:** See [QUICKSTART_BATCH.md](QUICKSTART_BATCH.md) for a 5-minute getting started guide.
|
||||
**Full Documentation:** See [BATCH_PROCESSING.md](BATCH_PROCESSING.md) for comprehensive documentation.
|
||||
|
||||
### Ephemeral System Prompts
|
||||
|
||||
The ephemeral system prompt feature allows you to guide the model's behavior during batch processing **without** saving that prompt to the training dataset trajectories. This is useful for:
|
||||
|
||||
- Guiding model behavior during data collection
|
||||
- Adding task-specific instructions
|
||||
- Keeping saved trajectories clean and focused on tool-calling format
|
||||
|
||||
**Example:**
|
||||
```bash
|
||||
python batch_runner.py \
|
||||
--dataset_file=prompts.jsonl \
|
||||
--batch_size=10 \
|
||||
--run_name=my_run \
|
||||
--ephemeral_system_prompt="You are a helpful assistant focused on image generation."
|
||||
```
|
||||
|
||||
The ephemeral prompt will influence the model's behavior during execution, but **only the standard tool-calling system prompt** will be saved in the trajectory files.
|
||||
|
||||
**Documentation:** See [docs/ephemeral_system_prompt.md](docs/ephemeral_system_prompt.md) for complete details.
|
||||
|
||||
## Command Line Arguments
|
||||
|
||||
**Single Agent (`run_agent.py`):**
|
||||
- `--query`: The question or task for the agent
|
||||
- `--model`: Model to use (default: claude-opus-4-20250514)
|
||||
- `--api_key`: API key for authentication
|
||||
- `--base_url`: API endpoint URL
|
||||
- `--max_turns`: Maximum number of tool-calling iterations
|
||||
- `--enabled_toolsets`: Comma-separated list of toolsets to enable. Use `all` (or `*`) to enable everything. If omitted, all toolsets are enabled by default.
|
||||
- `--disabled_toolsets`: Comma-separated list of toolsets to disable
|
||||
- `--list_tools`: List all available toolsets and tools
|
||||
- `--save_trajectories`: Save conversation trajectories to JSONL files
|
||||
|
||||
**Batch Processing (`batch_runner.py`):**
|
||||
- `--dataset_file`: Path to JSONL file with prompts
|
||||
- `--batch_size`: Number of prompts per batch
|
||||
- `--run_name`: Name for this run (for output/checkpointing)
|
||||
- `--distribution`: Toolset distribution to use (default: "default")
|
||||
- `--num_workers`: Number of parallel workers (default: 4)
|
||||
- `--resume`: Resume from checkpoint if interrupted
|
||||
- `--ephemeral_system_prompt`: System prompt used during execution but NOT saved to trajectories
|
||||
- `--list_distributions`: List available toolset distributions
|
||||
|
||||
## Environment Variables
|
||||
|
||||
All environment variables can be configured in the `.env` file (copy from `.env.example`).
|
||||
|
||||
**Core API Keys:**
|
||||
- `ANTHROPIC_API_KEY`: Main agent model
|
||||
- `FIRECRAWL_API_KEY`: Web tools (search, extract, crawl)
|
||||
- `NOUS_API_KEY`: Vision and reasoning tools
|
||||
- `MORPH_API_KEY`: Terminal tools
|
||||
- `FAL_KEY`: Image generation tools
|
||||
- `OPENAI_API_KEY`: Optional, for some Hecate features
|
||||
|
||||
**Configuration Options:**
|
||||
- `HECATE_VM_LIFETIME_SECONDS`: VM lifetime (default: 300)
|
||||
- `HECATE_DEFAULT_SNAPSHOT_ID`: Default snapshot (default: snapshot_p5294qxt)
|
||||
- `WEB_TOOLS_DEBUG`, `VISION_TOOLS_DEBUG`, `MOA_TOOLS_DEBUG`, `IMAGE_TOOLS_DEBUG`: Enable debug logging
|
||||
|
||||
## Documentation
|
||||
|
||||
**Single Agent Usage:**
|
||||
- `TOOLSETS_README.md`: Comprehensive guide to the toolsets system
|
||||
- `toolsets.py`: View and modify available toolsets
|
||||
- `model_tools.py`: Core tool definitions and handlers
|
||||
|
||||
**Batch Processing:**
|
||||
- `QUICKSTART_BATCH.md`: 5-minute quick start guide
|
||||
- `BATCH_PROCESSING.md`: Complete batch processing documentation
|
||||
- `toolset_distributions.py`: Toolset distributions for data generation
|
||||
|
||||
## Examples
|
||||
|
||||
See `TOOLSETS_README.md` for extensive examples of using different toolsets for various scenarios.
|
||||
|
||||
746
batch_runner.py
Normal file
746
batch_runner.py
Normal file
@@ -0,0 +1,746 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Batch Agent Runner
|
||||
|
||||
This module provides parallel batch processing capabilities for running the agent
|
||||
across multiple prompts from a dataset. It includes:
|
||||
- Dataset loading and batching
|
||||
- Parallel batch processing with multiprocessing
|
||||
- Checkpointing for fault tolerance and resumption
|
||||
- Trajectory saving in the proper format (from/value pairs)
|
||||
- Tool usage statistics aggregation across all batches
|
||||
|
||||
Usage:
|
||||
python batch_runner.py --dataset_file=data.jsonl --batch_size=10 --run_name=my_run
|
||||
|
||||
# Resume an interrupted run
|
||||
python batch_runner.py --dataset_file=data.jsonl --batch_size=10 --run_name=my_run --resume
|
||||
|
||||
# Use a specific toolset distribution
|
||||
python batch_runner.py --dataset_file=data.jsonl --batch_size=10 --run_name=my_run --distribution=image_gen
|
||||
"""
|
||||
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
import time
|
||||
from pathlib import Path
|
||||
from typing import List, Dict, Any, Optional, Tuple
|
||||
from datetime import datetime
|
||||
from multiprocessing import Pool, Manager, Lock
|
||||
import traceback
|
||||
|
||||
import fire
|
||||
|
||||
from run_agent import AIAgent
|
||||
from toolset_distributions import (
|
||||
get_distribution,
|
||||
list_distributions,
|
||||
sample_toolsets_from_distribution,
|
||||
validate_distribution
|
||||
)
|
||||
|
||||
|
||||
# Global configuration for worker processes
|
||||
_WORKER_CONFIG = {}
|
||||
|
||||
|
||||
def _extract_tool_stats(messages: List[Dict[str, Any]]) -> Dict[str, Dict[str, int]]:
|
||||
"""
|
||||
Extract tool usage statistics from message history.
|
||||
|
||||
Args:
|
||||
messages (List[Dict]): Message history
|
||||
|
||||
Returns:
|
||||
Dict: Tool statistics with counts and success/failure rates
|
||||
"""
|
||||
tool_stats = {}
|
||||
|
||||
# Track tool calls and their results
|
||||
tool_calls_map = {} # Map tool_call_id to tool name
|
||||
|
||||
for msg in messages:
|
||||
# Track tool calls from assistant messages
|
||||
if msg["role"] == "assistant" and "tool_calls" in msg and msg["tool_calls"]:
|
||||
for tool_call in msg["tool_calls"]:
|
||||
tool_name = tool_call["function"]["name"]
|
||||
tool_call_id = tool_call["id"]
|
||||
|
||||
# Initialize stats for this tool if not exists
|
||||
if tool_name not in tool_stats:
|
||||
tool_stats[tool_name] = {
|
||||
"count": 0,
|
||||
"success": 0,
|
||||
"failure": 0
|
||||
}
|
||||
|
||||
tool_stats[tool_name]["count"] += 1
|
||||
tool_calls_map[tool_call_id] = tool_name
|
||||
|
||||
# Track tool responses
|
||||
elif msg["role"] == "tool":
|
||||
tool_call_id = msg.get("tool_call_id", "")
|
||||
content = msg.get("content", "")
|
||||
|
||||
# Determine if tool call was successful
|
||||
is_success = True
|
||||
try:
|
||||
# Try to parse as JSON and check for actual error values
|
||||
content_json = json.loads(content) if isinstance(content, str) else content
|
||||
|
||||
if isinstance(content_json, dict):
|
||||
# Check if error field exists AND has a non-null value
|
||||
if "error" in content_json and content_json["error"] is not None:
|
||||
is_success = False
|
||||
|
||||
# Special handling for terminal tool responses
|
||||
# Terminal wraps its response in a "content" field
|
||||
if "content" in content_json and isinstance(content_json["content"], dict):
|
||||
inner_content = content_json["content"]
|
||||
# Check for actual error (non-null error field or non-zero exit code)
|
||||
has_error = (inner_content.get("error") is not None or
|
||||
inner_content.get("exit_code", 0) != 0)
|
||||
if has_error:
|
||||
is_success = False
|
||||
|
||||
# Check for "success": false pattern used by some tools
|
||||
if content_json.get("success") is False:
|
||||
is_success = False
|
||||
|
||||
except:
|
||||
# If not JSON, check if content is empty or explicitly states an error
|
||||
# Note: We avoid simple substring matching to prevent false positives
|
||||
if not content:
|
||||
is_success = False
|
||||
# Only mark as failure if it explicitly starts with "Error:" or "ERROR:"
|
||||
elif content.strip().lower().startswith("error:"):
|
||||
is_success = False
|
||||
|
||||
# Update success/failure count
|
||||
if tool_call_id in tool_calls_map:
|
||||
tool_name = tool_calls_map[tool_call_id]
|
||||
if is_success:
|
||||
tool_stats[tool_name]["success"] += 1
|
||||
else:
|
||||
tool_stats[tool_name]["failure"] += 1
|
||||
|
||||
return tool_stats
|
||||
|
||||
|
||||
def _process_single_prompt(
|
||||
prompt_index: int,
|
||||
prompt_data: Dict[str, Any],
|
||||
batch_num: int,
|
||||
config: Dict[str, Any]
|
||||
) -> Dict[str, Any]:
|
||||
"""
|
||||
Process a single prompt with the agent.
|
||||
|
||||
Args:
|
||||
prompt_index (int): Index of prompt in dataset
|
||||
prompt_data (Dict): Prompt data containing 'prompt' field
|
||||
batch_num (int): Batch number
|
||||
config (Dict): Configuration dict with agent parameters
|
||||
|
||||
Returns:
|
||||
Dict: Result containing trajectory, stats, and metadata
|
||||
"""
|
||||
prompt = prompt_data["prompt"]
|
||||
|
||||
try:
|
||||
# Sample toolsets from distribution for this prompt
|
||||
selected_toolsets = sample_toolsets_from_distribution(config["distribution"])
|
||||
|
||||
if config.get("verbose"):
|
||||
print(f" Prompt {prompt_index}: Using toolsets {selected_toolsets}")
|
||||
|
||||
# Initialize agent with sampled toolsets
|
||||
agent = AIAgent(
|
||||
base_url=config.get("base_url"),
|
||||
api_key=config.get("api_key"),
|
||||
model=config["model"],
|
||||
max_iterations=config["max_iterations"],
|
||||
enabled_toolsets=selected_toolsets,
|
||||
save_trajectories=False, # We handle saving ourselves
|
||||
verbose_logging=config.get("verbose", False),
|
||||
ephemeral_system_prompt=config.get("ephemeral_system_prompt")
|
||||
)
|
||||
|
||||
# Run the agent
|
||||
result = agent.run_conversation(prompt)
|
||||
|
||||
# Extract tool usage statistics
|
||||
tool_stats = _extract_tool_stats(result["messages"])
|
||||
|
||||
# Convert to trajectory format (using existing method)
|
||||
trajectory = agent._convert_to_trajectory_format(
|
||||
result["messages"],
|
||||
prompt,
|
||||
result["completed"]
|
||||
)
|
||||
|
||||
return {
|
||||
"success": True,
|
||||
"prompt_index": prompt_index,
|
||||
"trajectory": trajectory,
|
||||
"tool_stats": tool_stats,
|
||||
"completed": result["completed"],
|
||||
"api_calls": result["api_calls"],
|
||||
"toolsets_used": selected_toolsets,
|
||||
"metadata": {
|
||||
"batch_num": batch_num,
|
||||
"timestamp": datetime.now().isoformat(),
|
||||
"model": config["model"]
|
||||
}
|
||||
}
|
||||
|
||||
except Exception as e:
|
||||
print(f"❌ Error processing prompt {prompt_index}: {e}")
|
||||
if config.get("verbose"):
|
||||
traceback.print_exc()
|
||||
|
||||
return {
|
||||
"success": False,
|
||||
"prompt_index": prompt_index,
|
||||
"error": str(e),
|
||||
"trajectory": None,
|
||||
"tool_stats": {},
|
||||
"toolsets_used": [],
|
||||
"metadata": {
|
||||
"batch_num": batch_num,
|
||||
"timestamp": datetime.now().isoformat()
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
def _process_batch_worker(args: Tuple) -> Dict[str, Any]:
|
||||
"""
|
||||
Worker function to process a single batch of prompts.
|
||||
|
||||
Args:
|
||||
args (Tuple): (batch_num, batch_data, output_dir, completed_prompts, config)
|
||||
|
||||
Returns:
|
||||
Dict: Batch results with statistics
|
||||
"""
|
||||
batch_num, batch_data, output_dir, completed_prompts_set, config = args
|
||||
|
||||
output_dir = Path(output_dir)
|
||||
print(f"\n🔄 Batch {batch_num}: Starting ({len(batch_data)} prompts)")
|
||||
|
||||
# Output file for this batch
|
||||
batch_output_file = output_dir / f"batch_{batch_num}.jsonl"
|
||||
|
||||
# Filter out already completed prompts
|
||||
prompts_to_process = [
|
||||
(idx, data) for idx, data in batch_data
|
||||
if idx not in completed_prompts_set
|
||||
]
|
||||
|
||||
if not prompts_to_process:
|
||||
print(f"✅ Batch {batch_num}: Already completed (skipping)")
|
||||
return {
|
||||
"batch_num": batch_num,
|
||||
"processed": 0,
|
||||
"skipped": len(batch_data),
|
||||
"tool_stats": {},
|
||||
"completed_prompts": []
|
||||
}
|
||||
|
||||
print(f" Processing {len(prompts_to_process)} prompts (skipping {len(batch_data) - len(prompts_to_process)} already completed)")
|
||||
|
||||
# Initialize aggregated stats for this batch
|
||||
batch_tool_stats = {}
|
||||
completed_in_batch = []
|
||||
|
||||
# Process each prompt sequentially in this batch
|
||||
for prompt_index, prompt_data in prompts_to_process:
|
||||
# Process the prompt
|
||||
result = _process_single_prompt(
|
||||
prompt_index,
|
||||
prompt_data,
|
||||
batch_num,
|
||||
config
|
||||
)
|
||||
|
||||
# Save trajectory if successful
|
||||
if result["success"] and result["trajectory"]:
|
||||
trajectory_entry = {
|
||||
"prompt_index": prompt_index,
|
||||
"conversations": result["trajectory"],
|
||||
"metadata": result["metadata"],
|
||||
"completed": result["completed"],
|
||||
"api_calls": result["api_calls"],
|
||||
"toolsets_used": result["toolsets_used"]
|
||||
}
|
||||
|
||||
# Append to batch output file
|
||||
with open(batch_output_file, 'a', encoding='utf-8') as f:
|
||||
f.write(json.dumps(trajectory_entry, ensure_ascii=False) + "\n")
|
||||
|
||||
# Aggregate tool statistics
|
||||
for tool_name, stats in result.get("tool_stats", {}).items():
|
||||
if tool_name not in batch_tool_stats:
|
||||
batch_tool_stats[tool_name] = {
|
||||
"count": 0,
|
||||
"success": 0,
|
||||
"failure": 0
|
||||
}
|
||||
|
||||
batch_tool_stats[tool_name]["count"] += stats["count"]
|
||||
batch_tool_stats[tool_name]["success"] += stats["success"]
|
||||
batch_tool_stats[tool_name]["failure"] += stats["failure"]
|
||||
|
||||
completed_in_batch.append(prompt_index)
|
||||
print(f" ✅ Prompt {prompt_index} completed")
|
||||
|
||||
print(f"✅ Batch {batch_num}: Completed ({len(prompts_to_process)} prompts processed)")
|
||||
|
||||
return {
|
||||
"batch_num": batch_num,
|
||||
"processed": len(prompts_to_process),
|
||||
"skipped": len(batch_data) - len(prompts_to_process),
|
||||
"tool_stats": batch_tool_stats,
|
||||
"completed_prompts": completed_in_batch
|
||||
}
|
||||
|
||||
|
||||
class BatchRunner:
|
||||
"""
|
||||
Manages batch processing of agent prompts with checkpointing and statistics.
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
dataset_file: str,
|
||||
batch_size: int,
|
||||
run_name: str,
|
||||
distribution: str = "default",
|
||||
max_iterations: int = 10,
|
||||
base_url: str = None,
|
||||
api_key: str = None,
|
||||
model: str = "claude-opus-4-20250514",
|
||||
num_workers: int = 4,
|
||||
verbose: bool = False,
|
||||
ephemeral_system_prompt: str = None
|
||||
):
|
||||
"""
|
||||
Initialize the batch runner.
|
||||
|
||||
Args:
|
||||
dataset_file (str): Path to the dataset JSONL file with 'prompt' field
|
||||
batch_size (int): Number of prompts per batch
|
||||
run_name (str): Name for this run (used for checkpointing and output)
|
||||
distribution (str): Toolset distribution to use (default: "default")
|
||||
max_iterations (int): Max iterations per agent run
|
||||
base_url (str): Base URL for model API
|
||||
api_key (str): API key for model
|
||||
model (str): Model name to use
|
||||
num_workers (int): Number of parallel workers
|
||||
verbose (bool): Enable verbose logging
|
||||
ephemeral_system_prompt (str): System prompt used during agent execution but NOT saved to trajectories (optional)
|
||||
"""
|
||||
self.dataset_file = Path(dataset_file)
|
||||
self.batch_size = batch_size
|
||||
self.run_name = run_name
|
||||
self.distribution = distribution
|
||||
self.max_iterations = max_iterations
|
||||
self.base_url = base_url
|
||||
self.api_key = api_key
|
||||
self.model = model
|
||||
self.num_workers = num_workers
|
||||
self.verbose = verbose
|
||||
self.ephemeral_system_prompt = ephemeral_system_prompt
|
||||
|
||||
# Validate distribution
|
||||
if not validate_distribution(distribution):
|
||||
raise ValueError(f"Unknown distribution: {distribution}. Available: {list(list_distributions().keys())}")
|
||||
|
||||
# Setup output directory
|
||||
self.output_dir = Path("data") / run_name
|
||||
self.output_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Checkpoint file
|
||||
self.checkpoint_file = self.output_dir / "checkpoint.json"
|
||||
|
||||
# Statistics file
|
||||
self.stats_file = self.output_dir / "statistics.json"
|
||||
|
||||
# Load dataset
|
||||
self.dataset = self._load_dataset()
|
||||
|
||||
# Create batches
|
||||
self.batches = self._create_batches()
|
||||
|
||||
print(f"📊 Batch Runner Initialized")
|
||||
print(f" Dataset: {self.dataset_file} ({len(self.dataset)} prompts)")
|
||||
print(f" Batch size: {self.batch_size}")
|
||||
print(f" Total batches: {len(self.batches)}")
|
||||
print(f" Run name: {self.run_name}")
|
||||
print(f" Distribution: {self.distribution}")
|
||||
print(f" Output directory: {self.output_dir}")
|
||||
print(f" Workers: {self.num_workers}")
|
||||
if self.ephemeral_system_prompt:
|
||||
prompt_preview = self.ephemeral_system_prompt[:60] + "..." if len(self.ephemeral_system_prompt) > 60 else self.ephemeral_system_prompt
|
||||
print(f" 🔒 Ephemeral system prompt: '{prompt_preview}'")
|
||||
|
||||
def _load_dataset(self) -> List[Dict[str, Any]]:
|
||||
"""
|
||||
Load dataset from JSONL file.
|
||||
|
||||
Returns:
|
||||
List[Dict]: List of dataset entries
|
||||
"""
|
||||
if not self.dataset_file.exists():
|
||||
raise FileNotFoundError(f"Dataset file not found: {self.dataset_file}")
|
||||
|
||||
dataset = []
|
||||
with open(self.dataset_file, 'r', encoding='utf-8') as f:
|
||||
for line_num, line in enumerate(f, 1):
|
||||
line = line.strip()
|
||||
if not line:
|
||||
continue
|
||||
|
||||
try:
|
||||
entry = json.loads(line)
|
||||
if 'prompt' not in entry:
|
||||
print(f"⚠️ Warning: Line {line_num} missing 'prompt' field, skipping")
|
||||
continue
|
||||
dataset.append(entry)
|
||||
except json.JSONDecodeError as e:
|
||||
print(f"⚠️ Warning: Invalid JSON on line {line_num}: {e}")
|
||||
continue
|
||||
|
||||
if not dataset:
|
||||
raise ValueError(f"No valid entries found in dataset file: {self.dataset_file}")
|
||||
|
||||
return dataset
|
||||
|
||||
def _create_batches(self) -> List[List[Tuple[int, Dict[str, Any]]]]:
|
||||
"""
|
||||
Split dataset into batches with indices.
|
||||
|
||||
Returns:
|
||||
List of batches, where each batch is a list of (index, entry) tuples
|
||||
"""
|
||||
batches = []
|
||||
for i in range(0, len(self.dataset), self.batch_size):
|
||||
batch = [(idx, entry) for idx, entry in enumerate(self.dataset[i:i + self.batch_size], start=i)]
|
||||
batches.append(batch)
|
||||
|
||||
return batches
|
||||
|
||||
def _load_checkpoint(self) -> Dict[str, Any]:
|
||||
"""
|
||||
Load checkpoint data if it exists.
|
||||
|
||||
Returns:
|
||||
Dict: Checkpoint data with completed prompt indices
|
||||
"""
|
||||
if not self.checkpoint_file.exists():
|
||||
return {
|
||||
"run_name": self.run_name,
|
||||
"completed_prompts": [],
|
||||
"batch_stats": {},
|
||||
"last_updated": None
|
||||
}
|
||||
|
||||
try:
|
||||
with open(self.checkpoint_file, 'r', encoding='utf-8') as f:
|
||||
return json.load(f)
|
||||
except Exception as e:
|
||||
print(f"⚠️ Warning: Failed to load checkpoint: {e}")
|
||||
return {
|
||||
"run_name": self.run_name,
|
||||
"completed_prompts": [],
|
||||
"batch_stats": {},
|
||||
"last_updated": None
|
||||
}
|
||||
|
||||
def _save_checkpoint(self, checkpoint_data: Dict[str, Any], lock: Optional[Lock] = None):
|
||||
"""
|
||||
Save checkpoint data.
|
||||
|
||||
Args:
|
||||
checkpoint_data (Dict): Checkpoint data to save
|
||||
lock (Lock): Optional lock for thread-safe access
|
||||
"""
|
||||
checkpoint_data["last_updated"] = datetime.now().isoformat()
|
||||
|
||||
if lock:
|
||||
with lock:
|
||||
with open(self.checkpoint_file, 'w', encoding='utf-8') as f:
|
||||
json.dump(checkpoint_data, f, indent=2)
|
||||
else:
|
||||
with open(self.checkpoint_file, 'w', encoding='utf-8') as f:
|
||||
json.dump(checkpoint_data, f, indent=2)
|
||||
|
||||
|
||||
def run(self, resume: bool = False):
|
||||
"""
|
||||
Run the batch processing pipeline.
|
||||
|
||||
Args:
|
||||
resume (bool): Whether to resume from checkpoint
|
||||
"""
|
||||
print("\n" + "=" * 70)
|
||||
print("🚀 Starting Batch Processing")
|
||||
print("=" * 70)
|
||||
|
||||
# Load checkpoint
|
||||
checkpoint_data = self._load_checkpoint() if resume else {
|
||||
"run_name": self.run_name,
|
||||
"completed_prompts": [],
|
||||
"batch_stats": {},
|
||||
"last_updated": None
|
||||
}
|
||||
|
||||
if resume and checkpoint_data.get("completed_prompts"):
|
||||
print(f"📂 Resuming from checkpoint ({len(checkpoint_data['completed_prompts'])} prompts already completed)")
|
||||
|
||||
# Prepare configuration for workers
|
||||
config = {
|
||||
"distribution": self.distribution,
|
||||
"model": self.model,
|
||||
"max_iterations": self.max_iterations,
|
||||
"base_url": self.base_url,
|
||||
"api_key": self.api_key,
|
||||
"verbose": self.verbose,
|
||||
"ephemeral_system_prompt": self.ephemeral_system_prompt
|
||||
}
|
||||
|
||||
# Get completed prompts set
|
||||
completed_prompts_set = set(checkpoint_data.get("completed_prompts", []))
|
||||
|
||||
# Aggregate statistics across all batches
|
||||
total_tool_stats = {}
|
||||
|
||||
start_time = time.time()
|
||||
|
||||
# Process batches in parallel
|
||||
with Pool(processes=self.num_workers) as pool:
|
||||
# Create tasks for each batch
|
||||
tasks = [
|
||||
(
|
||||
batch_num,
|
||||
batch_data,
|
||||
str(self.output_dir), # Convert Path to string for pickling
|
||||
completed_prompts_set,
|
||||
config
|
||||
)
|
||||
for batch_num, batch_data in enumerate(self.batches)
|
||||
]
|
||||
|
||||
# Use map to process batches in parallel
|
||||
results = pool.map(_process_batch_worker, tasks)
|
||||
|
||||
# Aggregate all batch statistics and update checkpoint
|
||||
all_completed_prompts = list(completed_prompts_set)
|
||||
for batch_result in results:
|
||||
# Add newly completed prompts
|
||||
all_completed_prompts.extend(batch_result.get("completed_prompts", []))
|
||||
|
||||
# Aggregate tool stats
|
||||
for tool_name, stats in batch_result.get("tool_stats", {}).items():
|
||||
if tool_name not in total_tool_stats:
|
||||
total_tool_stats[tool_name] = {
|
||||
"count": 0,
|
||||
"success": 0,
|
||||
"failure": 0
|
||||
}
|
||||
|
||||
total_tool_stats[tool_name]["count"] += stats["count"]
|
||||
total_tool_stats[tool_name]["success"] += stats["success"]
|
||||
total_tool_stats[tool_name]["failure"] += stats["failure"]
|
||||
|
||||
# Save final checkpoint
|
||||
checkpoint_data["completed_prompts"] = all_completed_prompts
|
||||
self._save_checkpoint(checkpoint_data)
|
||||
|
||||
# Calculate success rates
|
||||
for tool_name in total_tool_stats:
|
||||
stats = total_tool_stats[tool_name]
|
||||
total_calls = stats["success"] + stats["failure"]
|
||||
if total_calls > 0:
|
||||
stats["success_rate"] = round(stats["success"] / total_calls * 100, 2)
|
||||
stats["failure_rate"] = round(stats["failure"] / total_calls * 100, 2)
|
||||
else:
|
||||
stats["success_rate"] = 0.0
|
||||
stats["failure_rate"] = 0.0
|
||||
|
||||
# Combine all batch files into a single trajectories.jsonl file
|
||||
combined_file = self.output_dir / "trajectories.jsonl"
|
||||
print(f"\n📦 Combining batch files into {combined_file.name}...")
|
||||
|
||||
with open(combined_file, 'w', encoding='utf-8') as outfile:
|
||||
for batch_num in range(len(self.batches)):
|
||||
batch_file = self.output_dir / f"batch_{batch_num}.jsonl"
|
||||
if batch_file.exists():
|
||||
with open(batch_file, 'r', encoding='utf-8') as infile:
|
||||
for line in infile:
|
||||
outfile.write(line)
|
||||
|
||||
print(f"✅ Combined {len(self.batches)} batch files into trajectories.jsonl")
|
||||
|
||||
# Save final statistics
|
||||
final_stats = {
|
||||
"run_name": self.run_name,
|
||||
"distribution": self.distribution,
|
||||
"total_prompts": len(self.dataset),
|
||||
"total_batches": len(self.batches),
|
||||
"batch_size": self.batch_size,
|
||||
"model": self.model,
|
||||
"completed_at": datetime.now().isoformat(),
|
||||
"duration_seconds": round(time.time() - start_time, 2),
|
||||
"tool_statistics": total_tool_stats
|
||||
}
|
||||
|
||||
with open(self.stats_file, 'w', encoding='utf-8') as f:
|
||||
json.dump(final_stats, f, indent=2)
|
||||
|
||||
# Print summary
|
||||
print("\n" + "=" * 70)
|
||||
print("📊 BATCH PROCESSING COMPLETE")
|
||||
print("=" * 70)
|
||||
print(f"✅ Total prompts processed: {len(self.dataset)}")
|
||||
print(f"✅ Total batches: {len(self.batches)}")
|
||||
print(f"⏱️ Total duration: {round(time.time() - start_time, 2)}s")
|
||||
print(f"\n📈 Tool Usage Statistics:")
|
||||
print("-" * 70)
|
||||
|
||||
if total_tool_stats:
|
||||
# Sort by count descending
|
||||
sorted_tools = sorted(
|
||||
total_tool_stats.items(),
|
||||
key=lambda x: x[1]["count"],
|
||||
reverse=True
|
||||
)
|
||||
|
||||
print(f"{'Tool Name':<25} {'Count':<10} {'Success':<10} {'Failure':<10} {'Success Rate':<12}")
|
||||
print("-" * 70)
|
||||
for tool_name, stats in sorted_tools:
|
||||
print(
|
||||
f"{tool_name:<25} "
|
||||
f"{stats['count']:<10} "
|
||||
f"{stats['success']:<10} "
|
||||
f"{stats['failure']:<10} "
|
||||
f"{stats['success_rate']:.1f}%"
|
||||
)
|
||||
else:
|
||||
print("No tool calls were made during this run.")
|
||||
|
||||
print(f"\n💾 Results saved to: {self.output_dir}")
|
||||
print(f" - Trajectories: trajectories.jsonl (combined)")
|
||||
print(f" - Individual batches: batch_*.jsonl (for debugging)")
|
||||
print(f" - Statistics: {self.stats_file.name}")
|
||||
print(f" - Checkpoint: {self.checkpoint_file.name}")
|
||||
|
||||
|
||||
def main(
|
||||
dataset_file: str = None,
|
||||
batch_size: int = None,
|
||||
run_name: str = None,
|
||||
distribution: str = "default",
|
||||
model: str = "claude-opus-4-20250514",
|
||||
api_key: str = None,
|
||||
base_url: str = "https://api.anthropic.com/v1/",
|
||||
max_turns: int = 10,
|
||||
num_workers: int = 4,
|
||||
resume: bool = False,
|
||||
verbose: bool = False,
|
||||
list_distributions: bool = False,
|
||||
ephemeral_system_prompt: str = None
|
||||
):
|
||||
"""
|
||||
Run batch processing of agent prompts from a dataset.
|
||||
|
||||
Args:
|
||||
dataset_file (str): Path to JSONL file with 'prompt' field in each entry
|
||||
batch_size (int): Number of prompts per batch
|
||||
run_name (str): Name for this run (used for output and checkpointing)
|
||||
distribution (str): Toolset distribution to use (default: "default")
|
||||
model (str): Model name to use (default: "claude-opus-4-20250514")
|
||||
api_key (str): API key for model authentication
|
||||
base_url (str): Base URL for model API
|
||||
max_turns (int): Maximum number of tool calling iterations per prompt (default: 10)
|
||||
num_workers (int): Number of parallel worker processes (default: 4)
|
||||
resume (bool): Resume from checkpoint if run was interrupted (default: False)
|
||||
verbose (bool): Enable verbose logging (default: False)
|
||||
list_distributions (bool): List available toolset distributions and exit
|
||||
ephemeral_system_prompt (str): System prompt used during agent execution but NOT saved to trajectories (optional)
|
||||
|
||||
Examples:
|
||||
# Basic usage
|
||||
python batch_runner.py --dataset_file=data.jsonl --batch_size=10 --run_name=my_run
|
||||
|
||||
# Resume interrupted run
|
||||
python batch_runner.py --dataset_file=data.jsonl --batch_size=10 --run_name=my_run --resume
|
||||
|
||||
# Use specific distribution
|
||||
python batch_runner.py --dataset_file=data.jsonl --batch_size=10 --run_name=image_test --distribution=image_gen
|
||||
|
||||
# With ephemeral system prompt (not saved to dataset)
|
||||
python batch_runner.py --dataset_file=data.jsonl --batch_size=10 --run_name=my_run \\
|
||||
--ephemeral_system_prompt="You are a helpful assistant focused on image generation."
|
||||
|
||||
# List available distributions
|
||||
python batch_runner.py --list_distributions
|
||||
"""
|
||||
# Handle list distributions
|
||||
if list_distributions:
|
||||
from toolset_distributions import list_distributions as get_all_dists, print_distribution_info
|
||||
|
||||
print("📊 Available Toolset Distributions")
|
||||
print("=" * 70)
|
||||
|
||||
all_dists = get_all_dists()
|
||||
for dist_name in sorted(all_dists.keys()):
|
||||
print_distribution_info(dist_name)
|
||||
|
||||
print("\n💡 Usage:")
|
||||
print(" python batch_runner.py --dataset_file=data.jsonl --batch_size=10 \\")
|
||||
print(" --run_name=my_run --distribution=<name>")
|
||||
return
|
||||
|
||||
# Validate required arguments
|
||||
if not dataset_file:
|
||||
print("❌ Error: --dataset_file is required")
|
||||
return
|
||||
|
||||
if not batch_size or batch_size < 1:
|
||||
print("❌ Error: --batch_size must be a positive integer")
|
||||
return
|
||||
|
||||
if not run_name:
|
||||
print("❌ Error: --run_name is required")
|
||||
return
|
||||
|
||||
# Initialize and run batch runner
|
||||
try:
|
||||
runner = BatchRunner(
|
||||
dataset_file=dataset_file,
|
||||
batch_size=batch_size,
|
||||
run_name=run_name,
|
||||
distribution=distribution,
|
||||
max_iterations=max_turns,
|
||||
base_url=base_url,
|
||||
api_key=api_key,
|
||||
model=model,
|
||||
num_workers=num_workers,
|
||||
verbose=verbose,
|
||||
ephemeral_system_prompt=ephemeral_system_prompt
|
||||
)
|
||||
|
||||
runner.run(resume=resume)
|
||||
|
||||
except Exception as e:
|
||||
print(f"\n❌ Fatal error: {e}")
|
||||
if verbose:
|
||||
traceback.print_exc()
|
||||
return 1
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
fire.Fire(main)
|
||||
|
||||
335
model_tools.py
335
model_tools.py
@@ -8,27 +8,38 @@ for defining tools and executing function calls.
|
||||
|
||||
Currently supports:
|
||||
- Web tools (search, extract, crawl) from web_tools.py
|
||||
- Terminal tools (command execution with interactive sessions) from terminal_tool.py
|
||||
- Vision tools (image analysis) from vision_tools.py
|
||||
- Mixture of Agents tools (collaborative multi-model reasoning) from mixture_of_agents_tool.py
|
||||
- Image generation tools (text-to-image with upscaling) from image_generation_tool.py
|
||||
|
||||
Usage:
|
||||
from model_tools import get_tool_definitions, handle_function_call
|
||||
|
||||
# Get tool definitions for model API
|
||||
# Get all available tool definitions for model API
|
||||
tools = get_tool_definitions()
|
||||
|
||||
# Get specific toolsets
|
||||
web_tools = get_tool_definitions(enabled_toolsets=['web_tools'])
|
||||
|
||||
# Handle function calls from model
|
||||
result = handle_function_call("web_search_tool", {"query": "Python", "limit": 3})
|
||||
result = handle_function_call("web_search", {"query": "Python"})
|
||||
"""
|
||||
|
||||
import json
|
||||
import asyncio
|
||||
from typing import Dict, Any, List
|
||||
|
||||
# Import toolsets
|
||||
from web_tools import web_search_tool, web_extract_tool, web_crawl_tool, check_firecrawl_api_key
|
||||
from terminal_tool import terminal_tool, check_hecate_requirements, TERMINAL_TOOL_DESCRIPTION
|
||||
from vision_tools import vision_analyze_tool, check_vision_requirements
|
||||
from mixture_of_agents_tool import mixture_of_agents_tool, check_moa_requirements
|
||||
from image_generation_tool import image_generate_tool, check_image_generation_requirements
|
||||
from tools.web_tools import web_search_tool, web_extract_tool, web_crawl_tool, check_firecrawl_api_key
|
||||
from tools.terminal_tool import terminal_tool, check_hecate_requirements, TERMINAL_TOOL_DESCRIPTION
|
||||
from tools.vision_tools import vision_analyze_tool, check_vision_requirements
|
||||
from tools.mixture_of_agents_tool import mixture_of_agents_tool, check_moa_requirements
|
||||
from tools.image_generation_tool import image_generate_tool, check_image_generation_requirements
|
||||
from toolsets import (
|
||||
get_toolset, resolve_toolset, resolve_multiple_toolsets,
|
||||
get_all_toolsets, get_toolset_names, validate_toolset,
|
||||
get_toolset_info, print_toolset_tree
|
||||
)
|
||||
|
||||
def get_web_tool_definitions() -> List[Dict[str, Any]]:
|
||||
"""
|
||||
@@ -42,20 +53,13 @@ def get_web_tool_definitions() -> List[Dict[str, Any]]:
|
||||
"type": "function",
|
||||
"function": {
|
||||
"name": "web_search",
|
||||
"description": "Search the web for information on any topic. Returns relevant results with titles and URLs. Uses advanced search depth for comprehensive results.",
|
||||
"description": "Search the web for information on any topic. Returns up to 5 relevant results with titles and URLs. Uses advanced search depth for comprehensive results.",
|
||||
"parameters": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"query": {
|
||||
"type": "string",
|
||||
"description": "The search query to look up on the web"
|
||||
},
|
||||
"limit": {
|
||||
"type": "integer",
|
||||
"description": "Maximum number of results to return (default: 5, max: 10)",
|
||||
"default": 5,
|
||||
"minimum": 1,
|
||||
"maximum": 10
|
||||
}
|
||||
},
|
||||
"required": ["query"]
|
||||
@@ -75,11 +79,6 @@ def get_web_tool_definitions() -> List[Dict[str, Any]]:
|
||||
"items": {"type": "string"},
|
||||
"description": "List of URLs to extract content from (max 5 URLs per call)",
|
||||
"maxItems": 5
|
||||
},
|
||||
"format": {
|
||||
"type": "string",
|
||||
"enum": ["markdown", "html"],
|
||||
"description": "Desired output format for extracted content (optional)"
|
||||
}
|
||||
},
|
||||
"required": ["urls"]
|
||||
@@ -101,12 +100,6 @@ def get_web_tool_definitions() -> List[Dict[str, Any]]:
|
||||
"instructions": {
|
||||
"type": "string",
|
||||
"description": "Specific instructions for what to crawl/extract using AI intelligence (e.g., 'Find pricing information', 'Get documentation pages', 'Extract contact details')"
|
||||
},
|
||||
"depth": {
|
||||
"type": "string",
|
||||
"enum": ["basic", "advanced"],
|
||||
"description": "Depth of extraction - 'basic' for surface content, 'advanced' for deeper analysis (default: basic)",
|
||||
"default": "basic"
|
||||
}
|
||||
},
|
||||
"required": ["url"]
|
||||
@@ -185,12 +178,7 @@ def get_vision_tool_definitions() -> List[Dict[str, Any]]:
|
||||
},
|
||||
"question": {
|
||||
"type": "string",
|
||||
"description": "Your specific question or request about the image to resolve. The AI will automatically provide a complete image description AND answer your specific question. Examples: 'What text can you read?', 'What architectural style is this?', 'Describe the mood and emotions', 'What safety hazards do you see?'"
|
||||
},
|
||||
"model": {
|
||||
"type": "string",
|
||||
"description": "The vision model to use for analysis (optional, default: gemini-2.5-flash)",
|
||||
"default": "gemini-2.5-flash"
|
||||
"description": "Your specific question or request about the image to resolve. The AI will automatically provide a complete image description AND answer your specific question."
|
||||
}
|
||||
},
|
||||
"required": ["image_url", "question"]
|
||||
@@ -212,7 +200,7 @@ def get_moa_tool_definitions() -> List[Dict[str, Any]]:
|
||||
"type": "function",
|
||||
"function": {
|
||||
"name": "mixture_of_agents",
|
||||
"description": "Process extremely difficult problems requiring intense reasoning using the Mixture-of-Agents methodology. This tool leverages multiple frontier language models to collaboratively solve complex tasks that single models struggle with. Uses a fixed 2-layer architecture: reference models (claude-opus-4, gemini-2.5-pro, o4-mini, deepseek-r1) generate diverse responses, then an aggregator synthesizes the best solution. Best for: complex mathematical proofs, advanced coding problems, multi-step analytical reasoning, precise and complex STEM problems, algorithm design, and problems requiring diverse domain expertise.",
|
||||
"description": "Process extremely difficult problems requiring intense reasoning using a Mixture-of-Agents. This tool leverages multiple frontier language models to collaboratively solve complex tasks that single models struggle with. Uses a fixed 2-layer architecture: reference models generate diverse responses, then an aggregator synthesizes the best solution. Best for: complex mathematical proofs, advanced coding problems, multi-step analytical reasoning, precise and complex STEM problems, algorithm design, and problems requiring diverse domain expertise.",
|
||||
"parameters": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
@@ -240,13 +228,13 @@ def get_image_tool_definitions() -> List[Dict[str, Any]]:
|
||||
"type": "function",
|
||||
"function": {
|
||||
"name": "image_generate",
|
||||
"description": "Generate high-quality images from text prompts using FAL.ai's FLUX.1 Krea model with automatic 2x upscaling. Creates detailed, artistic images that are automatically enhanced for superior quality. Returns a single upscaled image URL that can be displayed using <img src=\"{URL}\"></img> tags.",
|
||||
"description": "Generate high-quality images from text prompts using FLUX Krea model with automatic 2x upscaling. Creates detailed, artistic images that are automatically enhanced for superior quality. Returns a single upscaled image URL that can be displayed using <img src=\"{URL}\"></img> tags.",
|
||||
"parameters": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"prompt": {
|
||||
"type": "string",
|
||||
"description": "The text prompt describing the desired image. Be detailed and descriptive for best results."
|
||||
"description": "The text prompt describing the desired image. Be detailed and descriptive."
|
||||
},
|
||||
"image_size": {
|
||||
"type": "string",
|
||||
@@ -291,10 +279,6 @@ def get_all_tool_names() -> List[str]:
|
||||
if check_image_generation_requirements():
|
||||
tool_names.extend(["image_generate"])
|
||||
|
||||
# Future toolsets can be added here:
|
||||
# if check_file_tools():
|
||||
# tool_names.extend(["file_read", "file_write"])
|
||||
|
||||
return tool_names
|
||||
|
||||
|
||||
@@ -316,154 +300,152 @@ def get_toolset_for_tool(tool_name: str) -> str:
|
||||
"vision_analyze": "vision_tools",
|
||||
"mixture_of_agents": "moa_tools",
|
||||
"image_generate": "image_tools"
|
||||
# Future tools can be added here
|
||||
}
|
||||
|
||||
return toolset_mapping.get(tool_name, "unknown")
|
||||
|
||||
|
||||
def get_tool_definitions(
|
||||
enabled_tools: List[str] = None,
|
||||
disabled_tools: List[str] = None,
|
||||
enabled_toolsets: List[str] = None,
|
||||
disabled_toolsets: List[str] = None
|
||||
) -> List[Dict[str, Any]]:
|
||||
"""
|
||||
Get tool definitions for model API calls with optional filtering.
|
||||
Get tool definitions for model API calls with toolset-based filtering.
|
||||
|
||||
This function aggregates tool definitions from all available toolsets
|
||||
and applies filtering based on the provided parameters.
|
||||
|
||||
Filter Priority (higher priority overrides lower):
|
||||
1. enabled_tools (highest priority - only these tools, overrides everything)
|
||||
2. disabled_tools (applied after toolset filtering)
|
||||
3. enabled_toolsets (only tools from these toolsets)
|
||||
4. disabled_toolsets (exclude tools from these toolsets)
|
||||
This function aggregates tool definitions from available toolsets.
|
||||
All tools must be part of a toolset to be accessible. Individual tool
|
||||
selection is not supported - use toolsets to organize and select tools.
|
||||
|
||||
Args:
|
||||
enabled_tools (List[str]): Only include these specific tools. If provided,
|
||||
ONLY these tools will be included (overrides all other filters)
|
||||
disabled_tools (List[str]): Exclude these specific tools (applied after toolset filtering)
|
||||
enabled_toolsets (List[str]): Only include tools from these toolsets
|
||||
disabled_toolsets (List[str]): Exclude tools from these toolsets
|
||||
enabled_toolsets (List[str]): Only include tools from these toolsets.
|
||||
If None, all available tools are included.
|
||||
disabled_toolsets (List[str]): Exclude tools from these toolsets.
|
||||
Applied only if enabled_toolsets is None.
|
||||
|
||||
Returns:
|
||||
List[Dict]: Filtered list of tool definitions
|
||||
|
||||
Examples:
|
||||
# Only web tools
|
||||
tools = get_tool_definitions(enabled_toolsets=["web_tools"])
|
||||
# Use predefined toolsets
|
||||
tools = get_tool_definitions(enabled_toolsets=["research"])
|
||||
tools = get_tool_definitions(enabled_toolsets=["development"])
|
||||
|
||||
# All tools except terminal
|
||||
tools = get_tool_definitions(disabled_tools=["terminal"])
|
||||
# Combine multiple toolsets
|
||||
tools = get_tool_definitions(enabled_toolsets=["web", "vision"])
|
||||
|
||||
# Only specific tools (overrides toolset filters)
|
||||
tools = get_tool_definitions(enabled_tools=["web_search", "web_extract"])
|
||||
# All tools except those in terminal toolset
|
||||
tools = get_tool_definitions(disabled_toolsets=["terminal"])
|
||||
|
||||
# Conflicting filters (enabled_tools wins)
|
||||
tools = get_tool_definitions(enabled_toolsets=["web_tools"], enabled_tools=["terminal"])
|
||||
# Result: Only terminal tool (enabled_tools overrides enabled_toolsets)
|
||||
# Default - all available tools
|
||||
tools = get_tool_definitions()
|
||||
"""
|
||||
# Detect and warn about potential conflicts
|
||||
conflicts_detected = False
|
||||
# Collect all available tool definitions
|
||||
all_available_tools_map = {}
|
||||
|
||||
if enabled_tools and (enabled_toolsets or disabled_toolsets or disabled_tools):
|
||||
print("⚠️ enabled_tools overrides all other filters")
|
||||
conflicts_detected = True
|
||||
# Map tool names to their definitions
|
||||
if check_firecrawl_api_key():
|
||||
for tool in get_web_tool_definitions():
|
||||
all_available_tools_map[tool["function"]["name"]] = tool
|
||||
|
||||
if enabled_toolsets and disabled_toolsets:
|
||||
# Check for overlap
|
||||
enabled_set = set(enabled_toolsets)
|
||||
disabled_set = set(disabled_toolsets)
|
||||
overlap = enabled_set & disabled_set
|
||||
if overlap:
|
||||
print(f"⚠️ Conflicting toolsets: {overlap} in both enabled and disabled")
|
||||
print(f" → enabled_toolsets takes priority")
|
||||
conflicts_detected = True
|
||||
if check_hecate_requirements():
|
||||
for tool in get_terminal_tool_definitions():
|
||||
all_available_tools_map[tool["function"]["name"]] = tool
|
||||
|
||||
if enabled_tools and disabled_tools:
|
||||
# Check for overlap
|
||||
enabled_set = set(enabled_tools)
|
||||
disabled_set = set(disabled_tools)
|
||||
overlap = enabled_set & disabled_set
|
||||
if overlap:
|
||||
print(f"⚠️ Conflicting tools: {overlap} in both enabled and disabled")
|
||||
print(f" → enabled_tools takes priority")
|
||||
conflicts_detected = True
|
||||
if check_vision_requirements():
|
||||
for tool in get_vision_tool_definitions():
|
||||
all_available_tools_map[tool["function"]["name"]] = tool
|
||||
|
||||
all_tools = []
|
||||
if check_moa_requirements():
|
||||
for tool in get_moa_tool_definitions():
|
||||
all_available_tools_map[tool["function"]["name"]] = tool
|
||||
|
||||
# Collect all available tools from each toolset
|
||||
toolset_tools = {
|
||||
"web_tools": get_web_tool_definitions() if check_firecrawl_api_key() else [],
|
||||
"terminal_tools": get_terminal_tool_definitions() if check_hecate_requirements() else [],
|
||||
"vision_tools": get_vision_tool_definitions() if check_vision_requirements() else [],
|
||||
"moa_tools": get_moa_tool_definitions() if check_moa_requirements() else [],
|
||||
"image_tools": get_image_tool_definitions() if check_image_generation_requirements() else []
|
||||
# Future toolsets can be added here:
|
||||
# "file_tools": get_file_tool_definitions() if check_file_tools() else [],
|
||||
}
|
||||
if check_image_generation_requirements():
|
||||
for tool in get_image_tool_definitions():
|
||||
all_available_tools_map[tool["function"]["name"]] = tool
|
||||
|
||||
# HIGHEST PRIORITY: enabled_tools (overrides everything)
|
||||
if enabled_tools:
|
||||
if conflicts_detected:
|
||||
print(f"🎯 Using only enabled_tools: {enabled_tools}")
|
||||
|
||||
# Collect all available tools first
|
||||
all_available_tools = []
|
||||
for tools in toolset_tools.values():
|
||||
all_available_tools.extend(tools)
|
||||
|
||||
# Only include specifically enabled tools
|
||||
tool_names_to_include = set(enabled_tools)
|
||||
filtered_tools = [
|
||||
tool for tool in all_available_tools
|
||||
if tool["function"]["name"] in tool_names_to_include
|
||||
]
|
||||
|
||||
# Warn about requested tools that aren't available
|
||||
found_tools = {tool["function"]["name"] for tool in filtered_tools}
|
||||
missing_tools = tool_names_to_include - found_tools
|
||||
if missing_tools:
|
||||
print(f"⚠️ Requested tools not available: {missing_tools}")
|
||||
|
||||
return filtered_tools
|
||||
# Determine which tools to include based on toolsets
|
||||
tools_to_include = set()
|
||||
|
||||
# Apply toolset-level filtering first
|
||||
if enabled_toolsets:
|
||||
# Only include tools from enabled toolsets
|
||||
for toolset_name in enabled_toolsets:
|
||||
if toolset_name in toolset_tools:
|
||||
all_tools.extend(toolset_tools[toolset_name])
|
||||
if validate_toolset(toolset_name):
|
||||
resolved_tools = resolve_toolset(toolset_name)
|
||||
tools_to_include.update(resolved_tools)
|
||||
print(f"✅ Enabled toolset '{toolset_name}': {', '.join(resolved_tools) if resolved_tools else 'no tools'}")
|
||||
else:
|
||||
print(f"⚠️ Unknown toolset: {toolset_name}")
|
||||
# Try legacy compatibility
|
||||
if toolset_name in ["web_tools", "terminal_tools", "vision_tools", "moa_tools", "image_tools"]:
|
||||
# Map legacy names to new system
|
||||
legacy_map = {
|
||||
"web_tools": ["web_search", "web_extract", "web_crawl"],
|
||||
"terminal_tools": ["terminal"],
|
||||
"vision_tools": ["vision_analyze"],
|
||||
"moa_tools": ["mixture_of_agents"],
|
||||
"image_tools": ["image_generate"]
|
||||
}
|
||||
legacy_tools = legacy_map.get(toolset_name, [])
|
||||
tools_to_include.update(legacy_tools)
|
||||
print(f"✅ Enabled legacy toolset '{toolset_name}': {', '.join(legacy_tools)}")
|
||||
else:
|
||||
print(f"⚠️ Unknown toolset: {toolset_name}")
|
||||
elif disabled_toolsets:
|
||||
# Include all tools except from disabled toolsets
|
||||
for toolset_name, tools in toolset_tools.items():
|
||||
if toolset_name not in disabled_toolsets:
|
||||
all_tools.extend(tools)
|
||||
# Start with all tools from all toolsets, then remove disabled ones
|
||||
# Note: Only tools that are part of toolsets are accessible
|
||||
# We need to get all tools from all defined toolsets
|
||||
from toolsets import get_all_toolsets
|
||||
all_toolset_tools = set()
|
||||
for toolset_name in get_all_toolsets():
|
||||
resolved_tools = resolve_toolset(toolset_name)
|
||||
all_toolset_tools.update(resolved_tools)
|
||||
|
||||
# Start with all tools from toolsets
|
||||
tools_to_include = all_toolset_tools
|
||||
|
||||
# Remove tools from disabled toolsets
|
||||
for toolset_name in disabled_toolsets:
|
||||
if validate_toolset(toolset_name):
|
||||
resolved_tools = resolve_toolset(toolset_name)
|
||||
tools_to_include.difference_update(resolved_tools)
|
||||
print(f"🚫 Disabled toolset '{toolset_name}': {', '.join(resolved_tools) if resolved_tools else 'no tools'}")
|
||||
else:
|
||||
# Try legacy compatibility
|
||||
if toolset_name in ["web_tools", "terminal_tools", "vision_tools", "moa_tools", "image_tools"]:
|
||||
legacy_map = {
|
||||
"web_tools": ["web_search", "web_extract", "web_crawl"],
|
||||
"terminal_tools": ["terminal"],
|
||||
"vision_tools": ["vision_analyze"],
|
||||
"moa_tools": ["mixture_of_agents"],
|
||||
"image_tools": ["image_generate"]
|
||||
}
|
||||
legacy_tools = legacy_map.get(toolset_name, [])
|
||||
tools_to_include.difference_update(legacy_tools)
|
||||
print(f"🚫 Disabled legacy toolset '{toolset_name}': {', '.join(legacy_tools)}")
|
||||
else:
|
||||
print(f"⚠️ Unknown toolset: {toolset_name}")
|
||||
else:
|
||||
# Include all available tools
|
||||
for tools in toolset_tools.values():
|
||||
all_tools.extend(tools)
|
||||
# No filtering - include all tools from all defined toolsets
|
||||
from toolsets import get_all_toolsets
|
||||
for toolset_name in get_all_toolsets():
|
||||
resolved_tools = resolve_toolset(toolset_name)
|
||||
tools_to_include.update(resolved_tools)
|
||||
|
||||
# Apply tool-level filtering (disabled_tools)
|
||||
if disabled_tools:
|
||||
tool_names_to_exclude = set(disabled_tools)
|
||||
original_tools = [tool["function"]["name"] for tool in all_tools]
|
||||
|
||||
all_tools = [
|
||||
tool for tool in all_tools
|
||||
if tool["function"]["name"] not in tool_names_to_exclude
|
||||
]
|
||||
|
||||
# Show what was actually filtered out
|
||||
remaining_tools = {tool["function"]["name"] for tool in all_tools}
|
||||
actually_excluded = set(original_tools) & tool_names_to_exclude
|
||||
if actually_excluded:
|
||||
print(f"🚫 Excluded tools: {actually_excluded}")
|
||||
# Build final tool list (only include tools that are available)
|
||||
filtered_tools = []
|
||||
for tool_name in tools_to_include:
|
||||
if tool_name in all_available_tools_map:
|
||||
filtered_tools.append(all_available_tools_map[tool_name])
|
||||
|
||||
return all_tools
|
||||
# Sort tools for consistent ordering
|
||||
filtered_tools.sort(key=lambda t: t["function"]["name"])
|
||||
|
||||
if filtered_tools:
|
||||
tool_names = [t["function"]["name"] for t in filtered_tools]
|
||||
print(f"🛠️ Final tool selection ({len(filtered_tools)} tools): {', '.join(tool_names)}")
|
||||
else:
|
||||
print("🛠️ No tools selected (all filtered out or unavailable)")
|
||||
|
||||
return filtered_tools
|
||||
|
||||
def handle_web_function_call(function_name: str, function_args: Dict[str, Any]) -> str:
|
||||
"""
|
||||
@@ -478,25 +460,22 @@ def handle_web_function_call(function_name: str, function_args: Dict[str, Any])
|
||||
"""
|
||||
if function_name == "web_search":
|
||||
query = function_args.get("query", "")
|
||||
limit = function_args.get("limit", 5)
|
||||
# Ensure limit is within bounds
|
||||
limit = max(1, min(10, limit))
|
||||
# Always use fixed limit of 5
|
||||
limit = 5
|
||||
return web_search_tool(query, limit)
|
||||
|
||||
elif function_name == "web_extract":
|
||||
urls = function_args.get("urls", [])
|
||||
# Limit URLs to prevent abuse
|
||||
urls = urls[:5] if isinstance(urls, list) else []
|
||||
format = function_args.get("format")
|
||||
# Run async function in event loop
|
||||
return asyncio.run(web_extract_tool(urls, format))
|
||||
return asyncio.run(web_extract_tool(urls, "markdown"))
|
||||
|
||||
elif function_name == "web_crawl":
|
||||
url = function_args.get("url", "")
|
||||
instructions = function_args.get("instructions")
|
||||
depth = function_args.get("depth", "basic")
|
||||
# Run async function in event loop
|
||||
return asyncio.run(web_crawl_tool(url, instructions, depth))
|
||||
return asyncio.run(web_crawl_tool(url, instructions, "basic"))
|
||||
|
||||
else:
|
||||
return json.dumps({"error": f"Unknown web function: {function_name}"})
|
||||
@@ -518,9 +497,8 @@ def handle_terminal_function_call(function_name: str, function_args: Dict[str, A
|
||||
background = function_args.get("background", False)
|
||||
idle_threshold = function_args.get("idle_threshold", 5.0)
|
||||
timeout = function_args.get("timeout")
|
||||
snapshot_id = function_args.get("snapshot_id")
|
||||
# Session management is handled internally - don't pass session_id from model
|
||||
return terminal_tool(command, input_keys, None, background, idle_threshold, timeout, snapshot_id=snapshot_id)
|
||||
|
||||
return terminal_tool(command, input_keys, None, background, idle_threshold, timeout)
|
||||
|
||||
else:
|
||||
return json.dumps({"error": f"Unknown terminal function: {function_name}"})
|
||||
@@ -540,13 +518,11 @@ def handle_vision_function_call(function_name: str, function_args: Dict[str, Any
|
||||
if function_name == "vision_analyze":
|
||||
image_url = function_args.get("image_url", "")
|
||||
question = function_args.get("question", "")
|
||||
model = function_args.get("model", "gemini-2.5-flash")
|
||||
|
||||
# Automatically prepend full description request to user's question
|
||||
full_prompt = f"Fully describe and explain everything about this image\n\n{question}"
|
||||
|
||||
full_prompt = f"Fully describe and explain everything about this image, then answer the following question:\n\n{question}"
|
||||
|
||||
# Run async function in event loop
|
||||
return asyncio.run(vision_analyze_tool(image_url, full_prompt, model))
|
||||
return asyncio.run(vision_analyze_tool(image_url, full_prompt, "gemini-2.5-flash"))
|
||||
|
||||
else:
|
||||
return json.dumps({"error": f"Unknown vision function: {function_name}"})
|
||||
@@ -593,7 +569,6 @@ def handle_image_function_call(function_name: str, function_args: Dict[str, Any]
|
||||
if not prompt:
|
||||
return json.dumps({"success": False, "image": None})
|
||||
|
||||
# Extract only the exposed parameters
|
||||
image_size = function_args.get("image_size", "landscape_16_9")
|
||||
|
||||
# Use fixed internal defaults for all other parameters (not exposed to model)
|
||||
@@ -606,8 +581,21 @@ def handle_image_function_call(function_name: str, function_args: Dict[str, Any]
|
||||
allow_nsfw_images = True
|
||||
seed = None
|
||||
|
||||
# Run async function in event loop
|
||||
return asyncio.run(image_generate_tool(
|
||||
# Run async function in event loop with proper handling for multiprocessing
|
||||
try:
|
||||
# Try to get existing event loop
|
||||
loop = asyncio.get_event_loop()
|
||||
if loop.is_closed():
|
||||
# If closed, create a new one
|
||||
loop = asyncio.new_event_loop()
|
||||
asyncio.set_event_loop(loop)
|
||||
except RuntimeError:
|
||||
# No event loop in current thread, create one
|
||||
loop = asyncio.new_event_loop()
|
||||
asyncio.set_event_loop(loop)
|
||||
|
||||
# Run the coroutine in the event loop
|
||||
result = loop.run_until_complete(image_generate_tool(
|
||||
prompt=prompt,
|
||||
image_size=image_size,
|
||||
num_inference_steps=num_inference_steps,
|
||||
@@ -619,6 +607,8 @@ def handle_image_function_call(function_name: str, function_args: Dict[str, Any]
|
||||
allow_nsfw_images=allow_nsfw_images,
|
||||
seed=seed
|
||||
))
|
||||
|
||||
return result
|
||||
|
||||
else:
|
||||
return json.dumps({"error": f"Unknown image generation function: {function_name}"})
|
||||
@@ -663,12 +653,6 @@ def handle_function_call(function_name: str, function_args: Dict[str, Any]) -> s
|
||||
elif function_name in ["image_generate"]:
|
||||
return handle_image_function_call(function_name, function_args)
|
||||
|
||||
# Future toolsets can be routed here:
|
||||
# elif function_name in ["file_read_tool", "file_write_tool"]:
|
||||
# return handle_file_function_call(function_name, function_args)
|
||||
# elif function_name in ["code_execute_tool", "code_analyze_tool"]:
|
||||
# return handle_code_function_call(function_name, function_args)
|
||||
|
||||
else:
|
||||
error_msg = f"Unknown function: {function_name}"
|
||||
print(f"❌ {error_msg}")
|
||||
@@ -717,7 +701,6 @@ def get_available_toolsets() -> Dict[str, Dict[str, Any]]:
|
||||
"description": "Generate high-quality images from text prompts using FAL.ai's FLUX.1 Krea model with automatic 2x upscaling for enhanced quality",
|
||||
"requirements": ["FAL_KEY environment variable", "fal-client package"]
|
||||
}
|
||||
# Future toolsets can be added here
|
||||
}
|
||||
|
||||
return toolsets
|
||||
|
||||
28
pyproject.toml
Normal file
28
pyproject.toml
Normal file
@@ -0,0 +1,28 @@
|
||||
[build-system]
|
||||
requires = ["setuptools>=61.0"]
|
||||
build-backend = "setuptools.build_meta"
|
||||
|
||||
[project]
|
||||
name = "hermes-agent"
|
||||
version = "0.1.0"
|
||||
description = "AI agent with advanced tool-calling and toolsets"
|
||||
readme = "README.md"
|
||||
requires-python = ">=3.10"
|
||||
authors = [{ name = "Hermes Agent" }]
|
||||
license = { text = "MIT" }
|
||||
dependencies = [
|
||||
"firecrawl-py",
|
||||
"openai",
|
||||
"fal-client",
|
||||
"python-dotenv",
|
||||
"fire"
|
||||
]
|
||||
|
||||
[project.scripts]
|
||||
hermes-agent = "run_agent:main"
|
||||
|
||||
[tool.setuptools]
|
||||
py-modules = ["run_agent", "model_tools", "toolsets"]
|
||||
|
||||
[tool.setuptools.packages.find]
|
||||
include = ["tools"]
|
||||
@@ -1,3 +1,6 @@
|
||||
firecrawl-py
|
||||
openai
|
||||
fal-client
|
||||
fal-client
|
||||
python-dotenv
|
||||
fire
|
||||
requests
|
||||
1369
run_agent.py
1369
run_agent.py
File diff suppressed because it is too large
Load Diff
12
run_datagen_images.sh
Normal file
12
run_datagen_images.sh
Normal file
@@ -0,0 +1,12 @@
|
||||
python batch_runner.py \
|
||||
--dataset_file="hermes-agent-imagen-data/hermes_agent_imagen_eval.jsonl" \
|
||||
--batch_size=10 \
|
||||
--run_name="imagen_eval_gpt5" \
|
||||
--distribution="image_gen" \
|
||||
--model="gpt-5" \
|
||||
--base_url="https://api.openai.com/v1" \
|
||||
--api_key="${OPENAI_API_KEY}" \
|
||||
--num_workers=4 \
|
||||
--max_turns=5 \
|
||||
--verbose \
|
||||
--ephemeral_system_prompt="When generating an image for the user view the image by using the vision_analyze tool to ensure it is what the user wanted. If it isn't feel free to retry a few times. If none are perfect, choose the best option that is the closest match, and explain its imperfections. If the image generation tool fails, try again a few times. If the vision analyze tool fails, provide the image to the user and explain it is your best effort attempt."
|
||||
25
test_run.sh
Normal file → Executable file
25
test_run.sh
Normal file → Executable file
@@ -1,14 +1,23 @@
|
||||
#!/bin/bash
|
||||
|
||||
# Check if a prompt argument was provided
|
||||
if [ $# -eq 0 ]; then
|
||||
echo "Error: Please provide a prompt as an argument"
|
||||
echo "Usage: $0 \"your prompt here\""
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Get the prompt from the first argument
|
||||
PROMPT="$1"
|
||||
|
||||
# Set debug mode for web tools
|
||||
export WEB_TOOLS_DEBUG=true
|
||||
|
||||
# Run the agent with the provided prompt
|
||||
python run_agent.py \
|
||||
--query "Tell me about this animal pictured: https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcQi1nkrYXY-ijQv5aCxkwooyg2roNFxj0ewJA&s" \
|
||||
--query "$PROMPT" \
|
||||
--max_turns 30 \
|
||||
--model claude-sonnet-4-20250514 \
|
||||
--model claude-sonnet-4-5-20250929 \
|
||||
--base_url https://api.anthropic.com/v1/ \
|
||||
--api_key $ANTHROPIC_API_KEY \
|
||||
--enabled_toolsets=vision_tools
|
||||
|
||||
#Possible Toolsets:
|
||||
#web_tools
|
||||
#vision_tools
|
||||
#terminal_tools
|
||||
--save_trajectories
|
||||
0
tests/__init__.py
Normal file
0
tests/__init__.py
Normal file
129
tests/test_batch_runner.py
Normal file
129
tests/test_batch_runner.py
Normal file
@@ -0,0 +1,129 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Test script for batch runner
|
||||
|
||||
This script tests the batch runner with a small sample dataset
|
||||
to verify functionality before running large batches.
|
||||
"""
|
||||
|
||||
import json
|
||||
import shutil
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
def create_test_dataset():
|
||||
"""Create a small test dataset."""
|
||||
test_file = Path("tests/test_dataset.jsonl")
|
||||
test_file.parent.mkdir(exist_ok=True)
|
||||
|
||||
prompts = [
|
||||
{"prompt": "What is 2 + 2?"},
|
||||
{"prompt": "What is the capital of France?"},
|
||||
{"prompt": "Explain what Python is in one sentence."},
|
||||
]
|
||||
|
||||
with open(test_file, 'w') as f:
|
||||
for prompt in prompts:
|
||||
f.write(json.dumps(prompt) + "\n")
|
||||
|
||||
print(f"✅ Created test dataset: {test_file}")
|
||||
return test_file
|
||||
|
||||
|
||||
def cleanup_test_run(run_name):
|
||||
"""Clean up test run output."""
|
||||
output_dir = Path("data") / run_name
|
||||
if output_dir.exists():
|
||||
shutil.rmtree(output_dir)
|
||||
print(f"🗑️ Cleaned up test output: {output_dir}")
|
||||
|
||||
|
||||
def verify_output(run_name):
|
||||
"""Verify that output files were created correctly."""
|
||||
output_dir = Path("data") / run_name
|
||||
|
||||
# Check directory exists
|
||||
if not output_dir.exists():
|
||||
print(f"❌ Output directory not found: {output_dir}")
|
||||
return False
|
||||
|
||||
# Check for checkpoint
|
||||
checkpoint_file = output_dir / "checkpoint.json"
|
||||
if not checkpoint_file.exists():
|
||||
print(f"❌ Checkpoint file not found: {checkpoint_file}")
|
||||
return False
|
||||
|
||||
# Check for statistics
|
||||
stats_file = output_dir / "statistics.json"
|
||||
if not stats_file.exists():
|
||||
print(f"❌ Statistics file not found: {stats_file}")
|
||||
return False
|
||||
|
||||
# Check for batch files
|
||||
batch_files = list(output_dir.glob("batch_*.jsonl"))
|
||||
if not batch_files:
|
||||
print(f"❌ No batch files found in: {output_dir}")
|
||||
return False
|
||||
|
||||
print(f"✅ Output verification passed:")
|
||||
print(f" - Checkpoint: {checkpoint_file}")
|
||||
print(f" - Statistics: {stats_file}")
|
||||
print(f" - Batch files: {len(batch_files)}")
|
||||
|
||||
# Load and display statistics
|
||||
with open(stats_file) as f:
|
||||
stats = json.load(f)
|
||||
|
||||
print(f"\n📊 Statistics Summary:")
|
||||
print(f" - Total prompts: {stats['total_prompts']}")
|
||||
print(f" - Total batches: {stats['total_batches']}")
|
||||
print(f" - Duration: {stats['duration_seconds']}s")
|
||||
|
||||
if stats.get('tool_statistics'):
|
||||
print(f" - Tool calls:")
|
||||
for tool, tool_stats in stats['tool_statistics'].items():
|
||||
print(f" • {tool}: {tool_stats['count']} calls, {tool_stats['success_rate']:.1f}% success")
|
||||
|
||||
return True
|
||||
|
||||
|
||||
def main():
|
||||
"""Run the test."""
|
||||
print("🧪 Batch Runner Test")
|
||||
print("=" * 60)
|
||||
|
||||
run_name = "test_run"
|
||||
|
||||
# Clean up any previous test run
|
||||
cleanup_test_run(run_name)
|
||||
|
||||
# Create test dataset
|
||||
test_file = create_test_dataset()
|
||||
|
||||
print(f"\n📝 To run the test manually:")
|
||||
print(f" python batch_runner.py \\")
|
||||
print(f" --dataset_file={test_file} \\")
|
||||
print(f" --batch_size=2 \\")
|
||||
print(f" --run_name={run_name} \\")
|
||||
print(f" --distribution=minimal \\")
|
||||
print(f" --num_workers=2")
|
||||
|
||||
print(f"\n💡 Or test with different distributions:")
|
||||
print(f" python batch_runner.py --list_distributions")
|
||||
|
||||
print(f"\n🔍 After running, you can verify output with:")
|
||||
print(f" python tests/test_batch_runner.py --verify")
|
||||
|
||||
# Note: We don't actually run the batch runner here to avoid API calls during testing
|
||||
# Users should run it manually with their API keys configured
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
import sys
|
||||
|
||||
if "--verify" in sys.argv:
|
||||
run_name = "test_run"
|
||||
verify_output(run_name)
|
||||
else:
|
||||
main()
|
||||
|
||||
@@ -23,8 +23,8 @@ import argparse
|
||||
from datetime import datetime
|
||||
from typing import List, Dict, Any
|
||||
|
||||
# Import the web tools to test
|
||||
from web_tools import (
|
||||
# Import the web tools to test (updated path after moving tools/)
|
||||
from tools.web_tools import (
|
||||
web_search_tool,
|
||||
web_extract_tool,
|
||||
web_crawl_tool,
|
||||
67
tools/__init__.py
Normal file
67
tools/__init__.py
Normal file
@@ -0,0 +1,67 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Tools Package
|
||||
|
||||
This package contains all the specific tool implementations for the Hermes Agent.
|
||||
Each module provides specialized functionality for different capabilities:
|
||||
|
||||
- web_tools: Web search, content extraction, and crawling
|
||||
- terminal_tool: Command execution on virtual machines
|
||||
- vision_tools: Image analysis and understanding
|
||||
- mixture_of_agents_tool: Multi-model collaborative reasoning
|
||||
- image_generation_tool: Text-to-image generation with upscaling
|
||||
|
||||
The tools are imported into model_tools.py which provides a unified interface
|
||||
for the AI agent to access all capabilities.
|
||||
"""
|
||||
|
||||
# Export all tools for easy importing
|
||||
from .web_tools import (
|
||||
web_search_tool,
|
||||
web_extract_tool,
|
||||
web_crawl_tool,
|
||||
check_firecrawl_api_key
|
||||
)
|
||||
|
||||
from .terminal_tool import (
|
||||
terminal_tool,
|
||||
check_hecate_requirements,
|
||||
TERMINAL_TOOL_DESCRIPTION
|
||||
)
|
||||
|
||||
from .vision_tools import (
|
||||
vision_analyze_tool,
|
||||
check_vision_requirements
|
||||
)
|
||||
|
||||
from .mixture_of_agents_tool import (
|
||||
mixture_of_agents_tool,
|
||||
check_moa_requirements
|
||||
)
|
||||
|
||||
from .image_generation_tool import (
|
||||
image_generate_tool,
|
||||
check_image_generation_requirements
|
||||
)
|
||||
|
||||
__all__ = [
|
||||
# Web tools
|
||||
'web_search_tool',
|
||||
'web_extract_tool',
|
||||
'web_crawl_tool',
|
||||
'check_firecrawl_api_key',
|
||||
# Terminal tools
|
||||
'terminal_tool',
|
||||
'check_hecate_requirements',
|
||||
'TERMINAL_TOOL_DESCRIPTION',
|
||||
# Vision tools
|
||||
'vision_analyze_tool',
|
||||
'check_vision_requirements',
|
||||
# MoA tools
|
||||
'mixture_of_agents_tool',
|
||||
'check_moa_requirements',
|
||||
# Image generation tools
|
||||
'image_generate_tool',
|
||||
'check_image_generation_requirements',
|
||||
]
|
||||
|
||||
@@ -319,9 +319,6 @@ async def image_generate_tool(
|
||||
if not prompt or not isinstance(prompt, str) or len(prompt.strip()) == 0:
|
||||
raise ValueError("Prompt is required and must be a non-empty string")
|
||||
|
||||
if len(prompt) > 1000:
|
||||
raise ValueError("Prompt must be 1000 characters or less")
|
||||
|
||||
# Check API key availability
|
||||
if not os.getenv("FAL_KEY"):
|
||||
raise ValueError("FAL_KEY environment variable not set")
|
||||
File diff suppressed because it is too large
Load Diff
@@ -4,26 +4,27 @@ Terminal Tool Module
|
||||
|
||||
This module provides a single terminal tool using Hecate's VM infrastructure.
|
||||
It wraps Hecate's functionality to provide a simple interface for executing commands
|
||||
on Morph VMs with automatic lifecycle management.
|
||||
on Morph VMs with automatic lifecycle management. VMs live for 5 minutes after last use.
|
||||
Timer resets with each use.
|
||||
|
||||
Available tool:
|
||||
- terminal_tool: Execute commands with optional interactive session support
|
||||
|
||||
Usage:
|
||||
from terminal_tool import terminal_tool
|
||||
|
||||
|
||||
# Execute a single command
|
||||
result = terminal_tool("ls -la")
|
||||
|
||||
|
||||
# Execute in an interactive session
|
||||
result = terminal_tool("python", input_keys="print('hello')\\nexit()\\n")
|
||||
"""
|
||||
|
||||
import json
|
||||
import os
|
||||
import uuid
|
||||
import threading
|
||||
from typing import Optional, Dict, Any
|
||||
from hecate import run_tool_with_lifecycle_management
|
||||
from morphcloud._llm import ToolCall
|
||||
|
||||
# Detailed description for the terminal tool based on Hermes Terminal system prompt
|
||||
TERMINAL_TOOL_DESCRIPTION = """Execute commands on a secure, persistent Linux VM environment with full interactive application support.
|
||||
@@ -72,14 +73,19 @@ When commands enter interactive mode (vim, nano, less, git prompts, package mana
|
||||
- Test components incrementally with mock inputs
|
||||
- Install whatever tools needed - full system access provided"""
|
||||
|
||||
# Global state for VM lifecycle management
|
||||
# These persist across tool calls to enable session continuity
|
||||
_active_instance = None
|
||||
_active_context = None
|
||||
_instance_lock = threading.Lock()
|
||||
|
||||
def terminal_tool(
|
||||
command: Optional[str] = None,
|
||||
input_keys: Optional[str] = None,
|
||||
session_id: Optional[str] = None,
|
||||
background: bool = False,
|
||||
idle_threshold: float = 5.0,
|
||||
timeout: Optional[int] = None,
|
||||
snapshot_id: str | None = None,
|
||||
timeout: Optional[int] = None
|
||||
) -> str:
|
||||
"""
|
||||
Execute a command on a Morph VM with optional interactive session support.
|
||||
@@ -114,10 +120,60 @@ def terminal_tool(
|
||||
# Run a background task
|
||||
>>> result = terminal_tool(command="sleep 60", background=True)
|
||||
"""
|
||||
global _active_instance, _active_context
|
||||
|
||||
try:
|
||||
# Import required modules lazily so this module can be imported
|
||||
# even when hecate is not installed
|
||||
try:
|
||||
from morphcloud._llm import ToolCall
|
||||
from morphcloud.api import MorphCloudClient
|
||||
from hecate.cli import run_tool, ExecutionContext
|
||||
from rich.console import Console
|
||||
import io
|
||||
except ImportError as import_error:
|
||||
return json.dumps({
|
||||
"output": "",
|
||||
"screen": "",
|
||||
"session_id": None,
|
||||
"exit_code": -1,
|
||||
"error": f"Terminal tool is disabled due to import error: {import_error}",
|
||||
"status": "disabled"
|
||||
})
|
||||
|
||||
# Get configuration from environment
|
||||
vm_lifetime_seconds = int(os.getenv("HECATE_VM_LIFETIME_SECONDS", "300"))
|
||||
snapshot_id = os.getenv("HECATE_DEFAULT_SNAPSHOT_ID", "python-2025-10-31")
|
||||
|
||||
# Check API key
|
||||
morph_api_key = os.getenv("MORPH_API_KEY")
|
||||
if not morph_api_key:
|
||||
return json.dumps({
|
||||
"output": "",
|
||||
"screen": "",
|
||||
"session_id": None,
|
||||
"exit_code": -1,
|
||||
"error": "MORPH_API_KEY environment variable not set",
|
||||
"status": "disabled"
|
||||
})
|
||||
|
||||
# Get or create VM instance and execution context
|
||||
# This is critical for interactive session support - the context must persist!
|
||||
with _instance_lock:
|
||||
if _active_instance is None:
|
||||
morph_client = MorphCloudClient(api_key=morph_api_key)
|
||||
_active_instance = morph_client.instances.start(snapshot_id=snapshot_id)
|
||||
|
||||
# Get or create persistent execution context
|
||||
if _active_context is None:
|
||||
_active_context = ExecutionContext()
|
||||
|
||||
instance = _active_instance
|
||||
ctx = _active_context
|
||||
|
||||
# Build tool input based on provided parameters
|
||||
tool_input = {}
|
||||
|
||||
|
||||
if command:
|
||||
tool_input["command"] = command
|
||||
if input_keys:
|
||||
@@ -130,15 +186,28 @@ def terminal_tool(
|
||||
tool_input["idle_threshold"] = idle_threshold
|
||||
if timeout is not None:
|
||||
tool_input["timeout"] = timeout
|
||||
|
||||
|
||||
tool_call = ToolCall(
|
||||
name="run_command",
|
||||
input=tool_input
|
||||
)
|
||||
|
||||
# Execute with lifecycle management
|
||||
result = run_tool_with_lifecycle_management(tool_call, snapshot_id=snapshot_id)
|
||||
|
||||
|
||||
# Create a console for output (redirect to string buffer to avoid printing)
|
||||
console_output = io.StringIO()
|
||||
console = Console(file=console_output, force_terminal=False, legacy_windows=False)
|
||||
|
||||
# Generate unique tool block ID
|
||||
tool_block_id = f"tool_{uuid.uuid4().hex[:8]}"
|
||||
|
||||
# Execute the tool with hecate
|
||||
result = run_tool(
|
||||
tool_call=tool_call,
|
||||
instance=instance,
|
||||
console=console,
|
||||
tool_block_id=tool_block_id,
|
||||
ctx=ctx
|
||||
)
|
||||
|
||||
# Format the result with all possible fields
|
||||
# Map hecate's "stdout" to "output" for compatibility
|
||||
formatted_result = {
|
||||
@@ -149,9 +218,9 @@ def terminal_tool(
|
||||
"error": result.get("error"),
|
||||
"status": "active" if result.get("session_id") else "ended"
|
||||
}
|
||||
|
||||
|
||||
return json.dumps(formatted_result)
|
||||
|
||||
|
||||
except Exception as e:
|
||||
return json.dumps({
|
||||
"output": "",
|
||||
@@ -184,12 +253,16 @@ def check_hecate_requirements() -> bool:
|
||||
print(f"Warning: Missing optional environment variables: {', '.join(missing_optional)}")
|
||||
print(" (Some Hecate features may be limited)")
|
||||
|
||||
# Check if Hecate is importable
|
||||
# Check if Hecate and required modules are importable
|
||||
try:
|
||||
import hecate
|
||||
from morphcloud._llm import ToolCall
|
||||
from morphcloud.api import MorphCloudClient
|
||||
from hecate.cli import run_tool, ExecutionContext
|
||||
from rich.console import Console
|
||||
return True
|
||||
except ImportError:
|
||||
print("Hecate is not installed. Please install it with: pip install hecate")
|
||||
except Exception as e:
|
||||
print(f"Hecate not available: {e}")
|
||||
print(f"Make sure hecate is installed and MORPH_API_KEY is set.")
|
||||
return False
|
||||
|
||||
# Module-level initialization check
|
||||
@@ -1,346 +1,471 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Vision Tools Module
|
||||
|
||||
This module provides vision analysis tools that work with image URLs.
|
||||
Uses Gemini Flash via Nous Research API for intelligent image understanding.
|
||||
|
||||
Available tools:
|
||||
- vision_analyze_tool: Analyze images from URLs with custom prompts
|
||||
|
||||
Features:
|
||||
- Comprehensive image description
|
||||
- Context-aware analysis based on user queries
|
||||
- Proper error handling and validation
|
||||
- Debug logging support
|
||||
|
||||
Usage:
|
||||
from vision_tools import vision_analyze_tool
|
||||
import asyncio
|
||||
|
||||
# Analyze an image
|
||||
result = await vision_analyze_tool(
|
||||
image_url="https://example.com/image.jpg",
|
||||
user_prompt="What architectural style is this building?"
|
||||
)
|
||||
"""
|
||||
|
||||
import json
|
||||
import os
|
||||
import asyncio
|
||||
import uuid
|
||||
import datetime
|
||||
from pathlib import Path
|
||||
from typing import Dict, Any, Optional
|
||||
from openai import AsyncOpenAI
|
||||
|
||||
# Initialize Nous Research API client for vision processing
|
||||
nous_client = AsyncOpenAI(
|
||||
api_key=os.getenv("NOUS_API_KEY"),
|
||||
base_url="https://inference-api.nousresearch.com/v1"
|
||||
)
|
||||
|
||||
# Configuration for vision processing
|
||||
DEFAULT_VISION_MODEL = "gemini-2.5-flash"
|
||||
|
||||
# Debug mode configuration
|
||||
DEBUG_MODE = os.getenv("VISION_TOOLS_DEBUG", "false").lower() == "true"
|
||||
DEBUG_SESSION_ID = str(uuid.uuid4())
|
||||
DEBUG_LOG_PATH = Path("./logs")
|
||||
DEBUG_DATA = {
|
||||
"session_id": DEBUG_SESSION_ID,
|
||||
"start_time": datetime.datetime.now().isoformat(),
|
||||
"debug_enabled": DEBUG_MODE,
|
||||
"tool_calls": []
|
||||
} if DEBUG_MODE else None
|
||||
|
||||
# Create logs directory if debug mode is enabled
|
||||
if DEBUG_MODE:
|
||||
DEBUG_LOG_PATH.mkdir(exist_ok=True)
|
||||
print(f"🐛 Vision debug mode enabled - Session ID: {DEBUG_SESSION_ID}")
|
||||
|
||||
|
||||
def _log_debug_call(tool_name: str, call_data: Dict[str, Any]) -> None:
|
||||
"""
|
||||
Log a debug call entry to the global debug data structure.
|
||||
|
||||
Args:
|
||||
tool_name (str): Name of the tool being called
|
||||
call_data (Dict[str, Any]): Data about the call including parameters and results
|
||||
"""
|
||||
if not DEBUG_MODE or not DEBUG_DATA:
|
||||
return
|
||||
|
||||
call_entry = {
|
||||
"timestamp": datetime.datetime.now().isoformat(),
|
||||
"tool_name": tool_name,
|
||||
**call_data
|
||||
}
|
||||
|
||||
DEBUG_DATA["tool_calls"].append(call_entry)
|
||||
|
||||
|
||||
def _save_debug_log() -> None:
|
||||
"""
|
||||
Save the current debug data to a JSON file in the logs directory.
|
||||
"""
|
||||
if not DEBUG_MODE or not DEBUG_DATA:
|
||||
return
|
||||
|
||||
try:
|
||||
debug_filename = f"vision_tools_debug_{DEBUG_SESSION_ID}.json"
|
||||
debug_filepath = DEBUG_LOG_PATH / debug_filename
|
||||
|
||||
# Update end time
|
||||
DEBUG_DATA["end_time"] = datetime.datetime.now().isoformat()
|
||||
DEBUG_DATA["total_calls"] = len(DEBUG_DATA["tool_calls"])
|
||||
|
||||
with open(debug_filepath, 'w', encoding='utf-8') as f:
|
||||
json.dump(DEBUG_DATA, f, indent=2, ensure_ascii=False)
|
||||
|
||||
print(f"🐛 Vision debug log saved: {debug_filepath}")
|
||||
|
||||
except Exception as e:
|
||||
print(f"❌ Error saving vision debug log: {str(e)}")
|
||||
|
||||
|
||||
def _validate_image_url(url: str) -> bool:
|
||||
"""
|
||||
Basic validation of image URL format.
|
||||
|
||||
Args:
|
||||
url (str): The URL to validate
|
||||
|
||||
Returns:
|
||||
bool: True if URL appears to be valid, False otherwise
|
||||
"""
|
||||
if not url or not isinstance(url, str):
|
||||
return False
|
||||
|
||||
# Check if it's a valid URL format
|
||||
if not (url.startswith('http://') or url.startswith('https://')):
|
||||
return False
|
||||
|
||||
# Check for common image extensions (optional, as URLs may not have extensions)
|
||||
image_extensions = ['.jpg', '.jpeg', '.png', '.gif', '.bmp', '.webp', '.svg']
|
||||
|
||||
return True # Allow all HTTP/HTTPS URLs for flexibility
|
||||
|
||||
|
||||
async def vision_analyze_tool(
|
||||
image_url: str,
|
||||
user_prompt: str,
|
||||
model: str = DEFAULT_VISION_MODEL
|
||||
) -> str:
|
||||
"""
|
||||
Analyze an image from a URL using vision AI.
|
||||
|
||||
This tool processes images using Gemini Flash via Nous Research API.
|
||||
The user_prompt parameter is expected to be pre-formatted by the calling
|
||||
function (typically model_tools.py) to include both full description
|
||||
requests and specific questions.
|
||||
|
||||
Args:
|
||||
image_url (str): The URL of the image to analyze
|
||||
user_prompt (str): The pre-formatted prompt for the vision model
|
||||
model (str): The vision model to use (default: gemini-2.5-flash)
|
||||
|
||||
Returns:
|
||||
str: JSON string containing the analysis results with the following structure:
|
||||
{
|
||||
"success": bool,
|
||||
"analysis": str (defaults to error message if None)
|
||||
}
|
||||
|
||||
Raises:
|
||||
Exception: If analysis fails or API key is not set
|
||||
"""
|
||||
debug_call_data = {
|
||||
"parameters": {
|
||||
"image_url": image_url,
|
||||
"user_prompt": user_prompt,
|
||||
"model": model
|
||||
},
|
||||
"error": None,
|
||||
"success": False,
|
||||
"analysis_length": 0,
|
||||
"model_used": model
|
||||
}
|
||||
|
||||
try:
|
||||
print(f"🔍 Analyzing image from URL: {image_url[:60]}{'...' if len(image_url) > 60 else ''}")
|
||||
print(f"📝 User prompt: {user_prompt[:100]}{'...' if len(user_prompt) > 100 else ''}")
|
||||
|
||||
# Validate image URL
|
||||
if not _validate_image_url(image_url):
|
||||
raise ValueError("Invalid image URL format. Must start with http:// or https://")
|
||||
|
||||
# Check API key availability
|
||||
if not os.getenv("NOUS_API_KEY"):
|
||||
raise ValueError("NOUS_API_KEY environment variable not set")
|
||||
|
||||
# Use the prompt as provided (model_tools.py now handles full description formatting)
|
||||
comprehensive_prompt = user_prompt
|
||||
|
||||
# Prepare the message with image URL format
|
||||
messages = [
|
||||
{
|
||||
"role": "user",
|
||||
"content": [
|
||||
{
|
||||
"type": "text",
|
||||
"text": comprehensive_prompt
|
||||
},
|
||||
{
|
||||
"type": "image_url",
|
||||
"image_url": {
|
||||
"url": image_url
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
|
||||
print(f"🧠 Processing image with {model}...")
|
||||
|
||||
# Call the vision API
|
||||
response = await nous_client.chat.completions.create(
|
||||
model=model,
|
||||
messages=messages,
|
||||
temperature=0.1, # Low temperature for consistent analysis
|
||||
max_tokens=2000 # Generous limit for detailed analysis
|
||||
)
|
||||
|
||||
# Extract the analysis
|
||||
analysis = response.choices[0].message.content.strip()
|
||||
analysis_length = len(analysis)
|
||||
|
||||
print(f"✅ Image analysis completed ({analysis_length} characters)")
|
||||
|
||||
# Prepare successful response
|
||||
result = {
|
||||
"success": True,
|
||||
"analysis": analysis or "There was a problem with the request and the image could not be analyzed."
|
||||
}
|
||||
|
||||
debug_call_data["success"] = True
|
||||
debug_call_data["analysis_length"] = analysis_length
|
||||
|
||||
# Log debug information
|
||||
_log_debug_call("vision_analyze_tool", debug_call_data)
|
||||
_save_debug_log()
|
||||
|
||||
return json.dumps(result, indent=2)
|
||||
|
||||
except Exception as e:
|
||||
error_msg = f"Error analyzing image: {str(e)}"
|
||||
print(f"❌ {error_msg}")
|
||||
|
||||
# Prepare error response
|
||||
result = {
|
||||
"success": False,
|
||||
"analysis": "There was a problem with the request and the image could not be analyzed."
|
||||
}
|
||||
|
||||
debug_call_data["error"] = error_msg
|
||||
_log_debug_call("vision_analyze_tool", debug_call_data)
|
||||
_save_debug_log()
|
||||
|
||||
return json.dumps(result, indent=2)
|
||||
|
||||
|
||||
def check_nous_api_key() -> bool:
|
||||
"""
|
||||
Check if the Nous Research API key is available in environment variables.
|
||||
|
||||
Returns:
|
||||
bool: True if API key is set, False otherwise
|
||||
"""
|
||||
return bool(os.getenv("NOUS_API_KEY"))
|
||||
|
||||
|
||||
def check_vision_requirements() -> bool:
|
||||
"""
|
||||
Check if all requirements for vision tools are met.
|
||||
|
||||
Returns:
|
||||
bool: True if requirements are met, False otherwise
|
||||
"""
|
||||
return check_nous_api_key()
|
||||
|
||||
|
||||
def get_debug_session_info() -> Dict[str, Any]:
|
||||
"""
|
||||
Get information about the current debug session.
|
||||
|
||||
Returns:
|
||||
Dict[str, Any]: Dictionary containing debug session information
|
||||
"""
|
||||
if not DEBUG_MODE or not DEBUG_DATA:
|
||||
return {
|
||||
"enabled": False,
|
||||
"session_id": None,
|
||||
"log_path": None,
|
||||
"total_calls": 0
|
||||
}
|
||||
|
||||
return {
|
||||
"enabled": True,
|
||||
"session_id": DEBUG_SESSION_ID,
|
||||
"log_path": str(DEBUG_LOG_PATH / f"vision_tools_debug_{DEBUG_SESSION_ID}.json"),
|
||||
"total_calls": len(DEBUG_DATA["tool_calls"])
|
||||
}
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
"""
|
||||
Simple test/demo when run directly
|
||||
"""
|
||||
print("👁️ Vision Tools Module")
|
||||
print("=" * 40)
|
||||
|
||||
# Check if API key is available
|
||||
api_available = check_nous_api_key()
|
||||
|
||||
if not api_available:
|
||||
print("❌ NOUS_API_KEY environment variable not set")
|
||||
print("Please set your API key: export NOUS_API_KEY='your-key-here'")
|
||||
print("Get API key at: https://inference-api.nousresearch.com/")
|
||||
exit(1)
|
||||
else:
|
||||
print("✅ Nous Research API key found")
|
||||
|
||||
print("🛠️ Vision tools ready for use!")
|
||||
print(f"🧠 Using model: {DEFAULT_VISION_MODEL}")
|
||||
|
||||
# Show debug mode status
|
||||
if DEBUG_MODE:
|
||||
print(f"🐛 Debug mode ENABLED - Session ID: {DEBUG_SESSION_ID}")
|
||||
print(f" Debug logs will be saved to: ./logs/vision_tools_debug_{DEBUG_SESSION_ID}.json")
|
||||
else:
|
||||
print("🐛 Debug mode disabled (set VISION_TOOLS_DEBUG=true to enable)")
|
||||
|
||||
print("\nBasic usage:")
|
||||
print(" from vision_tools import vision_analyze_tool")
|
||||
print(" import asyncio")
|
||||
print("")
|
||||
print(" async def main():")
|
||||
print(" result = await vision_analyze_tool(")
|
||||
print(" image_url='https://example.com/image.jpg',")
|
||||
print(" user_prompt='What do you see in this image?'")
|
||||
print(" )")
|
||||
print(" print(result)")
|
||||
print(" asyncio.run(main())")
|
||||
|
||||
print("\nExample prompts:")
|
||||
print(" - 'What architectural style is this building?'")
|
||||
print(" - 'Describe the emotions and mood in this image'")
|
||||
print(" - 'What text can you read in this image?'")
|
||||
print(" - 'Identify any safety hazards visible'")
|
||||
print(" - 'What products or brands are shown?'")
|
||||
|
||||
print("\nDebug mode:")
|
||||
print(" # Enable debug logging")
|
||||
print(" export VISION_TOOLS_DEBUG=true")
|
||||
print(" # Debug logs capture all vision analysis calls and results")
|
||||
print(" # Logs saved to: ./logs/vision_tools_debug_UUID.json")
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Vision Tools Module
|
||||
|
||||
This module provides vision analysis tools that work with image URLs.
|
||||
Uses Gemini Flash via Nous Research API for intelligent image understanding.
|
||||
|
||||
Available tools:
|
||||
- vision_analyze_tool: Analyze images from URLs with custom prompts
|
||||
|
||||
Features:
|
||||
- Downloads images from URLs and converts to base64 for API compatibility
|
||||
- Comprehensive image description
|
||||
- Context-aware analysis based on user queries
|
||||
- Automatic temporary file cleanup
|
||||
- Proper error handling and validation
|
||||
- Debug logging support
|
||||
|
||||
Usage:
|
||||
from vision_tools import vision_analyze_tool
|
||||
import asyncio
|
||||
|
||||
# Analyze an image
|
||||
result = await vision_analyze_tool(
|
||||
image_url="https://example.com/image.jpg",
|
||||
user_prompt="What architectural style is this building?"
|
||||
)
|
||||
"""
|
||||
|
||||
import json
|
||||
import os
|
||||
import asyncio
|
||||
import uuid
|
||||
import datetime
|
||||
import base64
|
||||
from pathlib import Path
|
||||
from typing import Dict, Any, Optional
|
||||
from openai import AsyncOpenAI
|
||||
import httpx # Use httpx for async HTTP requests
|
||||
|
||||
# Initialize Nous Research API client for vision processing
|
||||
nous_client = AsyncOpenAI(
|
||||
api_key=os.getenv("NOUS_API_KEY"),
|
||||
base_url="https://inference-api.nousresearch.com/v1"
|
||||
)
|
||||
|
||||
# Configuration for vision processing
|
||||
DEFAULT_VISION_MODEL = "gemini-2.5-flash"
|
||||
|
||||
# Debug mode configuration
|
||||
DEBUG_MODE = os.getenv("VISION_TOOLS_DEBUG", "false").lower() == "true"
|
||||
DEBUG_SESSION_ID = str(uuid.uuid4())
|
||||
DEBUG_LOG_PATH = Path("./logs")
|
||||
DEBUG_DATA = {
|
||||
"session_id": DEBUG_SESSION_ID,
|
||||
"start_time": datetime.datetime.now().isoformat(),
|
||||
"debug_enabled": DEBUG_MODE,
|
||||
"tool_calls": []
|
||||
} if DEBUG_MODE else None
|
||||
|
||||
# Create logs directory if debug mode is enabled
|
||||
if DEBUG_MODE:
|
||||
DEBUG_LOG_PATH.mkdir(exist_ok=True)
|
||||
print(f"🐛 Vision debug mode enabled - Session ID: {DEBUG_SESSION_ID}")
|
||||
|
||||
|
||||
def _log_debug_call(tool_name: str, call_data: Dict[str, Any]) -> None:
|
||||
"""
|
||||
Log a debug call entry to the global debug data structure.
|
||||
|
||||
Args:
|
||||
tool_name (str): Name of the tool being called
|
||||
call_data (Dict[str, Any]): Data about the call including parameters and results
|
||||
"""
|
||||
if not DEBUG_MODE or not DEBUG_DATA:
|
||||
return
|
||||
|
||||
call_entry = {
|
||||
"timestamp": datetime.datetime.now().isoformat(),
|
||||
"tool_name": tool_name,
|
||||
**call_data
|
||||
}
|
||||
|
||||
DEBUG_DATA["tool_calls"].append(call_entry)
|
||||
|
||||
|
||||
def _save_debug_log() -> None:
|
||||
"""
|
||||
Save the current debug data to a JSON file in the logs directory.
|
||||
"""
|
||||
if not DEBUG_MODE or not DEBUG_DATA:
|
||||
return
|
||||
|
||||
try:
|
||||
debug_filename = f"vision_tools_debug_{DEBUG_SESSION_ID}.json"
|
||||
debug_filepath = DEBUG_LOG_PATH / debug_filename
|
||||
|
||||
# Update end time
|
||||
DEBUG_DATA["end_time"] = datetime.datetime.now().isoformat()
|
||||
DEBUG_DATA["total_calls"] = len(DEBUG_DATA["tool_calls"])
|
||||
|
||||
with open(debug_filepath, 'w', encoding='utf-8') as f:
|
||||
json.dump(DEBUG_DATA, f, indent=2, ensure_ascii=False)
|
||||
|
||||
print(f"🐛 Vision debug log saved: {debug_filepath}")
|
||||
|
||||
except Exception as e:
|
||||
print(f"❌ Error saving vision debug log: {str(e)}")
|
||||
|
||||
|
||||
def _validate_image_url(url: str) -> bool:
|
||||
"""
|
||||
Basic validation of image URL format.
|
||||
|
||||
Args:
|
||||
url (str): The URL to validate
|
||||
|
||||
Returns:
|
||||
bool: True if URL appears to be valid, False otherwise
|
||||
"""
|
||||
if not url or not isinstance(url, str):
|
||||
return False
|
||||
|
||||
# Check if it's a valid URL format
|
||||
if not (url.startswith('http://') or url.startswith('https://')):
|
||||
return False
|
||||
|
||||
# Check for common image extensions (optional, as URLs may not have extensions)
|
||||
image_extensions = ['.jpg', '.jpeg', '.png', '.gif', '.bmp', '.webp', '.svg']
|
||||
|
||||
return True # Allow all HTTP/HTTPS URLs for flexibility
|
||||
|
||||
|
||||
async def _download_image(image_url: str, destination: Path) -> Path:
|
||||
"""
|
||||
Download an image from a URL to a local destination (async).
|
||||
|
||||
Args:
|
||||
image_url (str): The URL of the image to download
|
||||
destination (Path): The path where the image should be saved
|
||||
|
||||
Returns:
|
||||
Path: The path to the downloaded image
|
||||
|
||||
Raises:
|
||||
Exception: If download fails or response is invalid
|
||||
"""
|
||||
# Create parent directories if they don't exist
|
||||
destination.parent.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Download the image with appropriate headers using async httpx
|
||||
async with httpx.AsyncClient(timeout=30.0) as client:
|
||||
response = await client.get(
|
||||
image_url,
|
||||
headers={"User-Agent": "hermes-agent-vision/1.0"},
|
||||
)
|
||||
response.raise_for_status()
|
||||
|
||||
# Save the image content
|
||||
destination.write_bytes(response.content)
|
||||
|
||||
return destination
|
||||
|
||||
|
||||
def _determine_mime_type(image_path: Path) -> str:
|
||||
"""
|
||||
Determine the MIME type of an image based on its file extension.
|
||||
|
||||
Args:
|
||||
image_path (Path): Path to the image file
|
||||
|
||||
Returns:
|
||||
str: The MIME type (defaults to image/jpeg if unknown)
|
||||
"""
|
||||
extension = image_path.suffix.lower()
|
||||
mime_types = {
|
||||
'.jpg': 'image/jpeg',
|
||||
'.jpeg': 'image/jpeg',
|
||||
'.png': 'image/png',
|
||||
'.gif': 'image/gif',
|
||||
'.bmp': 'image/bmp',
|
||||
'.webp': 'image/webp',
|
||||
'.svg': 'image/svg+xml'
|
||||
}
|
||||
return mime_types.get(extension, 'image/jpeg')
|
||||
|
||||
|
||||
def _image_to_base64_data_url(image_path: Path, mime_type: Optional[str] = None) -> str:
|
||||
"""
|
||||
Convert an image file to a base64-encoded data URL.
|
||||
|
||||
Args:
|
||||
image_path (Path): Path to the image file
|
||||
mime_type (Optional[str]): MIME type of the image (auto-detected if None)
|
||||
|
||||
Returns:
|
||||
str: Base64-encoded data URL (e.g., "data:image/jpeg;base64,...")
|
||||
"""
|
||||
# Read the image as bytes
|
||||
data = image_path.read_bytes()
|
||||
|
||||
# Encode to base64
|
||||
encoded = base64.b64encode(data).decode("ascii")
|
||||
|
||||
# Determine MIME type
|
||||
mime = mime_type or _determine_mime_type(image_path)
|
||||
|
||||
# Create data URL
|
||||
data_url = f"data:{mime};base64,{encoded}"
|
||||
|
||||
return data_url
|
||||
|
||||
|
||||
async def vision_analyze_tool(
|
||||
image_url: str,
|
||||
user_prompt: str,
|
||||
model: str = DEFAULT_VISION_MODEL
|
||||
) -> str:
|
||||
"""
|
||||
Analyze an image from a URL using vision AI.
|
||||
|
||||
This tool downloads images from URLs, converts them to base64, and processes
|
||||
them using Gemini Flash via Nous Research API. The image is downloaded to a
|
||||
temporary location and automatically cleaned up after processing.
|
||||
|
||||
The user_prompt parameter is expected to be pre-formatted by the calling
|
||||
function (typically model_tools.py) to include both full description
|
||||
requests and specific questions.
|
||||
|
||||
Args:
|
||||
image_url (str): The URL of the image to analyze (must be http:// or https://)
|
||||
user_prompt (str): The pre-formatted prompt for the vision model
|
||||
model (str): The vision model to use (default: gemini-2.5-flash)
|
||||
|
||||
Returns:
|
||||
str: JSON string containing the analysis results with the following structure:
|
||||
{
|
||||
"success": bool,
|
||||
"analysis": str (defaults to error message if None)
|
||||
}
|
||||
|
||||
Raises:
|
||||
Exception: If download fails, analysis fails, or API key is not set
|
||||
|
||||
Note:
|
||||
- Temporary images are stored in ./temp_vision_images/
|
||||
- Images are automatically deleted after processing
|
||||
- Supports common image formats (JPEG, PNG, GIF, WebP, etc.)
|
||||
"""
|
||||
debug_call_data = {
|
||||
"parameters": {
|
||||
"image_url": image_url,
|
||||
"user_prompt": user_prompt[:200] + "..." if len(user_prompt) > 200 else user_prompt,
|
||||
"model": model
|
||||
},
|
||||
"error": None,
|
||||
"success": False,
|
||||
"analysis_length": 0,
|
||||
"model_used": model,
|
||||
"image_size_bytes": 0
|
||||
}
|
||||
|
||||
temp_image_path = None
|
||||
|
||||
try:
|
||||
print(f"🔍 Analyzing image from URL: {image_url[:60]}{'...' if len(image_url) > 60 else ''}", flush=True)
|
||||
print(f"📝 User prompt: {user_prompt[:100]}{'...' if len(user_prompt) > 100 else ''}", flush=True)
|
||||
|
||||
# Validate image URL
|
||||
if not _validate_image_url(image_url):
|
||||
raise ValueError("Invalid image URL format. Must start with http:// or https://")
|
||||
|
||||
# Check API key availability
|
||||
if not os.getenv("NOUS_API_KEY"):
|
||||
raise ValueError("NOUS_API_KEY environment variable not set")
|
||||
|
||||
# Download the image to a temporary location
|
||||
print(f"⬇️ Downloading image from URL...", flush=True)
|
||||
temp_dir = Path("./temp_vision_images")
|
||||
temp_image_path = temp_dir / f"temp_image_{uuid.uuid4()}.jpg"
|
||||
|
||||
await _download_image(image_url, temp_image_path)
|
||||
|
||||
# Get image file size for logging
|
||||
image_size_bytes = temp_image_path.stat().st_size
|
||||
image_size_kb = image_size_bytes / 1024
|
||||
print(f"✅ Image downloaded successfully ({image_size_kb:.1f} KB)", flush=True)
|
||||
|
||||
# Convert image to base64 data URL
|
||||
print(f"🔄 Converting image to base64...", flush=True)
|
||||
image_data_url = _image_to_base64_data_url(temp_image_path)
|
||||
# Calculate size in KB for better readability
|
||||
data_size_kb = len(image_data_url) / 1024
|
||||
print(f"✅ Image converted to base64 ({data_size_kb:.1f} KB)", flush=True)
|
||||
|
||||
debug_call_data["image_size_bytes"] = image_size_bytes
|
||||
|
||||
# Use the prompt as provided (model_tools.py now handles full description formatting)
|
||||
comprehensive_prompt = user_prompt
|
||||
|
||||
# Prepare the message with base64-encoded image
|
||||
messages = [
|
||||
{
|
||||
"role": "user",
|
||||
"content": [
|
||||
{
|
||||
"type": "text",
|
||||
"text": comprehensive_prompt
|
||||
},
|
||||
{
|
||||
"type": "image_url",
|
||||
"image_url": {
|
||||
"url": image_data_url
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
|
||||
print(f"🧠 Processing image with {model}...", flush=True)
|
||||
|
||||
# Call the vision API
|
||||
response = await nous_client.chat.completions.create(
|
||||
model=model,
|
||||
messages=messages,
|
||||
temperature=0.1, # Low temperature for consistent analysis
|
||||
max_tokens=2000 # Generous limit for detailed analysis
|
||||
)
|
||||
|
||||
# Extract the analysis
|
||||
analysis = response.choices[0].message.content.strip()
|
||||
analysis_length = len(analysis)
|
||||
|
||||
print(f"✅ Image analysis completed ({analysis_length} characters)", flush=True)
|
||||
|
||||
# Prepare successful response
|
||||
result = {
|
||||
"success": True,
|
||||
"analysis": analysis or "There was a problem with the request and the image could not be analyzed."
|
||||
}
|
||||
|
||||
debug_call_data["success"] = True
|
||||
debug_call_data["analysis_length"] = analysis_length
|
||||
|
||||
# Log debug information
|
||||
_log_debug_call("vision_analyze_tool", debug_call_data)
|
||||
_save_debug_log()
|
||||
|
||||
return json.dumps(result, indent=2)
|
||||
|
||||
except Exception as e:
|
||||
error_msg = f"Error analyzing image: {str(e)}"
|
||||
print(f"❌ {error_msg}", flush=True)
|
||||
|
||||
# Prepare error response
|
||||
result = {
|
||||
"success": False,
|
||||
"analysis": "There was a problem with the request and the image could not be analyzed."
|
||||
}
|
||||
|
||||
debug_call_data["error"] = error_msg
|
||||
_log_debug_call("vision_analyze_tool", debug_call_data)
|
||||
_save_debug_log()
|
||||
|
||||
return json.dumps(result, indent=2)
|
||||
|
||||
finally:
|
||||
# Clean up temporary image file
|
||||
if temp_image_path and temp_image_path.exists():
|
||||
try:
|
||||
temp_image_path.unlink()
|
||||
print(f"🧹 Cleaned up temporary image file", flush=True)
|
||||
except Exception as cleanup_error:
|
||||
print(f"⚠️ Warning: Could not delete temporary file: {cleanup_error}", flush=True)
|
||||
|
||||
|
||||
def check_nous_api_key() -> bool:
|
||||
"""
|
||||
Check if the Nous Research API key is available in environment variables.
|
||||
|
||||
Returns:
|
||||
bool: True if API key is set, False otherwise
|
||||
"""
|
||||
return bool(os.getenv("NOUS_API_KEY"))
|
||||
|
||||
|
||||
def check_vision_requirements() -> bool:
|
||||
"""
|
||||
Check if all requirements for vision tools are met.
|
||||
|
||||
Returns:
|
||||
bool: True if requirements are met, False otherwise
|
||||
"""
|
||||
return check_nous_api_key()
|
||||
|
||||
|
||||
def get_debug_session_info() -> Dict[str, Any]:
|
||||
"""
|
||||
Get information about the current debug session.
|
||||
|
||||
Returns:
|
||||
Dict[str, Any]: Dictionary containing debug session information
|
||||
"""
|
||||
if not DEBUG_MODE or not DEBUG_DATA:
|
||||
return {
|
||||
"enabled": False,
|
||||
"session_id": None,
|
||||
"log_path": None,
|
||||
"total_calls": 0
|
||||
}
|
||||
|
||||
return {
|
||||
"enabled": True,
|
||||
"session_id": DEBUG_SESSION_ID,
|
||||
"log_path": str(DEBUG_LOG_PATH / f"vision_tools_debug_{DEBUG_SESSION_ID}.json"),
|
||||
"total_calls": len(DEBUG_DATA["tool_calls"])
|
||||
}
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
"""
|
||||
Simple test/demo when run directly
|
||||
"""
|
||||
print("👁️ Vision Tools Module")
|
||||
print("=" * 40)
|
||||
|
||||
# Check if API key is available
|
||||
api_available = check_nous_api_key()
|
||||
|
||||
if not api_available:
|
||||
print("❌ NOUS_API_KEY environment variable not set")
|
||||
print("Please set your API key: export NOUS_API_KEY='your-key-here'")
|
||||
print("Get API key at: https://inference-api.nousresearch.com/")
|
||||
exit(1)
|
||||
else:
|
||||
print("✅ Nous Research API key found")
|
||||
|
||||
print("🛠️ Vision tools ready for use!")
|
||||
print(f"🧠 Using model: {DEFAULT_VISION_MODEL}")
|
||||
|
||||
# Show debug mode status
|
||||
if DEBUG_MODE:
|
||||
print(f"🐛 Debug mode ENABLED - Session ID: {DEBUG_SESSION_ID}")
|
||||
print(f" Debug logs will be saved to: ./logs/vision_tools_debug_{DEBUG_SESSION_ID}.json")
|
||||
else:
|
||||
print("🐛 Debug mode disabled (set VISION_TOOLS_DEBUG=true to enable)")
|
||||
|
||||
print("\nBasic usage:")
|
||||
print(" from vision_tools import vision_analyze_tool")
|
||||
print(" import asyncio")
|
||||
print("")
|
||||
print(" async def main():")
|
||||
print(" result = await vision_analyze_tool(")
|
||||
print(" image_url='https://example.com/image.jpg',")
|
||||
print(" user_prompt='What do you see in this image?'")
|
||||
print(" )")
|
||||
print(" print(result)")
|
||||
print(" asyncio.run(main())")
|
||||
|
||||
print("\nExample prompts:")
|
||||
print(" - 'What architectural style is this building?'")
|
||||
print(" - 'Describe the emotions and mood in this image'")
|
||||
print(" - 'What text can you read in this image?'")
|
||||
print(" - 'Identify any safety hazards visible'")
|
||||
print(" - 'What products or brands are shown?'")
|
||||
|
||||
print("\nDebug mode:")
|
||||
print(" # Enable debug logging")
|
||||
print(" export VISION_TOOLS_DEBUG=true")
|
||||
print(" # Debug logs capture all vision analysis calls and results")
|
||||
print(" # Logs saved to: ./logs/vision_tools_debug_UUID.json")
|
||||
File diff suppressed because it is too large
Load Diff
270
toolset_distributions.py
Normal file
270
toolset_distributions.py
Normal file
@@ -0,0 +1,270 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Toolset Distributions Module
|
||||
|
||||
This module defines distributions of toolsets for data generation runs.
|
||||
Each distribution specifies which toolsets should be used and their probability
|
||||
of being selected for any given prompt during the batch processing.
|
||||
|
||||
A distribution is a dictionary mapping toolset names to their selection probability (%).
|
||||
Probabilities should sum to 100, but the system will normalize if they don't.
|
||||
|
||||
Usage:
|
||||
from toolset_distributions import get_distribution, list_distributions
|
||||
|
||||
# Get a specific distribution
|
||||
dist = get_distribution("image_gen")
|
||||
|
||||
# List all available distributions
|
||||
all_dists = list_distributions()
|
||||
"""
|
||||
|
||||
from typing import Dict, List, Optional
|
||||
import random
|
||||
from toolsets import validate_toolset
|
||||
|
||||
|
||||
# Distribution definitions
|
||||
# Each key is a distribution name, and the value is a dict of toolset_name: probability_percentage
|
||||
DISTRIBUTIONS = {
|
||||
# Default: All tools available 100% of the time
|
||||
"default": {
|
||||
"description": "All available tools, all the time",
|
||||
"toolsets": {
|
||||
"web": 100,
|
||||
"vision": 100,
|
||||
"image_gen": 100,
|
||||
"terminal": 100,
|
||||
"moa": 100
|
||||
}
|
||||
},
|
||||
|
||||
# Image generation focused distribution
|
||||
"image_gen": {
|
||||
"description": "Heavy focus on image generation with vision and web support",
|
||||
"toolsets": {
|
||||
"image_gen": 90, # 80% chance of image generation tools
|
||||
"vision": 90, # 60% chance of vision tools
|
||||
"web": 55, # 40% chance of web tools
|
||||
"terminal": 45,
|
||||
"moa": 10 # 20% chance of reasoning tools
|
||||
}
|
||||
},
|
||||
|
||||
# Research-focused distribution
|
||||
"research": {
|
||||
"description": "Web research with vision analysis and reasoning",
|
||||
"toolsets": {
|
||||
"web": 90, # 90% chance of web tools
|
||||
"vision": 50, # 50% chance of vision tools
|
||||
"moa": 40, # 40% chance of reasoning tools
|
||||
"terminal": 10 # 10% chance of terminal tools
|
||||
}
|
||||
},
|
||||
|
||||
# Development-focused distribution
|
||||
"development": {
|
||||
"description": "Terminal and reasoning with occasional web lookup",
|
||||
"toolsets": {
|
||||
"terminal": 80, # 80% chance of terminal tools
|
||||
"moa": 60, # 60% chance of reasoning tools
|
||||
"web": 30, # 30% chance of web tools
|
||||
"vision": 10 # 10% chance of vision tools
|
||||
}
|
||||
},
|
||||
|
||||
# Safe mode (no terminal)
|
||||
"safe": {
|
||||
"description": "All tools except terminal for safety",
|
||||
"toolsets": {
|
||||
"web": 80,
|
||||
"vision": 60,
|
||||
"image_gen": 60,
|
||||
"moa": 50
|
||||
}
|
||||
},
|
||||
|
||||
# Balanced distribution
|
||||
"balanced": {
|
||||
"description": "Equal probability of all toolsets",
|
||||
"toolsets": {
|
||||
"web": 50,
|
||||
"vision": 50,
|
||||
"image_gen": 50,
|
||||
"terminal": 50,
|
||||
"moa": 50
|
||||
}
|
||||
},
|
||||
|
||||
# Minimal (web only)
|
||||
"minimal": {
|
||||
"description": "Only web tools for basic research",
|
||||
"toolsets": {
|
||||
"web": 100
|
||||
}
|
||||
},
|
||||
|
||||
# Creative (vision + image generation)
|
||||
"creative": {
|
||||
"description": "Image generation and vision analysis focus",
|
||||
"toolsets": {
|
||||
"image_gen": 90,
|
||||
"vision": 90,
|
||||
"web": 30
|
||||
}
|
||||
},
|
||||
|
||||
# Reasoning heavy
|
||||
"reasoning": {
|
||||
"description": "Heavy mixture of agents usage with minimal other tools",
|
||||
"toolsets": {
|
||||
"moa": 90,
|
||||
"web": 30,
|
||||
"terminal": 20
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
def get_distribution(name: str) -> Optional[Dict[str, any]]:
|
||||
"""
|
||||
Get a toolset distribution by name.
|
||||
|
||||
Args:
|
||||
name (str): Name of the distribution
|
||||
|
||||
Returns:
|
||||
Dict: Distribution definition with description and toolsets
|
||||
None: If distribution not found
|
||||
"""
|
||||
return DISTRIBUTIONS.get(name)
|
||||
|
||||
|
||||
def list_distributions() -> Dict[str, Dict]:
|
||||
"""
|
||||
List all available distributions.
|
||||
|
||||
Returns:
|
||||
Dict: All distribution definitions
|
||||
"""
|
||||
return DISTRIBUTIONS.copy()
|
||||
|
||||
|
||||
def sample_toolsets_from_distribution(distribution_name: str) -> List[str]:
|
||||
"""
|
||||
Sample toolsets based on a distribution's probabilities.
|
||||
|
||||
Each toolset in the distribution has a % chance of being included.
|
||||
This allows multiple toolsets to be active simultaneously.
|
||||
|
||||
Args:
|
||||
distribution_name (str): Name of the distribution to sample from
|
||||
|
||||
Returns:
|
||||
List[str]: List of sampled toolset names
|
||||
|
||||
Raises:
|
||||
ValueError: If distribution name is not found
|
||||
"""
|
||||
dist = get_distribution(distribution_name)
|
||||
if not dist:
|
||||
raise ValueError(f"Unknown distribution: {distribution_name}")
|
||||
|
||||
# Sample each toolset independently based on its probability
|
||||
selected_toolsets = []
|
||||
|
||||
for toolset_name, probability in dist["toolsets"].items():
|
||||
# Validate toolset exists
|
||||
if not validate_toolset(toolset_name):
|
||||
print(f"⚠️ Warning: Toolset '{toolset_name}' in distribution '{distribution_name}' is not valid")
|
||||
continue
|
||||
|
||||
# Roll the dice - if random value is less than probability, include this toolset
|
||||
if random.random() * 100 < probability:
|
||||
selected_toolsets.append(toolset_name)
|
||||
|
||||
# If no toolsets were selected (can happen with low probabilities),
|
||||
# ensure at least one toolset is selected by picking the highest probability one
|
||||
if not selected_toolsets and dist["toolsets"]:
|
||||
# Find toolset with highest probability
|
||||
highest_prob_toolset = max(dist["toolsets"].items(), key=lambda x: x[1])[0]
|
||||
if validate_toolset(highest_prob_toolset):
|
||||
selected_toolsets.append(highest_prob_toolset)
|
||||
|
||||
return selected_toolsets
|
||||
|
||||
|
||||
def validate_distribution(distribution_name: str) -> bool:
|
||||
"""
|
||||
Check if a distribution name is valid.
|
||||
|
||||
Args:
|
||||
distribution_name (str): Distribution name to validate
|
||||
|
||||
Returns:
|
||||
bool: True if valid, False otherwise
|
||||
"""
|
||||
return distribution_name in DISTRIBUTIONS
|
||||
|
||||
|
||||
def print_distribution_info(distribution_name: str) -> None:
|
||||
"""
|
||||
Print detailed information about a distribution.
|
||||
|
||||
Args:
|
||||
distribution_name (str): Distribution name
|
||||
"""
|
||||
dist = get_distribution(distribution_name)
|
||||
if not dist:
|
||||
print(f"❌ Unknown distribution: {distribution_name}")
|
||||
return
|
||||
|
||||
print(f"\n📊 Distribution: {distribution_name}")
|
||||
print(f" Description: {dist['description']}")
|
||||
print(f" Toolsets:")
|
||||
for toolset, prob in sorted(dist["toolsets"].items(), key=lambda x: x[1], reverse=True):
|
||||
print(f" • {toolset:15} : {prob:3}% chance")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
"""
|
||||
Demo and testing of the distributions system
|
||||
"""
|
||||
print("📊 Toolset Distributions Demo")
|
||||
print("=" * 60)
|
||||
|
||||
# List all distributions
|
||||
print("\n📋 Available Distributions:")
|
||||
print("-" * 40)
|
||||
for name, dist in list_distributions().items():
|
||||
print(f"\n {name}:")
|
||||
print(f" {dist['description']}")
|
||||
toolset_list = ", ".join([f"{ts}({p}%)" for ts, p in dist["toolsets"].items()])
|
||||
print(f" Toolsets: {toolset_list}")
|
||||
|
||||
# Demo sampling
|
||||
print("\n\n🎲 Sampling Examples:")
|
||||
print("-" * 40)
|
||||
|
||||
test_distributions = ["image_gen", "research", "balanced", "default"]
|
||||
|
||||
for dist_name in test_distributions:
|
||||
print(f"\n{dist_name}:")
|
||||
# Sample 5 times to show variability
|
||||
samples = []
|
||||
for _ in range(5):
|
||||
sampled = sample_toolsets_from_distribution(dist_name)
|
||||
samples.append(sorted(sampled))
|
||||
|
||||
print(f" Sample 1: {samples[0]}")
|
||||
print(f" Sample 2: {samples[1]}")
|
||||
print(f" Sample 3: {samples[2]}")
|
||||
print(f" Sample 4: {samples[3]}")
|
||||
print(f" Sample 5: {samples[4]}")
|
||||
|
||||
# Show detailed info
|
||||
print("\n\n📊 Detailed Distribution Info:")
|
||||
print("-" * 40)
|
||||
print_distribution_info("image_gen")
|
||||
print_distribution_info("research")
|
||||
|
||||
339
toolsets.py
Normal file
339
toolsets.py
Normal file
@@ -0,0 +1,339 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Toolsets Module
|
||||
|
||||
This module provides a flexible system for defining and managing tool aliases/toolsets.
|
||||
Toolsets allow you to group tools together for specific scenarios and can be composed
|
||||
from individual tools or other toolsets.
|
||||
|
||||
Features:
|
||||
- Define custom toolsets with specific tools
|
||||
- Compose toolsets from other toolsets
|
||||
- Built-in common toolsets for typical use cases
|
||||
- Easy extension for new toolsets
|
||||
- Support for dynamic toolset resolution
|
||||
|
||||
Usage:
|
||||
from toolsets import get_toolset, resolve_toolset, get_all_toolsets
|
||||
|
||||
# Get tools for a specific toolset
|
||||
tools = get_toolset("research")
|
||||
|
||||
# Resolve a toolset to get all tool names (including from composed toolsets)
|
||||
all_tools = resolve_toolset("full_stack")
|
||||
"""
|
||||
|
||||
from typing import List, Dict, Any, Set, Optional
|
||||
import json
|
||||
|
||||
|
||||
# Core toolset definitions
|
||||
# These can include individual tools or reference other toolsets
|
||||
TOOLSETS = {
|
||||
# Basic toolsets - individual tool categories
|
||||
"web": {
|
||||
"description": "Web research and content extraction tools",
|
||||
"tools": ["web_search", "web_extract", "web_crawl"],
|
||||
"includes": [] # No other toolsets included
|
||||
},
|
||||
|
||||
"vision": {
|
||||
"description": "Image analysis and vision tools",
|
||||
"tools": ["vision_analyze"],
|
||||
"includes": []
|
||||
},
|
||||
|
||||
"image_gen": {
|
||||
"description": "Creative generation tools (images)",
|
||||
"tools": ["image_generate"],
|
||||
"includes": []
|
||||
},
|
||||
|
||||
"terminal": {
|
||||
"description": "Terminal/command execution tools",
|
||||
"tools": ["terminal"],
|
||||
"includes": []
|
||||
},
|
||||
|
||||
"moa": {
|
||||
"description": "Advanced reasoning and problem-solving tools",
|
||||
"tools": ["mixture_of_agents"],
|
||||
"includes": []
|
||||
},
|
||||
|
||||
# Scenario-specific toolsets
|
||||
|
||||
"debugging": {
|
||||
"description": "Debugging and troubleshooting toolkit",
|
||||
"tools": ["terminal"],
|
||||
"includes": ["web"] # For searching error messages and solutions
|
||||
},
|
||||
|
||||
"safe": {
|
||||
"description": "Safe toolkit without terminal access",
|
||||
"tools": ["mixture_of_agents"],
|
||||
"includes": ["web", "vision", "creative"]
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
|
||||
def get_toolset(name: str) -> Optional[Dict[str, Any]]:
|
||||
"""
|
||||
Get a toolset definition by name.
|
||||
|
||||
Args:
|
||||
name (str): Name of the toolset
|
||||
|
||||
Returns:
|
||||
Dict: Toolset definition with description, tools, and includes
|
||||
None: If toolset not found
|
||||
"""
|
||||
# Return toolset definition
|
||||
return TOOLSETS.get(name)
|
||||
|
||||
|
||||
def resolve_toolset(name: str, visited: Set[str] = None) -> List[str]:
|
||||
"""
|
||||
Recursively resolve a toolset to get all tool names.
|
||||
|
||||
This function handles toolset composition by recursively resolving
|
||||
included toolsets and combining all tools.
|
||||
|
||||
Args:
|
||||
name (str): Name of the toolset to resolve
|
||||
visited (Set[str]): Set of already visited toolsets (for cycle detection)
|
||||
|
||||
Returns:
|
||||
List[str]: List of all tool names in the toolset
|
||||
"""
|
||||
if visited is None:
|
||||
visited = set()
|
||||
|
||||
# Special aliases that represent all tools across every toolset
|
||||
# This ensures future toolsets are automatically included without changes.
|
||||
if name in {"all", "*"}:
|
||||
all_tools: Set[str] = set()
|
||||
for toolset_name in get_toolset_names():
|
||||
# Use a fresh visited set per branch to avoid cross-branch contamination
|
||||
resolved = resolve_toolset(toolset_name, visited.copy())
|
||||
all_tools.update(resolved)
|
||||
return list(all_tools)
|
||||
|
||||
# Check for cycles
|
||||
if name in visited:
|
||||
print(f"⚠️ Circular dependency detected in toolset '{name}'")
|
||||
return []
|
||||
|
||||
visited.add(name)
|
||||
|
||||
# Get toolset definition
|
||||
toolset = TOOLSETS.get(name)
|
||||
if not toolset:
|
||||
return []
|
||||
|
||||
# Collect direct tools
|
||||
tools = set(toolset.get("tools", []))
|
||||
|
||||
# Recursively resolve included toolsets
|
||||
for included_name in toolset.get("includes", []):
|
||||
included_tools = resolve_toolset(included_name, visited.copy())
|
||||
tools.update(included_tools)
|
||||
|
||||
return list(tools)
|
||||
|
||||
|
||||
def resolve_multiple_toolsets(toolset_names: List[str]) -> List[str]:
|
||||
"""
|
||||
Resolve multiple toolsets and combine their tools.
|
||||
|
||||
Args:
|
||||
toolset_names (List[str]): List of toolset names to resolve
|
||||
|
||||
Returns:
|
||||
List[str]: Combined list of all tool names (deduplicated)
|
||||
"""
|
||||
all_tools = set()
|
||||
|
||||
for name in toolset_names:
|
||||
tools = resolve_toolset(name)
|
||||
all_tools.update(tools)
|
||||
|
||||
return list(all_tools)
|
||||
|
||||
|
||||
def get_all_toolsets() -> Dict[str, Dict[str, Any]]:
|
||||
"""
|
||||
Get all available toolsets with their definitions.
|
||||
|
||||
Returns:
|
||||
Dict: All toolset definitions
|
||||
"""
|
||||
return TOOLSETS.copy()
|
||||
|
||||
|
||||
def get_toolset_names() -> List[str]:
|
||||
"""
|
||||
Get names of all available toolsets (excluding aliases).
|
||||
|
||||
Returns:
|
||||
List[str]: List of toolset names
|
||||
"""
|
||||
return list(TOOLSETS.keys())
|
||||
|
||||
|
||||
|
||||
|
||||
def validate_toolset(name: str) -> bool:
|
||||
"""
|
||||
Check if a toolset name is valid.
|
||||
|
||||
Args:
|
||||
name (str): Toolset name to validate
|
||||
|
||||
Returns:
|
||||
bool: True if valid, False otherwise
|
||||
"""
|
||||
# Accept special alias names for convenience
|
||||
if name in {"all", "*"}:
|
||||
return True
|
||||
return name in TOOLSETS
|
||||
|
||||
|
||||
def create_custom_toolset(
|
||||
name: str,
|
||||
description: str,
|
||||
tools: List[str] = None,
|
||||
includes: List[str] = None
|
||||
) -> None:
|
||||
"""
|
||||
Create a custom toolset at runtime.
|
||||
|
||||
Args:
|
||||
name (str): Name for the new toolset
|
||||
description (str): Description of the toolset
|
||||
tools (List[str]): Direct tools to include
|
||||
includes (List[str]): Other toolsets to include
|
||||
"""
|
||||
TOOLSETS[name] = {
|
||||
"description": description,
|
||||
"tools": tools or [],
|
||||
"includes": includes or []
|
||||
}
|
||||
|
||||
|
||||
|
||||
|
||||
def get_toolset_info(name: str) -> Dict[str, Any]:
|
||||
"""
|
||||
Get detailed information about a toolset including resolved tools.
|
||||
|
||||
Args:
|
||||
name (str): Toolset name
|
||||
|
||||
Returns:
|
||||
Dict: Detailed toolset information
|
||||
"""
|
||||
toolset = get_toolset(name)
|
||||
if not toolset:
|
||||
return None
|
||||
|
||||
resolved_tools = resolve_toolset(name)
|
||||
|
||||
return {
|
||||
"name": name,
|
||||
"description": toolset["description"],
|
||||
"direct_tools": toolset["tools"],
|
||||
"includes": toolset["includes"],
|
||||
"resolved_tools": resolved_tools,
|
||||
"tool_count": len(resolved_tools),
|
||||
"is_composite": len(toolset["includes"]) > 0
|
||||
}
|
||||
|
||||
|
||||
def print_toolset_tree(name: str, indent: int = 0) -> None:
|
||||
"""
|
||||
Print a tree view of a toolset and its composition.
|
||||
|
||||
Args:
|
||||
name (str): Toolset name
|
||||
indent (int): Current indentation level
|
||||
"""
|
||||
prefix = " " * indent
|
||||
toolset = get_toolset(name)
|
||||
|
||||
if not toolset:
|
||||
print(f"{prefix}❌ Unknown toolset: {name}")
|
||||
return
|
||||
|
||||
# Print toolset name and description
|
||||
print(f"{prefix}📦 {name}: {toolset['description']}")
|
||||
|
||||
# Print direct tools
|
||||
if toolset["tools"]:
|
||||
print(f"{prefix} 🔧 Tools: {', '.join(toolset['tools'])}")
|
||||
|
||||
# Print included toolsets
|
||||
if toolset["includes"]:
|
||||
print(f"{prefix} 📂 Includes:")
|
||||
for included in toolset["includes"]:
|
||||
print_toolset_tree(included, indent + 2)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
"""
|
||||
Demo and testing of the toolsets system
|
||||
"""
|
||||
print("🎯 Toolsets System Demo")
|
||||
print("=" * 60)
|
||||
|
||||
# Show all available toolsets
|
||||
print("\n📦 Available Toolsets:")
|
||||
print("-" * 40)
|
||||
for name, toolset in get_all_toolsets().items():
|
||||
info = get_toolset_info(name)
|
||||
composite = "📂" if info["is_composite"] else "🔧"
|
||||
print(f"{composite} {name:20} - {toolset['description']}")
|
||||
print(f" Tools: {len(info['resolved_tools'])} total")
|
||||
|
||||
|
||||
# Demo toolset resolution
|
||||
print("\n🔍 Toolset Resolution Examples:")
|
||||
print("-" * 40)
|
||||
|
||||
examples = ["research", "development", "full_stack", "minimal", "safe"]
|
||||
for name in examples:
|
||||
tools = resolve_toolset(name)
|
||||
print(f"\n{name}:")
|
||||
print(f" Resolved to {len(tools)} tools: {', '.join(sorted(tools))}")
|
||||
|
||||
# Show toolset composition tree
|
||||
print("\n🌳 Toolset Composition Tree:")
|
||||
print("-" * 40)
|
||||
print("\nExample: 'content_creation' toolset:")
|
||||
print_toolset_tree("content_creation")
|
||||
|
||||
print("\nExample: 'full_stack' toolset:")
|
||||
print_toolset_tree("full_stack")
|
||||
|
||||
# Demo multiple toolset resolution
|
||||
print("\n🔗 Multiple Toolset Resolution:")
|
||||
print("-" * 40)
|
||||
combined = resolve_multiple_toolsets(["minimal", "vision", "reasoning"])
|
||||
print(f"Combining ['minimal', 'vision', 'reasoning']:")
|
||||
print(f" Result: {', '.join(sorted(combined))}")
|
||||
|
||||
# Demo custom toolset creation
|
||||
print("\n➕ Custom Toolset Creation:")
|
||||
print("-" * 40)
|
||||
create_custom_toolset(
|
||||
name="my_custom",
|
||||
description="My custom toolset for specific tasks",
|
||||
tools=["web_search"],
|
||||
includes=["terminal", "vision"]
|
||||
)
|
||||
|
||||
custom_info = get_toolset_info("my_custom")
|
||||
print(f"Created 'my_custom' toolset:")
|
||||
print(f" Description: {custom_info['description']}")
|
||||
print(f" Resolved tools: {', '.join(custom_info['resolved_tools'])}")
|
||||
Reference in New Issue
Block a user