diff --git a/optional-skills/creative/comfyui/SKILL.md b/optional-skills/creative/comfyui/SKILL.md new file mode 100644 index 00000000000..2853789b329 --- /dev/null +++ b/optional-skills/creative/comfyui/SKILL.md @@ -0,0 +1,335 @@ +--- +name: comfyui +description: "Use when generating images/video/audio with ComfyUI — import workflows, run them with friendly parameters, manage models and dependencies. Uses the comfyui-skill CLI over the REST API." +version: 3.0.0 +requires: ComfyUI running locally or via Comfy Cloud; comfyui-skill CLI (auto-installed via uvx) +author: kshitijk4poor +license: MIT +platforms: [macos, linux, windows] +prerequisites: + commands: ["uv"] +setup: + help: "CLI auto-runs via uvx. ComfyUI install: https://docs.comfy.org/installation" +metadata: + hermes: + tags: [comfyui, image-generation, stable-diffusion, flux, creative, generative-ai, video-generation] + related_skills: [stable-diffusion-image-generation, image_gen] + category: creative +--- + +# ComfyUI + +Generate images, video, and audio through ComfyUI using the `comfyui-skill` CLI. +The CLI wraps ComfyUI's REST API into an agent-friendly interface — workflows become +"skills" with named parameters (e.g., `prompt`, `seed`) instead of raw node graphs. + +**Reference files in this skill:** +- `references/cli-reference.md` — complete command reference with all subcommands and options +- `references/api-notes.md` — underlying REST API routes (for debugging / advanced use) +- `scripts/comfyui_setup.sh` — workspace initialization script + +## When to Use + +- User asks to generate images with Stable Diffusion, SDXL, Flux, or other diffusion models +- User wants to run a specific ComfyUI workflow +- User wants to chain generative steps (txt2img → upscale → face restore) +- User needs ControlNet, inpainting, img2img, or other advanced pipelines +- User asks to manage ComfyUI queue, check models, or install custom nodes +- User wants video/audio generation via AnimateDiff, Hunyuan, AudioCraft, etc. + +## How It Works + +The `comfyui-skill` CLI turns ComfyUI workflows into callable "skills": + +1. **Import** a workflow JSON (from editor or API format) → CLI extracts a parameter schema +2. **Run** with friendly args (`--args '{"prompt": "a cat"}'`) → CLI injects values into the right nodes +3. **Retrieve** outputs → CLI downloads generated files locally + +The agent never sees raw node IDs or graph wiring. The CLI handles: +- Editor-format → API-format conversion (resolves reroutes, widget ordering via `/object_info`) +- Auto-upload of local images referenced in args +- Dependency checking (missing custom nodes, models) +- WebSocket streaming with polling fallback +- Multi-server routing +- Idempotent execution via `--job-id` + +## CLI Invocation + +The CLI is invoked via `uvx` (no persistent install needed): + +```bash +uvx --from comfyui-skill-cli comfyui-skill [OPTIONS] COMMAND [ARGS] +``` + +For brevity in all examples below, we alias this: + +```bash +# In execute_code / terminal, always use the full uvx form: +COMFY="uvx --from comfyui-skill-cli comfyui-skill" +``` + +**Always pass `--json` for structured output** the agent can parse: + +```bash +$COMFY --json list +$COMFY --json run my-workflow --args '{"prompt": "a cat"}' +``` + +If `comfyui-skill` is already installed as a `uv tool` (`uv tool install comfyui-skill-cli`), +it's on PATH directly and `uvx` is not needed. + +## Setup & Onboarding + +### 1. ComfyUI Must Be Running + +The CLI talks to a running ComfyUI server. If the user doesn't have one: + +- Point them to https://docs.comfy.org/installation +- Supports: NVIDIA (CUDA), AMD (ROCm), Intel Arc, Apple Silicon (MPS), CPU-only +- Desktop app available for Windows/macOS; manual install for Linux +- Comfy Cloud available for users without a GPU (https://platform.comfy.org) + +### 2. Initialize a Workspace + +The CLI reads `config.json` and `data/` from its working directory. Run the +setup script or initialize manually: + +```bash +bash scripts/comfyui_setup.sh +``` + +Or manually: + +```bash +mkdir -p ~/.hermes/comfyui && cd ~/.hermes/comfyui +``` + +Then add a server: + +```bash +$COMFY --json server add --id local --url http://127.0.0.1:8188 --name "Local ComfyUI" +``` + +For Comfy Cloud: + +```bash +$COMFY --json server add --id cloud --url https://cloud.comfy.org \ + --name "Comfy Cloud" --api-key "comfyui-xxxxxxxxxxxx" +``` + +### 3. Verify Connection + +```bash +$COMFY --json server status +``` + +Should return `{"status": "online", ...}`. If offline, user needs to start ComfyUI. + +### 4. Import a Workflow + +Users typically have workflow JSON files from the ComfyUI editor: + +```bash +$COMFY --json workflow import /path/to/workflow.json --name my-workflow +``` + +The CLI auto-detects format (editor or API), converts if needed, and extracts +a parameter schema. Both formats are accepted. + +To import from the ComfyUI server's saved workflows: + +```bash +$COMFY --json workflow import --from-server +``` + +## Core Workflow + +### Step 1: List Available Skills + +```bash +$COMFY --json list +``` + +Returns all imported workflows with their parameter schemas. Required params +must be provided; optional params have sensible defaults. + +### Step 2: Check Dependencies (First Run) + +```bash +$COMFY --json deps check my-workflow +``` + +Reports missing custom nodes and models. If `is_ready` is false: + +```bash +# Install missing nodes (requires ComfyUI Manager) +$COMFY --json deps install my-workflow --all + +# Missing models must be downloaded manually — CLI tells you which folder +``` + +### Step 3: Execute + +**Blocking (recommended for most use):** + +```bash +$COMFY --json run my-workflow --args '{"prompt": "a beautiful sunset", "seed": 42}' +``` + +Blocks until done, streams progress, downloads outputs. + +**Non-blocking (for long jobs):** + +```bash +# Submit +$COMFY --json submit my-workflow --args '{"prompt": "..."}' +# Returns: {"prompt_id": "abc-123"} + +# Poll (each poll = separate command, do NOT loop in shell) +$COMFY --json status abc-123 +# Returns: {"status": "running", "progress": {"value": 15, "max": 25}} + +# When status = "success", outputs are in the response +``` + +### Step 4: Present Results + +On success, the response contains output file paths. Show them to the user. +Images referenced in the output can be displayed via `vision_analyze` or +returned as file paths. + +## Quick Decision Tree + +| User says | Command | +|-----------|---------| +| "generate an image" / "draw" | `run --args '{"prompt": "..."}'` | +| "import this workflow" | `workflow import ` | +| "use this image" (img2img) | `upload ` then `run` with the reference | +| "inpaint this" | `upload --mask` then `run` | +| "what workflows do I have" | `list` | +| "what models are available" | `models list checkpoints` | +| "check if everything's installed" | `deps check ` | +| "what failed" / "show history" | `history list ` | +| "cancel that" | `cancel ` | +| "free up GPU memory" | `free` | +| "which nodes exist for X" | `nodes search ` | + +## Multi-Server + +Skills are addressed as `server_id/workflow_id`: + +```bash +$COMFY --json list # all servers +$COMFY --json run local/txt2img --args '{...}' # specific server +$COMFY --json run cloud/flux --args '{...}' # different server +$COMFY --json server stats --all # VRAM/RAM across all servers +``` + +If `server_id` is omitted, the default server is used. + +## Image Upload (img2img / Inpainting) + +```bash +# Upload input image +$COMFY --json upload /path/to/photo.png +# Returns: {"filename": "photo.png", ...} + +# Upload mask for inpainting +$COMFY --json upload /path/to/mask.png --mask --original photo.png + +# Use in workflow args — if a param has type "image" and value is a local +# file path (starts with /, ./, ../, ~), the CLI auto-uploads it +$COMFY --json run inpaint --args '{"image": "/path/to/photo.png", "mask": "/path/to/mask.png", "prompt": "fill with flowers"}' +``` + +## Model Discovery + +```bash +$COMFY --json models list # all folder types +$COMFY --json models list checkpoints # checkpoint files +$COMFY --json models list loras # LoRA files +$COMFY --json models list controlnet # ControlNet models +``` + +Model folders: `checkpoints`, `loras`, `vae`, `controlnet`, `clip`, `clip_vision`, +`upscale_models`, `embeddings`, `unet`, `diffusion_models`. + +## Node Discovery + +```bash +$COMFY --json nodes list # all nodes, grouped by category +$COMFY --json nodes list -c sampling # filter by category +$COMFY --json nodes info KSampler # full details of one node +$COMFY --json nodes search "upscale" # fuzzy search +``` + +## Queue & System + +```bash +$COMFY --json queue list # running + pending jobs +$COMFY --json queue clear # clear pending +$COMFY --json cancel # cancel specific job +$COMFY --json free # unload models + free VRAM +$COMFY --json server stats # system info (VRAM, RAM, GPU) +``` + +## Workflow Management + +```bash +$COMFY --json workflow import --name # import from file +$COMFY --json workflow import --from-server # import from ComfyUI server +$COMFY --json workflow enable # enable +$COMFY --json workflow disable # disable +$COMFY --json workflow delete # delete +$COMFY --json info # show schema + details +``` + +## Idempotent Execution + +For retries that shouldn't burn extra GPU: + +```bash +$COMFY --json run my-workflow --args '{"prompt": "..."}' --job-id "unique-key-123" +``` + +If `unique-key-123` was already executed, returns the cached result instantly. + +## Pitfalls + +1. **Working directory matters** — The CLI reads `config.json` and `data/` from CWD. + Always `cd` to the workspace directory before running commands. If `list` returns + empty or `server status` fails, you're in the wrong directory. + +2. **Editor format needs a live server** — Importing editor-format workflows requires + a running ComfyUI instance (calls `/object_info` to resolve widget ordering). + API-format imports work offline. + +3. **Missing custom nodes** — Always `deps check` before first run of an imported + workflow. "class_type not found" means missing nodes. + +4. **JSON args quoting** — Wrap `--args` in single quotes to prevent bash from + eating the double quotes: `--args '{"prompt": "a cat"}'`. + +5. **Comfy Cloud differences** — Cloud uses `/api/` prefix and `X-API-Key` auth. + The CLI handles this transparently when configured with `--api-key`. + +6. **Model names are exact** — Case-sensitive, includes extension. Use + `models list checkpoints` to discover installed models. + +7. **Long generations** — Video and high-step workflows can take minutes. The `run` + command blocks and streams progress. For very long jobs, use `submit` + `status`. + +8. **Concurrent limits (Cloud)** — Free/Standard: 1 job. Creator: 3. Pro: 5. + Extra submits queue automatically. + +9. **Config portability** — Use `config export` / `config import` to transfer + setups between machines. + +## Verification Checklist + +- [ ] `uv` or `uvx` available on PATH +- [ ] `comfyui-skill --json server status` returns online +- [ ] Workspace dir has `config.json` and `data/` +- [ ] At least one workflow imported (`list` returns non-empty) +- [ ] `deps check` passes for imported workflows +- [ ] Test run completes and outputs are saved diff --git a/optional-skills/creative/comfyui/references/api-notes.md b/optional-skills/creative/comfyui/references/api-notes.md new file mode 100644 index 00000000000..50f54647701 --- /dev/null +++ b/optional-skills/creative/comfyui/references/api-notes.md @@ -0,0 +1,103 @@ +# ComfyUI REST API Notes + +The `comfyui-skill` CLI wraps these endpoints. This reference is for debugging, +understanding errors, or advanced use when the CLI doesn't cover a specific need. + +## Endpoints the CLI Uses + +| Endpoint | Method | CLI Command | +|----------|--------|-------------| +| `/system_stats` | GET | `server status`, `server stats` | +| `/prompt` | POST | `run`, `submit` | +| `/history/{prompt_id}` | GET | `status`, `run` (polling) | +| `/history` | GET | `history list --server` | +| `/queue` | GET | `queue list` | +| `/queue` | POST | `queue clear`, `queue delete` | +| `/interrupt` | POST | `cancel` | +| `/free` | POST | `free` | +| `/object_info` | GET | `nodes list`, `workflow import` (schema extraction) | +| `/object_info/{class}` | GET | `nodes info` | +| `/models` | GET | `models list` | +| `/models/{folder}` | GET | `models list `, `deps check` | +| `/view` | GET | `run` (output download) | +| `/upload/image` | POST | `upload` | +| `/upload/mask` | POST | `upload --mask` | +| `/node_replacements` | GET | `workflow import` (deprecated node detection) | +| `/internal/logs/raw` | GET | `logs show` | +| `/workflow_templates` | GET | `templates list` | +| `/global_subgraphs` | GET | `templates subgraphs` | +| `/v2/userdata` | GET | `workflow import --from-server` | +| `/ws` | WebSocket | `run` (real-time progress) | + +### Cloud-specific + +| Endpoint | Method | Purpose | +|----------|--------|---------| +| `/api/jobs` | GET | Job listing with filtering | +| `/api/jobs/{id}` | GET | Job details | + +### ComfyUI Manager (optional plugin) + +| Endpoint | Method | CLI Command | +|----------|--------|-------------| +| `/manager/queue/start` | GET | `deps install` | +| `/manager/queue/install` | POST | `deps install` (custom nodes) | +| `/manager/queue/install_model` | POST | `deps install --models` | +| `/manager/queue/status` | GET | `deps install` (progress) | + +## Local vs Cloud Differences + +| | Local | Cloud | +|---|---|---| +| Base URL | `http://127.0.0.1:8188` | `https://cloud.comfy.org` | +| Route prefix | none | `/api` | +| Auth | none or bearer token | `X-API-Key` header | +| Job status | Poll `/history/{id}` | `/api/jobs/{id}` | +| Output download | Direct bytes from `/view` | 302 redirect → signed URL | +| WebSocket | `ws://host:port/ws?clientId={uuid}` | `wss://host/ws?clientId={uuid}&token={key}` | +| Concurrent jobs | Sequential | Tier-limited (Free: 1, Creator: 3, Pro: 5) | + +The CLI handles all of these differences transparently based on the server config. + +## Workflow JSON Format (API Format) + +```json +{ + "node_id_string": { + "class_type": "NodeClassName", + "inputs": { + "param_name": "value", + "linked_input": ["source_node_id", output_index] + } + } +} +``` + +- Node IDs are strings (`"3"`, not `3`) +- Links: `["node_id", output_index]` — 0-based int +- `class_type` must match exactly (case-sensitive) + +## POST /prompt Payload + +```json +{ + "prompt": { "" }, + "client_id": "uuid", + "extra_data": { + "api_key_comfy_org": "key-for-paid-api-nodes" + } +} +``` + +The CLI constructs this from the imported workflow + injected parameters. + +## WebSocket Message Types + +| Type | When | Key Fields | +|------|------|------------| +| `execution_start` | Prompt begins | `prompt_id` | +| `executing` | Node running (`null` = done) | `node`, `prompt_id` | +| `progress` | Sampling steps | `node`, `value`, `max` | +| `executed` | Node output ready | `node`, `output` | +| `execution_success` | All nodes done | `prompt_id` | +| `execution_error` | Failure | `exception_type`, `exception_message` | diff --git a/optional-skills/creative/comfyui/references/cli-reference.md b/optional-skills/creative/comfyui/references/cli-reference.md new file mode 100644 index 00000000000..3f5e819d4cf --- /dev/null +++ b/optional-skills/creative/comfyui/references/cli-reference.md @@ -0,0 +1,172 @@ +# comfyui-skill CLI Reference + +Complete command map for `comfyui-skill` v0.2.x. + +**Invocation:** `uvx --from comfyui-skill-cli comfyui-skill [OPTIONS] COMMAND [ARGS]` + +Or if installed as a tool: `comfyui-skill [OPTIONS] COMMAND [ARGS]` + +## Global Options + +| Option | Short | Description | +|--------|-------|-------------| +| `--version` | `-V` | Show version | +| `--json` | `-j` | JSON output (always use this for agent parsing) | +| `--output-format` | | `text`, `json`, or `stream-json` (NDJSON events) | +| `--server` | `-s` | Server ID override | +| `--dir` | `-d` | Data directory (default: CWD) | +| `--verbose` | `-v` | Verbose output | +| `--no-update-check` | | Skip CLI update check | + +## Standalone Commands + +### `list` +List all available skills across all enabled servers. + +### `info ` +Show skill details and parameter schema. Skill ID format: `server_id/workflow_id` or `workflow_id`. + +### `run [OPTIONS]` +Execute a skill (blocking — waits for completion, streams progress). + +| Option | Short | Description | +|--------|-------|-------------| +| `--args` | `-a` | JSON parameters (default: `{}`) | +| `--only` | | Comma-separated node IDs for partial execution | +| `--priority` | `-p` | Queue priority (lower = first, negative = jump queue; default: 0) | +| `--validate` | | Validate workflow without executing (dry run) | +| `--job-id` | | Idempotency key — reuse cached result if already executed | + +### `submit [OPTIONS]` +Submit a skill (non-blocking — returns `prompt_id` immediately). Same options as `run` except no streaming. + +### `status ` +Check execution status. Returns: `queued` (with `position`), `running` (with `progress`), `success` (with `outputs`), or `error`. + +### `upload [FILE_PATH] [OPTIONS]` +Upload a file to ComfyUI for use in workflows. + +| Option | Description | +|--------|-------------| +| `--from-output` | Reuse output from a previous prompt_id as input | +| `--mask` | Upload as mask (for inpainting) | +| `--original` | Original image filename (for mask upload) | + +### `cancel ` +Cancel a running or queued job. + +### `free [OPTIONS]` +Release GPU memory. + +| Option | Short | Description | +|--------|-------|-------------| +| `--models` | `-m` | Unload all models from VRAM | +| `--memory` | | Free all cached memory | + +## Command Groups + +### `server` — Manage ComfyUI Servers + +| Subcommand | Description | +|------------|-------------| +| `server list` | List all configured servers | +| `server status [SERVER_ID]` | Check if server is online | +| `server stats [SERVER_ID]` | System stats: VRAM, RAM, GPU, versions (`--all` for all servers) | +| `server add` | Add server (`--id`, `--url` required; `--name`, `--output-dir`, `--auth`, `--api-key` optional) | +| `server enable ` | Enable a server | +| `server disable ` | Disable a server | +| `server remove ` | Remove a server | + +### `workflow` — Manage Workflows + +| Subcommand | Description | +|------------|-------------| +| `workflow import [JSON_PATH]` | Import workflow (`--name`, `--type` image/audio/video, `--from-server`, `--preview`, `--check-deps`) | +| `workflow enable ` | Enable a workflow | +| `workflow disable ` | Disable a workflow | +| `workflow delete ` | Delete a workflow | + +### `models` — Discover Models + +| Subcommand | Description | +|------------|-------------| +| `models list [FOLDER]` | List models in a folder (checkpoints, loras, vae, controlnet, etc.) | + +### `nodes` — Discover Nodes + +| Subcommand | Description | +|------------|-------------| +| `nodes list` | List all node classes (`-c` to filter by category) | +| `nodes info ` | Full details of a node type | +| `nodes search ` | Fuzzy search across names/categories | + +### `deps` — Dependency Management + +| Subcommand | Description | +|------------|-------------| +| `deps check ` | Check if dependencies are installed (returns `is_ready`) | +| `deps install ` | Install missing deps (`--repos` git URLs, `--models`, `--all`) | + +### `history` — Execution History + +| Subcommand | Description | +|------------|-------------| +| `history list [SKILL_ID]` | List history (`--server`, `--status`, `--limit`, `--sort`) | +| `history show ` | Show specific run details | + +### `queue` — Queue Management + +| Subcommand | Description | +|------------|-------------| +| `queue list` | Show running and pending jobs | +| `queue clear` | Clear all pending jobs | +| `queue delete ` | Remove specific jobs from queue | + +### `logs` — Server Logs + +| Subcommand | Description | +|------------|-------------| +| `logs show` | Show recent server logs (`--lines` / `-n`, default: 50) | + +### `templates` — Discover Templates + +| Subcommand | Description | +|------------|-------------| +| `templates list` | Workflow templates from custom nodes | +| `templates subgraphs` | Reusable subgraph components | + +### `config` — Configuration + +| Subcommand | Description | +|------------|-------------| +| `config export` | Export config + workflows as bundle (`--output`, `--portable-only`) | +| `config import ` | Import bundle (`--dry-run`, `--apply-environment`, `--no-overwrite`) | + +## Config File Format + +Located at `/config.json`: + +```json +{ + "default_server": "local", + "servers": [ + { + "id": "local", + "name": "Local ComfyUI", + "url": "http://127.0.0.1:8188", + "enabled": true, + "output_dir": "./outputs", + "auth": "", + "comfy_api_key": "" + } + ] +} +``` + +**Server fields:** +- `id` — unique identifier (no spaces/slashes/dots) +- `url` — ComfyUI base URL +- `enabled` — whether server is active +- `output_dir` — where outputs are saved (relative to workspace) +- `auth` — bearer token for authenticated servers +- `comfy_api_key` — Comfy Cloud API key (also sent as `extra_data.api_key_comfy_org` in prompts) diff --git a/optional-skills/creative/comfyui/scripts/comfyui_setup.sh b/optional-skills/creative/comfyui/scripts/comfyui_setup.sh new file mode 100755 index 00000000000..e0040597771 --- /dev/null +++ b/optional-skills/creative/comfyui/scripts/comfyui_setup.sh @@ -0,0 +1,41 @@ +#!/usr/bin/env bash +# Initialize a comfyui-skill workspace directory. +# Usage: bash scripts/comfyui_setup.sh [WORKSPACE_DIR] +# +# Creates the workspace, adds a default local server config, +# and verifies the connection. + +set -euo pipefail + +WORKSPACE="${1:-$HOME/.hermes/comfyui}" +COMFY="${COMFY:-uvx --from comfyui-skill-cli comfyui-skill}" + +echo "==> Initializing ComfyUI skill workspace at: $WORKSPACE" +mkdir -p "$WORKSPACE" +cd "$WORKSPACE" + +# If config.json doesn't exist, create it with a default local server +if [ ! -f config.json ]; then + echo "==> Creating default config (local server at 127.0.0.1:8188)" + $COMFY --json server add --id local --url http://127.0.0.1:8188 --name "Local ComfyUI" + echo "==> Config created: $WORKSPACE/config.json" +else + echo "==> config.json already exists, skipping" +fi + +# Verify connection +echo "==> Checking server connection..." +if $COMFY --json server status 2>/dev/null | grep -q '"online"'; then + echo "==> ComfyUI is reachable!" + $COMFY --json server stats 2>/dev/null || true +else + echo "==> ComfyUI is not reachable at the configured URL." + echo " Start ComfyUI first, or update the server URL:" + echo " cd $WORKSPACE && $COMFY server add --id local --url " + echo "" + echo " Install ComfyUI: https://docs.comfy.org/installation" +fi + +echo "" +echo "==> Workspace ready: $WORKSPACE" +echo " Always cd here before running comfyui-skill commands."