Compare commits

...

2 Commits

Author SHA1 Message Date
alt-glitch
4d810d9591 style(comfyui): format SKILL.md — table alignment, YAML tags, onboarding hint 2026-04-29 13:50:17 +05:30
alt-glitch
fff37dd79e feat(skills): add comfyui skill — CLI-driven image/video/audio generation
Agent-friendly ComfyUI skill using the comfyui-skill CLI (invoked via uvx,
no persistent install). The CLI wraps ComfyUI's REST API into named 'skills'
with parameter schemas — the agent works with friendly args like
{"prompt": "a cat"} instead of raw node graphs.

- SKILL.md: setup/onboarding, core workflow, decision tree, multi-server,
  image upload, model/node discovery, queue management, pitfalls
- references/cli-reference.md: complete command map (27 leaf commands)
- references/api-notes.md: underlying REST endpoints for debugging
- scripts/comfyui_setup.sh: workspace initialization

Based on comfyui-skill-cli by HuangYuChuh, original skill work by kshitijk4poor.
2026-04-29 12:12:09 +05:30
4 changed files with 662 additions and 0 deletions

View File

@@ -0,0 +1,346 @@
---
name: comfyui
description: "Use when generating images/video/audio with ComfyUI — import workflows, run them with friendly parameters, manage models and dependencies. Uses the comfyui-skill CLI over the REST API."
version: 3.0.0
requires: ComfyUI running locally or via Comfy Cloud; comfyui-skill CLI (auto-installed via uvx)
author: kshitijk4poor
license: MIT
platforms: [macos, linux, windows]
prerequisites:
commands: ["uv"]
setup:
help: "CLI auto-runs via uvx. ComfyUI install: https://docs.comfy.org/installation"
metadata:
hermes:
tags:
[
comfyui,
image-generation,
stable-diffusion,
flux,
creative,
generative-ai,
video-generation,
]
related_skills: [stable-diffusion-image-generation, image_gen]
category: creative
---
# ComfyUI
Generate images, video, and audio through ComfyUI using the `comfyui-skill` CLI.
The CLI wraps ComfyUI's REST API into an agent-friendly interface — workflows become
"skills" with named parameters (e.g., `prompt`, `seed`) instead of raw node graphs.
**Reference files in this skill:**
- `references/cli-reference.md` — complete command reference with all subcommands and options
- `references/api-notes.md` — underlying REST API routes (for debugging / advanced use)
- `scripts/comfyui_setup.sh` — workspace initialization script
## When to Use
- User asks to generate images with Stable Diffusion, SDXL, Flux, or other diffusion models
- User wants to run a specific ComfyUI workflow
- User wants to chain generative steps (txt2img → upscale → face restore)
- User needs ControlNet, inpainting, img2img, or other advanced pipelines
- User asks to manage ComfyUI queue, check models, or install custom nodes
- User wants video/audio generation via AnimateDiff, Hunyuan, AudioCraft, etc.
## How It Works
The `comfyui-skill` CLI turns ComfyUI workflows into callable "skills":
1. **Import** a workflow JSON (from editor or API format) → CLI extracts a parameter schema
2. **Run** with friendly args (`--args '{"prompt": "a cat"}'`) → CLI injects values into the right nodes
3. **Retrieve** outputs → CLI downloads generated files locally
The agent never sees raw node IDs or graph wiring. The CLI handles:
- Editor-format → API-format conversion (resolves reroutes, widget ordering via `/object_info`)
- Auto-upload of local images referenced in args
- Dependency checking (missing custom nodes, models)
- WebSocket streaming with polling fallback
- Multi-server routing
- Idempotent execution via `--job-id`
## CLI Invocation
The CLI is invoked via `uvx` (no persistent install needed):
```bash
uvx --from comfyui-skill-cli comfyui-skill [OPTIONS] COMMAND [ARGS]
```
For brevity in all examples below, we alias this:
```bash
# In execute_code / terminal, always use the full uvx form:
COMFY="uvx --from comfyui-skill-cli comfyui-skill"
```
**Always pass `--json` for structured output** the agent can parse:
```bash
$COMFY --json list
$COMFY --json run my-workflow --args '{"prompt": "a cat"}'
```
If `comfyui-skill` is already installed as a `uv tool` (`uv tool install comfyui-skill-cli`),
it's on PATH directly and `uvx` is not needed.
## Setup & Onboarding
### 1. ComfyUI Must Be Running
The CLI talks to a running ComfyUI server. If the user doesn't have one:
- Point them to https://docs.comfy.org/installation. If they ask for help in onboarding, read the docs and help them set things up.
- Supports: NVIDIA (CUDA), AMD (ROCm), Intel Arc, Apple Silicon (MPS), CPU-only
- Desktop app available for Windows/macOS; manual install for Linux
- Comfy Cloud available for users without a GPU (https://platform.comfy.org)
### 2. Initialize a Workspace
The CLI reads `config.json` and `data/` from its working directory. Run the
setup script or initialize manually:
```bash
bash scripts/comfyui_setup.sh
```
Or manually:
```bash
mkdir -p ~/.hermes/comfyui && cd ~/.hermes/comfyui
```
Then add a server:
```bash
$COMFY --json server add --id local --url http://127.0.0.1:8188 --name "Local ComfyUI"
```
For Comfy Cloud:
```bash
$COMFY --json server add --id cloud --url https://cloud.comfy.org \
--name "Comfy Cloud" --api-key "comfyui-xxxxxxxxxxxx"
```
### 3. Verify Connection
```bash
$COMFY --json server status
```
Should return `{"status": "online", ...}`. If offline, user needs to start ComfyUI.
### 4. Import a Workflow
Users typically have workflow JSON files from the ComfyUI editor:
```bash
$COMFY --json workflow import /path/to/workflow.json --name my-workflow
```
The CLI auto-detects format (editor or API), converts if needed, and extracts
a parameter schema. Both formats are accepted.
To import from the ComfyUI server's saved workflows:
```bash
$COMFY --json workflow import --from-server
```
## Core Workflow
### Step 1: List Available Skills
```bash
$COMFY --json list
```
Returns all imported workflows with their parameter schemas. Required params
must be provided; optional params have sensible defaults.
### Step 2: Check Dependencies (First Run)
```bash
$COMFY --json deps check my-workflow
```
Reports missing custom nodes and models. If `is_ready` is false:
```bash
# Install missing nodes (requires ComfyUI Manager)
$COMFY --json deps install my-workflow --all
# Missing models must be downloaded manually — CLI tells you which folder
```
### Step 3: Execute
**Blocking (recommended for most use):**
```bash
$COMFY --json run my-workflow --args '{"prompt": "a beautiful sunset", "seed": 42}'
```
Blocks until done, streams progress, downloads outputs.
**Non-blocking (for long jobs):**
```bash
# Submit
$COMFY --json submit my-workflow --args '{"prompt": "..."}'
# Returns: {"prompt_id": "abc-123"}
# Poll (each poll = separate command, do NOT loop in shell)
$COMFY --json status abc-123
# Returns: {"status": "running", "progress": {"value": 15, "max": 25}}
# When status = "success", outputs are in the response
```
### Step 4: Present Results
On success, the response contains output file paths. Show them to the user.
Images referenced in the output can be displayed via `vision_analyze` or
returned as file paths.
## Quick Decision Tree
| User says | Command |
| --------------------------------- | ---------------------------------------------- |
| "generate an image" / "draw" | `run <skill> --args '{"prompt": "..."}'` |
| "import this workflow" | `workflow import <path>` |
| "use this image" (img2img) | `upload <image>` then `run` with the reference |
| "inpaint this" | `upload <mask> --mask` then `run` |
| "what workflows do I have" | `list` |
| "what models are available" | `models list checkpoints` |
| "check if everything's installed" | `deps check <skill>` |
| "what failed" / "show history" | `history list <skill>` |
| "cancel that" | `cancel <prompt_id>` |
| "free up GPU memory" | `free` |
| "which nodes exist for X" | `nodes search <query>` |
## Multi-Server
Skills are addressed as `server_id/workflow_id`:
```bash
$COMFY --json list # all servers
$COMFY --json run local/txt2img --args '{...}' # specific server
$COMFY --json run cloud/flux --args '{...}' # different server
$COMFY --json server stats --all # VRAM/RAM across all servers
```
If `server_id` is omitted, the default server is used.
## Image Upload (img2img / Inpainting)
```bash
# Upload input image
$COMFY --json upload /path/to/photo.png
# Returns: {"filename": "photo.png", ...}
# Upload mask for inpainting
$COMFY --json upload /path/to/mask.png --mask --original photo.png
# Use in workflow args — if a param has type "image" and value is a local
# file path (starts with /, ./, ../, ~), the CLI auto-uploads it
$COMFY --json run inpaint --args '{"image": "/path/to/photo.png", "mask": "/path/to/mask.png", "prompt": "fill with flowers"}'
```
## Model Discovery
```bash
$COMFY --json models list # all folder types
$COMFY --json models list checkpoints # checkpoint files
$COMFY --json models list loras # LoRA files
$COMFY --json models list controlnet # ControlNet models
```
Model folders: `checkpoints`, `loras`, `vae`, `controlnet`, `clip`, `clip_vision`,
`upscale_models`, `embeddings`, `unet`, `diffusion_models`.
## Node Discovery
```bash
$COMFY --json nodes list # all nodes, grouped by category
$COMFY --json nodes list -c sampling # filter by category
$COMFY --json nodes info KSampler # full details of one node
$COMFY --json nodes search "upscale" # fuzzy search
```
## Queue & System
```bash
$COMFY --json queue list # running + pending jobs
$COMFY --json queue clear # clear pending
$COMFY --json cancel <prompt_id> # cancel specific job
$COMFY --json free # unload models + free VRAM
$COMFY --json server stats # system info (VRAM, RAM, GPU)
```
## Workflow Management
```bash
$COMFY --json workflow import <path> --name <id> # import from file
$COMFY --json workflow import --from-server # import from ComfyUI server
$COMFY --json workflow enable <skill_id> # enable
$COMFY --json workflow disable <skill_id> # disable
$COMFY --json workflow delete <skill_id> # delete
$COMFY --json info <skill_id> # show schema + details
```
## Idempotent Execution
For retries that shouldn't burn extra GPU:
```bash
$COMFY --json run my-workflow --args '{"prompt": "..."}' --job-id "unique-key-123"
```
If `unique-key-123` was already executed, returns the cached result instantly.
## Pitfalls
1. **Working directory matters** — The CLI reads `config.json` and `data/` from CWD.
Always `cd` to the workspace directory before running commands. If `list` returns
empty or `server status` fails, you're in the wrong directory.
2. **Editor format needs a live server** — Importing editor-format workflows requires
a running ComfyUI instance (calls `/object_info` to resolve widget ordering).
API-format imports work offline.
3. **Missing custom nodes** — Always `deps check` before first run of an imported
workflow. "class_type not found" means missing nodes.
4. **JSON args quoting** — Wrap `--args` in single quotes to prevent bash from
eating the double quotes: `--args '{"prompt": "a cat"}'`.
5. **Comfy Cloud differences** — Cloud uses `/api/` prefix and `X-API-Key` auth.
The CLI handles this transparently when configured with `--api-key`.
6. **Model names are exact** — Case-sensitive, includes extension. Use
`models list checkpoints` to discover installed models.
7. **Long generations** — Video and high-step workflows can take minutes. The `run`
command blocks and streams progress. For very long jobs, use `submit` + `status`.
8. **Concurrent limits (Cloud)** — Free/Standard: 1 job. Creator: 3. Pro: 5.
Extra submits queue automatically.
9. **Config portability** — Use `config export` / `config import` to transfer
setups between machines.
## Verification Checklist
- [ ] `uv` or `uvx` available on PATH
- [ ] `comfyui-skill --json server status` returns online
- [ ] Workspace dir has `config.json` and `data/`
- [ ] At least one workflow imported (`list` returns non-empty)
- [ ] `deps check` passes for imported workflows
- [ ] Test run completes and outputs are saved

View File

@@ -0,0 +1,103 @@
# ComfyUI REST API Notes
The `comfyui-skill` CLI wraps these endpoints. This reference is for debugging,
understanding errors, or advanced use when the CLI doesn't cover a specific need.
## Endpoints the CLI Uses
| Endpoint | Method | CLI Command |
|----------|--------|-------------|
| `/system_stats` | GET | `server status`, `server stats` |
| `/prompt` | POST | `run`, `submit` |
| `/history/{prompt_id}` | GET | `status`, `run` (polling) |
| `/history` | GET | `history list --server` |
| `/queue` | GET | `queue list` |
| `/queue` | POST | `queue clear`, `queue delete` |
| `/interrupt` | POST | `cancel` |
| `/free` | POST | `free` |
| `/object_info` | GET | `nodes list`, `workflow import` (schema extraction) |
| `/object_info/{class}` | GET | `nodes info` |
| `/models` | GET | `models list` |
| `/models/{folder}` | GET | `models list <folder>`, `deps check` |
| `/view` | GET | `run` (output download) |
| `/upload/image` | POST | `upload` |
| `/upload/mask` | POST | `upload --mask` |
| `/node_replacements` | GET | `workflow import` (deprecated node detection) |
| `/internal/logs/raw` | GET | `logs show` |
| `/workflow_templates` | GET | `templates list` |
| `/global_subgraphs` | GET | `templates subgraphs` |
| `/v2/userdata` | GET | `workflow import --from-server` |
| `/ws` | WebSocket | `run` (real-time progress) |
### Cloud-specific
| Endpoint | Method | Purpose |
|----------|--------|---------|
| `/api/jobs` | GET | Job listing with filtering |
| `/api/jobs/{id}` | GET | Job details |
### ComfyUI Manager (optional plugin)
| Endpoint | Method | CLI Command |
|----------|--------|-------------|
| `/manager/queue/start` | GET | `deps install` |
| `/manager/queue/install` | POST | `deps install` (custom nodes) |
| `/manager/queue/install_model` | POST | `deps install --models` |
| `/manager/queue/status` | GET | `deps install` (progress) |
## Local vs Cloud Differences
| | Local | Cloud |
|---|---|---|
| Base URL | `http://127.0.0.1:8188` | `https://cloud.comfy.org` |
| Route prefix | none | `/api` |
| Auth | none or bearer token | `X-API-Key` header |
| Job status | Poll `/history/{id}` | `/api/jobs/{id}` |
| Output download | Direct bytes from `/view` | 302 redirect → signed URL |
| WebSocket | `ws://host:port/ws?clientId={uuid}` | `wss://host/ws?clientId={uuid}&token={key}` |
| Concurrent jobs | Sequential | Tier-limited (Free: 1, Creator: 3, Pro: 5) |
The CLI handles all of these differences transparently based on the server config.
## Workflow JSON Format (API Format)
```json
{
"node_id_string": {
"class_type": "NodeClassName",
"inputs": {
"param_name": "value",
"linked_input": ["source_node_id", output_index]
}
}
}
```
- Node IDs are strings (`"3"`, not `3`)
- Links: `["node_id", output_index]` — 0-based int
- `class_type` must match exactly (case-sensitive)
## POST /prompt Payload
```json
{
"prompt": { "<workflow>" },
"client_id": "uuid",
"extra_data": {
"api_key_comfy_org": "key-for-paid-api-nodes"
}
}
```
The CLI constructs this from the imported workflow + injected parameters.
## WebSocket Message Types
| Type | When | Key Fields |
|------|------|------------|
| `execution_start` | Prompt begins | `prompt_id` |
| `executing` | Node running (`null` = done) | `node`, `prompt_id` |
| `progress` | Sampling steps | `node`, `value`, `max` |
| `executed` | Node output ready | `node`, `output` |
| `execution_success` | All nodes done | `prompt_id` |
| `execution_error` | Failure | `exception_type`, `exception_message` |

View File

@@ -0,0 +1,172 @@
# comfyui-skill CLI Reference
Complete command map for `comfyui-skill` v0.2.x.
**Invocation:** `uvx --from comfyui-skill-cli comfyui-skill [OPTIONS] COMMAND [ARGS]`
Or if installed as a tool: `comfyui-skill [OPTIONS] COMMAND [ARGS]`
## Global Options
| Option | Short | Description |
|--------|-------|-------------|
| `--version` | `-V` | Show version |
| `--json` | `-j` | JSON output (always use this for agent parsing) |
| `--output-format` | | `text`, `json`, or `stream-json` (NDJSON events) |
| `--server` | `-s` | Server ID override |
| `--dir` | `-d` | Data directory (default: CWD) |
| `--verbose` | `-v` | Verbose output |
| `--no-update-check` | | Skip CLI update check |
## Standalone Commands
### `list`
List all available skills across all enabled servers.
### `info <SKILL_ID>`
Show skill details and parameter schema. Skill ID format: `server_id/workflow_id` or `workflow_id`.
### `run <SKILL_ID> [OPTIONS]`
Execute a skill (blocking — waits for completion, streams progress).
| Option | Short | Description |
|--------|-------|-------------|
| `--args` | `-a` | JSON parameters (default: `{}`) |
| `--only` | | Comma-separated node IDs for partial execution |
| `--priority` | `-p` | Queue priority (lower = first, negative = jump queue; default: 0) |
| `--validate` | | Validate workflow without executing (dry run) |
| `--job-id` | | Idempotency key — reuse cached result if already executed |
### `submit <SKILL_ID> [OPTIONS]`
Submit a skill (non-blocking — returns `prompt_id` immediately). Same options as `run` except no streaming.
### `status <PROMPT_ID>`
Check execution status. Returns: `queued` (with `position`), `running` (with `progress`), `success` (with `outputs`), or `error`.
### `upload [FILE_PATH] [OPTIONS]`
Upload a file to ComfyUI for use in workflows.
| Option | Description |
|--------|-------------|
| `--from-output` | Reuse output from a previous prompt_id as input |
| `--mask` | Upload as mask (for inpainting) |
| `--original` | Original image filename (for mask upload) |
### `cancel <PROMPT_ID>`
Cancel a running or queued job.
### `free [OPTIONS]`
Release GPU memory.
| Option | Short | Description |
|--------|-------|-------------|
| `--models` | `-m` | Unload all models from VRAM |
| `--memory` | | Free all cached memory |
## Command Groups
### `server` — Manage ComfyUI Servers
| Subcommand | Description |
|------------|-------------|
| `server list` | List all configured servers |
| `server status [SERVER_ID]` | Check if server is online |
| `server stats [SERVER_ID]` | System stats: VRAM, RAM, GPU, versions (`--all` for all servers) |
| `server add` | Add server (`--id`, `--url` required; `--name`, `--output-dir`, `--auth`, `--api-key` optional) |
| `server enable <SERVER_ID>` | Enable a server |
| `server disable <SERVER_ID>` | Disable a server |
| `server remove <SERVER_ID>` | Remove a server |
### `workflow` — Manage Workflows
| Subcommand | Description |
|------------|-------------|
| `workflow import [JSON_PATH]` | Import workflow (`--name`, `--type` image/audio/video, `--from-server`, `--preview`, `--check-deps`) |
| `workflow enable <SKILL_ID>` | Enable a workflow |
| `workflow disable <SKILL_ID>` | Disable a workflow |
| `workflow delete <SKILL_ID>` | Delete a workflow |
### `models` — Discover Models
| Subcommand | Description |
|------------|-------------|
| `models list [FOLDER]` | List models in a folder (checkpoints, loras, vae, controlnet, etc.) |
### `nodes` — Discover Nodes
| Subcommand | Description |
|------------|-------------|
| `nodes list` | List all node classes (`-c` to filter by category) |
| `nodes info <NODE_CLASS>` | Full details of a node type |
| `nodes search <QUERY>` | Fuzzy search across names/categories |
### `deps` — Dependency Management
| Subcommand | Description |
|------------|-------------|
| `deps check <SKILL_ID>` | Check if dependencies are installed (returns `is_ready`) |
| `deps install <SKILL_ID>` | Install missing deps (`--repos` git URLs, `--models`, `--all`) |
### `history` — Execution History
| Subcommand | Description |
|------------|-------------|
| `history list [SKILL_ID]` | List history (`--server`, `--status`, `--limit`, `--sort`) |
| `history show <SKILL_ID> <RUN_ID>` | Show specific run details |
### `queue` — Queue Management
| Subcommand | Description |
|------------|-------------|
| `queue list` | Show running and pending jobs |
| `queue clear` | Clear all pending jobs |
| `queue delete <PROMPT_IDS...>` | Remove specific jobs from queue |
### `logs` — Server Logs
| Subcommand | Description |
|------------|-------------|
| `logs show` | Show recent server logs (`--lines` / `-n`, default: 50) |
### `templates` — Discover Templates
| Subcommand | Description |
|------------|-------------|
| `templates list` | Workflow templates from custom nodes |
| `templates subgraphs` | Reusable subgraph components |
### `config` — Configuration
| Subcommand | Description |
|------------|-------------|
| `config export` | Export config + workflows as bundle (`--output`, `--portable-only`) |
| `config import <INPUT_PATH>` | Import bundle (`--dry-run`, `--apply-environment`, `--no-overwrite`) |
## Config File Format
Located at `<workspace>/config.json`:
```json
{
"default_server": "local",
"servers": [
{
"id": "local",
"name": "Local ComfyUI",
"url": "http://127.0.0.1:8188",
"enabled": true,
"output_dir": "./outputs",
"auth": "",
"comfy_api_key": ""
}
]
}
```
**Server fields:**
- `id` — unique identifier (no spaces/slashes/dots)
- `url` — ComfyUI base URL
- `enabled` — whether server is active
- `output_dir` — where outputs are saved (relative to workspace)
- `auth` — bearer token for authenticated servers
- `comfy_api_key` — Comfy Cloud API key (also sent as `extra_data.api_key_comfy_org` in prompts)

View File

@@ -0,0 +1,41 @@
#!/usr/bin/env bash
# Initialize a comfyui-skill workspace directory.
# Usage: bash scripts/comfyui_setup.sh [WORKSPACE_DIR]
#
# Creates the workspace, adds a default local server config,
# and verifies the connection.
set -euo pipefail
WORKSPACE="${1:-$HOME/.hermes/comfyui}"
COMFY="${COMFY:-uvx --from comfyui-skill-cli comfyui-skill}"
echo "==> Initializing ComfyUI skill workspace at: $WORKSPACE"
mkdir -p "$WORKSPACE"
cd "$WORKSPACE"
# If config.json doesn't exist, create it with a default local server
if [ ! -f config.json ]; then
echo "==> Creating default config (local server at 127.0.0.1:8188)"
$COMFY --json server add --id local --url http://127.0.0.1:8188 --name "Local ComfyUI"
echo "==> Config created: $WORKSPACE/config.json"
else
echo "==> config.json already exists, skipping"
fi
# Verify connection
echo "==> Checking server connection..."
if $COMFY --json server status 2>/dev/null | grep -q '"online"'; then
echo "==> ComfyUI is reachable!"
$COMFY --json server stats 2>/dev/null || true
else
echo "==> ComfyUI is not reachable at the configured URL."
echo " Start ComfyUI first, or update the server URL:"
echo " cd $WORKSPACE && $COMFY server add --id local --url <YOUR_URL>"
echo ""
echo " Install ComfyUI: https://docs.comfy.org/installation"
fi
echo ""
echo "==> Workspace ready: $WORKSPACE"
echo " Always cd here before running comfyui-skill commands."