Files
hermes-agent/website/docs/user-guide/features/image-generation.md

157 lines
6.8 KiB
Markdown
Raw Normal View History

---
title: Image Generation
description: Generate images via FAL.ai — 9 models including FLUX 2, GPT Image (1.5 & 2), Nano Banana Pro, Ideogram, Recraft V4 Pro, and more, selectable via `hermes tools`.
sidebar_label: Image Generation
sidebar_position: 6
---
# Image Generation
Hermes Agent generates images from text prompts via FAL.ai. Nine models are supported out of the box, each with different speed, quality, and cost tradeoffs. The active model is user-configurable via `hermes tools` and persists in `config.yaml`.
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
## Supported Models
| Model | Speed | Strengths | Price |
|---|---|---|---|
feat(image_gen): upgrade Recraft V3 → V4 Pro, Nano Banana → Pro (#11406) * feat(image_gen): upgrade Recraft V3 → V4 Pro, Nano Banana → Pro Upstream asked for these two upgrades ASAP — the old entries show stale models when newer, higher-quality versions are available on FAL. Recraft V3 → Recraft V4 Pro ID: fal-ai/recraft-v3 → fal-ai/recraft/v4/pro/text-to-image Price: $0.04/image → $0.25/image (6x — V4 Pro is premium tier) Schema: V4 dropped the required `style` enum entirely; defaults handle taste now. Added `colors` and `background_color` to supports for brand-palette control. `seed` is not supported by V4 per the API docs. Nano Banana → Nano Banana Pro ID: fal-ai/nano-banana → fal-ai/nano-banana-pro Price: $0.08/image → $0.15/image (1K); $0.30 at 4K Schema: Aspect ratio family unchanged. Added `resolution` (1K/2K/4K, default 1K for billing predictability), `enable_web_search` (real-time info grounding, +$0.015), and `limit_generations` (force exactly 1 image). Architecture: Gemini 2.5 Flash → Gemini 3 Pro Image. Quality and reasoning depth improved; slower (~6s → ~8s). Migration: users who had the old IDs in `image_gen.model` will fall through the existing 'unknown model → default' warning path in `_resolve_fal_model()` and get the Klein 9B default on the next run. Re-run `hermes tools` → Image Generation to pick the new version. No silent cost-upgrade aliasing — the 2-6x price jump on these tiers warrants explicit user re-selection. Portal note: both new model IDs need to be allowlisted on the Nous fal-queue-gateway alongside the previous 7 additions, or users on Nous Subscription will see the 'managed gateway rejected model' error we added previously (which is clear and self-remediating, just noisy). * docs: wrap '<1s' in backticks to unblock MDX compilation Docusaurus's MDX parser treats unquoted '<' as the start of JSX, and '<1s' fails because '1' isn't a valid tag-name start character. This was broken on main since PR #11265 (never noticed because docs-site-checks was failing on OTHER issues at the time and we admin-merged through it). Wrapping in backticks also gives the cell monospace styling which reads more cleanly alongside the inline-code model ID in the same row. The other '<1s' occurrence (line 52) is inside a fenced code block and is already safe — code fences bypass MDX parsing.
2026-04-16 22:05:41 -07:00
| `fal-ai/flux-2/klein/9b` *(default)* | `<1s` | Fast, crisp text | $0.006/MP |
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
| `fal-ai/flux-2-pro` | ~6s | Studio photorealism | $0.03/MP |
| `fal-ai/z-image/turbo` | ~2s | Bilingual EN/CN, 6B params | $0.005/MP |
feat(image_gen): upgrade Recraft V3 → V4 Pro, Nano Banana → Pro (#11406) * feat(image_gen): upgrade Recraft V3 → V4 Pro, Nano Banana → Pro Upstream asked for these two upgrades ASAP — the old entries show stale models when newer, higher-quality versions are available on FAL. Recraft V3 → Recraft V4 Pro ID: fal-ai/recraft-v3 → fal-ai/recraft/v4/pro/text-to-image Price: $0.04/image → $0.25/image (6x — V4 Pro is premium tier) Schema: V4 dropped the required `style` enum entirely; defaults handle taste now. Added `colors` and `background_color` to supports for brand-palette control. `seed` is not supported by V4 per the API docs. Nano Banana → Nano Banana Pro ID: fal-ai/nano-banana → fal-ai/nano-banana-pro Price: $0.08/image → $0.15/image (1K); $0.30 at 4K Schema: Aspect ratio family unchanged. Added `resolution` (1K/2K/4K, default 1K for billing predictability), `enable_web_search` (real-time info grounding, +$0.015), and `limit_generations` (force exactly 1 image). Architecture: Gemini 2.5 Flash → Gemini 3 Pro Image. Quality and reasoning depth improved; slower (~6s → ~8s). Migration: users who had the old IDs in `image_gen.model` will fall through the existing 'unknown model → default' warning path in `_resolve_fal_model()` and get the Klein 9B default on the next run. Re-run `hermes tools` → Image Generation to pick the new version. No silent cost-upgrade aliasing — the 2-6x price jump on these tiers warrants explicit user re-selection. Portal note: both new model IDs need to be allowlisted on the Nous fal-queue-gateway alongside the previous 7 additions, or users on Nous Subscription will see the 'managed gateway rejected model' error we added previously (which is clear and self-remediating, just noisy). * docs: wrap '<1s' in backticks to unblock MDX compilation Docusaurus's MDX parser treats unquoted '<' as the start of JSX, and '<1s' fails because '1' isn't a valid tag-name start character. This was broken on main since PR #11265 (never noticed because docs-site-checks was failing on OTHER issues at the time and we admin-merged through it). Wrapping in backticks also gives the cell monospace styling which reads more cleanly alongside the inline-code model ID in the same row. The other '<1s' occurrence (line 52) is inside a fenced code block and is already safe — code fences bypass MDX parsing.
2026-04-16 22:05:41 -07:00
| `fal-ai/nano-banana-pro` | ~8s | Gemini 3 Pro, reasoning depth, text rendering | $0.15/image (1K) |
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
| `fal-ai/gpt-image-1.5` | ~15s | Prompt adherence | $0.034/image |
| `fal-ai/gpt-image-2` | ~20s | SOTA text rendering + CJK, world-aware photorealism | $0.040.06/image |
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
| `fal-ai/ideogram/v3` | ~5s | Best typography | $0.030.09/image |
feat(image_gen): upgrade Recraft V3 → V4 Pro, Nano Banana → Pro (#11406) * feat(image_gen): upgrade Recraft V3 → V4 Pro, Nano Banana → Pro Upstream asked for these two upgrades ASAP — the old entries show stale models when newer, higher-quality versions are available on FAL. Recraft V3 → Recraft V4 Pro ID: fal-ai/recraft-v3 → fal-ai/recraft/v4/pro/text-to-image Price: $0.04/image → $0.25/image (6x — V4 Pro is premium tier) Schema: V4 dropped the required `style` enum entirely; defaults handle taste now. Added `colors` and `background_color` to supports for brand-palette control. `seed` is not supported by V4 per the API docs. Nano Banana → Nano Banana Pro ID: fal-ai/nano-banana → fal-ai/nano-banana-pro Price: $0.08/image → $0.15/image (1K); $0.30 at 4K Schema: Aspect ratio family unchanged. Added `resolution` (1K/2K/4K, default 1K for billing predictability), `enable_web_search` (real-time info grounding, +$0.015), and `limit_generations` (force exactly 1 image). Architecture: Gemini 2.5 Flash → Gemini 3 Pro Image. Quality and reasoning depth improved; slower (~6s → ~8s). Migration: users who had the old IDs in `image_gen.model` will fall through the existing 'unknown model → default' warning path in `_resolve_fal_model()` and get the Klein 9B default on the next run. Re-run `hermes tools` → Image Generation to pick the new version. No silent cost-upgrade aliasing — the 2-6x price jump on these tiers warrants explicit user re-selection. Portal note: both new model IDs need to be allowlisted on the Nous fal-queue-gateway alongside the previous 7 additions, or users on Nous Subscription will see the 'managed gateway rejected model' error we added previously (which is clear and self-remediating, just noisy). * docs: wrap '<1s' in backticks to unblock MDX compilation Docusaurus's MDX parser treats unquoted '<' as the start of JSX, and '<1s' fails because '1' isn't a valid tag-name start character. This was broken on main since PR #11265 (never noticed because docs-site-checks was failing on OTHER issues at the time and we admin-merged through it). Wrapping in backticks also gives the cell monospace styling which reads more cleanly alongside the inline-code model ID in the same row. The other '<1s' occurrence (line 52) is inside a fenced code block and is already safe — code fences bypass MDX parsing.
2026-04-16 22:05:41 -07:00
| `fal-ai/recraft/v4/pro/text-to-image` | ~8s | Design, brand systems, production-ready | $0.25/image |
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
| `fal-ai/qwen-image` | ~12s | LLM-based, complex text | $0.02/MP |
Prices are FAL's pricing at time of writing; check [fal.ai](https://fal.ai/) for current numbers.
## Setup
:::tip Nous Subscribers
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
If you have a paid [Nous Portal](https://portal.nousresearch.com) subscription, you can use image generation through the **[Tool Gateway](tool-gateway.md)** without a FAL API key. Your model selection persists across both paths.
If the managed gateway returns `HTTP 4xx` for a specific model, that model isn't yet proxied on the portal side — the agent will tell you so, with remediation steps (set `FAL_KEY` for direct access, or pick a different model).
:::
### Get a FAL API Key
1. Sign up at [fal.ai](https://fal.ai/)
2. Generate an API key from your dashboard
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
### Configure and Pick a Model
Run the tools command:
```bash
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
hermes tools
```
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
Navigate to **🎨 Image Generation**, pick your backend (Nous Subscription or FAL.ai), then the picker shows all supported models in a column-aligned table — arrow keys to navigate, Enter to select:
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
```
Model Speed Strengths Price
fal-ai/flux-2/klein/9b <1s Fast, crisp text $0.006/MP ← currently in use
fal-ai/flux-2-pro ~6s Studio photorealism $0.03/MP
fal-ai/z-image/turbo ~2s Bilingual EN/CN, 6B $0.005/MP
...
```
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
Your selection is saved to `config.yaml`:
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
```yaml
image_gen:
model: fal-ai/flux-2/klein/9b
use_gateway: false # true if using Nous Subscription
```
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
### GPT-Image Quality
The `fal-ai/gpt-image-1.5` and `fal-ai/gpt-image-2` request quality is pinned to `medium` (~$0.034$0.06/image at 1024×1024). We don't expose the `low` / `high` tiers as a user-facing option so that Nous Portal billing stays predictable across all users — the cost spread between tiers is 322×. If you want a cheaper option, pick Klein 9B or Z-Image Turbo; if you want higher quality, use Nano Banana Pro or Recraft V4 Pro.
## Usage
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
The agent-facing schema is intentionally minimal — the model picks up whatever you've configured:
```
Generate an image of a serene mountain landscape with cherry blossoms
```
```
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
Create a square portrait of a wise old owl — use the typography model
```
```
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
Make me a futuristic cityscape, landscape orientation
```
## Aspect Ratios
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
Every model accepts the same three aspect ratios from the agent's perspective. Internally, each model's native size spec is filled in automatically:
| Agent input | image_size (flux/z-image/qwen/recraft/ideogram) | aspect_ratio (nano-banana-pro) | image_size (gpt-image-1.5) | image_size (gpt-image-2) |
|---|---|---|---|---|
| `landscape` | `landscape_16_9` | `16:9` | `1536x1024` | `landscape_4_3` (1024×768) |
| `square` | `square_hd` | `1:1` | `1024x1024` | `square_hd` (1024×1024) |
| `portrait` | `portrait_16_9` | `9:16` | `1024x1536` | `portrait_4_3` (768×1024) |
GPT Image 2 maps to 4:3 presets rather than 16:9 because its minimum pixel count is 655,360 — the `landscape_16_9` preset (1024×576 = 589,824) would be rejected.
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
This translation happens in `_build_fal_payload()` — agent code never has to know about per-model schema differences.
## Automatic Upscaling
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
Upscaling via FAL's **Clarity Upscaler** is gated per-model:
| Model | Upscale? | Why |
|---|---|---|
| `fal-ai/flux-2-pro` | ✓ | Backward-compat (was the pre-picker default) |
| All others | ✗ | Fast models would lose their sub-second value prop; hi-res models don't need it |
When upscaling runs, it uses these settings:
| Setting | Value |
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
|---|---|
| Upscale factor | 2× |
| Creativity | 0.35 |
| Resemblance | 0.6 |
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
| Guidance scale | 4 |
| Inference steps | 18 |
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
If upscaling fails (network issue, rate limit), the original image is returned automatically.
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
## How It Works Internally
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
1. **Model resolution**`_resolve_fal_model()` reads `image_gen.model` from `config.yaml`, falls back to the `FAL_IMAGE_MODEL` env var, then to `fal-ai/flux-2/klein/9b`.
2. **Payload building**`_build_fal_payload()` translates your `aspect_ratio` into the model's native format (preset enum, aspect-ratio enum, or GPT literal), merges the model's default params, applies any caller overrides, then filters to the model's `supports` whitelist so unsupported keys are never sent.
3. **Submission**`_submit_fal_request()` routes via direct FAL credentials or the managed Nous gateway.
4. **Upscaling** — runs only if the model's metadata has `upscale: True`.
5. **Delivery** — final image URL returned to the agent, which emits a `MEDIA:<url>` tag that platform adapters convert to native media.
## Debugging
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
Enable debug logging:
```bash
export IMAGE_TOOLS_DEBUG=true
```
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
Debug logs go to `./logs/image_tools_debug_<session_id>.json` with per-call details (model, parameters, timing, errors).
docs: comprehensive documentation audit — fix stale info, expand thin pages, add depth (#5393) Major changes across 20 documentation pages: Staleness fixes: - Fix FAQ: wrong import path (hermes.agent → run_agent) - Fix FAQ: stale Gemini 2.0 model → Gemini 3 Flash - Fix integrations/index: missing MiniMax TTS provider - Fix integrations/index: web_crawl is not a registered tool - Fix sessions: add all 19 session sources (was only 5) - Fix cron: add all 18 delivery targets (was only telegram/discord) - Fix webhooks: add all delivery targets - Fix overview: add missing MCP, memory providers, credential pools - Fix all line-number references → use function name searches instead - Update file size estimates (run_agent ~9200, gateway ~7200, cli ~8500) Expanded thin pages (< 150 lines → substantial depth): - honcho.md: 43 → 108 lines — added feature comparison, tools, config, CLI - overview.md: 49 → 55 lines — added MCP, memory providers, credential pools - toolsets-reference.md: 57 → 175 lines — added explanations, config examples, custom toolsets, wildcards, platform differences table - optional-skills-catalog.md: 74 → 153 lines — added 25+ missing skills across communication, devops, mlops (18!), productivity, research categories - integrations/index.md: 82 → 115 lines — added messaging, HA, plugins sections - cron-internals.md: 90 → 195 lines — added job JSON example, lifecycle states, tick cycle, delivery targets, script-backed jobs, CLI interface - gateway-internals.md: 111 → 250 lines — added architecture diagram, message flow, two-level guard, platform adapters, token locks, process management - agent-loop.md: 112 → 235 lines — added entry points, API mode resolution, turn lifecycle detail, message alternation rules, tool execution flow, callback table, budget tracking, compression details - architecture.md: 152 → 295 lines — added system overview diagram, data flow diagrams, design principles table, dependency chain Other depth additions: - context-references.md: added platform availability, compression interaction, common patterns sections - slash-commands.md: added quick commands config example, alias resolution - image-generation.md: added platform delivery table - tools-reference.md: added tool counts, MCP tools note - index.md: updated platform count (5 → 14+), tool count (40+ → 47)
2026-04-05 19:45:50 -07:00
## Platform Delivery
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
| Platform | Delivery |
|---|---|
| **CLI** | Image URL printed as markdown `![](url)` — click to open |
| **Telegram** | Photo message with the prompt as caption |
| **Discord** | Embedded in a message |
| **Slack** | URL unfurled by Slack |
| **WhatsApp** | Media message |
| **Others** | URL in plain text |
docs: comprehensive documentation audit — fix stale info, expand thin pages, add depth (#5393) Major changes across 20 documentation pages: Staleness fixes: - Fix FAQ: wrong import path (hermes.agent → run_agent) - Fix FAQ: stale Gemini 2.0 model → Gemini 3 Flash - Fix integrations/index: missing MiniMax TTS provider - Fix integrations/index: web_crawl is not a registered tool - Fix sessions: add all 19 session sources (was only 5) - Fix cron: add all 18 delivery targets (was only telegram/discord) - Fix webhooks: add all delivery targets - Fix overview: add missing MCP, memory providers, credential pools - Fix all line-number references → use function name searches instead - Update file size estimates (run_agent ~9200, gateway ~7200, cli ~8500) Expanded thin pages (< 150 lines → substantial depth): - honcho.md: 43 → 108 lines — added feature comparison, tools, config, CLI - overview.md: 49 → 55 lines — added MCP, memory providers, credential pools - toolsets-reference.md: 57 → 175 lines — added explanations, config examples, custom toolsets, wildcards, platform differences table - optional-skills-catalog.md: 74 → 153 lines — added 25+ missing skills across communication, devops, mlops (18!), productivity, research categories - integrations/index.md: 82 → 115 lines — added messaging, HA, plugins sections - cron-internals.md: 90 → 195 lines — added job JSON example, lifecycle states, tick cycle, delivery targets, script-backed jobs, CLI interface - gateway-internals.md: 111 → 250 lines — added architecture diagram, message flow, two-level guard, platform adapters, token locks, process management - agent-loop.md: 112 → 235 lines — added entry points, API mode resolution, turn lifecycle detail, message alternation rules, tool execution flow, callback table, budget tracking, compression details - architecture.md: 152 → 295 lines — added system overview diagram, data flow diagrams, design principles table, dependency chain Other depth additions: - context-references.md: added platform availability, compression interaction, common patterns sections - slash-commands.md: added quick commands config example, alias resolution - image-generation.md: added platform delivery table - tools-reference.md: added tool counts, MCP tools note - index.md: updated platform count (5 → 14+), tool count (40+ → 47)
2026-04-05 19:45:50 -07:00
## Limitations
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
- **Requires FAL credentials** (direct `FAL_KEY` or Nous Subscription)
- **Text-to-image only** — no inpainting, img2img, or editing via this tool
- **Temporary URLs** — FAL returns hosted URLs that expire after hours/days; save locally if needed
- **Per-model constraints** — some models don't support `seed`, `num_inference_steps`, etc. The `supports` filter silently drops unsupported params; this is expected behavior