Two changes:
1. _PROVIDER_VISION_MODELS: add 'nous' -> 'xiaomi/mimo-v2-omni' entry
so the vision auto-detect chain picks the correct multimodal model.
2. resolve_provider_client: detect when the requested model is a vision
model (from _PROVIDER_VISION_MODELS or known vision model names) and
pass vision=True to _try_nous(). Previously, _try_nous() was always
called without vision=True in resolve_provider_client(), causing it to
return the default text model (gemini-3-flash-preview or mimo-v2-pro)
instead of the vision-capable mimo-v2-omni.
The _try_nous() function already handled free-tier vision correctly, but
the resolve_provider_client() path (used by the auto-detect vision chain)
never signaled that a vision task was in progress.
Verified: xiaomi/mimo-v2-omni returns HTTP 200 with image inputs on Nous
inference API. google/gemini-3-flash-preview returns 404 with images.