/chat/completions
endpoint with both image and text inputs.
detail
field to control image resolution. The following modes are available:
high
: High resolution, preserves more detail—ideal for precision tasks.low
: Low resolution, faster response—suitable for real-time usage.auto
: Automatically selects the appropriate mode.meta-llama/llama-4-maverick-17b-128e-instruct-fp8
meta-llama/llama-4-scout-17b-16e-instruct
google/gemma-3-27b-it
qwen/qwen2.5-vl-72b-instruct
baidu/ernie-4.5-vl-424b-a47b