Model name | vision | reasoning | function-calling | structured output | context size | tensor type |
---|---|---|---|---|---|---|
deepseek/deepseek-r1-0528-qwen3-8b | no | yes | no | no | 128000 | BF16 |
meta-llama/llama-4-scout-17b-16e-instruct | yes | no | yes | no | 131072 | BF16 |
deepseek/deepseek-r1-distill-llama-8b | no | yes | no | yes | 32000 | BF16 |
qwen/qwen3-8b-fp8 | no | yes | no | no | 128000 | FP8 |
qwen/qwen3-4b-fp8 | no | yes | no | no | 128000 | FP8 |
Model name | vision | reasoning | function-calling | structured output | context size | tensor type |
---|---|---|---|---|---|---|
meta-llama/llama-3.1-8b-instruct | no | no | no | no | 16384 | FP8 |
meta-llama/llama-3.3-70b-instruct | no | no | yes | yes | 131072 | BF16 |
meta-llama/llama-3.2-3b-instruct | no | no | yes | no | 32768 | BF16 |
novita/inflatebot-mn-12B-mag-mell-r1-awq-fp8 | yes | no | yes | yes | 32000 | FP8 |
novita/sao10k-l3-8b-stheno-v3.2-awq-fp8 | no | no | no | no | 8192 | FP8 |
deepseek/deepseek-v3-0324 | no | no | yes | no | 163840 | FP8 |
combination of stutus | situations |
---|---|
Running | Able to accept reasoning requirement |
Pending/Rolling→ Running | Restore available status automatically after changing deployments |
Terminating→ Terminated | Release resources, stop billing, support restart |