/chat/completions
endpoint to invoke reasoning models.
max_tokens
: Sets the maximum number of tokens the model can return.temperature
: Recommended between 0.5 and 0.7 (suggested: 0.6) to balance creativity and logic.top_p
: Recommended value is 0.95.deepseek/deepseek-r1-0528
deepseek/deepseek-r1-0528-qwen3-8b
deepseek/deepseek-r1-turbo
deepseek/deepseek-r1-distill-qwen-32b
deepseek/deepseek-r1-distill-qwen-14b
deepseek/deepseek-r1-distill-llama-70b
deepseek/deepseek-r1-distill-llama-8b
deepseek/deepseek-r1/community
qwen/qwen3-235b-a22b-fp8
qwen/qwen3-30b-a3b-fp8
qwen/qwen3-32b-fp8
qwen/qwen3-8b-fp8
qwen/qwen3-4b-fp8
thudm/glm-z1-rumination-32b-0414
thudm/glm-z1-9b-0414
thudm/glm-z1-32b-0414
meta-llama/llama-4-maverick-17b-128e-instruct-fp8
system
message. Instead, make the intent explicit in the user
message.