AI endpoints
Easily access world-renowned pre-trained AI models.
Innovate with simple and secure APIs on OVHcloud's robust and confidential infrastructure. Optimize your applications with scalable AI capabilities, eliminating the need for deep expertise. Gain efficiency with powerful AI endpoints, designed for simplicity and reliability.
Discover our models
Explore our catalog of artificial intelligence models to find the one that fits your needs.
Filters
Context
Selected value: All
29 results available
Qwen3-Coder-30B-A3B-Instruct
0.06€
/Mtoken(input)0.22€
/Mtoken(output)Licence: Apache 2.0
Number of params: 30B
Quantization: fp8
Max. context size: 256K
Support: Function calling, Code Assistant
Gpt-oss-120b
0.08€
/Mtoken(input)0.4€
/Mtoken(output)Licence: Apache 2.0
Number of params: 117B
Quantization: fp4
Max. context size: 131K
Support: Function calling, Reasoning
Gpt-oss-20b
0.04€
/Mtoken(input)0.15€
/Mtoken(output)Licence: Apache 2.0
Number of params: 21B
Quantization: fp4
Max. context size: 131K
Support: Function calling, Reasoning
Whisper-large-v3
0.00004083€
/secondLicence: Apache 2.0
Number of params: 1.54B
Quantization: fp16
Support: Automatic Speech Recognition
Whisper-large-v3-turbo
0.00001278€
/secondLicence: Apache 2.0
Number of params: 0.81B
Quantization: fp16
Support: Automatic Speech Recognition
Qwen3-32B
0.08€
/Mtoken(input)0.23€
/Mtoken(output)Licence: Apache 2.0
Number of params: 32.8B
Quantization: fp8
Max. context size: 32K
Support: Function calling, Reasoning
Mistral-Small-3.2-24B-Instruct-2506
0.09€
/Mtoken(input)0.28€
/Mtoken(output)Licence: Apache 2.0
Number of params: 24B
Quantization: fp8
Max. context size: 128K
Support: Function calling, Multimodal
Llama-3.1-8B-Instruct
0.1€
/Mtoken(input)0.1€
/Mtoken(output)Licence: Llama 3.1 Community
Number of params: 8B
Quantization: fp16
Max. context size: 131K
Support: Function calling
Yolov11x-image-segmentation
Free
Yolov11x-object-detection
Free
Mixtral-8x7B-Instruct-v0.1
0.63€
/Mtoken(input)0.63€
/Mtoken(output)Meta-Llama-3_3-70B-Instruct
0.67€
/Mtoken(input)0.67€
/Mtoken(output)Licence: Llama 3.3 Community
Number of params: 70B
Quantization: fp8
Max. context size: 131K
Support: Function calling
Mistral-7B-Instruct-v0.3
0.1€
/Mtoken(input)0.1€
/Mtoken(output)Licence: Apache 2.0
Number of params: 7B
Quantization: fp16
Max. context size: 127K
Support: Function calling
Bge-base-en-v1.5
0.01€
/Mtoken(input)Qwen2.5-VL-72B-Instruct
0.91€
/Mtoken(input)0.91€
/Mtoken(output)Licence: Qwen
Number of params: 72B
Quantization: fp8
Max. context size: 32K
Support: Multimodal
Bge-multilingual-gemma2
0.01€
/Mtoken(input)DeepSeek-R1-Distill-Llama-70B
0.67€
/Mtoken(input)0.67€
/Mtoken(output)Licence: MIT & Meta Llama 3 Community License
Number of params: 70B
Quantization: fp8
Max. context size: 131K
Support: Function calling, Reasoning
Mistral-Nemo-Instruct-2407
0.13€
/Mtoken(input)0.13€
/Mtoken(output)Licence: Apache 2.0
Number of params: 12.2B
Quantization: fp16
Max. context size: 118K
Support: Function calling