AI endpoints

Easily access world-renowned pre-trained AI models.
Innovate with simple and secure APIs on OVHcloud's robust and confidential infrastructure. Optimize your applications with scalable AI capabilities, eliminating the need for deep expertise. Gain efficiency with powerful AI endpoints, designed for simplicity and reliability.

Discover our models

Explore our catalog of artificial intelligence models to find the one that fits your needs.

29 results available

CODE LLM
New

Qwen3-Coder-30B-A3B-Instruct

0.06

/Mtoken(input)

0.22

/Mtoken(output)

Licence: Apache 2.0

Number of params: 30B

Quantization: fp8

Max. context size: 256K

Support: Function calling, Code Assistant

More details
REASONING LLM
New

Gpt-oss-120b

0.08

/Mtoken(input)

0.4

/Mtoken(output)

Licence: Apache 2.0

Number of params: 117B

Quantization: fp4

Max. context size: 131K

Support: Function calling, Reasoning

More details
REASONING LLM
New

Gpt-oss-20b

0.04

/Mtoken(input)

0.15

/Mtoken(output)

Licence: Apache 2.0

Number of params: 21B

Quantization: fp4

Max. context size: 131K

Support: Function calling, Reasoning

More details
AUDIO ANALYSIS
New

Whisper-large-v3

0.00004083

/second

Licence: Apache 2.0

Number of params: 1.54B

Quantization: fp16

Support: Automatic Speech Recognition

More details
AUDIO ANALYSIS
New

Whisper-large-v3-turbo

0.00001278

/second

Licence: Apache 2.0

Number of params: 0.81B

Quantization: fp16

Support: Automatic Speech Recognition

More details
REASONING LLM
New

Qwen3-32B

0.08

/Mtoken(input)

0.23

/Mtoken(output)

Licence: Apache 2.0

Number of params: 32.8B

Quantization: fp8

Max. context size: 32K

Support: Function calling, Reasoning

More details
VISUAL LLM
New

Mistral-Small-3.2-24B-Instruct-2506

0.09

/Mtoken(input)

0.28

/Mtoken(output)

Licence: Apache 2.0

Number of params: 24B

Quantization: fp8

Max. context size: 128K

Support: Function calling, Multimodal

More details
LARGE LANGUAGE MODELS (LLM)
New

Llama-3.1-8B-Instruct

0.1

/Mtoken(input)

0.1

/Mtoken(output)

Licence: Llama 3.1 Community

Number of params: 8B

Quantization: fp16

Max. context size: 131K

Support: Function calling

More details
COMPUTER VISION
Beta

Yolov11x-image-segmentation

Free

Licence: AGPL-3.0

Number of params: 0.06B

Quantization: fp16

More details
COMPUTER VISION
Beta

Yolov11x-object-detection

Free

Licence: AGPL-3.0

Number of params: 0.06B

Quantization: fp16

More details
LARGE LANGUAGE MODELS (LLM)

Mixtral-8x7B-Instruct-v0.1

0.63

/Mtoken(input)

0.63

/Mtoken(output)

Licence: Apache 2.0

Number of params: 46.7B

Quantization: fp16

Max. context size: 32K

More details
LARGE LANGUAGE MODELS (LLM)
New

Meta-Llama-3_3-70B-Instruct

0.67

/Mtoken(input)

0.67

/Mtoken(output)

Licence: Llama 3.3 Community

Number of params: 70B

Quantization: fp8

Max. context size: 131K

Support: Function calling

More details
LARGE LANGUAGE MODELS (LLM)
New

Mistral-7B-Instruct-v0.3

0.1

/Mtoken(input)

0.1

/Mtoken(output)

Licence: Apache 2.0

Number of params: 7B

Quantization: fp16

Max. context size: 127K

Support: Function calling

More details
EMBEDDINGS

Bge-base-en-v1.5

0.01

/Mtoken(input)

Licence: MIT

Number of params: 0.109B

Quantization: fp16

More details
VISUAL LLM
New

Qwen2.5-VL-72B-Instruct

0.91

/Mtoken(input)

0.91

/Mtoken(output)

Licence: Qwen

Number of params: 72B

Quantization: fp8

Max. context size: 32K

Support: Multimodal

More details
EMBEDDINGS

Bge-multilingual-gemma2

0.01

/Mtoken(input)

Licence: Gemma

Number of params: 0.567B

Quantization: fp16

More details
REASONING LLM
New

DeepSeek-R1-Distill-Llama-70B

0.67

/Mtoken(input)

0.67

/Mtoken(output)

Licence: MIT & Meta Llama 3 Community License

Number of params: 70B

Quantization: fp8

Max. context size: 131K

Support: Function calling, Reasoning

More details
LARGE LANGUAGE MODELS (LLM)

Mistral-Nemo-Instruct-2407

0.13

/Mtoken(input)

0.13

/Mtoken(output)

Licence: Apache 2.0

Number of params: 12.2B

Quantization: fp16

Max. context size: 118K

Support: Function calling

More details
EMBEDDINGS

BGE-M3

0.01

/Mtoken(input)

Licence: MIT

Number of params: 0.567B

Quantization: fp16

More details
IMAGE GENERATION

Stable-diffusion-xl-base-v10

Free

Licence: OpenRail++

Number of params: 3.5B

Quantization: fp32

Support: Image Generation

More details
AUDIO ANALYSIS

Nvr-tts-en-us

Free

Licence: Riva license

Number of params: B

Quantization: fp32

Support: Text To Speech

More details
TRANSLATION

T5-large

Free

Licence: Apache 2.0

Number of params: 0.738B

Quantization: fp32

More details
AUDIO ANALYSIS

Nvr-tts-it-it

Free

Licence: Riva license

Number of params: B

Quantization: fp32

Support: Text To Speech

More details
NATURAL LANGUAGE PROCESSING

Roberta-base-go_emotions

Free

Licence: MIT

Number of params: 0.125B

Quantization: fp32

Support: Emotion Extraction

More details
AUDIO ANALYSIS

Nvr-tts-de-de

Free

Licence: Riva license

Number of params: B

Quantization: fp32

Support: Text To Speech

More details
AUDIO ANALYSIS

Nvr-tts-es-es

Free

Licence: Riva license

Number of params: B

Quantization: fp32

Support: Text To Speech

More details
NATURAL LANGUAGE PROCESSING

Bert-base-multilingual-uncased-sentiment

Free

Licence: MIT

Number of params: 0.167B

Quantization: fp32

Support: Sentiment Analysis

More details
NATURAL LANGUAGE PROCESSING

Bert-base-NER

Free

Licence: MIT

Number of params: 0.108B

Quantization: fp32

Support: Name Entity Recognition

More details
NATURAL LANGUAGE PROCESSING

Bart-large-cnn

Free

Licence: MIT

Number of params: 0.406B

Quantization: fp32

More details