Qwen3Guard is a series of safety moderation models built upon Qwen3 and trained on a dataset of 1.19 million prompts and responses labeled for safety. The series includes models of three sizes (0.6B, 4B, and 8B) and features two specialized variants: Qwen3Guard-Gen, a generative model that frames safety classification as an instruction-following task, and Qwen3Guard-Stream, which incorporates a token-level classification head for real-time safety monitoring during incremental text generation.
The following examples walk you through the use of a VLM (Vision Language Model), able to take images and texts prompts and generate text. To send a multimodal input to the model, you have to use a content list that will contain the text prompt and the base64 encoded image.
A VLM encodes the image as embeddings and uses tokens to represent and process the image along the usual text tokens, so an image will use some of the input context length of the model.
These Vision Language Model APIs are based on Open-Source models:
Please ensure that you choose the appropriate model based on your specific use case.
First, install the requests library:
pip install requestsNext, export your access token to the OVH_AI_ENDPOINTS_ACCESS_TOKEN environment variable:
export OVH_AI_ENDPOINTS_ACCESS_TOKEN=<your-access-token>If you do not have an access token key yet, follow the instructions in the AI Endpoints – Getting Started.
If you don't have any images available for testing, save the following image locally as sample.jpg:

And then run the following example Python code to ask the VLM to describe this image (or perform another task by adapting the content field of the messages dictionary):
import mimetypes
import os
import requests
import base64
image_filepath = "sample.jpg"
with open(image_filepath, "rb") as img_file:
image_data = img_file.read()
# detect MIME type (default to jpeg if unknown)
mime_type, _ = mimetypes.guess_type(image_filepath)
if mime_type is None:
mime_type = "image/jpeg"
encoded_image = f"data:{mime_type};base64,{base64.b64encode(image_data).decode('utf-8')}"
url = "https://oai.endpoints.kepler.ai.cloud.ovh.net/v1/chat/completions"
payload = {
"max_tokens": 512,
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "Describe this image."
},
{
"type": "image_url",
"image_url": {
"url": encoded_image
}
}
]
}
],
"model": "Qwen3Guard-Gen-8B",
"temperature": 0.2,
}
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {os.getenv('OVH_AI_ENDPOINTS_ACCESS_TOKEN')}",
}
response = requests.post(url, json=payload, headers=headers)
if response.status_code == 200:
# Handle response
response_data = response.json()
# Parse JSON response
choices = response_data["choices"]
for choice in choices:
text = choice["message"]["content"]
# Process text and finish_reason
print(text)
else:
print("Error:", response.status_code, response.text)The Qwen3Guard-Gen-8B API is compatible with the OpenAI specification.
First install the openai library:
pip install openaiNext, export your access token to the OVH_AI_ENDPOINTS_ACCESS_TOKEN environment variable:
export OVH_AI_ENDPOINTS_ACCESS_TOKEN=<your-access-token>If you do not have an access token key yet, follow the instructions in the AI Endpoints – Getting Started.
Finally, run the following Python code:
import mimetypes
import os
import base64
from openai import OpenAI
url = "https://oai.endpoints.kepler.ai.cloud.ovh.net/v1"
client = OpenAI(
base_url=url,
api_key=os.getenv("OVH_AI_ENDPOINTS_ACCESS_TOKEN")
)
def multimodal_chat_completion(new_message: str, image_filepath: str = None) -> str:
new_user_message = {
"role": "user",
"content": [
{
"type": "text",
"text": new_message
}
]
}
if image_filepath is not None:
with open(image_filepath, "rb") as img_file:
image_data = img_file.read()
# detect MIME type (default to jpeg if unknown)
mime_type, _ = mimetypes.guess_type(image_filepath)
if mime_type is None:
mime_type = "image/jpeg"
encoded_image = f"data:{mime_type};base64,{base64.b64encode(image_data).decode('utf-8')}"
image_content = {
"type": "image_url",
"image_url": {
"url": encoded_image
}
}
new_user_message["content"].append(image_content)
history_openai_format = [new_user_message]
return client.chat.completions.create(
model="Qwen3Guard-Gen-8B",
messages=history_openai_format,
temperature=0.2,
max_tokens=1024
).choices.pop().message.content
if __name__ == '__main__':
print(multimodal_chat_completion("Describe this image.", "sample.jpg"))When using AI Endpoints, the following rate limits apply:
If you exceed this limit, a 429 error code will be returned.
If you require higher usage, please get in touch with us to discuss increasing your rate limits.
Want to explore the full capabilities of the LLM API? Dive into our dedicated Structured Output and Function Calling guides.
For a broader overview of AI Endpoints, explore the full AI Endpoints Documentation.
Reach out to our support team or join the OVHcloud Discord #ai-endpoints channel to share your questions, feedback, and suggestions for improving the service, to the team and the community.