Qwen2.5-VL-72B-Instruct

Visual LLM

Qwen2.5-VL is a powerful vision-language model, designed for advanced image understanding. It can generate detailed image captions, analyze documents, OCR, detect objects, and answer questions based on visuals, making it useful for AI assistants, RAG and Agents.

Acerca del modelo Qwen2.5-VL-72B-Instruct

Publicado el huggingface

27/01/2025


Precio de entrada

0.91 /Mtoken(entrada)

Precio de salida

0.91 /Mtoken(salida)


Características soportadas
MultimodalStreaming
Formatos de salida
raw_textjson_objectjson_schema
Tamaños de contexto
32k
Parámetros
72B

Prueba el modelo jugando con él.