Bge-multilingual-gemma2

Embeddings

BGE-Multilingual-Gemma2 est un modèle d'embedding multilingue basé sur LLM. Il est entraîné sur une gamme diversifiée de langues et de tâches. BGE-Multilingual-Gemma2 démontre principalement les avancées suivantes : Données d'entraînement diversifiées : Les données d'entraînement du modèle couvrent un large éventail de langues, notamment l'anglais, le chinois, le japonais, le coréen, le français, et plus encore. De plus, les données couvrent une variété de types de tâches, telles que la récupération, la classification et le clustering. Performances exceptionnelles : Le modèle présente des résultats à l'état de l'art (SOTA) sur des benchmarks multilingues comme MIRACL, MTEB-pl et MTEB-fr. Il obtient également d'excellentes performances sur d'autres évaluations majeures, notamment MTEB, C-MTEB et AIR-Bench.

Essayez un autre modèle

À propos du modèle Bge-multilingual-gemma2

Publié sur huggingface

29/06/2024

Licence: Gemma

Token envoyés

0.01 € /Mtoken(entrée)

Longueur de séquence max

8192 tokens

Taille de lot max

25 samples

Dimensions de sortie

3584 dimensions

Paramètres

0.567B

Essayez le modèle.

Embeddings Converter API

The Embeddings Converter API provides a single unified endpoint that can handle both single sentence and batch requests.

Introduction

In NLP, word embeddings are often used to represent individual words as vectors in a high dimensional space, where the vectors capture semantic and syntactic relationships between words. These embedding models allow you to perform tasks such as word analogy or sentence similarity. To do this, the difference between vectors in the embedding space must be calculated to identify relationships between words.

AI Endpoints makes it easy, with ready-to-use inference APIs. Discover how to use them:

Model catalog

The embedding APIs endpoint allows you to access state-of-the-art Open-Source models that transform raw text input into high-dimensional embedding vectors suitable for a variety of multilingual and monolingual applications:

BAAI/bge-multilingual-gemma2

Input Type: Raw text
Language Support: A wide range of languages, including English, Chinese, Japanese, Korean, French, and more
Output Type: Embedding vector (3584 dimensions)
Max Sequence Length: 8192 tokens
Max Batch Size: 25 samples

BAAI/bge-base-en-v1.5

Input Type: Raw text
Language Support: English
Output Type: Embedding vector (768 dimensions)
Max Sequence Length: 512 tokens
Max Batch Size: 25 samples

BAAI/bge-m3

Input Type: Raw text
Language Support: Supports over 100 languages - see model card for the full list
Output Type: Embedding vector (1024 dimensions)
Max Sequence Length: 8192 tokens
Max Batch Size: 25 samples

The request payload must contain a model field to indicate which model you plan to use and an input field. input can be a single string or a JSON array of strings (limited to the max batch size samples indicated above, depending on which embedding model you are using).

How to?

With a simple HTTP client (requests)

First install the requests library:

pip install requests

Next, export your access token to the OVH_AI_ENDPOINTS_ACCESS_TOKEN environment variable:

export OVH_AI_ENDPOINTS_ACCESS_TOKEN=<your-access-token>

If you do not have an access token key yet, follow the instructions in the AI Endpoints – Getting Started.

Finally, run the following Python code:

import os
import requests

texts = [
    "Paris is the capital of France",
    "Paris is the capital of France",
    "Berlin is the capital of Germany",
    "This endpoint converts input sentence into a vector embeddings"
]

url = "https://oai.endpoints.kepler.ai.cloud.ovh.net/v1/embeddings"
headers = {
    "Authorization": f"Bearer {os.getenv('OVH_AI_ENDPOINTS_ACCESS_TOKEN')}",
    "Content-Type": "application/json"
}
payload = {
    "model": "bge-multilingual-gemma2",
    "input": texts
}

response = requests.post(url, headers=headers, json=payload)
if response.status_code == 200:
    data = response.json()
    print(f"✅ Received {len(data['data'])} embeddings")

    # Show a preview of the first embedding (first 5 values)
    print("First embedding preview:", data['data'][0]['embedding'][:5])
else:
    print("❌ Error:", response.status_code, response.text)

With the Python OpenAI library

The bge-multilingual-gemma2 API is compatible with the OpenAI specification.

First install the openai library:

pip install openai

Next, export your access token to the OVH_AI_ENDPOINTS_ACCESS_TOKEN environment variable:

export OVH_AI_ENDPOINTS_ACCESS_TOKEN=<your-access-token>

Finally, run the following Python code:

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://oai.endpoints.kepler.ai.cloud.ovh.net/v1",
    api_key=os.getenv('OVH_AI_ENDPOINTS_ACCESS_TOKEN')
)

MODEL = "bge-multilingual-gemma2"
texts = [
    "Paris is the capital of France",
    "Paris is the capital of France",
    "Berlin is the capital of Germany",
    "This endpoint converts input sentence into a vector embeddings"
]

response = client.embeddings.create(
    model=MODEL,
    input=texts
)

print(f"✅ Received {len(response.data)} embeddings")

# Show a preview of the first embedding (first 5 values)
print("First embedding preview:", response.data[0].embedding[:5])

With cURL

curl -X POST "https://oai.endpoints.kepler.ai.cloud.ovh.net/v1/embeddings" \
  -H "Authorization: Bearer $OVH_AI_ENDPOINTS_ACCESS_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
      "model": "bge-multilingual-gemma2",
      "input": [
          "Paris is the capital of France",
          "Paris is the capital of France",
          "Berlin is the capital of Germany",
          "This endpoint converts input sentence into a vector embeddings"
      ]
  }'

Vector similarity comparison

After obtaining the embeddings, you can assess how semantically similar the input sentences are. The following example shows how to compute cosine similarity between the first embedding and the others.

Install the numpy library:

pip install numpy

Then run the following code:

import numpy as np

def cosine_similarity(vec_a, vec_b):
    """Return cosine similarity between two vectors."""
    return np.dot(vec_a, vec_b) / (np.linalg.norm(vec_a) * np.linalg.norm(vec_b))

# In this code, `data` is the JSON response from the `requests` example above
data = response.json()
embeddings = [item["embedding"] for item in data["data"]]
base = embeddings[0]

# Compute similarities to the first embedding
similarities = {
    f"Similarity with sentence {i}": f"{cosine_similarity(base, vec):.3f}"
    for i, vec in enumerate(embeddings[1:], start=1)
}
print("Sentence similarities:", similarities)

Model rate limit

Anonymous: 2 requests per minute, per IP and per model.
Authenticated with an API access key: 400 requests per minute, per Public Cloud project and per model.

If you exceed this limit, a 429 error code will be returned.

If you require higher usage, please get in touch with us to discuss increasing your rate limits.

Going Further

For a broader overview of AI Endpoints, explore the full AI Endpoints Documentation.

Reach out to our support team or join the OVHcloud Discord #ai-endpoints channel to share your questions, feedback, and suggestions for improving the service, to the team and the community.

Bge-multilingual-gemma2