Granite Embedding 107M Multilingual

This is a copy of the ibm-granite/granite-embedding-107m-multilingual model for document encoding purposes.

Model Summary

Granite-Embedding-107M-Multilingual is a 107M parameter dense biencoder embedding model from the Granite Embeddings suite that can be used to generate high quality text embeddings. This model produces embedding vectors of size 384.

Supported Languages

English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, and Chinese.

Usage

With Sentence Transformers

from sentence_transformers import SentenceTransformer

model = SentenceTransformer('RikoteMaster/MNLP_M3_document_encoder')
embeddings = model.encode(['Your text here'])

With Transformers

from transformers import AutoModel, AutoTokenizer
import torch

model = AutoModel.from_pretrained('RikoteMaster/MNLP_M3_document_encoder')
tokenizer = AutoTokenizer.from_pretrained('RikoteMaster/MNLP_M3_document_encoder')

inputs = tokenizer(['Your text here'], return_tensors='pt', padding=True, truncation=True)
with torch.no_grad():
    outputs = model(**inputs)
    embeddings = outputs.last_hidden_state[:, 0]  # CLS pooling
    embeddings = torch.nn.functional.normalize(embeddings, dim=1)

Original Model

This model is based on ibm-granite/granite-embedding-107m-multilingual by IBM.

Downloads last month
4
Safetensors
Model size
107M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for RikoteMaster/MNLP_M3_document_encoder

Finetuned
(4)
this model