The crispy, lightweight ColBERT family from Mixedbread.

🍞 Looking for a simple end-to-end retrieval solution? Meet Mixedbread Search, our multi-modal and multi-lingual search solution.

mxbai-edge-colbert-v0-32m

This model is a lightweight, 32 million parameter ColBERT with a projection dimension of 64. It is built on top of Ettin-32M, meaning it benefits from all of ModernBERT's architectural efficiencies. Despite this extreme efficiency, it is the best-performer "edge-sized" retriever, outperforming ColBERTv2 and many models with over 10 times more parameters. It can create multi-vector representations for documents of up to 32,000 tokens and is fully compatible with the PyLate library.

Usage

To use this model, you first need to install PyLate:

via uv

# uv
uv add pylate
# uv + pip
uv pip install pylate

or pip

# pip
pip install -U pylate

Once installed, the model is immediately ready to use to generate representations and index documents:

from pylate import indexes, models, retrieve

# Step 1: Load the model
model = models.ColBERT(
    model_name_or_path="mixedbread-ai/mxbai-edge-colbert-v0-32m",
)


# Step 2: Initialize an index (here, PLAID, for larger document collections)
index = indexes.PLAID(
    index_folder="pylate-index",
    index_name="index",
    override=True,  # This overwrites the existing index if any
)

# Step 3: Encode your documents
documents_ids = ["1", "2", "3"]
documents = ["document 1 text", "document 2 text", "document 3 text"]

documents_embeddings = model.encode(
    documents,
    batch_size=32,
    is_query=False,  # Ensure that it is set to False to indicate that these are documents, not queries
    show_progress_bar=True,
)

# Step 4: Add document embeddings to the index by providing embeddings and corresponding ids
index.add_documents(
    documents_ids=documents_ids,
    documents_embeddings=documents_embeddings,
)

That's all you need to do to encode a full collection! Your documents are indexed and ready to be queried:

# Step 5.1: Initialize the ColBERT retriever
retriever = retrieve.ColBERT(index=index)

# Step 2: Encode the queries
queries_embeddings = model.encode(
    ["query for document 3", "query for document 1"],
    batch_size=32,
    is_query=True,  #  # Ensure that it is set to False to indicate that these are queries
    show_progress_bar=True,
)

# Step 3: Retrieve top-k documents
scores = retriever.retrieve(
    queries_embeddings=queries_embeddings,
    k=10,  # Retrieve the top 10 matches for each query
)

Reranking

Thanks to its extreme parameter efficiency, this model is particularly well-suited to being used as a re-ranker following an even more lightweight first stage retrieval, such as static embeding models. Re-ranking is just as straigthforward:

from pylate import rank, models

# Load the model
model = models.ColBERT(
    model_name_or_path="mixedbread-ai/mxbai-edge-colbert-v0-32m",
)

# Define queries and documents
queries = [
    "query A",
    "query B",
]

documents = [
    ["document A", "document B"],
    ["document 1", "document C", "document B"],
]
documents_ids = [
    [1, 2],
    [1, 3, 2],
]

# Embed them
queries_embeddings = model.encode(
    queries,
    is_query=True,
)

documents_embeddings = model.encode(
    documents,
    is_query=False,
)

# Perform reranking
reranked_documents = rank.rerank(
    documents_ids=documents_ids,
    queries_embeddings=queries_embeddings,
    documents_embeddings=documents_embeddings,
)

Evaluation

Results on BEIR

Model AVG MS MARCO SciFact Touche FiQA TREC-COVID NQ DBPedia
Large Models (>100M)
GTE-ModernColBERT-v1 0.547 0.453 0.763 0.312 0.453 0.836 0.618 0.480
ColBERTv2 0.488 0.456 0.693 0.263 0.356 0.733 0.562 0.446
Medium Models (<35M)
mxbai-edge-colbert-v0-32m 0.521 0.450 0.740 0.313 0.390 0.775 0.600 0.455
answerai-colbert-small-v1 0.534 0.434 0.740 0.250 0.410 0.831 0.594 0.464
bge-small-en-v1.5 0.517 0.408 0.713 0.260 0.403 0.759 0.502 0.400
snowflake-s 0.519 0.402 0.722 0.235 0.407 0.801 0.509 0.410
Small Models (<25M)
mxbai-edge-colbert-v0-17m 0.490 0.416 0.719 0.316 0.326 0.713 0.551 0.410
colbert-muvera-micro 0.394 0.364 0.662 0.251 0.254 0.561 0.386 0.332
all-MiniLM-L6-v2 0.419 0.365 0.645 0.169 0.369 0.472 0.439 0.323

Results on LongEmbed

Model AVG
Large Models (>100M)
GTE-ModernColBERT-v1 (32k) 0.898
GTE-ModernColBERT-v1 (4k) 0.809
granite-embedding-english-r2 0.656
ColBERTv2 0.428
Medium Models (<50M)
mxbai-edge-colbert-v0-32m (32k) 0.849
mxbai-edge-colbert-v0-32m (4k) 0.783
granite-embedding-small-english-r2 0.637
answerai-colbert-small-v1 0.441
bge-small-en-v1.5 0.312
snowflake-arctic-embed-s 0.356
Small Models (<25M)
mxbai-edge-colbert-v0-17m (32k) 0.847
mxbai-edge-colbert-v0-17m (4k) 0.776
all-MiniLM-L6-v2 0.298
colbert-muvera-micro 0.405

For more details on evaluations, please read our Tech Report.

Community

Please join our Discord Community and share your feedback and thoughts! We are here to help and also always happy to chat.

License

Apache 2.0

Citation

If you use our model, please cite the associated tech report:

@misc{takehi2025fantasticsmallretrieverstrain,
      title={Fantastic (small) Retrievers and How to Train Them: mxbai-edge-colbert-v0 Tech Report}, 
      author={Rikiya Takehi and Benjamin Clavié and Sean Lee and Aamir Shakir},
      year={2025},
      eprint={2510.14880},
      archivePrefix={arXiv},
      primaryClass={cs.IR},
      url={https://arxiv.org/abs/2510.14880}, 
}

If you specifically use its projection heads, or discuss their effect, please cite our report on using different projections for ColBERT models:

@misc{clavie2025simpleprojectionvariantsimprove,
      title={Simple Projection Variants Improve ColBERT Performance}, 
      author={Benjamin Clavié and Sean Lee and Rikiya Takehi and Aamir Shakir and Makoto P. Kato},
      year={2025},
      eprint={2510.12327},
      archivePrefix={arXiv},
      primaryClass={cs.IR},
      url={https://arxiv.org/abs/2510.12327}, 
}

Finally, if you use PyLate in your work, please cite PyLate itself:

@misc{PyLate,
title={PyLate: Flexible Training and Retrieval for Late Interaction Models},
author={Chaffin, Antoine and Sourty, Raphaël},
url={https://github.com/lightonai/pylate},
year={2024}
}
Downloads last month
1,880
Safetensors
Model size
31.9M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including mixedbread-ai/mxbai-edge-colbert-v0-32m