--- language: - en tags: - ColBERT - PyLate - sentence-transformers - sentence-similarity - feature-extraction - generated_from_trainer - transformers pipeline_tag: sentence-similarity library_name: PyLate license: apache-2.0 ---

The crispy, lightweight ColBERT family from Mixedbread.

^{🍞 Looking for a simple end-to-end retrieval solution? Meet Mixedbread Search, our multi-modal and multi-lingual search solution.}

# mxbai-edge-colbert-v0-32m This model is a lightweight, 32 million parameter ColBERT with a projection dimension of 64. It is built on top of [Ettin-32M](https://huggingface.co/jhu-clsp/ettin-encoder-32m), meaning it benefits from all of ModernBERT's architectural efficiencies. Despite this extreme efficiency, it is the best-performer "edge-sized" retriever, outperforming ColBERTv2 and many models with over 10 times more parameters. It can create multi-vector representations for documents of up to 32,000 tokens and is fully compatible with the [PyLate](https://github.com/lightonai/pylate) library. ## Usage To use this model, you first need to install PyLate: via uv ```bash # uv uv add pylate # uv + pip uv pip install pylate ``` or pip ```bash # pip pip install -U pylate ``` Once installed, the model is immediately ready to use to generate representations and index documents: ```python from pylate import indexes, models, retrieve # Step 1: Load the model model = models.ColBERT( model_name_or_path="mixedbread-ai/mxbai-edge-colbert-v0-32m", ) # Step 2: Initialize an index (here, PLAID, for larger document collections) index = indexes.PLAID( index_folder="pylate-index", index_name="index", override=True, # This overwrites the existing index if any ) # Step 3: Encode your documents documents_ids = ["1", "2", "3"] documents = ["document 1 text", "document 2 text", "document 3 text"] documents_embeddings = model.encode( documents, batch_size=32, is_query=False, # Ensure that it is set to False to indicate that these are documents, not queries show_progress_bar=True, ) # Step 4: Add document embeddings to the index by providing embeddings and corresponding ids index.add_documents( documents_ids=documents_ids, documents_embeddings=documents_embeddings, ) ``` That's all you need to do to encode a full collection! Your documents are indexed and ready to be queried: ```python # Step 5.1: Initialize the ColBERT retriever retriever = retrieve.ColBERT(index=index) # Step 2: Encode the queries queries_embeddings = model.encode( ["query for document 3", "query for document 1"], batch_size=32, is_query=True, # # Ensure that it is set to False to indicate that these are queries show_progress_bar=True, ) # Step 3: Retrieve top-k documents scores = retriever.retrieve( queries_embeddings=queries_embeddings, k=10, # Retrieve the top 10 matches for each query ) ``` ### Reranking Thanks to its extreme parameter efficiency, this model is particularly well-suited to being used as a re-ranker following an even more lightweight first stage retrieval, such as static embeding models. Re-ranking is just as straigthforward: ```python from pylate import rank, models # Load the model model = models.ColBERT( model_name_or_path="mixedbread-ai/mxbai-edge-colbert-v0-32m", ) # Define queries and documents queries = [ "query A", "query B", ] documents = [ ["document A", "document B"], ["document 1", "document C", "document B"], ] documents_ids = [ [1, 2], [1, 3, 2], ] # Embed them queries_embeddings = model.encode( queries, is_query=True, ) documents_embeddings = model.encode( documents, is_query=False, ) # Perform reranking reranked_documents = rank.rerank( documents_ids=documents_ids, queries_embeddings=queries_embeddings, documents_embeddings=documents_embeddings, ) ``` ## Evaluation ### **Results on BEIR** | Model | AVG | MS MARCO | SciFact | Touche | FiQA | TREC-COVID | NQ | DBPedia | | :---------------------------- | :-------: | :-------: | :-------: | :-------: | :-------: | :--------: | :-------: | :-------: | | **Large Models (>100M)** | | | | | | | | | | GTE-ModernColBERT-v1 | **0.547** | 0.453 | **0.763** | **0.312** | **0.453** | **0.836** | **0.618** | **0.480** | | ColBERTv2 | 0.488 | **0.456** | 0.693 | 0.263 | 0.356 | 0.733 | 0.562 | 0.446 | | **Medium Models (<35M)** | | | | | | | | | | **mxbai-edge-colbert-v0-32m** | 0.521 | **0.450** | **0.740** | **0.313** | 0.390 | 0.775 | **0.600** | 0.455 | | answerai-colbert-small-v1 | **0.534** | 0.434 | **0.740** | 0.250 | **0.410** | **0.831** | 0.594 | **0.464** | | bge-small-en-v1.5 | 0.517 | 0.408 | 0.713 | 0.260 | 0.403 | 0.759 | 0.502 | 0.400 | | snowflake-s | 0.519 | 0.402 | 0.722 | 0.235 | 0.407 | 0.801 | 0.509 | 0.410 | | **Small Models (<25M)** | | | | | | | | | | mxbai-edge-colbert-v0-17m | **0.490** | **0.416** | **0.719** | **0.316** | 0.326 | **0.713** | **0.551** | **0.410** | | colbert-muvera-micro | 0.394 | 0.364 | 0.662 | 0.251 | 0.254 | 0.561 | 0.386 | 0.332 | | all-MiniLM-L6-v2 | 0.419 | 0.365 | 0.645 | 0.169 | **0.369** | 0.472 | 0.439 | 0.323 | ### **Results on LongEmbed** | Model | AVG | | :-------------------------------------------- | :-------: | | **Large Models (>100M)** | | | GTE-ModernColBERT-v1 (32k) | **0.898** | | GTE-ModernColBERT-v1 (4k) | 0.809 | | granite-embedding-english-r2 | 0.656 | | ColBERTv2 | 0.428 | | **Medium Models (<50M)** | | | **mxbai-edge-colbert-v0-32m (32k)** | **0.849** | | **mxbai-edge-colbert-v0-32m (4k)** | 0.783 | | granite-embedding-small-english-r2 | 0.637 | | answerai-colbert-small-v1 | 0.441 | | bge-small-en-v1.5 | 0.312 | | snowflake-arctic-embed-s | 0.356 | | **Small Models (<25M)** | | | mxbai-edge-colbert-v0-17m (32k) | **0.847** | | mxbai-edge-colbert-v0-17m (4k) | 0.776 | | all-MiniLM-L6-v2 | 0.298 | | colbert-muvera-micro | 0.405 | For more details on evaluations, please read our [Tech Report](https://mixedbread.com/papers/small_colbert_report.pdf). ## Community Please join our [Discord Community](https://discord.gg/j5dWb3Qkm9) and share your feedback and thoughts! We are here to help and also always happy to chat. ## License Apache 2.0 ## Citation If you use our model, please cite the associated tech report: ```bibtex @misc{takehi2025fantasticsmallretrieverstrain, title={Fantastic (small) Retrievers and How to Train Them: mxbai-edge-colbert-v0 Tech Report}, author={Rikiya Takehi and Benjamin Clavié and Sean Lee and Aamir Shakir}, year={2025}, eprint={2510.14880}, archivePrefix={arXiv}, primaryClass={cs.IR}, url={https://arxiv.org/abs/2510.14880}, } ``` If you specifically use its projection heads, or discuss their effect, please cite our report on using different projections for ColBERT models: ```bibtex @misc{clavie2025simpleprojectionvariantsimprove, title={Simple Projection Variants Improve ColBERT Performance}, author={Benjamin Clavié and Sean Lee and Rikiya Takehi and Aamir Shakir and Makoto P. Kato}, year={2025}, eprint={2510.12327}, archivePrefix={arXiv}, primaryClass={cs.IR}, url={https://arxiv.org/abs/2510.12327}, } ``` Finally, if you use PyLate in your work, please cite PyLate itself: ```bibtex @misc{PyLate, title={PyLate: Flexible Training and Retrieval for Late Interaction Models}, author={Chaffin, Antoine and Sourty, Raphaël}, url={https://github.com/lightonai/pylate}, year={2024} } ```