Fantastic (small) Retrievers and How to Train Them: mxbai-edge-colbert-v0 Tech Report
Abstract
mxbai-edge-colbert-v0 models, with 17M and 32M parameters, demonstrate superior retrieval performance on short-text and long-context benchmarks compared to ColBERTv2.
In this work, we introduce mxbai-edge-colbert-v0 models, at two different parameter counts: 17M and 32M. As part of our research, we conduct numerous experiments to improve retrieval and late-interaction models, which we intend to distill into smaller models as proof-of-concepts. Our ultimate aim is to support retrieval at all scales, from large-scale retrieval which lives in the cloud to models that can run locally, on any device. mxbai-edge-colbert-v0 is a model that we hope will serve as a solid foundation backbone for all future experiments, representing the first version of a long series of small proof-of-concepts. As part of the development of mxbai-edge-colbert-v0, we conducted multiple ablation studies, of which we report the results. In terms of downstream performance, mxbai-edge-colbert-v0 is a particularly capable small model, outperforming ColBERTv2 on common short-text benchmarks (BEIR) and representing a large step forward in long-context tasks, with unprecedented efficiency.
Community
Our latest tech report: A comprehensive tech report on how to tech a model from its language model pre-training weights, to being a capable single-vector embedding, then finally to being a ColBERT model that can outperform 8B parameter models on long-context retrieval tasks with just 0.017B parameters.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Granite Embedding R2 Models (2025)
- EmbeddingGemma: Powerful and Lightweight Text Representations (2025)
- Training LLMs to be Better Text Embedders through Bidirectional Reconstruction (2025)
- LEAF: Knowledge Distillation of Text Embedding Models with Teacher-Aligned Representations (2025)
- Simple Projection Variants Improve ColBERT Performance (2025)
- MetaEmbed: Scaling Multimodal Retrieval at Test-Time with Flexible Late Interaction (2025)
- Enhancing Document VQA Models via Retrieval-Augmented Generation (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 2
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper