LFM2-ColBERT-350M Inference Example
This repository demonstrates local GPU inference using the LiquidAI/LFM2-ColBERT-350M model for document retrieval and ranking tasks.
Overview
The LFM2-ColBERT-350M is a neural retrieval model that uses contextualized embeddings to rank documents based on their relevance to queries. This project provides a complete example of:
- Loading the model and tokenizer
- Processing queries and documents
- Computing similarity scores
- Ranking documents by relevance
Requirements
- Python 3.7+
- PyTorch
- Transformers
- scikit-learn
- CUDA-capable GPU (recommended)
Installation
Install the required dependencies:
pip install transformers torch scikit-learn
Usage
The Jupyter notebook demonstrates a complete workflow:
- Install Dependencies: Installs
transformersandtorch - Load Model: Loads the LFM2-ColBERT-350M model from Hugging Face
- Prepare Data: Creates example queries and documents
- Generate Embeddings: Computes embeddings for queries and documents
- Rank Results: Uses cosine similarity to rank documents by relevance
Quick Start
Open the LFM2-ColBERT-350M.ipynb notebook in Jupyter and run all cells. The example demonstrates:
queries = [
"What is the capital of France?",
"Tell me about machine learning.",
"How to train a neural network?"
]
documents = [
"Paris is the capital and most populous city of France.",
"Machine learning is a field of artificial intelligence...",
# More documents...
]
The model successfully ranks relevant documents higher for each query.
Results
The example shows effective document ranking with high similarity scores for relevant query-document pairs:
- Query about France's capital correctly ranks Paris-related documents highest
- Machine learning queries prioritize ML-related content
- Neural network training queries rank technical documents first
Model Information
- Model: LiquidAI/LFM2-ColBERT-350M
- Task: Document retrieval and ranking
- Parameters: 350M
- Architecture: ColBERT-style retrieval model
Next Steps
- Evaluate on larger, more diverse datasets
- Compare performance with other retrieval models
- Fine-tune on domain-specific data
- Implement batch processing for larger document collections
License
MIT License - see LICENSE file for details
Acknowledgments
- Model developed by LiquidAI
- Built using Hugging Face Transformers
Issues
If you encounter any problems or have questions:
- Check the model repository for model-specific issues
- Open an issue in this repository for implementation questions