FLES-1 v14 — Sparse Lexical Encoder (Best Quality)

Paper: Closed-Loop FLOPS Regulation for Learned Sparse Retrieval — Golvis Tavarez, Mindoval, Inc.

Model Description

FLES-1 transforms text into interpretable sparse vectors using BERT's MLM predictions. Each of the 30,522 dimensions corresponds to a real vocabulary word — readable, debuggable, and compatible with standard inverted indices (Elasticsearch, OpenSearch).

Trained with two novel techniques:

L1 FLOPS regularization — eliminates the gradient explosion that causes training instability in all published sparse retrieval models
Step-interval CLFR — closed-loop sparsity control that adjusts regularization every ~6,250 steps (one epoch in our setup) based on measured sparsity

Metrics

nfcorpus (threshold=0.3)

Metric	Value
NDCG@10	0.3049
MRR	0.5182
Recall@100	0.2544
Avg NNZ	359

Reproducibility

This recipe was run 5 times with different seeds:

Seed	NDCG@10
v14 (original)	0.305
v17c	0.299
v31a	0.299
v32 (seed=7777)	0.299
v26a (seed=42)	0.272

Mean: 0.295. Std: 0.013. v14 is at the high end of variance. Expected reproduction: 0.295 ± 0.013.

Baselines

Model	NDCG@10	NNZ	Distillation	Training Data
FLES-1 v14	0.305	359	None	200K MS MARCO
BM25 (Pyserini, stemmed)	0.325	—	—	—
BM25 (regex, no stemming)	0.307	—	—	—
SPLADE-Doc (no distillation)	0.323	—	None	Full MS MARCO
SPLADE original (no distillation)	0.336	—	None	Full MS MARCO
SPLADE-cocondenser (distilled)	0.340	125	Cross-encoder	Full MS MARCO

FLES-1 v14 is 6% behind Pyserini BM25 (0.325) and 6-10% behind non-distilled SPLADE variants. The paper's contribution is the training methodology (CLFR, L1 FLOPS, lambda-steps tradeoff), not the absolute numbers.

Cross-Domain (zero-shot)

Dataset	Domain	NDCG@10
nfcorpus	Medical	0.305
scifact	Scientific claims	0.557
fiqa	Financial Q&A	0.212
arguana	Argument retrieval	0.142
scidocs	Scientific docs	0.112

Production

Metric	GPU (A100)	CPU
Encoding	245 docs/sec	87 docs/sec
Query latency	10 ms avg	33 ms avg
Index size (1K docs)	0.32 MB	—
vs dense 768d	9.5x smaller	—

Training

Foundation: fles1-v12b (2 generations from bert-base-uncased)
Data: 200,000 MS MARCO random negatives
Epochs: 2 (12,500 steps)
Loss: InfoNCE (τ=0.05) + L1 FLOPS (λ_d=0.00003) + anti-collapse
Controller: Step-interval CLFR, adjusted every ~6,250 steps (target_nnz_d=400, gain=0.1)
Optimizer: AdamW, lr=2e-5, batch_size=32, 7 negatives per query
Hardware: 1× A100 80GB, ~2 hours

The CLFR Paper

Full paper coming soon.

This model is the primary result of a 75-run empirical study of training dynamics in sparse retrieval. The study discovered:

L1 FLOPS regularization (reduces training crashes from 10-17 to 0-7 per run)
Epoch-level closed-loop sparsity control (1 adjustment per ~6,250 steps outperforms 12,500 per-step adjustments)
The lambda-steps tradeoff (eff_reg = λ × steps, sweet spot 0.10-0.20)
The binary contrastive ceiling (0.298 ± 0.007 for InfoNCE with random negatives)
Checkpoint archaeology (longitudinal weight analysis across 43 training runs)

Limitations

Trained on MS MARCO (English web Q&A). Domain transfer to non-English or specialized domains requires fine-tuning.
NNZ=359 is denser than SPLADE (125). For latency-critical deployments, consider fles1-v12b (NNZ=139).
The 0.305 result is at the high end of variance for this recipe (mean=0.295).
Does not use knowledge distillation — the gap to SPLADE (10.4%) is structural.

Usage

from fles1_encoder import FLES1Encoder

# Load model
encoder = FLES1Encoder.from_pretrained("mindoval/fles1-v14")

# Encode text to sparse vector
sparse = encoder.encode("What is machine learning?")
# Returns: {'machine': 1.39, 'learning': 1.08, 'machines': 0.63, ...}

# Batch encode
vectors = encoder.encode_batch(["query 1", "query 2"], batch_size=32)

# Encode to term IDs (for inverted index)
ids, weights = encoder.encode_to_ids("What is machine learning?")

License

Apache 2.0

Golvis Tavarez — Mindoval, Inc. We thank Microsoft, Inc. for supporting this research through the Microsoft for Startups program. https://mindoval.com/ai-research

Downloads last month: 35

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

mindoval
/

fles1-v14