mmBERT: a modern multilingual encoder mmBERT is trained on 3T tokens from over 1800 languages, showing SoTA scores on benchmarks and exceptional low-resource performance jhu-clsp/mmBERT-base Fill-Mask • Updated Oct 7 • 80.6k • • 164 jhu-clsp/mmBERT-small Fill-Mask • Updated Oct 17 • 5.78k • • 54 jhu-clsp/mmBERT-checkpoints Updated Sep 9 • 3 jhu-clsp/mmBERT-pretrain-p1-fineweb2-langs Updated Oct 13 • 683 • 4
Encoders vs Decoders: the Ettin Suite A collection of SOTA, open-data, paired encoder-only and decoder only models ranging from 17M params to 1B. See the paper at https://arxiv.org/abs/250 Seq vs Seq: An Open Suite of Paired Encoders and Decoders Paper • 2507.11412 • Published Jul 15 • 28 jhu-clsp/ettin-encoder-17m Fill-Mask • Updated Jul 16 • 1.13k • 10 jhu-clsp/ettin-encoder-32m Feature Extraction • Updated Jul 18 • 610 • • 7 jhu-clsp/ettin-encoder-68m Fill-Mask • Updated Jul 18 • 1.43k • 3
mmBERT: a modern multilingual encoder mmBERT is trained on 3T tokens from over 1800 languages, showing SoTA scores on benchmarks and exceptional low-resource performance jhu-clsp/mmBERT-base Fill-Mask • Updated Oct 7 • 80.6k • • 164 jhu-clsp/mmBERT-small Fill-Mask • Updated Oct 17 • 5.78k • • 54 jhu-clsp/mmBERT-checkpoints Updated Sep 9 • 3 jhu-clsp/mmBERT-pretrain-p1-fineweb2-langs Updated Oct 13 • 683 • 4
Encoders vs Decoders: the Ettin Suite A collection of SOTA, open-data, paired encoder-only and decoder only models ranging from 17M params to 1B. See the paper at https://arxiv.org/abs/250 Seq vs Seq: An Open Suite of Paired Encoders and Decoders Paper • 2507.11412 • Published Jul 15 • 28 jhu-clsp/ettin-encoder-17m Fill-Mask • Updated Jul 16 • 1.13k • 10 jhu-clsp/ettin-encoder-32m Feature Extraction • Updated Jul 18 • 610 • • 7 jhu-clsp/ettin-encoder-68m Fill-Mask • Updated Jul 18 • 1.43k • 3