Bilingual LMs ( L1 {es fr de pl tr ar zh} + L2 en ) trained on Cultura-X for L1 and FineWebEdu (L2)
Suchir Salhan
suchirsalhan
AI & ML interests
Multilinguality and Cognitively-Inspired AI. Tokenization, Pretraining, Interpretability & Alignment.
Recent Activity
updated a dataset 25 minutes ago
Beetle-Data/hi-2B-pretok published a dataset 26 minutes ago
Beetle-Data/hi-2B-pretok updated a model 39 minutes ago
Beetle-nld-eng-Variants/beetle-bilingual-l2-50-sequential-33-67-b3-fineweb-2b-nld-eng-arch-moe