view article Article How to generate text: using different decoding methods for language generation with Transformers Mar 1, 2020 • 270
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language Paper • 2506.20920 • Published Jun 26 • 75
view article Article Building Conversational AI: A Deep Dive into Voice Agent Architectures and Best Practices Sep 2 • 11
Moroccan Darija LLMs Collection Language Models that speaks Moroccan darija (ary) • 9 items • Updated Feb 20 • 4
view article Article Seeing Isn’t Understanding: The Spatial Reasoning Gap in Vision-Language Models Jul 13 • 9
view article Article Atlaset Dataset for Moroccan Darija: From Data Collection, Analysis, to Model Trainings Mar 6 • 26
view article Article TerjamaBench: A Cultural Benchmark for English-Darija Machine Translation Jan 10 • 34
ArTST - Arabic Text Speech Transformer Collection Open source project for Arabic Speech Recognition and Generation • 15 items • Updated Jun 11 • 12