view article Article From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels Aug 18, 2025 β’ 95
Unifying Demonstration Selection and Compression for In-Context Learning Paper β’ 2405.17062 β’ Published May 27, 2024 β’ 1
TASTE: Text-Aligned Speech Tokenization and Embedding for Spoken Language Modeling Paper β’ 2504.07053 β’ Published Apr 9, 2025 β’ 6
view reply Does Liger Kernel affect training speed at all? Is it faster, slower, or no difference compared to regular GRPO?
BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset Paper β’ 2505.09568 β’ Published May 14, 2025 β’ 99
Tiny Series Collection Tiny datasets that empower the foundation of Small Language Model! β’ 11 items β’ Updated Jan 26, 2024 β’ 43