view article Article Improving Hugging Face Training Efficiency Through Packing with Flash Attention By lwtr and 5 others • Aug 21, 2024 • 39
Tiny Series Collection Tiny datasets that empower the foundation of Small Language Model! • 11 items • Updated Jan 26, 2024 • 41
view article Article Improving Parquet Dedupe on Hugging Face Hub By yuchenglow and 1 other • Oct 5, 2024 • 38
view article Article TimeScope: How Long Can Your Video Large Multimodal Model Go? By orrzohar and 3 others • 15 days ago • 32
view article Article Fast LoRA inference for Flux with Diffusers and PEFT By sayakpaul and 1 other • 15 days ago • 40
A little guide to building Large Language Models in 2024 Collection Resources mentioned by @thomwolf in https://x.com/Thom_Wolf/status/1773340316835131757 • 19 items • Updated Apr 1, 2024 • 16
📀 Dataset comparison models Collection 1.8B models trained on 350BT to compare different pretraining datasets • 8 items • Updated Jun 12, 2024 • 40
FineWeb2 Edu Japanese Collection FineWeb2 Edu Japanese: A high-quality, filtered Japanese dataset (120M texts, 89.3B tokens) for educational AI training. • 7 items • Updated Jun 19 • 1
view article Article FineWeb2-C: Help Build Better Language Models in Your Language By davanstrien and 5 others • Dec 23, 2024 • 21
Seed-X Collection A powerful open-source multilingual translation language model series, including instruction and reasoning models. • 6 items • Updated 9 days ago • 61