ImageChain: Advancing Sequential Image-to-Text Reasoning in Multimodal Large Language Models Paper • 2502.19409 • Published Feb 26, 2025
Running Featured 1.28k FineWeb: decanting the web for the finest text data at scale 🍷 1.28k Generate high-quality text data for LLMs using FineWeb
CRAFT Your Dataset: Task-Specific Synthetic Dataset Generation Through Corpus Retrieval and Augmentation Paper • 2409.02098 • Published Sep 3, 2024 • 3
CRAFT: Corpus Retrieval and Augmentation for Fine-Tuning Collection CRAFTed datasets and LoRA adapter checkpoints. All datasets are synthetically generated. Paper: https://arxiv.org/abs/2409.02098 • 11 items • Updated Sep 4, 2024 • 3