Reinforcement Fine-Tuning Naturally Mitigates Forgetting in Continual Post-Training Paper • 2507.05386 • Published Jul 7, 2025 • 1
SWE-Universe: Scale Real-World Verifiable Environments to Millions Paper • 2602.02361 • Published 3 days ago • 54
Good SFT Optimizes for SFT, Better SFT Prepares for Reinforcement Learning Paper • 2602.01058 • Published 5 days ago • 38
Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models Paper • 2601.22060 • Published 7 days ago • 144
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text Paper • 2601.22975 • Published 6 days ago • 79
Quantization-Aware Distillation for NVFP4 Inference Accuracy Recovery Paper • 2601.20088 • Published 9 days ago • 1
DynamicVLA: A Vision-Language-Action Model for Dynamic Object Manipulation Paper • 2601.22153 • Published 7 days ago • 68
ECO: Quantized Training without Full-Precision Master Weights Paper • 2601.22101 • Published 7 days ago • 6
FineInstructions: Scaling Synthetic Instructions to Pre-Training Scale Paper • 2601.22146 • Published 7 days ago • 8
JUST-DUB-IT: Video Dubbing via Joint Audio-Visual Diffusion Paper • 2601.22143 • Published 7 days ago • 6
Idea2Story: An Automated Pipeline for Transforming Research Concepts into Complete Scientific Narratives Paper • 2601.20833 • Published 8 days ago • 171
ConceptMoE: Adaptive Token-to-Concept Compression for Implicit Compute Allocation Paper • 2601.21420 • Published 7 days ago • 41