LaViT: Aligning Latent Visual Thoughts for Multi-modal Reasoning Paper • 2601.10129 • Published 4 days ago • 9
LSRIF: Logic-Structured Reinforcement Learning for Instruction Following Paper • 2601.06431 • Published 9 days ago • 9
Molmo2: Open Weights and Data for Vision-Language Models with Video Understanding and Grounding Paper • 2601.10611 • Published 4 days ago • 23
Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs Paper • 2601.08763 • Published 6 days ago • 129
Urban Socio-Semantic Segmentation with Vision-Language Reasoning Paper • 2601.10477 • Published 4 days ago • 150