Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models Paper • 2602.12036 • Published 2 days ago • 85
Thinking-Free Policy Initialization Makes Distilled Reasoning Models More Effective and Efficient Reasoners Paper • 2509.26226 • Published Sep 30, 2025 • 34
ShareGPT-4o-Image: Aligning Multimodal Models with GPT-4o-Level Image Generation Paper • 2506.18095 • Published Jun 22, 2025 • 66
Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models Paper • 2506.19697 • Published Jun 24, 2025 • 44
QFFT, Question-Free Fine-Tuning for Adaptive Reasoning Paper • 2506.12860 • Published Jun 15, 2025 • 18
QFFT, Question-Free Fine-Tuning for Adaptive Reasoning Paper • 2506.12860 • Published Jun 15, 2025 • 18 • 2
QFFT, Question-Free Fine-Tuning for Adaptive Reasoning Paper • 2506.12860 • Published Jun 15, 2025 • 18