MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering Paper • 2410.07095 • Published Oct 9, 2024 • 8
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models Paper • 2505.24864 • Published May 30 • 143
WorldMM: Dynamic Multimodal Memory Agent for Long Video Reasoning Paper • 2512.02425 • Published 25 days ago • 23
Adaptive Multi-Agent Response Refinement in Conversational Systems Paper • 2511.08319 • Published Nov 11 • 41
CWM: An Open-Weights LLM for Research on Code Generation with World Models Paper • 2510.02387 • Published Sep 30 • 8
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs Paper • 2510.11696 • Published Oct 13 • 176
Front-Loading Reasoning: The Synergy between Pretraining and Post-Training Data Paper • 2510.03264 • Published Sep 26 • 23
Revisiting the Uniform Information Density Hypothesis in LLM Reasoning Traces Paper • 2510.06953 • Published Oct 8 • 8
Cogito, Ergo Ludo: An Agent that Learns to Play by Reasoning and Planning Paper • 2509.25052 • Published Sep 29 • 4
Dyna-Mind: Learning to Simulate from Experience for Better AI Agents Paper • 2510.09577 • Published Oct 10 • 7
Multimodal Prompt Optimization: Why Not Leverage Multiple Modalities for MLLMs Paper • 2510.09201 • Published Oct 10 • 49
When Thoughts Meet Facts: Reusable Reasoning for Long-Context LMs Paper • 2510.07499 • Published Oct 8 • 48
Rethinking Reward Models for Multi-Domain Test-Time Scaling Paper • 2510.00492 • Published Oct 1 • 27
ACON: Optimizing Context Compression for Long-horizon LLM Agents Paper • 2510.00615 • Published Oct 1 • 32
FLOAT: Generative Motion Latent Flow Matching for Audio-driven Talking Portrait Paper • 2412.01064 • Published Dec 2, 2024 • 47
Distilling LLM Agent into Small Models with Retrieval and Code Tools Paper • 2505.17612 • Published May 23 • 81