Spotlight on Token Perception for Multimodal Reinforcement Learning Paper • 2510.09285 • Published 11 days ago • 35
MemMamba: Rethinking Memory Patterns in State Space Model Paper • 2510.03279 • Published 23 days ago • 68
Diversity-Incentivized Exploration for Versatile Reasoning Paper • 2509.26209 • Published 21 days ago • 16
Native Hybrid Attention for Efficient Sequence Modeling Paper • 2510.07019 • Published 13 days ago • 16
Reasoning over Boundaries: Enhancing Specification Alignment via Test-time Delibration Paper • 2509.14760 • Published Sep 18 • 52
The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models Paper • 2505.22617 • Published May 28 • 130