CHARM: Calibrating Reward Models With Chatbot Arena Scores Paper • 2504.10045 • Published Apr 14, 2025
CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models Paper • 2602.17684 • Published 20 days ago • 21
ACE: Attribution-Controlled Knowledge Editing for Multi-hop Factual Recall Paper • 2510.07896 • Published Oct 9, 2025 • 3
Reasoning in the Dark: Interleaved Vision-Text Reasoning in Latent Space Paper • 2510.12603 • Published Oct 14, 2025 • 1