Language Server CLI Empowers Language Agents with Process Rewards Paper • 2510.22907 • Published Oct 27, 2025 • 4
CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization Paper • 2507.06181 • Published Jul 8, 2025 • 44
On the Design of KL-Regularized Policy Gradient Algorithms for LLM Reasoning Paper • 2505.17508 • Published May 23, 2025 • 8
FormalMATH: Benchmarking Formal Mathematical Reasoning of Large Language Models Paper • 2505.02735 • Published May 5, 2025 • 33
Scaling Image Tokenizers with Grouped Spherical Quantization Paper • 2412.02632 • Published Dec 3, 2024 • 10
Training and Evaluating Language Models with Template-based Data Generation Paper • 2411.18104 • Published Nov 27, 2024 • 3
MARS: Unleashing the Power of Variance Reduction for Training Large Models Paper • 2411.10438 • Published Nov 15, 2024 • 13