SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning Paper • 2504.08600 • Published Apr 11 • 31
Reasoning-SQL: Reinforcement Learning with SQL Tailored Partial Rewards for Reasoning-Enhanced Text-to-SQL Paper • 2503.23157 • Published Mar 29 • 10
AI Agents: Evolution, Architecture, and Real-World Applications Paper • 2503.12687 • Published Mar 16 • 2
OSUniverse: Benchmark for Multimodal GUI-navigation AI Agents Paper • 2505.03570 • Published May 6 • 8
J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning Paper • 2505.10320 • Published May 15 • 24
GFM-RAG: Graph Foundation Model for Retrieval Augmented Generation Paper • 2502.01113 • Published Feb 3 • 5
From Local to Global: A Graph RAG Approach to Query-Focused Summarization Paper • 2404.16130 • Published Apr 24, 2024 • 6
Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning Paper • 2506.07044 • Published Jun 8 • 113
VIKI-R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning Paper • 2506.09049 • Published Jun 10 • 36
ShareGPT-4o-Image: Aligning Multimodal Models with GPT-4o-Level Image Generation Paper • 2506.18095 • Published Jun 22 • 65
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning Paper • 2507.01006 • Published Jul 1 • 236
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning Paper • 2507.00432 • Published Jul 1 • 79
Coding Triangle: How Does Large Language Model Understand Code? Paper • 2507.06138 • Published Jul 8 • 21
KV Cache Steering for Inducing Reasoning in Small Language Models Paper • 2507.08799 • Published Jul 11 • 40
MUR: Momentum Uncertainty guided Reasoning for Large Language Models Paper • 2507.14958 • Published Jul 20 • 46
Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens Paper • 2508.01191 • Published Aug 2 • 236
Agent Lightning: Train ANY AI Agents with Reinforcement Learning Paper • 2508.03680 • Published Aug 5 • 70
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models Paper • 2508.06471 • Published Aug 8 • 186
Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning Paper • 2508.08221 • Published Aug 11 • 47
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems Paper • 2508.07407 • Published Aug 10 • 97
Speed Always Wins: A Survey on Efficient Architectures for Large Language Models Paper • 2508.09834 • Published Aug 13 • 53
Provable Benefits of In-Tool Learning for Large Language Models Paper • 2508.20755 • Published Aug 28 • 11
Think in Games: Learning to Reason in Games via Reinforcement Learning with Large Language Models Paper • 2508.21365 • Published Aug 29 • 27
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey Paper • 2509.02547 • Published Sep 2 • 217
LatticeWorld: A Multimodal Large Language Model-Empowered Framework for Interactive Complex World Generation Paper • 2509.05263 • Published Sep 5 • 10
Revolutionizing Reinforcement Learning Framework for Diffusion Large Language Models Paper • 2509.06949 • Published Sep 8 • 56
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning Paper • 2509.02479 • Published Sep 2 • 83
Lost in Embeddings: Information Loss in Vision-Language Models Paper • 2509.11986 • Published Sep 15 • 27
In-the-Flow Agentic System Optimization for Effective Planning and Tool Use Paper • 2510.05592 • Published 12 days ago • 89
UniME-V2: MLLM-as-a-Judge for Universal Multimodal Embedding Learning Paper • 2510.13515 • Published 4 days ago • 11