Less is More: Recursive Reasoning with Tiny Networks Paper • 2510.04871 • Published 14 days ago • 406
When Thoughts Meet Facts: Reusable Reasoning for Long-Context LMs Paper • 2510.07499 • Published 11 days ago • 44
Improving Context Fidelity via Native Retrieval-Augmented Reasoning Paper • 2509.13683 • Published Sep 17 • 8
Multimodal Iterative RAG for Knowledge-Intensive Visual Question Answering Paper • 2509.00798 • Published Aug 31
Retrieval Feedback Memory Enhancement Large Model Retrieval Generation Method Paper • 2508.17862 • Published Aug 25
Improving Factuality in LLMs via Inference-Time Knowledge Graph Construction Paper • 2509.03540 • Published Aug 31
Transforming Questions and Documents for Semantically Aligned Retrieval-Augmented Generation Paper • 2508.09755 • Published Aug 13
MIRAGE: Scaling Test-Time Inference with Parallel Graph-Retrieval-Augmented Reasoning Chains Paper • 2508.18260 • Published Aug 25
From Ranking to Selection: A Simple but Efficient Dynamic Passage Selector for Retrieval Augmented Generation Paper • 2508.09497 • Published Aug 13
MemMamba: Rethinking Memory Patterns in State Space Model Paper • 2510.03279 • Published 22 days ago • 68
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters Paper • 2408.03314 • Published Aug 6, 2024 • 63
TAG: A Decentralized Framework for Multi-Agent Hierarchical Reinforcement Learning Paper • 2502.15425 • Published Feb 21 • 9
Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning Paper • 2406.06469 • Published Jun 10, 2024 • 29
Cognitive Kernel: An Open-source Agent System towards Generalist Autopilots Paper • 2409.10277 • Published Sep 16, 2024 • 1
Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs Paper • 2504.17432 • Published Apr 24 • 39
Beyond the Exploration-Exploitation Trade-off: A Hidden State Approach for LLM Reasoning in RLVR Paper • 2509.23808 • Published 22 days ago • 47
Reactive Transformer (RxT) -- Stateful Real-Time Processing for Event-Driven Reactive Language Models Paper • 2510.03561 • Published 16 days ago • 23
CodeContests+: High-Quality Test Case Generation for Competitive Programming Paper • 2506.05817 • Published Jun 6 • 9
Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens Paper • 2508.01191 • Published Aug 2 • 236
Revisiting Chain-of-Thought Prompting: Zero-shot Can Be Stronger than Few-shot Paper • 2506.14641 • Published Jun 17
The Synergy Dilemma of Long-CoT SFT and RL: Investigating Post-Training Techniques for Reasoning VLMs Paper • 2507.07562 • Published Jul 10 • 1
Chain-of-Thought Prompting Obscures Hallucination Cues in Large Language Models: An Empirical Evaluation Paper • 2506.17088 • Published Jun 20
DAPO: An Open-Source LLM Reinforcement Learning System at Scale Paper • 2503.14476 • Published Mar 18 • 141
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model? Paper • 2504.13837 • Published Apr 18 • 135
Distilling LLM Agent into Small Models with Retrieval and Code Tools Paper • 2505.17612 • Published May 23 • 81
Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published May 6 • 185
A Survey of Context Engineering for Large Language Models Paper • 2507.13334 • Published Jul 17 • 257
Scaling Code-Assisted Chain-of-Thoughts and Instructions for Model Reasoning Paper • 2510.04081 • Published 15 days ago • 19