Internalizing Meta-Experience into Memory for Guided Reinforcement Learning in Large Language Models Paper • 2602.10224 • Published 21 days ago • 19
LOCA-bench: Benchmarking Language Agents Under Controllable and Extreme Context Growth Paper • 2602.07962 • Published 23 days ago • 24
OdysseyArena: Benchmarking Large Language Models For Long-Horizon, Active and Inductive Interactions Paper • 2602.05843 • Published 26 days ago • 59
HER: Human-like Reasoning and Reinforcement Learning for LLM Role-playing Paper • 2601.21459 • Published Jan 29 • 9
SWE-Master: Unleashing the Potential of Software Engineering Agents via Post-Training Paper • 2602.03411 • Published 28 days ago • 37
SWE-World: Building Software Engineering Agents in Docker-Free Environments Paper • 2602.03419 • Published 28 days ago • 40
Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models Paper • 2601.22060 • Published Jan 29 • 158
Vision-DeepResearch Benchmark: Rethinking Visual and Textual Search for Multimodal Large Language Models Paper • 2602.02185 • Published 29 days ago • 125
MMDeepResearch-Bench: A Benchmark for Multimodal Deep Research Agents Paper • 2601.12346 • Published Jan 18 • 49
The Molecular Structure of Thought: Mapping the Topology of Long Chain-of-Thought Reasoning Paper • 2601.06002 • Published Jan 9 • 56
UniCorn: Towards Self-Improving Unified Multimodal Models through Self-Generated Supervision Paper • 2601.03193 • Published Jan 6 • 47
UniCorn: Towards Self-Improving Unified Multimodal Models through Self-Generated Supervision Paper • 2601.03193 • Published Jan 6 • 47
OPV: Outcome-based Process Verifier for Efficient Long Chain-of-Thought Verification Paper • 2512.10756 • Published Dec 11, 2025 • 35
Long-horizon Reasoning Agent for Olympiad-Level Mathematical Problem Solving Paper • 2512.10739 • Published Dec 11, 2025 • 47
Nex-N1: Agentic Models Trained via a Unified Ecosystem for Large-Scale Environment Construction Paper • 2512.04987 • Published Dec 4, 2025 • 80
T-Eval: Evaluating the Tool Utilization Capability Step by Step Paper • 2312.14033 • Published Dec 21, 2023 • 2
LiDAR-LLM: Exploring the Potential of Large Language Models for 3D LiDAR Understanding Paper • 2312.14074 • Published Dec 21, 2023
Point-DETR3D: Leveraging Imagery Data with Spatial Point Prior for Weakly Semi-supervised 3D Object Detection Paper • 2403.15317 • Published Mar 22, 2024 • 1
PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition Paper • 2403.17695 • Published Mar 26, 2024 • 1