CiteVQA: Benchmarking Evidence Attribution for Trustworthy Document Intelligence Paper • 2605.12882 • Published 7 days ago • 254
Beyond Reasoning: Reinforcement Learning Unlocks Parametric Knowledge in LLMs Paper • 2605.07153 • Published 12 days ago • 7
DiPO: Disentangled Perplexity Policy Optimization for Fine-grained Exploration-Exploitation Trade-Off Paper • 2604.13902 • Published Apr 15 • 62
GBQA: A Game Benchmark for Evaluating LLMs as Quality Assurance Engineers Paper • 2604.02648 • Published Apr 3 • 47
CarePilot: A Multi-Agent Framework for Long-Horizon Computer Task Automation in Healthcare Paper • 2603.24157 • Published Mar 25 • 10
daaxila/twitter-jiwawa1314-2026.03.25-2036706214286155974-bR-32XcIA0k9_o8D-part1 Viewer • Updated Apr 1 • 1 • 17 • 1
InCoder-32B: Code Foundation Model for Industrial Scenarios Paper • 2603.16790 • Published Mar 17 • 311
HSImul3R: Physics-in-the-Loop Reconstruction of Simulation-Ready Human-Scene Interactions Paper • 2603.15612 • Published Mar 16 • 153
Bootstrapping Exploration with Group-Level Natural Language Feedback in Reinforcement Learning Paper • 2603.04597 • Published Mar 4 • 210