HippoCamp: Benchmarking Contextual Agents on Personal Computers Paper • 2604.01221 • Published 15 days ago • 29
PerceptionComp: A Video Benchmark for Complex Perception-Centric Reasoning Paper • 2603.26653 • Published 20 days ago • 18
Supervised Fine-Tuning versus Reinforcement Learning: A Study of Post-Training Methods for Large Language Models Paper • 2603.13985 • Published Mar 14 • 10
Supervised Fine-Tuning versus Reinforcement Learning: A Study of Post-Training Methods for Large Language Models Paper • 2603.13985 • Published Mar 14 • 10