TTRV: Test-Time Reinforcement Learning for Vision Language Models Paper • 2510.06783 • Published 19 days ago • 11
VisualOverload: Probing Visual Understanding of VLMs in Really Dense Scenes Paper • 2509.25339 • Published 28 days ago • 9
KV Cache Steering for Inducing Reasoning in Small Language Models Paper • 2507.08799 • Published Jul 11 • 40
LiveXiv -- A Multi-Modal Live Benchmark Based on Arxiv Papers Content Paper • 2410.10783 • Published Oct 14, 2024 • 27
GLOV: Guided Large Language Models as Implicit Optimizers for Vision Language Models Paper • 2410.06154 • Published Oct 8, 2024 • 16