CASS: Nvidia to AMD Transpilation with Data, Models, and Benchmark Paper • 2505.16968 • Published May 22 • 41
Time Blindness: Why Video-Language Models Can't See What Humans Can? Paper • 2505.24867 • Published May 30 • 80
SVRPBench: A Realistic Benchmark for Stochastic Vehicle Routing Problem Paper • 2505.21887 • Published May 28 • 14
KITAB-Bench: A Comprehensive Multi-Domain Benchmark for Arabic OCR and Document Understanding Paper • 2502.14949 • Published Feb 20 • 9