Infinite-World: Scaling Interactive World Models to 1000-Frame Horizons via Pose-Free Hierarchical Memory Paper • 2602.02393 • Published 8 days ago • 15
TempSamp-R1: Effective Temporal Sampling with Reinforcement Fine-Tuning for Video LLMs Paper • 2509.18056 • Published Sep 22, 2025 • 27
LLaVA-Scissor: Token Compression with Semantic Connected Components for Video LLMs Paper • 2506.21862 • Published Jun 27, 2025 • 36
CrossKD: Cross-Head Knowledge Distillation for Object Detection Paper • 2306.11369 • Published Jun 20, 2023
Re-Aligning Language to Visual Objects with an Agentic Workflow Paper • 2503.23508 • Published Mar 30, 2025
YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection Paper • 2308.05480 • Published Aug 10, 2023 • 2
YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection Paper • 2308.05480 • Published Aug 10, 2023 • 2
VisualCloze: A Universal Image Generation Framework via Visual In-Context Learning Paper • 2504.07960 • Published Apr 10, 2025 • 50