MARC: Memory-Augmented RL Token Compression for Efficient Video Understanding Paper • 2510.07915 • Published 11 days ago • 1
MARC: Memory-Augmented RL Token Compression for Efficient Video Understanding Paper • 2510.07915 • Published 11 days ago • 1
UGC-VideoCaptioner: An Omni UGC Video Detail Caption Model and New Benchmarks Paper • 2507.11336 • Published Jul 15 • 5 • 1
ST-Think: How Multimodal Large Language Models Reason About 4D Worlds from Ego-Centric Videos Paper • 2503.12542 • Published Mar 16 • 1
UGC-VideoCaptioner: An Omni UGC Video Detail Caption Model and New Benchmarks Paper • 2507.11336 • Published Jul 15 • 5