UniVid: Unifying Vision Tasks with Pre-trained Video Generation Models Paper • 2509.21760 • Published 23 days ago • 14
SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning Paper • 2509.09674 • Published Sep 11 • 77
EchoX: Towards Mitigating Acoustic-Semantic Gap via Echo Training for Speech-to-Speech LLMs Paper • 2509.09174 • Published Sep 11 • 57
HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioning Paper • 2509.08519 • Published Sep 10 • 125
Pangu Pro MoE: Mixture of Grouped Experts for Efficient Sparsity Paper • 2505.21411 • Published May 27 • 17
ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning Paper • 2506.09513 • Published Jun 11 • 98