VMoBA: Mixture-of-Block Attention for Video Diffusion Models Paper • 2506.23858 • Published Jun 30 • 31
SimpleGVR: A Simple Baseline for Latent-Cascaded Video Super-Resolution Paper • 2506.19838 • Published Jun 24 • 12
Learning Video Generation for Robotic Manipulation with Collaborative Trajectory Control Paper • 2506.01943 • Published Jun 2 • 24
RICO: Improving Accuracy and Completeness in Image Recaptioning via Visual Reconstruction Paper • 2505.22613 • Published May 28 • 7
MME-VideoOCR: Evaluating OCR-Based Capabilities of Multimodal LLMs in Video Scenarios Paper • 2505.21333 • Published May 27 • 39
Training-Free Efficient Video Generation via Dynamic Token Carving Paper • 2505.16864 • Published May 22 • 22
Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation Paper • 2503.24379 • Published Mar 31 • 77
Position: Interactive Generative Video as Next-Generation Game Engine Paper • 2503.17359 • Published Mar 21 • 62
DiffMoE: Dynamic Token Selection for Scalable Diffusion Transformers Paper • 2503.14487 • Published Mar 18 • 27
ReCamMaster: Camera-Controlled Generative Rendering from A Single Video Paper • 2503.11647 • Published Mar 14 • 144
CineMaster: A 3D-Aware and Controllable Framework for Cinematic Text-to-Video Generation Paper • 2502.08639 • Published Feb 12 • 43
PlacidDreamer: Advancing Harmony in Text-to-3D Generation Paper • 2407.13976 • Published Jul 19, 2024 • 5
VideoTetris: Towards Compositional Text-to-Video Generation Paper • 2406.04277 • Published Jun 6, 2024 • 26