Scaling Embeddings Outperforms Scaling Experts in Language Models Paper • 2601.21204 • Published 2 days ago • 76
Ground Slow, Move Fast: A Dual-System Foundation Model for Generalizable Vision-and-Language Navigation Paper • 2512.08186 • Published Dec 9, 2025 • 22
TTS-VAR: A Test-Time Scaling Framework for Visual Auto-Regressive Generation Paper • 2507.18537 • Published Jul 24, 2025 • 18
OpenSubject: Leveraging Video-Derived Identity and Diversity Priors for Subject-driven Image Generation and Manipulation Paper • 2512.08294 • Published Dec 9, 2025 • 18
Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance Paper • 2512.08765 • Published Dec 9, 2025 • 132
Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance Paper • 2512.08765 • Published Dec 9, 2025 • 132
EditThinker: Unlocking Iterative Reasoning for Any Image Editor Paper • 2512.05965 • Published Dec 5, 2025 • 38
OmniX: From Unified Panoramic Generation and Perception to Graphics-Ready 3D Scenes Paper • 2510.26800 • Published Oct 30, 2025 • 22
Routing Matters in MoE: Scaling Diffusion Transformers with Explicit Routing Guidance Paper • 2510.24711 • Published Oct 28, 2025 • 20
Think with 3D: Geometric Imagination Grounded Spatial Reasoning from Limited Views Paper • 2510.18632 • Published Oct 21, 2025 • 22
OmniRetarget: Interaction-Preserving Data Generation for Humanoid Whole-Body Loco-Manipulation and Scene Interaction Paper • 2509.26633 • Published Sep 30, 2025 • 6
Paper2Video: Automatic Video Generation from Scientific Papers Paper • 2510.05096 • Published Oct 6, 2025 • 119