Spatial Forcing: Implicit Spatial Representation Alignment for Vision-language-action Model Paper • 2510.12276 • Published 7 days ago • 139
Hyperparameters are all you need: Using five-step inference for an original diffusion model to generate images comparable to the latest distillation model Paper • 2510.02390 • Published 21 days ago • 3
VideoGrain: Modulating Space-Time Attention for Multi-grained Video Editing Paper • 2502.17258 • Published Feb 24 • 79
Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis Paper • 2410.08261 • Published Oct 10, 2024 • 52