Rethinking Training Dynamics in Scale-wise Autoregressive Generation Paper • 2512.06421 • Published 23 days ago • 5
Does Understanding Inform Generation in Unified Multimodal Models? From Analysis to Path Forward Paper • 2511.20561 • Published Nov 25 • 31
Rolling Forcing: Autoregressive Long Video Diffusion in Real Time Paper • 2509.25161 • Published Sep 29 • 25
Latent Zoning Network: A Unified Principle for Generative Modeling, Representation Learning, and Classification Paper • 2509.15591 • Published Sep 19 • 45
Locality in Image Diffusion Models Emerges from Data Statistics Paper • 2509.09672 • Published Sep 11 • 12
T-LoRA: Single Image Diffusion Model Customization Without Overfitting Paper • 2507.05964 • Published Jul 8 • 119
How Well Does GPT-4o Understand Vision? Evaluating Multimodal Foundation Models on Standard Computer Vision Tasks Paper • 2507.01955 • Published Jul 2 • 35
Lost in Latent Space: An Empirical Study of Latent Diffusion Models for Physics Emulation Paper • 2507.02608 • Published Jul 3 • 21
Inverse-and-Edit: Effective and Fast Image Editing by Cycle Consistency Models Paper • 2506.19103 • Published Jun 23 • 42
Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models Paper • 2506.19697 • Published Jun 24 • 44
Archetypal SAE: Adaptive and Stable Dictionary Learning for Concept Extraction in Large Vision Models Paper • 2502.12892 • Published Feb 18 • 2
The Unreasonable Effectiveness of Gaussian Score Approximation for Diffusion Models and its Applications Paper • 2412.09726 • Published Dec 12, 2024