Abstract
SD3.5-Flash is an efficient few-step distillation framework that enhances image generation on consumer devices using rectified flow models with innovations like timestep sharing and split-timestep fine-tuning.
We present SD3.5-Flash, an efficient few-step distillation framework that brings high-quality image generation to accessible consumer devices. Our approach distills computationally prohibitive rectified flow models through a reformulated distribution matching objective tailored specifically for few-step generation. We introduce two key innovations: "timestep sharing" to reduce gradient noise and "split-timestep fine-tuning" to improve prompt alignment. Combined with comprehensive pipeline optimizations like text encoder restructuring and specialized quantization, our system enables both rapid generation and memory-efficient deployment across different hardware configurations. This democratizes access across the full spectrum of devices, from mobile phones to desktop computers. Through extensive evaluation including large-scale user studies, we demonstrate that SD3.5-Flash consistently outperforms existing few-step methods, making advanced generative AI truly accessible for practical deployment.
Community
We present SD3.5-Flash, an efficient few-step distillation framework that brings high-quality image generation to accessible consumer devices. Our approach distills computationally prohibitive rectified flow models through a reformulated distribution matching objective tailored specifically for few-step generation. We introduce two key innovations: "timestep sharing" to reduce gradient noise and "split-timestep fine-tuning" to improve prompt alignment. Combined with comprehensive pipeline optimizations like text encoder restructuring and specialized quantization, our system enables both rapid generation and memory-efficient deployment across different hardware configurations. This democratizes access across the full spectrum of devices, from mobile phones to desktop computers. Through extensive evaluation including large-scale user studies, we demonstrate that SD3.5-Flash consistently outperforms existing few-step methods, making advanced generative AI truly accessible for practical deployment.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- POSE: Phased One-Step Adversarial Equilibrium for Video Diffusion Models (2025)
- SwiftVideo: A Unified Framework for Few-Step Video Generation through Trajectory-Distribution Alignment (2025)
- Q-Sched: Pushing the Boundaries of Few-Step Diffusion Models with Quantization-Aware Scheduling (2025)
- MeanAudio: Fast and Faithful Text-to-Audio Generation with Mean Flows (2025)
- Video-BLADE: Block-Sparse Attention Meets Step Distillation for Efficient Video Generation (2025)
- Hyper-Bagel: A Unified Acceleration Framework for Multimodal Understanding and Generation (2025)
- RespoDiff: Dual-Module Bottleneck Transformation for Responsible&Faithful T2I Generation (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper