Guidance-Free Training (GFT): Visual Generation Without Guidance

This repository hosts models related to the paper Visual Generation Without Guidance.

GFT is a novel approach that enables visual generative models to operate without Classifier-Free Guidance (CFG), effectively halving the computational cost of sampling while maintaining comparable performance. This method is universal, applicable to diffusion, autoregressive, and masked-prediction models, and can be trained directly from scratch or fine-tuned with minimal modifications to existing codebases.

logo

Qualitative T2I comparison between vanilla conditional generation, GFT, and CFG on Stable Diffusion 1.5 with the prompt "Elegant crystal vase holding pink peonies, soft raindrops tracing paths down the window behind it".

For more details, including training code and example usage for different base models (like Stable Diffusion 1.5 and DiT), please refer to the official GitHub repository.

If you find this work helpful, please consider citing the original paper:

@article{chen2025visual,
  title={Visual Generation Without Guidance},
  author={Chen, Huayu and Jiang, Kai and Zheng, Kaiwen and Chen, Jianfei and Su, Hang and Zhu, Jun},
  journal={arXiv preprint arXiv:2501.15420},
  year={2025}
}
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support