|
--- |
|
library_name: diffusers |
|
pipeline_tag: text-to-image |
|
--- |
|
# Phased Consistency Model |
|
|
|
LoRA weights of Stable Diffusion XL for fast text-to-image generation. |
|
|
|
|
|
Important Usage Guidance |
|
Use DDIM or Euler instead of LCM for sampling! When using DDIM, set timestep_spacing="trailing". |
|
|
|
The name of each LoRA weights indicates how many inference steps they should be applied. |
|
|
|
The name of each LoRA weights indicates whether they are able to use normal CFGs or small CFGs |
|
|
|
NormalCFG means that model equipped with the LoRA can use CFG value 2-9 for generation. Yet you should adjust the CFG values given the steps you applied. |
|
When using fewer steps, you should use smaller CFGs. For example, use CFG 2.5 - 3.5 with 4 four steps and use CFG 3 - 6 with 8 steps. This is because that fewer-step means the model has fewer chance to fix the issues caused by the CFG. |
|
|
|
SmallCFG means that the model equipped with the LoRA can use CFG value 1-2 for generation. |
|
|
|
About the performance of normal CFG LoRAs. |
|
|
|
|
|
|
|
Note: Just find the normalCFG with 4-step is not working well. Trying to solve the issue. |
|
|
|
[[paper](https://huggingface.co/papers/2405.18407)] [[arXiv](https://arxiv.org/abs/2405.18407)] [[code](https://github.com/G-U-N/Phased-Consistency-Model)] [[project page](https://g-u-n.github.io/projects/pcm)] |