Update README.md
Browse files
README.md
CHANGED
@@ -20,9 +20,9 @@ Inspired by diffusion processes in vision models, the system gradually improves
|
|
20 |
|
21 |
This implementation has several benefits:
|
22 |
- **Noiseless convergence**: A unique feature of this implementation is its ability to convergence **without intermediate noising**, although this currently works best for simple or short questions.
|
23 |
-
- Scalable test time compute
|
24 |
-
- Reduced inference time
|
25 |
-
- Greatly reduced training time
|
26 |
|
27 |
## 🔧 Settings
|
28 |
- **Disable Intermediate Noising**: Speeds up convergence by skipping the noising step between iterations. Works best for short, factual questions.
|
|
|
20 |
|
21 |
This implementation has several benefits:
|
22 |
- **Noiseless convergence**: A unique feature of this implementation is its ability to convergence **without intermediate noising**, although this currently works best for simple or short questions.
|
23 |
+
- *Scalable test time compute*: By increasing the number of iterations, the answer quality improves.
|
24 |
+
- *Reduced inference time*: Most questions can be answered with less iterations then the number of tokens generated!
|
25 |
+
- *Greatly reduced training time*: By finetuning an autoregressive Llama-8B model using only LoRA for diffusive generation, we trained this model within several hours on a single GPU.
|
26 |
|
27 |
## 🔧 Settings
|
28 |
- **Disable Intermediate Noising**: Speeds up convergence by skipping the noising step between iterations. Works best for short, factual questions.
|