Update README.md
Browse files
README.md
CHANGED
@@ -22,7 +22,7 @@ This implementation has several benefits:
|
|
22 |
- **Noiseless convergence**: A unique feature of this implementation is its ability to convergence **without intermediate noising**.
|
23 |
- *Scalable test time compute*: By increasing the number of iterations, the answer quality improves.
|
24 |
- *Reduced inference time*: Most questions can be answered with less iterations then the number of tokens generated!
|
25 |
-
- *Greatly reduced training time*: By finetuning an autoregressive
|
26 |
|
27 |
---
|
28 |
|
@@ -51,9 +51,11 @@ See how low you can go with the number of iterations while still receiving adequ
|
|
51 |
---
|
52 |
|
53 |
More technical details (architecture, training, and evaluation) can be found in the accompanying blog post:
|
|
|
54 |
📘 [Read the blog post here](https://example.com/diffusion-language-model-blog)
|
55 |
|
56 |
For a more tweakable version that includes all inference parameters, check out this version:
|
|
|
57 |
🎛️ [Explore the model here](https://huggingface.co/spaces/Ruurd/tini)
|
58 |
|
59 |
Paper coming out soon! If you already want to cite this model, please refer to the blogpost
|
|
|
22 |
- **Noiseless convergence**: A unique feature of this implementation is its ability to convergence **without intermediate noising**.
|
23 |
- *Scalable test time compute*: By increasing the number of iterations, the answer quality improves.
|
24 |
- *Reduced inference time*: Most questions can be answered with less iterations then the number of tokens generated!
|
25 |
+
- *Greatly reduced training time*: By LoRA-based finetuning of an autoregressive model, this model can be trained within several hours on a single GPU.
|
26 |
|
27 |
---
|
28 |
|
|
|
51 |
---
|
52 |
|
53 |
More technical details (architecture, training, and evaluation) can be found in the accompanying blog post:
|
54 |
+
|
55 |
📘 [Read the blog post here](https://example.com/diffusion-language-model-blog)
|
56 |
|
57 |
For a more tweakable version that includes all inference parameters, check out this version:
|
58 |
+
|
59 |
🎛️ [Explore the model here](https://huggingface.co/spaces/Ruurd/tini)
|
60 |
|
61 |
Paper coming out soon! If you already want to cite this model, please refer to the blogpost
|