tini-lad

Running on Zero

Ruurd commited on Jun 5

Commit

77f8336

verified ·

1 Parent(s): b92c569

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -22,7 +22,7 @@ This implementation has several benefits:
 - **Noiseless convergence**: A unique feature of this implementation is its ability to convergence **without intermediate noising**.
 - *Scalable test time compute*: By increasing the number of iterations, the answer quality improves.
 - *Reduced inference time*: Most questions can be answered with less iterations then the number of tokens generated!
-- *Greatly reduced training time*: By finetuning an autoregressive Llama-8B model using only LoRA for diffusive generation, we trained this model within several hours on a single GPU.
 ---
@@ -51,9 +51,11 @@ See how low you can go with the number of iterations while still receiving adequ
 ---
 More technical details (architecture, training, and evaluation) can be found in the accompanying blog post:
 📘 [Read the blog post here](https://example.com/diffusion-language-model-blog)
 For a more tweakable version that includes all inference parameters, check out this version:
 🎛️ [Explore the model here](https://huggingface.co/spaces/Ruurd/tini)
 Paper coming out soon! If you already want to cite this model, please refer to the blogpost

 - **Noiseless convergence**: A unique feature of this implementation is its ability to convergence **without intermediate noising**.
 - *Scalable test time compute*: By increasing the number of iterations, the answer quality improves.
 - *Reduced inference time*: Most questions can be answered with less iterations then the number of tokens generated!
+- *Greatly reduced training time*: By LoRA-based finetuning of an autoregressive model, this model can be trained within several hours on a single GPU.
 ---
 ---
 More technical details (architecture, training, and evaluation) can be found in the accompanying blog post:
 📘 [Read the blog post here](https://example.com/diffusion-language-model-blog)
 For a more tweakable version that includes all inference parameters, check out this version:
 🎛️ [Explore the model here](https://huggingface.co/spaces/Ruurd/tini)
 Paper coming out soon! If you already want to cite this model, please refer to the blogpost