tini-lad

Running on Zero

App Files Files Community

Ruurd commited on Jun 5

Commit

b92c569

verified ·

1 Parent(s): a2ec89b

Update README.md

Browse files

Files changed (1) hide show

README.md +7 -1

README.md CHANGED Viewed

@@ -19,20 +19,26 @@ This is an interactive demo of a **diffusion-style language model**, which gener
 Inspired by diffusion processes in vision models, the system gradually improves a corrupted text sequence until convergence.
 This implementation has several benefits:
-- **Noiseless convergence**: A unique feature of this implementation is its ability to convergence **without intermediate noising**, although this currently works best for simple or short questions.
 - *Scalable test time compute*: By increasing the number of iterations, the answer quality improves.
 - *Reduced inference time*: Most questions can be answered with less iterations then the number of tokens generated!
 - *Greatly reduced training time*: By finetuning an autoregressive Llama-8B model using only LoRA for diffusive generation, we trained this model within several hours on a single GPU.
 ## 🔧 Settings
 - **Disable Intermediate Noising**: Speeds up convergence by skipping the noising step between iterations. Works best for short, factual questions.
 - **Iterations**: Number of refinement steps. More iterations means more time to refine the answer.
 - **Pause Between Steps**: Slows down the process so you can visually follow the changes.
 ## 🖍️ Visualization
 - **Red tokens**: Masked (noised) tokens that will be regenerated.
 - **Green tokens**: Newly generated tokens compared to the previous step.
 ## 🧪 Example Prompt
 For noiseless diffusion, try short questions like:
 > What's the capital of France?

 Inspired by diffusion processes in vision models, the system gradually improves a corrupted text sequence until convergence.
 This implementation has several benefits:
+- **Noiseless convergence**: A unique feature of this implementation is its ability to convergence **without intermediate noising**.
 - *Scalable test time compute*: By increasing the number of iterations, the answer quality improves.
 - *Reduced inference time*: Most questions can be answered with less iterations then the number of tokens generated!
 - *Greatly reduced training time*: By finetuning an autoregressive Llama-8B model using only LoRA for diffusive generation, we trained this model within several hours on a single GPU.
+---
 ## 🔧 Settings
 - **Disable Intermediate Noising**: Speeds up convergence by skipping the noising step between iterations. Works best for short, factual questions.
 - **Iterations**: Number of refinement steps. More iterations means more time to refine the answer.
 - **Pause Between Steps**: Slows down the process so you can visually follow the changes.
+---
 ## 🖍️ Visualization
 - **Red tokens**: Masked (noised) tokens that will be regenerated.
 - **Green tokens**: Newly generated tokens compared to the previous step.
+---
 ## 🧪 Example Prompt
 For noiseless diffusion, try short questions like:
 > What's the capital of France?