Update README.md
Browse files
README.md
CHANGED
@@ -1,14 +1,55 @@
|
|
1 |
---
|
2 |
-
title: Tini
|
3 |
emoji: ⚡
|
4 |
colorFrom: pink
|
5 |
colorTo: red
|
6 |
sdk: gradio
|
7 |
-
sdk_version: 5.
|
8 |
app_file: app.py
|
9 |
pinned: false
|
10 |
license: other
|
11 |
short_description: DLM
|
12 |
---
|
13 |
|
14 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
title: Tini-Lad
|
3 |
emoji: ⚡
|
4 |
colorFrom: pink
|
5 |
colorTo: red
|
6 |
sdk: gradio
|
7 |
+
sdk_version: 5.33.0
|
8 |
app_file: app.py
|
9 |
pinned: false
|
10 |
license: other
|
11 |
short_description: DLM
|
12 |
---
|
13 |
|
14 |
+
# 💬 Diffusion Language Model Demo
|
15 |
+
|
16 |
+
Note: Paper coming out soon; if anyone is interested in discussing the model, please contact me.
|
17 |
+
|
18 |
+
This is an interactive demo of a **diffusion-style language model**, which generates text through iterative refinement.
|
19 |
+
Inspired by diffusion processes in vision models, the system gradually improves a corrupted text sequence until convergence.
|
20 |
+
|
21 |
+
This implementation has several benefits:
|
22 |
+
- **Noiseless convergence**: A unique feature of this implementation is its ability to convergence **without intermediate noising**, although this currently works best for simple or short questions.
|
23 |
+
- Scalable test time compute: By increasing the number of iterations, the answer quality improves.
|
24 |
+
- Reduced inference time: Most questions can be answered with less iterations then the number of tokens generated!
|
25 |
+
- Greatly reduced training time: By finetuning an autoregressive Llama-8B model using only LoRA for diffusive generation, we trained this model within several hours on a single GPU.
|
26 |
+
|
27 |
+
## 🔧 Settings
|
28 |
+
- **Disable Intermediate Noising**: Speeds up convergence by skipping the noising step between iterations. Works best for short, factual questions.
|
29 |
+
- **Iterations**: Number of refinement steps. More iterations means more time to refine the answer.
|
30 |
+
- **Pause Between Steps**: Slows down the process so you can visually follow the changes.
|
31 |
+
|
32 |
+
## 🖍️ Visualization
|
33 |
+
- **Red tokens**: Masked (noised) tokens that will be regenerated.
|
34 |
+
- **Green tokens**: Newly generated tokens compared to the previous step.
|
35 |
+
|
36 |
+
## 🧪 Example Prompt
|
37 |
+
For noiseless diffusion, try short questions like:
|
38 |
+
> What's the capital of France?
|
39 |
+
|
40 |
+
For more in-depth questions, enable intermediate noising. Increasing the number of iterations generally improves answer quality.
|
41 |
+
> What do you know about Amsterdam?
|
42 |
+
|
43 |
+
See how low you can go with the number of iterations while still receiving adequate answers!
|
44 |
+
|
45 |
+
---
|
46 |
+
|
47 |
+
More technical details (architecture, training, and evaluation) can be found in the accompanying blog post:
|
48 |
+
📘 [Read the blog post here](https://example.com/diffusion-language-model-blog)
|
49 |
+
|
50 |
+
For a more tweakable version that includes all inference parameters, check out this version:
|
51 |
+
🎛️ [Explore the model here](https://huggingface.co/spaces/Ruurd/tini)
|
52 |
+
|
53 |
+
Paper coming out soon! If you already want to cite this model, please refer to the blogpost
|
54 |
+
|
55 |
+
|