Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
1
3
Ruurd Kuiper
PRO
Ruurd
Follow
John6666's profile picture
21world's profile picture
ddh0's profile picture
4 followers
Β·
7 following
AI & ML interests
None yet
Recent Activity
reacted
to
their
post
with π₯
about 5 hours ago
The past year I have been trying to get diffusion models to work for language generation, without having to retrain a LLM from scratch. And recently, we finally succeeded: We introduce "LAD: LoRA-Adapted Denoiser", a method to convert a LLaMA model into a text diffusion model using LoRA finetuning and structured input corruption. π― Try the demo and read the write-up here! https://ruurdkuiper.github.io/tini-lad/ Unlike autoregressive (word-for-word) models like ChatGPT, diffusion models iteratively refine a noised sequence. However, most current diffusion approaches rely on all-parameter retraining and repeatedly remasking tokens, which is costly and slow during both training and inference! π§ With LAD: - We can finetune an autoregressive model for diffusive generation in just 10 hours on a single GPU. - Test-time compute is fully adjustable: fewer steps means faster outputs while more steps improve output quality. - Due to our unique noising schedule, remasking is not always needed during inference. All tokens are attended to in each iteration! π LAD is built using: β A frozen LLaMA-8B backbone β Structured noising: token swaps, duplications, replacements, span shifts β Modified attention masks for bidirectional decoding π‘ We show that even small, fast-trained models can perform diffusive generation β with competitive benchmark performance, perplexity and more flexible test-time behavior than traditional transformers.
upvoted
an
article
about 5 hours ago
LAD: LoRA-Adapted Denoiser
published
an
article
about 20 hours ago
LAD: LoRA-Adapted Denoiser
View all activity
Organizations
None yet
Ruurd
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
liked
a Space
about 23 hours ago
Running
on
Zero
1
1
Tini-Lad
β‘
DLM
liked
a model
1 day ago
meta-llama/Llama-3.1-8B-Instruct
Text Generation
β’
Updated
Sep 25, 2024
β’
5.44M
β’
β’
4.07k
liked
a Space
about 2 months ago
Running
on
Zero
3
3
Tini
β‘
DLM