GSAI-ML
/

LLaDA-V

Image-Text-to-Text

Model card Files Files and versions

LLaDA-V

We introduce LLaDA-V, a competitive diffusion-based vision-language model, outperforming other diffusion MLLMs.

It was presented in the paper LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning.

Project Page: https://ml-gsai.github.io/LLaDA-V-demo/

Code: https://github.com/ML-GSAI/LLaDA-V

Downloads last month: 6,343

Safetensors

Model size

8B params

Tensor type

F16

·

Paper for GSAI-ML/LLaDA-V

LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning

Paper • 2505.16933 • Published May 22, 2025 • 34