WeDLM-7B

WeDLM-7B is a diffusion language model that performs parallel decoding under standard causal attention, initialized from Qwen2.5-7B.

This is the base (pretrained) version. For the instruction-tuned version, see WeDLM-7B-Instruct.

📄 Paper (Coming Soon) | 🌐 Project Page | 💻 GitHub

Model Details

Attribute	Value
Initialized From	Qwen2.5-7B
Parameters	7B
Context Length	32,768

Quick Start (Recommended)

For fast inference, use the wedlm engine:

pip install git+https://github.com/tencent/WeDLM.git

from wedlm import LLM, SamplingParams

llm = LLM(model="tencent/WeDLM-7B")

prompt = "The theory of relativity states that"
outputs = llm.generate([prompt], SamplingParams(temperature=0.2, max_tokens=256))

print(outputs[0]["text"])

HuggingFace Transformers

For training or simple forward passes, you can load via Transformers:

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("tencent/WeDLM-7B", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    "tencent/WeDLM-7B", 
    trust_remote_code=True,
    torch_dtype="auto",
    device_map="auto"
)

inputs = tokenizer("The theory of relativity", return_tensors="pt").to(model.device)
outputs = model(**inputs)

⚠️ Note: The HuggingFace interface is for training/forward pass convenience. For optimized inference throughput, use the wedlm engine above.

Performance

Benchmark	Qwen2.5-7B	WeDLM-7B
ARC-C (0-shot)	89.93	90.70
GSM8K (3-shot)	79.23	84.76
MATH (4-shot)	43.40	48.20
HumanEval (4-shot)	59.14	68.90
MMLU (5-shot)	71.62	71.93

Citation

@article{liu2025wedlm,
  title={WeDLM: Reconciling Diffusion Language Models with Standard Causal Attention for Fast Inference},
  author={Liu, Aiwei and He, Minghua and Zeng, Shaoxun and Zhang, Linhao and Wu, Chuhan and Jia, Wei and Liu, Yuan and Yu, Yang and Zhou, Xiao and Zhou, Jie},
  year={2025}
}

License

Apache 2.0

Downloads last month: 9

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for tencent/WeDLM-7B-Base

Base model

Qwen/Qwen2.5-7B

Finetuned

(784)

this model

Collection including tencent/WeDLM-7B-Base

WeDLM

Collection

4 items • Updated 2 days ago • 8