Flow-OPD: On-Policy Distillation for Flow Matching Models
Paper • 2605.08063 • Published • 92
Configuration Parsing Warning:In adapter_config.json: "peft.base_model_name_or_path" must be a string
Configuration Parsing Warning:In adapter_config.json: "peft.task_type" must be a string
Flow-OPD: On-Policy Distillation for Flow Matching Models — Evaluated on SD-3.5-Medium, Flow-OPD achieves +18pt average improvement over vanilla GRPO.
import torch
from diffusers import StableDiffusion3Pipeline
from peft import PeftModel
model_id = "stabilityai/stable-diffusion-3.5-medium"
lora_ckpt_path = "CostaliyA/Flow-OPD"#dev ckpt
device = "cuda"
pipe = StableDiffusion3Pipeline.from_pretrained(model_id, torch_dtype=torch.float16)
pipe.transformer = PeftModel.from_pretrained(pipe.transformer, lora_ckpt_path)
pipe.transformer = pipe.transformer.merge_and_unload()
pipe = pipe.to(device)
prompt = "a photo of a black kite and a green bear"
image = pipe(prompt, height=512, width=512, num_inference_steps=40, guidance_scale=4.5, negative_prompt="").images[0]
image.save("flow_opd.png")
| Model | GenEval | OCR | DeQA | PickScore | Average |
|---|---|---|---|---|---|
| SD-3.5-M (base) | 0.63 | 0.59 | 4.07 | 21.64 | 0.72 |
| GRPO-Mix | 0.73 | 0.83 | 4.33 | 21.84 | 0.82 |
| Flow-OPD | 0.92 | 0.94 | 4.35 | 23.08 | 0.90 |
@misc{fang2026flowopdonpolicydistillationflow,
title={Flow-OPD: On-Policy Distillation for Flow Matching Models},
author={Zhen Fang and Wenxuan Huang and Yu Zeng and Yiming Zhao and Shuang Chen and Kaituo Feng and Yunlong Lin and Lin Chen and Zehui Chen and Shaosheng Cao and Feng Zhao},
year={2026},
eprint={2605.08063},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2605.08063},
}
Base model
stabilityai/stable-diffusion-3.5-medium