Qwen-Image-ControlNet-Inpainting

This repository provides a ControlNet that supports mask-based image inpainting and outpainting for Qwen-Image.

Model Cards

  • This ControlNet consists of 6 double blocks copied from the pretrained transformer layers.
  • We train the model from scratch for 65K steps using a dataset of 10M high-quality general and human images.
  • We train at 1328x1328 resolution in BFloat16, batch size=128, learning rate=4e-5. We set the text drop ratio to 0.10.
  • This model supports Object replacement, Text modification, Background replacement, Outpainting.

Showcases

You can find more use cases in this blog.

example1 example1 example1
example2 example2 example2
example3 example3 example3

Inference

import torch
from diffusers.utils import load_image

# pip install git+https://github.com/huggingface/diffusers
from diffusers import QwenImageControlNetModel, QwenImageControlNetInpaintPipeline

base_model = "Qwen/Qwen-Image"
controlnet_model = "InstantX/Qwen-Image-ControlNet-Inpainting"

controlnet = QwenImageControlNetModel.from_pretrained(controlnet_model, torch_dtype=torch.bfloat16)

pipe = QwenImageControlNetInpaintPipeline.from_pretrained(
    base_model, controlnet=controlnet, torch_dtype=torch.bfloat16
)
pipe.to("cuda")

image = load_image("https://huggingface.co/InstantX/Qwen-Image-ControlNet-Inpainting/resolve/main/assets/images/image1.png")
mask_image = load_image("https://huggingface.co/InstantX/Qwen-Image-ControlNet-Inpainting/resolve/main/assets/masks/mask1.png")
prompt = "ไธ€่พ†็ปฟ่‰ฒ็š„ๅ‡บ็งŸ่ฝฆ่กŒ้ฉถๅœจ่ทฏไธŠ"

image = pipe(
    prompt=prompt,
    negative_prompt=" ",
    control_image=image,
    control_mask=mask_image,
    controlnet_conditioning_scale=controlnet_conditioning_scale,
    width=control_image.size[0],
    height=control_image.size[1],
    num_inference_steps=30,
    true_cfg_scale=4.0,
    generator=torch.Generator(device="cuda").manual_seed(42),
).images[0]
image.save(f"qwenimage_cn_inpaint_result.png")

ComfyUI Support

ComfyUI offers native support for Qwen-Image-ControlNet-Inpainting. The official workflow can be found here. Make sure your ComfyUI version is >=0.3.59.

Community Support

Liblib AI offers native support for Qwen-Image-ControlNet-Inpainting. Visit for online WebUI or ComfyUI inference.

Limitations

This model is slightly sensitive to user prompts. Using detailed prompts that describe the entire image (both the inpainted area and the background) is highly recommended. Please use descriptive prompt instead of instructive prompt.

Acknowledgements

This model is developed by InstantX Team. All copyright reserved.

Downloads last month
3,757
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ 1 Ask for provider support

Model tree for InstantX/Qwen-Image-ControlNet-Inpainting

Base model

Qwen/Qwen-Image
Finetuned
(32)
this model

Spaces using InstantX/Qwen-Image-ControlNet-Inpainting 2