YAML Metadata Warning: empty or missing yaml metadata in repo card

Check out the documentation for more information.

BAGEL

BAGEL Website BAGEL Paper on arXiv BAGEL Diffusers Pipeline BAGEL Original Model BAGEL Demo

BAGEL: Diffusers Integration

This repository hosts the BAGEL custom pipeline for 🤗 Diffusers, enabling seamless text-to-image, image editing, and visual understanding tasks with the BAGEL model.

🚀 Quick Start

pipe = DiffusionPipeline.from_pretrained(
    "JiaxinGe/Diffusers-BAGEL",
    custom_pipeline="JiaxinGe/Diffusers-BAGEL",
    torch_dtype=torch.bfloat16,
    trust_remote_code=True
)
pipe = pipe.to("cuda:0")
# (1) text→image
prompt = "A female cosplayer portraying an ethereal fairy or elf, wearing a flowing dress made of delicate fabrics in soft, mystical colors like emerald green and silver. She has pointed ears, a gentle, enchanting expression, and her outfit is adorned with sparkling jewels and intricate patterns. The background is a magical forest with glowing plants, mystical creatures, and a serene atmosphere."
out = pipe(text=prompt)
print(out)
out['images'][0].save("bagel_text2img.png")

# 2) text→image with “think”
out = pipe(
    text="a car made of small cars",
    think=True
)
print(out['text'])
out['images'][0].save("bagel_text2img_think.png")

# 3) image editing
from PIL import Image
img = Image.open("~/Bagel/test_images/women.jpg")
out = pipe(
    image=img,
    text="She boards a modern subway, quietly reading a folded newspaper…"
)
out['images'][0].save("bagel_img_edit.png")

# 4) image editing + think
img = Image.open("~/Bagel/test_images/octupusy.jpg")
out = pipe(
    image=img,
    text="Could you display the sculpture that takes after this design?",
    think=True
)
print(out['text'])
out['images'][0].save("bagel_img_edit_think.png")

# 4) image understanding
meme = Image.open("~/Bagel/test_images/meme.jpg")
out = pipe(
    image=meme,
    text="Can someone explain what’s funny about this meme?",
    understanding_output=True
)
print(out['text'])

🔧 Inference Hyperparameters

  • cfg_text_scale: text guidance strength (typical: 4.0–8.0)
  • cfg_image_scale: image guidance strength (1.0–2.0)
  • cfg_interval: fraction of steps to apply CFG (e.g. 0.4–1.0)
  • num_timesteps: total denoising steps (e.g. 50)
  • timestep_shift: offset of denoising schedule
  • cfg_renorm_min / cfg_renorm_type: renormalization settings for CFG

For detailed explanations, see the original BAGEL repository.

Downloads last month
7
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for JiaxinGe/Diffusers-BAGEL