YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

license: mit tags: - pytorch - diffusers - unconditional-image-generation - diffusion-models-class - medical-imaging - brain-mri - multiple-sclerosis

Brain MRI Synthesis with Latent Diffusion (from scratch)

This model is a diffusion-based model for unconditional image generation of latent representations of brain MRI FLAIR slices. The model is designed to synthesize high-resolution brain MRI images (256x256 pixels) through a Latent Diffusion process, leveraging a U-Net architecture with ResNet and Attention-based blocks.

Training Details

  • Architecture: Latent Diffusion Model (LDM)
  • Resolution: Latent resolution of 32x32 to generate 256x256 final images
  • Dataset: Lesion2D VH split (FLAIR MRI slices) (70% of the dataset)
  • Channels: 4 (latents are multi-channel representations of the original images)
  • Epochs: 100
  • Batch size: 16
  • Optimizer: AdamW with:
    • Learning Rate: 1.0e-4
    • Betas: (0.95, 0.999)
    • Weight Decay: 1.0e-6
    • Epsilon: 1.0e-8
  • Scheduler: Cosine with 500 warm-up steps
  • Gradient Accumulation: 1 step
  • Mixed Precision: No
  • Gradient Clipping: Max norm of 1.0
  • Noise Scheduler: Linear schedule with:
    • num_train_timesteps: 1000
    • beta_start: 0.0001
    • beta_end: 0.02
  • Hardware: Trained on NVIDIA GPUs with a distributed dataloader using 12 workers.
  • Memory Consumption: Approx. 2.5 GB during training.

U-Net Architecture

  • Down Blocks: [DownBlock2D, DownBlock2D, DownBlock2D, DownBlock2D, AttnDownBlock2D, DownBlock2D]
  • Up Blocks: [UpBlock2D, AttnUpBlock2D, UpBlock2D, UpBlock2D, UpBlock2D, UpBlock2D]
  • Layers per Block: 2
  • Block Channels: [128, 128, 256, 256, 512, 512]

The model is designed to learn a compressed representation of the brain MRI images at a latent level, making the synthesis process more memory-efficient while maintaining high fidelity.

Usage

You can use the model directly with the diffusers library:

from diffusers import LatentDiffusionPipeline
import torch

# Load the model
pipeline = LatentDiffusionPipeline.from_pretrained("benetraco/latent_scratch")
pipeline.to("cuda")  # or "cpu"

# Generate an image
image = pipeline(batch_size=1).images[0]

# Display the image
image.show()
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support