HunyuanImage-2.1 Banner

HunyuanImage-2.1 fp8 e4m3fn

An Efficient Diffusion Model for High-Resolution (2K) Text-to-Image Generation


Performance on RTX 5090

When using HunyuanImage-2.1 with the quantized encoder + quantized base model,
the VRAM usage on an NVIDIA RTX 5090 typically ranges between 26 GB and 30 GB with average
16 second inference time depending on resolution, batch size, and prompt complexity. Reports that it works on 16gb VRAM GPU's

⚠ Important Note:
The refiner is still not implemented and is not ready for use in ComfyUI.
However, the distilled model now works in ComfyUI with recommended settings of 8 steps / 1.5-2.5 CFG.


Image1

Image2

image/jpeg image/jpeg

Download Quantized Model (FP8 e4m3fn)

**Download hunyuanimage2.1_fp8_e4m3fn.safetensors**

Workflow Notes

  • Model: HunyuanImage-2.1
  • Mode: Quantized Encoder + Quantized Base Model
  • VRAM Usage: ~26GB–30GB on RTX 5090
  • Resolution Tested: 2K (2048Γ—2048)
  • Frameworks: ComfyUI & Diffusers
  • Optimisations Works with Patch Sage Attention + Lazycache / TeaCache βœ…
  • Distilled Model: βœ… Now works in ComfyUI with 8 steps / 1.5-2.5 CFG
  • Refiner: ❌ Still not implemented, not available in ComfyUI
  • License: tencent-hunyuan-community

πŸš€ **Optimized for High-Resolution, Memory-Efficient Text-to-Image Generation**

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support