Wan2.2-I2V-14B: HiFi-Surgical-FP8 & BF16 (ComfyUI Optimized)

This model follows the Wan-AI Software License Agreement. Please refer to the original repository for usage restrictions.

This repository provides two high-performance versions of Wan2.2-I2V-14B, meticulously optimized for the ComfyUI ecosystem. We offer a standard BF16 version and a specialized HiFi-Surgical-FP8 mixed-precision version.


πŸ’Ž The HiFi-Surgical Optimization Strategy

Unlike generic "one-click" quantization scripts that often cause visual degradation in Wan2.2, our HiFi-Surgical-FP8 version uses a data-driven, diagnostic-led approach to preserve cinematic quality.

1. Layer-Wise SNR Calibration

We performed a deep medical-grade scan on all 406 linear weight tensors of the FP32 Master. Only layers maintaining an SNR (Signal-to-Noise Ratio) > 31.5dB were converted to FP8. This ensures that the mathematical "soul" of the model remains intact.

2. High-Outlier Protection

Wan2.2 weights are notoriously "fragile" with sharp numerical peaks. Our strategy identifies layers with a high Outlier Index (Max/Std deviation > 12) and locks them in BF16. This specifically targets and eliminates the "sparkle" noise and flickering artifacts common in standard FP8 conversions.

3. Structural Integrity (Blocks 30-39)

We have physically isolated the Cross-Attention layers in the final blocks of the DiT architecture. By keeping these critical layers in BF16, we ensure that prompt adherence and temporal consistency are not compromised.


πŸ“Š Comparison & Specs

Feature Standard BF16 HiFi-Surgical-FP8 (Recommended)
File Size ~27.2 GB ~22.4 GB
Precision Pure Bfloat16 Hybrid FP8-E4M3 / BF16
VRAM Requirement 24GB+ 16GB - 24GB
Visual Fidelity Reference Grade 99% Reference Match
Inference Speed Base Speed Accelerated on Blackwell/Hopper

πŸ› οΈ ComfyUI Integration & Usage

These models are specifically converted and tested for ComfyUI.

  1. Native Scaling Support: We have included the scale_weight metadata for every quantized tensor. This allows ComfyUI loaders to utilize hardware-level scaling on NVIDIA Blackwell (RTX 50-series) and Hopper architectures for maximum speed.
  2. How to Use:
    • Place the .safetensors file in your `ComfyUI/models/diffusion_models/.
    • Use the CheckpointLoaderSimple or the specialized UNETLoader.
    • Ensure your ComfyUI is up-to-date to support the float8_e4m3fn type.

πŸ“ Diagnostic Methodology

Each weight in the HiFi version was selected based on the following diagnostic results:

  • Total Layers Scanned: 406
  • FP8 Layers: 184 (Non-sensitive FFN & Attention layers)
  • BF16 Layers: 222 (Sensitive Cross-Attention & Outlier-heavy layers)
  • Target Hardware: Optimized for RTX 4090, 5090, and H100/H200.
Downloads last month
592
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for LiconStudio/VBVR-wan2.2-comfy-bf16

Finetuned
(1)
this model