File size: 9,658 Bytes
36c1113 70e32c7 36c1113 c211a25 36c1113 c211a25 70e32c7 36c1113 b26a420 1e08108 b26a420 f31a413 c211a25 91d8140 4f157a9 c211a25 a954d40 91d8140 b26a420 4f157a9 b26a420 4f157a9 b26a420 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 |
---
license: apache-2.0
language:
- en
base_model:
- Wan-AI/Wan2.1-I2V-14B-480P
- Wan-AI/Wan2.1-I2V-14B-480P-Diffusers
pipeline_tag: image-to-video
tags:
- text-to-image
- lora
- diffusers
- template:diffusion-lora
widget:
- text: >-
In the video, a miniature dog is presented. The dog is held in a person's
hands. The person then presses on the dog, causing a sq41sh squish effect.
The person keeps pressing down on the dog, further showing the sq41sh squish
effect.
output:
url: example_videos/dog_squish.mp4
- text: >-
In the video, a miniature tank is presented. The tank is held in a person's
hands. The person then presses on the tank, causing a sq41sh squish effect.
The person keeps pressing down on the tank, further showing the sq41sh
squish effect.
output:
url: example_videos/tank_squish.mp4
- text: >-
In the video, a miniature balloon is presented. The balloon is held in a
person's hands. The person then presses on the balloon, causing a sq41sh
squish effect. The person keeps pressing down on the balloon, further
showing the sq41sh squish effect.
output:
url: example_videos/balloon_squish.mp4
- text: >-
In the video, a miniature rodent is presented. The rodent is held in a
person's hands. The person then presses on the rodent, causing a sq41sh
squish effect. The person keeps pressing down on the rodent, further showing
the sq41sh squish effect.
output:
url: example_videos/rodent_squish.mp4
- text: >-
In the video, a miniature person is presented. The person is held in a
person's hands. The person then presses on the person, causing a sq41sh
squish effect. The person keeps pressing down on the person, further showing
the sq41sh squish effect.
output:
url: example_videos/person_squish.mp4
---
<div style="background-color: #f8f9fa; padding: 20px; border-radius: 10px; margin-bottom: 20px;">
<h1 style="color: #24292e; margin-top: 0;">Squish Effect LoRA for Wan2.1 14B I2V 480p</h1>
<div style="background-color: white; padding: 15px; border-radius: 8px; margin: 15px 0; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
<h2 style="color: #24292e; margin-top: 0;">Overview</h2>
<p>This LoRA is trained on the Wan2.1 14B I2V 480p model and allows you to squish any object in an image. The effect works on a wide variety of objects, from animals to vehicles to people!</p>
</div>
<div style="background-color: white; padding: 15px; border-radius: 8px; margin: 15px 0; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
<h2 style="color: #24292e; margin-top: 0;">Features</h2>
<ul style="margin-bottom: 0;">
<li>Transform any image into a video of it being squished</li>
<li>Trained on the Wan2.1 14B 480p I2V base model</li>
<li>Consistent results across different object types</li>
<li>Simple prompt structure that's easy to adapt</li>
</ul>
</div>
<div style="background-color: white; padding: 15px; border-radius: 8px; margin: 15px 0; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
<h2 style="color: #24292e; margin-top: 0;">Community</h2>
<ul style="margin-bottom: 0;">
<li>
Generate videos with 100+ Camera Control and VFX LoRAs on the
<a href="https://app.remade.ai/canvas/create" style="color: #0366d6; text-decoration: none;">Remade Canvas</a>.
</li>
<li>
<b>Discord:</b>
<a href="https://remade.ai/join-discord?utm_source=Huggingface&utm_medium=Social&utm_campaign=model_release&utm_content=crash_zoom_out" style="color: #0366d6; text-decoration: none;">
Join our community
</a> to generate videos with this LoRA for free
</li>
</ul>
</div>
<Gallery />
# Model File and Inference Workflow
## 📥 Download Links:
- [squish_18.safetensors](./squish_18.safetensors) - LoRA Model File
- [wan_img2video_lora_workflow.json](./workflow/wan_img2video_lora_workflow.json) - Wan I2V with LoRA Workflow for ComfyUI
## Using with Diffusers
```py
pip install git+https://github.com/huggingface/diffusers.git
```
```py
import torch
from diffusers.utils import export_to_video, load_image
from diffusers import AutoencoderKLWan, WanImageToVideoPipeline
from transformers import CLIPVisionModel
import numpy as np
model_id = "Wan-AI/Wan2.1-I2V-14B-480P-Diffusers"
image_encoder = CLIPVisionModel.from_pretrained(model_id, subfolder="image_encoder", torch_dtype=torch.float32)
vae = AutoencoderKLWan.from_pretrained(model_id, subfolder="vae", torch_dtype=torch.float32)
pipe = WanImageToVideoPipeline.from_pretrained(model_id, vae=vae, image_encoder=image_encoder, torch_dtype=torch.bfloat16)
pipe.to("cuda")
pipe.load_lora_weights("Remade/Squish")
pipe.enable_model_cpu_offload() #for low-vram environments
prompt = "In the video, a miniature cat toy is presented. The cat toy is held in a person's hands. The person then presses on the cat toy, causing a sq41sh squish effect. The person keeps pressing down on the cat toy, further showing the sq41sh squish effect."
image = load_image("https://huggingface.co/datasets/diffusers/cat_toy_example/resolve/main/1.jpeg")
max_area = 480 * 832
aspect_ratio = image.height / image.width
mod_value = pipe.vae_scale_factor_spatial * pipe.transformer.config.patch_size[1]
height = round(np.sqrt(max_area * aspect_ratio)) // mod_value * mod_value
width = round(np.sqrt(max_area / aspect_ratio)) // mod_value * mod_value
image = image.resize((width, height))
output = pipe(
image=image,
prompt=prompt,
height=height,
width=width,
num_frames=81,
guidance_scale=5.0,
num_inference_steps=28
).frames[0]
export_to_video(output, "output.mp4", fps=16)
```
---
<div style="background-color: #f8f9fa; padding: 20px; border-radius: 10px; margin-bottom: 20px;">
<div style="background-color: white; padding: 15px; border-radius: 8px; margin: 15px 0; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
<h2 style="color: #24292e; margin-top: 0;">Recommended Settings</h2>
<ul style="margin-bottom: 0;">
<li><b>LoRA Strength:</b> 1.0</li>
<li><b>Embedded Guidance Scale:</b> 6.0</li>
<li><b>Flow Shift:</b> 5.0</li>
</ul>
</div>
<div style="background-color: white; padding: 15px; border-radius: 8px; margin: 15px 0; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
<h2 style="color: #24292e; margin-top: 0;">Trigger Words</h2>
<p>The key trigger phrase is: <code style="background-color: #f0f0f0; padding: 3px 6px; border-radius: 4px;">sq41sh squish effect</code></p>
</div>
<div style="background-color: white; padding: 15px; border-radius: 8px; margin: 15px 0; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
<h2 style="color: #24292e; margin-top: 0;">Prompt Template</h2>
<p>For best results, use this prompt structure:</p>
<div style="background-color: #f0f0f0; padding: 12px; border-radius: 6px; margin: 10px 0;">
<i>In the video, a miniature [object] is presented. The [object] is held in a person's hands. The person then presses on the [object], causing a sq41sh squish effect. The person keeps pressing down on the [object], further showing the sq41sh squish effect.</i>
</div>
<p>Simply replace <code style="background-color: #f0f0f0; padding: 3px 6px; border-radius: 4px;">[object]</code> with whatever you want to see squished!</p>
</div>
<div style="background-color: white; padding: 15px; border-radius: 8px; margin: 15px 0; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
<h2 style="color: #24292e; margin-top: 0;">ComfyUI Workflow</h2>
<p>This LoRA works with a modified version of <a href="https://github.com/kijai/ComfyUI-WanVideoWrapper/blob/main/example_workflows/wanvideo_480p_I2V_example_02.json" style="color: #0366d6; text-decoration: none;">Kijai's Wan Video Wrapper workflow</a>. The main modification is adding a Wan LoRA node connected to the base model.</p>
<img src="./workflow/workflow_screenshot.png" style="width: 100%; border-radius: 8px; margin: 15px 0; box-shadow: 0 4px 8px rgba(0,0,0,0.1);">
<p>See the Downloads section above for the modified workflow.</p>
</div>
</div>
<div style="background-color: #f8f9fa; padding: 20px; border-radius: 10px; margin-bottom: 20px;">
<div style="background-color: white; padding: 15px; border-radius: 8px; margin: 15px 0; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
<h2 style="color: #24292e; margin-top: 0;">Model Information</h2>
<p>The model weights are available in Safetensors format. See the Downloads section above.</p>
</div>
<div style="background-color: white; padding: 15px; border-radius: 8px; margin: 15px 0; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
<h2 style="color: #24292e; margin-top: 0;">Training Details</h2>
<ul style="margin-bottom: 0;">
<li><b>Base Model:</b> Wan2.1 14B I2V 480p</li>
<li><b>Training Data:</b> 1.5 minutes of video (20 short clips of things being squished)</li>
<li><b>Epochs:</b> 18</li>
</ul>
</div>
<div style="background-color: white; padding: 15px; border-radius: 8px; margin: 15px 0; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
<h2 style="color: #24292e; margin-top: 0;">Additional Information</h2>
<p>Training was done using <a href="https://github.com/tdrussell/diffusion-pipe" style="color: #0366d6; text-decoration: none;">Diffusion Pipe for Training</a></p>
</div>
<div style="background-color: white; padding: 15px; border-radius: 8px; margin: 15px 0; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
<h2 style="color: #24292e; margin-top: 0;">Acknowledgments</h2>
<p style="margin-bottom: 0;">Special thanks to Kijai for the ComfyUI Wan Video Wrapper and tdrussell for the training scripts!</p>
</div>
</div> |