# Usage Guide - WAN 2.2 Image-to-Video LoRA Demo ## Quick Start ### 1. Deploying to Hugging Face Spaces To deploy this demo to Hugging Face Spaces: ```bash # Install git-lfs if not already installed git lfs install # Create a new Space on huggingface.co # Then clone your space repository git clone https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME cd YOUR_SPACE_NAME # Copy all files from this demo cp -r * YOUR_SPACE_NAME/ # Commit and push git add . git commit -m "Initial commit: WAN 2.2 Image-to-Video LoRA Demo" git push ``` ### 2. Running Locally ```bash # Create a virtual environment python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate # Install dependencies pip install -r requirements.txt # Run the app python app.py ``` The app will be available at `http://localhost:7860` ## Using the Demo ### Basic Usage 1. **Upload Image**: Click the image upload area and select an image file 2. **Enter Prompt**: Type a description of the motion you want (e.g., "A person walking forward, cinematic") 3. **Click Generate**: Wait for the video to be generated (first run will download the model) 4. **View Result**: The generated video will appear in the output area ### Advanced Settings Expand the "Advanced Settings" accordion to access: - **Inference Steps** (20-100): More steps = higher quality but slower generation - 20-30: Fast, lower quality - 50: Balanced (recommended) - 80-100: Slow, highest quality - **Guidance Scale** (1.0-15.0): How closely to follow the prompt - 1.0-3.0: More creative, less faithful to prompt - 6.0: Balanced (recommended) - 10.0-15.0: Very faithful to prompt, less creative - **Use LoRA**: Enable/disable LoRA fine-tuning - **LoRA Type**: - **High-Noise**: Best for dynamic, action-heavy scenes - **Low-Noise**: Best for subtle, smooth motions ## Example Prompts ### Good Prompts - "A cat walking through a garden, sunny day, high quality" - "Waves crashing on a beach, sunset lighting, cinematic" - "A car driving down a highway, fast motion, 4k" - "Smoke rising from a campfire, slow motion" ### Tips for Better Results 1. **Be Specific**: Include details about motion, lighting, and quality 2. **Use Keywords**: "cinematic", "high quality", "4k", "smooth" 3. **Describe Motion**: Clearly state what should move and how 4. **Consider Style**: Add style descriptors like "photorealistic" or "animated" ## Troubleshooting ### Out of Memory Error If you encounter OOM errors: 1. The model requires significant VRAM (16GB+ recommended) 2. On Hugging Face Spaces, ensure you're using at least `gpu-medium` hardware 3. For local runs, try reducing the number of frames or using CPU offloading ### Slow Generation - First generation will be slower (model downloads) - Reduce inference steps for faster results - Ensure GPU is being used (check logs for "Loading model on cuda") ### Model Not Loading If the model fails to load: 1. Check your internet connection (model is ~20GB) 2. Ensure sufficient disk space 3. For Hugging Face Spaces, check your Space's logs ## Customization ### Using Your Own LoRA Files To use your own LoRA weights: 1. Upload LoRA `.safetensors` files to Hugging Face 2. Update the URLs in `app.py`: ```python HIGH_NOISE_LORA_URL = "https://huggingface.co/YOUR_USERNAME/YOUR_REPO/resolve/main/your_lora.safetensors" ``` 3. Uncomment and implement the LoRA loading code in the `generate_video` function ### Changing the Model To use a different model: 1. Update `MODEL_ID` in `app.py` 2. Ensure the model is compatible with `CogVideoXImageToVideoPipeline` 3. Adjust memory optimizations if needed ## Performance Notes - **GPU (A10G/T4)**: ~2-3 minutes per video - **GPU (A100)**: ~1-2 minutes per video - **CPU**: Not recommended (20+ minutes) ## API Access For programmatic access, you can use the Gradio Client: ```python from gradio_client import Client client = Client("YOUR_USERNAME/YOUR_SPACE_NAME") result = client.predict( image="path/to/image.jpg", prompt="A cat walking", api_name="/predict" ) ``` ## Credits - Model: CogVideoX by THUDM - Framework: Hugging Face Diffusers - Interface: Gradio ## License Apache 2.0 - See LICENSE file for details