Qwen-Image-Lora-Faceseg

Prompt
Original Image
Prompt
change the face to face segmentation mask
Prompt
change the face to face segmentation mask

Model description

Face Segmentation Model Description

Overview

This is a LoRA fine-tuned face segmentation model based on Qwen-VL (Qwen Vision-Language) architecture, specifically designed to transform facial images into precise segmentation masks. The model leverages the powerful multimodal capabilities of Qwen-VL and enhances it through Parameter-Efficient Fine-Tuning (PEFT) using LoRA (Low-Rank Adaptation) technique.

Model Architecture

  • Base Model: Qwen-Image-Edit (built on Qwen-VL foundation)
  • Fine-tuning Method: LoRA (Low-Rank Adaptation)
  • Task: Image-to-Image translation (Face โ†’ Segmentation Mask)
  • Input: RGB facial images
  • Output: Binary/grayscale segmentation masks highlighting facial regions

Training Configuration

  • Dataset: 20 carefully curated face segmentation samples
  • Training Steps: 900-1000 steps
  • Prompt: "change the image from the face to the face segmentation mask"
  • Precision Options:
    • BF16 precision for high-quality results
    • FP4 quantization for memory-efficient deployment

Key Features

  1. High Precision Segmentation: Accurately identifies and segments facial boundaries with fine detail preservation
  2. Memory Efficient: FP4 quantized version maintains competitive quality while significantly reducing memory footprint
  3. Fast Inference: Optimized for real-time applications with 20 inference steps
  4. Robust Performance: Handles various lighting conditions and facial orientations
  5. Parameter Efficient: Only trains LoRA adapters (~1M parameters) while keeping base model frozen

Technical Specifications

  • Inference Steps: 20
  • CFG Scale: 2.5
  • Input Resolution: Configurable (typically 512x512)
  • Model Size: Base model + ~1M LoRA parameters
  • Memory Usage:
    • BF16 version: Higher memory, best quality
    • FP4 version: 75% memory reduction, competitive quality

Use Cases

  • Identity Verification: KYC (Know Your Customer) applications
  • Privacy Protection: Face anonymization while preserving facial structure
  • Medical Applications: Facial analysis and dermatological assessments
  • AR/VR Applications: Real-time face tracking and segmentation
  • Content Creation: Automated face masking for video editing

Performance Highlights

  • Accuracy: Significantly improved boundary detection compared to base model
  • Detail Preservation: Maintains fine facial features in segmentation masks
  • Consistency: Stable segmentation quality across different input conditions
  • Efficiency: FP4 quantization achieves 4x memory savings with minimal quality loss

Deployment Options

  • High-Quality Mode: BF16 precision for maximum accuracy
  • Efficient Mode: FP4 quantization for resource-constrained environments
  • Real-time Applications: Optimized inference pipeline for low-latency requirements This model represents a practical solution for face segmentation tasks, offering an excellent balance between accuracy, efficiency, and deployability across various hardware configurations

Example:

Control Images input_image.jpg

Edited Image with Qwen-Image-Edit by promot `change the face to face segmentation mask`

result_base_model.jpg

After Lora Finetune with same prompt

result_lora_model.jpg

Code

Lora Finetune of Qwen-Image-Edit Code here: https://github.com/tsiendragon/qwen-image-finetune

Download model

Download them in the Files & versions tab.

Downloads last month
156
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for TsienDragon/qwen-image-edit-lora-face-segmentation

Adapter
(13)
this model

Space using TsienDragon/qwen-image-edit-lora-face-segmentation 1