train-modle / README.md
fokan's picture
Force Space rebuild v2.1.0 with incremental training
cca1fa9
metadata
title: Multi-Modal Knowledge Distillation
emoji: 🧠
colorFrom: blue
colorTo: purple
sdk: docker
app_file: app.py
pinned: false
license: mit
short_description: Multi-Modal Knowledge Distillation for AI models
tags:
  - machine-learning
  - knowledge-distillation
  - multi-modal
  - pytorch
  - transformers
  - computer-vision
  - nlp
suggested_hardware: t4-small
suggested_storage: medium

Multi-Modal Knowledge Distillation

Create new AI models through knowledge distillation from multiple pre-trained models across different modalities (text, vision, audio, and multimodal).

Features

  • Multi-Modal Support: Distill knowledge from text, vision, audio, and multimodal models
  • Multiple Input Sources: Upload local files, use Hugging Face repositories, or direct URLs
  • Real-Time Monitoring: Live progress tracking with WebSocket updates
  • Flexible Configuration: Customizable student model architecture and training parameters
  • Production Ready: Built with FastAPI, comprehensive error handling, and security measures
  • Responsive UI: Modern, mobile-friendly web interface
  • Multiple Formats: Support for PyTorch (.pt, .pth, .bin), Safetensors, and Hugging Face models

πŸ†• New Advanced Features

πŸ”§ System Optimization

  • Memory Management: Advanced memory management for 16GB RAM systems
  • CPU Optimization: Optimized for CPU-only training environments
  • Chunk Loading: Progressive loading for large models
  • Performance Monitoring: Real-time system performance tracking

πŸ”‘ Token Management

  • Secure Storage: Encrypted storage of Hugging Face tokens
  • Multiple Token Types: Support for read, write, and fine-grained tokens
  • Auto Validation: Automatic token validation and recommendations
  • Usage Tracking: Monitor token usage and access patterns

πŸ₯ Medical AI Support

  • Medical Datasets: Specialized medical datasets (ROCOv2, CT-RATE, UMIE)
  • DICOM Processing: Advanced DICOM file processing and visualization
  • Medical Preprocessing: Specialized preprocessing for medical images
  • Modality Support: CT, MRI, X-ray, and ultrasound image processing

🌐 Enhanced Model Support

  • Google Models: Direct access to Google's open-source models
  • Streaming Datasets: Memory-efficient dataset streaming
  • Progressive Training: Incremental model training capabilities
  • Arabic Documentation: Full Arabic language support

How to Use

  1. Select Teacher Models: Choose 1-10 pre-trained models as teachers

    • Upload local model files (.pt, .pth, .bin, .safetensors)
    • Enter Hugging Face repository names (format: organization/model-name)
    • Provide direct download URLs to model files
    • For private/gated models: Add your HF token in Space settings
  2. Configure Training: Set up training parameters

    • Student model architecture (hidden size, layers)
    • Training parameters (steps, learning rate, temperature)
    • Distillation strategy (ensemble, weighted, sequential)
  3. Monitor Training: Watch real-time progress

    • Live progress bar and metrics
    • Training console output
    • Download trained model when complete

Setup for Private/Gated Models

To access private or gated Hugging Face models:

  1. Get your Hugging Face token:

  2. Add token to Hugging Face Space:

    • Go to your Space settings
    • Add a new secret: HF_TOKEN = your_token_here
    • Restart your Space
  3. Alternative: Enter token in the interface

    • Use the "Hugging Face Token" field in the web interface
    • This is temporary and only for the current session

Supported Formats

  • PyTorch: .pt, .pth, .bin files
  • Safetensors: .safetensors files
  • Hugging Face: Any public repository
  • Direct URLs: Publicly accessible model files

Supported Modalities

  • Text: BERT, GPT, RoBERTa, T5, DistilBERT, etc.
  • Vision: ViT, ResNet, EfficientNet, SigLIP, etc.
  • Multimodal: CLIP, BLIP, ALBEF, etc.
  • Audio: Wav2Vec2, Whisper, etc.
  • Specialized: Background removal (RMBG), Medical imaging (MedSigLIP), etc.

Troubleshooting Common Models

SigLIP Models (e.g., google/siglip-base-patch16-224)

  • These models may require "Trust Remote Code" to be enabled
  • Use the "Test Model" button to verify compatibility before training

Custom Architecture Models

  • Some models use custom code that requires "Trust Remote Code"
  • Always test models before starting training
  • Check model documentation on Hugging Face for requirements

Gemma Models (e.g., google/gemma-2b, google/gemma-3-27b-it)

  • Requires: Hugging Face token AND access permission
  • Steps:
    1. Request access at the model page on Hugging Face
    2. Add your HF token in Space settings or interface
    3. Enable "Trust Remote Code" if needed
  • Note: Gemma 3 models require latest transformers version

Technical Details

  • Backend: FastAPI with async support
  • ML Framework: PyTorch with Transformers
  • Frontend: Responsive HTML/CSS/JavaScript
  • Real-time Updates: WebSocket communication
  • Security: File validation, input sanitization, resource limits

πŸš€ Quick Start (Optimized)

Option 1: Standard Run

python app.py

Option 2: Optimized Run (Recommended)

python run_optimized.py

The optimized runner provides:

  • βœ… Automatic CPU optimization
  • βœ… Memory management setup
  • βœ… System requirements check
  • βœ… Performance recommendations
  • βœ… Enhanced logging

Option 3: Docker (Coming Soon)

docker run -p 8000:8000 ai-knowledge-distillation

πŸ”§ Advanced Configuration

Environment Variables

# Memory optimization
export OMP_NUM_THREADS=8
export MKL_NUM_THREADS=8
export PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:128

# Cache directories
export HF_DATASETS_CACHE=./cache/datasets
export TRANSFORMERS_CACHE=./cache/transformers

# Token management
export HF_TOKEN=your_token_here

System Requirements

Minimum Requirements

  • Python 3.9+
  • 4GB RAM
  • 10GB free disk space
  • CPU with 2+ cores

Recommended Requirements

  • Python 3.10+
  • 16GB RAM
  • 50GB free disk space
  • CPU with 8+ cores
  • Intel CPU with MKL support

For Medical AI

  • 16GB+ RAM
  • 100GB+ free disk space
  • Fast SSD storage

πŸ“Š Performance Tips

  1. Memory Optimization:

    • Use streaming datasets for large medical datasets
    • Enable chunk loading for models >2GB
    • Monitor memory usage in real-time
  2. CPU Optimization:

    • Install Intel Extension for PyTorch
    • Use optimized BLAS libraries (MKL, OpenBLAS)
    • Set appropriate thread counts
  3. Storage Optimization:

    • Use SSD for cache directories
    • Regular cleanup of old datasets
    • Compress model checkpoints

Built with ❀️ for the AI community | Ω…Ψ¨Ω†ΩŠ Ψ¨Ω€ ❀️ Ω„Ω…Ψ¬ΨͺΩ…ΨΉ Ψ§Ω„Ψ°ΩƒΨ§Ψ‘ Ψ§Ω„Ψ§Ψ΅Ψ·Ω†Ψ§ΨΉΩŠ