Spaces:
Running
Deployment Guide for Hugging Face Spaces
This guide provides step-by-step instructions for deploying the Multi-Modal Knowledge Distillation application to Hugging Face Spaces.
π Pre-Deployment Checklist
β Project Structure Complete
- All required files and directories are present
- Python syntax validation passed
- Frontend files are properly structured
β Configuration Validated
requirements.txt
contains all necessary dependenciesspaces_config.yaml
is properly configured- API endpoints are implemented and accessible
β Documentation Complete
- Comprehensive README.md with usage instructions
- API documentation included
- Troubleshooting guide provided
π Deployment Steps
Step 1: Create Hugging Face Space
Go to Hugging Face Spaces
- Visit https://huggingface.co/spaces
- Click "Create new Space"
Configure Space Settings
- Space name:
multi-modal-knowledge-distillation
(or your preferred name) - License: MIT
- SDK: Gradio
- Hardware: T4 small (minimum) or T4 medium (recommended)
- Visibility: Public or Private (your choice)
- Space name:
Initialize Repository
- Choose "Initialize with README"
- Click "Create Space"
Step 2: Upload Project Files
Upload all the following files to your Space repository:
Core Application Files
app.py # Main FastAPI application
requirements.txt # Python dependencies
spaces_config.yaml # Hugging Face Spaces configuration
README.md # Project documentation
.gitignore # Git ignore rules
Source Code
src/
βββ __init__.py # Package initialization
βββ model_loader.py # Model loading utilities
βββ distillation.py # Knowledge distillation engine
βββ utils.py # Utility functions
Frontend Files
templates/
βββ index.html # Main web interface
static/
βββ css/
β βββ style.css # Application styles
βββ js/
βββ main.js # Frontend JavaScript
Directory Structure (will be created automatically)
uploads/ # Uploaded model files
models/ # Trained models
temp/ # Temporary files
logs/ # Application logs
Step 3: Configure Hardware
Go to Space Settings
- Click on "Settings" tab in your Space
- Navigate to "Hardware" section
Select Hardware
- Minimum: T4 small (16GB RAM, 1x T4 GPU)
- Recommended: T4 medium (32GB RAM, 1x T4 GPU)
- For large models: A10G small or larger
Apply Changes
- Click "Update hardware"
- Your Space will restart with new hardware
Step 4: Monitor Deployment
Build Process
- Watch the "Logs" tab for build progress
- Build typically takes 5-10 minutes
- Dependencies will be installed automatically
Common Build Issues
- PyTorch installation: May take several minutes
- CUDA compatibility: Ensure PyTorch version supports your hardware
- Memory issues: Upgrade hardware if needed
Successful Deployment
- Space status shows "Running"
- Application is accessible via the Space URL
- Health check endpoint responds correctly
π§ Configuration Options
Environment Variables
You can set these in your Space settings:
# Server Configuration
PORT=7860 # Default port (usually not needed)
HOST=0.0.0.0 # Default host
# Resource Limits
MAX_FILE_SIZE=5368709120 # 5GB max file size
MAX_MODELS=10 # Maximum teacher models
MAX_TRAINING_TIME=3600 # 1 hour training limit
# GPU Configuration
CUDA_VISIBLE_DEVICES=0 # GPU device selection
Hardware Recommendations
Use Case | Hardware | RAM | GPU | Cost |
---|---|---|---|---|
Demo/Testing | CPU Basic | 16GB | None | Free |
Small Models | T4 small | 16GB | T4 | Low |
Production | T4 medium | 32GB | T4 | Medium |
Large Models | A10G small | 24GB | A10G | High |
π§ͺ Testing Your Deployment
1. Health Check
curl https://your-space-name-username.hf.space/health
2. Web Interface
- Visit your Space URL
- Test file upload functionality
- Verify model selection works
- Check training configuration options
3. API Endpoints
Test key endpoints:
GET /
- Main interfacePOST /upload
- File uploadGET /models
- List modelsWebSocket /ws/{session_id}
- Real-time updates
π Troubleshooting
Build Failures
PyTorch Installation Issues:
# Check if CUDA version is compatible
# Update requirements.txt if needed
torch==2.1.0+cu118
Memory Issues During Build:
- Upgrade to higher hardware tier
- Reduce dependency versions
- Remove unnecessary packages
Runtime Issues
Out of Memory:
- Increase hardware tier
- Reduce batch size in training
- Implement model sharding
Model Loading Failures:
- Check file format compatibility
- Verify Hugging Face model exists
- Ensure sufficient disk space
WebSocket Connection Issues:
- Check browser compatibility
- Verify firewall settings
- Try refreshing the page
Performance Issues
Slow Training:
- Upgrade to GPU hardware
- Increase batch size
- Use mixed precision training
High Memory Usage:
- Monitor system resources
- Implement automatic cleanup
- Reduce model cache size
π Monitoring and Maintenance
Logs and Monitoring
- Check Space logs regularly
- Monitor resource usage
- Set up alerts for failures
Updates and Maintenance
- Keep dependencies updated
- Monitor for security issues
- Regular cleanup of temporary files
Scaling Considerations
- Monitor user load
- Consider multiple Space instances
- Implement load balancing if needed
π Security Best Practices
File Upload Security
- Validate all uploaded files
- Implement size limits
- Scan for malicious content
API Security
- Implement rate limiting
- Validate all inputs
- Use HTTPS only
Resource Protection
- Monitor resource usage
- Implement timeouts
- Automatic cleanup procedures
π Performance Optimization
Model Loading
- Cache frequently used models
- Implement lazy loading
- Use model compression
Training Optimization
- Use mixed precision
- Implement gradient checkpointing
- Optimize batch sizes
Frontend Performance
- Minimize JavaScript bundle
- Optimize CSS delivery
- Use CDN for static assets
π― Success Metrics
Your deployment is successful when:
β Functionality
- All API endpoints respond correctly
- File uploads work without errors
- Training completes successfully
- Model downloads work properly
β Performance
- Page loads in < 3 seconds
- Training starts within 30 seconds
- Real-time updates work smoothly
- Resource usage is within limits
β User Experience
- Interface is responsive on all devices
- Error messages are clear and helpful
- Progress tracking works accurately
- Documentation is accessible
π Support and Resources
- Hugging Face Spaces Documentation: https://huggingface.co/docs/hub/spaces
- FastAPI Documentation: https://fastapi.tiangolo.com/
- PyTorch Documentation: https://pytorch.org/docs/
Your Multi-Modal Knowledge Distillation application is now ready for production deployment! π