Spaces:
Running
Running
# Deployment Guide for Hugging Face Spaces | |
This guide provides step-by-step instructions for deploying the Multi-Modal Knowledge Distillation application to Hugging Face Spaces. | |
## π Pre-Deployment Checklist | |
β **Project Structure Complete** | |
- All required files and directories are present | |
- Python syntax validation passed | |
- Frontend files are properly structured | |
β **Configuration Validated** | |
- `requirements.txt` contains all necessary dependencies | |
- `spaces_config.yaml` is properly configured | |
- API endpoints are implemented and accessible | |
β **Documentation Complete** | |
- Comprehensive README.md with usage instructions | |
- API documentation included | |
- Troubleshooting guide provided | |
## π Deployment Steps | |
### Step 1: Create Hugging Face Space | |
1. **Go to Hugging Face Spaces** | |
- Visit [https://huggingface.co/spaces](https://huggingface.co/spaces) | |
- Click "Create new Space" | |
2. **Configure Space Settings** | |
- **Space name**: `multi-modal-knowledge-distillation` (or your preferred name) | |
- **License**: MIT | |
- **SDK**: Gradio | |
- **Hardware**: T4 small (minimum) or T4 medium (recommended) | |
- **Visibility**: Public or Private (your choice) | |
3. **Initialize Repository** | |
- Choose "Initialize with README" | |
- Click "Create Space" | |
### Step 2: Upload Project Files | |
Upload all the following files to your Space repository: | |
#### Core Application Files | |
``` | |
app.py # Main FastAPI application | |
requirements.txt # Python dependencies | |
spaces_config.yaml # Hugging Face Spaces configuration | |
README.md # Project documentation | |
.gitignore # Git ignore rules | |
``` | |
#### Source Code | |
``` | |
src/ | |
βββ __init__.py # Package initialization | |
βββ model_loader.py # Model loading utilities | |
βββ distillation.py # Knowledge distillation engine | |
βββ utils.py # Utility functions | |
``` | |
#### Frontend Files | |
``` | |
templates/ | |
βββ index.html # Main web interface | |
static/ | |
βββ css/ | |
β βββ style.css # Application styles | |
βββ js/ | |
βββ main.js # Frontend JavaScript | |
``` | |
#### Directory Structure (will be created automatically) | |
``` | |
uploads/ # Uploaded model files | |
models/ # Trained models | |
temp/ # Temporary files | |
logs/ # Application logs | |
``` | |
### Step 3: Configure Hardware | |
1. **Go to Space Settings** | |
- Click on "Settings" tab in your Space | |
- Navigate to "Hardware" section | |
2. **Select Hardware** | |
- **Minimum**: T4 small (16GB RAM, 1x T4 GPU) | |
- **Recommended**: T4 medium (32GB RAM, 1x T4 GPU) | |
- **For large models**: A10G small or larger | |
3. **Apply Changes** | |
- Click "Update hardware" | |
- Your Space will restart with new hardware | |
### Step 4: Monitor Deployment | |
1. **Build Process** | |
- Watch the "Logs" tab for build progress | |
- Build typically takes 5-10 minutes | |
- Dependencies will be installed automatically | |
2. **Common Build Issues** | |
- **PyTorch installation**: May take several minutes | |
- **CUDA compatibility**: Ensure PyTorch version supports your hardware | |
- **Memory issues**: Upgrade hardware if needed | |
3. **Successful Deployment** | |
- Space status shows "Running" | |
- Application is accessible via the Space URL | |
- Health check endpoint responds correctly | |
## π§ Configuration Options | |
### Environment Variables | |
You can set these in your Space settings: | |
```bash | |
# Server Configuration | |
PORT=7860 # Default port (usually not needed) | |
HOST=0.0.0.0 # Default host | |
# Resource Limits | |
MAX_FILE_SIZE=5368709120 # 5GB max file size | |
MAX_MODELS=10 # Maximum teacher models | |
MAX_TRAINING_TIME=3600 # 1 hour training limit | |
# GPU Configuration | |
CUDA_VISIBLE_DEVICES=0 # GPU device selection | |
``` | |
### Hardware Recommendations | |
| Use Case | Hardware | RAM | GPU | Cost | | |
|----------|----------|-----|-----|------| | |
| Demo/Testing | CPU Basic | 16GB | None | Free | | |
| Small Models | T4 small | 16GB | T4 | Low | | |
| Production | T4 medium | 32GB | T4 | Medium | | |
| Large Models | A10G small | 24GB | A10G | High | | |
## π§ͺ Testing Your Deployment | |
### 1. Health Check | |
```bash | |
curl https://your-space-name-username.hf.space/health | |
``` | |
### 2. Web Interface | |
- Visit your Space URL | |
- Test file upload functionality | |
- Verify model selection works | |
- Check training configuration options | |
### 3. API Endpoints | |
Test key endpoints: | |
- `GET /` - Main interface | |
- `POST /upload` - File upload | |
- `GET /models` - List models | |
- `WebSocket /ws/{session_id}` - Real-time updates | |
## π Troubleshooting | |
### Build Failures | |
**PyTorch Installation Issues:** | |
```bash | |
# Check if CUDA version is compatible | |
# Update requirements.txt if needed | |
torch==2.1.0+cu118 | |
``` | |
**Memory Issues During Build:** | |
- Upgrade to higher hardware tier | |
- Reduce dependency versions | |
- Remove unnecessary packages | |
### Runtime Issues | |
**Out of Memory:** | |
- Increase hardware tier | |
- Reduce batch size in training | |
- Implement model sharding | |
**Model Loading Failures:** | |
- Check file format compatibility | |
- Verify Hugging Face model exists | |
- Ensure sufficient disk space | |
**WebSocket Connection Issues:** | |
- Check browser compatibility | |
- Verify firewall settings | |
- Try refreshing the page | |
### Performance Issues | |
**Slow Training:** | |
- Upgrade to GPU hardware | |
- Increase batch size | |
- Use mixed precision training | |
**High Memory Usage:** | |
- Monitor system resources | |
- Implement automatic cleanup | |
- Reduce model cache size | |
## π Monitoring and Maintenance | |
### Logs and Monitoring | |
- Check Space logs regularly | |
- Monitor resource usage | |
- Set up alerts for failures | |
### Updates and Maintenance | |
- Keep dependencies updated | |
- Monitor for security issues | |
- Regular cleanup of temporary files | |
### Scaling Considerations | |
- Monitor user load | |
- Consider multiple Space instances | |
- Implement load balancing if needed | |
## π Security Best Practices | |
### File Upload Security | |
- Validate all uploaded files | |
- Implement size limits | |
- Scan for malicious content | |
### API Security | |
- Implement rate limiting | |
- Validate all inputs | |
- Use HTTPS only | |
### Resource Protection | |
- Monitor resource usage | |
- Implement timeouts | |
- Automatic cleanup procedures | |
## π Performance Optimization | |
### Model Loading | |
- Cache frequently used models | |
- Implement lazy loading | |
- Use model compression | |
### Training Optimization | |
- Use mixed precision | |
- Implement gradient checkpointing | |
- Optimize batch sizes | |
### Frontend Performance | |
- Minimize JavaScript bundle | |
- Optimize CSS delivery | |
- Use CDN for static assets | |
## π― Success Metrics | |
Your deployment is successful when: | |
β **Functionality** | |
- All API endpoints respond correctly | |
- File uploads work without errors | |
- Training completes successfully | |
- Model downloads work properly | |
β **Performance** | |
- Page loads in < 3 seconds | |
- Training starts within 30 seconds | |
- Real-time updates work smoothly | |
- Resource usage is within limits | |
β **User Experience** | |
- Interface is responsive on all devices | |
- Error messages are clear and helpful | |
- Progress tracking works accurately | |
- Documentation is accessible | |
## π Support and Resources | |
- **Hugging Face Spaces Documentation**: [https://huggingface.co/docs/hub/spaces](https://huggingface.co/docs/hub/spaces) | |
- **FastAPI Documentation**: [https://fastapi.tiangolo.com/](https://fastapi.tiangolo.com/) | |
- **PyTorch Documentation**: [https://pytorch.org/docs/](https://pytorch.org/docs/) | |
--- | |
**Your Multi-Modal Knowledge Distillation application is now ready for production deployment! π** | |