train-modle / DEPLOYMENT_GUIDE.md
fokan's picture
Initial clean commit: Multi-Modal Knowledge Distillation Platform
ab4e093
# Deployment Guide for Hugging Face Spaces
This guide provides step-by-step instructions for deploying the Multi-Modal Knowledge Distillation application to Hugging Face Spaces.
## πŸ“‹ Pre-Deployment Checklist
βœ… **Project Structure Complete**
- All required files and directories are present
- Python syntax validation passed
- Frontend files are properly structured
βœ… **Configuration Validated**
- `requirements.txt` contains all necessary dependencies
- `spaces_config.yaml` is properly configured
- API endpoints are implemented and accessible
βœ… **Documentation Complete**
- Comprehensive README.md with usage instructions
- API documentation included
- Troubleshooting guide provided
## πŸš€ Deployment Steps
### Step 1: Create Hugging Face Space
1. **Go to Hugging Face Spaces**
- Visit [https://huggingface.co/spaces](https://huggingface.co/spaces)
- Click "Create new Space"
2. **Configure Space Settings**
- **Space name**: `multi-modal-knowledge-distillation` (or your preferred name)
- **License**: MIT
- **SDK**: Gradio
- **Hardware**: T4 small (minimum) or T4 medium (recommended)
- **Visibility**: Public or Private (your choice)
3. **Initialize Repository**
- Choose "Initialize with README"
- Click "Create Space"
### Step 2: Upload Project Files
Upload all the following files to your Space repository:
#### Core Application Files
```
app.py # Main FastAPI application
requirements.txt # Python dependencies
spaces_config.yaml # Hugging Face Spaces configuration
README.md # Project documentation
.gitignore # Git ignore rules
```
#### Source Code
```
src/
β”œβ”€β”€ __init__.py # Package initialization
β”œβ”€β”€ model_loader.py # Model loading utilities
β”œβ”€β”€ distillation.py # Knowledge distillation engine
└── utils.py # Utility functions
```
#### Frontend Files
```
templates/
└── index.html # Main web interface
static/
β”œβ”€β”€ css/
β”‚ └── style.css # Application styles
└── js/
└── main.js # Frontend JavaScript
```
#### Directory Structure (will be created automatically)
```
uploads/ # Uploaded model files
models/ # Trained models
temp/ # Temporary files
logs/ # Application logs
```
### Step 3: Configure Hardware
1. **Go to Space Settings**
- Click on "Settings" tab in your Space
- Navigate to "Hardware" section
2. **Select Hardware**
- **Minimum**: T4 small (16GB RAM, 1x T4 GPU)
- **Recommended**: T4 medium (32GB RAM, 1x T4 GPU)
- **For large models**: A10G small or larger
3. **Apply Changes**
- Click "Update hardware"
- Your Space will restart with new hardware
### Step 4: Monitor Deployment
1. **Build Process**
- Watch the "Logs" tab for build progress
- Build typically takes 5-10 minutes
- Dependencies will be installed automatically
2. **Common Build Issues**
- **PyTorch installation**: May take several minutes
- **CUDA compatibility**: Ensure PyTorch version supports your hardware
- **Memory issues**: Upgrade hardware if needed
3. **Successful Deployment**
- Space status shows "Running"
- Application is accessible via the Space URL
- Health check endpoint responds correctly
## πŸ”§ Configuration Options
### Environment Variables
You can set these in your Space settings:
```bash
# Server Configuration
PORT=7860 # Default port (usually not needed)
HOST=0.0.0.0 # Default host
# Resource Limits
MAX_FILE_SIZE=5368709120 # 5GB max file size
MAX_MODELS=10 # Maximum teacher models
MAX_TRAINING_TIME=3600 # 1 hour training limit
# GPU Configuration
CUDA_VISIBLE_DEVICES=0 # GPU device selection
```
### Hardware Recommendations
| Use Case | Hardware | RAM | GPU | Cost |
|----------|----------|-----|-----|------|
| Demo/Testing | CPU Basic | 16GB | None | Free |
| Small Models | T4 small | 16GB | T4 | Low |
| Production | T4 medium | 32GB | T4 | Medium |
| Large Models | A10G small | 24GB | A10G | High |
## πŸ§ͺ Testing Your Deployment
### 1. Health Check
```bash
curl https://your-space-name-username.hf.space/health
```
### 2. Web Interface
- Visit your Space URL
- Test file upload functionality
- Verify model selection works
- Check training configuration options
### 3. API Endpoints
Test key endpoints:
- `GET /` - Main interface
- `POST /upload` - File upload
- `GET /models` - List models
- `WebSocket /ws/{session_id}` - Real-time updates
## πŸ› Troubleshooting
### Build Failures
**PyTorch Installation Issues:**
```bash
# Check if CUDA version is compatible
# Update requirements.txt if needed
torch==2.1.0+cu118
```
**Memory Issues During Build:**
- Upgrade to higher hardware tier
- Reduce dependency versions
- Remove unnecessary packages
### Runtime Issues
**Out of Memory:**
- Increase hardware tier
- Reduce batch size in training
- Implement model sharding
**Model Loading Failures:**
- Check file format compatibility
- Verify Hugging Face model exists
- Ensure sufficient disk space
**WebSocket Connection Issues:**
- Check browser compatibility
- Verify firewall settings
- Try refreshing the page
### Performance Issues
**Slow Training:**
- Upgrade to GPU hardware
- Increase batch size
- Use mixed precision training
**High Memory Usage:**
- Monitor system resources
- Implement automatic cleanup
- Reduce model cache size
## πŸ“Š Monitoring and Maintenance
### Logs and Monitoring
- Check Space logs regularly
- Monitor resource usage
- Set up alerts for failures
### Updates and Maintenance
- Keep dependencies updated
- Monitor for security issues
- Regular cleanup of temporary files
### Scaling Considerations
- Monitor user load
- Consider multiple Space instances
- Implement load balancing if needed
## πŸ”’ Security Best Practices
### File Upload Security
- Validate all uploaded files
- Implement size limits
- Scan for malicious content
### API Security
- Implement rate limiting
- Validate all inputs
- Use HTTPS only
### Resource Protection
- Monitor resource usage
- Implement timeouts
- Automatic cleanup procedures
## πŸ“ˆ Performance Optimization
### Model Loading
- Cache frequently used models
- Implement lazy loading
- Use model compression
### Training Optimization
- Use mixed precision
- Implement gradient checkpointing
- Optimize batch sizes
### Frontend Performance
- Minimize JavaScript bundle
- Optimize CSS delivery
- Use CDN for static assets
## 🎯 Success Metrics
Your deployment is successful when:
βœ… **Functionality**
- All API endpoints respond correctly
- File uploads work without errors
- Training completes successfully
- Model downloads work properly
βœ… **Performance**
- Page loads in < 3 seconds
- Training starts within 30 seconds
- Real-time updates work smoothly
- Resource usage is within limits
βœ… **User Experience**
- Interface is responsive on all devices
- Error messages are clear and helpful
- Progress tracking works accurately
- Documentation is accessible
## πŸ“ž Support and Resources
- **Hugging Face Spaces Documentation**: [https://huggingface.co/docs/hub/spaces](https://huggingface.co/docs/hub/spaces)
- **FastAPI Documentation**: [https://fastapi.tiangolo.com/](https://fastapi.tiangolo.com/)
- **PyTorch Documentation**: [https://pytorch.org/docs/](https://pytorch.org/docs/)
---
**Your Multi-Modal Knowledge Distillation application is now ready for production deployment! πŸŽ‰**