Spaces:
Running
Running
File size: 3,585 Bytes
5e1a30c 1cdeab3 5e1a30c 1cdeab3 5e1a30c 1cdeab3 5e1a30c 1cdeab3 5e1a30c 1cdeab3 5e1a30c 1cdeab3 5e1a30c 1cdeab3 5e1a30c 1cdeab3 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 |
# HuggingFace Spaces Deployment Guide
## Enhanced RISC-V RAG System
### π Quick Deployment Steps
1. **Create HuggingFace Space**
- Go to [HuggingFace Spaces](https://huggingface.co/spaces)
- Click "Create new Space"
- Choose **Streamlit** as SDK
- Set hardware to **CPU Basic** (2 cores, 16GB RAM)
2. **Upload Files**
Upload all files from this directory to your space:
```
app.py # Main entry point
streamlit_epic2_demo.py # Enhanced RAG demo
requirements.txt # Dependencies
config/ # Configuration files
src/ # Core system
data/ # Sample documents
demo/ # Demo utilities
```
3. **Set Environment Variables** (Optional)
In your Space settings, add:
```
HF_TOKEN=your_huggingface_token_here
```
**Note**: The system works without HF_TOKEN but provides enhanced capabilities with it.
4. **Build & Deploy**
- HuggingFace Spaces will automatically build your app
- Monitor build logs for any issues
- App will be available at: `https://huggingface.co/spaces/your-username/your-space-name`
### π§ System Capabilities
#### **With HF_TOKEN (Recommended)**
- β
Full advanced RAG capabilities
- β
Neural reranking with cross-encoder models
- β
Graph enhancement for document relationships
- β
Real-time analytics and performance monitoring
- β
API-based LLM integration (memory efficient)
#### **Without HF_TOKEN (Demo Mode)**
- β
System architecture demonstration
- β
Performance metrics display
- β
Technical documentation showcase
- βΉοΈ Limited live query functionality
### π Performance Expectations
**Memory Usage**: < 16GB (HF Spaces compatible)
**Startup Time**: 30-60 seconds (model loading)
**Query Response**: 1-3 seconds per query
**Concurrent Users**: Supports multiple simultaneous users
### π Monitoring & Troubleshooting
#### **Common Issues**
1. **Build Fails**
- Check `requirements.txt` compatibility
- Ensure all files are uploaded
- Monitor build logs for specific errors
2. **High Memory Usage**
- System is optimized for <16GB usage
- Models load efficiently with lazy loading
- Consider upgrading to CPU Persistent if needed
3. **Slow Response Times**
- First query may be slower (model loading)
- Subsequent queries should be <3 seconds
- Check HF_TOKEN configuration for API access
#### **Health Check Endpoints**
The system provides built-in health monitoring:
- Automatic environment detection
- Configuration validation
- Component status reporting
### π‘ Tips for Best Performance
1. **Use HF_TOKEN**: Enables full capabilities and better performance
2. **Monitor Logs**: Check for initialization and query processing
3. **Sample Queries**: Use provided RISC-V technical queries for demo
4. **Configuration**: System auto-selects optimal configuration based on environment
### π Expected Demo Results
With proper setup, your demo will showcase:
- **Neural reranking** with cross-encoder models
- **Graph enhancement** for document relationships
- **Hybrid search** combining semantic and keyword matching
- **Real-time analytics** with performance metrics
- **Professional UI** with technical feature focus
### π― Portfolio Impact
This deployment demonstrates:
- Advanced RAG system implementation
- Modular 6-component architecture
- Neural reranking and graph enhancement techniques
- Modern ML engineering practices
Showcases technical RAG implementation skills with focus on advanced features. |