Spaces:
Running
Running
A newer version of the Streamlit SDK is available:
1.49.1
HuggingFace Spaces Deployment Guide
Enhanced RISC-V RAG System
π Quick Deployment Steps
Create HuggingFace Space
- Go to HuggingFace Spaces
- Click "Create new Space"
- Choose Streamlit as SDK
- Set hardware to CPU Basic (2 cores, 16GB RAM)
Upload Files Upload all files from this directory to your space:
app.py # Main entry point streamlit_epic2_demo.py # Enhanced RAG demo requirements.txt # Dependencies config/ # Configuration files src/ # Core system data/ # Sample documents demo/ # Demo utilities
Set Environment Variables (Optional) In your Space settings, add:
HF_TOKEN=your_huggingface_token_here
Note: The system works without HF_TOKEN but provides enhanced capabilities with it.
Build & Deploy
- HuggingFace Spaces will automatically build your app
- Monitor build logs for any issues
- App will be available at:
https://huggingface.co/spaces/your-username/your-space-name
π§ System Capabilities
With HF_TOKEN (Recommended)
- β Full advanced RAG capabilities
- β Neural reranking with cross-encoder models
- β Graph enhancement for document relationships
- β Real-time analytics and performance monitoring
- β API-based LLM integration (memory efficient)
Without HF_TOKEN (Demo Mode)
- β System architecture demonstration
- β Performance metrics display
- β Technical documentation showcase
- βΉοΈ Limited live query functionality
π Performance Expectations
Memory Usage: < 16GB (HF Spaces compatible) Startup Time: 30-60 seconds (model loading) Query Response: 1-3 seconds per query Concurrent Users: Supports multiple simultaneous users
π Monitoring & Troubleshooting
Common Issues
Build Fails
- Check
requirements.txt
compatibility - Ensure all files are uploaded
- Monitor build logs for specific errors
- Check
High Memory Usage
- System is optimized for <16GB usage
- Models load efficiently with lazy loading
- Consider upgrading to CPU Persistent if needed
Slow Response Times
- First query may be slower (model loading)
- Subsequent queries should be <3 seconds
- Check HF_TOKEN configuration for API access
Health Check Endpoints
The system provides built-in health monitoring:
- Automatic environment detection
- Configuration validation
- Component status reporting
π‘ Tips for Best Performance
- Use HF_TOKEN: Enables full capabilities and better performance
- Monitor Logs: Check for initialization and query processing
- Sample Queries: Use provided RISC-V technical queries for demo
- Configuration: System auto-selects optimal configuration based on environment
π Expected Demo Results
With proper setup, your demo will showcase:
- Neural reranking with cross-encoder models
- Graph enhancement for document relationships
- Hybrid search combining semantic and keyword matching
- Real-time analytics with performance metrics
- Professional UI with technical feature focus
π― Portfolio Impact
This deployment demonstrates:
- Advanced RAG system implementation
- Modular 6-component architecture
- Neural reranking and graph enhancement techniques
- Modern ML engineering practices
Showcases technical RAG implementation skills with focus on advanced features.