File size: 3,585 Bytes
5e1a30c
1cdeab3
5e1a30c
 
 
 
 
 
 
 
 
 
 
 
 
1cdeab3
5e1a30c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1cdeab3
5e1a30c
1cdeab3
5e1a30c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1cdeab3
 
 
5e1a30c
1cdeab3
5e1a30c
 
 
 
1cdeab3
 
 
 
5e1a30c
1cdeab3
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
# HuggingFace Spaces Deployment Guide
## Enhanced RISC-V RAG System

### πŸš€ Quick Deployment Steps

1. **Create HuggingFace Space**
   - Go to [HuggingFace Spaces](https://huggingface.co/spaces)
   - Click "Create new Space"
   - Choose **Streamlit** as SDK
   - Set hardware to **CPU Basic** (2 cores, 16GB RAM)

2. **Upload Files**
   Upload all files from this directory to your space:
   ```
   app.py                    # Main entry point
   streamlit_epic2_demo.py   # Enhanced RAG demo
   requirements.txt          # Dependencies
   config/                   # Configuration files
   src/                      # Core system
   data/                     # Sample documents
   demo/                     # Demo utilities
   ```

3. **Set Environment Variables** (Optional)
   In your Space settings, add:
   ```
   HF_TOKEN=your_huggingface_token_here
   ```
   
   **Note**: The system works without HF_TOKEN but provides enhanced capabilities with it.

4. **Build & Deploy**
   - HuggingFace Spaces will automatically build your app
   - Monitor build logs for any issues
   - App will be available at: `https://huggingface.co/spaces/your-username/your-space-name`

### πŸ”§ System Capabilities

#### **With HF_TOKEN (Recommended)**
- βœ… Full advanced RAG capabilities
- βœ… Neural reranking with cross-encoder models
- βœ… Graph enhancement for document relationships
- βœ… Real-time analytics and performance monitoring
- βœ… API-based LLM integration (memory efficient)

#### **Without HF_TOKEN (Demo Mode)**
- βœ… System architecture demonstration
- βœ… Performance metrics display
- βœ… Technical documentation showcase
- ℹ️ Limited live query functionality

### πŸ“Š Performance Expectations

**Memory Usage**: < 16GB (HF Spaces compatible)
**Startup Time**: 30-60 seconds (model loading)
**Query Response**: 1-3 seconds per query
**Concurrent Users**: Supports multiple simultaneous users

### πŸ” Monitoring & Troubleshooting

#### **Common Issues**

1. **Build Fails**
   - Check `requirements.txt` compatibility
   - Ensure all files are uploaded
   - Monitor build logs for specific errors

2. **High Memory Usage**
   - System is optimized for <16GB usage
   - Models load efficiently with lazy loading
   - Consider upgrading to CPU Persistent if needed

3. **Slow Response Times**
   - First query may be slower (model loading)
   - Subsequent queries should be <3 seconds
   - Check HF_TOKEN configuration for API access

#### **Health Check Endpoints**

The system provides built-in health monitoring:
- Automatic environment detection
- Configuration validation
- Component status reporting

### πŸ’‘ Tips for Best Performance

1. **Use HF_TOKEN**: Enables full capabilities and better performance
2. **Monitor Logs**: Check for initialization and query processing
3. **Sample Queries**: Use provided RISC-V technical queries for demo
4. **Configuration**: System auto-selects optimal configuration based on environment

### πŸ“ˆ Expected Demo Results

With proper setup, your demo will showcase:
- **Neural reranking** with cross-encoder models
- **Graph enhancement** for document relationships
- **Hybrid search** combining semantic and keyword matching
- **Real-time analytics** with performance metrics
- **Professional UI** with technical feature focus

### 🎯 Portfolio Impact

This deployment demonstrates:
- Advanced RAG system implementation
- Modular 6-component architecture
- Neural reranking and graph enhancement techniques
- Modern ML engineering practices

Showcases technical RAG implementation skills with focus on advanced features.