File size: 6,300 Bytes
860bf55
1949ac7
 
860bf55
 
 
 
 
 
 
 
1949ac7
860bf55
1949ac7
860bf55
 
 
1949ac7
860bf55
 
 
 
fb8e0ac
1949ac7
860bf55
 
 
 
 
 
1949ac7
860bf55
 
 
34a92ea
860bf55
 
 
 
 
 
 
34a92ea
860bf55
34a92ea
 
 
 
 
523969f
 
860bf55
fb8e0ac
860bf55
fb8e0ac
 
34a92ea
fb8e0ac
34a92ea
 
 
 
fb8e0ac
34a92ea
fb8e0ac
 
 
 
 
 
 
34a92ea
fb8e0ac
34a92ea
 
 
 
 
 
fb8e0ac
34a92ea
fb8e0ac
 
34a92ea
fb8e0ac
 
 
34a92ea
fb8e0ac
 
 
 
 
1949ac7
1fda7a5
03f412b
 
1fda7a5
 
523969f
 
 
 
 
1fda7a5
 
523969f
1fda7a5
 
523969f
 
 
fb8e0ac
 
34a92ea
1949ac7
fb8e0ac
 
 
 
860bf55
 
1fda7a5
 
 
 
 
860bf55
fb8e0ac
1fda7a5
fb8e0ac
34a92ea
1fda7a5
 
fb8e0ac
34a92ea
1fda7a5
 
 
 
fb8e0ac
34a92ea
1fda7a5
 
fb8e0ac
34a92ea
1fda7a5
 
fb8e0ac
b283a26
1fda7a5
 
 
ec5a582
 
b283a26
1fda7a5
b283a26
 
ec5a582
 
 
 
 
 
 
 
 
b283a26
ec5a582
03f412b
 
 
 
 
1949ac7
03f412b
 
 
 
 
 
1949ac7
03f412b
1949ac7
03f412b
 
 
1949ac7
03f412b
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
---
title: CosmicCat AI Assistant
emoji: 🐱
colorFrom: purple
colorTo: blue
sdk: streamlit
sdk_version: "1.24.0"
app_file: app.py
pinned: false
---

# CosmicCat AI Assistant 🐱

Your personal AI-powered life coaching assistant with a cosmic twist.

## Features

- Personalized life coaching conversations with a space-cat theme
- Redis-based conversation memory
- Multiple LLM provider support (Ollama, Hugging Face, OpenAI)
- Dynamic model selection
- Remote Ollama integration via ngrok
- Automatic fallback between providers
- Cosmic Cascade mode for enhanced responses

## How to Use

1. Select a user from the sidebar
2. Configure your Ollama connection (if using remote Ollama)
3. Choose your preferred model
4. Start chatting with your CosmicCat AI Assistant!

## Requirements

All requirements are specified in requirements.txt. The app automatically handles:
- Streamlit UI
- FastAPI backend (for future expansion)
- Redis connection for persistent memory
- Multiple LLM integrations

## Environment Variables

Configure these in your Hugging Face Space secrets or local .env file:

- OLLAMA_HOST: Your Ollama server URL (default: ngrok URL)
- LOCAL_MODEL_NAME: Default model name (default: mistral)
- HF_TOKEN: Hugging Face API token (for Hugging Face models)
- HF_API_ENDPOINT_URL: Hugging Face inference API endpoint
- USE_FALLBACK: Whether to use fallback providers (true/false)

Note: Redis configuration is now hardcoded for reliability.

## Provider Details

### Ollama (Primary Local Provider)

Setup:
1. Install Ollama: https://ollama.com/download
2. Pull a model: ollama pull mistral
3. Start server: ollama serve
4. Configure ngrok: ngrok http 11434
5. Set OLLAMA_HOST to your ngrok URL

Advantages:
- No cost for inference
- Full control over models
- Fast response times
- Privacy - all processing local

### Hugging Face Inference API (Fallback)

Current Endpoint: https://zxzbfrlg3ssrk7d9.us-east-1.aws.endpoints.huggingface.cloud

Important Scaling Behavior:
- ⚠️ Scale-to-Zero: Endpoint automatically scales to zero after 15 minutes of inactivity
- ⏱️ Cold Start: Takes approximately 4 minutes to initialize when first requested
- 🔄 Automatic Wake-up: Sending any request will automatically start the endpoint
- 💰 Cost: $0.536/hour while running (not billed when scaled to zero)
- 📍 Location: AWS us-east-1 (Intel Sapphire Rapids, 16vCPUs, 32GB RAM)

Handling 503 Errors:
When using the Hugging Face fallback, you may encounter 503 errors initially. This indicates the endpoint is initializing. Simply retry your request after 30-60 seconds, or wait for the initialization to complete (typically 4 minutes).

Model: OpenAI GPT OSS 20B (Uncensored variant)

### OpenAI (Alternative Fallback)

Configure with OPENAI_API_KEY environment variable.

## Switching Between Providers

### For Local Development (Windows/Ollama):

1. Install Ollama:
```bash
# Download from https://ollama.com/download/OllamaSetup.exe
Pull and run models:


ollama pull mistral
ollama pull llama3
ollama serve
Start ngrok tunnel:


ngrok http 11434
Update environment variables:


OLLAMA_HOST=https://your-ngrok-url.ngrok-free.app
LOCAL_MODEL_NAME=mistral
USE_FALLBACK=false
For Production Deployment:

The application automatically handles provider fallback:

Primary: Ollama (via ngrok)
Secondary: Hugging Face Inference API
Tertiary: OpenAI (if configured)
Architecture
This application consists of:

Streamlit frontend (app.py)
Core LLM abstraction (core/llm.py)
Memory management (core/memory.py)
Configuration management (utils/config.py)
API endpoints (in api/ directory for future expansion)
Built with Python, Streamlit, FastAPI, and Redis.

Troubleshooting Common Issues
503 Errors with Hugging Face Fallback:

Wait 4 minutes for cold start initialization
Retry request after endpoint warms up
Ollama Connection Issues:

Verify ollama serve is running locally
Check ngrok tunnel status
Confirm ngrok URL matches OLLAMA_HOST
Test with test_ollama_connection.py
Redis Connection Problems:

The Redis configuration is now hardcoded for maximum reliability
If issues persist, check network connectivity to Redis Cloud
Model Not Found:

Pull required model: ollama pull <model-name>
Check available models: ollama list
Diagnostic Scripts:

Run python test_ollama_connection.py to verify Ollama connectivity.
Run python diagnose_ollama.py for detailed connection diagnostics.
Run python test_hardcoded_redis.py to verify Redis connectivity with hardcoded configuration.
Redis Database Configuration
The application now uses a non-SSL connection to Redis Cloud for maximum compatibility:


import redis
r = redis.Redis(
    host='redis-16717.c85.us-east-1-2.ec2.redns.redis-cloud.com',
    port=16717,
    username="default",
    password="bNQGmfkB2fRo4KrT3UXwhAUEUmgDClx7",
    decode_responses=True,
    socket_connect_timeout=15,
    socket_timeout=15,
    health_check_interval=30,
    retry_on_timeout=True
)
Note: SSL is disabled due to record layer failures with Redis Cloud. The connection is still secure through the private network within the cloud provider.

🚀 Hugging Face Space Deployment
This application is designed for deployment on Hugging Face Spaces with the following configuration:

Required HF Space Secrets:

OLLAMA_HOST - Your ngrok tunnel to Ollama server
LOCAL_MODEL_NAME - Default: mistral:latest
HF_TOKEN - Hugging Face API token (for HF endpoint access)
HF_API_ENDPOINT_URL - Your custom HF inference endpoint
TAVILY_API_KEY - For web search capabilities
OPENWEATHER_API_KEY - For weather data integration
Redis Configuration: The application uses hardcoded Redis Cloud credentials for persistent storage.

Multi-Model Coordination
Primary: Ollama (fast responses, local processing)
Secondary: Hugging Face Endpoint (deep analysis, cloud processing)
Coordination: Both work together, not fallback
System Architecture
The coordinated AI system automatically handles:

External data gathering (web search, weather, time)
Fast initial responses from Ollama
Background HF endpoint initialization
Deep analysis coordination
Session persistence with Redis
This approach will work perfectly in your HF Space environment where the variables are properly configured. The local demo will show the system architecture is correct and ready for deployment!