Spaces:
Sleeping
Sleeping
File size: 8,472 Bytes
e7abfef |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 |
# LangChain Implementation Guide for Advanced Chatbot
## π― **Technology Stack Overview**
### **Core Technologies:**
- **LangChain** - Conversation memory and chain management
- **FAISS** - Vector storage and similarity search
- **OpenAI Embeddings** - Text vectorization
- **SQLAlchemy** - Persistent conversation storage
- **Redis** (optional) - Caching conversation context
### **Architecture Flow:**
```
User Message β LangChain Memory β RAG Retrieval β Context Assembly β LLM Response
```
## π **Implementation Status**
### β
**Completed Components:**
1. **LangChain Conversation Service** (`services/langchain_conversation_service.py`)
- Module content loading from GPT FINAL FLOW folders
- Vector store creation with FAISS
- Conversation chain with memory
- RAG retrieval from module content
- Database integration for persistent storage
2. **Enhanced Chatbot Service** (`services/chatbot_service.py`)
- LangChain integration
- Fallback to original conversation service
- Module-specific content loading
3. **Database Models** (`models.py`)
- `ConversationMemory` - Stores conversation context
- `CrossModuleMemory` - Shares context across modules
- `ConversationMessage` - Individual message storage
4. **Dependencies** (`requirements.txt`)
- LangChain packages added
- FAISS for vector storage
- ChromaDB for alternative vector store
## π **Installation Steps**
### **1. Install Dependencies**
```bash
pip install -r requirements.txt
```
### **2. Set Environment Variables**
```bash
# Required
export OPENAI_API_KEY="your_openai_api_key"
export SUPABASE_DB_URL="your_supabase_connection_string"
# Optional
export ENVIRONMENT="production"
export DEBUG="false"
```
### **3. Test the Integration**
```bash
python test_langchain_integration.py
```
## π§ **Configuration**
### **Module Mapping**
The system maps module IDs to folder names:
```python
module_mapping = {
"offer_clarifier": "1_The Offer Clarifier GPT",
"avatar_creator": "2_Avatar Creator and Empathy Map GPT",
"before_state": "3_Before State Research GPT",
"after_state": "4_After State Research GPT",
"avatar_validator": "5_Avatar Validator GPT",
"trigger_gpt": "6_TriggerGPT",
"epo_builder": "7_EPO Builder GPT - Copy",
"scamper_synthesizer": "8_SCAMPER Synthesizer",
"wildcard_idea": "9_Wildcard Idea Bot",
"concept_crafter": "10_Concept Crafter GPT",
"hook_headline": "11_Hook & Headline GPT",
"campaign_concept": "12_Campaign Concept Generator GPT",
"ideation_injection": "13_Ideation Injection Bot"
}
```
### **Content Loading**
For each module, the system loads:
- **System Prompts** (`System Prompt/*.txt`)
- **RAG Content** (`RAG/*.txt`)
- **Output Templates** (`Output template/*.txt`)
## π§ **How It Works**
### **1. Content Processing**
```python
# Load module content
documents = await service.load_module_content("offer_clarifier")
# Create vector store
vector_store = FAISS.from_documents(texts, embeddings)
# Create conversation chain
chain = ConversationalRetrievalChain.from_llm(
llm=llm,
retriever=vector_store.as_retriever(),
memory=memory
)
```
### **2. Message Processing**
```python
# Process user message
result = await chain.ainvoke({
"question": user_message,
"chat_history": []
})
# Extract response and sources
response = result.get("answer")
source_documents = result.get("source_documents")
```
### **3. Memory Management**
- **Conversation History** - Stored in database
- **Context Summary** - AI-generated summaries
- **User Profile** - Extracted user information
- **Cross-Module Memory** - Shared across modules
## π **Features**
### **β
Conversation Memory**
- Remembers previous conversations
- Maintains context across sessions
- Stores conversation history in database
### **β
RAG (Retrieval Augmented Generation)**
- Loads content from GPT module folders
- Creates vector embeddings for similarity search
- Retrieves relevant content for responses
### **β
Context Awareness**
- Understands conversation flow
- Maintains user preferences
- Shares context across modules
### **β
Natural Language Processing**
- Detects user intent
- Generates contextual responses
- Handles conversation transitions
## π **Integration with Existing System**
### **Backward Compatibility**
The implementation maintains backward compatibility:
1. **Primary**: LangChain conversation service
2. **Fallback**: Original conversation service
3. **Legacy**: Traditional Q&A mode
### **Database Integration**
- Uses existing database models
- Maintains conversation history
- Supports cross-module memory
## π§ͺ **Testing**
### **Run Integration Tests**
```bash
python test_langchain_integration.py
```
### **Test Individual Components**
```python
# Test content loading
documents = await service.load_module_content("offer_clarifier")
# Test vector store
vector_store = await service.create_vector_store("offer_clarifier")
# Test conversation chain
chain = await service.create_conversation_chain("offer_clarifier", "test_id")
```
## π **Deployment**
### **Production Setup**
1. **Install Dependencies**
```bash
pip install -r requirements.txt
```
2. **Set Environment Variables**
```bash
export OPENAI_API_KEY="your_key"
export SUPABASE_DB_URL="your_connection_string"
```
3. **Run Database Migration**
```bash
python setup_database.py
```
4. **Start the Application**
```bash
python main.py
```
### **Hugging Face Deployment**
1. **Set Secrets** in Hugging Face Space
2. **Deploy** the application
3. **Test** the conversational features
## π **Performance Optimization**
### **Caching Strategy**
- **Vector Stores** - Cached per module
- **Conversation Chains** - Cached per session
- **Embeddings** - Reused across requests
### **Memory Management**
- **Window Memory** - Last 10 exchanges
- **Summary Memory** - AI-generated summaries
- **Token Management** - Track token usage
## π§ **Customization**
### **Adding New Modules**
1. **Create folder** in `GPT FINAL FLOW/`
2. **Add content** files (System Prompt, RAG, Output template)
3. **Update mapping** in `LangChainConversationService`
### **Customizing Memory**
```python
# Custom memory configuration
memory = ConversationBufferWindowMemory(
memory_key="chat_history",
return_messages=True,
k=15 # Increase window size
)
```
### **Customizing RAG**
```python
# Custom retriever configuration
retriever = vector_store.as_retriever(
search_type="similarity",
search_kwargs={"k": 5} # Increase results
)
```
## π **Troubleshooting**
### **Common Issues**
1. **OpenAI API Key Not Set**
```bash
export OPENAI_API_KEY="your_key"
```
2. **Module Content Not Found**
- Check folder structure in `GPT FINAL FLOW/`
- Verify file naming conventions
3. **Vector Store Creation Fails**
- Check OpenAI API key
- Verify content files exist
- Check file encoding (UTF-8)
4. **Database Connection Issues**
- Verify Supabase connection string
- Check database permissions
- Run database setup script
### **Debug Mode**
```python
# Enable verbose logging
logging.basicConfig(level=logging.DEBUG)
# Enable LangChain verbose mode
chain = ConversationalRetrievalChain.from_llm(
llm=llm,
retriever=retriever,
memory=memory,
verbose=True # Enable verbose mode
)
```
## π **Next Steps**
### **Immediate Actions**
1. **Install dependencies** and test integration
2. **Configure environment variables**
3. **Run database setup**
4. **Test with sample conversations**
### **Future Enhancements**
1. **Redis Caching** - For better performance
2. **Advanced Memory** - Conversation summary memory
3. **Multi-modal Support** - Images, documents
4. **Real-time Updates** - WebSocket integration
## π **Benefits**
### **For Users**
- **Natural Conversations** - More human-like interactions
- **Context Awareness** - Remembers previous conversations
- **Relevant Responses** - Based on module content
- **Smooth Transitions** - Between questions and modules
### **For Developers**
- **Modular Architecture** - Easy to extend
- **Backward Compatibility** - Existing features work
- **Scalable Design** - Handles multiple modules
- **Production Ready** - Database persistence
---
**This implementation provides a robust foundation for advanced conversational AI with proper memory management and context awareness.** |