Spaces:
Running
Running
# LangChain Implementation Guide for Advanced Chatbot | |
## π― **Technology Stack Overview** | |
### **Core Technologies:** | |
- **LangChain** - Conversation memory and chain management | |
- **FAISS** - Vector storage and similarity search | |
- **OpenAI Embeddings** - Text vectorization | |
- **SQLAlchemy** - Persistent conversation storage | |
- **Redis** (optional) - Caching conversation context | |
### **Architecture Flow:** | |
``` | |
User Message β LangChain Memory β RAG Retrieval β Context Assembly β LLM Response | |
``` | |
## π **Implementation Status** | |
### β **Completed Components:** | |
1. **LangChain Conversation Service** (`services/langchain_conversation_service.py`) | |
- Module content loading from GPT FINAL FLOW folders | |
- Vector store creation with FAISS | |
- Conversation chain with memory | |
- RAG retrieval from module content | |
- Database integration for persistent storage | |
2. **Enhanced Chatbot Service** (`services/chatbot_service.py`) | |
- LangChain integration | |
- Fallback to original conversation service | |
- Module-specific content loading | |
3. **Database Models** (`models.py`) | |
- `ConversationMemory` - Stores conversation context | |
- `CrossModuleMemory` - Shares context across modules | |
- `ConversationMessage` - Individual message storage | |
4. **Dependencies** (`requirements.txt`) | |
- LangChain packages added | |
- FAISS for vector storage | |
- ChromaDB for alternative vector store | |
## π **Installation Steps** | |
### **1. Install Dependencies** | |
```bash | |
pip install -r requirements.txt | |
``` | |
### **2. Set Environment Variables** | |
```bash | |
# Required | |
export OPENAI_API_KEY="your_openai_api_key" | |
export SUPABASE_DB_URL="your_supabase_connection_string" | |
# Optional | |
export ENVIRONMENT="production" | |
export DEBUG="false" | |
``` | |
### **3. Test the Integration** | |
```bash | |
python test_langchain_integration.py | |
``` | |
## π§ **Configuration** | |
### **Module Mapping** | |
The system maps module IDs to folder names: | |
```python | |
module_mapping = { | |
"offer_clarifier": "1_The Offer Clarifier GPT", | |
"avatar_creator": "2_Avatar Creator and Empathy Map GPT", | |
"before_state": "3_Before State Research GPT", | |
"after_state": "4_After State Research GPT", | |
"avatar_validator": "5_Avatar Validator GPT", | |
"trigger_gpt": "6_TriggerGPT", | |
"epo_builder": "7_EPO Builder GPT - Copy", | |
"scamper_synthesizer": "8_SCAMPER Synthesizer", | |
"wildcard_idea": "9_Wildcard Idea Bot", | |
"concept_crafter": "10_Concept Crafter GPT", | |
"hook_headline": "11_Hook & Headline GPT", | |
"campaign_concept": "12_Campaign Concept Generator GPT", | |
"ideation_injection": "13_Ideation Injection Bot" | |
} | |
``` | |
### **Content Loading** | |
For each module, the system loads: | |
- **System Prompts** (`System Prompt/*.txt`) | |
- **RAG Content** (`RAG/*.txt`) | |
- **Output Templates** (`Output template/*.txt`) | |
## π§ **How It Works** | |
### **1. Content Processing** | |
```python | |
# Load module content | |
documents = await service.load_module_content("offer_clarifier") | |
# Create vector store | |
vector_store = FAISS.from_documents(texts, embeddings) | |
# Create conversation chain | |
chain = ConversationalRetrievalChain.from_llm( | |
llm=llm, | |
retriever=vector_store.as_retriever(), | |
memory=memory | |
) | |
``` | |
### **2. Message Processing** | |
```python | |
# Process user message | |
result = await chain.ainvoke({ | |
"question": user_message, | |
"chat_history": [] | |
}) | |
# Extract response and sources | |
response = result.get("answer") | |
source_documents = result.get("source_documents") | |
``` | |
### **3. Memory Management** | |
- **Conversation History** - Stored in database | |
- **Context Summary** - AI-generated summaries | |
- **User Profile** - Extracted user information | |
- **Cross-Module Memory** - Shared across modules | |
## π **Features** | |
### **β Conversation Memory** | |
- Remembers previous conversations | |
- Maintains context across sessions | |
- Stores conversation history in database | |
### **β RAG (Retrieval Augmented Generation)** | |
- Loads content from GPT module folders | |
- Creates vector embeddings for similarity search | |
- Retrieves relevant content for responses | |
### **β Context Awareness** | |
- Understands conversation flow | |
- Maintains user preferences | |
- Shares context across modules | |
### **β Natural Language Processing** | |
- Detects user intent | |
- Generates contextual responses | |
- Handles conversation transitions | |
## π **Integration with Existing System** | |
### **Backward Compatibility** | |
The implementation maintains backward compatibility: | |
1. **Primary**: LangChain conversation service | |
2. **Fallback**: Original conversation service | |
3. **Legacy**: Traditional Q&A mode | |
### **Database Integration** | |
- Uses existing database models | |
- Maintains conversation history | |
- Supports cross-module memory | |
## π§ͺ **Testing** | |
### **Run Integration Tests** | |
```bash | |
python test_langchain_integration.py | |
``` | |
### **Test Individual Components** | |
```python | |
# Test content loading | |
documents = await service.load_module_content("offer_clarifier") | |
# Test vector store | |
vector_store = await service.create_vector_store("offer_clarifier") | |
# Test conversation chain | |
chain = await service.create_conversation_chain("offer_clarifier", "test_id") | |
``` | |
## π **Deployment** | |
### **Production Setup** | |
1. **Install Dependencies** | |
```bash | |
pip install -r requirements.txt | |
``` | |
2. **Set Environment Variables** | |
```bash | |
export OPENAI_API_KEY="your_key" | |
export SUPABASE_DB_URL="your_connection_string" | |
``` | |
3. **Run Database Migration** | |
```bash | |
python setup_database.py | |
``` | |
4. **Start the Application** | |
```bash | |
python main.py | |
``` | |
### **Hugging Face Deployment** | |
1. **Set Secrets** in Hugging Face Space | |
2. **Deploy** the application | |
3. **Test** the conversational features | |
## π **Performance Optimization** | |
### **Caching Strategy** | |
- **Vector Stores** - Cached per module | |
- **Conversation Chains** - Cached per session | |
- **Embeddings** - Reused across requests | |
### **Memory Management** | |
- **Window Memory** - Last 10 exchanges | |
- **Summary Memory** - AI-generated summaries | |
- **Token Management** - Track token usage | |
## π§ **Customization** | |
### **Adding New Modules** | |
1. **Create folder** in `GPT FINAL FLOW/` | |
2. **Add content** files (System Prompt, RAG, Output template) | |
3. **Update mapping** in `LangChainConversationService` | |
### **Customizing Memory** | |
```python | |
# Custom memory configuration | |
memory = ConversationBufferWindowMemory( | |
memory_key="chat_history", | |
return_messages=True, | |
k=15 # Increase window size | |
) | |
``` | |
### **Customizing RAG** | |
```python | |
# Custom retriever configuration | |
retriever = vector_store.as_retriever( | |
search_type="similarity", | |
search_kwargs={"k": 5} # Increase results | |
) | |
``` | |
## π **Troubleshooting** | |
### **Common Issues** | |
1. **OpenAI API Key Not Set** | |
```bash | |
export OPENAI_API_KEY="your_key" | |
``` | |
2. **Module Content Not Found** | |
- Check folder structure in `GPT FINAL FLOW/` | |
- Verify file naming conventions | |
3. **Vector Store Creation Fails** | |
- Check OpenAI API key | |
- Verify content files exist | |
- Check file encoding (UTF-8) | |
4. **Database Connection Issues** | |
- Verify Supabase connection string | |
- Check database permissions | |
- Run database setup script | |
### **Debug Mode** | |
```python | |
# Enable verbose logging | |
logging.basicConfig(level=logging.DEBUG) | |
# Enable LangChain verbose mode | |
chain = ConversationalRetrievalChain.from_llm( | |
llm=llm, | |
retriever=retriever, | |
memory=memory, | |
verbose=True # Enable verbose mode | |
) | |
``` | |
## π **Next Steps** | |
### **Immediate Actions** | |
1. **Install dependencies** and test integration | |
2. **Configure environment variables** | |
3. **Run database setup** | |
4. **Test with sample conversations** | |
### **Future Enhancements** | |
1. **Redis Caching** - For better performance | |
2. **Advanced Memory** - Conversation summary memory | |
3. **Multi-modal Support** - Images, documents | |
4. **Real-time Updates** - WebSocket integration | |
## π **Benefits** | |
### **For Users** | |
- **Natural Conversations** - More human-like interactions | |
- **Context Awareness** - Remembers previous conversations | |
- **Relevant Responses** - Based on module content | |
- **Smooth Transitions** - Between questions and modules | |
### **For Developers** | |
- **Modular Architecture** - Easy to extend | |
- **Backward Compatibility** - Existing features work | |
- **Scalable Design** - Handles multiple modules | |
- **Production Ready** - Database persistence | |
--- | |
**This implementation provides a robust foundation for advanced conversational AI with proper memory management and context awareness.** |