Spaces:

peace2024
/

dubswayAgenticV2

Building

File size: 8,886 Bytes

eefb74d

# 🚀 Dubsway Video AI - Groq Agentic System Guide

## Overview

This guide will help you set up and run the enhanced agentic video analysis system using **Groq** with the **Llama3-8b-8192** model. The system provides:

- 🤖 **Agentic Analysis**: Multi-modal video understanding with reasoning capabilities
- 🎯 **MCP/ACP Integration**: Model Context Protocol tools for enhanced analysis
- 🔍 **Multi-modal Processing**: Audio, visual, and text analysis
- 🌐 **Web Integration**: Real-time web search and Wikipedia lookups
- 📊 **Beautiful Reports**: Comprehensive, formatted analysis reports
- 💾 **Enhanced Vector Storage**: Better RAG capabilities with metadata

## 🛠️ Setup Instructions

### 1. Get Groq API Key

1. Visit [Groq Console](https://console.groq.com/)
2. Sign up for a free account
3. Get your API key from the dashboard
4. Set the environment variable:
   ```bash
   set GROQ_API_KEY=your_key_here
   ```
   Or add to your `.env` file:
   ```
   GROQ_API_KEY=your_key_here
   ```

### 2. Install Dependencies

Run the setup script:
```bash
setup_agentic_system.bat
```

Or manually:
```bash
# Activate virtual environment
myenv31\Scripts\activate.bat

# Install dependencies
pip install -r requirements.txt

# Install Groq specifically
pip install langchain-groq
```

### 3. Test the System

Run the test script to verify everything is working:
```bash
python test_agentic_system.py
```

You should see:
```
🚀 Dubsway Video AI - Agentic System Test
============================================================
📦 Testing Dependencies
============================================================
✅ opencv-python
✅ pillow
✅ torch
✅ transformers
✅ faster_whisper
✅ langchain
✅ langchain_groq
✅ duckduckgo-search
✅ wikipedia-api

🧪 Testing Groq Integration for Agentic Video Analysis
============================================================
✅ GROQ_API_KEY found
✅ langchain-groq imported successfully
✅ Groq test successful: Hello from Groq!

🔍 Testing Enhanced Analysis Components
============================================================
✅ Enhanced analysis imports successful
✅ MultiModalAnalyzer initialized successfully
✅ Agent created successfully

🤖 Testing Agentic Integration
============================================================
✅ Agentic integration imports successful
✅ AgenticVideoProcessor initialized successfully
✅ MCPToolManager initialized successfully
✅ 5 tools registered

🎉 All tests passed! Your agentic system is ready to use.
```

## 🏃‍♂️ Running the Agentic System

### Option 1: Use Setup Script
```bash
setup_agentic_system.bat
```

### Option 2: Manual Setup
```bash
# 1. Activate environment
myenv31\Scripts\activate.bat

# 2. Set API key
set GROQ_API_KEY=your_key_here

# 3. Run the daemon
python -m worker.daemon
```

### Option 3: Start Server
```bash
start-server.bat
```

## 🔧 System Architecture

### Enhanced Analysis Flow

```
Video Upload → Agentic Processor → Multi-modal Analysis
     ↓
┌─────────────────────────────────────────────────────┐
│ 1. Audio Analysis (Whisper + Emotion Detection)    │
│ 2. Visual Analysis (Object Detection + OCR)        │
│ 3. Agentic Reasoning (Groq Llama3-8b-8192)        │
│ 4. Web Search Integration                          │
│ 5. Wikipedia Lookups                               │
│ 6. Beautiful Report Generation                     │
│ 7. Enhanced Vector Storage                         │
└─────────────────────────────────────────────────────┘
     ↓
Comprehensive Analysis Report + PDF + Vector Embeddings
```

### Key Components

1. **MultiModalAnalyzer**: Handles audio, visual, and text analysis
2. **AgenticVideoProcessor**: Orchestrates the entire analysis pipeline
3. **MCPToolManager**: Manages web search, Wikipedia, and other tools
4. **Enhanced Vector Storage**: Stores analysis with rich metadata

## 📊 Enhanced Features

### Multi-modal Analysis
- **Audio**: Transcription, emotion detection, speaker identification
- **Visual**: Object detection, scene understanding, OCR text extraction
- **Text**: Sentiment analysis, topic extraction, context enrichment

### Agentic Capabilities
- **Reasoning**: Advanced understanding using Groq Llama3
- **Context**: Web search for additional information
- **Knowledge**: Wikipedia lookups for detailed explanations
- **Insights**: Actionable recommendations and analysis

### Beautiful Reports
```
# 📹 Video Analysis Report

## 📊 Overview
- **Duration**: 120 seconds
- **Resolution**: 1920x1080
- **Language**: English

## 🎵 Audio Analysis
### Transcription Summary
[Enhanced transcription with context]

### Key Audio Segments
- **0.0s - 30.0s**: Introduction to the topic
- **30.0s - 60.0s**: Main content discussion
- **60.0s - 90.0s**: Technical details
- **90.0s - 120.0s**: Conclusion and summary

## 🎬 Visual Analysis
### Scene Breakdown
- **0.0s**: Presenter in office setting
- **30.0s**: Screen sharing with diagrams
- **60.0s**: Close-up of technical specifications
- **90.0s**: Return to presenter view

### Key Visual Elements
- **Person**: appears 45 times
- **Computer**: appears 12 times
- **Text**: appears 8 times
- **Diagram**: appears 5 times

## 🎯 Key Insights
### Topics Covered
- Artificial Intelligence
- Machine Learning
- Technology Innovation
- Business Applications

### Sentiment Analysis
- **Positive**: 75%
- **Negative**: 10%
- **Neutral**: 15%

### Important Moments
- **15s**: Key insight about AI applications
- **45s**: Technical breakthrough mentioned
- **75s**: Business impact discussion

## 📈 Recommendations
Based on the analysis, consider:
- Content engagement opportunities
- Areas for improvement
- Target audience insights

---
*Report generated using Groq Llama3-8b-8192*
```

## 🔍 Troubleshooting

### Common Issues

1. **GROQ_API_KEY not found**
   ```
   ❌ GROQ_API_KEY environment variable not found!
   ```
   **Solution**: Set the environment variable or add to `.env` file

2. **Import errors**
   ```
   ❌ Failed to import langchain-groq
   ```
   **Solution**: Install with `pip install langchain-groq`

3. **Agentic analysis fails**
   ```
   Agentic analysis failed, falling back to basic Whisper
   ```
   **Solution**: Check Groq API key and internet connection

4. **Memory issues**
   ```
   CUDA out of memory
   ```
   **Solution**: Reduce batch size or use CPU processing

### Performance Optimization

1. **GPU Usage**: The system automatically detects and uses CUDA if available
2. **Batch Processing**: Videos are processed one at a time to manage memory
3. **Caching**: Analysis results are cached to avoid reprocessing
4. **Fallback**: System falls back to basic analysis if enhanced features fail

## 🎯 Usage Examples

### Basic Usage
```python
from app.utils.agentic_integration import analyze_with_agentic_capabilities

# Process video with agentic capabilities
transcription, summary = await analyze_with_agentic_capabilities(
    video_url="https://example.com/video.mp4",
    user_id=1,
    db=session
)
```

### Advanced Usage
```python
from app.utils.enhanced_analysis import MultiModalAnalyzer

# Create analyzer with custom settings
analyzer = MultiModalAnalyzer(groq_api_key="your_key")

# Perform comprehensive analysis
analysis = await analyzer.analyze_video_enhanced("video.mp4")

# Access results
print(analysis.formatted_report)
print(analysis.audio_analysis)
print(analysis.visual_analysis)
```

## 📈 Benefits of Agentic System

1. **Better Understanding**: Multi-modal analysis provides deeper insights
2. **Context Awareness**: Web search and Wikipedia integration
3. **Beautiful Output**: Professional, formatted reports
4. **Enhanced RAG**: Better vector embeddings for retrieval
5. **Open Source**: Uses Groq's Llama3-8b-8192 model
6. **Scalable**: Handles multiple video formats and sizes
7. **Reliable**: Fallback to basic analysis if enhanced features fail

## 🔮 Future Enhancements

- **Real-time Processing**: Stream video analysis
- **Custom Models**: Integration with custom fine-tuned models
- **Advanced OCR**: Better text extraction from videos
- **Emotion Detection**: Advanced audio and visual emotion analysis
- **Multi-language**: Support for multiple languages
- **API Endpoints**: REST API for external integration

## 📞 Support

If you encounter issues:

1. Check the troubleshooting section above
2. Run `python test_agentic_system.py` to diagnose issues
3. Check the logs in `worker.log`
4. Ensure all dependencies are installed correctly
5. Verify your Groq API key is valid and has sufficient credits

---

**Happy analyzing! 🎉**