A newer version of the Gradio SDK is available:
5.46.0
Vietnamese Legal Chatbot - Presentation Setup Guide
Quick Start for Presentation
Prerequisites
- Python 3.8+ installed
- Docker installed and running
- Google API Key for Gemini (optional for demo)
Step 1: Install Dependencies
pip install -r requirements.txt
Step 2: Start Qdrant Database
python start_qdrant.py
This will:
- Pull Qdrant Docker image
- Start Qdrant on http://localhost:6333
- Create persistent storage in
qdrant_data/
folder
Step 3: Set Up the System
python setup_system.py
This will:
- Load 3,271 legal documents
- Create 61,068 document chunks
- Build vector and BM25 indices
- Set up the RAG system
Step 4: Run the Application
python app.py
This will:
- Start the Gradio web interface
- Open at http://localhost:7860
- Show initialization progress
Demo Questions for Presentation
Sample Legal Questions to Try:
"Điều kiện thành lập doanh nghiệp là gì?"
- Tests basic legal knowledge retrieval
"Quy định về thời gian làm việc tối đa trong ngày?"
- Tests labor law knowledge
"Thủ tục đăng ký kết hôn cần những gì?"
- Tests civil law procedures
"Mức phạt vi phạm giao thông đường bộ?"
- Tests administrative law
Presentation Structure (10-12 minutes)
1. Introduction (2 min)
- Problem: Legal information access in Vietnam
- Solution: AI-powered legal assistant using RAG
- Technology: Hybrid search (BM25 + Vector) + LLM
2. Technical Architecture (3 min)
- Show the system components
- Explain hybrid retrieval approach
- Highlight Vietnamese-specific optimizations
3. Live Demo (3 min)
- Show the web interface
- Ask sample questions
- Demonstrate response quality and citations
4. Performance Results (2 min)
- Show performance table from
results_table.txt
- Highlight 60.82% MRR achievement
- Compare different methods
5. Future Work (1 min)
- Expand legal corpus
- Mobile app development
- Integration with legal services
Troubleshooting
If Qdrant fails to start:
# Check Docker status
docker ps
# Restart Qdrant
python start_qdrant.py stop
python start_qdrant.py
If setup fails:
# Clean up and retry
rm -rf qdrant_data/
python start_qdrant.py
python setup_system.py
If app fails to start:
- Check if Google API key is set (optional)
- Ensure Qdrant is running on port 6333
- Check console for error messages
Key Features to Highlight
- Hybrid Search: Combines keyword (BM25) and semantic (vector) search
- Vietnamese-Specific: Uses specialized Vietnamese embedding models
- Reranking: Advanced document re-ranking for better relevance
- Real-time Interface: Gradio web interface with progress indicators
- Source Attribution: Always cites specific legal documents
- Fallback System: Can search Google if local documents insufficient
Performance Metrics
- Best Method: Hybrid 2 + Reranking
- MRR: 60.82%
- Coverage: 88.99%
- Response Time: ~0.6 seconds
- Documents: 3,271 legal documents, 61,068 chunks